key: cord-0321422-2b09c8o2 authors: Vespe, Michele; Iacus, Stefano Maria; Santamaria, Carlos; Sermi, Francesco; Spyratos, Spyridon title: On the Use of Data from Multiple Mobile Network Operators in Europe to fight COVID-19 date: 2021-06-10 journal: nan DOI: 10.1017/dap.2021.9 sha: b9b920dcfea33451d719244d231b6aa3ab86f305 doc_id: 321422 cord_uid: 2b09c8o2 The rapid spread of COVID-19 infections on a global level has highlighted the need for accurate, transparent and timely information regarding collective mobility patterns to inform de-escalation strategies as well as to provide forecasting capacity for re-escalation policies aiming at addressing further waves of the virus. Such information can be extracted using aggregate anonymised data from innovative sources such as mobile positioning data. This paper presents lessons learnt and results of a unique Business-to-Government (B2G) initiative between several Mobile Network Operators in Europe and the European Commission. Mobile positioning data have supported policy makers and practitioners with evidence and data-driven knowledge to understand and predict the spread of the disease, the effectiveness of the containment measures, their socio-economic impacts while feeding scenarios at EU scale and in a comparable way across countries. The challenges of this data sharing initiative are not limited to data quality, harmonisation, and comparability across countries, however important they are. Equally essential aspects that need to be addressed from the onset are related to data privacy, security, fundamental rights and commercial sensitivity. The new coronavirus disease 2019 rapidly spread throughout the world during the first quarter of 2020, reaching pandemic status on 11 March 2020. Authorities in most of the European countries and worldwide had to confront with unprecedented challenges to contain the number of infections and prevent saturation of intensive care units in national health systems. This required immediate policy responses. Governments reacted by passing a wide range of measures, including confinement measures designed to contain the spread of the virus, information campaigns, fiscal stimulus to support the economy in the short term, recovery plans for the aftermath and preparation to prospective second waves of the virus. The need for timely, accurate and reliable data that would inform such decisions is of paramount importance. Against this backdrop, on 8 April the European Commission asked European Mobile Network Operators (MNOs) to share anonymised and aggregate mobile positioning data. The aim was to provide mobility patterns of population groups and serve the following purposes in the fight against COVID-19. Initiated by means of an exchange of letters, the terms of cooperation between MNOs and the European Commission are outlined by a Letter of Intent 1 , which specifies that insights into mobility patterns of population groups extracted in the framework of this initiative are meant to serve the following purposes: • "understand the spatial dynamics of the epidemics thanks to historical matrices of mobility national and international flows; • quantify the impact of social distancing measures (travel limitations, non-essential activities closures, total lock-down,etc.) on mobility; • feed SIR epidemiological models, contributing to the evaluation of the effects of social distancing measures on the reduction of the rate of virus spread in terms of reproduction number (expected number of secondary cases generated by one case); • feed models to estimate the economic costs of the different interventions, as well as the impact of control extended measures on intra-EU cross border flows and traffic jams due to the epidemic; and • cover all Member States in order to acquire insights." The aim of the initiative was in line with the European Commission Recommendation to support exit strategies through mobile data and apps (Commission Recommendation 2020 /518, 2020 2 February 2020 just before the outbreak of the pandemic. This strategy drew from a number of efforts including the High-Level Expert Group on B2G data sharing, set up by the Commission in autumn 2018 and whose members represented a broad range of interests and sectors. In its final report issued in February 2020 (Alemanno (2020) ), the Expert Group called the Commission, the Member States and all stakeholders to take the necessary steps to make more private data available and increase its reuse for the common good. These previous efforts proved very timely and paved the way for the COVID-19 data sharing initiative. The European geographical scale of the MNOs involved in this exercise, through the processing of aggregate and anonymised data, aims at the understanding and sharing of best practices across countries, highlighting which mobility policies are the most effective to fight COVID-19. The process applied by the operators transforms the raw mobile data 3 into aggregate and anonymised intermediate products (so called Origin-Destination Matrices, see next section for more details). These matrices are the products delivered by the MNOs to the Commission. The matrices are not "ready-to-use" indicators, but their level of granularity and their attributes have given the Commission the opportunity to derive from them indicators specifically designed to meet the needs of JRC researchers, the ECDC 4 , policymakers and practitioners at EU and Member States level. In addition to the increased flexibility to design indicators tailored to the policymakers needs, this arrangement also gives the possibility to reduce the 'black box' effect thanks to the greater transparency and control over the process. This results in output indicators that are more usable by policymakers. Even though some mobility data derived from social media 5 or mobile app 6 location data is openly available, contains rich disaggregations (e.g. walking mobility versus driving, or categorisation of places in retail, parks, etc.) and has global coverage, the MNO data available provide a number of advantages over that location data: level of granularity, both spatial (MNO data reaches up to municipality level while app data available is limited to region or province) and temporal (some MNOs provide more than daily updates); representativity (MNO data probably better captures all the different population groups); availability of connectivity data ( i.e. from an origin to a destination, as opposed to just mobility levels at a location); a higher level of transparency (more detailed methodological description). This makes the MNO data a very valuable source of human mobility insights. The unique nature of the initiative lies in its geographical scope, the number of involved Mobile Network Operators, and their relatively rapid and in many cases unconditional support offered. Thanks to continuous dialogue with the Commission, MNOs have shown concrete interest in being active and supportive, irrespective of the different levels of maturity in producing the required data; some were already collecting and processing aggregate and anonymised data to deliver similar insights to national authorities, others had to develop ad hoc processes to be in a position to respond to the data request. Within a few months, data from 17 MNOs covering 22 EU Member States plus Norway have been transferred to the Commission on a daily basis, with an average latency of a few days, and in most cases covering historical data from February 2020. This enables the comparative analysis across countries of mobility before, during and after the release of lock-down measures. 3 In this paper, the term "raw data" means mobile phone data records or collection of variables that refer to individual users -not necessarily identifiable -and not to groups of users. The urgency of producing useful insights and quickly supporting the response to the crisis, combined with the fact that the initiative is pro-bono, led to the decision of sharing Origin-Destination Matrix (ODM) data already available, requiring the least possible additional developments by the MNOs. As expected, the shared ODMs follow definitions and methodologies that inevitably differ among MNOs in various aspects, such as: can be captured and described by different origin-destination movements depending on the approach adopted. • Extrapolation. Some operators extrapolate the movements counts to the total population based on their market share in the country. • Spatial and temporal resolution of ODMs. MNOs use different types of geographical areas for capturing movements, for example, administrative boundaries (such as municipalities, postcodes, and census areas) or regular geo-grids. Similarly, the time-frequency of the reported movements is heterogeneous, with time-windows ranging between one and twenty four hours. • Confidentiality thresholds. MNOs discards all movements below a given "confidentiality threshold" to reduce de-anonymisation risks. This confidentiality threshold is set in adequate proportion to the size of the adopted geographical area of reference (and its lowest population) and thus varies across MNOs. • Syntactic heterogeneity. MNOs deliver ODMs in different data formats, geographic files in different coordinate reference systems and use different languages to describe the data. These syntactic heterogeneity issues are of minor importance compared to the others heterogeneity issues described above and can easily be addressed. • Presence of additional attributes. Some MNOs provide additional information about mobile phone users such as age groups and sex as well as inbound or outbound roamers. As a matter of fact, the more the level of disaggregation or granularity, the more movements fall below the confidentiality threshold and are therefore filtered out. 4 The large variations of the parameters described above lead to a low harmonisation of the aggregate and anonymised data across operators. Through further aggregation and relativisation ( i.e. concentrating to mobility trends rather than absolute figures), it was possible to use the principle of common denominator to derive from the ODMs a number of mobility data products that preserve a certain amount of basic shared characteristics and are therefore mostly comparable across countries. Through the process of harmonisation the ODMs received from the MNOs (and that differ in geolocation, mobility definitions, etc.) are transformed into mobility data products with a higher degree of comparability across MNOs than the original matrices. Although the derived insights have proved extremely useful, constraints of data heterogeneity across operators could be overcome by formulating refined data requests following Trusted Smart Statistics 7 concept as in Ricciato et al. (2020b) , with the results of providing harmonised statistics on human mobility to more effectively feed epidemiological models. The first results of the initiative were communicated to the general public by the European 7 The Trusted Smart Statistics is a concept that describes the ongoing efforts to augment the established components of statistical systems with the elements necessary to successfully exploit the increased datafication of society. These efforts involve the working models, operational processes and practices of statistical offices. 8 https://ec.europa.eu/commission/presscorner/detail/en/mex 20 1359 9 https://ec.europa.eu/jrc/en/news/coronavirus-mobility-data-provides-insights-virus-spread -and-containment-help-inform-future 10 https://ec.europa.eu/digital-single-market/en/news/coronavirus-mobility-data-provides -insights-virus-spread-and-containment-help-inform-future 11 https://www.rtp.pt/noticias/economia/operadoras-em-portugal-e-outros-18-paises-da-ue-ja -forneceram-dados-a-bruxelas n1245101 12 https://eng.belta.by/partner\ news/view/eu-study-on-mobile-phone-data-reveals-correlation -between-human-mobility-covid-19-spread-131786-2020/ 13 https://www.eureporter.co/frontpage/2020/07/16/coronavirus-mobility-data-provides-insights -into-virus-spread-and-containment-to-help-inform-future-responses/ 14 http://www.xinhuanet.com/english/2020-07/16/c 139215507.htm 15 https://eurohealthnet.eu/newsletter-article-hh/july-2020/ 16 https://www.eureporter.co/frontpage/2020/07/16/coronavirus-mobility-data-provides-insights -into-virus-spread-and-containment-to-help-inform-future-responses/ The The Mobility Functional Areas (MFAs) are data-driven geographic zones with a high degree of inter-mobility exchange. The construction of the MFAs, which is completely data-driven, starts from the ODMs at the highest spatial granularity available irrespective of admin- April 2020 (bottom) over MFAs in Austria. The COVID-19 geographic spread seems to follow MFAs more than the the number of political districts (GKZ) borders (Iacus et al. (2021a) ). The products are currently feeding the Mobility Visualisation Platform (Figure 2 The products are currently being expanded to feed early warning mechanisms to detect anomalies in usual mobility patterns such as gatherings (Iacus et al. (2021b) ) and to inform scenarios for targeted COVID-19 non-pharmaceutical interventions (De Groeve et al. (2020) ). Functional Areas), presenting insights comparable at national, regional and NUTS3 level and combining ECDC data. Access to the platform is provided to practitioners and policymakers in the Commission, ECDC and EU Member States. Because of its unprecedented nature, this B2G initiative highlighted some complex challenges that need to be addressed in order to benefit from the lessons learned. The most relevant challenges are listed below by main domain. • Data Security and Integrity. Security and integrity of the data were primarily addressed by implementing end-to-end encryption to the data transferred from the MNOs to the JRC, and by developing a dedicated secure platform to host and process the data which is accessible by a limited and controlled number of users. Data received by the MNOs at high spatial and temporal resolution was not allowed to exit the secure platform. All the data processing, analysis and storage took place remotely on the Unix secure platform using open-source technologies such as Python, R and PostgreSQL. The Common Denominator resulting from this process is represented by the Mobility Data products presented in Section 3. • Privacy, Commercial Sensititivity and Fundamental Rights. Data privacy, risk of reidentification of groups of individuals and ethical aspects related to the use of the data needed to be carefully addressed. Although the data shared by the MNOs (ODMs) contains only anonymised and aggregate data, in compliance with the EDPB guidelines (EDPB (2020)) the JRC carried out a so-called "Reasonability Test" upon the reception of preliminary data samples from MNOs. The objective of the test is twofold: to actively 8 verify that the data specification in terms of origin destination aggregate data were respected and to assess whether or not the risk of re-identification of the individuals was reasonably low. Following the recommendations by the European Data Protection Privacy Guidelines (GSMA (2020)), and in order to respect fundamental rights, avoid discrimination as well as respect of legitimate business interests of operators, the JRC put in place measures such as: i) definition of conditions of non disclosure and use of the data products only in well identified COVID-19 related fields (as set out in the Letter of Intent (see Section 1); ii) limited and controlled access not only to the original MNO data but also to the derived products; and iii) adoption of a data retention horizon. • Communication and transparency. Communication aspects were duly analysed in order to appropriately convey the message that the initiative has dealt exclusively with anonymised and aggregate data, ultimately avoiding reputational damage both for the MNOs and the Commission as well as political backlash. This required consultation with MNOs and the GSMA prior the publication of communication outlets as well as scientific results to the public. • Data heterogeneity. Because of the need to react quickly to an emergency situation, the initiative has been based mostly on data already available at the MNOs. Yet, the JRC had to cope with a high degree of heterogeneity of the data as introduced in Section 2. This implied substantial downstream technical efforts by the JRC to harmonise the data to the greatest possible extent, and to find a common denominator across operators resulting in lowered information content but guaranteeing data comparability. This unprecedented initiative demonstrated the importance of an inter-disciplinary ap- should also suggest the right balance between privacy-compliance and level-of-detail starting from the raw mobile positioning data; this may serve the MNOs to find a common standard for the ODMs, drastically reducing the heterogeneity. At the same time, the standards should not be too prescriptive to keep MNOs from innovating. Moreover, the working group should develop additional Mobility Data Products tailored to their different applications (epidemiological modelling, forecasting of the contagion on different scales, mobility-impact assessment, drop in connectivity and tourism, etc.). • Establish an Ethic Committee with the mission of considering all ethical aspects and implication of the initiative and to make sure that it complies in any of its part with fundamental rights. The Ethics Committee should take into consideration the culturally shared privacy norms of different countries involved in the initiative. • Establish and maintain a Research Network of Collaboration made of data providers, data processors and data users. Such a network will facilitate the dialog between the parties, ensuring the continuous improvement of procedures and the enhancement of the results. In terms of highest priority, the ex-ante definition of a common standard for the raw data from the MNOs is indeed the first step, since it requires a long time to be drafted and an even longer time to be implemented. Moreover, its adoption would ensure direct comparability across both regions and mobile data sources, while avoiding the time-and-resource demand in harmonisation at the data processor side (see the heterogeneous characteristics of the ODMs explained in Section 2), which not only requires an unavoidable loss of space-time granularity, but very often leads to sub-optimal indicators. Efforts should also aim at making insights and derived products publicly available to the research community in order to make use of its full potential and ensure reproducibility of research outcomes. initiative, which is a partnership between academia and industry. The expert group is nominated and funded by independent public foundations and works closely with experts within the industry to create a framework that allows access to the industry data preserving both the privacy of the users and the commercial value of the companies involved. The composition of the team of experts and the length of support to the initiative is under the control of the foundations. Another example is the is the Big Data for Migration Alliance (https:// data4migration.org/). Based on the results obtained, this initiative can potentially help simplify the systematic use of data from the private sector in the framework of policy support. The global scale and spread of the COVID-19 pandemic highlight the need for a more harmonised or coordinated approach across countries (Oliver et al., 2020) . By collecting mobility data across various EU member states, this initiative is trying to address COVID-19 crisis response in a more holistic way. The initiative provides a concrete example on setting up bilateral channels between private and public sectors in understanding human mobility and providing support in addressing societal issues. Some of the procedures developed during the project should be consolidated in order to speed up future B2G processes. The success of the initiative demonstrates how actors at the interface between the private sector, the scientific community and the policy side can play a key intermediary role in ensuring that data driven insights are used responsibly to effectively respond to pressing policy questions. Towards a european strategy on business-to-government data sharing for the public interest GSMA is the GSM Association of Mobile Network Operators Eurostat is the Statistical Office of the European Union Working group on epidemic preparedness -preventing the spread of epidemics using ICT Communication from the Commission to the European Commission to the European Parliament, the Council, the European Economic and Social Committee and the Committee of the Regions, A European strategy for data Proposal for a Regulation of the European Parliament and of the Council on european data governance (Data Governance Act Communication from the commission to the european parliament and the council, staying safe from covid-19 during winter Commission Recommendation (EU) on a common Union toolbox for the use of technology and data to combat and exit from the covid-19 crisis, in particular concerning mobile applications and the use of anonymised mobility data Scenarios and tools for locally targeted covid-19 non pharmaceutical intervention measures, JRC122800 Guidelines 04/2020 on the use of location data and contact The gsma covid-19 privacy guidelines Human mobility and covid-19 initial dynamics Mapping mobility functional areas (MFA) using mobile positioning data to inform covid-19 policies Mobility functional areas and covid-19 spread Anomaly detection of mobile positioning data with applications to covid-19 situational awareness Mobile phone data for informing public health actions across the covid-19 pandemic life cycle Towards a methodological framework for estimating present population density from mobile network operator data Trusted smart statistics: How new data will change official statistics Measuring the impact of covid-19 confinement measures on human mobility using mobile positioning data. a european regional analysis Acknowledgments The authors acknowledge the support of European MNOs (among which 3 Group -part of CK Hutchison, A1 Telekom Austria Group, Altice Portugal, Deutsche Telekom, Orange, Proximus, TIM Telecom Italia, Tele2, Telefonica, Telenor, Telia Company and Vodafone) in providing access to aggregate and anonymised data, an invaluable contribution to the initiative. The authors would also like to acknowledge the GSMA 19 , colleagues from Eurostat 20 and ECDC for their input in drafting the data request. Finally, the authors would also like to acknowledge the support from JRC colleagues, and in particular the E3 Unit, for setting up a secure environment to host and process of the data provided by MNOs, as well as the E6 Unit (the "Dynamic Data Hub team") for their valuable support in setting up the data lake. Ethical standards The research meets all ethical guidelines.Author contributions All authors equally contributed to the work. All authors approved the final submitted draft.Funding statement This work received no specific grant from any funding agency, commercial or not-for-profit sectors.Data Availability Statement No datasets were processed to generate research results presented in the current study.