key: cord-0207695-0l0cnpjp authors: Ahn, Na-Young; Park, Jun Eun; Lee, Dong Hoon; Hong, Paul C. title: Balancing Personal Privacy and Public Safety in COVID-19: Case of Korea and France date: 2020-04-29 journal: nan DOI: nan sha: 93b88a9def985f65a75d1ff10010f9170a799812 doc_id: 207695 cord_uid: 0l0cnpjp There has been vigorous debate on how different countries responded to the COVID-19. To secure public safety, Korea has actively used personal information at the risk of personal privacy whereas France encourages voluntary cooperation at the risk of public safety. In this article, we examine the pros and cons of these two different approaches in epidemiological investigations. In addition, we present technological options of de-identification of personal information in the course of managing this pandemic under control. We discuss lessons for future research and policy implications. Increasingly, integration of big data and ICT promises enormous benefits to societal value creation. At the same time, concerns about privacy breaches in the context of big data usage are timely and relevant. The "old" debate over personal privacy and public security is not over. In a pandemic crisis, public safety is a top priority. This means that if a government pursues a legitimate use of certain information for the compelling public purpose, privacy rights may be at risk [1, 2, 3] . In particular, when personal information is crucial to control pandemic crisis such as COVID-19, "implicit" consent is more likely to be expected. In the early 1980s, privacy and confidentiality was generally accepted standard on any epidemiological studies [4, 5] . For a face-to-face investigation, the main focus was to achieve specific research goals ensuring the personal privacy. However, in view of increasing social costs associated with prevention, treatment and compensation of serious infectious diseases, there was a growing demand for accurate information in pursuit of potentially disastrous public health risks. For an example, Gilbert Beebe suggested that gathering accurate information for public interest might be a higher priority over privacy issues [7] . The widespread flu epidemic in 2009 again provided additional support for this line of reasoning. In the course of using personal information for epidemiological investigations, individual's explicit consent was not always obtained. In the United States, Health Insurance Portability and Accountability Act (HIPAA) has established the privacy rules that set limits on the use and disclosure of any personal health information without the patient's approval [8] . At the same time, aggregating personal information for public health purpose was a somewhat different matter [9] . COVID-19 is an extraordinary circumstance that almost the whole society is lockdown because of enormous public health risks-anticipating potentially millions or more deaths of people worldwide [10, 11, 12] . Nations are at war with corona virus. In this context, what does it mean to balance between personal privacy and public safety? What are the proper boundaries and acceptable norms? This paper aims to consider these questions and examine actual cases of two countries-Korea and France. The subsequent sections of this paper are organized in the following order. In section 2 we discuss characteristics of the virus that causes COVID-19. We then introduce an anti-displacement alternative to COVID-19. We further compare the results of the French government's and Korean government's quarantine measures against COVID-19. The subsequent risk analysis uses STRIDE threat modeling for the Korean government's quarantine system. After discussing the necessity of deidentification of personal information and de-identification technologies, we present lessons and implications for future epidemiological investigations. A new type of corona virus (SARS-CoV-2) was first reported in Wuhan, China in December 2019. Since then, this respiratory infection epidemic, designated as COVID-19, has spread throughout China and around the world. Upon infection, after 2 to 14 days of incubation period, the respiratory symptoms include high fever (about 37.5 degrees) and cough or dyspnea. However, it is estimated that there are a number of cases of asymptomatic infections with the disease without these symptoms. On January 21, 2020, the Chinese government officially reported the possibility of human infection in COVID-19 with the 15 confirmed cases. The medical staff involved incident became a credible evidence of human-to-human transmission [13] . On January 30th, 2020, World Health Organization (WHO) declared the continual spread of this infection as International Public Health Emergency (PHEIC). With an accelerating rate of confirmed patients around the globe, on March 11, 2020 WHO declared this corona virus epidemic as pandemic as it did regarding the Hong Kong Flu (1968) and H1N1 (2009) [14] . COVID-19 is a respiratory virus which spreads primarily through droplets generated when an infected person coughs or sneezes, or through droplets of saliva or discharge from the nose. The infected patient's saliva can be transmitted directly to another person's eye or if their eyes are rubbed with a virus-contaminated hand [13] . And the rapid propagation power of COVID-19 was expected to inevitably lead to a shortage of limited medical equipment and facilities due to an sudden increase in the number of explosive patients within a short period of time [15] . For these reasons, the fight against COVID-19 requires confinement of confirmed and contacted persons, especially in the early days of quarantine. In some countries, these responses were compulsory, while others were left to autonomy. Our study compared the cases of France and South Korea with special focus on their government's approaches in seeking the participation of their citizens. At the beginning the Korean government did not responded appropriately due to the lack of understanding of the characteristics of COVID-19. The initial optimism was based on the confidence in Korea's medical capabilities to handle public health challenges. In particular, the confusion was about how to determine a specific level to quarantine those who could be suspected of being asymptomatic. For example, a Chinese woman who arrived from Wuhan on January 20 was found to be the first confirmed case, but until then, if fever symptoms were not present, immigrants were allowed to enter and therefore the system did not follow through asymptomatic patients. However, after understanding the epidemiologic significance of asymptomatic patients and the nature of droplet infection, the new emphasis was to identify the pathogens of the confirmed patients and isolating contacts in close contact with the pathogens. This represents the essential elements of Disease Health Integration System (DHIMS) which collects and uses epidemiological survey data, referring to Figure 1 . Local governments conduct tests for epidemiologic investigation. Medical staff at public health centers and diagnostic screening centers follow-up with the confirmed patients. Local governments are responsible to operate screening clinics through large scale of drive-through or walk-through testing sites without harvesting virus transmission [16] . In case when a person is confirmed with a test positive, the health center or diagnostic center immediately uploads the relevant personal information of the confirmed to the DHIMS. At the same time, the health center or diagnostic center immediately informs the incident to the Centers for Disease Control (KCDC). The local government health center also conducts additional epidemiological investigations. Confirmed patients are required by law to disclose the recent moving paths and identify all the contacted persons. Using the mobile phones and credit cards of the confirmed patients, local governments upload all contact information to DHIMS including name, address, contact information, date of birth, gender, disease name, diagnosis date, age, occupation, place of residence, telephone number, and medical information. Also, for virus prevention and containment, the epidemiological investigation list includes all the relevant information of the contacted--name, address, contact information, date of birth, and health status. In this way, a national database of epidemiological investigations gather all the relevant information of the confirmed and all the contacted. The Korean government uses this database of epidemiological investigations to isolate confirmed persons and all the contacted. Diagnosis test is performed immediately among those who have symptoms. According to the severity of the symptoms of COVID-19, the people are either self-contained or hospitalized. Those who do not have any symptoms among the contacts are quarantined for 14 days from the contact date of the confirmed person. Selfquarantined individuals are monitored daily at local government call centers. The additional diagnostic test. result shows that the person is negative and after 14 days without symptoms, he/she is to be released. The Korean government implements the COVID-19 response system with 3P (Preemptive, Prompt, and Precise). 3T (Trace, Test, Treatment) plus P (Participation) quarantine response model [16] . It covers diagnosis to testing, epidemiology, tracing, and citizen participation. It uses innovative ICT systems such as self-isolation and diagnostic apps, drive-through and walking-through clinics and mobile phone location information analysis. The Korean government also counts on voluntary participation of citizens in developing additional resources from the database. For example, using these aggregated epidemiological survey databases, real time corona maps and monitoring apps are developed for the benefits of society at large. France organized Public Health France (PHF) on January 13, 2020 to monitor and respond to the COVID-19 epidemic. PHF's Crisis Center is responsible for coordination of monitoring epidemiological prevention, mobilizing health protection organizations, managing strategic resources of medical facilities, and offering support services. PHF conducts daily epidemiological investigations and released the aggregate details including the area, gender and age group of COVID-19 [17, 18] . PHF sets the surveillance system to monitor epidemiological and clinical aspects of COVID-19, using urban medicine, measuring the severity of the epidemic and its impact on the medical system, and the fatality rate. PHF actively takes preventative measures to limit the spread of COVID-19. They aim to reduce the risk of transmission among people through preventive messages to the people in the affected area. In addition, precautionary measures are to assist people to a better quality of life even in social isolation. In addition, PHF supports active health-related services by operating a remote support system. PHF allows healthcare professionals (doctors, nurses, pharmacists, physical therapists, midwives, etc.) and health professionals (managers, supervisors, health facility personnel, and engineers) to stay ready for the health center's request for help. The COVID-19 certainly disrupts French ways of life, restricts economic activities and affects social behaviors. The French government has implemented quarantine measures since March 16, 2020. PHF also monitors behavioral and mental health practices in response to these changes, and assesses anxiety levels. From the early days of the outbreak, PHF's main challenge is how to mobilize citizen participation in the fight against COVID-19. According to WHO data, the date of the first confirmed case in Korea was January 20, 2020. In France it was January 24, 2020 when there was the first confirmed case. In both countries, COVID-19 was introduced at the same week. However, after little more than two months, there is a marked difference in the cumulative confirmed cases and deaths of the two countries [14] . As shown in Figure 2 , the cumulative # of confirmed in Korea is 10, 537 and in France is 95,403. The cumulative deaths in Korea are 147 and in France are 14,393. As for the ratio of deaths to confirmed persons, Korea is 1.39 and France is 15.08. A simple comparison of the confirmed and mortality rates is significantly higher in France than in Korea. At the rate of confirmers per 1,000 populations, Korea is 0.204, while France is 1.374. Figure 3 and Table 1 shows that the outcomes of the KCDC's responses is different that of France's PHF. The Korean government's active preventive measures include the usage of the epidemiological investigation database and ICT. Both countries encouraged their citizens to join in the fight against COVID-19. However, there are significant differences in their outcomes. Our research suggests the epidemiological investigation database and the ICT technology usage in Korea have resulted in enormous benefits. This do not mean that any country can adopt the Korean government's approaches without any difficulty. It is not a goal of this research to fully describe the development processes of the Korean government's quarantine system and its operational mechanisms. For the purpose of this research we used available response guidelines released by the Korean government and other related announcement and analyzed the security threat using the STRIDE analysis technique on the reconstructed quarantine system [19, 20] . We examined the relevant data from the Korean government's epidemiological database using the STRIDE analysis modeling. We performed the security evaluation by the STRIDE threat modeling and examined the dynamic investigation data input and output of the Disease Health Integration System (DHIMS) [16, 21] . Spoofing Identity: The appropriate security level of the DHIMS system requires safeguarding identity and restricting access to epidemiological investigation data, referring to Figure 4 . After performing the basic authentication operation procedures, the relevant medical personnel, epidemiologists, government agencies, or civilians are allowed to get access the epidemiological investigation database. Tempering Data: The epidemiological data collected contain personal information in detail: name, address, contact information, gender, age, and phone number of the confirmer/ all contacted. DHIMS does not automatically deidentify such epidemiological survey data. At the time of entering epidemiological survey data, medical personnel or epidemiologists may arbitrarily change epidemiological survey data arbitrarily. Repudiation: Since a legitimate user can handle both the input and output of epidemiological survey data to and from DHIMS, there may not be sufficient correct and check procedure in the processing data operation. Medical/ mechanical investigators may not be held accountable if they arbitrarily change epidemiological investigation data. results of epidemiological investigations in an offline state do not always match exactly with the epidemiological data entered in an online state. Therefore, ensuring integrity of the epidemiological investigation data is a real challenge. Information Disclosure: The retention period for epidemiological data in DHIMS is permanent or semipermanent. If systematic security of DHIMS is solid, then all personal information is assumed to be kept safe. However, a third party requests particular set of epidemiological investigation data, DHIMS is supposed to conduct deidentification process and offer specific numbers instead of names [21] . However, it is also possible the privacy of individuals might not always be respected. Quality Control Measures: DHIMS use quality control measures which access permission and denial based on relevant parameters (e.g., proper authorization and examination of usage patterns). DHIMS has issues related to the overall system security level because epidemiological survey data contain identifiable personal information. This may potentially lead to serious privacy violations of all those who are required to provide multiple aspects of personal information. Despite its effective response to COVID-19 using epidemiological survey data, the entire process is also embedded with serious potential privacy violation. Even more problematic is the fact that the retention period of these epidemiological investigation data is legally permanent or semi-permanent [21] . Unless strict safeguarding measures are in operation, there remains serous privacy risk concerns. The Korean government disclosed the COVID-19 confirmatory movement path, the address of quarantined building, and enforced two weeks of self-containment for all the confirmed and contacted. In the early days of the endemic outbreak, corona maps tracked the path of the confirmer and thus raised awareness of many people. With the rapid increase of the confirmed persons from Shincheonji church congregation, the nature of public safety needed a better definition. Shincheonji church, in good faith, provided the personal information of its members to the Korean government that guaranteed strict protection of personal information. Shincheonji local branch provided social security number and phone number of each member. The Korean government followed up the Figure 5 shows a comparison of cumulative deaths in Korea and France. The number of deaths before and after the Pandemic, declared on March 11, shows a sharp difference. There seems high correlations between the usage of big data that contains personal information and effectiveness of epidemiological investigations in this pandemic situation. In an extraordinary crisis situation-COVID-19 pandemic, unidentified aggregate information has no value. For public safety, the government had to use a huge church big data that provides identifiable personal information. members of Shincheonji church with phone calls and conducted investigations for their the COVID-19 related visits and contacts. Confirmed # 31 is a member of Shincheonji church. The third confirmed person also received much social attention. Public safety requires "right to know" about the status of coronavirus infection. In the age of digital age, balancing the public safety and personal privacy is enormously challenging [22, 23, 24] . Individuals may waive their privacy rights for public safety that requires informing people about relevant coronavirus infection information. The question are, "What is relevant? How much details are allowed to be openly available to public?" In the early stage of the outbreak, the Korean government collected the personal information of confirmed patients to prevent further infection. The use of personal information-is essential for the prevention of occupational diseases. The privacy problem described above may not be how it will be developed in the current process. The Korean government specified the path of infection, identified all the contacted persons, conducted disaster prevention and implemented self-containment of the contact person. The details of patient's personal information included credit cards, phone number, and address. The initial response of the Korean government had a considerable effect with moderate success. On February 18, 2020, there was the 31st confirmed person in a Shincheonji church in Daegu area. With a sudden increase of confirmed patients among Shincheonji church members, the Korean government changed its approach to more aggressive follow-up. Shincheonji church, as a new religious movement, was not yet accepted as a part of mainstream religions in Korea. Hundreds of the church leaders attended their international missionary outreach gathering in Wuhan, China and returned back to Korea in January, 2020. At the request of the Korean government Shincheonji Church provided social security number and phone number of its members. Local governments called the church member in their region and examined the symptoms and conducted corona tests. As a result, almost all of the Shincheonji church members (about 212,000) have been examined. In the meantime, the number of confirmed persons has increased explosively-up to 7,513 on March 10, 2020 [25] . The Shincheonji church ledger is a big data of more than 200,000 people. Korean government used this big data to prevent COVID-19 pandemic. Their methods of recruitment of new members and education of its members have controversial elements. In particular, their regular mass meeting often occurred in an enclosed huge hall. Considering the situation in which virus transmission is easy due to the characteristics of the church contacting in an enclosed space. With the use of this church big data and follow-up testing, one of the main sources of widespread outbreak was effectively contained. Although the personal information issue of big data has been raised before, it's real challenges actually occur in Korea in the contest of COVID-19. Public safety and personal privacy are two important value pillars related to use of information through Big Data. One way to balance public safety and personal privacy might be using deidentified information. As a proactive measure for pandemic containment and prevention, the Korean government uses epidemiological data through ICT technology. Medical testing equipment expedited massive coronavirus testing which was instrumental in reducing mortality. It is public safety imperative to use personal information to control and prevent the spread of pandemic. Managing personal information stored in big data requires appropriate privacy protection measures. It is one thing to use raw data at the initial stage of quarantine and quite another to manage huge volume of personal information as big-identified data. Privacy violation is related to the use of identifiable personal information. Therefore, serious privacy issues can be effectively handled with non-identifiable personal information. Right type of technology is needed to implement such de-identification options. In the United States, Health Insurance Portability and Accountability Act (HIPPA) set national standards for protecting an individual's medical records and their personal health information. It applies to health plans, health care information centers, and health care providers that electronically transmit any health care transactions. This rule requires appropriate safeguards to protect the privacy of personal health information and sets limits and specifies conditions for the use and disclosure of such information without patient consent and approval [9, 27] . However, the Korean government uses identifiable personal information with limited restrictions, leading to possible serious violations of privacy related to confirmed persons /all those contacted. Securing personal information in quarantine measures is appropriate. In respect to personal privacy, it is important to use the information gathered for the specific intended purpose only. Using identifiable personal information for some other purposes is breach of confidence and trust. Moreover, legal provision of keeping quarantine investigation data either in permanent or semipermanent period is not reasonable at all. Requiring deidentification of personal information and rapid deployment relevant technology is urgently in need. In the privacy rules of HIPAA, two approaches are proposed regarding the de-identification of personal health information: the safe harbor method and the expert determination method [27] . The safe harbor method deletes 18 personal identification variables such as name, social security number, contact information, IP address, fingerprint, photo, and detailed address. The method of using experts is to processing personal information in non-identifying algorithms. The release and forget model, the data use agreement (DUA) model, and the enclave model are all useful to achieve effective control in data storage and usage processes. The general pre-sale model is to release unidentified personal information to the public by posting data on the Internet. In this way, once personal information is disclosed, it is almost impossible for any institution to recover it. The data usage agreement model is usually a method of proactively negotiating, linking to other information, or prohibiting information redistribution in advance, or controlling it with a user click-through license agreement on the Internet. The closed room model maintains a kind of closed room (analytic environment in which physical access, network, etc. is blocked) that restricts the export of unidentified personal information original information and instead allows qualified researchers to inquire and analyses for desirable results. It is a physical and technical control method to respond and export [28] . It is not easy to preserve the scientific utilization value of the collected data and de-identify personal information at the same time. These two factors are in conflict with each other. In other words, researchers look for more precise analysis results using the original data that contain a minimum deidentification level of personal information. On the other hand, any institutions that provide the data aim to satisfy personal privacy requirements by ensuring the anonymity of the data. There are diverse approaches of de-identification of personal information. An increasing level of de-identification is negatively related to the quality of the data and the precision of the research results. Conversely, higher data quality and outcome precision requires lower level of deidentification. A greater level of personal identification is related to a higher possibility of privacy infringement. Then, how can epidemiological investigators de-identify personal information to ensure personal privacy? The primary purpose of the epidemiological investigation is to minimize the contacts with the confirmed patient. For this reason, isolation of a corona virus test positive individual is imperative in the prevention and spread of infectious diseases. Here are several practical suggestions to enhance security in the epidemiological investigations: First, the consent to use personal information is obtained at the time of epidemiological investigations. Such permission will specify the period of storage and use of personal information. Because of the extraordinary nature of COVID-19, in the early breakout period personal information was often collected without getting proper consent. Later, it is mandatory to obtain personal information consent. If any individuals are reluctant to give their content, the personal information used in the epidemiological investigation is immediately deleted. Second, there are other options for the proper use of identifiable personal information. It is worth considering concepts such as copyright payment. For sales of any product with copyright the corresponding amount of money is set aside to compensate the copyright holder. If complete deidentification of personal information is not possible, it is plausible to compensate each individual for the use of their personal information. In addition, epidemiological investigations can be done either offline or online. In all these investigation, getting consent from individuals that participate is a must. Third, a medical person or epidemiologist personal store the personal information collected offline in the system by applying de-identification technology. When the required information is uploaded in the system, it is notified to the individual for accuracy and consent. Afterward, the offline information is immediately destroyed. The fact is communicated to the individual as well. Fourth, personal information collected online will be stored in the system by applying de-identification technology by epidemiologists. When the personal information collected in the system is uploaded, the fact is immediately notified to the individual. Fifth, if third parties have to receive personal information of individuals collected by epidemiological investigations, the system immediately notifies them of this fact. At this time, the personal information that is made available to third party is in the form of non-identifying numbers of symbols. Sixth, the third party may need to re-identify the unidentified personal information as necessary. At this time, the third party must immediately communicate the individual of the re-identification need and obtain the consent of the individual. If an individual's consent is not obtained, personal information must be immediately destroyed. With widely available de-identification technologies, it is difficult to prevent individuals from being re-identified from de-identification measures. Researchers including Montjoye of Imperial College London, UK, conducted experiments with published data from the United States, Turkey, etc., and found certain attributes accurately even by using deidentified data [29] . Their machine learning model could identify individuals with 99.98% accuracy from any anonymized data using only 15 demographic attributes (age, gender, marital status, etc.). Montjoye suggests that there is a need to shift the paradigm of de-identification: "We need to de-identify, then move on. Anonymity is not a property of the data set, it depends on how the person who writes it uses it." In other words, what matters is not anonymizing the data set but designing and organizing the data set that matters most. So what are the alternatives? It is time to move from the idea of de-identification to the application of appropriate technologies/ It is to strike a balance between the use of data and personal privacy. Technologies such as secure multiparty computation and homomorphic encryption are emerging. More innovations are certainly in progress in post-COVID-19 world of new big data technologies and ICT applications [30, 31] . The development of these new technologies will increase our choice options when dealing with infectious diseases. In the epidemic prevention and control, there is no doubt that personal information collected in epidemiological investigations is for public safety purpose. There is no real disagreement that personal information collected should be protected from privacy infringement. It is imperative to balance personal privacy and public safety even in the context of COVID-19. Personal information collected under the consent of the individual may be used for overcoming this pandemic and achieving related research purposes. Initially, encrypted or unidentified personal information may be used, and in a pandemic situation, original personal information may be used. Of course, personal consent is essential whenever the collected personal information is used. COVID-19 is a testing case for the debate between personal privacy and public safety. In essence, the public disclosure of personal information is not absolutely required to epidemiological investigations of rapidly spreading epidemics. In COVID-19, the Korean government actively used personal information using ICT and demonstrated fairly successful outcomes in pandemic control. However, that is only a part of the whole story. ICT usage patterns imply that the more convenient it is to use for effective control of pandemic, the easier it is to infringe personal privacy. For the proper use of personal information in the fight against infectious diseases, consent to personal information is essential, and de-identified personal information must be used. The future research may explore further how de-identification technologies such as noisebased de-identification technologies are developed and applied. The rightful use of big data should make our lives safe, secure and free-balancing personal privacy and public safety. Authors of this article wish to express our deepest gratitude to all the dedicated medical practitioners and numerous patients in the world who are in the frontline battles against COVID-19. On the responsible use of digital data to tackle the COVID-19 pandemic Is there a duty to participate in digital epidemiology? Tracking Health Related Discussions on Reddit for Public Health Applications The epidemiologist's lament Putting the lamentations of epidemiologists into perspective Long-term follow-up is a problem Public Health Research Ethics: Clinical Registries and Informed Consent Tracking Healthy People Privacy Versus Public Health: The Impact of Current Confidentiality Rules Coronavirus 2019-nCoV: a brief perspective from the front line Feasibility of controlling COVID-19 outbreaks by isolation of cases and contacts Algorithms on regulatory lockdown in medicine Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and corona virus disease-2019 (COVID-19): the epidemic and the challenges Article 105924 COVID-2019 Situation Reports COVID-19, a worldwide public health emergency Flattening the curve on COVID-19 First cases of coronavirus disease 2019 (COVID-19) in France: surveillance, investigations and control measures SURVEILLANCE EPIDEMIOLOGIQUE DU COVID-19 Threat modeling for mobile health systems Threat Analysis for Wearable Health Devices and Environment Monitoring Internet of Things Integration System Privacy in the age of medical big data Differential Privacy for Data and Model Publishing of Medical Data Big Data, Cybersecurity, and Challenges in Healthcare Press_Release_(March10)_Afternoon.pdf Office for Civil Rights. Guidance Regarding Methods for Deidentification of Protected Health Information in Accordance with the Health Insurance Portability and Accountability Act (HIPAA) Privacy Rule De-identification of personal information (NISTIR 8053) Estimating the success of re-identifications in incomplete datasets using generative models Governance, technology and citizen behavior in pandemic: Lessons from COVID-19 in East Asia Information Technology-Based Tracing Strategy in Response to COVID-19 in South Korea-Privacy Controversies