key: cord-0873925-a7johunh authors: Birnhack, Michael title: Who Controls Covid-Related Medical Data? Copyright and Personal Data date: 2021-05-17 journal: IIC Int Rev Ind Prop Copyr Law DOI: 10.1007/s40319-021-01067-5 sha: 46739500ff1193cf00be2852cb88b8dedf8acab5 doc_id: 873925 cord_uid: a7johunh nan When the data are about a living person, we summon data protection law. Under the European Union's General Data Protection Regulation (GDPR), personal datadefined as ''any information relating to an identified or identifiable natural person''deserves legal protection. 2 The ideal of data protection law is that only the data subject may determine whether to share the data, with whom, when, how, and under which conditions. This is the idea of (informational) privacy as control. The law permits processing of personal data with the subject's free and informed consent or in certain other cases. 3 For scientific research, which includes medical data, the GDPR offers an explicit derogation, subject to some safeguards (Art. 89). Some scholars have argued in favor of treating personal data as property, but their arguments have been refuted: Treating personal data as property would lead to more, not less, human commodification and would play into the hands of data giants rather than data subjects. 4 Data about oneself may be precious to the individual and serve as a shield against actions such as discrimination or harassment. Controlling one's data facilitates autonomy and protects human dignity. However, the data is also valuable to others: the state, employers, insurance companies, and data giants. Digital data mongers are often interested in big data rather than in an individual's data; they analyze the dataset and ultimately target individuals. 5 Access to big medical data enables the study of pre-existing data rather than of people in clinical trials. It is safer and cheaper. Data analytics enables searching for unknown correlations, which leads to searching for causations. Instead of being limited to a sample population, the study can cover more people with more diverse relevant backgrounds. Notwithstanding the benefits of big medical data studies, they pose a substantial challenge to data protection law: The data were collected in an identifying mode as a byproduct of medical treatment, and are then de-identified. The researchers often wish to have as much data as possible about any individual, such as which neighborhood they live in, as one may be more polluted than the other; or perhaps ethnicity matters, as it reflects genetics? In other words, during collection, data subjects are identified; during processing, data subjects are identifiable; and at publication, the data are aggregate and statistical. A straightforward application of data protection law would require that data collection and processing be permitted only with the subject's consent. 6 If the data are truly anonymized, then data subjects are de facto protected. However, with the advance of de-anonymization and re-identification techniques, anonymity is fragile. 7 When the data subjects are identified, they have rights, and the data controller and data processor are subject to various obligations. Copyright law places a ''No Entry'' sign and refuses to protect raw data. 8 Copyright law's refusal to protect raw data is a well-entrenched legal maxim for several reasons. Firstly, a fact about the world was revealed, explained, and articulated by an author but not created by her. Secondly, data comprise raw material for further creativity, reflecting the notion of human knowledge being a collective human endeavor, echoing the idea of progress. 9 Thirdly, the free circulation of information has an important democratic role. In short, in the famous words of Justice Brandeis, referring to ''knowledge, truths ascertained, conceptions, and ideas,'' and extended to data -it should be ''free as the air to common use.'' 10 Turning from single datum to big data, copyright law offers thin protection for structured, non-trivial compilations of data (i.e. selected and organized data), but not for the underlying data. 11 However, when the dataset is unstructured (i.e. without pre-defined selection criteria and no particular arrangement, and with each datum tagged), there is not much to protect under copyright law. The data aggregators achieve control of their datasets through other means, such as technological locks and trade secrets law. Importantly, copyright ownership of big medical data may hinder others' access. The importance of the medical datasets, especially in times of a crisis, cannot be exaggerated. The wise use of the data can decidedly save many lives. Such control may be a copyright ideal for some but a drawback for all. Broken Promises of Privacy: Responding to the Surprising Failure of Anonymization'', 57 UCLA L. Rev Berne Convention for the Protection of Literary and Artistic Works The Idea of Progress in Copyright Law'', 1 Buffalo Intell. Prop. L.J. p. 3 10 International News Service v Marrakesh Agreement Establishing the World Trade Organization, Annex 1C, 1869 U Databases may also be protected under the sui generis European Directive. Directive 96/9/EC of the European Parliament and of the Council of 11 March 1996 on the legal protection of databases Combining the two bodies of law regarding big medical data results in four primary situations:(a) Structured, anonymous datasets: The controller enjoys copyright, and data subjects enjoy de facto privacy protection. (b) Unstructured, anonymous datasets: The controller does not enjoy copyright protection and may revert to other means, such as trade secrets law. Data subjects enjoy de facto privacy. Both (a) and (b) carry the risk of de-anonymization, especially if the controller shares the dataset with others. (c) Structured, identifying datasets: The controller enjoys copyright protection but is subject to various obligations to protect data subjects' privacy. Sharing the dataset would pose a high privacy risk. Data protection law reinforces copyright. (d) Unstructured, identifying datasets: The controller does not enjoy copyright protection and is under an obligation to protect data subjects' privacy, pushing the controller to seek other restrictive practices of not sharing the data.Returning to the Israeli context, according to the National Health Insurance Act, HMOs hold detailed medical data about the entire population, including vaccinations and their side effects. The HMOs can be classified under situation (c), with additional regulatory duties of confidentiality. The two largest HMOs conduct medical research also under situation (d). When the HMOs transfer aggregate and statistical data to the MoH, which, in turn, transfers it to Pfizer, we shift to situation (a), thus reducing privacy risks.Combining copyright law and the obligations imposed by data protection law pushes the parties to protect the data under both copyright law and additional layers of protection, such as trade secrets law. This result means that other parties may have access to outcomes but not to raw data. To facilitate broad access to crucial data during a global health crisis, we need to address both bodies of law in an integrated manner.