Introduction

Over the past decades, the importance of obtaining consent in medical research settings has been strongly emphasized [1]. However, in data-intensive health research it is often regarded impracticable or impossible to obtain (meaningful) consent [2,3,4,5]. A common misunderstanding of current European data protection law is that when consent is not being used as lawful basis, the processing of that persons data is prohibited [6]. While obtaining consent is a way to secure legitimate data processing, it is not the only way. Article 6 of the General Data Protection Regulation (GDPR) contains six legal bases for the processing of personal data, of which consent is one.

Processing personal data for health research purposes most likely involves “special categories” of personal data. The European legislator has labeled genetic data, biometric data and data concerning health—among others—as special categories of data [7], which merit a higher form of protection [7]. As a result, the processing of special categories of data must have a lawful basis as outlined in Article 6 of the GDPR, as well as fall under one of the ten exemptions listed in Article 9(2) GDPR.

The “research exemption” can be found under Article 9(2)(j) GDPR and allows for the processing of special categories of personal data if the processing is deemed necessary for scientific research purposes. In addition, it is required that the processing is in accordance with Article 89(1) GDPR and that it is based on Union or Member State law. Article 89(1) GDPR states that “technical and organizational measures” should be in place which “may include pseudonymization”. As such, Article 9(2)(j) GDPR contains an “opening clause”: Member States have been given the discretion to implement the research exemption into their national legislation. When they do so, it is required to provide for “suitable and specific measures to safeguard the fundamental rights and interests of the data subject” [7]. However, the GDPR does not provide much substance as to what constitutes suitable and specific measures.

Recent research has shown that the conditions in and the extent to which processing of health data for scientific research is allowed without consent differs between the Member States [8], and the many documents with the purpose of guiding policy in this area contain dissimilar terminology and concepts [9]. The fragmentation of data protection standards for scientific health research across the EU leaves researchers with a confusing legal landscape to maneuver [10, 11]. Additionally, concerns have been raised about the possible emergence of a disparity between legal requirements and ethical standards [12].

The lack of clarity regarding the measures that should be implemented when invoking the research exemption may harm the protection of personal data as well as hinder progress in data-intensive health research. To address this, we conducted a systematic review of soft law instruments and academic literature. Our goal was to identify the measures outlined in documents regarding the processing of personal data for health research purposes without consent. These documents contain valuable opinions and suggestions on how to ensure legally and ethically sound data processing when consent by the data subject is lacking. Moreover, the pace of publication of soft law instruments and academic papers is a lot higher than the trajectory of issuing official legal texts. Therefore, the measures and safeguards referred to in those documents provide us with a more up-to-date reflection of the current data-intensive scientific research climate. With this review we aim to contribute to substantiating the GDPR’s requirement of installing suitable and specific measures when invoking the research exemption in Article 9(2)(j) GDPR.

Methods

To ensure complete and transparent reporting of the methods used, we based our review on the PRISMA-Ethics Reporting Guideline for Systematic Reviews on Ethics Literature [13]. Textual analysis and coding of the included soft law instruments and articles was achieved using NVivo 12 qualitative data analysis software. To conduct a thematic analysis, the authors retrieved quotes from all the included documents containing recommendations and/or opinions regarding measures that should be installed when health data are processed for scientific research purposes without consent. Each quote was assigned one or more codes, and an inductive approach was used to identify different overarching themes arising from the reviewed documents.

For the purpose of this systematic review, the term soft law is used to denote (international) declarations, guidelines, recommendations, frameworks and other documents that are not legally binding but that have an influence on the regulation of health research. Relevant soft law instruments were identified using the International compilation of human research standards (2020), a collection of laws, regulations and guidelines governing research from 133 countries and a number of international and regional organizations. We reviewed instruments that were included under Guidelines in the categories International and Europe Regionwide.

First, all instruments containing any guidance on processing personal data for scientific research were selected for review. This resulted in a list of 22 instruments. To ensure its comprehensiveness and to complement it if necessary, the list was reviewed by our academic and consortium partners with expertise in health law and research ethics. Ultimately, we included the instruments from this list that were: related to the GDPR’s territorial scope; specifically referring to the absence of consent and/or describing types of scientific research for which obtaining consent is impossible; mentioning measures that should be installed in such a situation. Exclusion criteria were: only describing legislations of non-EU countries; solely factually reflecting current legal policies without adding study results, views, opinions, reflections and/or suggestions for appropriate measures and safeguards; not written in English.

The academic literature was identified through a systematic search in PubMed and Embase. The queries were adjusted to the type of database (see Appendix 1 and 2). The initial search was performed on 02 Dec 2020 and produced 977 results in PubMed and 436 in Embase. After deduplication 1010 articles remained. Title/abstract screening left 250 articles remaining for full-text screening. An additional search with the same queries was performed on 24 Jan 2022. The additional search produced 148 results in PubMed and 97 in Embase. After deduplication 194 remained for title/abstract screening, of which 24 articles were included for full-text screening (see Table 1).

Table 1 PRISMA flow chart.

Inclusion criteria were: academic publications related to the GDPR’s territorial scope; specifically referring to the absence of consent and/or describing types of scientific research for which obtaining consent is impossible; mentioning measures that should be installed in such a situation. Exclusion criteria were: only describing legislation of non-EU countries; solely factually reflecting current legal policies without adding study results, views, opinions, reflections and/or suggestions for appropriate measures and safeguards; not written in English. Publications were considered to be of sufficiently high quality if they were published in an international peer-reviewed journal. The screening of the articles was performed by two separate assessors (J.S. and M.M.). Disagreements regarding the eligibility of articles were resolved by close deliberation and consensus between the two separate assessors.

Results

A total of 13 soft law instruments (see Table 2) and 26 scientific articles (see Table 3) were included, mentioning measures for processing health data for research purposes without consent. The thematic analysis of the quotes that were retrieved from the included soft law instruments and academic literature resulted in the identification of four overarching themes of suggested measures: organizational measures, technical measures, oversight and review mechanisms, and engagement and participation. Table 4 displays the literal wording of the retrieved quotes, along with their associated overarching themes.

Table 2 Included soft law instruments.
Table 3 Included scientific articles.
Table 4 Identified overarching themes.

Organizational measures

The first overarching theme regards organizational measures. According to the Nuffield Council on Bioethics, when performing data-intensive health research without consent, “additional governance arrangements are usually required.” This could include limiting the use of data through formal agreements such as Data Sharing Agreements, Data Re-use Agreements and Material Transfer Agreements [14]. The term governance is referred to in multiple other documents as well: for instance, the requirement of extensive governance to ensure that secondary uses are legitimate (i.a.), for which the principles of transparency and accountability are vital [15]. Another example is the call for responsible data governance, in which the authors feel that data governance policies should not only aim to protect privacy but that they should also address broader societal issues such as fairness [16].

Of all different organizational measures that were mentioned, transparency was repeated most and emerged from the reviewed soft law documents as well as the scientific literature. The importance of clear and transparent policies regarding topics such as “data transfers, feedback of findings, storage of data, (..), re-contact of data subjects, access requests from third parties, access requests of data subjects, governance, and (where applicable) intellectual property and commercial use” was emphasized [4]. Furthermore, it was stated that by “adopting patient-friendly public disclosures relating to privacy safeguards and risks”, “describing how technology is used to safeguard participant data” and by providing “a privacy statement that increases database research transparency and discusses the software used to enhance privacy” trust and transparency will most likely be promoted [17].

It was argued that a form of respecting patients’ interests is through informing and notifying them [18], and that “for nonconsensual research to be defensible, broader openness and accountability must play an even greater role [2].” It was suggested that effectuating transparency can largely be achieved through publication on websites and social media [19]. Individual notification as well as broad notification through posters, emails, brochures, social media, or web portals were also proposed [18].

In addition, several documents emphasize that patients and/or individuals should be able to exert control over ‘their’ data, that they should be able to express their preferences regarding the processing [20] and that they should be involved in crucial decisions about how their data will be used [21].

Technical measures

The second theme concerns technical measures that can be implemented for the protection of personal data and the rights of the data subject. Data security is regarded not just as an important safeguard against unauthorized access to data, but also against loss, destruction, and modification [15]. In multiple of the included scientific articles technical measures are mentioned in congruence with, or as a part of, a governance structure. For instance, some regard “security and oversight” as one of the main components of data governance [16]. In addition, others state that “proportionate technical and governance measures should be incorporated in the design of data-intensive medical research projects and infrastructures [3]”.

Examples of suggested technical measures are aggregating data [22], de-identifying data [23] and key-coding data [24]. In the preliminary opinion on the European Health Data Space (EHDS) by the European Data Protection Supervisor (EDPS)_it is stated that the use of effective encryption should be a baseline requirement for the incorporation of state-of-the-art technical security measures. Furthermore, this document provides in-depth guidance on what should be understood by the term privacy enhancing technologies. For instance, the opinion refers to technologies “enabling to perform operations on encrypted data without having access to the data in clear or performing calculations on distributed data without having access to all data sources or enabling reliable statistical calculations on data where noise has been injected [25]”.

Oversight and review mechanisms

Thirdly, many of the reviewed documents state that there should be some form of oversight and/or review when performing data-intensive health research without consent by the data subject. Table 4 shows that different mechanisms are deemed suitable for the task of performing oversight and/or review. Some documents state that oversight or review should be performed by “competent bodies or institutions [26]” or an “authorization body [19]”. (Research) Ethics Committees (RECs) were most often suggested  [3, 4, 14, 16, 23, 27,28,29,30,31,32,33,34,35,36,37]. Moreover, several documents mention Data Access Committees (DACs) as the appropriate body for oversight or review [3, 14, 36, 37]. It was asserted that “RECs and DACs have a critical role to play in protecting the rights and interests of data donors and promoting the social value and public good of genomic data sharing [37]”.

Various characteristics were attributed to the designated mechanism for oversight or review such as “independent, multidisciplinary and pluralist [31]” or “coordinated and well-functioning [37]”. Furthermore, the importance of ensuring that oversight bodies have “adequate expertise” was stressed, meaning that they should possess sufficient knowledge about the processing of (genomic) data and the associated risks [36]. Some of the documents mention specific tasks and goals for the oversight or review mechanisms i.e., to “waive informed consent [28, 30, 32]”, “make an assessment of research proposals [36]”, “ensure that clinical data are used appropriately and only for purposes that will be beneficial to future patients [22]”, to perform “an independent necessity and proportionality test [3]” or to “address the requirements of adopting organizational measures and safeguards when processing personal data [..] [37].”

Multiple documents elaborate on the conditions under which the processing of personal data without consent should be permitted by the oversight or review body. Our analysis revealed that the conditions under which the consent requirement can be surpassed, vary significantly across different documents and/or authors. Often, the acceptability of surpassing the consent requirement is contextual and depends on the circumstances of a specific case. For instance, the World Medical Association’s (WMA) Declaration of Helsinki takes into account “exceptional situations where consent would be impossible or impracticable to obtain [27].” Alternatively, the Organization for Economic Co-operation and Development’s (OECD) Guidelines on Human Biobanks and Genetic Research Databases state that “in some jurisdictions, consent may be waived when it cannot be obtained, the risk to the participant is deemed minimal, and the rights and welfare of the participant are not adversely affected. In such cases, the informed consent may be waived by an authorized entity such as a research ethics committee in accordance with the applicable law and ethical principles pertaining to the protection of human subjects and will vary from jurisdiction to jurisdiction [30]”. Furthermore, the 2016 International Ethical Guidelines for Health-related Research Involving Humans of the Council for International Organizations of Medical Sciences (CIOMS) state that research ethics committees may approve a “waiver of informed consent to research if the research would not be feasible or practicable to carry out without the waiver, the research has important social value, and the research poses no more than minimal risks to participants [28].”

Public engagement and participation

The final overarching theme concerns the engagement of the public and the participation of relevant stakeholders in the research process. The reviewed documents prominently show the importance of engaging the public and not just the data subject. Many included documents emphasize the importance of public engagement, community consultation and/or stakeholder participation. It has been stated that “increasing public education about research and specific targeted information provision could promote trust in research processes and safeguards, which in turn could increase the acceptability of research without specific consent [38].”

Many of the reviewed documents suggest that simply providing information about how the data is handled and its intended purposes is inadequate. It was argued that researchers and research institutions should strive for “genuine engagement with stakeholders and public groups” which could include “the possibility of influencing matters, including the direction of research where appropriate [19].” Reciprocity seems to become more important and therefore, continuing public engagement should be upheld “to ensure that the requirements for social license are fulfilled and the research community continues to deserve the trust of society [39].” One of the reviewed documents indicated that the involvement of stakeholders could complement the REC review and assist in legitimizing data research [16].

Discussion

This systematic review of relevant soft law instruments and academic literature resulted in the identification of four overarching themes of measures for performing data-intensive health research without consent. The aim of this review was to contribute to substantiating to the GDPR’s requirement of installing suitable and specific measures when invoking the research exemption in Article 9(2)(j) GDPR.

One of the distinctive findings is that many of the reviewed documents recommend subjecting data-intensive health research without consent to review by a REC, DAC or a comparable review mechanism. In most European jurisdictions, obtaining research ethics approval for the (secondary) use of health data for research purposes is currently not a legal requirement [11]. Our research implies that in Member States where approval from an oversight or review mechanism is currently not legally required, proportionate review could be made part of the governance structure of health data research initiatives.

In its opinion on the proposed EHDS, which aims to not only to improve access to and quality of healthcare but also to support scientific research, the EDPS emphasizes the importance of ethical data use. The opinion highlights the value of ethics committees and advises that they are taken into account in forthcoming legislation [25]. The benefits of implementing oversight bodies in genetic research specifically are emphasized by the EDPS: “Genetic research in particular has implications not only for the subject of the DNA tests but others in his or her family or with shared characteristics in this and future generations. Independent ethical committees could support the understanding of which activities qualify as genuine research and define the ethical standards referred to in the GDPR [40].”

It appears that the European legislator has already incorporated the EDPS’ views in the design of the Data Governance Act (DGA), which will be applicable from September 2023, and is intended to regulate the re-use of data collected in public institutions. The DGA introduces the concept of data altruism, which is the voluntary disclosure of data by individuals or companies for the common good, including scientific research purposes. The European legislator asserts that for the concept of data altruism to succeed, safeguards such as oversight by ethics councils or boards will ensure that the data controller complies with high standards of scientific ethics [41].

Moreover, another role for oversight and review bodies could be to assist in the clarification of the role of consent in data-intensive scientific research. It seems that confusion has risen about the role of consent, because the term “consent” is being used in various regulatory areas without necessarily fulfilling the same purpose [42]. For instance, consent can be used as a legal basis for personal data processing, but it can also serve as an ethical standard and/or safeguard, providing individuals with more choice and control [6, 43]. These different forms and purposes of consent can also be found in the documents that were included for this review. For instance, the Declaration of Helsinki and the CIOMS guidelines (i.a.) contain ethical norms. When reference is made to consent in those documents, they refer to a different consent from the consent that is included in Articles 6 and 9 GDPR. According to the European Data Protection Board (EDPB) these different functions can and should be distinguished [7]. The EDPS is of the opinion that “viewing them as a single and indivisible requirement would be simplistic and misleading” [43]. Deliberations between the research community and data protection experts will be necessary to shape the notion of consent in the future of scientific research. Review and oversight bodies should be included in these deliberations.

Another notable result of our review is the identification of the theme public engagement and participation, which reveals emphasis on the importance of engaging the broader public in scientific research endeavors. Although the GDPR primarily focuses on the protection of the rights of the person whose data is being processed, most of the reviewed soft law instruments and, more prominently, the academic literature indicate that this is not sufficient. The majority of the included literature advocates informing the public, rather than solely informing the individual about (i.a.) data-intensive health research that is being performed without consent, the review processes by ethical oversight bodies, and the outcomes thereof. Furthermore, the reviewed literature seems to underline the importance of not just informing, but also actively involving and engaging the public, and thereby enabling them to genuinely participate in scientific research processes.

At the same time, some of the suggested measures identified in the reviewed soft law instruments and academic literature did not sufficiently clarify the GDPR’s requirement of installing suitable and specific measures when invoking the research exemption in Article 9(2)(j) GDPR. Many of the suggested measures included in the themes technical measures and organizational measures such as transparency, accountability, data-minimization and pseudonymization are a mere repetition of legal principles or standards deriving from the GDPR and are in in fact applicable to all types of data processing, including situations where consent has been obtained [7].

In addition, many of the reviewed documents recommend a certain measure, such as “transparency” or “data security”, without any further specification or clarification of what those terms constitute or what should be done to promote them. As such, it is unclear whether these documents and authors use the terms “transparency” and “data security” to refer to the same meaning of those terms as the GDPR does. Moreover, the implementation of measures should be proportionate to, for instance, the risks or the sensitivity of the data. However, in the reviewed documents little attention is paid to the proportionality of the suggested measures.

The lack of specification of a large part of the identified measures impedes the substantiating of the suitable and specific measures requirement when invoking the research exemption. Moreover, it complicates determining whether there indeed is a disparity between ethical and legal requirements. The EDPS has suggested that in the context of the EHDS a gap analysis might be required. This gap analysis will reveal whether there is a need to integrate with other regulatory safeguards provided by, for instance, ethical guidelines [25]. A similar gap analysis in the context of the GDPR could be of value.

This study has potential limitations. The results could be influenced by the exclusion of documents that were not available full text (see Table 1). Furthermore, it is possible that the search strategy used on the soft law instruments has resulted in the failure to identify all relevant documents. Moreover, by only including documents written in English with global and/or European relevance we might have missed valuable suggestions for specific measures included in, for example, national guidance documents. Future research endeavors could be aimed at exploring measures which are included in documents drafted for specific jurisdictions.

Conclusion

This review has provided us with some valuable insights on how to substantiate the GDPR’s requirement of installing suitable and specific measures in accordance with Article 9(2)(j) GDPR. The results suggest that this could be done, for instance, by making review by a REC or DAC part of the governance structure of health data research initiatives. It is also proposed to inform and engage not only the data subjects, but also different stakeholders and the public regarding the use of health data for research purposes.

This research does not provide sufficient basis to conclude whether it is also desirable to translate the suggestions we have found into legal obligations. This review can provide inspiration, but the results will still need to be reflected on. The mere fact that something is mentioned in soft law instruments or in the academic literature does not necessarily mean it should be turned into law. It would have to be evaluated, for instance, whether the suggested measures can withstand a subsidiarity and proportionality test. Therefore, we strongly encourage the European legislator, the Member States and the EDPB and/or other international ethical/legal guidance committees to further clarify the suitable and specific measures requirement and issue more in-depth guidance on this subject.