To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
The concept of identifiability remains a foundational yet contentious criterion in European Union (EU) data protection law. Similarly, anonymisation has sparked intense debate.
This paper examines recent developments that have shaped the EU’s approaches to identifiability and anonymisation, including trends in the Court of Justice of the European Union (CJEU) case law, national supervisory authority (SA) assessments of anonymisation processes, and the recent European Data Protection Board (EDPB) Opinion 28/2024 addressing the anonymity of artificial intelligence models and EDPB Guidelines 01/2025 on pseudonymisation.
The paper explores how the balance between over-inclusiveness and under-inclusiveness is being calibrated, suggesting the emergence of a functional definition of personal data in CJEU case law. It underscores the importance of the burden of proof in evaluating anonymisation processes, as confirmed by national SA assessments. Finally, it highlights how to ensure consistency between the GDPR and data sharing mandates stemming from the new generation of EU data regulations.
Over the past decade the European Union (EU) has transformed sustainability into a dense matrix of legally binding ESG reporting obligations for companies. Compliance increasingly hinges on firms’ ability to collect and verify thousands of datapoints deep into global supply chains – an exercise that is costly, error-prone and may yield non-comparable results. (Semi-)centralised ESG data-sharing arrangements – shared hubs where suppliers post one or more verified sets of sustainability figures that all their customers can reuse – can restore some efficiency by eliminating duplicate requests and supplying standardised, audit-ready inputs, but this amplifies competition-law risk. Drawing on competition law and policy and recent Dutch banking practice, the paper devises a set of legal “firewalls” and access rules that neutralise collusive potential resulting from the information exchange that takes place while safeguarding smaller market players from exclusion. These safeguards are essential to ensure that ESG data collaboration supports – not hinders – the EU’s Twin Transition towards a green and digital economy.
Some of the practices that are believed to enhance the quality of science may produce bias. Studies with unexciting results may never be published, or results are selectively reported to highlight positive outcomes. Investigators often measure multiple outcomes while only reporting those with statistically significant findings. The best remedy for this problem is to require prospective declaration of study plans through study registration, such as the primary and secondary outcome variables and data analysis plans. Failure to report results of completed studies remains a serious problem. Further, results from many studies remain unpublished and the probability of publication is higher for positive results, leading to overestimates of treatment benefit. It is possible that some encouraging clinical trial findings are actually false positive results. For US Food and Drug Administration evaluations, data from a significant portion of relevant completed trials remain undisclosed at the time the pharmaceutical products are under evaluation.
Recent developments in national health data platforms have the potential to significantly advance medical research, improve public health outcomes, and foster public trust in data governance. Across Europe, initiatives such as the NHS Research Secure Data Environment in England and the Data Room for Health-Related Research in Switzerland are underway, reflecting examples analogous to the European Health Data Space in two non-EU nations. Policy discussions in England and Switzerland emphasize building public trust to foster participation and ensure the success of these platforms. Central to building public trust is investing efforts into developing and implementing public involvement activities. In this commentary, we refer to three national research programs, namely the UK Biobank, Genomics England, and the Swiss Health Study, which implemented effective public involvement activities and achieved high participation rates. The public involvement activities used within these programs are presented following on established guiding principles for fostering public trust in health data research. Under this lens, we provide actionable policy recommendations to inform the development of trust-building public involvement activities for national health data platforms.
The COVID-19 pandemic underscored the critical need for timely data and information to aid interventions and decision-making. Efforts by different actors resulted in various data-driven initiatives, constituting experiences of deploying data in the COVID-19 response and valuable lessons that can advance the sharing and use of data for social good beyond COVID-19. This commentary highlights key case studies detailing the experiences and lessons of those who implemented data science solutions for the COVID-19 response, as well as findings from 74 data-centric COVID-19 interventions. These interventions demonstrated successful data access strategies, productive intervention processes, and effective stakeholder engagement, all of which present potential pathways to overcoming data access obstacles across Africa. Additionally, this study also briefly explores three areas for action (i.e., institutions, people, and platforms) that can inform future policy development to increase data sharing for societal benefit in the long term.
As posthumous data use policy within the broader scope of navigating postmortem data privacy is a procedurally complex landscape, our study addresses this by exploring patterns in individuals’ willingness to donate data with health researchers after death and developing practical recommendations.
Methods:
An electronic survey was conducted in April 2021 among adults (≥18 years of age) registered in ResearchMatch (www.researchmatch.org), a national health research registry. Descriptive statistics were used to observe trends in, and multinomial logistic regression analyses were conducted at a 95% confidence interval to determine the association between, willingness to donate some, all, or no data after death with researchers based on the participants’ demographics (education level, age range, duration of using online medical websites, and annual frequency of getting ill).
Results:
Of 399 responses, most participants were willing to donate health data (electronic medical record data [67%], prescription history data [63%], genetic data [54%], and fitness tracker data [53%]) after death. Among 397 respondents, we identified that individuals were more likely to donate some data after death (vs. no data) if they had longer duration of using online medical websites (adjusted relative risk ratio = 1.22, p= 0.04, 95% CI: 1.01 to 1.48). No additional significant findings were observed between willingness to donate all, some, or none of their data after death and other demographic factors.
Conclusions:
Engaging patients in online medical websites may be one potential mechanism to encourage or inspire individuals to participate in posthumous data donation for health research purposes.
One of the goals of open science is to promote the transparency and accessibility of research. Sharing data and materials used in network research is critical to these goals. In this paper, we present recommendations for whether, what, when, and where network data and materials should be shared. We recommend that network data and materials should be shared, but access to or use of shared data and materials may be restricted if necessary to avoid harm or comply with regulations. Researchers should share the network data and materials necessary to reproduce reported results via a publicly accessible repository when an associated manuscript is published. To ensure the adoption of these recommendations, network journals should require sharing, and network associations and academic institutions should reward sharing.
Public agencies routinely collect administrative data that, when shared and integrated, can form a rich picture of the health and well-being of the communities they serve. One major challenge is that these datasets are often siloed within individual agencies or programs and using them effectively presents legal, technical, and cultural obstacles. This article describes work led by the North Carolina Department of Health and Human Services (NCDHHS) with support from university-based researchers to establish enterprise-wide data governance and a legal framework for routine data sharing, toward the goal of increased capacity for integrated data analysis, improved policy and practice, and better health outcomes for North Carolinians. We relied on participatory action research (PAR) methods and Deliberative Dialogue to engage a diverse range of stakeholders in the co-creation of a data governance process and legal framework for routine data sharing in NCDHHS. Four key actions were taken as a result of the participatory research process: NCDHHS developed a data strategy road map, created a data sharing guidebook to operationalize legal and ethical review of requests, staffed the Data Office, and implemented a legal framework. In addition to describing how these ongoing streams of work support data use across a large state health and human services agency, we provide three use cases demonstrating the impact of this work. This research presents a successful, actionable, and replicable framework for developing and implementing processes to support intradepartmental data access, integration, and use.
Enabling private sector trust stands as a critical policy challenge for the success of the EU Data Governance Act and Data Act in promoting data sharing to address societal challenges. This paper attributes the widespread trust deficit to the unmanageable uncertainty that arises from businesses’ limited usage control to protect their interests in the face of unacceptable perceived risks. For example, a firm may hesitate to share its data with others in case it is leaked and falls into the hands of business competitors. To illustrate this impasse, competition, privacy, and reputational risks are introduced, respectively, in the context of three suboptimal approaches to data sharing: data marketplaces, data collaboratives, and data philanthropy. The paper proceeds by analyzing seven trust-enabling mechanisms comprised of technological, legal, and organizational elements to balance trust, risk, and control and assessing their capacity to operate in a fair, equitable, and transparent manner. Finally, the paper examines the regulatory context in the EU and the advantages and limitations of voluntary and mandatory data sharing, concluding that an approach that effectively balances the two should be pursued.
As the federal government continues to expand upon and improve its data sharing policies over the past 20 years, complex challenges remain. Our interviews with U.S. academic genetic researchers (n=23) found that the burden, translation, industry limitations, and consent structure of data sharing remain major governance challenges.
The advent of smart and digital cities is bringing data to the forefront as a critical resource for addressing the multifaceted transitions faced by African cities from rapid urbanization to the climate crisis. However, this commentary highlights the formidable considerations that must be addressed to realize the potential of data-driven urban planning and management. We argue that data should be viewed as a tool, not a panacea, drawing from our experience in modeling and mapping the accessibility of transport systems in Accra and Kumasi, Ghana. We identify five key considerations, including data choice, imperfections, resource intensity, validation, and data market dynamics, and propose three actionable points for progress: local data sharing, centralized repositories, and capacity-building. While our focus is on Kumasi and Accra, the considerations discussed are relevant to cities across the African continent.
Public health authorities (PHAs), including Tribal nations, have the right and responsibility to protect and promote the health of their citizens. Although Tribal nations have the same need and legal authority to access public health data as any other PHA, significant legal challenges continue to impede Tribal data access.
Data on real-time individuals’ location may provide significant opportunities for managing emergency situations. For example, in the case of outbreaks, besides informing on the proximity of people, hence supporting contact tracing activities, location data can be used to understand spatial heterogeneity in virus transmission. However, individuals’ low consent to share their data, proved by the low penetration rate of contact tracing apps in several countries during the coronavirus disease-2019 (COVID-19) pandemic, re-opened the scientific and practitioners’ discussion on factors and conditions triggering citizens to share their positioning data. Following the Antecedents → Privacy Concerns → Outcomes (APCO) model, and based on Privacy Calculus and Reasoned Action Theories, the study investigates factors that cause university students to share their location data with public institutions during outbreaks. To this end, an explanatory survey was conducted in Italy during the second wave of COVID-19, collecting 245 questionnaire responses. Structural equations modeling was used to contemporary investigate the role of trust, perceived benefit, and perceived risk as determinants of the intention to share location data during outbreaks. Results show that respondents’ trust in public institutions, the perceived benefits, and the perceived risk are significant predictor of the intention to disclose personal tracking data with public institutions. Results indicate that the latter two factors impact university students’ willingness to share data more than trust, prompting public institutions to rethink how they launch and manage the adoption process for these technological applications.
Chapter 3 shows why the contracts model doesn’t work: consent is absent in the information economy. Privacy harm can’t be seen as a risk that people accept in exchange for a service. Inferences, relational data, and de-identified data aren’t captured by consent provisions. Consent is unattainable in the information economy more broadly because the dynamic between corporations and users is plagued with uneven knowledge, inequality, and a lack of choices. Data harms are collective and unknowable, making individual choices to reduce them impossible. Worse, privacy has a moral hazard problem: corporations have incentives to behave against our best interests, creating profitable harms after obtaining agreements. Privacy’s moral hazard leads to informational exploitation. One manifestation of valid consent in the information economy are consent refusals. We can consider them by thinking of people’s data as part of them, as their bodies are.
This article focuses on copyright issues pertaining to generative artificial intelligence (AI) systems, with particular emphasis on the ChatGPT case study as a primary exemplar. In order to generate high-quality outcomes, generative AI systems require substantial quantities of training data, which may frequently comprise copyright-protected information. This prompts inquiries into the legal principles of fair use, the creation of derivative works and the lawfulness of data gathering and utilisation. The utilisation of input data for the purpose of training and enhancing AI models presents significant concerns regarding potential violations of copyright. This paper offers suggestions for safeguarding the interests of copyright holders and competitors, while simultaneously addressing legal challenges and expediting the advancement of AI technologies. This study analyses the ChatGPT platform as a case example to explore the necessary modifications that copyright regulations must undergo to adequately tackle the intricacies of authorship and ownership in the realm of AI-generated creative content.
Early in the pandemic, pre-print servers sped rapid evidence sharing. A collaborative of major medical journals supported their use to ensure equitable access to scientific advancements. In the intervening three years, we have made major advancements in the prevention and treatment of COVID-19 and learned about the benefits and limitations of pre-prints as a mechanism for sharing and disseminating scientific knowledge.
Pre-prints increase attention, citations, and ultimately impact policy, often before findings are verified. Evidence suggests that pre-prints have more spin relative to peer-reviewed publications. Clinical trial findings posted on pre-print servers do not change substantially following peer-review, but other study types (e.g., modeling and observational studies) often undergo substantial revision or are never published.
Nuanced policies about sharing results are needed to balance rapid implementation of true and important advancements with accuracy. Policies recommending immediate posting of COVID-19-related research should be re-evaluated, and standards for evaluation and sharing of unverified studies should be developed. These may include specifications about what information is included in pre-prints and requirements for certain data quality standards (e.g., automated review of images and tables); requirements for code release and sharing; and limiting early postings to methods, results, and limitations sections.
Academic publishing needs to innovate and improve, but assessments of evidence quality remains a critical part of the scientific discovery and dissemination process.
The coronavirus disease-2019 (COVID-19) pandemic has led to the irrational use of drugs in the absence of clinical management guidelines. Access to individual participant data (IPD) from clinical trials aids the evidence synthesis approaches. We undertook a rapid review to infer IPD sharing intentions based on data availability statements by the principal investigators (PIs) of drug and vaccine trials in the context of COVID-19.
Searches were conducted on PubMed (NCBI). We considered randomized controlled trial (RCT) publications from January 1, 2020, to October 31, 2021. IPD sharing intentions were inferred based on the data availability statements in the full-text manuscript publications. We included 180 articles. Of these, 81.7% (147/180) of the publications have arrived from the findings of the RCTs alone, 12.8% (23/180) of the publications were protocol publications alone, and 5.6% (10/180) of the RCTs had both published protocol and publication from the trial findings. We have reported IPD sharing intentions separately in RCT protocol publications (n = 23 + 10) and publications from RCT findings (n = 147 + 10). Among RCT protocol publications, one-third (11/33) of the PIs intended to share IPD. In fact, over half of the PIs (52.2%, 82/157) in their published RCT findings intended to share IPD. However, information to share about IPD was missing for 57.6% (19/33) of RCT protocols and 38.2% (60/157) of published RCT findings.
Stakeholders must work together to ensure that overarching factors, such as legislation that governs clinical trial practices, are streamlined to bolster IPD sharing mechanisms.
Qualitative research provides an excellent opportunity to study digitalization. The purpose of this chapter is to explore the digitalization of government services by studying the longitudinal development data-sharing practices across different parts of government in the United Kingdom. This chapter reports on a unique, qualitative, interpretive field study based on the author’s role as a participant observer and his analysis of the discourse and contents of the various documents presented in relation to both the creation and running of data-sharing practices in the United Kingdom. The chapter finds that despite government addressing many of the concerns identified in the literature on data sharing, practical and perceptual issues remain – issues that tell us much about the state of digitalization of government services.
Data sharing is a requisite for developing data-driven innovation and collaboration at the local scale. This paper aims to identify key lessons and recommendations for building trustworthy data governance at the local scale, including the public and private sectors. Our research is based on the experience gained in Rennes Metropole since 2010 and focuses on two thematic use cases: culture and energy. For each one, we analyzed how the power relations between actors and the local public authority shape the modalities of data sharing and exploitation. The paper will elaborate on challenges and opportunities at the local level, in perspective with the national and European frameworks.
This article provides a critical review of new policies in China, the United States, and the European Union that characterize genomic data as a national strategic resource. Specifically, we review policies that regulate human genomic data for economic, national security, or other strategic purposes rather than ethical or individual rights purposes.