Big data
Table of contents
- HMA/EMA Big Data Steering Group
- Metadata list describing real world data
- Data quality framework for EU medicine regulation
- Data standardisation strategy
- Research projects
- Darwin EU
- Pilot on using raw data in medicine evaluation
- Progress updates
- Work of the former HMA/EMA Big Data Task Force
- Meetings and workshops
- Veterinary big data
- Data protection
- International collaboration on real-world evidence
Massive amounts of data are generated on a daily basis that could potentially be harnessed to support medicines regulation. The European Medicines Agency (EMA) and Heads of Medicines Agencies (HMA) set up a joint task force to describe the big data landscape from a regulatory perspective and identify practical steps for the European medicines regulatory network to make best use of big data in support of innovation and public health in the European Union (EU).
'Big data' is a widely-used term without a commonly-accepted definition. The HMA/EMA Big Data Task Force defined big data as ‘extremely large datasets which may be complex, multi-dimensional, unstructured and heterogeneous, which are accumulating rapidly and which may be analysed computationally to reveal patterns, trends, and associations. In general, big data sets require advanced or specialised methods to provide an answer within reliable constraints’.
A single dataset may not strictly meet the definition of big data but, when pooled or linked with other datasets, they become sufficiently large or complex to analyse to assume the characteristics of big data. Sources include real-world data (such as electronic health records, insurance claims data and data from patient registries), genomics, clinical trials, spontaneous adverse drug reaction reports, social media and wearable devices.
Medicines regulators will increasingly use insights derived from big data to assess the benefit-risk of medicines across their lifecycle.
The joint HMA/EMA Big Data Steering Group advises the EMA Management Board and HMA on prioritisation and planning of actions to implement the ten priority recommendations in the
Big Data Task Force final report (phase two).
The Steering Group began its work in May 2020. It is co-chaired by Jesper Kjær, Director of Data Analytics Centre at the Danish Medicines Agency and Peter Arlett, Head of Data Analytics and Methods at EMA.
The Steering Group reviews the workplan annually to cover any new emerging topics. It last updated the workplan in July 2022.
The workplan aims to increase the utility of big data in regulation, from data quality through study methods to assessment and decision-making. It is patient-focused and guided by advances in science and technology.
Implementation of the workplan will be flexible and certain actions may be re-scheduled, since the European medicines regulatory network has to prioritise the unprecedented public health challenge of the Coronavirus disease (COVID-19) pandemic.
-
List item
Clusters of Excellence Discussion Paper (PDF/357.71 KB)
First published: 13/03/2023 -
List item
Workplan 2022-2025 - HMA / EMA joint Big Data Steering Group (PDF/306.05 KB)
First published: 28/07/2022 -
List item
Big Data Steering Group (BDSG): 2022 report (PDF/281.09 KB)
First published: 13/01/2023
Last updated: 18/01/2023
EMA/826814/2022 -
List item
Big Data Steering Group (BDSG): 2021 report (PDF/382.67 KB)
First published: 15/03/2022
EMA/40400/2022 -
List item
Workplan 2021-2023 - HMA / EMA joint Big Data Steering Group (PDF/306.52 KB)
First published: 27/08/2021 -
List item
Workplan 2020 - HMA / EMA joint Big Data Steering Group (PDF/1.08 MB)
First published: 14/09/2020
Last updated: 30/09/2020 -
List item
Big Data Steering Group (BDSG): 2020 report (PDF/528.47 KB)
First published: 12/03/2021
EMA/48625/2021 -
List item
Mandate - HMA / EMA joint Big Data Steering Group (PDF/450.26 KB)
First published: 14/09/2020
EMA/95333/2020 -
List item
Membership list - HMA / EMA joint Big Data Steering Group (PDF/141.89 KB)
First published: 11/09/2020
Last updated: 17/03/2023
A list of metadata describing real-world data sources and studies is available below to help pharmaceutical companies and researchers to identify and use such data when investigating the use, safety and effectiveness of medicines.
Real-world data are observational data stored in repositories such electronic health records and disease registries. Making use of these data sources can improve the evidence available to support benefit-risk decisions and facilitate getting better medicines to patients.
This metadata list will feed into two future EU catalogues on real-world data sources and studies:
- The catalogue of data sources will cover information on real-world databases, and is due to replace the European Network of Centres for Pharmacoepidemiology and Pharmacovigilance (ENCePP) catalogue in late 2023
- The catalogue of studies will cover studies performed on the data sources, enhancing and replacing the European Union electronic register of post-authorisation studies (EU PAS Register)
The catalogues have the following aims:
- Help regulators, researchers and pharmaceutical companies identify studies and data sources suitable to address research questions, based on the so-called ‘FAIR’ (findable, accessible, interoperable and reusable) data principles
- Boost transparency of observational studies
- Improve the ability of the aforementioned stakeholders to assess evidence from observational studies and real-world data sources
The HMA/EMA Big Data Steering Group adopted the metadata list in June 2022.
Improving the discoverability of data sources via these EU catalogues is a priority for the HMA-EMA joint Big Data Task Force. It is also in line with the European medicines agencies network strategy to 2025.
For more information:
A good practice guide for the use of real-world metadata was available for public consultation until 16 November 2022.
This draft guide aims to help regulators, data holders, researchers, pharmaceutical companies and other interested stakeholders to use the catalogue of data sources that will replace the currently available ENCePP catalogue.
For instance, it provides recommendations on how to identify suitable real-world data sources for studies, and describes the required metadata elements.
For more information:
A draft data quality framework for medicine regulation was available for public consultation until 18 November 2022.
This guidance document sets out the criteria for a more consistent and standardised approach to the quality of data used in medicine regulation to support benefit-risk decisions.
It is meant to:
- help identify, define and further develop data quality assessment procedures and recommendations for current and novel data types;
- support pharmaceutical companies and other stakeholders in selecting data sources for their studies;
- ensure the trust of patients and healthcare professionals in data-driven regulatory decision-making.
The data quality framework was co-produced by EMA, the Heads of Medicines Agency (HMA) and the Joint Action Towards the European Health Data Space (TEHDAS).
Establishing this framework is a key element in the HMA-EMA Big Data Steering Group workplan 2022-2025.
A list of metadata describing real-world data has also been made available for public consultation. For more information, see Metadata list describing real world data.
The European medicines regulatory network's data standardisation strategy sets out principles to guide the definition, adoption and implementation of international data standards by the network.
It aims to:
- enable quicker uptake of international data standards across the EU;
- improve data quality;
- enable data linkage and data analysis to support medicine regulation.
The strategy is a key deliverable of the Big Data Steering Group workplan.
EMA and HMA published the strategy in December 2021 and will maintain it over time to reflect any changing priorities or new requirements.
EMA has contracted several institutions to conduct research projects collecting and analysing real-world data from clinical practice to help monitor the safety and effectiveness of medicines.
For research projects related to COVID-19, see Treatments and vaccines for COVID-19: post-authorisation
EMA is establishing a coordination centre to provide timely and reliable evidence on the use, safety and effectiveness of medicines for human use, including vaccines, from real world healthcare databases across the EU.
This capability is called the Data Analysis and Real World Interrogation Network (DARWIN EU®). For more information, see:
Through a proof-of-concept pilot, selected applicants can submit 'raw data' to EMA as part of their initial and post-authorisation marketing authorisation applications.
Raw data refers to individual patient data from clinical trials. These include:
- clinical laboratory results;
- imaging data;
- patient medical charts.
Currently, applicants are submitting data in an aggregated format as clinical summaries or as individual patient data in PDF listings. This can hinder data analysis and slow down the evaluation process.
In contrast, raw data are stored in electronic structured format. This enables regulators to more easily visualise and analyse the data if needed.
The pilot aims to assess whether using raw data can help speed up and improve the medicine-evaluation process. The goal of this is to allow patients faster and better informed access to innovative medicines.
EMA launched the pilot in July 2022.
It will run for up to two years and include approximately ten regulatory procedures submitted to EMA from September 2022.
For any queries and to apply to take part in the pilot, write to rawdatapilot@ema.europa.eu.
The pilot is a key activity under the priority recommendations of the HMA/EMA Big Data Task Force. It refers to the priority of building network capability to analyse data.
-
List item
Information about the raw data proof-of-concept pilot for industry (PDF/132.78 KB)
First published: 12/07/2022
Last updated: 28/10/2022
EMA/174598/2022 -
List item
Application of EMA’s transparency principles to the raw data proof-of-concept pilot (PDF/243.63 KB)
First published: 01/02/2023
EMA/949891/2022
Further information is available to support pharmaceutical companies with their participation in EMA's raw data pilot.
The documents include:
- a questions and answers document on the raw data pilot;
- a participation letter to confirm pilot participation for a specific regulatory procedure;
- a cover letter for pilot participants to attach to their data packages.
-
List item
Questions and Answers about the raw data proof-of-concept pilot for industry (PDF/269.16 KB)
First published: 28/10/2022
Last updated: 08/03/2023
EMA/658116/2022 -
List item
Pilot participation letter (DOCX/309.81 KB)
First published: 28/10/2022
EMA/659352/2022 -
List item
Raw data submission cover letter template (DOCX/111.89 KB)
First published: 28/10/2022
EMA/658203/2022
For information on data protection in the raw data proof-of-concept pilot, see:
For more information:
EMA’s newsletter, published every three months, provides an update on progress in implementing the workplan of the HMA-EMA Big Data Steering Group.
-
List item
Big Data highlights - Issue 5 (PDF/639.6 KB)
First published: 31/03/2023
Issue 5 -
List item
Big Data highlights - Issue 4 (PDF/936.3 KB)
First published: 14/12/2022
Issue 4 -
List item
Big Data highlights - Issue 3 (PDF/672.36 KB)
First published: 06/08/2022
Issue 3 -
List item
Big Data highlights - Issue 2 (PDF/1.29 MB)
First published: 24/05/2022
Issue 2 -
List item
Big Data highlights - Issue 1 (PDF/1.07 MB)
First published: 15/02/2022
Issue 1
The HMA/EMA Big Data Task Force operated from 2017 until December 2019 to report on the challenges and opportunities posed by big data in medicines regulation. It carried out its work in two phases.
In phase one, the task force:
- reviewed the landscape of big data from a regulatory perspective and identified opportunities for improvements in the operation of medicines regulation;
- performed online surveys of national regulatory agencies and the pharmaceutical industry on perspectives, expertise and challenges. This helped develop an understanding of the challenges and the current state of expertise in the regulatory network.
In phase two, the task force made practical recommendations to inform strategic decision-making and planning by the HMA and EMA and to contribute to the European medicines regulatory network's work on developing a five-year EU Network Strategy to 2025.
The task force was composed of experienced medicines regulators and data experts appointed by the national competent authorities, EMA and the European Commission (EC). For more information, see HMA/EMA Big Data Task Force.
For information on related meetings and workshops, see:
- Multi-stakeholder workshop on Real World Data (RWD) quality and experience in use of Real World Evidence (RWE) for regulatory decision-making (26-27/06/2023)
- Second bi-annual Big Data Steering Group and industry stakeholders meeting (03/11/2022)
- EMA/HMA Big Data Stakeholder Forum 2022 (01/12/2022)
- Big Data Steering Group and industry stakeholders meeting (30/05/2022)
- Data quality framework for medicines regulation (07/04/2022)
- EU Big Data Stakeholder Forum (07/12/2021)
- Learnings initiative webinar for optimal use of big data for regulatory purpose (30/11/2021)
- Veterinary Big Data stakeholder forum (01-02/06/2021)
- Data Standardisation Strategy stakeholder workshop (18/05/2021)
- Joint HMA/EMA workshop on artificial intelligence in medicines regulation (19-20/04/2021)
- Technical workshop on real-world metadata for regulatory purposes (12/04/2021)
- EU big data stakeholder virtual forum (15/12/2020)
- Heads of Medicines Agencies (HMA) / European Medicines Agency (EMA) Joint Big Data Task Force meeting: identifying solutions for big data challenges (04/05/2018)
- Workshop on identifying opportunities for 'big data' in medicines development and regulatory science (14-15/11/2016)
EMA and HMA established the veterinary big data initiative to explore the use of new digital technologies in key veterinary regulatory activities.
It takes account of the increasing amount of data generated via new digital systems put in place to implement the Veterinary Medicinal Products Regulation.
A European veterinary big data strategy sets out how the European medicines regulatory network intends to implement this initiative:
For more information:
EMA is preparing dedicated guidance on the impact of EU data protection legislation on the secondary use of health data in support of the development, evaluation and supervision of medicines.
The aim is to help medicine developers, data providers and research bodies comply with EU data protection rules, and to help patients and consumers understand their rights and the existing safeguards to protect personal data.
Secondary use of data refers to the use of data for a different purpose than the one for which it was originally collected. It typically involves the use of electronic health records, health insurance claims data, registry data or drug consumption data for medicines research and public health purposes.
The guidance will cover various operational scenarios, including the development of medicines, the evaluation of marketing authorisation applications and post-authorisation safety monitoring.
By July 2020 EMA had gathered input from patients and consumers as data contributors as well as from medicines developers, research-performing and research-supporting infrastructures and other data providers (e.g. payers of healthcare).
In September 2020, stakeholders discussed with EMA the key questions concerning the application of the General Data Protection Regulation (GDPR) in the health sector and the secondary use of health data for medicines and public health purposes:
- Workshop on the application of the General Data Protection Regulation (GDPR) in the area of health and Secondary Use of Data for Medicines and Public Health Purposes
- Workshop on the General Data Protection Regulation (GDPR) and secondary use of data for medicines and public health purposes
EMA aims to finalise the guidance in consultation with the European Commission and the European Data Protection Supervisor (EDPS) in the last quarter of 2021. It will take into account stakeholder input and guidance from the EDPS on the processing of health data for research.
Ensuring that personal data are managed and analysed within a secure and ethical governance framework in compliance with EU data protection legislation is one of the recommended priorities of the HMA/EMA Big Data Task Force.
EU data protection legislation includes:
- Regulation (EU) 2016/679, known as the General Data Protection Regulation (GDPR), which applies to private and public entities in the Member States;
- Regulation (EU) 2018/1725, known as the EU Data Protection Regulation (EUDPR), which applies to all EU institutions and bodies.
EMA and theInternational Coalition of Medicines Regulatory Authorities (ICMRA) work together to help integrate real-world evidence into regulatory decision-making across the world.
ICMRA held a workshop for regulators to share experience in obtaining and using real-world evidence for the assessment of medicines, and issued a pledge in July 2022 to foster global efforts in this area.
For more information: