The ability to inform regulatory decision-making hinges on trust in the underlying data. The Data Analysis and Real World Interrogation Network (DARWIN EU®), initiated by the European Medicines Agency (EMA), is founded on this principle, employing a structured and systematically executed Data Quality Assessment process to build a reliable network of high-quality real-world databases. A recent poster, co-authored by experts from The Hyve, details this critical function.

The assessment process is initiated during the onboarding phase of new databases and after each new release from existing partners in the network. The workflow leverages a suite of powerful tools developed by the OHDSI community and under the frameworks of IMI EHDEN and DARWIN EU®.
- Data Analysis: Source data, mapped to the OMOP CDM, is processed using tools like CdmOnboarding and the DataQualityDashboard (DQD). These tools generate intrinsic metadata and perform quality checks.
- Review and Collaboration: The outputs are reviewed by the Network Operations team. Results, such as the Onboarding Report and DQD Shiny app visualizations, are discussed with the respective Data Partners to understand the cause of any potential DQ issues.
- Systematic Tracking and Prioritization: All findings are documented in a central Data Quality Tracker. Each issue is categorized and assigned an expected impact on study execution — high, medium, or low — to prioritize resolutions. This creates a transparent and actionable feedback loop with the Data Partner.
Demonstrating Progress Through Continuous Assessment
Since the start of DARWIN EU®, this process has been applied diligently, resulting in the collection and documentation of more than 750 issues in the tracker. Of these, approximately 54.9% have been resolved. The process shows clear maturation over time. For database releases from 2022, nearly 85% of issues have been resolved. Furthermore, the data shows that high-impact issues are given more priority and have a higher percentage of closure than lower-impact issues, ensuring the most critical problems are addressed first.
Ensuring Data is Fit-for-Purpose
Ultimately, the goal is to ensure that the data within the network is reliable and fit-for-purpose for any given study. The Data Quality Tracker, along with other database metadata, informs the Study Operations team about known issues and the overall reliability of each data source. This transparent, continuous assessment and the clear documentation are key pillars for trusted data collaboration that can confidently support regulatory science.