
Customer
Kythera Labs is a healthcare data technology company focused on unlocking the full value of real-world data (RWD) by making it more accessible, reliable, and actionable. Through its Wayfinder platform, Kythera integrates advanced data engineering, patient-centric remastering, and scalable analytics pipelines to transform complex, multi-source healthcare data into more comprehensive and research-ready patient journeys for life sciences, biotech, and healthcare organizations.
The Challenge
The promise of RWD is often challenged by inconsistent formats, vocabularies, and structural variations across datasets. When used for research, this fragmentation diminishes the ability of researchers to unlock the full value of RWD and effectively conduct large-scale observational studies.
Kythera and its customers found this fragmentation made it nearly impossible to construct a single, coherent view of a patient. Answering critical research questions was slow and inefficient. Kythera needed to empower their life sciences customers with a unified, FAIR (Findable, Accessible, Interoperable, and Reusable) data comprised of U.S.-based RWD sources — medical claims, lab data and two different sources of electronic health records (EHRs) — that were reliable, scalable, and ready for advanced analysis.
How We Solved It
With the goal of empowering Life Sciences organizations with comprehensive and research-ready patient journeys, Kythera chose The Hyve as a partner in transforming their vast multi-source data into the Observational Medical Outcomes Partnership (OMOP) Common Data Model. This robust, strategic path delivered a multi-source, single-model strategy that avoided brittle, one-off pipelines and aligned to the standard framework used by Life Sciences organizations.
Our collaboration was a fusion of platform power and domain expertise:
- Laying the Groundwork: Kythera leveraged its powerful Wayfinder platform, built on Databricks, to handle the heavy lifting with Spark-native, distributed serverless computing. They first de-identified and tokenized all source data to ensure patient privacy. Raw information, like disparate claim lines, was structured into patient-level event records, while other data was organized into a staging layer before transformation.
- Harmonizing the Data: This is where our specialized expertise became critical. The Hyve collaborated closely with Kythera’s team to develop sophisticated semantic and code mapping logic. We translated diverse, source-specific vocabularies into the standard OMOP vocabularies (version 5.4 schema), ensuring a high-fidelity OMOP translation where a diagnosis in a claim meant the same thing as a diagnosis in an EHR.
- Ensuring Quality at Scale: The Hyve delivered on mapping validation and profiling strategies, and Kythera integrated and optimized OHDSI QA tools, Achilles and the Data Quality Dashboard (DQD) into the Wayfinder platform. The joint approach enabled efficient execution across hundreds of millions of records, streamlining data characterization and ensuring OMOP dataset quality, consistency, and reliability at scale.
The Outcome
The project successfully transitioned Kythera from a multi-model approach to a unified, multi-source, single-model strategy, establishing a replicable model for operationalizing OMOP at scale. The resulting OMOP dataset includes over 300 million unique patients, providing a comprehensive resource for reliable research and analysis.
This remastered data asset now enables the use of standardized, reusable queries built upon a robust data infrastructure. The runtime for critical data quality and characterization tasks was dramatically accelerated, completing in hours rather than days.
Key outcomes include:
- A Unified, Scalable Data Asset: A massive, multi-source dataset was successfully standardized, creating a robust foundation for advanced, reusable analytics.
- Accelerated Data Processing: Optimizing transformation pipelines and embedding QA tools directly into the Wayfinder platform significantly reduced cycle times for data profiling and validation. This accelerated multiple rounds of QA, transformation tuning, and cohort development to deliver higher-quality outputs and a shorter path from ingestion to analysis.
- Enhanced Data Governance and Security: The new system incorporates transparent audit trails for full data lineage and utilizes Delta Sharing for secure, privacy-preserving collaboration with external partners, eliminating the need for data duplication.
This collaboration marks a significant step in Kythera’s mission to deliver clean, connected, and standards-aligned real-world data for life sciences research. OMOP transforms fragmented healthcare data into a unified, analyzable asset, eliminating the months typically spent preparing disparate datasets. With this foundation, life sciences organizations can immediately focus on generating insights that drive discovery, evidence generation, and commercial impact. By combining Kythera’s data engineering at scale with community-standard tools, this initiative enables faster research cycles, stronger outcomes, and accelerated time to value.
Â