Building a knowledge graph unlocking immuno-oncology and cell therapy data

The challenge

Many companies and research institutions struggle with integration of their research data; data lives in siloed systems and vendor solutions, and is scattered over departments. In addition, experimental results are communicated via email, SharePoint, or PowerPoint presentations. Furthermore, there is often no conformance to a shared standard for experimental data. This is a missed opportunity as annotating and integrating data enables scientists to answer broader research questions and therefore increases the value of their data.

The Hyve has helped to solve this challenge for a global, top-10 pharmaceutical company with drug discovery and development programs in several therapeutic areas. More specifically: we built a semantic model and knowledge graph for their Immuno-Oncology & Cell Therapy (IOCT) domain.

How we solved it

We tackled the data-integration issue by first creating a semantic model that captures the IOCT research and business domain. This then served as a foundation for generating a knowledge graph, using data from different systems within the company.

Figure 1. Schematic representation of the process to create a knowledge graph

We collaborated with customer key stakeholders to make an inventory of data sources that needed to be mapped to the semantic model. For the model, both public domain ontologies (such as OBI and BFO) as well as customer specific ontologies were used. This phase of the project involved investigating relevant use cases, systems, and data. Besides, we held regular feedback sessions.

Figure 2. Example of the semantic representation (left) of a CAR (chimeric antigen receptor) synthesis (right).

After creating the semantic model, we mapped the data from different sources to entities in the model to create the knowledge graph. The Hyve evaluated several strategies and tools to perform the extract, transform and load (ETL) process to populate this knowledge graph before deploying the graph on the client’s internal infrastructure where it can be browsed, searched, and queried.

Figure 3. Application centric view (left) versus domain centric view (right) of the data

The outcome

In this project, we delivered a semantic model that builds on public domain ontologies and aligns with other semantic models that had been previously developed for the client. The model provides our client with a stable representation of the entities and procedures used as it is not depending on structures dictated by vendor applications. By building the semantic model, the client also got a clear picture of what information was missing in the different systems to be able to more thoroughly understand and integrate their research data. Thus, the model now serves as a reference for modelling newly generated research data and is used to integrate research data assets into the company’s enterprise knowledge graph.

Once the semantic model was created and the knowledge graph populated with the data, queries based on the proposed use cases could be run. It demonstrated that end-to-end use cases can be answered using this knowledge graph. The knowledge graph transcends research domains and departments and is therefore a major step forward in the customers’ ongoing efforts to unlock research data from siloed systems.

Data Services

Our multidisciplinary team of consultants, data and knowledge engineers, and semantic modelling experts enables your organisation to turn data into FAIR assets.

Read more

Let’s start collaborating

  • Get in touch with our consultants on how your organisation can optimally benefit from transforming siloed data sources to an enterprise knowledge graph

  • Together with our experts, build a semantic model that reflects your research domain, to serve as a reference for your data integration efforts

  • Let our data engineers explore your data assets to develop mapping scripts for your ETL pipeline

Fill in the form and we will get in touch

Choose a subject