FAIR data is here to stay, but where is the tooling?

On the 8th of November 2017 we attended the FAIR data tooling meeting organized mainly for developers interested/involved in FAIR. We were invited to present about what we currently do and what we need from the community to increase the FAIRness of our portfolio. In this blogpost we want to share our insights and ideas from this meeting.

During the meeting, we heard some interesting presentations from the community which immediately addressed our concerns. There is a lot of interest in FAIR data right now. The main reason for this high interest is, well let’s be honest, who does not want their data to be Findable, Accessible, Interoperable and Reusable? How to achieve this is still unclear for most researchers, as it will take an effort from all data layers (e.g., raw data, processed data, etc):

  • Researchers will need to plan their data usage and data management.
  • Software developers will need to adjust their tooling to support FAIR data.
  • Research domains will need to establish ontologies to use

and these are not the only three things needed.

Achieving "full FAIRness" of a dataset in one go and till the end of times will never be possible. There will always be new techniques released more compliant to the FAIR principles. However, it is really inspiring to see how far the FAIR data community has come after all the effort and time invested in FAIR over the last couple of years. For example, a part of the community has realised a piece of the Personal Health Train. What they achieved is an algorithm which learns to do predictions about which radio treatment is best suited based on the data inside hospitals all over the world, without the data ever leaving the hospital.

FAIR metrics and tooling

Our main goal for attending the meeting was to see which solutions we could add to our portfolio.

The talk from Luiz Bonino, from DTL and GO-FAIR, on the process of creating FAIR metrics, which will cover every aspect of FAIR on every scale, was one of the most reassuring presentations we heard. We believe that these metrics will give us concrete feedback about how FAIR our infrastructure is and what we will need to improve during our journey towards an infrastructure capable of handling and analysing fully compliant FAIR data.

The presentation by Morris Swertz from UMC Groningen, one of the FAIR pioneers, on the FAIR tooling they are currently creating, gave a good insight on the practicality of FAIR.

How do you transform data which is not FAIR, into FAIR data? What challenges lay ahead for us in this task? His presentation highlighted how fundamentally important FAIR data points, such as we created for tranSMART during our FAIR hackathon, are to becoming Findable and Accessible.

So what are we currently working on at The Hyve?

Right now we are working on establishing an infrastructure capable of handling data as FAIRly as possible. We presented the approach we want to take and discussed it with the community. We received a lot of positive feedback from the community, which encourages us to start putting it into practice even more than we currently do.