Académique Documents
Professionnel Documents
Culture Documents
Abstract
Identifying and recruiting eligible research participants for studies is an expensive and timeconsuming process. Especially when involving heterogeneous, multinationally distributed electronic data sources, data provenance becomes an issue. To use provenance information, we developed a provenanceaware Query Formulation Tool that researchers can use to collaboratively define protocols; and trace back protocol development and participants identification processes via the provenance service.
Figure 2. Screenshot of the Query Formulation Tool, example Diabetes study protocol definition.
Introduction
TRANSFoRmProject 5year panEuropean project funded by the European Union to advance information and computing technologies in order to address current market challenges for connecting healthcare and research. Will deliver a digital infrastructure, and develop rigorous and generic methods facilitating reuse of primary care electronic health record (EHR) data to improve both patient safety; and the conduct and volume of clinical research in Europe. Query Formulation and Provenance The Query Formulation Tool enables researchers to collaboratively build study eligibility criteria and identify potential subject counts from selected heterogeneous data sources. Data provenance tracking during query formulation, execution and results display phases is important for reproducibility and validation of the research participant identification process.
Methods
Query Formulation Tool requirements:
Heterogeneity of data sources and terminologies Eligibility criteria and study protocol definitions Query submission and display of results Researcher tools for collaboration, data sharing and participant recruitment Privacy and data provenance aspects
Figure 3. Provenance Query Tool showing provenance graph and example provenance data tracked for the Diabetes study eligibility criteria.
Figure 1 shows how the Query Formulation Tool is used to obtain counts of participants with matching criteria and where provenance is recorded.
The clinical research domain is modelled by CRIM and the Query Formulation Tool design is in turn based on CRIM. Researchers collaboratively use the Query Formulation Tool to define study protocols, and use the terminology service for interoperable coding systems. Queries form study protocols are expressed according to CDIM and then submitted for execution against heterogeneous data sources.
Results
The Query Formulation Tool provides a flexible way for researchers to define complex study protocols. Figure 2 shows an example of a Diabetes study. The Provenance Service, using a novel concept of provenance templates, tracks information during the query formulation process, e.g. user, time, evolution of the eligibility criteria definitions. Provenance data can be queried through the Query Provenance Server. Figure 3 shows part of the provenance graph for the Diabetes study focussing on the EligibilityCriteria artifact as well as the recorded values for this artifact.
Acknowledgement
This project is partially funded by the European Commission under the 7th Framework Programme (Grant Agreement 247787).
TheodorosArvanitis<t.arvanitis@bham.ac.uk>[Presentingauthor] BrendanDelaney<brendan.delaney@kcl.ac.uk>