ReVeaLD: A User-driven Domain specific Interactive
Search Platform for Biomedical Research
Maulik R. Kamdar*, Dimitris Zeginis, Ali Hasnain, Stefan Decker, Helena F. Deus
*maulik.kamdar@deri.org

Motivation

“Are there Drugs with molecular weight
under 400 tested against ‘Colon Cancer’?”

~300 000
compounds
~300 interesting
compounds

Literature

Query databases

Virtual Screening

~ 10 interesting
compounds

(Linked) Data

Challenges

~5 compounds

 Increasing adoption and usability by the non-technical biomedical researcher.
 Awareness of which datasets contain the required data and their data model.
 Heterogeneous biomedical data sources, too dynamic for data centralization.
 High cognitive entry barrier towards the assembly of SPARQL queries.
 Human-readable, domain-specific representation of query results is required.
 Trade-off between expressivity (SPARQL) and usability (NL-Queries).
 Making the User Experience engaging, while providing quality results.

Integrative
Bioinformatics

Hypothesis
Generation

“Do any Publications refer to assays using ‘Aspirin’ as the
primary Drug in treatment of ‘Prostrate Cancer’?

Methods
SPARQL
Query

CanCO - a concise semantic model consisting of
only those concepts and properties which are
relevant to the cancer chemoprevention domain
R

Experimental
Datasets

Rule Templates

chebi:Compound void-ext:subClassOf granatum:Molecule
drugbank:Drug void-ext:subClassOf granatum:Molecule

Transformed
Query

LSLOD
Catalogue

Cataloguing &
Links Creation

Domain Specific Language (DSL)

SELECT * WHERE {<ResourceURI> ?p ?o}
Results are subjected to a set of Graphic Rules, which
follow the Event-Condition-Action paradigm (ECA)
and provide visual representations using Fresnel
Display Vocabulary.
Event: drugbank:targets_844 drugbank:pdbIdPage
<Structure_File> (single triple – can be multiple)
Condition: pdbIdpage (Predicate) + http (Object)
Action: HTTP GET and invoke Resource Renderer
Resource Renderer: GLMol Molecular Viewer

Transformed
Query

Transformed
Query

Transformed
Query

Chebi

DrugBank

UniProt

Others

Life Sciences Linked Open Data
(LSLOD)

Graphic Rules

Federated Query Engine

Results
Concept Map Representation of the DSL

Evaluation
Visual Query Builder Interface (Single-Entity & Advanced Search)
Assays, which identify potential
Chemopreventive Agents with
a Molecular Weight less than 300,
and which Target Estrogen Receptors

 5 Query formulation tasks - single or multiple
concept selection from DSL or LSLOD Catalogue
 Structured on the Tracking Real-time User
Experience (TRUE) methodology, popularly used
to evaluate user experience in computer games.

SPARQL Query
PREFIX
PREFIX
PREFIX

rdfs: <http://www.w3.org/2000/01/rdf-schema#>
granatum: <http://chem.deri.ie/granatum/>
xsd: <http://www.w3.org/2001/XMLSchema#>

SELECT DISTINCT * WHERE
{
?x0_Assay a granatum:Assay ;
granatum:hasInput ?x1_Target ;
granatum:identify ?x2_ChemopreventiveAgent ;
granatum:outcome_method ?x3_outcome_method .
?x1_Target granatum:title ?x4_title .
?x2_ChemopreventiveAgent
granatum:molecularWeight ?x10_molecularWeight ;
granatum:SMILESnotation ?x9_SMILESnotation ;
granatum:hasFormula ?x7_hasFormula ;
granatum:HBD ?x5_Hydrogen_Bond_Donors ;
granatum:HBA ?x6_Hydrogen_Bond_Acceptors ;
granatum:TPSA ?x8_Topological_Polar_Surface_Area .
FILTER regex(xsd:string(?x4_title), "estrogen receptor", "is")
FILTER ( xsd:double(?x10_molecularWeight) < 300 )
} LIMIT
100

Pubchem
Uniprot
Chebi

ReVeaLD – Real-time Visual Explorer and Aggregator of Linked Data (http://reveald.info)

Domain specific Visualizations (resource-dependent)

Usability Hypotheses Evaluated : Does familiarity of the users with the DSL affect
the time needed to formulate the query (intuitive)?
 Does a constrained DSL (smaller DSL), lead to
less time needed for query formulation?

Data Browser Interface (Faceted & Lens-based Data Navigation)

Acknowledgements: This work was funded EU FP7 GRANATUM project, ref. FP7-ICT-2009-6-270139 and Science Foundation Ireland Lion 2

Enabling Networked Knowledge

ReVeaLD: A user-driven domain-specific interactive search platform for biomedical research

  • 1.
    ReVeaLD: A User-drivenDomain specific Interactive Search Platform for Biomedical Research Maulik R. Kamdar*, Dimitris Zeginis, Ali Hasnain, Stefan Decker, Helena F. Deus *maulik.kamdar@deri.org Motivation “Are there Drugs with molecular weight under 400 tested against ‘Colon Cancer’?” ~300 000 compounds ~300 interesting compounds Literature Query databases Virtual Screening ~ 10 interesting compounds (Linked) Data Challenges ~5 compounds  Increasing adoption and usability by the non-technical biomedical researcher.  Awareness of which datasets contain the required data and their data model.  Heterogeneous biomedical data sources, too dynamic for data centralization.  High cognitive entry barrier towards the assembly of SPARQL queries.  Human-readable, domain-specific representation of query results is required.  Trade-off between expressivity (SPARQL) and usability (NL-Queries).  Making the User Experience engaging, while providing quality results. Integrative Bioinformatics Hypothesis Generation “Do any Publications refer to assays using ‘Aspirin’ as the primary Drug in treatment of ‘Prostrate Cancer’? Methods SPARQL Query CanCO - a concise semantic model consisting of only those concepts and properties which are relevant to the cancer chemoprevention domain R Experimental Datasets Rule Templates chebi:Compound void-ext:subClassOf granatum:Molecule drugbank:Drug void-ext:subClassOf granatum:Molecule Transformed Query LSLOD Catalogue Cataloguing & Links Creation Domain Specific Language (DSL) SELECT * WHERE {<ResourceURI> ?p ?o} Results are subjected to a set of Graphic Rules, which follow the Event-Condition-Action paradigm (ECA) and provide visual representations using Fresnel Display Vocabulary. Event: drugbank:targets_844 drugbank:pdbIdPage <Structure_File> (single triple – can be multiple) Condition: pdbIdpage (Predicate) + http (Object) Action: HTTP GET and invoke Resource Renderer Resource Renderer: GLMol Molecular Viewer Transformed Query Transformed Query Transformed Query Chebi DrugBank UniProt Others Life Sciences Linked Open Data (LSLOD) Graphic Rules Federated Query Engine Results Concept Map Representation of the DSL Evaluation Visual Query Builder Interface (Single-Entity & Advanced Search) Assays, which identify potential Chemopreventive Agents with a Molecular Weight less than 300, and which Target Estrogen Receptors  5 Query formulation tasks - single or multiple concept selection from DSL or LSLOD Catalogue  Structured on the Tracking Real-time User Experience (TRUE) methodology, popularly used to evaluate user experience in computer games. SPARQL Query PREFIX PREFIX PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> granatum: <http://chem.deri.ie/granatum/> xsd: <http://www.w3.org/2001/XMLSchema#> SELECT DISTINCT * WHERE { ?x0_Assay a granatum:Assay ; granatum:hasInput ?x1_Target ; granatum:identify ?x2_ChemopreventiveAgent ; granatum:outcome_method ?x3_outcome_method . ?x1_Target granatum:title ?x4_title . ?x2_ChemopreventiveAgent granatum:molecularWeight ?x10_molecularWeight ; granatum:SMILESnotation ?x9_SMILESnotation ; granatum:hasFormula ?x7_hasFormula ; granatum:HBD ?x5_Hydrogen_Bond_Donors ; granatum:HBA ?x6_Hydrogen_Bond_Acceptors ; granatum:TPSA ?x8_Topological_Polar_Surface_Area . FILTER regex(xsd:string(?x4_title), "estrogen receptor", "is") FILTER ( xsd:double(?x10_molecularWeight) < 300 ) } LIMIT 100 Pubchem Uniprot Chebi ReVeaLD – Real-time Visual Explorer and Aggregator of Linked Data (http://reveald.info) Domain specific Visualizations (resource-dependent) Usability Hypotheses Evaluated : Does familiarity of the users with the DSL affect the time needed to formulate the query (intuitive)?  Does a constrained DSL (smaller DSL), lead to less time needed for query formulation? Data Browser Interface (Faceted & Lens-based Data Navigation) Acknowledgements: This work was funded EU FP7 GRANATUM project, ref. FP7-ICT-2009-6-270139 and Science Foundation Ireland Lion 2 Enabling Networked Knowledge