The crusade for big data in the AAL domain

The crusade for Big Data in the
AAL domain
Femke Ongenae

22
Session organizers – Hi!
Femke Ongenae
Knowledge Engineer
IBCN, UGent - iMinds
Femke De Backere
eCare Researcher
IBCN, UGent - iMinds
Griet Verhenneman
Legal Researcher
ICRI, KULeuven - iMinds
Julie Doyle
Research Fellow
CASALA

33
Ambient-Assisted Living
Trend towards more personalized & context-aware healthcare services

44
FallRisk
Social-aware and context-aware multi-sensor fall detection platform

66
Enable the collection of (near) real-life profile and context data

77
Closing the gap...
Developers
Researchers

Keynote by Dr. Gray
Femke Ongenae

Data Integration in a
Big Data Context
Open PHACTS Case Study
Alasdair J G Gray
A.J.G.Gray@hw.ac.uk
alasdairjggray.co.uk
@gray_alasdair

Big Data
@gray_alasdair Big Data Integration 11
Volume Velocity
Variety Veracity
http://i.kinja-img.com/gawker-media/image/upload/lvzm0afp8kik5dctxiya.jpg

Open PHACTS Use Case
“Let me compare MW, logP
and PSA for launched
inhibitors of human &
mouse oxidoreductases”
 Chemical Properties (Chemspider)
 Launched drugs (Drugbank)
 Human => Mouse (Homologene)
 Protein Families (Enzyme)
 Bioactivty Data (ChEMBL)
 … other info (Uniprot/Entrez etc.)
“Let me compare MW, logP
and PSA for launched
inhibitors of human &
mouse oxidoreductases”

Open PHACTS Mission:
Integrate Multiple Research
Biomedical Data Resources
Into A Single Open & Free
Access Point

Literature
PubChem
Genbank
Patents
Databases
Downloads
Data Integration Data Analysis
Firewalled Databases
Repeat @ each
company
x
A single, shared
solution.
Funded under
• IMI: 2011-14
• ENSO: 2014-16
Pre-competitive Data

http://dx.doi.org/10.1016/j.websem.2014.03.003
• Cloud-Based
“Production” Level
System.
• Secure & Private
• Guided By Business
Questions
• Uses Semantic Web
Technology
• Provides REST-ful API
http://dx.doi.org/10.1016/j.drudis.2013.05.008
Discovery Platform

Scientific Results
http://ceur-ws.org/Vol-
1114/Demo_Dunlop.pdf
http://dx.doi.org/10.1016/j.drudis.2014.11.006 http://dx.doi.org/10.1002/minf.v31.8
http://dx.doi.org/10.1371/journal.pone.0115
460

OPS Discovery Platform
Drug Discovery Platform
Apps
Domain API
Interactive
responses
Production quality
integration platform
Method
Calls
Standard Web
Technologies

App Ecosystem
@gray_alasdair
An “App Store”?
Explorer Explorer2 ChemBioNavigator Target Dossier Pharmatrek Helium
MOE Collector Cytophacts Utopia Garfield SciBite
KNIME Mol. Data Sheets PipelinePilot scinav.it Taverna
Big Data Integration 18https://www.openphacts.org/2/sci/apps.html

http://chembionavigator.com
ChemBio
Navigator

API Hits
0
10
20
30
40
50
60
Jan
2013
Feb
2013
Mar
2013
Apr
2013
May
2013
June
2013
July
2013
Aug
2013
Sept
2013
Oct
2013
Nov
2013
Dec
2013
Jan
2014
Feb
2014
Mar
2014
Apr
2014
May
2014
June
2014
July
2014
Aug
2014
Sept
2014
Oct
2014
Nov
2014
Dec
2014
Jan
2015
Feb
2015
Mar
2015
Apr
2015
May
2015
June
2015
NoofHits
Millions
Month
Public launch
of 1.2 API
1.3 API 1.4 API 1.5 API

OPS Discovery Platform
Nanopub
Db
VoID
Data Cache
(Virtuoso Triple Store)
Semantic Workflow Engine
Linked Data API (RDF/XML, TTL, JSON)
Domain
Specific
Services
Identity
Resolution
Service
Chemistry
Registration
Normalisation
& Q/C
Identifier
Management
Service
Indexing
CorePlatform
P12374
EC2.43.4
CS4532
“Adenosine
receptor 2a”
VoID
Db
Nanopub
Db
VoID
Db
VoID
Nanopub
VoID
Public Content Commercial
Public Ontologies
User
Annotations
Apps

Open PHACTS Data

John Wilbanks consulted for us
A framework built around STANDARD well-understood
Creative Commons licences – and how they interoperate
Deal with the problems by:
Interoperable licences
Appropriate terms
Declare expectations to users and
data publishers
One size won‘t fit all requirements
Data Licensing (Or Lack Of!)

API: Complex Interactions
Disease
Tissue
Target
Compound
Pathway

STANDARD_TYPE UNIT_COUNT
---------------- -------
AC50 7
Activity 421
EC50 39
IC50 46
ID50 42
Ki 23
Log IC50 4
Log Ki 7
Potency 11
log IC50 0
STANDARD_TYPE STANDARD_UNITS COUNT(*)
------------------ ------------------ --------
IC50 nM 829448
IC50 ug.mL-1 41000
IC50 38521
IC50 ug/ml 2038
IC50 ug ml-1 509
IC50 mg kg-1 295
IC50 molar ratio 178
IC50 ug 117
IC50 % 113
IC50 uM well-1 52
~ 100 units
>5000 types
Implemented using the Quantities, Units, Dimension, Types
Ontology (http://www.qudt.org/)
Quantitative Data
Challenges

Quality Assurance

P12047
X31045
GB:29384
Identity Mapping
Andy Law's Third Law
“The number of unique identifiers
assigned to an individual is never less
than the number of Institutions
involved in the study”
http://bioinformatics.roslin.ac.uk/lawslaws/

Gleevec®: Imatinib Mesylate
DrugbankChemSpider PubChem
Imatinib
MesylateImatinib Mesylate
YLMAHDNUQAMNNX-UHFFFAOYSA-N

Gleevec®: Imatinib Mesylate
DrugbankChemSpider PubChem
Imatinib
MesylateImatinib Mesylate
YLMAHDNUQAMNNX-UHFFFAOYSA-N
Are these records the same?
It depends upon your task!

Big Data Integration 32
skos:exactMatch
(InChI)
Strict Relaxed
Analysing Browsing
Structure Lens
@gray_alasdair
I need to perform an analysis, give me
details of the active compound in Gleevec.

Big Data Integration 33
skos:closeMatch
(Drug Name)
skos:closeMatch
(Drug Name)
skos:exactMatch
(InChI)
Strict Relaxed
Analysing Browsing
Name Lens
@gray_alasdair
Which targets are known to interact
with Gleevec?

Data Provenance

dev.openphacts.org

Open PHACTS Approach
1. Know your audience
Web developers
2. Understand your use cases
Prioritised business questions
3. Identify access pathways
Identify data
Identify connections
Implement API

Questions
Alasdair J G Gray
A.J.G.Gray@hw.ac.uk
alasdairjggray.co.uk
@gray_alasdair
Open PHACTS
contact@openphacts.org
openphacts.org
@open_phacts

Brainstorm session
Femke Ongenae

4343
How do we enable an
infrastructure/platform that
allows the user-friendly and
rapid sharing of Living Lab
data?

4444
Brainstorm: 3 Steps
Generation of ideas
Selection of best ideas
Further detailing top ideas
http://www.flandersdc.be/gps

4545
Table 3
Data Sharing
Infrastructure
Table 4
Quality &
Reliability of
data
Table 5
Data Usage
Results
Table 1
Privacy &
Ethics
Table 2
Business
Models
Generating ideas

4646
Practical arrangements
• Paper indicating table order
• Brainstorm round: +/- 15 minutes
• Moderators

4848
Some tips!
Delay your judgement
Be open to naive and crazy ideas
Openess & enthusiasm
Use associative thinking
Piggyback on ideas of others

4949
Selection of ideas
• Summarize 3 key ideas
• How to select?
– Keep the goal in mind!
– Think in opportunities
– What are you enthusiastic about?
– Personal engagement
– What is needed in the short term?
– Most promising

5050
Selection of ideas
• 5 Votes
• Put your name & e-mail on the sheet if you want to be involved in
working out the idea!

THANK YOU FOR YOUR
TIME
Contact me @ Femke.Ongenae@intec.ugent.be

The crusade for big data in the AAL domain

Recommended

Recommended

More Related Content

What's hot

What's hot (11)

Viewers also liked

Viewers also liked (18)

Similar to The crusade for big data in the AAL domain

Similar to The crusade for big data in the AAL domain (20)

More from AALForum

More from AALForum (20)

Recently uploaded

Recently uploaded (20)

The crusade for big data in the AAL domain

Editor's Notes