VIP Call Girl Sector 25 Gurgaon Just Call Me 9899900591
Linked Data – challenges for Imagiology and Radiology
1. Exploration, sharing and privacy
of data
Linked Data – challenges for Imagiology and Radiology
Francisco Couto
Ciências ULisboa
FISMED 2017 – 6 Nov 2017
5. Application
• Similar medical images?
• Computer-assisted image processing
• Related medical images ?
• Which are not necessarily similar
• Linked Data
6. Example
Discover new drugs to treat Alzheimer’s
what proteins are involved in signal transduction
and
are related to pyramidal neurons?
Tim Berners-Lee’s Linked Data slides, from TED 2009
best known as the inventor of the World Wide Web
http://www.ted.com/talks/tim_berners_lee_on_the_next_web.html
10. Definition
The term Linked Data refers to a set of best
practices for publishing and connecting
structured data on the Web using international
standards of the World Wide Web Consortium
11. Wikipedia
• Infopages
• box in the upper right of the page
• names, dates, places
• DBpedia project
• extracts this structured data
• publish as Linked Data
http://dbpedia.org
12.
13. Topic Datasets %
Government 183 18.05%
Publications 96 9.47%
Life Sciences 83 8.19%
User-generated content 48 4.73%
Cross-domain 41 4.04%
Media 22 2.17%
Geographic 21 2.07%
Social Web 520 51.28%
Total 1014
State of the LOD Cloud 2014
14. Important properties
• easily combined with other Linked Data
• best reason to explore and use Linked Data
• Important for data sharing
• self-documenting
• immediately figure out what a term means
• Linked Data can be private
• widely deployed behind enterprise firewalls on
private networks
15. Linked Data from Forms
• Manual human annotation like in wikipedia
• only works with a wide set of users
• Selecting the right terms is non-trivial
• takes time
• requires knowledge
• Common problems
• missing values
• too generic to be useful
B. Inácio, J. Ferreira, and F. Couto, Metadata analyser: measuring metadata quality, in Practical Applications
of Computational Biology and Bioinformatics (PACBB), pp. 197-204, 2017
16. Alternatives for Imagiology
• Explore the written reports
• Text mining approaches
• Use a specialized terminology (RadLex)
• multilingual approaches
• Sharing and privacy
• Not images only the metadata
• Still there are privacy concerns
17. Imagiology Written Reports
• written in free-text
• well structured
• more than any other clinical notes
• since the information has to pass from one health
care professional to another
18. Text Mining Tools
• Dictionary-based
• only requires a common terminology
• limited to that terminology
• ambiguity of terms
• Machine Learning
• requires a trainning set
• not limited to a terminology
• learns with experience
19. RadLex
• RSNA has produced RadLex(R)
• ontology focused on radiology
• terms (i.e. a lexicon) and the relationships
• over 75,000 terms and synonyms
https://www.rsna.org/RadLex.aspx
22. 22
London Bills of Mortality
listed possible ways to die throughout the
sixteenth, seventeenth and eighteenth centuries
Source: http://faculty.up.edu/asarnow/popular7.htm
25. <?xml version="1.0"?>
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:dc= "http://purl.org/dc/elements/1.1/">
<rdf:Description rdf:about="http://en.wikipedia.org/wiki/Sintra_Collar">
<dc:description>
Gold collar. It was made from three circular sectioned and tapering gold bars that are
fused at the ends forming a penannular neck-ring.
</dc:description>
<dc:date>1250BC-800BC (circa)</dc:date>
<dc:location>
Sintra, Portugal
http://yboss.yahooapis.com/geo/placefinder?woeid=748874
</dc:location>
<dc:type>
Gold
http://purl.obolibrary.org/obo/CHEBI_30050
</dc:type>
</rdf:Description>
</rdf:RDF>
27. Example
• MER
• Minimal Entity Recognizer
http://labs.fc.ul.pt/mer/
Couto, F. M., Campos, L., & Lamurias, A. MER: a minimal named-entity recognition tagger and
annotation server. Proceedings of the BioCreative, 2017
• DiShIn
• Semantic Similarity Measures using Disjunctive
Shared Information
http://labs.fc.ul.pt/dishin/
F. Couto and M. Silva, Disjunctive shared information between ontology concepts: application to Gene
Ontology, Journal of Biomedical Semantics, vol. 2, no. 5, pp. 1-16, 2011
28.
29.
30.
31.
32. Language
Usually written in the native language
Text Mining tools mostly developed for
English
RadLex is currently only available in English
German and Portuguese translations of RadLex
are currently in development
Obstacle in the sharing information
33. Two approaches
1) Translate the lexicon itself
• new version of lexicon requires translation
2) Translate the reports
• efficient automatic translation services nowadays
available
• state-of-the-art Text Mining tools tuned for English
34. Multilingual Reports
• reports accessible to any doctor
• who understands English
• tourists access their reports in their language
• send them to their personal doctor at home
• hospital can get a highly specialized second
opinion
• in complex clinical cases from international
experts
35. Translation Efficiency
• 51 Research Articles related to Radiology
• originally written in Portuguese
• a human translation in English was available
• We measured NER accuracy
• using Yandex, Google and Unbabel
L. Campos, V. Pedro, and F. Couto, Impact of translation on named-entity recognition in radiology texts,
Database, vol. 2017, no. bax064, pp. 1-9, 2017
36.
37. Multilingual System Prototype
• Given a report
• Automatically translates the report
• Find the most similar reports
• According to most similar RadLex terms
42. Privacy in Imagiology
• Like sharing photos
• After being in the web stays in the web
• An issue can be passed from family to family
• Encrypt data
• If important is a question of time
• Cloud storage
• Is it secure?
• Split through different providers.
43. Privacy in linked data
• You can share metadata without sharing really
privacy sensitive information
• show what kind of images are available
• You control which person to give that data
• using a secure channel
M. Fernandes, J. Decouchant, F. Couto, and P. Verissimo, Cloud-assisted read alignment and privacy, in
Practical Applications of Computational Biology and Bioinformatics (PACBB), pp. 220-227, 2017
44. Final Remarks (1)
• Linked Data is not free or open data
• is not sound data
• it can have access restrictions
• be incomplete and have errors
• But many successful use cases in the Life and
Health Sciences
M. Barros and F. Couto, Knowledge representation and management: a linked data perspective, IMIA
Yearbook of Medical Informatics, pp. 178-183, 2016
45. Final Remarks (2)
• go beyond technological advances
• create motivation mechanisms
• encourage data owners to share their data
• in a meaningful way
• Science is about replication
• without access to data there is no replication
46. Thanks!
Current Team:
• André Lamúrias
and Tânia Maldonado (Text Mining)
• Gonçalo Figueiró (MRIR)
• Maria Fernandes
and Mariana Pinhão (Genomic Privacy)
Publications & Tools
• http://labs.fc.ul.pt/
• http://webpages.fc.ul.pt/~fjcouto/?page_id=100