Towards Biomedical Data Integration for Analyzing the Evolution of Cognition


Published on

Presented at the Ontologies in Data and Life Sciences Workshop 2013:

Published in: Education, Technology
No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Towards Biomedical Data Integration for Analyzing the Evolution of Cognition

  1. 1. Towards Biomedical Data Integration for Analyzing the Evolution of Cognition Amrapali Zaveri, Jens Lehmann, Katja Nowick
  2. 2. Outline ●  Why study Evolution of Cognition? ●  Research Questions ●  Our Approach ●  Datasets ○  Conversion ○  Interlinking ○  Querying and Preliminary Results ●  Conclusions & Future Work 2/13
  3. 3. Why study Evolution of Cognition? ●  Cognition refers to a group of mental processes that includes memory, attention, language (production and understanding), reasoning, learning, problem solving as well as decision making. ●  Some aspects of cognition are human specific, and that it has been argued that human specific evolutionary innovations have made us on the one hand smarter but on the other hand more vulnerable to cognitive disorders, e.g. Autism, Alzheimers disease 3/13
  4. 4. Why study Evolution of Cognition? ●  Mental processes involved in cognition are not controlled by a few individual genes but rather by the function and interplay of several hundreds, if not even thousands, of genes. ●  Information available in disparate databases or in separate tables of publications. ●  Querying across these databases is ●  time consuming – data in different formats ●  highly inefficient - when any one of the datasets is updated or changed. 4/13
  5. 5. •  Which genes have been found to be positively selected in humans, but also have been implicated with cognitive diseases? •  Which genes have been associated with human cognitive processes and evidence of evolutionary signatures "changes" within primates? •  Which genes have been associated with cognitive decline during ageing in humans? Do they show differential expression patterns when compared with other primates during development "ageing"? •  Do genes involved in cognition and behaviour show high diversity within humans and higher divergence between humans and chimpanzees? Research Questions * Image source: 5/13
  6. 6. • Use the Linking Open Data (LOD) principles • Identify and acquire data from relevant disparate datasets • Convert data to a single human and machine-readable format – RDF (Resource Description Format) • Integrate and interlink datasets • Query integrated datasets Our Approach 6/13
  7. 7. ●  11 datasets ●  Genes – symbol, name, alternative names ●  Diseases ●  Chromosome location ●  Cross-species information Datasets
  8. 8. Datasets conversion Available Formats: ●  CSV, TSV ●  TXT ●  PDF Transformed to RDF using: ●  SPARQLIFY ●  LODRefine 8/13
  9. 9. Datasets Interlinking ●  Each gene was given a unique identifier based on the gene symbol to create a URI (Uniform Resource Identifier) ●  a single globally re-usable resource. ●  Common element: Gene Symbol Example: 9/13
  10. 10. •  Integrated datasets available at with the graph name •  Research Question: o  “Which genes are involved in determining cognition and have changed during primate evolution?” •  Datasets: o  ID-TFs §  Transcription Factors associated with Intelligence Disorder o  Human Positive Selection Candidates §  dN/dS ratio: no. of mutations leading to an amino acid seq. change vs. no. of mutations that do not lead to this change §  The higher this ratio, the faster the protein is evolving. §  dN/dS ratio > 1 – evolve under positive selection Datasets Querying and Initial Results 10/13
  11. 11. SELECT ?symbol1 ?dnbydns FROM WHERE { ?gene1 rdf:type cog:gene . ?gene1 go:symbol ?symbol1 . ?gene1 cog:dnDs ?dnbydns . ?gene2 rdf:type cog:gene . ?gene2 go:symbol ?symbol2 . ?gene2 cog:nsid ?ns . FILTER (?symbol1 = ?symbol2) } Initial Results Result FMR2 dN/dS = 1.33 •  has changed significantly more during primate evolution & might be under positive selection in humans •  Patients with mutations in FMR2 have been reported to be mentally retarded & have autistic behavior*. SPARQL Query * M. Bensaid, M. Meiko, E.G. Bechara, L. Davidovic, A. Berretta, M.V. Catania, J.Gecz and B. Lalli, E. Bardoni. FRAXE-associated mental retardation protein (FMR2) is an RNA binding protein with high affinity for G-quartet RNA forming structure. Nucleic Acids Research, 2009. 11/13
  12. 12. Conclusions & Future Work ●  Preliminary work and ideas to use Linked Data publication to demonstrate its use in analyzing the evolution of cognition. Future Work: ●  Perform complex queries ●  Answer more research questions ●  Add more datasets ●  Interlink with external datasets ●  Create user interface 12/13
  13. 13. Thank You Questions?