Your SlideShare is downloading. ×
0
How do we know what we don’t know:  Using the Neuroscience Information Framework to reveal knowledge gaps
How do we know what we don’t know:  Using the Neuroscience Information Framework to reveal knowledge gaps
How do we know what we don’t know:  Using the Neuroscience Information Framework to reveal knowledge gaps
How do we know what we don’t know:  Using the Neuroscience Information Framework to reveal knowledge gaps
How do we know what we don’t know:  Using the Neuroscience Information Framework to reveal knowledge gaps
How do we know what we don’t know:  Using the Neuroscience Information Framework to reveal knowledge gaps
How do we know what we don’t know:  Using the Neuroscience Information Framework to reveal knowledge gaps
How do we know what we don’t know:  Using the Neuroscience Information Framework to reveal knowledge gaps
How do we know what we don’t know:  Using the Neuroscience Information Framework to reveal knowledge gaps
How do we know what we don’t know:  Using the Neuroscience Information Framework to reveal knowledge gaps
How do we know what we don’t know:  Using the Neuroscience Information Framework to reveal knowledge gaps
How do we know what we don’t know:  Using the Neuroscience Information Framework to reveal knowledge gaps
How do we know what we don’t know:  Using the Neuroscience Information Framework to reveal knowledge gaps
How do we know what we don’t know:  Using the Neuroscience Information Framework to reveal knowledge gaps
How do we know what we don’t know:  Using the Neuroscience Information Framework to reveal knowledge gaps
How do we know what we don’t know:  Using the Neuroscience Information Framework to reveal knowledge gaps
How do we know what we don’t know:  Using the Neuroscience Information Framework to reveal knowledge gaps
How do we know what we don’t know:  Using the Neuroscience Information Framework to reveal knowledge gaps
How do we know what we don’t know:  Using the Neuroscience Information Framework to reveal knowledge gaps
How do we know what we don’t know:  Using the Neuroscience Information Framework to reveal knowledge gaps
How do we know what we don’t know:  Using the Neuroscience Information Framework to reveal knowledge gaps
How do we know what we don’t know:  Using the Neuroscience Information Framework to reveal knowledge gaps
How do we know what we don’t know:  Using the Neuroscience Information Framework to reveal knowledge gaps
How do we know what we don’t know:  Using the Neuroscience Information Framework to reveal knowledge gaps
How do we know what we don’t know:  Using the Neuroscience Information Framework to reveal knowledge gaps
How do we know what we don’t know:  Using the Neuroscience Information Framework to reveal knowledge gaps
How do we know what we don’t know:  Using the Neuroscience Information Framework to reveal knowledge gaps
How do we know what we don’t know:  Using the Neuroscience Information Framework to reveal knowledge gaps
How do we know what we don’t know:  Using the Neuroscience Information Framework to reveal knowledge gaps
How do we know what we don’t know:  Using the Neuroscience Information Framework to reveal knowledge gaps
How do we know what we don’t know:  Using the Neuroscience Information Framework to reveal knowledge gaps
How do we know what we don’t know:  Using the Neuroscience Information Framework to reveal knowledge gaps
How do we know what we don’t know:  Using the Neuroscience Information Framework to reveal knowledge gaps
How do we know what we don’t know:  Using the Neuroscience Information Framework to reveal knowledge gaps
How do we know what we don’t know:  Using the Neuroscience Information Framework to reveal knowledge gaps
How do we know what we don’t know:  Using the Neuroscience Information Framework to reveal knowledge gaps
How do we know what we don’t know:  Using the Neuroscience Information Framework to reveal knowledge gaps
How do we know what we don’t know:  Using the Neuroscience Information Framework to reveal knowledge gaps
How do we know what we don’t know:  Using the Neuroscience Information Framework to reveal knowledge gaps
How do we know what we don’t know:  Using the Neuroscience Information Framework to reveal knowledge gaps
How do we know what we don’t know:  Using the Neuroscience Information Framework to reveal knowledge gaps
How do we know what we don’t know:  Using the Neuroscience Information Framework to reveal knowledge gaps
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

How do we know what we don’t know: Using the Neuroscience Information Framework to reveal knowledge gaps

290

Published on

Presentation at Tools for Integrating and Planning Experiments in Neuroscience-UCLA March 11, 2014 …

Presentation at Tools for Integrating and Planning Experiments in Neuroscience-UCLA March 11, 2014

Published in: Education, Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
290
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
4
Comments
0
Likes
2
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • Queue movie after this.  Would be nice to visually pull this together with an animated view.
  • Current model: Scholars are producing multiple types of research objects; each goes to their own infrastructure with little coordination among them.
    Consumer no longer exclusively a scholar: General public wants access to what they pay for; automated agents are accessing first and mining the content.
  • Transcript

    1. How do we know what we don’tHow do we know what we don’t know: Using the Neuroscienceknow: Using the Neuroscience Information Framework to revealInformation Framework to reveal knowledge gapsknowledge gaps Maryann E. Martone, Ph. D. University of California, San Diego Tools for Integrating and Planning Experiments in Neuroscience-UCLA March 11, 2014
    2. We say this to each other all the time, but we set up systems for scholarly advancement and communication that are the antithesis of integration Whole brain data (20 um microscopic MRI) Mosiac LM images (1 GB+) Conventional LM images Individual cell morphologies EM volumes & reconstructions Solved molecular structures No single technology serves these all equally well. Multiple data types; multiple scales; multiple databases A data integration problemA data integration problem
    3. • NIF is an initiative of the NIH Blueprint consortium of institutesNIF is an initiative of the NIH Blueprint consortium of institutes – What types of resources (data, tools, materials, services) are available to theWhat types of resources (data, tools, materials, services) are available to the neuroscience community?neuroscience community? – How many are there?How many are there? – What domains do they cover? What domains do they not cover?What domains do they cover? What domains do they not cover? – Where are they?Where are they? • Web sitesWeb sites • DatabasesDatabases • LiteratureLiterature • Supplementary materialSupplementary material – Who uses them?Who uses them? – Who creates them?Who creates them? – How can we find them?How can we find them? – How can we make them better in the future?How can we make them better in the future? http://neuinfo.org • PDF filesPDF files • Desk drawersDesk drawers
    4. Old Model: Single type of content; singleOld Model: Single type of content; single mode of distributionmode of distribution ScholarScholar LibraryLibrary Scholar PublisherPublisher Systems for cataloging, standards, and citation in placeSystems for cataloging, standards, and citation in place
    5. Scholar Consumer Libraries Data Repositories Code Repositories Community databases/platforms OA Curators Social Networks Social Networks Social Networks Social NetworksSocial Networks Social Networks Peer Reviewers NarrativeNarrative WorkflowsWorkflows DataData ModelsModels MultimediaMultimedia NanopublicationsNanopublications CodeCode
    6. The duality of modern scholarship Observation: Those who build information systems from the machine side don’t understand the requirements of the human very well Those who build information systems from the human side, don’t understand requirements of machines very well Scholarship requires the ability to cite and track usage of scholarly artifacts. In our current mode of working, there is no way to track artifacts as they move through the ecosystem; no way to incrementally add human expertise; no way to look across the entirety Scholarship requires the ability to cite and track usage of scholarly artifacts. In our current mode of working, there is no way to track artifacts as they move through the ecosystem; no way to incrementally add human expertise; no way to look across the entirety
    7. Whither neuroscience information?Whither neuroscience information? ∞ What is easily machine processable and accessible What is easily machine processable and accessible What is potentially knowableWhat is potentially knowable What is known: Literature, images, human knowledge What is known: Literature, images, human knowledge Unstructured; Natural language processing, entity recognition, image processing and analysis; paywalls communication Abstracts vs full text vs tables etc
    8. NIF: A New Type of Entity for New Modes ofNIF: A New Type of Entity for New Modes of Scientific DisseminationScientific Dissemination • NIF’s mission is to maximize the awareness of, access to and utility of research resources produced worldwide to enable better science and promote efficient use – NIF unites neuroscience information without respect to domain, funding agency, institute or community – NIF is like a “Pub Med” for all biomedical resources and a “Pub Med Central” for databases – Makes them searchable from a single interface – Practical and cost-effective; tries to be sensible – Learned a lot about the effective data sharing The Neuroscience Information Framework is an initiative of the NIH Blueprint consortium of institutes http://neuinfo.org The Neuroscience Information Framework is an initiative of the NIH Blueprint consortium of institutes http://neuinfo.org
    9. Surveying the resourceSurveying the resource landscapelandscape
    10. Data Federation: Deep searchData Federation: Deep search http://neuinfo.org With the thousands of databases and other information sources available, simple descriptive metadata will not suffice With the thousands of databases and other information sources available, simple descriptive metadata will not suffice
    11. A unified framework for neuroscienceA unified framework for neuroscience Hippocampus OR “Cornu Ammonis” OR “Ammon’s horn” Hippocampus OR “Cornu Ammonis” OR “Ammon’s horn” NIF queries > 200 databases; ~400 million recordsNIF queries > 200 databases; ~400 million records
    12. NIF Semantic Framework: NIFSTD ontologyNIF Semantic Framework: NIFSTD ontology • NIF uses ontologies to help navigate across and unify neuroscience resources • Ontologies are built from community ontologies  cross integration with other domains NIFSTDNIFSTD OrganismOrganism NS FunctionNS FunctionMoleculeMolecule InvestigationInvestigationSubcellular structure Subcellular structure MacromoleculeMacromolecule GeneGene Molecule DescriptorsMolecule Descriptors TechniquesTechniques ReagentReagent ProtocolsProtocols CellCell ResourceResource InstrumentInstrument DysfunctionDysfunction QualityQualityAnatomical Structure Anatomical Structure
    13. Purkinje Cell Axon Terminal Axon Dendritic Tree Dendritic Spine Dendrite Cell body Cerebellar cortex Bringing knowledge to data: Ontologies as frameworkBringing knowledge to data: Ontologies as framework There is little obvious connection between data sets taken at different scales using different microscopies without an explicit representation of the biological objects that the data represent There is little obvious connection between data sets taken at different scales using different microscopies without an explicit representation of the biological objects that the data represent
    14. : C: C Neurolex: > 1 million triples Dr. Yi Zeng: Chinese neural knowledge base NIF Cell Graph This is your brain on computers
    15. Ontologies as a data integration frameworkOntologies as a data integration framework •NIF Connectivity: 7 databases containing connectivity primary data or claims from literature on connectivity between brain regions •Brain Architecture Management System (rodent) •Temporal lobe.com (rodent) •Connectome Wiki (human) •Brain Maps (various) •CoCoMac (primate cortex) •UCLA Multimodal database (Human fMRI) •Avian Brain Connectivity Database (Bird) •Total: 1800 unique brain terms (excluding Avian) •Number of exact terms used in > 1 database: 42 •Number of synonym matches: 99 •Number of 1st order partonomy matches: 385
    16. 0 1-10 11-100 >101 Open World-Closed World: Mapping the knowledge - data space Data Sources NIF lets us ask: where isn’t there data? What isn’t studied? Why?NIF lets us ask: where isn’t there data? What isn’t studied? Why?
    17. ForebrainForebrain MidbrainMidbrain HindbrainHindbrain 0 1-10 11-100 >101 Neuroimaging Data-Knowledge Space? Data Sources
    18. ““The Data Homunculus”The Data Homunculus” Funding drives representation in the data spaceFunding drives representation in the data space
    19. Neurolex.org: A computableNeurolex.org: A computable lexicon for neurosciencelexicon for neuroscience http://neurolex.org Larson et al, Frontiers in Neuroinformatics, 2013Larson et al, Frontiers in Neuroinformatics, 2013 •Semantic MediaWiki •Provide a simple interface for defining the concepts required •Light weight semantics •Community based: •Anyone can contribute their terms, concepts, things •Anyone can edit •Anyone can link •Accessible: searched by Google •Growing into a significant knowledge base for neuroscience •25,000 concepts Demo D03 200,000 edits 150 contributors 200,000 edits 150 contributors
    20. Neurolex Structural Lexicon: Defining brainNeurolex Structural Lexicon: Defining brain partsparts
    21. Structural LexiconStructural Lexicon The scourge of neuroanatomical nomenclatureThe scourge of neuroanatomical nomenclature • Problem: Neuroscientists have a myriad number of ways to parcellate the brain – Brains are made up of networks that do not respect gross anatomical boundaries – Partonomies are generally along multiple axes: • Volummetric (species dependent): NeuroNames • Functional (Swanson) • Developmental • Cytoarchitectural – Partonomies are often weak • Arbitrary but defensible Program on Ontologies for Neural Structures, INCF- creating a computable lexicon for neural structures Program on Ontologies for Neural Structures, INCF- creating a computable lexicon for neural structures
    22. Neuroanatomy without bordersNeuroanatomy without borders Brainmaps.org
    23. Structural Lexicon in NeurolexStructural Lexicon in Neurolex Brain Region Brain Region Brain Parcel Brain Parcel •Trans-species •“Stateless”, i.e. no universal defining criteria •General structures and partonomies based on Neuroanatomy 101 Partially overlaps e.g., Hippocampus, Dentate gyrus •Species specific •Specific reference •Defining criteria •Sometimes partonomy; sometimes not e.g., Hippocampus of ABA2009
    24. ““When I use a word...it means what I choose itWhen I use a word...it means what I choose it to mean”to mean”
    25. Neurolex NeuronNeurolex Neuron • Led by Dr. Gordon Shepherd • > 30 world wide experts • Simple set of properties • Consistent naming scheme • Integrated with Structural Lexicon • Used for annotation in other resources, e.g., NeuroElectro
    26. ““You have broken links”You have broken links” Red Links: Information is missing (or misspelled)Red Links: Information is missing (or misspelled)
    27. Location of Cell Soma Location of dendrites Location of local axon arbor
    28. Analysis of Red Links in the Neuron RegistryAnalysis of Red Links in the Neuron Registry • Analysis of red links tells us where instructions aren’t clear, the information isn’t available, or the model insufficient – Conceptualization not clear • what is most important thing about local axon terminals? – Tool doesn’t capture all details Social networks and community sites let us learn things from the collective behavior of contributors  INCF/HBP Knowledge Space Social networks and community sites let us learn things from the collective behavior of contributors  INCF/HBP Knowledge Space
    29. Re-inventing Narrative: Do I have to write inRe-inventing Narrative: Do I have to write in triples?triples? • Not all entities are well-enough specified that they lend themselves to deep annotation – And, as we’ve seen in the previous example, we probably don’t want to pretend that they are • But…sometimes they are – Semantic annotation of research papers to make them “machine-interpretable” has been a goal of many – Can we update the way that authors produce manuscripts so that they are easier to process? •  NIF pilot project: Semantic annotation of entities that researchers would understand
    30. The problem: How many papers were published that used my: antibody Paz et al, J Neurosci, 2010
    31. Now, go find the antibody http://www.millipore.com/searchsummary.do?tabValue=&q=gfap Nov 12,
    32. Jan 15, 2014A catalog number is not a persistent identifierA catalog number is not a persistent identifier
    33. The problem is general across multiple resource types and disciplines The problem is general across multiple resource types and disciplines Vasilevsky et al, Peer J 2013Vasilevsky et al, Peer J 2013
    34. If we can’t do it, neither can the robot • Automated text mining tools were not deployed on this problem, because too few antibodies were able to be automatically identified • We are asking authors to change their ways, instead! • Almost all antibodies were identified with the company name, city and state, but the information is useless if the goal is to identify the antibody used
    35. The Resource Identification InitiativeThe Resource Identification Initiative • NIF, FORCE11 and partners – Led by Anita Bandrowski and Melissa Haendel • Identify 3 types of research resources – Antibodies – Genetically modified animals – Software http://force11.org/Resource_identification_initiativehttp://force11.org/Resource_identification_initiative
    36. Musings: You can’t do that!Musings: You can’t do that! • Two powerful trends in the 21st century: – Networking machines and networking people – Moving science into a machine-accessible platform has been a challenge • Mechanistically • Culturally • Sociologically • “A foolish consistency is the hobgoblin of little minds” – When you have a lot of data and information in an accessible form, we can start to look at actual practices and trends – Focusing on the “negative space”, i.e., what we don’t know, reveals glimpses into sources of bias and confusion • When we scratch the surface of science, we find uncertainty and confusion – Not a failure, but an opportunity • Sometimes we can be precise, i.e., which reagents we used • Sometimes, we can’t  so we should set up systems so we can learn from that
    37. Next Steps: Neurolex to Knowledge SpaceNext Steps: Neurolex to Knowledge Space Data SpaceData Space Laboratory Space Laboratory Space Knowledge Space Knowledge Space BAMS LexiconLexicon EncyclopediaEncyclopedia
    38. AnatomistAnatomist  InformaticistInformaticist
    39. What is the “completeness” of our knowledge?What is the “completeness” of our knowledge? Neocortex Olfactory bulb Neostriatum Cochlear nucleus All neurons with cell bodies in the same brain region are grouped together All neurons with cell bodies in the same brain region are grouped together Properties in Neurolex •Simple set of properties that can be reasonably supplied with a minimal amount of effort
    40. The case of the meanest journal in the world, coincidentally having the lowest retraction rate
    41. The landscape is messy, diverse and evolving: Data toThe landscape is messy, diverse and evolving: Data to Knowledge – Knowledge to DataKnowledge – Knowledge to Data NIF favors a hybrid, tiered, federated system • Domain knowledge – Ontologies • Claims, models and observations – Virtuoso RDF triples – Model repositories • Data – Data federation – Spatial data – Workflows • Narrative – Full text access NeuronNeuron Brain partBrain part DiseaseDisease OrganismOrganism GeneGene Caudate projects to Snpc Caudate projects to Snpc Grm1 is upregulated in chronic cocaine Grm1 is upregulated in chronic cocaine Betz cells degenerate in ALS Betz cells degenerate in ALS NIF provides the tentacles that connect the pieces: a new type of entity for 21st century science NIF provides the tentacles that connect the pieces: a new type of entity for 21st century science TechniqueTechnique PeoplePeople
    42. Data about the subthalamusData about the subthalamus http://neuinfo.org

    ×