SlideShare a Scribd company logo
Navigating the Neuroscience Data
Landscape
Maryann Martone, Ph. D.
University of California, San Diego
“Neural Choreography”
“A grand challenge in neuroscience is to elucidate brain function in relation
to its multiple layers of organization that operate at different spatial
and temporal scales. Central to this effort is tackling “neural
choreography” -- the integrated functioning of neurons into brain
circuits--their spatial organization, local and long-distance connections,
their temporal orchestration, and their dynamic features. Neural
choreography cannot be understood via a purely reductionist approach.
Rather, it entails the convergent use of analytical and synthetic tools to
gather, analyze and mine information from each level of analysis, and
capture the emergence of new layers of function (or dysfunction) as we
move from studying genes and proteins, to cells, circuits, thought, and
behavior....
However, the neuroscience community is not yet fully engaged in exploiting
the rich array of data currently available, nor is it adequately poised to
capitalize on the forthcoming data explosion. “
Akil et al., Science, Feb 11, 2011
 NIF is an initiative of the NIH Blueprint consortium of institutes
 What types of resources (data, tools, materials, services) are
available to the neuroscience community?
 How many are there?
 What domains do they cover? What domains do they not cover?
 Where are they?
 Web sites
 Databases
 Literature
 Supplementary material
 Who uses them?
 Who creates them?
 How can we find them?
 How can we make them better in the future? http://neuinfo.org
• PDF files
• Desk drawers
How many resources are
there?
•NIF Registry: A
catalog of
neuroscience-relevant
resources
•> 4800 currently
listed
•> 2000 databases
•And we are finding
more every day
The Neuroscience Information Framework: Discovery and
utilization of web-based resources for neuroscience
 A portal for finding and
using neuroscience
resources
 A consistent framework for
describing resources
 Provides simultaneous
search of multiple types of
information, organized by
category
 Supported by an expansive
ontology for neuroscience
 Utilizes advanced
technologies to search the
“hidden web”
http://neuinfo.org
UCSD,Yale, CalTech, George Mason, Washington Univ
Supported by NIH Blueprint
Literature
Database
Federation
Registry
What are the connections of the
hippocampus?
HippocampusOR “CornuAmmonis” OR
“Ammon’s horn” Query expansion: Synonyms
and related concepts
Boolean queries
Data sources
categorized by
“data type” and
level of nervous
system
Common views
across multiple
sources
Tutorials for using
full resource when
getting there from
NIF
Link back to
record in
original
source
Results are organized within a common
framework
Connects to
Synapsed with
Synapsed by
Input region
innervates
Axon innervates
Projects toCellular contact
Subcellular contact
Source site
Target site
Each resource implements a different, though related model;
systems are complex and difficult to learn, in many cases
The scourge of neuroanatomical
nomenclature
•NIFConnectivity: 6 databases containing connectivity primary data or claims
•BrainArchitecture Management System (rodent)
•ConnectomeWiki (human)
•Brain Maps (various)
•CoCoMac (primate cortex)
•UCLA Multimodal database (Human fMRI)
•Avian Brain Connectivity Database (Bird)
•Total: 1800 unique brain terms (exluding Avian)
•Number of exact terms used in > 1 database: 42
•Number of synonym matches: 99
•Number of partonomy matches: 385
The INCF is working with NIF to develop semantic and spatial strategies for translating
anatomy across information systems
What is an ontology?
Brain
Cerebellum
Purkinje Cell Layer
Purkinje cell
neuron
has a
has a
has a
is a
 Ontology: an explicit, formal representation
of concepts relationships among them
within a particular domain that expresses
human knowledge in a machine readable
form
 Branch of philosophy: a theory of what is
 e.g., Gene ontologies
 Provide universals for navigating across
different data sources
 Semantic “index”
 Provide the basis for concept-based
queries to probe and mine data
 Perform reasoning
 Link data through relationships not just one-
to-one mappings
PONS program
 Structural LexiconTaskforce
 Concentrate on Human, Non-human
Primate, Rat and Mouse
 Define structural concepts from level of
organ to macromolecular complexes
 Provide a set of criteria by which
structures can be identified
 Neuronal RegistryTaskforce
 Establish conventions for naming new
types of neurons
 Establish a standard set of properties to
define neurons
 Create a Neuron Registry for registering
new types of neurons
 Deployment and representation (Alan
Ruttenberg)
 Brought together ontologists working
across scales
Courtesy of Chris Mungall, Lawrence
Berkeley Labs
***Not about imposing a
single view of anatomy;
about making concepts
computable and being
able to translate among
views
NeuroLexWiki
http://neurolex.org Stephen Larson
•Provide a simple framework
for defining the concepts
required
•Cell, Part of brain,
subcellular structure,
molecule
•Community based:
•Avian neuroanatomy
•Fly neurons (England)
•Neuroimaging terms
•Brain regions identified
by text mining
•Creating a computable
index for neuroscience data
•INCF working to coordinate
Wiki efforts underway at
Allen Institute, Blue Brain
and Neurolex
Demo D03
Comparison of traffic to NIF Portal
vsNeurolex
5000 hits 15000 hits
Wiki is readily indexed by search engines
Neurons in Neurolex
 INCF building a
knowledge base of
neurons and their
properties via the
Neurolex Wiki
 Led by Dr. Gordon
Shepherd
 Consistent and
parseable naming
scheme
 Knowledge is readily
accessible, editable
and computable
Stephen Larson
NIF data federation
Images
Drugs
Antibodies
Grants
Pathways
Animals
Percentage of data records per
data type
connectivity
Brain activation foci
Microarray
98%
Primary data, secondary data, claims,
repositories
Recently added: BioNOT literature
mining tool; Retraction Watch blog
What do you mean by data?
Databases come in many shapes and sizes
 Primary data:
 Data available for
reanalysis, e.g., microarray data
sets from GEO; brain images from
XNAT; microscopic images
(CCDB/CIL)
 Secondary data
 Data features extracted through
data processing and sometimes
normalization, e.g, brain structure
volumes (IBVD), gene expression
levels (Allen Brain Atlas); brain
connectivity statements (BAMS)
 Tertiary data
 Claims and assertions about the
meaning of data
 E.g., gene
upregulation/downregulation,
 Registries:
 Metadata
 Pointers to data sets or
materials stored elsewhere
 Data aggregators
 Aggregate data of the same
type from multiple
sources, e.g., Cell Image
Library ,SUMSdb, Brede
 Single source
 Data acquired within a single
context , e.g., Allen Brain Atlas
Striatum
Hypothalamus
Olfactory bulb
Cerebral cortex
Brain
Brainregion
Data source
VadimAstakhov, KepplerWorkflow Engine
NIF landscape analysis
How much of the landscape do we have?
Query for “reference” brain structures and their parts in NIF Connectivity database
NIF Reports:
Male vs Female
Gender bias
NIF can start to
answer interesting
questions about
neuroscience
research, not just
about neuroscience
Embracing duplication: Data Mash ups
•~300 PMID’s were common between Brede and SUMSdb
•Same information; value added
Same data; different aspects
Same data: different analysis
Chronic vs acute
morphine in striatum
 Drug Related Gene database:
extracted statements from
figures, tables and supplementary
data from published article
 Gemma: Reanalyzed microarray
results from GEO using different
algorithms
 Both provide results of increased
or decreased expression as a
function of experimental
paradigm
 4 strains of mice
 3 conditions: chronic morphine,
acute morphine, saline
Mined NIF for all references to GEO
ID’s: found small number where the
same dataset was represented in two
or more databases
http://www.chibi.ubc.ca/Gemma/home.html
How easy was it to compare?
 Gemma: Gene ID + Gene Symbol
 DRG: Gene name + Probe ID
 Gemma: Increased expression/decreased expression
 DRG: Increased expression/decreased expression
 But...Gemma presented results relative to baseline chronic morphine; DRG with
respect to saline, so direction of change is opposite in the 2 databases
 Analysis:
 1370 statements from Gemma regarding gene expression as a function of
chronicmorphine
 617 were consistent with DRG; over half of the claims of the paper were not
confirmed in this analysis
 Results for 1 gene were opposite in DRG and Gemma
 45 did not have enough information provided in the paper to make a judgment
NIF annotation
standard
Grabbing the long tail of small
data
 Analysis of NIF shows
multiple databases with
similar scope and content
 Many contain partially
overlapping data
 Data “flows” from one
resource to the next
 Data is
reinterpreted, reanalyzed
or added to
 When does it become
something else?
Phases of NIF
 2006-2008: A survey of what was out there
 2008-2009: Strategy for resource discovery
 NIF Registry vs NIF data federation
 Ingestion of data contained within different technology
platforms, e.g., XML vs relational vs RDF
 Effective search across semantically diverse sources
 NIFSTD ontologies
 2009-2011: Strategy for data integration
 Unified views across common sources
 Mapping of content to NIF vocabularies
 2011-present: Data analytics
 Uniform external data references
Data, not just stories about them!
47/50 major preclinical
published cancer studies
could not be replicated
 “The scientific community
assumes that the claims in a
preclinical study can be taken
at face value-that although
there might be some errors in
detail, the main message of
the paper can be relied on and
the data will, for the most
part, stand the test of time.
Unfortunately, this is not
always the case.”
 Getting data out sooner in a
form where they can be exposed
to many eyes and many
analyses, and easily compared,
may allow us to expose errors
and develop better metrics to
evaluate the validity of data
Begley and Ellis, 29 MARCH 2012 |VOL 483 |
NATURE | 531
 “There are no guidelines that
require all data sets to be
reported in a paper; often,
original data are removed
during the peer review and
publication process. “
A global view of data
 You (and the machine) have to be able to
find it
 Accessible through the web
 Annotations
 You have to be able to use it
 Data type specified and in a usable form
 You have to know what the data mean
 Some semantics
 Context: Experimental metadata
 Provenance: Where did the data come from?
Reporting neuroscience data within a consistent framework helps enormously
NIF team (past and present)
Jeff Grethe, UCSD, Co Investigator, Interim PI
AmarnathGupta, UCSD, Co Investigator
Anita Bandrowski, NIF Project Leader
Gordon Shepherd,Yale University
Perry Miller
Luis Marenco
RixinWang
DavidVan Essen,Washington University
Erin Reid
Paul Sternberg, CalTech
ArunRangarajan
Hans Michael Muller
Yuling Li
GiorgioAscoli,George Mason University
SrideviPolavarum
Fahim Imam, NIF Ontology Engineer
Larry Lui
Andrea Arnaud Stagg
Jonathan Cachat
Jennifer Lawrence
Lee Hornbrook
Binh Ngo
VadimAstakhov
XufeiQian
Chris Condit
Mark Ellisman
Stephen Larson
WillieWong
TimClark, Harvard University
Paolo Ciccarese
Karen Skinner, NIH, Program Officer
Concept-based search: search by meaning
 Search Google: GABAergic neuron
 Search NIF: GABAergic neuron
 NIF automatically searches for types of
GABAergic neurons
Types of GABAergic
neurons

More Related Content

What's hot

Databases and Ontologies: Where do we go from here?
Databases and Ontologies:  Where do we go from here?Databases and Ontologies:  Where do we go from here?
Databases and Ontologies: Where do we go from here?
Maryann Martone
 
Neuroscience as networked science
Neuroscience as networked scienceNeuroscience as networked science
Neuroscience as networked science
Neuroscience Information Framework
 
How do we know what we don’t know: Using the Neuroscience Information Framew...
How do we know what we don’t know:  Using the Neuroscience Information Framew...How do we know what we don’t know:  Using the Neuroscience Information Framew...
How do we know what we don’t know: Using the Neuroscience Information Framew...
Maryann Martone
 
The Neuroscience Information Framework: Making Resources Discoverable for the...
The Neuroscience Information Framework: Making Resources Discoverable for the...The Neuroscience Information Framework: Making Resources Discoverable for the...
The Neuroscience Information Framework: Making Resources Discoverable for the...
Neuroscience Information Framework
 
Martone acs presentation
Martone acs presentationMartone acs presentation
Martone acs presentation
Neuroscience Information Framework
 
Big data from small data: A deep survey of the neuroscience landscape data via
Big data from small data:  A deep survey of the neuroscience landscape data viaBig data from small data:  A deep survey of the neuroscience landscape data via
Big data from small data: A deep survey of the neuroscience landscape data via
Neuroscience Information Framework
 
Ibn Sina
Ibn SinaIbn Sina
Ibn Sina
Yasmine Gaber
 
The Neuroscience Information Framework: Establishing a practical semantic fra...
The Neuroscience Information Framework: Establishing a practical semantic fra...The Neuroscience Information Framework: Establishing a practical semantic fra...
The Neuroscience Information Framework: Establishing a practical semantic fra...
Neuroscience Information Framework
 
Semantic Web & Web 3.0 empowering real world outcomes in biomedical research ...
Semantic Web & Web 3.0 empowering real world outcomes in biomedical research ...Semantic Web & Web 3.0 empowering real world outcomes in biomedical research ...
Semantic Web & Web 3.0 empowering real world outcomes in biomedical research ...
Amit Sheth
 
Cartic Ramakrishnan's dissertation defense
Cartic Ramakrishnan's dissertation defenseCartic Ramakrishnan's dissertation defense
Cartic Ramakrishnan's dissertation defense
Cartic Ramakrishnan
 
Text-mining-based retrieval of protein networks
Text-mining-based retrieval of protein networksText-mining-based retrieval of protein networks
Text-mining-based retrieval of protein networks
Lars Juhl Jensen
 
Application and Implementation of different deep learning
Application and Implementation of different deep learningApplication and Implementation of different deep learning
Application and Implementation of different deep learning
JIEJackyZOUChou
 
NEUROINFORMATICS
NEUROINFORMATICSNEUROINFORMATICS
NEUROINFORMATICS
Ranjana Nagendra
 
B.3.5
B.3.5B.3.5
Building an informatics solution to sustain AI-guided cell profiling with hig...
Building an informatics solution to sustain AI-guided cell profiling with hig...Building an informatics solution to sustain AI-guided cell profiling with hig...
Building an informatics solution to sustain AI-guided cell profiling with hig...
Ola Spjuth
 
Neuroinformatics
NeuroinformaticsNeuroinformatics
Neuroinformatics
PritinshaRout
 
Ai in drug design webinar 26 feb 2019
Ai in drug design webinar 26 feb 2019Ai in drug design webinar 26 feb 2019
Ai in drug design webinar 26 feb 2019
Pistoia Alliance
 
Data Landscapes: The Neuroscience Information Framework
Data Landscapes:  The Neuroscience Information FrameworkData Landscapes:  The Neuroscience Information Framework
Data Landscapes: The Neuroscience Information Framework
Maryann Martone
 
Delroy Cameron's Dissertation Defense: A Contenxt-Driven Subgraph Model for L...
Delroy Cameron's Dissertation Defense: A Contenxt-Driven Subgraph Model for L...Delroy Cameron's Dissertation Defense: A Contenxt-Driven Subgraph Model for L...
Delroy Cameron's Dissertation Defense: A Contenxt-Driven Subgraph Model for L...
Amit Sheth
 
2013 nas-ehs-data-integration-dc
2013 nas-ehs-data-integration-dc2013 nas-ehs-data-integration-dc
2013 nas-ehs-data-integration-dc
c.titus.brown
 

What's hot (20)

Databases and Ontologies: Where do we go from here?
Databases and Ontologies:  Where do we go from here?Databases and Ontologies:  Where do we go from here?
Databases and Ontologies: Where do we go from here?
 
Neuroscience as networked science
Neuroscience as networked scienceNeuroscience as networked science
Neuroscience as networked science
 
How do we know what we don’t know: Using the Neuroscience Information Framew...
How do we know what we don’t know:  Using the Neuroscience Information Framew...How do we know what we don’t know:  Using the Neuroscience Information Framew...
How do we know what we don’t know: Using the Neuroscience Information Framew...
 
The Neuroscience Information Framework: Making Resources Discoverable for the...
The Neuroscience Information Framework: Making Resources Discoverable for the...The Neuroscience Information Framework: Making Resources Discoverable for the...
The Neuroscience Information Framework: Making Resources Discoverable for the...
 
Martone acs presentation
Martone acs presentationMartone acs presentation
Martone acs presentation
 
Big data from small data: A deep survey of the neuroscience landscape data via
Big data from small data:  A deep survey of the neuroscience landscape data viaBig data from small data:  A deep survey of the neuroscience landscape data via
Big data from small data: A deep survey of the neuroscience landscape data via
 
Ibn Sina
Ibn SinaIbn Sina
Ibn Sina
 
The Neuroscience Information Framework: Establishing a practical semantic fra...
The Neuroscience Information Framework: Establishing a practical semantic fra...The Neuroscience Information Framework: Establishing a practical semantic fra...
The Neuroscience Information Framework: Establishing a practical semantic fra...
 
Semantic Web & Web 3.0 empowering real world outcomes in biomedical research ...
Semantic Web & Web 3.0 empowering real world outcomes in biomedical research ...Semantic Web & Web 3.0 empowering real world outcomes in biomedical research ...
Semantic Web & Web 3.0 empowering real world outcomes in biomedical research ...
 
Cartic Ramakrishnan's dissertation defense
Cartic Ramakrishnan's dissertation defenseCartic Ramakrishnan's dissertation defense
Cartic Ramakrishnan's dissertation defense
 
Text-mining-based retrieval of protein networks
Text-mining-based retrieval of protein networksText-mining-based retrieval of protein networks
Text-mining-based retrieval of protein networks
 
Application and Implementation of different deep learning
Application and Implementation of different deep learningApplication and Implementation of different deep learning
Application and Implementation of different deep learning
 
NEUROINFORMATICS
NEUROINFORMATICSNEUROINFORMATICS
NEUROINFORMATICS
 
B.3.5
B.3.5B.3.5
B.3.5
 
Building an informatics solution to sustain AI-guided cell profiling with hig...
Building an informatics solution to sustain AI-guided cell profiling with hig...Building an informatics solution to sustain AI-guided cell profiling with hig...
Building an informatics solution to sustain AI-guided cell profiling with hig...
 
Neuroinformatics
NeuroinformaticsNeuroinformatics
Neuroinformatics
 
Ai in drug design webinar 26 feb 2019
Ai in drug design webinar 26 feb 2019Ai in drug design webinar 26 feb 2019
Ai in drug design webinar 26 feb 2019
 
Data Landscapes: The Neuroscience Information Framework
Data Landscapes:  The Neuroscience Information FrameworkData Landscapes:  The Neuroscience Information Framework
Data Landscapes: The Neuroscience Information Framework
 
Delroy Cameron's Dissertation Defense: A Contenxt-Driven Subgraph Model for L...
Delroy Cameron's Dissertation Defense: A Contenxt-Driven Subgraph Model for L...Delroy Cameron's Dissertation Defense: A Contenxt-Driven Subgraph Model for L...
Delroy Cameron's Dissertation Defense: A Contenxt-Driven Subgraph Model for L...
 
2013 nas-ehs-data-integration-dc
2013 nas-ehs-data-integration-dc2013 nas-ehs-data-integration-dc
2013 nas-ehs-data-integration-dc
 

Similar to Navigating the Neuroscience Data Landscape

The real world of ontologies and phenotype representation: perspectives from...
The real world of ontologies and phenotype representation:  perspectives from...The real world of ontologies and phenotype representation:  perspectives from...
The real world of ontologies and phenotype representation: perspectives from...
Maryann Martone
 
RDAP14: Maryann Martone, Keynote, The Neuroscience Information Framework
RDAP14: Maryann Martone, Keynote, The Neuroscience Information FrameworkRDAP14: Maryann Martone, Keynote, The Neuroscience Information Framework
RDAP14: Maryann Martone, Keynote, The Neuroscience Information Framework
ASIS&T
 
Data Landscapes - Addiction
Data Landscapes - AddictionData Landscapes - Addiction
Data Landscapes - Addiction
Neuroscience Information Framework
 
NIFSTD: A Comprehensive Ontology for Neuroscience
NIFSTD: A Comprehensive Ontology for NeuroscienceNIFSTD: A Comprehensive Ontology for Neuroscience
NIFSTD: A Comprehensive Ontology for Neuroscience
Neuroscience Information Framework
 
Neuroscience Information Framework Ontologies: Nerve cells in Neurolex and NI...
Neuroscience Information Framework Ontologies: Nerve cells in Neurolex and NI...Neuroscience Information Framework Ontologies: Nerve cells in Neurolex and NI...
Neuroscience Information Framework Ontologies: Nerve cells in Neurolex and NI...
Neuroscience Information Framework
 
Connected Data for Machine Learning | Paul Groth
Connected Data for Machine Learning | Paul GrothConnected Data for Machine Learning | Paul Groth
Connected Data for Machine Learning | Paul Groth
Connected Data World
 
Knowledge graph construction for research & medicine
Knowledge graph construction for research & medicineKnowledge graph construction for research & medicine
Knowledge graph construction for research & medicine
Paul Groth
 
NIFSTD and NeuroLex: A Comprehensive Ontology Development Based on Multiple B...
NIFSTD and NeuroLex: A Comprehensive Ontology Development Based on Multiple B...NIFSTD and NeuroLex: A Comprehensive Ontology Development Based on Multiple B...
NIFSTD and NeuroLex: A Comprehensive Ontology Development Based on Multiple B...
Neuroscience Information Framework
 
Bioinformatics databases: Current Trends and Future Perspectives
Bioinformatics databases: Current Trends and Future PerspectivesBioinformatics databases: Current Trends and Future Perspectives
Bioinformatics databases: Current Trends and Future Perspectives
University of Malaya
 
Publishing for the 21st Century: Experiences from the NEUROSCIENCE INFORMATIO...
Publishing for the 21st Century: Experiences from the NEUROSCIENCE INFORMATIO...Publishing for the 21st Century: Experiences from the NEUROSCIENCE INFORMATIO...
Publishing for the 21st Century: Experiences from the NEUROSCIENCE INFORMATIO...
Neuroscience Information Framework
 
Martone grethe
Martone gretheMartone grethe
Martone grethe
Maryann Martone
 
Kefed introduction 12-05-10-2224
Kefed introduction 12-05-10-2224Kefed introduction 12-05-10-2224
Kefed introduction 12-05-10-2224
Gully Burns
 
NCBO haendel talk 2013
NCBO haendel talk 2013NCBO haendel talk 2013
NCBO haendel talk 2013
mhaendel
 
Research Statement Chien-Wei Lin
Research Statement Chien-Wei LinResearch Statement Chien-Wei Lin
Research Statement Chien-Wei Lin
Chien-Wei Lin
 
Ontology Based Information Extraction for Disease Intelligence
Ontology Based Information Extraction for Disease Intelligence Ontology Based Information Extraction for Disease Intelligence
Ontology Based Information Extraction for Disease Intelligence
IJORCS
 
Diversity and Depth: Implementing AI across many long tail domains
Diversity and Depth: Implementing AI across many long tail domainsDiversity and Depth: Implementing AI across many long tail domains
Diversity and Depth: Implementing AI across many long tail domains
Paul Groth
 
Where are the Data? Perspectives from the Neuroscience Information Framework.
Where are the Data? Perspectives from the Neuroscience Information Framework. Where are the Data? Perspectives from the Neuroscience Information Framework.
Where are the Data? Perspectives from the Neuroscience Information Framework.
Neuroscience Information Framework
 
Data Mining in Rediology reports
Data Mining in Rediology reportsData Mining in Rediology reports
Data Mining in Rediology reports
Saeed Mehrabi
 
Reverse Engineering The Brain, Online
Reverse Engineering The Brain, OnlineReverse Engineering The Brain, Online
Reverse Engineering The Brain, Online
Stephen Larson
 
B4OS-2012
B4OS-2012B4OS-2012

Similar to Navigating the Neuroscience Data Landscape (20)

The real world of ontologies and phenotype representation: perspectives from...
The real world of ontologies and phenotype representation:  perspectives from...The real world of ontologies and phenotype representation:  perspectives from...
The real world of ontologies and phenotype representation: perspectives from...
 
RDAP14: Maryann Martone, Keynote, The Neuroscience Information Framework
RDAP14: Maryann Martone, Keynote, The Neuroscience Information FrameworkRDAP14: Maryann Martone, Keynote, The Neuroscience Information Framework
RDAP14: Maryann Martone, Keynote, The Neuroscience Information Framework
 
Data Landscapes - Addiction
Data Landscapes - AddictionData Landscapes - Addiction
Data Landscapes - Addiction
 
NIFSTD: A Comprehensive Ontology for Neuroscience
NIFSTD: A Comprehensive Ontology for NeuroscienceNIFSTD: A Comprehensive Ontology for Neuroscience
NIFSTD: A Comprehensive Ontology for Neuroscience
 
Neuroscience Information Framework Ontologies: Nerve cells in Neurolex and NI...
Neuroscience Information Framework Ontologies: Nerve cells in Neurolex and NI...Neuroscience Information Framework Ontologies: Nerve cells in Neurolex and NI...
Neuroscience Information Framework Ontologies: Nerve cells in Neurolex and NI...
 
Connected Data for Machine Learning | Paul Groth
Connected Data for Machine Learning | Paul GrothConnected Data for Machine Learning | Paul Groth
Connected Data for Machine Learning | Paul Groth
 
Knowledge graph construction for research & medicine
Knowledge graph construction for research & medicineKnowledge graph construction for research & medicine
Knowledge graph construction for research & medicine
 
NIFSTD and NeuroLex: A Comprehensive Ontology Development Based on Multiple B...
NIFSTD and NeuroLex: A Comprehensive Ontology Development Based on Multiple B...NIFSTD and NeuroLex: A Comprehensive Ontology Development Based on Multiple B...
NIFSTD and NeuroLex: A Comprehensive Ontology Development Based on Multiple B...
 
Bioinformatics databases: Current Trends and Future Perspectives
Bioinformatics databases: Current Trends and Future PerspectivesBioinformatics databases: Current Trends and Future Perspectives
Bioinformatics databases: Current Trends and Future Perspectives
 
Publishing for the 21st Century: Experiences from the NEUROSCIENCE INFORMATIO...
Publishing for the 21st Century: Experiences from the NEUROSCIENCE INFORMATIO...Publishing for the 21st Century: Experiences from the NEUROSCIENCE INFORMATIO...
Publishing for the 21st Century: Experiences from the NEUROSCIENCE INFORMATIO...
 
Martone grethe
Martone gretheMartone grethe
Martone grethe
 
Kefed introduction 12-05-10-2224
Kefed introduction 12-05-10-2224Kefed introduction 12-05-10-2224
Kefed introduction 12-05-10-2224
 
NCBO haendel talk 2013
NCBO haendel talk 2013NCBO haendel talk 2013
NCBO haendel talk 2013
 
Research Statement Chien-Wei Lin
Research Statement Chien-Wei LinResearch Statement Chien-Wei Lin
Research Statement Chien-Wei Lin
 
Ontology Based Information Extraction for Disease Intelligence
Ontology Based Information Extraction for Disease Intelligence Ontology Based Information Extraction for Disease Intelligence
Ontology Based Information Extraction for Disease Intelligence
 
Diversity and Depth: Implementing AI across many long tail domains
Diversity and Depth: Implementing AI across many long tail domainsDiversity and Depth: Implementing AI across many long tail domains
Diversity and Depth: Implementing AI across many long tail domains
 
Where are the Data? Perspectives from the Neuroscience Information Framework.
Where are the Data? Perspectives from the Neuroscience Information Framework. Where are the Data? Perspectives from the Neuroscience Information Framework.
Where are the Data? Perspectives from the Neuroscience Information Framework.
 
Data Mining in Rediology reports
Data Mining in Rediology reportsData Mining in Rediology reports
Data Mining in Rediology reports
 
Reverse Engineering The Brain, Online
Reverse Engineering The Brain, OnlineReverse Engineering The Brain, Online
Reverse Engineering The Brain, Online
 
B4OS-2012
B4OS-2012B4OS-2012
B4OS-2012
 

More from Neuroscience Information Framework

Why should my institution support RRIDs?
Why should my institution support RRIDs?Why should my institution support RRIDs?
Why should my institution support RRIDs?
Neuroscience Information Framework
 
Why should Journals ask fo RRIDs?
Why should Journals ask fo RRIDs?Why should Journals ask fo RRIDs?
Why should Journals ask fo RRIDs?
Neuroscience Information Framework
 
Funders and RRIDs
Funders and RRIDsFunders and RRIDs
INCF 2013 - Uniform Resource Layer
INCF 2013 - Uniform Resource LayerINCF 2013 - Uniform Resource Layer
INCF 2013 - Uniform Resource Layer
Neuroscience Information Framework
 
Neurosciences Information Framework (NIF): An example of community Cyberinfra...
Neurosciences Information Framework (NIF): An example of community Cyberinfra...Neurosciences Information Framework (NIF): An example of community Cyberinfra...
Neurosciences Information Framework (NIF): An example of community Cyberinfra...
Neuroscience Information Framework
 
The Neuroscience Information Framework: A Scalable Platform for Information E...
The Neuroscience Information Framework: A Scalable Platform for Information E...The Neuroscience Information Framework: A Scalable Platform for Information E...
The Neuroscience Information Framework: A Scalable Platform for Information E...
Neuroscience Information Framework
 
The Uniform Resource Layer
The Uniform Resource LayerThe Uniform Resource Layer
The Uniform Resource Layer
Neuroscience Information Framework
 
NIF services overview
NIF services overviewNIF services overview
NIF services overview
Neuroscience Information Framework
 
NIF Lexical Overview
NIF Lexical OverviewNIF Lexical Overview
NIF Services
NIF ServicesNIF Services
NIF Data Registration
NIF Data RegistrationNIF Data Registration
NIF Data Registration
Neuroscience Information Framework
 
NIF Data Ingest
NIF Data IngestNIF Data Ingest
NIF Data Federation
NIF Data FederationNIF Data Federation
NIF Overview
NIF Overview NIF Overview
A Deep Survey of the Digital Resource Landscape
A Deep Survey of the Digital Resource LandscapeA Deep Survey of the Digital Resource Landscape
A Deep Survey of the Digital Resource Landscape
Neuroscience Information Framework
 
NIF: A vision for a uniform resource layer
NIF: A vision for a uniform resource layerNIF: A vision for a uniform resource layer
NIF: A vision for a uniform resource layer
Neuroscience Information Framework
 
In Search of a Missing Link in the Data Deluge vs. Data Scarcity Debate
In Search of a Missing Link in the Data Deluge vs. Data Scarcity DebateIn Search of a Missing Link in the Data Deluge vs. Data Scarcity Debate
In Search of a Missing Link in the Data Deluge vs. Data Scarcity Debate
Neuroscience Information Framework
 
NIFSTD and NeuroLex: A Comprehensive Ontology Development Based on Multiple B...
NIFSTD and NeuroLex: A Comprehensive Ontology Development Based on Multiple B...NIFSTD and NeuroLex: A Comprehensive Ontology Development Based on Multiple B...
NIFSTD and NeuroLex: A Comprehensive Ontology Development Based on Multiple B...
Neuroscience Information Framework
 
Defined versus Asserted Classes: Working with the OWL Ontologies
Defined versus Asserted Classes: Working with the OWL OntologiesDefined versus Asserted Classes: Working with the OWL Ontologies
Defined versus Asserted Classes: Working with the OWL Ontologies
Neuroscience Information Framework
 
NIF as a Multi-Model Semantic Information System
NIF as a Multi-Model Semantic Information SystemNIF as a Multi-Model Semantic Information System
NIF as a Multi-Model Semantic Information System
Neuroscience Information Framework
 

More from Neuroscience Information Framework (20)

Why should my institution support RRIDs?
Why should my institution support RRIDs?Why should my institution support RRIDs?
Why should my institution support RRIDs?
 
Why should Journals ask fo RRIDs?
Why should Journals ask fo RRIDs?Why should Journals ask fo RRIDs?
Why should Journals ask fo RRIDs?
 
Funders and RRIDs
Funders and RRIDsFunders and RRIDs
Funders and RRIDs
 
INCF 2013 - Uniform Resource Layer
INCF 2013 - Uniform Resource LayerINCF 2013 - Uniform Resource Layer
INCF 2013 - Uniform Resource Layer
 
Neurosciences Information Framework (NIF): An example of community Cyberinfra...
Neurosciences Information Framework (NIF): An example of community Cyberinfra...Neurosciences Information Framework (NIF): An example of community Cyberinfra...
Neurosciences Information Framework (NIF): An example of community Cyberinfra...
 
The Neuroscience Information Framework: A Scalable Platform for Information E...
The Neuroscience Information Framework: A Scalable Platform for Information E...The Neuroscience Information Framework: A Scalable Platform for Information E...
The Neuroscience Information Framework: A Scalable Platform for Information E...
 
The Uniform Resource Layer
The Uniform Resource LayerThe Uniform Resource Layer
The Uniform Resource Layer
 
NIF services overview
NIF services overviewNIF services overview
NIF services overview
 
NIF Lexical Overview
NIF Lexical OverviewNIF Lexical Overview
NIF Lexical Overview
 
NIF Services
NIF ServicesNIF Services
NIF Services
 
NIF Data Registration
NIF Data RegistrationNIF Data Registration
NIF Data Registration
 
NIF Data Ingest
NIF Data IngestNIF Data Ingest
NIF Data Ingest
 
NIF Data Federation
NIF Data FederationNIF Data Federation
NIF Data Federation
 
NIF Overview
NIF Overview NIF Overview
NIF Overview
 
A Deep Survey of the Digital Resource Landscape
A Deep Survey of the Digital Resource LandscapeA Deep Survey of the Digital Resource Landscape
A Deep Survey of the Digital Resource Landscape
 
NIF: A vision for a uniform resource layer
NIF: A vision for a uniform resource layerNIF: A vision for a uniform resource layer
NIF: A vision for a uniform resource layer
 
In Search of a Missing Link in the Data Deluge vs. Data Scarcity Debate
In Search of a Missing Link in the Data Deluge vs. Data Scarcity DebateIn Search of a Missing Link in the Data Deluge vs. Data Scarcity Debate
In Search of a Missing Link in the Data Deluge vs. Data Scarcity Debate
 
NIFSTD and NeuroLex: A Comprehensive Ontology Development Based on Multiple B...
NIFSTD and NeuroLex: A Comprehensive Ontology Development Based on Multiple B...NIFSTD and NeuroLex: A Comprehensive Ontology Development Based on Multiple B...
NIFSTD and NeuroLex: A Comprehensive Ontology Development Based on Multiple B...
 
Defined versus Asserted Classes: Working with the OWL Ontologies
Defined versus Asserted Classes: Working with the OWL OntologiesDefined versus Asserted Classes: Working with the OWL Ontologies
Defined versus Asserted Classes: Working with the OWL Ontologies
 
NIF as a Multi-Model Semantic Information System
NIF as a Multi-Model Semantic Information SystemNIF as a Multi-Model Semantic Information System
NIF as a Multi-Model Semantic Information System
 

Recently uploaded

Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptxChapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Mohd Adib Abd Muin, Senior Lecturer at Universiti Utara Malaysia
 
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
National Information Standards Organization (NISO)
 
How to Add Chatter in the odoo 17 ERP Module
How to Add Chatter in the odoo 17 ERP ModuleHow to Add Chatter in the odoo 17 ERP Module
How to Add Chatter in the odoo 17 ERP Module
Celine George
 
Digital Artefact 1 - Tiny Home Environmental Design
Digital Artefact 1 - Tiny Home Environmental DesignDigital Artefact 1 - Tiny Home Environmental Design
Digital Artefact 1 - Tiny Home Environmental Design
amberjdewit93
 
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptxPrésentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
siemaillard
 
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
PECB
 
Chapter wise All Notes of First year Basic Civil Engineering.pptx
Chapter wise All Notes of First year Basic Civil Engineering.pptxChapter wise All Notes of First year Basic Civil Engineering.pptx
Chapter wise All Notes of First year Basic Civil Engineering.pptx
Denish Jangid
 
Main Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docxMain Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docx
adhitya5119
 
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptxC1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
mulvey2
 
writing about opinions about Australia the movie
writing about opinions about Australia the moviewriting about opinions about Australia the movie
writing about opinions about Australia the movie
Nicholas Montgomery
 
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem studentsRHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
Himanshu Rai
 
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Dr. Vinod Kumar Kanvaria
 
How to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRMHow to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRM
Celine George
 
Cognitive Development Adolescence Psychology
Cognitive Development Adolescence PsychologyCognitive Development Adolescence Psychology
Cognitive Development Adolescence Psychology
paigestewart1632
 
Hindi varnamala | hindi alphabet PPT.pdf
Hindi varnamala | hindi alphabet PPT.pdfHindi varnamala | hindi alphabet PPT.pdf
Hindi varnamala | hindi alphabet PPT.pdf
Dr. Mulla Adam Ali
 
Life upper-Intermediate B2 Workbook for student
Life upper-Intermediate B2 Workbook for studentLife upper-Intermediate B2 Workbook for student
Life upper-Intermediate B2 Workbook for student
NgcHiNguyn25
 
Film vocab for eal 3 students: Australia the movie
Film vocab for eal 3 students: Australia the movieFilm vocab for eal 3 students: Australia the movie
Film vocab for eal 3 students: Australia the movie
Nicholas Montgomery
 
A Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdfA Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdf
Jean Carlos Nunes Paixão
 
South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)
Academy of Science of South Africa
 
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UPLAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
RAHUL
 

Recently uploaded (20)

Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptxChapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
 
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
 
How to Add Chatter in the odoo 17 ERP Module
How to Add Chatter in the odoo 17 ERP ModuleHow to Add Chatter in the odoo 17 ERP Module
How to Add Chatter in the odoo 17 ERP Module
 
Digital Artefact 1 - Tiny Home Environmental Design
Digital Artefact 1 - Tiny Home Environmental DesignDigital Artefact 1 - Tiny Home Environmental Design
Digital Artefact 1 - Tiny Home Environmental Design
 
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptxPrésentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
 
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
 
Chapter wise All Notes of First year Basic Civil Engineering.pptx
Chapter wise All Notes of First year Basic Civil Engineering.pptxChapter wise All Notes of First year Basic Civil Engineering.pptx
Chapter wise All Notes of First year Basic Civil Engineering.pptx
 
Main Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docxMain Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docx
 
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptxC1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
 
writing about opinions about Australia the movie
writing about opinions about Australia the moviewriting about opinions about Australia the movie
writing about opinions about Australia the movie
 
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem studentsRHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
 
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
 
How to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRMHow to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRM
 
Cognitive Development Adolescence Psychology
Cognitive Development Adolescence PsychologyCognitive Development Adolescence Psychology
Cognitive Development Adolescence Psychology
 
Hindi varnamala | hindi alphabet PPT.pdf
Hindi varnamala | hindi alphabet PPT.pdfHindi varnamala | hindi alphabet PPT.pdf
Hindi varnamala | hindi alphabet PPT.pdf
 
Life upper-Intermediate B2 Workbook for student
Life upper-Intermediate B2 Workbook for studentLife upper-Intermediate B2 Workbook for student
Life upper-Intermediate B2 Workbook for student
 
Film vocab for eal 3 students: Australia the movie
Film vocab for eal 3 students: Australia the movieFilm vocab for eal 3 students: Australia the movie
Film vocab for eal 3 students: Australia the movie
 
A Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdfA Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdf
 
South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)
 
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UPLAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
 

Navigating the Neuroscience Data Landscape

  • 1. Navigating the Neuroscience Data Landscape Maryann Martone, Ph. D. University of California, San Diego
  • 2. “Neural Choreography” “A grand challenge in neuroscience is to elucidate brain function in relation to its multiple layers of organization that operate at different spatial and temporal scales. Central to this effort is tackling “neural choreography” -- the integrated functioning of neurons into brain circuits--their spatial organization, local and long-distance connections, their temporal orchestration, and their dynamic features. Neural choreography cannot be understood via a purely reductionist approach. Rather, it entails the convergent use of analytical and synthetic tools to gather, analyze and mine information from each level of analysis, and capture the emergence of new layers of function (or dysfunction) as we move from studying genes and proteins, to cells, circuits, thought, and behavior.... However, the neuroscience community is not yet fully engaged in exploiting the rich array of data currently available, nor is it adequately poised to capitalize on the forthcoming data explosion. “ Akil et al., Science, Feb 11, 2011
  • 3.  NIF is an initiative of the NIH Blueprint consortium of institutes  What types of resources (data, tools, materials, services) are available to the neuroscience community?  How many are there?  What domains do they cover? What domains do they not cover?  Where are they?  Web sites  Databases  Literature  Supplementary material  Who uses them?  Who creates them?  How can we find them?  How can we make them better in the future? http://neuinfo.org • PDF files • Desk drawers
  • 4. How many resources are there? •NIF Registry: A catalog of neuroscience-relevant resources •> 4800 currently listed •> 2000 databases •And we are finding more every day
  • 5. The Neuroscience Information Framework: Discovery and utilization of web-based resources for neuroscience  A portal for finding and using neuroscience resources  A consistent framework for describing resources  Provides simultaneous search of multiple types of information, organized by category  Supported by an expansive ontology for neuroscience  Utilizes advanced technologies to search the “hidden web” http://neuinfo.org UCSD,Yale, CalTech, George Mason, Washington Univ Supported by NIH Blueprint Literature Database Federation Registry
  • 6. What are the connections of the hippocampus? HippocampusOR “CornuAmmonis” OR “Ammon’s horn” Query expansion: Synonyms and related concepts Boolean queries Data sources categorized by “data type” and level of nervous system Common views across multiple sources Tutorials for using full resource when getting there from NIF Link back to record in original source
  • 7. Results are organized within a common framework Connects to Synapsed with Synapsed by Input region innervates Axon innervates Projects toCellular contact Subcellular contact Source site Target site Each resource implements a different, though related model; systems are complex and difficult to learn, in many cases
  • 8. The scourge of neuroanatomical nomenclature •NIFConnectivity: 6 databases containing connectivity primary data or claims •BrainArchitecture Management System (rodent) •ConnectomeWiki (human) •Brain Maps (various) •CoCoMac (primate cortex) •UCLA Multimodal database (Human fMRI) •Avian Brain Connectivity Database (Bird) •Total: 1800 unique brain terms (exluding Avian) •Number of exact terms used in > 1 database: 42 •Number of synonym matches: 99 •Number of partonomy matches: 385 The INCF is working with NIF to develop semantic and spatial strategies for translating anatomy across information systems
  • 9. What is an ontology? Brain Cerebellum Purkinje Cell Layer Purkinje cell neuron has a has a has a is a  Ontology: an explicit, formal representation of concepts relationships among them within a particular domain that expresses human knowledge in a machine readable form  Branch of philosophy: a theory of what is  e.g., Gene ontologies  Provide universals for navigating across different data sources  Semantic “index”  Provide the basis for concept-based queries to probe and mine data  Perform reasoning  Link data through relationships not just one- to-one mappings
  • 10. PONS program  Structural LexiconTaskforce  Concentrate on Human, Non-human Primate, Rat and Mouse  Define structural concepts from level of organ to macromolecular complexes  Provide a set of criteria by which structures can be identified  Neuronal RegistryTaskforce  Establish conventions for naming new types of neurons  Establish a standard set of properties to define neurons  Create a Neuron Registry for registering new types of neurons  Deployment and representation (Alan Ruttenberg)  Brought together ontologists working across scales Courtesy of Chris Mungall, Lawrence Berkeley Labs ***Not about imposing a single view of anatomy; about making concepts computable and being able to translate among views
  • 11. NeuroLexWiki http://neurolex.org Stephen Larson •Provide a simple framework for defining the concepts required •Cell, Part of brain, subcellular structure, molecule •Community based: •Avian neuroanatomy •Fly neurons (England) •Neuroimaging terms •Brain regions identified by text mining •Creating a computable index for neuroscience data •INCF working to coordinate Wiki efforts underway at Allen Institute, Blue Brain and Neurolex Demo D03
  • 12. Comparison of traffic to NIF Portal vsNeurolex 5000 hits 15000 hits Wiki is readily indexed by search engines
  • 13. Neurons in Neurolex  INCF building a knowledge base of neurons and their properties via the Neurolex Wiki  Led by Dr. Gordon Shepherd  Consistent and parseable naming scheme  Knowledge is readily accessible, editable and computable Stephen Larson
  • 14. NIF data federation Images Drugs Antibodies Grants Pathways Animals Percentage of data records per data type connectivity Brain activation foci Microarray 98% Primary data, secondary data, claims, repositories Recently added: BioNOT literature mining tool; Retraction Watch blog
  • 15. What do you mean by data? Databases come in many shapes and sizes  Primary data:  Data available for reanalysis, e.g., microarray data sets from GEO; brain images from XNAT; microscopic images (CCDB/CIL)  Secondary data  Data features extracted through data processing and sometimes normalization, e.g, brain structure volumes (IBVD), gene expression levels (Allen Brain Atlas); brain connectivity statements (BAMS)  Tertiary data  Claims and assertions about the meaning of data  E.g., gene upregulation/downregulation,  Registries:  Metadata  Pointers to data sets or materials stored elsewhere  Data aggregators  Aggregate data of the same type from multiple sources, e.g., Cell Image Library ,SUMSdb, Brede  Single source  Data acquired within a single context , e.g., Allen Brain Atlas
  • 16. Striatum Hypothalamus Olfactory bulb Cerebral cortex Brain Brainregion Data source VadimAstakhov, KepplerWorkflow Engine NIF landscape analysis
  • 17. How much of the landscape do we have? Query for “reference” brain structures and their parts in NIF Connectivity database
  • 18. NIF Reports: Male vs Female Gender bias NIF can start to answer interesting questions about neuroscience research, not just about neuroscience
  • 19. Embracing duplication: Data Mash ups •~300 PMID’s were common between Brede and SUMSdb •Same information; value added Same data; different aspects
  • 20. Same data: different analysis Chronic vs acute morphine in striatum  Drug Related Gene database: extracted statements from figures, tables and supplementary data from published article  Gemma: Reanalyzed microarray results from GEO using different algorithms  Both provide results of increased or decreased expression as a function of experimental paradigm  4 strains of mice  3 conditions: chronic morphine, acute morphine, saline Mined NIF for all references to GEO ID’s: found small number where the same dataset was represented in two or more databases http://www.chibi.ubc.ca/Gemma/home.html
  • 21. How easy was it to compare?  Gemma: Gene ID + Gene Symbol  DRG: Gene name + Probe ID  Gemma: Increased expression/decreased expression  DRG: Increased expression/decreased expression  But...Gemma presented results relative to baseline chronic morphine; DRG with respect to saline, so direction of change is opposite in the 2 databases  Analysis:  1370 statements from Gemma regarding gene expression as a function of chronicmorphine  617 were consistent with DRG; over half of the claims of the paper were not confirmed in this analysis  Results for 1 gene were opposite in DRG and Gemma  45 did not have enough information provided in the paper to make a judgment NIF annotation standard
  • 22. Grabbing the long tail of small data  Analysis of NIF shows multiple databases with similar scope and content  Many contain partially overlapping data  Data “flows” from one resource to the next  Data is reinterpreted, reanalyzed or added to  When does it become something else?
  • 23. Phases of NIF  2006-2008: A survey of what was out there  2008-2009: Strategy for resource discovery  NIF Registry vs NIF data federation  Ingestion of data contained within different technology platforms, e.g., XML vs relational vs RDF  Effective search across semantically diverse sources  NIFSTD ontologies  2009-2011: Strategy for data integration  Unified views across common sources  Mapping of content to NIF vocabularies  2011-present: Data analytics  Uniform external data references
  • 24. Data, not just stories about them! 47/50 major preclinical published cancer studies could not be replicated  “The scientific community assumes that the claims in a preclinical study can be taken at face value-that although there might be some errors in detail, the main message of the paper can be relied on and the data will, for the most part, stand the test of time. Unfortunately, this is not always the case.”  Getting data out sooner in a form where they can be exposed to many eyes and many analyses, and easily compared, may allow us to expose errors and develop better metrics to evaluate the validity of data Begley and Ellis, 29 MARCH 2012 |VOL 483 | NATURE | 531  “There are no guidelines that require all data sets to be reported in a paper; often, original data are removed during the peer review and publication process. “
  • 25. A global view of data  You (and the machine) have to be able to find it  Accessible through the web  Annotations  You have to be able to use it  Data type specified and in a usable form  You have to know what the data mean  Some semantics  Context: Experimental metadata  Provenance: Where did the data come from? Reporting neuroscience data within a consistent framework helps enormously
  • 26. NIF team (past and present) Jeff Grethe, UCSD, Co Investigator, Interim PI AmarnathGupta, UCSD, Co Investigator Anita Bandrowski, NIF Project Leader Gordon Shepherd,Yale University Perry Miller Luis Marenco RixinWang DavidVan Essen,Washington University Erin Reid Paul Sternberg, CalTech ArunRangarajan Hans Michael Muller Yuling Li GiorgioAscoli,George Mason University SrideviPolavarum Fahim Imam, NIF Ontology Engineer Larry Lui Andrea Arnaud Stagg Jonathan Cachat Jennifer Lawrence Lee Hornbrook Binh Ngo VadimAstakhov XufeiQian Chris Condit Mark Ellisman Stephen Larson WillieWong TimClark, Harvard University Paolo Ciccarese Karen Skinner, NIH, Program Officer
  • 27. Concept-based search: search by meaning  Search Google: GABAergic neuron  Search NIF: GABAergic neuron  NIF automatically searches for types of GABAergic neurons Types of GABAergic neurons