Neuroscience research increasingly relies on large, heterogeneous datasets from various sources. Integrating these diverse data types and making them accessible presents challenges. The NIF (Neuroscience Information Framework) addresses this by creating a federated search engine and unified interface to access multiple neuroscience databases. NIF aims to make neuroscience data more discoverable, accessible, and usable through techniques like unique identifiers, metadata standards, and semantic integration. This will help researchers more effectively find and use relevant neuroscience information.
Data-knowledge transition zones within the biomedical research ecosystemMaryann Martone
Overview of the Neuroscience Information Framework and how it brings together data, in the form of distributed databases, and knowledge, in the form of ontologies to show the mapping of the dataspace and places where there are mismatches between data and knowledge.
Data-knowledge transition zones within the biomedical research ecosystemMaryann Martone
Overview of the Neuroscience Information Framework and how it brings together data, in the form of distributed databases, and knowledge, in the form of ontologies to show the mapping of the dataspace and places where there are mismatches between data and knowledge.
How Portable Are the Metadata Standards for Scientific Data?Jian Qin
The one-covers-all approach in current metadata standards for scientific data has serious limitations in keeping up with the ever-growing data. This paper reports the findings from a survey to metadata standards in the scientific data domain and argues for the need for a metadata infrastructure. The survey collected 4400+ unique elements from 16 standards and categorized these elements into 9 categories. Findings from the data included that the highest counts of element occurred in the descriptive category and many of them overlapped with DC elements. This pattern also repeated in the elements co-occurred in different standards. A small number of semantically general elements appeared across the largest numbers of standards while the rest of the element co-occurrences formed a long tail with a wide range of specific semantics. The paper discussed implications of the findings in the context of metadata portability and infrastructure and pointed out that large, complex standards and widely varied naming practices are the major hurdles for building a metadata infrastructure.
the Neuroscience Information Framework has over 100 big data databases indexed, allowing us to ask big data landscape questions. Anita Bandrowski presents an overview of the NIF system and provides insights into the addiction data landscape to JAX laboratories.
Amit Sheth with TK Prasad, "Semantic Technologies for Big Science and Astrophysics", Invited Plenary Presentation, at Earthcube Solar-Terrestrial End-User Workshop, NJIT, Newark, NJ, August 13, 2014.
Like many other fields of Big Science, Astrophysics and Solar Physics deal with the challenges of Big Data, including Volume, Variety, Velocity, and Veracity. There is already significant work on handling volume related challenges, including the use of high performance computing. In this talk, we will mainly focus on other challenges from the perspective of collaborative sharing and reuse of broad variety of data created by multiple stakeholders, large and small, along with tools that offer semantic variants of search, browsing, integration and discovery capabilities. We will borrow examples of tools and capabilities from state of the art work in supporting physicists (including astrophysicists) [1], life sciences [2], material sciences [3], and describe the role of semantics and semantic technologies that make these capabilities possible or easier to realize. This applied and practice oriented talk will complement more vision oriented counterparts [4].
[1] Science Web-based Interactive Semantic Environment: http://sciencewise.info/
[2] NCBO Bioportal: http://bioportal.bioontology.org/ , Kno.e.sis’s work on Semantic Web for Healthcare and Life Sciences: http://knoesis.org/amit/hcls
[3] MaterialWays (a Materials Genome Initiative related project): http://wiki.knoesis.org/index.php/MaterialWays
[4] From Big Data to Smart Data: http://wiki.knoesis.org/index.php/Smart_Data
Data Landscapes: The Neuroscience Information FrameworkMaryann Martone
Overview of how to use the Neuroscience Information Framework for data discovery presented at the Genetics of Addiction Workshop, held at Jackson Lab Aug 28- Sept 1, 2014.
What is data discovery and how do people find out about data?
Metadata: What information helps potential users decide whether that data might be useful?
How and why do machines exchange information about research data?
Data without metadata and connections is useless:
Linked data
How Scholix is helping publishers and others to link data with publications and more
Metadata, controlled vocabularies, linked data and crosswalks
Things #11, #12, #13 of 23 Things
How do we make FAIR data? Finable, Accessible, Interoperable, Reusable?
Applying machine learning techniques to big data in the scholarly domainAngelo Salatino
Slides of the Lecture at the 5th International School on Applied Probability Theory,Communications Technologies & Data Science (APTCT-2020)
12 Nov 2020
DataCite and Campus Data Services
Paul Bracke, Associate Dean for Digital Programs and Information Services, Purdue University
Research libraries are increasingly interested in developing data services for their campuses. There are many perspectives, however, on how to develop services that are responsive to the many needs of scientists; sensitive to the concerns of scientists who are not always accustomed to sharing their data; and that are attractive to campus administrators. This presentation will discuss the development of campus-based data services programs, the centrality of data citation to these efforts, and the ways in which engagement with DataCite can enhance local programs.
RDAP14: Maryann Martone, Keynote, The Neuroscience Information FrameworkASIS&T
Research Data Access and Preservation Summit, 2014
San Diego, CA
March 26-28, 2014
Maryann Martone, Principal Investigator, Neuroscience Information Framework, University of California, San Diego
How Portable Are the Metadata Standards for Scientific Data?Jian Qin
The one-covers-all approach in current metadata standards for scientific data has serious limitations in keeping up with the ever-growing data. This paper reports the findings from a survey to metadata standards in the scientific data domain and argues for the need for a metadata infrastructure. The survey collected 4400+ unique elements from 16 standards and categorized these elements into 9 categories. Findings from the data included that the highest counts of element occurred in the descriptive category and many of them overlapped with DC elements. This pattern also repeated in the elements co-occurred in different standards. A small number of semantically general elements appeared across the largest numbers of standards while the rest of the element co-occurrences formed a long tail with a wide range of specific semantics. The paper discussed implications of the findings in the context of metadata portability and infrastructure and pointed out that large, complex standards and widely varied naming practices are the major hurdles for building a metadata infrastructure.
the Neuroscience Information Framework has over 100 big data databases indexed, allowing us to ask big data landscape questions. Anita Bandrowski presents an overview of the NIF system and provides insights into the addiction data landscape to JAX laboratories.
Amit Sheth with TK Prasad, "Semantic Technologies for Big Science and Astrophysics", Invited Plenary Presentation, at Earthcube Solar-Terrestrial End-User Workshop, NJIT, Newark, NJ, August 13, 2014.
Like many other fields of Big Science, Astrophysics and Solar Physics deal with the challenges of Big Data, including Volume, Variety, Velocity, and Veracity. There is already significant work on handling volume related challenges, including the use of high performance computing. In this talk, we will mainly focus on other challenges from the perspective of collaborative sharing and reuse of broad variety of data created by multiple stakeholders, large and small, along with tools that offer semantic variants of search, browsing, integration and discovery capabilities. We will borrow examples of tools and capabilities from state of the art work in supporting physicists (including astrophysicists) [1], life sciences [2], material sciences [3], and describe the role of semantics and semantic technologies that make these capabilities possible or easier to realize. This applied and practice oriented talk will complement more vision oriented counterparts [4].
[1] Science Web-based Interactive Semantic Environment: http://sciencewise.info/
[2] NCBO Bioportal: http://bioportal.bioontology.org/ , Kno.e.sis’s work on Semantic Web for Healthcare and Life Sciences: http://knoesis.org/amit/hcls
[3] MaterialWays (a Materials Genome Initiative related project): http://wiki.knoesis.org/index.php/MaterialWays
[4] From Big Data to Smart Data: http://wiki.knoesis.org/index.php/Smart_Data
Data Landscapes: The Neuroscience Information FrameworkMaryann Martone
Overview of how to use the Neuroscience Information Framework for data discovery presented at the Genetics of Addiction Workshop, held at Jackson Lab Aug 28- Sept 1, 2014.
What is data discovery and how do people find out about data?
Metadata: What information helps potential users decide whether that data might be useful?
How and why do machines exchange information about research data?
Data without metadata and connections is useless:
Linked data
How Scholix is helping publishers and others to link data with publications and more
Metadata, controlled vocabularies, linked data and crosswalks
Things #11, #12, #13 of 23 Things
How do we make FAIR data? Finable, Accessible, Interoperable, Reusable?
Applying machine learning techniques to big data in the scholarly domainAngelo Salatino
Slides of the Lecture at the 5th International School on Applied Probability Theory,Communications Technologies & Data Science (APTCT-2020)
12 Nov 2020
DataCite and Campus Data Services
Paul Bracke, Associate Dean for Digital Programs and Information Services, Purdue University
Research libraries are increasingly interested in developing data services for their campuses. There are many perspectives, however, on how to develop services that are responsive to the many needs of scientists; sensitive to the concerns of scientists who are not always accustomed to sharing their data; and that are attractive to campus administrators. This presentation will discuss the development of campus-based data services programs, the centrality of data citation to these efforts, and the ways in which engagement with DataCite can enhance local programs.
RDAP14: Maryann Martone, Keynote, The Neuroscience Information FrameworkASIS&T
Research Data Access and Preservation Summit, 2014
San Diego, CA
March 26-28, 2014
Maryann Martone, Principal Investigator, Neuroscience Information Framework, University of California, San Diego
bioCADDIE Webinar: The NIDDK Information Network (dkNET) - A Community Resear...dkNET
The NIDDK Information Network (dkNET; http://dknet.org) is a open community resource for basic and clinical investigators in metabolic, digestive and kidney disease. dkNET’s portal facilitates access to a collection of diverse research resources (i.e. the multitude of data, software tools, materials, services, projects and organizations available to researchers in the public domain) that advance the mission of the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK). This webinar was presented by dkNET principle investigator Dr. Jeffrey Grethe.
Applied semantic technology and linked dataWilliam Smith
Mapping a human brain generates petabytes of gene listings and the corresponding locations of these genes throughout the human brain. Due to the large dataset a prototype Semantic Web application was created with the unique ability to link new datasets from similar fields of research, and present these new models to an online community. The resulting application presents a large set of gene to location mappings and provides new information about diseases, drugs, and side effects in relation to the genes and areas of the human brain.
In this presentation we will discuss the normalization processes and tools for adding new datasets, the user experience throughout the publishing process, the underlying technologies behind the application, and demonstrate the preliminary use cases of the project.
This talk presents areas of investigation underway at the Rensselaer Institute for Data Exploration and Applications. First presented at Flipkart, Bangalore India, 3/2015.
Enabling knowledge management in the Agronomic DomainPierre Larmande
This talk will focus mainly on, ongoing projects at the Institute of Computational Biology
Agronomic Linked Data (AgroLD): is a Semantic Web knowledge base designed to integrate data from various publically available plant centric data sources.
GIGwA: is a tool developed to manage genomic, transcriptomic and genotyping large data resulting from NGS analyses.
Anita Bandrowski explains how the uniform resource layer of the Neuroscience Information Framework allows several interesting questions about the state of scientific research to be answered.
Maryann Martone
Making Sense of Biological Systems: Using Knowledge Mining to Improve and Validate Models of Living Systems; NIH COBRE Center for the Analysis of Cellular Mechanisms and Systems Biology, Montana State University, Bozeman, MT
August 24, 2012
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...Sérgio Sacani
We characterize the earliest galaxy population in the JADES Origins Field (JOF), the deepest
imaging field observed with JWST. We make use of the ancillary Hubble optical images (5 filters
spanning 0.4−0.9µm) and novel JWST images with 14 filters spanning 0.8−5µm, including 7 mediumband filters, and reaching total exposure times of up to 46 hours per filter. We combine all our data
at > 2.3µm to construct an ultradeep image, reaching as deep as ≈ 31.4 AB mag in the stack and
30.3-31.0 AB mag (5σ, r = 0.1” circular aperture) in individual filters. We measure photometric
redshifts and use robust selection criteria to identify a sample of eight galaxy candidates at redshifts
z = 11.5 − 15. These objects show compact half-light radii of R1/2 ∼ 50 − 200pc, stellar masses of
M⋆ ∼ 107−108M⊙, and star-formation rates of SFR ∼ 0.1−1 M⊙ yr−1
. Our search finds no candidates
at 15 < z < 20, placing upper limits at these redshifts. We develop a forward modeling approach to
infer the properties of the evolving luminosity function without binning in redshift or luminosity that
marginalizes over the photometric redshift uncertainty of our candidate galaxies and incorporates the
impact of non-detections. We find a z = 12 luminosity function in good agreement with prior results,
and that the luminosity function normalization and UV luminosity density decline by a factor of ∼ 2.5
from z = 12 to z = 14. We discuss the possible implications of our results in the context of theoretical
models for evolution of the dark matter halo mass function.
Slide 1: Title Slide
Extrachromosomal Inheritance
Slide 2: Introduction to Extrachromosomal Inheritance
Definition: Extrachromosomal inheritance refers to the transmission of genetic material that is not found within the nucleus.
Key Components: Involves genes located in mitochondria, chloroplasts, and plasmids.
Slide 3: Mitochondrial Inheritance
Mitochondria: Organelles responsible for energy production.
Mitochondrial DNA (mtDNA): Circular DNA molecule found in mitochondria.
Inheritance Pattern: Maternally inherited, meaning it is passed from mothers to all their offspring.
Diseases: Examples include Leber’s hereditary optic neuropathy (LHON) and mitochondrial myopathy.
Slide 4: Chloroplast Inheritance
Chloroplasts: Organelles responsible for photosynthesis in plants.
Chloroplast DNA (cpDNA): Circular DNA molecule found in chloroplasts.
Inheritance Pattern: Often maternally inherited in most plants, but can vary in some species.
Examples: Variegation in plants, where leaf color patterns are determined by chloroplast DNA.
Slide 5: Plasmid Inheritance
Plasmids: Small, circular DNA molecules found in bacteria and some eukaryotes.
Features: Can carry antibiotic resistance genes and can be transferred between cells through processes like conjugation.
Significance: Important in biotechnology for gene cloning and genetic engineering.
Slide 6: Mechanisms of Extrachromosomal Inheritance
Non-Mendelian Patterns: Do not follow Mendel’s laws of inheritance.
Cytoplasmic Segregation: During cell division, organelles like mitochondria and chloroplasts are randomly distributed to daughter cells.
Heteroplasmy: Presence of more than one type of organellar genome within a cell, leading to variation in expression.
Slide 7: Examples of Extrachromosomal Inheritance
Four O’clock Plant (Mirabilis jalapa): Shows variegated leaves due to different cpDNA in leaf cells.
Petite Mutants in Yeast: Result from mutations in mitochondrial DNA affecting respiration.
Slide 8: Importance of Extrachromosomal Inheritance
Evolution: Provides insight into the evolution of eukaryotic cells.
Medicine: Understanding mitochondrial inheritance helps in diagnosing and treating mitochondrial diseases.
Agriculture: Chloroplast inheritance can be used in plant breeding and genetic modification.
Slide 9: Recent Research and Advances
Gene Editing: Techniques like CRISPR-Cas9 are being used to edit mitochondrial and chloroplast DNA.
Therapies: Development of mitochondrial replacement therapy (MRT) for preventing mitochondrial diseases.
Slide 10: Conclusion
Summary: Extrachromosomal inheritance involves the transmission of genetic material outside the nucleus and plays a crucial role in genetics, medicine, and biotechnology.
Future Directions: Continued research and technological advancements hold promise for new treatments and applications.
Slide 11: Questions and Discussion
Invite Audience: Open the floor for any questions or further discussion on the topic.
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...University of Maribor
Slides from:
11th International Conference on Electrical, Electronics and Computer Engineering (IcETRAN), Niš, 3-6 June 2024
Track: Artificial Intelligence
https://www.etran.rs/2024/en/home-english/
What is greenhouse gasses and how many gasses are there to affect the Earth.moosaasad1975
What are greenhouse gasses how they affect the earth and its environment what is the future of the environment and earth how the weather and the climate effects.
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...Scintica Instrumentation
Intravital microscopy (IVM) is a powerful tool utilized to study cellular behavior over time and space in vivo. Much of our understanding of cell biology has been accomplished using various in vitro and ex vivo methods; however, these studies do not necessarily reflect the natural dynamics of biological processes. Unlike traditional cell culture or fixed tissue imaging, IVM allows for the ultra-fast high-resolution imaging of cellular processes over time and space and were studied in its natural environment. Real-time visualization of biological processes in the context of an intact organism helps maintain physiological relevance and provide insights into the progression of disease, response to treatments or developmental processes.
In this webinar we give an overview of advanced applications of the IVM system in preclinical research. IVIM technology is a provider of all-in-one intravital microscopy systems and solutions optimized for in vivo imaging of live animal models at sub-micron resolution. The system’s unique features and user-friendly software enables researchers to probe fast dynamic biological processes such as immune cell tracking, cell-cell interaction as well as vascularization and tumor metastasis with exceptional detail. This webinar will also give an overview of IVM being utilized in drug development, offering a view into the intricate interaction between drugs/nanoparticles and tissues in vivo and allows for the evaluation of therapeutic intervention in a variety of tissues and organs. This interdisciplinary collaboration continues to drive the advancements of novel therapeutic strategies.
Cancer cell metabolism: special Reference to Lactate PathwayAADYARAJPANDEY1
Normal Cell Metabolism:
Cellular respiration describes the series of steps that cells use to break down sugar and other chemicals to get the energy we need to function.
Energy is stored in the bonds of glucose and when glucose is broken down, much of that energy is released.
Cell utilize energy in the form of ATP.
The first step of respiration is called glycolysis. In a series of steps, glycolysis breaks glucose into two smaller molecules - a chemical called pyruvate. A small amount of ATP is formed during this process.
Most healthy cells continue the breakdown in a second process, called the Kreb's cycle. The Kreb's cycle allows cells to “burn” the pyruvates made in glycolysis to get more ATP.
The last step in the breakdown of glucose is called oxidative phosphorylation (Ox-Phos).
It takes place in specialized cell structures called mitochondria. This process produces a large amount of ATP. Importantly, cells need oxygen to complete oxidative phosphorylation.
If a cell completes only glycolysis, only 2 molecules of ATP are made per glucose. However, if the cell completes the entire respiration process (glycolysis - Kreb's - oxidative phosphorylation), about 36 molecules of ATP are created, giving it much more energy to use.
IN CANCER CELL:
Unlike healthy cells that "burn" the entire molecule of sugar to capture a large amount of energy as ATP, cancer cells are wasteful.
Cancer cells only partially break down sugar molecules. They overuse the first step of respiration, glycolysis. They frequently do not complete the second step, oxidative phosphorylation.
This results in only 2 molecules of ATP per each glucose molecule instead of the 36 or so ATPs healthy cells gain. As a result, cancer cells need to use a lot more sugar molecules to get enough energy to survive.
Unlike healthy cells that "burn" the entire molecule of sugar to capture a large amount of energy as ATP, cancer cells are wasteful.
Cancer cells only partially break down sugar molecules. They overuse the first step of respiration, glycolysis. They frequently do not complete the second step, oxidative phosphorylation.
This results in only 2 molecules of ATP per each glucose molecule instead of the 36 or so ATPs healthy cells gain. As a result, cancer cells need to use a lot more sugar molecules to get enough energy to survive.
introduction to WARBERG PHENOMENA:
WARBURG EFFECT Usually, cancer cells are highly glycolytic (glucose addiction) and take up more glucose than do normal cells from outside.
Otto Heinrich Warburg (; 8 October 1883 – 1 August 1970) In 1931 was awarded the Nobel Prize in Physiology for his "discovery of the nature and mode of action of the respiratory enzyme.
WARNBURG EFFECT : cancer cells under aerobic (well-oxygenated) conditions to metabolize glucose to lactate (aerobic glycolysis) is known as the Warburg effect. Warburg made the observation that tumor slices consume glucose and secrete lactate at a higher rate than normal tissues.
Nutraceutical market, scope and growth: Herbal drug technologyLokesh Patil
As consumer awareness of health and wellness rises, the nutraceutical market—which includes goods like functional meals, drinks, and dietary supplements that provide health advantages beyond basic nutrition—is growing significantly. As healthcare expenses rise, the population ages, and people want natural and preventative health solutions more and more, this industry is increasing quickly. Further driving market expansion are product formulation innovations and the use of cutting-edge technology for customized nutrition. With its worldwide reach, the nutraceutical industry is expected to keep growing and provide significant chances for research and investment in a number of categories, including vitamins, minerals, probiotics, and herbal supplements.
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Ana Luísa Pinho
Functional Magnetic Resonance Imaging (fMRI) provides means to characterize brain activations in response to behavior. However, cognitive neuroscience has been limited to group-level effects referring to the performance of specific tasks. To obtain the functional profile of elementary cognitive mechanisms, the combination of brain responses to many tasks is required. Yet, to date, both structural atlases and parcellation-based activations do not fully account for cognitive function and still present several limitations. Further, they do not adapt overall to individual characteristics. In this talk, I will give an account of deep-behavioral phenotyping strategies, namely data-driven methods in large task-fMRI datasets, to optimize functional brain-data collection and improve inference of effects-of-interest related to mental processes. Key to this approach is the employment of fast multi-functional paradigms rich on features that can be well parametrized and, consequently, facilitate the creation of psycho-physiological constructs to be modelled with imaging data. Particular emphasis will be given to music stimuli when studying high-order cognitive mechanisms, due to their ecological nature and quality to enable complex behavior compounded by discrete entities. I will also discuss how deep-behavioral phenotyping and individualized models applied to neuroimaging data can better account for the subject-specific organization of domain-general cognitive systems in the human brain. Finally, the accumulation of functional brain signatures brings the possibility to clarify relationships among tasks and create a univocal link between brain systems and mental functions through: (1) the development of ontologies proposing an organization of cognitive processes; and (2) brain-network taxonomies describing functional specialization. To this end, tools to improve commensurability in cognitive science are necessary, such as public repositories, ontology-based platforms and automated meta-analysis tools. I will thus discuss some brain-atlasing resources currently under development, and their applicability in cognitive as well as clinical neuroscience.
Introduction:
RNA interference (RNAi) or Post-Transcriptional Gene Silencing (PTGS) is an important biological process for modulating eukaryotic gene expression.
It is highly conserved process of posttranscriptional gene silencing by which double stranded RNA (dsRNA) causes sequence-specific degradation of mRNA sequences.
dsRNA-induced gene silencing (RNAi) is reported in a wide range of eukaryotes ranging from worms, insects, mammals and plants.
This process mediates resistance to both endogenous parasitic and exogenous pathogenic nucleic acids, and regulates the expression of protein-coding genes.
What are small ncRNAs?
micro RNA (miRNA)
short interfering RNA (siRNA)
Properties of small non-coding RNA:
Involved in silencing mRNA transcripts.
Called “small” because they are usually only about 21-24 nucleotides long.
Synthesized by first cutting up longer precursor sequences (like the 61nt one that Lee discovered).
Silence an mRNA by base pairing with some sequence on the mRNA.
Discovery of siRNA?
The first small RNA:
In 1993 Rosalind Lee (Victor Ambros lab) was studying a non- coding gene in C. elegans, lin-4, that was involved in silencing of another gene, lin-14, at the appropriate time in the
development of the worm C. elegans.
Two small transcripts of lin-4 (22nt and 61nt) were found to be complementary to a sequence in the 3' UTR of lin-14.
Because lin-4 encoded no protein, she deduced that it must be these transcripts that are causing the silencing by RNA-RNA interactions.
Types of RNAi ( non coding RNA)
MiRNA
Length (23-25 nt)
Trans acting
Binds with target MRNA in mismatch
Translation inhibition
Si RNA
Length 21 nt.
Cis acting
Bind with target Mrna in perfect complementary sequence
Piwi-RNA
Length ; 25 to 36 nt.
Expressed in Germ Cells
Regulates trnasposomes activity
MECHANISM OF RNAI:
First the double-stranded RNA teams up with a protein complex named Dicer, which cuts the long RNA into short pieces.
Then another protein complex called RISC (RNA-induced silencing complex) discards one of the two RNA strands.
The RISC-docked, single-stranded RNA then pairs with the homologous mRNA and destroys it.
THE RISC COMPLEX:
RISC is large(>500kD) RNA multi- protein Binding complex which triggers MRNA degradation in response to MRNA
Unwinding of double stranded Si RNA by ATP independent Helicase
Active component of RISC is Ago proteins( ENDONUCLEASE) which cleave target MRNA.
DICER: endonuclease (RNase Family III)
Argonaute: Central Component of the RNA-Induced Silencing Complex (RISC)
One strand of the dsRNA produced by Dicer is retained in the RISC complex in association with Argonaute
ARGONAUTE PROTEIN :
1.PAZ(PIWI/Argonaute/ Zwille)- Recognition of target MRNA
2.PIWI (p-element induced wimpy Testis)- breaks Phosphodiester bond of mRNA.)RNAse H activity.
MiRNA:
The Double-stranded RNAs are naturally produced in eukaryotic cells during development, and they have a key role in regulating gene expression .
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.Sérgio Sacani
The return of a sample of near-surface atmosphere from Mars would facilitate answers to several first-order science questions surrounding the formation and evolution of the planet. One of the important aspects of terrestrial planet formation in general is the role that primary atmospheres played in influencing the chemistry and structure of the planets and their antecedents. Studies of the martian atmosphere can be used to investigate the role of a primary atmosphere in its history. Atmosphere samples would also inform our understanding of the near-surface chemistry of the planet, and ultimately the prospects for life. High-precision isotopic analyses of constituent gases are needed to address these questions, requiring that the analyses are made on returned samples rather than in situ.
2. We say this to each other all the
time, but we set up systems for
scholarly advancement and
communication that are the
antithesis of integrationWhole brain data
(20 um
microscopic MRI)
Mosiac LM
images (1 GB+)
Conventional LM
images
Individual cell
morphologies
EM volumes &
reconstructions
Solved molecular
structures
No single technology serves
these all equally well.
Multiple data types;
multiple scales; multiple
databases
A data integration problem
3. Solving the large problems of
science?
• Observation
• Experimentation
• Modeling
• Cooperative data
intensive science
“An unaided human’s ability to process
large data sets is comparable to a dog’s
ability to do arithmetic, and not much more
valuable.” –Michael Nielson, Reinventing
Discovery, 2012.
4. Old Model: Single type of content;
single mode of distribution
Scholar
Library
Scholar
Publisher
FORCE11.org: Future of research communications and e-scholarship
6. The duality of modern scholarship
Observation: Those who build information systems from the
machine side don’t understand the requirements of the
human very well
Those who build information systems from the human side,
don’t understand requirements of machines very well
Production of “reusable scholarly artifacts” = usable by by humans and machines
Findable, accessible, citable
7. • NIF is an initiative of the NIH Blueprint consortium of institutes
– What types of resources (data, tools, materials, services) are available to the
neuroscience community?
– How many are there?
– What domains do they cover? What domains do they not cover?
– Where are they?
• Web sites
• Databases
• Literature
• Supplementary material
– Who uses them?
– Who creates them?
– How can we find them?
– How can we make them better in the future?
http://neuinfo.org
• PDF files
• Desk drawers
NIF has been
surveying,
cataloging and
tracking the
neuroscience
resource
landscape since
< 2008
9. Database
Software Application
Data Analysis Service
Topical Portal
Core Facility
Ontology
Software Resource
Years:
Anita Bandrowski and Burak Ozyurt
Population, Coverage and Linkage of Resource
Registry
10. • Automated text mining is used to look
for “web page last updated” or
copyright dates
– Identified for 570 resources
– 373 were not updated within the last 2
years (65%)
• Manual review of ~200 resources
– 38 not updated within the past 2 years
(~20%)
– 8 migrated to new addresses or institutions
– 7 are no longer in service (~3%)
– 3 were deemed no longer appropriate
What happens to these resources?
The Registry provides a persistent identifier and metadata
record for what once existed but no longer does
11. BD2K: Big Data to Knowledge
• BD2K - a trans-NIH initiative established to enable biomedical research as a
digital research enterprise, to facilitate discovery and support new knowledge,
and to maximize community engagement.
• BD2K aims to develop the new approaches, standards, methods, tools,
software, and competencies that will enhance the use of biomedical Big Data
by:
– Facilitating broad use of biomedical digital assets by making them
discoverable, accessible, and citable
– Conducting research and developing the methods, software, and tools
needed to analyze biomedical Big Data
– Enhancing training in the development and use of methods and tools
necessary for biomedical Big Data science
– Supporting a data ecosystem that accelerates discovery as part of a digital
enterprise
http://bd2k.nih.gov/
13. What resources are available for GRM1?
With the thousands of databases and other information sources
available, simple descriptive metadata will not suffice
14. NIF data federation
NIF was designed to accommodate the multiplicity of heterogeneous and distributed data
resources, providing deep query of the contents and unified views
250 sources
> 800 M records
15. What do you mean by data?
Databases come in many shapes and sizes
• Primary data:
– Data available for reanalysis, e.g.,
microarray data sets from GEO;
brain images from XNAT;
microscopic images (CCDB/CIL)
• Secondary data
– Data features extracted through
data processing and sometimes
normalization, e.g, brain structure
volumes (IBVD), gene expression
levels (Allen Brain Atlas); brain
connectivity statements (BAMS)
• Tertiary data
– Claims and assertions about the
meaning of data
• E.g., gene
upregulation/downregulation,
brain activation as a function of
task
• Registries:
– Metadata
– Pointers to data sets or
materials stored elsewhere
• Data aggregators
– Aggregate data of the same
type from multiple sources,
e.g., Cell Image Library
,SUMSdb, Brede
• Single source
– Data acquired within a single
context , e.g., Allen Brain Atlas
Researchers are producing a variety of
information artifacts using a multitude of
technologies
18. Making it easier to access and understand
distributed databases
Each resource implements a different, though related model;
systems are complex and difficult to learn, in many cases
19. Current challenge: With so much
available, how do I find what I need?
• “What genes are upregulated by
chronic morphine?”
– It depends
• Most often use cases require
connecting a researcher to
relevant data sets and
appropriate tools
– Depending upon the data and tools,
the answers may differ
• Many databases have tool bases
and workflows that they support
• Much value has been added to
individual data sets if we can
connect to them
22. SciCrunch: A “social network” for
resources
• NIF is a general search
engine across all of
neuroscience (biomedicine)
• Very powerful for discovery
and general browsing
• Can perform analytics across
the spectrum of biomedical
resources
• Many communities want to
create more focused portals
• Specialized for their domain
• Restrict the particular sources
• Organize the data according
to their needs
• Use their own branding
• How do we create a system
that satisfies community
needs without creating
another silo?
28. What is an effective information
framework for neuroscience?
Knowledge in space and spatial relationships
(the “where”)
Knowledge in words, terminologies and
logical relationships (the “what”)
30. What can ontology do for us?
• Express neuroscience concepts in a way that is machine readable
– Unique identifier
– Synonyms, lexical variants
– Definitions
• Provide means of disambiguation of strings
– Nucleus part of cell; nucleus part of brain; nucleus part of atom
– Each of these concepts has a unique identifier that distinguishes them
• Properties
– Support reasoning
• Provide universals for navigating across different data sources
– Semantic “index”
– Link data through relationships not just one-to-one mappings
• Provide the basis for concept-based queries to probe and mine data
• Establish a semantic framework for landscape analysis
• Deep data integration for some types of knowledge
Mathematics, Computer code or Esperanto
31. The scourge of neuroanatomical nomenclature
•NIF Connectivity: 7 databases containing connectivity primary data or claims
from literature on connectivity between brain regions
•Brain Architecture Management System (rodent)
•Temporal lobe.com (rodent)
•Connectome Wiki (human)
•Brain Maps (various)
•CoCoMac (primate cortex)
•UCLA Multimodal database (Human fMRI)
•Avian Brain Connectivity Database (Bird)
•Total: 1800 unique brain terms (excluding Avian)
•Number of exact terms used in > 1 database: 42
•Number that map to the same identifier, i.e., synonyms: 99
•Number of 1st order partonomy matches: 385
32. : C
Neurolex: > 1 million triples
Dr. Yi Zeng: Chinese neural knowledge base
NIF Cell Graph
This is your brain on
computers
33. Looking across the ecosystem: Where are the data?
Data Sources
Bringing knowledge to data: Gap analysis
35. How much information makes it into
the data space?
∞
What is easily machine
processable and accessible
What is potentially knowable
What is known:
Literature, images, human
knowledge
Unstructured; Natural
language processing,
entity recognition,
image processing and
analysis; paywalls; file
drawers
Abstracts vs full
text vs tables etc
Estimates that > 50% scientific output is not recovered
Chan et al. Lancet, 383, 2014
36. The tale of the tail
“Human neuroimaging typically is performed on a whole brain basis.
However, for several reasons tail of the caudate activity can easily be missed.
•One reason is limitations in the normalization algorithms, that typically are
optimized to maximize accuracy for cortical rather than subcortical
structures. ...
•A second reason is that standard neuroimaging atlases such as the Harvard-
Oxford structural atlas used with neuroimaging analysis programs such as
FreeSurfer truncate the caudate at the body, and completely exclude the
tail...
•A final reason is that the tail of the caudate is close to the hippocampus, and
could be misidentified as such especially in tasks involving learning and
memory.
Therefore, the tail of the caudate may be recruited in additional cognitive
tasks, but yet not have been properly identified and reported in the
neuroimaging literature”
Seger CA. The visual corticostriatal loop through the tail of the caudate: circuitry and function. Front
Syst Neurosci. 2013 Dec 6;7:104. doi: 10.3389/fnsys.2013.00104. eCollection 2013.
38. Data-Knowledge Mismatch
Dutowski et al., 2013:
Nature Biotechnology
A major impediment
for researchers using
ontology identifiers
is the perception
that ontologies
require a consensus
on definition of
terms
By matching
assertions about
biological entities
to data, we can
test both our
knowledge and
our data
39. The Monarch Initiative
•Genotype-Phenotype
comparison engine
•Integrates large amounts
of genotype-phenotype
data
•Semantic similarity
analytics
•Human disease
Animal model
Monarchinitiative.org
Melissa Haendel, OHSU
Chris Mungall, LBL
42. SO ALL I AM IS A NUMBER?
The power of unique and persistent identifiers
43.
44. What studies used my monoclonal mouse antibody
against actin in humans?
“The following antibodies were used for immunoblotting: -actin
mAb (1:10,000 dilution, Sigma-Aldrich)…”
Papers are
currently poor at
identifying the
simplest part of
the paper, the
materials used
45. Pilot Project
• Authors to identify 3 types of
research resources:
– Software/databases
– Antibodies
– Model organisms
• Include unique identifier = RRID
in methods section
• Voluntary for authors
• Journals did not have to modify
their submission system
Launched February 2014: 3 month commitment and more…
Two simple questions:
Could authors do it?
Would authors do it?
46. Resource IDs from NIF aggregated databases
•A single portal for
authors
•>10 authoritative
databases
•One search interface
•Simple directions
•Prominent “Cite
This” button
•Help desk
RII Portal
http://scicrunch.org/resources
Initiative was possible because of
the massive registries available
and aggregation services of
NIF/SciCrunch
47. RRID’s in the wild!
• >300 articles
have appeared to
date
• 47 journals
• 800+ RRID’s
• 96% correct!
Database available at: https://www.force11.org/node/5635
Authors can and will
adopt new citation
styles for research
resources
48. Increased identifiability of resources after the
Resource Identification Initiative Pilot
Update of Vasilevsky et al, PeerJ, 2013
49. What can we do with an RRID?
• A resolver
service has
been created
• 3rd party tools
are being
created to
provide linkage
between
resources and
papers
http://scicrunch.com/resolver/RRID:AB_90755
50. “Alerting” service
• Teaming with
Hypothes.is and
ORCID to
develop
annotation tools
for RRID’s,
including
“alerts” on
reagents and
tools
51. Hypothes.is is a tool for creating and
sharing annotations on web pages
http://hypothes.is.org
52. Article
Code
Blogs
Workflows
Data
Portals
Unique and persistent identifiers and a system for
referencing them allow an ecosystem to function
An ecosystem for research objects: the social network of
research resources
Data
Data
Code
Code
Blogs
Blogs
Workflows
Workflows
Portals
Portals
Search engines
ID’s
ID’s
ID’s
ID’s
ID’s
ID’s
ID’s
ID’s
53. WHAT CAN WE DO NOW?
Lessons learned from my career
54. Share your data and share it
effectively• Discoverability
– Data can be found
• Accessibility
– Data can be accessed and
access rights are clear
– Links to data are stable
• Assessability
– The reliability of the data can
be determined
• Understandability
– The data can be understood
• Usability
– The data are in a usable form
• Publishing data on your
website or as
supplemental material is
not the best way to make
it available
55. What about my data?
•Best practice:
•Put it in a repository
•What repository?
•Community repository for
your data type, e.g.,
NITRC, GEO
•General repository:
•Dryad
•FigShare
•NIH Data Commons
•Institutional repository
•Research libraries are
setting up repositories to
manage their “digital
assets”
NIF can help you find a place for your data
56. Make sure you and your scholarly outputs
can be linked
A distributed system like the biomedical data ecosystem runs
on the ability to uniquely identify relevant entities
ORCID ID: Unique researcher
identifier
Editors, authors: participate in
the Resource Identification
Initiative
“Sound, reproducible scholarship rests upon a
foundation of robust, accessible data. Data should be
considered legitimate, citable products of research. Data
citation, like the citation of other evidence and sources,
is good research practice.”
-Joint Declaration of Data Citation
Principles http://www.force11.org/datacitation
Coming soon: Formal
standards for citing data sets
57. Future of Research Communications
and e-Scholarship (FORCE11.org)
http://force11.orgJoin FORCE11!
58. NIF team (past and present)
Jeff Grethe, UCSD, Co-PI
Amarnath Gupta, UCSD,
Anita Bandrowski, NIF Project Leader
Gordon Shepherd, Yale University
Perry Miller
Luis Marenco
Rixin Wang
David Van Essen, Washington University
Erin Reid
Paul Sternberg, Cal Tech
Arun Rangarajan
Hans Michael Muller
Yuling Li
Giorgio Ascoli, George Mason University
Sridevi Polavarum
Fahim Imam
Larry Lui
Andrea Arnaud Stagg
Jonathan Cachat
Jennifer Lawrence
Svetlana Sulima
Davis Banks
Vadim Astakhov
Xufei Qian
Chris Condit
Mark Ellisman
Stephen Larson
Willie Wong
Tim Clark, Harvard University
Paolo Ciccarese
Karen Skinner, NIH, Program Officer
(retired)
Jonathan Pollock, NIH, Program Officer
And my colleagues in Monarch, dkNet, 3DVC, Force 11
59.
60. The
Encyclopedia
of Life
A…
Access to data has
changed over the
years
Tim Berner-s Lee: Web of dataWikipedia defines Linked Data as "a
term used to describe a
recommended best practice for
exposing, sharing, and connecting
pieces of data, information, and
knowledge on the Semantic Web
using URIs and RDF.”
http://linkeddata.org/
Genbank
PDB
“Whichever technology wins broad adoption will become, by
default, the data web. That’s why we don’t need to know
which technological vision of the data web will win to conclude
that the data web is inevitable”-Michael Nielson
61. “Empty Archives”
Repository Type of Data
Date
started Host
Public
data Comments
CARMEN
neuroscience /
electrophysiology 2008
Newcastle University; United
Kingdom 100 Requires account
INCF Dataspace various 2012
International
Neuroinformatics
Coordinating Facility ?
Open Source Brain models 2014 University College London 47 Cells and Networks; 23 (Technology -showcases)
XNAT Central Neuroimaging 2010
Washington University
School of Medicine in St.
Louis; Missouri; USA 34
States 370 projects, 3804 subjects, and 5172
imaging sessions. 123 were visible but do not all
appear to be public. 34 public data were listed
under “Recent”
Open Connectome
Serial electron
Microscopy and
Magnetic Resonance 2011
Johns Hopkins University;
Maryland; USA (graphs) 9 9, 7 - image projects; 19 - graphs
UCSF DataShare
biomedical including
neuroimaging, MRI,
cognitive
impairment,
dementia, aging 2011
University of California at San
Francisco; California; USA 15
BrainLiner
various functional
data 2011 ATR; Kyoto; Japan 10
ModelDB neuron models 1996
Yale University; Connecticut;
USA 875
NeuroMorpho
digitally
reconstructed
neurons 2006
George Mason University;
Virginia; USA 10004
Cell Image
Library/Cell
Centered Database
images, videos, and
animations of cell
2002 CCDB
2010 CIL
American Society for Cell
Biology / University of
California at San Diego;
California; USA 10,360
The CCDB had 450 data sets when it merged with
CIL. CIL also contains large imaging data sets that
are not counted as separate images
CRCNS
computational
neuroscience
datasets 2008
University of California at
Berkeley; California; USA 38
OpenfMRI fMRI 2012
University of Texas at Austin;
Texas; USA 22
“I finally gave NeuroMorpho my data so they would stop
63. Make your data machine-actionable
Van De Werd HJ1, Uylings HB.. Brain Struct Funct. 2014 Mar;219(2):433-59. doi:
10.1007/s00429-013-0630-
64. Use RRID’s in your papers,
databases and journals!
• Antibody and
model
organism
databases
are adopting
65. NIF Information Framework: Query and alignment
• Aggregate of community ontologies with some extensions for neuroscience, e.g., Gene
Ontology, Chebi, Protein Ontology
• Available as services through NIF and BioPortal
NIFSTD
Organism
NS FunctionMolecule Investigation
Subcellular
structure
Macromolecule Gene
Molecule Descriptors
Techniques
Reagent Protocols
Cell
Resource Instrument
Dysfunction Quality
Anatomical
Structure
NIF uses ontologies to enhance search
and discovery but is not constrained by
them