The document discusses the progress made over the past 10 years towards achieving a vision of having comprehensive biodiversity information accessible online. It examines several key components of this vision including developing a global taxon inventory, linking names to primary literature, assembling georeferenced species records, and creating machine-readable taxon trait data. For each component, it provides details on the scope of the task, current resources addressing it, and how complete the work is so far, noting many challenges that still remain.
Smithsonian Libraries 2.0 and the Biodiversity Heritage Library ProjectMartin Kalfatovic
Smithsonian Libraries 2.0 and the Biodiversity Heritage Library Project. Martin R. Kalfatovic. Smithsonian Libraries Board Meeting. June 26, 2009. Landover, MD.
Nigel J. Robinson - ZooBank and Zoological Record - a partnership for successICZN
Since its origin in 1864, ZR has had a close association with the taxonomic community, particularly with the Zoological Society of London. ZR was founded in 1864 by a group of scientists associated with the British Museum. It continued, supported by Society until 1980 when a partner was sought and BIOSIS took over production activities. In 2004, BIOSIS realised that with limited resources we could not achieve our aims and put our ideas into practice without further partnerships, so in January 2004, BIOSIS (including ZR) was acquired by the Thomson Corporation, and the new ownership is now starting to pay dividends. Over that 150 years or so, there have been difficult times, but ZR is still here and still has the same purpose it had in 1864 - to serve the community and disseminate taxonomic, biodiversity and zoological information for the benefit of scientific research.
This presentation discusses ZR, and the new free Index to Organism Names service which serves to demonstrate our commitment as Thomson to this initiative. I will also discuss how the partnership between ZR and ICZN might work from the ZR perspective.
Michel digital nomenclature-gna-zoobank-2014-co-namesconfv2Ellinor Michel
Global Digital Infrastructure for Biological Nomenclature and Taxonomy
Ellinor Michel, Dep’t of Life Sciences, The Natural History Museum, London, UK, (e.michel@nhm.ac.uk)
Richard L. Pyle, Natural Sciences Dep’t, Bishop Museum, Honolulu, HI, USA
Robert P. Guralnick, Dep’t of Ecology & Evolutionary Biology, Univ Colorado, Boulder, CO, USA
Jon Todd, Dep’t of Earth Sciences, The Natural History Museum, London, UK,
The future for interoperable scientific information is digital, yet scientific names, the handles for all biodiversity information, remain without an integrated system tied to published descriptions and museum type specimens. Descriptions and type specimens provide standards for the otherwise fluid concepts of biological taxa. We are working to unify the infrastructures for biological nomenclature across nomenclatural codes (including zoological (ICZN - http://iczn.org/), botanical (ICNafp - http://www.iapt-taxon.org/nomen/main.php) and bacterial (ICNB) codes) through the Global Names Architecture (GNA). Our initial focus is on animal names, as these comprise the largest component of metazoan biodiversity and ZooBank (zoobank.org) is the first code-related online nomenclatural registration system. Users are applied scientists in agriculture, medicine, veterinary science and climate change research; biodiversity researchers such as ecologists, physiologists; archives such as museums; the scientific publishing community – in short, all users of scientific names of organisms based on the work of taxonomists.
Smithsonian Libraries 2.0 and the Biodiversity Heritage Library ProjectMartin Kalfatovic
Smithsonian Libraries 2.0 and the Biodiversity Heritage Library Project. Martin R. Kalfatovic. Smithsonian Libraries Board Meeting. June 26, 2009. Landover, MD.
Nigel J. Robinson - ZooBank and Zoological Record - a partnership for successICZN
Since its origin in 1864, ZR has had a close association with the taxonomic community, particularly with the Zoological Society of London. ZR was founded in 1864 by a group of scientists associated with the British Museum. It continued, supported by Society until 1980 when a partner was sought and BIOSIS took over production activities. In 2004, BIOSIS realised that with limited resources we could not achieve our aims and put our ideas into practice without further partnerships, so in January 2004, BIOSIS (including ZR) was acquired by the Thomson Corporation, and the new ownership is now starting to pay dividends. Over that 150 years or so, there have been difficult times, but ZR is still here and still has the same purpose it had in 1864 - to serve the community and disseminate taxonomic, biodiversity and zoological information for the benefit of scientific research.
This presentation discusses ZR, and the new free Index to Organism Names service which serves to demonstrate our commitment as Thomson to this initiative. I will also discuss how the partnership between ZR and ICZN might work from the ZR perspective.
Michel digital nomenclature-gna-zoobank-2014-co-namesconfv2Ellinor Michel
Global Digital Infrastructure for Biological Nomenclature and Taxonomy
Ellinor Michel, Dep’t of Life Sciences, The Natural History Museum, London, UK, (e.michel@nhm.ac.uk)
Richard L. Pyle, Natural Sciences Dep’t, Bishop Museum, Honolulu, HI, USA
Robert P. Guralnick, Dep’t of Ecology & Evolutionary Biology, Univ Colorado, Boulder, CO, USA
Jon Todd, Dep’t of Earth Sciences, The Natural History Museum, London, UK,
The future for interoperable scientific information is digital, yet scientific names, the handles for all biodiversity information, remain without an integrated system tied to published descriptions and museum type specimens. Descriptions and type specimens provide standards for the otherwise fluid concepts of biological taxa. We are working to unify the infrastructures for biological nomenclature across nomenclatural codes (including zoological (ICZN - http://iczn.org/), botanical (ICNafp - http://www.iapt-taxon.org/nomen/main.php) and bacterial (ICNB) codes) through the Global Names Architecture (GNA). Our initial focus is on animal names, as these comprise the largest component of metazoan biodiversity and ZooBank (zoobank.org) is the first code-related online nomenclatural registration system. Users are applied scientists in agriculture, medicine, veterinary science and climate change research; biodiversity researchers such as ecologists, physiologists; archives such as museums; the scientific publishing community – in short, all users of scientific names of organisms based on the work of taxonomists.
An International Cooperative Digital Library for Taxonomic Literature: The Bi...Martin Kalfatovic
An International Cooperative Digital Library for Taxonomic Literature: The Biodiversity Heritage Library. Martin R. Kalfatovic. American Library Association Annual Meeting. Collaborative Digital Initiatives: Show and Tell and Lessons Learned. June 30, 2008. Anaheim, CA.
Natural history research as a replicable data scienceRutger Vos
Keynote presentation to the 2017 GARR conference, 17 November 2017, Venice, Italy. Introduction to natural history data types and analysis examples. Discussion of current practices in promoting reproducibility.
David Remsen lecture on Tuesday, Sept 15, 2009, for the Biodiversity Informatics Course, a Swedish Taxonomy Initiative (Svenska Artprojektet) course at the Swedish Natural History Museum, Stockholm, supported by the Swedish Species Service (ArtDatabanken) and the Swedish GBIF node.
pro-iBiosphere Towards Open Biodiversity Knowledge COOPEUS 2013millerjeremya
Invited presentation, meeting of COOPEUS - Connecting Research Infrastructures WP6, in conjunction with EGI (European Grid Infrastructure) Technical Forum, Madrid, Spain, September 2013
Botanists and annotations printer friendlyWilliam Ulate
Findings from I Annotate 2016 concluded that the uptake of web annotation could be sufficiently moved forward by tackling three key issues: 1) interoperability, 2) domain use cases, and 3) user centered design. The Center for Biodiversity Informatics at the Missouri Botanical Garden has identified valuable use cases for developing in-depth user assessments of annotation needs in the specific domain of botanists. This presentation will share those use cases and talk about next steps in serving the annotation needs of botanists and their relevance for the larger scientific domain.
Computational Acoustic Identification of Bat SpeciesJason Miller
in this talk, I describe a project I've been working on with undergraduates on and off for several years. We are attempting to solve an inverse problem where we identify a bat's species using only measurements made from a recording of its search-phase echolocation call.
Eol-Drupal Presentation for DrupalSouth 2008Dan Morrison
Presentation of Taxonomy Development for Drupal. Prepared for http://drupalsouth.net.nz/ regarding http://sprint.eol.org/
More EOL slides from that code sprint at http://www.slideshare.net/search/slideshow?q=eol
Sherborn: Thompson & Pape - Sherborn’s critical influence in getting informat...ICZN
The order Diptera (Insecta), flies, is a megadiverse group, representing some 15% or more of the known species of organisms. Scientific names are tags to concepts (hypotheses), called species, by which we organize our knowledge of biodiversity. Our Systema Dipterorum provides an index to all scientific names related to flies, so access to our knowledge about them is readily available. Sherborn more than a century ago attempted to provide such an index to all animal names. He did provide an index to all names published up until and including 1850. We compare our indexes, revealing how standards have changed and the number of names increased. Today, more and better resources are being made available to us, such as the Biodiversity Heritage Library, and our standards are higher (new International Code of Zoological Nomenclature), but regardless of all the change, Sherborn for his time provided an almost perfect (99.9%) index.
An International Cooperative Digital Library for Taxonomic Literature: The Bi...Martin Kalfatovic
An International Cooperative Digital Library for Taxonomic Literature: The Biodiversity Heritage Library. Martin R. Kalfatovic. American Library Association Annual Meeting. Collaborative Digital Initiatives: Show and Tell and Lessons Learned. June 30, 2008. Anaheim, CA.
Natural history research as a replicable data scienceRutger Vos
Keynote presentation to the 2017 GARR conference, 17 November 2017, Venice, Italy. Introduction to natural history data types and analysis examples. Discussion of current practices in promoting reproducibility.
David Remsen lecture on Tuesday, Sept 15, 2009, for the Biodiversity Informatics Course, a Swedish Taxonomy Initiative (Svenska Artprojektet) course at the Swedish Natural History Museum, Stockholm, supported by the Swedish Species Service (ArtDatabanken) and the Swedish GBIF node.
pro-iBiosphere Towards Open Biodiversity Knowledge COOPEUS 2013millerjeremya
Invited presentation, meeting of COOPEUS - Connecting Research Infrastructures WP6, in conjunction with EGI (European Grid Infrastructure) Technical Forum, Madrid, Spain, September 2013
Botanists and annotations printer friendlyWilliam Ulate
Findings from I Annotate 2016 concluded that the uptake of web annotation could be sufficiently moved forward by tackling three key issues: 1) interoperability, 2) domain use cases, and 3) user centered design. The Center for Biodiversity Informatics at the Missouri Botanical Garden has identified valuable use cases for developing in-depth user assessments of annotation needs in the specific domain of botanists. This presentation will share those use cases and talk about next steps in serving the annotation needs of botanists and their relevance for the larger scientific domain.
Computational Acoustic Identification of Bat SpeciesJason Miller
in this talk, I describe a project I've been working on with undergraduates on and off for several years. We are attempting to solve an inverse problem where we identify a bat's species using only measurements made from a recording of its search-phase echolocation call.
Eol-Drupal Presentation for DrupalSouth 2008Dan Morrison
Presentation of Taxonomy Development for Drupal. Prepared for http://drupalsouth.net.nz/ regarding http://sprint.eol.org/
More EOL slides from that code sprint at http://www.slideshare.net/search/slideshow?q=eol
Sherborn: Thompson & Pape - Sherborn’s critical influence in getting informat...ICZN
The order Diptera (Insecta), flies, is a megadiverse group, representing some 15% or more of the known species of organisms. Scientific names are tags to concepts (hypotheses), called species, by which we organize our knowledge of biodiversity. Our Systema Dipterorum provides an index to all scientific names related to flies, so access to our knowledge about them is readily available. Sherborn more than a century ago attempted to provide such an index to all animal names. He did provide an index to all names published up until and including 1850. We compare our indexes, revealing how standards have changed and the number of names increased. Today, more and better resources are being made available to us, such as the Biodiversity Heritage Library, and our standards are higher (new International Code of Zoological Nomenclature), but regardless of all the change, Sherborn for his time provided an almost perfect (99.9%) index.
The power of names smithsonian talk-2013-iczn_nomenclature&bioinformatics-v2Ellinor Michel
I gave this talk at the Smithsonian National Museum of Natural History in April 2013. It deals with ZooBank and the registration of scientific names of animals, the role of type specimens and archives for both specimens and literature. It should be of interest to taxonomists, and people working on biodiversity bioinformatics and scientific bibliography.
The talk had significant input from several co-authors: Richard Pyle, David Patterson, Daphne Fautin and Jon Todd. The Smithsonian presentation was hosted by the AAZN (American Association of Zoological Nomenclature). I gave a similar talk in November 2012 at the invitation of the Field Museum, Chicago, which is available in full online here (54 minutes): http://vimeo.com/55796036 and linked with a short promo piece on scientific nomenclature here (2.8 minutes): http://vimeo.com/54956625
The Biodiversity Heritage Library Mass Digitizing Project: A Grandeur in this...Martin Kalfatovic
The Biodiversity Heritage Library Mass Digitizing Project: A Grandeur in this View of Digital Libraries by Martin R. Kalfatovic and Suzanne C. Pilsk, Smithsonian Institution Libraries. LITA National Forum, October 2007. Denver, Colorado.
An International Cooperative Digital Library for Taxonomic Literature: The Bi...Martin Kalfatovic
An International Cooperative Digital Library for Taxonomic Literature: The Biodiversity Heritage Library. Martin Kalfatovic. The Catholic University of America, School of Library and Information Science. LSC 715. 6 June 2008. Washington, DC.
Biodiversity Heritage Library : Development and PartnerhipsNancy Gwinn
Biodiversity Heritage Library. Development and Partnerships. Nancy E. Gwinn. Biodiversity and Ecosystems Informatics Group, National Science Foundation, March 24, 2008, Washington, D.C.
This presentation was given by Dr. Avishek Bhattacharjee in Botanical Nomenclature Course held in Botanical Survey of India, Eastern Regional Centre, Shillong in November 2016. This may be helpful to the undergraduate and post graduate Botany students to understand different types of taxonomic literature, especially Flora, Revision and Monograph.
Global Library of Life: The Biodiversity Heritage LibraryMartin Kalfatovic
Global Library of Life: The Biodiversity Heritage Library. Martin R. Kalfatovic. Boston Library Consortium Meeting. Boston Public Library. 18 March 2008. Boston, MA.
Similar to 10 years of global biodiversity databases: are we there yet? (20)
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Sérgio Sacani
Since volcanic activity was first discovered on Io from Voyager images in 1979, changes
on Io’s surface have been monitored from both spacecraft and ground-based telescopes.
Here, we present the highest spatial resolution images of Io ever obtained from a groundbased telescope. These images, acquired by the SHARK-VIS instrument on the Large
Binocular Telescope, show evidence of a major resurfacing event on Io’s trailing hemisphere. When compared to the most recent spacecraft images, the SHARK-VIS images
show that a plume deposit from a powerful eruption at Pillan Patera has covered part
of the long-lived Pele plume deposit. Although this type of resurfacing event may be common on Io, few have been detected due to the rarity of spacecraft visits and the previously low spatial resolution available from Earth-based telescopes. The SHARK-VIS instrument ushers in a new era of high resolution imaging of Io’s surface using adaptive
optics at visible wavelengths.
Richard's entangled aventures in wonderlandRichard Gill
Since the loophole-free Bell experiments of 2020 and the Nobel prizes in physics of 2022, critics of Bell's work have retreated to the fortress of super-determinism. Now, super-determinism is a derogatory word - it just means "determinism". Palmer, Hance and Hossenfelder argue that quantum mechanics and determinism are not incompatible, using a sophisticated mathematical construction based on a subtle thinning of allowed states and measurements in quantum mechanics, such that what is left appears to make Bell's argument fail, without altering the empirical predictions of quantum mechanics. I think however that it is a smoke screen, and the slogan "lost in math" comes to my mind. I will discuss some other recent disproofs of Bell's theorem using the language of causality based on causal graphs. Causal thinking is also central to law and justice. I will mention surprising connections to my work on serial killer nurse cases, in particular the Dutch case of Lucia de Berk and the current UK case of Lucy Letby.
Richard's aventures in two entangled wonderlandsRichard Gill
Since the loophole-free Bell experiments of 2020 and the Nobel prizes in physics of 2022, critics of Bell's work have retreated to the fortress of super-determinism. Now, super-determinism is a derogatory word - it just means "determinism". Palmer, Hance and Hossenfelder argue that quantum mechanics and determinism are not incompatible, using a sophisticated mathematical construction based on a subtle thinning of allowed states and measurements in quantum mechanics, such that what is left appears to make Bell's argument fail, without altering the empirical predictions of quantum mechanics. I think however that it is a smoke screen, and the slogan "lost in math" comes to my mind. I will discuss some other recent disproofs of Bell's theorem using the language of causality based on causal graphs. Causal thinking is also central to law and justice. I will mention surprising connections to my work on serial killer nurse cases, in particular the Dutch case of Lucia de Berk and the current UK case of Lucy Letby.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.Sérgio Sacani
The return of a sample of near-surface atmosphere from Mars would facilitate answers to several first-order science questions surrounding the formation and evolution of the planet. One of the important aspects of terrestrial planet formation in general is the role that primary atmospheres played in influencing the chemistry and structure of the planets and their antecedents. Studies of the martian atmosphere can be used to investigate the role of a primary atmosphere in its history. Atmosphere samples would also inform our understanding of the near-surface chemistry of the planet, and ultimately the prospects for life. High-precision isotopic analyses of constituent gases are needed to address these questions, requiring that the analyses are made on returned samples rather than in situ.
This pdf is about the Schizophrenia.
For more details visit on YouTube; @SELF-EXPLANATORY;
https://www.youtube.com/channel/UCAiarMZDNhe1A3Rnpr_WkzA/videos
Thanks...!
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...University of Maribor
Slides from:
11th International Conference on Electrical, Electronics and Computer Engineering (IcETRAN), Niš, 3-6 June 2024
Track: Artificial Intelligence
https://www.etran.rs/2024/en/home-english/
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...Sérgio Sacani
We characterize the earliest galaxy population in the JADES Origins Field (JOF), the deepest
imaging field observed with JWST. We make use of the ancillary Hubble optical images (5 filters
spanning 0.4−0.9µm) and novel JWST images with 14 filters spanning 0.8−5µm, including 7 mediumband filters, and reaching total exposure times of up to 46 hours per filter. We combine all our data
at > 2.3µm to construct an ultradeep image, reaching as deep as ≈ 31.4 AB mag in the stack and
30.3-31.0 AB mag (5σ, r = 0.1” circular aperture) in individual filters. We measure photometric
redshifts and use robust selection criteria to identify a sample of eight galaxy candidates at redshifts
z = 11.5 − 15. These objects show compact half-light radii of R1/2 ∼ 50 − 200pc, stellar masses of
M⋆ ∼ 107−108M⊙, and star-formation rates of SFR ∼ 0.1−1 M⊙ yr−1
. Our search finds no candidates
at 15 < z < 20, placing upper limits at these redshifts. We develop a forward modeling approach to
infer the properties of the evolving luminosity function without binning in redshift or luminosity that
marginalizes over the photometric redshift uncertainty of our candidate galaxies and incorporates the
impact of non-detections. We find a z = 12 luminosity function in good agreement with prior results,
and that the luminosity function normalization and UV luminosity density decline by a factor of ∼ 2.5
from z = 12 to z = 14. We discuss the possible implications of our results in the context of theoretical
models for evolution of the dark matter halo mass function.
Cancer cell metabolism: special Reference to Lactate PathwayAADYARAJPANDEY1
Normal Cell Metabolism:
Cellular respiration describes the series of steps that cells use to break down sugar and other chemicals to get the energy we need to function.
Energy is stored in the bonds of glucose and when glucose is broken down, much of that energy is released.
Cell utilize energy in the form of ATP.
The first step of respiration is called glycolysis. In a series of steps, glycolysis breaks glucose into two smaller molecules - a chemical called pyruvate. A small amount of ATP is formed during this process.
Most healthy cells continue the breakdown in a second process, called the Kreb's cycle. The Kreb's cycle allows cells to “burn” the pyruvates made in glycolysis to get more ATP.
The last step in the breakdown of glucose is called oxidative phosphorylation (Ox-Phos).
It takes place in specialized cell structures called mitochondria. This process produces a large amount of ATP. Importantly, cells need oxygen to complete oxidative phosphorylation.
If a cell completes only glycolysis, only 2 molecules of ATP are made per glucose. However, if the cell completes the entire respiration process (glycolysis - Kreb's - oxidative phosphorylation), about 36 molecules of ATP are created, giving it much more energy to use.
IN CANCER CELL:
Unlike healthy cells that "burn" the entire molecule of sugar to capture a large amount of energy as ATP, cancer cells are wasteful.
Cancer cells only partially break down sugar molecules. They overuse the first step of respiration, glycolysis. They frequently do not complete the second step, oxidative phosphorylation.
This results in only 2 molecules of ATP per each glucose molecule instead of the 36 or so ATPs healthy cells gain. As a result, cancer cells need to use a lot more sugar molecules to get enough energy to survive.
Unlike healthy cells that "burn" the entire molecule of sugar to capture a large amount of energy as ATP, cancer cells are wasteful.
Cancer cells only partially break down sugar molecules. They overuse the first step of respiration, glycolysis. They frequently do not complete the second step, oxidative phosphorylation.
This results in only 2 molecules of ATP per each glucose molecule instead of the 36 or so ATPs healthy cells gain. As a result, cancer cells need to use a lot more sugar molecules to get enough energy to survive.
introduction to WARBERG PHENOMENA:
WARBURG EFFECT Usually, cancer cells are highly glycolytic (glucose addiction) and take up more glucose than do normal cells from outside.
Otto Heinrich Warburg (; 8 October 1883 – 1 August 1970) In 1931 was awarded the Nobel Prize in Physiology for his "discovery of the nature and mode of action of the respiratory enzyme.
WARNBURG EFFECT : cancer cells under aerobic (well-oxygenated) conditions to metabolize glucose to lactate (aerobic glycolysis) is known as the Warburg effect. Warburg made the observation that tumor slices consume glucose and secrete lactate at a higher rate than normal tissues.
Introduction:
RNA interference (RNAi) or Post-Transcriptional Gene Silencing (PTGS) is an important biological process for modulating eukaryotic gene expression.
It is highly conserved process of posttranscriptional gene silencing by which double stranded RNA (dsRNA) causes sequence-specific degradation of mRNA sequences.
dsRNA-induced gene silencing (RNAi) is reported in a wide range of eukaryotes ranging from worms, insects, mammals and plants.
This process mediates resistance to both endogenous parasitic and exogenous pathogenic nucleic acids, and regulates the expression of protein-coding genes.
What are small ncRNAs?
micro RNA (miRNA)
short interfering RNA (siRNA)
Properties of small non-coding RNA:
Involved in silencing mRNA transcripts.
Called “small” because they are usually only about 21-24 nucleotides long.
Synthesized by first cutting up longer precursor sequences (like the 61nt one that Lee discovered).
Silence an mRNA by base pairing with some sequence on the mRNA.
Discovery of siRNA?
The first small RNA:
In 1993 Rosalind Lee (Victor Ambros lab) was studying a non- coding gene in C. elegans, lin-4, that was involved in silencing of another gene, lin-14, at the appropriate time in the
development of the worm C. elegans.
Two small transcripts of lin-4 (22nt and 61nt) were found to be complementary to a sequence in the 3' UTR of lin-14.
Because lin-4 encoded no protein, she deduced that it must be these transcripts that are causing the silencing by RNA-RNA interactions.
Types of RNAi ( non coding RNA)
MiRNA
Length (23-25 nt)
Trans acting
Binds with target MRNA in mismatch
Translation inhibition
Si RNA
Length 21 nt.
Cis acting
Bind with target Mrna in perfect complementary sequence
Piwi-RNA
Length ; 25 to 36 nt.
Expressed in Germ Cells
Regulates trnasposomes activity
MECHANISM OF RNAI:
First the double-stranded RNA teams up with a protein complex named Dicer, which cuts the long RNA into short pieces.
Then another protein complex called RISC (RNA-induced silencing complex) discards one of the two RNA strands.
The RISC-docked, single-stranded RNA then pairs with the homologous mRNA and destroys it.
THE RISC COMPLEX:
RISC is large(>500kD) RNA multi- protein Binding complex which triggers MRNA degradation in response to MRNA
Unwinding of double stranded Si RNA by ATP independent Helicase
Active component of RISC is Ago proteins( ENDONUCLEASE) which cleave target MRNA.
DICER: endonuclease (RNase Family III)
Argonaute: Central Component of the RNA-Induced Silencing Complex (RISC)
One strand of the dsRNA produced by Dicer is retained in the RISC complex in association with Argonaute
ARGONAUTE PROTEIN :
1.PAZ(PIWI/Argonaute/ Zwille)- Recognition of target MRNA
2.PIWI (p-element induced wimpy Testis)- breaks Phosphodiester bond of mRNA.)RNAse H activity.
MiRNA:
The Double-stranded RNAs are naturally produced in eukaryotic cells during development, and they have a key role in regulating gene expression .
10 years of global biodiversity databases: are we there yet?
1. 10 years of global biodiversity
databases:
are we there yet?
Tony Rees
Independent data consultant,
Northern Rivers region, New South Wales,
Australia
previously: CSIRO Marine &
Atmospheric Research, Hobart, Tasmania
Global ocean bio-records in OBIS, 2015
2. The vision: “Biodiversity information on every
desktop” [ / device]…
A global taxon inventory
up-to-date species lists,
synonymies, etc. (for all groups)
Citations, links to primary
literature
direct access to the primary
taxonomic literature (for all
described taxa), including full text
(preferably…)
“All” georeferenced records
accessible, for all species
no need for individuals to
do the data aggregation
map local / regional / global
records, show details for any
data item
Indexes of taxon traits
e.g. to support sort /
filter / group by…
Predictive mapping / computed
range maps for all taxa
fill sampling gaps via niche
modelling, to produce
comprehensive global species maps Plus more (phylogenies,
illustrations, genetics,
descriptions, keys…)
3. A standardised approach for this talk
Rationale for each component/
activity (why do we care?)
Size of the problem (or sub-problem)
Who is addressing it (and what they are
doing)
How far have they (“we”) got, and how
much is still to be done…
Some other points to consider:
• open vs. closed access to relevant content (who can access?)
• machine vs. human retrievability ( -> services, not just pages to view)
• degree of consolidation available (saves querying multiple resources)
• web query only, or are the base data available for export/user upload
4. A global taxon inventory
From presentation by Quentin Wheeler, International Institute for Species Exploration (IISE):
5. A global taxon inventory – why do we care?
Useful to know with what organisms we share the planet
History of life as its own study area, also key to understanding
present life forms and their relationships
Ensure taxonomic names do not get accidentally re-used
Construct list once, use many times (no need to re-create
from scratch)
Reconcile old names / synonyms to current taxon concepts
(important for data integration)
Provide “taxonomic backbone” underpinning other
biodiversity activities / projects.
7. From Chapman’s summary document (2009 edition):
NB “Others” includes non-green
algae, Protista, prokaryotes and
viruses (refer document for details)
“Invertebrates” includes 1m insects,
360k others (incl. 102k arachnids, 85k
molluscs, 47k Crustacea)
“Estimated” total spp. for world is
11.3m i.e. only 17% of estimated
global biodiversity yet named (!)
9. From Chapman’s summary document (2009 edition):
…add another 200k-300k(?) for known fossil species, maybe
multiply 2x-3x to include synonyms…
…gives upwards of 5m species names to
catalogue/organise (+0.5m genera…)
+ new species descriptions (~20k/year) and higher taxa,
also new combinations (??/year)
5m+
names!
11. Likes:
• Comprehensive coverage (“most” zoological names held)
• Includes fossils as well as extant taxa
• Cites original publications for most post-1860 names
• Low latency (c. 6 months to name appearance in index)
• Some tax. hierarchy for all names
• ION ID minted for every name (usefulness varies)
Dislikes:
• Many more names than taxa (spelling + authority variations, synonyms, bad
data) – needs deduplication before use
• Hard to work out which is “correct” name or what names are synonyms, etc.
• Some quirks in citations as given (including author spellings)
• Detailed publication and taxon info is behind paywall
ION – Index to Organism Names
www.organismnames.com
2005: 1.8m names, all ranks (incl. synonyms), from 2.8m documents
2015: 5.2m names, all ranks (incl. synonyms), from 3.5m documents
- Animal names only (+ protists), cf. other resources for plants etc.
12. Newly published names (all ranks) in ION,
Nov 2015 (total 2.01m)
names from Index Animalium
(1758-1850)
names from Zoological Record
(1864-current)
13. Catalogue of Life
www.catalogueoflife.org
2005: 530k valid species names + ?? synonyms), from 23 databases
2015: 1.6m valid species names + 1.3m synonyms), from 151 databases
- All taxonomic groups, extant only (a few fossils starting 2015)
Likes:
• Name quality high (all expert-supplied), synonomies explicit, no (few) duplicates
• Internally consistent taxonomic hierarchy, kingdom -> family
• Coverage increasing over time (claims currently 84% of all extant species)
Dislikes:
• Some groups not yet covered (also no fossils)
• Synonymies not always complete (some old names not listed)
• No author, synonym information at ranks above species
• No links to original literature (although these may be traceable via source databases)
• More latency than ION (takes a while for new names to appear)
• No stable IDs for names (cannot use for linking to current edition)
15. Partial ION listing – search for “Physeter”
Note: 1. ION often includes the same name in multiple variants (mix of “good” and
“bad” content), giving over-representation of number of “real” names
2. This is a list of names, not taxa (single taxon can have multiple names, e.g.
valid name plus synonyms – not distinguished in ION).
17. Selected other names/taxon databases of note
PaleoBioDB (fossils) – formerly PaleoDB
• 2005: 60k names, all ranks
• 2015: 320k names, all ranks (incl. synonyms)
– Good coverage of many fossil taxa (most groups)
World Register of Marine Species (WoRMS)
• 2007: first release, 75k valid species + ?? syns
• 2015: 230k valid species + 96k synonyms
– Excellent coverage of marine taxa (almost all
groups), incl. some fossils
Interim Register of Marine and Nonmarine Genera (IRMNG)
– Tony Rees / OBIS project
• 2006: first release, 159k genus names incl. synonyms
(the latter partly known, part not)
• 2015: 488k genus names incl. synonyms (also 1.9m
species names incl. synonyms)
– Comprehensive genus level coverage of all groups,
extant + fossil, not all assigned to family as yet
18. For other groups (examples, NB completeness varies)
(etc., etc.)
20. Linking names to the literature – why do we
care?
Initial publication / description / designated type is “anchor” for
every taxonomic name and concept
Use to verify “indexing” details (taxon name + author, year) are
correctly represented
Included text details (title < abstract < full text) can be “mined”
to extract information useful for indexing (or just reading)
Entry point to wider literature via refs. list, subsequent citations,
etc.
22. Genus #1 in IRMNG: example “minimal” citation styles
Genus Authority Microcitation
Aa
Aa Baker,
1940
Aa Baker,
1940
Bull. Bishop
Mus., 165, 107
23. Genus #1 in IRMNG: example “better” (=standard) citation style
Genus Authority Microcitation Full citation
Aa
Aa Baker,
1940
Aa Baker,
1940
Bull. Bishop
Mus., 165, 107
Aa Baker,
1940
Baker, H.B., 1940. Zonitid
snails from Pacific Islands.
Part 2.-Hawaiian genera of
Microcystinae. Bulletin
Bishop Museum Honolulu,
165: 105-201.
ION has a subset of these
(article title, citation only)
24. Genus #1 in IRMNG: example “best” citation style with online links
Genus Authority Microcitation Full citation Online link
(abstract)
Online link
(full text)
Aa
Aa Baker,
1940
Aa Baker,
1940
Bull. Bishop
Mus., 165, 107
Aa Baker,
1940
Baker, H.B., 1940. Zonitid
snails from Pacific Islands.
Part 2.-Hawaiian genera of
Microcystinae. Bulletin
Bishop Museum Honolulu,
165: 105-201.
Aa Baker,
1940
Baker, H.B., 1940. (etc.) http://...
(or DOI)
Aa Baker,
1940
Baker, H.B., 1940. (etc.) http://...
(or DOI)
ION has a subset of these
(article title, citation only)
BioNames
(R. Page project)
has some of these
25. Online access to scientific literature – 1
Q.: How many articles in the “scientific literature”?
A.: Guesstimate might be 180m total “scholarly articles”, 120m in all
sciences, 20m in biology over past 250 years
Google Scholar: ~160m citations (all disciplines)
Web of Science: 90m items indexed (1900 onwards)
PubMed: 24m records (mostly 1966 onwards)
Biological Abstracts: 12m records, 1926 onwards (includes some
non-journal material)
Ideally would like single master list, unique ID/hyperlink for each work
(article/chapter/book etc.)
DOI (Digital Object Identifier) system / CrossRef introduced in 2000,
good for newly published work
currently used for 114m “objects” (incl. some retrospective allocation;
NB not all are scientific literature)
“Publishers use CrossRef's tools to convert citations from dumb
strings to useful links” (quote from R. Page discussion post)
26. Online access to scientific literature – 2
Zoological Record has indexed 3.5m works in zoology 1864-
current (increasing at 70k/year, 1.5k/week), but individual
records are behind paywall
27. Online access to scientific literature – 3
Biodiversity Heritage Library (BHL) is scanning older literature (esp.
pre-1923) and placing online
limited subset indexed by article title, otherwise (all) indexed by journal
and page no. (then has BHL page ID – can link to that)
search can be initiated by journal title, volume + page (if already
known)
can also search by taxon scientific name – but some instances will be
missed (BHL OCR [optical character recognition] is less than 100%
reliable)
this author’s experience looking for initial publication instances of
older names – success in around 1/3 of cases (not too bad), however
requires manual search (time consuming)
ideally, original description page links should be compiled somewhere
for others to re-use (not currently done on any scale)
28. BHL sample page: American Journal of Science s4 v15 (1903) p. 312
(original description of Megablattina Sellards, 1903, a cockroach)
29. BHL sample page: American Journal of Science s4 v15 (1903) p. 312
(original description of Megablattina Sellards, 1903, a cockroach)
30. Online access to scientific literature – 4
More recent literature – mix of publisher websites and operations like
JSTOR, often behind paywalls (though abstracts typically not so) – but not
all yet available digitally (BHL also has some post-1922 content)
Subscription/abstracting services (Zoological Record, Web of Science,
etc.) have better coverage, but are often not open access for viewing or
external linking purposes (although PubMed is)
Some tools constructed around planned all-encompassing “Bibliography
of Life” project (from Europe, http://biblife.org/), but progress difficult to
gauge as yet (claims 215k references held); another European project:
GRIB (Global References Index to Biodiversity), however development
appears to have stopped…
…
claims 215k references held
31. In summary: online [open] access available to subsets of article titles
> abstracts > full text in decreasing proportions
No single comprehensive source of online refs. available at
this time, users must “mix and match” sources as available
Few direct links in current tax. databases to literature that is
online (some noteworthy exceptions)
Over 95% of taxonomic literature pre-dates year 2000 starting
point for DOIs
Most comprehensive indexes are currently commercial
products (behind paywalls), not much traction in “community
/ open access” equivalents as yet.
33. Machine-readable sets of taxon traits – why do
we care?
Powerful tools for automated subsetting / filtering out sets of
interest
Useful for data quality assurance (e.g. flag suspect data, fix
logical inconsistencies)
Can form the basis of auto-response “expert systems” / keys
e.g. as already available for specialised groups
Need for standardised vocabularies/ semantics for indexing
terms, units used, etc.
34. Operations like OBIS (Ocean Biogeographic Information
System) want to display only (e.g.) marine + extant taxa,
suppress others
No “trait bank” systems existed at that time, IRMNG was
created to fill this need: flag taxa as extant/fossil,
marine/nonmarine
IRMNG data & flags subsequently incorporated into other systems
e.g. WoRMS, ALA, OTOL, EOL, more… – IRMNG flags are ~70%
complete at genus level, 95%+ for species
EOL (Encyclopedia of Life) is establishing “TraitBank” (2014 on)
to capture similar traits + more
35. EOL TraitBank most populous content (Oct 2015)
Note, EOL is an aggregator, not an original content generator (relies on
content supplied by third parties)
36. EOL TraitBank most populous content (Oct 2015)
EOL traits recently
added to Google
search, Nov 2015
37. Room for further development in this area…
e.g. TDWG (Taxonomic Databases Working Group) had active
interest in development of “SPM” (Species Profile Model)
around 2007-8, seems a bit quiet since
character matrices stored in computer-based keys e.g. Lucid,
DELTA, etc. could presumably be leveraged in some cases
some domains already well covered in standard manner (e.g.
FishBase for 33k fishes, SeaLifeBase for 71k non-fish marine
taxa)
SeaLifeBase example shown in next slide…
41. Assembling georeferenced species data – why
do we care?
“Where” is as important as “what” in biodiversity studies
Central repository much easier point of access than
thousands/millions of distributed sources
See gaps in existing data holdings / state of current data sampling,
digitisation and mobilisation
Overlay spatial distributions with other layers e.g. country
boundaries, habitats, environmental variables – generate regional
lists, understand controlling factors
Spot bad data (appearing in unlikely places on the map)
Use for spatial analysis (geography as computable data).
42. Distributed data networks
First data networks in USA, late 1990s – VertNET, HerpNET, ORNIS –
connecting museum data (vertebrate specimen records) in participating
agencies (also in Australia: Australian Virtual Herbarium)
OBIS (2002 on) and GBIF (2004 on) provide gateways to both specimen
and observation data from multiple agencies wordwide
OBIS (marine species records only):
2005: 5.6m records from 38 data sources (40,700 species)
2015: 44.9m records from 1,916 data sources (147,000 species)
GBIF (all habitats):
2005: 45m records from 334 data sources (?? species)
2015: 577m records from 15,196 data sources (?? species)
OBIS data flows into GBIF (though with some issues), also into local
networks e.g. ALA (Atlas of Living Australia)
43. Building OBIS – 2002-5
(trying to make a working system, and provide a good user experience)
“OBIS v2” front page /
spatial search interface, 2005
44. Current OBIS sample map & data
OBIS records for Physeter macrocephalus (sperm whale) in Australian
region, Oct 2015 (51,756 global records)
45. GBIF sample map & data
GBIF records for Physeter macrocephalus in Australian region, Oct 2015
(34,436 global records)
46. ALA (Atlas of Living Australia) presentation of records for Physeter macrocephalus in
Australian region, Oct 2015
47. ALA (Atlas of Living Australia) presentation of records for Physeter macrocephalus in
Australian region, Oct 2015
48. How complete are holdings of GBIF, OBIS, etc.?
From Hill et al., 2012 paper: at least 1 bn – 2 bn specimens in
biological collections worldwide (not all currently digitised)
Observations probably outnumber specimens by 100x - 1000x
Gives maybe 500 bn potential records +/- ; GBIF has 0.5 bn to
date (0.1%)…
Not all records are of equal importance for initial studies of
distributions (much redundancy), maybe OBIS/GBIF have <5% of
most useful records at this time…
Existing holdings presently heavily skewed towards better
sampled/accessible areas, also regions where digitisation is more
advanced
True “target numbers” difficult to assess (every individual of every
species, or what?)
50. Predicted distributions (environmental niche
modelling) – why do we care?
Available georeferenced data are always incomplete, need a
mechanism to intelligently fill in data gaps, produce more
complete biodiversity maps & atlases
Move from hand drawn maps / non-digital “expert knowledge”
to computable data
Model potential spread of invasives into new areas (show
suitable habitat)
Model potential changes in species range in response to
changing climate or other factors
Facilitate better understanding of broad- (and fine-) scale
factors controlling species distributions.
51. Niche modelling concept
Range of methodologies available including MAXENT, GARP, simple niche
models e.g. Relative Environmental Suitability (RES)
Ready et al., 2010 (incl. Tony Rees) contend that simple methods work as
well as more complex ones:
Source: A. Guisan group web page, Université de Lausanne, Switzerland
http://www.unil.ch/idyst/en/home/menuinst/research-poles/geoinformatics-and-spatial-m/predictive-biogeography/
advancing-the-science-of-eco.html
53. Global niche modelling/mapping projects
Lifemapper: Kansas University, c. 2003 onwards
• Models terrestrial niches (?only)
• No. of maps unclear (claims >100,000 species with data,
perhaps only a subset with maps)
• Uses GARP modelling (computationally intensive, several
hours per species map?), no expert review
** FW variables: elevation, surface temperature, net primary productivity, soil pH, soil
moisture, soil organic carbon, precipitation, compound topographic index
* Marine variables: bottom depth, water temperature (SST/bottom), salinity, primary
production, sea ice concentration, distance to land
AquaMaps: Kiel Marine Lab (+ co-developers), 2006 onwards
• Models marine niches only (plus some freshwater)
• 22,000 species mapped by Nov 2015 (incl. ~600 FW), mainly
fishes
• Uses RES modelling (6*/8** environmental variables, <2 mins
per species map) plus geographic partitioning and expert review
55. Lifemapper example map
Lifemapper example map for milk or Spanish snail (Helix lactea, now = Otala lactea)
(yellow dots are data points, red is potential habitat)
56. Building AquaMaps – 2005
(trying to make a working system, and the models fit the data…)
57. Building AquaMaps – 2005
(trying to make a working system, and the models fit the data…)
58. AquaMaps example map for New Zealand sea lion (Phocarctos hookeri)
(without expert review): Data points used
59. AquaMaps example map for New Zealand sea lion (Phocarctos hookeri)
(without expert review): Computed AquaMap
60. AquaMaps example map for New Zealand sea lion (Phocarctos hookeri)
(without expert review): All suitable habitat
61. AquaMaps example map for New Zealand sea lion (Phocarctos hookeri)
(without expert review) : All suitable habitat (detail)
Detail (square size
= 50 km nominal
for global
coverage)
62. AquaMaps example maps for New Zealand sea lion (Phocarctos hookeri)
(without expert review): Current vs. computed year 2100 range
2010 2100
74. Component 2005 2015 Status (/5)
Global taxon inventory – all species
names (with synonyms)
25%? 60%+ nnn(n)
All names linked to the literature
(original descriptions), at least
minimally
5%? 10-20%? n(n)
Taxon traits databased, in machine-
addressable form
0 10%+? n(n)
Distribution data (specimens,
observations) in online systems
<1% 5%+? n
Predicted distributions/global
range maps for all taxa
0?
5%+?
(fishes 60%+) n(n)
A report card to date…
75. Take home message: progress is definitely being made,
however plenty still to do:
Complete master names lists, release as open data (also deal
with inflow of new names and taxonomic dynamism)
Improve online access to tax. literature (plus embedded links
from relevant databases)
More data into OBIS & GBIF (including datasets not yet
digitised)
More progress on
predictive mapping
(algorithms, base data,
habitat factors, species
covered).
76. Take home message: progress is definitely being made,
however plenty still to do:
Complete master names lists, release as open data (also deal
with inflow of new names and taxonomic dynamism)
Improve online access to tax. literature (plus embedded links
from relevant databases)
More data into OBIS & GBIF (including datasets not yet
digitised)
More progress on
predictive mapping
(algorithms, base data,
habitat factors, species
covered).
77. Thank
you!Tony Rees Tony.Rees@marinespecies.org
◦ CSIRO Marine Research applications developer 1998-2014 including CAAB
(Codes for Australian Aquatic Biota), c-squares and Taxamatch
◦ OBIS steering committees (various) / system developer 2002-2005
◦ AquaMaps project co-developer 2004-current
◦ IRMNG developer 2006-current
◦ OBIS Australia Node manager 2006-2014
◦ Global Names Project collaborator 2006-current
◦ WoRMS contributor 2007-current
◦ GBIF & Open Tree of Life collaborator 2010-current
◦ iPlant collaborator 2010-2013
◦ Atlas of Living Australia consultant 2010-2012
◦ Catalogue of Life global team member 2010-2012
◦ GBIF Ebbe Nielsen Prize (for excellence in Biodiversity Informatics) winner 2014.
This talk available at: www.slideshare.net/tony1212/presentations