View a video recording here: https://vimeo.com/195024485
Franz & Sterner @ #TDWG16 - "A new power balance is needed for trustworthy biodiversity data". Talk # 1134, Friday, December 09, 2016, 11:30 am. Session Contributed Papers 05: Data Gaps, Trust, Knowledge Acquisition. See https://mbgserv18.mobot.org/ocs/index.php/tdwg/tdwg2016/schedConf/program
Franz et al ice 2016 addressing the name meaning drift challenge in open ende...taxonbytes
Presentation for the Symposium: Building the Biodiversity Knowledge Graph for Insects – Components, Progress, and Challenges; 2016 XXV International Congress of Entomology, Orlando, FL – September 26, 2016 (#ICE2016). See https://esa.confex.com/esa/ice2016/meetingapp.cgi/Session/24482
The Center for Expanded Data Annotation and Retrieval (CEDAR) has developed a suite of tools and services that allow scientists to create and publish metadata describing scientific experiments. Using these tools and services—referred to collectively as the CEDAR Workbench—scientists can collaboratively author metadata and submit them to public repositories. A key focus of our software is semantically enriching metadata with ontology terms. The system combines emerging technologies, such as JSON-LD and graph databases, with modern software development technologies, such as microservices and container platforms. The result is a suite of user-friendly, Web-based tools and REST APIs that provide a versatile end-to-end solution to the problems of metadata authoring and management. This talk presents the architecture of the CEDAR Workbench and focuses on the technology choices made to construct an easily usable, open system that allows users to create and publish semantically enriched metadata in standard Web formats.
The metadata about scientific experiments are crucial for finding, reproducing, and reusing the data that the metadata describe. We present a study of the quality of the metadata stored in BioSample—a repository of metadata about samples used in biomedical experiments managed by the U.S. National Center for Biomedical Technology Information (NCBI). We tested whether 6.6 million BioSample metadata records are populated with values that fulfill the stated requirements for such values. Our study revealed multiple anomalies in the analyzed metadata. The BioSample metadata field names and their values are not standardized or controlled—15% of the metadata fields use field names not specified in the BioSample data dictionary. Only 9 out of 452 BioSample-specified fields ordinarily require ontology terms as values, and the quality of these controlled fields is better than that of uncontrolled ones, as even simple binary or numeric fields are often populated with inadequate values of different data types (e.g., only 27% of Boolean values are valid). Overall, the metadata in BioSample reveal that there is a lack of principled mechanisms to enforce and validate metadata requirements. The aberrancies in the metadata are likely to impede search and secondary use of the associated datasets.
The Center for Expanded Data Annotation and Retrieval (CEDAR) aims to revolutionize the way that metadata describing scientific experiments are authored. The software we have developedthe CEDAR Workbenchis a suite of Web-based tools and REST APIs that allows users to construct metadata templates, to fill in templates to generate high-quality metadata, and to share and manage these resources. The CEDAR Workbench provides a versatile, REST-based environment for authoring metadata that are enriched with terms from ontologies. The metadata are available as JSON, JSON-LD, or RDF for easy integration in scientific applications and reusability on the Web. Users can leverage our APIs for validating and submitting metadata to external repositories. The CEDAR Workbench is freely available and open-source.
Franz 2017 uiuc cirss non unitary syntheses of systematic knowledgetaxonbytes
Invited Presentation given at the University of Illinois Urbana Champaign iSchool, Center for Informatics Research in Science and Scholarship, CIRSS Seminar, Friday, February 17, 2017.
The availability of high-quality metadata is key to facilitating discovery in the large variety of scientific datasets that are increasingly becoming publicly available. However, despite the recent focus on metadata, the diversity of metadata representation formats and the poor support for semantic markup typically result in metadata that are of poor quality. There is a pressing need for a metadata representation format that provides strong interoperation capabilities together with robust semantic underpinnings. In this talk, we describe such a format, together with open-source Web-based tools that support the acquisition, search, and management of metadata. We outline an initial evaluation using metadata from a variety of biomedical repositories.
Franz et al ice 2016 addressing the name meaning drift challenge in open ende...taxonbytes
Presentation for the Symposium: Building the Biodiversity Knowledge Graph for Insects – Components, Progress, and Challenges; 2016 XXV International Congress of Entomology, Orlando, FL – September 26, 2016 (#ICE2016). See https://esa.confex.com/esa/ice2016/meetingapp.cgi/Session/24482
The Center for Expanded Data Annotation and Retrieval (CEDAR) has developed a suite of tools and services that allow scientists to create and publish metadata describing scientific experiments. Using these tools and services—referred to collectively as the CEDAR Workbench—scientists can collaboratively author metadata and submit them to public repositories. A key focus of our software is semantically enriching metadata with ontology terms. The system combines emerging technologies, such as JSON-LD and graph databases, with modern software development technologies, such as microservices and container platforms. The result is a suite of user-friendly, Web-based tools and REST APIs that provide a versatile end-to-end solution to the problems of metadata authoring and management. This talk presents the architecture of the CEDAR Workbench and focuses on the technology choices made to construct an easily usable, open system that allows users to create and publish semantically enriched metadata in standard Web formats.
The metadata about scientific experiments are crucial for finding, reproducing, and reusing the data that the metadata describe. We present a study of the quality of the metadata stored in BioSample—a repository of metadata about samples used in biomedical experiments managed by the U.S. National Center for Biomedical Technology Information (NCBI). We tested whether 6.6 million BioSample metadata records are populated with values that fulfill the stated requirements for such values. Our study revealed multiple anomalies in the analyzed metadata. The BioSample metadata field names and their values are not standardized or controlled—15% of the metadata fields use field names not specified in the BioSample data dictionary. Only 9 out of 452 BioSample-specified fields ordinarily require ontology terms as values, and the quality of these controlled fields is better than that of uncontrolled ones, as even simple binary or numeric fields are often populated with inadequate values of different data types (e.g., only 27% of Boolean values are valid). Overall, the metadata in BioSample reveal that there is a lack of principled mechanisms to enforce and validate metadata requirements. The aberrancies in the metadata are likely to impede search and secondary use of the associated datasets.
The Center for Expanded Data Annotation and Retrieval (CEDAR) aims to revolutionize the way that metadata describing scientific experiments are authored. The software we have developedthe CEDAR Workbenchis a suite of Web-based tools and REST APIs that allows users to construct metadata templates, to fill in templates to generate high-quality metadata, and to share and manage these resources. The CEDAR Workbench provides a versatile, REST-based environment for authoring metadata that are enriched with terms from ontologies. The metadata are available as JSON, JSON-LD, or RDF for easy integration in scientific applications and reusability on the Web. Users can leverage our APIs for validating and submitting metadata to external repositories. The CEDAR Workbench is freely available and open-source.
Franz 2017 uiuc cirss non unitary syntheses of systematic knowledgetaxonbytes
Invited Presentation given at the University of Illinois Urbana Champaign iSchool, Center for Informatics Research in Science and Scholarship, CIRSS Seminar, Friday, February 17, 2017.
The availability of high-quality metadata is key to facilitating discovery in the large variety of scientific datasets that are increasingly becoming publicly available. However, despite the recent focus on metadata, the diversity of metadata representation formats and the poor support for semantic markup typically result in metadata that are of poor quality. There is a pressing need for a metadata representation format that provides strong interoperation capabilities together with robust semantic underpinnings. In this talk, we describe such a format, together with open-source Web-based tools that support the acquisition, search, and management of metadata. We outline an initial evaluation using metadata from a variety of biomedical repositories.
ICBO2017 - Supporting Ontology-Based Standardization of Biomedical Metadata i...marcosmartinezromero
In this talk I describe the main features CEDAR developed
to make it possible to easily construct Web-based
metadata-acquisition forms, enrich those forms with ontology
concepts, and then fill out the forms to create ontology-annotated
descriptions of scientific experiments
Hail: SCALING GENETIC DATA ANALYSIS WITH APACHE SPARK: Keynote by Cotton SeedSpark Summit
In 2001, it cost ~$100M to sequence a single human genome. In 2014, due to dramatic improvements in sequencing technology far outpacing Moore’s law, we entered the era of the $1,000 genome. At the same time, the power of genetics to impact medicine has become evident: for example, drugs with supporting genetic evidence have twice the clinical trial success rate. These factors have led to an explosion in the volume of genetic data, in the face of which existing analysis tools are breaking down.
Therefore, we began the open-source Hail project (https://hail.is) to be a scalable platform built on Apache Spark to enable the worldwide genetics community to build, share, and apply new tools. Hail is focused on variant-level (post-read) data; querying genetic data, annotations and sample data; and performing rare and common variant association analyses. Hail has already been used to analyze datasets with hundreds of thousands of exomes and tens of thousands of whole genomes.
We will give an overview of the goals of the Hail project and its architecture. The challenge of efficiently manipulating genetic data in Spark has led to several innovations that may have wider applicability, including an RDD-like abstraction for representing multidimensional data and an OrderedRDD abstraction for ordered data, (for example, data indexed by position in the genome). Finally, we will discuss Hail performance and future directions.
Science has evolved from the isolated individual tinkering in the lab, through the era of the “gentleman scientist” with his or her assistant(s), to group-based then expansive collaboration and now to an opportunity to collaborate with the world. With the advent of the internet the opportunity for crowd-sourced contribution and large-scale collaboration has exploded and, as a result, scientific discovery has been further enabled. The contributions of enormous open data sets, liberal licensing policies and innovative technologies for mining and linking these data has given rise to platforms that are beginning to deliver on the promise of semantic technologies and nanopublications, facilitated by the unprecedented computational resources available today, especially the increasing capabilities of handheld devices. The speaker will provide an overview of his experiences in developing a crowdsourced platform for chemists allowing for data deposition, annotation and validation. The challenges of mapping chemical and pharmacological data, especially in regards to data quality, will be discussed. The promise of distributed participation in data analysis is already in place.
Use of ContentMine tools on the Open Access subset of EuropePubMedCentral to discover new knowledge about the Zika virus.
Three slides have embedded movies - these do not show in slideshare and a first pass of this can be seen as a single file at https://vimeo.com/154705161
Published on Feb 07, 2016 by PMR
Use of ContentMine tools on the Open Access subset of EuropePubMedCentral to discover new knowledge about the Zika virus. Includes clips of the software in action
Using the Semantic Web to Support Ecoinformaticsebiquity
We describe our on-going work in using the semantic web in support of ecological informatics, and demonstrate a distributed platform for constructing end-to-end use cases. Specifically, we describe ELVIS (the Ecosystem Location Visualization and Information System), a suite of tools for constructing food webs for a given location, and Triple Shop, a SPARQL query interface which allows scientists to semi-automatically construct distributed datasets relevant to the queries they want to ask. ELVIS functionality is exposed as a collection of web services, and all input and output data is expressed in OWL, thereby enabling its integration with Triple Shop and other semantic web resources.
Rphenoscape: Connecting the semantics of evolutionary morphology to comparat...Hilmar Lapp
Presentation of the software package RPhenoscape for the R platform for statistical computing. The package bridges between the ecosystem of packages for comparative phylogenetics in R and the data content and computational semantics services provided by the API of the Phenoscape Knowledgebase. Presented at the 2016 Evolution Meetings in Austin, TX.
Visualizing Primary Data form Taxonomic Literaturemillerjeremya
Visualizing Primary Data form Taxonomic Literature
Jeremy Miller, Donat Agosti, Lyubomir Penev, Guido Sautter, Teodor Georgiev, Terry Catapano, David Patterson, David King, Serrano Pereira, Rutger Aldo Vos, Soraya Sierra
EU BON General Meeting, 1-4 June 2015, Cambridge, United Kingdom
Next Generation Cancer Data Discovery, Access, and Integration Using Prizms a...Jim McCusker
To encourage data sharing in the life sciences, supporting tools need to minimize effort and maximize incentives. We have created infrastructure that makes it easy to create portals that supports dataset sharing and simplified publishing of the datasets as high quality linked data. We report here on our infrastructure and its use in the creation of a melanoma dataset portal. This portal is based on the Comprehensive Knowledge Archive Network (CKAN) and Prizms, an infrastructure to acquire, integrate, and publish data using Linked Data principles. In addition, we introduce an extension to CKAN that makes it easy for others to cite datasets from within both publications and subsequently-derived datasets using the emerging nanopublication and World Wide Web Consortium provenance standards.
Publishing Germplasm Vocabularies as Linked DataValeria Pesce
What has already been published?
What may still be needed?
How to do it?
This presentation is a part of the 3rd Session of the 1st International e-Conference on Germplasm Data Interoperability https://sites.google.com/site/germplasminteroperability/
Franz et al tdwg 2016 new developments for libraries of lifetaxonbytes
Franz et al. @ #TDWG16 - "New developments for the Libraries of Life project and app". Talk # 1138, Friday, December 09, 2016, 02:45 pm. Session Lightning Talks. See https://mbgserv18.mobot.org/ocs/index.php/tdwg/tdwg2016/schedConf/program
Franz et al tdwg 2016 introducing lep nettaxonbytes
Franz et al. @ #TDWG16 - "Introducing LepNet – the Lepidoptera of North America Network". Talk # 1139, Friday, December 09, 2016, 02:40 pm. Session Lightning Talks. See https://mbgserv18.mobot.org/ocs/index.php/tdwg/tdwg2016/schedConf/program
ICBO2017 - Supporting Ontology-Based Standardization of Biomedical Metadata i...marcosmartinezromero
In this talk I describe the main features CEDAR developed
to make it possible to easily construct Web-based
metadata-acquisition forms, enrich those forms with ontology
concepts, and then fill out the forms to create ontology-annotated
descriptions of scientific experiments
Hail: SCALING GENETIC DATA ANALYSIS WITH APACHE SPARK: Keynote by Cotton SeedSpark Summit
In 2001, it cost ~$100M to sequence a single human genome. In 2014, due to dramatic improvements in sequencing technology far outpacing Moore’s law, we entered the era of the $1,000 genome. At the same time, the power of genetics to impact medicine has become evident: for example, drugs with supporting genetic evidence have twice the clinical trial success rate. These factors have led to an explosion in the volume of genetic data, in the face of which existing analysis tools are breaking down.
Therefore, we began the open-source Hail project (https://hail.is) to be a scalable platform built on Apache Spark to enable the worldwide genetics community to build, share, and apply new tools. Hail is focused on variant-level (post-read) data; querying genetic data, annotations and sample data; and performing rare and common variant association analyses. Hail has already been used to analyze datasets with hundreds of thousands of exomes and tens of thousands of whole genomes.
We will give an overview of the goals of the Hail project and its architecture. The challenge of efficiently manipulating genetic data in Spark has led to several innovations that may have wider applicability, including an RDD-like abstraction for representing multidimensional data and an OrderedRDD abstraction for ordered data, (for example, data indexed by position in the genome). Finally, we will discuss Hail performance and future directions.
Science has evolved from the isolated individual tinkering in the lab, through the era of the “gentleman scientist” with his or her assistant(s), to group-based then expansive collaboration and now to an opportunity to collaborate with the world. With the advent of the internet the opportunity for crowd-sourced contribution and large-scale collaboration has exploded and, as a result, scientific discovery has been further enabled. The contributions of enormous open data sets, liberal licensing policies and innovative technologies for mining and linking these data has given rise to platforms that are beginning to deliver on the promise of semantic technologies and nanopublications, facilitated by the unprecedented computational resources available today, especially the increasing capabilities of handheld devices. The speaker will provide an overview of his experiences in developing a crowdsourced platform for chemists allowing for data deposition, annotation and validation. The challenges of mapping chemical and pharmacological data, especially in regards to data quality, will be discussed. The promise of distributed participation in data analysis is already in place.
Use of ContentMine tools on the Open Access subset of EuropePubMedCentral to discover new knowledge about the Zika virus.
Three slides have embedded movies - these do not show in slideshare and a first pass of this can be seen as a single file at https://vimeo.com/154705161
Published on Feb 07, 2016 by PMR
Use of ContentMine tools on the Open Access subset of EuropePubMedCentral to discover new knowledge about the Zika virus. Includes clips of the software in action
Using the Semantic Web to Support Ecoinformaticsebiquity
We describe our on-going work in using the semantic web in support of ecological informatics, and demonstrate a distributed platform for constructing end-to-end use cases. Specifically, we describe ELVIS (the Ecosystem Location Visualization and Information System), a suite of tools for constructing food webs for a given location, and Triple Shop, a SPARQL query interface which allows scientists to semi-automatically construct distributed datasets relevant to the queries they want to ask. ELVIS functionality is exposed as a collection of web services, and all input and output data is expressed in OWL, thereby enabling its integration with Triple Shop and other semantic web resources.
Rphenoscape: Connecting the semantics of evolutionary morphology to comparat...Hilmar Lapp
Presentation of the software package RPhenoscape for the R platform for statistical computing. The package bridges between the ecosystem of packages for comparative phylogenetics in R and the data content and computational semantics services provided by the API of the Phenoscape Knowledgebase. Presented at the 2016 Evolution Meetings in Austin, TX.
Visualizing Primary Data form Taxonomic Literaturemillerjeremya
Visualizing Primary Data form Taxonomic Literature
Jeremy Miller, Donat Agosti, Lyubomir Penev, Guido Sautter, Teodor Georgiev, Terry Catapano, David Patterson, David King, Serrano Pereira, Rutger Aldo Vos, Soraya Sierra
EU BON General Meeting, 1-4 June 2015, Cambridge, United Kingdom
Next Generation Cancer Data Discovery, Access, and Integration Using Prizms a...Jim McCusker
To encourage data sharing in the life sciences, supporting tools need to minimize effort and maximize incentives. We have created infrastructure that makes it easy to create portals that supports dataset sharing and simplified publishing of the datasets as high quality linked data. We report here on our infrastructure and its use in the creation of a melanoma dataset portal. This portal is based on the Comprehensive Knowledge Archive Network (CKAN) and Prizms, an infrastructure to acquire, integrate, and publish data using Linked Data principles. In addition, we introduce an extension to CKAN that makes it easy for others to cite datasets from within both publications and subsequently-derived datasets using the emerging nanopublication and World Wide Web Consortium provenance standards.
Publishing Germplasm Vocabularies as Linked DataValeria Pesce
What has already been published?
What may still be needed?
How to do it?
This presentation is a part of the 3rd Session of the 1st International e-Conference on Germplasm Data Interoperability https://sites.google.com/site/germplasminteroperability/
Franz et al tdwg 2016 new developments for libraries of lifetaxonbytes
Franz et al. @ #TDWG16 - "New developments for the Libraries of Life project and app". Talk # 1138, Friday, December 09, 2016, 02:45 pm. Session Lightning Talks. See https://mbgserv18.mobot.org/ocs/index.php/tdwg/tdwg2016/schedConf/program
Franz et al tdwg 2016 introducing lep nettaxonbytes
Franz et al. @ #TDWG16 - "Introducing LepNet – the Lepidoptera of North America Network". Talk # 1139, Friday, December 09, 2016, 02:40 pm. Session Lightning Talks. See https://mbgserv18.mobot.org/ocs/index.php/tdwg/tdwg2016/schedConf/program
Presentation for the San Francisco #IDCC14 conference (http://www.dcc.ac.uk/events/idcc14/day-two-papers). The presentation covers publishing zooarchaeology data with Open Context (http://opencontext.org) to study the spread of farming from the Near East to Europe through Anatolia. It looks at editorial processes, linked data annotation, and other workflow concerns relating to making raw data more usable for comparative analysis.
Connecting life sciences data at the European Bioinformatics InstituteConnected Data World
Tony Burdett's slides from his talk at Connected Data London. Tony is a Senior Software Engineer at The European Bioinformatics Institute. He presented the complexity of data at the EMBL-EBI and what is their solution to make sense of all this data.
A 45min presentation given at the 'Getting published in Nature's Scientific Data journal', hosted by the University of Cambridge Research Data Management team (www.data.cam.ac.uk). Presented on Monday 11th January 2016.
How SADI & SHARE help restore the Scientific Method to in silico scienceMark Wilkinson
This is my presentation to the Bio Open Source Convention (BOSC) in Boston, July 2010. I start with a brief status-update on the BioMoby project and then launch into a series of demonstrations of it's successor - SADI + SHARE. Rather than discussing how SADI/SHARE work, I focus the discussion on what role I think these technologies can play in bringing the traditional "scientific method" back into in silico biology.
Apollo and i5K: Collaborative Curation and Interactive Analysis of GenomesMonica Munoz-Torres
Precise elucidation of the many different biological features encoded in a genome requires a careful curation process that involves reviewing all available evidence to allow researchers to resolve discrepancies and validate automated gene models, protein alignments, and other biological elements. Genome annotation is an inherently collaborative task; researchers only rarely work in isolation, turning to colleagues for second opinions and insights from those with expertise in particular domains and gene families.
The i5k initiative seeks to sequence the genomes of 5,000 insect and related arthropod species. The selected species are known to be important to worldwide agriculture, food safety, medicine, and energy production as well as many used as models in biology, those most abundant in world ecosystems, and representatives in every branch of the insect phylogeny in an effort to better understand arthropod evolution and phylogeny. Because computational genome analysis remains an imperfect art, each of these new genomes sequenced will require visualization and curation.
Apollo is an instantaneous, collaborative, genome annotation editor, and the new JavaScript based version allows researchers real-time interactivity, breaking down large amounts of data into manageable portions to mobilize groups of researchers with shared interests. The i5K is a broad and inclusive effort that seeks to involve scientists from around the world in their genome curation process and Apollo is serving as the platform to empower this community. Here we offer details about this collaboration.
Scientists commonly find themselves in a state of overwhelm in regards to the availability of information accessible to them. The distribution of resources now includes the entire space of the worldwide web, access to primary databases such as CAS and, commonly, a plethora of internally developed systems. While the web has provided improved access to chemistry-related information there has not been an online central resource allowing integrated chemical structure-searching of chemistry databases, chemistry articles, patents and web pages such as blogs and wikis. ChemSpider has built a structure centric community for chemists by providing free access to an online database and collaboration tool for chemists. The online database offers an environment for curating the data on ChemSpider as well as the deposition of chemical structures, analytical data and associated information and provides a significant knowledge base and resource for chemists working in different domains. An overview of present and future capabilities is given.
NISO Virtual Conference: Expanding the Assessment Toolbox: Blending the Old and New Assessment Practices
Dismantling a Single-Discipline Journal Bundle: A Triangulation Method for Assessment Diane (DeDe) Dawson MSc, MLIS, Science Liaison Librarian, Science Library, University of Saskatchewan
Scott Edmunds talk at AIST: Overcoming the Reproducibility Crisis: and why I ...GigaScience, BGI Hong Kong
Scott Edmunds talk at the AIST Computational Biology Research Center in Tokyo: Overcoming the Reproducibility Crisis: and why I stopped worrying a learned to love open data (& methods), July 1st 2014
Building bioinformatics resources for the global communityExternalEvents
http://www.fao.org/about/meetings/wgs-on-food-safety-management/en/
Building bioinformatics resources for the global community. Presentation from the Technical Meeting on the impact of Whole Genome Sequencing (WGS) on food safety management and GMI-9, 23-25 May 2016, Rome, Italy.
De-centralized but global: Redesigning biodiversity data aggregation for impr...taxonbytes
Biodiversity data pose fundamental challenges for unification-based paradigms of data science. In particular, a hierarchical, backbone-driven approach to aggregating global biodiversity data tends to limit community engagement. Data quality, trust, fitness for use, and impact are similarly reduced. This presentation will outline an alternative, de-centralized design for aggregating biodiversity data globally. The design requires a coordinative approach to representing and reconciling evolving systematic perspectives, and further social but technologically mediated coordination between regionally and taxonomically constrained "communities of practice" (sensu Wenger, 2000, https://doi.org/10.1177/135050840072002). Important next steps in this direction include the development of use cases that quantify the benefits of a de-centralized biodiversity data aggregation - in terms of lowering costs to expert engagement, raising efficiency of curation, validating novel integration services, and improving reproducibility and provenance tracking across heterogenous data structures and portals.
Anzaldo franz 2017 ecn your daily weeviltaxonbytes
Slides of the presentation "#YourDailyWeevil - a story of modest but gratifying social media success", given at the 2017 Annual Meeting of the Entomological Collections Network, November 05, 2017, Denver, Colorado.
Zhang et al ecn 2016 building an accessible weevil tissue collection for geno...taxonbytes
Poster describing the origin and function of the ASUHIC Weevil Tissue Collection (WTC), see tinyurl.com/weeviltissuecollection; presented at the 2016 Entomological Collections Network Meeting, September 23, 2016, Orlando, Florida. ECN website: http://ecnweb.org/
Franz et al evol 2016 aligning multipe incongruent phylogenies with the euler...taxonbytes
Lightning talk at iEvoBio 2016 (http://www.ievobio.org/), given on June 21, 2016, at Evolution Meetings in Austin, Texas. Brief overview of using Euler/X to align phylogenies. See https://github.com/EulerProject
Johnston ESA 2014 Trogloderus Sand Dune Speciationtaxonbytes
Andrew Johnston's presentation on Trogloderus (Coleoptera: Tenebrionidae) systematics and speciation in Southwestern United States sand dune habitat, given at the 2014 Annual Meeting of the Entomological Society of America in Portland, OR. http://www.entsoc.org/entomology2014
Franz 2014 BIGCB Tracking Change across Classifications and Phylogeniestaxonbytes
Slides presented on the Euler/X toolkit at the "Understanding Taxon Ranges in Space and Time" Workshop – Berkeley Initiative in Global Change Biology (BIGCB); held on November 07-09, 2014, University of California at Berkeley, CA. See also http://taxonbytes.org/bigcb-workshop-at-uc-berkeley-tackling-the-taxon-concept-problem/
Arizona State University Natural History Collections - Moving to Alameda (201...taxonbytes
A collection of photos showing the transition of the Natural History Collections (School of Life Sciences, Arizona State University) from the Tempe Campus to the Alameda location. May, 2011 to August, 2014. See also http://taxonbytes.org/impressions-alameda-grand-opening/
Cobb, Seltmann, Franz. 2014. The Current State of Arthropod Biodiversity Data...taxonbytes
Cobb et al. 2014. The Current State of Arthropod Biodiversity Data: Addressing Impacts of Global Change. Presented at https://www.idigbio.org/content/collections-21st-century-symposium Program available at https://www.idigbio.org/wiki/index.php/Collections_for_the_21st_Century
Franz. 2014. Explaining taxonomy's legacy to computers – how and why?taxonbytes
Slides presented on the Euler/X projected (http://taxonbytes.org/prior-work-on-concept-taxonomy-2013/ & https://bitbucket.org/eulerx/euler-project) - for the conference "The Meaning of Names: Naming Diversity in the 21st Century", CU Natural History Museum, September 30, 2014.
Ludäscher et al. 2014 - A Hybrid Diagnosis Approach Combining Black-Box and W...taxonbytes
Presentation given at RuleML 2014 conference (http://ruleml2014.vse.cz/) with updates on the Euler/X toolkit; see also http://taxonbytes.org/prior-work-on-concept-taxonomy-2013/
The sequential stages culminating in the publication of a morphological cladistic analysis of weevils in the Exophthalmus genus complex (Coleoptera: Curculionidae: Entiminae) are reviewed, with an emphasis on how early- stage homology assessments were gradually evaluated and refined in light of intermittent phylogenetic insights. In all, 60 incremental versions of the evolving character matrix were congealed and analysed, starting with an assembly of 52 taxa and ten traditionally deployed diagnostic characters, and ending with 90 taxa and 143 characters that reflect significantly more narrow assessments of phylogenetic similarity and scope. Standard matrix properties and analytical tree statistics were traced throughout the analytical process, and series of incongruence length indifference tests were used to identify critical points of topology change among succeeding matrix versions. This kind of parsimony-contingent rescoping is generally representative of the inferential process of character individuation within individual and across multiple cladistic analyses. The expected long-term outcome is a maturing observational terminology in which precise inferences of homology are parsimony-contingent, and the notions of homology and parsimony are inextricably linked. This contingent view of cladistic character individuation is contrasted with current approaches to developing phenotype ontologies based on homology-neutral structural equivalence expressions. Recommendations are made to transparently embrace the parsimony-contingent nature of cladistic homology.
Multi-source connectivity as the driver of solar wind variability in the heli...Sérgio Sacani
The ambient solar wind that flls the heliosphere originates from multiple
sources in the solar corona and is highly structured. It is often described
as high-speed, relatively homogeneous, plasma streams from coronal
holes and slow-speed, highly variable, streams whose source regions are
under debate. A key goal of ESA/NASA’s Solar Orbiter mission is to identify
solar wind sources and understand what drives the complexity seen in the
heliosphere. By combining magnetic feld modelling and spectroscopic
techniques with high-resolution observations and measurements, we show
that the solar wind variability detected in situ by Solar Orbiter in March
2022 is driven by spatio-temporal changes in the magnetic connectivity to
multiple sources in the solar atmosphere. The magnetic feld footpoints
connected to the spacecraft moved from the boundaries of a coronal hole
to one active region (12961) and then across to another region (12957). This
is refected in the in situ measurements, which show the transition from fast
to highly Alfvénic then to slow solar wind that is disrupted by the arrival of
a coronal mass ejection. Our results describe solar wind variability at 0.5 au
but are applicable to near-Earth observatories.
Slide 1: Title Slide
Extrachromosomal Inheritance
Slide 2: Introduction to Extrachromosomal Inheritance
Definition: Extrachromosomal inheritance refers to the transmission of genetic material that is not found within the nucleus.
Key Components: Involves genes located in mitochondria, chloroplasts, and plasmids.
Slide 3: Mitochondrial Inheritance
Mitochondria: Organelles responsible for energy production.
Mitochondrial DNA (mtDNA): Circular DNA molecule found in mitochondria.
Inheritance Pattern: Maternally inherited, meaning it is passed from mothers to all their offspring.
Diseases: Examples include Leber’s hereditary optic neuropathy (LHON) and mitochondrial myopathy.
Slide 4: Chloroplast Inheritance
Chloroplasts: Organelles responsible for photosynthesis in plants.
Chloroplast DNA (cpDNA): Circular DNA molecule found in chloroplasts.
Inheritance Pattern: Often maternally inherited in most plants, but can vary in some species.
Examples: Variegation in plants, where leaf color patterns are determined by chloroplast DNA.
Slide 5: Plasmid Inheritance
Plasmids: Small, circular DNA molecules found in bacteria and some eukaryotes.
Features: Can carry antibiotic resistance genes and can be transferred between cells through processes like conjugation.
Significance: Important in biotechnology for gene cloning and genetic engineering.
Slide 6: Mechanisms of Extrachromosomal Inheritance
Non-Mendelian Patterns: Do not follow Mendel’s laws of inheritance.
Cytoplasmic Segregation: During cell division, organelles like mitochondria and chloroplasts are randomly distributed to daughter cells.
Heteroplasmy: Presence of more than one type of organellar genome within a cell, leading to variation in expression.
Slide 7: Examples of Extrachromosomal Inheritance
Four O’clock Plant (Mirabilis jalapa): Shows variegated leaves due to different cpDNA in leaf cells.
Petite Mutants in Yeast: Result from mutations in mitochondrial DNA affecting respiration.
Slide 8: Importance of Extrachromosomal Inheritance
Evolution: Provides insight into the evolution of eukaryotic cells.
Medicine: Understanding mitochondrial inheritance helps in diagnosing and treating mitochondrial diseases.
Agriculture: Chloroplast inheritance can be used in plant breeding and genetic modification.
Slide 9: Recent Research and Advances
Gene Editing: Techniques like CRISPR-Cas9 are being used to edit mitochondrial and chloroplast DNA.
Therapies: Development of mitochondrial replacement therapy (MRT) for preventing mitochondrial diseases.
Slide 10: Conclusion
Summary: Extrachromosomal inheritance involves the transmission of genetic material outside the nucleus and plays a crucial role in genetics, medicine, and biotechnology.
Future Directions: Continued research and technological advancements hold promise for new treatments and applications.
Slide 11: Questions and Discussion
Invite Audience: Open the floor for any questions or further discussion on the topic.
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Ana Luísa Pinho
Functional Magnetic Resonance Imaging (fMRI) provides means to characterize brain activations in response to behavior. However, cognitive neuroscience has been limited to group-level effects referring to the performance of specific tasks. To obtain the functional profile of elementary cognitive mechanisms, the combination of brain responses to many tasks is required. Yet, to date, both structural atlases and parcellation-based activations do not fully account for cognitive function and still present several limitations. Further, they do not adapt overall to individual characteristics. In this talk, I will give an account of deep-behavioral phenotyping strategies, namely data-driven methods in large task-fMRI datasets, to optimize functional brain-data collection and improve inference of effects-of-interest related to mental processes. Key to this approach is the employment of fast multi-functional paradigms rich on features that can be well parametrized and, consequently, facilitate the creation of psycho-physiological constructs to be modelled with imaging data. Particular emphasis will be given to music stimuli when studying high-order cognitive mechanisms, due to their ecological nature and quality to enable complex behavior compounded by discrete entities. I will also discuss how deep-behavioral phenotyping and individualized models applied to neuroimaging data can better account for the subject-specific organization of domain-general cognitive systems in the human brain. Finally, the accumulation of functional brain signatures brings the possibility to clarify relationships among tasks and create a univocal link between brain systems and mental functions through: (1) the development of ontologies proposing an organization of cognitive processes; and (2) brain-network taxonomies describing functional specialization. To this end, tools to improve commensurability in cognitive science are necessary, such as public repositories, ontology-based platforms and automated meta-analysis tools. I will thus discuss some brain-atlasing resources currently under development, and their applicability in cognitive as well as clinical neuroscience.
Seminar of U.V. Spectroscopy by SAMIR PANDASAMIR PANDA
Spectroscopy is a branch of science dealing the study of interaction of electromagnetic radiation with matter.
Ultraviolet-visible spectroscopy refers to absorption spectroscopy or reflect spectroscopy in the UV-VIS spectral region.
Ultraviolet-visible spectroscopy is an analytical method that can measure the amount of light received by the analyte.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.Sérgio Sacani
The return of a sample of near-surface atmosphere from Mars would facilitate answers to several first-order science questions surrounding the formation and evolution of the planet. One of the important aspects of terrestrial planet formation in general is the role that primary atmospheres played in influencing the chemistry and structure of the planets and their antecedents. Studies of the martian atmosphere can be used to investigate the role of a primary atmosphere in its history. Atmosphere samples would also inform our understanding of the near-surface chemistry of the planet, and ultimately the prospects for life. High-precision isotopic analyses of constituent gases are needed to address these questions, requiring that the analyses are made on returned samples rather than in situ.
A brief information about the SCOP protein database used in bioinformatics.
The Structural Classification of Proteins (SCOP) database is a comprehensive and authoritative resource for the structural and evolutionary relationships of proteins. It provides a detailed and curated classification of protein structures, grouping them into families, superfamilies, and folds based on their structural and sequence similarities.
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...University of Maribor
Slides from:
11th International Conference on Electrical, Electronics and Computer Engineering (IcETRAN), Niš, 3-6 June 2024
Track: Artificial Intelligence
https://www.etran.rs/2024/en/home-english/
This pdf is about the Schizophrenia.
For more details visit on YouTube; @SELF-EXPLANATORY;
https://www.youtube.com/channel/UCAiarMZDNhe1A3Rnpr_WkzA/videos
Thanks...!
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Franz sterner tdwg 2016 new power balance needed for trustworthy biodiversity data
1. A new power balance is needed
for trustworthy biodiversity data
Please
@taxonbytes
Nico Franz1 & Beckett W. Sterner1
With contributions by Edward Gilbert1, Andrew Johnston1,
Guanyang Zhang1, Bertram Ludäscher2 & Alan Weakley3
1 School of Life Sciences, Arizona State University
2 iSchool, University of Illinois at Urbana-Champaign
3 Herbarium, University of North Carolina at Chapel Hill
TDWG 2016 – Biodiversity Information Standards
December 09, 2016 – Instituto Tecnológico de Costa Rica (#TDWG16)
@ http://www.slideshare.net/taxonbytes/franz-sterner-tdwg-2016-new-power-balance-needed-for-trustworthy-biodiversity-data
2. Largely derived from doi:10.3897/rio.2.e10610
91dd0ee1-8a37-4efc-85b7-8176874cf5be
3. Premise: We agree that there are significant data quality issues
91dd0ee1-8a37-4efc-85b7-8176874cf5be
Aggregated Australian millipede data 'taken to the cleaners'
4. Premise: We agree that there are significant data quality issues
91dd0ee1-8a37-4efc-85b7-8176874cf5be
Aggregated Australian millipede data 'taken to the cleaners'
Aggregators respond to the charges
5. Premise: We agree that there are significant data quality issues
91dd0ee1-8a37-4efc-85b7-8176874cf5be
Aggregated Australian millipede data 'taken to the cleaners'
Aggregators respond to the charges
But this leaves open the question(s):
Who (exactly) is responsible for
how much of each particular issue?
6. We seem to disagree on the question of responsibility assignment(s)
91dd0ee1-8a37-4efc-85b7-8176874cf5be
Source: Belbin et al. 2013. A specialist's audit […]: An 'aggregator's' perspective. doi:10.3897/zookeys.305.5438
Page 73
7. Often enough, aggregators respond by:
• Acknowledging the general issues and their relevance.
• Pointing to many issues that effectively reside "with the sources".
• Calling for more collaboration across all levels; as well as new tools and
annotation options that "motivate and empower" the research community.
91dd0ee1-8a37-4efc-85b7-8176874cf5be
Source: Belbin et al. 2013. A specialist's audit […]: An 'aggregator's' perspective. doi:10.3897/zookeys.305.5438
Page 74
8. Thesis: For taxonomy integration, this both wrong and self-defeating
91dd0ee1-8a37-4efc-85b7-8176874cf5be
• Many aggregators are designed to impose a single taxonomic hierarchy –
one at a time – onto all taxonomically annotated records.
9. 91dd0ee1-8a37-4efc-85b7-8176874cf5be
• Many aggregators are designed to impose a single taxonomic hierarchy –
one at a time – onto all taxonomically annotated records.
• By design, these "backbones" are rarely attributable to individual (expert)
authors, but instead are newly created systematic theories that only appear
at the system level.
Thesis: For taxonomy integration, this both wrong and self-defeating
10. 91dd0ee1-8a37-4efc-85b7-8176874cf5be
• Many aggregators are designed to impose a single taxonomic hierarchy –
one at a time – onto all taxonomically annotated records.
• By design, these "backbones" are rarely attributable to individual (expert)
authors, but instead are newly created systematic theories that only appear
at the system level.
• Data are aggregated accordingly; yet backbone-driven modifications may
newly disrupt the original integrity of submitted data packages.
Thesis: For taxonomy integration, this both wrong and self-defeating
11. 91dd0ee1-8a37-4efc-85b7-8176874cf5be
• Many aggregators are designed to impose a single taxonomic hierarchy –
one at a time – onto all taxonomically annotated records.
• By design, these "backbones" are rarely attributable to individual (expert)
authors, but instead are newly created systematic theories that only appear
at the system level.
• Data are aggregated accordingly; yet backbone-driven modifications may
newly disrupt the original integrity of submitted data packages.
• By deflecting on responsibilities, aggregators may cause additional self-harm.
Ultimately, the power balance – as presently built in – must shift to bring
experts back into the process of licensing succinct, trustworthy data packages.
Thesis: For taxonomy integration, this both wrong and self-defeating
13. Taxonomic views of a frequently revised organismal lineage
Source: Franz et al. 2016. Controlling the taxonomic variable: […]. RIO Journal. doi:10.3897/rio.2.e10610
• 9 schemata for the NA Cleistes/Cleistesiopsis complex (orchids, "pogonias")
14. Snapshot of a more frequently revised organismal lineage
• 9 schemata for the NA Cleistes/Cleistesiopsis complex (orchids, "pogonias")
• Vertical sections identify taxonomic concept regions
Source: Franz et al. 2016. Controlling the taxonomic variable: […]. RIO Journal. doi:10.3897/rio.2.e10610
15. Snapshot of a more frequently revised organismal lineage
• 9 schemata for the NA Cleistes/Cleistesiopsis complex (orchids, "pogonias")
• Vertical sections identify taxonomic concept regions
• Colors identify lineages of taxonomic names (epithets) in use
Source: Franz et al. 2016. Controlling the taxonomic variable: […]. RIO Journal. doi:10.3897/rio.2.e10610
16. Snapshot of a more frequently revised organismal lineage
• 9 schemata for the NA Cleistes/Cleistesiopsis complex (orchids)
• Vertical sections identify taxonomic concept regions
• Colors identify lineages of taxonomic names (epithets) in use
• There is no consensus! Five incongruent schemata are used concurrently
Source: Franz et al. 2016. Controlling the taxonomic variable: […]. RIO Journal. doi:10.3897/rio.2.e10610
17. Further diagnosis:
If incongruent taxonomies are endorsed
– locally, provisionally, and democratically –
then what is the impact for
aggregated biodiversity data?
18. Further diagnosis:
Taxonomy becomes a variable
that we need to represent,
and thereby control for
(at the system level)
19. The 'consensus'
• Query: "Where do these orchid
species occur?"
• Same set of 250 orchid specimens,
according to 4 taxonomies.
"Controllingthetaxonomicvariable" Example: the Cleistes use case
Source: Franz et al. 2016. Controlling the taxonomic variable: […]. RIO Journal. doi:10.3897/rio.2.e10610
20. The 'consensus' The 'bible'
"Controllingthetaxonomicvariable"
• Query: "Where do these orchid
species occur?"
• Same set of 250 orchid specimens,
according to 4 taxonomies.
Example: the Cleistes use case
Source: Franz et al. 2016. Controlling the taxonomic variable: […]. RIO Journal. doi:10.3897/rio.2.e10610
21. The 'consensus' The 'bible'
The (formerly)
federal 'standard'
"Controllingthetaxonomicvariable"
Source: Franz et al. 2016. Controlling the taxonomic variable: […]. RIO Journal. doi:10.3897/rio.2.e10610
22. The 'consensus' The 'bible'
The (formerly)
federal 'standard'
The 'best', latest
regional flora
"Controllingthetaxonomicvariable"
Source: Franz et al. 2016. Controlling the taxonomic variable: […]. RIO Journal. doi:10.3897/rio.2.e10610
23. The 'consensus' The 'bible'
The (formerly)
federal 'standard'
The 'best', latest
regional flora
"Controllingthetaxonomicvariable"
Expert views
are in conflict
Source: Franz et al. 2016. Controlling the taxonomic variable: […]. RIO Journal. doi:10.3897/rio.2.e10610
24. The 'consensus' The 'bible'
The (formerly)
federal 'standard'
The 'best', latest
regional flora
"Controllingthetaxonomicvariable"
Expert views
are in conflict
"Just bad"
Source: Franz et al. 2016. Controlling the taxonomic variable: […]. RIO Journal. doi:10.3897/rio.2.e10610
25. The 'consensus' The 'bible'
The (formerly)
federal 'standard'
The 'best', latest
regional flora
Impact:
Name-based aggregation has created
a novel synthesis that nobody believes in
"Controllingthetaxonomicvariable"
"Just bad"
Source: Franz et al. 2016. Controlling the taxonomic variable: […]. RIO Journal. doi:10.3897/rio.2.e10610
26. The 'consensus' The 'bible'
The (formerly)
federal 'standard'
The 'best', latest
regional flora
"Controllingthetaxonomicvariable"
"Just
bad"
Expert views
are in conflict
Solution:
Instead of aggregating
an artificial 'consensus',
…
Source: Franz et al. 2016. Controlling the taxonomic variable: […]. RIO Journal. doi:10.3897/rio.2.e10610
27. The 'consensus' The 'bible'
The (formerly)
federal 'standard'
The 'best', latest
regional flora
"Controllingthetaxonomicvariable"
"Just
bad"
Expert views
are reconciled
Solution:
Instead of aggregating
an artificial 'consensus',
build translation services
Source: Franz et al. 2016. Controlling the taxonomic variable: […]. RIO Journal. doi:10.3897/rio.2.e10610
28. Challenges:
How can we redesign aggregation to yield
high-quality biodiversity data packages?
29. Challenges:
How can we redesign aggregation to yield
high-quality biodiversity data packages?
What does this mean for Darwin Core1
and how we use this aggregation standard?
1 Wieczorek et al. 2012. Darwin Core: an evolving […]. PLoS ONE 7(1): e29715. doi:10.1371/journal.pone.0029715
30. Preview of solution with eight steps
• DwC is insufficient, and part of the problem
31. # 1: Represent only taxonomic concept labels (TCLs) 1
• Syntax (TCL): taxonomic name [author, year, page] sec. source
1 Multi-taxonomy input/alignment visualizations generated with Euler/X toolkit: https://github.com/EulerProject/EulerX
Cleistes divaricata
sec. Gregg & Catling 1993
Pogonia
sec. Brown & Wunderlin 1997
32. # 1: DwC score keeping TCLs are optional; < 1% realized?
• TCL ~ DwC: nameAccordingTo
• SCAN: 19,722 of nearly 9 million records have TCLs (0.2%)
• Lack of enforcement to use TCLs makes standard less big data-ready
"Who authors GBIF's Backbone?"
https://storify.com/taxonbytes/who-authors-gbif-s-backbone
33. # 2: Represent each source coherently (Parent-Child relationships)
• Syntax (PC): TCL1 is a child/parent of TCL2 [where TCL1/2 = same source]
Cleistesiopsis bifaria sec. Pans. & de Barr. 2008
is a child of
Cleistesiopsis sec. Pans. & de Barr. 2008
34. # 2: DwC score keeping Not (adequately) represented
• PC ~ DwC: genus, family, order (etc.; higherClassification)
• However, higher-level names in DwC are not modeled as TCLs
• Taxonomic coherence of sources cannot be preserved with DwC alone
DwC record with higherClassification
(BDJ)
35. # 3: Do not force a single hierarchy onto all tip-level TCLs
• Syntax (PC): Tip-level TCL1 , TCL2 , etc. [where TCL1/2 = different sources]
36. # 3: DwC score keeping Optional Not (ever?) practiced
• No PC ~ DwC: infra-/specificEpithet only
• Typically, a single, 'unitary' higher-level classification is represented
• Combinations of algorithmic and social practices achieve the single hierarchy
"Who authors GBIF's Backbone?"
https://storify.com/taxonbytes/who-authors-gbif-s-backbone
37. # 4: Link TCLs via expert-provided RCC–5 articulations
• Syntax (RCC–5): TCL1 {==, >, <, ><, !} TCL2 [where TCL1/2 = diff. sources]
• RCC–5 = Region Connection Calculus
• 14 articulations provided by: http://tinyurl.com/Weakley-Flora-2015
Cleistes bifaria "Coastal Populations" sec. Smith et al. 2004
== (is congruent with)
Cleistesiopsis oricamporum sec. Brown & Pans. 2009
==
38. Source: Thau, D.M. 2010. Reasoning about taxonomies. Thesis, UC Davis. http://gradworks.proquest.com/3422778.pdf
Region Connection Calculus (semantics: set constraints)
== < > >< !
• Two regions N, M are either:
• congruent (N == M)
• properly inclusive (N < M)
• inversely properly inclusive (N > M)
• overlapping (N >< M)
• exclusive of each other (N ! M)
• RCC–5 articulations answer the query: "can we join regions N and M?"
• Taxonomies have multiple RCC–5 alignable components: nodes (parents,
children), node-associated traits, even node-anchoring specimens
40. Oscillating meanings of the epithet hyalites – 1911 to 2003
Phenotypicdiversity
Type-anchorednameidentityrelations
Source: Vane-Wright. 2003. Indifferent philosophy versus […]. Syst. Biodiv. 1: 3–11. doi:10.1017/S1477200003001063
41. # 5: Identify occurrence records only to TCLs
Records:
EKY39235
MTSU003611
NCSC00040204
…
Records:
BOON8098
CLEMS0061133
WILLI39399
…
Records:
GMUF-0039355
IBE006808
USCH58399
…
Records:
CONV0006268
MDKY00006482
NCU00038930
…
Records:
BRYV0023582, BRYV0023584
KHD00032030, MISS0016604
MMNS000227, NCSC00040206
USMS_000002923, USMS_000002924
VSC0053223, VSC0065528
…
Records:
ARIZ393087
DBG39049
USCH51217
…
Records:
NCU00040710
USCH96248
VSC0053218
…
Records:
CLEMS0012881
FUGR0003293
GA023130
…
Records:
BOON8100
NCSC00040210
SJNM45487
…
Records:
GA023144
LSU00012494
MISS0016608
…
Records:
IBE006810, IND-0012374, MMNS000227
Records:
NY8654
• Syntax (ID): Occurrence / organism is identified to TCL
"CLEMS0012881"
is identified to
Cleistes divaricata sec. Smith et al. 2004
[additional ID metadata]
42. DwC record with Identification metadata
(BDJ)
# 5: DwC score keeping ID metadata optional; > 50% realized
• ID ~ DwC: Identification, (date)identified(By), identificationReference
• SCAN: 4,715,277 of nearly 9 million records have ID metadata (52.5%)
• Enforcement…still also require use of TCLs
43. # 6: Generate comprehensive, consistent RCC–5 alignments
• Euler/X is a toolkit that infers logically consistent RCC–5 alignments
44. # 6: Generate comprehensive, consistent RCC–5 alignments
• Valued-added: MIR – set of Maximally Informative Relations containing
the RCC–5 articulation for every possible TCL pair scalability
Reasonerinference
46. The 'consensus' The 'bible'
The (formerly)
federal 'standard'
The 'best', latest
regional flora
"Controllingthetaxonomicvariable"
Impact:
"Please select your preference (A – D);
we can perform all translations"
Source: Franz et al. 2016. Controlling the taxonomic variable: […]. RIO Journal. doi:10.3897/rio.2.e10610
47. • We can now respond to queries such as:
• "Show all specimens identified to the taxonomic name Cleistes divaricata"
• Returns many records resolves incongruent lineage of name usages
# 8: "Do you trust us now?" Aggregation as a translational service
48. • We can now respond to queries such as:
• "Show all specimens identified to the taxonomic name Cleistes divaricata"
• Returns many records resolves incongruent lineage of name usages
• "Now show specimens with the TCL Cleistesiopsis divaricata sec. Weakley 2015"
• Returns record subset resolving only one narrowly circumscribed concept
# 8: "Do you trust us now?" Aggregation as a translational service
49. # 8: "Do you trust us now?" Aggregation as a translational service
• We can now respond to queries such as:
• "Show all specimens identified to the taxonomic name Cleistes divaricata"
• Returns many records resolves incongruent lineage of name usages
• "Now show specimens with the TCL Cleistesiopsis divaricata sec. Weakley 2015"
• Returns record subset resolving only one narrowly circumscribed concept
• "Now show specimens identified to the TCL Cleistes divaricata sec. RAB 1968,
yet translated into the more granular TCLs sec. Weakley 2015"
• Returns (again) many records, yet represents and contrasts two treatments,
as opposed to providing the ambiguous lineage view (above)
• "Show all specimens with ambiguous 2010/2015 TCL identifications…" (etc.)
50. Conclusion – designing trusted biodiversity data services
• The Darwin Core standard for aggregating biodiversity data:
(1) Has under-utilized options for better representing taxonomic expertise
(2) Is part of a design paradigm that undermines the plurality of expertise
51. • The Darwin Core standard for aggregating biodiversity data:
(1) Has under-utilized options for better representing taxonomic expertise
(2) Is part of a design paradigm that undermines the plurality of expertise
• Solutions are in development that realize data aggregation via translational
services – not as disenfranchising "backbones" – and without disrupting the
formation of expert-licensed, high-quality biodiversity data packages
Conclusion – designing trusted biodiversity data services
52. • The Darwin Core standard for aggregating biodiversity data:
(1) Has under-utilized options for better representing taxonomic expertise
(2) Is part of a design paradigm that undermines the plurality of expertise
• Solutions are in development that realize data aggregation via translational
services – not as disenfranchising "backbones" – and without disrupting the
formation of expert-licensed, high-quality biodiversity data packages
• All of us – not just aggregators – "own" the responsibility of designing
systems where the plurality of taxonomic expertise is fairly accommodated
Conclusion – designing trusted biodiversity data services
53. Acknowledgments & links to products
• Cleistes use case: Alan Weakley (UNC)
• Euler/X toolkit: Shizhuo Yu (UC Davis)
• Other data issues, discussions: Andrew Johnston, Guanyang Zhang
• NSF DEB–1155984, DBI–1342595 (PI Franz)
• NSF IIS–118088, DBI–1147273 (PI Ludäscher)
• Euler/X code @ https://github.com/EulerProject/EulerX
• Franz et al. 2016. Two influential primate classifications logically aligned.
Systematic Biology 65(4): 561–582. Link
The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
The simple semantics of RCC-5 makes this a rather generic vocabulary for representing advancement in phylogenetic knowledge. At the same time, the onus is on the phylogeneticists to apply the articulations in auch ways that the desired query services are actually obtained.
The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.