SlideShare a Scribd company logo
1 of 73
Literature and XML: or How to Have More Time to Think Donat Agosti Plazi ArtDataBanken Stockholm, Sept 21, 2009
Who is this? What do I know about her? Where does she live? Who are you? What do you do? Where are you from?
The answers are in several hundred million pages of printed species descriptions in our libraries, including the descriptions redescriptions of an estimated 1.8M species, and an estimated 50K new  (re-)descriptions annually.
Taxonomists at work  …… T. E. Lawrence: Seven Pillars of Wisdom – a triumph. 1st published for general circulation, 1935: p. 535
The traditional flux of information … … a  more or less closed, intransient system
What has this to do with XML, semantic, enhanced documents?
Access
Scanning Pdf-conversion (WWW)
Before antbase.org, Harvard‘s Museum of Comparative Zoology could claim to be the only location with a complete set of ant systematics publications from 1758 - present. Through antbase.org‘s digital library, access to this body of literature is worldwide, and it is actively used (>10,000 visits in one month only).
The Biodiversity Heritage Library is currently digitizing and make accessible >100 million pages, most of them out of copyright, ie older then 1925. ........ to be finished in 2048...
Access to ant taxonomic publications through antbase.org /Smithsonian Institution, including currently the entire body of non-copyrighted publications since 1758 (>4,000 publications or 85,000 pages)
Can taxonomic work be copyrighted? Copyright legislation is national but is based on the  Berne Convention for the Protection of Literary and Artistic Works  which defines a minimal standard. This international copyright standard does not require the recognition of treatments, the building stones of taxonomic publications, as works.
“ work ” does not mean “text”, does not mean “data”, does not mean “information”. A “ work ” is something more. That kind of something more has many different definitions in the various legislations, but it is always there: It may be called originality, individuality, creation, personal expression, creative shaping or anyhow else, but it is a condition for qualifying a product as a work: “Work” is an intellectual product that is in a certain sense particular, individual, original, new.   (Egloff: EDIT IPR and Copyright, 2008)
Taxonomic treatments are highly structured and homogenous, part of a global >100 million page corpus growing at a rate of ca 20,000 new species descriptions per year, not counting 5 times more redescriptions. Its structure is tightly controlled by a peer review process enforcing standards, a domain specific vocabulary, not written as poem or in flowery language but scientific jargon.  Treatments do not qualify as work.  The publications including the treatments might. (Egloff: EDIT IPR and Copyright, 2008)
It is about digesting millions of pages:  >>100 M pages taxonomic literature 25M scientific publications / year 25K journals >1K with taxonomic descriptions 20K descriptions of new species / year
Is this is the access we need?!
No, we need  open access  to  content ,  not  the  PDF   per se .
It is about  machines (not we) doing a great deal of the work for us, extracting data, formulating hypothesis, ....
It is about data and information in  context
„ Nothing makes sense in biology except in the light of treatments“.
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],“ protein-protein interaction networks” John Wilbanks,    Neurocommons
 
In a semantic Web environment (where machines talk to each other and do most of our work), data need to be able to talk to each other: “ protein-protein interaction networks” John Wilbanks,    Neurocommons 27,266 papers 4,563 papers 41,985 papers 10,365 papers 128,437 papers
Relational to Ontological Mapping Drug Neuron Pathological Agent Receptor Channel inhibits inhibits Agent Neuronal Property Pathological Change involves involves inhibits Compartment has is_located_in is_located_in slide courtesy of  kei chung, yale
It will open up scientific literature for data mining “ protein-protein interaction networks” John Wilbanks,    Neurocommons
TREATMENT Cremastogaster mimosae    Likely Diagnostically Related to:  Cremastogaster tricolor     Likely Diagnostically Related to:  Cremastogaster tricolor     Likely Diagnostically Related to:  Cremastogaster amabilis     Likely Diagnostically Related to:  Cremastogaster tricolor     Likely Diagnostically Related to:  Cremastogaster amabilis     Associated with:  Acacia sienocarpa Living in: Mombasa Lviing in: Tanga
It is more:  it is about  access  to the  original or source data
The semantically enhanced treatments, extracted, stored on Plazi.org, and served in a human readable form, are linked to the underlying data: Fisher & Smith, 2008, PLoS ONE.
Semantic, enhanced treatments  do the job ...
... and  XML  is one  way to go .
XML   XML stands for EXtensible Markup Language   XML is a markup language much like HTML   XML was designed to carry data, not to display data   XML tags are not predefined. You must define your own tags (schema)  XML is designed to be self-descriptive   XML is a W3C Recommendation
XML   Being open and non-proprietary XML is an optimal archival format for the treatment/publication Being a stable and rich data format, XML can be repurposed for a variety of purposes
XML XML application design is an art in itself .... and thus can not be explained in 15 minutes Plenty of resources to dive into XML on Web, eg  http://www.w3schools.com/, etc.
This means to develop a schema that models the logic content (e.g TaxonX),  insert those tags that define what a word means, so a computer can understand as well. To assure, that everybody talks about the same species, the name can be linked to a reference name server Azteca instabilis Taxonx-schema Would then read like External schema <tax:name> <tax:xmldata>  Normalization of data <dc:Genus>Azteca</dc:Genus> <dc:Species>instabilis</dc:Species> </tax:xmldata>   Azteca instabilis   </tax:name>
This can also be applied to entire sections of text, such as the treatment of a species and its parts. <tax:treatment> <tax:nomenclature> <tax:name> <tax:xid source=&quot;HNS&quot; identifier=&quot;193329&quot;/> <tax:xmldata> <dc:Genus>Mystrium</dc:Genus> <dc:Species>leonie</dc:Species> </tax:xmldata> Mystrium leonie </tax:name> <tax:status>n. sp.</tax:status> Fig 1 D - F </tax:nomenclature> <tax:div type=&quot;description&quot;> <tax:p>HOLOTYPE WORKER: TL 3.95, HL 1.02, HW 0.95, CI 93, SL  1.30, SI 137, PW 0.73, ML 0.38. Mandible outer margin strongly curving  to a sharp apical tooth, the apex parallel to the anterior clypeal margin.  (Holotype with material in mandibles, so mandibles and anterior clypeus $ described below from paratypes.) Median clypeus .... </treatment>
global unique identifiers  (e.g. LSID) link up data
LSID for scientific publications LSID for treatments LSID for names (Zoobank/ HNS..) LSID for specimens LSID for DNA sequences /  characters (ontologies) LSID for repositories GPS fixes for locations
Azteca instabilis Would then read like <tax:name> <tax:xid source=“ LSID&quot; identifier=“urn:lsid:biosci.ohio-state.edu.osuc_concetps:13452 &quot;/>    Link to external database <tax:xmldata>  Normalization of data <dc:Genus>Azteca</dc:Genus> <dc:Species>instabilis</dc:Species> </tax:xmldata>   Azteca instabilis   </tax:name>
We need XML-schemas,  tools to convert and expose semantically enhanced documents.
Plazi workflow: overview Plazi deliverables TaxonX XML schema GoldenGate Dspace application Exist application SRS Exchange protocols (SPM, TAPIR, REST)
[object Object],[object Object],- Get bibliographic Metadata from HNS (MODS) - Get bibliographic Guids from bioguid (or EDIT?) - Get geographic long/lat from geonames.org Plazi workflow: GoldenGate mark up as an example ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Plazi Search and Retrieval Server: Access to data TAPIR, SPM You You You human machine
Materials examined from literature in GBIF
Plazi workflow: content 11,000 descriptions online 500 publications 4,500 publications Handle, SPM and Tapir services Feeds into HNS and Zoobank (soon) Is harvested by GBIF, EOL Support from GBIF, EOL, US-NSF, DFG
Does the retro mark-up process scale up to the millions of pages needed to be processed? Only partially: Mark up takes about 5min/page: For 100 M pages = 700 man years (but it is only a first tool...)
Does the mark-up process scale up to the millions of page needed to be processed? Only partially: Mark up takes about 5min/page: For 100 M pages = 700 man years (but it is only a first tool...); wizards can reduce the time by several factors But: How much does it cost to digitize specimens, and what is its quality?
The cost of converting legacy publications can be avoided by producing marked-up publications up-front
NLM/TaxonX  schema allows publishers to maintain richly encoded articles whose data can be distributed and presented in multiple formats for a variety of uses.
NLM/Taxonx  XML Document Print
NLM/Taxonx  XML Document PDF Print
NLM/Taxonx  XML Document HTML PDF Print
NLM/Taxonx  XML Document HTML SPM /RDF PDF Print SPM /RDF SPM /RDF
NLM/Taxonx  XML Document HTML SPM /RDF PDF Print Database HTML / Species Page HTML SPM /RDF SPM /RDF HTML / Species Page HTML / Species Page Eg. EOL,  scrathpads
NLM/Taxonx  XML Document HTML SPM /RDF PDF Print Database HTML / Species Page LSID resolver HTML SPM /RDF SPM /RDF HTML / Species Page HTML / Species Page Eg. EOL,  scrathpads
NLM/Taxonx  XML Document HTML SPM /RDF PDF Print Database HTML / Species Page Google Dataminig, ... LSID resolver HTML SPM /RDF SPM /RDF HTML / Species Page HTML / Species Page Eg. EOL,  scrathpads
Semi-automatically generated semantic, enhanced  e-publications are the only way to describe the missing 10 M species, and to deal with an increasing flood of data.
ms submission („Taxon-x-version“) new ms alert Posting for review Edited ms Revised ms Publication: pdf Publication: hard copy Publication database („taxon-x-version“) analysis &  ms preparation Taxon DB New Data feedback Accepted ms New taxon alert The future of publications:  The publication semiautomaticall generated ontology bibliography ZooBank / NS Character DB Specimen DB Description DB Distribution DB Char. Matrix DB Phyl. Tree DB Char-state  Im. Specimen Im. Habitat Image Leg. Publicat.
Word MS DB Input forms export export convert NLM taxpub Indesign NLM taxpub author author author publisher publisher publisher Journal authoring and production workflow Ctd.
NLM/Taxonx  XML Document HTML SPM /RDF PDF Print Database HTML / Species Page Google Dataminig, ... LSID resolver HTML SPM /RDF SPM /RDF HTML / Species Page HTML / Species Page Eg. EOL,  scrathpads Ctd.
Word MS DB Input forms export export convert NLM taxpub Indesign NLM taxpub author author author publisher publisher publisher Journal authoring and production workflow: What do we miss? available prototypes to be developed
Where do we stand? 2008 LSIDs, external links
Where do we stand? 2008 LSIDs, external links, XML
Where do we stand? 2008
Where do we stand? 2009 LSIDs, external links, external data via doi, export services
Where do we stand? 2009 LSIDs, external links
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Self archive (the Green Road): UNIZ as one of the global leaders in self archiving
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
OECD Declaration for Access to Research Data from Public Funding (Spring 2007) How to implement this?
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
[object Object],[object Object],[object Object],[object Object],[object Object]
[object Object],[object Object],[object Object],[object Object],[object Object],antbase.org: Freier Zugang als Grundlage…
http://plazi.org Thank you very much! Donat Agosti [email_address]

More Related Content

Similar to 20090921 Art Databanken Agosti Final

Setting the Scene for ViBRANT – Strategy, Philosophy and Communication
Setting the Scene for ViBRANT – Strategy, Philosophy and CommunicationSetting the Scene for ViBRANT – Strategy, Philosophy and Communication
Setting the Scene for ViBRANT – Strategy, Philosophy and Communicationvbrant
 
20110122 vibrant final
20110122 vibrant final20110122 vibrant final
20110122 vibrant finalagosti
 
20110725 ibc xml
20110725 ibc xml20110725 ibc xml
20110725 ibc xmlagosti
 
The Semantic Web
The Semantic WebThe Semantic Web
The Semantic WebBarry Smith
 
Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...
Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...
Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...Stuart Chalk
 
Corpora, Blogs and Linguistic Variation (Paderborn)
Corpora, Blogs and Linguistic Variation (Paderborn)Corpora, Blogs and Linguistic Variation (Paderborn)
Corpora, Blogs and Linguistic Variation (Paderborn)Cornelius Puschmann
 
Semantic Libraries: the Container, the Content and the Contenders
Semantic Libraries: the Container, the Content and the ContendersSemantic Libraries: the Container, the Content and the Contenders
Semantic Libraries: the Container, the Content and the ContendersStefan Gradmann
 
247th ACS Meeting: Experiment Markup Language (ExptML)
247th ACS Meeting: Experiment Markup Language (ExptML)247th ACS Meeting: Experiment Markup Language (ExptML)
247th ACS Meeting: Experiment Markup Language (ExptML)Stuart Chalk
 
Donat Agosti & Norman F. Johnson - Copyright: the new taxonomic impediment
Donat Agosti & Norman F. Johnson - Copyright: the new taxonomic impedimentDonat Agosti & Norman F. Johnson - Copyright: the new taxonomic impediment
Donat Agosti & Norman F. Johnson - Copyright: the new taxonomic impedimentICZN
 
Computation and Knowledge
Computation and KnowledgeComputation and Knowledge
Computation and KnowledgeIan Foster
 
Using the Semantic Web, and Contributing to it
Using the Semantic Web, and Contributing to itUsing the Semantic Web, and Contributing to it
Using the Semantic Web, and Contributing to itMathieu d'Aquin
 
20110222 behesty monitoring and measuring biodiversity
20110222 behesty monitoring and measuring biodiversity20110222 behesty monitoring and measuring biodiversity
20110222 behesty monitoring and measuring biodiversityagosti
 
High throughput mining of the scholarly literature: journals and theses
High throughput mining of the scholarly literature: journals and thesesHigh throughput mining of the scholarly literature: journals and theses
High throughput mining of the scholarly literature: journals and thesespetermurrayrust
 
Neno/Fhat: Semantic Network Programming Language and Virtual Machine Specific...
Neno/Fhat: Semantic Network Programming Language and Virtual Machine Specific...Neno/Fhat: Semantic Network Programming Language and Virtual Machine Specific...
Neno/Fhat: Semantic Network Programming Language and Virtual Machine Specific...Marko Rodriguez
 
eScience: A Transformed Scientific Method
eScience: A Transformed Scientific MethodeScience: A Transformed Scientific Method
eScience: A Transformed Scientific MethodDuncan Hull
 
Sherborn: Lyal - Digitising legacy taxonomic literature: processes, products ...
Sherborn: Lyal - Digitising legacy taxonomic literature: processes, products ...Sherborn: Lyal - Digitising legacy taxonomic literature: processes, products ...
Sherborn: Lyal - Digitising legacy taxonomic literature: processes, products ...ICZN
 
Text Analytics Overview, 2011
Text Analytics Overview, 2011Text Analytics Overview, 2011
Text Analytics Overview, 2011Seth Grimes
 

Similar to 20090921 Art Databanken Agosti Final (20)

Setting the Scene for ViBRANT – Strategy, Philosophy and Communication
Setting the Scene for ViBRANT – Strategy, Philosophy and CommunicationSetting the Scene for ViBRANT – Strategy, Philosophy and Communication
Setting the Scene for ViBRANT – Strategy, Philosophy and Communication
 
20110122 vibrant final
20110122 vibrant final20110122 vibrant final
20110122 vibrant final
 
20110725 ibc xml
20110725 ibc xml20110725 ibc xml
20110725 ibc xml
 
Web3uploaded
Web3uploadedWeb3uploaded
Web3uploaded
 
text
texttext
text
 
The Semantic Web
The Semantic WebThe Semantic Web
The Semantic Web
 
Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...
Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...
Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...
 
Corpora, Blogs and Linguistic Variation (Paderborn)
Corpora, Blogs and Linguistic Variation (Paderborn)Corpora, Blogs and Linguistic Variation (Paderborn)
Corpora, Blogs and Linguistic Variation (Paderborn)
 
Semantic Libraries: the Container, the Content and the Contenders
Semantic Libraries: the Container, the Content and the ContendersSemantic Libraries: the Container, the Content and the Contenders
Semantic Libraries: the Container, the Content and the Contenders
 
247th ACS Meeting: Experiment Markup Language (ExptML)
247th ACS Meeting: Experiment Markup Language (ExptML)247th ACS Meeting: Experiment Markup Language (ExptML)
247th ACS Meeting: Experiment Markup Language (ExptML)
 
Donat Agosti & Norman F. Johnson - Copyright: the new taxonomic impediment
Donat Agosti & Norman F. Johnson - Copyright: the new taxonomic impedimentDonat Agosti & Norman F. Johnson - Copyright: the new taxonomic impediment
Donat Agosti & Norman F. Johnson - Copyright: the new taxonomic impediment
 
Computation and Knowledge
Computation and KnowledgeComputation and Knowledge
Computation and Knowledge
 
Using the Semantic Web, and Contributing to it
Using the Semantic Web, and Contributing to itUsing the Semantic Web, and Contributing to it
Using the Semantic Web, and Contributing to it
 
20110222 behesty monitoring and measuring biodiversity
20110222 behesty monitoring and measuring biodiversity20110222 behesty monitoring and measuring biodiversity
20110222 behesty monitoring and measuring biodiversity
 
High throughput mining of the scholarly literature: journals and theses
High throughput mining of the scholarly literature: journals and thesesHigh throughput mining of the scholarly literature: journals and theses
High throughput mining of the scholarly literature: journals and theses
 
Neno/Fhat: Semantic Network Programming Language and Virtual Machine Specific...
Neno/Fhat: Semantic Network Programming Language and Virtual Machine Specific...Neno/Fhat: Semantic Network Programming Language and Virtual Machine Specific...
Neno/Fhat: Semantic Network Programming Language and Virtual Machine Specific...
 
eScience: A Transformed Scientific Method
eScience: A Transformed Scientific MethodeScience: A Transformed Scientific Method
eScience: A Transformed Scientific Method
 
Sherborn: Lyal - Digitising legacy taxonomic literature: processes, products ...
Sherborn: Lyal - Digitising legacy taxonomic literature: processes, products ...Sherborn: Lyal - Digitising legacy taxonomic literature: processes, products ...
Sherborn: Lyal - Digitising legacy taxonomic literature: processes, products ...
 
Topical_Facets
Topical_FacetsTopical_Facets
Topical_Facets
 
Text Analytics Overview, 2011
Text Analytics Overview, 2011Text Analytics Overview, 2011
Text Analytics Overview, 2011
 

More from agosti

DOI and the Mitteilungen: communicating scientific results in the future
DOI and the Mitteilungen: communicating scientific results in the futureDOI and the Mitteilungen: communicating scientific results in the future
DOI and the Mitteilungen: communicating scientific results in the futureagosti
 
Data Sharing Principles and Legal Interoperability for Essential Biodiversity...
Data Sharing Principles and Legal Interoperability for Essential Biodiversity...Data Sharing Principles and Legal Interoperability for Essential Biodiversity...
Data Sharing Principles and Legal Interoperability for Essential Biodiversity...agosti
 
BioDIP - a proposed infrastructure to link the taxonomic to the genomic and o...
BioDIP - a proposed infrastructure to link the taxonomic to the genomic and o...BioDIP - a proposed infrastructure to link the taxonomic to the genomic and o...
BioDIP - a proposed infrastructure to link the taxonomic to the genomic and o...agosti
 
Revolutionizing the Research on Ants through new Methods and Technologies: th...
Revolutionizing the Research on Ants through new Methods and Technologies: th...Revolutionizing the Research on Ants through new Methods and Technologies: th...
Revolutionizing the Research on Ants through new Methods and Technologies: th...agosti
 
Open Research Data: Taxonomy
Open Research Data: TaxonomyOpen Research Data: Taxonomy
Open Research Data: Taxonomyagosti
 
Nothing in taxonomy makes sense except in the light of Open Access
Nothing in taxonomy makes sense except in the light of Open Access Nothing in taxonomy makes sense except in the light of Open Access
Nothing in taxonomy makes sense except in the light of Open Access agosti
 
20150701 opendata bern_agosti_2
20150701 opendata bern_agosti_220150701 opendata bern_agosti_2
20150701 opendata bern_agosti_2agosti
 
Plazi or the challenge to free biodiversity data caught in hundreds of millio...
Plazi or the challenge to free biodiversity data caught in hundreds of millio...Plazi or the challenge to free biodiversity data caught in hundreds of millio...
Plazi or the challenge to free biodiversity data caught in hundreds of millio...agosti
 
20141027 bouchout declaration
20141027 bouchout declaration20141027 bouchout declaration
20141027 bouchout declarationagosti
 
20140924 rda _bouchout
20140924 rda _bouchout20140924 rda _bouchout
20140924 rda _bouchoutagosti
 
20140922 rda codata_legal_ig_plazi_final
20140922 rda codata_legal_ig_plazi_final20140922 rda codata_legal_ig_plazi_final
20140922 rda codata_legal_ig_plazi_finalagosti
 
2 donat agosti-1
2 donat agosti-12 donat agosti-1
2 donat agosti-1agosti
 
Agosti 20140813 icd8_agosti_global_dipterology-2
Agosti 20140813 icd8_agosti_global_dipterology-2Agosti 20140813 icd8_agosti_global_dipterology-2
Agosti 20140813 icd8_agosti_global_dipterology-2agosti
 
A Step Towards (From) Read to Write Access to Taxonomic Publications
A Step Towards  (From) Read to Write Access to Taxonomic PublicationsA Step Towards  (From) Read to Write Access to Taxonomic Publications
A Step Towards (From) Read to Write Access to Taxonomic Publicationsagosti
 
Bouchout Declaration on Open Biodiversity Knowledge Management, Montpellier J...
Bouchout Declaration on Open Biodiversity Knowledge Management, Montpellier J...Bouchout Declaration on Open Biodiversity Knowledge Management, Montpellier J...
Bouchout Declaration on Open Biodiversity Knowledge Management, Montpellier J...agosti
 
Bouchout Declaration on Open Biodiversity Knowledge Management, Montpellier J...
Bouchout Declaration on Open Biodiversity Knowledge Management, Montpellier J...Bouchout Declaration on Open Biodiversity Knowledge Management, Montpellier J...
Bouchout Declaration on Open Biodiversity Knowledge Management, Montpellier J...agosti
 
20140623 swets agosti_final
20140623 swets agosti_final20140623 swets agosti_final
20140623 swets agosti_finalagosti
 
20140523 swiss curators_bouchout_2
20140523 swiss curators_bouchout_220140523 swiss curators_bouchout_2
20140523 swiss curators_bouchout_2agosti
 
20140327 rda plazi_final
20140327 rda plazi_final20140327 rda plazi_final
20140327 rda plazi_finalagosti
 
20140317 pi b_nmbe_journal_club
20140317 pi b_nmbe_journal_club20140317 pi b_nmbe_journal_club
20140317 pi b_nmbe_journal_clubagosti
 

More from agosti (20)

DOI and the Mitteilungen: communicating scientific results in the future
DOI and the Mitteilungen: communicating scientific results in the futureDOI and the Mitteilungen: communicating scientific results in the future
DOI and the Mitteilungen: communicating scientific results in the future
 
Data Sharing Principles and Legal Interoperability for Essential Biodiversity...
Data Sharing Principles and Legal Interoperability for Essential Biodiversity...Data Sharing Principles and Legal Interoperability for Essential Biodiversity...
Data Sharing Principles and Legal Interoperability for Essential Biodiversity...
 
BioDIP - a proposed infrastructure to link the taxonomic to the genomic and o...
BioDIP - a proposed infrastructure to link the taxonomic to the genomic and o...BioDIP - a proposed infrastructure to link the taxonomic to the genomic and o...
BioDIP - a proposed infrastructure to link the taxonomic to the genomic and o...
 
Revolutionizing the Research on Ants through new Methods and Technologies: th...
Revolutionizing the Research on Ants through new Methods and Technologies: th...Revolutionizing the Research on Ants through new Methods and Technologies: th...
Revolutionizing the Research on Ants through new Methods and Technologies: th...
 
Open Research Data: Taxonomy
Open Research Data: TaxonomyOpen Research Data: Taxonomy
Open Research Data: Taxonomy
 
Nothing in taxonomy makes sense except in the light of Open Access
Nothing in taxonomy makes sense except in the light of Open Access Nothing in taxonomy makes sense except in the light of Open Access
Nothing in taxonomy makes sense except in the light of Open Access
 
20150701 opendata bern_agosti_2
20150701 opendata bern_agosti_220150701 opendata bern_agosti_2
20150701 opendata bern_agosti_2
 
Plazi or the challenge to free biodiversity data caught in hundreds of millio...
Plazi or the challenge to free biodiversity data caught in hundreds of millio...Plazi or the challenge to free biodiversity data caught in hundreds of millio...
Plazi or the challenge to free biodiversity data caught in hundreds of millio...
 
20141027 bouchout declaration
20141027 bouchout declaration20141027 bouchout declaration
20141027 bouchout declaration
 
20140924 rda _bouchout
20140924 rda _bouchout20140924 rda _bouchout
20140924 rda _bouchout
 
20140922 rda codata_legal_ig_plazi_final
20140922 rda codata_legal_ig_plazi_final20140922 rda codata_legal_ig_plazi_final
20140922 rda codata_legal_ig_plazi_final
 
2 donat agosti-1
2 donat agosti-12 donat agosti-1
2 donat agosti-1
 
Agosti 20140813 icd8_agosti_global_dipterology-2
Agosti 20140813 icd8_agosti_global_dipterology-2Agosti 20140813 icd8_agosti_global_dipterology-2
Agosti 20140813 icd8_agosti_global_dipterology-2
 
A Step Towards (From) Read to Write Access to Taxonomic Publications
A Step Towards  (From) Read to Write Access to Taxonomic PublicationsA Step Towards  (From) Read to Write Access to Taxonomic Publications
A Step Towards (From) Read to Write Access to Taxonomic Publications
 
Bouchout Declaration on Open Biodiversity Knowledge Management, Montpellier J...
Bouchout Declaration on Open Biodiversity Knowledge Management, Montpellier J...Bouchout Declaration on Open Biodiversity Knowledge Management, Montpellier J...
Bouchout Declaration on Open Biodiversity Knowledge Management, Montpellier J...
 
Bouchout Declaration on Open Biodiversity Knowledge Management, Montpellier J...
Bouchout Declaration on Open Biodiversity Knowledge Management, Montpellier J...Bouchout Declaration on Open Biodiversity Knowledge Management, Montpellier J...
Bouchout Declaration on Open Biodiversity Knowledge Management, Montpellier J...
 
20140623 swets agosti_final
20140623 swets agosti_final20140623 swets agosti_final
20140623 swets agosti_final
 
20140523 swiss curators_bouchout_2
20140523 swiss curators_bouchout_220140523 swiss curators_bouchout_2
20140523 swiss curators_bouchout_2
 
20140327 rda plazi_final
20140327 rda plazi_final20140327 rda plazi_final
20140327 rda plazi_final
 
20140317 pi b_nmbe_journal_club
20140317 pi b_nmbe_journal_club20140317 pi b_nmbe_journal_club
20140317 pi b_nmbe_journal_club
 

Recently uploaded

AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptxiammrhaywood
 
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxQ4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxlancelewisportillo
 
4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptxmary850239
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPCeline George
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPCeline George
 
Integumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.pptIntegumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.pptshraddhaparab530
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYKayeClaireEstoconing
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for BeginnersSabitha Banu
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...JhezDiaz1
 
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptx
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptxMusic 9 - 4th quarter - Vocal Music of the Romantic Period.pptx
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptxleah joy valeriano
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...Postal Advocate Inc.
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxAnupkumar Sharma
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management systemChristalin Nelson
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxHumphrey A Beña
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...Nguyen Thanh Tu Collection
 
Activity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translationActivity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translationRosabel UA
 

Recently uploaded (20)

AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
 
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxQ4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
 
4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx
 
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptxYOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERP
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERP
 
Integumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.pptIntegumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.ppt
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for Beginners
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
 
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptx
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptxMusic 9 - 4th quarter - Vocal Music of the Romantic Period.pptx
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptx
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management system
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptxFINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
 
Activity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translationActivity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translation
 

20090921 Art Databanken Agosti Final

  • 1. Literature and XML: or How to Have More Time to Think Donat Agosti Plazi ArtDataBanken Stockholm, Sept 21, 2009
  • 2. Who is this? What do I know about her? Where does she live? Who are you? What do you do? Where are you from?
  • 3. The answers are in several hundred million pages of printed species descriptions in our libraries, including the descriptions redescriptions of an estimated 1.8M species, and an estimated 50K new (re-)descriptions annually.
  • 4. Taxonomists at work …… T. E. Lawrence: Seven Pillars of Wisdom – a triumph. 1st published for general circulation, 1935: p. 535
  • 5. The traditional flux of information … … a more or less closed, intransient system
  • 6. What has this to do with XML, semantic, enhanced documents?
  • 9. Before antbase.org, Harvard‘s Museum of Comparative Zoology could claim to be the only location with a complete set of ant systematics publications from 1758 - present. Through antbase.org‘s digital library, access to this body of literature is worldwide, and it is actively used (>10,000 visits in one month only).
  • 10. The Biodiversity Heritage Library is currently digitizing and make accessible >100 million pages, most of them out of copyright, ie older then 1925. ........ to be finished in 2048...
  • 11. Access to ant taxonomic publications through antbase.org /Smithsonian Institution, including currently the entire body of non-copyrighted publications since 1758 (>4,000 publications or 85,000 pages)
  • 12. Can taxonomic work be copyrighted? Copyright legislation is national but is based on the Berne Convention for the Protection of Literary and Artistic Works which defines a minimal standard. This international copyright standard does not require the recognition of treatments, the building stones of taxonomic publications, as works.
  • 13. “ work ” does not mean “text”, does not mean “data”, does not mean “information”. A “ work ” is something more. That kind of something more has many different definitions in the various legislations, but it is always there: It may be called originality, individuality, creation, personal expression, creative shaping or anyhow else, but it is a condition for qualifying a product as a work: “Work” is an intellectual product that is in a certain sense particular, individual, original, new. (Egloff: EDIT IPR and Copyright, 2008)
  • 14. Taxonomic treatments are highly structured and homogenous, part of a global >100 million page corpus growing at a rate of ca 20,000 new species descriptions per year, not counting 5 times more redescriptions. Its structure is tightly controlled by a peer review process enforcing standards, a domain specific vocabulary, not written as poem or in flowery language but scientific jargon. Treatments do not qualify as work. The publications including the treatments might. (Egloff: EDIT IPR and Copyright, 2008)
  • 15. It is about digesting millions of pages: >>100 M pages taxonomic literature 25M scientific publications / year 25K journals >1K with taxonomic descriptions 20K descriptions of new species / year
  • 16. Is this is the access we need?!
  • 17. No, we need open access to content , not the PDF per se .
  • 18. It is about machines (not we) doing a great deal of the work for us, extracting data, formulating hypothesis, ....
  • 19. It is about data and information in context
  • 20. „ Nothing makes sense in biology except in the light of treatments“.
  • 21.
  • 22.  
  • 23. In a semantic Web environment (where machines talk to each other and do most of our work), data need to be able to talk to each other: “ protein-protein interaction networks” John Wilbanks, Neurocommons 27,266 papers 4,563 papers 41,985 papers 10,365 papers 128,437 papers
  • 24. Relational to Ontological Mapping Drug Neuron Pathological Agent Receptor Channel inhibits inhibits Agent Neuronal Property Pathological Change involves involves inhibits Compartment has is_located_in is_located_in slide courtesy of kei chung, yale
  • 25. It will open up scientific literature for data mining “ protein-protein interaction networks” John Wilbanks, Neurocommons
  • 26. TREATMENT Cremastogaster mimosae    Likely Diagnostically Related to: Cremastogaster tricolor   Likely Diagnostically Related to: Cremastogaster tricolor   Likely Diagnostically Related to: Cremastogaster amabilis   Likely Diagnostically Related to: Cremastogaster tricolor   Likely Diagnostically Related to: Cremastogaster amabilis   Associated with: Acacia sienocarpa Living in: Mombasa Lviing in: Tanga
  • 27. It is more: it is about access to the original or source data
  • 28. The semantically enhanced treatments, extracted, stored on Plazi.org, and served in a human readable form, are linked to the underlying data: Fisher & Smith, 2008, PLoS ONE.
  • 30. ... and XML is one way to go .
  • 31. XML XML stands for EXtensible Markup Language XML is a markup language much like HTML XML was designed to carry data, not to display data XML tags are not predefined. You must define your own tags (schema) XML is designed to be self-descriptive XML is a W3C Recommendation
  • 32. XML Being open and non-proprietary XML is an optimal archival format for the treatment/publication Being a stable and rich data format, XML can be repurposed for a variety of purposes
  • 33. XML XML application design is an art in itself .... and thus can not be explained in 15 minutes Plenty of resources to dive into XML on Web, eg http://www.w3schools.com/, etc.
  • 34. This means to develop a schema that models the logic content (e.g TaxonX), insert those tags that define what a word means, so a computer can understand as well. To assure, that everybody talks about the same species, the name can be linked to a reference name server Azteca instabilis Taxonx-schema Would then read like External schema <tax:name> <tax:xmldata> Normalization of data <dc:Genus>Azteca</dc:Genus> <dc:Species>instabilis</dc:Species> </tax:xmldata> Azteca instabilis </tax:name>
  • 35. This can also be applied to entire sections of text, such as the treatment of a species and its parts. <tax:treatment> <tax:nomenclature> <tax:name> <tax:xid source=&quot;HNS&quot; identifier=&quot;193329&quot;/> <tax:xmldata> <dc:Genus>Mystrium</dc:Genus> <dc:Species>leonie</dc:Species> </tax:xmldata> Mystrium leonie </tax:name> <tax:status>n. sp.</tax:status> Fig 1 D - F </tax:nomenclature> <tax:div type=&quot;description&quot;> <tax:p>HOLOTYPE WORKER: TL 3.95, HL 1.02, HW 0.95, CI 93, SL 1.30, SI 137, PW 0.73, ML 0.38. Mandible outer margin strongly curving to a sharp apical tooth, the apex parallel to the anterior clypeal margin. (Holotype with material in mandibles, so mandibles and anterior clypeus $ described below from paratypes.) Median clypeus .... </treatment>
  • 36. global unique identifiers (e.g. LSID) link up data
  • 37. LSID for scientific publications LSID for treatments LSID for names (Zoobank/ HNS..) LSID for specimens LSID for DNA sequences / characters (ontologies) LSID for repositories GPS fixes for locations
  • 38. Azteca instabilis Would then read like <tax:name> <tax:xid source=“ LSID&quot; identifier=“urn:lsid:biosci.ohio-state.edu.osuc_concetps:13452 &quot;/> Link to external database <tax:xmldata> Normalization of data <dc:Genus>Azteca</dc:Genus> <dc:Species>instabilis</dc:Species> </tax:xmldata> Azteca instabilis </tax:name>
  • 39. We need XML-schemas, tools to convert and expose semantically enhanced documents.
  • 40. Plazi workflow: overview Plazi deliverables TaxonX XML schema GoldenGate Dspace application Exist application SRS Exchange protocols (SPM, TAPIR, REST)
  • 41.
  • 42. Plazi Search and Retrieval Server: Access to data TAPIR, SPM You You You human machine
  • 43. Materials examined from literature in GBIF
  • 44. Plazi workflow: content 11,000 descriptions online 500 publications 4,500 publications Handle, SPM and Tapir services Feeds into HNS and Zoobank (soon) Is harvested by GBIF, EOL Support from GBIF, EOL, US-NSF, DFG
  • 45. Does the retro mark-up process scale up to the millions of pages needed to be processed? Only partially: Mark up takes about 5min/page: For 100 M pages = 700 man years (but it is only a first tool...)
  • 46. Does the mark-up process scale up to the millions of page needed to be processed? Only partially: Mark up takes about 5min/page: For 100 M pages = 700 man years (but it is only a first tool...); wizards can reduce the time by several factors But: How much does it cost to digitize specimens, and what is its quality?
  • 47. The cost of converting legacy publications can be avoided by producing marked-up publications up-front
  • 48. NLM/TaxonX schema allows publishers to maintain richly encoded articles whose data can be distributed and presented in multiple formats for a variety of uses.
  • 49. NLM/Taxonx XML Document Print
  • 50. NLM/Taxonx XML Document PDF Print
  • 51. NLM/Taxonx XML Document HTML PDF Print
  • 52. NLM/Taxonx XML Document HTML SPM /RDF PDF Print SPM /RDF SPM /RDF
  • 53. NLM/Taxonx XML Document HTML SPM /RDF PDF Print Database HTML / Species Page HTML SPM /RDF SPM /RDF HTML / Species Page HTML / Species Page Eg. EOL, scrathpads
  • 54. NLM/Taxonx XML Document HTML SPM /RDF PDF Print Database HTML / Species Page LSID resolver HTML SPM /RDF SPM /RDF HTML / Species Page HTML / Species Page Eg. EOL, scrathpads
  • 55. NLM/Taxonx XML Document HTML SPM /RDF PDF Print Database HTML / Species Page Google Dataminig, ... LSID resolver HTML SPM /RDF SPM /RDF HTML / Species Page HTML / Species Page Eg. EOL, scrathpads
  • 56. Semi-automatically generated semantic, enhanced e-publications are the only way to describe the missing 10 M species, and to deal with an increasing flood of data.
  • 57. ms submission („Taxon-x-version“) new ms alert Posting for review Edited ms Revised ms Publication: pdf Publication: hard copy Publication database („taxon-x-version“) analysis & ms preparation Taxon DB New Data feedback Accepted ms New taxon alert The future of publications: The publication semiautomaticall generated ontology bibliography ZooBank / NS Character DB Specimen DB Description DB Distribution DB Char. Matrix DB Phyl. Tree DB Char-state Im. Specimen Im. Habitat Image Leg. Publicat.
  • 58. Word MS DB Input forms export export convert NLM taxpub Indesign NLM taxpub author author author publisher publisher publisher Journal authoring and production workflow Ctd.
  • 59. NLM/Taxonx XML Document HTML SPM /RDF PDF Print Database HTML / Species Page Google Dataminig, ... LSID resolver HTML SPM /RDF SPM /RDF HTML / Species Page HTML / Species Page Eg. EOL, scrathpads Ctd.
  • 60. Word MS DB Input forms export export convert NLM taxpub Indesign NLM taxpub author author author publisher publisher publisher Journal authoring and production workflow: What do we miss? available prototypes to be developed
  • 61. Where do we stand? 2008 LSIDs, external links
  • 62. Where do we stand? 2008 LSIDs, external links, XML
  • 63. Where do we stand? 2008
  • 64. Where do we stand? 2009 LSIDs, external links, external data via doi, export services
  • 65. Where do we stand? 2009 LSIDs, external links
  • 66.
  • 67. Self archive (the Green Road): UNIZ as one of the global leaders in self archiving
  • 68.
  • 69. OECD Declaration for Access to Research Data from Public Funding (Spring 2007) How to implement this?
  • 70.
  • 71.
  • 72.
  • 73. http://plazi.org Thank you very much! Donat Agosti [email_address]