Good afternoon & Thanks for opportunity to present on eMonocot & focus on portalPrior to 1st release next monthFishbase we heard about b4 lunch 33000sppWithin botanical biodivinf at least, most projects to date have focussed on 1 family/taxon with few 000 spp like Araceae, Solanaceae, palmsOr geographical area such as ALAeMono 1st project to synthesise and deliver taxonomic data from a major plant taxon to a specific primary user group70K species, ca 20 % of all flowering plants, eg cereals – wheat, rice, maize – over 50% of world’s food energy intakeoil palm, yams, pineapples, ginger – Hedychium a hort imp member of ginger family ZingiberaceaeFlowering now in gdnsWill cover progress in project to date, describe approaches taken& look again at the issue of providing consensus classification Generated some controversy in the early days of biodiversity informatics
project updateeMonocot portalLook at consensus classification in BIFinally summarise presentation & look to future
Presenting on behalf of a team of researchers22 emonocot team members at 1st plenary meeting in early2011We are now just over ½ way through projectPartnership of taxonomists with ICT specialists such as software and web developersThis is taxonomy as a “team sport” [ref Sandy]But In addition to core team International collaboration & input to deliver both initial project & med to long term sustainabilityEmonocot will involve 100s or even 1000s of people from specialist monocottaxonomists through professional biodivsci to the general publicAnd be developed & owned by the community as a wholeTeam activity moving taxonomy onto the webContent team based at RBG Kew capturing monocot tax infoKew will provide post project sustainabilityOxford central to software developmentTax producer community Outreach based at NHMAlso Scratchpads project & extn to zoologyPoster available upstairs with further project info
ID to at least fam, gen/sp where poss via keys and other ID toolsProvide..... Via taxon pages3) Users from professional taxonomists (producers) through primary users such as ecologists and conservation biologists to secondary users gardeners and the interested public4) particularly those working in smaller institutions without access to large collections and libraries5) perhaps for animal groups& the rest of the flg plants with a view to GSPC target 1 2020 – WFO
How are we working towards those aims?eMonoco is formed by Distributed information system multiple linked taxonomic web-basedresources developed in open source softwarePre-existing – [ list]Added tobespoke eMonocotscratchpads developed by monocot specialists and communities external to the project but collaborating with it [point]e.g. Juncaginaceae, Zingiberaceae, workshop taxa& Content Developed by Kew content team & uploaded into scratchpads [point]REF Vince’s talk re scratchpadsAdded to this biodivsci data e.g. Habitat info, cons. Status assessment via web services or linksAll supplying content to an innovative Portal – web interface for usersrepeatedly Harvests data from all sources using open access GBIF standard Darwin core archiveTo provide rich information organised around a single classificationaimed mainly at biodiversity scientist user groupe.g. Ecologists, evolutionary biologists, conservation scientistsAnd secondarilyother user communities
A major part of building the emonocot system has been capturing content for delivery Significant Progress to date with content capture in 8 core families (2100 genera, ca 75%)As can see from graphExplain axes % processed/tax group Genera complete in palmweb/CATE-araceae,Also much species contentGrassbase is a completed resource all species & genera in deltaFor cyperaceae genera 100% content captured plus many Carex species pages.Dioscoreacaea genera complete, Liliaceae almost soOrchidaceae Genera complete, also Cypripedioideae among species level reourcesEuropean & SRLI monocot species pages content gathering in progress & will move forward through late 2012 and 2013Interactive Keys generation complete or under way in all 8 core families Already a key to monocot families
Content team uploading gathered content into eMonocot scratchpade.g.Cypripediodeae ScratchpadMain contributor Ruth BoneCovering all genera and species of slipper orchidsMulti-access keyDescriptive contentImages inc from SOF via webserviceOther orchid scratchpads already Eulophiinae, Aeridinae, genera based on GO in the pipelineSeveral sites for the largest monocot family based community sustainability; most others are family-based
e.g. DioscoreaceaeKew content team generic content and keys to Dioscoreaceae already in placeSpecies content and keyse.g. Madagascar developed by local collaboratorWe can work on this independently She gains much greater access to taxonomic resources – e.g. Protologues, type images& we are able to collaborate more closelySA Dioscorea recent project with in country collaboratorsDeveloping content with in the scratchpad – already have descriptions for ca ½ the species from previous paper publicationWill generate species limits/new descriptions for the rest from specimen-based researchUpload directly to the scratchpadAiming to submit regional treatment “Dioscorea of South Africa” MS via specialised scratchpad publications module to PhytokeysExample of a new mode of working for taxonomists provided by scratchpads
Main function of content team and external scratchpads to supply portal with data – showPortal will integrate this contentwith existing sources of data including WCM and biodivsci datavia checklist webservice & Darwin Core Archive harvestingPortal will be going live to users in early OctoberCurrently270K names, over 14K descriptionsforming taxon pagesAdditional Target Content by end Oct4 more KeysTaxon descriptions: several 100 moreIUCN conservation status
Portaltaxon pages Skeleton Taxon pages for all 70K+taxa from the World Checklist with [show] accepted name synonymy, nomenclature, distribution maps and references.16 % already withadditional descriptive content such as this pothos pageNB map, clickable links to speciesWill be augmented by data of specific interest to biodivsci e.g.Cons status (from IUCN via webservce)Common names (from Checklist bank)Specimens via GBIFProtected/Invasive species & Habitat type information where availableWWF BiomesAnd climatic data from WorldclimAiming to enable research by biodivsci across a range of themes.Link from Taxon pages to source system pages which gives information on the source of the data in development. Source system owners can register for an account allowing them to update source system details and view information on the harvesting of data from their source system.
How does portal deliver classification?Portal now:Ability to browse the entire consensus classification from the World Checklist of Monocotsby navigating between taxon pages or viewing whole classification in 1 place via classn treePlanned Future developments include a bulk validator for checklist names and an interactive phylogeny browser to allow users to explore and visualise phylogenetic treesFor taxa with keys available, key icon appears e.g. Dioscoreaceae
Portal will help users to ID specimens/living plantsusing an interactive multi-access keys including illustrations of characters and stateslinking through to the taxon pages in the portal. Users can search for keys to a particular taxonomic group then use the key to identify their specimen, or use the family key 1st and then use a lower-level key to narrow down IDUsers can browse a gallery or slideshow of images of a specific taxon or group of taxa, or taxa found in a certain part of the world. Will be Adding searchable dich keys totaxa based on multi-access key matricesA Glossary will help users to understand terminology of keys & taxon page descriptions
Faceted searching developed for eMono portal; free text searches can be restricted by particular facets e.g. Presence in a geog regionFamily or other rank - powerful searching toolSupplemented Map searching using points or polygons Interface supporting ad hoc queries e.g. Largest family?Species no per genera of Poaceae?Which fam of monocots has most threatened spp?Content downloads will also be supported in futureBen Clark, lead software developer, will be presenting in more detail at OBI workshop on Biodiversity technologies next week.
Turning now to consensus classification and its role in biodiversity informaticsDefinition of consensusThus it follows that They don’t all have to agree that it is perfectMerely that it is the best they can do at a point in time[One could even argue that they don’t even all have to agreeEditors of journals and flora editorial boards often have to make majority decisions]
Why is consensus classification important?Provides a Collective visionLinking the taxonomic work we do across scales from individuals to communities to groups working with higher taxa like flowering plants to biodiversity as a wholeEveryone playing a part in delivering a team-based consensus classification is in my view taxonomy’s big plan.Also a good idea because demonstrably people want it – success of unified taxonomic resources like WCM, TPL in terms of website hits & citationsIt is what the users of taxonomy would like us to provide
Does consensus classification already exist?In one sense, it is the system most people useAs pointed out in papers lead both by Malcolm and by Ben ClarkCould call this PASSIVE CONSENSUSClassifications e.g. Linnaeus sp. plSpecies plantarumDevised by a single taxonomist but adopted by all for over 100 yearsEngler – consensus classificationsPflanzenreich & PflanzenfamilienMultiple authors for different familiesMore inclusive, more they are used, more they become the passive consensusEngler flawed concept of Liliaceae as an example of when consensus is wrong needs to be overturned.Took a combination of the genius of Rohlf Dahlgren, Will Hennig and the inventor of the PCR to overturn itWhen I was student Cronquist was consensus flowering plant classificationEven though lots of people didn’t like it, 1 person’s view, devoid of supporting evidence in partsWe are all used to working with consensus classification
Recent years have moved towards more inclusive, evidence based models – ACTIVE CONSENSUSe.g. AGPIII/WCMTwo main classification sources for eMonocotAPG3 40+ authors flgpland orders & familiesWCM has incorporated idea of consensus 1/3 of monocot families with multiple contributorsNot all of the rest have active taxonomic communities84 inputs to the classification according to the websiteAraceae – 18 peopleOrchids 15and will becoming increasingly a cons classn as more monocot tax communities move to the webAnd take ownership of “top copy” status of the checklist for their taxon
Discussion of consensus classification was started in 2002 by CG’s Nature commentary Challenges for taxonomyDescribed what he then called unitary taxonomy (term has rather fallen out of use)And how that could be generated by moving the resources or building blocks of the taxonomy of a gp onlineThis could be used to form the “first web revision”Following peer review and discussion this could form a single consensus classification of that taxonFollowed up by a paper by Malcolm in 2003Exploring these ideas in more detail
This generated some controversyIn response, Thiele & yeates drew attention to a critical dichotomy within taxonomyOn one handa)Taxonomy is a vibrant hypothesis testing science with each name representing a testable hypothesison the otherb) it also needs to provide a stable info service to usersSuggested it would be problematic for consensus classification to play a significant role in taxonomy – make it more rigid and hamper the uptake of new research findings into classn
Others have gone further, suggestingthat biodiversity informatics by being focussed on users will result in simplified, inferior taxonomyReduced in both importance and role of taxonomye.g. De Carvalho et al 2007
Issue of consensus classification was addressed by the eTaxonomy proof of concept project CATEInvestigated means of working towards consensus while retaining alternative or rejected versions should they need to be revivedPaper cited above argued strongly for value of consensus classification especially to usersBut CATE worked primarily with taxonomic specialists in two moderately sized groups of organismsThus exposure to diverse groups of users was limitedAnd it was exposed to users via 2 sites (1 for each group)Thus it had no opportunity to have different elements with different approaches to issues such as consensus
Most of the debate took place when BiodivInf was still in its infancy & was theoretical in natureSee what is actually happening nowEspecially given that we are now able to work at a significantly larger scaleLook at practialities of eMonocot system2 places where we can promote consensusFamily/taxon/scratchpad community community scale consensusClassifications based on rigorous, hypothesis-testing taxonomic science based as far as possible on in depth morphological and/or molecular data monophylyContributions by all encouraged; data or observations contributed will generate or change the classification via a peer review systemCompeting hypotheses can be evaluated by the community and a consensus decision reached The same system of ensuring reliability of our work as we have always usedStill allows for consensus to be overturned (e.g. Engler’s flawed concept of Liliaceae)But importantly the classifications we develop here are the building blocks of consensus at a larger scale2) At higher taxon scale – all of the monocotsHere is vital to present a single overall consensus classification to users: Initially WCM based but will incorporate more community generated classifications in its consensus as eMonocot scratchpads developPortal will harvest classifications generated by expert communities to provide overall consensusThus we are achieving a means of decoupling of the taxonomic system as suggested by Thiele & Yeates1) the need to generate and test hypotheses robustly within the communities oriented around scratchpads (1)2) providing stable,taxon based information via the portal (2)Thus a model for how to go forward in terms of biodiversity informatics of major taxonomic groups
DevelopingConsensus within the scratchpadseMono Content team at Kew have started investigating 2 methods for discussing, peer-reviewing and collectively changing a classification in a scratchpadEither Via forums enabling posting of comments via emailOr via a custom content type enabling structured debate.Both will be trialled with eMonocot scratchpad communities in the months ahead
Portal is explicitly consensus classification-basedProviding one name linked to e.g.descriptive, image and trait data for all monocot taxa
SummaryVision of e\\Monocot as a distributed Info system (show) establishedCaptured content across all 11 target areas (8 fams, SRLI, Euro & slipper orchids)and in non-core families too15 eMonocot bespoke Scratchpads have been launched to deploy that contentPortal development is progressing well; first release to users will be early next monthWill benefit both taxonomic producer and biodiversity scientist end user communities, in part through presenting consensus classification Taxonomists benefitsAccess to taxonomic resourcesNew ways of collaboratingNew ways to publish workAccess to new audiencesBiodivsci end usersGet single consensus classification and authoritative source for associated dataAnd it can be done without compromising academic integrity of taxonomyLessons so farLessons learnt already:Interconnectivity [point] (in this case via DCA) vital in BI to link separate taxonomic web resources will be increasingly vital to link resources for diff purposes at different scalesUser needs are key – too many taxonomic databases have focussed too much on the needs of taxonomistsAn increasingly collective, community focussed ethic is needed for taxonomyShould Look to apply these lessons as we move on to new challenges like WFO 2020
Emonocot presentation linn_soc20912
eMonocot, the eMonocotPortal and ConsensusClassificationPaul Wilkin Hedychium densiflorum
Presentation structure eMonocot project update the eMonocot portal Consensus classification and biodiversity informatics Summary Ceroxylon quindiense. Photo R. Bernal
Aims of eMonocot When complete, eMonocot will: 1. Enable the identification of monocot plants anywhere in the world 2. Provide a wealth of information about monocot species, genera and families 3. Address separately the needs of different users 4. Link together monocot taxonomists to enhance their productivity 5. Provide a model for web taxonomy Watsonia confusa
eMonocot: a distributed information systemeMonocot community scratchpads: 15http://e-monocot.org/list-emonocot-scratchpadsExisting eTaxonomic resourcesCATE-Araceae (www.cate-araceae.org) 104 genera, ca 4000 speciesPalmweb (www.palmweb.org) 190 genera, ca 2600 speciesGrassbase (www.kew.org/data/grasses-db.html) Ca 700 genera, 11611 speciesMonocot checklist (www.kew.org/wcsp/monocots) Ca 70000 monocot species
What is consensus classification?Consensus: Agreement in opinion;the collective unanimous opinion ofa number of personsConsensus classification: a singletaxonomy that is subscribed to byall specialists for a given taxon Chlorophytum sp.
Why is consensus classification important? Collective Vision ....in order for policy makers or big funding agencies to take us seriously we need to have a common vision - a big plan, not a robotic repetition of the same words (Knapp 2008) Meeting user requirements Tulipa sp. Photo M. Zarrei
Existing consensus classification: active consens WCM : ~1/3 of familes with multiple contributors 84 inputs by monocot taxonomists
Consensus classification and biodiversityinformatics“The taxonomy of a particular groupcould reside in one place and beadministered by a single organization. Itcould be self-contained and requirereference to no other sources..... a numberof things would then follow. First, the onlylogical way to organize a unitary taxonomyand to make it widely available is on theweb” (Godfray 2002) Haemanthus puniceus
Why has consensus classification been controversial? (“Names must both represent avolatile hypothesis and provide a keyto lasting information.....A solutionmust adequately recognize thesedual roles and decouple the systemthat allows maximum freedom ofhypothesis-generation from thesystem that provides names forusers” (Thiele & Yeates 2002). Kniphofia sp.
Why has consensus classification been controversial? ( “The ‘cybertaxonomic solution’.... reveals a traditional misunderstanding that regularly emanates from the more ‘applied’ side of biology - that the only significant data taxonomists provide are the species name, diagnosis, and distribution for the purposes of identification by non- taxonomic end-users (de Carvalho et al 2007)” Dendrobium cuthbertsonii. Photo W. Baker
Consensus classification in CATE “CATE....consensus taxonomy is intended to retain alternative views....so that they can, potentially, be revived. Good revisionary taxonomy, whether Web or paper based, explains differences of opinion but still proposes a recommendation. Consensus, therefore, is neither intended to stifle dissent nor does it imply immutability. It is needed to Lysichiton help users outside the taxonomic americanum. Photo I. community” (Clark et al 2009) Kitching
Community consensus classification in eMonoco 1 1 2 1
Helping producer communities to work towardsconsensus classification in scratchpads
Summary Distributed information system established Content capture on track 15 new Scratchpads launched Portal released to users October 2012 Will benefit both taxonomic producer anduser communities Lessons learned to date: • Interconnectivity • Users Triticum aestivum • Communities
A particular slide catching your eye?
Clipping is a handy way to collect important slides you want to go back to later.