Thanks for coming. I’d like to talk to you today about the CATE Araceae project. CATE was a 3 year NERC funded project that ran out at the end of 2008. We focussed on two taxonomic groups: Araceae and Sphingidae. Today I’d like to talk to specifically about the Araceae project. On previous occasions I have talked to some of you about the process we went through to put content on the web. So this is really a talk about what has happened since the end of the project. We hope to share some of our experiences in the hope they may be of use to future and current etaxonomy ventures.
Casting this talk as a lessons learned – that there aught to be a difference between eTaxonomy and Taxonomy. Making the point that it is hard to know what to do on the web.
So what is etaxonomy? Here we have three quotes from the good and the great regarding eTaxonomy – its clear that they think it is a good thing, but is that enough? Many people have written how important it is for taxonomy to embrace the web, but what does this really mean and what it the best way to go about this? What is it that makes a successful web revision?
When we started CATE, we had a grand vision, and we thought we knew what we were trying to do, but once we began to build it we realised we had to make many, many decisions. We realised that we all had slightly different ideas of what we expected the site to do and how we hope it would do. Trying to communicate those ideas was a long and painful process. The web is just like the “real world”. Many more websites fail than succeed. Assuming that all websites are successful is the “field of dreams” approach – assuming that all you have to do is build the website and people will use it. The web is no longer uncharted territory waiting to be colonised. In fact it is pretty densely colonised already. Fuzzy idea is not enough to build the site , we need a more detailed idea. We realised we would have to work hard at attracting and keeping users. We needed to think about why they should come.
It might help to look at eTaxonomy as a problem of publication – until recently taxonomy was like most other knowledge industries – constrained by the cost of publication which meant that it was important to get everything absolutely correct before you published it. This meant that only the experts could be involved (because we require high levels of quality control), and typically only one or two experts would collaborate to produce a book. This would be circulated to a small number of libraries and would be almost immediately out of date. This means that access to information is limited by physical access to libraries, by the time taken to publish information, and by the availability of experts. This is also a problem because a taxonomic monograph is an index, has to be up to be up-to-date. Subsequent scattered publications on a group may not be widely known about until the next monograph is published.
The internet removes many of these problems. Access is greatly improved. Because you can re-publish at will, you can publish incomplete data, or make corrections. In addition collaboration is much easier and this allows many more people to become involved (provided they can maintain a suitable level of quality control). This may involve more non-experts who often have more time. So the internet has huge potential that can be taken advantage of as long, as the details are thought through. Internet has a certain amount of cool, and can be attractive to funders. However success is in the detail.
So we’re casting eTaxonomy as part of a more general pattern which is called “participation - collaboration”. Of course, the starting point is the information which you put on the internet in your first release. For us, this was in July 2009 when the site finally went “live”. By which I mean that it was then possible to add new taxa and update the taxonomy, as well as add new data and edit the existing data. Note that this occurred sometime after the official end of the project. It was only with some additional funding that the site became sustainable and could be updated by its used rather than by sending data to Ben to upload. I am sure that it is a general truth that all web revisions are intended to be completed, if they are not, and curated, so that they remain so. In Ben’s research as part of eMonocot, he spoke to 8 e-taxonomists : all said would maintain sites, up-to-date. This requires ongoing effort which is distinct from the effort expended delivering the first web revision. We explored the opportunity to collaborate with our users.
This leads to a working definition of success – the important thing is that it doesn’t take sustainability for granted – sustainability is an aim and a criterion for success. We can assume that sustainable revisions would provide useful, high-quality information and this would, in turn, increase the likelihood that the revision would be sustained.
As we’ve identified earlier, the real challenge is knowing how to build a website that is a success. One of the things which we did was talk to our “editorial committee” who are eminent Araceae experts or “druids” who, in collaborating with us, gave the project credibility. They also provided real help in, for example, building the keys. We consulted with them on the design of the site and they gave us feedback on what it should look like and what it should do. However, we eventually realised that they did not actually use the CATE site themselves. – The reason for this is that most of the information originated from them in the first place, and most of them have their own personal resource where they have access to the taxonomic data for example a filing cabinet and office. I think that it is a common pitfall to design websites based on a perception that the site will be used by your colleagues or your peers. In our case, this did not hold. However I have hope that the site could be of more use to early career scientists who have not built up that office of full of protologues and other literature. We have found that some younger taxonomists have embraced the site more. I would like to hope that some of the extra features of the site that I will mention later might encourage taxonomists that there is something in it for them. Added value in adding data to the site.
This is not to say that CATE is not used. We’ve had over 12 thousand visits from over 6 thousand visitors, and a significant proportion of those visitors return to the site again and again. The problem is, we have no idea who they are. Many come from search engine results pages where they have searched for a scientific name.
One thing which is clear to us is that, for CATE to be sustainable, it needs to have an active community of users who keep it up to date. Another fallacy is that visitors, or even registered users, do not equal activity on a website. The 90-9-1 rule of online engagement (where 90% of users ‘lurk’, 9% are occasional contributors, and 1% are active) holds true in the case of CATE and probably other eTaxonomy websites. This seems to be a fact of life which means that, although a website might have thousands of visitors, it can have a very small number of users keeping the content up to date. Those users require a lot of help.
Its not all doom and gloom though. Greg is a retired alpaca farmer from Australia. So far, he’s made over one thousand edits to CATE and has spent hours uploading new descriptions to the site. We would never have anticipated that someone like Greg would contribute more to CATE, at least in terms of time online, than professional taxonomists who we already have a good working relationship with. It is simply a problem of scale – there are very few professional Araceae taxonomists in the world, all very busy. There are many more amateur botanists with an interest in the Araceae.
Finally, some ideas about making better use of this digital information – because the data is stored in a database, we can do much more than publish pages online. CATE stores categorical and quantitative information and generates reports of that information on the fly
We’ve also worked on producing offline copies of CATE for particular genera or geographical regions, formatting the content as a traditional monograph or flora. Treatment of dracntium in pdf.
CATE sends its checklist to GBIF’s “Checklist Bank” to help improve the classification of the Araceae in other systems. Every month cate produces a version of checklist that gbif harvests. Can be downloaded by other people but also for use within gbif to improve search results and improve the view of the data. This is one way we’ve tried to make CATE more useful. Gbif is appropriate , eg people with an interest in all plants of a geographical region eg Latin America, wouldn’t came to cate. But by feeding the data to broader sites, we have more of an impact.
In conclusion: Clearly getting taxonomic content online is important, but I hope we’ve shown that, if a project is successful, sustainability is critical. Sustainability means keeping the information accurate and up-to-date – which is a big job. Distributing that effort across multiple people is one way of reducing the risk and increasing efficiency, but for that to be successful, you need to think carefully about who your users are, and be prepared to spend time supporting the users and curating the data. What happens after project?
Whatever happened to CATE?
Whatever happened to CATE? Anna Haigh Ben Clark
In brief <ul><li>What we learned from CATE: </li></ul><ul><li>The difference between eTaxonomy and Taxonomy (and why that is important) </li></ul><ul><li>Figuring out what to do on the web (and why that is hard) </li></ul><ul><li>Pitfalls for the unwary (don’t repeat our mistakes) </li></ul>
So what is e-Taxonomy? “ We have no doubt that the Internet will play a crucial role in the evolution of taxonomy . . .” House of Lords Select Committee on Science and Technology, 2008 “ The role of taxonomy as an information science will increase greatly, most likely as a primarily web-based science” Taxonomy in Europe in the 21 st Century (Report prepared for the board of directors of EDIT) “ Imagine an electronic page for each species of organism on Earth . . . ” Edward O. Wilson
Its easy, right? Defining the problem that we’re trying to solve is important “ If you build it, they will come ” is a fallacy “ If you build it, will they come? ”
Defining the problem <ul><li>Until recently publishing looked like this: </li></ul><ul><li>Setup costs are large </li></ul><ul><li>Small runs are expensive </li></ul><ul><li>Errors can’t be corrected </li></ul><ul><li>Quality control is paramount </li></ul><ul><li>Experts are required </li></ul>
The internet <ul><li>The web solves this: </li></ul><ul><li>Setup costs are (quite) small </li></ul><ul><li>Distribution is (nearly) free </li></ul><ul><li>Re-publication is trivial </li></ul><ul><li>Collaboration is easy </li></ul><ul><li>“ Many eyes make all errors shallow” </li></ul>
So what is e-Taxonomy? <ul><li>e-Taxonomy is part of a more general “Participation – Collaboration” pattern </li></ul><ul><li>The starting point is the first revision </li></ul><ul><li>The intent is that the revision should </li></ul><ul><li>be complete and accurate </li></ul><ul><li>This requires ongoing effort </li></ul><ul><li>Collaboration with your users is an option which we explored </li></ul>Photo: Peter Boyce
Given that definition <ul><li>Success would be a web revision which: </li></ul><ul><li>Provides an up-to-date classification </li></ul><ul><li>of the Aroids </li></ul><ul><li>Is a useful reference for people who </li></ul><ul><li>want to know about the Araceae </li></ul><ul><li>Sustains itself by attracting users </li></ul><ul><li>who help to maintain it </li></ul>Photo: Peter Boyce
What should we do? <ul><li>Talk to stakeholders </li></ul><ul><li>Our “committee of experts” gave the project credibility </li></ul><ul><li>They did not generally use the site themselves </li></ul>
Participation <ul><li>6,000 different people visit CATE Araceae </li></ul><ul><li>1,500 have visited 50+ times </li></ul><ul><li>100 have registered </li></ul><ul><li>Only three have made edits to CATE </li></ul><ul><li>The 90-9-1 rule of online engagement </li></ul>
Greg Sent: Thursday, September 09, 2010 5:50 AM To: [email_address] Hi, I have joined the site and was wondering what I can do to help? Cheers, Greg Ruckert Nairne, South Australia
What happened to CATE ? <ul><li>Getting taxonomic content online is important </li></ul><ul><li>Sustaining a taxonomic resource is just as hard </li></ul><ul><li>Sustainability means distributing effort across people </li></ul><ul><li>People need curation, just like content does </li></ul>