1. Biodiversity DataBiodiversity Data
vs. the Web 2.0vs. the Web 2.0
OR
How I learned to stop worrying and
love the “systems”
Ana Dal Molin
J. B. Woolley
Texas A&M University
4. • Data providers
• Aggregators
• Tools
• etc
“growth in bioinformatics data
exceeded Moore’s Law, the well-
known observation that the number
of transistors on a chip doubles
every 18 months.”
(Butte, 2001, TRENDS in Biotechnology 19(5))
• Johnson, N. 2007. Annual Rev. Entomology
• http://www.ala.org.au/about-the-atlas/downloadable-tools/tools-review/
• IDigBio
47*
6. • Museums often have already decided on a
model/database system
• Each researcher, on the other hand, may not
have, so questions
– Content management systems (CMS)?
– Which output?
– Stability?
– Best practices?
7. ‘systems’ available
• First Generation: desktop-based (MS Access,
FileMaker)
• Second Generation: desktop-based with web output
• Third Generation: content management systems
(PHP, Ruby, MySql, etc.)
12. • Taxonomy as 2-natured science
• Shifts in media format
13. Web 1.0 -> Web 3.0
1.0: Static HTML, e-mail, forums, chat
2.0: Dynamic HTML, Wikis, blogging,
commenting, social networking
3.0: …
*You and your work are not invisible before
publication*
14. • Web 3.0:
– “Social”
– Tags
– Cloud computing
– Ubiquitous connectivity
– Open technologies, open data formats (and open identity
too)
– Publishing in languages specifically designed for data
(databases, markup)
– Semantic web
– Marketing
23. • Web 3.0
1. People lie
2. People are lazy
3. People are stupid
4. Mission: impossible – know
thyself
5. Schemas aren’t neutral
6. Metrics influence results
7. There’s more than one way to
describe something
C. Doctorow, Metacrap, 2001
24. Issues
• “Unification”*
is not going to happen – curators and
researchers will always have their own
– (although often largely overlapping) set of crucial
information fields which can be cross-linked
• These days, it is imperative that databases
communicate with each other
• ‘unitary taxonomy’ is also not possible and any big
database needs to allow the system to display
conflicting ideas
* Thomas, C. “Biodiversity databases spread, prompting unification
call”, Science v. 325 (2009)
** http://hymao.org
26. Data ephemerality
• Digital data preservation: Internet Archive, IIPC
• Library of Congress discussions and recommendations
– Disclosure, Adoption, Transparency , External dependency, Technical
protection
• http://www.digitalpreservation.gov/formats
27.
28.
29. User perspective
“Incomplete” sites
Dynamic information
Selective information?
31. Online databases are taxonomic product and
marketing for your work
Online biodiversity databases complement your
work
But it’s up to you to be able to make the user
understand that your work is more than that
The user of online databases is probably not the
same as the person who will get your paper
33. … or work with a journal/team that can help you
• Make sure the system is flexible enough in your hands
• Decide who will do the maintenance of your data
– How big is your team?
– Fluidity (positive and negative)
• Think about stability and backup strategies