Your SlideShare is downloading. ×
0
Going Digital

                      Rod Page




Photo by Keith Marshall http://www.flickr.com/photos/keithmarshall/43292...
If you are not online you are
           invisible

          (Web 1.0)
All useful information will be
             online

           (Web 1.0)
Value is explicit and based on
         usage (links)

           (Web 1.0)
Reputation is created…


       (Web 2.0)
…not conferred by authority


          (Web 2.0)
Everything will have a URL


         (Web 3.0)
Yes, I’ve drunk the Kool Aid
…but I’m not alone
Social networking
Dinosaurs ban it
Scaremongers say it causes
        cancer
Some “get it”
Some do real work with it
#uksnow
@kzelnioCould you do me a favour:
 10.1016/j.anbehav.2008.12.017
Where is the
      (digital) museum?




Photo by Keith Marshall http://www.flickr.com/photos/keithmarshall/432924465/
www.nhm.ac.uk
Zoology
This dataset is not accessible by the
public. For more information please
contact the Department of Zoology.
Silo




http://www.flickr.com/photos/kenmccown/132990634/
404
GBIF




http://www.flickr.com/photos/chrisfreeland/3306689322/
Top 10 GBIF data providers
League table
Museum   GBIF data       Open access   Staff          Social
                         journal       publicati...
Why go digital?




Photo by Keith Marshall http://www.flickr.com/photos/keithmarshall/432924465/
Diverse kinds of data
Apomys datae
Apomys specimen
How do we integrate these data?
Why integrate?
Learn stuff we don’t know
• There are knownknowns, things we know that we
  know

• There are knownunknowns, things we now know
  we don’t know

• B...
Unknown knowns
Things we know
…without knowing that we know
Melissotarsus insularis
1

Melissotarsus insularis                no hit




Melissotarsus insularis        CASENT0107663-D01




CASENT0107663-D0...
No one source has all the
        answers
Joining the dots
Identifiers
Digital Object Identifier
         (DOI)
Identifies a publication
Globally unique
10.1016/j.ympev.2006.04.006
Paper
Why have DOIs?
Link rot
Refs
Cites



        2006
         2006
Forward Cites




   2006         2009
Shoulders of giants
progress is incremental
reuse past results
Forward Cites




   2006         2008
Species



          Genes
data linking
data citation
http://iphylo.org/~rpage/challenge
demo
Vision
Chromiscircumaurea
What should
              museums do?




Photo by Keith Marshall http://www.flickr.com/photos/keithmarshall/432924465/
Do nothing




Photo by Keith Marshall http://www.flickr.com/photos/keithmarshall/432924465/
The new
Don’t try this at home
•   Image storage (Flickr)
•   Video storage (YouTube, Vimeo)
•   Bibliographies (Connotea, Mendele...
Make it easy




Photo by Keith Marshall http://www.flickr.com/photos/keithmarshall/432924465/
http://www.flickr.com/photos/scobleizer/2256358640/
http://taxonomy.zoology.gla.ac.uk/rod/treeview.html
http://abacus.gene.ucl.ac.uk/software/paml.html
http://mrbayes.csit.fsu.edu/
http://www.tree-puzzle.de/
http://atgc.lirmm.fr/phyml/
No branding
No corporate style
No permission needed
Institution provides
  infrastructure…
…then gets out of the way
Top five European papers in
evolutionary biology 1996-2006
       1,118 – 4,512 citations
Partnerships
                   (EOL)




Photo by Keith Marshall http://www.flickr.com/photos/keithmarshall/432924465/
“Dance of the initiatives”

       Christine Hine
Danger of too much money
Million Dollar Page
EOL in it’s present form
sucks
Can I do science with it?
Not yet…
Intellectual
                   Property




Photo by Keith Marshall http://www.flickr.com/photos/keithmarshall/432924465/
Fear
Ignorance
Silo




http://www.flickr.com/photos/kenmccown/132990634/
AMNH Conditions
1.   Except as otherwise expressly stated herein, the information, records, or images
     in these databa...
2. AMNH does not grant permission
   for anyone to use, download,
   reproduce, publicly display,
   distribute, or reprin...
Elachistocleisovalis




http://www.flickr.com/photos/lleonebio/3328398741/
You know more than the AMNH
       database does!
DQ283405


FEATURES            Location/Qualifiers
  source        1..2400
            /organism=quot;Elachistocleisovalis...
Tens of thousands of copies all
      around the world
AMNH Conditions
1.   Except as otherwise expressly stated herein, the information, records, or images
     in these databa...
You are going digital whether you
          like it or not…
If it is on the web it will be found,
               and used
This is a good thing
Creative Commons
Why be open?




Photo by Keith Marshall http://www.flickr.com/photos/keithmarshall/432924465/
Caeciliidae
Caeciliidae
Caeciliidae
Pagelluserythrinus
Pagelluserythrinus
Pagelluserythrinus
Mannophrynetrinitatis




                          MVZ 199838
MVZ 199828
(Aneidesflavipunctatus)
Errors in databases
Errors in publications
The Carmen Electra argument for
         Open Access
treemap
reuse data
Electra pilosa
Carmen Electra versus Electra

       (guess who wins…)
reuse data
Homo sapiens
AJ711044
should be AJ971044
How do we find and fix these
         errors?
Don’t release data until it is
         “perfect”
           (wrong)
“given enough eyes, all bugs are
           shallow”
          Eric S Raymond
Credit




Photo by Keith Marshall http://www.flickr.com/photos/keithmarshall/432924465/
Google Page Rank

1.49
        A

                                  0.78
                              B


               ...
Page rank for web page
Scientific citation
H-index for authors
Impact factor for journals
What about an
impact factor for data?
Metric of the value of the data
Incentive to have globally
unique, citable identifiers
What to digitise first?




Photo by Keith Marshall http://www.flickr.com/photos/keithmarshall/432924465/
First digitise that which has been
                cited
W D Lang Nature 139, 191 (1937)
    doi:10.1038/139191a0
http://www.flickr.com/photos/mtl_shag/1403957285/
www.nhm.ac.uk
V S Smith
…end




Photo by Keith Marshall http://www.flickr.com/photos/keithmarshall/432924465/
Going Digital
Going Digital
Going Digital
Going Digital
Going Digital
Going Digital
Going Digital
Going Digital
Going Digital
Going Digital
Going Digital
Going Digital
Going Digital
Going Digital
Going Digital
Going Digital
Going Digital
Going Digital
Going Digital
Going Digital
Upcoming SlideShare
Loading in...5
×

Going Digital

3,192

Published on

Invited talk given at The Natural History Museum, London, 17 March 2009 (I gave a very similar talk at the Department of Zoology, University of Stockholm, 12 March 2009).

Published in: Technology, Education

Transcript of "Going Digital"

  1. 1. Going Digital Rod Page Photo by Keith Marshall http://www.flickr.com/photos/keithmarshall/432924465/
  2. 2. If you are not online you are invisible (Web 1.0)
  3. 3. All useful information will be online (Web 1.0)
  4. 4. Value is explicit and based on usage (links) (Web 1.0)
  5. 5. Reputation is created… (Web 2.0)
  6. 6. …not conferred by authority (Web 2.0)
  7. 7. Everything will have a URL (Web 3.0)
  8. 8. Yes, I’ve drunk the Kool Aid
  9. 9. …but I’m not alone
  10. 10. Social networking
  11. 11. Dinosaurs ban it
  12. 12. Scaremongers say it causes cancer
  13. 13. Some “get it”
  14. 14. Some do real work with it
  15. 15. #uksnow
  16. 16. @kzelnioCould you do me a favour: 10.1016/j.anbehav.2008.12.017
  17. 17. Where is the (digital) museum? Photo by Keith Marshall http://www.flickr.com/photos/keithmarshall/432924465/
  18. 18. www.nhm.ac.uk
  19. 19. Zoology
  20. 20. This dataset is not accessible by the public. For more information please contact the Department of Zoology.
  21. 21. Silo http://www.flickr.com/photos/kenmccown/132990634/
  22. 22. 404
  23. 23. GBIF http://www.flickr.com/photos/chrisfreeland/3306689322/
  24. 24. Top 10 GBIF data providers
  25. 25. League table Museum GBIF data Open access Staff Social journal publications Networking online Twitter, yes 3,446,016 Facebook, Youtube, etc. (searchable yes collections) (Twitter) 412,797 (planned) (planned)
  26. 26. Why go digital? Photo by Keith Marshall http://www.flickr.com/photos/keithmarshall/432924465/
  27. 27. Diverse kinds of data
  28. 28. Apomys datae
  29. 29. Apomys specimen
  30. 30. How do we integrate these data?
  31. 31. Why integrate?
  32. 32. Learn stuff we don’t know
  33. 33. • There are knownknowns, things we know that we know • There are knownunknowns, things we now know we don’t know • But there are also unknownunknowns, things we do not know we don't know
  34. 34. Unknown knowns
  35. 35. Things we know …without knowing that we know
  36. 36. Melissotarsus insularis
  37. 37. 1 Melissotarsus insularis no hit Melissotarsus insularis CASENT0107663-D01 CASENT0107663-D01 DQ176312 DQ176312 Melissotarsus sp. BLF m1 = Melissotarsus sp. BLF m1 Melissotarsus insularis
  38. 38. No one source has all the answers
  39. 39. Joining the dots
  40. 40. Identifiers
  41. 41. Digital Object Identifier (DOI)
  42. 42. Identifies a publication
  43. 43. Globally unique
  44. 44. 10.1016/j.ympev.2006.04.006
  45. 45. Paper
  46. 46. Why have DOIs?
  47. 47. Link rot
  48. 48. Refs
  49. 49. Cites 2006 2006
  50. 50. Forward Cites 2006 2009
  51. 51. Shoulders of giants
  52. 52. progress is incremental
  53. 53. reuse past results
  54. 54. Forward Cites 2006 2008
  55. 55. Species Genes
  56. 56. data linking
  57. 57. data citation
  58. 58. http://iphylo.org/~rpage/challenge
  59. 59. demo
  60. 60. Vision
  61. 61. Chromiscircumaurea
  62. 62. What should museums do? Photo by Keith Marshall http://www.flickr.com/photos/keithmarshall/432924465/
  63. 63. Do nothing Photo by Keith Marshall http://www.flickr.com/photos/keithmarshall/432924465/
  64. 64. The new
  65. 65. Don’t try this at home • Image storage (Flickr) • Video storage (YouTube, Vimeo) • Bibliographies (Connotea, Mendeley) • Social networking (Facebook, Twitter) • Annotation (CMS, Wikis, Blogs) • Bulk storage (Amazon S3) • Bulk computing (Amazon EC2)
  66. 66. Make it easy Photo by Keith Marshall http://www.flickr.com/photos/keithmarshall/432924465/
  67. 67. http://www.flickr.com/photos/scobleizer/2256358640/
  68. 68. http://taxonomy.zoology.gla.ac.uk/rod/treeview.html
  69. 69. http://abacus.gene.ucl.ac.uk/software/paml.html
  70. 70. http://mrbayes.csit.fsu.edu/
  71. 71. http://www.tree-puzzle.de/
  72. 72. http://atgc.lirmm.fr/phyml/
  73. 73. No branding
  74. 74. No corporate style
  75. 75. No permission needed
  76. 76. Institution provides infrastructure…
  77. 77. …then gets out of the way
  78. 78. Top five European papers in evolutionary biology 1996-2006 1,118 – 4,512 citations
  79. 79. Partnerships (EOL) Photo by Keith Marshall http://www.flickr.com/photos/keithmarshall/432924465/
  80. 80. “Dance of the initiatives” Christine Hine
  81. 81. Danger of too much money
  82. 82. Million Dollar Page
  83. 83. EOL in it’s present form
  84. 84. sucks
  85. 85. Can I do science with it?
  86. 86. Not yet…
  87. 87. Intellectual Property Photo by Keith Marshall http://www.flickr.com/photos/keithmarshall/432924465/
  88. 88. Fear
  89. 89. Ignorance
  90. 90. Silo http://www.flickr.com/photos/kenmccown/132990634/
  91. 91. AMNH Conditions 1. Except as otherwise expressly stated herein, the information, records, or images in these databases may not be reproduced, distributed, or publicly displayed, in whole or in part, without the express written permission of the American Museum of Natural History (AMNH). 2. AMNH does not grant permission for anyone to use, download, reproduce, publicly display, distribute, or reprint all or substantially all of the information, records, or images in the database. 3. Subsets of the information, records, or images in the database may be used, downloaded, reproduced, publicly displayed, distributed, or reprinted strictly for educational, scientific, scholarly, and other non-profit uses provided that AMNH is appropriately cited as the source of the information. 4. Subsets of the records from the database downloaded for use with data from other data sets must be clearly identified by the attribution “AMNH.” 5. Data are provided to individual users with the understanding that said data will not be passed on to third parties or redistributed, except with approval from AMNH. 6. …
  92. 92. 2. AMNH does not grant permission for anyone to use, download, reproduce, publicly display, distribute, or reprint all or substantially all of the information, records, or images in the database.
  93. 93. Elachistocleisovalis http://www.flickr.com/photos/lleonebio/3328398741/
  94. 94. You know more than the AMNH database does!
  95. 95. DQ283405 FEATURES Location/Qualifiers source 1..2400 /organism=quot;Elachistocleisovalisquot; /organelle=quot;mitochondrionquot; /mol_type=quot;genomic DNAquot; /specimen_voucher=quot;AMNH A141136quot; /db_xref=quot;taxon:367647quot; /country=quot;Guyana: Dubulay Ranch on the Berbice River, 200ft, 5'40'55N, 57'51'32Wquot; misc_RNA<1..>2400 /note=quot;contains 12S ribosomal RNA, tRNA-Val, and 16S ribosomal RNAquot;
  96. 96. Tens of thousands of copies all around the world
  97. 97. AMNH Conditions 1. Except as otherwise expressly stated herein, the information, records, or images in these databases may not be reproduced, distributed, or publicly displayed, in whole or in part, without the express written permission of the American Museum of Natural History (AMNH). 2. AMNH does not grant permission for anyone to use, download, reproduce, publicly display, distribute, or reprint all or substantially all of the information, records, or images in the database. 3. Subsets of the information, records, or images in the database may be used, downloaded, reproduced, publicly displayed, distributed, or reprinted strictly for educational, scientific, scholarly, and other non-profit uses provided that AMNH is appropriately cited as the source of the information. 4. Subsets of the records from the database downloaded for use with data from other data sets must be clearly identified by the attribution “AMNH.” 5. Data are provided to individual users with the understanding that said data will not be passed on to third parties or redistributed, except with approval from AMNH. 6. …
  98. 98. You are going digital whether you like it or not…
  99. 99. If it is on the web it will be found, and used
  100. 100. This is a good thing
  101. 101. Creative Commons
  102. 102. Why be open? Photo by Keith Marshall http://www.flickr.com/photos/keithmarshall/432924465/
  103. 103. Caeciliidae
  104. 104. Caeciliidae
  105. 105. Caeciliidae
  106. 106. Pagelluserythrinus
  107. 107. Pagelluserythrinus
  108. 108. Pagelluserythrinus
  109. 109. Mannophrynetrinitatis MVZ 199838 MVZ 199828 (Aneidesflavipunctatus)
  110. 110. Errors in databases
  111. 111. Errors in publications
  112. 112. The Carmen Electra argument for Open Access
  113. 113. treemap
  114. 114. reuse data
  115. 115. Electra pilosa
  116. 116. Carmen Electra versus Electra (guess who wins…)
  117. 117. reuse data
  118. 118. Homo sapiens
  119. 119. AJ711044
  120. 120. should be AJ971044
  121. 121. How do we find and fix these errors?
  122. 122. Don’t release data until it is “perfect” (wrong)
  123. 123. “given enough eyes, all bugs are shallow” Eric S Raymond
  124. 124. Credit Photo by Keith Marshall http://www.flickr.com/photos/keithmarshall/432924465/
  125. 125. Google Page Rank 1.49 A 0.78 B C 1.58 0.15 D
  126. 126. Page rank for web page
  127. 127. Scientific citation
  128. 128. H-index for authors
  129. 129. Impact factor for journals
  130. 130. What about an impact factor for data?
  131. 131. Metric of the value of the data
  132. 132. Incentive to have globally unique, citable identifiers
  133. 133. What to digitise first? Photo by Keith Marshall http://www.flickr.com/photos/keithmarshall/432924465/
  134. 134. First digitise that which has been cited
  135. 135. W D Lang Nature 139, 191 (1937) doi:10.1038/139191a0
  136. 136. http://www.flickr.com/photos/mtl_shag/1403957285/
  137. 137. www.nhm.ac.uk
  138. 138. V S Smith
  139. 139. …end Photo by Keith Marshall http://www.flickr.com/photos/keithmarshall/432924465/
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×