A Publication Approach toLinked Data in ArchaeologyA Publication Approach toLinked Data in ArchaeologyEric C. KansaUC Berk...
• Started in 2007• Open access / open datapublishing for archaeology• Archiving by CaliforniaDigital Library• Referenced b...
My Precious DataMy Precious Data?
Data Sharing as Publication• Several projects studyingeditorial + publishingworkflows• Current Funding: ACLS,NEH, Sloan, E...
Web of DataWeb of DataCross-discipline ConnectionsOpen Context links withhumanities data (CIDOC,Pleiades, British Museum),...
Pelagios API
EOL Computable DataChallenge(Ben Arbuckle, Sarah Kansa,Eric Kansa)
EOL Computable DataChallenge1. 15 different sites2. 34 zooarchaeologists3. Publishing: decoding, cleanup,metadata document...
Data PublishingGoogle / Open Refine1. Check consistency2. Edit functions3. All changes logged, can berolled backGoogle / O...
Bibliography• Bibliographic referencesexpressed as Linked Data(modeled after S. Heath)• Associates publicationcitation wit...
Why UBERON?1. Expresses relevant expert knowledge,tremendous effort. Why ignore orduplicate this effort?2. Anatomic entiti...
“Ovis aries”http://eol.org/pages/311906/Code: 14DomesticsheepCode: 70Code: 16Ovis ariesCode: 15SheepO. ariesSchafSh.
“Distal epiphysis unfused”http://opencontext.org/vocabularies/open-context-zooarch/zoo-0058dist.unfusedd. uf.30uf. dist.,f...
Sheep/Goat Distal Femur FusionSheep/Goat Distal Femur FusionKarain B Cave (N=53) Pınarbaşı (N=3) Çukuriçi Höyük (N=13)Sube...
“Distal epiphysis unfused”http://opencontext.org/vocabularies/open-context-zooarch/zoo-0058
DIPIR: Data Documentation PracticesDIPIR: Data Documentation PracticesI use an Excel spreadsheet…which I … inherited from ...
CC-BY (Eduardo Otubo)http://www.flickr.com/photos/otubo/5091378744
SPARQL endpoint easy to break (too big of a graphto query).Needed a work-around, so I also use the normal(“plain web”) ind...
(1) Keywordsearch forrelevant term.(2) Scrape results(blech!) for itemidentifiers(“objectid”parameter inURLs)(3) Use Objec...
SELECT ?s ?oPart ?oThes ?oLabWHERE{?s<http://collection.britishmuseum.org/id/crm/bm-extensions/codex_id>$objectID;<http://...
Why is linkeddata important?Why is linkeddata important?1. Improve data quality, expertcuration of concepts +vocabularies2...
… butparticipatingin Linked Datarequireseffort!… butparticipatingin Linked Datarequireseffort!Why is linkeddata important?...
Image Credit: Copyright Newline Cinema
One does not simplyshare usable data…
Data are challenging1. “Raw data” often problematic,even with documentation (10Xeffort needed with decoded data)2. Tension...
#LAWDI Open Context, publishing linked data in archaeology
#LAWDI Open Context, publishing linked data in archaeology
#LAWDI Open Context, publishing linked data in archaeology
#LAWDI Open Context, publishing linked data in archaeology
#LAWDI Open Context, publishing linked data in archaeology
#LAWDI Open Context, publishing linked data in archaeology
#LAWDI Open Context, publishing linked data in archaeology
#LAWDI Open Context, publishing linked data in archaeology
#LAWDI Open Context, publishing linked data in archaeology
#LAWDI Open Context, publishing linked data in archaeology
#LAWDI Open Context, publishing linked data in archaeology
#LAWDI Open Context, publishing linked data in archaeology
#LAWDI Open Context, publishing linked data in archaeology
#LAWDI Open Context, publishing linked data in archaeology
#LAWDI Open Context, publishing linked data in archaeology
#LAWDI Open Context, publishing linked data in archaeology
#LAWDI Open Context, publishing linked data in archaeology
#LAWDI Open Context, publishing linked data in archaeology
#LAWDI Open Context, publishing linked data in archaeology
Upcoming SlideShare
Loading in …5
×

#LAWDI Open Context, publishing linked data in archaeology

468 views

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
468
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

#LAWDI Open Context, publishing linked data in archaeology

  1. 1. A Publication Approach toLinked Data in ArchaeologyA Publication Approach toLinked Data in ArchaeologyEric C. KansaUC Berkeley / OpenContext.orgUnless otherwise indicated, this work is licensed under a Creative Commons Attribution3.0 License <http://creativecommons.org/licenses/by/3.0/>
  2. 2. • Started in 2007• Open access / open datapublishing for archaeology• Archiving by CaliforniaDigital Library• Referenced by NSF andNEH for grant datamanagement• Started in 2007• Open access / open datapublishing for archaeology• Archiving by CaliforniaDigital Library• Referenced by NSF andNEH for grant datamanagement
  3. 3. My Precious DataMy Precious Data?
  4. 4. Data Sharing as Publication• Several projects studyingeditorial + publishingworkflows• Current Funding: ACLS,NEH, Sloan, EOLData Sharing as Publication• Several projects studyingeditorial + publishingworkflows• Current Funding: ACLS,NEH, Sloan, EOL
  5. 5. Web of DataWeb of DataCross-discipline ConnectionsOpen Context links withhumanities data (CIDOC,Pleiades, British Museum), andnatural sciences (EOL, UBERON)
  6. 6. Pelagios API
  7. 7. EOL Computable DataChallenge(Ben Arbuckle, Sarah Kansa,Eric Kansa)
  8. 8. EOL Computable DataChallenge1. 15 different sites2. 34 zooarchaeologists3. Publishing: decoding, cleanup,metadata documentation4. Linked Data annotation (EOL,UBERON, biometrics)5. Collaborative analysis6. Reuse itself studied byDIPIR.org (U. MichiganISchool)EOL Computable DataChallenge1. 15 different sites2. 34 zooarchaeologists3. Publishing: decoding, cleanup,metadata documentation4. Linked Data annotation (EOL,UBERON, biometrics)5. Collaborative analysis6. Reuse itself studied byDIPIR.org (U. MichiganISchool)
  9. 9. Data PublishingGoogle / Open Refine1. Check consistency2. Edit functions3. All changes logged, can berolled backGoogle / Open Refine1. Check consistency2. Edit functions3. All changes logged, can berolled back
  10. 10. Bibliography• Bibliographic referencesexpressed as Linked Data(modeled after S. Heath)• Associates publicationcitation with Open AccessvariantsBibliography• Bibliographic referencesexpressed as Linked Data(modeled after S. Heath)• Associates publicationcitation with Open Accessvariants
  11. 11. Why UBERON?1. Expresses relevant expert knowledge,tremendous effort. Why ignore orduplicate this effort?2. Anatomic entities related toembryology, genetic networks. Newresearch opportunities for zooarch?3. Zooarchaeology gains stakeholders(biometric data of wide interest)Why UBERON?1. Expresses relevant expert knowledge,tremendous effort. Why ignore orduplicate this effort?2. Anatomic entities related toembryology, genetic networks. Newresearch opportunities for zooarch?3. Zooarchaeology gains stakeholders(biometric data of wide interest)
  12. 12. “Ovis aries”http://eol.org/pages/311906/Code: 14DomesticsheepCode: 70Code: 16Ovis ariesCode: 15SheepO. ariesSchafSh.
  13. 13. “Distal epiphysis unfused”http://opencontext.org/vocabularies/open-context-zooarch/zoo-0058dist.unfusedd. uf.30uf. dist.,f. prox.Distal epiph.unfusedDistal end unf.
  14. 14. Sheep/Goat Distal Femur FusionSheep/Goat Distal Femur FusionKarain B Cave (N=53) Pınarbaşı (N=3) Çukuriçi Höyük (N=13)Suberde (N=0) Domuztepe (N=28) Ulucak (N=15)0%10%20%30%40%50%60%70%80%90%100%UnfusedFused
  15. 15. “Distal epiphysis unfused”http://opencontext.org/vocabularies/open-context-zooarch/zoo-0058
  16. 16. DIPIR: Data Documentation PracticesDIPIR: Data Documentation PracticesI use an Excel spreadsheet…which I … inherited from my researchadvisers. …my dissertation advisor was still recording data for eachspecimen on paper when I was in graduate school so thats what Istarted …then quickly, I was like, "This is ridiculous.“… I just startedusing an Excel spreadsheet that has sort of slowly gotten bigger andbigger over time with more variables or columns…Ive added …colorcoding…I also use…a very sort of primitive numerical coding system,again, that I inherited from my research advisers…So, this little bookthat goes with me of codes which is sort of odd, but …we all knowthat a 14 is a sheep.” (CCU13)A long way to go before weget usable, intelligible data
  17. 17. CC-BY (Eduardo Otubo)http://www.flickr.com/photos/otubo/5091378744
  18. 18. SPARQL endpoint easy to break (too big of a graphto query).Needed a work-around, so I also use the normal(“plain web”) index to query the British Museum.
  19. 19. (1) Keywordsearch forrelevant term.(2) Scrape results(blech!) for itemidentifiers(“objectid”parameter inURLs)(3) Use ObjectIDsin SPARQL queries(limits size ofgraph queried, soserver doesn’tdie).
  20. 20. SELECT ?s ?oPart ?oThes ?oLabWHERE{?s<http://collection.britishmuseum.org/id/crm/bm-extensions/codex_id>$objectID;<http://collection.britishmuseum.org/id/crm/P46F.is_composed_of> ?oPart.?oPart<http://collection.britishmuseum.org/id/crm/P45F.consists_of> ?oThes.?oThes<http://www.w3.org/2004/02/skos/core#prefLabel> ?oLab.} LIMIT 10
  21. 21. Why is linkeddata important?Why is linkeddata important?1. Improve data quality, expertcuration of concepts +vocabularies2. Develop ties with otherresearch communities (canfeedback to collect new /different data)3. Increasingly sophisticatedopen source tools, supportservices4. Part of the Web, not just onthe Web1. Improve data quality, expertcuration of concepts +vocabularies2. Develop ties with otherresearch communities (canfeedback to collect new /different data)3. Increasingly sophisticatedopen source tools, supportservices4. Part of the Web, not just onthe Web
  22. 22. … butparticipatingin Linked Datarequireseffort!… butparticipatingin Linked Datarequireseffort!Why is linkeddata important?Why is linkeddata important?
  23. 23. Image Credit: Copyright Newline Cinema
  24. 24. One does not simplyshare usable data…
  25. 25. Data are challenging1. “Raw data” often problematic,even with documentation (10Xeffort needed with decoded data)2. Tension between modeling needsand familiarity with tools (Excel)3. More work needed modelingresearch methods (esp. sampling,see DIPIR.org outcomes)4. You’re never going to be done!Data are challenging1. “Raw data” often problematic,even with documentation (10Xeffort needed with decoded data)2. Tension between modeling needsand familiarity with tools (Excel)3. More work needed modelingresearch methods (esp. sampling,see DIPIR.org outcomes)4. You’re never going to be done!

×