Publishing to the “Web of Data”        in Archaeology:       Quality and Workflows                               Eric Kans...
Web of Data (2011)         Main Contributors:              ●                  Institutions (esp. government)              ...
Thousand Flowers         ●             Open access, open licensed             data         ●             Archiving by Cali...
Thousand FlowersFills a Gap:Most data sources are institutional.Open Context publishes individual,small group contributions
Thousand FlowersFills a Gap:Most data sources are institutional.   Challenge:Open Context publishes individual,     Divers...
•    3-year project Oct 2010 – Sep 2013•    Funded with a National Leadership Grant from the    Institute for Museum and L...
Open Context Interviewees•    22 Ph.D. or graduate students    interviewed    –        13 men    –        9 women•    Novi...
Open Context Interviewees
Data Documentation PracticesI use an Excel spreadsheet…which I … inherited from my researchadvisers. …my dissertation advi...
Data Documentation PracticesI use an Excel spreadsheet…which I … inherited from my researchadvisers. …my dissertation advi...
Sometimes data is betterserved cooked.
Thousand Flowers        ●            Clean-up and document            contributed data        ●            Map to ArchaeoM...
My Precious Data  Image Credit: “Lord of the Rings” (2003, New      Line), All Rights Reserved Copyright
Data sharing as publication
Data Publishing
Publishing             Data Quality and Standards             Alignment             (1) Check consistency             (2) ...
Publishing             Tools of the Trade              (1) Google Refine (check, edit,                  consistancy)      ...
Publishing               Project Metadata             Column Descriptions
Publishing             Entity Reconciliation              (1) With Google Refine              (2) Implemented, EOL and    ...
●    CDL Archiving Service●    How do DOIs, ARKs, etc. work    with Web and Linked Data?●    Question of granularity and  ...
Summary Outcomes of Publishing Data:  (1) Communicate and set      expectations about content and      quality  (2) Organi...
Final ThoughtsPublication needs to evolve! (1) Participating in Linked Data is     a great goal, but far removed     from ...
Upcoming SlideShare
Loading in …5
×

Open Context and Publishing to the Web of Data: Eric Kansa's LAWDI Presentation

852 views

Published on

This presentation discusses how a model of “data sharing as publishing” can contribute to developing Linked Open Data resources in archaeology and the study of the ancient world. The paper gives examples from Open Context’s developing approach to data editing, documentation and quality improvement processes. The goal of these efforts is to better align the professional interests of individual researchers with the needs of the larger community to access and use high-quality data in Linked Data scenarios.

Published in: Technology, Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
852
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
4
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Open Context and Publishing to the Web of Data: Eric Kansa's LAWDI Presentation

  1. 1. Publishing to the “Web of Data” in Archaeology: Quality and Workflows Eric Kansa UC Berkeley / OpenContext.org Unless otherwise indicated, this work is licensed under a Creative Commons Attribution 3.0 License <http://creativecommons.org/licenses/by/3.0/>
  2. 2. Web of Data (2011) Main Contributors: ● Institutions (esp. government) ● Thematic collections / projects
  3. 3. Thousand Flowers ● Open access, open licensed data ● Archiving by California Digital Library ● Persistent Identifiers (DOIs, ARKs) ● Web services ● NSF/NEH links for data management plans
  4. 4. Thousand FlowersFills a Gap:Most data sources are institutional.Open Context publishes individual,small group contributions
  5. 5. Thousand FlowersFills a Gap:Most data sources are institutional. Challenge:Open Context publishes individual, Diversesmall group contributions contributions, needing lots of work to clean- up and “link”
  6. 6. • 3-year project Oct 2010 – Sep 2013• Funded with a National Leadership Grant from the Institute for Museum and Library Services, LG-06- 10-0140-10, “Dissemination Information Packages for Information Reuse”• Ixchel Faniel, PI & Elizabeth Yakel, Co-PI http://www.dipir.org
  7. 7. Open Context Interviewees• 22 Ph.D. or graduate students interviewed – 13 men – 9 women• Novices / Experts – 19 experts – 3 novices• Interviewees who where curators or professors also with a curatorial role = 6
  8. 8. Open Context Interviewees
  9. 9. Data Documentation PracticesI use an Excel spreadsheet…which I … inherited from my researchadvisers. …my dissertation advisor was still recording data for eachspecimen on paper when I was in graduate school so thats what Istarted …then quickly, I was like, "This is ridiculous.“… I just startedusing an Excel spreadsheet that has sort of slowly gotten bigger andbigger over time with more variables or columns…Ive added …colorcoding…I also use…a very sort of primitive numerical coding system,again, that I inherited from my research advisers…So, this little bookthat goes with me of codes which is sort of odd, but …we all knowthat a 14 is a sheep.” (CCU13)
  10. 10. Data Documentation PracticesI use an Excel spreadsheet…which I … inherited from my researchadvisers. …my dissertation advisor was still recording data for eachspecimen on paper when I was in graduate school so thats what Istarted …then quickly, I was like, "This is ridiculous.“… I just startedusing an Excel spreadsheet that has sort of slowly gotten bigger andbigger over time with more variables or columns…Ive added …colorcoding…I also use…a very sort of primitive numerical coding system,again, that I inherited from my research advisers…So, this little bookthat goes with me of codes which is sort of odd, but …we all knowthat a 14 is a sheep.” (CCU13) A long way to go before we get Linked Data
  11. 11. Sometimes data is betterserved cooked.
  12. 12. Thousand Flowers ● Clean-up and document contributed data ● Map to ArchaeoML ● Mint URIs to entities (potsherds, projects, contexts, people) ● Link to important vocabularies / collections (Pleiades, Encyclopedia of Life) ● Working on CLAROS-based CIDOC-CRM (RDF) representations (not straightforward)
  13. 13. My Precious Data Image Credit: “Lord of the Rings” (2003, New Line), All Rights Reserved Copyright
  14. 14. Data sharing as publication
  15. 15. Data Publishing
  16. 16. Publishing Data Quality and Standards Alignment (1) Check consistency (2) Edit functions (3) Align to common standards (“Linked Data” if applicable) (4) Issue tracking, version control
  17. 17. Publishing Tools of the Trade (1) Google Refine (check, edit, consistancy) (2) Mantis (issue-tracker, coordinate edits, metadata creation)
  18. 18. Publishing Project Metadata Column Descriptions
  19. 19. Publishing Entity Reconciliation (1) With Google Refine (2) Implemented, EOL and Pleiades (3) Need more vocabularies! (4) Simple model, not complex ontology mapping
  20. 20. ● CDL Archiving Service● How do DOIs, ARKs, etc. work with Web and Linked Data?● Question of granularity and emphasis(archive “objects”)
  21. 21. Summary Outcomes of Publishing Data: (1) Communicate and set expectations about content and quality (2) Organize workflows to improve data quality and usability (3) Make “datasets” first class citizens in world of scholarly communications
  22. 22. Final ThoughtsPublication needs to evolve! (1) Participating in Linked Data is a great goal, but far removed from most everyday practice (2) Researchers need help. (3) 19th century publication norms poorly suited to 21st century methods, research, public goals

×