Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Interpretation, Context, and Metadata:
Examples from Open Context
Eric Kansa (@ekansa)
Data often discussed
using language of
compliance
(Taylorist perspectives)
Data often discussed
using language of
complian...
●
Linked: Links with other systems & data (tDAR, ORCID, etc)
●
Open: Code, data (mainly CC-By) on GitHub, machine-readable...

Role: Publication (editorial & peer-review) and exhibition (like an online museum)

Promote Data Reuse: Attempt to docu...
?
Spectrum of Less and More Structure
1. More structured: classification, quantification
2. Less structured: images, field...

Open Context ≠ A conventional digital repository


Open Context ≠ A conventional digital repository

Information Stable URI
300m wall circumference (estimated based on
geomagnetic sounding, approximate)
http://arcserver.usc...
APIs (Machine-Readable
Data) make it easier to re-
use, analyze, visualize, +
interpret less structured
data.
APIs (Machin...

Open Context ≠ A conventional digital repository


Open Context ≠ A conventional digital repository

Image Credit: Mark Skipper via Flickr (CC-BY)
https://www.flickr.com/photos/bitterjug/7670055210
Challenge of ComplexityCh...
Entity Relation Diagram:
Anglo-Saxon Graves and Grave Goods of
the 6th and 7th Centuries AD: A
Chronological Framework
Joh...
Digital
Repository
Citation Cite Archaeological
Entities (sites, coins,
bones, etc)
Cite Digital Files (can
contain thousa...
Managing Complexity:
Data about this coin came
from several different files
(relational data bases,
spreadsheets)
Some arc...
Publishing Workflow
Improve / Enhance
1. Consistency
2. Context
(intelligibility)
Improve / Enhance
1. Consistency
2. Cont...
Large scale data sharing &
integration for exploring the
origins of farming.
Funded by EOL / NEH
Large scale data sharing ...
“Bos taurus”
http://eol.org/pages/328699
Code: 14
Cattle
Code: 70
Code: 16
Bos taurus
Code: 15
Cattle,
domestic
B. taurus
...
LimitationsLimitations
• Diverse recovery, sampling,Diverse recovery, sampling,
identification methods…identification meth...
Bootstrapping ProblemBootstrapping Problem
• (Linked) Data can feel like(Linked) Data can feel like
having a telephone wit...
Pelagios:
Geographic context emerging as
key way to aggregate multiple
datasets
(Pis: Leif Isaksen, Elton Barker)
Pelagios...
●
Digital Index of North American Archaeology (DINAA): David G.
Anderson, Joshua Wells (PIs) NSF-funded.
●
Publishes a gaz...
●
Cross referenced site URIs with relevant records in tDAR and other public
databases
●
Cross referenced site URIs with re...
PeriodO (http://perio.do)
•
Led by Adam Rabinowitz, Ryan
Shaw, Eric Kansa (NEH funding)
•
Sometimes little consensus in
co...
PeriodO Gazetteer of Periods,
modeling:
(1) Temporal scope
(2) Geographic coverage
(3) Scholarly authority [because
disagr...
New Publishing Services
1. Open Context will publish
citable, formally modeled
(SKOS) controlled vocabularies
2. Context-i...
Final Thoughts
(Finally) some examples of data
reuse and integration (in
archaeology).
In many cases, reuse is still
aspir...
THANK YOU!
Special Thanks!
DCC, DIPIR Team!
Interpretation, Context, and Metadata: Examples from Open Context
Interpretation, Context, and Metadata: Examples from Open Context
Interpretation, Context, and Metadata: Examples from Open Context
Interpretation, Context, and Metadata: Examples from Open Context
Interpretation, Context, and Metadata: Examples from Open Context
Upcoming SlideShare
Loading in …5
×

Interpretation, Context, and Metadata: Examples from Open Context

294 views

Published on

Presentation given at the International Data Curation Conference (#IDCC!6) in Amsterdam, at the "A Context-driven Approach to Data Curation for Reuse" workshop (organized by Ixchel Faniel and Elizabeth Yakel) on Monday, February 22, 2015

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

Interpretation, Context, and Metadata: Examples from Open Context

  1. 1. Interpretation, Context, and Metadata: Examples from Open Context Eric Kansa (@ekansa)
  2. 2. Data often discussed using language of compliance (Taylorist perspectives) Data often discussed using language of compliance (Taylorist perspectives)
  3. 3. ● Linked: Links with other systems & data (tDAR, ORCID, etc) ● Open: Code, data (mainly CC-By) on GitHub, machine-readable formats, APIs ● Long-term: NSF, NEH data management. California Digital Library archiving ● Global: Mirroring, collaboration with the German Archaeological Institute (DAI) ● Linked: Links with other systems & data (tDAR, ORCID, etc) ● Open: Code, data (mainly CC-By) on GitHub, machine-readable formats, APIs ● Long-term: NSF, NEH data management. California Digital Library archiving ● Global: Mirroring, collaboration with the German Archaeological Institute (DAI)
  4. 4.  Role: Publication (editorial & peer-review) and exhibition (like an online museum)  Promote Data Reuse: Attempt to document context, annotate data to common vocabularies. Increasing emphasis on intervening earlier in research data “life- cycle”.  Role: Publication (editorial & peer-review) and exhibition (like an online museum)  Promote Data Reuse: Attempt to document context, annotate data to common vocabularies. Increasing emphasis on intervening earlier in research data “life- cycle”.
  5. 5. ? Spectrum of Less and More Structure 1. More structured: classification, quantification 2. Less structured: images, field-notes 3. Structured and less structured information need to cross-reference (URIs useful), all provide context Spectrum of Less and More Structure 1. More structured: classification, quantification 2. Less structured: images, field-notes 3. Structured and less structured information need to cross-reference (URIs useful), all provide context
  6. 6.  Open Context ≠ A conventional digital repository   Open Context ≠ A conventional digital repository 
  7. 7. Information Stable URI 300m wall circumference (estimated based on geomagnetic sounding, approximate) http://arcserver.usc.edu/reports/reports/TAA_ 2000_to_2007.pdf Wall foundation about 1.8m thick http://opencontext.org/media/BF565965- 98A8-4E84-2318-AFFA983277E1 Brick dimensions: 34 x 31 x 9 cm http://opencontext.org/subjects/975143F2- B80E-436B-B078-1D67FD848352 Surviving wall height: 1.2 meters http://opencontext.org/subjects/02B9D6E6- D6AD-4138-7FCC-3EF6F8BD5722 Specific Citation Promotes Reproducibility 1. Look at lots of pictures, read field notes. 2. URIs facilitate reproducibility, link assertions with specific information sources Specific Citation Promotes Reproducibility 1. Look at lots of pictures, read field notes. 2. URIs facilitate reproducibility, link assertions with specific information sources URIs & Unstructured Data
  8. 8. APIs (Machine-Readable Data) make it easier to re- use, analyze, visualize, + interpret less structured data. APIs (Machine-Readable Data) make it easier to re- use, analyze, visualize, + interpret less structured data.
  9. 9.  Open Context ≠ A conventional digital repository   Open Context ≠ A conventional digital repository 
  10. 10. Image Credit: Mark Skipper via Flickr (CC-BY) https://www.flickr.com/photos/bitterjug/7670055210 Challenge of ComplexityChallenge of Complexity
  11. 11. Entity Relation Diagram: Anglo-Saxon Graves and Grave Goods of the 6th and 7th Centuries AD: A Chronological Framework John Hines (2013) http://dx.doi.org/10.5284/1018290 Entity Relation Diagram: Anglo-Saxon Graves and Grave Goods of the 6th and 7th Centuries AD: A Chronological Framework John Hines (2013) http://dx.doi.org/10.5284/1018290
  12. 12. Digital Repository Citation Cite Archaeological Entities (sites, coins, bones, etc) Cite Digital Files (can contain thousands of items) Granularity High (“1 URI per potsherd”) Low (Information aggregated in big files) Discovery, Querying Common schema, common index for content, not just metadata Index metadata only, content is more opaque Cost Expensive “Boutique Publishing” Cheaper, easier to scale. Self-service models.
  13. 13. Managing Complexity: Data about this coin came from several different files (relational data bases, spreadsheets) Some archaeological projects can have dozens of different spreadsheets + databases! Managing Complexity: Data about this coin came from several different files (relational data bases, spreadsheets) Some archaeological projects can have dozens of different spreadsheets + databases!
  14. 14. Publishing Workflow Improve / Enhance 1. Consistency 2. Context (intelligibility) Improve / Enhance 1. Consistency 2. Context (intelligibility)
  15. 15. Large scale data sharing & integration for exploring the origins of farming. Funded by EOL / NEH Large scale data sharing & integration for exploring the origins of farming. Funded by EOL / NEH
  16. 16. “Bos taurus” http://eol.org/pages/328699 Code: 14 Cattle Code: 70 Code: 16 Bos taurus Code: 15 Cattle, domestic B. taurus Cattle (dom.)
  17. 17. LimitationsLimitations • Diverse recovery, sampling,Diverse recovery, sampling, identification methods…identification methods… • Data modeling problems inData modeling problems in sources (esp. teeth)sources (esp. teeth) • Researchers need toResearchers need to understand how to make dataunderstand how to make data better suited for reusebetter suited for reuse LimitationsLimitations • Diverse recovery, sampling,Diverse recovery, sampling, identification methods…identification methods… • Data modeling problems inData modeling problems in sources (esp. teeth)sources (esp. teeth) • Researchers need toResearchers need to understand how to make dataunderstand how to make data better suited for reusebetter suited for reuse
  18. 18. Bootstrapping ProblemBootstrapping Problem • (Linked) Data can feel like(Linked) Data can feel like having a telephone withhaving a telephone with nobody to callnobody to call • Links with other data can helpLinks with other data can help buid context. But relevancebuid context. But relevance can have a very narrow scopecan have a very narrow scope Bootstrapping ProblemBootstrapping Problem • (Linked) Data can feel like(Linked) Data can feel like having a telephone withhaving a telephone with nobody to callnobody to call • Links with other data can helpLinks with other data can help buid context. But relevancebuid context. But relevance can have a very narrow scopecan have a very narrow scope
  19. 19. Pelagios: Geographic context emerging as key way to aggregate multiple datasets (Pis: Leif Isaksen, Elton Barker) Pelagios: Geographic context emerging as key way to aggregate multiple datasets (Pis: Leif Isaksen, Elton Barker)
  20. 20. ● Digital Index of North American Archaeology (DINAA): David G. Anderson, Joshua Wells (PIs) NSF-funded. ● Publishes a gazetteer of archaeological “site” records (from state agencies). gazetteer of “sites”. (A site is a key concept in archaeology) ● Digital Index of North American Archaeology (DINAA): David G. Anderson, Joshua Wells (PIs) NSF-funded. ● Publishes a gazetteer of archaeological “site” records (from state agencies). gazetteer of “sites”. (A site is a key concept in archaeology)
  21. 21. ● Cross referenced site URIs with relevant records in tDAR and other public databases ● Cross referenced site URIs with relevant records in tDAR and other public databases
  22. 22. PeriodO (http://perio.do) • Led by Adam Rabinowitz, Ryan Shaw, Eric Kansa (NEH funding) • Sometimes little consensus in context (time periods) PeriodO (http://perio.do) • Led by Adam Rabinowitz, Ryan Shaw, Eric Kansa (NEH funding) • Sometimes little consensus in context (time periods)
  23. 23. PeriodO Gazetteer of Periods, modeling: (1) Temporal scope (2) Geographic coverage (3) Scholarly authority [because disagreements about High, Middle, and Low Chronologies] PeriodO Gazetteer of Periods, modeling: (1) Temporal scope (2) Geographic coverage (3) Scholarly authority [because disagreements about High, Middle, and Low Chronologies]
  24. 24. New Publishing Services 1. Open Context will publish citable, formally modeled (SKOS) controlled vocabularies 2. Context-informed reconciliation services to help researchers / curators link data 3. Offer a recommendation service for relevant vocabularies for researchers (especially seeking DMP help) New Publishing Services 1. Open Context will publish citable, formally modeled (SKOS) controlled vocabularies 2. Context-informed reconciliation services to help researchers / curators link data 3. Offer a recommendation service for relevant vocabularies for researchers (especially seeking DMP help)
  25. 25. Final Thoughts (Finally) some examples of data reuse and integration (in archaeology). In many cases, reuse is still aspirational. Need long time scales to develop context. “Context” is a hard research problem (including theoretical); requires better practice at each stage of the data life-cycle. (Finally) some examples of data reuse and integration (in archaeology). In many cases, reuse is still aspirational. Need long time scales to develop context. “Context” is a hard research problem (including theoretical); requires better practice at each stage of the data life-cycle.
  26. 26. THANK YOU! Special Thanks! DCC, DIPIR Team!

×