SlideShare a Scribd company logo
1 of 60
Publishing and Pushing
Linked Data in Archaeology
Unless otherwise indicated, this work is licensed under a Creative Commons
Attribution 3.0 License <http://creativecommons.org/licenses/by/3.0/>
Eric C. Kansa (@ekansa)
UC Berkeley D-Lab
& Open Context
Introduction
Challenges in Reusing Data
1. Background
2. Data publishing workflow
3. Data curation and dynamism
“Gold Standard” of
professional contribution
My Precious Data:
Dysfunctional incentives
(poorly constructed metrics),
limit scope, diversity of
publications
Image Credit: “Lord of the Rings” (2003, New
Line), All Rights Reserved Copyright
Need more carrots!
1. Citation, credit,
intellectually valued
2. Research outcomes
(new insights from data
reuse!)
Need more carrots!
1. Citation, credit,
intellectually valued
2. Research outcomes
(new insights from data
reuse!)
Why linked data
is so important
EOL Computable Data
Challenge
(Ben Arbuckle, Sarah
W. Kansa, Eric Kansa)
Large scale data sharing &
integration for exploring the
origins of farming.
Funded by EOL / NEH
1. 300,000 bone specimens
2. Complex: dozens, up to 110
descriptive fields
3. 34 contributors from 15
archaeological sites
4. More than 4 person years
of effort to create the data !
Relatively collaborative bunch,
Ben Arbuckle cultivated
relationships & built trust over
years prior to EOL funding.
Introduction
Challenges in Reusing Data
1. Background
2. Data publishing workflow
3. Data curation and dynamism
1. Referenced by US National
Science Foundation and
National Endowment for the
Humanities for Data
Management
2. “Data sharing as
publishing” metaphor
Raw Data: Idiosyncratic,
sometimes highly coded,
often inconsistent
Raw Data Can Be Unappetizing
Publishing Workflow
Improve / Enhance
1. Consistency
2. Context
(intelligibility)
Sometimes data is better
served cooked
- Documentation
- Review, editing
- Annotation
- Documentation
- Review, editing
- Annotation
- Documentation
- Review, editing
- Annotation
- Documentation
- Review, editing
- Annotation
- Documentation
- Review, editing
- Annotation
“Ovis orientalis”
Code: 14
Wild
sheep
Code: 70
Code: 16
Ovis orientalis
Code: 15
Sheep,
wild
O.
orientalis
Sheep
(wild)
- Documentation
- Review, editing
- Annotation
“Ovis orientalis”
http://eol.org/pages/311906/
Code: 14
Wild
sheep
Code: 70
Code: 16
Ovis orientalis
Code: 15
Sheep,
wild
O.
orientalis
Sheep
(wild)
● Controlled vocabulary
● Linked Data applications
“Sheep/goat”
http://eol.org/pages/32609438/
1. Needed to mint new
concepts like
“sheep/goat”
2. Vocabularies need to
be responsive for
multidisciplinary
applications
Linking to UBERON
1. Needed a controlled vocabulary for
bone anatomy
2. Better data modeling than common in
zooarchaeology, adds quality.
Linking to UBERON
1. Models links between anatomy,
developmental biology, and genetics
2. Unexpected links between the
Humanities and Bioinformatics!
7000 BC (many pigs, cattle)
7500 BC (sheep + goat dominate, few pigs, few cattle)
6500 BC (few pigs, mixing with wild animals?)
8000 BC (cattle, pigs,
sheep + goats)
• Not a neat model of progress to adopt a more productive
economy. Very different, sometimes piecemeal adoption in
different regions.
• Separate coastal and inland routes for the spread of domestic
animals, over a 1000-year time period.
Easy to Align
1. Animal taxonomy
2. Bone anatomy
3. Sex determinations
4. Side of the animal
5. Fusion (bone growth, up to
a point)
Hard to Align (poor modeling, recording)
1. Tooth wear (age)
2. Fusion data
3. Measurements
Despite common research methods!!
Professional expectations for data reuse
1. Need better data modeling
(than feasible with, cough,
Excel)
2. Data validation,
normalization
3. Requires training &
incentives for researchers
to care more about quality
of their data!
Nobody expected their data
to see wider scrutiny either..
… and not just academic
researchers, linked open
data involves many sectors!
Digital Index of North
American Archaeology
(DINAA)
1. State “site files” created
to comply with federal
preservation laws
2. Main record of human
occupation in North
America
3. PIs: David G. Anderson
and Josh Wells
DINAA
1. Stable URI for
each site file.
2. CC-Zero (public
domain)
3. Beginning to link
to controlled
vocabularies
Data are challenging!
1. Decoding takes 10x longer
2. Data management plans should
also cover data modeling, quality
control (esp. validation)
3. More work needed modeling
research methods (esp. sampling)
4. Editing, annotation requires lots of
back-and-forth with data authors
5. Data need investment to be
useful!
Introduction
Challenges in Reusing Data
1. Background
2. Data publishing workflow
3. Data curation and dynamism
Investing in Data is a Continual Need
1. Data and code co-evolve. New
visualizations, analysis may reveal
unseen problems in data.
2. Data and metadata change routinely
(revised stratigraphy requires ongoing
updates to data in this analysis)
3. Problems, interpretive issues in data
(and annotations) keep cropping up.
4. Is publishing a bad metaphor implying
a static product?
Data sharing as publication
Data sharing as open source
release cycles?
Data sharing as publication
Data sharing as open source
release cycles?
Data sharing as publication
AND
Data sharing as open source
release cycles
Data are challenging!
1. Decoding takes 10x longer
2. Data management plans should
also cover data modeling, quality
control (esp. validation)
3. More work needed modeling
research methods (esp. sampling)
4. Editing, annotation requires lots of
back-and-forth with data authors
5. Data need investment to be
useful!
Image Credit: “Brainchildvn” via Flickr (CC-By)
http://www.flickr.com/photos/brainchildvn/3957949195
Image Credit: “Brainchildvn” via Flickr (CC-By)
http://www.flickr.com/photos/brainchildvn/3957949195
Not an easy environment to
seek new investments.
Contingent Employment
Source: Washington Monthly (http://ecleader.org/2012/02/21/nation-wide-trend-towards-
adjuncts-threatens-higher-ed/)
Bethany Nowviskie (University of Virginia)
Shifts in Career Paths and Professions
(#alt-academy), different publishing
incentives, emerging as data assume
a greater emphasis
Bethany Nowviskie (University of Virginia)
Alt-Acs (contingent, low status) not a
good answer, but reflect wider need
for institutional reform.
One does not simply
walk into Mordor
Academia and share
usable data…
Image Credit: Copyright Newline Cinema
Final Thoughts
Data require intellectual
investment, methodological and
theoretical innovation.
Institutional structures poorly
configured to support data
powered research
New professional roles needed,
but who will pay for it?
Thank you!
University of
Pennsylvania Digital
Humanities Forum and
other Sponsors!

More Related Content

Similar to Publishing and Pushing Linked Open Data

Beyond Preservation: Situating Archaeological Data in Professional Practice
Beyond Preservation: Situating Archaeological Data in Professional PracticeBeyond Preservation: Situating Archaeological Data in Professional Practice
Beyond Preservation: Situating Archaeological Data in Professional PracticeEric Kansa
 
Knowledge Exchange, Nov 2011, Bonn
Knowledge Exchange, Nov 2011, BonnKnowledge Exchange, Nov 2011, Bonn
Knowledge Exchange, Nov 2011, BonnTodd Vision
 
HKU Data Curation MLIM7350 Class 8
HKU Data Curation MLIM7350 Class 8HKU Data Curation MLIM7350 Class 8
HKU Data Curation MLIM7350 Class 8Scott Edmunds
 
Data Sharing as Publication: A View from Archaeology
Data Sharing as Publication: A View from ArchaeologyData Sharing as Publication: A View from Archaeology
Data Sharing as Publication: A View from ArchaeologyEric Kansa
 
Share and analyze geonomic data at scale by Andy Petrella and Xavier Tordoir
Share and analyze geonomic data at scale by Andy Petrella and Xavier TordoirShare and analyze geonomic data at scale by Andy Petrella and Xavier Tordoir
Share and analyze geonomic data at scale by Andy Petrella and Xavier TordoirSpark Summit
 
#LAWDI Open Context, publishing linked data in archaeology
#LAWDI Open Context, publishing linked data in archaeology#LAWDI Open Context, publishing linked data in archaeology
#LAWDI Open Context, publishing linked data in archaeologyekansa
 
Scott Edmunds: GigaScience - Big-Data, Data Citation and Future Data Handling
Scott Edmunds: GigaScience - Big-Data, Data Citation and Future Data HandlingScott Edmunds: GigaScience - Big-Data, Data Citation and Future Data Handling
Scott Edmunds: GigaScience - Big-Data, Data Citation and Future Data HandlingGigaScience, BGI Hong Kong
 
RDAP13 Ixchel Faniel: Can Quantitative Social Scientists Get Data Reuse Satis...
RDAP13 Ixchel Faniel: Can Quantitative Social Scientists Get Data Reuse Satis...RDAP13 Ixchel Faniel: Can Quantitative Social Scientists Get Data Reuse Satis...
RDAP13 Ixchel Faniel: Can Quantitative Social Scientists Get Data Reuse Satis...ASIS&T
 
There is No Intelligent Life Down Here
There is No Intelligent Life Down HereThere is No Intelligent Life Down Here
There is No Intelligent Life Down HerePhilip Bourne
 
Scott Edmunds talk at AIST: Overcoming the Reproducibility Crisis: and why I ...
Scott Edmunds talk at AIST: Overcoming the Reproducibility Crisis: and why I ...Scott Edmunds talk at AIST: Overcoming the Reproducibility Crisis: and why I ...
Scott Edmunds talk at AIST: Overcoming the Reproducibility Crisis: and why I ...GigaScience, BGI Hong Kong
 
Interpretation, Context, and Metadata: Examples from Open Context
Interpretation, Context, and Metadata: Examples from Open ContextInterpretation, Context, and Metadata: Examples from Open Context
Interpretation, Context, and Metadata: Examples from Open ContextEric Kansa
 
Disciplinary and institutional perspectives on digital curation
Disciplinary and institutional perspectives on digital curationDisciplinary and institutional perspectives on digital curation
Disciplinary and institutional perspectives on digital curationMichael Day
 
Yde de Jong & Dave Roberts - ZooBank and EDIT: Towards a business model for Z...
Yde de Jong & Dave Roberts - ZooBank and EDIT: Towards a business model for Z...Yde de Jong & Dave Roberts - ZooBank and EDIT: Towards a business model for Z...
Yde de Jong & Dave Roberts - ZooBank and EDIT: Towards a business model for Z...ICZN
 
Scott Edmunds: Data publication in the data deluge
Scott Edmunds: Data publication in the data delugeScott Edmunds: Data publication in the data deluge
Scott Edmunds: Data publication in the data delugeGigaScience, BGI Hong Kong
 

Similar to Publishing and Pushing Linked Open Data (20)

Beyond Preservation: Situating Archaeological Data in Professional Practice
Beyond Preservation: Situating Archaeological Data in Professional PracticeBeyond Preservation: Situating Archaeological Data in Professional Practice
Beyond Preservation: Situating Archaeological Data in Professional Practice
 
Knowledge Exchange, Nov 2011, Bonn
Knowledge Exchange, Nov 2011, BonnKnowledge Exchange, Nov 2011, Bonn
Knowledge Exchange, Nov 2011, Bonn
 
HKU Data Curation MLIM7350 Class 8
HKU Data Curation MLIM7350 Class 8HKU Data Curation MLIM7350 Class 8
HKU Data Curation MLIM7350 Class 8
 
Data Publishing in Archaeozoology
Data Publishing in ArchaeozoologyData Publishing in Archaeozoology
Data Publishing in Archaeozoology
 
Data Sharing as Publication: A View from Archaeology
Data Sharing as Publication: A View from ArchaeologyData Sharing as Publication: A View from Archaeology
Data Sharing as Publication: A View from Archaeology
 
Share and analyze geonomic data at scale by Andy Petrella and Xavier Tordoir
Share and analyze geonomic data at scale by Andy Petrella and Xavier TordoirShare and analyze geonomic data at scale by Andy Petrella and Xavier Tordoir
Share and analyze geonomic data at scale by Andy Petrella and Xavier Tordoir
 
2014 aus-agta
2014 aus-agta2014 aus-agta
2014 aus-agta
 
Big data nebraska
Big data nebraskaBig data nebraska
Big data nebraska
 
#LAWDI Open Context, publishing linked data in archaeology
#LAWDI Open Context, publishing linked data in archaeology#LAWDI Open Context, publishing linked data in archaeology
#LAWDI Open Context, publishing linked data in archaeology
 
Scott Edmunds: GigaScience - Big-Data, Data Citation and Future Data Handling
Scott Edmunds: GigaScience - Big-Data, Data Citation and Future Data HandlingScott Edmunds: GigaScience - Big-Data, Data Citation and Future Data Handling
Scott Edmunds: GigaScience - Big-Data, Data Citation and Future Data Handling
 
Big data nebraska
Big data nebraskaBig data nebraska
Big data nebraska
 
RDAP13 Ixchel Faniel: Can Quantitative Social Scientists Get Data Reuse Satis...
RDAP13 Ixchel Faniel: Can Quantitative Social Scientists Get Data Reuse Satis...RDAP13 Ixchel Faniel: Can Quantitative Social Scientists Get Data Reuse Satis...
RDAP13 Ixchel Faniel: Can Quantitative Social Scientists Get Data Reuse Satis...
 
There is No Intelligent Life Down Here
There is No Intelligent Life Down HereThere is No Intelligent Life Down Here
There is No Intelligent Life Down Here
 
Scott Edmunds talk at AIST: Overcoming the Reproducibility Crisis: and why I ...
Scott Edmunds talk at AIST: Overcoming the Reproducibility Crisis: and why I ...Scott Edmunds talk at AIST: Overcoming the Reproducibility Crisis: and why I ...
Scott Edmunds talk at AIST: Overcoming the Reproducibility Crisis: and why I ...
 
Interpretation, Context, and Metadata: Examples from Open Context
Interpretation, Context, and Metadata: Examples from Open ContextInterpretation, Context, and Metadata: Examples from Open Context
Interpretation, Context, and Metadata: Examples from Open Context
 
Disciplinary and institutional perspectives on digital curation
Disciplinary and institutional perspectives on digital curationDisciplinary and institutional perspectives on digital curation
Disciplinary and institutional perspectives on digital curation
 
2014 mmg-talk
2014 mmg-talk2014 mmg-talk
2014 mmg-talk
 
Yde de Jong & Dave Roberts - ZooBank and EDIT: Towards a business model for Z...
Yde de Jong & Dave Roberts - ZooBank and EDIT: Towards a business model for Z...Yde de Jong & Dave Roberts - ZooBank and EDIT: Towards a business model for Z...
Yde de Jong & Dave Roberts - ZooBank and EDIT: Towards a business model for Z...
 
Scott Edmunds: Data publication in the data deluge
Scott Edmunds: Data publication in the data delugeScott Edmunds: Data publication in the data deluge
Scott Edmunds: Data publication in the data deluge
 
E research overview gahegan bioinformatics workshop 2010
E research overview gahegan bioinformatics workshop 2010E research overview gahegan bioinformatics workshop 2010
E research overview gahegan bioinformatics workshop 2010
 

Recently uploaded

Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
Capitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptxCapitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptxCapitolTechU
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsanshu789521
 
Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxEyham Joco
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for BeginnersSabitha Banu
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersSabitha Banu
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxOH TEIK BIN
 
Meghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentMeghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentInMediaRes1
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Celine George
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxDr.Ibrahim Hassaan
 
MICROBIOLOGY biochemical test detailed.pptx
MICROBIOLOGY biochemical test detailed.pptxMICROBIOLOGY biochemical test detailed.pptx
MICROBIOLOGY biochemical test detailed.pptxabhijeetpadhi001
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfFraming an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfUjwalaBharambe
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 

Recently uploaded (20)

OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
Capitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptxCapitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptx
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha elections
 
Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptx
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for Beginners
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginners
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptx
 
Meghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentMeghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media Component
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptx
 
MICROBIOLOGY biochemical test detailed.pptx
MICROBIOLOGY biochemical test detailed.pptxMICROBIOLOGY biochemical test detailed.pptx
MICROBIOLOGY biochemical test detailed.pptx
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfFraming an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 

Publishing and Pushing Linked Open Data

  • 1. Publishing and Pushing Linked Data in Archaeology Unless otherwise indicated, this work is licensed under a Creative Commons Attribution 3.0 License <http://creativecommons.org/licenses/by/3.0/> Eric C. Kansa (@ekansa) UC Berkeley D-Lab & Open Context
  • 2. Introduction Challenges in Reusing Data 1. Background 2. Data publishing workflow 3. Data curation and dynamism
  • 4. My Precious Data: Dysfunctional incentives (poorly constructed metrics), limit scope, diversity of publications Image Credit: “Lord of the Rings” (2003, New Line), All Rights Reserved Copyright
  • 5.
  • 6.
  • 7. Need more carrots! 1. Citation, credit, intellectually valued 2. Research outcomes (new insights from data reuse!)
  • 8. Need more carrots! 1. Citation, credit, intellectually valued 2. Research outcomes (new insights from data reuse!) Why linked data is so important
  • 9. EOL Computable Data Challenge (Ben Arbuckle, Sarah W. Kansa, Eric Kansa)
  • 10.
  • 11. Large scale data sharing & integration for exploring the origins of farming. Funded by EOL / NEH
  • 12. 1. 300,000 bone specimens 2. Complex: dozens, up to 110 descriptive fields 3. 34 contributors from 15 archaeological sites 4. More than 4 person years of effort to create the data !
  • 13. Relatively collaborative bunch, Ben Arbuckle cultivated relationships & built trust over years prior to EOL funding.
  • 14. Introduction Challenges in Reusing Data 1. Background 2. Data publishing workflow 3. Data curation and dynamism
  • 15. 1. Referenced by US National Science Foundation and National Endowment for the Humanities for Data Management 2. “Data sharing as publishing” metaphor
  • 16. Raw Data: Idiosyncratic, sometimes highly coded, often inconsistent
  • 17. Raw Data Can Be Unappetizing
  • 18. Publishing Workflow Improve / Enhance 1. Consistency 2. Context (intelligibility)
  • 19. Sometimes data is better served cooked
  • 20. - Documentation - Review, editing - Annotation
  • 21. - Documentation - Review, editing - Annotation
  • 22. - Documentation - Review, editing - Annotation
  • 23. - Documentation - Review, editing - Annotation
  • 24. - Documentation - Review, editing - Annotation
  • 25. “Ovis orientalis” Code: 14 Wild sheep Code: 70 Code: 16 Ovis orientalis Code: 15 Sheep, wild O. orientalis Sheep (wild)
  • 26. - Documentation - Review, editing - Annotation
  • 27. “Ovis orientalis” http://eol.org/pages/311906/ Code: 14 Wild sheep Code: 70 Code: 16 Ovis orientalis Code: 15 Sheep, wild O. orientalis Sheep (wild)
  • 28. ● Controlled vocabulary ● Linked Data applications
  • 29. “Sheep/goat” http://eol.org/pages/32609438/ 1. Needed to mint new concepts like “sheep/goat” 2. Vocabularies need to be responsive for multidisciplinary applications
  • 30.
  • 31.
  • 32. Linking to UBERON 1. Needed a controlled vocabulary for bone anatomy 2. Better data modeling than common in zooarchaeology, adds quality.
  • 33. Linking to UBERON 1. Models links between anatomy, developmental biology, and genetics 2. Unexpected links between the Humanities and Bioinformatics!
  • 34.
  • 35.
  • 36. 7000 BC (many pigs, cattle) 7500 BC (sheep + goat dominate, few pigs, few cattle) 6500 BC (few pigs, mixing with wild animals?) 8000 BC (cattle, pigs, sheep + goats) • Not a neat model of progress to adopt a more productive economy. Very different, sometimes piecemeal adoption in different regions. • Separate coastal and inland routes for the spread of domestic animals, over a 1000-year time period.
  • 37. Easy to Align 1. Animal taxonomy 2. Bone anatomy 3. Sex determinations 4. Side of the animal 5. Fusion (bone growth, up to a point)
  • 38. Hard to Align (poor modeling, recording) 1. Tooth wear (age) 2. Fusion data 3. Measurements Despite common research methods!!
  • 39. Professional expectations for data reuse 1. Need better data modeling (than feasible with, cough, Excel) 2. Data validation, normalization 3. Requires training & incentives for researchers to care more about quality of their data!
  • 40. Nobody expected their data to see wider scrutiny either..
  • 41. … and not just academic researchers, linked open data involves many sectors!
  • 42. Digital Index of North American Archaeology (DINAA) 1. State “site files” created to comply with federal preservation laws 2. Main record of human occupation in North America 3. PIs: David G. Anderson and Josh Wells
  • 43. DINAA 1. Stable URI for each site file. 2. CC-Zero (public domain) 3. Beginning to link to controlled vocabularies
  • 44. Data are challenging! 1. Decoding takes 10x longer 2. Data management plans should also cover data modeling, quality control (esp. validation) 3. More work needed modeling research methods (esp. sampling) 4. Editing, annotation requires lots of back-and-forth with data authors 5. Data need investment to be useful!
  • 45. Introduction Challenges in Reusing Data 1. Background 2. Data publishing workflow 3. Data curation and dynamism
  • 46. Investing in Data is a Continual Need 1. Data and code co-evolve. New visualizations, analysis may reveal unseen problems in data. 2. Data and metadata change routinely (revised stratigraphy requires ongoing updates to data in this analysis) 3. Problems, interpretive issues in data (and annotations) keep cropping up. 4. Is publishing a bad metaphor implying a static product?
  • 47.
  • 48. Data sharing as publication Data sharing as open source release cycles?
  • 49. Data sharing as publication Data sharing as open source release cycles?
  • 50. Data sharing as publication AND Data sharing as open source release cycles
  • 51. Data are challenging! 1. Decoding takes 10x longer 2. Data management plans should also cover data modeling, quality control (esp. validation) 3. More work needed modeling research methods (esp. sampling) 4. Editing, annotation requires lots of back-and-forth with data authors 5. Data need investment to be useful!
  • 52. Image Credit: “Brainchildvn” via Flickr (CC-By) http://www.flickr.com/photos/brainchildvn/3957949195
  • 53. Image Credit: “Brainchildvn” via Flickr (CC-By) http://www.flickr.com/photos/brainchildvn/3957949195 Not an easy environment to seek new investments.
  • 54.
  • 55. Contingent Employment Source: Washington Monthly (http://ecleader.org/2012/02/21/nation-wide-trend-towards- adjuncts-threatens-higher-ed/)
  • 56. Bethany Nowviskie (University of Virginia) Shifts in Career Paths and Professions (#alt-academy), different publishing incentives, emerging as data assume a greater emphasis
  • 57. Bethany Nowviskie (University of Virginia) Alt-Acs (contingent, low status) not a good answer, but reflect wider need for institutional reform.
  • 58. One does not simply walk into Mordor Academia and share usable data… Image Credit: Copyright Newline Cinema
  • 59. Final Thoughts Data require intellectual investment, methodological and theoretical innovation. Institutional structures poorly configured to support data powered research New professional roles needed, but who will pay for it?
  • 60. Thank you! University of Pennsylvania Digital Humanities Forum and other Sponsors!

Editor's Notes

  1. The removal of objects is now forbidden in most countries and many sites in the US. As a result data collection methods have changed from description of a physical object accessible in the US to a full surrogate for an object that might be re-buried in the ground. Data collection has increased as the collection of objects has decreased.Still individual systems of data collection (see examples on the right) have emerged which have.Developed over timeAre Handed down from mentors Contain some technological adoption, particularly the adoption of Excel spreadsheets over relational databasesIn all of our interviews there was no reference to existing guides, such as the UK: Archaeological Data Service or Netherlands: DANS on archaeological documentation.
  2. We used archaeology as a case study. During our 22 semi-structured interviews archaeologists were asked about their1. background and research interests2. data reuse experiences:Actual experience using the critical incident (i.e. the last time they reused someone else’s data for their research)Aspirational - for those who had not reused someone else’s data we asked what they would need or want in order to do so3. views on digital data repositories4. data sharing practices
  3. The removal of objects is now forbidden in most countries and many sites in the US. As a result data collection methods have changed from description of a physical object accessible in the US to a full surrogate for an object that might be re-buried in the ground. Data collection has increased as the collection of objects has decreased.Still individual systems of data collection (see examples on the right) have emerged which have.Developed over timeAre Handed down from mentors Contain some technological adoption, particularly the adoption of Excel spreadsheets over relational databasesIn all of our interviews there was no reference to existing guides, such as the UK: Archaeological Data Service or Netherlands: DANS on archaeological documentation.
  4. The removal of objects is now forbidden in most countries and many sites in the US. As a result data collection methods have changed from description of a physical object accessible in the US to a full surrogate for an object that might be re-buried in the ground. Data collection has increased as the collection of objects has decreased.Still individual systems of data collection (see examples on the right) have emerged which have.Developed over timeAre Handed down from mentors Contain some technological adoption, particularly the adoption of Excel spreadsheets over relational databasesIn all of our interviews there was no reference to existing guides, such as the UK: Archaeological Data Service or Netherlands: DANS on archaeological documentation.
  5. The removal of objects is now forbidden in most countries and many sites in the US. As a result data collection methods have changed from description of a physical object accessible in the US to a full surrogate for an object that might be re-buried in the ground. Data collection has increased as the collection of objects has decreased.Still individual systems of data collection (see examples on the right) have emerged which have.Developed over timeAre Handed down from mentors Contain some technological adoption, particularly the adoption of Excel spreadsheets over relational databasesIn all of our interviews there was no reference to existing guides, such as the UK: Archaeological Data Service or Netherlands: DANS on archaeological documentation.
  6. The removal of objects is now forbidden in most countries and many sites in the US. As a result data collection methods have changed from description of a physical object accessible in the US to a full surrogate for an object that might be re-buried in the ground. Data collection has increased as the collection of objects has decreased.Still individual systems of data collection (see examples on the right) have emerged which have.Developed over timeAre Handed down from mentors Contain some technological adoption, particularly the adoption of Excel spreadsheets over relational databasesIn all of our interviews there was no reference to existing guides, such as the UK: Archaeological Data Service or Netherlands: DANS on archaeological documentation.
  7. The removal of objects is now forbidden in most countries and many sites in the US. As a result data collection methods have changed from description of a physical object accessible in the US to a full surrogate for an object that might be re-buried in the ground. Data collection has increased as the collection of objects has decreased.Still individual systems of data collection (see examples on the right) have emerged which have.Developed over timeAre Handed down from mentors Contain some technological adoption, particularly the adoption of Excel spreadsheets over relational databasesIn all of our interviews there was no reference to existing guides, such as the UK: Archaeological Data Service or Netherlands: DANS on archaeological documentation.
  8. The removal of objects is now forbidden in most countries and many sites in the US. As a result data collection methods have changed from description of a physical object accessible in the US to a full surrogate for an object that might be re-buried in the ground. Data collection has increased as the collection of objects has decreased.Still individual systems of data collection (see examples on the right) have emerged which have.Developed over timeAre Handed down from mentors Contain some technological adoption, particularly the adoption of Excel spreadsheets over relational databasesIn all of our interviews there was no reference to existing guides, such as the UK: Archaeological Data Service or Netherlands: DANS on archaeological documentation.
  9. The removal of objects is now forbidden in most countries and many sites in the US. As a result data collection methods have changed from description of a physical object accessible in the US to a full surrogate for an object that might be re-buried in the ground. Data collection has increased as the collection of objects has decreased.Still individual systems of data collection (see examples on the right) have emerged which have.Developed over timeAre Handed down from mentors Contain some technological adoption, particularly the adoption of Excel spreadsheets over relational databasesIn all of our interviews there was no reference to existing guides, such as the UK: Archaeological Data Service or Netherlands: DANS on archaeological documentation.
  10. The removal of objects is now forbidden in most countries and many sites in the US. As a result data collection methods have changed from description of a physical object accessible in the US to a full surrogate for an object that might be re-buried in the ground. Data collection has increased as the collection of objects has decreased.Still individual systems of data collection (see examples on the right) have emerged which have.Developed over timeAre Handed down from mentors Contain some technological adoption, particularly the adoption of Excel spreadsheets over relational databasesIn all of our interviews there was no reference to existing guides, such as the UK: Archaeological Data Service or Netherlands: DANS on archaeological documentation.
  11. The removal of objects is now forbidden in most countries and many sites in the US. As a result data collection methods have changed from description of a physical object accessible in the US to a full surrogate for an object that might be re-buried in the ground. Data collection has increased as the collection of objects has decreased.Still individual systems of data collection (see examples on the right) have emerged which have.Developed over timeAre Handed down from mentors Contain some technological adoption, particularly the adoption of Excel spreadsheets over relational databasesIn all of our interviews there was no reference to existing guides, such as the UK: Archaeological Data Service or Netherlands: DANS on archaeological documentation.