An Open Context for Archaeology

  • 1,611 views
Uploaded on

The common use by archaeologists of ubiquitous technologies such as computers and digital cameras means that archaeological research projects now produce huge amounts of diverse, digital …

The common use by archaeologists of ubiquitous technologies such as computers and digital cameras means that archaeological research projects now produce huge amounts of diverse, digital documentation. However, while the technology is available to collect this documentation, we still largely lack community accepted dissemination channels appropriate for such torrents of data. Open Context (http://www.opencontext.org) aims to help fill this gap by providing open access data publication services for archaeology. Open Context has a flexible and generalized technical architecture that can accommodate most archaeological datasets, despite the lack of common recording systems or other documentation standards. Open Context includes a variety of tools to make data dissemination easier and more worthwhile. Authorship is clearly identified through citation tools, a web-based publication systems enables individuals upload their own data for review, and collaboration is facilitated through easy download and other features. While we have demonstrated a potentially valuable approach for data sharing, we face significant challenges in scaling Open Context up for serving large quantities of data from multiple projects.

More in: Business , Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
1,611
On Slideshare
0
From Embeds
0
Number of Embeds
1

Actions

Shares
Downloads
40
Comments
0
Likes
1

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. An Open Context for Archaeology Publishing Research Data on the Web Eric Kansa UC Berkeley School of Information Unless otherwise indicated, this work is licensed under a Creative Commons Attribution 3.0 License <http://creativecommons.org/licenses/by/3.0/>
  • 2. Today
    • My background
    • Sharing Field Documentation
    • Open Context
    • Unresolved Issues and Next Steps
  • 3. Today
    • My background
    • Sharing Field Documentation
    • Open Context
    • Unresolved Issues and Next Steps
  • 4. Personal Background
    • Anthropology
      • Cultural
      • Archaeology (social)
    • Co-founder of the AAI, a “.org”
    • Currently Executive Director of ISD Program
  • 5. Career Directions
    • Frustrated with the practice of archaeology
      • Data sharing hard / nonexistent
      • Publication = paper
      • Impressionistic, hard to verify claims
    • Opportunity for research
      • Focus on data sharing / communication
  • 6.
    • Independent NGO / nonprofit corporation dedicated to promoting open content for cultural heritage research and education
    • Explore approaches for the community to share (technical, copyright, academic)
    • NOT a repository: Promoting data sharing by creating tools, methods, and exemplars.
  • 7. Today
    • My background
    • Sharing Field Documentation
    • Open Context
    • Unresolved Issues and Next Steps
  • 8. Why Focus on Field Projects and Collections?
    • New Research Opportunities
      • Encourage use and reuse of primary evidence
      • Enable broad scale, analytically rigorous investigations
    • Reduce costs and enhance effectiveness of preservation & access
      • Informal estimates: 15-27% of research ever gets published 1,2 , often in inaccessible formats
    1 James H. Ottaway, Jr. “Publish or Be Damned”, a lecture presented for the University of Cincinnati Classics Department, 5/2001. 2 Morag Kersel. “Publishing the Past: Some Shocking Statistics ”, a lecture presented for the American Schools for Oriental Research annual conference. 2005
  • 9. Why Primary Research Content?
    • Bumpus (1898) House Sparrow Data
      • Carey Bumpus published all of his raw data along with his syntheses
    • 10 subsequent groundbreaking papers reanalyzed these data
      • Invaluable dataset used for instruction
      • Key Point: Dataset becomes 10X more valuable with dissemination!
  • 10. Ads Screenshot
  • 11. The Conceptual Challenge
    • The content of field documentation (represented in spreadsheets and databases) varies greatly
      • Discipline has 1 foot in humanities, 1 in sciences
      • Archaeological documentation is also rich in media and narrative text
  • 12. Our Approach
    • Explore ways to pool data without overly constraining standards
    • Find / create tools for non-tech expert use and contribution
    • Find / create tools that enable casual browsing / exploratory analyses
  • 13. Our Approach
    • Stay cost-effective! Most archaeological data sharing initiatives are site / project specific.
    • More general solutions needed
  • 14. Global Schemas: ArchaeoML
    • Simple, general schema makes it easier to pool diverse content
    • Not overly determined, support multiple research agendas
    • Hard to implement . But we’re gaining experience
    UML Diagram of a subset of ArchaeoML
  • 15. Other Web Initiatives
    • Web resources using highly generalized data structures see growing popularity
    • Example: OpenRecord
      • Dojo Foundation (leading open source AJAX)
      • “ Wiki for databases”
      • Data expressed in RDF triples (queried with SPARQL)
    • Likely needs some added meaningful structure to facilitate discipline specific use
    OpenRecord (www.openrecord.org)
  • 16. Other Web Initiatives Freebase (freebase.com)
  • 17. Other Web Initiatives Freebase (freebase.com)
    • More about this later…
  • 18. Today
    • My background
    • Sharing Field Documentation
    • Open Context
    • Unresolved Issues and Next Steps
  • 19. OCHRE, Open Context
    • OCHRE: Fully supports ArchaeoML global schema using a native XML database and free java client
    • Open Context: Uses a subset of the ArchaeoML schema (via MySQL/PHP) for web-browser access and Internet search engine indexing.
    • Common services, including complex querying and analysis for pooled content
  • 20. General Approach
    • “ Organic” Development Originally planned just to use OCHRE
    • PHP/MySQL: Drives many dynamic content websites, relatively simple standard technology. “Bleeding edge” difficult for our target community.
    • Open to Search Engines: Increasingly important research tools (Harley 2006)
    • Easy integration of Open Source Tools (RSS-feeds, ping-back, etc.)
    2 Diane Harley, Sarah Earl-Novell, Jennifer Arter, Shannon Lawrence, and C. Judson King, Jr. “The Influence of Academic Values on Scholarly Publication and Communication Practices”, Research & Occasional Paper Series: CSHE.13.06 <http://cshe.berkeley.edu/publications/docs/ROP.Harley.AcademicValues.13.06.pdf>.
  • 21. Faceted Browse
    • Data from multiple projects browsed, queried (even with Boolean algebra), and results pooled together
  • 22. Records in Open Context
  • 23. Media Record
  • 24. Records in Open Context Contextual relationships: (Spatial containment)
  • 25. Records in Open Context Contextual relationships: (Stratigraphy)
  • 26. Ownership in Open Context Copyright ownership and Creative Commons license information, including metadata Internet-wide standard metadata, links ownership & permissions
  • 27. Ownership in Open Context Citation information with stable URL direct to the item being cited.
  • 28. Ownership in Open Context Zotero (www.zotero.org) uses COinS (a micro-format) metadata to make bibliographic references
  • 29. Complex Querying
    • Data from multiple projects can be queried (with Boolean algebra), and results pooled together
  • 30. Summary Statistics
  • 31. Making Meaningful Links
    • ArchaeoML essentially describes a network of atomic units and their relationships
      • Units and their links typically derived from source data
    Domuztepe Lot 1939 Bone 231 Pot 232 Pot 233 Pinarbasi Cave Unit A Find ID1-A Find ID2-A Find ID3-A Taxon: Ovis aries Modification: Ground Point Element: metacarpal Material: ceramic Color: Buff-orange Type: Spindle-whorl
  • 32. Making Meaningful Links
    • Current (limited) approach with “tags”
      • Assigned to 1 item or a whole set of items (esp. a query selection set)
      • Express a meaningful link between items
    Domuztepe Lot 1939 Bone 231 Pot 232 Pot 233 Pinarbasi Cave Unit A Find ID1-A Find ID2-A Find ID3-A Taxon: Ovis aries Modification: Ground Point Element: metacarpal Material: ceramic Color: Buff-orange Type: Spindle-whorl “ Weaving tool”
  • 33. Future Extensions
    • Extend tagging concept for more structure
      • Users can apply variable/value pairs.
      • Assign calendar dates to items
      • Apply more sophisticated ontologies / thesauri (Getty?)
    Domuztepe Lot 1939 Bone 231 Pot 232 Pot 233 Pinarbasi Cave Unit A Find ID1-A Find ID2-A Find ID3-A Taxon: Ovis aries Modification: Ground Point Element: metacarpal Material: ceramic Color: Buff-orange Type: Spindle-whorl “ Tool Type”: “ Weaving tool”
  • 34. Using Tagged Sets Pingback: Register of a link made to a set tagged as “weaving tools” from a weblog
  • 35. Integration at a General Level
    • Speed and ease of mapping content into ArchaeoML systems
      • Significant cost reduction if most contributors can do it themselves
      • Important for small, individual or project generated research
    • Enables powerful query and analysis across multiple projects
    • But NOT very specific.
      • Example: Composing queries still uses each project’s local recording system (even though several projects can be queried simultaneously and their results pooled)
  • 36. Schema Mapping into ArchaeoML
    • Importer an important part, most people work with Excel, Filemaker, Access…
      • Goal: Individuals can upload their own data, map them into ArchaeoML and submit for review and publishing
  • 37.
    • Data expressed in ArchaeoML
      • Ready for Open Context, OCHRE dissemination
      • Interoperability, longevity advantages
      • Project’s original terminology is maintained
    • Data described with high-level metadata
      • Dublin Core, TimeMap. Can be expressed in RDF, COinS, ArchaeoML (XML), etc.
    • Data schema mappings recorded
      • Import process saves mapping parameters.
    • Internet Archive Accession
    Outcomes
  • 38. Petra Open City Eric C. Kansa Executive Director, ISD Program, School of Information, UC Berkeley
  • 39. Faceted Browse
    • Petra Great Temple:
      • 128,187 locations / objects
      • 1.1 million descriptions
      • 1626 media objects ( more to come )
      • 298,500 relationships
  • 40. Penelope 2
    • Petra Great Temple:
      • 12 individual databases (some very large)
      • ~200 text documents “mined”
      • 1600+ related media files
  • 41. Penelope 1
  • 42. Penelope 1
  • 43. Penelope 2
  • 44. Penelope 3
  • 45. Penelope 4
  • 46. Today
    • My background
    • Sharing Field Documentation
    • Open Context
    • Unresolved Issues and Next Steps
  • 47.
    • Bugs, interface problems
      • Truncated development (I got a new job…)
      • Just beginning user evaluations
    • Schema mapping is major challenge
    • Recent collaborations, hiring should help (stay tuned!)
    User Experience Image by Jeff Kubina via Flickr (CC-by license) <http://www.flickr.com/photos/kubina/296367267/>
  • 48. Unlocking Open Context
  • 49. Unlocking Open Context
    • Web services
      • Clear need to facilitate “mash-ups”
      • Community / organization specific portals and views of content
    • Example Application: Second Life or Croquet
      • Most current virtual visualizations are one-off projects, have little applicability to other sites / collections
      • Dynamically link online data stores so visualizations can be easier do develop / more meaningful
  • 50. Records in Open Context XML data output, enables: (1) Sharing between web resources (2) Custom presentation (Brown University-specific style templates, etc.)
  • 51.
    • “ Cultural Resource Management”
      • 90% of US archaeology
      • Un-circulated “gray literature” reports
    • Collaboration with the San Diego Archaeological Center
      • 400 datasets, representing 500,000 locations and objects
      • 4-5 “Petras” worth of data
    • Scaling issues becoming paramount!
    Data Inundation Image by “Doegox” via Flickr (CC-by license) <http://www.flickr.com/photos/doegox/2085419215/>
  • 52. Metaweb
    • Exploring Metaweb
      • ArchaeoML seems to map well to their data store
      • Powerful API
      • Large user community
      • Scale, performance
    Need to understand concerns!
  • 53. Open Data Protocol
    • Advocated by Science Commons
      • Use of “CC-zero” license
      • Public domain data, reliance on social norms for appropriate use
      • Solves important problems
    • Questions
      • Multiple stakeholder communities
      • Very different, conflicting norms
  • 54. Glocal Backlash
    • Internet, one of the principle ways traditional knowledge and heritage is/will be accessed
    • Local claims and notions of privacy, propriety, spirituality often missing
      • Can have dark-side too! (essentialism, ethno-nationalism, fundamentalisms)
    Captain Hook award winner for bio-piracy 2006 1 1 Andrew Donoghue, ZDNet UK (March 29, 2006) http://news.zdnet.co.uk/business/0,39020645,39260264,00.htm
  • 55. Traditional Knowledge
    • Jason Schulz and Ahrash Bissell (co-authors)
    • How CC-licenses can be applied, where they may be inappropriate
    Some Rights Reserved
  • 56. Public Tensions
    • Prospects to collaborate with amateur communities?
      • Site security
      • “ Fantastic” archaeology
    &quot;Pothunters&quot; destroying an archaeological site on the Columbia River (Oregon, USA) Image by “gbaku” via Flickr (CC-By-SA license) <http://www.flickr.com/photos/gbaku/1074322614/>
  • 57. Today
    • My background
    • Sharing Field Documentation
    • Open Context
    • Unresolved Issues and Next Steps
    … .. Now for the thanks!
  • 58. Open Context Developers
    • Eric Kansa (Lead developer, tagging system, interface design)
    • Ahrash Bissell (Penelope design, usability)
    • Nathan Hirth (XML, XSLT, schema mapping)
    • David Schloen (ArchaeoML schema)
    • Sarah W. Kansa (Usability, interface design, documentation)
    • Jeanne Lopiparo (Interface and graphic design, usability)
    • Michael Ashley (Filemaker item-view mockup)
    • Chris Hoffman (Usability, optimization )
  • 59. Special Thanks University of Chicago: OCHRE Project The Electronic Frontier Foundation Doris and Donald Fisher Presidio Archaeology: NPS, Golden Gate National Rec. Area Science Commons Internet Archive (media repository services) “ Friday Afternoon Seminar” “ Friday Afternoon Seminar”
  • 60. Special Thanks University of Chicago: OCHRE Project The Electronic Frontier Foundation Doris and Donald Fisher “ Friday Afternoon Seminar” Presidio Archaeology: NPS, Golden Gate National Rec. Area Science Commons Internet Archive (media repository services)