0
BEGINNERS DIGITAL HUMANITIES/SUBJECT
LIBRARIAN BOOT CAMP
What is Digital Humanities? Hack (building scholarly digital
edit...
POTENTIAL LITERATURE
We looked at markov chain random text generation.
Playing around with a "rhymer" script led to a disc...
BUILDING A SEMI-AUTOMATIC GEOCODING PROGRAM
FOR TEXT DOCUMENTS
Andrew introduced concept of geocoding place references in ...
SLU CENTER FOR DIGITAL HUMANITIES PT. 1
SLU CDH Origin story
 accidental opportunities come from casting a wide net
 pur...
SLU CENTER FOR THE DIGITAL HUMANITIES PT.
2
Mizzou
 Faces challenges for Digital Humanities support where sciences are pr...
BLURRING THE BOUNDARIES BETWEEN
SCHOLARSHIP
1. Open-source tools in Digi Hum: calls on the public to do creative work with...
STL LAMS
Going forward, the TECHO (Technology Exchange Humanities Cultural
Organizations) group should:
 continue meeting...
XML, OAC, RDF, JSON-LD AND THE KING STOOD: THE
UNIVERSE IS METADATA:
TEI is a great schema for description and interoperab...
QGIS
Introduced QGIS and the history of the project
Discussed types of GIS possible with the software
Demonstrated how to ...
DIGITAL PEDAGOGY
Even in instructional settings where teaching DH is not the primary goal, DH or simply
technology-assiste...
INTEGRATING NEW TECHNOLOGIES INTO FIRST
GENERATION DIGITIZATION PROJECTS
Problem of intellectual stewardship: who is custo...
SPATIAL HUMANITIES
The discussion revolved around ways in which digital spatial tools have or might in the
future enhance ...
UNSTRUCTURED DATA
•

Types of NoSQL db‘s – other Big Data technologies

•

Application and use cases in Humanities
•
Crowd...
WORDPRESS
WordPress can be used as a full content management system. It's not just a
blogging platform.
Some example WordP...
TIME SERIES
The session on Databases Before Digital drew a small group for a
discussion that spent some time on questions ...
ATTRIBUTION AND COLLABORATION
Facing the challenges of attribution and credit in a digital world
Traditional publishing of...
Thatcamp recap
Upcoming SlideShare
Loading in...5
×

Thatcamp recap

604

Published on

Session notes from THATCampSTL on November 9, 2013 at Washington University in St. Louis

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
604
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
1
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Transcript of "Thatcamp recap"

  1. 1. BEGINNERS DIGITAL HUMANITIES/SUBJECT LIBRARIAN BOOT CAMP What is Digital Humanities? Hack (building scholarly digital editions, projects) vs. Yack (theory) Two areas within the Hack part of DH: textual encoding with TEI (Textual Encoding Initiative) XML, and textual mining We all use digital tools now: what differentiates something as uniquely digital humanities (vs. ‗traditional‘) scholarship? Digital humanities scholarship will leverage the digital medium, i.e., create something that could not be duplicated in analog formats; or if it could be reproduced in analog with no loss, it‘s not DH The research team—of scholars, programmers, librarians—is characteristic (and probably necessary) of DH, but new to the humanities, which had a tradition (if not completely accurate) of the ―lone wolf‖ scholar Pointers toward resources in getting started in DH will be on the THATCamp STL site next week! Session leaders: Chris Freeland and Andrew Rouner
  2. 2. POTENTIAL LITERATURE We looked at markov chain random text generation. Playing around with a "rhymer" script led to a discussion of lexical resources for text generation. We looked at a version of the "dada engine", which generates texts by applying a vocabulary to a grammer. We briefly surveyed John Cage's "mesostics". All of this led to a discussion of quantitative measures for literary creativity. Resources used on the session are available at http://ada.artsci.wustl.edu/dada/ Session leader: Stephen Pentecost
  3. 3. BUILDING A SEMI-AUTOMATIC GEOCODING PROGRAM FOR TEXT DOCUMENTS Andrew introduced concept of geocoding place references in text documents. Aaron said technology is out there to do this. Jeff demonstrated Viewshare, Library of Congress open source software, which he used for mapping important place references in oral histories compiled for the Missouri State Historical Society. Aaron described how Clavin is based on a Gazetteer, that enables you use access coordinates for real place names. The problem is that it won‘t recognize historical places that no longer exist. It also will not function at the fine grain of street addresses. There was discussion about the need to create a gazetteer for St. Louis to incorporate lost landscapes and street addresses. Brian demonstrated Open Calais, a name recognition software, and explained how he used it to map locations in St. Louis Beacon articles through the Google API. Anupam asked about different types of geographical data output, other than web-based displays. The session wrapped-up with some playing around with the Clavin demo to make it usable. Session leader: Andrew Hurley
  4. 4. SLU CENTER FOR DIGITAL HUMANITIES PT. 1 SLU CDH Origin story  accidental opportunities come from casting a wide net  pursue impossible ideas and you might make connections that make it possible Collaborative spirit leaves the door open  Linked Open Data ideas can support interoperability and future collaboration, communication, or data reuse even if it is not exposed to the world WashU Libraries  sharing experiences and seeking solutions for DLXS lack of support  working with technologies like Fedora, Hydra  the power of finding user groups and library communities WashU  Unique struggles with 20th, 21st century texts  publishing incomplete biographical text is a DH project that can best exist as an interactive digital object  financing, copyright, access control can interrupt standards and interoperability  even requests are not standardized and change from institution to institution and country to country  Finding tools that support standards or that help mediate the IPR by remotely fetching images or supporting remote annotation so access can be used with violating rights can help Session leader: Patrick Cuba
  5. 5. SLU CENTER FOR THE DIGITAL HUMANITIES PT. 2 Mizzou  Faces challenges for Digital Humanities support where sciences are prominent and geography is isolating  Digital Humanities may allow for more distant collaboration where interests overlap  Sometimes the DH projects need to precede institutional support until a critical mass of interest exists on campus Webster  Film project for annotated documentaries or user-guided stories  Tradamus (SLU-CDH) took from others to find standards and directions  Sharing obstacles with peers can aid in the discovery of tangent tools which nearly meet challenges as a starting point for new projects  Visualization tools for moving through a graph may assist in composition or user interface  LittleBigPlanet game allows users to move around well defined visual components to create and experience and the community reshares compositions (crowd-sourced documentary possibilities) Eastern Illinois  Past Tracker and Localities projects are great resources which would benefit from update  challenges include rotating grad position in charge of working on project, lack of time at institution, and decentralized resources for working with DH projects  Contact with other institutions revealed on-campus resources that may be available  When creating this as a DH project, tracking the history of the project itself may be of interest, both popular interest and as an aid to future scholars Session leader: Patrick Cuba
  6. 6. BLURRING THE BOUNDARIES BETWEEN SCHOLARSHIP 1. Open-source tools in Digi Hum: calls on the public to do creative work with material  An example: http://t-pen.org/TPEN/  an example: http://rapgenius.com 2. Crowd-sourcing & social media 3. Community, broader impacts in digi-hum projects/products/methods 4. How to convince students? How to incorporate into class construction? 5. How do faculty involve students and still maintain the project quality integrity of the original product goals (this is true outside of the student context too--at community level)? Faculty-student collaboration? Faculty-student guidance/direction? Both? 6. The "subject" as another type of community? 7. Academic/faculty/scholar collaboration 8. Futures? Communities for scholarly peer review in DigiHum, simultaneous, longdistance scholar input (using Wikipedia as an example of the beginnings of this) Session leader: Kristine Hildebrandt
  7. 7. STL LAMS Going forward, the TECHO (Technology Exchange Humanities Cultural Organizations) group should:  continue meeting with a focus on projects; making it a “sharing group,” participants will lose interest; projects require commitments  identify a better platform for collaboration than Google Groups, and at the same time should have a public-facing resource, so interested parties can contact the group to join in (possibly WordPress)  build its network and collaborators  begin planning ongoing, informal training on relevant platforms and standards Session leader: Andrew Rouner
  8. 8. XML, OAC, RDF, JSON-LD AND THE KING STOOD: THE UNIVERSE IS METADATA: TEI is a great schema for description and interoperability, but XML limits in too many ways  overlapping ranges are not allowed when annotating  XML document does not resemble simulated original  metadata in headers and in-line tags are artificially different  massive XML documents must be parsed and processed for relevant or wanted information RDF sought to fix some of the problems, but RDF-XML still stumbles OAC (openannotation.org) removes the description, conversation, and linking from the original digital object  solves all the listed problems of XML, leaves some common issues of vocab, convention, and data fragility  allows for TEI or DC or any vocabulary to be used in description  creates an independent digital object that can be stored, queried, or resolved from any location  complex chains of annotations and selectors can describe a resource so well that even if an original image or text becomes unavailable, the annotations can still recreate meaning  OAC abandons the idea that annotations should be easily human readable in favor of machine navigatable triples that can be passed easily between and within digital applications Thinking in oa:Annotations instead of XML allows for new possibilities SharedCanvas (shared-canvas.org) extends OAC and creates a sc:Canvas object for reference which has no content and is only annotated Tradamus (SLU-CDH project) creates digital editions whose text is only Session leader: Patrick Cuba
  9. 9. QGIS Introduced QGIS and the history of the project Discussed types of GIS possible with the software Demonstrated how to search for data and add simple data to a QGIS project Outlined various ways QGIS was similardifferent to ArcGIS Session leader: Aaron Addison
  10. 10. DIGITAL PEDAGOGY Even in instructional settings where teaching DH is not the primary goal, DH or simply technology-assisted projects (as basic as creating sites, blogging, tweeting) can encourage student to interact, take ownership of content, teach peers, & learn important lessons about source documentation & context Ongoing projects in particular are great for incorporating new/young/uneducated students, giving them built-in peer teaching, engagement, bigger sense of purpose, & responsibility to ―real‖ audience outside classroom (examples from participants: http://widewideworlddigitaledition.siue.edu/ http://talus.artsci.wustl.edu/spenserArchivePrototype/) Combining content/theory & making/DH in one course is challenging: many approaches, incl. one hands-on session & one lecture each week, an additional lab option, periodic technical bootcamps throughout semester, or a DH-customized lab track of a larger survey course – none of them perfect, all requiring institutional support! DH playing field is absolutely not level: digital divide an issue in different institutional contexts, and not all languages can claim the evel of successful digitization that English literature enjoys – so how can those of us who teach and/or study foreign languages expand the definition of DH to include basic digitization & translation projects that will be useful to them? Should we recenter DH to address socioeconomic & linguistic difference, especially if these are topics we encounter regularly in our classrooms? (possible example of richly multilingual project: http://library.princeton.edu/projects/bluemountain/) Session leader: Wendy Love Anderson
  11. 11. INTEGRATING NEW TECHNOLOGIES INTO FIRST GENERATION DIGITIZATION PROJECTS Problem of intellectual stewardship: who is custodian of an archive? Should you share files, cede ownership? How do you ensure usability in the future? Front-end vs. back-end? Uniformity of standards: metadata should talk across platforms, archives. "We all want our stuff to work with other people‘s stuff to have better scholarship is the underlying issue that we should be agitating to change the rules?" Session leader: Malgorzata RymszaPawlowska
  12. 12. SPATIAL HUMANITIES The discussion revolved around ways in which digital spatial tools have or might in the future enhance scholarship. The early part of the discussion focused on GIS mapping. There was also some discussion about 3D digital environments toward the end of the session. Campers identified several types of research that lend themselves to electronic spatial analysis:  Research involving data produced by crowd sourcing.  Research involving massive amounts of data.  Research about the diffusion processes.  Research attempting to flesh out the physical dimensions of a place.  Research about material objects and architectural elements that can be reconstructed in 3D Limitations of employing spatial digital tools included: Temporal analysis is difficult to display through maps. Data collection and input along with the building of 3D environments is resource intensive and there is the danger that such enterprises will be monopolized by corporate behemoths like G*****. The discussion ended on the subject of the portability of geographical data and issues of access. Session leader: Andrew Hurley
  13. 13. UNSTRUCTURED DATA • Types of NoSQL db‘s – other Big Data technologies • Application and use cases in Humanities • Crowdsourcing data • Word spotting • Data mining of archives • Need to be sure we are asking the right questions • Importance of metadata for all processes Session leader: Aaron Addison
  14. 14. WORDPRESS WordPress can be used as a full content management system. It's not just a blogging platform. Some example WordPress sites:  http://taylorfamilyinstitute.wustl.edu  http://mallinckrodt-academy.org/  http://historyofmedicine.wustl.edu/ The Advanced Custom Fields plugin makes it easy to enter and display data for site-specific types of content. For developers, WordPress strikes a good balance between flexibility and ease of use. WordPress is very popular. As free, open source software, it has a low barrier to entry. Its huge installed base makes it easy to find hosting, technical support, themes, and plugins. The easiest way to get started with WordPress is to sign up for an account at wordpress.com. Session leader: Brian Marston
  15. 15. TIME SERIES The session on Databases Before Digital drew a small group for a discussion that spent some time on questions of how to improve methods of working with tabular textual material that OCR often doesn't handle well, but also included shared curiosity on the history of how people have historically organized data and bureaucracies. There was some overlap with earlier discussions of 19th-century St. Louis city directories and what might be done with them in the form of a structured digital historical resource. The session ended early to enable participants to attend other sessions of interest at the same time. The session on Time Series delved into questions of modeling and visualization, and became a fascinating speculative conversation. We discussed how to represent spans of time and how to deal with fuzzy and unknown data. Simile timeline tools Session leader: Doug Knox
  16. 16. ATTRIBUTION AND COLLABORATION Facing the challenges of attribution and credit in a digital world Traditional publishing offers monolithic intellectual objects marked with citation conventions Digital objects record micro-contributions and allows for chaining of annotations  precise citation and criticism becomes possible  crowd-sourced or collaborative work can be assembled by groups, rather than simply mass contributed and then munged into cohesion by a single editing entity  if an editorial decision is discredited, it becomes easier to find dependent opinions and revise them It introduces many scenarios we cannot resolve  How do we discriminate between users who contribute different types of work?       datasets sparse, but critical editorial choices advanced transcription and collation helpful visualizations proof-reading and corrective changes linking, citation, and supportive annotation  How do we balance quality over quantity?  an RA may have created 95% of the annotations (editorial acts), but the PI may 'own' the critical, controversial, or significant 5% The act of reviewing and accepting an annotation doesn't necessarily change the credit of the contributor, but establishes some editorial hegemony Different institutions attach very different values to work like data collection, cataloging, transcription, collation, key-finding, inter-linking, etc. Session leader: Patrick Cuba
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×