Your SlideShare is downloading. ×
Dutch Book Trade 1660-1750: using the STCN to gain insight in publishers’ strategies
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Dutch Book Trade 1660-1750: using the STCN to gain insight in publishers’ strategies

322
views

Published on

Despite a stagnating domestic demand near the end of the seventeenth century, Dutch book producers managed to keep up their international market position. In a so-called embedded research project, the …

Despite a stagnating domestic demand near the end of the seventeenth century, Dutch book producers managed to keep up their international market position. In a so-called embedded research project, the Short Title Catalogue, Netherlands (STCN) was used to gain insight in the strategies and decisions of these publishers. The STCN is a retrospective bibliography of publications 1540-1800, containing information on title, author, book producer, language, subject and collation. Historians and computer scientists collaborated to disclose this STCN, and to connect it to other relevant datasets. To explore the possibilities of, and difficulties in, disclosing and linking the bibliography, attention was turned to a particular strategy: publishing scandalous books. Next to explaining the process of converting and querying the STCN data, the presentation will deal with differences in handling data and the advantages of an Open Data approach in the humanities research.

Published in: Spiritual, Education

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
322
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
2
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. e-Humanities Group Research Meeting: STCN 2013/10/10 Wouter Beek Albert Meroño Peñuela Rinke Hoekstra Fernie Maas Inger Leemans
  • 2. ‘OPENING’ THE STCN LINKING THE STCN
  • 3. Open data
  • 4. Linked Open Data • Connect to existing datasets • Connect to services • Queries/inferences run across datasets – The Picarta topic hierarchy allows us to infer that certain publications cover related topics. – GeoNames gives the latitude of publishing houses, allowing publishing decisions to be correlated to historical events. – Lexvo / ISO standards allow translations to be traced via related languages (e.g. language families). • Easy to create mashups / new applications.
  • 5. died in Biografisch portaal same as
  • 6. Taking the STCN to the Semantic Web • 139.817 publications (4M facts) • 23.543 authors (120K facts) • 9.959 printers (55K facts) • 37K enriched concepts (DBpedia, Yago, Heidelberg Diglit, …) • 105 topics (1K facts) • Relate to international standards (GGC/OCLC/ISO/RFC/IANA) • Making the schema explicit (vocabulary)
  • 7. RDF files Domain-independent data conversions fully automated Relational DB Domain-dependent data conversions domain knowledge needed domain knowledge Simple RDF Link to external sources (linksets) domain knowledge needed XML files depends on structure domain knowledge Text files ambiguous Fixing bad data origin inconsistencies & inaccuracies Connect to services (e.g. query interface, maps) high level of reuse
  • 8. FROM THE LIBRARY TO THE LAB
  • 9. “How many publications by Arminius?”
  • 10. “How many publications by Gomarus?”
  • 11. What happens to the average publication format after 1619? Measured in terms of the number of folds: • Works by Arminius: 5.6  5.7 • Works by Gomarius: 6.8  4.9 Distant reading!
  • 12. Methodological implications From searching for resources (librarian) to validating/refuting hypotheses (scientist)
  • 13. humanities + R (statistics processing software) A WEB SERVICE FOR RESEARCH INVOLVING DISTANT READING humR
  • 14. Open issues 0: institutional hurdles • The products of publicly funded research should be publicly available (papers&datasets). – Not everybody makes their data publicly available. • Distant reading research is often restricted by the user interace.
  • 15. Open issues 1: meaning A large percentage of the data has no/unknown meaning: • “before 1808” • “This book was published between the Big Bang and 1808.” Context-dependent: • “The first dinosaur walked the earth before 300M years BC.” • “Einstein came up with the idea of general relativity before 1937.” Fuzzyness: • “James Joyce’s Ulysses was published before 1925.”
  • 16. Open issues 2: statistics • Which query results are statistically relevant? • How to detect whether a statistically significant difference reflects reality and not the way in which the dataset was constructed?

×