Torsten Reimer


Published on

Web Semantics in Action: Web 3.0 in e-Science

11:00 – 11:25 Torsten Reimer: Classifying the (digital) Arts and Humanities

Published in: Education, Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Torsten Reimer

  1. 1. Classifying the (digital) Arts and Humanities Wishful thinking in fifteen slides By Dr Torsten Reimer Centre for e-Research, King's College London IEEE Conference on e-Science - 11/12/2009
  2. 2. Once upon a time ICT Guides • Projects • Methods • Tools • Events and reports • Community • Bibliography etc.
  3. 3.  an online hub for research & teaching in the digital arts and humanities  support for creating and using digital resources  enables members to locate information, promote their research and discuss ideas  mix of centrally provided and user contributed content  use of web 2.0 functionality such as tagging, feeds, wiki, blogging, user profiles etc.  community resource
  4. 4. Methods Taxonomy • Originally developed for the projects and methods database • Focus on resource creation • Used to categorize projects, tools, resources • Now part of • Seven main categories
  5. 5. Data analysis • Collating: Collation is the process of comparing different versions of a text to discover the location and type of textual variants. Collation is fundamental to a variety of scholarly pursuits, for example in the Arts and Humanities field it can be used for the accurate reconstruction of texts of classical works. In the past collation was performed by hand; today, it is performed with the assistance of a computer. Read more... • Collocating: Refers to the techniques used to detect patterns of words that appear together in a text more often than would be expected by chance. A collocation is a group or pair of words that are always used together, and can illustrate restrictions on which verbs or adjectives can be used with particular nouns, or the order in which words appear. Read more... • Content analysis: Content analysis is a research technique focused on the content and internal features of media. It is used to determine the presence of certain words, concepts, themes, phrases, characters, or sentences within texts or sets of texts and to quantify this presence in an objective manner. Read more... • Content-based image retrieval: Content-based image retrieval (CBIR) refers to techniques used to search for digital images by features of their content, which is particularly helpful when studying large databases. It is often preferable to perform searches relying on metadata, which can be expensive and time-consuming to produce, as it requires humans to describe each individual item in the database. Read more... • Content-based sound retrieval: Refers to techniques used to search for sound files by features of their content, using specialist software, which is particularly helpful when studying large databases. It is often preferable to perform searches relying on metadata, which can be expensive and time-consuming to produce, as it requires humans to describe each individual item in the database. Read more... • Data mining: Data mining is the process of using computing power to extract hidden patterns from data, analysing the results from different perspectives and summarising it into a useful format, such as a graph or table. This process is often facilitated by the use of metadata. It is important that any patterns found are verified and validated by comparison with other data samples. In this way, data mining can identify trends that go beyond simple data analysis. Read more... • Image feature measurement: Image feature measurement is a term to describe techniques used to acquire, measure, and analyse the parameters of digital images, such as size, shape, relative locations, textures, grey tones and colours. These parameters are also known as ‘perception attributes’. Read more...
  6. 6. Three partners – one system?
  7. 7. The 'mine, all mine' problem
  8. 8. CHAIN ADHO, centerNet, CLARIN, DARIAH, Project Bamboo, NoC Key theme: advocacy for an improved digital research infrastructure for the Humanities and Arts Knowledge base: all partners want one; we have one International desire to overcome 'mine, all mine problem' Coalition of Humanities and Arts Infrastructures and Networks
  9. 9. Problems with current set-up • Shared editing necessary • Versioning system • Distributed across several websites • Only parent-child relationships • Different terminology for same method in different fields • Only monolingual
  10. 10. Solution: semantic web? Linked Data: • 1. Use URIs to identify things. • 2. Use HTTP URIs so that these things can be referred to and looked up ("dereference") by people and user agents. • 3. Provide useful information (i.e., a structured description — metadata) about the thing when its URI is dereferenced. • 4. Include links to other, related URIs in the exposed data to improve discovery of other related information on the Web.
  11. 11. Taxonomy as service Semantic web (linked data) Shared taxonomy • CeRch • DHO • OeRC • (CHAIN) • and you?
  12. 12. Glorious future • Build a resource owned by and useful for the wider Digital Humanities / Arts community • Bring field(s) together • Make what we do more easily accessible to funding bodies and the public