• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
De conferentie 2012 - CLARIN
 

De conferentie 2012 - CLARIN

on

  • 282 views

 

Statistics

Views

Total Views
282
Views on SlideShare
229
Embed Views
53

Actions

Likes
0
Downloads
0
Comments
0

1 Embed 53

http://www.den.nl 53

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • (Quantitative Analysis of Culture Using Millions of Digitized Books (J.-B. Michel et al, 2010, Science DOI: 10.1126/science.1199644)
  • Automatischeinterruptieanalyse: welkepartijinterrupeerdewelkepartij hoe vaak (Maarten Marx, UvA)

De conferentie 2012 - CLARIN De conferentie 2012 - CLARIN Presentation Transcript

  • Language Resources and Technology Infrastructure for the Humanities and the Social Sciences in the Netherlands Arjan van Hessen
  • State of the Technology Language and Speech Technology is (nearly) mature Many applications are available Most of it is usable (although not perfect) but…..
  • Unused Technology & ResourcesLack of standardization Many scholars are is killing not aware of the HLT & Resources It is less used than expected A-priori technicalknowledge still necessary Use it to much dependent of “friends” in the field
  • Research Life cycle New Idea Publications Research ? Tuning Building Cultural Heritage Institution(s)
  • Unused Technology & Resources CAR
  • HLT & CHI paths Language processing Machine learning CATCH Cultural Heritage InstitutionsHumaninities
  • After the project 7
  • CLARIN-EU (2007-2012) CLARIN-NL (2009-2015) CLARIN-ERIC (2012-xxxx) CLARIAH (2015-…) Infrastructure program for the Humanities 8
  • Issues to address1. Finding the users2. Identification of their needs/problems3. Do our solutions correspond to their problems?4. Usability of tools: can they use them?5. Visualisation6. Tutorials and web material (movies, courses)7. Sustainability of tools and resources 9
  • How to identify and convince potential users1. FINDING THE USERS 10
  • 11Humanities enter a New EraHuge amounts of digital data are becoming available Hardware allows this Traditionally, Spitzweg’s and many tools are “lonelysupported bylonger Big data, scholar” no available and under automatedsuffices methods development
  • User Surveys Go out to ask potential users  User survey in the Netherlands (2010) 12
  • What do they need?2. IDENTIFICATION OF THEIR NEEDS/PROBLEMS 13
  • User attraction cycle Finding new users Convincing these Listening to users to the users participate Support the Train these users in the use of users all those wonderful tools 14
  • What to prevent in order to NOT scare off (potential) users3. DO OUR SOLUTIONS CORRESPOND TO THEIR PROBLEMS? 15
  • The CLARIN dream Give me digital copies of all contemporary documents in European archives that discuss the Great Plague of England (1348-1350) Give me all negative articles about Catholics in the Fryske Courant (1868-1924) Find European TV news interviews that involve discussions about Geert Wilders 16 16
  • The CLARIN nightmare in 6sleepless nights – night 1 Give me digital copies of all contemporary documents in European archives that discuss the Great Plague of England (1348-1350)  “All” means from all countries and all archives, not just some archives in some (9) countries that happen to be in CLARIN  If contemporary docs exist in digital form at all they are probably pictures – how do we get access to the content?  Can we rely on standardized metadata to find them?  Many of the docs may be in Latin – can we handle that, and what about the other languages?  How would a scholar know how to formulate this query?  How to present results? 17
  • The gearbox syndrome4. USABILITY OF TOOLS 18
  • The gearbox syndrome explained Humanities scholar with a problem, waiting for a solution First HLT researcher offering help 19
  • The gearbox syndrome explained Humanities scholar with a problem, waiting for a solution First generation named entity recognizer (rule based) 20
  • The gearbox syndrome explained Humanities scholar with a problem, waiting for a solution Second HLT researcher offering help 21
  • The gearbox syndrome explained Humanities scholar with a problem, waiting for a solution Second generation named entity recognizer (statistics based) 22
  • The gearbox syndrome explained Humanities scholar with a problem, waiting for a solution Third HLT researcher offering help 23
  • The gearbox syndrome explained Humanities scholar with a problem, waiting for a solution LREC 2012 paper about next generation named entity recognizer 24
  • The gearbox syndrome explained 25
  • Making understandable interfaces
  • A picture says more than 1000 wordsEasy visualization fosters data analysisNice visualisation eases use of analysis toolsNice-to-look-at tools help to reach out to the community5. VISUALIZATION 27
  • Who answered which words: visualizing word frequency information in lettersC. Culy. 2012. "Somechallenges oflanguage andlinguistic data forinformationvisualization. " Invitedkeynote presentationat Advanced VisualMethods forLinguistics. Universityof York, September 7,2012. 28
  • 29
  • 30
  • Parliamentary DebateWhich party interrupted which other party and how often? 31
  • Create and publish web tutorialsPublish recorded lectures about CLARIN-specific topicsMake and publish show cases6. TUTORIALS AND WEB MATERIAL 32
  • Web-video’s 33
  • Showcases 34
  • Resources and tools must be accessible after a project finishesData and tools must use international accepted standardsEasy access via federated login7. SUSTAINABILITY OF TOOLS AND RESOURCES 35
  • CLARIN Centres 36
  • Conclusion CLARIN offers a good and sustainable infrastructure for long-term use of both Resources and Tools Participating in CLARIN gives you access to enclosure tools, standardized metadata, tools for metadata, the CLARIN community Give other groups/institutions access to your data….. If you want 37
  • So join us!www.clarin.nlTHANK YOU! 38