Gathering and Organizing System for PErsonal Language Skills - GOSPELS
Upcoming SlideShare
Loading in...5
×
 

Gathering and Organizing System for PErsonal Language Skills - GOSPELS

on

  • 246 views

Provide appropriate documents to users based on their ...

Provide appropriate documents to users based on their
language skills in English, Italian and German as
determined in accordance with guidelines provided by
the European Language Portfolio.

Statistics

Views

Total Views
246
Views on SlideShare
246
Embed Views
0

Actions

Likes
0
Downloads
2
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

CC Attribution License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Gathering and Organizing System for PErsonal Language Skills - GOSPELS Gathering and Organizing System for PErsonal Language Skills - GOSPELS Presentation Transcript

    • Gathering and Organizing System for PErsonal Language Skills G.O.S.PE.L.S.Student: Enrico ZanardoSupervisor: Prof. Vittore CasarosaFree University of Bolzano-Bozen8th October 2010
    • GoalProvide appropriate documents to users based on theirlanguage skills in English, Italian and German asdetermined in accordance with guidelines provided bythe European Language Portfolio. EN DE IT
    • Outline● Problems;● Proposed Solution;● Prototype & Results;● Conclusion;
    • ObjectiveEN-C1 EN-B1IT-A2 IT-C2DE-B2 DE-A2IT-C2EN-B2IT-B1DE-B2DE-A2
    • Problems1. Classify documents according to “GOSPELS ratingsystem” and match it to rating of the European LanguagePortfolio (A1, A2, ..., C1, C2).2. Know users language skills for the three languagesupported by the system (English, Italian and German).3. Provide results in the three different languagesaccording to users language skills in each language.
    • Solution to step 1Frequency of (Classify documents) most common Docs words Algorithm Level of complexity Part of of the document Speech of the word
    • Solution to step 2 (users language skills)
    • Match between Gospels Algorithm & ELPFrequency of most common Docs words Algorithm Level of complexity of the document Part of Range Template Speech of Language Documents the word Levels
    • Example Results Italian Gospels Algorithm A1 A2 B1 B2 C1 C24500 40.00 35.724000 34.09 35.00 31.883500 30.003000 25.51 23.94 25.002500 20.002000 15.00 12.661500 10.001000 5.00 500 0 0.00 A1 A2 B1 B2 C1 C2 Rating Known words Words
    • Solution to step 3 (three language results)
    • Prototype Apache Nutch 1.1 Apache Solr 1.4 LanguageLevel plug-in APACHE LUCENE INDEXER TreeTagger SEARCHER Wiktionary Internet WEB-GUI “unibz.org” J2EE GOOGLE TRANSLATOR API CRAWLERAPACHE TOMCAT 6.0 DB Postgresql 8.4.4 USER Profile ARCH LINUX 2010.05
    • Conclusions and possible extensions● The prototype is stable and seems to work well. ● Further testing required to improve and tune the algorithm ● Further testing required to improve the matching with ELP● The architecture can easily support other languages ● It needs the frequency of words in the new language ● It needs the PoS tagger for the new language● The prototype can be easily modified to become an additional function of an existing digital library ● It has to be embedded in the indexer
    • Thank-youDanke Grazie QUESTIONS? demo?