Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Mohamed Amine Chatti Informatik 5, RWTH Aachen, Germany PROLEARN Network of Excellence ALOA  – A Web Services Driven Frame...
Agenda <ul><li>Why Automatic Metadata Generation? </li></ul><ul><li>AMG v.1 </li></ul><ul><li>AMG v.2 – SAmgI </li></ul><u...
Metadata <ul><li>Metadata is crucial for search, access, share, and reuse.  </li></ul><ul><li>Dealing with metadata cannot...
Automatic Approach <ul><li>Use information about the LO and its context to extract or generate its metadata. </li></ul><ul...
<ul><li>AMG at KUL (Cardinaels et al., 2005; Ochoa et al., 2005) </li></ul>AMG v.1
<ul><li>It was an application (Java-based) </li></ul><ul><li>No support for different languages </li></ul><ul><li>Not poss...
AMG v.2 <ul><li>Federated AMG </li></ul><ul><li>Simple AMG Interface (SAmgI) (Meire et al., 2007) </li></ul><ul><li>Main D...
<ul><li>ObjectBasedGenerators based on the Factory design pattern </li></ul><ul><li>Problem: checkout source code, recompi...
<ul><li>Federated AMG Engine - SAmgI installations / service endpoints </li></ul><ul><li>Problem: some programming require...
ALOA <ul><li>A Framework for LOM-based Automatic LO Annotation </li></ul><ul><li>Service Oriented Architecture (SOA) / Web...
<ul><li>Indexer  performing these actions: </li></ul><ul><ul><li>read all configurations in the properties file (i.e. avai...
<ul><li>Extractors </li></ul><ul><ul><li>extract content information and embedded properties from LOs </li></ul></ul><ul><...
<ul><li>Based on the ALOA Web Services API </li></ul><ul><li>Automatically generate metadata from online LOs (html, plain ...
<ul><li>Enables to easily plug-in new components (extractors and generators), for instance: </li></ul><ul><ul><li>Extracto...
<ul><li>ALOA adopts a slightly modified version of SAmgI WSDL specification </li></ul><ul><li>New methods:  getLanguages ,...
<ul><li>ALOA – A framework for LOM-based automatic metadata generation </li></ul><ul><li>ALOA already implements different...
<ul><li>Interactions between ALOA and AMG </li></ul><ul><li>Extension with more extractors and generators based on other t...
Thank You!
Upcoming SlideShare
Loading in …5
×

Aloa - A Web Services Driven Framework for Automatic Learning Objcet Annotation

1,533 views

Published on

Talk given at the Metadata 2.0 workshop in Leuven on Feb 7, 2008.

Published in: Technology, Education
  • Be the first to comment

  • Be the first to like this

Aloa - A Web Services Driven Framework for Automatic Learning Objcet Annotation

  1. 1. Mohamed Amine Chatti Informatik 5, RWTH Aachen, Germany PROLEARN Network of Excellence ALOA – A Web Services Driven Framework for A utomatic L earning O bject A nnotation
  2. 2. Agenda <ul><li>Why Automatic Metadata Generation? </li></ul><ul><li>AMG v.1 </li></ul><ul><li>AMG v.2 – SAmgI </li></ul><ul><li>ALOA </li></ul><ul><li>ALOA and AMG </li></ul><ul><li>Conclusion and Future Work </li></ul>
  3. 3. Metadata <ul><li>Metadata is crucial for search, access, share, and reuse. </li></ul><ul><li>Dealing with metadata cannot be a human task (Duval and Hodgins, 2004) </li></ul><ul><ul><li>Complex metadata standards (e.g. 9 LOM categories and 45 records of LOM level two) </li></ul></ul><ul><ul><li>Benefit not immediately appreciated </li></ul></ul><ul><ul><li>Metadata creators too expensive to be employed </li></ul></ul><ul><ul><li>Tools not user friendly (“electronic forms must die”) </li></ul></ul><ul><ul><li>Need for Automatic Metadata Generation </li></ul></ul>
  4. 4. Automatic Approach <ul><li>Use information about the LO and its context to extract or generate its metadata. </li></ul><ul><li>4 aspects of AMG (Cardinaels et al., 2005) </li></ul><ul><ul><li>Content analysis (LO itself, e.g. keyword, language) </li></ul></ul><ul><ul><li>Context analysis (environment the LO is stored or used in, e.g. LMS) </li></ul></ul><ul><ul><li>Usage analysis (e.g. time spent reading a doc) </li></ul></ul><ul><ul><li>Structure analysis (relationship amongst LOs) </li></ul></ul>
  5. 5. <ul><li>AMG at KUL (Cardinaels et al., 2005; Ochoa et al., 2005) </li></ul>AMG v.1
  6. 6. <ul><li>It was an application (Java-based) </li></ul><ul><li>No support for different languages </li></ul><ul><li>Not possible to have a metadata subset as a result </li></ul><ul><li>Not flexible and extensible </li></ul><ul><li>Not really interoperable between platforms </li></ul>AMG v.1 Limitations
  7. 7. AMG v.2 <ul><li>Federated AMG </li></ul><ul><li>Simple AMG Interface (SAmgI) (Meire et al., 2007) </li></ul><ul><li>Main Design Goals: </li></ul><ul><ul><li>Extensibility – Pluggability </li></ul></ul><ul><ul><li>Interoperability (Service oriented) </li></ul></ul>
  8. 8. <ul><li>ObjectBasedGenerators based on the Factory design pattern </li></ul><ul><li>Problem: checkout source code, recompile and rebuild the whole application </li></ul>AMG v.2 Extensibility
  9. 9. <ul><li>Federated AMG Engine - SAmgI installations / service endpoints </li></ul><ul><li>Problem: some programming required (SAmgI WSDL specification, XML schemas, etc.) </li></ul>AMG v.2 Interoperability
  10. 10. ALOA <ul><li>A Framework for LOM-based Automatic LO Annotation </li></ul><ul><li>Service Oriented Architecture (SOA) / Web Services </li></ul><ul><li>Main focus on flexibility and extensibility </li></ul>
  11. 11. <ul><li>Indexer performing these actions: </li></ul><ul><ul><li>read all configurations in the properties file (i.e. available extractors and generators, priority of each generator, maximum generated values) </li></ul></ul><ul><ul><li>access the LO as an array of bytes </li></ul></ul><ul><ul><li>detect the mime type of the LO </li></ul></ul><ul><ul><li>look for the available extractor for this particular mime type </li></ul></ul><ul><ul><li>extract the content and the embedded properties of the LO </li></ul></ul><ul><ul><li>contact the available generators </li></ul></ul><ul><ul><li>solve conflicts </li></ul></ul><ul><ul><li>translate the generated metadata into the required languages </li></ul></ul><ul><ul><li>return the generation result to the Web Service stub </li></ul></ul><ul><li>ConflictResolver </li></ul><ul><ul><li>considers priorities of the generators </li></ul></ul><ul><li>Translator </li></ul><ul><ul><li>uses Google Translate as its translation service </li></ul></ul>ALOA Core Engine
  12. 12. <ul><li>Extractors </li></ul><ul><ul><li>extract content information and embedded properties from LOs </li></ul></ul><ul><ul><li>only one extractor for each LO mime type </li></ul></ul><ul><ul><li>html extractor (Jericho library) </li></ul></ul><ul><ul><li>pdf extractor (pdfBox library) </li></ul></ul><ul><ul><li>word extractor (Apache POI library) </li></ul></ul><ul><ul><li>ppt extractor (Apache POI library) </li></ul></ul><ul><li>Generators </li></ul><ul><ul><li>use the output of the extractors to generate one or parts of the metadata </li></ul></ul><ul><ul><li>text/data mining libraries (e.g. Yahoo! Term Extraction, Tagthe, Topicalizer, LingPipe, Balie, Classifier4J) </li></ul></ul>ALOA Components
  13. 13. <ul><li>Based on the ALOA Web Services API </li></ul><ul><li>Automatically generate metadata from online LOs (html, plain text, word, ppt, pdf) </li></ul><ul><li>Parameters </li></ul><ul><ul><li>URL location of the LO </li></ul></ul><ul><ul><li>Target metadata languages (English, German, Arabic, French, Spanish, Korean) </li></ul></ul><ul><ul><li>Subset of the generated metadata </li></ul></ul><ul><ul><li>Output format (LOM XML, HTML, LOM Editor) </li></ul></ul>ALOA User Interface
  14. 14. <ul><li>Enables to easily plug-in new components (extractors and generators), for instance: </li></ul><ul><ul><li>Extractor for multimedia LO (e.g. audio, video, image, flash) </li></ul></ul><ul><ul><li>Generator for a specific context (e.g. LMS) </li></ul></ul><ul><li>The components can be deployed on different machines or on different application servers </li></ul><ul><li>Once deployed, a component can be plugged into ALOA by just giving the address of the component service </li></ul><ul><li>ALOA core engine validates and adds it to the component list in the properties file </li></ul><ul><li>Dynamic addition in run time; no need to recompile and rebuild the system </li></ul><ul><li>ALOA CMI also enables to manage the priorities of the generators and to define the maximum generated values (used by ALOA core engine) </li></ul>ALOA Configuration Management Interface
  15. 15. <ul><li>ALOA adopts a slightly modified version of SAmgI WSDL specification </li></ul><ul><li>New methods: getLanguages , setLanguages </li></ul><ul><li>Modified method: getMetadata </li></ul><ul><li>Web Services-based interactions between ALOA and AMG possible </li></ul><ul><li>ALOA as a new SAmgI installation used by the federated AMG engine </li></ul><ul><li>AMG as a new component (i.e. extractor or generator) of ALOA </li></ul>ALOA and AMG
  16. 16. <ul><li>ALOA – A framework for LOM-based automatic metadata generation </li></ul><ul><li>ALOA already implements different components (i.e. extractors and generators) </li></ul><ul><li>ALOA already generates LOM from different types of LOs (html, plain text, pdf, ppt, word) </li></ul><ul><li>Primary focus on flexibility and extensibility of the framework </li></ul><ul><li>SOA-based architecture enabling new components to be easily plugged into the basic system </li></ul><ul><li>ALOA provides a public Web Services API for third party applications </li></ul>Conclusion
  17. 17. <ul><li>Interactions between ALOA and AMG </li></ul><ul><li>Extension with more extractors and generators based on other text/data mining techniques </li></ul><ul><li>Look at model transformation techniques to support other metadata schemas (e.g. DC, MPEG) </li></ul><ul><li>Further research of the quality of automatically generated metadata </li></ul><ul><li>Combination of automatic metadata generation with a bottom up approach (e.g. Web 2.0 social tagging) </li></ul>Future Work
  18. 18. Thank You!

×