Requirements for Processing Datasets for Recommender Systems

Stoitsis Giannis
Stoitsis GiannisChief Operations Officer
Requirements for Processing Datasets
     for Recommender Systems
 Preliminary Experiences from Three Case
                 Studies

             Giannis Stoitsis
             University of Alcala, Spain
          Agro-Know Technologies, Greece

           RecSys Challenge 2012, Dublin
the learning case
• technology-enhanced learning investigates how
  information and communication technologies can
  be used to support learning and teaching, and
  competence development throughout life.
• various levels/contexts
  –   school
  –   higher education and research
  –   vocational education and training
  –   adult education
recommend resources in moodle
recommend resources in learning portal
handling multiple, diverse sets &
              streams
• various types of social data
• different schemas and formats
• multiple languages and dimensions




       Single criteria            Multi-criteria
why?
• support various usage and recommendation
  scenarios
• combining data from various sources may
  boost the way recommender work in
  education
  – bigger data
  – federated recommender systems
  – open science platform
a European social data infrastructure
              for learning

                                                                      …portals…




                 Meta     Social              Meta         Social                 Meta   Social
     Social      data                                                             data    Data
                           Data               data          Data
      Data




     API                  API                API                                         API
   Federated            Aggregation of metadata, social and usage data
Recommendation
    services


                                                     Resolution
                                                      services
                                     Social                         Metadata
                                     Data                            per URI

                                   Anonymised
Requirements for Processing Datasets for Recommender Systems
challenges
•   define common metadata schema
•   harvest/crawl social data
•   transform each social data schema
•   uri resolution
•   scalability
•   anonymised approach
•   develop item-based non personalized
    algorithms that can perform well
our open science case study
web app for testing neighborhood-based recommendation
      algorithms with multi-criteria rating dataset

                                           Export data
                                            (sql, csv)
     I need
                                                         Refine
     more!!!                     Login
                                                          data
                                         Transfom
                          Import          dataset
                        dataset (sql,
                         csv, xml)         Create
                          Prepare          dataset
                          dataset               Data
                                            characteristics
                               Visualize
                               dataset
                                             Visualize
           RecSys             Export          results
         researcher/          results
          developer
architecture

   Web UI                                                 Developers

                                      API
Components

                Refine and                        Prepare/p
  Import                       Visualize                        Evaluate
                transform                           rocess



                                      API
Cloud/Grid infra

            Monte Carlo      Social     Social   Social    Recommender
                             Data       Data     Data
             Simulator                                        services
experience from Mendeley case
experience from multi-criteria rating
   dataset from a teachers portal
                                               e.g. integration in classroom,
                                            relevance to topics, ability to help
                                                       students learn




                 Size of the neighborhood    Correlation Weight Threshold value
DEMO
1 of 15

More Related Content

Similar to Requirements for Processing Datasets for Recommender Systems(20)

Linked Data as a ServiceLinked Data as a Service
Linked Data as a Service
Peter Haase2.1K views
20130117 - Big Data Architectures20130117 - Big Data Architectures
20130117 - Big Data Architectures
BlueMetalInc1.1K views
Metadata-powered dissemination of contentMetadata-powered dissemination of content
Metadata-powered dissemination of content
Nikos Manouselis1.3K views
STI Summit 2011 - Digital WorldsSTI Summit 2011 - Digital Worlds
STI Summit 2011 - Digital Worlds
Semantic Technology Institute International331 views
Enterprise Sharepoint PortalEnterprise Sharepoint Portal
Enterprise Sharepoint Portal
Curtis Timmons624 views
Future.ready().watson dataplatform 01Future.ready().watson dataplatform 01
Future.ready().watson dataplatform 01
Redazione InnovaPuglia141 views
BI Dashboards with SQL Server 2008 R2BI Dashboards with SQL Server 2008 R2
BI Dashboards with SQL Server 2008 R2
Eduardo Castro3.1K views
Machine Learning Models in ProductionMachine Learning Models in Production
Machine Learning Models in Production
DataWorks Summit2.2K views
Big Data SE vs. SE for Big DataBig Data SE vs. SE for Big Data
Big Data SE vs. SE for Big Data
Institute of Contemporary Sciences401 views

Recently uploaded(20)

231112 (WR) v1  ChatGPT OEB 2023.pdf231112 (WR) v1  ChatGPT OEB 2023.pdf
231112 (WR) v1 ChatGPT OEB 2023.pdf
WilfredRubens.com100 views
STERILITY TEST.pptxSTERILITY TEST.pptx
STERILITY TEST.pptx
Anupkumar Sharma102 views
Narration lesson plan.docxNarration lesson plan.docx
Narration lesson plan.docx
Tariq KHAN90 views
Material del tarjetero LEES Travesías.docxMaterial del tarjetero LEES Travesías.docx
Material del tarjetero LEES Travesías.docx
Norberto Millán Muñoz57 views
Chemistry of sex hormones.pptxChemistry of sex hormones.pptx
Chemistry of sex hormones.pptx
RAJ K. MAURYA97 views
Universe revised.pdfUniverse revised.pdf
Universe revised.pdf
DrHafizKosar84 views
Narration  ppt.pptxNarration  ppt.pptx
Narration ppt.pptx
Tariq KHAN62 views
Lecture: Open InnovationLecture: Open Innovation
Lecture: Open Innovation
Michal Hron82 views
Nico Baumbach IMR Media ComponentNico Baumbach IMR Media Component
Nico Baumbach IMR Media Component
InMediaRes1186 views
Gopal Chakraborty Memorial Quiz 2.0 Prelims.pptxGopal Chakraborty Memorial Quiz 2.0 Prelims.pptx
Gopal Chakraborty Memorial Quiz 2.0 Prelims.pptx
Debapriya Chakraborty221 views
Drama KS5 BreakdownDrama KS5 Breakdown
Drama KS5 Breakdown
WestHatch50 views
Dance KS5 BreakdownDance KS5 Breakdown
Dance KS5 Breakdown
WestHatch52 views
ICS3211_lecture 08_2023.pdfICS3211_lecture 08_2023.pdf
ICS3211_lecture 08_2023.pdf
Vanessa Camilleri68 views
ICANNICANN
ICANN
RajaulKarim2057 views

Requirements for Processing Datasets for Recommender Systems

  • 1. Requirements for Processing Datasets for Recommender Systems Preliminary Experiences from Three Case Studies Giannis Stoitsis University of Alcala, Spain Agro-Know Technologies, Greece RecSys Challenge 2012, Dublin
  • 2. the learning case • technology-enhanced learning investigates how information and communication technologies can be used to support learning and teaching, and competence development throughout life. • various levels/contexts – school – higher education and research – vocational education and training – adult education
  • 4. recommend resources in learning portal
  • 5. handling multiple, diverse sets & streams • various types of social data • different schemas and formats • multiple languages and dimensions Single criteria Multi-criteria
  • 6. why? • support various usage and recommendation scenarios • combining data from various sources may boost the way recommender work in education – bigger data – federated recommender systems – open science platform
  • 7. a European social data infrastructure for learning …portals… Meta Social Meta Social Meta Social Social data data Data Data data Data Data API API API API Federated Aggregation of metadata, social and usage data Recommendation services Resolution services Social Metadata Data per URI Anonymised
  • 9. challenges • define common metadata schema • harvest/crawl social data • transform each social data schema • uri resolution • scalability • anonymised approach • develop item-based non personalized algorithms that can perform well
  • 10. our open science case study
  • 11. web app for testing neighborhood-based recommendation algorithms with multi-criteria rating dataset Export data (sql, csv) I need Refine more!!! Login data Transfom Import dataset dataset (sql, csv, xml) Create Prepare dataset dataset Data characteristics Visualize dataset Visualize RecSys Export results researcher/ results developer
  • 12. architecture Web UI Developers API Components Refine and Prepare/p Import Visualize Evaluate transform rocess API Cloud/Grid infra Monte Carlo Social Social Social Recommender Data Data Data Simulator services
  • 14. experience from multi-criteria rating dataset from a teachers portal e.g. integration in classroom, relevance to topics, ability to help students learn Size of the neighborhood Correlation Weight Threshold value
  • 15. DEMO

Editor's Notes

  1. smirti.bhagat@technicolor.com
  2. Example of using Recommendation API: recommend(itemURI,limit_of_resources), recommend(itemURI,user_tags) Example of social data API provided by the aggregator: get_tags(itemURI), get_reviews(itemURI) etc
  3. Here we present the architecture of such an environment and the proposed software stackMonte Carlo will be a separate component that can run also on the Grid and that will br provided through an API. The API will be documented.