MyMediaLite is a lightweight recommender system library written in C# that provides functionality for rating prediction, item recommendation from implicit feedback, and algorithm testing. It is designed to be simple, free, scalable, and well-documented. The library allows users to easily implement their own recommendation methods by defining model data structures and writing train and predict methods.
Analyzing Adverse Drug Events Using Data Mining ApproachRupal7
We get the potential drugs , drug pairs and drug triplets which can result into Adverse Drug events which can be harmful, both to the hospital and the patients.
This presentation introduces text analytics, its applications and various tools/algorithms used for this process. Given below are some of the important tools:
- Decision trees
- SVM
- Naive-Bayes
- K-nearest neighbours
- Artificial Neural Networks
- Fuzzy C-Means
- Latent Dirichlet Allocation
Python for Data Science - Python Brasil 11 (2015)Gabriel Moreira
This talk demonstrate a complete Data Science process, involving Obtaining, Scrubbing, Exploring, Modeling and Interpreting data using Python ecosystem tools, like IPython Notebook, Pandas, Matplotlib, NumPy, SciPy and Scikit-learn.
Moving Your Machine Learning Models to Production with TensorFlow ExtendedJonathan Mugan
ML is great fun, but now we want it to solve real problems. To do this, we need a way of keeping track of all of our data and models, and we need to know when our models fail and why. This talk will cover how to move ML to production with TensorFlow Extended (TFX). TFX is used by Google internally for machine-learning model development and deployment, and it has recently been made public. TFX consists of multiple pipeline elements and associated components, and this talk will cover them all, but three elements are particularly interesting: TensorFlow Data Validation, TensorFlow Model Analysis, and the What-If Tool.
The TensorFlow Data Validation library analyses incoming data and computes distributions over the feature values. This can show us which features many not be useful, maybe because they always have the same value, or which features may contain bugs. TensorFlow Model Analysis allows us to understand how well our data performs on different slices of the data. For example, we may find that our predictive models are more accurate for events that happen on Tuesdays, and such knowledge can be used to help us better understand our data and our business. The What-If Tool is as an interactive tool that allows you to change data and see what the model would say if a particular record had a particular feature value. It lets you probe your model, and it can automatically find the closest record with a different predicted label, which allows you to learn what the model is homing in on. Machine learning is growing up.
Le Machine Learning, l’IA, le DeepLearning, les Statistiques, le Data Mining… bref, tous ces mots sont les buzz words du moment mais que se cache-t-il derrière ?
A travers des exemples concrets, on parcourra les différentes approches du Machine Learning, les grandes familles d’algorithmes (n’ayez crainte : sans rentrer dans le cœur de leurs implémentations), puis les outils et les frameworks à la disposition des Data Scientists… et pour finir, on essayera de prédire l’avenir !
Salon Data - Nantes - 19 Septembre 2017
https://salondata.fr/2017/07/12/0930-1030-ml/
Analyzing Adverse Drug Events Using Data Mining ApproachRupal7
We get the potential drugs , drug pairs and drug triplets which can result into Adverse Drug events which can be harmful, both to the hospital and the patients.
This presentation introduces text analytics, its applications and various tools/algorithms used for this process. Given below are some of the important tools:
- Decision trees
- SVM
- Naive-Bayes
- K-nearest neighbours
- Artificial Neural Networks
- Fuzzy C-Means
- Latent Dirichlet Allocation
Python for Data Science - Python Brasil 11 (2015)Gabriel Moreira
This talk demonstrate a complete Data Science process, involving Obtaining, Scrubbing, Exploring, Modeling and Interpreting data using Python ecosystem tools, like IPython Notebook, Pandas, Matplotlib, NumPy, SciPy and Scikit-learn.
Moving Your Machine Learning Models to Production with TensorFlow ExtendedJonathan Mugan
ML is great fun, but now we want it to solve real problems. To do this, we need a way of keeping track of all of our data and models, and we need to know when our models fail and why. This talk will cover how to move ML to production with TensorFlow Extended (TFX). TFX is used by Google internally for machine-learning model development and deployment, and it has recently been made public. TFX consists of multiple pipeline elements and associated components, and this talk will cover them all, but three elements are particularly interesting: TensorFlow Data Validation, TensorFlow Model Analysis, and the What-If Tool.
The TensorFlow Data Validation library analyses incoming data and computes distributions over the feature values. This can show us which features many not be useful, maybe because they always have the same value, or which features may contain bugs. TensorFlow Model Analysis allows us to understand how well our data performs on different slices of the data. For example, we may find that our predictive models are more accurate for events that happen on Tuesdays, and such knowledge can be used to help us better understand our data and our business. The What-If Tool is as an interactive tool that allows you to change data and see what the model would say if a particular record had a particular feature value. It lets you probe your model, and it can automatically find the closest record with a different predicted label, which allows you to learn what the model is homing in on. Machine learning is growing up.
Le Machine Learning, l’IA, le DeepLearning, les Statistiques, le Data Mining… bref, tous ces mots sont les buzz words du moment mais que se cache-t-il derrière ?
A travers des exemples concrets, on parcourra les différentes approches du Machine Learning, les grandes familles d’algorithmes (n’ayez crainte : sans rentrer dans le cœur de leurs implémentations), puis les outils et les frameworks à la disposition des Data Scientists… et pour finir, on essayera de prédire l’avenir !
Salon Data - Nantes - 19 Septembre 2017
https://salondata.fr/2017/07/12/0930-1030-ml/
This presentation lets you know about Apache Mahout.
The Apache Mahout is a machine learning library and the main goal is to build scalable machine learning libraries.
Module 9: Natural Language Processing Part 2Sara Hooker
Delta Analytics is a 501(c)3 non-profit in the Bay Area. We believe that data is powerful, and that anybody should be able to harness it for change. Our teaching fellows partner with schools and organizations worldwide to work with students excited about the power of data to do good.
Welcome to the course! These modules will teach you the fundamental building blocks and the theory necessary to be a responsible machine learning practitioner in your own community. Each module focuses on accessible examples designed to teach you about good practices and the powerful (yet surprisingly simple) algorithms we use to model data.
To learn more about our mission or provide feedback, take a look at www.deltanalytics.org. If you would like to use this material to further our mission of improving access to machine learning. Education please reach out to inquiry@deltanalytics.org .
Sentiment Analysis/Opinion Mining of Twitter Data on Unigram/Bigram/Unigram+Bigram Model using:
1. Machine Learning
2. Lexical Scores
3. Emoticon Scores
YouTube Video: https://youtu.be/VuR16P87yPE
Link to the WebPage: http://akirato.github.io/Twitter-Sentiment-Analysis-Tool
Github Page: https://github.com/Akirato/Twitter-Sentiment-Analysis-Tool
Structural syntactic metrics for RDF Datasets that correlate with high level quality deficiencies.
The vision of the Linked Open Data (LOD) initiative is to provide a model for publishing data and meaningfully interlinking such dispersed but related data. Despite the importance of data quality for the successful growth of the LOD, only limited attention has been focused on quality of data prior to their publication on the LOD. This paper focuses on the systematic assessment of the quality of datasets prior to publication on the LOD cloud. To this end, we identify important quality deficiencies that need to be avoided and/or resolved prior to the publication of a dataset. We then propose a set of metrics to measure and identify these quality deficiencies in a dataset. This way, we enable the assessment and identification of undesirable quality characteristics of a dataset through our proposed metrics.
Slides for paper presentation at DEXA 2015:
Behshid Behkamal, Mohsen Kahani, Ebrahim Bagheri:
Quality Metrics for Linked Open Data. DEXA (1) 2015: 144-152
An educational course covering all aspects of supply chain management. Includes workshops to design and implement your own supply chain management function
This presentation lets you know about Apache Mahout.
The Apache Mahout is a machine learning library and the main goal is to build scalable machine learning libraries.
Module 9: Natural Language Processing Part 2Sara Hooker
Delta Analytics is a 501(c)3 non-profit in the Bay Area. We believe that data is powerful, and that anybody should be able to harness it for change. Our teaching fellows partner with schools and organizations worldwide to work with students excited about the power of data to do good.
Welcome to the course! These modules will teach you the fundamental building blocks and the theory necessary to be a responsible machine learning practitioner in your own community. Each module focuses on accessible examples designed to teach you about good practices and the powerful (yet surprisingly simple) algorithms we use to model data.
To learn more about our mission or provide feedback, take a look at www.deltanalytics.org. If you would like to use this material to further our mission of improving access to machine learning. Education please reach out to inquiry@deltanalytics.org .
Sentiment Analysis/Opinion Mining of Twitter Data on Unigram/Bigram/Unigram+Bigram Model using:
1. Machine Learning
2. Lexical Scores
3. Emoticon Scores
YouTube Video: https://youtu.be/VuR16P87yPE
Link to the WebPage: http://akirato.github.io/Twitter-Sentiment-Analysis-Tool
Github Page: https://github.com/Akirato/Twitter-Sentiment-Analysis-Tool
Structural syntactic metrics for RDF Datasets that correlate with high level quality deficiencies.
The vision of the Linked Open Data (LOD) initiative is to provide a model for publishing data and meaningfully interlinking such dispersed but related data. Despite the importance of data quality for the successful growth of the LOD, only limited attention has been focused on quality of data prior to their publication on the LOD. This paper focuses on the systematic assessment of the quality of datasets prior to publication on the LOD cloud. To this end, we identify important quality deficiencies that need to be avoided and/or resolved prior to the publication of a dataset. We then propose a set of metrics to measure and identify these quality deficiencies in a dataset. This way, we enable the assessment and identification of undesirable quality characteristics of a dataset through our proposed metrics.
Slides for paper presentation at DEXA 2015:
Behshid Behkamal, Mohsen Kahani, Ebrahim Bagheri:
Quality Metrics for Linked Open Data. DEXA (1) 2015: 144-152
An educational course covering all aspects of supply chain management. Includes workshops to design and implement your own supply chain management function
App;ying Different Classification Technologies and for Different types of datasets such as Text and image dataset. Here I have used Machine learning and Deep Learning respectively for text and image datasets.
Discovering User's Topics of Interest in Recommender Systems @ Meetup Machine...Gabriel Moreira
This talk introduces the main techniques of Recommender Systems and Topic Modeling. Then, we present a case of how we've combined those techniques to build Smart Canvas, a SaaS that allows people to bring, create and curate content relevant to their organization, and also helps to tear down knowledge silos.
We give a deep dive into the design of our large-scale recommendation algorithms, giving special attention to a content-based approach that uses topic modeling techniques (like LDA and NMF) to discover people’s topics of interest from unstructured text, and social-based algorithms using a graph database connecting content, people and teams around topics.
Our typical data pipeline that includes the ingestion millions of user events (using Google PubSub and BigQuery), the batch processing of the models (with PySpark, MLib, and Scikit-learn), the online recommendations (with Google App Engine, Titan Graph Database and Elasticsearch), and the data-driven evaluation of UX and algorithms through A/B testing experimentation. We also touch topics about non-functional requirements of a software-as-a-service like scalability, performance, availability, reliability and multi-tenancy and how we addressed it in a robust architecture deployed on Google Cloud Platform.
Short-Bio: Gabriel Moreira is a scientist passionate about solving problems with data. He is Head of Machine Learning at CI&T and Doctoral student at Instituto Tecnológico de Aeronáutica - ITA. where he has also got his Masters on Science. His current research interests are recommender systems and deep learning.
https://www.meetup.com/pt-BR/machine-learning-big-data-engenharia/events/239037949/
Movie Recommender System Using Artificial Intelligence Shrutika Oswal
In recent years, a huge amount of information is available on the internet and it is very difficult for the user to collect the relevant information. While purchasing any product also a lot of choices available and the user is confused about what to choose. This will be a time-consuming process as well. The search engine will solve this problem to some extent by but it will fail in giving a personalized recommendation. In this presentation, I will describe the different types and working of the recommender system how they gather the data, build recommender, generate recommendations from it, evaluate the performance and effectiveness of the recommender system. The further part of the presentation will describe how to build a movie recommender system using python.
Start machine learning in 5 simple stepsRenjith M P
Simple steps to get started with machine learning.
The use case uses python programming. Target audience is expected to have a very basic python knowledge.
Discovering User's Topics of Interest in Recommender SystemsGabriel Moreira
This talk introduces the main techniques of Recommender Systems and Topic Modeling.
Then, we present a case of how we've combined those techniques to build Smart Canvas (www.smartcanvas.com), a service that allows people to bring, create and curate content relevant to their organization, and also helps to tear down knowledge silos.
We present some of Smart Canvas features powered by its recommender system, such as:
- Highlight relevant content, explaining to the users which of his topics of interest have generated each recommendation.
- Associate tags to users’ profiles based on topics discovered from content they have contributed. These tags become searchable, allowing users to find experts or people with specific interests.
- Recommends people with similar interests, explaining which topics brings them together.
We give a deep dive into the design of our large-scale recommendation algorithms, giving special attention to our content-based approach that uses topic modeling techniques (like LDA and NMF) to discover people’s topics of interest from unstructured text, and social-based algorithms using a graph database connecting content, people and teams around topics.
Our typical data pipeline that includes the ingestion millions of user events (using Google PubSub and BigQuery), the batch processing of the models (with PySpark, MLib, and Scikit-learn), the online recommendations (with Google App Engine, Titan Graph Database and Elasticsearch), and the data-driven evaluation of UX and algorithms through A/B testing experimentation. We also touch topics about non-functional requirements of a software-as-a-service like scalability, performance, availability, reliability and multi-tenancy and how we addressed it in a robust architecture deployed on Google Cloud Platform.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024Neo4j
Neha Bajwa, Vice President of Product Marketing, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Unlocking Productivity: Leveraging the Potential of Copilot in Microsoft 365, a presentation by Christoforos Vlachos, Senior Solutions Manager – Modern Workplace, Uni Systems
Climate Impact of Software Testing at Nordic Testing DaysKari Kakkonen
My slides at Nordic Testing Days 6.6.2024
Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
MyMediaLite
1. MyMediaLite
a lightweight, multi-purpose library of recommender system algorithms
Zeno Gantner
University of Hildesheim
February 5, 2011
Zeno Gantner, University of Hildesheim: MyMediaLite Recommender System Library — http://ismll.de/mymedialite 1 / 16
2. Introduction
What are Recommender Systems?
Zeno Gantner, University of Hildesheim: MyMediaLite Recommender System Library — http://ismll.de/mymedialite 2 / 16
3. Introduction
MyMediaLite: Recommender System Algorithm Library
functionality
rating prediction
item recommendation from implicit feedback
algorithm testbed
target groups
why use it?
recommender system researchers
simple
educators and students
free
application developers
scalable
misc info well-documented
written in C#, runs on Mono choice
GNU General Public License (GPL)
regular releases (1 or 2 per month)
Zeno Gantner, University of Hildesheim: MyMediaLite Recommender System Library — http://ismll.de/mymedialite 3 / 16
4. Using MyMediaLite
Data Flow
hyperparameters
Recommender
interaction
data predictions
Model
user/item
attributes
disk
Zeno Gantner, University of Hildesheim: MyMediaLite Recommender System Library — http://ismll.de/mymedialite 4 / 16
5. Using MyMediaLite
Methods Implemented in MyMediaLite
rating prediction
averages: global, user, item
linear baseline method by Koren and Bell
frequency-weighted Slope One
k-nearest neighbor (kNN):
user or item similarities, diff. similarity measures
collaborative or attribute-/content-based
(biased) matrix factorization
item prediction from implicit feedback
random
most popular item
linear content-based model optimized for BPR (BPR-Linear)
support-vector machine using item attributes
k-nearest neighbor (kNN)
weighted regularized matrix factorization (WR-MF)
matrix factorization optimized for BPR (BPR-MF)
Zeno Gantner, University of Hildesheim: MyMediaLite Recommender System Library — http://ismll.de/mymedialite 5 / 16
6. Using MyMediaLite
Command-Line Tools
one for each task: rating prediction, item recommendation
simple text format: CSV
pick method and parameters using command-line arguments
evaluate, store/load models
http://ismll.de/mymedialite/documentation/command_line.html
Zeno Gantner, University of Hildesheim: MyMediaLite Recommender System Library — http://ismll.de/mymedialite 6 / 16
7. Using MyMediaLite
Embedding MyMediaLite: C#
using System ;
using M yM e d ia Lite . Data ;
using M yM e d ia Lite . E v a l ;
using M yM e d ia Lite . IO ;
using M yM e d ia Lite . ItemRecommendation ;
p u b l i c c l a s s Example
{
p u b l i c s t a t i c v o i d Main ( s t r i n g [ ] a r g s )
{
// l o a d t h e d a t a
v a r u s e r m a p p i n g = new E n t i t y M a p p i n g ( ) ;
v a r i t e m m a p p i n g = new E n t i t y M a p p i n g ( ) ;
v a r t r a i n i n g d a t a = ItemRecommenderData . Read ( a r g s [ 0 ] , u s e r m a p p i n g , i t e m m a p p i n g ) ;
var r e l e v a n t i t e m s = item mapping . I n t e r n a l I D s ;
v a r t e s t d a t a = ItemRecommenderData . Read ( a r g s [ 1 ] , u s e r m a p p i n g , i t e m m a p p i n g ) ;
// s e t up t h e recommender
v a r recommender = new M o s t P o p u l a r ( ) ;
recommender . S e t C o l l a b o r a t i v e D a t a ( t r a i n i n g d a t a ) ;
recommender . T r a i n ( ) ;
// m e a s u r e t h e a c c u r a c y on t h e t e s t d a t a s e t
v a r r e s u l t s = I t e m P r e d i c t i o n E v a l . E v a l u a t e ( recommender , t e s t d a t a , t r a i n i n g d a t a ,
relevant items );
C o n s o l e . W r i t e L i n e ( " prec@5 ={0} " , r e s u l t s [ " prec5 " ] ) ;
// make a p r e d i c t i o n f o r a c e r t a i n u s e r and i t e m
C o n s o l e . W r i t e L i n e ( recommender . P r e d i c t ( u s e r m a p p i n g . T o I n t e r n a l I D ( 1 ) ,
item mapping . ToInternalID ( 1 ) ) ) ;
}
}
Zeno Gantner, University of Hildesheim: MyMediaLite Recommender System Library — http://ismll.de/mymedialite 7 / 16
8. Using MyMediaLite
Embedding MyMediaLite: Python
#! / u s r / b i n / e n v i p y
import clr
clr . AddReference ( " MyMediaLite . dll " )
from MyMediaLite import ∗
# load the data
user_mapping = Data . EntityMapping ( )
item_mapping = Data . EntityMapping ( )
train_data = IO . I t e m R ec o m m e n d e r Da t a . Read ( " u1 . base " , user_mapping , item_mapping )
relev ant_ite ms = item_mapping . InternalIDs
test_data = IO . I t e m R ec o m m e n d e r Da t a . Read ( " u1 . test " , user_mapping , item_mapping )
# s e t up t h e recommender
recommender = I te mR e co m me nd a ti o n . MostPopular ( )
recommender . S e t C o l l a b o r a t i v e D a t a ( train_data ) ;
recommender . Train ( )
# m e a s u r e t h e a c c u r a c y on t h e t e s t d a t a s e t
print Eval . I t e m Pr ed i ct i on Ev a l . Evaluate ( recommender , test_data , train_data , relevant_items )
# make a p r e d i c t i o n f o r a c e r t a i n u s e r and i t e m
print recommender . Predict ( user_mapping . ToInternalID ( 1 ) , item_mapping . ToInternalID ( 1 ) )
Zeno Gantner, University of Hildesheim: MyMediaLite Recommender System Library — http://ismll.de/mymedialite 8 / 16
9. Using MyMediaLite
Embedding MyMediaLite: Ruby
#! / u s r / b i n / e n v i r
require ’ MyMediaLite ’
min_rating = 1
max_rating = 5
# load the data
user_mapping = MyMediaLite : : Data : : EntityMapping . new ( )
item_mapping = MyMediaLite : : Data : : EntityMapping . new ( )
train_data = MyMediaLite : : IO : : R a t i n g P r e d i c t i o n D a t a . Read ( " u1 . base " , min_rating , max_rating ,
user_mapping , item_mapping )
test_data = MyMediaLite : : IO : : R a t i n g P r e d i c t i o n D a t a . Read ( " u1 . test " , min_rating , max_rating ,
user_mapping , item_mapping )
# s e t up t h e recommender
recommender = MyMediaLite : : RatingPrediction : : UserItemBaseline . new ( )
recommender . MinRating = min_rating
recommender . MaxRating = max_rating
recommender . Ratings = train_data
recommender . Train ( )
# m e a s u r e t h e a c c u r a c y on t h e t e s t d a t a s e t
eval_results = MyMediaLite : : Eval : : RatingEval : : Evaluate ( recommender , test_data )
eval_results . each do | entry |
puts " #{ entry } "
end
# make a p r e d i c t i o n f o r a c e r t a i n u s e r and i t e m
puts recommender . Predict ( user_mapping . ToInternalID ( 1 ) , item_mapping . ToInternalID ( 1 ) )
Zeno Gantner, University of Hildesheim: MyMediaLite Recommender System Library — http://ismll.de/mymedialite 9 / 16
10. Extending MyMediaLite
Roll Your Own Recommendation Method
It’s easy.
for basic functionality
define model data structures
write Train() method
write Predict() method
That’s all!
Zeno Gantner, University of Hildesheim: MyMediaLite Recommender System Library — http://ismll.de/mymedialite 10 / 16
11. Extending MyMediaLite
Roll Your Own: Define Model Data Structures
Zeno Gantner, University of Hildesheim: MyMediaLite Recommender System Library — http://ismll.de/mymedialite 11 / 16
12. Extending MyMediaLite
Roll Your Own: Write Train() Method
Zeno Gantner, University of Hildesheim: MyMediaLite Recommender System Library — http://ismll.de/mymedialite 12 / 16
13. Extending MyMediaLite
Roll Your Own: Write Predict() Method
Zeno Gantner, University of Hildesheim: MyMediaLite Recommender System Library — http://ismll.de/mymedialite 13 / 16
14. Extending MyMediaLite
Roll Your Own Recommendation Method
It’s easy.
You do not need to worry about including the new method to the
command-line tools, reflection takes care of that.
advanced functionality
CanPredict() method
load/store models
on-line updates
Zeno Gantner, University of Hildesheim: MyMediaLite Recommender System Library — http://ismll.de/mymedialite 14 / 16
15. Conclusion
MyMediaLite
future work
more methods (contributions welcome . . . )
additional scenarios: context-aware recommendation, tags, . . .
distributed/parallel computing
Methods now shipped with MyMediaLite were
used in the MyMedia field trials (>50,000 users).
acknowledgements
authors: Zeno Gantner, Steffen Rendle, Christoph Freudenthaler
funding by EC FP7 project “Dynamic Personalization of Multimedia”
(MyMedia) under grant agreement no. 215006.
feedback, patches, suggestions: Thorsten Angermann, Fu Changhong,
Andreas Hoffmann, Artus Krohn-Grimberghe, Christina Lichtenth¨ler,
a
Damir Logar, Thai-Nghe Nguyen
Zeno Gantner, University of Hildesheim: MyMediaLite Recommender System Library — http://ismll.de/mymedialite 15 / 16