SlideShare a Scribd company logo
AN EXAMPLE OF
PREDICTIVE ANALYTICS:
BUILDING A
RECOMMENDATION
ENGINE USING PYTHON
A little about me
• Data Scientist at Texas Advanced Computing Center (TACC)
• My Contact: atrivedi@tacc.utexas.edu ; @anurive
• Independent research center at UT Austin
• One of the largest supercomputer center, HIPAA compliant
• ~250 faculty, researchers, students and staff
• We work on providing support to large scale computing problems
1
MOTIVATION
Talk Outline
• Predictive Analytics Vs Recommender Systems
• Introduction to recommender systems
• Types of recommender systems
• Recommender Systems for Pubmed document
• Workflow of TACC’s Recommender Systems
• Live Demo
3
Predictive Analytics
Vs
Recommender Systems
4
Data Analytics
• Mainly three types of Data Analytics:
Descriptive Predictive Prescriptive
1. The simplest class of
analytics
2. Allows you to condense
big data into smaller and
more useful chunks of
information
3. Use it when you need to
understand things at an
aggregate level
4. E,g: Vast majority of the
statistics that we use
(sums, averages, percent
changes etc).
1. It utilizes a variety of
statistical modeling, data
mining, and machine
learning techniques to
study recent and
historical data
2. It allows analysts to make
predictions about the
future.
3. Use Predictive analytics
any time you need to
know something about
the future or fill in the
information that you do
not have.
4. E.g: Sentiment Analysis
1. Prescriptive analytics is a
type of predictive
analytics
2. Prescribe or recommend
an action to end users
3. Helps informed decision
based on data
4. E.g: Recommendation
System
5
Introduction:
Recommender Systems
6
What are Recommendation Systems?
• Recommender System helps match users with item
• Implicit or explicit user feedback or item suggestion
• Different Recommender System designs
 Based on the availability of data or content/context of the data
• Our Recommendation system:
 We try to build a model which recommends Pubmed documents
to users
7
Why Recommendations ?
The world is an over-crowded
place
8
Types of Recommender System
9
Types of Recommender System
Types Pros Cons
Knowledge‐based
(i.e, search)
Deterministic
recommendations,
assured quality,
no cold‐ start
Knowledge engineering effort to
bootstrap,
basically static
Content‐based No community required,
comparison between items
possible
Content descriptions necessary,
cold start for new users
Collaborative No knowledge‐
engineering effort,
serendipity of results
Requires some form of rating
feedback,
cold start for new users and new
items
10
Commonly used Example: User-
based CF
• The recommendation system recommends books to customer C
• Customer B is very close to C (s/he has bought all the books C has
bought). Book 5 is highly recommended for customer C
• Customer D is somewhat close. Book 6 is recommended to a lower
extent
• Customers A and E are not similar at all
11
Paradigms of a Hybrid Recommender
System
Hybrid: combinations of various inputs
and/or composition of different
mechanism
12
Why need Hybrid Model for
recommending Pubmed Documents
• The problem that we are trying to solve
 Users search documents in Pubmed
 Based on their search, create a user profile with relevant
documents
 Recommend documents in context with the search
• The steps needed to solve:
 Search
 Information Retrieval for users based on Content and Context
Recommend documents in context with the search
13
Our Model: Hybrid Recommendation
Engine for Pubmed
Step 1: Content-based filtering on Pubmed documents
 We search documents using query term
 We tokenize each document into a combination of unique terms
 We compare the pubmed documents to a sparse matrix of
documents and terms
Step 2: We weight the query term in the sparse matrix and rank
documents
# We apply Vector Space Model (VSM) to combine Step 1 and Step 2
1414
Our Model: Hybrid Recommendation
Engine
Step 3:
• We filter the weighted/ranked documents using collaborative
filtering/recommendation
• We use a python library python-recsys (by Oscar Celma - Director of
Reseach at Pandora) for recommending Pubmed documents.
• We used Pandas for preprocessing data
• We use Scikit-learn to evaluate our model
• We do two types of recommendation -
➢ Item-based Recommendation
➢ User-based Recommendation
• Several data pre-processing steps are involved in Pubmed
document recommendation.
1515
Vector Space Model (VSM)
• Given:
 A set of Pubmed documents
 N features (terms) describing the documents in the set
• VSM builds an N-dimensional Vector Space.
• Each itemdocument is represented as a point in the Vector Space
• Information Retrieval based on search
 Query: A point in the Vector Space
 We apply TFIDF to the tokenized documents to weight the documents and
convert the documents to vectors
 We compute cosine similarity between the tokenized documents and the query
term
 We select top 3 documents matching our query
1616
Available Recommendation Libraries
LibRec
mrec
Python-recsys
1717
Motivation for using Python-
Recsys library
18
• A python library for building recommendation engines
• Works with Scikit-learn for collaborative,content and hybrid filtering
• Underlying model: K-SVD (good for sparse matrix)
• Open Source: https://github.com/ocelma/python-recsys
Python-Recsys library
19
Code Execution: Demo
We try to show that our Hybrid model (using VSM and python-
recsys combined) works better that using python-recsys library
alone
➢ First, we explain the collaborative filtering technique, using
the python-recsys library on a simple movie data set
➢ Next, we explain our hybrid model for pubmed
recommendation
➢ We run comparative evaluations on the model (RMSE &
MAE). Our model performs better – as it utilizes the
content/context of items
2020
Data Used for Demo
• Movie dataset format: • Pubmed dataset format:
21
UserID MovieID Rating
1 1 5.0
2 1 3.5
3 2 4.0
4 3 5.0
5 3 2.5
6 4 4.5
7 4 4.0
ID User
Name
User
ID
Query
Term
Doc
ID
Doc
Rank
Doc
Name
44 paul 11 genetics 7713442 0.341534906
…
Each_Doc_Token Combined_Docs_Tokens
…. ….
21
Conclusion
◆Pros:
➢ Python-Recsys is very accurate recommendation library
➢The Hybrid model increase the prediction/recommendation
accuracy
➢Great community support
◆Cons:
➢Parallelization is not that great
➢For Big Data, need to use different functions instead of SVD.
22
Acknowledgement
1. Oscar Celma: Python-Recsys - https://github.com/ocelma/python-
recsys
2. Alan Said: Comparative Recommender System Evaluation -
Benchmarking Recommendation Frameworks
3. Pasquale Lops: Semantics-aware Content-based Recommender
23
THANK YOU !
Thanks for your attention.
Questions?
24

More Related Content

What's hot

Recommender systems
Recommender systemsRecommender systems
Recommender systems
Tamer Rezk
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
Francesco Casalegno
 
Data Wrangling
Data WranglingData Wrangling
Data Wrangling
Ashwini Kuntamukkala
 
Data analytics introduction
Data analytics introductionData analytics introduction
Data analytics introduction
amiyadash
 
Knowledge Graph Embeddings for Recommender Systems
Knowledge Graph Embeddings for Recommender SystemsKnowledge Graph Embeddings for Recommender Systems
Knowledge Graph Embeddings for Recommender Systems
Enrico Palumbo
 
Algorithmic Music Recommendations at Spotify
Algorithmic Music Recommendations at SpotifyAlgorithmic Music Recommendations at Spotify
Algorithmic Music Recommendations at Spotify
Chris Johnson
 
Warsaw Data Science - Factorization Machines Introduction
Warsaw Data Science -  Factorization Machines IntroductionWarsaw Data Science -  Factorization Machines Introduction
Warsaw Data Science - Factorization Machines Introduction
Bartlomiej Twardowski
 
Music Recommendation 2018
Music Recommendation 2018Music Recommendation 2018
Music Recommendation 2018
Fabien Gouyon
 
Movie recommendation system using collaborative filtering system
Movie recommendation system using collaborative filtering system Movie recommendation system using collaborative filtering system
Movie recommendation system using collaborative filtering system
Mauryasuraj98
 
Apriori and Eclat algorithm in Association Rule Mining
Apriori and Eclat algorithm in Association Rule MiningApriori and Eclat algorithm in Association Rule Mining
Apriori and Eclat algorithm in Association Rule MiningWan Aezwani Wab
 
Recommendation Systems Basics
Recommendation Systems BasicsRecommendation Systems Basics
Recommendation Systems Basics
Jarin Tasnim Khan
 
Personalized Playlists at Spotify
Personalized Playlists at SpotifyPersonalized Playlists at Spotify
Personalized Playlists at Spotify
Rohan Agrawal
 
Recommender system introduction
Recommender system   introductionRecommender system   introduction
Recommender system introductionLiang Xiang
 
Recommender systems: Content-based and collaborative filtering
Recommender systems: Content-based and collaborative filteringRecommender systems: Content-based and collaborative filtering
Recommender systems: Content-based and collaborative filtering
Viet-Trung TRAN
 
Tutorial on Sequence Aware Recommender Systems - ACM RecSys 2018
Tutorial on Sequence Aware Recommender Systems - ACM RecSys 2018Tutorial on Sequence Aware Recommender Systems - ACM RecSys 2018
Tutorial on Sequence Aware Recommender Systems - ACM RecSys 2018
Massimo Quadrana
 
Recent Trends in Personalization at Netflix
Recent Trends in Personalization at NetflixRecent Trends in Personalization at Netflix
Recent Trends in Personalization at Netflix
Förderverein Technische Fakultät
 
Music Recommendations at Scale with Spark
Music Recommendations at Scale with SparkMusic Recommendations at Scale with Spark
Music Recommendations at Scale with Spark
Chris Johnson
 
Tutorial on sequence aware recommender systems - UMAP 2018
Tutorial on sequence aware recommender systems - UMAP 2018Tutorial on sequence aware recommender systems - UMAP 2018
Tutorial on sequence aware recommender systems - UMAP 2018
Paolo Cremonesi
 
Data Discovery and Visualization
Data Discovery and VisualizationData Discovery and Visualization
Data Discovery and Visualization
Dr. Neil Brittliff
 
Recommendation System
Recommendation SystemRecommendation System
Recommendation System
Anamta Sayyed
 

What's hot (20)

Recommender systems
Recommender systemsRecommender systems
Recommender systems
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
Data Wrangling
Data WranglingData Wrangling
Data Wrangling
 
Data analytics introduction
Data analytics introductionData analytics introduction
Data analytics introduction
 
Knowledge Graph Embeddings for Recommender Systems
Knowledge Graph Embeddings for Recommender SystemsKnowledge Graph Embeddings for Recommender Systems
Knowledge Graph Embeddings for Recommender Systems
 
Algorithmic Music Recommendations at Spotify
Algorithmic Music Recommendations at SpotifyAlgorithmic Music Recommendations at Spotify
Algorithmic Music Recommendations at Spotify
 
Warsaw Data Science - Factorization Machines Introduction
Warsaw Data Science -  Factorization Machines IntroductionWarsaw Data Science -  Factorization Machines Introduction
Warsaw Data Science - Factorization Machines Introduction
 
Music Recommendation 2018
Music Recommendation 2018Music Recommendation 2018
Music Recommendation 2018
 
Movie recommendation system using collaborative filtering system
Movie recommendation system using collaborative filtering system Movie recommendation system using collaborative filtering system
Movie recommendation system using collaborative filtering system
 
Apriori and Eclat algorithm in Association Rule Mining
Apriori and Eclat algorithm in Association Rule MiningApriori and Eclat algorithm in Association Rule Mining
Apriori and Eclat algorithm in Association Rule Mining
 
Recommendation Systems Basics
Recommendation Systems BasicsRecommendation Systems Basics
Recommendation Systems Basics
 
Personalized Playlists at Spotify
Personalized Playlists at SpotifyPersonalized Playlists at Spotify
Personalized Playlists at Spotify
 
Recommender system introduction
Recommender system   introductionRecommender system   introduction
Recommender system introduction
 
Recommender systems: Content-based and collaborative filtering
Recommender systems: Content-based and collaborative filteringRecommender systems: Content-based and collaborative filtering
Recommender systems: Content-based and collaborative filtering
 
Tutorial on Sequence Aware Recommender Systems - ACM RecSys 2018
Tutorial on Sequence Aware Recommender Systems - ACM RecSys 2018Tutorial on Sequence Aware Recommender Systems - ACM RecSys 2018
Tutorial on Sequence Aware Recommender Systems - ACM RecSys 2018
 
Recent Trends in Personalization at Netflix
Recent Trends in Personalization at NetflixRecent Trends in Personalization at Netflix
Recent Trends in Personalization at Netflix
 
Music Recommendations at Scale with Spark
Music Recommendations at Scale with SparkMusic Recommendations at Scale with Spark
Music Recommendations at Scale with Spark
 
Tutorial on sequence aware recommender systems - UMAP 2018
Tutorial on sequence aware recommender systems - UMAP 2018Tutorial on sequence aware recommender systems - UMAP 2018
Tutorial on sequence aware recommender systems - UMAP 2018
 
Data Discovery and Visualization
Data Discovery and VisualizationData Discovery and Visualization
Data Discovery and Visualization
 
Recommendation System
Recommendation SystemRecommendation System
Recommendation System
 

Similar to An Example of Predictive Analytics: Building a Recommendation Engine Using Python-Anusua Trivedi

RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
Joaquin Delgado PhD.
 
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
 RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning... RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
S. Diana Hu
 
Tag based recommender system
Tag based recommender systemTag based recommender system
Tag based recommender system
Karen Li
 
Recsys 2016
Recsys 2016Recsys 2016
Recsys 2016
Mindaugas Zickus
 
Towards effective research recommender systems for repositories
Towards effective research recommender systems for repositoriesTowards effective research recommender systems for repositories
Towards effective research recommender systems for repositories
petrknoth
 
Modern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in MendeleyModern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in Mendeley
Kris Jack
 
Major_Project_Presentaion_B14.pptx
Major_Project_Presentaion_B14.pptxMajor_Project_Presentaion_B14.pptx
Major_Project_Presentaion_B14.pptx
LokeshKumarReddy8
 
Recommendation Systems
Recommendation SystemsRecommendation Systems
Recommendation Systems
Hilary Aben
 
 Towards Reproducible Data Analysis Using Cloud and Container Technologies
 Towards Reproducible Data Analysis Using Cloud and Container Technologies Towards Reproducible Data Analysis Using Cloud and Container Technologies
 Towards Reproducible Data Analysis Using Cloud and Container Technologies
inside-BigData.com
 
Final presentation
Final presentationFinal presentation
Final presentation
Nitish Upreti
 
Filtering content bbased crs
Filtering content bbased crsFiltering content bbased crs
Filtering content bbased crs
Aravindharamanan S
 
Enhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.com
Enhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.comEnhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.com
Enhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.com
Simon Hughes
 
Data Science With Python | Python For Data Science | Python Data Science Cour...
Data Science With Python | Python For Data Science | Python Data Science Cour...Data Science With Python | Python For Data Science | Python Data Science Cour...
Data Science With Python | Python For Data Science | Python Data Science Cour...
Simplilearn
 
How Oracle Uses CrowdFlower For Sentiment Analysis
How Oracle Uses CrowdFlower For Sentiment AnalysisHow Oracle Uses CrowdFlower For Sentiment Analysis
How Oracle Uses CrowdFlower For Sentiment Analysis
CrowdFlower
 
UKSG 2018 Breakout - TERMS redefined: developing the combination of electroni...
UKSG 2018 Breakout - TERMS redefined: developing the combination of electroni...UKSG 2018 Breakout - TERMS redefined: developing the combination of electroni...
UKSG 2018 Breakout - TERMS redefined: developing the combination of electroni...
UKSG: connecting the knowledge community
 
Elsevier - Smart Data and Algorithms for the Publishing Industry
Elsevier - Smart Data and Algorithms for the Publishing IndustryElsevier - Smart Data and Algorithms for the Publishing Industry
Elsevier - Smart Data and Algorithms for the Publishing Industry
Antonio Gulli
 
Recommendation engines
Recommendation enginesRecommendation engines
Recommendation enginesGeorgian Micsa
 
Overview of recommender system
Overview of recommender systemOverview of recommender system
Overview of recommender system
Stanley Wang
 
Talk on Research Data Management
Talk on Research Data ManagementTalk on Research Data Management
Talk on Research Data Management
Anita de Waard
 
UKSG webinar - TERMS revisited: developing the combination of electronic reso...
UKSG webinar - TERMS revisited: developing the combination of electronic reso...UKSG webinar - TERMS revisited: developing the combination of electronic reso...
UKSG webinar - TERMS revisited: developing the combination of electronic reso...
UKSG: connecting the knowledge community
 

Similar to An Example of Predictive Analytics: Building a Recommendation Engine Using Python-Anusua Trivedi (20)

RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
 
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
 RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning... RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
 
Tag based recommender system
Tag based recommender systemTag based recommender system
Tag based recommender system
 
Recsys 2016
Recsys 2016Recsys 2016
Recsys 2016
 
Towards effective research recommender systems for repositories
Towards effective research recommender systems for repositoriesTowards effective research recommender systems for repositories
Towards effective research recommender systems for repositories
 
Modern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in MendeleyModern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in Mendeley
 
Major_Project_Presentaion_B14.pptx
Major_Project_Presentaion_B14.pptxMajor_Project_Presentaion_B14.pptx
Major_Project_Presentaion_B14.pptx
 
Recommendation Systems
Recommendation SystemsRecommendation Systems
Recommendation Systems
 
 Towards Reproducible Data Analysis Using Cloud and Container Technologies
 Towards Reproducible Data Analysis Using Cloud and Container Technologies Towards Reproducible Data Analysis Using Cloud and Container Technologies
 Towards Reproducible Data Analysis Using Cloud and Container Technologies
 
Final presentation
Final presentationFinal presentation
Final presentation
 
Filtering content bbased crs
Filtering content bbased crsFiltering content bbased crs
Filtering content bbased crs
 
Enhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.com
Enhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.comEnhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.com
Enhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.com
 
Data Science With Python | Python For Data Science | Python Data Science Cour...
Data Science With Python | Python For Data Science | Python Data Science Cour...Data Science With Python | Python For Data Science | Python Data Science Cour...
Data Science With Python | Python For Data Science | Python Data Science Cour...
 
How Oracle Uses CrowdFlower For Sentiment Analysis
How Oracle Uses CrowdFlower For Sentiment AnalysisHow Oracle Uses CrowdFlower For Sentiment Analysis
How Oracle Uses CrowdFlower For Sentiment Analysis
 
UKSG 2018 Breakout - TERMS redefined: developing the combination of electroni...
UKSG 2018 Breakout - TERMS redefined: developing the combination of electroni...UKSG 2018 Breakout - TERMS redefined: developing the combination of electroni...
UKSG 2018 Breakout - TERMS redefined: developing the combination of electroni...
 
Elsevier - Smart Data and Algorithms for the Publishing Industry
Elsevier - Smart Data and Algorithms for the Publishing IndustryElsevier - Smart Data and Algorithms for the Publishing Industry
Elsevier - Smart Data and Algorithms for the Publishing Industry
 
Recommendation engines
Recommendation enginesRecommendation engines
Recommendation engines
 
Overview of recommender system
Overview of recommender systemOverview of recommender system
Overview of recommender system
 
Talk on Research Data Management
Talk on Research Data ManagementTalk on Research Data Management
Talk on Research Data Management
 
UKSG webinar - TERMS revisited: developing the combination of electronic reso...
UKSG webinar - TERMS revisited: developing the combination of electronic reso...UKSG webinar - TERMS revisited: developing the combination of electronic reso...
UKSG webinar - TERMS revisited: developing the combination of electronic reso...
 

More from PyData

Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...
Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...
Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...
PyData
 
Unit testing data with marbles - Jane Stewart Adams, Leif Walsh
Unit testing data with marbles - Jane Stewart Adams, Leif WalshUnit testing data with marbles - Jane Stewart Adams, Leif Walsh
Unit testing data with marbles - Jane Stewart Adams, Leif Walsh
PyData
 
The TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake Bolewski
The TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake BolewskiThe TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake Bolewski
The TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake Bolewski
PyData
 
Using Embeddings to Understand the Variance and Evolution of Data Science... ...
Using Embeddings to Understand the Variance and Evolution of Data Science... ...Using Embeddings to Understand the Variance and Evolution of Data Science... ...
Using Embeddings to Understand the Variance and Evolution of Data Science... ...
PyData
 
Deploying Data Science for Distribution of The New York Times - Anne Bauer
Deploying Data Science for Distribution of The New York Times - Anne BauerDeploying Data Science for Distribution of The New York Times - Anne Bauer
Deploying Data Science for Distribution of The New York Times - Anne Bauer
PyData
 
Graph Analytics - From the Whiteboard to Your Toolbox - Sam Lerma
Graph Analytics - From the Whiteboard to Your Toolbox - Sam LermaGraph Analytics - From the Whiteboard to Your Toolbox - Sam Lerma
Graph Analytics - From the Whiteboard to Your Toolbox - Sam Lerma
PyData
 
Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...
Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...
Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...
PyData
 
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo MazzaferroRESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
PyData
 
Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...
Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...
Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...
PyData
 
Avoiding Bad Database Surprises: Simulation and Scalability - Steven Lott
Avoiding Bad Database Surprises: Simulation and Scalability - Steven LottAvoiding Bad Database Surprises: Simulation and Scalability - Steven Lott
Avoiding Bad Database Surprises: Simulation and Scalability - Steven Lott
PyData
 
Words in Space - Rebecca Bilbro
Words in Space - Rebecca BilbroWords in Space - Rebecca Bilbro
Words in Space - Rebecca Bilbro
PyData
 
End-to-End Machine learning pipelines for Python driven organizations - Nick ...
End-to-End Machine learning pipelines for Python driven organizations - Nick ...End-to-End Machine learning pipelines for Python driven organizations - Nick ...
End-to-End Machine learning pipelines for Python driven organizations - Nick ...
PyData
 
Pydata beautiful soup - Monica Puerto
Pydata beautiful soup - Monica PuertoPydata beautiful soup - Monica Puerto
Pydata beautiful soup - Monica Puerto
PyData
 
1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...
1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...
1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...
PyData
 
Extending Pandas with Custom Types - Will Ayd
Extending Pandas with Custom Types - Will AydExtending Pandas with Custom Types - Will Ayd
Extending Pandas with Custom Types - Will Ayd
PyData
 
Measuring Model Fairness - Stephen Hoover
Measuring Model Fairness - Stephen HooverMeasuring Model Fairness - Stephen Hoover
Measuring Model Fairness - Stephen Hoover
PyData
 
What's the Science in Data Science? - Skipper Seabold
What's the Science in Data Science? - Skipper SeaboldWhat's the Science in Data Science? - Skipper Seabold
What's the Science in Data Science? - Skipper Seabold
PyData
 
Applying Statistical Modeling and Machine Learning to Perform Time-Series For...
Applying Statistical Modeling and Machine Learning to Perform Time-Series For...Applying Statistical Modeling and Machine Learning to Perform Time-Series For...
Applying Statistical Modeling and Machine Learning to Perform Time-Series For...
PyData
 
Solving very simple substitution ciphers algorithmically - Stephen Enright-Ward
Solving very simple substitution ciphers algorithmically - Stephen Enright-WardSolving very simple substitution ciphers algorithmically - Stephen Enright-Ward
Solving very simple substitution ciphers algorithmically - Stephen Enright-Ward
PyData
 
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
PyData
 

More from PyData (20)

Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...
Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...
Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...
 
Unit testing data with marbles - Jane Stewart Adams, Leif Walsh
Unit testing data with marbles - Jane Stewart Adams, Leif WalshUnit testing data with marbles - Jane Stewart Adams, Leif Walsh
Unit testing data with marbles - Jane Stewart Adams, Leif Walsh
 
The TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake Bolewski
The TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake BolewskiThe TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake Bolewski
The TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake Bolewski
 
Using Embeddings to Understand the Variance and Evolution of Data Science... ...
Using Embeddings to Understand the Variance and Evolution of Data Science... ...Using Embeddings to Understand the Variance and Evolution of Data Science... ...
Using Embeddings to Understand the Variance and Evolution of Data Science... ...
 
Deploying Data Science for Distribution of The New York Times - Anne Bauer
Deploying Data Science for Distribution of The New York Times - Anne BauerDeploying Data Science for Distribution of The New York Times - Anne Bauer
Deploying Data Science for Distribution of The New York Times - Anne Bauer
 
Graph Analytics - From the Whiteboard to Your Toolbox - Sam Lerma
Graph Analytics - From the Whiteboard to Your Toolbox - Sam LermaGraph Analytics - From the Whiteboard to Your Toolbox - Sam Lerma
Graph Analytics - From the Whiteboard to Your Toolbox - Sam Lerma
 
Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...
Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...
Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...
 
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo MazzaferroRESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
 
Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...
Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...
Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...
 
Avoiding Bad Database Surprises: Simulation and Scalability - Steven Lott
Avoiding Bad Database Surprises: Simulation and Scalability - Steven LottAvoiding Bad Database Surprises: Simulation and Scalability - Steven Lott
Avoiding Bad Database Surprises: Simulation and Scalability - Steven Lott
 
Words in Space - Rebecca Bilbro
Words in Space - Rebecca BilbroWords in Space - Rebecca Bilbro
Words in Space - Rebecca Bilbro
 
End-to-End Machine learning pipelines for Python driven organizations - Nick ...
End-to-End Machine learning pipelines for Python driven organizations - Nick ...End-to-End Machine learning pipelines for Python driven organizations - Nick ...
End-to-End Machine learning pipelines for Python driven organizations - Nick ...
 
Pydata beautiful soup - Monica Puerto
Pydata beautiful soup - Monica PuertoPydata beautiful soup - Monica Puerto
Pydata beautiful soup - Monica Puerto
 
1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...
1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...
1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...
 
Extending Pandas with Custom Types - Will Ayd
Extending Pandas with Custom Types - Will AydExtending Pandas with Custom Types - Will Ayd
Extending Pandas with Custom Types - Will Ayd
 
Measuring Model Fairness - Stephen Hoover
Measuring Model Fairness - Stephen HooverMeasuring Model Fairness - Stephen Hoover
Measuring Model Fairness - Stephen Hoover
 
What's the Science in Data Science? - Skipper Seabold
What's the Science in Data Science? - Skipper SeaboldWhat's the Science in Data Science? - Skipper Seabold
What's the Science in Data Science? - Skipper Seabold
 
Applying Statistical Modeling and Machine Learning to Perform Time-Series For...
Applying Statistical Modeling and Machine Learning to Perform Time-Series For...Applying Statistical Modeling and Machine Learning to Perform Time-Series For...
Applying Statistical Modeling and Machine Learning to Perform Time-Series For...
 
Solving very simple substitution ciphers algorithmically - Stephen Enright-Ward
Solving very simple substitution ciphers algorithmically - Stephen Enright-WardSolving very simple substitution ciphers algorithmically - Stephen Enright-Ward
Solving very simple substitution ciphers algorithmically - Stephen Enright-Ward
 
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
 

Recently uploaded

一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
oz8q3jxlp
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
v3tuleee
 
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
Walaa Eldin Moustafa
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
u86oixdj
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
roli9797
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
jerlynmaetalle
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
sameer shah
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
Nanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdfNanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdf
eddie19851
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
mbawufebxi
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
Roger Valdez
 
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
mzpolocfi
 
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
u86oixdj
 
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
dwreak4tg
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
slg6lamcq
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
jerlynmaetalle
 
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdfUnleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Enterprise Wired
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
rwarrenll
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023
kuntobimo2016
 

Recently uploaded (20)

一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
 
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
Nanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdfNanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdf
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
 
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
 
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
 
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
 
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdfUnleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023
 

An Example of Predictive Analytics: Building a Recommendation Engine Using Python-Anusua Trivedi

  • 1. AN EXAMPLE OF PREDICTIVE ANALYTICS: BUILDING A RECOMMENDATION ENGINE USING PYTHON
  • 2. A little about me • Data Scientist at Texas Advanced Computing Center (TACC) • My Contact: atrivedi@tacc.utexas.edu ; @anurive • Independent research center at UT Austin • One of the largest supercomputer center, HIPAA compliant • ~250 faculty, researchers, students and staff • We work on providing support to large scale computing problems 1
  • 4. Talk Outline • Predictive Analytics Vs Recommender Systems • Introduction to recommender systems • Types of recommender systems • Recommender Systems for Pubmed document • Workflow of TACC’s Recommender Systems • Live Demo 3
  • 6. Data Analytics • Mainly three types of Data Analytics: Descriptive Predictive Prescriptive 1. The simplest class of analytics 2. Allows you to condense big data into smaller and more useful chunks of information 3. Use it when you need to understand things at an aggregate level 4. E,g: Vast majority of the statistics that we use (sums, averages, percent changes etc). 1. It utilizes a variety of statistical modeling, data mining, and machine learning techniques to study recent and historical data 2. It allows analysts to make predictions about the future. 3. Use Predictive analytics any time you need to know something about the future or fill in the information that you do not have. 4. E.g: Sentiment Analysis 1. Prescriptive analytics is a type of predictive analytics 2. Prescribe or recommend an action to end users 3. Helps informed decision based on data 4. E.g: Recommendation System 5
  • 8. What are Recommendation Systems? • Recommender System helps match users with item • Implicit or explicit user feedback or item suggestion • Different Recommender System designs  Based on the availability of data or content/context of the data • Our Recommendation system:  We try to build a model which recommends Pubmed documents to users 7
  • 9. Why Recommendations ? The world is an over-crowded place 8
  • 11. Types of Recommender System Types Pros Cons Knowledge‐based (i.e, search) Deterministic recommendations, assured quality, no cold‐ start Knowledge engineering effort to bootstrap, basically static Content‐based No community required, comparison between items possible Content descriptions necessary, cold start for new users Collaborative No knowledge‐ engineering effort, serendipity of results Requires some form of rating feedback, cold start for new users and new items 10
  • 12. Commonly used Example: User- based CF • The recommendation system recommends books to customer C • Customer B is very close to C (s/he has bought all the books C has bought). Book 5 is highly recommended for customer C • Customer D is somewhat close. Book 6 is recommended to a lower extent • Customers A and E are not similar at all 11
  • 13. Paradigms of a Hybrid Recommender System Hybrid: combinations of various inputs and/or composition of different mechanism 12
  • 14. Why need Hybrid Model for recommending Pubmed Documents • The problem that we are trying to solve  Users search documents in Pubmed  Based on their search, create a user profile with relevant documents  Recommend documents in context with the search • The steps needed to solve:  Search  Information Retrieval for users based on Content and Context Recommend documents in context with the search 13
  • 15. Our Model: Hybrid Recommendation Engine for Pubmed Step 1: Content-based filtering on Pubmed documents  We search documents using query term  We tokenize each document into a combination of unique terms  We compare the pubmed documents to a sparse matrix of documents and terms Step 2: We weight the query term in the sparse matrix and rank documents # We apply Vector Space Model (VSM) to combine Step 1 and Step 2 1414
  • 16. Our Model: Hybrid Recommendation Engine Step 3: • We filter the weighted/ranked documents using collaborative filtering/recommendation • We use a python library python-recsys (by Oscar Celma - Director of Reseach at Pandora) for recommending Pubmed documents. • We used Pandas for preprocessing data • We use Scikit-learn to evaluate our model • We do two types of recommendation - ➢ Item-based Recommendation ➢ User-based Recommendation • Several data pre-processing steps are involved in Pubmed document recommendation. 1515
  • 17. Vector Space Model (VSM) • Given:  A set of Pubmed documents  N features (terms) describing the documents in the set • VSM builds an N-dimensional Vector Space. • Each itemdocument is represented as a point in the Vector Space • Information Retrieval based on search  Query: A point in the Vector Space  We apply TFIDF to the tokenized documents to weight the documents and convert the documents to vectors  We compute cosine similarity between the tokenized documents and the query term  We select top 3 documents matching our query 1616
  • 19. Motivation for using Python- Recsys library 18
  • 20. • A python library for building recommendation engines • Works with Scikit-learn for collaborative,content and hybrid filtering • Underlying model: K-SVD (good for sparse matrix) • Open Source: https://github.com/ocelma/python-recsys Python-Recsys library 19
  • 21. Code Execution: Demo We try to show that our Hybrid model (using VSM and python- recsys combined) works better that using python-recsys library alone ➢ First, we explain the collaborative filtering technique, using the python-recsys library on a simple movie data set ➢ Next, we explain our hybrid model for pubmed recommendation ➢ We run comparative evaluations on the model (RMSE & MAE). Our model performs better – as it utilizes the content/context of items 2020
  • 22. Data Used for Demo • Movie dataset format: • Pubmed dataset format: 21 UserID MovieID Rating 1 1 5.0 2 1 3.5 3 2 4.0 4 3 5.0 5 3 2.5 6 4 4.5 7 4 4.0 ID User Name User ID Query Term Doc ID Doc Rank Doc Name 44 paul 11 genetics 7713442 0.341534906 … Each_Doc_Token Combined_Docs_Tokens …. …. 21
  • 23. Conclusion ◆Pros: ➢ Python-Recsys is very accurate recommendation library ➢The Hybrid model increase the prediction/recommendation accuracy ➢Great community support ◆Cons: ➢Parallelization is not that great ➢For Big Data, need to use different functions instead of SVD. 22
  • 24. Acknowledgement 1. Oscar Celma: Python-Recsys - https://github.com/ocelma/python- recsys 2. Alan Said: Comparative Recommender System Evaluation - Benchmarking Recommendation Frameworks 3. Pasquale Lops: Semantics-aware Content-based Recommender 23
  • 25. THANK YOU ! Thanks for your attention. Questions? 24