SlideShare a Scribd company logo
1 of 28
Download to read offline
Collaborative Filtering 
for fun 
Elissa Brown 
Erin Shellman 
Stella Rowlett 
…and profit! 
  
How’s it done? 
‣ Collaborative filtering! 
‣ Look for people who like the stuff 
you like, and recommend the 
things they’ve rated positively 
that you haven’t seen yet.
Step 1: Collect ratings 
Leaf Shorts Floral Shorts 
Elissa 5 5 
Erin 2 5 
Stella 4 1
5 
4 
3 
2 
1 
2 3 4 5 
Leaf Shorts 
Floral Shorts 
Person 
Elissa 
Erin 
Stella
Step 2: Find someone 
similar for Mr. New Customer 
‣ A new customer wants to 
buy a pair of shorts for 
his new “girlfriend.” 
 
‣ Which of us is most 
similar to Justin? 
 
 

Finding the nearest neighbor 
‣Remember the 
Pythagorean Theorem? 
a2 + b2 = c2 
c = 
p 
a2 + b2 
p 
(x1  x2)2 + (y1  y2)2 
p 
(1  5)2 + (4  5)2 = 4.12 
5 
4 
3 
2 
1 
1 2 3 4 5 
Leaf Shorts 
Floral Shorts 
Person 
Elissa 
Erin 
Justin 
Stella
What’s the distance 
between Justin and Stella? 
5 
4 
3 
2 
1 
1 2 3 4 5 
Leaf Shorts 
Floral Shorts 
Person 
Elissa 
Erin 
Justin 
Stella ?
What’s the distance 
between Justin and Stella? 
5 
4 
3 
2 
1 
1 2 3 4 5 
Leaf Shorts 
Floral Shorts 
Person 
Elissa 
Erin 
Justin 
Stella 
p 
(1  4)2 + (1  4)2 = 4.24
Meant to be! 
 
‣Justin and Erin are a 
match made in 
heaven. 
‣How can this help us 
shop for Justin’s 
5 
4 
Floral Shorts 
3 
2 
1 
1 2 3 4 5 
Leaf Shorts 
Person 
Elissa 
Erin 
Justin 
Stella 
p girlfriend (not Erin)? 
(1  2)2 + (4  5)2 = 1.41
Step 3: Make a suggestion! 
Leaf Shorts Floral Shorts Jungle Shorts 
Elissa 5 5 1 
Erin 2 5 5 
Stella 4 1 2 
Justin 1 4 ?
Let’s go shopping! 

How should we store our ratings data?
How should we store our ratings data? 
A dictionary! 
# Store user ratings 
user_ratings = {Elissa: {Leaf Shorts: 5, Floral Shorts: 5}, 
Erin: {Leaf Shorts: 2, Floral Shorts: 5}, 
Stella: {Leaf Shorts: 4, Floral Shorts: 1} 
}
Python Warm-up! 
# Store user ratings 
user_ratings = {Elissa: {Leaf Shorts: 5, Floral Shorts: 5}, 
Erin: {Leaf Shorts: 2, Floral Shorts: 5}, 
Stella: {Leaf Shorts: 4, Floral Shorts: 1} 
} 
! 
# Print Stella's ratings. 
print user_ratings['Stella'] 
! 
# What did Elissa think of the floral ones? 
print user_ratings['Elissa']['Leaf Shorts'] 
! 
def what_did_they_think(rating): 
if rating  3: 
opinion = LOVED IT! 
else: 
opinion = HATED IT! 
return opinion 
! 
my_humble_opinion = what_did_they_think(user_ratings['Erin']['Leaf Shorts']) 
! 
print Erin's review of the Leaf Shorts:  + my_humble_opinion 
!!
What functions do we need 
to build a recommender?
What functions do we need 
to build a recommender? 
1. Function to compute distances 
2. Function to find nearby people 
3. Function to recommend items I haven’t rated yet
Pseudocode: 
Function to compute distances
compute_distance 
def compute_distance(user1_ratings, user2_ratings): 
 
This function computes the distance between two user's 
ratings. Both arguments should be dictionaries keyed on users, 
and items. 
 
distances = [] 
for key in user1_ratings: 
if key in user2_ratings: 
distances.append((user1_ratings[key] - user2_ratings[key]) ** 2) 
total_distance = round(sum(distances) ** 0.5, 2) 
return total_distance
Pseudocode: 
Function to find closest match
find_nearest_neighbors 
def find_nearest_neighbors(username, user_ratings): 
 
Returns the list of neighbors, ordered by distance. 
Call like this: find_nearest_neighbors('Erin', user_ratings) 
 
distances = [] 
for user in user_ratings: 
if user != username: 
distance = compute_distance(user_ratings[user], user_ratings[username]) 
distances.append((distance, user)) 
distances.sort() 
return distances
Pseudocode: 
Function to make recommendations
get_recommendations 
def get_recommendations(username, user_ratings): 
 
Return a list of recommendations. 
 
nearest_users = find_nearest_neighbors(username, user_ratings) 
recommendations = [] 
! 
# Input user's ratings 
ratings = user_ratings[username] 
! 
for neighbor in nearest_users: 
neighbor_name = neighbor[1] 
for item in user_ratings[neighbor_name]: 
if not item in ratings: 
recommendations.append((item, user_ratings[neighbor_name][item])) 
! 
return sorted(recommendations, 
key = lambda personTuple: personTuple[1], 
reverse = True)
Limitations 
‣ What happens if everyone rated the 
same group of items? 
‣ What if there’s no overlap? 
‣ What about new users with no ratings?
Now what? 
‣Ask your classmates to rate their electives, 
and make a recommender to help students 
pick their classes. 
‣Poll recent graduates from your school about 
their college, and make a recommender to 
help students pick colleges.
 
https://github.com/erinshellman/girls-who-code-recommender

More Related Content

Recently uploaded

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Recently uploaded (20)

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 

Featured

How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
ThinkNow
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
Kurio // The Social Media Age(ncy)
 

Featured (20)

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 

Collaborative Filtering for fun ...and profit!

  • 1. Collaborative Filtering for fun Elissa Brown Erin Shellman Stella Rowlett …and profit!   
  • 2.
  • 3.
  • 4.
  • 5. How’s it done? ‣ Collaborative filtering! ‣ Look for people who like the stuff you like, and recommend the things they’ve rated positively that you haven’t seen yet.
  • 6. Step 1: Collect ratings Leaf Shorts Floral Shorts Elissa 5 5 Erin 2 5 Stella 4 1
  • 7. 5 4 3 2 1 2 3 4 5 Leaf Shorts Floral Shorts Person Elissa Erin Stella
  • 8. Step 2: Find someone similar for Mr. New Customer ‣ A new customer wants to buy a pair of shorts for his new “girlfriend.”  ‣ Which of us is most similar to Justin?   
  • 9. Finding the nearest neighbor ‣Remember the Pythagorean Theorem? a2 + b2 = c2 c = p a2 + b2 p (x1 x2)2 + (y1 y2)2 p (1 5)2 + (4 5)2 = 4.12 5 4 3 2 1 1 2 3 4 5 Leaf Shorts Floral Shorts Person Elissa Erin Justin Stella
  • 10. What’s the distance between Justin and Stella? 5 4 3 2 1 1 2 3 4 5 Leaf Shorts Floral Shorts Person Elissa Erin Justin Stella ?
  • 11. What’s the distance between Justin and Stella? 5 4 3 2 1 1 2 3 4 5 Leaf Shorts Floral Shorts Person Elissa Erin Justin Stella p (1 4)2 + (1 4)2 = 4.24
  • 12. Meant to be!  ‣Justin and Erin are a match made in heaven. ‣How can this help us shop for Justin’s 5 4 Floral Shorts 3 2 1 1 2 3 4 5 Leaf Shorts Person Elissa Erin Justin Stella p girlfriend (not Erin)? (1 2)2 + (4 5)2 = 1.41
  • 13. Step 3: Make a suggestion! Leaf Shorts Floral Shorts Jungle Shorts Elissa 5 5 1 Erin 2 5 5 Stella 4 1 2 Justin 1 4 ?
  • 15. How should we store our ratings data?
  • 16. How should we store our ratings data? A dictionary! # Store user ratings user_ratings = {Elissa: {Leaf Shorts: 5, Floral Shorts: 5}, Erin: {Leaf Shorts: 2, Floral Shorts: 5}, Stella: {Leaf Shorts: 4, Floral Shorts: 1} }
  • 17. Python Warm-up! # Store user ratings user_ratings = {Elissa: {Leaf Shorts: 5, Floral Shorts: 5}, Erin: {Leaf Shorts: 2, Floral Shorts: 5}, Stella: {Leaf Shorts: 4, Floral Shorts: 1} } ! # Print Stella's ratings. print user_ratings['Stella'] ! # What did Elissa think of the floral ones? print user_ratings['Elissa']['Leaf Shorts'] ! def what_did_they_think(rating): if rating 3: opinion = LOVED IT! else: opinion = HATED IT! return opinion ! my_humble_opinion = what_did_they_think(user_ratings['Erin']['Leaf Shorts']) ! print Erin's review of the Leaf Shorts: + my_humble_opinion !!
  • 18. What functions do we need to build a recommender?
  • 19. What functions do we need to build a recommender? 1. Function to compute distances 2. Function to find nearby people 3. Function to recommend items I haven’t rated yet
  • 20. Pseudocode: Function to compute distances
  • 21. compute_distance def compute_distance(user1_ratings, user2_ratings): This function computes the distance between two user's ratings. Both arguments should be dictionaries keyed on users, and items. distances = [] for key in user1_ratings: if key in user2_ratings: distances.append((user1_ratings[key] - user2_ratings[key]) ** 2) total_distance = round(sum(distances) ** 0.5, 2) return total_distance
  • 22. Pseudocode: Function to find closest match
  • 23. find_nearest_neighbors def find_nearest_neighbors(username, user_ratings): Returns the list of neighbors, ordered by distance. Call like this: find_nearest_neighbors('Erin', user_ratings) distances = [] for user in user_ratings: if user != username: distance = compute_distance(user_ratings[user], user_ratings[username]) distances.append((distance, user)) distances.sort() return distances
  • 24. Pseudocode: Function to make recommendations
  • 25. get_recommendations def get_recommendations(username, user_ratings): Return a list of recommendations. nearest_users = find_nearest_neighbors(username, user_ratings) recommendations = [] ! # Input user's ratings ratings = user_ratings[username] ! for neighbor in nearest_users: neighbor_name = neighbor[1] for item in user_ratings[neighbor_name]: if not item in ratings: recommendations.append((item, user_ratings[neighbor_name][item])) ! return sorted(recommendations, key = lambda personTuple: personTuple[1], reverse = True)
  • 26. Limitations ‣ What happens if everyone rated the same group of items? ‣ What if there’s no overlap? ‣ What about new users with no ratings?
  • 27. Now what? ‣Ask your classmates to rate their electives, and make a recommender to help students pick their classes. ‣Poll recent graduates from your school about their college, and make a recommender to help students pick colleges.