People who liked this talk also liked … Building Recommendation Systems Using Ruby

•

5 likes•2,674 views

From Amazon, to Spotify, to thermostats, recommendation systems are everywhere. The ability to provide recommendations for your users is becoming a crucial feature for modern applications. In this talk I'll show you how you can use Ruby to build recommendation systems for your users. You don't need a PhD to build a simple recommendation engine -- all you need is Ruby. Together we'll dive into the dark arts of machine learning and you'll discover that writing a basic recommendation engine is not as hard as you might have imagined. Using Ruby I'll teach you some of the common algorithms used in recommender systems, such as: Collaborative Filtering, K-Nearest Neighbor, and Pearson Correlation Coefficient. At the end of the talk you should be on your way to writing your own basic recommendation system in Ruby.

Technology

People who liked this talk also liked …
Building Recommendation Systems
Using Ruby

Ryan Weald, @rweald
LA RubyConf 2013

1

Who is this guy?

What does he know
about recommendation
systems?

2

Data Scientist @Sharethrough

Native advertising
platform
3

Outline
1) What is a recommendation system?
2) Collaborative ﬁltering based
recommendations
3) Content based recommendations
4) Hybrid systems - the best of both worlds
5) Evaluating your recommendation system
6) Resources & existing libraries

5

What this Talk is Not
• Everything there is to know about
recommendation systems.
• Bleeding edge machine learning
• How to use a speciﬁc library

6

A program that predicts
a user’s preferences using information
about the user, other users, and the
items in your system.

8

Two Main Categories of Algorithm

1. Collaborative Filtering (CF)

2. Content Based - Classiﬁcation

14

Collaborative Filtering

Fill in missing user preferences using
similar users or items

15

Two Types of CF
1. Memory Based - Uses similarity
between users or items. Dataset
usually kept in memory

2. Model Based - Model generated
to “explain” observed ratings

16

User Based CF

(User x Item) Matrix + Similarity
Function = Top-K most similar users

17

Collaborative Filtering
Video 1 Video 2 Video 3 Video 4 Video 5

User 1 0 1 0 5 0

User 2 1 2 1 0 5

User 3 2 5 0 0 2

User 4 5 4 4 1 1

User 5 2 4 2
? ?
* 0 denotes not rated

18

Similarity Functions

• Pearson Correlation Coefﬁcient
• Cosine Similarity

19

Collaborative Filtering
Video 1 Video 2 Video 3 Video 4 Video 5

User 1 0 1 0 5 0

User 2 1 2 1 0 5

User 3 2 5 0 0 2

User 4 5 4 4 1 1

User 5 2 4 2
? ?
* 0 denotes not rated

29

Problems With CF

• Cold Start
• Data Sparsity
• Resource expensive

31

Doesn’t the video
content matter for
recommendations?

32

Content Based Recommendations

Classify items based on features of
the item. Pick other items from
same class to recommend.

33

Content Based Algorithms
• K-means clustering
• Random Forrest
• Support Vector Machines
• ...
• Insert your favorite ML algorithm

34

Content Based Algorithms
Type of Duration Maturity
content Rating
Video 1 comedy 60 G

Video 2 action 120 G

Video 3 comedy 34 PG-13

Video 4 romantic 15 R

Video 5 sports 120 G

35

K-means Clustering

Group items into K clusters.
Assign new item to a cluster and
pick items from that cluster

36

Problems With Content Based
Recommendations

• Unsupervised Learning is hard
• Training data limited or expensive
• Doesn’t take user into account
• Limited by features of content

38

Hybrid Recommendations

Combine collaborative ﬁltering with
content based algorithm to achieve
greater results

39

Hybrid Recommendations

Input
CF Based
Recommender

Combiner Reco

Input
Content Based
Recommender

40

Hybrid Recommendations

Content CF
Input Reco
Recommender Recommender

42

Hybrid Recommendations

CF
Recommender
Input Reco
Content
Recommender

43

Evaluating Recommendation Quality

• Precision vs. Recall
• Clicks
• Click through rate
• Direct user feedback

44

Summary of What We’ve Learned

• Collaborative Filtering using similar users
• Content clustering using k-means
• Combining 2 algorithms to boost quality
• How to evaluate your recommender

47

Don’t Reinvent the Wheel

• Apache Mahout
• JRuby mahout gem
• SciRuby
• Recommenderlab for R

48

Resources & Further Reading
• Recommender Systems: An Introduction
• Linden, Greg, Brent Smith, and Jeremy York.
"Amazon. com recommendations: Item-to-item
collaborative ﬁltering."
• Resnick, Paul, et al. "GroupLens: an open architecture
for collaborative ﬁltering of netnews."
• ACM RecSys Conference Proceedings

49

We’re Hiring
http://bit.ly/str-engineering

50

Thanks!
Twitter: @rweald
Email: ryan@sharethrough.com

51

See conference video - http://www.lucidimagination.com/devzone/events/conferences/ApacheLuceneEurocon2011 "I know it when I see it". This term was coined by a Supreme Court Justice in reference to obscenity, but he might as well been talking about relevancy and search engine results. Testing search engines is rarely a binary process of "it works, it doesn't work", instead it draws on our human skills to design tests that capture the intangibles that make up a great search engine implementation! The behavior of a search engine changes as the data changes, so a search that returns one set of results today will return a different set tomorrow. Is that a bug? Or just a finely tuned search engine responding to changes in the data it searches? Search Engine testing often focuses on the very first layer of functionality, "Do I get results?", without digging deeper into "Do I get great relevant results?".

Social Media Boot Camp Los Angeles 2010 Day 2

Eric Schwartzman

Combining IR with Relevance Feedback for Concept LocationSonia Haiduc

Social Media Boot Camp L.A. Day 2, 2010Eric Schwartzman

Modern Perspectives on Recommender Systems and their Applications in Mendeley

Kris Jack

Social Media Boot Camp 2

Eric Schwartzman

Code quality is not an exact science and is rather subjective, which brings the need of well defined rules and principles to follow. Clean code is all about readability, furthermore, principles like GRASP or techniques like fluent APIs makes the code even cleaner and maintainable. Design Patterns on the other hand are typical solutions to common problems in software design. You will walk away with a taste of quality principles and metrics for building quality software. Also previews of some useful books like Martin Fowler's Clean Code and Kent Beck's Implementation Patterns are presented.

10 Easy Ways to Take Your Website from Good to Great

Chris Sietsema

Enterprise Search @EPAMAlex Kozhemiakin

Why We Refactor? Confessions of GitHub Contributors

Nikolaos Tsantalis

Refactoring is a widespread practice that helps developers to improve the maintainability and readability of their code. However, there is a limited number of studies empirically investigating the actual motivations behind specific refactoring operations applied by developers. To fill this gap, we monitored Java projects hosted on GitHub to detect recently applied refactorings, and asked the developers to explain the reasons behind their decision to refactor the code. By applying thematic analysis on the collected responses, we compiled a catalogue of 44 distinct motivations for 12 well-known refactoring types. We found that refactoring activity is mainly driven by changes in the requirements and much less by code smells. Extract Method is the most versatile refactoring operation serving 11 different purposes. Finally, we found evidence that the IDE used by the developers affects the adoption of automated refactoring tools.

Automatic and dynamic profiling of enterprisesJose Santos

How to Have Code Reviews That Developers Actually Want

Cameron Presley

Exploring perspectives in digital library evaluationGiannis Tsakonas

Content Audits and Analysis

meetcontent

A content inventory is an essential early step in your content strategy. How do you know what content you need if you don't know what you have? But that's not where the process should end. How do you dig deeper to not only understand what you have, but if it's useful, relevant, and on-brand? Using practical examples, Rick will walk through the process of creating a content inventory and a quantitative and qualitative audit to evaluate content quality.

Tool up your lamp stackAgileOnTheBeach

Tool Up Your LAMP Stack

Lorna Mitchell

Executing for Every Screen: Build, launch and sustain products for your custo...

Steven Hoober

hybrid web-recommender-systems

Aravindharamanan S

Agile Software Development in practice: Experience, Tips and Tools from the T...

Valerie Puffet-Michel

In the Division of Student Affairs at the University of Connecticut, the Applications Development team has been developing and delivering custom software using agile methods for over four years. In this session, we'll share our experiences and give you a behind the scenes look at how agile software development really works by walking you through how we translate the unique business needs of our clients into deployed software.

Avatara: OLAP for Web-scale Analytics Products

Lili Wu

Securing your Kubernetes cluster_ a step-by-step guide to success !

KatiaHIMEUR1

Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster. However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks. In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.

Essentials of Automations: The Art of Triggers and Actions in FME

Safe Software

In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation. We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios. Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!

Similar to People who liked this talk also liked … Building Recommendation Systems Using Ruby

Social Media Boot Camp, Chicago June 17, 2010Eric Schwartzman

Social Media Boot Camp SF April 29, 2010

guest3b9e35d

Code review in practice

Edorian

Code Review for Teams Too Busy to Review Code - Atlassian Summit 2010

Atlassian

Reviewing CPAN modulesneilbowers

Software Quality via Unit Testing

Shaun Abram

Caring About Code Quality (Clean Code, GRASP, Effective Java, Design Pattern)

El Mahdi Benzekri

10 Easy Ways to Take Your Website from Good to Great

Chris Sietsema

Enterprise Search @EPAMAlex Kozhemiakin

Why We Refactor? Confessions of GitHub Contributors

Nikolaos Tsantalis

Automatic and dynamic profiling of enterprisesJose Santos

How to Have Code Reviews That Developers Actually Want

Cameron Presley

Exploring perspectives in digital library evaluationGiannis Tsakonas

Content Audits and Analysis

meetcontent

Tool up your lamp stackAgileOnTheBeach

Tool Up Your LAMP Stack

Lorna Mitchell

Executing for Every Screen: Build, launch and sustain products for your custo...

Steven Hoober

hybrid web-recommender-systems

Aravindharamanan S

Agile Software Development in practice: Experience, Tips and Tools from the T...

Valerie Puffet-Michel

Avatara: OLAP for Web-scale Analytics Products

Lili Wu

Similar to People who liked this talk also liked … Building Recommendation Systems Using Ruby (20)

Social Media Boot Camp, Chicago June 17, 2010

Social Media Boot Camp SF April 29, 2010

Code review in practice

Code Review for Teams Too Busy to Review Code - Atlassian Summit 2010

Reviewing CPAN modules

Software Quality via Unit Testing

Caring About Code Quality (Clean Code, GRASP, Effective Java, Design Pattern)

10 Easy Ways to Take Your Website from Good to Great

Enterprise Search @EPAM

Why We Refactor? Confessions of GitHub Contributors

Automatic and dynamic profiling of enterprises

How to Have Code Reviews That Developers Actually Want

Exploring perspectives in digital library evaluation

Content Audits and Analysis

Tool up your lamp stack

Tool Up Your LAMP Stack

Executing for Every Screen: Build, launch and sustain products for your custo...

hybrid web-recommender-systems

Agile Software Development in practice: Experience, Tips and Tools from the T...

Avatara: OLAP for Web-scale Analytics Products

Recently uploaded

Securing your Kubernetes cluster_ a step-by-step guide to success !

KatiaHIMEUR1

Essentials of Automations: The Art of Triggers and Actions in FME

Safe Software

FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf

FIDO Alliance

zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs

Alex Pruden

This paper presents Reef, a system for generating publicly verifiable succinct non-interactive zero-knowledge proofs that a committed document matches or does not match a regular expression. We describe applications such as proving the strength of passwords, the provenance of email despite redactions, the validity of oblivious DNS queries, and the existence of mutations in DNA. Reef supports the Perl Compatible Regular Expression syntax, including wildcards, alternation, ranges, capture groups, Kleene star, negations, and lookarounds. Reef introduces a new type of automata, Skipping Alternating Finite Automata (SAFA), that skips irrelevant parts of a document when producing proofs without undermining soundness, and instantiates SAFA with a lookup argument. Our experimental evaluation confirms that Reef can generate proofs for documents with 32M characters; the proofs are small and cheap to verify (under a second). Paper: https://eprint.iacr.org/2023/1886

Encryption in Microsoft 365 - ExpertsLive Netherlands 2024

Albert Hoitingh

Accelerate your Kubernetes clusters with Varnish Caching

Thijs Feryn

GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...

James Anderson

Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management. The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM). Speakers: Bob Boule Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle. Gopinath Rebala Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.

GraphRAG is All You need? LLM & Knowledge Graph

Guy Korland

Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs. 1. Unifying Large Language Models and Knowledge Graphs: A Roadmap. https://arxiv.org/abs/2306.08302 2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs: https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/

Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...

UiPathCommunity

💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™: See how to accelerate model training and optimize model performance with active learning Learn about the latest enhancements to out-of-the-box document processing – with little to no training required Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath. Speakers: 👨‍🏫 Andras Palfi, Senior Product Manager, UiPath 👩‍🏫 Lenka Dulovicova, Product Program Manager, UiPath

Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...

SOFTTECHHUB

The choice of an operating system plays a pivotal role in shaping our computing experience. For decades, Microsoft's Windows has dominated the market, offering a familiar and widely adopted platform for personal and professional use. However, as technological advancements continue to push the boundaries of innovation, alternative operating systems have emerged, challenging the status quo and offering users a fresh perspective on computing. One such alternative that has garnered significant attention and acclaim is Nitrux Linux 3.5.0, a sleek, powerful, and user-friendly Linux distribution that promises to redefine the way we interact with our devices. With its focus on performance, security, and customization, Nitrux Linux presents a compelling case for those seeking to break free from the constraints of proprietary software and embrace the freedom and flexibility of open-source computing.

Assure Contact Center Experiences for Your Customers With ThousandEyes

ThousandEyes

The Art of the Pitch: WordPress Relationships and Sales

Laura Byrne

Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes? All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.

Introduction to CHERI technology - Cybersecurity

mikeeftimakis1

Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf

Paige Cruz

Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack. While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack. I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:

Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™

UiPathCommunity

In questo evento online gratuito, organizzato dalla Community Italiana di UiPath, potrai esplorare le nuove funzionalità di Autopilot, il tool che integra l'Intelligenza Artificiale nei processi di sviluppo e utilizzo delle Automazioni. 📕 Vedremo insieme alcuni esempi dell'utilizzo di Autopilot in diversi tool della Suite UiPath: Autopilot per Studio Web Autopilot per Studio Autopilot per Apps Clipboard AI GenAI applicata alla Document Understanding 👨‍🏫👨‍💻 Speakers: Stefano Negro, UiPath MVPx3, RPA Tech Lead @ BSP Consultant Flavio Martinelli, UiPath MVP 2023, Technical Account Manager @UiPath Andrei Tasca, RPA Solutions Team Lead @NTT Data

FIDO Alliance Osaka Seminar: Overview.pdf

FIDO Alliance

Leading Change strategies and insights for effective change management pdf 1.pdf

OnBoard

Climate Impact of Software Testing at Nordic Testing Days

Kari Kakkonen

My slides at Nordic Testing Days 6.6.2024 Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.

UiPath Community Day Dubai: AI at Work..

UiPathCommunity

Welcome to the first live UiPath Community Day Dubai! Join us for this unique occasion to meet our local and global UiPath Community and leaders. You will get a full view of the MEA region's automation landscape and the AI Powered automation technology capabilities of UiPath. Also, hosted by our local partners Marc Ellis, you will enjoy a half-day packed with industry insights and automation peers networking. 📕 Curious on our agenda? Wait no more! 10:00 Welcome note - UiPath Community in Dubai Lovely Sinha, UiPath Community Chapter Leader, UiPath MVPx3, Hyper-automation Consultant, First Abu Dhabi Bank 10:20 A UiPath cross-region MEA overview Ashraf El Zarka, VP and Managing Director MEA, UiPath 10:35: Customer Success Journey Deepthi Deepak, Head of Intelligent Automation CoE, First Abu Dhabi Bank 11:15 The UiPath approach to GenAI with our three principles: improve accuracy, supercharge productivity, and automate more Boris Krumrey, Global VP, Automation Innovation, UiPath 12:15 To discover how Marc Ellis leverages tech-driven solutions in recruitment and managed services. Brendan Lingam, Director of Sales and Business Development, Marc Ellis

Generative AI Deep Dive: Advancing from Proof of Concept to Production

Aggregage

Recently uploaded (20)

Securing your Kubernetes cluster_ a step-by-step guide to success !

Essentials of Automations: The Art of Triggers and Actions in FME

FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf

zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs

Encryption in Microsoft 365 - ExpertsLive Netherlands 2024

Accelerate your Kubernetes clusters with Varnish Caching

GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...

GraphRAG is All You need? LLM & Knowledge Graph

Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...

Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...

Assure Contact Center Experiences for Your Customers With ThousandEyes

The Art of the Pitch: WordPress Relationships and Sales

Introduction to CHERI technology - Cybersecurity

Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf

Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™

FIDO Alliance Osaka Seminar: Overview.pdf

Leading Change strategies and insights for effective change management pdf 1.pdf

Climate Impact of Software Testing at Nordic Testing Days

UiPath Community Day Dubai: AI at Work..

Generative AI Deep Dive: Advancing from Proof of Concept to Production

People who liked this talk also liked … Building Recommendation Systems Using Ruby

1. People who liked this talk also liked … Building Recommendation Systems Using Ruby Ryan Weald, @rweald LA RubyConf 2013 1

2. Who is this guy? What does he know about recommendation systems? 2

3. Data Scientist @Sharethrough Native advertising platform 3

4. 4

5. Outline 1) What is a recommendation system? 2) Collaborative ﬁltering based recommendations 3) Content based recommendations 4) Hybrid systems - the best of both worlds 5) Evaluating your recommendation system 6) Resources & existing libraries 5

6. What this Talk is Not • Everything there is to know about recommendation systems. • Bleeding edge machine learning • How to use a speciﬁc library 6

7. What is a recommendation system? 7

8. A program that predicts a user’s preferences using information about the user, other users, and the items in your system. 8

9. LinkedIn 9

10. Netﬂix 10

11. Spotify 11

12. Amazon 12

13. How do I build recommendations? 13

14. Two Main Categories of Algorithm 1. Collaborative Filtering (CF) 2. Content Based - Classiﬁcation 14

15. Collaborative Filtering Fill in missing user preferences using similar users or items 15

16. Two Types of CF 1. Memory Based - Uses similarity between users or items. Dataset usually kept in memory 2. Model Based - Model generated to “explain” observed ratings 16

17. User Based CF (User x Item) Matrix + Similarity Function = Top-K most similar users 17

18. Collaborative Filtering Video 1 Video 2 Video 3 Video 4 Video 5 User 1 0 1 0 5 0 User 2 1 2 1 0 5 User 3 2 5 0 0 2 User 4 5 4 4 1 1 User 5 2 4 2 ? ? * 0 denotes not rated 18

19. Similarity Functions • Pearson Correlation Coefﬁcient • Cosine Similarity 19

20. Pearson Correlation Coefﬁcient 20

21. Calculating PCC 21

22. Calculating PCC 22

23. Calculating PCC 23

24. Calculating PCC 24

25. Calculating PCC 25

26. Calculating PCC 26

27. 27

28. Using similarity to recommend items 28

29. Collaborative Filtering Video 1 Video 2 Video 3 Video 4 Video 5 User 1 0 1 0 5 0 User 2 1 2 1 0 5 User 3 2 5 0 0 2 User 4 5 4 4 1 1 User 5 2 4 2 ? ? * 0 denotes not rated 29

30. 30

31. Problems With CF • Cold Start • Data Sparsity • Resource expensive 31

32. Doesn’t the video content matter for recommendations? 32

33. Content Based Recommendations Classify items based on features of the item. Pick other items from same class to recommend. 33

34. Content Based Algorithms • K-means clustering • Random Forrest • Support Vector Machines • ... • Insert your favorite ML algorithm 34

35. Content Based Algorithms Type of Duration Maturity content Rating Video 1 comedy 60 G Video 2 action 120 G Video 3 comedy 34 PG-13 Video 4 romantic 15 R Video 5 sports 120 G 35

36. K-means Clustering Group items into K clusters. Assign new item to a cluster and pick items from that cluster 36

37. K-means Clustering 37

38. Problems With Content Based Recommendations • Unsupervised Learning is hard • Training data limited or expensive • Doesn’t take user into account • Limited by features of content 38

39. Hybrid Recommendations Combine collaborative ﬁltering with content based algorithm to achieve greater results 39

40. Hybrid Recommendations Input CF Based Recommender Combiner Reco Input Content Based Recommender 40

41. Hybrid Recommendations 41

42. Hybrid Recommendations Content CF Input Reco Recommender Recommender 42

43. Hybrid Recommendations CF Recommender Input Reco Content Recommender 43

44. Evaluating Recommendation Quality • Precision vs. Recall • Clicks • Click through rate • Direct user feedback 44

45. Precision vs. Recall 45

46. Precision vs. Recall 46

47. Summary of What We’ve Learned • Collaborative Filtering using similar users • Content clustering using k-means • Combining 2 algorithms to boost quality • How to evaluate your recommender 47

48. Don’t Reinvent the Wheel • Apache Mahout • JRuby mahout gem • SciRuby • Recommenderlab for R 48

49. Resources & Further Reading • Recommender Systems: An Introduction • Linden, Greg, Brent Smith, and Jeremy York. "Amazon. com recommendations: Item-to-item collaborative ﬁltering." • Resnick, Paul, et al. "GroupLens: an open architecture for collaborative ﬁltering of netnews." • ACM RecSys Conference Proceedings 49

50. We’re Hiring http://bit.ly/str-engineering 50

51. Thanks! Twitter: @rweald Email: ryan@sharethrough.com 51

People who liked this talk also liked … Building Recommendation Systems Using Ruby

Recommended

Recommended

More Related Content

Similar to People who liked this talk also liked … Building Recommendation Systems Using Ruby

Similar to People who liked this talk also liked … Building Recommendation Systems Using Ruby (20)

Recently uploaded

Recently uploaded (20)

People who liked this talk also liked … Building Recommendation Systems Using Ruby