From Amazon, to Spotify, to thermostats, recommendation systems are everywhere. The ability to provide recommendations for your users is becoming a crucial feature for modern applications. In this talk I'll show you how you can use Ruby to build recommendation systems for your users. You don't need a PhD to build a simple recommendation engine -- all you need is Ruby. Together we'll dive into the dark arts of machine learning and you'll discover that writing a basic recommendation engine is not as hard as you might have imagined. Using Ruby I'll teach you some of the common algorithms used in recommender systems, such as: Collaborative Filtering, K-Nearest Neighbor, and Pearson Correlation Coefficient. At the end of the talk you should be on your way to writing your own basic recommendation system in Ruby.
Modern Perspectives on Recommender Systems and their Applications in MendeleyMaya Hristakeva
Presentation given for one of Pearson's Data Research teams. It motivates the use of recommender systems, describes common approaches to building and evaluating them and gives examples of how they are used in Mendeley. Joint work with Kris Jack, Chief Data Scientist at Mendeley.
See conference video - http://www.lucidimagination.com/devzone/events/conferences/ApacheLuceneEurocon2011
"I know it when I see it".
This term was coined by a Supreme Court Justice in reference to obscenity, but he might as well been talking about relevancy and search engine results. Testing search engines is rarely a binary process of "it works, it doesn't work", instead it draws on our human skills to design tests that capture the intangibles that make up a great search engine implementation! The behavior of a search engine changes as the data changes, so a search that returns one set of results today will return a different set tomorrow. Is that a bug? Or just a finely tuned search engine responding to changes in the data it searches? Search Engine testing often focuses on the very first layer of functionality, "Do I get results?", without digging deeper into "Do I get great relevant results?".
Modern Perspectives on Recommender Systems and their Applications in MendeleyKris Jack
Presentation given for one of Pearson's Data Research teams. It motivates the use of recommender systems, describes common approaches to building and evaluating them and gives examples of how they are used in Mendeley. Thanks to Maya Hristakeva for creating some of the slides.
Modern Perspectives on Recommender Systems and their Applications in MendeleyMaya Hristakeva
Presentation given for one of Pearson's Data Research teams. It motivates the use of recommender systems, describes common approaches to building and evaluating them and gives examples of how they are used in Mendeley. Joint work with Kris Jack, Chief Data Scientist at Mendeley.
See conference video - http://www.lucidimagination.com/devzone/events/conferences/ApacheLuceneEurocon2011
"I know it when I see it".
This term was coined by a Supreme Court Justice in reference to obscenity, but he might as well been talking about relevancy and search engine results. Testing search engines is rarely a binary process of "it works, it doesn't work", instead it draws on our human skills to design tests that capture the intangibles that make up a great search engine implementation! The behavior of a search engine changes as the data changes, so a search that returns one set of results today will return a different set tomorrow. Is that a bug? Or just a finely tuned search engine responding to changes in the data it searches? Search Engine testing often focuses on the very first layer of functionality, "Do I get results?", without digging deeper into "Do I get great relevant results?".
Modern Perspectives on Recommender Systems and their Applications in MendeleyKris Jack
Presentation given for one of Pearson's Data Research teams. It motivates the use of recommender systems, describes common approaches to building and evaluating them and gives examples of how they are used in Mendeley. Thanks to Maya Hristakeva for creating some of the slides.
Software quality is critical to consistently and continually delivering new features to our users. This talk covers the importance of software quality and how to deliver it via unit testing, Test Driven Development and clean code in general.
This is the deck from a talk I gave at Desert Code Camp 2013.
Code quality is not an exact science and is rather subjective, which brings the need of well defined rules and principles to follow.
Clean code is all about readability, furthermore, principles like GRASP or techniques like fluent APIs makes the code even cleaner and maintainable.
Design Patterns on the other hand are typical solutions to common problems in software design.
You will walk away with a taste of quality principles and metrics for building quality software.
Also previews of some useful books like Martin Fowler's Clean Code and Kent Beck's Implementation Patterns are presented.
10 Easy Ways to Take Your Website from Good to GreatChris Sietsema
Presentation given at the IRMA Conference in October 2012. Covers methods to improve website effectiveness including usability, search engine optimization, email automation, social sharing, content planning and more. From @sietsema.
Refactoring is a widespread practice that helps developers
to improve the maintainability and readability of their code.
However, there is a limited number of studies empirically
investigating the actual motivations behind specific refactoring operations applied by developers. To fill this gap, we monitored Java projects hosted on GitHub to detect recently applied refactorings, and asked the developers to explain the reasons behind their decision to refactor the code.
By applying thematic analysis on the collected responses,
we compiled a catalogue of 44 distinct motivations for 12
well-known refactoring types. We found that refactoring activity is mainly driven by changes in the requirements and
much less by code smells. Extract Method is the most versatile refactoring operation serving 11 different purposes.
Finally, we found evidence that the IDE used by the developers affects the adoption of automated refactoring tools.
A content inventory is an essential early step in your content strategy. How do you know what content you need if you don't know what you have? But that's not where the process should end. How do you dig deeper to not only understand what you have, but if it's useful, relevant, and on-brand? Using practical examples, Rick will walk through the process of creating a content inventory and a quantitative and qualitative audit to evaluate content quality.
Agile Software Development in practice: Experience, Tips and Tools from the T...Valerie Puffet-Michel
In the Division of Student Affairs at the University of Connecticut, the Applications Development team has been developing and delivering custom software using agile methods for over four years. In this session, we'll share our experiences and give you a behind the scenes look at how agile software development really works by walking you through how we translate the unique business needs of our clients into deployed software.
Avatara: OLAP for Web-scale Analytics Products Lili Wu
We recently presented a paper about Avatara at VLDB 2012. Avatara is LinkedIn's scalable, low latency, and highly-available OLAP system for "sharded" multi-dimensional queries. It has been powering our analytical products, such as "Who's Viewed My Profile?" and "Who's Viewed This Job?", for two and a half years.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
More Related Content
Similar to People who liked this talk also liked … Building Recommendation Systems Using Ruby
Software quality is critical to consistently and continually delivering new features to our users. This talk covers the importance of software quality and how to deliver it via unit testing, Test Driven Development and clean code in general.
This is the deck from a talk I gave at Desert Code Camp 2013.
Code quality is not an exact science and is rather subjective, which brings the need of well defined rules and principles to follow.
Clean code is all about readability, furthermore, principles like GRASP or techniques like fluent APIs makes the code even cleaner and maintainable.
Design Patterns on the other hand are typical solutions to common problems in software design.
You will walk away with a taste of quality principles and metrics for building quality software.
Also previews of some useful books like Martin Fowler's Clean Code and Kent Beck's Implementation Patterns are presented.
10 Easy Ways to Take Your Website from Good to GreatChris Sietsema
Presentation given at the IRMA Conference in October 2012. Covers methods to improve website effectiveness including usability, search engine optimization, email automation, social sharing, content planning and more. From @sietsema.
Refactoring is a widespread practice that helps developers
to improve the maintainability and readability of their code.
However, there is a limited number of studies empirically
investigating the actual motivations behind specific refactoring operations applied by developers. To fill this gap, we monitored Java projects hosted on GitHub to detect recently applied refactorings, and asked the developers to explain the reasons behind their decision to refactor the code.
By applying thematic analysis on the collected responses,
we compiled a catalogue of 44 distinct motivations for 12
well-known refactoring types. We found that refactoring activity is mainly driven by changes in the requirements and
much less by code smells. Extract Method is the most versatile refactoring operation serving 11 different purposes.
Finally, we found evidence that the IDE used by the developers affects the adoption of automated refactoring tools.
A content inventory is an essential early step in your content strategy. How do you know what content you need if you don't know what you have? But that's not where the process should end. How do you dig deeper to not only understand what you have, but if it's useful, relevant, and on-brand? Using practical examples, Rick will walk through the process of creating a content inventory and a quantitative and qualitative audit to evaluate content quality.
Agile Software Development in practice: Experience, Tips and Tools from the T...Valerie Puffet-Michel
In the Division of Student Affairs at the University of Connecticut, the Applications Development team has been developing and delivering custom software using agile methods for over four years. In this session, we'll share our experiences and give you a behind the scenes look at how agile software development really works by walking you through how we translate the unique business needs of our clients into deployed software.
Avatara: OLAP for Web-scale Analytics Products Lili Wu
We recently presented a paper about Avatara at VLDB 2012. Avatara is LinkedIn's scalable, low latency, and highly-available OLAP system for "sharded" multi-dimensional queries. It has been powering our analytical products, such as "Who's Viewed My Profile?" and "Who's Viewed This Job?", for two and a half years.
Similar to People who liked this talk also liked … Building Recommendation Systems Using Ruby (20)
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofsAlex Pruden
This paper presents Reef, a system for generating publicly verifiable succinct non-interactive zero-knowledge proofs that a committed document matches or does not match a regular expression. We describe applications such as proving the strength of passwords, the provenance of email despite redactions, the validity of oblivious DNS queries, and the existence of mutations in DNA. Reef supports the Perl Compatible Regular Expression syntax, including wildcards, alternation, ranges, capture groups, Kleene star, negations, and lookarounds. Reef introduces a new type of automata, Skipping Alternating Finite Automata (SAFA), that skips irrelevant parts of a document when producing proofs without undermining soundness, and instantiates SAFA with a lookup argument. Our experimental evaluation confirms that Reef can generate proofs for documents with 32M characters; the proofs are small and cheap to verify (under a second).
Paper: https://eprint.iacr.org/2023/1886
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...SOFTTECHHUB
The choice of an operating system plays a pivotal role in shaping our computing experience. For decades, Microsoft's Windows has dominated the market, offering a familiar and widely adopted platform for personal and professional use. However, as technological advancements continue to push the boundaries of innovation, alternative operating systems have emerged, challenging the status quo and offering users a fresh perspective on computing.
One such alternative that has garnered significant attention and acclaim is Nitrux Linux 3.5.0, a sleek, powerful, and user-friendly Linux distribution that promises to redefine the way we interact with our devices. With its focus on performance, security, and customization, Nitrux Linux presents a compelling case for those seeking to break free from the constraints of proprietary software and embrace the freedom and flexibility of open-source computing.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™UiPathCommunity
In questo evento online gratuito, organizzato dalla Community Italiana di UiPath, potrai esplorare le nuove funzionalità di Autopilot, il tool che integra l'Intelligenza Artificiale nei processi di sviluppo e utilizzo delle Automazioni.
📕 Vedremo insieme alcuni esempi dell'utilizzo di Autopilot in diversi tool della Suite UiPath:
Autopilot per Studio Web
Autopilot per Studio
Autopilot per Apps
Clipboard AI
GenAI applicata alla Document Understanding
👨🏫👨💻 Speakers:
Stefano Negro, UiPath MVPx3, RPA Tech Lead @ BSP Consultant
Flavio Martinelli, UiPath MVP 2023, Technical Account Manager @UiPath
Andrei Tasca, RPA Solutions Team Lead @NTT Data
Climate Impact of Software Testing at Nordic Testing DaysKari Kakkonen
My slides at Nordic Testing Days 6.6.2024
Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.
Welcome to the first live UiPath Community Day Dubai! Join us for this unique occasion to meet our local and global UiPath Community and leaders. You will get a full view of the MEA region's automation landscape and the AI Powered automation technology capabilities of UiPath. Also, hosted by our local partners Marc Ellis, you will enjoy a half-day packed with industry insights and automation peers networking.
📕 Curious on our agenda? Wait no more!
10:00 Welcome note - UiPath Community in Dubai
Lovely Sinha, UiPath Community Chapter Leader, UiPath MVPx3, Hyper-automation Consultant, First Abu Dhabi Bank
10:20 A UiPath cross-region MEA overview
Ashraf El Zarka, VP and Managing Director MEA, UiPath
10:35: Customer Success Journey
Deepthi Deepak, Head of Intelligent Automation CoE, First Abu Dhabi Bank
11:15 The UiPath approach to GenAI with our three principles: improve accuracy, supercharge productivity, and automate more
Boris Krumrey, Global VP, Automation Innovation, UiPath
12:15 To discover how Marc Ellis leverages tech-driven solutions in recruitment and managed services.
Brendan Lingam, Director of Sales and Business Development, Marc Ellis
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
5. Outline
1) What is a recommendation system?
2) Collaborative filtering based
recommendations
3) Content based recommendations
4) Hybrid systems - the best of both worlds
5) Evaluating your recommendation system
6) Resources & existing libraries
5
6. What this Talk is Not
• Everything there is to know about
recommendation systems.
• Bleeding edge machine learning
• How to use a specific library
6
16. Two Types of CF
1. Memory Based - Uses similarity
between users or items. Dataset
usually kept in memory
2. Model Based - Model generated
to “explain” observed ratings
16
17. User Based CF
(User x Item) Matrix + Similarity
Function = Top-K most similar users
17
18. Collaborative Filtering
Video 1 Video 2 Video 3 Video 4 Video 5
User 1 0 1 0 5 0
User 2 1 2 1 0 5
User 3 2 5 0 0 2
User 4 5 4 4 1 1
User 5 2 4 2
? ?
* 0 denotes not rated
18
29. Collaborative Filtering
Video 1 Video 2 Video 3 Video 4 Video 5
User 1 0 1 0 5 0
User 2 1 2 1 0 5
User 3 2 5 0 0 2
User 4 5 4 4 1 1
User 5 2 4 2
? ?
* 0 denotes not rated
29
33. Content Based Recommendations
Classify items based on features of
the item. Pick other items from
same class to recommend.
33
34. Content Based Algorithms
• K-means clustering
• Random Forrest
• Support Vector Machines
• ...
• Insert your favorite ML algorithm
34
35. Content Based Algorithms
Type of Duration Maturity
content Rating
Video 1 comedy 60 G
Video 2 action 120 G
Video 3 comedy 34 PG-13
Video 4 romantic 15 R
Video 5 sports 120 G
35
36. K-means Clustering
Group items into K clusters.
Assign new item to a cluster and
pick items from that cluster
36
38. Problems With Content Based
Recommendations
• Unsupervised Learning is hard
• Training data limited or expensive
• Doesn’t take user into account
• Limited by features of content
38
47. Summary of What We’ve Learned
• Collaborative Filtering using similar users
• Content clustering using k-means
• Combining 2 algorithms to boost quality
• How to evaluate your recommender
47
48. Don’t Reinvent the Wheel
• Apache Mahout
• JRuby mahout gem
• SciRuby
• Recommenderlab for R
48
49. Resources & Further Reading
• Recommender Systems: An Introduction
• Linden, Greg, Brent Smith, and Jeremy York.
"Amazon. com recommendations: Item-to-item
collaborative filtering."
• Resnick, Paul, et al. "GroupLens: an open architecture
for collaborative filtering of netnews."
• ACM RecSys Conference Proceedings
49