1) More data is not always better than better models. Sometimes, better modeling techniques are needed rather than just collecting more data.
2) Ensembles of different models generally perform better than any single model and are commonly used in practice. Feature engineering to create new inputs for ensembles can improve their effectiveness.
3) Implicit signals from user behavior usually provide more useful information than explicit feedback, but both should be used to best represent users' long-term goals.
Presentation slides at RecSys 2016, Boston. At Quora, our mission is to share and grow the world’s knowledge. Recommender systems are at the core of this mission: we need to recommend the most important questions to people most likely to write great answers, and recommend the best answers to people interested in reading them. Driven by the above mission statement, we have a variety of interesting and challenging recommendation problems and a large, rich data set that we can work with to build novel solutions for them. In this talk, we will describe several of these recommendation problems and present our approaches solving them.
Big & Personal: the data and the models behind Netflix recommendations by Xa...BigMine
Since the Netflix $1 million Prize, announced in 2006, our company has been known for having personalization at the core of our product. Even at that point in time, the dataset that we released was considered “large”, and we stirred innovation in the (Big) Data Mining research field. Our current product offering is now focused around instant video streaming, and our data is now many orders of magnitude larger. Not only do we have many more users in many more countries, but we also receive many more streams of data. Besides the ratings, we now also use information such as what our members play, browse, or search.
In this talk I will discuss the different approaches we follow to deal with these large streams of data in order to extract information for personalizing our service. I will describe some of the machine learning models used, as well as the architectures that allow us to combine complex offline batch processes with real-time data streams.
Past, present, and future of Recommender Systems: an industry perspectiveXavier Amatriain
Keynote for the ACM Intelligent User Interface conference in 2016 in Sonoma, CA. I start with the past by talking about the Recommender Problem, and the Netflix Prize. Then I go into the Present and the Future by talking about approaches that go beyond rating prediction and ranking and by finishing with some of the most important lessons learned over the years. Throughout my talk I put special emphasis on the relation between algorithms and the User Interface.
Déjà Vu: The Importance of Time and Causality in Recommender SystemsJustin Basilico
Talk at RecSys 2017 in Como, Italy on 2017-08-29.
Abstract:
Time plays a key role in recommendation. Handling it properly is especially critical when using recommender systems in real-world applications, which may not be as clear when doing research with historical data. In this talk, we will discuss some of the important challenges of handling time in recommendation algorithms at Netflix. We will focus on challenges related to how our users, items, and systems all change over time. We will then discuss some strategies for tackling these challenges, which revolves around proper treatment of causality in our systems.
This is part 1 of the tutorial Xavier and Deepak gave at Recsys 2016 this year. You can find the second part http://www.slideshare.net/xamat/recsys-2016-tutorial-lessons-learned-from-building-reallife-recommender-systems
A Multi-Armed Bandit Framework For Recommendations at NetflixJaya Kawale
In this talk, we present a general multi-armed bandit framework for recommendations on the Netflix homepage. We present two example case studies using MABs at Netflix - a) Artwork Personalization to recommend personalized visuals for each of our members for the different titles and b) Billboard recommendation to recommend the right title to be watched on the Billboard.
The first requirement for an online mathematics homework engine is to encourage students to practice and reinforce their mathematics skills in ways that are as good or better than traditional paper homework. The use of the computer and the internet should not limit the kind or quality of the mathematics that we teach and if possible it should expand it.
Now that much of the homework practice takes place online we have the potential of a new and much better window into how students learn mathematics but we must continue to ensure that students are studying the mathematics we want to have learned and not just mathematics that is easily gradable. Several of the open source mathematics engines that do this well are represented at this conference.
The WeBWorK mathematics rendering engine started twenty years ago as a stand alone application. Since then homework questions contributed by many, many mathematicians to the OpenProblemLibrary (OPL) have created a collection of over 30,000 Creative Commons licensed problems primarily directed toward calculus but ranging from basic algebra through matrix linear algebra.
I’ll present one of the adaptations of WeBWorK which allows it to render mathematics questions for a standard Moodle quiz in much the same way that STACK functions. Both STACK and WeBWorK vastly increase Moodle’s ability to handle mathematics. Using the Moodle quiz format will make the OPL available to many more educators and allows utilization of Moodle’s facility at collecting student data.
If there is time I’ll show a second adaptation which allows WeBWorK to serve as an assignment type within Moodle. These same mechanisms allow active WeBWorK questions to be embedded in other learning management systems, in interactive textbooks and even HTML pages. This capability fits well with an emerging trend to use smaller, more specialized, inter-operating components for online education.
The term Machine Learning was coined by Arthur Samuel in 1959, an american pioneer in the field of computer gaming and artificial intelligence and stated that “ it gives computers the ability to learn without being explicitly programmed” And in 1997, Tom Mitchell gave a “ well-Posed” mathematical and relational definition that “ A Computer Program is said to learn from experience E with respect to some task T and some performance measure P, if its performance on T, as measured by P, improves with experience E”.
Machine learning is needed for tasks that are too complex for humans to code directly. So instead, we provide a large amount of data to a machine learning algorithm and let the algorithm work it out by exploring that data and searching for a model that will achieve what the programmers have set it out to achieve.
Xavier Amatriain, VP of Engineering, Quora at MLconf SEA - 5/01/15MLconf
Machine learning applications for growing the world’s knowledge at Quora: At Quora our mission is to “share and grow the world’s knowledge”. We want to do this by getting the right questions to the right people to answer them, but also by getting the existing answers to people who are interested in them. In order to accomplish this we need to build a complex ecosystem where we value issues such as content quality, engagement, demand, interests, or reputation. It is not possible to build a system like this unless most of the process are highly automated and scalable. We are fortunate though to have lots of very good quality data on which to build machine learning solutions that can help address all of the previous requirements.
In this talk I will describe some interesting uses of machine learning at Quora that range from different recommendation approaches such as personalized ranking to classifiers built to detect duplicate questions or spam. I will describe some of the modeling and feature engineering approaches that go into building these systems. I will also share some of the challenges faced when building such a large-scale knowledge base of human-generated knowledge.
Deep Learning For Practitioners, lecture 2: Selecting the right applications...ananth
In this presentation we articulate when deep learning techniques yield best results from a practitioner's view point. Do we apply deep learning techniques for every machine learning problem? What characteristics of an application lends itself suitable for deep learning? Does more data automatically imply better results regardless of the algorithm or model? Does "automated feature learning" obviate the need for data preprocessing and feature design?
Deep learning has accomplished impressive feats in areas such as voice recognition, image processing, and natural language processing. Deep learning enthusiasts have rushed to predict that this family of algorithms is likely to take over most other applications in the near future. This focus on deep architectures seems to have cast a shadow over more “traditional” machine learning and data science approaches, leaving researchers and practitioners alike wondering whether there is any point in investing in feature engineering or simpler models.
In this talk, I will go over what deep learning can and cannot do for you, both now and in the near future. I will also describe how different approaches will continue to be needed, and why their demand will likely grow despite the rise of deep learning. I will support my claims not only by looking at recent publications, but also by using practical examples drawn from my experience at companies at the forefront of machine learning applications, such as Quora.
Presentation slides at RecSys 2016, Boston. At Quora, our mission is to share and grow the world’s knowledge. Recommender systems are at the core of this mission: we need to recommend the most important questions to people most likely to write great answers, and recommend the best answers to people interested in reading them. Driven by the above mission statement, we have a variety of interesting and challenging recommendation problems and a large, rich data set that we can work with to build novel solutions for them. In this talk, we will describe several of these recommendation problems and present our approaches solving them.
Big & Personal: the data and the models behind Netflix recommendations by Xa...BigMine
Since the Netflix $1 million Prize, announced in 2006, our company has been known for having personalization at the core of our product. Even at that point in time, the dataset that we released was considered “large”, and we stirred innovation in the (Big) Data Mining research field. Our current product offering is now focused around instant video streaming, and our data is now many orders of magnitude larger. Not only do we have many more users in many more countries, but we also receive many more streams of data. Besides the ratings, we now also use information such as what our members play, browse, or search.
In this talk I will discuss the different approaches we follow to deal with these large streams of data in order to extract information for personalizing our service. I will describe some of the machine learning models used, as well as the architectures that allow us to combine complex offline batch processes with real-time data streams.
Past, present, and future of Recommender Systems: an industry perspectiveXavier Amatriain
Keynote for the ACM Intelligent User Interface conference in 2016 in Sonoma, CA. I start with the past by talking about the Recommender Problem, and the Netflix Prize. Then I go into the Present and the Future by talking about approaches that go beyond rating prediction and ranking and by finishing with some of the most important lessons learned over the years. Throughout my talk I put special emphasis on the relation between algorithms and the User Interface.
Déjà Vu: The Importance of Time and Causality in Recommender SystemsJustin Basilico
Talk at RecSys 2017 in Como, Italy on 2017-08-29.
Abstract:
Time plays a key role in recommendation. Handling it properly is especially critical when using recommender systems in real-world applications, which may not be as clear when doing research with historical data. In this talk, we will discuss some of the important challenges of handling time in recommendation algorithms at Netflix. We will focus on challenges related to how our users, items, and systems all change over time. We will then discuss some strategies for tackling these challenges, which revolves around proper treatment of causality in our systems.
This is part 1 of the tutorial Xavier and Deepak gave at Recsys 2016 this year. You can find the second part http://www.slideshare.net/xamat/recsys-2016-tutorial-lessons-learned-from-building-reallife-recommender-systems
A Multi-Armed Bandit Framework For Recommendations at NetflixJaya Kawale
In this talk, we present a general multi-armed bandit framework for recommendations on the Netflix homepage. We present two example case studies using MABs at Netflix - a) Artwork Personalization to recommend personalized visuals for each of our members for the different titles and b) Billboard recommendation to recommend the right title to be watched on the Billboard.
The first requirement for an online mathematics homework engine is to encourage students to practice and reinforce their mathematics skills in ways that are as good or better than traditional paper homework. The use of the computer and the internet should not limit the kind or quality of the mathematics that we teach and if possible it should expand it.
Now that much of the homework practice takes place online we have the potential of a new and much better window into how students learn mathematics but we must continue to ensure that students are studying the mathematics we want to have learned and not just mathematics that is easily gradable. Several of the open source mathematics engines that do this well are represented at this conference.
The WeBWorK mathematics rendering engine started twenty years ago as a stand alone application. Since then homework questions contributed by many, many mathematicians to the OpenProblemLibrary (OPL) have created a collection of over 30,000 Creative Commons licensed problems primarily directed toward calculus but ranging from basic algebra through matrix linear algebra.
I’ll present one of the adaptations of WeBWorK which allows it to render mathematics questions for a standard Moodle quiz in much the same way that STACK functions. Both STACK and WeBWorK vastly increase Moodle’s ability to handle mathematics. Using the Moodle quiz format will make the OPL available to many more educators and allows utilization of Moodle’s facility at collecting student data.
If there is time I’ll show a second adaptation which allows WeBWorK to serve as an assignment type within Moodle. These same mechanisms allow active WeBWorK questions to be embedded in other learning management systems, in interactive textbooks and even HTML pages. This capability fits well with an emerging trend to use smaller, more specialized, inter-operating components for online education.
The term Machine Learning was coined by Arthur Samuel in 1959, an american pioneer in the field of computer gaming and artificial intelligence and stated that “ it gives computers the ability to learn without being explicitly programmed” And in 1997, Tom Mitchell gave a “ well-Posed” mathematical and relational definition that “ A Computer Program is said to learn from experience E with respect to some task T and some performance measure P, if its performance on T, as measured by P, improves with experience E”.
Machine learning is needed for tasks that are too complex for humans to code directly. So instead, we provide a large amount of data to a machine learning algorithm and let the algorithm work it out by exploring that data and searching for a model that will achieve what the programmers have set it out to achieve.
Xavier Amatriain, VP of Engineering, Quora at MLconf SEA - 5/01/15MLconf
Machine learning applications for growing the world’s knowledge at Quora: At Quora our mission is to “share and grow the world’s knowledge”. We want to do this by getting the right questions to the right people to answer them, but also by getting the existing answers to people who are interested in them. In order to accomplish this we need to build a complex ecosystem where we value issues such as content quality, engagement, demand, interests, or reputation. It is not possible to build a system like this unless most of the process are highly automated and scalable. We are fortunate though to have lots of very good quality data on which to build machine learning solutions that can help address all of the previous requirements.
In this talk I will describe some interesting uses of machine learning at Quora that range from different recommendation approaches such as personalized ranking to classifiers built to detect duplicate questions or spam. I will describe some of the modeling and feature engineering approaches that go into building these systems. I will also share some of the challenges faced when building such a large-scale knowledge base of human-generated knowledge.
Deep Learning For Practitioners, lecture 2: Selecting the right applications...ananth
In this presentation we articulate when deep learning techniques yield best results from a practitioner's view point. Do we apply deep learning techniques for every machine learning problem? What characteristics of an application lends itself suitable for deep learning? Does more data automatically imply better results regardless of the algorithm or model? Does "automated feature learning" obviate the need for data preprocessing and feature design?
Deep learning has accomplished impressive feats in areas such as voice recognition, image processing, and natural language processing. Deep learning enthusiasts have rushed to predict that this family of algorithms is likely to take over most other applications in the near future. This focus on deep architectures seems to have cast a shadow over more “traditional” machine learning and data science approaches, leaving researchers and practitioners alike wondering whether there is any point in investing in feature engineering or simpler models.
In this talk, I will go over what deep learning can and cannot do for you, both now and in the near future. I will also describe how different approaches will continue to be needed, and why their demand will likely grow despite the rise of deep learning. I will support my claims not only by looking at recent publications, but also by using practical examples drawn from my experience at companies at the forefront of machine learning applications, such as Quora.
Lean DevOps - Lessons Learned from Innovation-driven CompaniesXavier Amatriain
Presentation I gave at the IEEE Devops Symposium in the Computer History Museum, Mountain View. I describe the CASSSH model for Devops as well as lessons learned in innovation-driven companies.
From Idea to Execution: Spotify's Discover WeeklyChris Johnson
Discover Weekly is a personalized mixtape of 30 highly personalized songs that's curated and delivered to Spotify's 75M active users every Monday. It's received high acclaim in the press and reached 1B streams within its first 10 weeks. In this slide deck we dive into the narrative of how Discover Weekly came to be, highlighting technical challenges, data driven development, and the Machine Learning models used to power our recommendations engine.
Key Lessons Learned Building Recommender Systems for Large-Scale Social Netw...Christian Posse
Invited Talk at KDD 2012 (Industry Practice Expo)
http://kdd2012.sigkdd.org/indexpo.shtml#posse
Abstract: By helping members to connect, discover and share relevant content or find a new career opportunity, recommender systems have become a critical component of user growth and engagement for social networks. The multidimensional nature of engagement and diversity of members on large-scale social networks have generated new infrastructure and modeling challenges and opportunities in the development, deployment and operation of recommender systems.
This presentation will address some of these issues, focusing on the modeling side for which new research is much needed while describing a recommendation platform that enables real-time recommendation updates at scale as well as batch computations, and cross-leverage between different product recommendations. Topics covered on the modeling side will include optimizing for multiple competing objectives, solving contradicting business goals, modeling user intent and interest to maximize placement and timeliness of the recommendations, utility metrics beyond CTR that leverage both real-time tracking of explicit and implicit user feedback, gathering training data for new product recommendations, virility preserving online testing and virtual profiling.
A.E. Eiben's presentation slides on Methodological Issues in Bio-inspired Computing or How to Get a PhD in …? From Bionetics 2011. Sponsored by the Awareness Initiative.
Xavier Amatriain, VP of Engineering, Quora at MLconf SF - 11/13/15MLconf
10 More Lessons Learned from Building Real-Life ML Systems: A year ago I presented a collection of 10 lessons in MLConf. These goal of the presentation was to highlight some of the practical issues that ML practitioners encounter in the field, many of which are not included in traditional textbooks and courses. The original 10 lessons included some related to issues such as feature complexity, sampling, regularization, distributing/parallelizing algorithms, or how to think about offline vs. online computation.
Since that presentation and associated material was published, I have been asked to complement it with more/newer material. In this talk I will present 10 new lessons that not only build upon the original ones, but also relate to my recent experiences at Quora. I will talk about the importance of metrics, training data, and debuggability of ML systems. I will also describe how to combine supervised and non-supervised approaches or the role of ensembles in practical ML systems.
Scaling Recommendations at Quora (RecSys talk 9/16/2016)Nikhil Dandekar
Talk about scaling Quora's recommendations and ML systems given at the ACM RecSys conference at Boston during the Large Scale Recommendation Systems (LSRS) workshop.
Production-Ready BIG ML Workflows - from zero to heroDaniel Marcous
Data science isn't an easy task to pull of.
You start with exploring data and experimenting with models.
Finally, you find some amazing insight!
What now?
How do you transform a little experiment to a production ready workflow? Better yet, how do you scale it from a small sample in R/Python to TBs of production data?
Building a BIG ML Workflow - from zero to hero, is about the work process you need to take in order to have a production ready workflow up and running.
Covering :
* Small - Medium experimentation (R)
* Big data implementation (Spark Mllib /+ pipeline)
* Setting Metrics and checks in place
* Ad hoc querying and exploring your results (Zeppelin)
* Pain points & Lessons learned the hard way (is there any other way?)
Slides for application prototyping workshop on web and mobile application design.
We discussed
- product and project requirements definition
- rationale for wireframes, mockups, prototypes
- functional prototypes vs production sw
- tools: Balsamiq, myBalsamiq, Webflow
- MVP (minimum viable product) implementation in Javascript, HTML/CSS on node.js
What Are the Basics of Product Manager Interviews by Google PMProduct School
Ankit walked through an intro to the Product Manager role, the skills needed, and how the role differs between small and large companies. He wrapped up with some advice that's helped him in his Product Manager interviews over the years.
He gave a structured approach to thinking about what a Product Manager actually does (structured, meaning no "top 10" lists) and what are the skills you need to do well as a Product Manager.
CSSC × GDSC: Intro to Machine Learning!
Aaron Shah and Manav Bhojak on October 5, 2023
🤖 Join us for an exciting ML Workshop! 🚀 Dive into the world of Machine Learning, where we'll unravel the mysteries of CNNs, RNNs, Transformers, and more. 🤯
Get ready to embark on a journey of discovery! We'll begin with an easy-to-follow introduction to the fascinating realm of ML. 📚
🛠️ In our hands-on session, we'll walk you through setting up your environment. No tech hurdles here! 🌐
🔍 Then, we'll get down to the nitty-gritty, guiding you through our starter code for a thrilling hands-on example. Together, we'll explore the power of ML in action! 💡
Machine learning: A Walk Through School ExamsRamsha Ijaz
When it comes to studying, Machines and Students have one thing in common: Examinations. To perform well on their final evaluations, humans require taking classes, reading books and solving practice quizzes. Similarly, machines need artificial intelligence to memorize data, infer feature correlations, and pass validation standards in order to solve almost any problem. In this quick introductory session, we'll walk through these analogies to learn the core concepts behind Machine Learning, and why it works so well!
Effective Tips for Building ML Products by Rally Health Lead PMProduct School
- Model building is only 10% of an ML Product, the rest is good Product Management. Focus on building good tracking and testing. Models will get better.
- ML Product Roadmaps look very different. It may take 2-3 years to get true gain (test vs actual accuracy). Keep managing stakeholder misinformation. Educating is part of the job.
- ML is more than a feature in your app. Relook at the whole canvas. You'll need expertise beyond data; Re-think tech design, UX and most importantly Business strategy.
Similar to BIG2016- Lessons Learned from building real-life user-focused Big Data systems (20)
Data/AI driven product development: from video streaming to telehealthXavier Amatriain
Healthcare is different from any other application domain, or is it not? While it is true that there are specific aspects, such as high stakes decisions and a complex regulatory framework, that make healthcare somewhat different, it is also the case that many of the lessons learned from building data-driven products in other domains translate remarcably well into healthcare. This is particularly so because healthcare is also a user facing domain, where users can be both patients or healthcare professionals. Given that data has shown to improve user experience while ensuring quality and scalability, few would argue that healthcare cannot benefit from being much more data-driven than it has traditionally been.
In this talk, I described how this experience building impactful data and AI solutions into user facing products for decades can be leveraged to revolutionize telehealth. At Curai, we combine approaches such as state-of-the-art large language models with expert systems in areas such as NLP, vision, and automated diagnosis to augment and scale doctors, and to improve user experience and healthcare outcomes. We will see some of those applications while analyzing the role of data and ML algorithms in making them possible.
AI-driven product innovation: from Recommender Systems to COVID-19Xavier Amatriain
AI/Machine Learning has become an integral part of many household tech products, from Netflix to our phones. In this talk I will draw from my experience driving AI teams at some of those companies to showcase how AI can positively impact products as different as Netflix and Curai, an online telehealth service.
With half of the world’s population lacking access to healthcare services, and 30% of the adult population in the US having inadequate health insurance coverage to get even basic access to services, it should have been clear that a pandemic like COVID-19 would strain the global healthcare system way over its maximum capacity. In this context, many are trying to embrace and encourage the use of telehealth as a way to provide safe and convenient access to care. However, telehealth in itself can not scale to cover all our needs unless we improve scalability and efficiency through AI and automation.
In this talk, we will describe how our work on combining latest AI advances with medical experts and online access has the potential to change the landscape in healthcare access and provide 24/7 quality healthcare. Combining areas such as NLP, vision, and automatic diagnosis we can augment and scale doctors. We will describe our work on combining expert systems with deep learning to build state-of-the-art medical diagnostic models that are also able to model the unknowns. We will also show our work on using language models for medical Q&A . More importantly, we will describe how those approaches have been used to address the urgent and immediate needs of the current pandemic.
AI for COVID-19: An online virtual care approachXavier Amatriain
Slides for the talk I gave at the AI and COVID-19 virtual conference at Stanford. Video here: https://hai.stanford.edu/events/covid-19-and-ai-virtual-conference/video-archive
From one to zero: Going smaller as a growth strategyXavier Amatriain
This talk was designed for Engineering managers. Having been at companies of all sizes, I recommend managers who want to grow to go smaller. At the same time I reflect on what are the important things that remain constant regardless the size and context and which ones don't.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
7. More data or better models?
Really?
Anand Rajaraman: VC, Founder, Stanford Professor
8. More data or better models?
Sometimes, it’s
not about more
data
9. More data or better models?
Norvig:
“Google does not have
better Algorithms only
more Data”
Many features/
low-bias models
10. More data or better models?
Sometimes, it’s
not about more
data
11. How useful is Big Data?
● “Everybody” has Big Data
○ Does everyone need it?
○ E.g. How many users do you need to compute a MF of 100 factors?
● Smart (e.g. stratified) sampling can produce as good (or better) results!
13. Better models and features that “don’t work”
● E.g. You have a linear model and have
been selecting and optimizing features
for that model
■ More complex model with the same features
-> improvement not likely
■ More expressive features with the same model
-> improvement not likely
● More complex features may require a
more complex model
● A more complex model may not show
improvements with a feature set that is
too simple
15. Hyperparameter optimization
● Automate hyperparameter
optimization by choosing the
right metric.
○ But, is it as simple as choosing the
max?
● Bayesian Optimization
(Gaussian Processes) better
than grid search
○ See spearmint, hyperopt, AutoML,
MOE...
17. Supervised/Unsupervised Learning
● Unsupervised learning as dimensionality reduction
● Unsupervised learning as feature engineering
● The “magic” behind combining
unsupervised/supervised learning
○ E.g.1 clustering + knn
○ E.g.2 Matrix Factorization
■ MF can be interpreted as
● Unsupervised:
○ Dimensionality Reduction a la PCA
○ Clustering (e.g. NMF)
● Supervised
○ Labeled targets ~ regression
18. Supervised/Unsupervised Learning
● One of the “tricks” in Deep Learning is how it
combines unsupervised/supervised learning
○ E.g. Stacked Autoencoders
○ E.g. training of convolutional nets
20. Ensembles
● Netflix Prize was won by an ensemble
○ Initially Bellkor was using GDBTs
○ BigChaos introduced ANN-based ensemble
● Most practical applications of ML run an ensemble
○ Why wouldn’t you?
○ At least as good as the best of your methods
○ Can add completely different approaches (e.
g. CF and content-based)
○ You can use many different models at the
ensemble layer: LR, GDBTs, RFs, ANNs...
21. Ensembles & Feature Engineering
● Ensembles are the way to turn any model into a feature!
● E.g. Don’t know if the way to go is to use Factorization
Machines, Tensor Factorization, or RNNs?
○ Treat each model as a “feature”
○ Feed them into an ensemble
24. Feature Engineering
● Main properties of a well-behaved ML feature
○ Reusable
○ Transformable
○ Interpretable
○ Reliable
● Reusability: You should be able to reuse features in different
models, applications, and teams
● Transformability: Besides directly reusing a feature, it
should be easy to use a transformation of it (e.g. log(f), max(f),
∑ft
over a time window…)
25. Feature Engineering
● Main properties of a well-behaved ML feature
○ Reusable
○ Transformable
○ Interpretable
○ Reliable
● Interpretability: In order to do any of the previous, you
need to be able to understand the meaning of features and
interpret their values.
● Reliability: It should be easy to monitor and detect bugs/issues
in features
26. Feature Engineering Example - Quora Answer Ranking
What is a good Quora answer?
• truthful
• reusable
• provides explanation
• well formatted
• ...
27. Feature Engineering Example - Quora Answer Ranking
How are those dimensions translated
into features?
• Features that relate to the answer
quality itself
• Interaction features
(upvotes/downvotes, clicks,
comments…)
• User features (e.g. expertise in topic)
29. Implicit vs. Explicit
● Many have acknowledged
that implicit feedback is more useful
● Is implicit feedback really always
more useful?
● If so, why?
30. ● Implicit data is (usually):
○ More dense, and available for all users
○ Better representative of user behavior vs.
user reflection
○ More related to final objective function
○ Better correlated with AB test results
● E.g. Rating vs watching
Implicit vs. Explicit
31. ● However
○ It is not always the case that
direct implicit feedback correlates
well with long-term retention
○ E.g. clickbait
● Solution:
○ Combine different forms of
implicit + explicit to better represent
long-term goal
Implicit vs. Explicit
33. Defining training/testing data
● Training a simple binary classifier for good/bad
answer
○ Defining positive and negative labels ->
Non-trivial task
○ Is this a positive or a negative?
● funny uninformative answer with many upvotes
● short uninformative answer by a well-known
expert in the field
● very long informative answer that nobody
reads/upvotes
● informative answer with grammar/spelling
mistakes
● ...
34. Other training data issues: Time traveling
● Time traveling: usage of features that originated after the
event you are trying to predict
○ E.g. Your upvoting an answer is a pretty good prediction
of you reading that answer, especially because most
upvotes happen AFTER you read the answer
○ Tricky when you have many related features
○ Whenever I see an offline experiment with huge wins, I
ask: “Is there time traveling?”
36. Training a model
● Model will learn according to:
○ Training data (e.g. implicit and explicit)
○ Target function (e.g. probability of user reading an answer)
○ Metric (e.g. precision vs. recall)
● Example 1 (made up):
○ Optimize probability of a user going to the cinema to
watch a movie and rate it “highly” by using purchase history
and previous ratings. Use NDCG of the ranking as final
metric using only movies rated 4 or higher as positives.
37. Example 2 - Quora’s feed
● Training data = implicit + explicit
● Target function: Value of showing a story to a
user ~ weighted sum of actions: v = ∑a
va
1{ya
= 1}
○ predict probabilities for each action, then compute expected
value: v_pred = E[ V | x ] = ∑a
va
p(a | x)
● Metric: any ranking metric
38. Offline testing
● Measure model performance,
using (IR) metrics
● Offline performance = indication
to make decisions on follow-up
A/B tests
● A critical (and mostly unsolved)
issue is how offline metrics
correlate with A/B test results.
41. The curse of presentation bias
● User can only click on what you decide to show
● But, what you decide to show is the result of what your model
predicted is good
● Simply treating things you show as negatives is not likely to work
● Better options
● Correcting for the probability a user will click on a position ->
Attention models
● Explore/exploit approaches such as MAB
43. Distributing ML
● Most of what people do in practice can fit into a multi-
core machine
○ Smart data sampling
○ Offline schemes
○ Efficient parallel code
● Dangers of “easy” distributed approaches such
as Hadoop/Spark
● Do you care about costs? How about latencies?
44. Distributing ML
● Example of optimizing computations to fit them into
one machine
○ Spark implementation: 6 hours, 15 machines
○ Developer time: 4 days
○ C++ implementation: 10 minutes, 1 machine
● Most practical applications of Big Data can fit into
a (multicore) implementation
46. ● In data, size is not all that matters
● Understand dependencies between data, models
& systems
● Choose the right metric & optimize what matters
● Be thoughtful about
○ Your ML infrastructure/tools
○ Interaction between data and UX