To download please go to: http://www.intelligentmining.com/category/knowledge-base/
Slides as presented by Alex Lin to the NYC Predictive Analytics Meetup group: http://www.meetup.com/NYC-Predictive-Analytics/ on Dec. 10, 2009.
Counterfactual Learning for RecommendationOlivier Jeunen
Slides for our presentation at the REVEAL workshop for RecSys '19 in Copenhagen and a Data Science Leuven Meetup, titled "Counterfactual Learning for Recommendation".
Traditional randomized experiments allow us to determine the overall causal impact of a treatment program (e.g. marketing, medical, social, education, political). Uplift modeling (also known as true lift, net lift, incremental lift) takes a further step to identify individuals who are truly positively influenced by a treatment through data mining / machine learning. This technique allows us to identify the “persuadables” and thus optimize target selection in order to maximize treatment benefits. This important subfield of data mining/data science/business analytics has gained significant attention in areas such as personalized marketing, personalized medicine, and political election with plenty of publications and presentations appeared in recent years from both industry practitioners and academics.
In this workshop, I will introduce the concept of Uplift, review existing methods, contrast with the traditional approach, and introduce a new method that can be implemented with standard software. A method and metrics for model assessment will be recommended. Our discussion will include new approaches to handling a general situation where only observational data are available, i.e. without randomized experiments, using techniques from causal inference. Additionally, an integrated modeling approach for uplift and direct response (where it can be identified who actually responded, e.g., click-through or coupon scanning) will be discussed. Last but not least, extension to the multiple treatment situation with solutions to optimizing treatments at the individual level will also be discussed. While the talk is geared towards marketing applications (“personalized marketing”), the same methodologies can be readily applied in other fields such as insurance, medicine, education, political, and social programs. Examples from the retail and non-profit industries will be used to illustrate the methodologies.
This is a brief review of net lift models based on the presentation I did at Truecar in June, 2013 after I attended the training provided by SAS Institute. Recently, I added a few things to the slides after I reviewed several online examples and papers.
Uplift Modelling as a Tool for Making Causal Inferences at Shopify - Mojan HamedRising Media Ltd.
For many businesses, it is not enough to model the probability of an outcome but rather, “given a predictive model, what can we do to change the probability of this outcome?” The goal of this talk is to present how uplift modelling is used to make causal inferences that guide acquisition strategy at Shopify. Mojan will walk through a case study focused on the statistics and experimental design behind uplift modelling, in addition to the learnings gained from bringing this model to production. The python implementation of this presentation will be made available to attendees.
Revisiting Offline Evaluation for Implicit-Feedback Recommender Systems (Doct...Olivier Jeunen
Slides for my Doctoral Symposium presentation at RecSys '19 in Copenhagen, titled "Revisiting Offline Evaluation for Implicit-Feedback Recommender Systems".
Counterfactual Learning for RecommendationOlivier Jeunen
Slides for our presentation at the REVEAL workshop for RecSys '19 in Copenhagen and a Data Science Leuven Meetup, titled "Counterfactual Learning for Recommendation".
Traditional randomized experiments allow us to determine the overall causal impact of a treatment program (e.g. marketing, medical, social, education, political). Uplift modeling (also known as true lift, net lift, incremental lift) takes a further step to identify individuals who are truly positively influenced by a treatment through data mining / machine learning. This technique allows us to identify the “persuadables” and thus optimize target selection in order to maximize treatment benefits. This important subfield of data mining/data science/business analytics has gained significant attention in areas such as personalized marketing, personalized medicine, and political election with plenty of publications and presentations appeared in recent years from both industry practitioners and academics.
In this workshop, I will introduce the concept of Uplift, review existing methods, contrast with the traditional approach, and introduce a new method that can be implemented with standard software. A method and metrics for model assessment will be recommended. Our discussion will include new approaches to handling a general situation where only observational data are available, i.e. without randomized experiments, using techniques from causal inference. Additionally, an integrated modeling approach for uplift and direct response (where it can be identified who actually responded, e.g., click-through or coupon scanning) will be discussed. Last but not least, extension to the multiple treatment situation with solutions to optimizing treatments at the individual level will also be discussed. While the talk is geared towards marketing applications (“personalized marketing”), the same methodologies can be readily applied in other fields such as insurance, medicine, education, political, and social programs. Examples from the retail and non-profit industries will be used to illustrate the methodologies.
This is a brief review of net lift models based on the presentation I did at Truecar in June, 2013 after I attended the training provided by SAS Institute. Recently, I added a few things to the slides after I reviewed several online examples and papers.
Uplift Modelling as a Tool for Making Causal Inferences at Shopify - Mojan HamedRising Media Ltd.
For many businesses, it is not enough to model the probability of an outcome but rather, “given a predictive model, what can we do to change the probability of this outcome?” The goal of this talk is to present how uplift modelling is used to make causal inferences that guide acquisition strategy at Shopify. Mojan will walk through a case study focused on the statistics and experimental design behind uplift modelling, in addition to the learnings gained from bringing this model to production. The python implementation of this presentation will be made available to attendees.
Revisiting Offline Evaluation for Implicit-Feedback Recommender Systems (Doct...Olivier Jeunen
Slides for my Doctoral Symposium presentation at RecSys '19 in Copenhagen, titled "Revisiting Offline Evaluation for Implicit-Feedback Recommender Systems".
Recommender Systems and Active LearningDain Kaplan
This presentation presents a high level overview of recommender systems and active learning, including from the viewpoint of startups vs. established companies, the cold-start problem, etc.
Online recommendations at scale using matrix factorisationMarcus Ljungblad
This presentation was used for my thesis defense held at Universidad Politecnica de Catalunya, Spain, for a double-degree master programme in Distributed Computing. The other two universities participating in the programme are Royal Institute of Technology, Stockholm, Sweden and Instituto Tecnico Superior, Lisbon, Portugal.
Recommendation Engine Powered by Hadoop - Pranab GhoshBigDataCloud
Personalized recommendations are ubiquitous in social network and shopping sites these days. How do they do it? As long as enough user interaction data is available for items e.g., products in shopping sites, a kind of recommendation engine based on what’s known as ' Collaborative Filtering' is not that difficult to build. Since the solution causes a combinatorial explosion, Hadoop can play a critical role in processing massive amount of data in collaborative filtering based solutions. In this presentations, I will cover a Hadoop based recommendation engine implementation using collaborative filtering.
Text mining to correct missing CRM information: a practical data science projectJonathan Sedar
20min talk given at PyData London 2014
A client in the energy sector wanted to create predictive behavioural models of business customers at the company level, but the CRM data was messy, often containing several sub-accounts for each business, without any grouping identifiers, and so aggregation was impossible. In this talk I describe a short project where we used text mining, a handful of unsupervised learning techniques and pragmatic use of human skill, to identify the true company level structures in the CRM data.
Introduction to Factorization Machines model with an example. Motivations - why you should have it in your toolbox, model and it expressiveness, use case for context-aware recommendations and Field-Aware Factorization Machines.
DataEngConf: Building the Next New York Times Recommendation EngineHakka Labs
By Alex Spangher (Data Engineer, New York Times Digital)
Machine Learning is a discipline characterized by systematic approaches and common threads to seemingly diverse problems. In this talk I'll talk about several approaches taken during our work on the next New York Times Recommendation Engine, specifically focusing on spatial reasoning, dimensionality reduction, and testing strategies. Topics covered will include implicit regression, Bayesian modeling and neural networks. The talk will focus on the commonalities between different approaches taken.
To download slides:
http://www.intelligentmining.com/category/knowledge-base/
These are my notes for a presentation I did internally at IM. It covers both the multinomial and multi-variate Bernoulli event models in Naive Bayes text classification.
We estimate that nearly one third of news articles contain references to future events. While this information can prove crucial to understanding news stories and how events will develop for a given topic, there is currently no easy way to access this information. We propose a new task to address the problem of retrieving and ranking sentences that contain mentions to future events, which we call ranking related news predictions. In this paper, we formally define this task and propose a learning to rank approach based on 4 classes of features: term similarity, entity-based similarity, topic similarity, and temporal similarity. Through extensive evaluations using a corpus consisting of 1.8 millions news articles and 6,000 manually judged relevance pairs, we show that our approach is able to retrieve a significant number of relevant predictions related to a given topic.
To download please go to: http://www.intelligentmining.com/knowledge-base.html
Slides as presented by Alex Lin to the NYC Predictive Analytics Meetup group: http://www.meetup.com/NYC-Predictive-Analytics/ on April 1, 2010 (no joke!) :)
In this lecture, I will first cover the recent advances in neural recommender systems such as autoencoder-based and MLP-based recommender systems. Then, I will introduce the recent achievement for automatic playlist continuation in music recommendation.
Recommender Systems and Active LearningDain Kaplan
This presentation presents a high level overview of recommender systems and active learning, including from the viewpoint of startups vs. established companies, the cold-start problem, etc.
Online recommendations at scale using matrix factorisationMarcus Ljungblad
This presentation was used for my thesis defense held at Universidad Politecnica de Catalunya, Spain, for a double-degree master programme in Distributed Computing. The other two universities participating in the programme are Royal Institute of Technology, Stockholm, Sweden and Instituto Tecnico Superior, Lisbon, Portugal.
Recommendation Engine Powered by Hadoop - Pranab GhoshBigDataCloud
Personalized recommendations are ubiquitous in social network and shopping sites these days. How do they do it? As long as enough user interaction data is available for items e.g., products in shopping sites, a kind of recommendation engine based on what’s known as ' Collaborative Filtering' is not that difficult to build. Since the solution causes a combinatorial explosion, Hadoop can play a critical role in processing massive amount of data in collaborative filtering based solutions. In this presentations, I will cover a Hadoop based recommendation engine implementation using collaborative filtering.
Text mining to correct missing CRM information: a practical data science projectJonathan Sedar
20min talk given at PyData London 2014
A client in the energy sector wanted to create predictive behavioural models of business customers at the company level, but the CRM data was messy, often containing several sub-accounts for each business, without any grouping identifiers, and so aggregation was impossible. In this talk I describe a short project where we used text mining, a handful of unsupervised learning techniques and pragmatic use of human skill, to identify the true company level structures in the CRM data.
Introduction to Factorization Machines model with an example. Motivations - why you should have it in your toolbox, model and it expressiveness, use case for context-aware recommendations and Field-Aware Factorization Machines.
DataEngConf: Building the Next New York Times Recommendation EngineHakka Labs
By Alex Spangher (Data Engineer, New York Times Digital)
Machine Learning is a discipline characterized by systematic approaches and common threads to seemingly diverse problems. In this talk I'll talk about several approaches taken during our work on the next New York Times Recommendation Engine, specifically focusing on spatial reasoning, dimensionality reduction, and testing strategies. Topics covered will include implicit regression, Bayesian modeling and neural networks. The talk will focus on the commonalities between different approaches taken.
To download slides:
http://www.intelligentmining.com/category/knowledge-base/
These are my notes for a presentation I did internally at IM. It covers both the multinomial and multi-variate Bernoulli event models in Naive Bayes text classification.
We estimate that nearly one third of news articles contain references to future events. While this information can prove crucial to understanding news stories and how events will develop for a given topic, there is currently no easy way to access this information. We propose a new task to address the problem of retrieving and ranking sentences that contain mentions to future events, which we call ranking related news predictions. In this paper, we formally define this task and propose a learning to rank approach based on 4 classes of features: term similarity, entity-based similarity, topic similarity, and temporal similarity. Through extensive evaluations using a corpus consisting of 1.8 millions news articles and 6,000 manually judged relevance pairs, we show that our approach is able to retrieve a significant number of relevant predictions related to a given topic.
To download please go to: http://www.intelligentmining.com/knowledge-base.html
Slides as presented by Alex Lin to the NYC Predictive Analytics Meetup group: http://www.meetup.com/NYC-Predictive-Analytics/ on April 1, 2010 (no joke!) :)
In this lecture, I will first cover the recent advances in neural recommender systems such as autoencoder-based and MLP-based recommender systems. Then, I will introduce the recent achievement for automatic playlist continuation in music recommendation.
What really are recommendations engines nowadays?
This presentation introduces the foundations of recommendation algorithms, and covers common approaches as well as some of the most advanced techniques. Although more focused on efficiency than theoretical properties, basics of matrix algebra and optimization-based machine learning are used through the presentation.
Table of Contents:
1. Collaborative Filtering
1.1 User-User
1.2 Item-Item
1.3 User-Item
* Matrix Factorization
* Stochastic Gradient Descent (SGD)
* Truncated Singular Value Decomposition (SVD)
* Alternating Least Square (ALS)
* Deep Learning
2. Content Extraction
* Item-Item Similarities
* Deep Content Extraction: NLP, CNN, LSTM
3. Hybrid Models
4. In Production
4.1 Problematics
4.2 Solutions
4.3 Tools
PredictionIO - Building Applications That Predict User Behavior Through Big D...predictionio
Building Applications That Predict User Behavior Through Big Data Using Open-Source Technologies
Presented by PredictionIO at Big Data TechCon (Oct 17, 2013)
DN 2017 | Building a recommender system using collaborative filtering (CF) | ...Dataconomy Media
Certainly, you had that moment thinking about what clothes to take for your stay in Berlin, what places to visit, where to stay and where/what to eat. You felt overwhelmed by the options! You surely want choices that appeal to your own tastes! How can this unique “You” be leveraged to get you out of the “information overload” while searching on the web for your right fit?
The answer is: by personalization. Personalization is explored by companies like Amazon, Netflix, Booking. One of the most used tools is recommendation engines that allow a customer experience tailored to the user’s behavior.
While behavioral analysis is the secret power of achieving a real value, it’s not that easy get way as customer behavior is different from one to another and presents changes over time! Here comes machine learning technologies for performance! A large landscape of ML algorithms comes in hand! Text classification, KNN, k-means, stochastic gradient, Neural Networks.. what algorithm to use? What approach to follow? Data about users is in different forms..sometimes it doesn’t even exist.
This talk provides a short walk through one of the most used techniques to discover users' behaviors: collaborative filtering (CF). Starting with when/where we experience it, we will first go through some applications example. Then we’ll dive into an exploration of standard CF-based methods (ML algorithms). Watch what challenges we may face as we try them and what possible ways to overcome their weaknesses.
Quest for prediction accuracy by Mekbib AwonoFrosmo
Is it possible to accurately predict what users would want to consume next? Basic prediction models are becoming less and less effective, but there is another way.
About Mekbib
Mekbib is a team lead developer in FROSMO. He is working in the customer service wing of FROSMO. He has experience in a wide range of projects from mobile apps to fullstack web development. Mekbib is currently pursuing his Master's in Aalto university in Algorithm, logic and computation.
Explain Yourself: Why You Get the Recommendations You DoDatabricks
Machine learning recommender systems have supercharged the online retail environment by directly targeting what the customer wants. While customers are getting better product recommendations than ever before, in the age of GDPR there is growing concern about customer privacy and transparency with ML models. Many are asking, just why am I receiving these recommendations? While the current Implicit Collaborative Filtering (CF) algorithm in spark.ml is great for generating recommendations at scale, its currently lacks any method to explain why a particular customer is getting the recommendations they are getting. In this talk, we demonstrate a way to expand collaborative filtering so that the viewing history of a customer can be directly related to their recommendations. Why were you recommended footwear? Well, 40% of this recommendation came from browsing runners and 20% came from the shorts you recently purchased. Turns out, rethinking of the linear algebra in the current spark.ml CF implementation makes this possible. We show how this is done and demonstrate its implemented as a new feature to spark.ml, expanding the API to allow everyone to explain recommendations at scale and create a more transparent ML future.
Authors: Niels Hanson Kishori Konwar
This is the other category of recommender systems: Content-based filtering.
We will apply three of the various available methods: Decision Trees, Nearest Neighbor method and Polynomial Regression.
[Notebook](https://colab.research.google.com/drive/1ohF6b5LO1XA0cCSLITSApaioUIKyQowQ)
basic Function and Terminology of Recommendation Systems. Some Algorithmic Implementation with some sample Dataset for Understanding. It contains all the Layers of RS Framework well explained.
Deep Learning for Recommendations: Fundamentals and Advances
In this part, we focus on the Fundamentals of Deep Recommender Systems.
Tutorial Website/slides: https://advanced-recommender-systems.github.io/ijcai2021-tutorial/
https://youtu.be/_M5S0Njmc_c
Crafting Recommenders: the Shallow and the Deep of it! Sudeep Das, Ph.D.
I present a brief review, and an outlook on the rapid changes happening in the field of recommendation engine research on the heels of the deep learning revolution!
Creating Documentation Your Users Will LoveEna Arel
What are users looking for in technical documentation? This presentation describes 10 best practices in information development based on usability testing.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Welcome to the first live UiPath Community Day Dubai! Join us for this unique occasion to meet our local and global UiPath Community and leaders. You will get a full view of the MEA region's automation landscape and the AI Powered automation technology capabilities of UiPath. Also, hosted by our local partners Marc Ellis, you will enjoy a half-day packed with industry insights and automation peers networking.
📕 Curious on our agenda? Wait no more!
10:00 Welcome note - UiPath Community in Dubai
Lovely Sinha, UiPath Community Chapter Leader, UiPath MVPx3, Hyper-automation Consultant, First Abu Dhabi Bank
10:20 A UiPath cross-region MEA overview
Ashraf El Zarka, VP and Managing Director MEA, UiPath
10:35: Customer Success Journey
Deepthi Deepak, Head of Intelligent Automation CoE, First Abu Dhabi Bank
11:15 The UiPath approach to GenAI with our three principles: improve accuracy, supercharge productivity, and automate more
Boris Krumrey, Global VP, Automation Innovation, UiPath
12:15 To discover how Marc Ellis leverages tech-driven solutions in recruitment and managed services.
Brendan Lingam, Director of Sales and Business Development, Marc Ellis
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
The Metaverse and AI: how can decision-makers harness the Metaverse for their...Jen Stirrup
The Metaverse is popularized in science fiction, and now it is becoming closer to being a part of our daily lives through the use of social media and shopping companies. How can businesses survive in a world where Artificial Intelligence is becoming the present as well as the future of technology, and how does the Metaverse fit into business strategy when futurist ideas are developing into reality at accelerated rates? How do we do this when our data isn't up to scratch? How can we move towards success with our data so we are set up for the Metaverse when it arrives?
How can you help your company evolve, adapt, and succeed using Artificial Intelligence and the Metaverse to stay ahead of the competition? What are the potential issues, complications, and benefits that these technologies could bring to us and our organizations? In this session, Jen Stirrup will explain how to start thinking about these technologies as an organisation.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
In his public lecture, Christian Timmerer provides insights into the fascinating history of video streaming, starting from its humble beginnings before YouTube to the groundbreaking technologies that now dominate platforms like Netflix and ORF ON. Timmerer also presents provocative contributions of his own that have significantly influenced the industry. He concludes by looking at future challenges and invites the audience to join in a discussion.
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofsAlex Pruden
This paper presents Reef, a system for generating publicly verifiable succinct non-interactive zero-knowledge proofs that a committed document matches or does not match a regular expression. We describe applications such as proving the strength of passwords, the provenance of email despite redactions, the validity of oblivious DNS queries, and the existence of mutations in DNA. Reef supports the Perl Compatible Regular Expression syntax, including wildcards, alternation, ranges, capture groups, Kleene star, negations, and lookarounds. Reef introduces a new type of automata, Skipping Alternating Finite Automata (SAFA), that skips irrelevant parts of a document when producing proofs without undermining soundness, and instantiates SAFA with a lookup argument. Our experimental evaluation confirms that Reef can generate proofs for documents with 32M characters; the proofs are small and cheap to verify (under a second).
Paper: https://eprint.iacr.org/2023/1886
3. Recommendation Engine
What is a Recommendation Engine (RE)?
RE takes “observation” data and uses machine
learning / statistical algorithms to predict
outcomes or levels of interest.
“Recommender systems form a specific type of
information filtering (IF) technique that attempts
to present information items (movies, music,
books, news, images, web pages, etc.) that are
likely of interest to the user.” – Wikipedia
4. Recommendation Engine
Demographic
based
Clustering
Social Knowledge based
Graph based
based
Content
Collaborative based
Filtering Category
Mixture
Models based
This
presentation will focus on
Neighborhood-based Collaborative Filtering
User-oriented method
Item-oriented method
5. Neighborhood-based
Collaborative Filtering
Data Normalization
Data
Input Data
(Observations) Representation
User-user similarity
vs.
Item-item similarity
Neighborhood
Formation
Make
Recommendation
Recommendation
Generation
Web Page, Email,
Direct Mail,
Marketing Campaign,
Delivering etc.
Mechanism
12. User-oriented CF
Practical Implementation
Compute and store all user-user similarities.
u•v
Cosine similarity: sim(u,v) = cos(u,v) =
u ∗ v
2 2
Find N items that will be most likely purchased by user u.
Find k most similar users to u, save to Usim
Get all items purchased by Usim, save to Icandidate
€
Remove unavailable items in Icandidate
Get all items purchased by u, save to Ipurchased
Take Icandidate – Ipurchased = Irecmd
Re-order items in Irecmd based on sum of user-user similarity
∑ v ∈k−similarUser(u)
userSim(u,v) * rvi
pred(u,i) =
∑ v ∈k−similarUser( u)
userSim(u,v)
€
17. Item-oriented CF
Practical Implementation
Compute and store all item-item similarities.
a•b
Cosine similarity: sim(a,b) = cos(a,b) =
a ∗ b
2 2
Find N items that will be most likely purchased by user u.
Get all items purchased by u, save to Ipurchased
€
For each item in Ipurchased, find k most similar items save them
to Icandidate
Remove unavailable items in Icandidate
Get all items purchased by u, save to Ipurchased
Take Icandidate – Ipurchased = Irecmd
Re-order items in Irecmd based on
∑ j ∈ purchasedItems(u)
itemSim(i, j) * ruj
pred(u,i) =
∑ j ∈ purchasedItems(u)
itemSim(i, j)
19. The Challenges you will face
Data Sparsity Issue
Cold Start Problem
Curse of Dimensionality
Scalability
20. Data Sparsity Issue
Missing values in the Users / Items matrix.
users
Not
1 2 3 4 5 6 7 8 9 10 … n
Reliable Mostly
Unknown
1
2 ∑ userSim(u,v) * rvi
items
v ∈k−similarUser(u)
3 1
pred(u,i) = 1
∑ v ∈k−similarUser( u)
userSim(u,v)
4 1
1
…
m €
NetflixPrize data set: 98.82% of cells are blank
Typical e-commerce txn data set can be 10-100
time more sparse than Netflix Prize data set !!
21. Cold Start Problem
It occurs when new item or new user is added to the
data matrix.
users
1 2 3 4 5 6 7 8 9 10 … n
1 1 1 1 1
2 1 1 1
items
3 1 1 1 1 1 1
4 1 1 1 1
1 1
…
m
RE does not have enough knowledge about this new user or
this new item yet.
Content-based REs can be incorporated to alleviate cold start
problem.
22. Curse of Dimensionality
Adding more features (items or users) can
increase the noise, and hence the error.
There aren’t enough observations to get good
estimates. users
items
23. Scalability
User neighborhood formation: O(n2) for n users
Item neighborhood formation: O(m2) for m items
When m (# of items) << n (# of users), item-based
CF will be more efficient than user-based CF
Ability to update neighborhood incrementally
25. Best Practices
Understand the data thoroughly
Define business objectives and conversion metrics
judiciously
Understand context and user intent
Apply adaptive reinforcement learning
Optimize RE using cost-based methods
Be aware of data-shift issue
Optimize marketing messages delivered with
recommendation results
26. Understand the data thoroughly
What data are available?
E-commerce data set typically contains:
Clickstream
Shopping cart / Saved Items / Wish list / Shared Item
Order / Return
User profile
User ratings
How are these data points being collected?
Is there pre-existing bias in the data? or leakage?
Is the data related to what we want to predict?
27. Define business objectives and
conversion metrics judiciously
Definingcorrect conversion metrics can be a
competitive advantage.
mitigate
increase
navigation
inefficiency
# of orders
up-sell increase
Web more
within average
property the context item price
revenue
cross-sell increase
within items
the context per order
Recommendation
Engine
28. Understand context and user
intent
Context
should be considered when RE making
recommendation.
Month: December
Temperature: 45oF
30. Optimize RE using cost-based
models
Cost-based engine optimization
Metrics
Parameter
31. Be aware of the data-shift issue
Data collection UI changes will influence data
significantly, creating artificial data shifting
Netflix prize data set
Y. Koren, "Collaborative Filtering with Temporal Dynamics," Proc. 15th ACM SIGKDD Int'l Conf. Knowledge Discovery and Data
Mining (KDD 09), ACM Press, 2009, pp. 447-455.12