Review: [SIGIR'22]Interpolative Distillation for Unifying Biased and Debiased Recommendation

•Download as PPTX, PDF•

0 likes•79 views

This document summarizes a paper that proposes a new recommender system model called InterD that aims to perform well on both biased and debiased test sets. InterD uses distillation to train a student model using predictions from pre-trained biased and unbiased teacher models. The student model learns a weighted sum of the teacher predictions based on the environment (biased vs debiased) to determine recommendations. It also incorporates unobserved user-item pairs in training to further improve performance on both test sets. The experiments section evaluates InterD on benchmark datasets to demonstrate its effectiveness at achieving good performance on both biased and debiased recommendations.

Technology

Interpolative Distillation for Unifying Biased
and Debiased Recommendation
SIGIR’22, Sihao Ding(USTC) et al.
POSTECH DI Lab
Presenter: Changsoo Kwak
2022.5.24
1

Motivation
2
▪ Most recommender system’s test set for evaluate
▪ Normal biased test set(𝐷𝑏)
▪ Debiased test set (𝐷𝑑)
[1] Self-supervised Graph Learning for Recommendation, Jiancan Wu(USTC) et al, SIGIR’21
Existing models didn’t perform well on both test set
Biased or Unbiased model
Only reflect part of whole picture

Intuitive solution?
3
▪ Unifying 𝐷𝑏, 𝐷𝑑
▪ Usually 𝐷𝑏 ≫ |𝐷𝑑|
▪ Train two models for 𝐷𝑏, 𝐷𝑑 respectively, and ensemble
▪ Unclear that each models are strong/weak at which type of users/items
▪ Existing ensemble strategies are not tailored for win-win recommendation scenario
▪ Possible solution?
▪ Distillation!
▪ Aggregate two models at the level of user-item pair
Determine coefficient automatically for distillation

Proposed model(InterD)
4
Environment 𝐸 ∈ {𝑒𝑏, 𝑒𝑑}
Probability of environment given user-item pair
Existing models only consider one environment
- Only achieve good performance on one of 𝐷𝑏 or 𝐷𝑑
Predicted rating with given environment assumption
Let student model learns predicted ratings generated by
fine-grained weighted sum of prediction of pre-trained
models, considering environment

Proposed model(InterD)
5
𝑓𝑏, 𝑓𝑑: Pre-trained biased/unbiased model
▪ Estimate 𝑃(𝑅|𝑈, 𝐼, 𝐸)
▪ Directly use prediction of 𝑓𝑏, 𝑓𝑑
▪ Estimate 𝑃 𝐸 𝑈, 𝐼
𝑤𝑏 =
𝐿𝑏(𝑟𝑏, 𝑟)𝛾1
𝐿𝑏(𝑟𝑏, 𝑟)𝛾1+𝐿𝑑(𝑟𝑑, 𝑟)𝛾1
, 𝑤𝑑 =
𝐿𝑑(𝑟𝑑, 𝑟)𝛾1
𝐿𝑏(𝑟𝑏, 𝑟)𝛾1+𝐿𝑑(𝑟𝑑, 𝑟)𝛾1
𝐿𝑏: MSE, 𝐿𝑑: IPS weighted MSE, 𝛾1: Negative hyperparameter
𝑃 𝑅 𝑈, 𝐼 =
𝐸
𝑃 𝑅 𝑈, 𝐼, 𝐸 𝑃 𝐸 𝑈, 𝐼 = 𝑟∗ = 𝑤𝑏 𝑟𝑏 + 𝑤𝑑𝑟𝑑
▪ Training student model
Distillation loss 𝐿𝑂 =
1
|𝐷𝑏| + |𝐷𝑑|
(𝑢,𝑖,𝑟)∈𝐷𝑏∪𝐷𝑑
𝐿(𝑟, 𝑟∗ )

Proposed model(InterD)
6
▪ Incorporate unobserved data 𝐷𝑛 = 𝑈 × 𝐼 − 𝐷𝑏 ∪ 𝐷𝑑
𝑤𝑏
′
=
𝐿𝑏(𝑟𝑏, 𝑟)𝛾2
𝐿𝑏(𝑟𝑏, 𝑟)𝛾2+𝐿𝑑(𝑟𝑑, 𝑟)𝛾2
, 𝑤𝑑
′
=
𝐿𝑑(𝑟𝑑, 𝑟)𝛾2
𝐿𝑏(𝑟𝑏, 𝑟)𝛾2+𝐿𝑑(𝑟𝑑, 𝑟)𝛾2
𝑟∗
′
= 𝑤𝑏
′
𝑟𝑏 + 𝑤𝑑
′
𝑟𝑑
Imputation distillation loss 𝐿𝑁 =
1
|𝐷𝑛|
(𝑢,𝑖)∈𝐷𝑛
𝐿(𝑟, 𝑟∗
′)
Student model learn more from closer teacher over unobserved data

Logged user interactions are one of the most ubiquitous forms of data available because they can be recorded from a variety of systems (e.g., search engines, recommender systems, ad placement) at little cost. Naively using this data, however, is prone to failure. A key problem lies in biases that systems inject into the logs by influencing where we will receive feedback (e.g., more clicks at the top of the search ranking). This talk explores how counterfactual inference techniques can make learning algorithms robust against bias. This makes log data accessible to a broad range of learning algorithms, from ranking SVMs to deep networks.

hands on machine learning Chapter 6&7 decision tree, ensemble and random forest

Jaey Jeong

This document discusses decision trees and ensemble methods like random forests. It covers decision tree training and visualization using iris datasets. Ensemble methods like bagging, boosting and stacking are introduced. Random forests are ensembles of decision trees that split on a random subset of features at each node. Boosting methods like AdaBoost and gradient boosting aim to boost weak learners into a strong learner by focusing on misclassified samples.

Review: [KDD'21]Model-Agnostic Counterfactual Reasoning for Eliminating Popul...

CS Kwak

The document summarizes a research paper about reducing popularity bias in recommender systems. It proposes a model-agnostic counterfactual reasoning approach (MACR) that adds modules to model user conformity and item popularity. MACR decomposes the total effect of an item recommendation into the natural direct effect representing pure user-item matching, and the total indirect effect representing the mediator of popularity. It then performs counterfactual inference by deducting the estimated indirect effect from predictions to alleviate the influence of popularity bias. Experiments on a biased dataset show MACR improves accuracy and reduces popularity bias compared to baseline models.

ngboost.pptx

MohamedAliHabib3

Py data19 final

Maria Navarro Jiménez

1) The document discusses conformal predictions, a machine learning technique that provides calibrated predictions along with confidence levels or regions. 2) Conformal predictions work by dividing data into training and calibration sets, fitting a model on training data, and using the calibration set to estimate prediction confidence without assumptions about the data distribution. 3) The document outlines how conformal predictions are applied in classification and regression problems, providing algorithms to compute prediction regions at a given confidence level for new observations. 4) As an example application, the document shows how conformal predictors are used for a car insurance classification problem to determine the confidence of predictions while minimizing false positives.

Paper Study: Melding the data decision pipeline

ChenYiHuang5

Jsai final final final

dinesh malla

1. The document proposes using Bayesian inverse reinforcement learning (IRL) with neural networks for anomaly prediction detection. It formulates the problem as a Markov decision process to learn the reward function from expert trajectories. 2. A Bayesian neural network is used to model the reward function, with weights assigned prior distributions. The model is trained by maximizing the log likelihood of the training data to find the posterior distribution over weights. 3. The approach is evaluated on temperature anomaly detection and maze navigation tasks. Bayesian IRL is able to distinguish normal trajectories from anomalous ones in test data for intentional anomaly detection.

ML_basics_lecture1_linear_regression.pdf

Tigabu Yaya

This document provides an overview of machine learning basics and linear regression. It defines machine learning as a program that improves its performance on tasks through experience. Linear regression aims to fit a linear model to training data by minimizing the empirical loss between predicted and true target values. It works by finding the weights that minimize the mean squared error loss on the training data according to the normal equation. The bias term can be incorporated by augmenting features with 1s.

This document provides an overview of logistic regression. It discusses the hypothesis representation using a sigmoid function to output probabilities between 0 and 1. It describes using maximum likelihood estimation to learn the parameters θ by minimizing the cost function. Gradient descent is used to optimize the cost function. The document also briefly mentions regularization and multi-class classification extensions.

DCWP_CVPR2023.pptx

건영 박

Ensemble methods

zekeLabs Technologies

Ensemble methods combine multiple machine learning models to obtain better predictive performance than could be obtained from any of the constituent models alone. The document discusses major families of ensemble methods including bagging, boosting, and voting. It provides examples like random forest, AdaBoost, gradient tree boosting, and XGBoost which build ensembles of decision trees. Ensemble methods help reduce variance and prevent overfitting compared to single models.

adversarial robustness lecture

MuhammadAhmedShah2

The document discusses adversarial perturbations against machine learning models. It begins by introducing adversarial perturbations, how they are created through methods like fast gradient sign method and projected gradient descent, and how to defend against them with techniques like adversarial training and randomized smoothing. It suggests that adversarial vulnerabilities may exist because models can learn non-robust features from data rather than the robust human-meaningful features. The document then outlines past and current projects in the author's group on improving adversarial robustness.

Distributional RL via Moment Matching

taeseon ryu

본 논문에서는 분배형 강화학습(Distributional Reinforcement Learning)에서 벨만 다이내믹스를 통해 확률 분포를 학습하는 문제를 고려합니다. 이전 연구들은 각 반환 분포의 유한 개의 통계량을 신경망을 통해 학습하는 방법을 사용해왔으나, 이 방법은 반환 분포의 함수적 형태에 제한을 받아 제한적인 표현력을 가지며, 미리 정의된 통계량을 유지하는 것이 어려웠습니다. 본 논문에서는 이러한 제한을 없애기 위해 최대 평균 거리(Maximum Mean Discrepancy, MMD)라는 가설 검정 기술을 활용해 반환 분포의 결정론적인(의사 난수를 사용한) 표본들을 학습하는 방법을 제안합니다. 이를 통해 반환 분포와 벨만 타겟 간의 모든 모멘트(순간값)를 암묵적으로 일치시킴으로써 분배형 벨만 연산자의 수렴성을 보장하며, 분포 근사에 대한 유한 샘플 분석을 제시합니다. 실험 결과, 본 논문에서 제안한 방법은 분배형 강화학습의 기본 모델보다 우수한 성능을 보이며, Atari 게임에서 분산형 에이전트를 사용하지 않는 경우에도 최고 성적을 기록합니다.

Deep learning paper review ppt sourece -Direct clr

taeseon ryu

딥러닝 이미지 분류 테스크에서는 Self-Supervision 학습 방법이 있습니다. 레이블이 없는 상태에서 context prediction 이나 jigsaw puzzle과 같은 방법으로 학습시키는 방법이지만 이러한 self-supervision 테스크에는 모든 차원에 분포하지 않고 특정 부분 차원으로만 학습이 되는 Dimensional Collapse 라는 고질적인 문제를 일으킵니다. Self-supervision 중 positive pair는 가깝게, 그리고 negative pair는 서로 멀어지게 학습을 시키는 Contrastive Learning 이 있습니다. 이로인해 Dimensional Collapse에 강인할 것 이라고 직관적으로 생각이 들지만, 그렇지 않았습니다. 이러한 문제를 해결하기 위해 등장한 Direct CLR이라는 방법론을 소개드립니다. 논문의 배경부터 Direct CLR논문에 대한 디테일한 설명까지, 펀디멘탈팀의 이재윤님이 자세한 리뷰 도와주셨습니다. 오늘도 많은 관심 미리 감사드립니다 !

Machine learning - session 3

Luis Borbon

Conistency of random forests

Hoang Nguyen

The document discusses consistency of random forests. It summarizes recent theoretical results showing that random forests are consistent estimators under certain conditions. Specifically, it is shown that random forests are consistent if the number of features sampled at each node (mtry) increases with sample size and the minimum node size decreases with sample size. The document also discusses how consistency holds even when the splitting criteria are randomized, as in random forests, as long as the base classifiers are consistent.

ddpg seminar

민재 정

This document provides an overview of deep deterministic policy gradient (DDPG), which combines aspects of DQN and policy gradient methods to enable deep reinforcement learning with continuous action spaces. It summarizes DQN and its limitations for continuous domains. It then explains policy gradient methods like REINFORCE, actor-critic, and deterministic policy gradient (DPG) that can handle continuous action spaces. DDPG adopts key elements of DQN like experience replay and target networks, and models the policy as a deterministic function like DPG, to apply deep reinforcement learning to complex continuous control tasks.

Week 13 Feature Selection Computer Vision Bagian 2

khairulhuda242

This document discusses feature selection techniques for machine learning models. It explains that having too many features, including irrelevant and redundant features, can negatively impact model performance by increasing complexity and reducing accuracy. Feature selection aims to automatically select the optimal subset of features that contribute most to the prediction target. The document describes filter methods like variance threshold, correlation coefficient, chi-square test, and ANOVA F-value statistic that select features based on their statistical properties. It also provides examples of implementing variance threshold, correlation coefficient, chi-square test, and ANOVA in Python using scikit-learn.

GTC 2021: Counterfactual Learning to Rank in E-commerce

GrubhubTech

Many ecommerce companies have extensive logs of user behavior such as clicks and conversions. However, if supervised learning is naively applied, then systems can suffer from poor performance due to bias and feedback loops. Using techniques from counterfactual learning we can leverage log data in a principled manner in order to model user behaviour and build personalized recommender systems. At Grubhub, a user journey begins with recommendations and the vast majority of conversions are powered by recommendations. Our recommender policies can drive user behavior to increase orders and/or profit. Accordingly, the ability to rapidly iterate and experiment is very important. Because of our powerful GPU workflows, we can iterate 200% more rapidly than with counterpart CPU workflows. Developers iterate ideas with notebooks powered by GPUs. Hyperparameter spaces are explored up to 8x faster with multi-GPUs Ray clusters. Solutions are shipped from notebooks to production in half the time with nbdev. With our accelerated DS workflows and Deep Learning on GPUs, we were able to deliver a +12.6% conversion boost in just a few months. In this talk we hope to present modern techniques for industrial recommender systems powered by GPU workflows. First a small background on counterfactual learning techniques, then followed by practical information and data from our industrial application. By Alex Egg, accepted to Nvidia GTC 2021 Conference

Learning a nonlinear embedding by preserving class neibourhood structure 최종

WooSung Choi

I2b2 2008

University of Minnesota, Duluth

The document summarizes research on using machine learning to predict patient comorbidities from discharge summaries. It describes training rule learning classifiers on annotated examples and evaluating their performance. The best models were rule learners like JRip and J48, achieving high precision but lower recall. Rules learned for conditions like asthma, depression, and obesity were relatively simple but descriptive of the data.

Basic Concepts of Standard Experimental Designs ( Statistics )

Hasnat Israq

This document outlines key concepts in standard experimental design. It defines experimental design as assigning experimental units to treatment conditions to measure and compare treatment effects. Sample design selects units for measurement from a population. The document discusses necessary steps like replication and randomization. It presents linear statistical models including fixed, random, and mixed effects models. It also explains analysis of variance and standard designs like completely randomized design, randomized block design, and Latin square design, including their analysis of variance tables. The conclusion compares the efficiency of these standard designs.

Multi PPT - Agent Actor-Critic for Mixed Cooperative-Competitive Environments

Jisang Yoon

MADDPG is a multi-agent actor-critic reinforcement learning algorithm that can operate in mixed cooperative-competitive environments. It uses a decentralized actor and centralized critic architecture. The centralized critic takes the observations and actions of all agents as input to guide learning, even though each agent only controls its own actor. To deal with non-stationary environments, it approximates other agents' policies when they are unknown. It also trains with policy ensembles to prevent overfitting to competitors' strategies. Experiments show MADDPG outperforms decentralized methods on cooperative tasks and its performance benefits from approximating other agents and using policy ensembles in competitive settings.

MACHINE LEARNING.pptx

SOURAVGHOSH623569

1) Machine learning is a field of artificial intelligence that allows computers to learn without being explicitly programmed by finding patterns in data. 2) There are three main types of machine learning problems: supervised learning which uses labeled training data, unsupervised learning which finds hidden patterns in unlabeled data, and reinforcement learning where a system learns from feedback of rewards and punishments. 3) Key machine learning concepts include linear regression, which finds a linear relationship between variables, and gradient descent, an algorithm for minimizing cost functions to optimize model parameters like slope and intercept of a linear regression line.

BaggingBoosting.pdf

DynamicPitch

Ensemble methods like bagging, boosting, random forest and AdaBoost combine multiple classifiers to improve performance. Bagging aims to reduce variance by training classifiers on random subsets of data and averaging their predictions. Boosting sequentially trains classifiers to focus on misclassified examples from previous classifiers to reduce bias. Random forest extends bagging by randomly selecting features for training each decision tree. AdaBoost is a boosting algorithm that iteratively adds classifiers and assigns higher weights to misclassified examples.

Nordic Marketo Engage User Group_June 13_ 2024.pptx

MichaelKnudsen27

Building Production Ready Search Pipelines with Spark and Milvus

Zilliz

Similar to Review: [SIGIR'22]Interpolative Distillation for Unifying Biased and Debiased Recommendation

NeurIPS22.pptx

Julián Tachella

Challenging Common Assumptions in the Unsupervised Learning of Disentangled R...

Sangwoo Mo

Aaa ped-14-Ensemble Learning: About Ensemble Learning

AminaRepo

Lec05.pptx

HassanAhmad442087

DCWP_CVPR2023.pptx

건영 박

Ensemble methods

zekeLabs Technologies

adversarial robustness lecture

MuhammadAhmedShah2

Distributional RL via Moment Matching

taeseon ryu

Deep learning paper review ppt sourece -Direct clr

taeseon ryu

Machine learning - session 3

Luis Borbon

Conistency of random forests

Hoang Nguyen

ddpg seminar

민재 정

Week 13 Feature Selection Computer Vision Bagian 2

khairulhuda242

GTC 2021: Counterfactual Learning to Rank in E-commerce

GrubhubTech

Learning a nonlinear embedding by preserving class neibourhood structure 최종

WooSung Choi

I2b2 2008

University of Minnesota, Duluth

Basic Concepts of Standard Experimental Designs ( Statistics )

Hasnat Israq

Multi PPT - Agent Actor-Critic for Mixed Cooperative-Competitive Environments

Jisang Yoon

MACHINE LEARNING.pptx

SOURAVGHOSH623569

BaggingBoosting.pdf

DynamicPitch

Similar to Review: [SIGIR'22]Interpolative Distillation for Unifying Biased and Debiased Recommendation (20)

NeurIPS22.pptx

Challenging Common Assumptions in the Unsupervised Learning of Disentangled R...

Aaa ped-14-Ensemble Learning: About Ensemble Learning

Lec05.pptx

DCWP_CVPR2023.pptx

Ensemble methods

adversarial robustness lecture

Distributional RL via Moment Matching

Deep learning paper review ppt sourece -Direct clr

Machine learning - session 3

Conistency of random forests

ddpg seminar

Week 13 Feature Selection Computer Vision Bagian 2

GTC 2021: Counterfactual Learning to Rank in E-commerce

Learning a nonlinear embedding by preserving class neibourhood structure 최종

I2b2 2008

Basic Concepts of Standard Experimental Designs ( Statistics )

Multi PPT - Agent Actor-Critic for Mixed Cooperative-Competitive Environments

MACHINE LEARNING.pptx

BaggingBoosting.pdf

Recently uploaded

Nordic Marketo Engage User Group_June 13_ 2024.pptx

MichaelKnudsen27

Building Production Ready Search Pipelines with Spark and Milvus

Zilliz

Recommendation System using RAG Architecture

fredae14

Energy Efficient Video Encoding for Cloud and Edge Computing Instances

Alpen-Adria-Universität

HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU

panagenda

Webinar Recording: https://www.panagenda.com/webinars/hcl-notes-und-domino-lizenzkostenreduzierung-in-der-welt-von-dlau/ DLAU und die Lizenzen nach dem CCB- und CCX-Modell sind für viele in der HCL-Community seit letztem Jahr ein heißes Thema. Als Notes- oder Domino-Kunde haben Sie vielleicht mit unerwartet hohen Benutzerzahlen und Lizenzgebühren zu kämpfen. Sie fragen sich vielleicht, wie diese neue Art der Lizenzierung funktioniert und welchen Nutzen sie Ihnen bringt. Vor allem wollen Sie sicherlich Ihr Budget einhalten und Kosten sparen, wo immer möglich. Das verstehen wir und wir möchten Ihnen dabei helfen! Wir erklären Ihnen, wie Sie häufige Konfigurationsprobleme lösen können, die dazu führen können, dass mehr Benutzer gezählt werden als nötig, und wie Sie überflüssige oder ungenutzte Konten identifizieren und entfernen können, um Geld zu sparen. Es gibt auch einige Ansätze, die zu unnötigen Ausgaben führen können, z. B. wenn ein Personendokument anstelle eines Mail-Ins für geteilte Mailboxen verwendet wird. Wir zeigen Ihnen solche Fälle und deren Lösungen. Und natürlich erklären wir Ihnen das neue Lizenzmodell. Nehmen Sie an diesem Webinar teil, bei dem HCL-Ambassador Marc Thomas und Gastredner Franz Walder Ihnen diese neue Welt näherbringen. Es vermittelt Ihnen die Tools und das Know-how, um den Überblick zu bewahren. Sie werden in der Lage sein, Ihre Kosten durch eine optimierte Domino-Konfiguration zu reduzieren und auch in Zukunft gering zu halten. Diese Themen werden behandelt - Reduzierung der Lizenzkosten durch Auffinden und Beheben von Fehlkonfigurationen und überflüssigen Konten - Wie funktionieren CCB- und CCX-Lizenzen wirklich? - Verstehen des DLAU-Tools und wie man es am besten nutzt - Tipps für häufige Problembereiche, wie z. B. Team-Postfächer, Funktions-/Testbenutzer usw. - Praxisbeispiele und Best Practices zum sofortigen Umsetzen

System Design Case Study: Building a Scalable E-Commerce Platform - Hiike

Hiike

Ocean lotus Threat actors project by John Sitima 2024 (1).pptx

SitimaJohn

Ocean Lotus cyber threat actors represent a sophisticated, persistent, and politically motivated group that poses a significant risk to organizations and individuals in the Southeast Asian region. Their continuous evolution and adaptability underscore the need for robust cybersecurity measures and international cooperation to identify and mitigate the threats posed by such advanced persistent threat groups.

Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...

Tatiana Kojar

Skybuffer AI, built on the robust SAP Business Technology Platform (SAP BTP), is the latest and most advanced version of our AI development, reaffirming our commitment to delivering top-tier AI solutions. Skybuffer AI harnesses all the innovative capabilities of the SAP BTP in the AI domain, from Conversational AI to cutting-edge Generative AI and Retrieval-Augmented Generation (RAG). It also helps SAP customers safeguard their investments into SAP Conversational AI and ensure a seamless, one-click transition to SAP Business AI. With Skybuffer AI, various AI models can be integrated into a single communication channel such as Microsoft Teams. This integration empowers business users with insights drawn from SAP backend systems, enterprise documents, and the expansive knowledge of Generative AI. And the best part of it is that it is all managed through our intuitive no-code Action Server interface, requiring no extensive coding knowledge and making the advanced AI accessible to more users.

Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...

saastr

UI5 Controls simplified - UI5con2024 presentation

Wouter Lemaire

Trusted Execution Environment for Decentralized Process Mining

LucaBarbaro3

Serial Arm Control in Real Time Presentation

tolgahangng

Your One-Stop Shop for Python Success: Top 10 US Python Development Providers

akankshawande

Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr

saastr

Azure API Management to expose backend services securely

Dinusha Kumarasiri

Generating privacy-protected synthetic data using Secludy and Milvus

Zilliz

During this demo, the founders of Secludy will demonstrate how their system utilizes Milvus to store and manipulate embeddings for generating privacy-protected synthetic data. Their approach not only maintains the confidentiality of the original data but also enhances the utility and scalability of LLMs under privacy constraints. Attendees, including machine learning engineers, data scientists, and data managers, will witness first-hand how Secludy's integration with Milvus empowers organizations to harness the power of LLMs securely and efficiently.

Choosing The Best AWS Service For Your Website + API.pptx

Brandon Minnick, MBA

Have you ever been confused by the myriad of choices offered by AWS for hosting a website or an API? Lambda, Elastic Beanstalk, Lightsail, Amplify, S3 (and more!) can each host websites + APIs. But which one should we choose? Which one is cheapest? Which one is fastest? Which one will scale to meet our needs? Join me in this session as we dive into each AWS hosting service to determine which one is best for your scenario and explain why!

Monitoring and Managing Anomaly Detection on OpenShift.pdf

Tosin Akinosho

Monitoring and Managing Anomaly Detection on OpenShift Overview Dive into the world of anomaly detection on edge devices with our comprehensive hands-on tutorial. This SlideShare presentation will guide you through the entire process, from data collection and model training to edge deployment and real-time monitoring. Perfect for those looking to implement robust anomaly detection systems on resource-constrained IoT/edge devices. Key Topics Covered 1. Introduction to Anomaly Detection - Understand the fundamentals of anomaly detection and its importance in identifying unusual behavior or failures in systems. 2. Understanding Edge (IoT) - Learn about edge computing and IoT, and how they enable real-time data processing and decision-making at the source. 3. What is ArgoCD? - Discover ArgoCD, a declarative, GitOps continuous delivery tool for Kubernetes, and its role in deploying applications on edge devices. 4. Deployment Using ArgoCD for Edge Devices - Step-by-step guide on deploying anomaly detection models on edge devices using ArgoCD. 5. Introduction to Apache Kafka and S3 - Explore Apache Kafka for real-time data streaming and Amazon S3 for scalable storage solutions. 6. Viewing Kafka Messages in the Data Lake - Learn how to view and analyze Kafka messages stored in a data lake for better insights. 7. What is Prometheus? - Get to know Prometheus, an open-source monitoring and alerting toolkit, and its application in monitoring edge devices. 8. Monitoring Application Metrics with Prometheus - Detailed instructions on setting up Prometheus to monitor the performance and health of your anomaly detection system. 9. What is Camel K? - Introduction to Camel K, a lightweight integration framework built on Apache Camel, designed for Kubernetes. 10. Configuring Camel K Integrations for Data Pipelines - Learn how to configure Camel K for seamless data pipeline integrations in your anomaly detection workflow. 11. What is a Jupyter Notebook? - Overview of Jupyter Notebooks, an open-source web application for creating and sharing documents with live code, equations, visualizations, and narrative text. 12. Jupyter Notebooks with Code Examples - Hands-on examples and code snippets in Jupyter Notebooks to help you implement and test anomaly detection models.

Best 20 SEO Techniques To Improve Website Visibility In SERP

Pixlogix Infotech

Presentation of the OECD Artificial Intelligence Review of Germany

innovationoecd

Recently uploaded (20)

Nordic Marketo Engage User Group_June 13_ 2024.pptx

Building Production Ready Search Pipelines with Spark and Milvus

Recommendation System using RAG Architecture

Energy Efficient Video Encoding for Cloud and Edge Computing Instances

HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU

System Design Case Study: Building a Scalable E-Commerce Platform - Hiike

Ocean lotus Threat actors project by John Sitima 2024 (1).pptx

Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...

Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...

UI5 Controls simplified - UI5con2024 presentation

Trusted Execution Environment for Decentralized Process Mining

Serial Arm Control in Real Time Presentation

Your One-Stop Shop for Python Success: Top 10 US Python Development Providers

Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr

Azure API Management to expose backend services securely

Generating privacy-protected synthetic data using Secludy and Milvus

Choosing The Best AWS Service For Your Website + API.pptx

Monitoring and Managing Anomaly Detection on OpenShift.pdf

Best 20 SEO Techniques To Improve Website Visibility In SERP

Presentation of the OECD Artificial Intelligence Review of Germany

Review: [SIGIR'22]Interpolative Distillation for Unifying Biased and Debiased Recommendation

1. Interpolative Distillation for Unifying Biased and Debiased Recommendation SIGIR’22, Sihao Ding(USTC) et al. POSTECH DI Lab Presenter: Changsoo Kwak 2022.5.24 1

2. Motivation 2 ▪ Most recommender system’s test set for evaluate ▪ Normal biased test set(𝐷𝑏) ▪ Debiased test set (𝐷𝑑) [1] Self-supervised Graph Learning for Recommendation, Jiancan Wu(USTC) et al, SIGIR’21 Existing models didn’t perform well on both test set Biased or Unbiased model Only reflect part of whole picture

3. Intuitive solution? 3 ▪ Unifying 𝐷𝑏, 𝐷𝑑 ▪ Usually 𝐷𝑏 ≫ |𝐷𝑑| ▪ Train two models for 𝐷𝑏, 𝐷𝑑 respectively, and ensemble ▪ Unclear that each models are strong/weak at which type of users/items ▪ Existing ensemble strategies are not tailored for win-win recommendation scenario ▪ Possible solution? ▪ Distillation! ▪ Aggregate two models at the level of user-item pair Determine coefficient automatically for distillation

4. Proposed model(InterD) 4 Environment 𝐸 ∈ {𝑒𝑏, 𝑒𝑑} Probability of environment given user-item pair Existing models only consider one environment - Only achieve good performance on one of 𝐷𝑏 or 𝐷𝑑 Predicted rating with given environment assumption Let student model learns predicted ratings generated by fine-grained weighted sum of prediction of pre-trained models, considering environment

5. Proposed model(InterD) 5 𝑓𝑏, 𝑓𝑑: Pre-trained biased/unbiased model ▪ Estimate 𝑃(𝑅|𝑈, 𝐼, 𝐸) ▪ Directly use prediction of 𝑓𝑏, 𝑓𝑑 ▪ Estimate 𝑃 𝐸 𝑈, 𝐼 𝑤𝑏 = 𝐿𝑏(𝑟𝑏, 𝑟)𝛾1 𝐿𝑏(𝑟𝑏, 𝑟)𝛾1+𝐿𝑑(𝑟𝑑, 𝑟)𝛾1 , 𝑤𝑑 = 𝐿𝑑(𝑟𝑑, 𝑟)𝛾1 𝐿𝑏(𝑟𝑏, 𝑟)𝛾1+𝐿𝑑(𝑟𝑑, 𝑟)𝛾1 𝐿𝑏: MSE, 𝐿𝑑: IPS weighted MSE, 𝛾1: Negative hyperparameter 𝑃 𝑅 𝑈, 𝐼 = 𝐸 𝑃 𝑅 𝑈, 𝐼, 𝐸 𝑃 𝐸 𝑈, 𝐼 = 𝑟∗ = 𝑤𝑏 𝑟𝑏 + 𝑤𝑑𝑟𝑑 ▪ Training student model Distillation loss 𝐿𝑂 = 1 |𝐷𝑏| + |𝐷𝑑| (𝑢,𝑖,𝑟)∈𝐷𝑏∪𝐷𝑑 𝐿(𝑟, 𝑟∗ )

6. Proposed model(InterD) 6 ▪ Incorporate unobserved data 𝐷𝑛 = 𝑈 × 𝐼 − 𝐷𝑏 ∪ 𝐷𝑑 𝑤𝑏 ′ = 𝐿𝑏(𝑟𝑏, 𝑟)𝛾2 𝐿𝑏(𝑟𝑏, 𝑟)𝛾2+𝐿𝑑(𝑟𝑑, 𝑟)𝛾2 , 𝑤𝑑 ′ = 𝐿𝑑(𝑟𝑑, 𝑟)𝛾2 𝐿𝑏(𝑟𝑏, 𝑟)𝛾2+𝐿𝑑(𝑟𝑑, 𝑟)𝛾2 𝑟∗ ′ = 𝑤𝑏 ′ 𝑟𝑏 + 𝑤𝑑 ′ 𝑟𝑑 Imputation distillation loss 𝐿𝑁 = 1 |𝐷𝑛| (𝑢,𝑖)∈𝐷𝑛 𝐿(𝑟, 𝑟∗ ′) Student model learn more from closer teacher over unobserved data

7. Experiments 7

8. Experiments 8

Editor's Notes

RCT: Randomized Control Trial(https://books.google.co.kr/books?id=JUTqDwAAQBAJ&pg=PA244&lpg=PA244&dq=yahoo!r3+randomized+controlled+trial&source=bl&ots=0cagKMc4KG&sig=ACfU3U3oFb-FZsxO3PuYDFYRz6gX9O97tA&hl=ko&sa=X&ved=2ahUKEwj5qp-psev3AhWim1YBHfVgC2QQ6AF6BAgDEAM#v=onepage&q=yahoo!r3%20randomized%20controlled%20trial&f=false)
In other words, the student tends to learn the easier aspects of knowledge since the smaller distance makes it easier to follow the corresponding teacher 학생 입장에서 더 쉬운 쪽(거리가 적은 쪽 teacher)을 따라가기 때문에 curriculum learning으로 볼 수도 있다? Weight 계산에 student prediction이 들어가니까 self-paced learning으,로 볼 수도 있다?

Review: [SIGIR'22]Interpolative Distillation for Unifying Biased and Debiased Recommendation

Recommended

Recommended

More Related Content

Similar to Review: [SIGIR'22]Interpolative Distillation for Unifying Biased and Debiased Recommendation

Similar to Review: [SIGIR'22]Interpolative Distillation for Unifying Biased and Debiased Recommendation (20)

Recently uploaded

Recently uploaded (20)

Review: [SIGIR'22]Interpolative Distillation for Unifying Biased and Debiased Recommendation

Editor's Notes