Pang Wei Koh and Percy Liang
"Understanding Black-Box prediction via influence functions" ICML 2017 Best paper
References:
https://youtu.be/0w9fLX_T6tY
https://arxiv.org/abs/1703.04730
Main obstacles of Bayesian statistics or Bayesian machine learning is computing posterior distribution. In many contexts, computing posterior distribution is intractable. Today, there are two main stream to detour directly computing posterior distribution. One is using sampling method(ex. MCMC) and another is Variational inference. Compared to Variational inference, MCMC takes more time and vulnerable to high-dimensional parameters. However, MCMC has strength in simplicity and guarantees of convergence. I'll briefly introduce several methods people using in application.
Differential Geometry for Machine LearningSEMINARGROOT
References:
Differential Geometry of Curves and Surfaces, Manfredo P. Do Carmo (2016)
Differential Geometry by Claudio Arezzo
Youtube: https://youtu.be/tKnBj7B2PSg
What is a Manifold?
Youtube: https://youtu.be/CEXSSz0gZI4
Shape analysis (MIT spring 2019) by Justin Solomon
Youtube: https://youtu.be/GEljqHZb30c
Tensor Calculus
Youtube: https://youtu.be/kGXr1SF3WmA
Manifolds: A Gentle Introduction,
Hyperbolic Geometry and Poincaré Embeddings by Brian Keng
Link: http://bjlkeng.github.io/posts/manifolds/,
http://bjlkeng.github.io/posts/hyperbolic-geometry-and-poincare-embeddings/
Statistical Learning models for Manifold-Valued measurements with application to computer vision and neuroimaging by Hyunwoo J.Kim
Estimation Theory Class (Summary and Revision)Ahmad Gomaa
Summary of important theories and formulas in Estimation theory:
1) Cramer-Rao lower bound (CRLB)
2) Linear Model
3) Best Linear Unbiased Estimate (BLUE)
4) Maximum Likelihood Estimation (MLE)
5) Least Squares Estimation (LSE)
6) Bayesian Estimation and MMSE estimation
Data Approximation in Mathematical Modelling Regression Analysis and Curve Fi...Dr.Summiya Parveen
Outline of the lecture:
Introduction of Regression
Application of Regression
Regression Techniques
Types of Regression
Goodness of fit
MATLAB/MATHEMATICA implementation with some example
Regression analysis is a form of predictive modelling technique which investigates the relationship between a dependent (target) and independent variable (s) (predictor). This technique is used for forecasting, time series modelling and finding the casual effect relationship between the variables. Regression analysis is an important tool for modelling and analysing data. Here, we fit a curve / line to the data points in such a manner that the differences between the distances of data points from the curve or line is minimized.
By DR. SUMMIYA PARVEEN
Main obstacles of Bayesian statistics or Bayesian machine learning is computing posterior distribution. In many contexts, computing posterior distribution is intractable. Today, there are two main stream to detour directly computing posterior distribution. One is using sampling method(ex. MCMC) and another is Variational inference. Compared to Variational inference, MCMC takes more time and vulnerable to high-dimensional parameters. However, MCMC has strength in simplicity and guarantees of convergence. I'll briefly introduce several methods people using in application.
Differential Geometry for Machine LearningSEMINARGROOT
References:
Differential Geometry of Curves and Surfaces, Manfredo P. Do Carmo (2016)
Differential Geometry by Claudio Arezzo
Youtube: https://youtu.be/tKnBj7B2PSg
What is a Manifold?
Youtube: https://youtu.be/CEXSSz0gZI4
Shape analysis (MIT spring 2019) by Justin Solomon
Youtube: https://youtu.be/GEljqHZb30c
Tensor Calculus
Youtube: https://youtu.be/kGXr1SF3WmA
Manifolds: A Gentle Introduction,
Hyperbolic Geometry and Poincaré Embeddings by Brian Keng
Link: http://bjlkeng.github.io/posts/manifolds/,
http://bjlkeng.github.io/posts/hyperbolic-geometry-and-poincare-embeddings/
Statistical Learning models for Manifold-Valued measurements with application to computer vision and neuroimaging by Hyunwoo J.Kim
Estimation Theory Class (Summary and Revision)Ahmad Gomaa
Summary of important theories and formulas in Estimation theory:
1) Cramer-Rao lower bound (CRLB)
2) Linear Model
3) Best Linear Unbiased Estimate (BLUE)
4) Maximum Likelihood Estimation (MLE)
5) Least Squares Estimation (LSE)
6) Bayesian Estimation and MMSE estimation
Data Approximation in Mathematical Modelling Regression Analysis and Curve Fi...Dr.Summiya Parveen
Outline of the lecture:
Introduction of Regression
Application of Regression
Regression Techniques
Types of Regression
Goodness of fit
MATLAB/MATHEMATICA implementation with some example
Regression analysis is a form of predictive modelling technique which investigates the relationship between a dependent (target) and independent variable (s) (predictor). This technique is used for forecasting, time series modelling and finding the casual effect relationship between the variables. Regression analysis is an important tool for modelling and analysing data. Here, we fit a curve / line to the data points in such a manner that the differences between the distances of data points from the curve or line is minimized.
By DR. SUMMIYA PARVEEN
Linear regression [Theory and Application (In physics point of view) using py...ANIRBANMAJUMDAR18
Machine-learning models are behind many recent technological advances, including high-accuracy translations of the text and self-driving cars. They are also increasingly used by researchers to help in solving physics problems, like Finding new phases of matter, Detecting interesting outliers
in data from high-energy physics experiments, Founding astronomical objects are known as gravitational lenses in maps of the night sky etc. The rudimentary algorithm that every Machine Learning enthusiast starts with is a linear regression algorithm. In statistics, linear regression is a linear approach to modelling the relationship between a scalar response (or dependent variable) and one or more explanatory variables (or independent
variables). Linear regression analysis (least squares) is used in a physics lab to prepare the computer-aided report and to fit data. In this article, the application is made to experiment: 'DETERMINATION OF DIELECTRIC CONSTANT OF NON-CONDUCTING LIQUIDS'. The entire computation is made through Python 3.6 programming language in this article.
Generic Reinforcement Schemes and Their Optimizationinfopapers
Dana Simian, Florin Stoica, Generic Reinforcement Schemes and Their Optimization, Proceedings of the 5th European Computing Conference (ECC ’11), Paris, France, April 28-30, 2011, pp. 332-337
In this paper, modified q-homotopy analysis method (mq-HAM) is proposed for solving high-order non-linear partial differential equations. This method improves the convergence of the series solution and overcomes the computing difficulty encountered in the q-HAM, so it is more accurate than nHAM which proposed in Hassan and El-Tawil, Saberi-Nik and Golchaman. The second- and third-order cases are solved as illustrative examples of the proposed method.
Probability formula sheet
Set theory, sample space, events, concepts of randomness and uncertainty, basic principles of probability, axioms and properties of probability, conditional probability, independent events, Baye’s formula, Bernoulli trails, sequential experiments, discrete and continuous random variable, distribution and density functions, one and two dimensional random variables, marginal and joint distributions and density functions. Expectations, probability distribution families (binomial, poisson, hyper geometric, geometric distribution, normal, uniform and exponential), mean, variance, standard deviations, moments and moment generating functions, law of large numbers, limits theorems
for more visit http://tricntip.blogspot.com/
Linear regression [Theory and Application (In physics point of view) using py...ANIRBANMAJUMDAR18
Machine-learning models are behind many recent technological advances, including high-accuracy translations of the text and self-driving cars. They are also increasingly used by researchers to help in solving physics problems, like Finding new phases of matter, Detecting interesting outliers
in data from high-energy physics experiments, Founding astronomical objects are known as gravitational lenses in maps of the night sky etc. The rudimentary algorithm that every Machine Learning enthusiast starts with is a linear regression algorithm. In statistics, linear regression is a linear approach to modelling the relationship between a scalar response (or dependent variable) and one or more explanatory variables (or independent
variables). Linear regression analysis (least squares) is used in a physics lab to prepare the computer-aided report and to fit data. In this article, the application is made to experiment: 'DETERMINATION OF DIELECTRIC CONSTANT OF NON-CONDUCTING LIQUIDS'. The entire computation is made through Python 3.6 programming language in this article.
Generic Reinforcement Schemes and Their Optimizationinfopapers
Dana Simian, Florin Stoica, Generic Reinforcement Schemes and Their Optimization, Proceedings of the 5th European Computing Conference (ECC ’11), Paris, France, April 28-30, 2011, pp. 332-337
In this paper, modified q-homotopy analysis method (mq-HAM) is proposed for solving high-order non-linear partial differential equations. This method improves the convergence of the series solution and overcomes the computing difficulty encountered in the q-HAM, so it is more accurate than nHAM which proposed in Hassan and El-Tawil, Saberi-Nik and Golchaman. The second- and third-order cases are solved as illustrative examples of the proposed method.
Probability formula sheet
Set theory, sample space, events, concepts of randomness and uncertainty, basic principles of probability, axioms and properties of probability, conditional probability, independent events, Baye’s formula, Bernoulli trails, sequential experiments, discrete and continuous random variable, distribution and density functions, one and two dimensional random variables, marginal and joint distributions and density functions. Expectations, probability distribution families (binomial, poisson, hyper geometric, geometric distribution, normal, uniform and exponential), mean, variance, standard deviations, moments and moment generating functions, law of large numbers, limits theorems
for more visit http://tricntip.blogspot.com/
Artificial intelligence is the simulation of human intelligence processes by machines, especially computer systems. Specific applications of AI include expert systems, natural language processing, speech recognition and machine vision.
Artificial intelligence is the simulation of human intelligence processes by machines, especially computer systems. Specific applications of AI include expert systems, natural language processing, speech recognition and machine vision.
Artificial intelligence is the simulation of human intelligence processes by machines, especially computer systems. Specific applications of AI include expert systems, natural language processing, speech recognition and machine vision.
Covers supervised learning and discriminative algorithms. Includes: Linear Regression, The LMS Algorithm, Probabalistic interpretations, Classification, Logistic Regression, Underfitting and Overfitting.
Introduction to linear regression and the maths behind it like line of best fit, regression matrics. Other concepts include cost function, gradient descent, overfitting and underfitting, r squared.
Here a Review of the Combination of Machine Learning models from Bayesian Averaging, Committees to Boosting... Specifically An statistical analysis of Boosting is done
Attention Is All You Need (NIPS 2017)
(Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, Illia Polosukhin)
paper link: https://arxiv.org/pdf/1706.03762.pdf
Reference:
https://youtu.be/mxGCEWOxfe8 (by Minsuk Heo)
https://youtu.be/5vcj8kSwBCY (Stanford CS224N: NLP with Deep Learning | Winter 2019 | Lecture 14 – Transformers and Self-Attention)
References:
"Gaussian Process", Lectured by Professor Il-Chul Moon
"Gaussian Processes", Cornell CS4780 , Lectured by Professor
Kilian Weinberger
Bayesian Deep Learning by Sungjoon Choi
Vaccine management system project report documentation..pdfKamal Acharya
The Division of Vaccine and Immunization is facing increasing difficulty monitoring vaccines and other commodities distribution once they have been distributed from the national stores. With the introduction of new vaccines, more challenges have been anticipated with this additions posing serious threat to the already over strained vaccine supply chain system in Kenya.
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxR&R Consult
CFD analysis is incredibly effective at solving mysteries and improving the performance of complex systems!
Here's a great example: At a large natural gas-fired power plant, where they use waste heat to generate steam and energy, they were puzzled that their boiler wasn't producing as much steam as expected.
R&R and Tetra Engineering Group Inc. were asked to solve the issue with reduced steam production.
An inspection had shown that a significant amount of hot flue gas was bypassing the boiler tubes, where the heat was supposed to be transferred.
R&R Consult conducted a CFD analysis, which revealed that 6.3% of the flue gas was bypassing the boiler tubes without transferring heat. The analysis also showed that the flue gas was instead being directed along the sides of the boiler and between the modules that were supposed to capture the heat. This was the cause of the reduced performance.
Based on our results, Tetra Engineering installed covering plates to reduce the bypass flow. This improved the boiler's performance and increased electricity production.
It is always satisfying when we can help solve complex challenges like this. Do your systems also need a check-up or optimization? Give us a call!
Work done in cooperation with James Malloy and David Moelling from Tetra Engineering.
More examples of our work https://www.r-r-consult.dk/en/cases-en/
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Dr.Costas Sachpazis
Terzaghi's soil bearing capacity theory, developed by Karl Terzaghi, is a fundamental principle in geotechnical engineering used to determine the bearing capacity of shallow foundations. This theory provides a method to calculate the ultimate bearing capacity of soil, which is the maximum load per unit area that the soil can support without undergoing shear failure. The Calculation HTML Code included.
Democratizing Fuzzing at Scale by Abhishek Aryaabh.arya
Presented at NUS: Fuzzing and Software Security Summer School 2024
This keynote talks about the democratization of fuzzing at scale, highlighting the collaboration between open source communities, academia, and industry to advance the field of fuzzing. It delves into the history of fuzzing, the development of scalable fuzzing platforms, and the empowerment of community-driven research. The talk will further discuss recent advancements leveraging AI/ML and offer insights into the future evolution of the fuzzing landscape.
Overview of the fundamental roles in Hydropower generation and the components involved in wider Electrical Engineering.
This paper presents the design and construction of hydroelectric dams from the hydrologist’s survey of the valley before construction, all aspects and involved disciplines, fluid dynamics, structural engineering, generation and mains frequency regulation to the very transmission of power through the network in the United Kingdom.
Author: Robbie Edward Sayers
Collaborators and co editors: Charlie Sims and Connor Healey.
(C) 2024 Robbie E. Sayers
Quality defects in TMT Bars, Possible causes and Potential Solutions.PrashantGoswami42
Maintaining high-quality standards in the production of TMT bars is crucial for ensuring structural integrity in construction. Addressing common defects through careful monitoring, standardized processes, and advanced technology can significantly improve the quality of TMT bars. Continuous training and adherence to quality control measures will also play a pivotal role in minimizing these defects.
Immunizing Image Classifiers Against Localized Adversary Attacksgerogepatton
This paper addresses the vulnerability of deep learning models, particularly convolutional neural networks
(CNN)s, to adversarial attacks and presents a proactive training technique designed to counter them. We
introduce a novel volumization algorithm, which transforms 2D images into 3D volumetric representations.
When combined with 3D convolution and deep curriculum learning optimization (CLO), itsignificantly improves
the immunity of models against localized universal attacks by up to 40%. We evaluate our proposed approach
using contemporary CNN architectures and the modified Canadian Institute for Advanced Research (CIFAR-10
and CIFAR-100) and ImageNet Large Scale Visual Recognition Challenge (ILSVRC12) datasets, showcasing
accuracy improvements over previous techniques. The results indicate that the combination of the volumetric
input and curriculum learning holds significant promise for mitigating adversarial attacks without necessitating
adversary training.
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)MdTanvirMahtab2
This presentation is about the working procedure of Shahjalal Fertilizer Company Limited (SFCL). A Govt. owned Company of Bangladesh Chemical Industries Corporation under Ministry of Industries.
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdffxintegritypublishin
Advancements in technology unveil a myriad of electrical and electronic breakthroughs geared towards efficiently harnessing limited resources to meet human energy demands. The optimization of hybrid solar PV panels and pumped hydro energy supply systems plays a pivotal role in utilizing natural resources effectively. This initiative not only benefits humanity but also fosters environmental sustainability. The study investigated the design optimization of these hybrid systems, focusing on understanding solar radiation patterns, identifying geographical influences on solar radiation, formulating a mathematical model for system optimization, and determining the optimal configuration of PV panels and pumped hydro storage. Through a comparative analysis approach and eight weeks of data collection, the study addressed key research questions related to solar radiation patterns and optimal system design. The findings highlighted regions with heightened solar radiation levels, showcasing substantial potential for power generation and emphasizing the system's efficiency. Optimizing system design significantly boosted power generation, promoted renewable energy utilization, and enhanced energy storage capacity. The study underscored the benefits of optimizing hybrid solar PV panels and pumped hydro energy supply systems for sustainable energy usage. Optimizing the design of solar PV panels and pumped hydro energy supply systems as examined across diverse climatic conditions in a developing country, not only enhances power generation but also improves the integration of renewable energy sources and boosts energy storage capacities, particularly beneficial for less economically prosperous regions. Additionally, the study provides valuable insights for advancing energy research in economically viable areas. Recommendations included conducting site-specific assessments, utilizing advanced modeling tools, implementing regular maintenance protocols, and enhancing communication among system components.
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...Amil Baba Dawood bangali
Contact with Dawood Bhai Just call on +92322-6382012 and we'll help you. We'll solve all your problems within 12 to 24 hours and with 101% guarantee and with astrology systematic. If you want to take any personal or professional advice then also you can call us on +92322-6382012 , ONLINE LOVE PROBLEM & Other all types of Daily Life Problem's.Then CALL or WHATSAPP us on +92322-6382012 and Get all these problems solutions here by Amil Baba DAWOOD BANGALI
#vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore#blackmagicformarriage #aamilbaba #kalajadu #kalailam #taweez #wazifaexpert #jadumantar #vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore #blackmagicforlove #blackmagicformarriage #aamilbaba #kalajadu #kalailam #taweez #wazifaexpert #jadumantar #vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore #Amilbabainuk #amilbabainspain #amilbabaindubai #Amilbabainnorway #amilbabainkrachi #amilbabainlahore #amilbabaingujranwalan #amilbabainislamabad
3. Introduction
A key question often asked of machine learning systems is
“Why did the system make this prediction?”
How can we explain where the model came from?
In this paper, we tackle this question by tracing a model’s predictions
through its learning algorithm and back to the training data, where the
model parameters ultimately derive from.
4. Introduction
Answering this question by perturbing the data and retraining the model
can be prohibitively expensive. To overcome this problem, we use
influence functions, a classic technique from robust statistics (Cook &
Weisberg, 1980) that tells us how the model parameters change as we
upweight a training point by an infinitesimal amount.
6. Approach
We are given training points 𝑧1,… , 𝑧 𝑛, where 𝑧𝑖 = (𝑥𝑖, 𝑦𝑖) ∈ X × Y. For
a point 𝑧 and parameters 𝜃 ∈ Θ, let 𝐿(𝑧, 𝜃) be the loss
Assume that the empirical risk is twice-differentiable and strictly
convex in 𝜃
7. Approach
Model param. by training w/o z :
Model param. by upweighting z :
Model param. by perturbing z :
8. Approach
Let us begin by studying the change in model parameters due to
removing a point z from the training set.
Formally, this change is 𝜃ɛ, 𝑧 − 𝜃
Formally, this change is 𝜃−𝑧 − 𝜃
Formally, this change is 𝜃ɛ, 𝑧 𝛿, −𝑧 − 𝜃
11. Up, params influence
where 𝐻𝜃 ≝
1
𝑛
σ𝑖=1
𝑛
∇ 𝜃
2
𝐿(𝑧, 𝜃) is the Hessian and is positive definite
(PD) by assumption. In essence, we form a quadratic approximation
to the empirical risk around 𝜃 and take a single Newton step; see
appendix A for a derivation. Since removing a point z is the same as
upweighting it by ε = −
1
𝑛
, we can linearly approximate the parameter
change due to removing z by computing 𝜃−𝑧 − 𝜃 ≈ −
1
𝑛
𝜤 𝑢𝑝,𝑝𝑎𝑟𝑎𝑚𝑠,
without retraining the model.
13. Perturbing a training input
For a training point 𝑧 = (𝑥, 𝑦) , define 𝑧 𝛿 ≝ (𝑥 + 𝛿, 𝑦). Consider the
perturbation 𝑧 → 𝑧 𝛿 , and let 𝜃 𝑧 𝛿, −𝑧 be the empirical risk minimizer
on the training points with 𝑧 𝛿 in place of 𝑧. To approximate its
effects, define the parameters resulting from moving ɛ mass from 𝑧
onto 𝑧 𝛿
14. Perturbing a training input
If x is continuous and 𝛿is small
lim
ℎ→0
F(X+h) – F(X) = F’(X)∗h
16. Efficiently calculation
We discuss two techniques for approximating 𝑠𝑡𝑒𝑠𝑡, both relying on
the fact that the HVP of a single term in 𝐻𝜃, [∇ 𝜃
2
𝐿(𝑧, 𝜃)]v, can be
computed for arbitrary v in the same time that∇ 𝜃 𝐿(𝑧, 𝜃) would take,
which is typically O(p) (Pearlmutter, 1994).
𝑠𝑡𝑒𝑠𝑡 ≝ 𝐻𝜃
−1
∇ 𝜃 𝐿(𝑧𝑡𝑒𝑠𝑡, 𝜃)
17. Efficiently calculation - Conjugate gradients (CG)
Since 𝐻𝜃 ≻ 0 by assumption, 𝐻𝜃
−1
𝑣 ≡ 𝑎𝑟𝑔𝑚𝑖𝑛 𝑡
1
2
𝑡 𝑇 𝐻𝜃 𝑡 − 𝑣 𝑇
𝑡 . We
can solve this with CG approaches that only require the evaluation of
𝐻𝜃 𝑡 , which takes O(np)time, without explicitly forming 𝐻𝜃
𝑠𝑡𝑒𝑠𝑡 ≝ 𝐻𝜃
−1
∇ 𝜃 𝐿(𝑧𝑡𝑒𝑠𝑡, 𝜃)
18. Efficiently calculation - Stochastic estimation
𝑠𝑡𝑒𝑠𝑡 ≝ 𝐻𝜃
−1
∇ 𝜃 𝐿(𝑧𝑡𝑒𝑠𝑡, 𝜃)
Dropping the 𝜃 subscript for clarity,let 𝐻𝑗
−1
≝ σ𝑖=0
𝑗
(𝐼 − 𝐻)𝑖, the first
j terms in the Taylor expansion of 𝐻−1. Rewrite this recursively as
𝐻𝑗
−1
= 𝐼 + (𝐼 − 𝐻)𝐻𝑗−1
−1
. From the validity of the Taylor expansion,
𝐻𝑗
−1
→ 𝐻−1 as 𝑗 → ∞. The key is that at each iteration, we can
substitute the full 𝐻 with a draw from any unbiased (and faster to-
compute) estimator of 𝐻 to form ෪𝐻𝑗. Since E[෪𝐻𝑗
−1
] = 𝐻𝑗
−1
, we still
have E[෪𝐻𝑗
−1
] → 𝐻−1
19. Efficiently calculation - Stochastic estimation
෪𝐻𝑗
−1
𝑣 = 𝑣 + (𝐼 − ∇ 𝜃
2
𝐿(𝑧𝑠 𝑗
, 𝜃))෫𝐻𝑗−1
−1
𝑣
Empirically, we found this significantly faster than CG.
20. Non-convexity and non-convergence
Our approach is to form a convex quadratic approximation of the loss
around ෩𝜃 , i.e., ෩𝐿 𝑧, 𝜃 = 𝐿(𝑧, ෩𝜃 ) + ∇𝐿(𝑧, ෩𝜃 ) 𝑇 𝜃 − ෩𝜃 +
1
2
(𝜃 − ෩𝜃 ) 𝑇൫
൯
𝐻෩𝜃 +
λ 𝐼 𝜃 − ෩𝜃 . Here, λ is a damping term that we add if 𝐻෩𝜃 has negative
eigenvalues; this corresponds to adding L2 regularization on 𝜃. We then
calculate 𝜤 𝑢𝑝,𝑙𝑜𝑠𝑠 using ෩𝐿 . If ෩𝜃 is close to a local minimum, this is
correlated with the result of taking a Newton step from ෩𝜃 after removing 𝜀
weight from z
Let 𝑋 ∈ 𝑅 𝑚×𝑚 be a symmetric matrix.
𝑋 = 𝑈Σ𝑈 𝑇
𝐼 = 𝑈𝐼𝑈 𝑇
𝑋 + 𝐼 = 𝑈(Σ + 𝐼)𝑈 𝑇
24. Applications - Understanding model behavior
Influence functions reveal insights about
how models rely on and extrapolate from the training data.
Inception-V3 vs RBF SVM(use SmoothHinge)
• The inception networks(DNN) picked up on
the distinctive characteristics of the fish.
• RBF SVM pattern-matched training images
superficially
29. Application - Debugging domain mismatch
If a model makes a mistake, can we find out why?
Original Modified
~20k -> ~20k
21 -> 1
3 -> 3
same
-20
same
Domain mismatch — where the training distribution
does not match the test distribution — can cause
models with high training accuracy to do poorly on
test data
(………………)
we predicted whether a patient would be readmitted
to hospital. We used logistic regression to predict
readmission with a balanced training dataset of 20K
diabetic patients from 100+ US hospitals, each
represented by127 features.
(………………)
This caused the model to wrongly classify many
children in the test set
Healthy +
re-admitted
Adults
Healthy
children
Re-admitted
children
30. Application - Debugging domain mismatch
True test label: Healthy children
Model predicts: Re-admitted childeren
0.1
0
-0.1
Influence
Top 20 influential training examples
32. Application - Fixing mislabeled examples
Training labels are noisy, and we have a small budget to manually inspect them
Can we prioritize which labels to try to fix?
Even if a human expert could
recognize wrongly labeled
examples, it is impossible in many
applications to manually review
all of the training data We show
that influence functions can help
human experts prioritize their
attention, allowing them to
inspect only the examples that
actually matter
Ham SpamSpamSpamHam
Ham SpamSpamHamSpam
We flipped the labels of a random 10% of the training data
33. Application - Fixing mislabeled examples
Plots of how test accuracy (left) and the fraction of flipped data
detected (right) change with the fraction of train data checked
35. References
Pang Wei Koh and Percy Liang. "Understanding Black-Box prediction via influence functions" ICML 2017 Best
paper
Paper link: https://arxiv.org/abs/1703.04730
Microsoft Research: Understanding Black-box Predictions via Influence Functions (by Pang Wei Koh)
Youtube: https://youtu.be/0w9fLX_T6tY