2. Outline
• Background
• Taylor's Formula
• Newton's Method
• Introduction
• Influence Function
• Definition
• Efficiently Calculating Influence
• Validation and Extensions
• Use cases of influence function
3. Background : Taylor
• Taylor's theorem gives an approximation of a k-times
differentiable function around a given point by a k-th
order Taylor polynomial
• Linear approximation
• Quadratic approximation
4. Background : Newton
• Find x:F(x) = 0 through iteration.
• Recall Taylor’s Formula
• F(a) ≈ F(Xn) + F’(Xn)(a – Xn)
• Set F(a) = 0, get a = Xn – F(Xn)/F’(Xn)
• Newton in optimizing
• X = argmin(F(x)), then F’(X) = 0
• Doing Newton’s method with F’(X)
• Xn+1 = Xn – F’(Xn)/F’’(Xn)
8. Introduction
• Why did the model make this prediction?
• Retrieving images that maximally activate a neuron [Girshick et
al. 2014]
9. Introduction
• Why did the model make this prediction?
• Retrieving images that maximally activate a neuron [Girshick et
al. 2014]
• Finding the most influential part from the image [Zhou et al. 2016]
10. Introduction
• Why did the model make this prediction?
• Retrieving images that maximally activate a neuron [Girshick et
al. 2014]
• Finding the most influential part from the image [Zhou et al. 2016]
But, they assumed a
fixed model
11. Introduction
• Existing Methods
• Treat model as fixed
• Explain prediction w.r.t parameters or test input
• Our Method
• Treat model as a function of training data
• Explain prediction w.r.t the training data “most
responsible” for prediction
• How would the prediction change if we up-weighted/
modified a training point?
22. Perturbing a training input
• If we change (x, y) to (x + delta, y), what will test loss
change?
• (x, y) to (x + delta, y) equals to delete (x, y) then add (x +
delta, y)
23. Efficiently calculation
• Two challenges:
• calculating Inverse Hessian Matrix
• calculating influence function on all training points
• n training points, p parameters
• Inverting Hessian O(np2 + p3)
• Use Conjugate gradients(refer to paper), O(np)
• Stochastic estimation(refer to paper)
24. Validation and Extensions
• There are some assumptions & approximation:
• model parameter minimized the loss
• the loss is twice-differentiable
• We want to check the performance of influence function
when these assumptions are violated.
25. Validation and Extensions
• Influence function vs leave-one-out retraining
• actually retrain a linear regression model after
removing a training point
26. Validation and Extensions
• Non-convexity and non-convergence
• When theta is not a minimizer, the loss change will be a
little different( refer to paper)
• Iloss non-convex
• Person’s correlation = 0.86
27. Validation and Extensions
• Non-differentiable losses
• Hinge loss: we can approximate using some smooth
methods
28. Use cases of influence functions
• Understanding model behavior
• Fixing mislabeled examples
• Adversarial training examples
• Debugging domain mismatch( refer to paper)
29. Understanding model behaviors
• Model1: Inception v3 with all but top layer frozen
• Model2: SVM with rbf kernel
• Task: binary image classification of fish and dog
30. Understanding model behaviors
• Model1: Inception v3 with all but top layer frozen
• Model2: SVM with rbf kernel
• Task: binary image classification of fish and dog
31. Fixing mislabeled examples
• Only have training set.
• What do we usually do?
• Find example with the largest loss
33. Adversarial training examples
• There exists some paper generating some adversarial test
images that are visually indistinguishable but can fool a
classifier.
• We demonstrate we can craft adversarial training images
that can flip a model’s prediction
• Basically the idea is iterating on training images on the
direction of influence function.
34. Adversarial training examples
• Data same as fish vs dog
• origin correctly classified 591/600 test images.
• for each test image, find only one training image and
do 100 iterations.
• 335(57%) of the testing images were flipped
• Also, attack on one training image can influence
multiple test images.
35. Adversarial training examples
• Data same as fish vs dog
• origin correctly classified 591/600 test images.
• for each test image, find only one training image and
do 100 iterations.
• 335(57%) of the testing images were flipped
• Also, attack on one training image can influence
multiple test images.