Game theoretic concepts in Support Vector Machines

1,229 views

Published on

Using a game theoretic concept to obtain the separating hyperplane in classification machine learning problems.

Published in: Education
1 Comment
0 Likes
Statistics
Notes
  • Be the first to like this

No Downloads
Views
Total views
1,229
On SlideShare
0
From Embeds
0
Number of Embeds
349
Actions
Shares
0
Downloads
36
Comments
1
Likes
0
Embeds 0
No embeds

No notes for slide

Game theoretic concepts in Support Vector Machines

  1. 1. Game TheoreticConceptsinSupport VectorMachine LearningSubhayan MukerjeeBITS Pilani
  2. 2. Game Theory• Game Theory is the strategic study of decision making.• More formally it is "the study of mathematical models of conflict and cooperation between intelligent rational decision- makers.“• The games studied in game theory are well-defined mathematical objects.• A game consists of a set of players, a set of moves (or strategies) available to those players, and a specification of payoffs for each combination of strategies. Most cooperative games are presented in the characteristic function form, while the extensive and the normal forms are used to define non- cooperative games.
  3. 3. Game Theory (continued)• To put it in other words, Game Theory models a conflict between two or more players using specified payoffs of the different players under different combinations of deployed strategies.
  4. 4. Machine Learning• Machine learning, a branch of artificial intelligence, is about the construction and study of systems that can learn from data.• Support Vector Machines, Perceptrons, Bayesian Classifiers etc. are widely used algorithms that are used for the same.• Machine Learning problems are categorized into classification problems and regression problems.
  5. 5. Support Vector Machine• A standard and popular Machine Learning algorithm that is used to solve classification problems.• It is a supervised learning algorithm.• This means it tries to infer a function from a set of labeled training data, and then classifies a test data point accordingly.• The classification is done on the basis of the equation of a hyperplane which is arrived at by analyzing the training data set.
  6. 6. Support Vector Machine (continued) The test data point, the star will be classified as a circle because of the side of the hyperplane it lies on.
  7. 7. Game Theory in SVM The Chip Firing Classifier!
  8. 8. The Chip Firing Classifier• Uses a chip firing algorithm to identify the support vectors in a given data set.• Then various methods of predicting the separating hyperplane.• It is essentially an alternative iterative approach to solving the dual of the quadratic programming problem which in turn is used to obtain the support vectors in SVM.
  9. 9. The Chip Firing Classifier• It is a strategic two player iterated game, in which the players can be chosen from the data patterns.• Each data point starts off with a fixed number of chips.• In every iteration, the two players are randomly selected from the same class of the training set (all the data patterns start of with the same number of chips).• A third participant (not technically a player) provides the utility value to the players based on its distance from them.• We now introduce two terms : fire and rest.• Fire refers to transferring half of the chips to the other player and rest means retaining them.
  10. 10. The Chip Firing Classifier• The rules of the game are as follows : • the player closer to the participant from the opposing class rests and the one further away fires its chips. • In the iterated game, this process carries on for many iterations with random sampling of the player and the opposing class participant, • The game ends when a situation is reached where two data points, one from each class attain the maximum number of chips. A sample game is shown in the next slide.
  11. 11. The Chip Firing Classifier
  12. 12. The Chip Firing Classifier• The data points with the maximum number of chips after convergence of the algorithm are the respective support vectors, of either class.• Correctnesss?• Implementation on MATLAB.
  13. 13. MATLAB implementation• We chose the fisher iris dataset for classification purposes.• The dataset was divided into two classes: Setosa and non-Setosa.• There were a total of 150 data points.• Tried to compare the support vectors achieved using conventional SVM training and those obtained using chip firing classier.• For every experiment, the entire dataset was randomly divided into training and test sets.• Support Vector Machine hyperplane was learnt through the inbuilt function svmtrain.• The accuracy of the classier when applied on the test set always exceeded 98% in all experiments.• For the chip firing classier, 300 chips were given to each data point.• the number of iterations was varied from 100 to 10000. It turns out that even 500 iterations sufficed a good enough approximation for the support vectors.
  14. 14. MATLAB implementation (continued)• If there are more than one contenders for a support vectors, the chips get uniformly distributed over them. Consequently, if there is one single clear contender, the number of chips it has is very high.
  15. 15. MATLAB implementation (continued)
  16. 16. MATLAB implementation (continued)
  17. 17. Drawing the Hyperplane• Three hypotheses were tried to obtain the equation of the hyperplane from the support vectors which were obtained from the chip firing algorithm. They are as follows : • Perpendicular bisector of the line segment joining the two support vectors of opposing sides The accuracy of the classification obtained using this ranged from 56% to 86% • Line equidistant from the support vectors having the slope 1/m where m is the slope of the line joining the centroids of the two opposing classes The successful classification rate ranged from 85% to 92% in different training and test arrangements to the fisheriris dataset • Line equidistant from the support vectors having the slope as the average of the slopes of the major axes of the opposing classes The major axis of the classes are obtained by computing the first eigenvector of the covariance matrix of the data points. Averaging the two gives us the slope of the line. The constant term is calculated so as to be equidistant from the two support vectors. This technique gives more than 90% success in the classification process.
  18. 18. The Hyperplane Black : obtained using usual SVM algorithm Pink : perpendicular Bisector Blue : separator obtained by calculating the slope perpendicular to the line joining the Centroids Green : the slope as the average of the slopes of the major axes of the opposing classes.
  19. 19. Conclusion• Even more ways of finding better separating hyperplanes?• Effect of outliers on the algorithm?• The rare class problem?
  20. 20. Thank you

×