This document discusses support vector machines (SVMs) for classification tasks. It describes how SVMs find the optimal separating hyperplane with the maximum margin between classes in the training data. This is formulated as a quadratic optimization problem that can be solved using algorithms that construct a dual problem. Non-linear SVMs are also discussed, using the "kernel trick" to implicitly map data into higher-dimensional feature spaces. Common kernel functions and the theoretical justification for maximum margin classifiers are provided.