Regularization adds an additional term to the objective function being optimized in machine learning to prevent overfitting. Common regularizers include the L1 norm (sum of absolute weights), L2 norm (sum of squared weights), and Lp norms. The L1 norm encourages sparsity by driving small weights to exactly zero. Gradient descent can be used to minimize objective functions that combine a loss term and a regularizer, as long as the overall function remains convex. During gradient descent, the regularizer term pushes weights towards zero. Regularization is important for obtaining models that generalize well to new data.