Gradient descent is an algorithm to minimize a cost function by iteratively moving in the direction of steepest descent as defined by the negative of the gradient of the cost function. The rate of change or step size is represented by the learning rate parameter alpha. Potential limitations include getting stuck in local minima rather than finding the global minimum, slow convergence if features have very different scales, and instability if alpha is too large. Feature scaling and reducing alpha can help address these issues. Gradient descent is considered converged when the decrease in the cost function between iterations falls below a small threshold such as 0.001.