Machine Learning / Data Science 4a. Suppose the features in your training set have very different scales. What algorithm might suffer from this, and how? What can you do about it? 4b. Can gradient descent get stuck in a local minimum when training a logistic regression classifier? Does this depend on the used cost function, and if so, which cost function suffers problems and why?.