MLBox is a fully automated machine learning pipeline that cleans, preprocesses, models, and analyzes data. It features include reading and cleaning data, encoding categorical features, feature engineering, hyperparameter tuning, model validation, and interpreting results. MLBox addresses issues like data drift over time by detecting drifting features and selectively removing them if they negatively impact model performance. It also uses entity embeddings to learn low-dimensional vector representations of categorical features, providing an accurate, scalable, and interpretable encoding method. This technique was tested on a large automotive insurance dataset and improved both model accuracy and understanding of feature relationships.