MLlib is an Apache Spark component for machine learning algorithms that supports classification, regression, clustering, and dimensionality reduction algorithms. It provides scalable implementations of these algorithms for distributed datasets. Recent updates to MLlib include improved documentation, API stability, support for sparse data representations to improve performance and scalability, and the addition of decision trees for both classification and regression. Future work will focus on standardizing interfaces, enabling parallel model training, adding statistical analysis functionality, and incorporating more learning algorithms.