The document discusses challenges in using machine learning for automated diagnosis of performance issues in distributed systems. It describes 4 key challenges: 1) transforming large amounts of metrics data into useful information, 2) adapting models to changing systems, 3) leveraging historical diagnosis to retrieve similar issues, and 4) combining metrics data with unstructured log data from multiple sources. The author proposes approaches for each challenge including Bayesian network classifiers, adaptive ensembles of models, defining issue signatures, and information extraction from logs.