This document summarizes challenges in developing machine learning models to detect cognitive diseases like Alzheimer's and aphasia from speech. Some key challenges discussed include the lack of appropriate labeled training data, especially for diseases other than Alzheimer's. The document also discusses how additional healthy speech data and techniques like consensus networks can help address the lack of labels. It notes that automatic speech recognition errors can significantly impact performance and that syntactic features are more robust to errors than lexical features. Finally, it discusses the importance of cross-language studies and domain adaptation techniques like optimal transport to overcome limitations of models trained only on English data.