This document summarizes a talk on data science for software engineering. It discusses how data science involves various fields like statistics, machine learning, and data mining. It notes that while "big data" is often discussed, software engineering data is typically small and sparse. Domain knowledge is important for data mining to avoid misinterpreting data. Data science with software engineering data requires understanding organizations and their willingness to share data given privacy concerns. The document outlines sharing data, models, and methods for learning across different organizations and discusses techniques for balancing privacy and utility when sharing data.