This document summarizes a social scientist's perspectives on data science. It discusses that data comes from many sources and in many formats, which requires data scientists to know how to obtain data and work with different file types and APIs. It also notes that real data is often messy, with duplicates, missing values, and inconsistent formats, and combining data from multiple sources requires tools like UNIX commands, scripting languages, and databases. The document discusses that while data munging takes 80% of effort, teaching hacking skills is straightforward by borrowing from computer science curriculums. It also discusses exploring and modeling data through methods that scale and match different data types like text, geospatial, and web-scale data. The document advocates focusing