The document is a presentation by Josh Wills from Cloudera on data science. It discusses defining data science and the roles of data scientists. It also covers challenges working with big data, including data modeling and addressing the impedance mismatch between operational and analytical systems. The presentation promotes open data science and sharing examples on GitHub to push beyond the limits of current tools.