Data science involves cleaning and analyzing large amounts of data to gain insights and make predictions. It requires expertise in mathematics, statistics, programming, and domain knowledge. While big data is hyped as providing definitive answers, data scientists know that all models are imperfect and data must be understood in context. Open source tools like R and Python are commonly used to explore, visualize, and draw conclusions from data.