•

41 likes•13,010 views

This Data Science Presentation will help you in understanding what is Data Science, why we need Data Science, prerequisites for learning Data Science, what does a Data Scientist do, Data Science lifecycle with an example and career opportunities in Data Science domain. You will also learn the differences between Data Science and Business intelligence. The role of a data scientist is one of the sexiest jobs of the century. The demand for data scientists is high, and the number of opportunities for certified data scientists is increasing. Every day, companies are looking out for more and more skilled data scientists and studies show that there is expected to be a continued shortfall in qualified candidates to fill the roles. So, let us dive deep into Data Science and understand what is Data Science all about. This Data Science Presentation will cover the following topics: 1. Need for Data Science? 2. What is Data Science? 3. Data Science vs Business intelligence 4. Prerequisites for learning Data Science 5. What does a Data scientist do? 6. Data Science life cycle with use case 7. Demand for Data scientists This Data Science with Python course will establish your mastery of data science and analytics techniques using Python. With this Python for Data Science Course, you’ll learn the essential concepts of Python programming and become an expert in data analytics, machine learning, data visualization, web scraping and natural language processing. Python is a required skill for many data science positions, so jumpstart your career with this interactive, hands-on course. Why learn Data Science? Data Scientists are being deployed in all kinds of industries, creating a huge demand for skilled professionals. Data scientist is the pinnacle rank in an analytics organization. Glassdoor has ranked data scientist first in the 25 Best Jobs for 2016, and good data scientists are scarce and in great demand. As a data you will be required to understand the business problem, design the analysis, collect and format the required data, apply algorithms or techniques using the correct tools, and finally make recommendations backed by data. The Data Science with python is recommended for: 1. Analytics professionals who want to work with Python 2. Software professionals looking to get into the field of analytics 3. IT professionals interested in pursuing a career in analytics 4. Graduates looking to build a career in analytics and data science 5. Experienced professionals who would like to harness data science in their fields

- 1. ?
- 2. What’s in it for you What is Data Science? Data Science vs Business Intelligence What does a Data Scientist do? Data Science lifecycle with example Data Scientist demand Need for Data Science The Prerequisites for learning Data Science
- 3. Need for Data Science
- 4. Need For Data Science Does the thought of your car driving you home by itself excite you?
- 5. Is that even possible ? Need For Data Science Does the thought of your car driving you home by itself excite you?
- 6. This is where the ‘Need For Data Science’ comes into picture. Data Science helps in making better decisions! Need For Data Science
- 7. You mean it will be able to take decisions like slowing down, stopping by itself, speeding up and all of that? Need For Data Science This is where the ‘Need For Data Science’ comes into picture. Data Science helps in making better decisions!
- 8. You mean it will be able to take decisions like slowing down, stopping by itself, speeding up and all of that? Exactly! And then let the machine learn iteratively using unsupervised learning! Need For Data Science This is where the ‘Need For Data Science’ comes into picture. Data Science helps in making better decisions!
- 9. You mean it will be able to take decisions like slowing down, stopping by itself, speeding up and all of that? That’s interesting! Need For Data Science Exactly! And then let the machine learn iteratively using unsupervised learning! This is where the ‘Need For Data Science’ comes into picture. Data Science helps in making better decisions!
- 10. Self driving cars will root out more than 2 million deaths caused by car accidents annually. Need For Data Science
- 11. Due to lack of data available, flights are often delayed or cancelled at the last minute 1 Need For Data Science We’re extremely sorry to inform that your flight has been delayed by 4 hours due to bad weather conditions. Regret the inconvenience caused 2 3
- 12. 1 Need For Data Science Due to improper route planning, customers don’t get the flight for desired time and duration We’re extremely sorry to inform you that there are no flights for the time selected. There’s a connecting flight for the same time tomorrow. 2 3 Due to lack of data available, flights are often delayed or cancelled at the last minute
- 13. 2 1 3 Need For Data Science Dear Flyer, We regret to inform you that your flight has been cancelled due to delay from Airbus on account of engine delivery Incorrect decisions in selection of right equipment leads to unplanned delays and cancellations Due to lack of data available, flights are often delayed or cancelled at the last minute Due to improper route planning, customers don’t get the flight for desired time and duration
- 14. Need For Data Science With Data Science, it has become possible to predict such disruptions and alleviate the loss for both airline and the passenger
- 15. Need For Data Science Using Data Science, we can achieve the following: Route Planning: Whether to schedule direct or connecting flights Predictive analytics model can be built to foresee flight delays Deciding which class of planes to purchase for better performance Promotional offers depending on customer booking patterns
- 16. Logistics companies like FedEx are using Data Science models for operational efficiency Discover the best routes to ship The best suited time to deliver The best mode of transport Need For Data Science
- 17. Need For Data Science So Data Science is mainly needed for: Better Decision Making Whether A or B? Predictive Analysis What will happen next? Pattern Discovery Is there any hidden information in the data?
- 18. What is Data Science?
- 19. What is Data Science? Suppose, you have decided to buy furniture online for your new office How do you choose the right website?
- 20. What is Data Science? Want to buy online furniture? Does website sell furniture ? Yes Rating > 4 out of 5 Yes Purchase Product No Close website No Close website Yes Discount > 20% No Close website
- 21. Which route should my cab take so that I reach faster? Which viewers like the same kind of TV shows? Will this refrigerator fail in the next 3 years: Yes or No? Who will win the elections? Data Science can answer a lot of other questions as well! What is Data Science?
- 22. What is Data Science? Finally communicating and visualizing the results Asking the right questions and exploring the data Modeling the data using various algorithms So, Data Science or Data-driven Science is about:
- 23. Finally communicating and visualizing the results Modeling the data using various algorithms Asking the right questions and exploring the data What is Data Science? So, Data Science or Data-driven Science is about:
- 24. Finally communicating and visualizing the results Modeling the data using various algorithms Asking the right questions and exploring the data What is Data Science? So, Data Science or Data-driven Science is about:
- 25. Business Intelligence vs Data Science
- 26. Business Intelligence vs Data Science Structured data e.g. Data Warehouse Unstructured data e.g. web logs Data Source Method Analytical Scientific Skills Statistics, Visualization Statistics, Visualization, Machine Learning Focus Past and Present Data Present Data and Future Predictions Criterion Business Intelligence Data Science
- 27. Prerequisites For Data Science
- 28. Prerequisites for Data Science Only when you ask questions, you will have a better understanding of the business problem CURIOSITY The following are the 3 essential traits of a Data Scientist:
- 29. Prerequisites for Data Science COMMON SENSE To identify new ways to solve a business problem and to detect priority problems The following are the 3 essential traits of a Data Scientist: CURIOSITY
- 30. Prerequisites for Data Science COMMUNICATION SKILLSCOMMON SENSE A Data Scientist needs to communicate their findings to business teams to act upon the insights The following are the 3 essential traits of a Data Scientist: CURIOSITY
- 31. Machine learning is the backbone of Data Science. It is one of the many ways that Data Science uses to find solution to a problem Prerequisites for Data Science 1 MACHINE LEARNING
- 32. Prerequisites for Data Science Mathematical Models can be extremely helpful to make fast calculations and predictions from what you know about your data 1 2 MACHINE LEARNING MATHEMATICAL MODELLING
- 33. Prerequisites for Data Science Statistics is foundational to Data Science. It lets you extract knowledge and obtain better results from the data 3 1 2 MACHINE LEARNING STATISTICS MATHEMATICAL MODELLING
- 34. Prerequisites for Data Science You should know at least one programming language, preferably Python or R for data modelling 4 1 2 3 MACHINE LEARNING STATISTICS COMPUTER PROGRAMMING MATHEMATICAL MODELLING
- 35. MACHINE LEARNING Prerequisites for Data Science STATISTICS COMPUTER PROGRAMMING The discipline of querying databases teaches you to ask better questions as a Data Scientist 51 2 3 4 MATHEMATICAL MODELLING DATABASES
- 36. Tools/Skills used in Data Science Skills: R, Python, Statistics Tools: SAS, Jupyter, R studio, MATLAB, Excel, RapidMiner Data Analysis Skills: ETL, SQL,Hadoop, Apache Spark, Tools: Informatica/ Talend, AWS Redshift Data Warehousing Skills: R, Python libraries Tools: Jupyter, Tableau, Cognos, RAW Data Visualization Skills: Python, Algebra, ML Algorithms, Statistics Tools: Spark MLib, Mahout, Azure ML studio Machine Learning
- 37. What does a Data Scientist do?
- 38. What does a Data Scientist do? Real World
- 39. What does a Data Scientist do? Raw Data Real World
- 40. What does a Data Scientist do? Raw Data Process and Analyze Real World
- 41. What does a Data Scientist do? Raw Data Process and Analyze Meaningful Data Real World
- 42. What does a Data Scientist do? Raw Data Process and Analyze Meaningful Data Real World Useful Insights
- 43. Must Know Machine Learning Algorithms Naive Baiyes Support Vector MachineClustering The most basic and important techniques that you should know as a Data Scientist are Decision TreeRegression Note to instructor: Please say that they can find the videos on specific algorithms in the video description below
- 44. Data Science Lifecycle with Example
- 45. Concept Study – Life Cycle CONCEPT STUDY Understanding the problem statement, thorough study of the business model is required. 1 2 3 4 5 6
- 46. What is the Example? What is the end goal? What is the budget? What are the various specifications? Concept Study – Example
- 47. Concept of the task : Predict the price of 1.35 carat diamond Get to know about the diamond industry, various terminologies used. Understand the business problem and collect RELEVANT and enough data Suppose, we get the price of diamonds from different diamond retailers. Now, we want to find out the price of 1.35 carat diamond Concept Study – Example
- 48. Data Preparation - Life cycle Data Preparation Also known as Data Munging, it is the most important aspect of Data Science lifecycle for any valuable insights to pop up. 1 2 3 4 5 6
- 49. Data Integration Resolving any conflicts in the data and handling redundancies Data Cleaning Correcting inconsistent data by filling out missing values and smoothing out noisy data Data Transformation It involves normalization, transformation and aggregation of data using ETL methodsData Reduction Using various strategies, reducing the size of data but yielding the same outcome Data Preparation - Life cycle
- 50. Data Preparation - Example Missing Value Improper Datatype Null Value Data preparation: Make the data clean and valuable.
- 51. Data Preparation - Example Ways to fill missing data values: If dataset is huge, we can simply remove the rows with missing data vales. It is the quickest way. i.e. we use the rest of the data to predict the values. We can substitute missing values with mean of rest of the data using pandas’ dataframe in Python. i.e. df.mean() df.fillna(mean)
- 52. • Split the data into train data and test data in the ratio of 80:20 • It is generally advised to divide the dataset into two random partition Data Preparation - Example Train data (80%) Test data (20%)
- 53. Model Planning - Life cycle Model Planning:- After proper understanding and cleaning of the data in hand, suitable model is selected.1 2 3 4 5 6
- 54. Model Planning: • This step involves Exploratory Data Analysis (EDA) to understand the relation between variables and to see what the data can tell us • Key variables are selected Model Planning - Life cycle
- 55. But what is Exploratory Data Analysis? Definition : Deeper analysis of dataset to better understand the data. Model Planning - Life cycle Goals : • Know the datatypes and answer questions with the data • Understand how data is distributed • Identify outliers • Identify patterns, if any
- 56. Techniques: • Histogram 0 2000 4000 6000 8000 10000 12000 14000 0 0.5 1 1.5 2 2.5 TREND ANALYSIS • Trend Analysis Model Planning - Life cycle Using various techniques, we can easily figure out that the relation between carat and price of diamond is linear in nature
- 57. Model Planning - Example Test Data (20%) Train Data (80%) Model is created Feedback • Train Data is used to develop model • Test Data is used to validate model Train Data vs Test Data Improvement
- 58. SASMATLAB PythonR Various tools used in Model Planning
- 59. Model Building - Life cycle Model Building :- Using various analytical tools and techniques, data is transformed with the goal of ‘discovering’ useful information to build the right model 1 2 3 4 5 6
- 60. Model Building: On analyzing the data, we observe that the output is progressing linearly. Hence, we are using Linear Regression Algorithm for Model Building in this case Model Building - Example Rs. 15,000 Carat Rs.5,000 Rs.10,000 Price of diamond 0.5 1.0 1.5 1.35 Regression line
- 61. Model Building - Example Linear regression describes the relation between 2 variables i.e. X and Y After the regression line is drawn, we can predict Y value for a input X value using following formula: Y = mX + c m = Slope of the line c = Y intercept X is Independent variable
- 62. Model Building - Example Linear regression describes the relation between 2 variables i.e. X and Y After the regression line is drawn, we can predict Y value for a input X value using following formula: Y = mX + c m = Slope of the line c = Y intercept X is Independent variable Y is dependent variable
- 63. Collected & Analysed Data (Carat, price) Output Test data Model Building Prediction (Price) (Carat) Model Building - Example Using test data set, the built model is validated for the best accuracy Feedback
- 64. Prediction: After successful validation of the model, we predict the price of 1.35 carat diamond Model Building - Example Rs. 15,000 Carat Rs.5,000 Rs.10,000 Price of diamond 0.5 1.0 1.5 1.35 Regression line
- 65. Prediction: Thus, using Simple Linear Regression algorithm we have implemented a successful model and predicted the price of 1.35 carat diamond to be Rs. 10,000 Model Building - Example Rs. 15,000 Carat Rs.5,000 Rs.10,000 Price of diamond 0.5 1.0 1.5 1.35 Regression line
- 66. This model is easily built using Python packages like pandas, matplotlib, numpy We will study this in detail in the upcoming Data Science Tutorial using Python Model Building - Example
- 67. Communication - Life cycle Communicate results: Keys findings are identified and conveyed to the stakeholders Communicate results 1 2 3 4 5 6
- 68. Communication - Life cycle The Battle is not over yet!! A good Data Scientist should be able to communicate his findings with the business team such that it easily goes into execution phase
- 69. Life cycle of Data Science project Operationalize: - Final reports, code, and technical documents are delivered by the team. 1 2 3 4 5 6
- 70. Summary - Life cycle Operationalize 1 2 3 4 5 6 Concept Study Data Preparation Model Planning Model Building Communicate Results
- 72. Demand for Data Scientist Marketing Finance Healthcare Gaming Industries with high demand of Data Scientists: Technology
- 73. Summary Need For Data Science What is Data science? Prerequisites of data science Demand for data scientistLifecycle with exampleTools Used in Data science
- 74. So what’s your next step?

- Remove title case
- Data-driven science, is an interdisciplinary field of scientific methods, processes, algorithms and systems to extract knowledge or insights from data in various forms
- Data-driven science, is an interdisciplinary field of scientific methods, processes, algorithms and systems to extract knowledge or insights from data in various forms
- Data-driven science, is an interdisciplinary field of scientific methods, processes, algorithms and systems to extract knowledge or insights from data in various forms
- Good insight of the workings of the DBMS will surely take you a long way.
- A Data Scientist collects as much raw data as possible from the real world
- A Data Scientist collects as much raw data as possible from the real world
- A Data Scientist collects as much raw data as possible from the real world
- A Data Scientist collects as much raw data as possible from the real world
- A Data Scientist collects as much raw data as possible from the real world
- Iwe can also use
- Natural language processing to enable it to communicate successfully in English (or some other human language). Knowledge representation to store information provided before or during the interrogation. Automated reasoning to use the stored information to answer questions and to draw new conclusions. Machine learning to adapt to new circumstances and to detect and extrapolate patterns.