Ds

•Download as PPTX, PDF•

0 likes•32 views

Nourin Daudpoto

Data Structure slides

Data & Analytics

Introduction to Data
Science
NOUREEN FATIMA DAUDPOTO

Data science job Role
 Data scientists: Design data modeling processes to create algorithms and
predictive models and perform custom analysis
 Data analysts: Manipulate large data sets and use them to identify trends
and reach meaningful conclusions to inform strategic business decisions
 Data engineers: Clean, aggregate, and organize data from disparate
sources and transfer it to data warehouses.
 Business intelligence specialists: Identify trends in data sets
 Data architects: Design, create, and manage an organization’s data
architecture

OSEMN
 O — Obtaining our data
 S — Scrubbing / Cleaning our data
 E — Exploring / Visualizing our data will allow us to find patterns and
trends
 M — Modeling our data will give us our predictive power as a wizard
 N — Interpreting our data

Business Question
1. How can we translate data into dollars?
2. What impact do I want to make with this data?
3. What business value does our model bring to the table?
4. What will save us lots of money?
5. What can be done to make our business run more efficiently?

Obtain Your Data
 a rule of thumb, there are some things you must take into consideration
when obtaining your data. You must identify all of your available datasets
(which can be from the internet or external/internal databases). You must
extract the data into a usable format (.csv, json, xml, etc..)
 Skills Required:
1. Database Management: MySQL, Postgres SQL, MongoDB
2. Querying Relational Databases
3. Retrieving Unstructured Data: text, videos, audio files, documents
4. Distributed Storage: Hadoops, Apache Spark/Flink

“Good data science is more
about the questions you pose of
the data rather than data
mugging and analysis”
— Riley Newman

Scrubbing / Cleaning Your Data
 This phase of the pipeline should require the most time and
effort. Because the results and output of your machine learning model is
only as good as what you put into it. Basically, garbage in garbage out.

Scrubbing / Cleaning Your Data
 Objective:
1. Examine the data: understand every feature you’re working with, identify
errors, missing values, and corrupt records
2. Clean the data: throw away, replace, and/or fill missing values/errors
 Skills Required:
1. Scripting language: Python, R, SAS
2. Data Wrangling Tools: Python Pandas, R
3. Distributed Processing: Hadoop, Map Reduce / Spark

Exploring (Exploratory Data Analysis)
 Understand
 visualizations
 statistical testing
 Objective:
1. Find patterns in your data through visualizations and charts
2. Extract features by using statistics to identify and test significant variables
 Skills Required:
1. Python: Numpy, Matplotlib, Pandas, Scipy
2. R: GGplot2, Dplyr
3. Inferential statistics
4. Experimental Design
5. Data Visualization

 Objective:
1. In-depth Analytics: create predictive models/algorithms
2. Evaluate and refine the model
 Skills Required:
1. Machine Learning: Supervised/Unsupervised algorithms
2. Evaluation methods
3. Machine Learning Libraries: Python (Sci-kit Learn) / R (CARET)
4. Linear algebra & Multivariate Calculus

Similar to Ds

Tips for Effective Data Science in the EnterpriseLisa Cohen

Cloudera Breakfast Series, Analytics Part 1: Use All Your DataCloudera, Inc.

Tips and Tricks to be an Effective Data ScientistLisa Cohen

Data Science Job ready #DataScienceInterview Question and Answers 2022 | #Dat...Rohit Dubey

Cssu dw dmsumit621

Qiagramjwppz

Data Science.pdfWinduGata3

data wrangling (1).pptx kjhiukjhknjbnkjhVISHALMARWADE1

How Data Virtualization Adds Value to Your Data Science StackDenodo

Unit i big data introductionSujaMaryD

Ch1IntroductiontoDataScience.pptxAbderrahmanABID2

Data Scientist Salary, Skills, Jobs And Resume | Data Scientist Career | Data...Simplilearn

Demystifying Data ScienceJonathan Sedar

Data Mining and Data WarehouseAnupam Sharma

How to build a data science project in a corporate setting, by Soraya Christi...WiMLDSMontreal

Business IntelligenceSukirti Garg

Data Science Introduction: Concepts, lifecycle, applications.pptxsumitkumar600840

The Simple 5-Step Process for Creating a Winning Data Pipeline.pdfData Science Council of America

data science and business analyticssunnypatil1778

Introduction of Data Science and Data AnalyticsVrushaliSolanke

Similar to Ds (20)

Tips for Effective Data Science in the Enterprise

Cloudera Breakfast Series, Analytics Part 1: Use All Your Data

Tips and Tricks to be an Effective Data Scientist

Data Science Job ready #DataScienceInterview Question and Answers 2022 | #Dat...

Cssu dw dm

Qiagram

Data Science.pdf

data wrangling (1).pptx kjhiukjhknjbnkjh

How Data Virtualization Adds Value to Your Data Science Stack

Unit i big data introduction

Ch1IntroductiontoDataScience.pptx

Data Scientist Salary, Skills, Jobs And Resume | Data Scientist Career | Data...

Demystifying Data Science

Data Mining and Data Warehouse

How to build a data science project in a corporate setting, by Soraya Christi...

Business Intelligence

Data Science Introduction: Concepts, lifecycle, applications.pptx

The Simple 5-Step Process for Creating a Winning Data Pipeline.pdf

data science and business analytics

Introduction of Data Science and Data Analytics

Recently uploaded

Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408

Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh

RadioAdProWritingCinderellabyButleri.pdfgstagge

Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Universitat Politècnica de Catalunya

Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda

Data Science Jobs and Salaries Analysis.pptxFurkanTasci3

꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083

(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat

Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083

04242024_CCC TUG_Joins and Relationshipsccctableauusergroup

Spark3's new memory model/managementakshesh doshi

VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor

Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiSuhani Kapoor

Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375

VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor

Brighton SEO | April 2024 | Data StorytellingNeil Barnes

꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...Call Girls In Delhi Whatsup 9873940964 Enjoy Unlimited Pleasure

dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach

B2 Creative Industry Response Evaluation.docxStephen266013

RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh

Recently uploaded (20)

Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps

Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝

RadioAdProWritingCinderellabyButleri.pdf

Deep Generative Learning for All - The Gen AI Hype (Spring 2024)

Customer Service Analytics - Make Sense of All Your Data.pptx

Data Science Jobs and Salaries Analysis.pptx

꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call

(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service

Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call

04242024_CCC TUG_Joins and Relationships

Spark3's new memory model/management

VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...

Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai

Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...

VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati

Brighton SEO | April 2024 | Data Storytelling

꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...

dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt

B2 Creative Industry Response Evaluation.docx

RA-11058_IRR-COMPRESS Do 198 series of 1998

Ds

1. Introduction to Data Science NOUREEN FATIMA DAUDPOTO

2. Data Sources

3. Data Sources

10.

11.

12.

13.

14.

15. Data science job Role  Data scientists: Design data modeling processes to create algorithms and predictive models and perform custom analysis  Data analysts: Manipulate large data sets and use them to identify trends and reach meaningful conclusions to inform strategic business decisions  Data engineers: Clean, aggregate, and organize data from disparate sources and transfer it to data warehouses.  Business intelligence specialists: Identify trends in data sets  Data architects: Design, create, and manage an organization’s data architecture

16.

17.

18.

19.

20.

21.

22.

23.

24. Data Pipeline  Data Science is OSEMN

25. OSEMN  O — Obtaining our data  S — Scrubbing / Cleaning our data  E — Exploring / Visualizing our data will allow us to find patterns and trends  M — Modeling our data will give us our predictive power as a wizard  N — Interpreting our data

26. Business Question 1. How can we translate data into dollars? 2. What impact do I want to make with this data? 3. What business value does our model bring to the table? 4. What will save us lots of money? 5. What can be done to make our business run more efficiently?

27. Obtain Your Data  a rule of thumb, there are some things you must take into consideration when obtaining your data. You must identify all of your available datasets (which can be from the internet or external/internal databases). You must extract the data into a usable format (.csv, json, xml, etc..)  Skills Required: 1. Database Management: MySQL, Postgres SQL, MongoDB 2. Querying Relational Databases 3. Retrieving Unstructured Data: text, videos, audio files, documents 4. Distributed Storage: Hadoops, Apache Spark/Flink

28. “Good data science is more about the questions you pose of the data rather than data mugging and analysis” — Riley Newman

29. Scrubbing / Cleaning Your Data  This phase of the pipeline should require the most time and effort. Because the results and output of your machine learning model is only as good as what you put into it. Basically, garbage in garbage out.

30. Scrubbing / Cleaning Your Data  Objective: 1. Examine the data: understand every feature you’re working with, identify errors, missing values, and corrupt records 2. Clean the data: throw away, replace, and/or fill missing values/errors  Skills Required: 1. Scripting language: Python, R, SAS 2. Data Wrangling Tools: Python Pandas, R 3. Distributed Processing: Hadoop, Map Reduce / Spark

31. Exploring (Exploratory Data Analysis)  Understand  visualizations  statistical testing  Objective: 1. Find patterns in your data through visualizations and charts 2. Extract features by using statistics to identify and test significant variables  Skills Required: 1. Python: Numpy, Matplotlib, Pandas, Scipy 2. R: GGplot2, Dplyr 3. Inferential statistics 4. Experimental Design 5. Data Visualization

32. Modeling

33.  Objective: 1. In-depth Analytics: create predictive models/algorithms 2. Evaluate and refine the model  Skills Required: 1. Machine Learning: Supervised/Unsupervised algorithms 2. Evaluation methods 3. Machine Learning Libraries: Python (Sci-kit Learn) / R (CARET) 4. Linear algebra & Multivariate Calculus

34. Interpreting (Data Storytelling)

Ds

Recommended

Recommended

More Related Content

Similar to Ds

Similar to Ds (20)

Recently uploaded

Recently uploaded (20)

Ds