SlideShare a Scribd company logo
1 of 44
AnIntroductionto
DATASCIENCE
By: Dr. Shweta Saraswat
01
Definitions and Examples
02
What exactly data science and
data scientist do?
03
Relation between artificial
ineliigence and data science
04
To become a data scientist one
should know the various
techniques.
TABLEOFCONTENTS
DataScience&Importance DataScience Process
AIandDataScience Prerequisites forDS
WhatisDataScience?
Data science is an
interdisciplinary field that uses
algorithms, procedures, and
processes to examine large
amounts of data in order to
uncover hidden patterns,
generate insights, and direct
decision making.
Importanceof
DataScience
01
Career Opportunities
"The rise of Data Science needs will create roughly 11.5 million job openings by
2026" US Bureau of Labour Statistics
"By 2026, Data Scientists and Analysts will become the number one emerging
role in the world." World Economic Forum
Data Science and Artificial Intelligence are amongst the hottest fields of the
21st century that will impactall segments of daily life by 2025, from transport
and logistics to healthcare and customer service.
Examples:
Oil giant Shell, for instance, used data science to anticipate machine failure at
facilities across the world.
Agricultural company Cargill developed a mobile data-tracking app that helps
shrimp farmers reduce mortality rates.
Dr. Pepper Snapple Group analyzed data with machine learning to glean more
details about beverage sales and vendors.
And freight company Pitt Ohio used historical data and predictive analytics to
estimate delivery time with 99 percent accuracy.
Facts on Data Generation
Facts on Data Genaration
Statistics show that more than 500 terabytes of new data are
entered into the databases of the social networking site
Facebook every day.
•A single Jet engine can generate over 10 terabytes of data in
30 minutes of flight time. With several thousand flights per day,
data generation reaches several petabytes.
•Stock Exchange is also an example of big data that generates
about a terabyte of new trade data per day
HowdoesData
ScienceWork?
02
CollectData
Raw data is gathered
from various sources that
explain the business
problem
Using various statistical
analysis, and machine
learning approaches, data
modeling is performed to
get the optimum solutions
that best explain the
business problem.
Actionable insights that
will serve as a solution for
the business problems
gathered through data
science.
How does Data Science Work?
AnalyzeData Insights
CollectData
Gather the previous data
on the sales that were
closed.
Use statistical analysis to
find out the patterns that
were followed by the leads
that were closed.
Use machine learning to
get actionable insights for
finding out potential leads.
Consider an Example!
AnalyzeData Insights
Suppose there is an organization that is working
towards finding out potential leads for their sales
team. They can follow the following approach to
get an optimal solution using Data Science:
Letscheckrelationship betweenAIandDataScience
“In above example we saw machine
learning is required for insights”
AIandDataScience
03
Data science and artificial intelligence are not the same.
“Data science and artificial intelligence are two technologies
that are transforming the world. While artificial intelligence powers data
science operations, data science is not completely dependent on AI.
Data Science is leading the fourth industrial revolution. ”
Datascience alsorequires machinelearningalgorithms, which results in
dependencyonAI.
Comparison BetweenAIandDataScience
• Data science jobs require the knowledge of ML languages
like R and Python to perform various data operations and
computer science expertise.
• Data science uses more tools apart from AI. This is because
data science involves multiples steps to analyze data and
generate insights.
• Data science models are built for statistical insights
whereas AI is used to build models that mimic cognition and
human understanding.
Comparison BetweenAIandDataScience
• Today’s industries require both, data science and
artificial intelligence. Data science will help them
make necessary data-driven decisions and assess
their performance in the market, while artificial
intelligence will help industries work with
smarter devices and software that will minimize
workload and optimize all the processes for
improves innovation.
Comparison BetweenAIandDataScience
PrerequisitesforData
Science
04
Computer Science MachineLearning
Statistics
DataScienceisasciencewhichuses:
Visualization
To collect ,clean, Integrate, analyze, visualize, interact with
data to create data products.
HumanComputer Interaction
RoadmapofDataScience
Commonly used
languages are python
and R
Involves charts,graphs etc.
We have already two libraries
in Python for this i.e Seaborn,
matplotlib
Like mean, median ,
mode, standard
deviation etc.
Know about basic algorithms
of ML like Linear regression,
Logistic regression, Decision
tree, SVM algorithm, KNN
algorithm , Random forest
algorithm.
1.LearnProgramming
Language
3.DataVisualization
2.Mathematics &Statistics
4.MachineLearning
5.Project
Try to make projects with
the help of Kaggle
Python:
• Python is also most popular programming language among data scientists
these days.
• Python is very versatile and can be used in almost all the processes in Data
Science.
• Be it data mining or running embedded systems, python can do everything, and
because of this, 40% of the people that participated in a survey by O’Reilly
said that they used Python most often.
• Pandas, a python library, is used for data analysis and can do anything from
plotting data with histograms, to importing data from spreadsheets.
• Python can take data in various formats and import SQL tables to your code
easily.
• The python packages you need to master are Numpy, Matplotlib, PyTorch,
Pandas, Scikit-Learn, and Seaborn in Python.
RprogrammingLanguage
•Following Python, the next skill in the list was R Programming, mentioned in 32% of the job
postings. R is a language specifically designed for Data Science.
• It can be used to solve any Data Science related problem that you might encounter.
•It is the most popular language among Data Scientists.
•Infact, 43% of data scientists prefer to use R for solving statistical problems.
• It is one of the most important Data Science Prerequisites. However, the learning curve is steep. It
is difficult to master, especially if you already have an expertise in any other programming language.
•R can implement ML algorithms to give us a vast variety of statistical and graphical techniques like
time-series analysis, clustering, classical statistical tests etc.
•It is used for calculations and data manipulation. Tidyverse, Ggplot2, Stringr, Dplyr and Caret are
some of the things to master in R.
Mathematics and Statistics
Mathematics is one of the very popular Prerequisites for Data Science.
Probability and Statistics are used for data imputation, visualization of
features, feature transformation, model evaluation, dimensionality reduction,
feature engineering and data preprocessing.
Multivariable Calculus is used to build Machine Learning Models. For Model
Evaluation, Data Preprocessing and Data Transformation, we use linear
Algebra. A matrix is used to represent a Data set.
Mathematics and Statistics
There is no defined syllabus for what you need to learn in Mathematics and Statistics but
here are a few topics you should be familiar with –
•Mean, Median, Mode, Variance, Standard Deviation, and Percentiles
•Bayes Theorem And Probability Distribution (Normal, Poisson, and Binomial)
•Covariance Matrix and Correlation Coefficient
•Mean Square Error and R2 Score
•Statistical tests like p-value, hypothesis Testing, and chi-square
•Multivariate Functions, Cost Functions and Maxima and Minima of a Function
•Step Function, Sigmoid Function, Logit function and Rectified Linear Unit
•Vectors and Matrices; Transpose, Inverse, and Determinant of a matrix
•Dot Product, Cross Product, Eigenvalues, and Eigenvectors
Data Visualization
Data Visualization is a very important Prerequisite for Data Science. In
simple words, data visualization is a representation of data visually, through
graphs and charts. A data scientist should be able to represent data
graphically, using charts, graphs, maps, etc.
Data Visualization
There are multiple components in a good data visualization –
•Data Component – The first step in visualizing data is understanding the type of data, for example, it
could be continuous data, discrete data, or categorical data.
•Geometric Component – This means deciding what kind of visualization will best suit your data-
histograms, bar plots, heatmaps, scatter plots, pair plots, line graphs, etc.
•Mapping Component – In this component, you decide what variable you should use as the x-variable
(independent variable) and y-variable (dependent variable). This is especially important for multi-
dimensional datasets.
•Labels Component – This has the axes labels, legends, titles, font size, etc.
•Scale Component – Here you decide which scale you will be using- log scale, linear scale, etc.
•Ethical Component – This is to make sure that the visualization you have done, tells the true story,
and doesn’t mislead the audience.
Machine Learning and Artificial Intelligence –
•ML helps in analyzing large amounts of data using algorithms. Using Machine Learning, major
parts of a data scientist’s jobs can be automated.
•Only a small percentage of Data Scientists are proficient with advanced machine learning
techniques like adversarial learning, neural networks, reinforcement learning, Outlier Detection,
Time Series etc.
•The most skilled data scientists are highly familiar with advanced machine learning techniques
such as recommendation engines and Natural Language Processing.
•If you want to stand out from the crowd and be one at the top tier, knowledge of machine
learning techniques such as logistic regression, supervised machine learning, decision trees,
Survival Analysis, Computer Vision, etc., is a must.
Start making small Data Science projects
with the help of Kaggle
InsideKaggleyou’llfindallthecode&datayouneedto
doyourdatasciencework.
Useover50,000publicdatasetsand400,000
publicnotebookstoconqueranyanalysisinnotime.
Introduction to Data Science.pptx

More Related Content

Similar to Introduction to Data Science.pptx

The Simple 5-Step Process for Creating a Winning Data Pipeline.pdf
The Simple 5-Step Process for Creating a Winning Data Pipeline.pdfThe Simple 5-Step Process for Creating a Winning Data Pipeline.pdf
The Simple 5-Step Process for Creating a Winning Data Pipeline.pdfData Science Council of America
 
Self Study Business Approach to DS_01022022.docx
Self Study Business Approach to DS_01022022.docxSelf Study Business Approach to DS_01022022.docx
Self Study Business Approach to DS_01022022.docxShanmugasundaram M
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data ScienceSrishti44
 
What is the difference between Data Science and Data Analytics.pdf
What is the difference between Data Science and Data Analytics.pdfWhat is the difference between Data Science and Data Analytics.pdf
What is the difference between Data Science and Data Analytics.pdfRoshni Sharma
 
Data science tutorial
Data science tutorialData science tutorial
Data science tutorialAakashdata
 
Data Engineer vs Data Scientist vs Data Analyst.pptx
Data Engineer vs Data Scientist vs Data Analyst.pptxData Engineer vs Data Scientist vs Data Analyst.pptx
Data Engineer vs Data Scientist vs Data Analyst.pptxCarolineRebeccaD
 
KIT-601 Lecture Notes-UNIT-1.pdf
KIT-601 Lecture Notes-UNIT-1.pdfKIT-601 Lecture Notes-UNIT-1.pdf
KIT-601 Lecture Notes-UNIT-1.pdfDr. Radhey Shyam
 
Data Science.pptx NEW COURICUUMN IN DATA
Data Science.pptx NEW COURICUUMN IN DATAData Science.pptx NEW COURICUUMN IN DATA
Data Science.pptx NEW COURICUUMN IN DATAjaved75
 
Top Rated Dissertation Data Analysis Services | PhD Assistance
Top Rated Dissertation Data Analysis Services | PhD AssistanceTop Rated Dissertation Data Analysis Services | PhD Assistance
Top Rated Dissertation Data Analysis Services | PhD AssistancePHDAssistance2
 
Real-Time Data Analytics Examples
Real-Time Data Analytics ExamplesReal-Time Data Analytics Examples
Real-Time Data Analytics ExamplesPHDAssistance2
 
What is Data analytics? How is data analytics a better career option?
What is Data analytics? How is data analytics a better career option?What is Data analytics? How is data analytics a better career option?
What is Data analytics? How is data analytics a better career option?Aspire Techsoft Academy
 
An Overview of Python for Data Analytics
An Overview of Python for Data AnalyticsAn Overview of Python for Data Analytics
An Overview of Python for Data AnalyticsIRJET Journal
 
"Data Science: Insight & Analysis" and fundamental of data science?
"Data Science: Insight & Analysis" and fundamental of data science?"Data Science: Insight & Analysis" and fundamental of data science?
"Data Science: Insight & Analysis" and fundamental of data science?arjunnegi34
 
Career in Python and data science
Career in Python and data science Career in Python and data science
Career in Python and data science Sagar Hedau
 
Introduction to Data Science.pdf
Introduction to Data Science.pdfIntroduction to Data Science.pdf
Introduction to Data Science.pdfUniversity of Sindh
 
INTRODUCTION TO DATA SCIENCE -CONCEPTS.pptx
INTRODUCTION TO DATA SCIENCE -CONCEPTS.pptxINTRODUCTION TO DATA SCIENCE -CONCEPTS.pptx
INTRODUCTION TO DATA SCIENCE -CONCEPTS.pptxMadhumitha N
 
Guide for a Data Scientist
Guide for a Data ScientistGuide for a Data Scientist
Guide for a Data ScientistRohit Dubey
 
Top 10 areas of expertise in data science
Top 10 areas of expertise in data scienceTop 10 areas of expertise in data science
Top 10 areas of expertise in data scienceGlobalTechCouncil
 

Similar to Introduction to Data Science.pptx (20)

Untitled document.pdf
Untitled document.pdfUntitled document.pdf
Untitled document.pdf
 
The Simple 5-Step Process for Creating a Winning Data Pipeline.pdf
The Simple 5-Step Process for Creating a Winning Data Pipeline.pdfThe Simple 5-Step Process for Creating a Winning Data Pipeline.pdf
The Simple 5-Step Process for Creating a Winning Data Pipeline.pdf
 
Self Study Business Approach to DS_01022022.docx
Self Study Business Approach to DS_01022022.docxSelf Study Business Approach to DS_01022022.docx
Self Study Business Approach to DS_01022022.docx
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
What is the difference between Data Science and Data Analytics.pdf
What is the difference between Data Science and Data Analytics.pdfWhat is the difference between Data Science and Data Analytics.pdf
What is the difference between Data Science and Data Analytics.pdf
 
Data science tutorial
Data science tutorialData science tutorial
Data science tutorial
 
Data Engineer vs Data Scientist vs Data Analyst.pptx
Data Engineer vs Data Scientist vs Data Analyst.pptxData Engineer vs Data Scientist vs Data Analyst.pptx
Data Engineer vs Data Scientist vs Data Analyst.pptx
 
KIT-601 Lecture Notes-UNIT-1.pdf
KIT-601 Lecture Notes-UNIT-1.pdfKIT-601 Lecture Notes-UNIT-1.pdf
KIT-601 Lecture Notes-UNIT-1.pdf
 
Data Science.pptx NEW COURICUUMN IN DATA
Data Science.pptx NEW COURICUUMN IN DATAData Science.pptx NEW COURICUUMN IN DATA
Data Science.pptx NEW COURICUUMN IN DATA
 
Top Rated Dissertation Data Analysis Services | PhD Assistance
Top Rated Dissertation Data Analysis Services | PhD AssistanceTop Rated Dissertation Data Analysis Services | PhD Assistance
Top Rated Dissertation Data Analysis Services | PhD Assistance
 
Real-Time Data Analytics Examples
Real-Time Data Analytics ExamplesReal-Time Data Analytics Examples
Real-Time Data Analytics Examples
 
What is Data analytics? How is data analytics a better career option?
What is Data analytics? How is data analytics a better career option?What is Data analytics? How is data analytics a better career option?
What is Data analytics? How is data analytics a better career option?
 
An Overview of Python for Data Analytics
An Overview of Python for Data AnalyticsAn Overview of Python for Data Analytics
An Overview of Python for Data Analytics
 
"Data Science: Insight & Analysis" and fundamental of data science?
"Data Science: Insight & Analysis" and fundamental of data science?"Data Science: Insight & Analysis" and fundamental of data science?
"Data Science: Insight & Analysis" and fundamental of data science?
 
What is data science ?
What is data science ?What is data science ?
What is data science ?
 
Career in Python and data science
Career in Python and data science Career in Python and data science
Career in Python and data science
 
Introduction to Data Science.pdf
Introduction to Data Science.pdfIntroduction to Data Science.pdf
Introduction to Data Science.pdf
 
INTRODUCTION TO DATA SCIENCE -CONCEPTS.pptx
INTRODUCTION TO DATA SCIENCE -CONCEPTS.pptxINTRODUCTION TO DATA SCIENCE -CONCEPTS.pptx
INTRODUCTION TO DATA SCIENCE -CONCEPTS.pptx
 
Guide for a Data Scientist
Guide for a Data ScientistGuide for a Data Scientist
Guide for a Data Scientist
 
Top 10 areas of expertise in data science
Top 10 areas of expertise in data scienceTop 10 areas of expertise in data science
Top 10 areas of expertise in data science
 

More from Dr.Shweta

research ethics , plagiarism checking and removal.pptx
research ethics , plagiarism checking and removal.pptxresearch ethics , plagiarism checking and removal.pptx
research ethics , plagiarism checking and removal.pptxDr.Shweta
 
effective modular design.pptx
effective modular design.pptxeffective modular design.pptx
effective modular design.pptxDr.Shweta
 
software design: design fundamentals.pptx
software design: design fundamentals.pptxsoftware design: design fundamentals.pptx
software design: design fundamentals.pptxDr.Shweta
 
Search Algorithms in AI.pptx
Search Algorithms in AI.pptxSearch Algorithms in AI.pptx
Search Algorithms in AI.pptxDr.Shweta
 
Informed search algorithms.pptx
Informed search algorithms.pptxInformed search algorithms.pptx
Informed search algorithms.pptxDr.Shweta
 
constraint satisfaction problems.pptx
constraint satisfaction problems.pptxconstraint satisfaction problems.pptx
constraint satisfaction problems.pptxDr.Shweta
 
review paper publication.pptx
review paper publication.pptxreview paper publication.pptx
review paper publication.pptxDr.Shweta
 
SORTING techniques.pptx
SORTING techniques.pptxSORTING techniques.pptx
SORTING techniques.pptxDr.Shweta
 
Recommended System.pptx
 Recommended System.pptx Recommended System.pptx
Recommended System.pptxDr.Shweta
 
semi supervised Learning and Reinforcement learning (1).pptx
 semi supervised Learning and Reinforcement learning (1).pptx semi supervised Learning and Reinforcement learning (1).pptx
semi supervised Learning and Reinforcement learning (1).pptxDr.Shweta
 
introduction to Statistical Theory.pptx
 introduction to Statistical Theory.pptx introduction to Statistical Theory.pptx
introduction to Statistical Theory.pptxDr.Shweta
 
Unit 2 unsupervised learning.pptx
Unit 2 unsupervised learning.pptxUnit 2 unsupervised learning.pptx
Unit 2 unsupervised learning.pptxDr.Shweta
 
unit 1.2 supervised learning.pptx
unit 1.2 supervised learning.pptxunit 1.2 supervised learning.pptx
unit 1.2 supervised learning.pptxDr.Shweta
 
Introduction of machine learning.pptx
Introduction of machine learning.pptxIntroduction of machine learning.pptx
Introduction of machine learning.pptxDr.Shweta
 
searching techniques.pptx
searching techniques.pptxsearching techniques.pptx
searching techniques.pptxDr.Shweta
 
LINKED LIST.pptx
LINKED LIST.pptxLINKED LIST.pptx
LINKED LIST.pptxDr.Shweta
 
complexity.pptx
complexity.pptxcomplexity.pptx
complexity.pptxDr.Shweta
 

More from Dr.Shweta (20)

research ethics , plagiarism checking and removal.pptx
research ethics , plagiarism checking and removal.pptxresearch ethics , plagiarism checking and removal.pptx
research ethics , plagiarism checking and removal.pptx
 
effective modular design.pptx
effective modular design.pptxeffective modular design.pptx
effective modular design.pptx
 
software design: design fundamentals.pptx
software design: design fundamentals.pptxsoftware design: design fundamentals.pptx
software design: design fundamentals.pptx
 
Search Algorithms in AI.pptx
Search Algorithms in AI.pptxSearch Algorithms in AI.pptx
Search Algorithms in AI.pptx
 
Informed search algorithms.pptx
Informed search algorithms.pptxInformed search algorithms.pptx
Informed search algorithms.pptx
 
constraint satisfaction problems.pptx
constraint satisfaction problems.pptxconstraint satisfaction problems.pptx
constraint satisfaction problems.pptx
 
review paper publication.pptx
review paper publication.pptxreview paper publication.pptx
review paper publication.pptx
 
SORTING techniques.pptx
SORTING techniques.pptxSORTING techniques.pptx
SORTING techniques.pptx
 
Recommended System.pptx
 Recommended System.pptx Recommended System.pptx
Recommended System.pptx
 
semi supervised Learning and Reinforcement learning (1).pptx
 semi supervised Learning and Reinforcement learning (1).pptx semi supervised Learning and Reinforcement learning (1).pptx
semi supervised Learning and Reinforcement learning (1).pptx
 
introduction to Statistical Theory.pptx
 introduction to Statistical Theory.pptx introduction to Statistical Theory.pptx
introduction to Statistical Theory.pptx
 
Unit 2 unsupervised learning.pptx
Unit 2 unsupervised learning.pptxUnit 2 unsupervised learning.pptx
Unit 2 unsupervised learning.pptx
 
unit 1.2 supervised learning.pptx
unit 1.2 supervised learning.pptxunit 1.2 supervised learning.pptx
unit 1.2 supervised learning.pptx
 
Introduction of machine learning.pptx
Introduction of machine learning.pptxIntroduction of machine learning.pptx
Introduction of machine learning.pptx
 
searching techniques.pptx
searching techniques.pptxsearching techniques.pptx
searching techniques.pptx
 
LINKED LIST.pptx
LINKED LIST.pptxLINKED LIST.pptx
LINKED LIST.pptx
 
complexity.pptx
complexity.pptxcomplexity.pptx
complexity.pptx
 
queue.pptx
queue.pptxqueue.pptx
queue.pptx
 
STACK.pptx
STACK.pptxSTACK.pptx
STACK.pptx
 
dsa.pptx
dsa.pptxdsa.pptx
dsa.pptx
 

Recently uploaded

Spellings Wk 4 and Wk 5 for Grade 4 at CAPS
Spellings Wk 4 and Wk 5 for Grade 4 at CAPSSpellings Wk 4 and Wk 5 for Grade 4 at CAPS
Spellings Wk 4 and Wk 5 for Grade 4 at CAPSAnaAcapella
 
OSCM Unit 2_Operations Processes & Systems
OSCM Unit 2_Operations Processes & SystemsOSCM Unit 2_Operations Processes & Systems
OSCM Unit 2_Operations Processes & SystemsSandeep D Chaudhary
 
What is 3 Way Matching Process in Odoo 17.pptx
What is 3 Way Matching Process in Odoo 17.pptxWhat is 3 Way Matching Process in Odoo 17.pptx
What is 3 Way Matching Process in Odoo 17.pptxCeline George
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17Celine George
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfDr Vijay Vishwakarma
 
Play hard learn harder: The Serious Business of Play
Play hard learn harder:  The Serious Business of PlayPlay hard learn harder:  The Serious Business of Play
Play hard learn harder: The Serious Business of PlayPooky Knightsmith
 
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxHMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxmarlenawright1
 
Economic Importance Of Fungi In Food Additives
Economic Importance Of Fungi In Food AdditivesEconomic Importance Of Fungi In Food Additives
Economic Importance Of Fungi In Food AdditivesSHIVANANDaRV
 
How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17Celine George
 
Details on CBSE Compartment Exam.pptx1111
Details on CBSE Compartment Exam.pptx1111Details on CBSE Compartment Exam.pptx1111
Details on CBSE Compartment Exam.pptx1111GangaMaiya1
 
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...Nguyen Thanh Tu Collection
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxRamakrishna Reddy Bijjam
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxJisc
 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxDr. Ravikiran H M Gowda
 
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Pooja Bhuva
 
Model Attribute _rec_name in the Odoo 17
Model Attribute _rec_name in the Odoo 17Model Attribute _rec_name in the Odoo 17
Model Attribute _rec_name in the Odoo 17Celine George
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxCeline George
 
dusjagr & nano talk on open tools for agriculture research and learning
dusjagr & nano talk on open tools for agriculture research and learningdusjagr & nano talk on open tools for agriculture research and learning
dusjagr & nano talk on open tools for agriculture research and learningMarc Dusseiller Dusjagr
 
Simple, Complex, and Compound Sentences Exercises.pdf
Simple, Complex, and Compound Sentences Exercises.pdfSimple, Complex, and Compound Sentences Exercises.pdf
Simple, Complex, and Compound Sentences Exercises.pdfstareducators107
 

Recently uploaded (20)

Spellings Wk 4 and Wk 5 for Grade 4 at CAPS
Spellings Wk 4 and Wk 5 for Grade 4 at CAPSSpellings Wk 4 and Wk 5 for Grade 4 at CAPS
Spellings Wk 4 and Wk 5 for Grade 4 at CAPS
 
OSCM Unit 2_Operations Processes & Systems
OSCM Unit 2_Operations Processes & SystemsOSCM Unit 2_Operations Processes & Systems
OSCM Unit 2_Operations Processes & Systems
 
What is 3 Way Matching Process in Odoo 17.pptx
What is 3 Way Matching Process in Odoo 17.pptxWhat is 3 Way Matching Process in Odoo 17.pptx
What is 3 Way Matching Process in Odoo 17.pptx
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
 
Play hard learn harder: The Serious Business of Play
Play hard learn harder:  The Serious Business of PlayPlay hard learn harder:  The Serious Business of Play
Play hard learn harder: The Serious Business of Play
 
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxHMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
 
Economic Importance Of Fungi In Food Additives
Economic Importance Of Fungi In Food AdditivesEconomic Importance Of Fungi In Food Additives
Economic Importance Of Fungi In Food Additives
 
How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17
 
Details on CBSE Compartment Exam.pptx1111
Details on CBSE Compartment Exam.pptx1111Details on CBSE Compartment Exam.pptx1111
Details on CBSE Compartment Exam.pptx1111
 
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
VAMOS CUIDAR DO NOSSO PLANETA! .
VAMOS CUIDAR DO NOSSO PLANETA!                    .VAMOS CUIDAR DO NOSSO PLANETA!                    .
VAMOS CUIDAR DO NOSSO PLANETA! .
 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptx
 
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
 
Model Attribute _rec_name in the Odoo 17
Model Attribute _rec_name in the Odoo 17Model Attribute _rec_name in the Odoo 17
Model Attribute _rec_name in the Odoo 17
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptx
 
dusjagr & nano talk on open tools for agriculture research and learning
dusjagr & nano talk on open tools for agriculture research and learningdusjagr & nano talk on open tools for agriculture research and learning
dusjagr & nano talk on open tools for agriculture research and learning
 
Simple, Complex, and Compound Sentences Exercises.pdf
Simple, Complex, and Compound Sentences Exercises.pdfSimple, Complex, and Compound Sentences Exercises.pdf
Simple, Complex, and Compound Sentences Exercises.pdf
 

Introduction to Data Science.pptx

  • 2. 01 Definitions and Examples 02 What exactly data science and data scientist do? 03 Relation between artificial ineliigence and data science 04 To become a data scientist one should know the various techniques. TABLEOFCONTENTS DataScience&Importance DataScience Process AIandDataScience Prerequisites forDS
  • 3. WhatisDataScience? Data science is an interdisciplinary field that uses algorithms, procedures, and processes to examine large amounts of data in order to uncover hidden patterns, generate insights, and direct decision making.
  • 5. Career Opportunities "The rise of Data Science needs will create roughly 11.5 million job openings by 2026" US Bureau of Labour Statistics "By 2026, Data Scientists and Analysts will become the number one emerging role in the world." World Economic Forum Data Science and Artificial Intelligence are amongst the hottest fields of the 21st century that will impactall segments of daily life by 2025, from transport and logistics to healthcare and customer service.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20. Examples: Oil giant Shell, for instance, used data science to anticipate machine failure at facilities across the world. Agricultural company Cargill developed a mobile data-tracking app that helps shrimp farmers reduce mortality rates. Dr. Pepper Snapple Group analyzed data with machine learning to glean more details about beverage sales and vendors. And freight company Pitt Ohio used historical data and predictive analytics to estimate delivery time with 99 percent accuracy.
  • 21. Facts on Data Generation
  • 22. Facts on Data Genaration Statistics show that more than 500 terabytes of new data are entered into the databases of the social networking site Facebook every day. •A single Jet engine can generate over 10 terabytes of data in 30 minutes of flight time. With several thousand flights per day, data generation reaches several petabytes. •Stock Exchange is also an example of big data that generates about a terabyte of new trade data per day
  • 24. CollectData Raw data is gathered from various sources that explain the business problem Using various statistical analysis, and machine learning approaches, data modeling is performed to get the optimum solutions that best explain the business problem. Actionable insights that will serve as a solution for the business problems gathered through data science. How does Data Science Work? AnalyzeData Insights
  • 25. CollectData Gather the previous data on the sales that were closed. Use statistical analysis to find out the patterns that were followed by the leads that were closed. Use machine learning to get actionable insights for finding out potential leads. Consider an Example! AnalyzeData Insights Suppose there is an organization that is working towards finding out potential leads for their sales team. They can follow the following approach to get an optimal solution using Data Science:
  • 26. Letscheckrelationship betweenAIandDataScience “In above example we saw machine learning is required for insights”
  • 28. Data science and artificial intelligence are not the same. “Data science and artificial intelligence are two technologies that are transforming the world. While artificial intelligence powers data science operations, data science is not completely dependent on AI. Data Science is leading the fourth industrial revolution. ”
  • 29. Datascience alsorequires machinelearningalgorithms, which results in dependencyonAI.
  • 30. Comparison BetweenAIandDataScience • Data science jobs require the knowledge of ML languages like R and Python to perform various data operations and computer science expertise. • Data science uses more tools apart from AI. This is because data science involves multiples steps to analyze data and generate insights. • Data science models are built for statistical insights whereas AI is used to build models that mimic cognition and human understanding.
  • 31. Comparison BetweenAIandDataScience • Today’s industries require both, data science and artificial intelligence. Data science will help them make necessary data-driven decisions and assess their performance in the market, while artificial intelligence will help industries work with smarter devices and software that will minimize workload and optimize all the processes for improves innovation.
  • 34. Computer Science MachineLearning Statistics DataScienceisasciencewhichuses: Visualization To collect ,clean, Integrate, analyze, visualize, interact with data to create data products. HumanComputer Interaction
  • 35. RoadmapofDataScience Commonly used languages are python and R Involves charts,graphs etc. We have already two libraries in Python for this i.e Seaborn, matplotlib Like mean, median , mode, standard deviation etc. Know about basic algorithms of ML like Linear regression, Logistic regression, Decision tree, SVM algorithm, KNN algorithm , Random forest algorithm. 1.LearnProgramming Language 3.DataVisualization 2.Mathematics &Statistics 4.MachineLearning 5.Project Try to make projects with the help of Kaggle
  • 36. Python: • Python is also most popular programming language among data scientists these days. • Python is very versatile and can be used in almost all the processes in Data Science. • Be it data mining or running embedded systems, python can do everything, and because of this, 40% of the people that participated in a survey by O’Reilly said that they used Python most often. • Pandas, a python library, is used for data analysis and can do anything from plotting data with histograms, to importing data from spreadsheets. • Python can take data in various formats and import SQL tables to your code easily. • The python packages you need to master are Numpy, Matplotlib, PyTorch, Pandas, Scikit-Learn, and Seaborn in Python.
  • 37. RprogrammingLanguage •Following Python, the next skill in the list was R Programming, mentioned in 32% of the job postings. R is a language specifically designed for Data Science. • It can be used to solve any Data Science related problem that you might encounter. •It is the most popular language among Data Scientists. •Infact, 43% of data scientists prefer to use R for solving statistical problems. • It is one of the most important Data Science Prerequisites. However, the learning curve is steep. It is difficult to master, especially if you already have an expertise in any other programming language. •R can implement ML algorithms to give us a vast variety of statistical and graphical techniques like time-series analysis, clustering, classical statistical tests etc. •It is used for calculations and data manipulation. Tidyverse, Ggplot2, Stringr, Dplyr and Caret are some of the things to master in R.
  • 38. Mathematics and Statistics Mathematics is one of the very popular Prerequisites for Data Science. Probability and Statistics are used for data imputation, visualization of features, feature transformation, model evaluation, dimensionality reduction, feature engineering and data preprocessing. Multivariable Calculus is used to build Machine Learning Models. For Model Evaluation, Data Preprocessing and Data Transformation, we use linear Algebra. A matrix is used to represent a Data set.
  • 39. Mathematics and Statistics There is no defined syllabus for what you need to learn in Mathematics and Statistics but here are a few topics you should be familiar with – •Mean, Median, Mode, Variance, Standard Deviation, and Percentiles •Bayes Theorem And Probability Distribution (Normal, Poisson, and Binomial) •Covariance Matrix and Correlation Coefficient •Mean Square Error and R2 Score •Statistical tests like p-value, hypothesis Testing, and chi-square •Multivariate Functions, Cost Functions and Maxima and Minima of a Function •Step Function, Sigmoid Function, Logit function and Rectified Linear Unit •Vectors and Matrices; Transpose, Inverse, and Determinant of a matrix •Dot Product, Cross Product, Eigenvalues, and Eigenvectors
  • 40. Data Visualization Data Visualization is a very important Prerequisite for Data Science. In simple words, data visualization is a representation of data visually, through graphs and charts. A data scientist should be able to represent data graphically, using charts, graphs, maps, etc.
  • 41. Data Visualization There are multiple components in a good data visualization – •Data Component – The first step in visualizing data is understanding the type of data, for example, it could be continuous data, discrete data, or categorical data. •Geometric Component – This means deciding what kind of visualization will best suit your data- histograms, bar plots, heatmaps, scatter plots, pair plots, line graphs, etc. •Mapping Component – In this component, you decide what variable you should use as the x-variable (independent variable) and y-variable (dependent variable). This is especially important for multi- dimensional datasets. •Labels Component – This has the axes labels, legends, titles, font size, etc. •Scale Component – Here you decide which scale you will be using- log scale, linear scale, etc. •Ethical Component – This is to make sure that the visualization you have done, tells the true story, and doesn’t mislead the audience.
  • 42. Machine Learning and Artificial Intelligence – •ML helps in analyzing large amounts of data using algorithms. Using Machine Learning, major parts of a data scientist’s jobs can be automated. •Only a small percentage of Data Scientists are proficient with advanced machine learning techniques like adversarial learning, neural networks, reinforcement learning, Outlier Detection, Time Series etc. •The most skilled data scientists are highly familiar with advanced machine learning techniques such as recommendation engines and Natural Language Processing. •If you want to stand out from the crowd and be one at the top tier, knowledge of machine learning techniques such as logistic regression, supervised machine learning, decision trees, Survival Analysis, Computer Vision, etc., is a must.
  • 43. Start making small Data Science projects with the help of Kaggle InsideKaggleyou’llfindallthecode&datayouneedto doyourdatasciencework. Useover50,000publicdatasetsand400,000 publicnotebookstoconqueranyanalysisinnotime.