SlideShare a Scribd company logo
1 of 31
Download to read offline
Data Science
Muhammad Suleman Memon
Assistant Professor
Department of Information Technology,
Dadu Campus,
University of Sindh
What is
Data
Science?
Data science is the domain of
study that deals with vast
volumes.
Find unseen patterns, derive
meaningful information, and
make business decisions.
Data science uses complex
machine learning algorithms to
build predictive models.
Data Science
Applications
Sources of the Data
Data Science Lifecycle
Prerequisites
for Data
Science
1. Machine Learning
2. Modeling
3. Statistics
4. Programming
5. Databases
Who Oversees the Data Science Process?
• Business Managers
• To collaborate with the data science team to characterize the problem and
establish an analytical method.
• IT Managers
• Developing the infrastructure and architecture to enable data science
activities.
• Data Science Managers
• Supervise the working procedures of all data science team members.
• They also manage and keep track of the day-to-day activities of the three data
science teams.
What is a
Data
Scientist?
professionals who have the technical ability
to handle complicated issues as well as the
desire to investigate what questions need to
be answered.
They're a mix of mathematicians, computer
scientists, and trend forecasters.
They're also in high demand and well-paid
because they work in both the business and
IT sectors.
On a daily
basis, a data
scientist
may do the
following
tasks:
Discover patterns and
trends in datasets to get
insights.
Create forecasting
algorithms and data
models.
Improve the quality of
data or product offerings
by utilising machine
learning techniques.
Distribute suggestions to
other teams and top
management.
In data analysis, use data
tools such as R, SAS,
Python, or SQL.
Top the field of data
science innovations.
What Does a
Data Scientist
Do?
Determine the
problem.
Determines the
correct set of
variables and
datasets.
Gather structured
and unstructured
data from many
sources.
Convert raw data
into a suitable
format.
Apply ML
algorithms.
Interpret the data to
find opportunities
and solutions.
Prepare the
results and
insights to share
with stake
holders.
Why Become
a Data
Scientist?
• According to Glassdoor and Forbes,
demand for data scientists will
increase by 28 percent by 2026,
which speaks of the profession’s
durability and longevity, so if you
want a secure career, data science
offers you that chance.
Use of Data
Science
1. Data science may detect patterns in seemingly
unstructured or unconnected data, allowing
conclusions and predictions to be made.
2. Tech businesses that acquire user data can
utilize strategies to transform that data into
valuable or profitable information.
3. Data Science has also made inroads into the
transportation industry, such as with driverless
cars.
4. Data Science applications provide a better level
of therapeutic customization through genetics
and genomics research.
Data Scientist
Job role: Determine what the
problem is, what questions
need answers, and where to
find the data. Also, they mine,
clean, and present the relevant
data.
Skills needed: Programming
skills (SAS, R, Python),
storytelling and data
visualization, statistical and
mathematical skills, knowledge
of Hadoop, SQL, and Machine
Learning.
Data Analyst
Job role: Analysts bridge the gap
between the data scientists and the
business analysts, organizing and
analyzing data to answer the
questions the organization poses.
They take the technical analyses and
turn them into qualitative action
items.
Skills needed: Statistical and
mathematical skills, programming
skills (SAS, R, Python), plus
experience in data wrangling and
data visualization.
Data Engineer
Job role: Data engineers focus on
developing, deploying, managing,
and optimizing the organization’s
data infrastructure and data
pipelines. Engineers support data
scientists by helping to transfer
and transform data for queries.
Skills needed: NoSQL databases
(e.g., MongoDB, Cassandra DB),
programming languages such as
Java and Scala, and frameworks
(Apache Hadoop).
Data
Science
Tools
Data Analysis: SAS, Jupyter, R
Studio, MATLAB, Excel, RapidMiner
Data Warehousing: Informatica/
Talend, AWS Redshift
Data Visualization: Jupyter, Tableau,
Cognos, RAW
Machine Learning: Spark MLib,
Mahout, Azure ML studio
Difference
Between
Business
Intelligence
and Data
Science
BUSINESS INTELLIGENCE DATA SCIENCE
Uses structured data Uses both structured and
unstructured data
Analytical in nature - provides a
historical report of the data
Scientific in nature - perform an in-
depth statistical analysis on the
data
Use of basic statistics with
emphasis on visualization
(dashboards, reports)
Leverages more sophisticated
statistical and predictive analysis
and machine learning (ML)
Compares historical data to current
data to identify trends
Combines historical and current
data to predict future performance
and outcomes
Applications
of Data
Science
1. Healthcare
2. Gaming
3. Image
Recognition
4.
Recommendation
Systems
5. Logistics
6. Fraud
Detection
7. Internet Search
8. Speech
recognition
9. Targeted
Advertising
10. Airline Route
Planning
11. Augmented
Reality
Programming Language
for Data Science
Python
Fundamental
Python
Libraries for
Data
Scientists
Numpy
SciPy
Pandas
Scikit-Learn
IDE
Pycharm
Getting Started
Import pandas as pd
1
Import numpy as np
2
Import
matplotlib.pyplot as
plt
3
Getting Started
data = { ’year ’: [2010 , 2011 , 2012 ,2010 , 2011 , 2012 ,2010 , 2011 , 2012],
’team ’: [’ FCBarcelona ’, ’ FCBarcelona ’,’ FCBarcelona ’, ’ RMadrid ’,’ RMadrid ’, ’ RMadrid ’,’ ValenciaCF ’, ’
ValenciaCF ’,’ ValenciaCF ’
],
’wins ’: [30 , 28 , 32 , 29 , 32 , 26 , 21 , 17 , 19] ,
’ draws ’: [6 , 7, 4, 5, 4, 7, 8, 10 , 8] ,
’ losses ’: [2 , 3, 2, 4, 2, 5, 9, 11 , 11]
}
football = pd . DataFrame ( data , columns = [
’year ’, ’team ’, ’wins ’, ’ draws ’, ’ losses ’
]
)
Output
Read CSV
• Import pandas as pd
• mydata = pd.read_csv(‘data.csv’)
First Five Rows
• mydata.head()
Last Five Rows
• mydata.tail()
Show Statistical Information
• mydata.describe()
Selecting Data
• mydata[‘column’]
Subset of Rows
• mydata[5:10]
Thank You

More Related Content

Similar to Introduction to Data Science.pdf

data science course training in Hyderabad
data science course training in Hyderabaddata science course training in Hyderabad
data science course training in Hyderabadmadhupriya3zen
 
data science course training in Hyderabad
data science course training in Hyderabaddata science course training in Hyderabad
data science course training in Hyderabadmadhupriya3zen
 
best data science course institutes in Hyderabad
best data science course institutes in Hyderabadbest data science course institutes in Hyderabad
best data science course institutes in Hyderabadrajasrichalamala3zen
 
Colloquium(7)_DataScience:ShivShaktiGhosh&MohitGarg
Colloquium(7)_DataScience:ShivShaktiGhosh&MohitGargColloquium(7)_DataScience:ShivShaktiGhosh&MohitGarg
Colloquium(7)_DataScience:ShivShaktiGhosh&MohitGargShiv Shakti Ghosh
 
Data+Science : A First Course
Data+Science : A First CourseData+Science : A First Course
Data+Science : A First CourseArnab Majumdar
 
Data science Nagarajan and madhav.pptx
Data science Nagarajan and madhav.pptxData science Nagarajan and madhav.pptx
Data science Nagarajan and madhav.pptxNagarajanG35
 
Data Scientist Salary, Skills, Jobs And Resume | Data Scientist Career | Data...
Data Scientist Salary, Skills, Jobs And Resume | Data Scientist Career | Data...Data Scientist Salary, Skills, Jobs And Resume | Data Scientist Career | Data...
Data Scientist Salary, Skills, Jobs And Resume | Data Scientist Career | Data...Simplilearn
 
Data Science: Unlocking Insights and Transforming Industries
Data Science: Unlocking Insights and Transforming IndustriesData Science: Unlocking Insights and Transforming Industries
Data Science: Unlocking Insights and Transforming IndustriesUncodemy
 
DATASCIENCE vs BUSINESS INTELLIGENCE.pptx
DATASCIENCE vs BUSINESS INTELLIGENCE.pptxDATASCIENCE vs BUSINESS INTELLIGENCE.pptx
DATASCIENCE vs BUSINESS INTELLIGENCE.pptxOTA13NayabNakhwa
 
what is data science
 what is data science what is data science
what is data scienceCrampete
 
Welcome to Data Science
Welcome to Data ScienceWelcome to Data Science
Welcome to Data ScienceNyraSehgal
 
Learn All about Data Science from the Best Private University in Karnataka
Learn All about Data Science from the Best Private University in KarnatakaLearn All about Data Science from the Best Private University in Karnataka
Learn All about Data Science from the Best Private University in KarnatakaREVA University
 
Who is a data scientist
Who is a data scientist  Who is a data scientist
Who is a data scientist prateek kumar
 
Data Analytics Course In Surat.pdf
Data Analytics Course In Surat.pdfData Analytics Course In Surat.pdf
Data Analytics Course In Surat.pdfSujata Gupta
 
Big data (word file)
Big data  (word file)Big data  (word file)
Big data (word file)Shahbaz Anjam
 
INTRODUCTION TO DATA SCIENCE -CONCEPTS.pptx
INTRODUCTION TO DATA SCIENCE -CONCEPTS.pptxINTRODUCTION TO DATA SCIENCE -CONCEPTS.pptx
INTRODUCTION TO DATA SCIENCE -CONCEPTS.pptxMadhumitha N
 
What is the difference between Data Science and Data Analytics.pdf
What is the difference between Data Science and Data Analytics.pdfWhat is the difference between Data Science and Data Analytics.pdf
What is the difference between Data Science and Data Analytics.pdfRoshni Sharma
 

Similar to Introduction to Data Science.pdf (20)

data science course training in Hyderabad
data science course training in Hyderabaddata science course training in Hyderabad
data science course training in Hyderabad
 
data science course training in Hyderabad
data science course training in Hyderabaddata science course training in Hyderabad
data science course training in Hyderabad
 
data science.pptx
data science.pptxdata science.pptx
data science.pptx
 
best data science course institutes in Hyderabad
best data science course institutes in Hyderabadbest data science course institutes in Hyderabad
best data science course institutes in Hyderabad
 
Colloquium(7)_DataScience:ShivShaktiGhosh&MohitGarg
Colloquium(7)_DataScience:ShivShaktiGhosh&MohitGargColloquium(7)_DataScience:ShivShaktiGhosh&MohitGarg
Colloquium(7)_DataScience:ShivShaktiGhosh&MohitGarg
 
What is data science ?
What is data science ?What is data science ?
What is data science ?
 
Data+Science : A First Course
Data+Science : A First CourseData+Science : A First Course
Data+Science : A First Course
 
Data science Nagarajan and madhav.pptx
Data science Nagarajan and madhav.pptxData science Nagarajan and madhav.pptx
Data science Nagarajan and madhav.pptx
 
Data Scientist Salary, Skills, Jobs And Resume | Data Scientist Career | Data...
Data Scientist Salary, Skills, Jobs And Resume | Data Scientist Career | Data...Data Scientist Salary, Skills, Jobs And Resume | Data Scientist Career | Data...
Data Scientist Salary, Skills, Jobs And Resume | Data Scientist Career | Data...
 
Data Science: Unlocking Insights and Transforming Industries
Data Science: Unlocking Insights and Transforming IndustriesData Science: Unlocking Insights and Transforming Industries
Data Science: Unlocking Insights and Transforming Industries
 
DATASCIENCE vs BUSINESS INTELLIGENCE.pptx
DATASCIENCE vs BUSINESS INTELLIGENCE.pptxDATASCIENCE vs BUSINESS INTELLIGENCE.pptx
DATASCIENCE vs BUSINESS INTELLIGENCE.pptx
 
what is data science
 what is data science what is data science
what is data science
 
Welcome to Data Science
Welcome to Data ScienceWelcome to Data Science
Welcome to Data Science
 
Learn All about Data Science from the Best Private University in Karnataka
Learn All about Data Science from the Best Private University in KarnatakaLearn All about Data Science from the Best Private University in Karnataka
Learn All about Data Science from the Best Private University in Karnataka
 
Who is a data scientist
Who is a data scientist  Who is a data scientist
Who is a data scientist
 
Data Analytics Course In Surat.pdf
Data Analytics Course In Surat.pdfData Analytics Course In Surat.pdf
Data Analytics Course In Surat.pdf
 
Big data (word file)
Big data  (word file)Big data  (word file)
Big data (word file)
 
INTRODUCTION TO DATA SCIENCE -CONCEPTS.pptx
INTRODUCTION TO DATA SCIENCE -CONCEPTS.pptxINTRODUCTION TO DATA SCIENCE -CONCEPTS.pptx
INTRODUCTION TO DATA SCIENCE -CONCEPTS.pptx
 
What is the difference between Data Science and Data Analytics.pdf
What is the difference between Data Science and Data Analytics.pdfWhat is the difference between Data Science and Data Analytics.pdf
What is the difference between Data Science and Data Analytics.pdf
 
Data Analytics Course in Noida. pptx
Data Analytics  Course in Noida.     pptxData Analytics  Course in Noida.     pptx
Data Analytics Course in Noida. pptx
 

Recently uploaded

Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 

Recently uploaded (20)

Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 

Introduction to Data Science.pdf

  • 1. Data Science Muhammad Suleman Memon Assistant Professor Department of Information Technology, Dadu Campus, University of Sindh
  • 2. What is Data Science? Data science is the domain of study that deals with vast volumes. Find unseen patterns, derive meaningful information, and make business decisions. Data science uses complex machine learning algorithms to build predictive models.
  • 6. Prerequisites for Data Science 1. Machine Learning 2. Modeling 3. Statistics 4. Programming 5. Databases
  • 7. Who Oversees the Data Science Process? • Business Managers • To collaborate with the data science team to characterize the problem and establish an analytical method. • IT Managers • Developing the infrastructure and architecture to enable data science activities. • Data Science Managers • Supervise the working procedures of all data science team members. • They also manage and keep track of the day-to-day activities of the three data science teams.
  • 8. What is a Data Scientist? professionals who have the technical ability to handle complicated issues as well as the desire to investigate what questions need to be answered. They're a mix of mathematicians, computer scientists, and trend forecasters. They're also in high demand and well-paid because they work in both the business and IT sectors.
  • 9. On a daily basis, a data scientist may do the following tasks: Discover patterns and trends in datasets to get insights. Create forecasting algorithms and data models. Improve the quality of data or product offerings by utilising machine learning techniques. Distribute suggestions to other teams and top management. In data analysis, use data tools such as R, SAS, Python, or SQL. Top the field of data science innovations.
  • 10. What Does a Data Scientist Do? Determine the problem. Determines the correct set of variables and datasets. Gather structured and unstructured data from many sources. Convert raw data into a suitable format. Apply ML algorithms. Interpret the data to find opportunities and solutions. Prepare the results and insights to share with stake holders.
  • 11. Why Become a Data Scientist? • According to Glassdoor and Forbes, demand for data scientists will increase by 28 percent by 2026, which speaks of the profession’s durability and longevity, so if you want a secure career, data science offers you that chance.
  • 12. Use of Data Science 1. Data science may detect patterns in seemingly unstructured or unconnected data, allowing conclusions and predictions to be made. 2. Tech businesses that acquire user data can utilize strategies to transform that data into valuable or profitable information. 3. Data Science has also made inroads into the transportation industry, such as with driverless cars. 4. Data Science applications provide a better level of therapeutic customization through genetics and genomics research.
  • 13. Data Scientist Job role: Determine what the problem is, what questions need answers, and where to find the data. Also, they mine, clean, and present the relevant data. Skills needed: Programming skills (SAS, R, Python), storytelling and data visualization, statistical and mathematical skills, knowledge of Hadoop, SQL, and Machine Learning.
  • 14. Data Analyst Job role: Analysts bridge the gap between the data scientists and the business analysts, organizing and analyzing data to answer the questions the organization poses. They take the technical analyses and turn them into qualitative action items. Skills needed: Statistical and mathematical skills, programming skills (SAS, R, Python), plus experience in data wrangling and data visualization.
  • 15. Data Engineer Job role: Data engineers focus on developing, deploying, managing, and optimizing the organization’s data infrastructure and data pipelines. Engineers support data scientists by helping to transfer and transform data for queries. Skills needed: NoSQL databases (e.g., MongoDB, Cassandra DB), programming languages such as Java and Scala, and frameworks (Apache Hadoop).
  • 16. Data Science Tools Data Analysis: SAS, Jupyter, R Studio, MATLAB, Excel, RapidMiner Data Warehousing: Informatica/ Talend, AWS Redshift Data Visualization: Jupyter, Tableau, Cognos, RAW Machine Learning: Spark MLib, Mahout, Azure ML studio
  • 17. Difference Between Business Intelligence and Data Science BUSINESS INTELLIGENCE DATA SCIENCE Uses structured data Uses both structured and unstructured data Analytical in nature - provides a historical report of the data Scientific in nature - perform an in- depth statistical analysis on the data Use of basic statistics with emphasis on visualization (dashboards, reports) Leverages more sophisticated statistical and predictive analysis and machine learning (ML) Compares historical data to current data to identify trends Combines historical and current data to predict future performance and outcomes
  • 18. Applications of Data Science 1. Healthcare 2. Gaming 3. Image Recognition 4. Recommendation Systems 5. Logistics 6. Fraud Detection 7. Internet Search 8. Speech recognition 9. Targeted Advertising 10. Airline Route Planning 11. Augmented Reality
  • 22. Getting Started Import pandas as pd 1 Import numpy as np 2 Import matplotlib.pyplot as plt 3
  • 23. Getting Started data = { ’year ’: [2010 , 2011 , 2012 ,2010 , 2011 , 2012 ,2010 , 2011 , 2012], ’team ’: [’ FCBarcelona ’, ’ FCBarcelona ’,’ FCBarcelona ’, ’ RMadrid ’,’ RMadrid ’, ’ RMadrid ’,’ ValenciaCF ’, ’ ValenciaCF ’,’ ValenciaCF ’ ], ’wins ’: [30 , 28 , 32 , 29 , 32 , 26 , 21 , 17 , 19] , ’ draws ’: [6 , 7, 4, 5, 4, 7, 8, 10 , 8] , ’ losses ’: [2 , 3, 2, 4, 2, 5, 9, 11 , 11] } football = pd . DataFrame ( data , columns = [ ’year ’, ’team ’, ’wins ’, ’ draws ’, ’ losses ’ ] )
  • 25. Read CSV • Import pandas as pd • mydata = pd.read_csv(‘data.csv’)
  • 26. First Five Rows • mydata.head()
  • 27. Last Five Rows • mydata.tail()
  • 28. Show Statistical Information • mydata.describe()
  • 30. Subset of Rows • mydata[5:10]