SlideShare a Scribd company logo
1 of 16
Introduction to Data Science
Kamal Al Nasr, Matthew Hayes and Jean-Claude Pedjeu
Computer Science and Mathematical Sciences
College of Engineering
Tennessee State University
Outline
Data, Big Data and Challenges
Data Science
Introduction
Why Data Science
Data Scientists
What do they do?
Major/Concentration in Data Science
What courses to take.
Data All Around
Lots of data is being collected
and warehoused
Web data, e-commerce
Financial transactions, bank/credit
transactions
Online trading and purchasing
Social Network
Big Data
Big Data is any data that is expensive to manage
and hard to extract value from
Volume
The size of the data
Velocity
The latency of data processing relative to the
growing demand for interactivity
Variety and Complexity
the diversity of sources, formats, quality, structures.
Big Data
Types of Data We Have
Relational Data
(Tables/Transaction/Legacy Data)
Text Data (Web)
Semi-structured Data (XML)
Graph Data
Social Network, Semantic Web (RDF), …
Streaming Data
You can afford to scan the data once
What To Do With These Data?
Aggregation and Statistics
Data warehousing and OLAP
Indexing, Searching, and Querying
Keyword based search
Pattern matching (XML/RDF)
Knowledge discovery
Data Mining
Statistical Modeling
What is Data Science?
An area that manages, manipulates,
extracts, and interprets knowledge from
tremendous amount of data
Data science (DS) is a multidisciplinary
field of study with goal to address the
challenges in big data
Data science principles apply to all data –
big and small
https://hbr.org/2012/10/data-scientist-the-sexiest-job-of-the-21st-century/
What is Data Science?
Theories and techniques from many fields and
disciplines are used to investigate and analyze a
large amount of data to help decision makers in
many industries such as science, engineering,
economics, politics, finance, and education
Computer Science
Pattern recognition, visualization, data warehousing, High
performance computing, Databases, AI
Mathematics
Mathematical Modeling
Statistics
Statistical and Stochastic modeling, Probability.
Why is it sexy?
Gartner’s 2014 Hype Cycle
Data Science
Data Science
Real Life Examples
Companies learn your secrets, shopping
patterns, and preferences
For example, can we know if a woman is
pregnant, even if she doesn’t want us to
know? Target case study
Data Science and election (2008, 2012)
1 million people installed the Obama
Facebook app that gave access to info on
“friends”
Data Scientists
Data Scientist
The Sexiest Job of the 21st Century
They find stories, extract knowledge. They
are not reporters
Data Scientists
Data scientists are the key to realizing the
opportunities presented by big data. They
bring structure to it, find compelling
patterns in it, and advise executives on the
implications for products, processes, and
decisions
What do Data Scientists do?
National Security
Cyber Security
Business Analytics
Engineering
Healthcare
And more ….

More Related Content

Similar to Introduction to Data Science 1116.pptx

Introduction to Data Science
Introduction to Data Science Introduction to Data Science
Introduction to Data Science Rajuyadav887963
 
Introduction to Data Science\
Introduction to Data Science\Introduction to Data Science\
Introduction to Data Science\Rajuyadav887963
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data ScienceSarmiHarsha
 
Introduction to Data Science 5-13.pptx
Introduction to Data Science 5-13.pptxIntroduction to Data Science 5-13.pptx
Introduction to Data Science 5-13.pptxdatapro2
 
Introduction to Data Science 5-13.pptx
Introduction to Data Science 5-13.pptxIntroduction to Data Science 5-13.pptx
Introduction to Data Science 5-13.pptxSanmati Jain
 
Opportunities in Data Science.ppt
Opportunities in Data Science.pptOpportunities in Data Science.ppt
Opportunities in Data Science.pptSwapnilTelrandhe1
 
Introduction to Data Science 1121.pptx
Introduction to Data Science 1121.pptxIntroduction to Data Science 1121.pptx
Introduction to Data Science 1121.pptxmark828
 
Introduction to Data Science 1113.pptx
Introduction to Data Science 1113.pptxIntroduction to Data Science 1113.pptx
Introduction to Data Science 1113.pptxmark828
 
Semantic Web Investigation within Big Data Context
Semantic Web Investigation within Big Data ContextSemantic Web Investigation within Big Data Context
Semantic Web Investigation within Big Data ContextMurad Daryousse
 
Why is Data Science a Popular Career Choice.pdf
Why is Data Science a Popular Career Choice.pdfWhy is Data Science a Popular Career Choice.pdf
Why is Data Science a Popular Career Choice.pdfUSDSI
 
Introduction to Data Mining and technologies .ppt
Introduction to Data Mining and technologies .pptIntroduction to Data Mining and technologies .ppt
Introduction to Data Mining and technologies .pptSangrangBargayary3
 
Big_Data_ML_Madhu_Reddiboina
Big_Data_ML_Madhu_ReddiboinaBig_Data_ML_Madhu_Reddiboina
Big_Data_ML_Madhu_ReddiboinaMadhu Reddiboina
 
Data Science - An emerging Stream of Science with its Spreading Reach & Impact
Data Science - An emerging Stream of Science with its Spreading Reach & ImpactData Science - An emerging Stream of Science with its Spreading Reach & Impact
Data Science - An emerging Stream of Science with its Spreading Reach & ImpactDr. Sunil Kr. Pandey
 
Big Data: Big Deal or Big Brother?
Big Data: Big Deal or Big Brother?Big Data: Big Deal or Big Brother?
Big Data: Big Deal or Big Brother?John D. Johnson
 
Data minig with Big data analysis
Data minig with Big data analysisData minig with Big data analysis
Data minig with Big data analysisPoonam Kshirsagar
 

Similar to Introduction to Data Science 1116.pptx (20)

Introduction to Data Science
Introduction to Data Science Introduction to Data Science
Introduction to Data Science
 
Introduction to Data Science\
Introduction to Data Science\Introduction to Data Science\
Introduction to Data Science\
 
mkol.pptx
mkol.pptxmkol.pptx
mkol.pptx
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
hjol.pptx
hjol.pptxhjol.pptx
hjol.pptx
 
Introduction to Data Science 5-13.pptx
Introduction to Data Science 5-13.pptxIntroduction to Data Science 5-13.pptx
Introduction to Data Science 5-13.pptx
 
Introduction to Data Science 5-13.pptx
Introduction to Data Science 5-13.pptxIntroduction to Data Science 5-13.pptx
Introduction to Data Science 5-13.pptx
 
Opportunities in Data Science.ppt
Opportunities in Data Science.pptOpportunities in Data Science.ppt
Opportunities in Data Science.ppt
 
Introduction to Data Science 1121.pptx
Introduction to Data Science 1121.pptxIntroduction to Data Science 1121.pptx
Introduction to Data Science 1121.pptx
 
Introduction to Data Science 1113.pptx
Introduction to Data Science 1113.pptxIntroduction to Data Science 1113.pptx
Introduction to Data Science 1113.pptx
 
Big Data Ethics
Big Data EthicsBig Data Ethics
Big Data Ethics
 
Semantic Web Investigation within Big Data Context
Semantic Web Investigation within Big Data ContextSemantic Web Investigation within Big Data Context
Semantic Web Investigation within Big Data Context
 
Untitled document.pdf
Untitled document.pdfUntitled document.pdf
Untitled document.pdf
 
Why is Data Science a Popular Career Choice.pdf
Why is Data Science a Popular Career Choice.pdfWhy is Data Science a Popular Career Choice.pdf
Why is Data Science a Popular Career Choice.pdf
 
Introduction to Data Mining and technologies .ppt
Introduction to Data Mining and technologies .pptIntroduction to Data Mining and technologies .ppt
Introduction to Data Mining and technologies .ppt
 
Big_Data_ML_Madhu_Reddiboina
Big_Data_ML_Madhu_ReddiboinaBig_Data_ML_Madhu_Reddiboina
Big_Data_ML_Madhu_Reddiboina
 
Why Data Science is a Science
Why Data Science is a ScienceWhy Data Science is a Science
Why Data Science is a Science
 
Data Science - An emerging Stream of Science with its Spreading Reach & Impact
Data Science - An emerging Stream of Science with its Spreading Reach & ImpactData Science - An emerging Stream of Science with its Spreading Reach & Impact
Data Science - An emerging Stream of Science with its Spreading Reach & Impact
 
Big Data: Big Deal or Big Brother?
Big Data: Big Deal or Big Brother?Big Data: Big Deal or Big Brother?
Big Data: Big Deal or Big Brother?
 
Data minig with Big data analysis
Data minig with Big data analysisData minig with Big data analysis
Data minig with Big data analysis
 

Recently uploaded

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 

Recently uploaded (20)

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 

Introduction to Data Science 1116.pptx

  • 1. Introduction to Data Science Kamal Al Nasr, Matthew Hayes and Jean-Claude Pedjeu Computer Science and Mathematical Sciences College of Engineering Tennessee State University
  • 2. Outline Data, Big Data and Challenges Data Science Introduction Why Data Science Data Scientists What do they do? Major/Concentration in Data Science What courses to take.
  • 3. Data All Around Lots of data is being collected and warehoused Web data, e-commerce Financial transactions, bank/credit transactions Online trading and purchasing Social Network
  • 4. Big Data Big Data is any data that is expensive to manage and hard to extract value from Volume The size of the data Velocity The latency of data processing relative to the growing demand for interactivity Variety and Complexity the diversity of sources, formats, quality, structures.
  • 6. Types of Data We Have Relational Data (Tables/Transaction/Legacy Data) Text Data (Web) Semi-structured Data (XML) Graph Data Social Network, Semantic Web (RDF), … Streaming Data You can afford to scan the data once
  • 7. What To Do With These Data? Aggregation and Statistics Data warehousing and OLAP Indexing, Searching, and Querying Keyword based search Pattern matching (XML/RDF) Knowledge discovery Data Mining Statistical Modeling
  • 8. What is Data Science? An area that manages, manipulates, extracts, and interprets knowledge from tremendous amount of data Data science (DS) is a multidisciplinary field of study with goal to address the challenges in big data Data science principles apply to all data – big and small https://hbr.org/2012/10/data-scientist-the-sexiest-job-of-the-21st-century/
  • 9. What is Data Science? Theories and techniques from many fields and disciplines are used to investigate and analyze a large amount of data to help decision makers in many industries such as science, engineering, economics, politics, finance, and education Computer Science Pattern recognition, visualization, data warehousing, High performance computing, Databases, AI Mathematics Mathematical Modeling Statistics Statistical and Stochastic modeling, Probability.
  • 10. Why is it sexy? Gartner’s 2014 Hype Cycle
  • 13. Real Life Examples Companies learn your secrets, shopping patterns, and preferences For example, can we know if a woman is pregnant, even if she doesn’t want us to know? Target case study Data Science and election (2008, 2012) 1 million people installed the Obama Facebook app that gave access to info on “friends”
  • 14. Data Scientists Data Scientist The Sexiest Job of the 21st Century They find stories, extract knowledge. They are not reporters
  • 15. Data Scientists Data scientists are the key to realizing the opportunities presented by big data. They bring structure to it, find compelling patterns in it, and advise executives on the implications for products, processes, and decisions
  • 16. What do Data Scientists do? National Security Cyber Security Business Analytics Engineering Healthcare And more ….