SlideShare a Scribd company logo
1 of 26
Ch. Eick Introduction to Data Science 8/28/2023
What is Data Science?
 What words come to mind when you think of Data
Science?
 What experience do you have with Data
Science?
 Why are you taking the Data Science I class?
1
Ch. Eick Introduction to Data Science 8/28/2023
Data Science Definitions
Data Science is an interdisciplinary field about processes and systems to extract
knowledge or insights from data in various forms, either structured or
unstructured, which is a continuation of some of the data analysis fields such as
statistics, data mining, and predictive analytics, similar to Knowledge Discovery
in Data (KDD).
Data Science is a multi-disciplinary field that uses scientific methods, processes,
algorithms and systems to extract knowledge and insights from structured and
unstructured data. Data science is the same concept as data mining and big
data: "use the most powerful hardware, the most powerful programming
systems, and the most efficient algorithms to solve problems" Data science is a
"concept to unify statistics, data analysis, machine learning and their related
methods" in order to "understand and analyze actual phenomena" with data
technology“ (Wikipedia)
Ch. Eick Introduction to Data Science 8/28/2023
Data Science Definitions II
 There are many, but most say data science is:
– Broad – broader than any one existing discipline
– Interdisciplinary: Computer Science, Statistics,
Information Science, Databases, Mathematics
– Applied focus on extracting knowledge from data to
inform decision making.
– Focuses on the skills needed to collect, manage,
store, distribute, analyze, visualize, reuse data and on
data storytelling.
 There are many visual representations in Data
Science
3
Ch. Eick Introduction to Data Science 8/28/2023
Some More Definitions
4
Ch. Eick Introduction to Data Science 8/28/2023
More Definitions
5
Ch. Eick Introduction to Data Science 8/28/2023
Data Science is Broad!
6
Ch. Eick Introduction to Data Science 8/28/2023
Data Science Word Cloud
7
Ch. Eick Introduction to Data Science 8/28/2023
Data Analysis
 We analyze data to extract meaning from it.
 Virtually all data analysis focuses on data
reduction
 Data reduction comes in the form of:
– Descriptive statistics
– Measures of association
– Graphical visualizations
 The objective is to abstract from all of the data
some feature or set of features that captures
evidence of the process you are studying
8
Ch. Eick Introduction to Data Science 8/28/2023
The Data Lifecycle
 Data science considers data at every stage of what is
called the data lifecycle.
 This lifecycle generally refers to everything from
collecting data to analyzing it to sharing it so others
can re-analyze it.
 New visions of this process in particular focus on
integrating every action that creates, analyzes, or
otherwise touches data.
 These same new visions treat the process as
dynamic – archives are not just digital shoe boxes
under the bed.
 There are many representations of the this lifecycle.
9
Ch. Eick Introduction to Data Science 8/28/2023
10
Ch. Eick Introduction to Data Science 8/28/2023
Data curation is a term used to indicate management
activities related to organization and integration
of data collected from various sources, annotation of the
data, and publication and presentation of the data such
that the value of the data is maintained over time, and the
data remains available for reuse and preservation.
Data Curation
Ch. Eick Introduction to Data Science 8/28/2023
News August 29, 2023
 Task1 ProblemSet1 should be available on Sept. 5 or earlier;
check course website!
 Plans for today’s class
– Finish Introduction to Data Science/Data Mining
– 3p: Conduct Survey R/Python Knowledge of Students in the
course
– Data Science Basics & Exploratory Data Analysis
 Plans for the August 31 class
– Continue Data Science Basics & Exploratory Data Analysis
– Preprocessing for DS/Data Mining
– Optional: Brief Discussion of Task1 of ProblemSet1
– More Discussion GHC and Announce GHC Groups
 There will be a lab as part of the Sept. 5 class; bring laptop!
12
Ch. Eick Introduction to Data Science 8/28/2023
What is Missing?
 Most definitions of data science underplay or
leave out discussions of:
– Substantive theory
– Metadata
– Privacy and Ethics
13
Ch. Eick Introduction to Data Science 8/28/2023
Privacy and Ethics
 Data, the elements of data science, and even so-
called “Big Data” are not new.
 One thing that is new is the greater variety of
data and, most importantly, the amount of data
available about humans.
 Discussion and good policy regarding privacy,
security, and the ethical use of data about people
lags behind the methods of collecting, sharing,
archiving, and analyzing data.
14
Ch. Eick Introduction to Data Science 8/28/2023
Big Data
 The launch of the Data Science conversation has
been sparked primarily by the so-called “Big
Data” revolution.
 As mentioned, we have always had data that
taxed our technical and computational capacities.
 “Big Data” makes front-page news, however,
because of the explosion of data about people.
 Contemporary definitions of Big Data focus on:
– Volume (the amount of data)
– Velocity (the speed of data in and out)
– Variety (the diverse types of data)
15
Ch. Eick Introduction to Data Science 8/28/2023
16
Ch. Eick Introduction to Data Science 8/28/2023
Talk Outline
1. Importance of Data Science
2. Data Science is More than Using Tools
3. Data Storytelling
4. Examples of Data Storytelling
5. Conclusion
Data Analysis and Intelligent Systems Lab
Data Science According to Swami Chandrasekaran
Ch. Eick Introduction to Data Science 8/28/2023
Data Science and Storytelling
 Google’s Chief Economist Dr. Hal R.Varian stated, "The
ability to take data—to be able to understand it, to process it,
to extract value from it, to visualize it, to communicate it—
that’s going to be a hugely important skill in the next
decades."
 “When hiring data scientists, people tend to focus primarily
on technical qualifications. It’s hard to find candidates who
have the right mix of computational and statistical skills. But
what’s even harder is finding people who have those skills
and are good at communicating the story behind the data.”
Michael Li
Data Analysis and Intelligent Systems Lab
Ch. Eick Introduction to Data Science 8/28/2023
Data Science and Storytelling 2
 “Data scientists are involved with gathering data, massaging it into a tractable
form, making it tell its story, and presenting that story to others.” – Mike
Loukides, VP, O’Reilly Media
 “Our challenge as data scientists is to translate this haystack of information
into guidance for staff so they can make smart decisions…We “humanize” the
data by turning raw numbers into a story about our performance. Data
scientists want to believe that data has all the answers. But the most important
part of our job is qualitative: asking questions, creating directives from our
data, and telling its story. ” Jeff Bladtand and Bob Filbin
 Telling Stories with Data in 3 Steps (Quick Study) - Bing video (showed
this video on Aug. 24, 2023)
 The Power in Effective Data Storytelling | Malavica Sridhar | TEDxUIUC -
Bing video starting 6:40
 20 of the Best Video Storytelling Examples EVER | Wyzowl
 20 Best Data Storytelling Examples (updated for 2023) — Juice Analytics
Data Analysis and Intelligent Systems Lab
Not covered in August 2023, except we showed one video!
Ch. Eick Introduction to Data Science 8/28/2023
Comment
 This concludes the Introduction to Data
Science/Data Mining
 The remaining slides of this slide show and other
slides will be discussed in a lecture centering of
“Data Storytelling”, in the second half of
November 2023…
 There will be also some limited coverage of Data
Science Ethics in November, 2023.
Ch. Eick Introduction to Data Science 8/28/2023
Data Science is More than Using Tools
 The problem with data is that it says a lot, but it also says nothing.
‘Big data’ is terrific, but it’s usually thin. To understand why
something is happening, we have to engage in both forensics and
guess work.”- Sendhil Mullainathan, Professor of economics,
Harvard
 “But a theory is not like an airline or bus timetable. We are not
interested simply in the accuracy of its predictions. A theory also
serves as a base for thinking. It helps us to understand what is
going on by enabling us to organize our thoughts. Faced with a
choice between a theory which predicts well but gives us little
insight into how the system works and one which gives us this
insight but predicts badly, I would choose the latter, and I am
inclined to think that most economists would do the same.”
― Ronald H. Coase, Essays on Economics and Economists
Data Analysis and Intelligent Systems Lab
Slides which follow were not covered in August 2023!
Ch. Eick Introduction to Data Science 8/28/2023
Data Storytelling is Currently “Hot”
Evidence:
 Quotations of Leaders in Data Science we
presented earlier
 Watch Commercials: https://www.tableau.com/solutions/customer/storytelling-data-0
 Popularity of TED Talks, most of which mostly
center on data storytelling.
 New productsaps: https://www.esri.com/arcgis-blog/story-maps/
 …Learn Data Story telling on Tableau 2022 in 7 mins - Bing video
 A lot of data storytelling contests
Data Analysis and Intelligent Systems Lab
Ch. Eick Introduction to Data Science 8/28/2023
Importance of Data Science
 “It is a capital mistake to theorize before one has
data.”- Arthur Conan Doyle, Author of Sherlock
Holmes
 If you’re a scientist, and you have to have an
answer, even in the absence of data, you’re not
going to be a good scientist.” – Neil deGrasse
Tyson, Astrophysicist
 “Without big data analytics, companies are blind
and deaf, wandering out onto the Web like deer
on a freeway.” – Geoffrey Moore, Partner at MDV
Data Analysis and Intelligent Systems Lab
Ch. Eick Introduction to Data Science 8/28/2023
Resposibilities of Data Scientists
1. We have to have some committment to telleth
„Torture the data long enough and it will confess to anything."
Nobel Prize winning economist Ronald Coase
“To find signals in data, we must learn to reduce the noise -
not just the noise that resides in the data, but also the noise
that resides in us. It is nearly impossible for noisy minds to
perceive anything but noise in data.”
― Stephen Few, Signal: Understanding What Matters in a
World of Noise
2. We have to know what we are doing
Data Analysis and Intelligent Systems Lab
Ch. Eick Introduction to Data Science 8/28/2023
COSC Data Science Curriculum
COSC 3337: Data Science II (starting in 2019)
Credit Hours: 3.0
Lecture Contact Hours: 3 Lab Contact Hours: 0
Prerequisite: ‘Data Structures’
Data science process, data preprocessing, exploratory data
analysis, data visualization, basic statistics, basic machine
learning concepts, classification and prediction, similarity
assessment, clustering, post-processing and interpreting data
analysis results, use of data analysis tools and programming
languages and data analysis case studies.
Data Analysis and Intelligent Systems Lab
Ch. Eick Introduction to Data Science 8/28/2023
26

More Related Content

Similar to Data; Data manipulation, sorting, grouping, rearranging. Plotting the data. Descriptive statistics. Inferential statistics. Python Libraries for Data Science.

from_physics_to_data_science
from_physics_to_data_sciencefrom_physics_to_data_science
from_physics_to_data_scienceMartina Pugliese
 
Making our mark: the important role of social scientists in the ‘era of big d...
Making our mark: the important role of social scientists in the ‘era of big d...Making our mark: the important role of social scientists in the ‘era of big d...
Making our mark: the important role of social scientists in the ‘era of big d...The Higher Education Academy
 
Data Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has ChangedData Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has ChangedPhilip Bourne
 
A Deep Dissertion Of Data Science Related Issues And Its Applications
A Deep Dissertion Of Data Science  Related Issues And Its ApplicationsA Deep Dissertion Of Data Science  Related Issues And Its Applications
A Deep Dissertion Of Data Science Related Issues And Its ApplicationsTracy Hill
 
Introduction to Big Data and Data Science
Introduction to Big Data and Data ScienceIntroduction to Big Data and Data Science
Introduction to Big Data and Data ScienceFeyzi R. Bagirov
 
Data Science & Analytics (light overview)
Data Science & Analytics (light overview) Data Science & Analytics (light overview)
Data Science & Analytics (light overview) Shalin Hai-Jew
 
Mapping (big) data science (15 dec2014)대학(원)생
Mapping (big) data science (15 dec2014)대학(원)생Mapping (big) data science (15 dec2014)대학(원)생
Mapping (big) data science (15 dec2014)대학(원)생Han Woo PARK
 
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...Editor IJCATR
 
Sentiment Analysis In Retail Domain
Sentiment Analysis In Retail DomainSentiment Analysis In Retail Domain
Sentiment Analysis In Retail DomainEdureka!
 
International Collaboration Networks in the Emerging (Big) Data Science
International Collaboration Networks in the Emerging (Big) Data ScienceInternational Collaboration Networks in the Emerging (Big) Data Science
International Collaboration Networks in the Emerging (Big) Data Sciencedatasciencekorea
 
[DSC Croatia 22] Writing scientific papers about data science projects - Mirj...
[DSC Croatia 22] Writing scientific papers about data science projects - Mirj...[DSC Croatia 22] Writing scientific papers about data science projects - Mirj...
[DSC Croatia 22] Writing scientific papers about data science projects - Mirj...DataScienceConferenc1
 
Making an impact with data science
Making an impact  with data scienceMaking an impact  with data science
Making an impact with data scienceJordan Engbers
 
A metadata scheme of the software-data relationship: A proposal
A metadata scheme of the software-data relationship: A proposalA metadata scheme of the software-data relationship: A proposal
A metadata scheme of the software-data relationship: A proposalKai Li
 
Ict와 사회과학지식간 학제간 연구동향(23 march2013)
Ict와 사회과학지식간 학제간 연구동향(23 march2013)Ict와 사회과학지식간 학제간 연구동향(23 march2013)
Ict와 사회과학지식간 학제간 연구동향(23 march2013)Han Woo PARK
 
Biomedical Data Science: We Are Not Alone
Biomedical Data Science: We Are Not AloneBiomedical Data Science: We Are Not Alone
Biomedical Data Science: We Are Not AlonePhilip Bourne
 
DEALING CRISIS MANAGEMENT USING AI
DEALING CRISIS MANAGEMENT USING AIDEALING CRISIS MANAGEMENT USING AI
DEALING CRISIS MANAGEMENT USING AIIJCSEA Journal
 
DEALING CRISIS MANAGEMENT USING AI
DEALING CRISIS MANAGEMENT USING AIDEALING CRISIS MANAGEMENT USING AI
DEALING CRISIS MANAGEMENT USING AIIJCSEA Journal
 

Similar to Data; Data manipulation, sorting, grouping, rearranging. Plotting the data. Descriptive statistics. Inferential statistics. Python Libraries for Data Science. (20)

from_physics_to_data_science
from_physics_to_data_sciencefrom_physics_to_data_science
from_physics_to_data_science
 
Making our mark: the important role of social scientists in the ‘era of big d...
Making our mark: the important role of social scientists in the ‘era of big d...Making our mark: the important role of social scientists in the ‘era of big d...
Making our mark: the important role of social scientists in the ‘era of big d...
 
Data Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has ChangedData Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has Changed
 
A Deep Dissertion Of Data Science Related Issues And Its Applications
A Deep Dissertion Of Data Science  Related Issues And Its ApplicationsA Deep Dissertion Of Data Science  Related Issues And Its Applications
A Deep Dissertion Of Data Science Related Issues And Its Applications
 
Lecture_1_Intro_toDS&AI.pptx
Lecture_1_Intro_toDS&AI.pptxLecture_1_Intro_toDS&AI.pptx
Lecture_1_Intro_toDS&AI.pptx
 
Introduction to Big Data and Data Science
Introduction to Big Data and Data ScienceIntroduction to Big Data and Data Science
Introduction to Big Data and Data Science
 
Data Science & Analytics (light overview)
Data Science & Analytics (light overview) Data Science & Analytics (light overview)
Data Science & Analytics (light overview)
 
Mapping (big) data science (15 dec2014)대학(원)생
Mapping (big) data science (15 dec2014)대학(원)생Mapping (big) data science (15 dec2014)대학(원)생
Mapping (big) data science (15 dec2014)대학(원)생
 
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...
 
Sentiment Analysis In Retail Domain
Sentiment Analysis In Retail DomainSentiment Analysis In Retail Domain
Sentiment Analysis In Retail Domain
 
International Collaboration Networks in the Emerging (Big) Data Science
International Collaboration Networks in the Emerging (Big) Data ScienceInternational Collaboration Networks in the Emerging (Big) Data Science
International Collaboration Networks in the Emerging (Big) Data Science
 
From byte to mind
From byte to mindFrom byte to mind
From byte to mind
 
Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data science
 
[DSC Croatia 22] Writing scientific papers about data science projects - Mirj...
[DSC Croatia 22] Writing scientific papers about data science projects - Mirj...[DSC Croatia 22] Writing scientific papers about data science projects - Mirj...
[DSC Croatia 22] Writing scientific papers about data science projects - Mirj...
 
Making an impact with data science
Making an impact  with data scienceMaking an impact  with data science
Making an impact with data science
 
A metadata scheme of the software-data relationship: A proposal
A metadata scheme of the software-data relationship: A proposalA metadata scheme of the software-data relationship: A proposal
A metadata scheme of the software-data relationship: A proposal
 
Ict와 사회과학지식간 학제간 연구동향(23 march2013)
Ict와 사회과학지식간 학제간 연구동향(23 march2013)Ict와 사회과학지식간 학제간 연구동향(23 march2013)
Ict와 사회과학지식간 학제간 연구동향(23 march2013)
 
Biomedical Data Science: We Are Not Alone
Biomedical Data Science: We Are Not AloneBiomedical Data Science: We Are Not Alone
Biomedical Data Science: We Are Not Alone
 
DEALING CRISIS MANAGEMENT USING AI
DEALING CRISIS MANAGEMENT USING AIDEALING CRISIS MANAGEMENT USING AI
DEALING CRISIS MANAGEMENT USING AI
 
DEALING CRISIS MANAGEMENT USING AI
DEALING CRISIS MANAGEMENT USING AIDEALING CRISIS MANAGEMENT USING AI
DEALING CRISIS MANAGEMENT USING AI
 

Recently uploaded

(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile servicerehmti665
 
High Profile Call Girls Nashik Megha 7001305949 Independent Escort Service Na...
High Profile Call Girls Nashik Megha 7001305949 Independent Escort Service Na...High Profile Call Girls Nashik Megha 7001305949 Independent Escort Service Na...
High Profile Call Girls Nashik Megha 7001305949 Independent Escort Service Na...Call Girls in Nagpur High Profile
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxupamatechverse
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxupamatechverse
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxpranjaldaimarysona
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)Suman Mia
 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSCAESB
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escortsranjana rawat
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝soniya singh
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxupamatechverse
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 

Recently uploaded (20)

(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile service
 
High Profile Call Girls Nashik Megha 7001305949 Independent Escort Service Na...
High Profile Call Girls Nashik Megha 7001305949 Independent Escort Service Na...High Profile Call Girls Nashik Megha 7001305949 Independent Escort Service Na...
High Profile Call Girls Nashik Megha 7001305949 Independent Escort Service Na...
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptx
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptx
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptx
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentation
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptx
 
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptxExploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 

Data; Data manipulation, sorting, grouping, rearranging. Plotting the data. Descriptive statistics. Inferential statistics. Python Libraries for Data Science.

  • 1. Ch. Eick Introduction to Data Science 8/28/2023 What is Data Science?  What words come to mind when you think of Data Science?  What experience do you have with Data Science?  Why are you taking the Data Science I class? 1
  • 2. Ch. Eick Introduction to Data Science 8/28/2023 Data Science Definitions Data Science is an interdisciplinary field about processes and systems to extract knowledge or insights from data in various forms, either structured or unstructured, which is a continuation of some of the data analysis fields such as statistics, data mining, and predictive analytics, similar to Knowledge Discovery in Data (KDD). Data Science is a multi-disciplinary field that uses scientific methods, processes, algorithms and systems to extract knowledge and insights from structured and unstructured data. Data science is the same concept as data mining and big data: "use the most powerful hardware, the most powerful programming systems, and the most efficient algorithms to solve problems" Data science is a "concept to unify statistics, data analysis, machine learning and their related methods" in order to "understand and analyze actual phenomena" with data technology“ (Wikipedia)
  • 3. Ch. Eick Introduction to Data Science 8/28/2023 Data Science Definitions II  There are many, but most say data science is: – Broad – broader than any one existing discipline – Interdisciplinary: Computer Science, Statistics, Information Science, Databases, Mathematics – Applied focus on extracting knowledge from data to inform decision making. – Focuses on the skills needed to collect, manage, store, distribute, analyze, visualize, reuse data and on data storytelling.  There are many visual representations in Data Science 3
  • 4. Ch. Eick Introduction to Data Science 8/28/2023 Some More Definitions 4
  • 5. Ch. Eick Introduction to Data Science 8/28/2023 More Definitions 5
  • 6. Ch. Eick Introduction to Data Science 8/28/2023 Data Science is Broad! 6
  • 7. Ch. Eick Introduction to Data Science 8/28/2023 Data Science Word Cloud 7
  • 8. Ch. Eick Introduction to Data Science 8/28/2023 Data Analysis  We analyze data to extract meaning from it.  Virtually all data analysis focuses on data reduction  Data reduction comes in the form of: – Descriptive statistics – Measures of association – Graphical visualizations  The objective is to abstract from all of the data some feature or set of features that captures evidence of the process you are studying 8
  • 9. Ch. Eick Introduction to Data Science 8/28/2023 The Data Lifecycle  Data science considers data at every stage of what is called the data lifecycle.  This lifecycle generally refers to everything from collecting data to analyzing it to sharing it so others can re-analyze it.  New visions of this process in particular focus on integrating every action that creates, analyzes, or otherwise touches data.  These same new visions treat the process as dynamic – archives are not just digital shoe boxes under the bed.  There are many representations of the this lifecycle. 9
  • 10. Ch. Eick Introduction to Data Science 8/28/2023 10
  • 11. Ch. Eick Introduction to Data Science 8/28/2023 Data curation is a term used to indicate management activities related to organization and integration of data collected from various sources, annotation of the data, and publication and presentation of the data such that the value of the data is maintained over time, and the data remains available for reuse and preservation. Data Curation
  • 12. Ch. Eick Introduction to Data Science 8/28/2023 News August 29, 2023  Task1 ProblemSet1 should be available on Sept. 5 or earlier; check course website!  Plans for today’s class – Finish Introduction to Data Science/Data Mining – 3p: Conduct Survey R/Python Knowledge of Students in the course – Data Science Basics & Exploratory Data Analysis  Plans for the August 31 class – Continue Data Science Basics & Exploratory Data Analysis – Preprocessing for DS/Data Mining – Optional: Brief Discussion of Task1 of ProblemSet1 – More Discussion GHC and Announce GHC Groups  There will be a lab as part of the Sept. 5 class; bring laptop! 12
  • 13. Ch. Eick Introduction to Data Science 8/28/2023 What is Missing?  Most definitions of data science underplay or leave out discussions of: – Substantive theory – Metadata – Privacy and Ethics 13
  • 14. Ch. Eick Introduction to Data Science 8/28/2023 Privacy and Ethics  Data, the elements of data science, and even so- called “Big Data” are not new.  One thing that is new is the greater variety of data and, most importantly, the amount of data available about humans.  Discussion and good policy regarding privacy, security, and the ethical use of data about people lags behind the methods of collecting, sharing, archiving, and analyzing data. 14
  • 15. Ch. Eick Introduction to Data Science 8/28/2023 Big Data  The launch of the Data Science conversation has been sparked primarily by the so-called “Big Data” revolution.  As mentioned, we have always had data that taxed our technical and computational capacities.  “Big Data” makes front-page news, however, because of the explosion of data about people.  Contemporary definitions of Big Data focus on: – Volume (the amount of data) – Velocity (the speed of data in and out) – Variety (the diverse types of data) 15
  • 16. Ch. Eick Introduction to Data Science 8/28/2023 16
  • 17. Ch. Eick Introduction to Data Science 8/28/2023 Talk Outline 1. Importance of Data Science 2. Data Science is More than Using Tools 3. Data Storytelling 4. Examples of Data Storytelling 5. Conclusion Data Analysis and Intelligent Systems Lab Data Science According to Swami Chandrasekaran
  • 18. Ch. Eick Introduction to Data Science 8/28/2023 Data Science and Storytelling  Google’s Chief Economist Dr. Hal R.Varian stated, "The ability to take data—to be able to understand it, to process it, to extract value from it, to visualize it, to communicate it— that’s going to be a hugely important skill in the next decades."  “When hiring data scientists, people tend to focus primarily on technical qualifications. It’s hard to find candidates who have the right mix of computational and statistical skills. But what’s even harder is finding people who have those skills and are good at communicating the story behind the data.” Michael Li Data Analysis and Intelligent Systems Lab
  • 19. Ch. Eick Introduction to Data Science 8/28/2023 Data Science and Storytelling 2  “Data scientists are involved with gathering data, massaging it into a tractable form, making it tell its story, and presenting that story to others.” – Mike Loukides, VP, O’Reilly Media  “Our challenge as data scientists is to translate this haystack of information into guidance for staff so they can make smart decisions…We “humanize” the data by turning raw numbers into a story about our performance. Data scientists want to believe that data has all the answers. But the most important part of our job is qualitative: asking questions, creating directives from our data, and telling its story. ” Jeff Bladtand and Bob Filbin  Telling Stories with Data in 3 Steps (Quick Study) - Bing video (showed this video on Aug. 24, 2023)  The Power in Effective Data Storytelling | Malavica Sridhar | TEDxUIUC - Bing video starting 6:40  20 of the Best Video Storytelling Examples EVER | Wyzowl  20 Best Data Storytelling Examples (updated for 2023) — Juice Analytics Data Analysis and Intelligent Systems Lab Not covered in August 2023, except we showed one video!
  • 20. Ch. Eick Introduction to Data Science 8/28/2023 Comment  This concludes the Introduction to Data Science/Data Mining  The remaining slides of this slide show and other slides will be discussed in a lecture centering of “Data Storytelling”, in the second half of November 2023…  There will be also some limited coverage of Data Science Ethics in November, 2023.
  • 21. Ch. Eick Introduction to Data Science 8/28/2023 Data Science is More than Using Tools  The problem with data is that it says a lot, but it also says nothing. ‘Big data’ is terrific, but it’s usually thin. To understand why something is happening, we have to engage in both forensics and guess work.”- Sendhil Mullainathan, Professor of economics, Harvard  “But a theory is not like an airline or bus timetable. We are not interested simply in the accuracy of its predictions. A theory also serves as a base for thinking. It helps us to understand what is going on by enabling us to organize our thoughts. Faced with a choice between a theory which predicts well but gives us little insight into how the system works and one which gives us this insight but predicts badly, I would choose the latter, and I am inclined to think that most economists would do the same.” ― Ronald H. Coase, Essays on Economics and Economists Data Analysis and Intelligent Systems Lab Slides which follow were not covered in August 2023!
  • 22. Ch. Eick Introduction to Data Science 8/28/2023 Data Storytelling is Currently “Hot” Evidence:  Quotations of Leaders in Data Science we presented earlier  Watch Commercials: https://www.tableau.com/solutions/customer/storytelling-data-0  Popularity of TED Talks, most of which mostly center on data storytelling.  New productsaps: https://www.esri.com/arcgis-blog/story-maps/  …Learn Data Story telling on Tableau 2022 in 7 mins - Bing video  A lot of data storytelling contests Data Analysis and Intelligent Systems Lab
  • 23. Ch. Eick Introduction to Data Science 8/28/2023 Importance of Data Science  “It is a capital mistake to theorize before one has data.”- Arthur Conan Doyle, Author of Sherlock Holmes  If you’re a scientist, and you have to have an answer, even in the absence of data, you’re not going to be a good scientist.” – Neil deGrasse Tyson, Astrophysicist  “Without big data analytics, companies are blind and deaf, wandering out onto the Web like deer on a freeway.” – Geoffrey Moore, Partner at MDV Data Analysis and Intelligent Systems Lab
  • 24. Ch. Eick Introduction to Data Science 8/28/2023 Resposibilities of Data Scientists 1. We have to have some committment to telleth „Torture the data long enough and it will confess to anything." Nobel Prize winning economist Ronald Coase “To find signals in data, we must learn to reduce the noise - not just the noise that resides in the data, but also the noise that resides in us. It is nearly impossible for noisy minds to perceive anything but noise in data.” ― Stephen Few, Signal: Understanding What Matters in a World of Noise 2. We have to know what we are doing Data Analysis and Intelligent Systems Lab
  • 25. Ch. Eick Introduction to Data Science 8/28/2023 COSC Data Science Curriculum COSC 3337: Data Science II (starting in 2019) Credit Hours: 3.0 Lecture Contact Hours: 3 Lab Contact Hours: 0 Prerequisite: ‘Data Structures’ Data science process, data preprocessing, exploratory data analysis, data visualization, basic statistics, basic machine learning concepts, classification and prediction, similarity assessment, clustering, post-processing and interpreting data analysis results, use of data analysis tools and programming languages and data analysis case studies. Data Analysis and Intelligent Systems Lab
  • 26. Ch. Eick Introduction to Data Science 8/28/2023 26

Editor's Notes

  1. r
  2. r
  3. r
  4. r
  5. r
  6. r
  7. r