(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
Data science week_1
1. Data Science
Week 1. “Data revolution”
Introduction to data science
1
“The ascendance of data”
“Data, data, everywhere”
“Data deluge”
“Drowning in data”
What does all
this mean?!?
2. Week 1
Table of contents
1. Forms of data
i. Traditional (numerical data, textual data)
ii. Novel
2. Introduction to data science
i. Democratization of data science
ii. What is data science? Data scientist?
3. “Data driven development”
i. Data type by source and method of generation
ii. “Big data”, “open data”, “meta data” …
iii. Size of data
2
3. Data can be numbers
3
Team Player Apps Mins Goals Assists Yel Red SpG
PS
%
Aerial
sWon
Mot
M
Rating
Manchester City Kevin De Bruyne Manchester City, 28, M(CLR),FW15(1) 1317 6 9 1 - 3 82.1 0.3 4 7.93
Leicester James Maddison Leicester, 23, AM(CLR)16 1415 5 3 1 - 2.8 83.8 0.3 4 7.73
Leicester Ricardo Pereira Leicester, 26, D(LR),M(R)17 1530 2 1 1 - 0.5 79.9 1.5 1 7.72
Liverpool Sadio Mané Liverpool, 27, AM(CLR),FW15(1) 1329 9 5 1 - 2.4 79.5 1.3 5 7.61
Leicester Wilfred Ndidi Leicester, 23, DMC16 1440 2 - 2 - 0.9 84 2.6 2 7.6
Wolverhampton Wanderers Adama Traoré Wolverhampton Wanderers, 23, M(R),FW14(2) 1281 3 3 - - 1 72.7 1.3 3 7.54
Leicester Jamie Vardy Leicester, 32, AM(L),FW17 1530 16 3 2 - 2.7 71.5 1.4 3 7.53
Manchester City Raheem Sterling Manchester City, 25, M(CLR),FW16 1404 9 1 4 - 3.1 81.8 0.7 - 7.5
Manchester City Riyad Mahrez Manchester City, 28, AM(CLR)7(6) 680 4 4 - - 1.8 89.7 0.2 2 7.46
Tottenham Son Heung-Min Tottenham, 27, M(CLR),FW14(1) 1238 5 7 - 1 2.7 85.9 0.4 3 7.46
Liverpool Mohamed Salah Liverpool, 27, AM(CLR),FW14 1195 9 4 - - 3.9 77.1 0.3 3 7.42
Wolverhampton Wanderers Raúl Jiménez Wolverhampton Wanderers, 28, FW17 1475 6 5 2 - 3.2 73.6 2 1 7.39
Liverpool Virgil van Dijk Liverpool, 28, D(C)17 1530 3 - 1 - 0.8 87.6 5.4 3 7.39
Manchester City Rodrigo Manchester City, 23, DMC13(2) 1154 2 1 4 - 0.5 91.9 2.3 1 7.38
4. Data can be text
4
Team Player Apps Mins Goals Assists Yel Red SpG
PS
%
Aerial
sWon
Mot
M
Rating
Manchester City Kevin De Bruyne Manchester City, 28, M(CLR),FW15(1) 1317 6 9 1 - 3 82.1 0.3 4 7.93
Leicester James Maddison Leicester, 23, AM(CLR)16 1415 5 3 1 - 2.8 83.8 0.3 4 7.73
Leicester Ricardo Pereira Leicester, 26, D(LR),M(R)17 1530 2 1 1 - 0.5 79.9 1.5 1 7.72
Liverpool Sadio Mané Liverpool, 27, AM(CLR),FW15(1) 1329 9 5 1 - 2.4 79.5 1.3 5 7.61
Leicester Wilfred Ndidi Leicester, 23, DMC16 1440 2 - 2 - 0.9 84 2.6 2 7.6
Wolverhampton Wanderers Adama Traoré Wolverhampton Wanderers, 23, M(R),FW14(2) 1281 3 3 - - 1 72.7 1.3 3 7.54
Leicester Jamie Vardy Leicester, 32, AM(L),FW17 1530 16 3 2 - 2.7 71.5 1.4 3 7.53
Manchester City Raheem Sterling Manchester City, 25, M(CLR),FW16 1404 9 1 4 - 3.1 81.8 0.7 - 7.5
Manchester City Riyad Mahrez Manchester City, 28, AM(CLR)7(6) 680 4 4 - - 1.8 89.7 0.2 2 7.46
Tottenham Son Heung-Min Tottenham, 27, M(CLR),FW14(1) 1238 5 7 - 1 2.7 85.9 0.4 3 7.46
Liverpool Mohamed Salah Liverpool, 27, AM(CLR),FW14 1195 9 4 - - 3.9 77.1 0.3 3 7.42
Wolverhampton Wanderers Raúl Jiménez Wolverhampton Wanderers, 28, FW17 1475 6 5 2 - 3.2 73.6 2 1 7.39
Liverpool Virgil van Dijk Liverpool, 28, D(C)17 1530 3 - 1 - 0.8 87.6 5.4 3 7.39
Manchester City Rodrigo Manchester City, 23, DMC13(2) 1154 2 1 4 - 0.5 91.9 2.3 1 7.38
5. These two types have been analyzed
quantitatively (statistically) for decades
5
In the data science era, there are additional types
of data we can now use and analyze!
6. Data can be pictures (photograph, drawing, moving
images etc)
6
https://quickdraw.withgoogle.com/#
7. Data can be pictures
AI powered app Hananona
7
Take a picture
of a flower and
the app tells
you the name of
the flower
8. Data can be pictures
“Eigenfaces”
8
Nev Acar. Eigenfaces:
Recovering Humans
from Ghosts.
Towarddatascience.com
11. Data can be lots of texts
11https://www.theguardian.com/books/booksblog/2017/dec/13/harry-potter-botnik-jk-rowling
12. Data can be location
COVID-19 cases in the United States
12Source: Johns Hopkins University
13. Data can be movement of people
(origin-destination)
13
Visualization by ShinagawaJP@Twitter
Data courtesy of Agoop. Tokyo residents’ travel over 4 days in 2019.
14. Data can be location and movement of, say, soccer
players on the field
14Source: IEEE
15. 15
Source: National
Institute of Japanese
Literature.
https://www.nijl.ac.jp/
koten/kuzushiji/post-
4.html
Can you read
this?
This is a very
famous piece
of writing.
Data can be handwriting from the
11th Century
16. Kuzushiji data
Efforts to translate kuzushiji text into digital text using AI
16Source: Sankei Shimbun
17. Introduction to Data Science
Democratization of data science
What is data science? Data
scientist?
17
18. Cornelissen (2018). The Democratization of Data
Science
• New uses for data sicence are found
everyday in _________________ sectors.
• Many organizations today relegate all data
knowledge to a “handful of people.” How?
_______________
• Why is such approach problematic?
______________
• “Why would non-data scientists need to learn
data science?” ____________________
• What three things does the author propose to
democratize data science? ____________ 18
19. Grus (2015)
• “Ascendance of Data”
– What generates data? __________. A piece
of data is generated everytime ________
occurs.
• What is Data Science? Data scientist?
“…a data scientist is someone
who____________________.” (p.2)
19
20. What is data science? Data scientist? (Grus, chapter1)
20
“…a data scientist is someone who
extracts insights from messy data.”
(p.2)
21. What is data science? Robinson and Nolis
21
“Data science is the practice of using
data to try to understand and solve
real-world problems.” (p.5)
22. What is data science? Robinson and Nolis
22
No single
person can do
it all
25. World Bank. (2018). Data Driven Development
Chapter 1. Data: The Fuel of the Future
A data typology (p.2-3)
• Big data
• Personal data
• Open data
• Metadata
• Data platforms
25
26. World Bank. (2018). Data Driven Development
Chapter 1. Data: The Fuel of the Future
A data typology (p.2-)
26
Big data is characterized by its massive size and
complexity
Table 1.2. Big data
Data generation Intentional Unintentional
Human Primary content Data exhaust
Machine Secondary content Internet of Things data
Data
Agent
27. World Bank. (2018). Data Driven Development
Chapter 1. Data: The Fuel of the Future
A data typology (p.2-)
27
Open data are made available by both businesses
and governments. Example of private sector open data used by
other businesses?
28. World Bank. (2018). Data Driven Development
Chapter 1. Data: The Fuel of the Future
A data typology (p.2-)
28
Metadata is “data about data”
A phone call’s main “data” is the content of the conversation.
Its “metadata” includes the date and time of call, the number
called, the duration of the call, etc.
29. World Bank. (2018). Data Driven Development
Chapter 1. Data: The Fuel of the Future
A data typology (p.2-) Data platform
29
Facebook, for example, is a
a). Peer-only data platform
b). Intranet data platform
c). Multisided data platform
Uber connects _____ with ______.
AirBnB connects ______with _____.
Mercari connects ______ with ________.
Facebook connects _______ with _________.
30. World Bank. (2018). Data Driven Development
Chapter 1. Data: The Fuel of the Future
How governments use data (p.5-)
30
Transformation from e-government (1990s-, data
is just the payload of a transaction) to digital
government (2010-, data as strategic asset)
31. In summary, the “data revolution” means…
• There is a lot more data today (volume)
• There are new kinds of data today
(variety)
• Data accumulate rapidly (velocity)
• Human beings produce a lot more data
today (both intentional and unintentional)
• Data are increasingly generated and used
by non-humans (machines)
31
32. Additional background info
Size of data
• The smallest unit: a bit (a contraction of
“binary digit”)
• Eight (8) bits make up one (1) byte
• One alphabet letter is one (1) byte
• One Japanese (or Chinese etc) character
is two (2) or more bytes
32
34. 34
On Twitter (and other social
networking sites),
Data begets data, which then
begets even more data
Illustration using realDonaldTrumprealDonaldTrumprealDonaldTrumprealDonaldTrump
https://twitter.com/realDonaldTrump