Big data burst upon the scene in the first decade of the 21st century, and the first organizations to embrace it were online and startup firms. Arguably, firms like Google, eBay, LinkedIn, and Facebook were built around big data from the beginning. Here is a brief introduction to what Big Data entails and how it could effect businesses today.
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Introduction to Big Data
1.
2. Agenda
•
•
•
•
What is Big Data
Example of Big Data
Drivers of Big Data: HIPO vs “Geeks”
Potential of Big Data
Gan, Jeremy
eMOT | MG 8783: Cloud Computing
1
3. What is Big Data?
• Three V’s of Big Data
– Volume
– Velocity
– Variety
Gan, Jeremy
eMOT | MG 8783: Cloud Computing
2
4. VOLUME: HOW MUCH DATA?
Gan, Jeremy
eMOT | MG 8783: Cloud Computing
3
6. Volume: How Much Data?
•
•
•
•
•
•
•
•
KiloMegaTeraGigaPetaExaZettaYotta-
Gan, Jeremy
: 10^3
: 10^6
: 10^9
: 10^12
: 10^15
: 10^18
: 10^21
: 10^24
bytes
bytes
bytes
bytes
bytes
bytes
bytes
bytes
eMOT | MG 8783: Cloud Computing
As of 2013
5
7. Volume: How Much Data? (cont.)
HELLA(~ 10^27 byte)
aka
“HELLUVA-”
Gan, Jeremy
eMOT | MG 8783: Cloud Computing
6
8. Volume: How Much Data? (cont.)
If we were to take all that information and store
it in books, we could cover the entire area of the
US or China in 3 layers of books.
Martin Hilbert, Researcher, USC
Gan, Jeremy
eMOT | MG 8783: Cloud Computing
7
9. VELOCITY: IMMEDIATE & REACTIVE
(REAL-TIME DATA ANALYSIS)
Gan, Jeremy
eMOT | MG 8783: Cloud Computing
8
10. NYSE collects over 1 TB of trade info EACH session
Gan, Jeremy
eMOT | MG 8783: Cloud Computing
9
11. Modern cars have over HUNDRED sensors
Gan, Jeremy
eMOT | MG 8783: Cloud Computing
10
21. Variety: Data In What Form?
• Goal
– Identify patterns
– Gain insights
• Why?
– Combine big data with traditional data to better
understand pain points
– Mitigate/limit negative impact
– Increase/create revenue stream
Gan, Jeremy
eMOT | MG 8783: Cloud Computing
20
22. THREE V’S + 1 = VERACITY
Gan, Jeremy
eMOT | MG 8783: Cloud Computing
21
23. Role of Data Scientist
• Keep data organized - accurately
• Poor data management quality cost U.S.
economy roughly $3.1 trillion/year
Gan, Jeremy
eMOT | MG 8783: Cloud Computing
22
24. Role of Data Scientist (cont.)
• Data used correctly could spark limitless
potentials
– Prevent disease
– Combat crime
– Revolutionize global R&D
– Disrupt conventional business model
– Challenge HIPO’s guts
Gan, Jeremy
eMOT | MG 8783: Cloud Computing
23
25. Role of Data Scientist (cont.)
• Data used correctly could spark limitless
potentials
– Prevent disease
– Combat crime
– Revolutionize global R&D
– Disrupt conventional business model
–Challenge HIPO’s guts
Gan, Jeremy
eMOT | MG 8783: Cloud Computing
24
26. DRIVERS OF BIG DATA: HIPO VS
“GEEKS” (EXAMPLE)
Gan, Jeremy
eMOT | MG 8783: Cloud Computing
25
28. HIPO vs Geek
Michael Slaby, CTO, OFA 2008
Gan, Jeremy
Harper Reed, CTO, OFA 2012
eMOT | MG 8783: Cloud Computing
27
29. Breakdown
• Innovative solution by leveraging big data
– Facebook information
• Personal interest: Preferences
• Location: Hyper-local, better content distribution
• Relevant: Contact efficiency
– Push innovation into sales by using data to have a
conversation
– Twitter
• DM via President and First Lady’s Twitter accounts
Gan, Jeremy
eMOT | MG 8783: Cloud Computing
28
32. Limitless
• Research by McKinsey in Jan 2013
– Companies using large-scale big data to shape
corporate strategy
• Example:
– IBM acquiring Kenexa Corp.
» Cloud (SAAS foundation) + big data (market insights)
» Remove “guess work” – replacing it with precision
• Hiring – Utilize behavioral traits
• Research by Harvard School of Public Health
– Big data could effectively prevent TB and
shrinkage of health care cost
Gan, Jeremy
eMOT | MG 8783: Cloud Computing
31
33. Harper’s Thought On Healthcare.gov
Source NYT.com
Gan, Jeremy
eMOT | MG 8783: Cloud Computing
32
Insight unto consumer spending to sell you targeted ads.Merchant gain access to your email every swipe to send you emails.
designed by a team of neuroscientists, psychologists, and data scientists to suss out human potential. Play one of them for just 20 minutes and you’ll generate several megabytes of data, exponentially more than what’s collected by the SATEnd Result?high-resolution portrait of your psyche and intellect, and an assessment of your potential as a leader or an innovator.