Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
The Big Data - Same Humans Problem (CIDR 2015)
1. The Big Data – Same Humans
Problem
Alexandros Labrinidis
Advanced Data Management Technologies Lab
Department of Computer Science
University of Pittsburgh
CIDR 2015 – Gong Show – January 5, 2015
2. 2
You know Big Data is an
important problem if...
• It is featured on the cover of Nature and the
Economist!
3. 3
You know Big Data is an even
more important problem if...
• It has a Dilbert cartoon!
4. What is Big Data?
Definition #1:
• Big data is like teenage sex:
o everyone talks about it,
o nobody really knows how to do it,
o everyone thinks everyone else is doing it,
o so everyone claims they are doing it...
Definition #2:
• Anything that Won't Fit in Excel!
Definition #3:
• Using the Vs
4
5. If you have been living under a rock
for the last few years
5
The three Vs
6. • Volume - size does matter!
• Velocity - data at speed, i.e., the data
“fire-hose”
• Variety - heterogeneity is the rule
6
The three Vs
7. 7
Five more Vs
• Variability - rapid change of data characteristics
over time
• Veracity - ability to handle uncertainty,
inconsistency, etc
• Visibility – protect privacy and provide security
• Value – usefulness & ability to find the right-needle
in the stack
• Voracity - strong appetite for data!
8. 8
Enter Moore’s Law
[ Wikipedia Image ]
Moore's law is the observation that, over the
history of computing hardware, the number of
transistors in a dense integrated circuit doubles
approximately every two years. The law is
named after Gordon E. Moore, co-founder of
Intel Corporation, who described the trend in
his 1965 paper.
Source: http://en.wikipedia.org/wiki/Moore's_law
9. 9
Enter Bezos’ Law
Photo: http://www.slashgear.com/google-data-center-hd-photos-hit-where-the-internet-lives-gallery-17252451/
Bezos' law is the observation that, over
the history of cloud, a unit of computing
power price is reduced by 50% approximately
every 3 years
Source: http://blog.appzero.com/blog/futureofcloud
12. 12
We refer to this as the:
Big Data – Same Humans
Problem
13. Rethinking
13
Systems point of view Human point of view
• Response time
• Throughput
• Scale-up
• Scale-out
• Making sure humans
do not get lost in a
sea of data!
Scalability
14. Rethinking Scalability
14
Systems point of view Human point of view
• Fantastic! • Terrible!
Example:
Data Stream Management System processing
1,000,000 events per second
15. So now what?
• In addition to “traditional” scalability, it is
important to consider “big data” research
in:
o Summarization
o Personalization
o Ranking
o Recommender Systems
o Visual Analytics
o Crowd-sourcing
o …
15
18. Thank you
18
Further reading:
Big Data and Its Technical Challenges
http://bit.ly/bigdatachallenges
By Jagadish, Gehrke, Labrinidis, Papakonstantinou,
Patel, Ramakrishnan, and Shahabi,
Communications of the ACM, July 2014
Editor's Notes
We faced Big Data challenges for over four decades, though the meaning of “Big” has been evolving.
Unless you have been living under a rock for the last few years you must have already seen/heard the 3 Vs
Unless you have been living under a rock for the last few years you must have already seen/heard the 3 Vs
My take on the big data definition and problem is as follows