Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Managing Big Data: An Introduction to Data Intensive Computing
1. An
Introduc+on
to
Data
Intensive
Compu+ng
Chapter
2:
Data
Management
Robert
Grossman
University
of
Chicago
Open
Data
Group
Collin
BenneC
Open
Data
Group
November
14,
2011
1
2. 1. Introduc+on
(0830-‐0900)
a. Data
clouds
(e.g.
Hadoop)
b. U+lity
clouds
(e.g.
Amazon)
2. Managing
Big
Data
(0900-‐0945)
a. Databases
b. Distributed
File
Systems
(e.g.
Hadoop)
c. NoSql
databases
(e.g.
HBase)
3. Processing
Big
Data
(0945-‐1000
and
1030-‐1100)
a. Mul+ple
Virtual
Machines
&
Message
Queues
b. MapReduce
c. Streams
over
distributed
file
systems
4. Lab
using
Amazon’s
Elas+c
Map
Reduce
(1100-‐1200)
3. What
Are
the
Choices?
Databases
(SqlServer,
Oracle,
DB2)
File
Systems
Distributed
File
Systems
(Hadoop,
Sector)
Clustered
File
Systems
(glusterfs,
…)
NoSQL
Databases
(HBase,
Accumulo,
Cassandra,
SimpleDB,
…)
Applica+ons
(R,
SAS,
Excel,
etc.
)
4. What
is
the
Fundamental
Trade
Off?
Scale
up
Scale
out
vs
…
6. Advice
From
Jim
Gray
1. Analyzing
big
data
requires
scale-‐out
solu+ons
not
scale-‐up
solu+ons
(GrayWulf)
2. Move
the
analysis
to
the
data.
3. Work
with
scien+sts
to
find
the
most
common
“20
queries”
and
make
them
fast.
4. Go
from
“working
to
working.”
7. PaCern
1:
Put
the
metadata
in
a
database
and
point
to
files
in
a
file
system.
8. Example:
Sloan
Digital
Sky
Survey
• Two
surveys
in
one
– Photometric
survey
in
5
bands
– Spectroscopic
redshii
survey
• Data
is
public
– 40
TB
of
raw
data
– 5
TB
processed
catalogs
– 2.5
Terapixels
of
images
• Catalog
uses
Microsoi
SQLServer
• Started
in
1992,
finished
in
2008
• JHU
SkyServer
serves
millions
of
queries
10. Database
Services
Analysis
Pipelines
&
Re-‐analysis
Services
GWT-‐based
Front
End
Data
Cloud
Services
Data
Inges+on
Services
U+lity
Cloud
Services
Intercloud
Services
11. Database
Services
Analysis
Pipelines
&
Re-‐analysis
Services
GWT-‐based
Front
End
Large
Data
Cloud
Services
Data
Inges+on
Services
Elas+c
Cloud
Services
Intercloud
Services
(Hadoop,
Sector/Sphere)
(Eucalyptus,
OpenStack)
(PostgreSQL)
ID
Service
(UDT,
replica+on)
13. Hadoop’s
Large
Data
Cloud
Storage
Services
Compute
Services
13
Hadoop’s
Stack
Applica+ons
Hadoop
Distributed
File
System
(HDFS)
Hadoop’s
MapReduce
Data
Services
NoSQL
Databases
14. PaCern
2:
Put
the
data
into
a
distributed
file
system.
15. Hadoop
Design
• Designed
to
run
over
commodity
components
that
fail.
• Data
replicated,
typically
three
+mes.
• Block-‐based
storage.
• Op+mized
for
efficient
scans
with
high
throughput,
not
low
latency
access.
• Designed
for
write
once,
read
many.
• Append
opera+on
planned
for
future.
16. Hadoop
Distributed
File
System
(HDFS)
Architecture
Name
Node
Data
Node
Data
Node
Data
Node
Client
control
Data
Node
Data
Node
Data
Node
data
Rack
Rack
Rack
• HDFS
is
block-‐
based.
• WriCen
in
Java.
17. Sector
Distributed
File
System
(SDFS)
Architecture
• Broadly
similar
to
Google
File
System
and
Hadoop
Distributed
File
System.
• Uses
na+ve
file
system.
It
is
not
block
based.
• Has
security
server
that
provides
authoriza+ons.
• Has
mul+ple
master
name
servers
so
that
there
is
no
single
point
of
failure.
• Use
UDT
to
support
wide
area
opera+ons.
18. Sector
Distributed
File
System
(SDFS)
Architecture
Master
Node
Slave
Node
Slave
Node
Slave
Node
Client
control
Slave
Node
Slave
Node
Slave
Node
data
Rack
Rack
Rack
• HDFS
is
file-‐
based.
• WriCen
in
C++.
• Security
server.
• Mul+ple
masters.
Security
Server
control
Master
Node
19. GlusterFS
Architecture
• No
metadata
server.
• No
single
point
of
failure.
• Uses
algorithms
to
determine
loca+on
of
data.
• Can
scale
out
by
adding
more
bricks.
20. GlusterFS
Architecture
Brick
Brick
Brick
Client
Brick
Brick
Brick
data
Rack
Rack
Rack
File-‐based.
GlusterFS
Server
22. Evolu+on
• Standard
architecture
for
simple
web
applica+ons:
– Presenta+on:
front-‐end,
load
balanced
web
servers
– Business
logic
layer
– Backend
database
• Database
layer
does
not
scale
with
large
numbers
of
users
or
large
amounts
of
data
• Alterna+ves
arose
– Sharded
(par++oned)
databases
or
master-‐slave
dbs
– memcache
22
23. Scaling
RDMS
• Master
–
slave
database
systems
– Writes
to
master
– Reads
from
slaves
– Can
be
boClenecks
wri+ng
to
slaves;
can
be
inconsistent
• Sharded
databases
– Applica+ons
and
queries
must
understand
sharing
schema
– Both
reads
and
writes
scale
– No
na+ve,
direct
support
for
joins
across
shards
23
24. NoSQL
Systems
• Suggests
No
SQL
support,
also
Not
Only
SQL
• One
or
more
of
the
ACID
proper+es
not
supported
• Joins
generally
not
supported
• Usually
flexible
schemas
• Some
well
known
examples:
Google’s
BigTable,
Amazon’s
Dynamo
&
Facebook’s
Cassandra
• Quite
a
few
recent
open
source
systems
24
27. C
A
P
Consistency
Availability
Par++on-‐resiliency
CA:
available
and
consistent,
unless
there
is
a
par++on.
AP:
a
reachable
replica
provides
service
even
in
a
par++on,
but
may
be
inconsistent.
CP:
always
consistent,
even
in
a
par++on,
but
a
reachable
replica
may
deny
service
without
quorum.
Dynamo,
Cassandra
BigTable,
HBase
CAP
–
Choose
Two
Per
Opera+on
28. CAP
Theorem
• Proposed
by
Eric
Brewer,
2000
• Three
proper+es
of
a
system:
consistency,
availability
and
par++ons
• You
can
have
at
most
two
of
these
three
proper+es
for
any
shared-‐data
system
• Scale
out
requires
par++ons
• Most
large
web-‐based
systems
choose
availability
over
consistency
28
Reference:
Brewer,
PODC
2000;
Gilbert/Lynch,
SIGACT
News
2002
29. Eventual
Consistency
• If
no
updates
occur
for
a
while,
all
updates
eventually
propagate
through
the
system
and
all
the
nodes
will
be
consistent
• Eventually,
a
node
is
either
updated
or
removed
from
service.
• Can
be
implemented
with
Gossip
protocol
• Amazon’s
Dynamo
popularized
this
approach
• Some+mes
this
is
called
BASE
(Basically
Available,
Soi
state,
Eventual
consistency),
as
opposed
to
ACID
29
30. Different
Types
of
NoSQL
Systems
• Distributed
Key-‐Value
Systems
– Amazon’s
S3
Key-‐Value
Store
(Dynamo)
– Voldemort
– Cassandra
• Column-‐based
Systems
– BigTable
– HBase
– Cassandra
• Document-‐based
systems
– CouchDB
30
31. Hbase
Architecture
HRegionServer
Client
Client
Client
Client
Client
HBaseMaster
REST API
Disk
HRegionServer
Java
Client
Disk
HRegionServer
Disk
HRegionServer
Disk
HRegionServer
Source:
Raghu
Ramakrishnan
32. HRegion
Server
• Records
par++oned
by
column
family
into
HStores
– Each
HStore
contains
many
MapFiles
• All
writes
to
HStore
applied
to
single
memcache
• Reads
consult
MapFiles
and
memcache
• Memcaches
flushed
as
MapFiles
(HDFS
files)
when
full
• Compac+ons
limit
number
of
MapFiles
HRegionServer
HStore
MapFiles
Memcache
writes
Flush
to
disk
reads
Source:
Raghu
Ramakrishnan
33. Facebook’s
Cassandra
• Modeled
aier
BigTable’s
data
model
• Modeled
aier
Dynamo’s
eventual
consistency
• Peer
to
peer
storage
architecture
using
consistent
hashing
(Chord
hashing)
33
34. Databases
NoSQL
Systems
Scalability
100’s
TB
100’s
PB
Func+onality
Full
SQL-‐based
queries,
including
joins
Op+mized
access
to
sorted
tables
(tables
with
single
keys)
Op+mized
Databases
op+mized
for
safe
writes
Clouds
op+mized
for
efficient
reads
Consistency
model
ACID
(Atomicity,
Consistency,
Isola+on
&
Durability)
–
database
always
consist
Eventual
consistency
–
updates
eventually
propagate
through
system
Parallelism
Difficult
because
of
ACID
model;
shared
nothing
is
possible
Basic
design
incorporates
parallelism
over
commodity
components
Scale
Racks
Data
center
34
41. PaCern
4:
Put
the
data
into
a
distributed
key-‐value
store.
42. S3
Buckets
• S3
bucket
names
must
be
unique
across
AWS
• A
good
prac+ce
is
to
use
a
paCern
like
tutorial.osdc.org/dataset1.txt
for
a
domain
you
own.
• The
file
is
then
referenced
as
tutorial.osdc.org.s3.
amazonaws.com/
dataset1.txt
• If
you
own
osdc.org
you
can
create
a
DNS
CNAME
entry
to
access
the
file
as
tutorial.osdc.org/dataset1.txt
43. S3
Keys
• Keys
must
be
unique
within
a
bucket.
• Values
can
be
as
large
as
5
TB
(formerly
5
GB)
44. S3
Security
• AWS
access
key
(user
name)
• This
func+on
as
your
S3
username.
It
is
an
alphanumeric
text
string
that
uniquely
iden+fies
users.
• AWS
Secret
key
(func+ons
as
password)
49. The
Basic
Problem
• TCP
was
never
designed
to
move
large
data
sets
over
wide
area
high
performance
networks.
• As
a
general
rule,
reading
data
off
disks
is
slower
than
transpor+ng
it
over
the
network.
50. TCP Throughput vs RTT and Packet Loss
0.01%
0.05%
0.1%
0.1%
0.5%
1000
800
600
400
200
1 10 100 200 400
1000
800
600
400
200
Throughput(Mb/s)
Round Trip Time (ms)
LAN US-EU US-ASIAUS
Source:
Yunhong
Gu,
2007,
experiments
over
wide
area
1G.
51. The
Solu+on
• Use
parallel
TCP
streams
– GridFTP
• Use
specialized
network
protocols
– UDT,
FAST,
etc.
• Use
RAID
to
stripe
data
across
disks
to
improve
throughput
when
reading
• These
techniques
are
well
understood
in
HEP,
astronomy,
but
not
yet
in
biology.
52. Case
Study:
Bio-‐mirror
[The
open
source
GridFTP]
from
the
Globus
project
has
recently
been
improved
to
offer
UDP-‐based
file
transport,
with
long-‐distance
speed
improvements
of
3x
to
10x
over
the
usual
TCP-‐based
file
transport.
-‐-‐
Don
Gilbert,
August
2010,
bio-‐mirror.net
53. Moving
113GB
of
Bio-‐mirror
Data
Site
RTT
TCP
UDT
TCP/UDT
Km
NCSA
10
139
139
1
200
Purdue
17
125
125
1
500
ORNL
25
361
120
3
1,200
TACC
37
616
120
55
2,000
SDSC
65
750
475
1.6
3,300
CSTNET
274
3722
304
12
12,000
GridFTP
TCP
and
UDT
transfer
+mes
for
113
GB
from
gridip.bio-‐mirror.net/biomirror/
blast/
(Indiana
USA).
All
TCP
and
UDT
+mes
in
minutes.
Source:
hCp://gridip.bio-‐
mirror.net/biomirror/
54. Case
Study:
CGI
60
Genomes
• Trace
by
Complete
Genomics
showing
performance
of
moving
60
complete
human
genomes
from
Mountain
View
to
Chicago
using
the
open
source
Sector/UDT.
• Approximately
18
TB
at
about
0.5
Mbs
on
1G
link.
Source:
Complete
Genomics.
55. Resource
Use
Protocol
CPU
Usage*
Memory*
GridFTP
(UDT)
1.0%
-‐
3.0%
40
Mb
GridFTP
(TCP)
0.1%
-‐
0.6%
6
Mb
*CPU
and
memory
usage
collected
by
Don
Gilbert.
He
reports
that
rsync
uses
more
CPU
than
GridFTP
with
UDT.
Source:
hCp://gridip.bio-‐mirror.net/biomirror/.
56. Sector/Sphere
• Sector/Sphere
is
a
pla{orm
for
data
intensive
compu+ng
built
over
UDT
and
designed
to
support
geographically
distributed
clusters.
57. Ques+ons?
For
the
most
current
version
of
these
notes,
see
rgrossman.com