2. Contents
• What
does
“Big
Data”
really
mean?
• Big
Data
use
cases
• Considera:ons
when
building
your
project/applica:on
• Hos:ng
op:ons
and
Big
Data
challenges
• Opera:ons-‐as-‐a-‐Service
• Customer
close-‐up
3. “Big
data
is
a
collec.on
of
[unstructured]
data
from
tradi.onal
and
digital
sources
inside
and
outside
your
company
that
represents
a
source
for
ongoing
discovery
and
analysis.”
-‐
Lisa
Arthur,
Forbes
4. Big
Data
use
cases
DATA
Make
unstructured
info
transparent
and
usable
at
much
higher
frequency
Precisely
tailor
products/services
for
beJer
analysis
and
segmenta:on
Improve
development
of
next
gen
products/services
Create
and
store
unstructured
transac:onal
data
5. Planning
your
build
When
you’re
building
big
data
applica:ons
you
have
to
have
a
view
of
the
complete
Stack
The
Stack
6. Requirements
of
Big
Data
ApplicaEons
• Big
Data
is
power
hungry
• 10
or
40Gbps
networks
at
a
minimum
• Big
Data
is
distributed
• Big
Data
is
monitoring
intensive
– Requires
accurate,
specific
and
frequent
diagnos:cs
to
run
properly
• Big
Data
apps
require
tons
of
memory
and
storage
• Applica:on
Support
Tools
7. What
do
you
need
for
the
back
end?
Take
a
big
task
and
divide
into
smaller,
discrete
tasks
that
can
be
carried
out
in
parallel
In
the
cloud,
your
data
could
be
spread
across
mul:ple
servers
Because
of
this
complexity,
the
task
needs
to
be
divided
into
smaller
tasks
8. Choosing
a
hos:ng
op:on
for
your
project
In-‐house
vs.
Cloud
vs.
Coloca:on
vs.
Dedicated
Managed
Hos:ng
(Opera:ons
as
a
Service)
10. In-‐house
–
What
you
get
• Purpose
built
system
(custom
design)
=
Fast!
• Minimal
Packet
Loss,
JiJer
and
Latency
• Single
Tenant
• Reduced/No
Server
or
Data
Sprawl
• Transparent
Infrastructure
• 10
or
40Gbps
Network
✓
✓
✓
✓
✓
✓
11. Challenges
of
Big
Data
w/
In-‐House
hos:ng
• Do
you
have
the
experience
and
knowledge
to
design,
build
and
maintain
the
network?
§ Have
you
thought
about
the
total
costs?
– Data
center
costs
– Equipment
costs
– Staffing
costs
– Applica:on
Support
costs
• Did
you
factor
in
applica:on
support
tools?
• Do
you
want
to
be
an
internet
plumber?
$
12. Cloud
–
what
you
get
• Quick
spin-‐up
:me
• Lower
equipment
costs
• Lower
personnel
costs
for
infrastructure
support
✓
✓
✓
13. Challenges
of
Big
Data
in
a
Cloud
environment
• Would
your
opera:ons
be
adversely
affected
by
packet
loss,
jiJer
and
latency?
• Do
you
want
to
share
resources
with
other
companies
on
a
system
that’s
designed
to
be
big,
but
not
fast?
• Does
your
data
need
to
be
“in
one
place”?
• Distributed
data
puts
a
stress
on
the
network
that
most
cloud
environments
were
not
designed
for
! !
14. Challenges
of
Big
Data
in
a
Cloud
environment
• Is
the
cloud
provider
capable
of
providing
the
intensive
monitoring
needed
by
Big
Data
applica:ons?
– Requires
accurate,
specific
and
frequent
diagnos:cs
to
run
properly
– The
privacy
of
the
cloud
works
against
efficiency
15. Coloca:on
Hos:ng–
what
you
get
• Lower
equipment
costs
• Control
over
non-‐data
center
infrastructure
(servers,
network,
etc.)
• Not
responsible
for
data
center
design,
build
or
maintenance
• No
tech
support
for
equipment
• Single-‐tenancy
✓
✓
✓
✓
✓
16. Challenges
of
Big
Data
in
a
Coloca:on
Environment
• Do
you
want
to
be
responsible
for
all
non-‐data
center
support?
• Are
you
comfortable
with
having
no
applica:on
support?
• Does
the
provider
custom-‐design
your
architecture,
or
rely
on
a
‘one
size
fits
most’
deployment?
• What
hardware
is
single-‐tenant,
and
what
is
mul:-‐
tenant/shared,
and
would
the
shared
elements
impact
your
opera:ons?
17. Opera:ons-‐as-‐a-‐Service
(Dedicated
Managed
Hos:ng)
OperaEons-‐as-‐a-‐Service
In-‐House
Cloud
ColocaEon
OaaS
via
Peak
HosEng
Minimal
Packet
Loss,
JiQer
and
Latency
þ
ý
Maybe
þ
Single
Tenant
þ
ý
þ
þ
Reduced/No
Server
or
Data
Sprawl
þ
ý
Maybe
þ
DC
Techs
Supplied
ý
þ
þ
þ
SysAdmin
Supplied
ý
ý
ý
þ
Transparent
Infrastructure
þ
ý
þ
þ
Custom
Design
þ
þ
Maybe
þ
10
or
40Gbps
Network
þ
?
?
þ
ApplicaEon
Support
tools
ý
ý
ý
þ
18. Peak
Hos:ng
Customer
Close-‐up
Big
social
data
analy:cs
company,
delivering
advanced
social
intelligence
and
real-‐:me
threat
detec:on
across
the
consumer
packaged
goods,
food
and
beverage,
media
and
entertainment
and
pharmaceu:cal
industries.
Akuda
Labs’
Pulsar
real-‐:me
streaming
classifica:on
engine
available,
currently
processing
5
Billion
SCOPS
(was
500
million
when
the
came
to
Peak
Hos:ng)
for
their
product,
ListenLogic
19.
-‐
The
search
• Needs:
– At
least
1
Billion
SCOPS
processing
power
to
run
Hadoop-‐level,
deep
dive
ques:ons
– Answers
in
real-‐:me
Build
vs.
Buy?
Cloud
DIY
(Build)
Dedicated
Managed
HosEng
Not
an
op:on
due
to
shared
and
distributed
infrastructure
in
a
cloud
environment
• Total
control
• EXPENSIVE
$$$
-‐
HW
-‐
Staffing
• Their
best
op:on
• Now,
which
provider?
20. -‐
The
choice
Best
performing
hardware
Fast
network
Customized
Infrastructure
–
designed
specifically
for
Akuda
Labs
!
þ
Technical
Support
staff
þ
OperaEons-‐as-‐a-‐Service
þ
21. -‐
What
we
did
2012:
• Provided
40-‐50
servers
–
24
&
34
core
machines
w/
128GB
RAM
2013:
• Akuda
upgrades
to
64-‐core
servers
w/
512GB
RAM
• S:ll
only
40-‐50
servers
• Connected
via
dual
10Gbps
networking
Pool
servers
for
customers
and
simply
add
more
servers
to
the
pool
as
needed
–
rather
than
deploy
a
new
cluster
per
customer
New
Abili:es
Process
100X
the
data
they
previously
could
Easily
process
500
million
SCOPS,
with
the
ability
to
process
50
billion
if
they
had
enough
data
22. -‐
The
ROI
BeQer
Efficiency
BeQer
Service
BeQer
Economics
More
ProducEvity
Trim
server
count
by
20%
Schedule
tasks
on-‐demand
instead
of
wai:ng
for
resources
BeJer
performance,
higher
levels
of
customiza:on
and
produc:vity
All
while
paying
30%
less
than
with
previous
provider
Worked
together
to
design,
build,
maintain,
and
support
current
infrastructure