This document discusses IoT and big data. It provides an overview of IoT, its impact, use cases that generate large amounts of data, and challenges around data readiness. Key points include that IoT connects physical objects to exchange data over networks, the amount of IoT devices will grow exponentially, and analyzing IoT data at scale in real-time presents many technical challenges around data storage, analytics infrastructure, and skills.
1. IoT
And
Big
Data
Sanjay
-‐Enterprise
Data
Architect
and
Analy:cs,
Arrayent
Inc.
@sabhub1
A
R
R
A
Y
E
N
T
2. • About
Sanjay
– Enterprise
Data
Architect
and
Analy:cs,
Arrayent
Inc.
– More
than
20
years
in
IT
and
mostly
dealing
with
some
form
of
data
– Worked
at
Apple,
Accenture,
WesternUnion
– Ac:ve
Open
Source
Junky
à
Download,
Compile,
Install,
Ask
Ques:ons
and
Help
Improve
it
3. Agenda
• IoT
• Impact
• Use
Cases
• Data
• Readiness
• Conclusion
4. What
is
IoT?
“The
Internet
of
Things
(IoT)
is
a
scenario
in
which
objects,
animals
or
people
are
provided
with
unique
iden>fies
and
the
ability
to
transfer
data
over
a
network
without
requiring
human-‐to-‐human
or
human-‐to-‐computer
interac>on”
5. IoT?
Future
is
here,
you
like
it
or
not,
you
will
be
sucked
into
it
6. We
are
all
I(di)oT’s?
• We
called
the
Big
Box
the
Idiot
Box
• Now
we
all
have
smaller
version
of
Big
Box
in
our
hands
and
pockets
• We
are
glued
to
it
all
the
:me
• We
want
more
of
it
• And
we
had
enough
of
it
• We
started
to
think
OOTB…
7. What
We
Want
Next?
• We
want
everything
around
us
connected.
• We
want
to
control
things
around
us
• We
are
asking
for
it
• We
were
actually
doing
it.
8. IoT
Is
Large
Space
All
Use
Cases
You
Can
Think
of
9. The
Good,
The
Bad
• Good
News
is
IoT
is
coming,
it
is
happening
• Bad
News
is
IoT
is
coming
faster
• As
Per
Gartner
– Internet
of
Things
Installed
Base
Will
Grow
to
26
Billion
Units
By
2020
12. Use
Cases
• Retail
and
Logis:cs
– Tracking
of
goods
on
an
item-‐level
a
feasible
business
case,
including
inventory
accuracy,
reduc:on
of
administra:ve
overhead,
automated
customer
check-‐out
processes
and
a
reliable
an:-‐
thea
system.
– In-‐store
beacons
14.
Devices
Sense
What
is
Happening
Decide
What
to
do
Build
Your
Context
Act
Quickly
&
Consistently
• Context
Aware
• Predic:ve
and
Rules
Driven
• Con:nuous
real-‐:me
at
scale
Real
Time
Ac:onable
Insights
Sanjay.sabnis@arrayent.com
Latency
Needs
are
Milliseconds
or
less
15. Are
We
Ready?
• Big
Data
is
like
you
have
never
seen
before
• Gathering
data
from
previously
unexplored
areas.
• Not
the
absence
of
data.
• It
is
about
not
missing
on
data
that
you
really
need
16. Use
Cases
• Assisted
living
– Cost
of
nursing
home
is
increasing.
– Need
for
round
the
clock
monitoring
is
challenging
•
Smart
Ligh:ng
– Op:mize
use
of
street
and
building
lights
based
on
current
condi:ons
• Traffic
Monitoring
– Monitoring
and
analyzing
traffic
pa`erns
to
reroute
drivers
• Waste
Management
– Op:mizing
waste
pickup
by
measuring
container
levels
• Security
&
Emergency
Detec:on
– Detec:ng
radia:on,
gases,
and
other
hazardous
condi:ons
in
real
:me.
17. Test
• Data
Centers
will
be
overwhelmed
by
Data
Deluge
– Exis:ng
Data
Center
Capacity
will
be
put
to
test
• A
new
focus
of
real
:me
analy:cs
– Exis:ng
Real
Time
Components
will
have
to
rethink
what
actually
means
in
IoT
• Data
Reten:on/Storage
reach
a
new
level
– Cloud
Storage
Needs
will
surpass
what
we
have
now
• Network
Bandwidth
– New
inven:ons
are
needed
• Privacy/Security
– End
of
the
Day
Consumer
will
take
control
• Skill
Set
– We
need
lot
more
data
engineers
and
scien:sts
• Data
Science
– Need
“Data
Science
on
the
Wire”
• Value
of
Data/Time
to
react
– Window
is
Gefng
Shorter
18.
19. What
We
Do
Arrayent
PlaMorm
enables
trusted
consumer
brands
to
implement
connected
products
and
systems
20. Home
15,000 !
Messages per Sec!
~200 Milliseconds!
Round Trip Latency!
How
we
are
Connected
Device
Virtualiza:on
21. Arrayent
Data
Architecture
Real
Time
Replica:on
Data
Center
1
Data
Center
2
Data
Center
3
Batch
Analy:cs
Quick
Search/Query
Device
Status
Region
1
Region
2
Region
3
Using
Cassandra
• Isolated
Work
Loads
• Easy
to
Manage
• Scalable
• Mul:
DC-‐Region
Deployment
• Use
Network
Anycast
22. Other
Components
• Apache
Usergrid
– Using
it
as
RBAC
– It
can
do
lot
more
than
that.
It
is
fancy
data
store
running
on
Cassandra,
Highly
Scalable
• ELK
Stack
– For
massive
log
pipelining,
Storage,
Indexing
and
pre`y
neat
UI.
– Used
in
NOC
• Kaia
– For
persistent
distributed
pub/sub