2. 2
Sqrrl Data, Inc., All Rights Reserved
Problem
<5%
of
Data
Solu+on
General Data Problems
Source:
Forrester
3. 3
Sqrrl Data, Inc., All Rights Reserved
What about security?
3
4. 4
Sqrrl Data, Inc., All Rights Reserved
What is the market saying?
security
becomes
an
“enabler”
by
making
it
possible
to
bring
together
huge
stores
of
data
You
want
security
to
be
just
as
scalable,
high-‐
performance
and
self-‐organizing
as
the
clusters
most
big
data
technologies
don’t
have
any
security
features
built
in
want
fine-‐grained
security
and
policy
control
at
the
database-‐level
5. 5
Sqrrl Data, Inc., All Rights Reserved
• With
every
copy
of
data,
there
is
an
increased
risk
of
unintended
disclosure
• Every
now
and
then
people
with
access
and
privileges
take
a
look
at
records
without
a
legiCmate
business
purpose
e.g.,
an
employee
of
a
banking
system
looking
up
their
neighbor
A few more risks…
6. 6
Sqrrl Data, Inc., All Rights Reserved
The Perfect Storm
6
Security
Analysis
Customer
Support
Customer
Profiles
Sales
&
MarkeCng
Social
Media
Business
Improvement
Big
Data
Regula+ons
&
Breaches
Increased
profits
Increased
profits
Increased
profits
Increased
profits
Increased
profits
Increased
profits
7. 7
Sqrrl Data, Inc., All Rights Reserved
• Big
Data
is
a
Cme-‐bomb
based
on
how
things
are
coming
together
• Big
Data
deployment
is
growing
fast;
rushing
into
it
• Shortage
in
Big
Data
skills
• Big
Data
security
soluCons
are
not
effecCve
• General
shortage
in
security
skills
The Perfect Storm
7
8. 8
Sqrrl Data, Inc., All Rights Reserved
So
what
can
we
do?
9. 9
Sqrrl Data, Inc., All Rights Reserved
(Def.)
A
form
of
security
in
which
data
carries
with
it
the
elements
of
provenance
that
are
required
to
make
policy
decisions
on
its
visibility:
• Separate
data
modeling
for
security
and
analysis
• Data
comes
with
security
aYributes
governing
its
visibility…..data
is
self-‐describing
• Reusability
of
applicaCons
across
security
domains
• Distributed
development
of
ingest
and
query
applicaCons
• Supported
by
Accumulo’s
cell-‐level
security
Data-Centric Security
10. 10
Sqrrl Data, Inc., All Rights Reserved
Data-Centric Security
Within
Accumulo,
a
key
is
a
5-‐tuple,
consis+ng
of:
" Row:
Controls
Atomicity
" Column
Family:
Controls
Locality
" Column
Qualifier:
Controls
Uniqueness
" Visibility
Label:
Controls
Access
" Timestamp:
Controls
Versioning
Row
Col.
Fam.
Col.
Qual.
Visibility
Timestamp
Value
John
Doe
Notes
PCP
PCP_JD
20120912
PaCent
suffers
from
an
acute
…
John
Doe
Test
Results
Cholesterol
JD|PCP_JD
20120912
183
John
Doe
Test
Results
Mental
Health
JD|PSYCH_JD
20120801
Pass
John
Doe
Test
Results
X-‐Ray
JD|PHYS_JD
20120513
1010110110100…
Accumulo
Key/Value
Example
11. 11
Sqrrl Data, Inc., All Rights Reserved
Data-Centric Security
12. 12
Sqrrl Data, Inc., All Rights Reserved
Data-Centric Security
Row Col Value
1 Name Jones
1 Sales 100
1 Age 28
2 Name Smith
2 Sales 350
2 Age 25
2
Quota
1000
Row Col Value
1 Name Anon1
1 Sales 100
2 Name Smith
2 Sales 350
2
Quota
1000
User 1 User 2Data
Store
Data-‐centric
security
approach
allows
all
the
data
to
be
stored
on
a
single
pla9orm
and
only
authorized
data
is
returned
to
the
user
Pushing
security
to
the
data-‐level,
simplifies
applica@on
development
and
enables
more
powerful
queries
13. 13
Sqrrl Data, Inc., All Rights Reserved
We
now
have
user
access
to
the
data
secured.
But
what
about
your
HDFS
administrators?
Encryption of Files
14. 14
Sqrrl Data, Inc., All Rights Reserved
Encryption of Files
By
encrypCng
the
files
we
write
into
HDFS
we
further
eliminate
who
can
access
the
data!