The Briefing Room with Dr. Robin Bloor and HPE Security
The Internet of Things brings new technological problems: sensor communications are bi-directional, the scale of data generation points has no precedent and, in this new world, security, privacy and data protection need to go out to the edge. Likely, most of that data lands in Hadoop and Big Data platforms. With the need for rapid analytics never greater, companies try to seize opportunities in tighter time windows. Yet, cyber-threats are at an all-time high, targeting the most valuable of assets—the data.
Register for this episode of The Briefing Room to hear Analyst Dr. Robin Bloor explain the implications of today's divergent data forces. He’ll be briefed by Reiner Kappenberger of HPE, who will discuss how a recent innovation -- NiFi -- is revolutionizing the big data ecosystem. He’ll explain how this technology dramatically simplifies data flow design, enabling a new era of business-driven analysis, while also protecting sensitive data.
4. u Reveal the essential characteristics of enterprise
software, good and bad
u Provide a forum for detailed analysis of today s innovative
technologies
u Give vendors a chance to explain their product to savvy
analysts
u Allow audience members to pose serious questions…and
get answers!
Mission
8. HPE Security
u HPE offers comprehensive data security and privacy
solutions for big data, the cloud and the Internet of
Things
u Its solution features data encryption, tokenization and
key management
u HPE partnered with Hortonworks to enhance its
DataFlow solution (powered by Apache NiFi) with data
protection
9. Guest
Reiner Kappenberger,
Global Product Management, Big Data
HPE Security – Data Security
Reiner Kappenberger has over 20 years of
computer software industry experience
focusing on encryption and security for Big
Data environments. His background ranges
from Device management in the
telecommunications sector to GIS and
database systems. He holds a Diploma from
the FH Regensburg, Germany in computer
science.
10. Solving the Really Big
Tech Problems with IoT
HPE Security – Data Security
January 17, 2017
11. 2
HPE Security - Data Security
We protect the world’s most sensitive data
– Protect the world’s largest brands & neutralize breach impact by securing
sensitive data-at-rest, in-use and in-motion.
– Over 80 patents & 51 years of expertise
Our Solutions
– Provide advanced encryption, tokenization & key management
Market leadership
– Data-centric security solutions used by eight of the top ten U.S. payment
processors, nine of the top ten U.S. banks.
– Thousands of enterprise customers across all industries including
transportation, retail, financial services, payment processing, banking,
insurance, high tech, healthcare, energy, telecom & public sector.
– Email solution used by millions of users and thousands of enterprise & mid-
sized businesses including healthcare organizations, regional banks &
insurance providers.
– Contribute technology to multiple standards organizations.
12. Why is securing Hadoop difficult?
Rapid innovation in a well
funded open source community
Multiple feeds of data in real time
from different sources with different
protection needs
Mainframe
MQ
RDBMs
XML
Salesforce
Flat
Files
Multiple types of data combined
in a Hadoop “Data Lake”
3
13. Why is securing Hadoop difficult?
4
Reduced control if Hadoop
clusters are deployed in a cloud
environment
Automatic replication of data across
multiple nodes once entered into
the HDFS data store
Access by many different users
with varying analytic needs
14. Introducing “Data-centric” security
5
Traditional IT
Infrastructure Security
Disk encryption
Database encryption
SSL/TLS/firewalls
Authentication
Management
Threats to
Data
Malware,
Insiders
SQL injection,
Malware
Traffic
Interceptors
Malware,
Insiders
Credential
Compromise
Security
Gaps
HPE SecureData
Data-centric Security
SSL/TLS/firewalls
Datasecuritycoverage
End-to-endProtection
Data
Ecosystem
Storage
File systems
Databases
Data and applications
Security gap
Security gap
Security gap
Security gap
Middleware
15. HPE Format-Preserving Encryption (FPE)
6
– Supports data of any format: name, address, dates, numbers, etc.
– Preserves referential integrity
– Only applications that need the original value need change
– Used for production protection and data masking
– NIST-standard using FF1 AES Encryption
AES - CBC
AES - FPE 253- 67-2356
8juYE%Uks&dDFa2345^WFLERG
First Name: Uywjlqo Last Name: Muwruwwbp
SSN: 253- 67-2356
DOB: 18-06-1972
Ija&3k24kQotugDF2390^32 0OWioNu2(*872weW
Oiuqwriuweuwr%oIUOw1@
First Name: Gunther
Last Name: Robertson
SSN: 934-72-2356
DOB: 20-07-1966
Tax ID
934-72-2356
16. Hyper Secure Stateless Tokenization (SST)
Credit Card
4171 5678 8765 4321
– Tokenization for PCI scope reduction
– Replaces token database with a smaller token mapping table
– Token values mapped using random numbers
– Lower costs
− No database hardware, software, replication problems, etc.
– Hyper SST technology is architected to leverage the latest compute-platform advances
7
SST 8736 5533 4678 9453
Partial SST 4171 5633 4678 4321
Obvious SST 4171 56AZ UYTZ 4321
BIN Mapping 1236 5533 4678 4321
17. Granular Policy Managed by HPE SecureData
– A policy consists of Data Formats, Protection, and Data Access Rules
8
Data Format Name – Field or Object
Type Alphabets, Formats
Logic Rules – Meaning Preservation Rules
Protection Method
Authentication Policy
FPE SST IBSE
Dynamic Masking Policy Mask Type, Mask
System Auth
Key Rotation Policy for Encryption, Caching Policy
Authorization Groups, Roles – No Access, Access, Masked Access
App Auth
App Permissions – Encrypt, Tokenization, Detokenize, Decrypt
Note: Some features vary by platform, use of LDAP, IAM, IDM
LDAP, Groups
PKI, Secret, IP
ranges, custom
adapters/Java
18. Securing Sensitive Data in Big Data Platforms and Hadoop
9
Public
data
Big Data Platform
Teradata, Vertica, Hadoop
Sqoop
Hive
UDFs
Map
Reduce
“Landing
zone”
TDE
SQL Spark
Sensor
Data
Power
user re-
identifies
data
BI tools
work on
protected
data
Business
processes
use
protected
data
Laptop
log files
Server
log files
Any data
Source
Flume
NiFi
Storm
Kafka
19. 60 Data Sources
20 Million records
per day = 1TB
250 Nodes
LDAP
Sensitive
structured
sources
Hadoop Cluster
Sqoop
Flume
Storm
Hive
UDFs
Map Reduce
Staging Area
HPE SecureData
File Processor
Teradata EDW
UDFs
Data
Virtualization
layer
Tableau
Analytics &
Data Science
HPE SecureData
Key Servers & WS
API’s
Leading Telecoms Provider – Big Data Primary Data Flow
Data
Cleansing
22
20. Threats in the IoT Space
11
Public
data
Sqoop
Hive
UDFs
Map
Reduce
“Landing
zone”
TDE
SQL Spark
Sensor
Data
Power
user re-
identifies
data
BI tools
work on
protected
data
Business
processes
use
protected
data
Laptop
log files
Server
log files
Any data
Source
Flume
NiFi
Storm
Kafka
Back-end infrastructure
21. Leading car manufacturer – Big Data primary data flow
12
Sensitive
structured data
Hadoop Edge
Nodes
HPE SecureData
Hadoop Tools
Hadoop Cluster Data Warehouse
Sensitive
structured
sources
Cognos
Analytics &
Data Science
HPE SecureData
Key Servers &
WS API’s
~2 Billion real time
transactions/day
Other real-time data
feeds – customer
data from
dealerships,
manufacturers
Sqoop
Hive
UDFs
Map Reduces
“Landing
zone”
“Integration
Controls”
Flume real
time ingest
Existing data sets
and 3rd party data,
e.g.. accident data
UDFs
IBM DataStage
25. Event Management and Processing
We have gradually entered an event-
based IT world.
It brings with it new realities.
We need to consider “data in motion.”
26. New Realities
u A good deal of (new) data is
now sourced from outside the
business
u Data has to be governed
u Provenance and lineage
matter
u There is no perimeter
anywhere -- data access
permissions and encryption
apply everywhere
27. § Time
§ Geographic location
§ Virtual/logical
location
§ Source device
§ Device ID
§ Ownership and actors
§ Data
Events and Event Data
28. § Apache project from
NSA (2015)
§ Highly scalable
§ Parallel operation
§ A distributed data
flow platform
§ Point to point (pull-
push)
§ IoT
§ Apache Project from
LinkedIn (2015)
§ Highly scalable
§ Parallel operation
§ A distributed streaming
platform
§ Publish Subscribe
(push-pull)
§ Not so much IoT
NiFi v Kafka
NiFi Kafka
29. A View of a Coherent Data Lake
u Data Lakes are complex -
more complex, for
example, than a data
warehouse
u It’s becoming obvious that
streaming and data flow
are inherent to the data
lake
u It is the primary place of
governance
u There needs to be a
strategy for data
Secure
Transform &
Aggregate
Ingest
Extracts
Network Devices IoTMobileServers Desktops
Embedded Chips RFID The Cloud Log Files
OSes VMsESB/Messaging Software Sys Mgt Apps
Social Network DataWeb Services Data Streams
WorkflowBI & Office AppsBusiness AppsSaaS
To
Databases
Data Marts
Other Apps
Search &
Query
BI, Visual'n
& Analytics
Other
Apps
ETL
D
A
T
A
S
T
O
R
A
G
E
A
R
C
H
I
V
E
Real-Time
Apps
Metadata
MGT
Data
Cleansing
Life
Cycle
30. Compliance and Regulations
u Aside from sector initiatives there
are many official regulations:
HIPAA, SOX, FISMA, FERPA, GLBA
(mainly US legislation)
u Standards (Global): PCI-DSS, ISO/
IEC 17799 (data should be owned)
u National regulations differ
country to country (even in
Europe)
u GDPR being negotiated
31. The Challenges
It is ceasing to be possible to include
security as an afterthought.
Security needs to be designed in from
the get go.
32. u How many companies doing big data projects
do you believe have security properly
organized? Is anything happening that is likely
to complicate the situation even more?
u Security often comes with performance
penalties. What is the performance cost of
the solutions you are advocating?
u Costs? How much budget needs to be
allocated? Can you give a feel for this?
33. u Where do you tend to see NiFi and Kafka (Storm,
Flume, Flink…) being used?
u Security needs to be integrated, so encryption
needs to shake hands with authentication. How
does HPE make this work?
u Are there any environments/applications to
which HPE’s security technology is inapplicable:
OLTP, Data Streaming & Streaming Analytics, BI,
Mobile, Cloud, etc.…