Simplifying Hadoop with RecordService, A Secure and Unified Data Access Path for Compute Frameworks

1© Cloudera, Inc. All rights reserved.
Introducing RecordService
Lenni Kuff

RecordService is a distributed,
scalable, data access service for
unified authorization in Hadoop.

Motivation
• As the Hadoop ecosystem expands, new components continue to be added
• Speaks to the overall flexibility of Hadoop
• This is good - more functionality, more workloads, more use cases.
• As use cases for Hadoop mature, user requirements and expectations increase:
• Security
• Performance
• Compatibility
• The flexibility of Hadoop has come at cost of increased complexity

Storage
Compute

Storage
Compute
…

Example: Security
Challenge: Provide unified fine-grained security across compute frameworks
• Integrating consistent security layer into every components is not scalable.
• Securing data at file-level precludes fine grained access control (column/row)
• File ACLs not enough - User can view all or nothing.
• Currently, must split files, duplicate data – large operational cost.
Solution: Add a level of abstraction - secure service to access datasets in “record”
format
• Can now apply fine-grained constraints on projection of dataset
• Same access control policy can be applied uniformly across compute
frameworks; uncoupled from underlying storage layer

Introducing RecordService

Record Service - Overview
• Simplifies
• Provides a higher level, logical abstraction for data (ie Tables or Views)
• Returns schemed objects (instead of paths and bytes). No need for applications
to worry about storage APIs and file formats.
• HCatalog? Similar concept - RecordService is secure, performant. Plan to
support HCatalog as a data model on RecordService.
• Secures
• Central location for all authorization checks using Sentry metadata.
• Secure service that does not execute arbitrary user code
• Accelerates
• Unified data access path allows platform-wide performance improvements.

Architecture

Architecture
• Runs as a distributed service: Planner Servers & Worker Servers
• Servers do not store any state
• Easy HA, fault tolerance.
• Planner Servers responsible for request planning
• Retrieve and combine metadata (NN, HMS, Sentry)
• Split generation -> Creates tasks for workers
• Performs authorization
• Worker Servers reads from storage and constructs records.
• IO, file parsing, predicate evaluation
• Runs as the “source” for a DAG computation

Architecture – Server APIs
• Planner and Worker services expose thrift APIs
• PlanRequest(), Exec(), Fetch()
• PlanRequest()
• Accepts SQL to specify request: Support SELECT and PROJECT
• Access to tables and views stored in HMS
• Does not run operators that require data exchange; “map only”
• Generates a list of tasks which contain the request, each with locality
• Exec()/Fetch()
• Returns records in a canonical optimized, columnar-format.

Architecture – Fault tolerance
• Cluster state persisted in ZK
• Membership, delegation tokens, secret keys
• Servers do not communicate with each other directly => scalability
• Planner services
• Expected to run a few (i.e. 3) for HA
• Fault tolerance handled with clients getting a list of planners and failing over
• Plan requests are short
• Worker services
• Expect to run on each node in the cluster with data
• Fault tolerance handled by framework (e.g. MR) rescheduling task

Architecture – Security
• Authentication using Kerberos and delegation tokens
• Planner authorizes request using metadata in Sentry
• Column level ACLs
• Row level ACLs – create a view with a predicate
• Masking – create a view with the masking function in the select list
• Tasks generated by the planner are signed with a shared key
• Worker runs generated tasks.
• Does not authorize, relies on signed tasks
• Runs as user with full access to data, does not run user code

Architecture – Security example
CREATE VIEW v as
SELECT mask(credit_card_number) as ccn,
name, balance, region
FROM data WHERE region = “Europe”
1. Restrict access to the data set: disable access to ‘data’ table and underlying
files in HDFS.
2. Give access by creating view, v
3. Set column level permissions on v per user if necessary
Write path (ingest) unchanged. Job expected to run as privileged user.

Client APIs – Integration with ecosystem
• Similar APIs designed to integrate with MapReduce and Spark
• Client APIs make things simpler
• Don’t need to interact with HMS
• Care about the underlying storage format: worker always returns records in a
canonical format.
• Storage engine details (e.g. s3)

Client Integration APIs
• Drop in replacements for common existing InputFormats
• Text, Avro
• Can be used with Spark as well
• SparkSQL: integration with the Data Sources API
• Predicate pushdown, projection
• Migration should be easy

MR Example
//FileInputFormat.setInputPaths(job, new Path(args[0]));
//job.setInputFormatClass(AvroKeyInputFormat.class);
RecordServiceConfig.setInputTable(configuration, null, args[0]);
job.setInputFormatClass(
com.cloudera.recordservice.avro.mapreduce.AvroKeyInputFormat.class);

Spark Example
// Comment out one or the other
val file = sc.recordServiceTextFile(path)
//val file = sc.textFile(path)

Performance
• Shares some core components with Impala
• IO management, optimized C++ code, runtime code generation, uses low level
storage APIs
• Highly efficient implementation of the scan functionality
• Optimized columnar on wire format
• Inspired by Apache Parquet
• Accelerates performance for many workloads

Terasort
• ~Worst case scenario. Minimal schema: a single STRING column
• Custom RecordServiceTeraInputFormat (similar to TeraInputFormat)
• 78 Node cluster (12 cores/24 Hyper-Threaded, 12 disks)
• Ran on 1 billion, 50 billion and 1 trillion (~100TB) scales
• See Github repo for more details and runnable examples.

TeraChecksum
1
0.48
0.23
1.03
0.8
0.85
0
0.2
0.4
0.6
0.8
1
1.2
1B (MapReduce) 50B (MapReduce) 1T (MapReduce) 1B (Spark) 50B (Spark) 1T (Spark)
Normalizedjobtime
TeraChecksum
Without RecordService
With RecordService

Spark SQL
• Represents a more expected use case
• Data is fully schemed
• TPCDS
• 500GB scale factor, on parquet
• Cluster
• 5 node cluster

0
50
100
150
200
250
300
350
TPCDS
SparkSQL
SparkSQL
SparkSQL with RecordService
Spark SQL
~15% improvement in query times; queries are not scan bound

Spark SQL
29.5
31
14
23.5
0
5
10
15
20
25
30
35
2% Selective Scan Sum(col)
SparkSQL
SparkSQL
SparkSQL with RecordService

State of the project
• Available in v0.2 beta:
• Integration with Spark, MR, Pig (via HCatalog)
• Planner HA
• Apache 2.0 Licensed
• Sentry Column-Level Privilege Support
• Mini Roadmap:
• Improved multi-tenancy
• Complex types
• More InputFormat support / integration options
• Intend to donate to Apache Software Foundation

Conclusion
• RecordService provides a schemed data access service for Hadoop
• Logical data access instead of physical
• Much more powerful abstraction
• Demonstrated security enforcement, improved performance
• Simpler: clients don’t need to worry about low level details: storage APIs, file
formats
• Opens the door for future improvements

Contributing!
• Mailing list: recordservice-user@googlegroups.com
• Discussion forum: http://community.cloudera.com/t5/Beta-Releases/bd-
p/Beta
• Contributions: http://github.com/cloudera/RecordServiceClient/
• Documentation: http://cloudera.github.io/RecordServiceClient/
• Bug Reporting: https://issues.cloudera.org/projects/RS
• Beta Download:
http://www.cloudera.com/downloads/beta/record-service/0-2-0.html

Thank you

Simplifying Hadoop with RecordService, A Secure and Unified Data Access Path for Compute Frameworks

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Simplifying Hadoop with RecordService, A Secure and Unified Data Access Path for Compute Frameworks

Similar to Simplifying Hadoop with RecordService, A Secure and Unified Data Access Path for Compute Frameworks (20)

More from Cloudera, Inc.

More from Cloudera, Inc. (20)

Recently uploaded

Recently uploaded (20)

Simplifying Hadoop with RecordService, A Secure and Unified Data Access Path for Compute Frameworks

Editor's Notes