MapR, Implications for Integration

MapR, Implications for Integration CHUG – August 2011

Outline MapR system overview Map-reduce review MapR architecture Performance Results Map-reduce on MapR Architectural implications Search indexing / deployment EM algorithm for machine learning … and more …

Map-Reduce Shuffle Input Output

Bottlenecks and Issues Read-only files Many copies in I/O path Shuffle based on HTTP Can’t use new technologies Eats file descriptors Spills go to local file space Bad for skewed distribution of sizes

MapR Improvements Faster file system Fewer copies Multiple NICS No file descriptor or page-buf competition Faster map-reduce Uses distributed file system Direct RPC to receiver Very wide merges

MapR Innovations Volumes Distributed management Data placement Read/write random access file system Allows distributed meta-data Improved scaling Enables NFS access Application-level NIC bonding Transactionally correct snapshots and mirrors

MapR'sContainers Files/directories are sharded into blocks, whichare placed into mini NNs (containers ) on disks ,[object Object]

No need to manage directlyContainers are 16-32 GB segments of disk, placed on nodes

Container locations and replication CLDB N1, N2 N1 N3, N2 N1, N2 N2 N1, N3 N3, N2 N3 Container location database (CLDB) keeps track of nodes hosting each container

MapR Scaling Containers represent 16 - 32GB of data ,[object Object]

100M containers = ~ 2 Exabytes (a very large cluster)250 bytes DRAM to cache a container ,[object Object]

But not necessary, can page to disk

Typical large 10PB cluster needs 2GBContainer-reports are 100x - 1000x < HDFS block-reports ,[object Object]

Increase container size to 64G to serve 4EB cluster

Map/reduce not affected,[object Object]

Terasort on MapR 10+1 nodes: 8 core, 24GB DRAM, 11 x 1TB SATA 7200 rpm Elapsed time (mins) Lower is better

HBase on MapR YCSB Random Read with 1 billion 1K records 10+1 node cluster: 8 core, 24GB DRAM, 11 x 1TB 7200 RPM Recordspersecond Higher is better

Small Files (Apache Hadoop, 10 nodes) Out of box Op: - create file - write 100 bytes - close Notes: - NN not replicated - NN uses 20G DRAM - DN uses 2G DRAM Tuned Rate (files/sec) # of files (m)

MUCH faster for some operations Same 10 nodes … Create Rate # of files (millions)

What MapR is not Volumes != federation MapR supports > 10,000 volumes all with independent placement and defaults Volumes support snapshots and mirroring NFS != FUSE Checksum and compress at gateway IP fail-over Read/write/update semantics at full speed MapR != maprfs

NFS mounting models Export to the world NFS gateway runs on selected gateway hosts Local server NFS gateway runs on local host Enables local compression and check summing Export to self NFS gateway runs on all data nodes, mounted from localhost

Export to the world NFS Server NFS Server NFS Server NFS Server NFS Client

Local server Client Application NFS Server Cluster Nodes

Universal export to self Cluster Nodes Cluster Node Task NFS Server

Cluster Node Task NFS Server Cluster Node Task Cluster Node Task NFS Server NFS Server Nodes are identical

Application architecture So now we have a hammer Let’s find us some nails!

Sharded text Indexing Index text to local disk and then copy index to distributed file store Assign documents to shards Map Reducer Clustered index storage Input documents Copy to local disk typically required before index can be loaded Local disk Search Engine Local disk

Shardedtext indexing Mapper assigns document to shard Shard is usually hash of document id Reducer indexes all documents for a shard Indexes created on local disk On success, copy index to DFS On failure, delete local files Must avoid directory collisions can’t use shard id! Must manage and reclaim local disk space

Conventional data flow Failure of search engine requires another download of the index from clustered storage. Map Failure of a reducer causes garbage to accumulate in the local disk Reducer Clustered index storage Input documents Local disk Search Engine Local disk

Simplified NFS data flows Index to task work directory via NFS Map Reducer Search Engine Input documents Clustered index storage Failure of a reducer is cleaned up by map-reduce framework Search engine reads mirrored index directly.

Simplified NFS data flows Search Engine Mirroring allows exact placement of index data Map Reducer Input documents Search Engine Aribitrary levels of replication also possible Mirrors

K-means Classic E-M based algorithm Given cluster centroids, Assign each data point to nearest centroid Accumulate new centroids Rinse, lather, repeat

K-means, the movie Centroids Assign to Nearest centroid I n p u t Aggregate new centroids

Parallel Stochastic Gradient Descent Model Train sub model I n p u t Average models

VariationalDirichlet Assignment Model Gather sufficient statistics I n p u t Update model

Old tricks, new dogs Mapper Assign point to cluster Emit cluster id, (1, point) Combiner and reducer Sum counts, weighted sum of points Emit cluster id, (n, sum/n) Output to HDFS Read from local disk from distributed cache Read from HDFS to local disk by distributed cache Written by map-reduce

Old tricks, new dogs Mapper Assign point to cluster Emit cluster id, (1, point) Combiner and reducer Sum counts, weighted sum of points Emit cluster id, (n, sum/n) Output to HDFS Read from NFS Written by map-reduce MapR FS

Poor man’s Pregel Mapper Lines in bold can use conventional I/O via NFS while not done: read and accumulate input models for each input: accumulate model write model synchronize reset input format emit summary 37

Click modeling architecture Map-reduce Side-data Now via NFS Feature extraction and down sampling I n p u t Data join Sequential SGD Learning

MapR, Implications for Integration

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to MapR, Implications for Integration

Similar to MapR, Implications for Integration (20)

More from trihug

More from trihug (11)

Recently uploaded

Recently uploaded (20)

MapR, Implications for Integration