Llnl talk

•Download as PPTX, PDF•

1 like•743 views

These slides are from a recent talk I gave at Lawrence Livermore Labs. The talk gives an architectural outline of the MapR system and then discusses how this architecture facilitates large scale machine learning algorithms.

Technology Education

MapR Architecture and Machine Learning 1

Outline MapR system overview Map-reduce review MapR architecture Performance Results Map-reduce on MapR Machine learning on MapR

Bottlenecks and Issues Read-only files Many copies in I/O path Shuffle based on HTTP Can’t use new technologies Eats file descriptors Spills go to local file space Bad for skewed distribution of sizes

MapR Improvements Faster file system Fewer copies Multiple NICS No file descriptor or page-buf competition Faster map-reduce Uses distributed file system Direct RPC to receiver Very wide merges

MapR Innovations Volumes Distributed management Data placement Read/write random access file system Allows distributed meta-data Improved scaling Enables NFS access Application-level NIC bonding Transactionally correct snapshots and mirrors

MapR'sContainers Files/directories are sharded into blocks, whichare placed into mini NNs (containers ) on disks ,[object Object]

No need to manage directlyContainers are 16-32 GB segments of disk, placed on nodes

Container locations and replication CLDB N1, N2 N1 N3, N2 N1, N2 N2 N1, N3 N3, N2 N3 Container location database (CLDB) keeps track of nodes hosting each container

MapR Scaling Containers represent 16 - 32GB of data ,[object Object]

100M containers = ~ 2 Exabytes (a very large cluster)250 bytes DRAM to cache a container ,[object Object]

Typical large 10PB cluster needs 2GBContainer-reports are 100x - 1000x < HDFS block-reports ,[object Object]

Increase container size to 64G to serve 4EB cluster

Terasort on MapR 10+1 nodes: 8 core, 24GB DRAM, 11 x 1TB SATA 7200 rpm Elapsed time (mins) Lower is better

MUCH faster for some operations Same 10 nodes … Teststoppedhere Create Rate # of files (millions)

NFS mounting models Export to the world NFS gateway runs on selected gateway hosts Local server NFS gateway runs on local host Enables local compression and check summing Export to self NFS gateway runs on all data nodes, mounted from localhost

Export to the world NFS Server NFS Server NFS Server NFS Server NFS Client

Local server Client Application NFS Server Cluster Nodes

Universal export to self Cluster Nodes Cluster Node Application NFS Server

Cluster Node Application NFS Server Cluster Node Application Cluster Node Application NFS Server NFS Server Nodes are identical

Shardedtext indexing Mapper assigns document to shard Shard is usually hash of document id Reducer indexes all documents for a shard Indexes created on local disk On success, copy index to DFS On failure, delete local files Must avoid directory collisions can’t use shard id! Must manage local disk space

Conventional data flows Failure of search engine requires another download of the index from clustered storage. Map Failure of a reducer causes garbage to accumulate in the local disk Reducer Clustered index storage Input documents Local disk Search Engine Local disk

Simplified NFS data flows Map Reducer Search Engine Input documents Clustered index storage Failure of a reducer is cleaned up by map-reduce framework Search engine reads mirrored index directly.

Application to machine learning So now we have the hammer Let’s see some nails!

K-means Classic E-M based algorithm Given cluster centroids, Assign each data point to nearest centroid Accumulate new centroids Rinse, lather, repeat

K-means, the movie Centroids Assign to Nearest centroid I n p u t Aggregate new centroids

What's hot

Threading Successes 06 Allegorithmicguest40fc7cd

Sector Sphere 2009lilyco

Oscon data-2011-ted-dunningTed Dunning

Hadoop 2EasyMedico.com

Масштабируемость Hadoop в Facebook. Дмитрий Мольков, Facebookyaevents

Common Support Issues And How To Troubleshoot Them - Michael Hackett, Vikhyat...Ceph Community

Reference Architecture: Architecting Ceph Storage Solutions Ceph Community

Accordion - VLDB 2014Marco Serafini

HPTS talk on micro-sharding with KattaTed Dunning

January 2011 HUG: Pig PresentationYahoo Developer Network

Champion Fas DeduplicationMichael Hudak

Overview of Spark for HPCGlenn K. Lockwood

Apache hadoop, hdfs and map reduce OverviewNisanth Simon

MapR M7: Providing an enterprise quality Apache HBase APImcsrivas

Accelerating Ceph Performance with High Speed Networks and Protocols - Qingch...Ceph Community

Hive, Presto, and Spark on TPC-DS benchmarkDongwon Kim

Dhcp in linuxUc Man

Ceph for Big Science - Dan van der SterCeph Community

Spark tunning in Apache KylinShi Shao Feng

Hadoop MapReduce Streaming and PipesHanborq Inc.

What's hot (20)

Threading Successes 06 Allegorithmic

Sector Sphere 2009

Oscon data-2011-ted-dunning

Hadoop 2

Масштабируемость Hadoop в Facebook. Дмитрий Мольков, Facebook

Common Support Issues And How To Troubleshoot Them - Michael Hackett, Vikhyat...

Reference Architecture: Architecting Ceph Storage Solutions

Accordion - VLDB 2014

HPTS talk on micro-sharding with Katta

January 2011 HUG: Pig Presentation

Champion Fas Deduplication

Overview of Spark for HPC

Apache hadoop, hdfs and map reduce Overview

MapR M7: Providing an enterprise quality Apache HBase API

Accelerating Ceph Performance with High Speed Networks and Protocols - Qingch...

Hive, Presto, and Spark on TPC-DS benchmark

Dhcp in linux

Ceph for Big Science - Dan van der Ster

Spark tunning in Apache Kylin

Hadoop MapReduce Streaming and Pipes

Similar to Llnl talk

Data mining-2011-09Ted Dunning

Hadoop Network Performance profilepramodbiligiri

02.28.13 WANdisco ApacheCon 2013WANdisco Plc

Ted Dunning - Whither HadoopEd Kohlwey

(SDD401) Amazon Elastic MapReduce Deep Dive and Best Practices | AWS re:Inven...Amazon Web Services

Hadoop ArchitectureDelhi/NCR HUG

Data ScienceSubhajit75

Hadoop architecture (Delhi Hadoop User Group Meetup 10 Sep 2011)Hari Shankar Sreekumar

Putting Wings on the ElephantDataWorks Summit

Data mining 2011 09MapR Technologies

Apache hadoopsheetal sharma

Trip down the GPU lane with Machine LearningRenaldas Zioma

Architectural Overview of MapR's Apache Hadoop Distributionmcsrivas

Running Apache Spark on a High-Performance Cluster Using RDMA and NVMe Flash ...Databricks

Hadoop Architecture_Cluster_Cap_PlanNarayana B

Lecture 2 part 1Jazan University

Artmosphere DemoKeira Zhou

An Introduction to HadoopDerrekYoungDotCom

Unified Big Data Processing with Apache Spark (QCON 2014)Databricks

Технологии работы с дисковыми хранилищами и файловыми системами Windows Serve...Виталий Стародубцев

Similar to Llnl talk (20)

Data mining-2011-09

Hadoop Network Performance profile

02.28.13 WANdisco ApacheCon 2013

Ted Dunning - Whither Hadoop

(SDD401) Amazon Elastic MapReduce Deep Dive and Best Practices | AWS re:Inven...

Hadoop Architecture

Data Science

Hadoop architecture (Delhi Hadoop User Group Meetup 10 Sep 2011)

Putting Wings on the Elephant

Data mining 2011 09

Apache hadoop

Trip down the GPU lane with Machine Learning

Architectural Overview of MapR's Apache Hadoop Distribution

Running Apache Spark on a High-Performance Cluster Using RDMA and NVMe Flash ...

Hadoop Architecture_Cluster_Cap_Plan

Lecture 2 part 1

Artmosphere Demo

An Introduction to Hadoop

Unified Big Data Processing with Apache Spark (QCON 2014)

Технологии работы с дисковыми хранилищами и файловыми системами Windows Serve...

Recently uploaded

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j

WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal

GenCyber Cyber Security Day PresentationMichael W. Hawkins

Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700

08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls

08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls

Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC

The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad

Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi

The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los

04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG

Partners Life - Insurer Innovation Award 2024The Digital Insurer

[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745

Automating Google Workspace (GWS) & more with Apps Scriptwesley chun

08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls

Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer

Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko

Boost PC performance: How more available memory can improve productivityPrincipled Technologies

How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes

Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik

Recently uploaded (20)

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...

WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service

GenCyber Cyber Security Day Presentation

Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men

08448380779 Call Girls In Friends Colony Women Seeking Men

Breaking the Kubernetes Kill Chain: Host Path Mount

The Codex of Business Writing Software for Real-World Solutions 2.pptx

Top 5 Benefits OF Using Muvi Live Paywall For Live Streams

The 7 Things I Know About Cyber Security After 25 Years | April 2024

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx

Partners Life - Insurer Innovation Award 2024

[2024]Digital Global Overview Report 2024 Meltwater.pdf

Automating Google Workspace (GWS) & more with Apps Script

08448380779 Call Girls In Greater Kailash - I Women Seeking Men

Axa Assurance Maroc - Insurer Innovation Award 2024

Handwritten Text Recognition for manuscripts and early printed texts

Boost PC performance: How more available memory can improve productivity

How to Troubleshoot Apps for the Modern Connected Worker

Injustice - Developers Among Us (SciFiDevCon 2024)

Llnl talk

1. MapR Architecture and Machine Learning 1

2. Outline MapR system overview Map-reduce review MapR architecture Performance Results Map-reduce on MapR Machine learning on MapR

3. Map-Reduce Shuffle Input Output

4. Bottlenecks and Issues Read-only files Many copies in I/O path Shuffle based on HTTP Can’t use new technologies Eats file descriptors Spills go to local file space Bad for skewed distribution of sizes

5. MapR Improvements Faster file system Fewer copies Multiple NICS No file descriptor or page-buf competition Faster map-reduce Uses distributed file system Direct RPC to receiver Very wide merges

6. MapR Innovations Volumes Distributed management Data placement Read/write random access file system Allows distributed meta-data Improved scaling Enables NFS access Application-level NIC bonding Transactionally correct snapshots and mirrors

8. Directories & files

9. Data blocks

10. Replicated on servers

11. No need to manage directlyContainers are 16-32 GB segments of disk, placed on nodes

12. Container locations and replication CLDB N1, N2 N1 N3, N2 N1, N2 N2 N1, N3 N3, N2 N3 Container location database (CLDB) keeps track of nodes hosting each container

13.

14.

15. But not necessary, can page to disk

16.

17. Increase container size to 64G to serve 4EB cluster

18.

19. Terasort on MapR 10+1 nodes: 8 core, 24GB DRAM, 11 x 1TB SATA 7200 rpm Elapsed time (mins) Lower is better

20. MUCH faster for some operations Same 10 nodes … Teststoppedhere Create Rate # of files (millions)

21. MUCH faster for some operations

22. NFS mounting models Export to the world NFS gateway runs on selected gateway hosts Local server NFS gateway runs on local host Enables local compression and check summing Export to self NFS gateway runs on all data nodes, mounted from localhost

23. Export to the world NFS Server NFS Server NFS Server NFS Server NFS Client

24. Local server Client Application NFS Server Cluster Nodes

25. Universal export to self Cluster Nodes Cluster Node Application NFS Server

26. Cluster Node Application NFS Server Cluster Node Application Cluster Node Application NFS Server NFS Server Nodes are identical

27. Shardedtext indexing Mapper assigns document to shard Shard is usually hash of document id Reducer indexes all documents for a shard Indexes created on local disk On success, copy index to DFS On failure, delete local files Must avoid directory collisions can’t use shard id! Must manage local disk space

28. Conventional data flows Failure of search engine requires another download of the index from clustered storage. Map Failure of a reducer causes garbage to accumulate in the local disk Reducer Clustered index storage Input documents Local disk Search Engine Local disk

29. Simplified NFS data flows Map Reducer Search Engine Input documents Clustered index storage Failure of a reducer is cleaned up by map-reduce framework Search engine reads mirrored index directly.

30. Application to machine learning So now we have the hammer Let’s see some nails!

31. K-means Classic E-M based algorithm Given cluster centroids, Assign each data point to nearest centroid Accumulate new centroids Rinse, lather, repeat

32. K-means, the movie Centroids Assign to Nearest centroid I n p u t Aggregate new centroids

33. But …

34. Parallel Stochastic Gradient Descent Model Train sub model I n p u t Average models

35. VariationalDirichlet Assignment Model Gather sufficient statistics I n p u t Update model

36. Old tricks, new dogs Mapper Assign point to cluster Emit cluster id, (1, point) Combiner and reducer Sum counts, weighted sum of points Emit cluster id, (n, sum/n) Output to HDFS Read from local disk from distributed cache Read from HDFS to local disk by distributed cache Written by map-reduce

37. Old tricks, new dogs Mapper Assign point to cluster Emit cluster id, 1, point Combiner and reducer Sum counts, weighted sum of points Emit cluster id, n, sum/n Output to HDFS Read from NFS Written by map-reduce MapR FS

38. Click modeling architecture Map-reduce Side-data Now via NFS Feature extraction and down sampling I n p u t Data join Sequential SGD Learning

39. Poor man’s Pregel Mapper Lines in bold can use conventional I/O via NFS while not done: read and accumulate input models for each input: accumulate model write model synchronize reset input format emit summary 31

40. Trivial visualization interface Map-reduce output is visible via NFS Legacy visualization just works $ R > x <- read.csv(“/mapr/my.cluster/home/ted/data/foo.out”) > plot(error ~ t, x) > q(save=‘n’)

41. Conclusions We used to know all this Tab completion used to work 5 years of work-arounds have clouded our memories We just have to remember the future

Llnl talk

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Llnl talk

Similar to Llnl talk (20)

More from Ted Dunning

More from Ted Dunning (20)

Recently uploaded

Recently uploaded (20)

Llnl talk