Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Like this presentation? Why not share!

- STING: Spatio-Temporal Interaction ... by Jason Riedy 469 views
- Gelly in Apache Flink Bay Area Meetup by Vasia Kalavri 12645 views
- NSA presentation by katherinew17 624 views
- Tilon/SpyEye2 intelligence report by daniel_bilar 1072 views
- 2015 more ties than we thought by daniel_bilar 568 views
- Pi 2012 The Life of Pi: From Archim... by daniel_bilar 2251 views

No Downloads

Total views

3,964

On SlideShare

0

From Embeds

0

Number of Embeds

373

Shares

0

Downloads

0

Comments

0

Likes

9

No embeds

No notes for slide

- 1. An NSA Big Graph experimentPaul Burkhardt, Chris WaringU.S. National Security AgencyResearch Directorate - R6Technical Report NSA-RD-2013-056002v1May 20, 2013Paul Burkhardt, Chris Waring An NSA Big Graph experiment
- 2. NSA-RD-2013-056001v1Graphs are everywhere!A graph is a collection of binary relationships, i.e. networks ofpairwise interactions including social networks, digital networks. . .road networksutility gridsinternetprotein interactomesM. tuberculosisVashisht, PLoS ONE 7(7), 2012Brain network of C. elegansWatts, Strogatz, Nature 393(6684), 1998Paul Burkhardt, Chris Waring An NSA Big Graph experiment
- 3. NSA-RD-2013-056001v1Scale of the ﬁrst graphNearly 300 years ago the ﬁrst graph problem consisted of 4 verticesand 7 edges—Seven Bridges of K¨onigsberg problem.Crossing the River PregelIs it possible to cross each of the SevenBridges of K¨onigsberg exactly once?Not too hard to ﬁt in “memory”.Paul Burkhardt, Chris Waring An NSA Big Graph experiment
- 4. NSA-RD-2013-056001v1Scale of real-world graphsGraph scale in current Computer Science literatureOn order of billions of edges, tens of gigabytes.AS by Skitter (AS-Skitter) — internet topology in 2005 (n = router, m = traceroute)LiveJournal (LJ) — social network (n = members, m = friendship)U.S. Road Network (USRD) — road network (n = intersections, m = roads)Billion Triple Challenge (BTC) — RDF dataset 2009 (n = object, m = relationship)WWW of UK (WebUK) — Yahoo Web spam dataset (n = pages, m = hyperlinks)Twitter graph (Twitter) — Twitter network1(n = users, m = tweets)Yahoo! Web Graph (YahooWeb) — WWW pages in 2002 (n = pages, m = hyperlinks)Popular graph datasets in current literaturen (vertices in millions) m (edges in millions) sizeAS-Skitter 1.7 11 142 MBLJ 4.8 69 337.2 MBUSRD 24 58 586.7 MBBTC 165 773 5.3 GBWebUK 106 1877 8.6 GBTwitter 42 1470 24 GBYahooWeb 1413 6636 120 GB1http://an.kaist.ac.kr/traces/WWW2010.htmlPaul Burkhardt, Chris Waring An NSA Big Graph experiment
- 5. NSA-RD-2013-056001v1Big Data begets Big GraphsIncreasing volume,velocity,variety of Big Data are signiﬁcantchallenges to scalable algorithms.Big Data GraphsHow will graph applications adapt to Big Data at petabyte scale?Ability to store and process Big Graphs impacts typical datastructures.Orders of magnitudekilobyte (KB) = 210terabyte (TB) = 240megabyte (MB) = 220petabyte (PB) = 250gigabyte (GB) = 230exabyte (EB) = 260Undirected graph data structure space complexityΘ(bytes) ×8><>:Θ(n2) adjacency matrixΘ(n + 4m) adjacency listΘ(4m) edge listPaul Burkhardt, Chris Waring An NSA Big Graph experiment
- 6. NSA-RD-2013-056001v1Social scale. . .1 billion vertices, 100 billion edges111 PB adjacency matrix2.92 TB adjacency list2.92 TB edge listTwitter graph from Gephi dataset(http://www.gephi.org)Paul Burkhardt, Chris Waring An NSA Big Graph experiment
- 7. NSA-RD-2013-056001v1Web scale. . .50 billion vertices, 1 trillion edges271 EB adjacency matrix29.5 TB adjacency list29.1 TB edge listInternet graph from the Opte Project(http://www.opte.org/maps)Web graph from the SNAP database(http://snap.stanford.edu/data)Paul Burkhardt, Chris Waring An NSA Big Graph experiment
- 8. NSA-RD-2013-056001v1Brain scale. . .100 billion vertices, 100 trillion edges2.08 mNA · bytes2 (molar bytes) adjacency matrix2.84 PB adjacency list2.84 PB edge listHuman connectome.Gerhard et al., Frontiers in Neuroinformatics 5(3), 20112NA = 6.022 × 1023mol−1Paul Burkhardt, Chris Waring An NSA Big Graph experiment
- 9. NSA-RD-2013-056001v1Benchmarking scalability on Big GraphsBig Graphs challenge our conventional thinking on both algorithmsand computer architecture.New Graph500.org benchmark provides a foundation forconducting experiments on graph datasets.Graph500 benchmarkProblem classes from 17 GB to 1 PB — many times larger thancommon datasets in literature.Paul Burkhardt, Chris Waring An NSA Big Graph experiment
- 10. NSA-RD-2013-056001v1Graph algorithms are challengingDiﬃcult to parallelize. . .irregular data access increases latencyskewed data distribution creates bottlenecksgiant componenthigh degree verticesIncreased size imposes greater. . .latencyresource contention (i.e. hot-spotting)Algorithm complexity really matters!Run-time of O(n2) on a trillion node graph is notpractical!Paul Burkhardt, Chris Waring An NSA Big Graph experiment
- 11. NSA-RD-2013-056001v1Problem: How do we store and process Big Graphs?Conventional approach is to store and compute in-memory.SHARED-MEMORYParallel Random Access Machine (PRAM)data in globally-shared memoryimplicit communication by updating memoryfast-random accessDISTRIBUTED-MEMORYBulk Synchronous Parallel (BSP)data distributed to local, private memoryexplicit communication by sending messageseasier to scale by adding more machinesPaul Burkhardt, Chris Waring An NSA Big Graph experiment
- 12. NSA-RD-2013-056001v1Memory is fast but. . .Algorithms must exploit computer memory hierarchy.designed for spatial and temporal localityregisters, L1,L2,L3 cache, TLB, pages, disk. . .great for unit-stride access common in many scientiﬁc codes,e.g. linear algebraBut common graph algorithm implementations have. . .lots of random access to memory causing. . .many cache and TLB missesPaul Burkhardt, Chris Waring An NSA Big Graph experiment
- 13. NSA-RD-2013-056001v1Poor locality increases latency. . .QUESTION: What is the memory throughput if 90% TLB hitand 0.01% page fault on miss?Eﬀective memory access timeTn = pnln + (1 − pn)Tn−1ExampleTLB = 20ns, RAM = 100ns, DISK = 10ms (10 × 106ns)T2 = p2l2 + (1 − p2)(p1l1 + (1 − p1)T0)= .9(TLB+RAM) + .1(.9999(TLB+2RAM) + .0001(DISK))= .9(120ns) + .1(.9999(220ns) + 1000ns) = 230nsANSWER:Paul Burkhardt, Chris Waring An NSA Big Graph experiment
- 14. NSA-RD-2013-056001v1Poor locality increases latency. . .QUESTION: What is the memory throughput if 90% TLB hitand 0.01% page fault on miss?Eﬀective memory access timeTn = pnln + (1 − pn)Tn−1ExampleTLB = 20ns, RAM = 100ns, DISK = 10ms (10 × 106ns)T2 = p2l2 + (1 − p2)(p1l1 + (1 − p1)T0)= .9(TLB+RAM) + .1(.9999(TLB+2RAM) + .0001(DISK))= .9(120ns) + .1(.9999(220ns) + 1000ns) = 230nsANSWER: 33 MB/sPaul Burkhardt, Chris Waring An NSA Big Graph experiment
- 15. NSA-RD-2013-056001v1If it ﬁts. . .Graph problems that ﬁt in memory can leverage excellent advancesin architecture and libraries. . .Cray XMT2 designed for latency-hidingSGI UV2 designed for large, cache-coherent shared-memorybody of literature and librariesParallel Boost Graph Library (PBGL) — Indiana UniversityMultithreaded Graph Library (MTGL) — Sandia National LabsGraphCT/STINGER — Georgia TechGraphLab — Carnegie Mellon UniversityGiraph — Apache Software FoundationBut some graphs do not ﬁt in memory. . .Paul Burkhardt, Chris Waring An NSA Big Graph experiment
- 16. NSA-RD-2013-056001v1We can add more memory but. . .Memory capacity is limited by. . .number of CPU pins, memory controller channels, DIMMs perchannelmemory bus width — parallel traces of same length, one lineper bitGlobally-shared memory limited by. . .CPU address spacecache-coherencyPaul Burkhardt, Chris Waring An NSA Big Graph experiment
- 17. NSA-RD-2013-056001v1Larger systems, greater latency. . .Increasing memory can increase latency.traverse more memory addresseslarger system with greater physical distance between machinesLight is only so fast (but still faster than neutrinos!)It takes approximately 1 nanosecond for light to travel 0.30 metersLatency causes signiﬁcant ineﬃciency in new CPU architectures.IBM PowerPC A2A single 1.6 GHz PowerPC A2 can perform 204.8operations per nanosecond!Paul Burkhardt, Chris Waring An NSA Big Graph experiment
- 18. NSA-RD-2013-056001v1Easier to increase capacity using disksCurrent Intel Xeon E5 architectures:384 GB max. per CPU (4 channels x 3 DIMMS x 32 GB)64 TB max. globally-shared memory (46-bit address space)3881 dual Xeon E5 motherboards to store Brain Graph — 98racksDisk capacity not unlimited but higher than memory.Largest capacity HDD on market4 TB HDD — need 728 to store Brain Graph which can ﬁt in 5racks (4 drives per chassis)Disk is not enough—applications will still require memory forprocessingPaul Burkhardt, Chris Waring An NSA Big Graph experiment
- 19. NSA-RD-2013-056001v1Top supercomputer installationsLargest supercomputer installations do not have enough memoryto process the Brain Graph (3 PB)!Titan Cray XK7 at ORNL — #1 Top500 in 20120.5 million cores710 TB memory8.2 Megawatts34300 sq.ft. (NBA basketball court is 4700 sq.ft.)Sequoia IBM Blue Gene/Q at LLNL — #1 Graph500 in 20121.5 million cores1 PB memory7.9 Megawatts33000 sq.ft.Electrical power costAt 10 cents per kilowatt-hour — $7 million per year tokeep the lights on!33Power specs from http://www.top500.org/list/2012/11Paul Burkhardt, Chris Waring An NSA Big Graph experiment
- 20. NSA-RD-2013-056001v1Cloud architectures can cope with graphs at Big DatascalesCloud technologies designed for Big Data problems.scalable to massive sizesfault-tolerant; restarting algorithms on Big Graphs is expensivesimpler programming model; MapReduceBig Graphs from real data are not static. . .Paul Burkhardt, Chris Waring An NSA Big Graph experiment
- 21. NSA-RD-2013-056001v1Distributed key, value repositoriesMuch better for storing a distributed, dynamic graph than adistributed ﬁlesystem like HDFS.create index on edges for fast random-access and . . .sort edges for eﬃcient block sequential accessupdates can be fast but . . .not transactional; eventually consistentPaul Burkhardt, Chris Waring An NSA Big Graph experiment
- 22. NSA-RD-2013-056001v1NSA BigTable — Apache AccumuloGoogle BigTable paper inspired a small group of NSA researchersto develop an implementation with cell-level security.Apache AccumuloOpen-sourced in 2011 under the Apache Software Foundation.Paul Burkhardt, Chris Waring An NSA Big Graph experiment
- 23. NSA-RD-2013-056001v1Using Apache Accumulo for Big GraphsAccumulo records are ﬂexible and can be used to store graphs withtyped edges and weights.store graph as distributed edge list in an Edge Tableedges are natural key, value pairs, i.e. vertex pairsupdates to edges happen in memory and are immediatelyavailable to queries and . . .updates are written to write-ahead logs for fault-toleranceApache Accumulo key, value recordKEYROW IDCOLUMNTIMESTAMPVALUEFAMILY QUALIFIER VISIBILITYPaul Burkhardt, Chris Waring An NSA Big Graph experiment
- 24. NSA-RD-2013-056001v1But for bulk-processing, use MapReduceBig Graph processing suitable for MapReduce. . .large, bulk writes to HDFS are faster (no need to sort)temporary scratch space (local writes)greater control over parallelization of algorithmsPaul Burkhardt, Chris Waring An NSA Big Graph experiment
- 25. NSA-RD-2013-056001v1Quick overview of MapReduceMapReduce processes data as key, value pairs in three steps:1 MAPmap tasks independently process blocks of data in parallel andoutput key, value pairsmap tasks execute user-deﬁned “map” function2 SHUFFLEsorts key, value pairs and distributes to reduce tasksensures value set for a key is processed by one reduce taskframework executes a user-deﬁned “partitioner” function3 REDUCEreduce tasks process assigned (key, value set) in parallelreduce tasks execute user-deﬁned “reduce” functionPaul Burkhardt, Chris Waring An NSA Big Graph experiment
- 26. NSA-RD-2013-056001v1MapReduce is not great for iterative algorithmsIterative algorithms are implemented as a sequence ofMapReduce jobs, i.e. rounds.But iteration in MapReduce is expensive. . .temporary data written to disksort/shuﬄe may be unnecessary for each roundframework overhead for scheduling tasksPaul Burkhardt, Chris Waring An NSA Big Graph experiment
- 27. NSA-RD-2013-056001v1Good principles for MapReduce algorithmsBe prepared to Think in MapReduce. . .avoid iteration and minimize number of roundslimit child JVM memorynumber of concurrent tasks is limited by per machine RAMset IO buﬀers carefully to avoid spills (requires memory!)pick a good partitionerwrite raw comparatorsleverage compound keysminimizes hot-spots by distributing on keysecondary-sort on compound keys is almost freeRound-Memory tradeoﬀConstant, O(1), in memory and rounds.Paul Burkhardt, Chris Waring An NSA Big Graph experiment
- 28. NSA-RD-2013-056001v1Best of both worlds – MapReduce and AccumuloStore the Big Graph in an Accumulo Edge Table . . .great for updating a big, typed, multi-graphmore timely (lower latency) access to graphThen extract a sub-graph from the Edge Table and useMapReduce for graph analysis.Paul Burkhardt, Chris Waring An NSA Big Graph experiment
- 29. NSA-RD-2013-056001v1How well does this really work?Prove on the industry benchmark Graph500.org!Breadth-First Search (BFS) on an undirected R-MAT Graphcount Traversed Edges per Second (TEPS)n = 2scale, m = 16 × n02004006008001000120026 28 30 32 34 36 38 40 42 44terabytesscaleProblem ClassGraph500 Problem SizesClass Scale StorageToy 26 17 GBMini 29 140 GBSmall 32 1 TBMedium 36 17 TBLarge 39 140 TBHuge 42 1.1 PBPaul Burkhardt, Chris Waring An NSA Big Graph experiment
- 30. NSA-RD-2013-056001v1Walking a ridiculously Big Graph. . .Everything is harder at scale!performance is dominated by ability to acquire adjacenciesrequires specialized MapReduce partitioner to load-balancequeries from MapReduce to Accumulo Edge TableSpace-eﬃcient Breadth-First Search is critical!create vertex, distance recordssliding window over distance records to minimize input ateach k iterationreduce tasks get adjacencies from Edge Table but only if. . .all distance values for a vertex, v, are equal to kPaul Burkhardt, Chris Waring An NSA Big Graph experiment
- 31. NSA-RD-2013-056001v1Graph500 ExperimentGraph500 Huge class — scale 42242 (4.40 trillion) vertices246 (70.4 trillion) edges1 PetabyteCluster specs1200 nodes2 Intel quad-core per node48 GB RAM per node7200 RPM sata drivesHuge problem is 19.5xmore than clustermemory010203040506070800 20 40 60 80 100 120 14017140112636 39 42TraversedEdges(trillion)TerabytesTime (h)Problem ClassProblem ClassMemory = 57.6 TBGraph500 Scalability BenchmarkPaul Burkhardt, Chris Waring An NSA Big Graph experiment
- 32. NSA-RD-2013-056001v1Graph500 ExperimentGraph500 Huge class — scale 42242 (4.40 trillion) vertices246 (70.4 trillion) edges1 PetabyteCluster specs1200 nodes2 Intel quad-core per node48 GB RAM per node7200 RPM sata drivesHuge problem is 19.5xmore than clustermemorylinear performance from1 trillion to 70 trillionedges. . . 010203040506070800 20 40 60 80 100 120 14017140112636 39 42TraversedEdges(trillion)TerabytesTime (h)Problem ClassProblem ClassMemory = 57.6 TBGraph500 Scalability BenchmarkPaul Burkhardt, Chris Waring An NSA Big Graph experiment
- 33. NSA-RD-2013-056001v1Graph500 ExperimentGraph500 Huge class — scale 42242 (4.40 trillion) vertices246 (70.4 trillion) edges1 PetabyteCluster specs1200 nodes2 Intel quad-core per node48 GB RAM per node7200 RPM sata drivesHuge problem is 19.5xmore than clustermemorylinear performance from1 trillion to 70 trillionedges. . .despite multiplehardware failures!010203040506070800 20 40 60 80 100 120 14017140112636 39 42TraversedEdges(trillion)TerabytesTime (h)Problem ClassProblem ClassMemory = 57.6 TBGraph500 Scalability BenchmarkPaul Burkhardt, Chris Waring An NSA Big Graph experiment
- 34. NSA-RD-2013-056001v1Graph500 ExperimentGraph500 Huge class — scale 42242 (4.40 trillion) vertices246 (70.4 trillion) edges1 PetabyteHuge problem is 19.5xmore than clustermemorylinear performance from1 trillion to 70 trillionedges. . .despite multiplehardware failures!010203040506070800 20 40 60 80 100 120 14017140112636 39 42TraversedEdges(trillion)TerabytesTime (h)Problem ClassProblem ClassMemory = 57.6 TBGraph500 Scalability BenchmarkPaul Burkhardt, Chris Waring An NSA Big Graph experiment
- 35. NSA-RD-2013-056001v1Future workWhat are the tradeoﬀs between disk- and memory-based BigGraph solutions?power–space–cooling eﬃciencysoftware development and maintenanceCan a hybrid approach be viable?memory-based processing for subgraphs or incrementalupdatesdisk-based processing for global analysisAdvances in memory and disk technologies will blur the lines. Willsolid-state disks be another level of memory?Paul Burkhardt, Chris Waring An NSA Big Graph experiment

No public clipboards found for this slide

×
### Save the most important slides with Clipping

Clipping is a handy way to collect and organize the most important slides from a presentation. You can keep your great finds in clipboards organized around topics.

Be the first to comment