• Save
Big graph analysis issues NSA
Upcoming SlideShare
Loading in...5
×
 

Big graph analysis issues NSA

on

  • 2,962 views

 

Statistics

Views

Total Views
2,962
Views on SlideShare
2,850
Embed Views
112

Actions

Likes
4
Downloads
0
Comments
0

2 Embeds 112

https://twitter.com 110
http://waffle-stage.herokuapp.com 2

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Big graph analysis issues NSA Big graph analysis issues NSA Presentation Transcript

  • An NSA Big Graph experimentPaul Burkhardt, Chris WaringU.S. National Security AgencyResearch Directorate - R6Technical Report NSA-RD-2013-056002v1May 20, 2013Paul Burkhardt, Chris Waring An NSA Big Graph experiment
  • NSA-RD-2013-056001v1Graphs are everywhere!A graph is a collection of binary relationships, i.e. networks ofpairwise interactions including social networks, digital networks. . .road networksutility gridsinternetprotein interactomesM. tuberculosisVashisht, PLoS ONE 7(7), 2012Brain network of C. elegansWatts, Strogatz, Nature 393(6684), 1998Paul Burkhardt, Chris Waring An NSA Big Graph experiment
  • NSA-RD-2013-056001v1Scale of the first graphNearly 300 years ago the first graph problem consisted of 4 verticesand 7 edges—Seven Bridges of K¨onigsberg problem.Crossing the River PregelIs it possible to cross each of the SevenBridges of K¨onigsberg exactly once?Not too hard to fit in “memory”.Paul Burkhardt, Chris Waring An NSA Big Graph experiment
  • NSA-RD-2013-056001v1Scale of real-world graphsGraph scale in current Computer Science literatureOn order of billions of edges, tens of gigabytes.AS by Skitter (AS-Skitter) — internet topology in 2005 (n = router, m = traceroute)LiveJournal (LJ) — social network (n = members, m = friendship)U.S. Road Network (USRD) — road network (n = intersections, m = roads)Billion Triple Challenge (BTC) — RDF dataset 2009 (n = object, m = relationship)WWW of UK (WebUK) — Yahoo Web spam dataset (n = pages, m = hyperlinks)Twitter graph (Twitter) — Twitter network1(n = users, m = tweets)Yahoo! Web Graph (YahooWeb) — WWW pages in 2002 (n = pages, m = hyperlinks)Popular graph datasets in current literaturen (vertices in millions) m (edges in millions) sizeAS-Skitter 1.7 11 142 MBLJ 4.8 69 337.2 MBUSRD 24 58 586.7 MBBTC 165 773 5.3 GBWebUK 106 1877 8.6 GBTwitter 42 1470 24 GBYahooWeb 1413 6636 120 GB1http://an.kaist.ac.kr/traces/WWW2010.htmlPaul Burkhardt, Chris Waring An NSA Big Graph experiment
  • NSA-RD-2013-056001v1Big Data begets Big GraphsIncreasing volume,velocity,variety of Big Data are significantchallenges to scalable algorithms.Big Data GraphsHow will graph applications adapt to Big Data at petabyte scale?Ability to store and process Big Graphs impacts typical datastructures.Orders of magnitudekilobyte (KB) = 210terabyte (TB) = 240megabyte (MB) = 220petabyte (PB) = 250gigabyte (GB) = 230exabyte (EB) = 260Undirected graph data structure space complexityΘ(bytes) ×8><>:Θ(n2) adjacency matrixΘ(n + 4m) adjacency listΘ(4m) edge listPaul Burkhardt, Chris Waring An NSA Big Graph experiment
  • NSA-RD-2013-056001v1Social scale. . .1 billion vertices, 100 billion edges111 PB adjacency matrix2.92 TB adjacency list2.92 TB edge listTwitter graph from Gephi dataset(http://www.gephi.org)Paul Burkhardt, Chris Waring An NSA Big Graph experiment
  • NSA-RD-2013-056001v1Web scale. . .50 billion vertices, 1 trillion edges271 EB adjacency matrix29.5 TB adjacency list29.1 TB edge listInternet graph from the Opte Project(http://www.opte.org/maps)Web graph from the SNAP database(http://snap.stanford.edu/data)Paul Burkhardt, Chris Waring An NSA Big Graph experiment
  • NSA-RD-2013-056001v1Brain scale. . .100 billion vertices, 100 trillion edges2.08 mNA · bytes2 (molar bytes) adjacency matrix2.84 PB adjacency list2.84 PB edge listHuman connectome.Gerhard et al., Frontiers in Neuroinformatics 5(3), 20112NA = 6.022 × 1023mol−1Paul Burkhardt, Chris Waring An NSA Big Graph experiment
  • NSA-RD-2013-056001v1Benchmarking scalability on Big GraphsBig Graphs challenge our conventional thinking on both algorithmsand computer architecture.New Graph500.org benchmark provides a foundation forconducting experiments on graph datasets.Graph500 benchmarkProblem classes from 17 GB to 1 PB — many times larger thancommon datasets in literature.Paul Burkhardt, Chris Waring An NSA Big Graph experiment
  • NSA-RD-2013-056001v1Graph algorithms are challengingDifficult to parallelize. . .irregular data access increases latencyskewed data distribution creates bottlenecksgiant componenthigh degree verticesIncreased size imposes greater. . .latencyresource contention (i.e. hot-spotting)Algorithm complexity really matters!Run-time of O(n2) on a trillion node graph is notpractical!Paul Burkhardt, Chris Waring An NSA Big Graph experiment
  • NSA-RD-2013-056001v1Problem: How do we store and process Big Graphs?Conventional approach is to store and compute in-memory.SHARED-MEMORYParallel Random Access Machine (PRAM)data in globally-shared memoryimplicit communication by updating memoryfast-random accessDISTRIBUTED-MEMORYBulk Synchronous Parallel (BSP)data distributed to local, private memoryexplicit communication by sending messageseasier to scale by adding more machinesPaul Burkhardt, Chris Waring An NSA Big Graph experiment
  • NSA-RD-2013-056001v1Memory is fast but. . .Algorithms must exploit computer memory hierarchy.designed for spatial and temporal localityregisters, L1,L2,L3 cache, TLB, pages, disk. . .great for unit-stride access common in many scientific codes,e.g. linear algebraBut common graph algorithm implementations have. . .lots of random access to memory causing. . .many cache and TLB missesPaul Burkhardt, Chris Waring An NSA Big Graph experiment
  • NSA-RD-2013-056001v1Poor locality increases latency. . .QUESTION: What is the memory throughput if 90% TLB hitand 0.01% page fault on miss?Effective memory access timeTn = pnln + (1 − pn)Tn−1ExampleTLB = 20ns, RAM = 100ns, DISK = 10ms (10 × 106ns)T2 = p2l2 + (1 − p2)(p1l1 + (1 − p1)T0)= .9(TLB+RAM) + .1(.9999(TLB+2RAM) + .0001(DISK))= .9(120ns) + .1(.9999(220ns) + 1000ns) = 230nsANSWER:Paul Burkhardt, Chris Waring An NSA Big Graph experiment
  • NSA-RD-2013-056001v1Poor locality increases latency. . .QUESTION: What is the memory throughput if 90% TLB hitand 0.01% page fault on miss?Effective memory access timeTn = pnln + (1 − pn)Tn−1ExampleTLB = 20ns, RAM = 100ns, DISK = 10ms (10 × 106ns)T2 = p2l2 + (1 − p2)(p1l1 + (1 − p1)T0)= .9(TLB+RAM) + .1(.9999(TLB+2RAM) + .0001(DISK))= .9(120ns) + .1(.9999(220ns) + 1000ns) = 230nsANSWER: 33 MB/sPaul Burkhardt, Chris Waring An NSA Big Graph experiment
  • NSA-RD-2013-056001v1If it fits. . .Graph problems that fit in memory can leverage excellent advancesin architecture and libraries. . .Cray XMT2 designed for latency-hidingSGI UV2 designed for large, cache-coherent shared-memorybody of literature and librariesParallel Boost Graph Library (PBGL) — Indiana UniversityMultithreaded Graph Library (MTGL) — Sandia National LabsGraphCT/STINGER — Georgia TechGraphLab — Carnegie Mellon UniversityGiraph — Apache Software FoundationBut some graphs do not fit in memory. . .Paul Burkhardt, Chris Waring An NSA Big Graph experiment
  • NSA-RD-2013-056001v1We can add more memory but. . .Memory capacity is limited by. . .number of CPU pins, memory controller channels, DIMMs perchannelmemory bus width — parallel traces of same length, one lineper bitGlobally-shared memory limited by. . .CPU address spacecache-coherencyPaul Burkhardt, Chris Waring An NSA Big Graph experiment
  • NSA-RD-2013-056001v1Larger systems, greater latency. . .Increasing memory can increase latency.traverse more memory addresseslarger system with greater physical distance between machinesLight is only so fast (but still faster than neutrinos!)It takes approximately 1 nanosecond for light to travel 0.30 metersLatency causes significant inefficiency in new CPU architectures.IBM PowerPC A2A single 1.6 GHz PowerPC A2 can perform 204.8operations per nanosecond!Paul Burkhardt, Chris Waring An NSA Big Graph experiment
  • NSA-RD-2013-056001v1Easier to increase capacity using disksCurrent Intel Xeon E5 architectures:384 GB max. per CPU (4 channels x 3 DIMMS x 32 GB)64 TB max. globally-shared memory (46-bit address space)3881 dual Xeon E5 motherboards to store Brain Graph — 98racksDisk capacity not unlimited but higher than memory.Largest capacity HDD on market4 TB HDD — need 728 to store Brain Graph which can fit in 5racks (4 drives per chassis)Disk is not enough—applications will still require memory forprocessingPaul Burkhardt, Chris Waring An NSA Big Graph experiment
  • NSA-RD-2013-056001v1Top supercomputer installationsLargest supercomputer installations do not have enough memoryto process the Brain Graph (3 PB)!Titan Cray XK7 at ORNL — #1 Top500 in 20120.5 million cores710 TB memory8.2 Megawatts34300 sq.ft. (NBA basketball court is 4700 sq.ft.)Sequoia IBM Blue Gene/Q at LLNL — #1 Graph500 in 20121.5 million cores1 PB memory7.9 Megawatts33000 sq.ft.Electrical power costAt 10 cents per kilowatt-hour — $7 million per year tokeep the lights on!33Power specs from http://www.top500.org/list/2012/11Paul Burkhardt, Chris Waring An NSA Big Graph experiment
  • NSA-RD-2013-056001v1Cloud architectures can cope with graphs at Big DatascalesCloud technologies designed for Big Data problems.scalable to massive sizesfault-tolerant; restarting algorithms on Big Graphs is expensivesimpler programming model; MapReduceBig Graphs from real data are not static. . .Paul Burkhardt, Chris Waring An NSA Big Graph experiment
  • NSA-RD-2013-056001v1Distributed ￿key, value￿ repositoriesMuch better for storing a distributed, dynamic graph than adistributed filesystem like HDFS.create index on edges for fast random-access and . . .sort edges for efficient block sequential accessupdates can be fast but . . .not transactional; eventually consistentPaul Burkhardt, Chris Waring An NSA Big Graph experiment
  • NSA-RD-2013-056001v1NSA BigTable — Apache AccumuloGoogle BigTable paper inspired a small group of NSA researchersto develop an implementation with cell-level security.Apache AccumuloOpen-sourced in 2011 under the Apache Software Foundation.Paul Burkhardt, Chris Waring An NSA Big Graph experiment
  • NSA-RD-2013-056001v1Using Apache Accumulo for Big GraphsAccumulo records are flexible and can be used to store graphs withtyped edges and weights.store graph as distributed edge list in an Edge Tableedges are natural ￿key, value￿ pairs, i.e. vertex pairsupdates to edges happen in memory and are immediatelyavailable to queries and . . .updates are written to write-ahead logs for fault-toleranceApache Accumulo ￿key, value￿ recordKEYROW IDCOLUMNTIMESTAMPVALUEFAMILY QUALIFIER VISIBILITYPaul Burkhardt, Chris Waring An NSA Big Graph experiment
  • NSA-RD-2013-056001v1But for bulk-processing, use MapReduceBig Graph processing suitable for MapReduce. . .large, bulk writes to HDFS are faster (no need to sort)temporary scratch space (local writes)greater control over parallelization of algorithmsPaul Burkhardt, Chris Waring An NSA Big Graph experiment
  • NSA-RD-2013-056001v1Quick overview of MapReduceMapReduce processes data as ￿key, value￿ pairs in three steps:1 MAPmap tasks independently process blocks of data in parallel andoutput ￿key, value￿ pairsmap tasks execute user-defined “map” function2 SHUFFLEsorts ￿key, value￿ pairs and distributes to reduce tasksensures value set for a key is processed by one reduce taskframework executes a user-defined “partitioner” function3 REDUCEreduce tasks process assigned (key, value set) in parallelreduce tasks execute user-defined “reduce” functionPaul Burkhardt, Chris Waring An NSA Big Graph experiment
  • NSA-RD-2013-056001v1MapReduce is not great for iterative algorithmsIterative algorithms are implemented as a sequence ofMapReduce jobs, i.e. rounds.But iteration in MapReduce is expensive. . .temporary data written to disksort/shuffle may be unnecessary for each roundframework overhead for scheduling tasksPaul Burkhardt, Chris Waring An NSA Big Graph experiment
  • NSA-RD-2013-056001v1Good principles for MapReduce algorithmsBe prepared to Think in MapReduce. . .avoid iteration and minimize number of roundslimit child JVM memorynumber of concurrent tasks is limited by per machine RAMset IO buffers carefully to avoid spills (requires memory!)pick a good partitionerwrite raw comparatorsleverage compound keysminimizes hot-spots by distributing on keysecondary-sort on compound keys is almost freeRound-Memory tradeoffConstant, O(1), in memory and rounds.Paul Burkhardt, Chris Waring An NSA Big Graph experiment
  • NSA-RD-2013-056001v1Best of both worlds – MapReduce and AccumuloStore the Big Graph in an Accumulo Edge Table . . .great for updating a big, typed, multi-graphmore timely (lower latency) access to graphThen extract a sub-graph from the Edge Table and useMapReduce for graph analysis.Paul Burkhardt, Chris Waring An NSA Big Graph experiment
  • NSA-RD-2013-056001v1How well does this really work?Prove on the industry benchmark Graph500.org!Breadth-First Search (BFS) on an undirected R-MAT Graphcount Traversed Edges per Second (TEPS)n = 2scale, m = 16 × n02004006008001000120026 28 30 32 34 36 38 40 42 44terabytesscaleProblem ClassGraph500 Problem SizesClass Scale StorageToy 26 17 GBMini 29 140 GBSmall 32 1 TBMedium 36 17 TBLarge 39 140 TBHuge 42 1.1 PBPaul Burkhardt, Chris Waring An NSA Big Graph experiment
  • NSA-RD-2013-056001v1Walking a ridiculously Big Graph. . .Everything is harder at scale!performance is dominated by ability to acquire adjacenciesrequires specialized MapReduce partitioner to load-balancequeries from MapReduce to Accumulo Edge TableSpace-efficient Breadth-First Search is critical!create ￿vertex, distance￿ recordssliding window over distance records to minimize input ateach k iterationreduce tasks get adjacencies from Edge Table but only if. . .all distance values for a vertex, v, are equal to kPaul Burkhardt, Chris Waring An NSA Big Graph experiment
  • NSA-RD-2013-056001v1Graph500 ExperimentGraph500 Huge class — scale 42242 (4.40 trillion) vertices246 (70.4 trillion) edges1 PetabyteCluster specs1200 nodes2 Intel quad-core per node48 GB RAM per node7200 RPM sata drivesHuge problem is 19.5xmore than clustermemory010203040506070800 20 40 60 80 100 120 14017140112636 39 42TraversedEdges(trillion)TerabytesTime (h)Problem ClassProblem ClassMemory = 57.6 TBGraph500 Scalability BenchmarkPaul Burkhardt, Chris Waring An NSA Big Graph experiment
  • NSA-RD-2013-056001v1Graph500 ExperimentGraph500 Huge class — scale 42242 (4.40 trillion) vertices246 (70.4 trillion) edges1 PetabyteCluster specs1200 nodes2 Intel quad-core per node48 GB RAM per node7200 RPM sata drivesHuge problem is 19.5xmore than clustermemorylinear performance from1 trillion to 70 trillionedges. . . 010203040506070800 20 40 60 80 100 120 14017140112636 39 42TraversedEdges(trillion)TerabytesTime (h)Problem ClassProblem ClassMemory = 57.6 TBGraph500 Scalability BenchmarkPaul Burkhardt, Chris Waring An NSA Big Graph experiment
  • NSA-RD-2013-056001v1Graph500 ExperimentGraph500 Huge class — scale 42242 (4.40 trillion) vertices246 (70.4 trillion) edges1 PetabyteCluster specs1200 nodes2 Intel quad-core per node48 GB RAM per node7200 RPM sata drivesHuge problem is 19.5xmore than clustermemorylinear performance from1 trillion to 70 trillionedges. . .despite multiplehardware failures!010203040506070800 20 40 60 80 100 120 14017140112636 39 42TraversedEdges(trillion)TerabytesTime (h)Problem ClassProblem ClassMemory = 57.6 TBGraph500 Scalability BenchmarkPaul Burkhardt, Chris Waring An NSA Big Graph experiment
  • NSA-RD-2013-056001v1Graph500 ExperimentGraph500 Huge class — scale 42242 (4.40 trillion) vertices246 (70.4 trillion) edges1 PetabyteHuge problem is 19.5xmore than clustermemorylinear performance from1 trillion to 70 trillionedges. . .despite multiplehardware failures!010203040506070800 20 40 60 80 100 120 14017140112636 39 42TraversedEdges(trillion)TerabytesTime (h)Problem ClassProblem ClassMemory = 57.6 TBGraph500 Scalability BenchmarkPaul Burkhardt, Chris Waring An NSA Big Graph experiment
  • NSA-RD-2013-056001v1Future workWhat are the tradeoffs between disk- and memory-based BigGraph solutions?power–space–cooling efficiencysoftware development and maintenanceCan a hybrid approach be viable?memory-based processing for subgraphs or incrementalupdatesdisk-based processing for global analysisAdvances in memory and disk technologies will blur the lines. Willsolid-state disks be another level of memory?Paul Burkhardt, Chris Waring An NSA Big Graph experiment