SlideShare a Scribd company logo
TinkerPop Backed By Accumulo
6/12/2014
Ryan Webb
Associate Professional
Ryan.Webb@jhuapl.edu
Agenda
 Introduction to TinkerPop
 Detailed Implementation
 Obstacles
 Overcoming Obstacles
 Map Reduce Integration
 Performance
Background
 Associate Professional at The Johns Hopkins Applied Physics
Laboratory
 Bachelors of Science in Computer Science with a minor in
Mathematics from the University of Delaware
 Pursing a Masters in Computer Science with a focus on
Distributed Systems at the Whiting School of Engineering
TinkerPop Blueprints
 Foundational technology for a
complete graph stack
 Extensive test suite to ensure
implementations follow all the
rules required.
 Only a simple API
 getVertex
 getEdge
 setProperty
 getProperty
 Multiple Interfaces with
incremental features
TinkerPop Blueprints Graph API
Graph Creation
Configuration cfg = new AccumuloGraphConfiguration()
.instance("accumulo").user("user").zkHosts("zk1")
.password("password".getBytes()).name("myGraph");
Graph graph = GraphFactory.open(cfg);
Vertex v1 = graph.addVertex("1");
v1.setProperty("name", "Alice");
Vertex v2 = graph.addVertex("2");
v2.setProperty("name", "Bob");
Edge e1 = graph.addEdge("E1", v1, v2, "knows");
e1.setProperty("since", new Date());
Trade off Spectrum
Consistency
Performance
Accumulo Implementation
 Base Naïve implementation passes all required TinkerPop tests
 Far Right of the spectrum
 As consistent as you can get
 Table Structure
 Edge and Vertex
 Edge and Vertex Index table
 Metadata Table for indexes
Table Structure
Vertex
Edge
Row ID Column Family Column Qualifier Value
VertexID Label Flag Exists Flag [empty]
VertexID INVERTEX OutVertexID_EdgeID Edge Label
VertexID OUTVERTEX InVertexID_EdgeID Edge Label
VertexID Property Key [empty] Serialized Value
Row ID Column Family Column Qualifier Value
EdgeID Label Flag InVertexID_OutVertexID Edge Label
EdgeID Property Key [empty] Serialized Value
Graph Access and Index Creation/Use
// Access before Index
for (Vertex v: graph.getVertices()) {
String name = v.getProperty("name");
}
((KeyIndexableGraph)graph)
.createKeyIndex("name", Vertex.class);
// Access after Index
for (Vertex v: graph.getVertices()) {
String name = v.getProperty("name");
}
Table Structure - Continued
Indexes
Metadata
Row Column Family Column Qualifier Value
Serialized Value Property Key VertexID [empty]
Row Column Family Column Qualifier Value
Index Name Index Class [empty] [empty]
Obstacles
 Existence checking is expensive
 Required for TinkerPop test suite
 Writing every graph object out is expensive
 Building indexes post ingest is expensive
 Blocking, full table scan
 Consistency is expensive
Overcoming Obstacles
Give more power to users who know they are using an Accumulo
Graph
 Ingest Improvements
 Give option to disable existence checks
 Allow manual batching
 Specialized Ingest path
 Traversal Improvements
 Attribute preloading
 Property caching
 Element caching
Simple Bulk Ingest
// Will migrate to BatchGraph
AccumuloBulkIngester g = new AccumuloBulkIngester(cfg);
PropertyBuilder v1 = g.addVertex("ID1");
PropertyBuilder v2 = g.addVertex("ID2");
PropertyBuilder edge = g.addEdge("ID1", "ID2", "knows");
v1.add("name", "alice");
v2.add("name", "bob");
edge.add("since", new Date());
Map Reduce Integration
 In your Tool
j.setInputFormatClass(VertexInputFormat.class);
VertexInputFormat.setAccumuloGraphConfiguration(
new AccumuloGraphConfiguration()
.instance(“accumulo").zkHosts(“zk1").user("root")
.password(“secret".getBytes()).name("myGraph"));
 In your Mapper
public void map(Text k, Vertex v, Context c){
System.out.println(v.getId().toString());
}
Results
2 Nodes 4 Nodes 8 Nodes
20 Hours 9 Minutes 13 Hours 47 Minutes 7 Hours 4 Minutes
Cluster Stats
8 Node Cluster
64 GB Ram
Quad-Core Xeon Processor 2.50GHz 10MB
2x 4 TB 6.0Gb/s 7200 RPM Drives
1 Gb/s Networking
Accumulo 1.5.1, Hadoop 2.0.0 – MR1
Stanford SNAP Friendster Graph
65,608,366 Vertices
1,806,067,135 Edges
2 Nodes 4 Nodes 8 Nodes
55 Minutes 29 Minutes 15 Minutes
Vertex Iteration
Ingest
Conclusion
 Simple, easy to read graph API
 Give developers a lot of tuning points for their implementations
 Performance is “good enough”
 Not meant for high performance, specialized solutions
 Quick to develop new ideas and investigate your graph.
 Easy to integrate and already integrated.
 Low effort to get REST access to your graph
Future
 Polish and open source
 Iterators
 Locality Groups
 Addressing Security
 Graph Query
 Extending MapReduce Integration
 Upgrading to Accumulo 1.6, TinkerPop 2.5
 Conditional Mutations
 Table namespaces
Resources
 http://www.tinkerpop.com/
 http://snap.stanford.edu/data/com-Friendster.html
Accumulo Summit 2014: Accumulo backed Tinkerpop Implementation

More Related Content

What's hot

Agile NCR 2013- Anirudh Bhatnagar - Hadoop unit testing agile ncr
Agile NCR 2013- Anirudh Bhatnagar - Hadoop unit testing agile ncr Agile NCR 2013- Anirudh Bhatnagar - Hadoop unit testing agile ncr
Agile NCR 2013- Anirudh Bhatnagar - Hadoop unit testing agile ncr
AgileNCR2013
 
An introduction to Test Driven Development on MapReduce
An introduction to Test Driven Development on MapReduceAn introduction to Test Driven Development on MapReduce
An introduction to Test Driven Development on MapReduce
Ananth PackkilDurai
 
02 database eudomdet
02 database eudomdet02 database eudomdet
02 database eudomdet
patrolsq
 

What's hot (20)

How to make GAE adapt the Great Firewall
How to make GAE adapt the Great FirewallHow to make GAE adapt the Great Firewall
How to make GAE adapt the Great Firewall
 
Scaling up data science applications
Scaling up data science applicationsScaling up data science applications
Scaling up data science applications
 
Getting more out of Matplotlib with GR
Getting more out of Matplotlib with GRGetting more out of Matplotlib with GR
Getting more out of Matplotlib with GR
 
Updates on the Fake Object Pipeline for HSC Survey
Updates on the Fake Object Pipeline for HSC Survey Updates on the Fake Object Pipeline for HSC Survey
Updates on the Fake Object Pipeline for HSC Survey
 
R and C++
R and C++R and C++
R and C++
 
Agile NCR 2013- Anirudh Bhatnagar - Hadoop unit testing agile ncr
Agile NCR 2013- Anirudh Bhatnagar - Hadoop unit testing agile ncr Agile NCR 2013- Anirudh Bhatnagar - Hadoop unit testing agile ncr
Agile NCR 2013- Anirudh Bhatnagar - Hadoop unit testing agile ncr
 
Cascading talk in Etsy (http://www.meetup.com/cascading/events/169390262/)
Cascading talk in Etsy (http://www.meetup.com/cascading/events/169390262/)Cascading talk in Etsy (http://www.meetup.com/cascading/events/169390262/)
Cascading talk in Etsy (http://www.meetup.com/cascading/events/169390262/)
 
R and cpp
R and cppR and cpp
R and cpp
 
Tank War and Katch and Pop
Tank War and Katch and PopTank War and Katch and Pop
Tank War and Katch and Pop
 
Business Dashboards using Bonobo ETL, Grafana and Apache Airflow
Business Dashboards using Bonobo ETL, Grafana and Apache AirflowBusiness Dashboards using Bonobo ETL, Grafana and Apache Airflow
Business Dashboards using Bonobo ETL, Grafana and Apache Airflow
 
How to Build a Telegraf Plugin by Noah Crowley
How to Build a Telegraf Plugin by Noah CrowleyHow to Build a Telegraf Plugin by Noah Crowley
How to Build a Telegraf Plugin by Noah Crowley
 
An introduction to Test Driven Development on MapReduce
An introduction to Test Driven Development on MapReduceAn introduction to Test Driven Development on MapReduce
An introduction to Test Driven Development on MapReduce
 
Scalding
ScaldingScalding
Scalding
 
Elasticsearch's aggregations & esctl in action or how i built a cli tool...
Elasticsearch's aggregations & esctl in action  or how i built a cli tool...Elasticsearch's aggregations & esctl in action  or how i built a cli tool...
Elasticsearch's aggregations & esctl in action or how i built a cli tool...
 
Property-based Testing and Generators (Lua)
Property-based Testing and Generators (Lua)Property-based Testing and Generators (Lua)
Property-based Testing and Generators (Lua)
 
Functional Programming with JavaScript
Functional Programming with JavaScriptFunctional Programming with JavaScript
Functional Programming with JavaScript
 
Últimas atualizações de produtividade no Visual Studio 2017​
Últimas atualizações de produtividade no Visual Studio 2017​Últimas atualizações de produtividade no Visual Studio 2017​
Últimas atualizações de produtividade no Visual Studio 2017​
 
02 database eudomdet
02 database eudomdet02 database eudomdet
02 database eudomdet
 
QConSF 2014 talk on Netflix Mantis, a stream processing system
QConSF 2014 talk on Netflix Mantis, a stream processing systemQConSF 2014 talk on Netflix Mantis, a stream processing system
QConSF 2014 talk on Netflix Mantis, a stream processing system
 
Testing Hadoop jobs with MRUnit
Testing Hadoop jobs with MRUnitTesting Hadoop jobs with MRUnit
Testing Hadoop jobs with MRUnit
 

Viewers also liked

Viewers also liked (6)

Accumulo Summit 2014: Accumulo Visibility Labels and Pluggable Authorization ...
Accumulo Summit 2014: Accumulo Visibility Labels and Pluggable Authorization ...Accumulo Summit 2014: Accumulo Visibility Labels and Pluggable Authorization ...
Accumulo Summit 2014: Accumulo Visibility Labels and Pluggable Authorization ...
 
Accumulo Summit 2014: Accumulo on YARN
Accumulo Summit 2014: Accumulo on YARNAccumulo Summit 2014: Accumulo on YARN
Accumulo Summit 2014: Accumulo on YARN
 
Accumulo Summit 2015: Attempting to answer unanswerable questions: Key manage...
Accumulo Summit 2015: Attempting to answer unanswerable questions: Key manage...Accumulo Summit 2015: Attempting to answer unanswerable questions: Key manage...
Accumulo Summit 2015: Attempting to answer unanswerable questions: Key manage...
 
Cassandra Day London 2015: Securing Cassandra and DataStax Enterprise
Cassandra Day London 2015: Securing Cassandra and DataStax EnterpriseCassandra Day London 2015: Securing Cassandra and DataStax Enterprise
Cassandra Day London 2015: Securing Cassandra and DataStax Enterprise
 
Highly available, scalable and secure data with Cassandra and DataStax Enterp...
Highly available, scalable and secure data with Cassandra and DataStax Enterp...Highly available, scalable and secure data with Cassandra and DataStax Enterp...
Highly available, scalable and secure data with Cassandra and DataStax Enterp...
 
Securing Cassandra The Right Way
Securing Cassandra The Right WaySecuring Cassandra The Right Way
Securing Cassandra The Right Way
 

Similar to Accumulo Summit 2014: Accumulo backed Tinkerpop Implementation

Hybrid Apps (Native + Web) via QtWebKit
Hybrid Apps (Native + Web) via QtWebKitHybrid Apps (Native + Web) via QtWebKit
Hybrid Apps (Native + Web) via QtWebKit
Ariya Hidayat
 
Java 5 6 Generics, Concurrency, Garbage Collection, Tuning
Java 5 6 Generics, Concurrency, Garbage Collection, TuningJava 5 6 Generics, Concurrency, Garbage Collection, Tuning
Java 5 6 Generics, Concurrency, Garbage Collection, Tuning
Carol McDonald
 
Google Developer Fest 2010
Google Developer Fest 2010Google Developer Fest 2010
Google Developer Fest 2010
Chris Ramsdale
 

Similar to Accumulo Summit 2014: Accumulo backed Tinkerpop Implementation (20)

GR8Conf 2009: Industrial Strength Groovy by Paul King
GR8Conf 2009: Industrial Strength Groovy by Paul KingGR8Conf 2009: Industrial Strength Groovy by Paul King
GR8Conf 2009: Industrial Strength Groovy by Paul King
 
Auto-GWT : Better GWT Programming with Xtend
Auto-GWT : Better GWT Programming with XtendAuto-GWT : Better GWT Programming with Xtend
Auto-GWT : Better GWT Programming with Xtend
 
Qt & Webkit
Qt & WebkitQt & Webkit
Qt & Webkit
 
Qt Workshop
Qt WorkshopQt Workshop
Qt Workshop
 
Time-evolving Graph Processing on Commodity Clusters: Spark Summit East talk ...
Time-evolving Graph Processing on Commodity Clusters: Spark Summit East talk ...Time-evolving Graph Processing on Commodity Clusters: Spark Summit East talk ...
Time-evolving Graph Processing on Commodity Clusters: Spark Summit East talk ...
 
Building Web Apps Sanely - EclipseCon 2010
Building Web Apps Sanely - EclipseCon 2010Building Web Apps Sanely - EclipseCon 2010
Building Web Apps Sanely - EclipseCon 2010
 
Vasia Kalavri – Training: Gelly School
Vasia Kalavri – Training: Gelly School Vasia Kalavri – Training: Gelly School
Vasia Kalavri – Training: Gelly School
 
Hybrid Apps (Native + Web) via QtWebKit
Hybrid Apps (Native + Web) via QtWebKitHybrid Apps (Native + Web) via QtWebKit
Hybrid Apps (Native + Web) via QtWebKit
 
First fare 2010 java-beta-2011
First fare 2010 java-beta-2011First fare 2010 java-beta-2011
First fare 2010 java-beta-2011
 
Java 5 6 Generics, Concurrency, Garbage Collection, Tuning
Java 5 6 Generics, Concurrency, Garbage Collection, TuningJava 5 6 Generics, Concurrency, Garbage Collection, Tuning
Java 5 6 Generics, Concurrency, Garbage Collection, Tuning
 
Google Developer Fest 2010
Google Developer Fest 2010Google Developer Fest 2010
Google Developer Fest 2010
 
Scripting Oracle Develop 2007
Scripting Oracle Develop 2007Scripting Oracle Develop 2007
Scripting Oracle Develop 2007
 
Porting legacy apps to Griffon
Porting legacy apps to GriffonPorting legacy apps to Griffon
Porting legacy apps to Griffon
 
Nx tutorial basics
Nx tutorial basicsNx tutorial basics
Nx tutorial basics
 
PlantUML
PlantUMLPlantUML
PlantUML
 
UniRx - Reactive Extensions for Unity(EN)
UniRx - Reactive Extensions for Unity(EN)UniRx - Reactive Extensions for Unity(EN)
UniRx - Reactive Extensions for Unity(EN)
 
Go Programming Patterns
Go Programming PatternsGo Programming Patterns
Go Programming Patterns
 
mobl
moblmobl
mobl
 
Enter the gradle
Enter the gradleEnter the gradle
Enter the gradle
 
Google Web Toolkit
Google Web ToolkitGoogle Web Toolkit
Google Web Toolkit
 

Recently uploaded

Recently uploaded (20)

Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
Speed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in MinutesSpeed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in Minutes
 
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
 
AI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří KarpíšekAI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří Karpíšek
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
 
Salesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone KomSalesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
 
UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi IbrahimzadeFree and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
 
UiPath Test Automation using UiPath Test Suite series, part 1
UiPath Test Automation using UiPath Test Suite series, part 1UiPath Test Automation using UiPath Test Suite series, part 1
UiPath Test Automation using UiPath Test Suite series, part 1
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 

Accumulo Summit 2014: Accumulo backed Tinkerpop Implementation

  • 1. TinkerPop Backed By Accumulo 6/12/2014 Ryan Webb Associate Professional Ryan.Webb@jhuapl.edu
  • 2. Agenda  Introduction to TinkerPop  Detailed Implementation  Obstacles  Overcoming Obstacles  Map Reduce Integration  Performance
  • 3. Background  Associate Professional at The Johns Hopkins Applied Physics Laboratory  Bachelors of Science in Computer Science with a minor in Mathematics from the University of Delaware  Pursing a Masters in Computer Science with a focus on Distributed Systems at the Whiting School of Engineering
  • 4. TinkerPop Blueprints  Foundational technology for a complete graph stack  Extensive test suite to ensure implementations follow all the rules required.  Only a simple API  getVertex  getEdge  setProperty  getProperty  Multiple Interfaces with incremental features
  • 6. Graph Creation Configuration cfg = new AccumuloGraphConfiguration() .instance("accumulo").user("user").zkHosts("zk1") .password("password".getBytes()).name("myGraph"); Graph graph = GraphFactory.open(cfg); Vertex v1 = graph.addVertex("1"); v1.setProperty("name", "Alice"); Vertex v2 = graph.addVertex("2"); v2.setProperty("name", "Bob"); Edge e1 = graph.addEdge("E1", v1, v2, "knows"); e1.setProperty("since", new Date());
  • 8. Accumulo Implementation  Base Naïve implementation passes all required TinkerPop tests  Far Right of the spectrum  As consistent as you can get  Table Structure  Edge and Vertex  Edge and Vertex Index table  Metadata Table for indexes
  • 9. Table Structure Vertex Edge Row ID Column Family Column Qualifier Value VertexID Label Flag Exists Flag [empty] VertexID INVERTEX OutVertexID_EdgeID Edge Label VertexID OUTVERTEX InVertexID_EdgeID Edge Label VertexID Property Key [empty] Serialized Value Row ID Column Family Column Qualifier Value EdgeID Label Flag InVertexID_OutVertexID Edge Label EdgeID Property Key [empty] Serialized Value
  • 10. Graph Access and Index Creation/Use // Access before Index for (Vertex v: graph.getVertices()) { String name = v.getProperty("name"); } ((KeyIndexableGraph)graph) .createKeyIndex("name", Vertex.class); // Access after Index for (Vertex v: graph.getVertices()) { String name = v.getProperty("name"); }
  • 11. Table Structure - Continued Indexes Metadata Row Column Family Column Qualifier Value Serialized Value Property Key VertexID [empty] Row Column Family Column Qualifier Value Index Name Index Class [empty] [empty]
  • 12. Obstacles  Existence checking is expensive  Required for TinkerPop test suite  Writing every graph object out is expensive  Building indexes post ingest is expensive  Blocking, full table scan  Consistency is expensive
  • 13. Overcoming Obstacles Give more power to users who know they are using an Accumulo Graph  Ingest Improvements  Give option to disable existence checks  Allow manual batching  Specialized Ingest path  Traversal Improvements  Attribute preloading  Property caching  Element caching
  • 14. Simple Bulk Ingest // Will migrate to BatchGraph AccumuloBulkIngester g = new AccumuloBulkIngester(cfg); PropertyBuilder v1 = g.addVertex("ID1"); PropertyBuilder v2 = g.addVertex("ID2"); PropertyBuilder edge = g.addEdge("ID1", "ID2", "knows"); v1.add("name", "alice"); v2.add("name", "bob"); edge.add("since", new Date());
  • 15. Map Reduce Integration  In your Tool j.setInputFormatClass(VertexInputFormat.class); VertexInputFormat.setAccumuloGraphConfiguration( new AccumuloGraphConfiguration() .instance(“accumulo").zkHosts(“zk1").user("root") .password(“secret".getBytes()).name("myGraph"));  In your Mapper public void map(Text k, Vertex v, Context c){ System.out.println(v.getId().toString()); }
  • 16. Results 2 Nodes 4 Nodes 8 Nodes 20 Hours 9 Minutes 13 Hours 47 Minutes 7 Hours 4 Minutes Cluster Stats 8 Node Cluster 64 GB Ram Quad-Core Xeon Processor 2.50GHz 10MB 2x 4 TB 6.0Gb/s 7200 RPM Drives 1 Gb/s Networking Accumulo 1.5.1, Hadoop 2.0.0 – MR1 Stanford SNAP Friendster Graph 65,608,366 Vertices 1,806,067,135 Edges 2 Nodes 4 Nodes 8 Nodes 55 Minutes 29 Minutes 15 Minutes Vertex Iteration Ingest
  • 17. Conclusion  Simple, easy to read graph API  Give developers a lot of tuning points for their implementations  Performance is “good enough”  Not meant for high performance, specialized solutions  Quick to develop new ideas and investigate your graph.  Easy to integrate and already integrated.  Low effort to get REST access to your graph
  • 18. Future  Polish and open source  Iterators  Locality Groups  Addressing Security  Graph Query  Extending MapReduce Integration  Upgrading to Accumulo 1.6, TinkerPop 2.5  Conditional Mutations  Table namespaces