Your SlideShare is downloading. ×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Getting Started with Graph Databases


Published on

Exploiting graph database to discover value in complex Big Data. Lunch will be provided while you discover the power of graph database technology for your Big Data needs. …

Exploiting graph database to discover value in complex Big Data. Lunch will be provided while you discover the power of graph database technology for your Big Data needs.

Bring your charged laptops to this upcoming meetup to walk through how to get started with InfiniteGraph. Nick Quinn, Senior Software Developer for InfiniteGraph, will walk you through the initial installation of InfiniteGraph and the HelloGraph sample to get you started with your graph database. Download InfiniteGraph for free here:

Once we get through the tutorial, there will be time for Q&A and more hands on support from additional members of the InfiniteGraph technical team.

If you have a complex Big Data problem and are looking to discover deeper connections and relationships within your data to create next-generation applications for social networks, healthcare, finance, telecom and security this is a must attend event! Get started quickly with our enterprise proven, massively scalable and distributed graph database!

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide
  • Kevin Norwood Bacon, 1958 in Pennsylvania
  • Relationships and connections are EVERYWHERE. Examples include CRM, Telecom, Intelligence, Research, Healthcare, Finance and yes, social networks too. But notice, it’s absolutely not just about social networks, in the Facebook sense. ANY application that needs to find connections and relationships separated by more than 2 degrees, is a good candidate for InfiniteGraph.
  • SIMPLE_BREADTH_FIRSTTraversal from a given vertex proceeds to all related vertices that are one degree of separation out before backtracking to traverse to related vertices that are two degrees of separation out, and so forth.SIMPLE_DEPTH_FIRSTTraversal from a given vertex continues down a path until it reaches an endpoint before backtracking to the originating vertex to check for additional outgoing paths, and so forth.
  • Transcript

    • 1. Getting Started with Graph Databases Nick Quinn Principal Engineer, InfiniteGraph 11/13/2013 1
    • 2. What are we talking about today? •Big Data and Databases •What is a Graph Database? •What is InfiniteGraph? •Demo and Q&A – Hands On – Installing InfiniteGraph • – FlightPlan Sample •  “Download Examples”  Images Courtesy of IMDB (
    • 3. NoSQL 2013 • Developers are embracing choice • More than Dynamo and BigTable clones • Incorporates specialized data models like Document, Object and Graph • 100+ projects and products (Wikipedia) • ~250 Groups (5 meetups this week!) • NoSQL fans consume 12% of the worlds Beer & Pizza 11/13/2013
    • 4. NoSQL and BigData – What’s the Connection ? big data is a loosely-defined term used to describe data sets so large and complex that they become awkward to work with using on-hand database management tools (wikipedia) • • • • • • Making big data “appear” smaller Partitioning, replication & distributed query Storage model optimizations Consistency trade offs Simplified query models Dynamic views 11/13/2013 4
    • 5. The Specialist ! • Everyone specializes – Doctors, Lawyers, Bankers, Developers  • Why was data so normalized for so long ! • NoSQL is all about the data specialist • Specializing in… – – – – 11/13/2013 Distribution / deployment Physical data storage Logical data model Query mechanism 5
    • 6. Polyglot NoSQL Architectures Users Applications RDBMS Document Graph Database 6 Business External / Legacy Data 11/13/2013 Distributed Data Processing Platform Transformation MDM Partitioned Distributed DB (often Document / KV)
    • 7. NoSQL Landscape - How it all stacks up! Data Model Performance Scalability Flexibility Complexity Functionality Key–value Stores high high high none variable (none) Column Store high high moderate low minimal Document Store high variable high low variable (low) Graph Database variable variable high high graph theory Relational Database variable variable low moderate relational algebra. From… 11/13/2013 7
    • 8. Navigational Query Performance 11/13/2013 8
    • 9. The Physical Data Model • Becoming a relationship specialist… Rows/Columns/Tables Relationship/Graph Optimized Meetings P1 Alice P2 Bob Place Denver Time 5-27-10 Alice Met 5-27-10 Charlie Calls From Bob Bob To Carlos Charlie Time 13:20 17:10 Duration 25 15 Called 13:20 Called 17:10 Carlos Bob Paid 100000 Payments From Date Amount Carlos 11/13/2013 To Charlie 5-12-10 100000 9
    • 10. Sometimes Big Data is just Fast Data ! • Some data is only actionable momentarily – – – – Intelligence IT Security Site/page visit Financial / trading behavior • Presents a different type of challenge • Latency of batch data processing becomes problematic 11/13/2013 10
    • 11. Scaling Writes • Big/Fast data demands write performance • Most NoSQL solutions allow you to scale writes by… – Partitioning the data – Understanding your consistency requirements – Allowing you to defer conflicts 11/13/2013 11
    • 12. Why a Graph Database ? 11/13/2013 12
    • 13. Relationships are everywhere CRM, Sales & Marketing Network Mgmt, Telecom Intelligence (Government & Business) PLM (Product Lifecycle Mgmt) Finance Social Networks 11/13/2013 Healthcare Research: Genomics 13
    • 14. Exploding Connections • More often than not… graphs are big ! 11/13/2013 14
    • 15. The Graph Database Landscape • Neo4J • Titan (Aurelius) • AllegroGraph (RDF) • FlockDB (Twitter) • DEX (Sparsity) • OrientDB (Document) • + 24 others (from Copyright © InfiniteGraph
    • 16. The Graph Database Landscape Cont’d • Graph Analytics: High latency, Batch Processing, offline – Apache Giraph – GraphLab – Intel’s Graph Builder • Visual Analytics: In Memory, High Performance, Poor Scalability – – – – Tom Sawyer D3JS KeyLines InfoVis • Tinkerpop stack (Blueprints/Gremlin) – 16 implementations and counting… Copyright © InfiniteGraph
    • 17. Why InfiniteGraph™? • Objectivity/DB is a proven foundation – Building highly connected databases since 1993 – A complete database management system • Concurrency, transactions, cache, schema, query, indexing • It’s a Graph Specialist ! – Simple but powerful API tailored for navigation through data – Easy to configure distribution model 11/13/2013 17
    • 18. InfiniteGraph™ Basic Architecture User Apps Blueprints InfiniteGraph - Core/API Management Extensions Navigation Execution Placement Session / TX Management Configuration Distributed Object and Relationship Persistence Layer 11/13/2013 18
    • 19. Fully Distributed Data Model AddVertex() IG Core/API ADP Placement Distributed Object and Relationship Persistence Layer HostA HostB HostC Zone 1 11/13/2013 HostX Zone 2 19
    • 20. InfiniteGraph is a Complete Database • InfiniteGraph helps manage the things you don’t want to do, but want to have done: – Concurrency • Transactions (commit/rollback) • Controlled multi-user reading during updates – Schema Control • Build complex data structures, make changes easily and migrate existing data – Distribution • Sharing large amounts of distributed data between distributed processes – Indexes • Choose built-in key-value, b-tree or other indexes – Cache • Keep large sections of the graphs in configurable memory caches 11/13/2013 20
    • 21. Scaling Graph Writes App-2 App-2 (Ingest V2) (E23{ V2V3}) App-1 (E1 2{ V1V2}) (Ingest V1) App-3 (Ingest V3) InfiniteGraph Objectivity/DB Persistence Layer V1 E12 Copyright © InfiniteGraph V2 E23 V3
    • 22. High Performance Edge Ingest IG Core/API E23 E(2->1) E(1->2) E(2->3) E(2->3) E(3->1) E(1->2) E(3->2) 11/13/2013 Pipeline E(1->2) E(3->1) Target Containers E12 E(2->3) E(2->1) E(2->3) E(3->1) E(3->1) E(3->2) 22 C1 Pipeline Containers E(1->2) C2 Agent C3
    • 23. Result… 500000 450000 Nodes and Edges per second 400000 350000 1 client 300000 2 clients 250000 4 clients 200000 88 Hosts clients 150000 44 Hosts clients 100000 50000 22 Hosts clients 0 1 Single 1 clientHost 2 4 11/13/2013 23 8 clients
    • 24. Scaling Reads and Query Partitioning and Read Replicas… easy right ! Application(s) Distributed API Processor Processor Processor Processor Partition 1 Partition 2 Partition 3 Partition ...n Copyright © InfiniteGraph
    • 25. Why are Graphs Different ? Application(s) Distributed API Processor Processor Processor Processor Partition 1 Partition 2 Partition 3 Partition ...n 11/13/2013 25
    • 26. Optimizing Distributed Navigation • Detect local hops and perform in memory traversal – Intelligently cache freq accessed remote data • Route tasks to other hosts when it is optimal Application Distributed API Processor Processor A C B X F D P(A,B,C,D) E Y Partition 1 11/13/2013 Partition 2 26 G
    • 27. Super Simple API Person alice = new Person(“Alice”); helloGraphDB.addVertex( alice ); Person bob = new Person(“Bob”); helloGraphDB.addVertex( bob ); Person carlos = new Person(“Carlos”); helloGraphDB.addVertex( carlos ); Person charlie = new Person(“Charlie”); helloGraphDB.addVertex( charlie ); 11/13/2013 27
    • 28. Adding Edges MyEdgeType edge = new MyEdgeType(); vertexA.addEdge ( edge, vertexB, EdgeKind.???, weight ); Meeting denverMeeting = new Meeting("Denver", "5-27-10"); alice.addEdge(denverMeeting, bob, EdgeKind.BIDIRECTIONAL, (short)1); Call bobToCarlos = new Call(getRandomJulyTime()); bob.addEdge(bobToCarlos, carlos, EdgeKind.OUTGOING, (short)0); Payment payment = new Payment(10000.00); carlos.addEdge(payment, charlie, EdgeKind.OUTGOING, (short)2); Call bobToCharlie = new Call(getRandomJulyTime()); bob.addEdge(bobToCharlie, charlie, EdgeKind.INCOMING, (short)0); 11/13/2013 28
    • 29. The Result… 11/13/2013 29
    • 30. Graph Traversal (Navigation) Queries • Use an instance of the Navigator class to perform a navigation query. • A navigation instance is highly customizable, but is comprised of the following basic parts: – The vertex from which to start the navigation query. – A guide strategy, which is a high-level navigational aid. You can create a custom guide, or there are several available built-in guide strategies. • Guide.Strategy.NONE • Guide.Strategy.SIMPLE_BREADTH_FIRST • Guide.Strategy.SIMPLE_DEPTH_FIRST – Qualifiers • A path qualifier • A result qualifier – Handlers • A result handler 11/13/2013 30
    • 31. Schema – It’s not your enemy ! (well not all the time...) • Schema vs Schema-less – – – – Database religion No time for a full debate here InfiniteGraph supports schema Planning to also support optional properties on schema types • Graph Views : A Great Use Case for Schema! – Filter by type and predicate during navigation – Connection Inference! 11/13/2013 31
    • 32. Graph Views and Bacon! • Filter out uninteresting projects connected to Kevin Bacon GraphView view = new GraphView(); //Excludes all instances of TvShow from navigation view.excludeClass(myDb.getTypeId(TvShow.class.getName())); //Excludes all movies made for TV/Video view.excludeClass(myDb.getTypeId(Movie.class.getName()), “de tails.madeForTv || details.madeForVideo”); //Include ActedIn w/ characterName not containing “Himself” view.excludeClass(myDb.getTypeId(WorkedOn.class.getName())); view.includeClass(myDb.getTypeId(ActedIn.class.getName()), “!CONTAINS(characterName, “Himself”)”); Movie Ryan Hardy TV Show The Following Actor Himself Kevin Bacon Jack Swigert Movie Apollo 13 Behind the Scenes
    • 33. Tools To Suit the Solution 11/13/2013 33
    • 34. Demo  Installing InfiniteGraph  FlightPlan Sample