Graph Database Overview and Feature Update Darren Wood Chief Architect, InfiniteGraph
History <ul><li>Objectivity – Massively scalable, distributed object oriented database </li></ul><ul><ul><li>Used in Gover...
Graph Databases <ul><li>Key technical attributes </li></ul><ul><li>How Infinite Graph addresses these </li></ul><ul><li>Qu...
Graph Databases <ul><li>Optimized around data relationships </li></ul><ul><ul><li>Relationships as first class citizens </...
Distributed Graph Must Haves <ul><li>High performance distributed persistence </li></ul><ul><li>Ability to deal with remot...
Some Code Copyright © InfiniteGraph Vertex alice = myGraph.addVertex(new Person(“Alice”));  Vertex bob = myGraph.addVertex...
Physical Storage Comparison Copyright © InfiniteGraph Meetings P1 Place Time P2 Alice Denver 5-27-10 Bob Calls From Time D...
Query and Navigation <ul><li>Queries – but not as you know them </li></ul><ul><li>More like a rules based search and disco...
Navigation Example Copyright © InfiniteGraph // Create a qualifier that describes the target vertex Qualifier findCharlieP...
Management of Large Data Graphs <ul><li>Graphs grow quickly </li></ul><ul><ul><li>Billions of phone calls / day in US </li...
Basic Architecture Copyright © InfiniteGraph IG Core/API Configuration Navigation Execution Management Extensions Blueprin...
Feature Update Copyright © InfiniteGraph 2.0
Accelerated Ingest Copyright © InfiniteGraph IG Core/API Configuration Navigation Execution Management Extensions Session ...
Accelerated Ingest Copyright © InfiniteGraph IG Core/API Configuration Navigation Execution Management Extensions Session ...
InfiniteGraph Visualizer <ul><li>Really nice flexible graph viewer </li></ul><ul><li>Browser style navigation and history ...
InfiniteGraph Visualizer Copyright © InfiniteGraph
InfiniteGraph Visualizer Copyright © InfiniteGraph
Indexing Framework <ul><li>Focused on providing choice ! </li></ul><ul><li>Manual Indexes for grouping data </li></ul><ul>...
<ul><li>Automated Distributed Navigation </li></ul><ul><li>Stored Loadable Navigators </li></ul><ul><li>Visualizer Navigat...
Graphs are used everywhere! <ul><li>Social Network Analysis </li></ul><ul><ul><li>Targeted Advertising </li></ul></ul><ul>...
Social Network Analysis Copyright © InfiniteGraph Sam Bob Julie Kate Mary Mike Joe Susan Jim Laura Finding and measuring k...
Transportation Copyright © InfiniteGraph “ Find me the cheapest flight from Amsterdam to Phoenix leaving on March 1, 2007,...
Transportation (graph model) Copyright © InfiniteGraph AMS LHR ORD LAX SYD TYO JFK LGA PHX EWR CAI F0-160.17 F1-964.29 F2-...
Finding Criminal Activity (by association) Copyright © InfiniteGraph
Finding Criminal Activity (by location) Copyright © InfiniteGraph
Thankyou ! Copyright © InfiniteGraph [email_address] [email_address]
Upcoming SlideShare
Loading in …5
×

Webinar: An Introduction to InfiniteGraph, and Connecting the Dots in Big Data.

1,357 views

Published on

This August 16, 2011 webinar, hosted by DBTA with InfiniteGraph, examines the technology behind InfiniteGraph and explores common use cases involving very large scale graph processing, and social network analysis. InfiniteGraph was designed specifically to traverse complex relationships in big data, and provide the framework for products built to provide real-time network analysis, business decision support and relationship analytics. Moderator: Tom Wilson, President, DBTA and Unisphere Research. Presenters: Darren Wood, Chief Architect, InfiniteGraph, and Mark Maagdenberg, Senior Field Engineer, InfiniteGraph.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,357
On SlideShare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
21
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Social Networks – Facebook, LInkedIn, Twitter – connecting people to people or companies. most connected participants Influencers Important sub-networks Gaming – connecting players with other players; looking for central players SocialCRM – connecting companies to customers, cases, email HCM – connecting employees to projects, skills GIS/Geo-Spacial – connecting people to places/events (POI) (e.g. what’s around me?) Recommendation Engines – connecting people to places based on credibility of others recommending said places; FOAF, You might also like Computer/Phone/Utility Networks – connecting computer systems and networking components quickly detect issues/remediate problems. B2B or B2C - connecting areas to find shortest/cheapest routes on air, land, sea. Fraud/Crime Detection – connecting people to events, financial tx, phone conversations Recognize attack/threat patterns Web – connecting URLs, triple stores (RDF) Marketing – connecting people to web sites, habits. Intelligence – looking for bad guys by connecting phone calls between people, events. Transportation – calculating shortest routes by air, land, sea.
  • Some SNA questions: How highly connected is an entity within a network? What is an entity&apos;s overall importance in a network? How central is an entity within a network? How does information flow within a network? Degree centrality Bob has the highest degree centrality, which means that he is quite active in the network. However, he is not necessarily the most powerful person because he is only directly connected within one degree to people in his clique—she has to go through Sam to get to other cliques. Betweeness Centrality Sam has the highest betweenness because he is between Bob and Joe, who are between other entities. Bob and Joe have a slightly lower betweenness because they are essentially only between their own cliques. Therefore, although Bob has a higher degree centrality, Sam has more importance in the network in certain respects. Closeness As with the betweenness example, Sam has the highest closeness centrality because he can reach more entities through shorter paths. As such Bob’s placement allows him to connect to entities in his own clique, and to entities that span cliques Eigenvalue Bob and Sam are closer to other highly close entities in the network. Julie and Kate are also highly close, but to a lesser value.
  • Recognize common patterns of activity Complex chains of interaction
  • Webinar: An Introduction to InfiniteGraph, and Connecting the Dots in Big Data.

    1. 1. Graph Database Overview and Feature Update Darren Wood Chief Architect, InfiniteGraph
    2. 2. History <ul><li>Objectivity – Massively scalable, distributed object oriented database </li></ul><ul><ul><li>Used in Government (DoD, Intelligence) </li></ul></ul><ul><ul><ul><li>Machine generated data such as sensor, acoustic… </li></ul></ul></ul><ul><ul><li>OEM Markets </li></ul></ul><ul><ul><ul><li>Either complex data models, or high ingest or both </li></ul></ul></ul><ul><li>Significant technical advantage in highly connected (many-to-many) data models </li></ul>Copyright © InfiniteGraph
    3. 3. Graph Databases <ul><li>Key technical attributes </li></ul><ul><li>How Infinite Graph addresses these </li></ul><ul><li>Query and navigation </li></ul><ul><li>Challenges/Requirements of Distribution </li></ul><ul><li>Practical applications </li></ul>Copyright © InfiniteGraph
    4. 4. Graph Databases <ul><li>Optimized around data relationships </li></ul><ul><ul><li>Relationships as first class citizens </li></ul></ul><ul><ul><li>Super fast traversal between entities </li></ul></ul><ul><ul><li>Rich/flexible annotation of connections </li></ul></ul><ul><li>Small focused API (typically not SQL) </li></ul><ul><ul><li>Natively work with concepts of Vertex/Edge </li></ul></ul><ul><ul><li>SQL has no concept of “navigation” </li></ul></ul><ul><ul><li>Most attempts based in SQL are convoluted </li></ul></ul>Copyright © InfiniteGraph
    5. 5. Distributed Graph Must Haves <ul><li>High performance distributed persistence </li></ul><ul><li>Ability to deal with remote data reads (fast) </li></ul><ul><li>Intelligent local cache of subgraphs </li></ul><ul><li>Distributed navigation processing </li></ul><ul><li>Distributed, multi-source concurrent ingest </li></ul><ul><li>Write modes supporting both strict and eventual consistency </li></ul>Copyright © InfiniteGraph
    6. 6. Some Code Copyright © InfiniteGraph Vertex alice = myGraph.addVertex(new Person(“Alice”)); Vertex bob = myGraph.addVertex(new Person(“Bob”)); Vertex carlos = myGraph.addVertex(new Person(“Carlos”)); Vertex charlie = myGraph.addVertex(new Person(“Charlie”)); alice.addEdge(new Meeting(“Denver”, “5-27-10”), bob); bob.addEdge(new Call(timestamp), carlos); carlos.addEdge(new Payment(100000.00), charlie); bob.addEdge(new Call(timestamp), charlie); Alice Carlos Charlie Bob Meets Calls Pays Calls
    7. 7. Physical Storage Comparison Copyright © InfiniteGraph Meetings P1 Place Time P2 Alice Denver 5-27-10 Bob Calls From Time Duration To Bob 13:20 25 Carlos Bob 17:10 15 Charlie Payments From Date Amount To Carlos 5-12-10 100000 Charlie Met 5-27-10 Alice Called 13:20 Bob Payed 100000 Carlos Charlie Called 17:10 Rows/Columns/Tables Relationship/Graph Optimized
    8. 8. Query and Navigation <ul><li>Queries – but not as you know them </li></ul><ul><li>More like a rules based search and discovery </li></ul><ul><li>Asynchronous Results </li></ul>Copyright © InfiniteGraph Alice Carlos Charlie Bob Meets Calls Pays Calls “ Find all paths between Alice and Charlie” “ Find all paths between Alice and Charlie – within 2 degrees” “ Find all paths between Alice and Charlie – events in May 2010”
    9. 9. Navigation Example Copyright © InfiniteGraph // Create a qualifier that describes the target vertex Qualifier findCharliePredicate = new VertexPredicate(personType, &quot;name == ’Charlie'&quot; ); // Construct a navigator which starts with Alice and uses a result qualifier // to find all paths in the graph to Charlie Navigator charlieFinder = alice.navigate( Guide.SIMPLE_BREADTH_FIRST, // default guide Qualifier.ANY, // no path constraints findCharliePredicate , // find paths ending with Charlie myResultHandler); // fire results to supplied handler // Start the navigator charlieFinder.start();
    10. 10. Management of Large Data Graphs <ul><li>Graphs grow quickly </li></ul><ul><ul><li>Billions of phone calls / day in US </li></ul></ul><ul><ul><li>Emails, social media events, IP Traffic </li></ul></ul><ul><ul><li>Financial transactions </li></ul></ul><ul><li>Some analytics require navigation of large sections of the graph </li></ul><ul><li>Each step (often) depends on the last </li></ul><ul><li>Must distribute data and go parallel </li></ul>Copyright © InfiniteGraph
    11. 11. Basic Architecture Copyright © InfiniteGraph IG Core/API Configuration Navigation Execution Management Extensions Blueprints User Apps Objectivity/DB Distributed Database Session / TX Management Placement
    12. 12. Feature Update Copyright © InfiniteGraph 2.0
    13. 13. Accelerated Ingest Copyright © InfiniteGraph IG Core/API Configuration Navigation Execution Management Extensions Session / TX Management Placement Standard Blocking Ingest/Placement (MDP Plugin) Objectivity/DB App-1 (Ingest V 1 ) App-2 (Ingest V 2 ) App-3 (Ingest V 3 ) V 1 V 2 V 3 App-1 (E 1 2 { V 1 V 2 }) App-2 (E 23 { V 2 V 3 }) App-3 E 12 E 23
    14. 14. Accelerated Ingest Copyright © InfiniteGraph IG Core/API Configuration Navigation Execution Management Extensions Session / TX Management Placement (Standard) Placement (Accelerated) V 1 V 2 V 3 E 12 E 23 Distributed Pipelines Staging Containers Pipeline Containers E(1->2) E(3->1) E(2->3) E(2->1) E(2->3) E(3->1) E(1->2) E(3->2) E(1->2) E(2->3) E(3->1) E(2->1) E(2->3) E(3->1) E(3->2) E(1->2)
    15. 15. InfiniteGraph Visualizer <ul><li>Really nice flexible graph viewer </li></ul><ul><li>Browser style navigation and history </li></ul><ul><li>Full index support – search your data </li></ul><ul><li>Display connections around a selected point </li></ul><ul><li>Fully customize display to your data model </li></ul><ul><li>Full data view via selection </li></ul>Copyright © InfiniteGraph
    16. 16. InfiniteGraph Visualizer Copyright © InfiniteGraph
    17. 17. InfiniteGraph Visualizer Copyright © InfiniteGraph
    18. 18. Indexing Framework <ul><li>Focused on providing choice ! </li></ul><ul><li>Manual Indexes for grouping data </li></ul><ul><li>Automatic Indexes for cross population </li></ul><ul><li>Query interface with qualification language </li></ul><ul><li>Pluggable query operators </li></ul><ul><li>External index support (Lucene) </li></ul>Copyright © InfiniteGraph
    19. 19. <ul><li>Automated Distributed Navigation </li></ul><ul><li>Stored Loadable Navigators </li></ul><ul><li>Visualizer Navigation Plugins </li></ul><ul><li>More Visualizer Enhancements </li></ul><ul><li>More Import/Export support </li></ul>Copyright © InfiniteGraph >> next
    20. 20. Graphs are used everywhere! <ul><li>Social Network Analysis </li></ul><ul><ul><li>Targeted Advertising </li></ul></ul><ul><ul><li>Recommendation Engines </li></ul></ul><ul><li>Transportation </li></ul><ul><li>Network Analysis </li></ul><ul><li>Fraud Detection/Prevention </li></ul><ul><li>Crime Detection/Prevention </li></ul>Copyright © InfiniteGraph
    21. 21. Social Network Analysis Copyright © InfiniteGraph Sam Bob Julie Kate Mary Mike Joe Susan Jim Laura Finding and measuring key players and relationships Value Degree Centrality Betweeness Centrality Closeness Eigenvalue High Bob Sam Sam Bob, Sam Moderate Sam Bob, Joe Bob, Joe Julie, Kate
    22. 22. Transportation Copyright © InfiniteGraph “ Find me the cheapest flight from Amsterdam to Phoenix leaving on March 1, 2007, with a maximum of two stops, and each stop should be less than 4 hours” Given a list of flights between airports represented as… … try to answer the following FLIGHT NO DEPART AIRPORT ARRIVE AIRPORT DEPART TIME ARRIVE TIME PRICE 0 AMS LHR 2007-03-01-11.30 2007-03-01-12.30 160.17 1 LHR ORD 2007-03-01-13.30 2007-03-01-19.30 964.29 2 ORD LAX 2007-03-01-20.30 2007-03-02-01.30 583.11 3 LAX SYD 2007-03-02-02.30 2007-03-02-12.30 1663.04 4 AMS TYO 2007-03-01-11.00 2007-03-01-22.00 1595.86 5 TYO SYD 2007-03-02-03.00 2007-03-02-14.00 1487.33 6 AMS LAX 2007-03-01-18.00 2007-03-02-07.00 1374.15 7 AMS JFK 2007-03-01-10.00 2007-03-01-16.00 964.61 8 JFK PHX 2007-03-01-19.00 2007-03-02-01.00 1069.99 9 AMS LGA 2007-03-01-10.00 2007-03-01-16.00 1081.56 10 LGA PHX 2007-03-01-20.00 2007-03-02-02.00 911.92 11 AMS EWR 2007-03-01-10.00 2007-03-01-17.00 911.36 12 EWR PHX 2007-03-01-19.00 2007-03-02-00.00 937.98 13 AMS CAI 2007-03-01-09.00 2007-03-01-16.00 1208.67 14 CAI TYO 2007-03-01-19.00 2007-03-02-00.00 977.95 15 AMS JFK 2007-03-01-15.00 2007-03-01-21.00 1155.43 16 AMS LGA 2007-03-01-12.00 2007-03-01-18.00 923.61 17 AMS LHR 2007-03-01-15.00 2007-03-01-16.00 114.23
    23. 23. Transportation (graph model) Copyright © InfiniteGraph AMS LHR ORD LAX SYD TYO JFK LGA PHX EWR CAI F0-160.17 F1-964.29 F2-583.11 F3-1663.04 F4-1595.86 F5-1487.33 F6-1374.15 F7-964.61 F8-1069.99 F9-1081.56 F10-911.92 F11-911.36 F12- 937.98 F13-1208.67 F14-977.95 F15-1155.43 F16-923.61 F17-114.23 Path 1: AMS -(F16)-> LGA -(F10)-> PHX Total Price: $1835.53 Path 2: AMS -(F11)-> EWR -(F12)-> PHX Total Price: $1849.34 Path 3: AMS -(F09)-> LGA -(F10)-> PHX Total Price: $1993.48 Path 4: AMS -(F07)-> JFK -(F08)-> PHX Total Price: $2034.60
    24. 24. Finding Criminal Activity (by association) Copyright © InfiniteGraph
    25. 25. Finding Criminal Activity (by location) Copyright © InfiniteGraph
    26. 26. Thankyou ! Copyright © InfiniteGraph [email_address] [email_address]

    ×