Your SlideShare is downloading. ×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Webinar: An Introduction to InfiniteGraph, and Connecting the Dots in Big Data.

910
views

Published on

This August 16, 2011 webinar, hosted by DBTA with InfiniteGraph, examines the technology behind InfiniteGraph and explores common use cases involving very large scale graph processing, and social …

This August 16, 2011 webinar, hosted by DBTA with InfiniteGraph, examines the technology behind InfiniteGraph and explores common use cases involving very large scale graph processing, and social network analysis. InfiniteGraph was designed specifically to traverse complex relationships in big data, and provide the framework for products built to provide real-time network analysis, business decision support and relationship analytics. Moderator: Tom Wilson, President, DBTA and Unisphere Research. Presenters: Darren Wood, Chief Architect, InfiniteGraph, and Mark Maagdenberg, Senior Field Engineer, InfiniteGraph.

Published in: Technology

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
910
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
14
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • Social Networks – Facebook, LInkedIn, Twitter – connecting people to people or companies. most connected participants Influencers Important sub-networks Gaming – connecting players with other players; looking for central players SocialCRM – connecting companies to customers, cases, email HCM – connecting employees to projects, skills GIS/Geo-Spacial – connecting people to places/events (POI) (e.g. what’s around me?) Recommendation Engines – connecting people to places based on credibility of others recommending said places; FOAF, You might also like Computer/Phone/Utility Networks – connecting computer systems and networking components quickly detect issues/remediate problems. B2B or B2C - connecting areas to find shortest/cheapest routes on air, land, sea. Fraud/Crime Detection – connecting people to events, financial tx, phone conversations Recognize attack/threat patterns Web – connecting URLs, triple stores (RDF) Marketing – connecting people to web sites, habits. Intelligence – looking for bad guys by connecting phone calls between people, events. Transportation – calculating shortest routes by air, land, sea.
  • Some SNA questions: How highly connected is an entity within a network? What is an entity's overall importance in a network? How central is an entity within a network? How does information flow within a network? Degree centrality Bob has the highest degree centrality, which means that he is quite active in the network. However, he is not necessarily the most powerful person because he is only directly connected within one degree to people in his clique—she has to go through Sam to get to other cliques. Betweeness Centrality Sam has the highest betweenness because he is between Bob and Joe, who are between other entities. Bob and Joe have a slightly lower betweenness because they are essentially only between their own cliques. Therefore, although Bob has a higher degree centrality, Sam has more importance in the network in certain respects. Closeness As with the betweenness example, Sam has the highest closeness centrality because he can reach more entities through shorter paths. As such Bob’s placement allows him to connect to entities in his own clique, and to entities that span cliques Eigenvalue Bob and Sam are closer to other highly close entities in the network. Julie and Kate are also highly close, but to a lesser value.
  • Recognize common patterns of activity Complex chains of interaction
  • Transcript

    • 1. Graph Database Overview and Feature Update Darren Wood Chief Architect, InfiniteGraph
    • 2. History
      • Objectivity – Massively scalable, distributed object oriented database
        • Used in Government (DoD, Intelligence)
          • Machine generated data such as sensor, acoustic…
        • OEM Markets
          • Either complex data models, or high ingest or both
      • Significant technical advantage in highly connected (many-to-many) data models
      Copyright © InfiniteGraph
    • 3. Graph Databases
      • Key technical attributes
      • How Infinite Graph addresses these
      • Query and navigation
      • Challenges/Requirements of Distribution
      • Practical applications
      Copyright © InfiniteGraph
    • 4. Graph Databases
      • Optimized around data relationships
        • Relationships as first class citizens
        • Super fast traversal between entities
        • Rich/flexible annotation of connections
      • Small focused API (typically not SQL)
        • Natively work with concepts of Vertex/Edge
        • SQL has no concept of “navigation”
        • Most attempts based in SQL are convoluted
      Copyright © InfiniteGraph
    • 5. Distributed Graph Must Haves
      • High performance distributed persistence
      • Ability to deal with remote data reads (fast)
      • Intelligent local cache of subgraphs
      • Distributed navigation processing
      • Distributed, multi-source concurrent ingest
      • Write modes supporting both strict and eventual consistency
      Copyright © InfiniteGraph
    • 6. Some Code Copyright © InfiniteGraph Vertex alice = myGraph.addVertex(new Person(“Alice”)); Vertex bob = myGraph.addVertex(new Person(“Bob”)); Vertex carlos = myGraph.addVertex(new Person(“Carlos”)); Vertex charlie = myGraph.addVertex(new Person(“Charlie”)); alice.addEdge(new Meeting(“Denver”, “5-27-10”), bob); bob.addEdge(new Call(timestamp), carlos); carlos.addEdge(new Payment(100000.00), charlie); bob.addEdge(new Call(timestamp), charlie); Alice Carlos Charlie Bob Meets Calls Pays Calls
    • 7. Physical Storage Comparison Copyright © InfiniteGraph Meetings P1 Place Time P2 Alice Denver 5-27-10 Bob Calls From Time Duration To Bob 13:20 25 Carlos Bob 17:10 15 Charlie Payments From Date Amount To Carlos 5-12-10 100000 Charlie Met 5-27-10 Alice Called 13:20 Bob Payed 100000 Carlos Charlie Called 17:10 Rows/Columns/Tables Relationship/Graph Optimized
    • 8. Query and Navigation
      • Queries – but not as you know them
      • More like a rules based search and discovery
      • Asynchronous Results
      Copyright © InfiniteGraph Alice Carlos Charlie Bob Meets Calls Pays Calls “ Find all paths between Alice and Charlie” “ Find all paths between Alice and Charlie – within 2 degrees” “ Find all paths between Alice and Charlie – events in May 2010”
    • 9. Navigation Example Copyright © InfiniteGraph // Create a qualifier that describes the target vertex Qualifier findCharliePredicate = new VertexPredicate(personType, "name == ’Charlie'" ); // Construct a navigator which starts with Alice and uses a result qualifier // to find all paths in the graph to Charlie Navigator charlieFinder = alice.navigate( Guide.SIMPLE_BREADTH_FIRST, // default guide Qualifier.ANY, // no path constraints findCharliePredicate , // find paths ending with Charlie myResultHandler); // fire results to supplied handler // Start the navigator charlieFinder.start();
    • 10. Management of Large Data Graphs
      • Graphs grow quickly
        • Billions of phone calls / day in US
        • Emails, social media events, IP Traffic
        • Financial transactions
      • Some analytics require navigation of large sections of the graph
      • Each step (often) depends on the last
      • Must distribute data and go parallel
      Copyright © InfiniteGraph
    • 11. Basic Architecture Copyright © InfiniteGraph IG Core/API Configuration Navigation Execution Management Extensions Blueprints User Apps Objectivity/DB Distributed Database Session / TX Management Placement
    • 12. Feature Update Copyright © InfiniteGraph 2.0
    • 13. Accelerated Ingest Copyright © InfiniteGraph IG Core/API Configuration Navigation Execution Management Extensions Session / TX Management Placement Standard Blocking Ingest/Placement (MDP Plugin) Objectivity/DB App-1 (Ingest V 1 ) App-2 (Ingest V 2 ) App-3 (Ingest V 3 ) V 1 V 2 V 3 App-1 (E 1 2 { V 1 V 2 }) App-2 (E 23 { V 2 V 3 }) App-3 E 12 E 23
    • 14. Accelerated Ingest Copyright © InfiniteGraph IG Core/API Configuration Navigation Execution Management Extensions Session / TX Management Placement (Standard) Placement (Accelerated) V 1 V 2 V 3 E 12 E 23 Distributed Pipelines Staging Containers Pipeline Containers E(1->2) E(3->1) E(2->3) E(2->1) E(2->3) E(3->1) E(1->2) E(3->2) E(1->2) E(2->3) E(3->1) E(2->1) E(2->3) E(3->1) E(3->2) E(1->2)
    • 15. InfiniteGraph Visualizer
      • Really nice flexible graph viewer
      • Browser style navigation and history
      • Full index support – search your data
      • Display connections around a selected point
      • Fully customize display to your data model
      • Full data view via selection
      Copyright © InfiniteGraph
    • 16. InfiniteGraph Visualizer Copyright © InfiniteGraph
    • 17. InfiniteGraph Visualizer Copyright © InfiniteGraph
    • 18. Indexing Framework
      • Focused on providing choice !
      • Manual Indexes for grouping data
      • Automatic Indexes for cross population
      • Query interface with qualification language
      • Pluggable query operators
      • External index support (Lucene)
      Copyright © InfiniteGraph
    • 19.
      • Automated Distributed Navigation
      • Stored Loadable Navigators
      • Visualizer Navigation Plugins
      • More Visualizer Enhancements
      • More Import/Export support
      Copyright © InfiniteGraph >> next
    • 20. Graphs are used everywhere!
      • Social Network Analysis
        • Targeted Advertising
        • Recommendation Engines
      • Transportation
      • Network Analysis
      • Fraud Detection/Prevention
      • Crime Detection/Prevention
      Copyright © InfiniteGraph
    • 21. Social Network Analysis Copyright © InfiniteGraph Sam Bob Julie Kate Mary Mike Joe Susan Jim Laura Finding and measuring key players and relationships Value Degree Centrality Betweeness Centrality Closeness Eigenvalue High Bob Sam Sam Bob, Sam Moderate Sam Bob, Joe Bob, Joe Julie, Kate
    • 22. Transportation Copyright © InfiniteGraph “ Find me the cheapest flight from Amsterdam to Phoenix leaving on March 1, 2007, with a maximum of two stops, and each stop should be less than 4 hours” Given a list of flights between airports represented as… … try to answer the following FLIGHT NO DEPART AIRPORT ARRIVE AIRPORT DEPART TIME ARRIVE TIME PRICE 0 AMS LHR 2007-03-01-11.30 2007-03-01-12.30 160.17 1 LHR ORD 2007-03-01-13.30 2007-03-01-19.30 964.29 2 ORD LAX 2007-03-01-20.30 2007-03-02-01.30 583.11 3 LAX SYD 2007-03-02-02.30 2007-03-02-12.30 1663.04 4 AMS TYO 2007-03-01-11.00 2007-03-01-22.00 1595.86 5 TYO SYD 2007-03-02-03.00 2007-03-02-14.00 1487.33 6 AMS LAX 2007-03-01-18.00 2007-03-02-07.00 1374.15 7 AMS JFK 2007-03-01-10.00 2007-03-01-16.00 964.61 8 JFK PHX 2007-03-01-19.00 2007-03-02-01.00 1069.99 9 AMS LGA 2007-03-01-10.00 2007-03-01-16.00 1081.56 10 LGA PHX 2007-03-01-20.00 2007-03-02-02.00 911.92 11 AMS EWR 2007-03-01-10.00 2007-03-01-17.00 911.36 12 EWR PHX 2007-03-01-19.00 2007-03-02-00.00 937.98 13 AMS CAI 2007-03-01-09.00 2007-03-01-16.00 1208.67 14 CAI TYO 2007-03-01-19.00 2007-03-02-00.00 977.95 15 AMS JFK 2007-03-01-15.00 2007-03-01-21.00 1155.43 16 AMS LGA 2007-03-01-12.00 2007-03-01-18.00 923.61 17 AMS LHR 2007-03-01-15.00 2007-03-01-16.00 114.23
    • 23. Transportation (graph model) Copyright © InfiniteGraph AMS LHR ORD LAX SYD TYO JFK LGA PHX EWR CAI F0-160.17 F1-964.29 F2-583.11 F3-1663.04 F4-1595.86 F5-1487.33 F6-1374.15 F7-964.61 F8-1069.99 F9-1081.56 F10-911.92 F11-911.36 F12- 937.98 F13-1208.67 F14-977.95 F15-1155.43 F16-923.61 F17-114.23 Path 1: AMS -(F16)-> LGA -(F10)-> PHX Total Price: $1835.53 Path 2: AMS -(F11)-> EWR -(F12)-> PHX Total Price: $1849.34 Path 3: AMS -(F09)-> LGA -(F10)-> PHX Total Price: $1993.48 Path 4: AMS -(F07)-> JFK -(F08)-> PHX Total Price: $2034.60
    • 24. Finding Criminal Activity (by association) Copyright © InfiniteGraph
    • 25. Finding Criminal Activity (by location) Copyright © InfiniteGraph
    • 26. Thankyou ! Copyright © InfiniteGraph [email_address] [email_address]