October 21, 2010
Warren Davidson wdavidson@infinitegraph.com
Darren Wood dwood@infinitegraph.com
InfiniteGraph www.infinit...
Agenda
• The NoSQL Landscape
• InfiniteGraph
• Solving what problems and how?
Copyright © InfiniteGraph
Some NoSQL Notes
Copyright © InfiniteGraph
• NoSQL = Not Only SQL
• NoSQL is requirements driven
• NoSQL = open source?
• ...
Company Confidential
The NoSQL Landscape
Cassandra
InfiniteGraph
NoSQL Landscape
Key Value
Stores
Key Value
Stores
BigTable
Clones
BigTable
Clones
Document databasesDocument databases
Com...
Graph Databases
• A graph database is used to trace relationships among entities, most
commonly people, to any depth. Its ...
InfiniteGraph
A business unit of Objectivity
• In the business of distributed data
management for over 10 years
• Solving ...
Graphs are everywhere
Enterprise and government 2.0, bio-engineering, gene
sequencing, drug development…..
LinkedIn, Faceb...
Graph Databases – What’s so
Different ?
Darren Wood
Chief Architect, InfiniteGraph
Graph Databases
• Key technical attributes
• How Infinite Graph addresses these
• Query and navigation
• Challenges/Requir...
Graph Databases
• Optimized around data relationships
– Relationships as first class citizens
– Super fast navigation betw...
Physical Storage Comparison
Copyright © InfiniteGraph
Meetings
P1 Place TimeP2
Alice Denver 5-27-10Bob
Calls
From Time Dur...
Query and Navigation
• Queries – but not as you know them
• More like a rules based search and discovery
• Asynchronous Re...
Management of Large Data
Graphs
• Graphs grow quickly
– Billions of phone calls / day in US
– Emails, social media events,...
Graph Partitioning
• Graph partitioning is not as simple
• Graph operations are rarely partition bound
• Graphs are ‘alive...
Distributed API
Application(s)
Partition 1 Partition 3Partition 2 Partition ...n
Processor Processor Processor Processor
G...
Distributed Graph Must Haves
• High performance distributed persistence
• Ability to deal with remote data reads (fast)
• ...
Practical Applications
Copyright © InfiniteGraph
Graph Analysis (Algorithms)
• Social Networks
– Most connected participants
– Influencers
– Important Syndicates or Sub-ne...
Graph Analysis (Patterns)
• Crime (again)
– Recognize common patterns of activity
– Complex chains of interaction
• Securi...
Many Many More !
• Spatial data
• Defence / Situational Awareness
• Sciences
• Health Care
• Genealogy
• Logistics
• Track...
Thankyou !
Copyright © InfiniteGraph
darren.wood@infinitegraph.com
wdavidson@infinitegraph.com
Twitter - @infinitegraph
Upcoming SlideShare
Loading in …5
×

InfiniteGraph Presentation from Oct 21, 2010 DBTA Webcast

1,103 views

Published on

Here is the presentation from Warren Davidson, Director of Business Development, and Darren Wood, InfiniteGraph chief architect. The October 21, 2010 webinar hosted by DBTA, with InfiniteGraph and Riptano, covered new data technologies and how the NOSQL ("Not Only SQL") approach is beneficial in addressing some of the more complex application, scalability and performance requirements in handling vast amounts of data, and in performing advanced analytics on those data volumes with greater ease and speed.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,103
On SlideShare
0
From Embeds
0
Number of Embeds
46
Actions
Shares
0
Downloads
28
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Key-value pair stores have a simple interface – Put, Get and Delete
    Voldemort is a distributed key-value storage system implemented as a fault tolerant hash table
    Dynamo – a distributed storage system, highly available key-value store, fault tolerant
    BigTable – fast and extremely large scale, distributed Google File System, MapReduce – distributed parallel processing
    Cassandra – structured key-value store, columnfamily based data model, eventually consistent, distributed systems technology from Dynamo,data model from Google's BigTable
    Hbase –
    HyperTable –
    CouchDB – is a document oriented database that can be queried and indexed in a MapReduce fashion using JavaScript
    MongoDB – document oriented, more complex schema model than just key/value pairs, C++, uses MapReduce for processing
    Neo4j –
    HyperGraphDB –
    AllegrGraph
    Sones -
  • InfiniteGraph Presentation from Oct 21, 2010 DBTA Webcast

    1. 1. October 21, 2010 Warren Davidson wdavidson@infinitegraph.com Darren Wood dwood@infinitegraph.com InfiniteGraph www.infinitegraph.com
    2. 2. Agenda • The NoSQL Landscape • InfiniteGraph • Solving what problems and how? Copyright © InfiniteGraph
    3. 3. Some NoSQL Notes Copyright © InfiniteGraph • NoSQL = Not Only SQL • NoSQL is requirements driven • NoSQL = open source? • NoSQL = cloud computing?
    4. 4. Company Confidential The NoSQL Landscape Cassandra InfiniteGraph
    5. 5. NoSQL Landscape Key Value Stores Key Value Stores BigTable Clones BigTable Clones Document databasesDocument databases Complexity Voldemort – LinkedIn Dynamo - Amazon Cassandra – Facebook HBase – Apache/Hadoop Hypertable CouchDB – Apache MongoDB Neo4j HypergraphDB AllegroGraph Sones Performance Graph Databases Social Network Analysis Intelligence Community Graph Databases
    6. 6. Graph Databases • A graph database is used to trace relationships among entities, most commonly people, to any depth. Its characteristics are: – Very simple, fixed schema – Very complex data relationships – Used to support complex associations among like entities. 6 Node Edge John Jones Jane Jones- Smith Nancy Jones Paul Jones Doris Smith Jim Smith Jeff Smith Meta-Model Instance Example (simplified) Attribute(s) Jeff Smith
    7. 7. InfiniteGraph A business unit of Objectivity • In the business of distributed data management for over 10 years • Solving graph data problems for over 8 years • Focusing on the emerging requirements of graph data for cloud and on-premise distributed systems Copyright © InfiniteGraph
    8. 8. Graphs are everywhere Enterprise and government 2.0, bio-engineering, gene sequencing, drug development….. LinkedIn, Facebook…. Social network analytics, social CRM…. Network analysis, complex BoM, predictive and real-time ISR, fraud detection and response….
    9. 9. Graph Databases – What’s so Different ? Darren Wood Chief Architect, InfiniteGraph
    10. 10. Graph Databases • Key technical attributes • How Infinite Graph addresses these • Query and navigation • Challenges/Requirements of Distibution • Practical applications Copyright © InfiniteGraph
    11. 11. Graph Databases • Optimized around data relationships – Relationships as first class citizens – Super fast navigation between entities – Rich/flexible annotation of connections • Small focused API (typically not SQL) – Natively work with concepts of Vertex/Edge – SQL has no concept of “navigation” – Most attempts based in SQL are convoluted Copyright © InfiniteGraph
    12. 12. Physical Storage Comparison Copyright © InfiniteGraph Meetings P1 Place TimeP2 Alice Denver 5-27-10Bob Calls From Time DurationTo Bob 13:20 25Carlos Bob 17:10 15Charlie Payments From Date AmountTo Carlos 5-12-10 100000Charlie Met 5-27-10 Alice Called 13:20 Bob Payed 100000 Carlos Charlie Called 17:10 Rows/Columns/Tables Relationship/Graph Optimized
    13. 13. Query and Navigation • Queries – but not as you know them • More like a rules based search and discovery • Asynchronous Results Copyright © InfiniteGraph Alice Carlos CharlieBob Meets Calls Pays Calls “Find all paths between Alice and Charlie” “Find all paths between Alice and Charlie – within 2 degrees” “Find all paths between Alice and Charlie – events in May 2010”
    14. 14. Management of Large Data Graphs • Graphs grow quickly – Billions of phone calls / day in US – Emails, social media events, IP Traffic – Financial transactions • Some analytics require navigation of large sections of the graph • Each step (often) depends on the last • Must distribute data and go parallel Copyright © InfiniteGraph
    15. 15. Graph Partitioning • Graph partitioning is not as simple • Graph operations are rarely partition bound • Graphs are ‘alive’ • Repartitioning is expensive • Partitions must co-operate Copyright © InfiniteGraph
    16. 16. Distributed API Application(s) Partition 1 Partition 3Partition 2 Partition ...n Processor Processor Processor Processor Graph Partitioning – Reality ! Copyright © InfiniteGraph
    17. 17. Distributed Graph Must Haves • High performance distributed persistence • Ability to deal with remote data reads (fast) • Intelligent local cache of subgraphs • Distributed navigation processing • Distributed, multi-source concurrent ingest • Write modes supporting both strict and eventual consistency Copyright © InfiniteGraph
    18. 18. Practical Applications Copyright © InfiniteGraph
    19. 19. Graph Analysis (Algorithms) • Social Networks – Most connected participants – Influencers – Important Syndicates or Sub-networks • Central figures in crime organisations • Business Intelligence – Discovering Knowledge Assets – Complex analytics Copyright © InfiniteGraph
    20. 20. Graph Analysis (Patterns) • Crime (again) – Recognize common patterns of activity – Complex chains of interaction • Security – Recognize attack/threat patterns – Auditing / log analytics • Targeting Advertising – To specific browsing patterns Copyright © InfiniteGraph
    21. 21. Many Many More ! • Spatial data • Defence / Situational Awareness • Sciences • Health Care • Genealogy • Logistics • Tracking Copyright © InfiniteGraph
    22. 22. Thankyou ! Copyright © InfiniteGraph darren.wood@infinitegraph.com wdavidson@infinitegraph.com Twitter - @infinitegraph

    ×