UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
Dbta Webinar Realize Value of Big Data with graph 011713
1. Realize The Value In Your Big Data
With Graph Technology
www.Objectivity.com
Leon Guzenda - Objectivity, Inc.
DBTA Webinar – January 17, 2013
2. Overview
• Who We Are
• Current Big Data Analytics
• Relationship Analytics
• Graph Technologies
• The Big Data Connection Platform
3. About Objectivity Inc.
• Objectivity, Inc. is headquartered in Sunnyvale, California.
• Established in 1988 to tackle database problems that network/hierarchical/relational
and file-based technologies struggle with.
• Objectivity has over two decades of Big Data and NoSQL experience
• Develops NoSQL platforms for managing and discovering relationships and
patterns in complex data:
– Objectivity/DB - an object database that manages localized, centralized or
distributed databases
– InfiniteGraph - a massively scalable graph database built on Objectivity/DB
that enables organizations to find, store and exploit the relationships in their
data
Embedded in hundreds of enterprises, government organizations and products -
millions of deployments.
6. We All Know The Problem - Information Overload!
Volume, Velocity, Variety, Veracity, Value...
Making sense of it all takes time and $$$
Current “Big Data” Analytics
7. A Typical “Big Data” Analytics Setup
Data Aggregation and Analytics Applications
Commodity Linux Platforms and/or High Performance Computing Clusters
Column Data Graph Object K-V
RDBMS Hadoop Doc DB
Store W/H DB DB Store
Structured Semi-Structured Unstructured
8. Incremental Improvements Aren’t Enough
All current solutions use the same basic architectural model
• None of the current solutions have a way to store connections between
entities in different silos
• Most analytic technology focuses on the content of the data nodes, rather
than the many kinds of connections between the nodes and the data in
those connections
• Why? Because traditional and earlier NoSQL solutions are bad at handling
relationships.
• Graph databases can efficiently store, manage and query the many kinds of
relationships hidden in the data.
9. Not Only SQL – a group of 4 primary technologies
• Key-Value Stores
• “Big Table” Clones
• Document Databases
• Object and Graph databases
Graph Database
Graph Processing
10. Not Only SQL – A group of 4 primary technologies
Highly
Simple Interconnected
11. Graph Theory Terminology...
VERTEX: A single node in a graph data structure
EDGE: A connection between a pair of VERTICES
PROPERTIES: Data items that belong to a particular Vertex
WEIGHT: A quantity associated with a particular Edge
GRAPH: A collection of linked Vertex and Edge objects
Vertex 1 Edge 1 Vertex 2
City: San Francisco Road: I-101 City: San Jose
Pop: 812,826 Miles: 47.8 Pop: 967,487
12. ...Graph Theory Terminology...
SIMPLE/UNDIRECTED GRAPH: A Graph where each VERTEX may be linked to
one or more Vertex objects via Edge objects and each Edge object is connected to
exactly two Vertex objects. Furthermore, neither Vertex connected to an Edge is more
significant than the other.
DIRECTED GRAPH: A Simple/Undirected Graph where one Vertex in a
Vertex + Edge + Vertex group (an “Arc” or “Path”) can be considered the “Head” of the
Path and the other can be considered the “Tail”.
MIXED GRAPH: A Graph in which some paths are Undirected and others are
Directed.
13. ...Graph Theory Terminology
LOOP: An Edge that is doubly-linked to the same Vertex
MULTIGRAPH: A Graph that allows multiple Edges and Loops
QUIVER: A Graph where Vertices are allowed to be connected by multiple Arcs.
A Quiver may include Loops.
WEIGHTED GRAPH: A Graph where a quantity is assigned to an Edge, e.g.
a Length assigned to an Edge representing a road between two Vertices representing
cities.
HALF EDGE: An Edge that is only connected to a single Vertex
LOOSE EDGE: An Edge that isn't connected to any Vertices.
CONNECTIVITY: Two Vertices are Connected if it is possible to find a path between
them.
15. Example 1 – Social Network Analysis
Sources may be covert or open
Telecom Call Detail Records
Banking transactions
Flight and hotel reservations
MASINT
Twitter
Facebook
Google+
LinkedIn
Plaxo
Flickr
Youtube
16. Example 2 – Finding Patterns In Open Source Data...
The Challenges
Data Volumes
Fast-Changing Data
Sensitivity of Data
Significance of Data
23. Relationship (Connection) Analytics...
A SQL Shortcoming
Think about the SQL query for finding all links between the two “blue” rows... it's hard!!
Table_A Table_B Table_C Table_D Table_E Table_F Table_G
There are some kinds of complex relationship handling problems that SQL
wasn't designed for.
24. Relationship (Connection) Analytics...
A SQL Shortcoming
Table_A Table_B Table_C Table_D Table_E Table_F Table_G
InfiniteGraph - The solution can be found with a few lines of code
A3 G4
25. Representing the Graph...
The existing data might look like this:
Events/Places People/Orgs Facts
Situation X Combatant A A Called P A Seen Near X P Emailed S
Situation Y Bank X P Called Q Q Seen Near T X Paid S
Target T Civilian P R Seen Near T
P Called R
Cafe C Civilian Q A Banks at X S Seen Near T
Civilian R A Seen At Y
A Eats At
Civilian S
26. Representing the Graph...
We start by identifying the nodes (Vertices) and the connections (Edges)
NODES CONNECTIONS
Events/Places People/Orgs Facts
Situation X Combatant A A Called P A Seen Near X P Emailed S
Situation Y Bank X P Called Q Q Seen Near T X Paid S
Target T Civilian P R Seen Near T
P Called R
Cafe C Civilian Q A Banks at X S Seen Near T
Civilian R A Seen At Y
A Eats At
Civilian S
28. ...Representing the Graph..
“Nodes” VERTEX EDGE “Connections”
Situation X Seen Near Combatant A Seen At Situation Y
Eats At Called Banks At
Cafe C Civilian P Bank X
Called Called Emailed Paid
Civilian Q Civilian R Civilian S
Seen Near Seen Near Seen Near
Target T
29. ...Analyzing the Graph...
Situation X Seen Near Combatant A Seen At Situation Y
Called Banks At
Eats At
Cafe C Civilian P Bank X
Called Called Emailed Paid
Civilian Q Civilian R Civilian S
Seen Near Seen Near Seen Near
Target T
30. ...Threat Analysis
Situation X Seen Near Combatant A Seen At Situation Y
Called Banks At
SUSPECTS
Civilian P Bank X
Called Called Emailed Paid
Civilian Q Civilian R Civilian S
Seen Near Seen Near Seen Near
Target T NEEDS PROTECTION
38. Conventional & Graph Analytics
Data Visualization
& Analytics
*Now HP *Now IBM
Big Data ORACLE or
Connection
Platform Other Big Data
Solutions +
39. InfiniteGraph - The Enterprise Graph Database
• A high performance distributed database engine that supports analyst-time decision
support and actionable intelligence
• Cost effective link analysis – flexible deployment on commodity resources (hardware
and OS).
• Efficient, scalable, risk averse technology – enterprise proven.
• High Speed parallel ingest to load graph data quickly.
• Parallel, distributed queries
• Flexible plugin architecture
• Complementary technology
• Fast proof of concept – easy to use Graph API.
40. Basic Capabilities Of Most Graph Databases
Rapid Graph Traversal Inclusive or Exclusive
Selection
X
Start Start
X
Find the Shortest or All Paths Between Objects
Start Finish
42. Summary - Graph Analytics
• Can Be Used For:
– Social Network Analysis
– Pattern finding in open source data
– Logistics
– Campaign planning
– Energy usage, planning and protection
• The technology works best if the graph is extracted from existing
sources and stored in a Graph Database.
43. Thank You!
Please take a look at objectivity.com
For InfiniteGraph Online Demos, White Papers, Free
Downloads, Samples & Tutorials
info@objectivity.com