Choosing the Right Graph Database to Succeed in Your Project

Choosing the Right Graph
Database to Succeed in Your
Project
Marin Dimitrov (CTO)
Feb 2016

About Ontotext
• Provides products & solutions for content enrichment and metadata
management
− Founded in 2000, 70 employees
− HQ in Sofia (Bulgaria), sales presence in NYC and London
• Major verticals
− Media & publishing
− Healthcare & life sciences
− Cultural heritage & digital libraries
− Government
− Financial information providers
− Education
2Feb 2016Choosing the Right Graph Database to Succeed in Your Project

Some of Our Customers

Smart Data Management
4
Semantic Graph Database
• Flexible graph data
model
• Ontology data model &
metadata layer
Enrichment, Search, Discovery
• Metadata driven content
• Semantic, exploratory search
• Information discovery + recommendations
Text Mining & Interlinking
• Organisations, people, locations,
topics, relations
• Discover implicit relations
• Reuse open Knowledge Graphs
• Interlink with reference data
Feb 2016Choosing the Right Graph Database to Succeed in Your Project

Presentation Outline
• Use Cases for Graph Databases
• GraphDB by Ontotext
• Choosing a Database for Your Project
• Q & A

Graph Databases for Interconnected Data
• Integration of heterogeneous data sources
• Hierarchical or interconnected datasets
• Agile “schema-late” data integration
• Dynamic data models / schema evolution
• Relationship centric analytics / discovery
• Path traversal / navigation, sub-graph pattern matching
• Property graph DBs vs Semantic graph DBs (triplestores, RDF DBs)

Semantic Graph Databases – Advantages
• Simple, graph based data model
• Exploratory queries against unknown schema
• Agile schema / schema-less / schema-late
• Rich, semantic data models (schema)
• Easily map between data models (schemas)
• Global identifiers of nodes & relations
• Inference of implicit facts, based on rules
• Compliance to standards (RDF, SPARQL), no vendor lock-in
• Easy to publish / consume open Knowledge Graphs (Linked Data)

Semantic Graph Databases – Inferring New Facts

Typical Use Cases
• Network analysis (social, influencer, risk, fraud, …)
• Recommendation engines
• Heterogeneous data integration
• Master Data Management
• Metadata driven content / dynamic content publishing
• Knowledge Graphs / data sharing & reuse
• Information discovery / semantic search
#9Feb 2016Choosing the Right Graph Database to Succeed in Your Project

Use Cases – Knowledge Graphs

Use Cases – Content Management &
Recommendation

Use Cases – Metadata-Driven Content
Management & Recommendation

Ontotext and AstraZeneca
13
Profile
• Global, Bio-pharma company
• $28 billion in sales in 2012
• $4 billion in R&D across three continents
Goals
• Efficient design of new clinical studies
• Quick access to all of the data
• Improved evidence based decision-making
• Strengthen the knowledge feedback loop
• Enable predictive science
Challenges
• Over 7,000 studies and 23,000 documents are difficult
to obtain
• Searches returning 1,000 – 10,000 results
• Document repositories not designed for reuse
• Tedious process to arrive at evidence based decisions

Ontotext and Financial Times
14
• Goals
− Create a horizontal platform for
both data and content based on
semantics and serve all functionality
through it
• Challenges
− Critical part of FT.COM
− GraphDB used not only for data, but
for content storage as well
− Personalized recommendation
based on user behavior and
semantic context (Related Reads)

Ontotext and EuroMoney
15
• Goals
− Create a horizontal platform to
serve 100 different publications
− Platform which would include
the latest authoring, storing, and
display technologies including,
semantic annotation, search and
a triple store repository
• Challenges
− Multiple domains covered
− Sophisticated content analytics
including relation, template and
scenario extraction

LinkedLifeData – Knowledge Graph

Graph Database Landscape
“Despite all of this attention the market is
dominated by Neo4J and Ontotext
(GraphDB), which are graph and RDF
database providers respectively. These are
the longest established vendors in this
space (both founded in 2000) so they have a
longevity and experience that other
suppliers cannot yet match. How long this
will remain the case remains to be seen.”
Bloor Group report
Graph Databases, April 2015
http://www.bloorresearch.com/technology/graph-databases/

“Linking a few data sources is often simple,
but to do so with significant amounts of
heterogeneous data requires a radically new
approach. Graph databases are a powerful
optimized technology that link billions of
pieces of connected data to help create new
sources of value for customers and increase
operational agility for customer service. […]
they are well-suited for scenarios in which
relationships are important.”
Forrester report
Market Overview: Graph Databases, May 2015
https://www.forrester.com/Market+Overview+Graph+Databases/fulltext/-/E-RES121473

“What’s different in a graph store from a database
perspective is the sheer volume of connections, or
relationships—how people, places, and things relate
to one another through those interactions. If your
data is rich, you’ll see lots of relationships between
the entities in native graph form. Older database
technologies place less emphasis on relationships,
resulting in less context. Graphs offer the chance for
richer context through more connections and any-
to-any data models rather than the usual tabular or
hierarchical models”
PwC report
The promise of graph databases in public health, June 2015
http://www.pwc.com/us/en/technology-forecast/2015/remapping-database-landscape.html

• Q & A

GraphDB by Ontotext
• High performance semantic graph database, 10s of billions of
triples
• Full compliance to W3C standards
• Various inference profiles, including custom rules
• Extensions
−Geo-spatial, RDF Rank, full-text search, Blueprints/Gremlin, 3rd party plugins
• Tooling for DBAs

Advanced Features
• Connectors to Solr, Elasticsearch, MongoDB*
• Consistency checks
• RDF Rank for graph analytics
• Geo-spatial querying
• Notifications, plugin architecture for 3rd parties
• “Explain plan”
• High-availability cluster

GraphDB Connectors
Selective
replication
Query Processor
Graph indexesInternal indexes
SPARQL SELECT with or without an
embedded Solr / Elasticsearch
query
Solr / Elasticsearch
direct queries
Solr / Elasticsearch GraphDB engine
SPARQL INSERT/DELETE

High-Availability (Replication) Cluster
• Improved resilience & query
performance
• Worker nodes can be added/removed
dynamically
• “Graceful degradation” of cluster
performance when one or more
worker nodes fail
• Flexible topologies, multi-DC
deployment

GraphDB Editions
• Free (+ AWS Marketplace)
• Standard (+ AWS Marketplace)
• Enterprise
• Database-as-a-Service
25Feb 2016Feb 2016Choosing the Right Graph Database to Succeed in Your Project

Ontotext GraphDB
+ Java based, deploy anywhere
+ Maven artefacts
+ Docker images

GraphDB on the AWS Marketplace
• “1-Click” purchasing
• Variety of hardware configurations
• Manage big RDF graph data
• Pay-per-hour pricing, 5-day trial
27Nov 2015Feb 2016Choosing the Right Graph Database to Succeed in Your Project

Fully Managed Database-as-a-Service
• Low-cost DBaaS for Ontotext GraphDB
• Ideal for small to moderate data & query volumes
−database options: 10M (free), 50M, 250M & 1B triples
• Instantly deploy new databases when needed
−Easily scale up / down as data volume changes
• Zero administration
−automated operations, maintenance & upgrades
• Faster experimentation & prototyping, reduced TCO

Fully Managed Database-as-a-Service

Ontotext GraphDB – Key Advantages
1. High availability cluster
2. Performance & scalability
3. Advanced features & extensions
4. Variety of deployment options
5. Developed by an established vendor
6. Full lifecycle support – data modelling, integration, deployment
7. Proven in high-profile business critical use cases

• Q & A

From Experimentation to Production
• Priorities: cost, ease of deployment, performance, availability
• GraphDB options: Free, Standard, Enterprise (HA)
• Deployment: on premise, AWS cloud, database-as-a-service
• Seamless upgrade paths
−all options based on the same engine
Learning Prototype Pilot Production

Learning
• Priorities
−Free
−Easy & quick to set up, “sandbox” environment
• Recommended
−Database-as-a-Service (free 10M triples)
−GraphDB Free (on premise / on AWS)

Prototype
• Priorities
−Free / low-cost
−Easy & quick to set up, “sandbox” environment
• Recommended
−GraphDB Free (on premise / on AWS)
−Database-as-a-Service (10M – 50M triples)

Pilot
• Priorities
− Low-cost
− Performance & scalability
• Recommended
− GraphDB Standard (on premise / on AWS)
• Also consider
− Database-as-a-Service (250M – 1B triples)
− GraphDB Free (on premise / on AWS)

Production
• Priorities
− Performance & scalability
− High availability
• Recommended
− GraphDB Enterprise
• Also consider
− GraphDB Standard (on premise / on AWS)

Key Takeaways
• Graph databases are well suited for interconnected data,
heterogeneous data integration, relationship-centric analytics &
discovery, schema evolution
• Use cases include network analysis, MDM, knowledge graphs,
metadata management, recommendations, …
• Ontotext GraphDB is an enterprise-grade semantic graph
database, proven in mission-critical scenarios
• Various GraphDB deployment options, optimal for learning,
prototyping & experimentation, production

Links
• Ontotext GraphDB
−http://ontotext.com/products/graphdb/
−http://graphdb.ontotext.com/
−@OntotextGraphDB
• Customers & Verticals
−http://ontotext.com/company/customers/
−http://ontotext.com/knowledge-hub/case-studies/

Choosing the Right Graph Database to Succeed in Your Project
Thank You!

Choosing the Right Graph Database to Succeed in Your Project

More Related Content

What's hot

Similar to Choosing the Right Graph Database to Succeed in Your Project

More from Graphwise (Previously Ontotext)

Recently uploaded

In this document

Choosing the Right Graph Database to Succeed in Your Project