SlideShare a Scribd company logo
1 of 50
Download to read offline
5 MINUTE OVERVIEW
STARDOG
ENTERPRISE KNOWLEDGE GRAPH
stardog.com
U N C O N N E C T E D D ATA
I S A L I A B I L I T Y
E N T E R P R I S E S N E E D F L E X I B L E , R E U S A B L E
D ATA O N D E M A N D ,
W I T H L E S S D I S R U P T I O N A N D O V E R H E A D
K N O W L E D G E G R A P H I S T H E A N S W E R
F L E X I B L E
R E U S A B L E
A C C R E T I V E
K N O W L E D G E G R A P H =
K N O W L E D G E T O O L K I T + G R A P H D B
W H AT ' S A K N O W L E D G E T O O L K I T ?
V I RT U A L G R A P H S B U I L D K N O W L E D G E A C R O S S S I L O S
B U S I N E S S L O G I C B U I L D S R E U S A B L E , L O G I C A L R E A S O N I N G I N T O T H E G R A P H
M A C H I N E L E A R N I N G I N T E G R AT E S S TAT I S T I C A L R E A S O N I N G
I N T E G R I T Y C O N S T R A I N T VA L I D AT I O N E M P O W E R S D ATA S TA N D A R D S
K N O W L E D G E = D ATA P L U S R E A S O N I N G
FA C T C O U N T: 4 E X P L I C I T FA C T S
Inferno
Gareth Edwards
Rogue One
Felicity Jones
Tom Hanks
actor
director
actor
actor
K N O W L E D G E = D ATA P L U S R E A S O N I N G
actorOf inverseOf actor
directorOf inverseOf director
actorOf subPropertyOf workedOn
directorOf subPropertyOf workedOn
coworker propertyChain
(workedOn [inverseOf workedOn])
coworker subPropertyOf connectedTo
connectedTo a TransitiveProperty
Inferno
Gareth Edwards
Rogue One
Felicity Jones
Tom Hanks
actor
director
actor
actor
actorOf
actorOf
directorOf
coworker
connectedTo
coworker
connectedTo
connectedTo
, workedOn
, workedOn
, workedOn
FA C T C O U N T: 1 5 E X P L I C I T / I M P L I C I T FA C T S
B U S I N E S S L O G I C T H AT B E T T E R
E X P L A I N S T H E D O M A I N
K N O W L E D G E G R A P H S C O N N E C T A L L D ATA
C O N N E C T I N G A L L D ATA C H A N G E S E V E RY T H I N G
T H A N K Y O U
A . J . C O O K , N O R T H A M E R I C A N S A L E S
A J @ S TA R D O G . C O M
Data Modeling & Metadata
for Graph Databases
Donna Burbank
Global Data Strategy Ltd.
Lessons in Data Modeling DATAVERSITY Series
July 27th, 2017
Global Data Strategy, Ltd. 2017
Donna Burbank
Donna is a recognised industry expert in
information management with over 20
years of experience in data strategy,
information management, data modeling,
metadata management, and enterprise
architecture. Her background is multi-
faceted across consulting, product
development, product management, brand
strategy, marketing, and business
leadership.
She is currently the Managing Director at
Global Data Strategy, Ltd., an international
information management consulting
company that specializes in the alignment
of business drivers with data-centric
technology. In past roles, she has served in
key brand strategy and product
management roles at CA Technologies and
Embarcadero Technologies for several of
the leading data management products in
the market.
As an active contributor to the data
management community, she is a long
time DAMA International member, Past
President and Advisor to the DAMA Rocky
Mountain chapter, and was recently
awarded the Excellence in Data
Management Award from DAMA
International in 2016. She was on the
review committee for the Object
Management Group’s (OMG) Information
Management Metamodel (IMM) and the
Business Process Modeling Notation
(BPMN). Donna is also an analyst at the
Boulder BI Train Trust (BBBT) where she
provides advices and gains insight on the
latest BI and Analytics software in the
market.
She has worked with dozens of Fortune
500 companies worldwide in the Americas,
Europe, Asia, and Africa and speaks
regularly at industry conferences. She has
co-authored two books: Data Modeling for
the Business and Data Modeling Made
Simple with ERwin Data Modeler and is a
regular contributor to industry
publications. She can be reached at
donna.burbank@globaldatastrategy.com
Donna is based in Boulder, Colorado, USA.
2
Follow on Twitter @donnaburbank
Today’s hashtag: #LessonsDM
Global Data Strategy, Ltd. 2017
Lessons in Data Modeling Series
• January 26th How Data Modeling Fits Into an Overall Enterprise Architecture
• February 23rd Data Modeling and Business Intelligence
• March Conceptual Data Modeling – How to Get the Attention of Business Users
• April The Evolving Role of the Data Architect – What does it mean for your Career?
• May Data Modeling & Metadata Management
• June Self-Service Data Analysis, Data Wrangling, Data Munging, and Data Modeling
• July Data Modeling & Metadata for Graph Databases
• August Data Modeling & Data Integration
• September Data Modeling & MDM
• October Agile & Data Modeling – How Can They Work Together?
• December Data Modeling, Data Quality & Data Governance
3
This Year’s Line Up
Global Data Strategy, Ltd. 2017
Word from our Sponsor
4
Stardog Enterprise Knowledge Graph
www.stardog.com
Global Data Strategy, Ltd. 2017
Agenda
• What is a Graph Database
• Use Cases for Graph Databases
• Data Modeling & Metadata for Graph Databases
5
What we’ll cover today
Global Data Strategy, Ltd. 2017
What is a Graph Database?
• A graph database uses a set of nodes, edges, and
properties to represent and store data.
• With graph databases, the relationships between data
points often matter more than the individual points
themselves. In order to leverage those data relationships,
your organization needs a database technology that stores
• These relationships can help you discover new insights
from your data.
6
Global Data Strategy, Ltd. 2017
Graph Database = Thing Relates to Thing
7
Global Data Strategy, Ltd. 2017
Graph Database = Thing Relates to Thing
8
Node
Vertice
Edge
Relationship
The more formal way of referring to “thing relates to thing” is
“Nodes & Edges”, “Vertices & Relationships”, etc.
Global Data Strategy, Ltd. 2017
Graph Databases Mirror the Way We Think
9
Squirrel!
I should go
visit Mary
I wonder how her
brother John is doing?
Is he still dating
Stephanie?
…In the mind, as in data,
there are always random
data points…
Do they still have that
house at the Lake?
Riding their boats on the lake was great.
Remember when John crashed the boat?
Like my toy
as a child.
Graph databases can be intuitive to many, since they mirror the way the human brain
typically thinks – through Association.
Global Data Strategy, Ltd. 2017
“Traditional” way of Looking at the World: Hierarchies
• Carolus Linnaeus in 1735 established a hierarchy/taxonomy for organizing and identifying
biological systems.
Kingdom
Phylum
Class
Order
Family
Genus
Species
Global Data Strategy, Ltd. 2017
“New” Way of Looking at the World - Emergence
In philosophy, systems theory, science, and art, emergence is
the way complex systems and patterns arise out of a
multiplicity of relatively simple interactions.
- Wikipedia
Global Data Strategy, Ltd. 2017
Graph Databases Combine Flexibility w/ Structure & Meaning
• In many ways, graph databases provide the “best of both worlds”.
12
Flexibility of the “New World”
of Discovery & “Emergence”
Structure & Meaning of the “Old
World” through Ontologies+
Global Data Strategy, Ltd. 2017
It’s All About Relationships
• In graph databases, relationships are first class constructs.
• Rather ironically, relational databases lack relationships.
• In relational databases, relationships are enforced through joins and constraints.
• NoSQL (e.g. Key Value) databases are also weak at supporting relationships.
13
“A relational database isn’t about relationships, it’s about constraints.”
– Karen Lopez
Customer Account
Is Owner Of
<Customer> <Owner Of> <Account>
14
Use Cases for Graph
Databases
Global Data Strategy, Ltd. 2017
Social Networks
15
Donna
Sad, Lonely Person who
doesn’t like data
Who are the cool kids?
i.e. People linked with Donna
Global Data Strategy, Ltd. 2017
X Degrees of Separation – “The Bacon Number”
• What’s Audrey Hepburn’s “Bacon Number”? i.e. degrees of separation/relation to actor Kevin Bacon
• As always, metadata and data quality are important., i.e Which Audrey Hepburn?
16Courtesy of oracleofbacon.org
Global Data Strategy, Ltd. 2017
Fraud Detection in Online Transactions
• Online transactions typically have certain identifiers, e.g. User ID, IP address, geo location, tracking cookie, credit card number, etc.
• Graph patterns can help detect fraud, e.g.
• The more interconnections exist among identifiers, the greater the cause for concern.
• Typically they would be 1:1.
• Some variations may occur, e.g. Multiple credit cards with one person. Families using same machine, etc.
• Large and tightly-knit graphs are very strong indicators that fraud is taking place.
• Triggers can be put into place so that these patterns are uncovered before they cause damage.
17
IP1 IP1 IP1 IP1 IP1 IP1 IP1 IP1 IP1 IP1
CC1 CC2 CC3 CC4 CC5 CC6 CC7 CC8 CC9 CC10 CC11 CC12 CC13 CC14 CC15 CC16 CC17
Fraud? FamilyPersonal & Business Card
Global Data Strategy, Ltd. 2017
Recommendation Engines
• Recommendation Engines are familiar to most of us who do any online shopping.
• These engines can be powered by a graph database, e.g.
• Capture a customer’s browsing behavior and demographics
• Combine those with their buying history to provide relevant recommendations
18
Global Data Strategy, Ltd. 2017
Data Quality & Volume Matters
• Recommendation engines are based on evaluating data sets. If those data sets are faulty or of
poor quality, your results will be flawed.
• Especially if the data sets are small
19
Global Data Strategy, Ltd. 2017
Master Data Management (MDM)
• Master Data Management (MDM) is the practice of identifying, cleansing, storing & governance
core data assets of the organization (e.g. customer, product, etc.)
• There are many architectural approaches to MDM. Two are the following:
20
Centralized -- Commonly Relational Virtualized/Registry – Commonly Graph
MDM
Virtualization Layer
• Core data stored in
a common schema
in a centralized
“hub”.
• Used as a common
reference for
operational systems,
DW, etc.
• Data remains in
source systems.
• Referenced through
a common
virtualization layer.
BOTH require the same core foundation of data quality, parsing & matching, semantic meaning,
data governance, etc. in order to be successful… and that’s usually the hardest stuff.
Global Data Strategy, Ltd. 2017 21
When you have a
Hammer, everything
looks like a nail
i.e. Data Warehouses serve a
particular purpose for aggregating &
summarizing data. Not ideal for
graph databases.
Graph Databases for Data Warehousing
Global Data Strategy, Ltd. 2017
Data Warehousing & Enterprise Knowledge Graph
22
Data Warehouse
…Show me Total Sales by Region and by
Customer each month in 2017
Enterprise Knowledge Graph
Relational & Dimensional data model Graph data model
…Who are my most influential
customers. (with the most connections)
Global Data Strategy, Ltd. 2017
Data Management & Ballroom Dancing
“First you dance with yourself, then with your partner, then you dance with the room.”
23
Global Data Strategy, Ltd. 2017
An Enterprise Knowledge Graph Provides a Holistic View of
the Organization through Relationships
24
“First you dance with yourself, then with your partner, then you dance with the room.”
Customer Data
Data Quality & Semantics are important
for core enterprise data assets.
Name: Audrey Hepburn
DOB: May 4, 1929
Current Customer: No
But the true value is in the
interrelationships between data assets.
Mother of
Name: Luca Dotti
DOB: February 8, 1970
Current
Customer: Yes
Purchased Yacht Insurance
Purchased Home
Insurance
Filed a Claim
25
Data Modeling &
Metadata for Graph
Databases
Global Data Strategy, Ltd. 2017
Data Modeling for Graph Databases
• There are several dominant ways to model graph databases. Two popular ones include:
• Resource Description Language (RDF) Triples
• Labeled Property Graph
26
Labeled Property Graph
• Made up of nodes, relationships, properties & labels
• Sample Query language: Cypher
• Sample Vendor: Neo4J
Resource Description Language (RDF) Triples
• Made up of subject, predicate object triples
• Sample Query: SPARQL
• Sample Vendor: Stardog
• Both have a close affinity between logical & physical models
• i.e. We already think in “thing relates to thing”
• In the following slides, we’ll use the RDF example, since that is a W3C Open Standard.
Global Data Strategy, Ltd. 2017
Graph Query Languages
• Unlike relational databases, where SQL is a general standard, there are a number
of query language options available for graph databases:
• SPARQL: is SQL-like declarative query language that was created by W3C to query RDF
(Resource Description Framework) graphs.
• Cypher: is also a declarative query language that resembles SQL. Created by Neo4J
• GraphQL: is a query language for APIs. Isn’t specific to graph databases, but can be used for
them. Developed by Facebook.
• Gremlin: is a graph traversal language developed for Apache TinkerPop™, an open source,
vendor-agnostic, graph computing framework distributed under the Apache2 license.
27
Again, we’ll use SPARQL in our examples since it’s a W3C standard.
Global Data Strategy, Ltd. 2017
Resource Description Framework (RDF)
• The RDF (Resource Description Framework) model from the World Wide Web Consortium (W3C)
provides a way to link resources on the web (people, places, things) using the concept of “triples”.
• This linking structure forms a directed, labeled graph, where the edges represent the named link
between two resources, represented by the graph nodes.
28
Subject Object
Predicate
RDF Triples
Global Data Strategy, Ltd. 2017
RDF Triple Example
29
Cynthia Fido
Is Owner Of
<Cynthia> <Owner Of> <Fido>
Reference
• Brackets indicate individual references in RDF. Note that these are
defined by URIs in RDF, but have been simplified for this example.
Subject Predicate Object
Global Data Strategy, Ltd. 2017
RDF Triples
30
<Cynthia> <type> <Person>.
<Fido> <type> <Dog>
<Cynthia> <hasName> “Cynthia Smith”
<Fido> <hasName> “Fido”
<Cynthia> <ownerOf> <Fido>
Class
Literal
Instance
Global Data Strategy, Ltd. 2017
RDF Triple Graphical Representation
• RDF triples can be intuitively visualized graphically
31
<Cynthia>
<Person>
<Fido>
<ownerOf>
“Cynthia Smith”
<hasName>
“Fido”
<hasName>
<type>
<Dog>
<type>
Global Data Strategy, Ltd. 2017
Logical Groupings
@prefix example: http://example.org/example#.
example: Cynthia rdf:type example: Person;
example: hasName “Cynthia Smith” ;
example: ownerOf example: Fido> .
Example: Fido rdf:type example: Dog;
example: hasName: “Fido” .
32
• A Person has a name
• A Person can be an owner
• A Dog has a name
Global Data Strategy, Ltd. 2017
Ontologies
• An ontology is a data model of sorts to describe the “things” in RDF data.
• Two types of languages include:
• OWL (W3C Web Ontology): is a Semantic Web language designed to represent rich and complex
knowledge about things, groups of things, and relations between things.
• RDFS (RDF Schema): is a general-purpose language for representing simple RDF vocabularies. It is
considered a precursor to OWL.
• For example:
33
• People have Names
• People can own kinds of things
• Pets can be owned
• A dog is a pet
• Dogs can have names
RDFS OWL can be more Expressive
• A Mother is union of (Parent, Woman)
• This Family ontology links with the Person ontology
(meta-meta-metadata)
• Etc.
Global Data Strategy, Ltd. 2017
Ontologies help Define Queries
34
People have Names
People can own kinds of things
Pets can be owned
A dog is a pet
Dogs can have names
Ontology
Show me all of the People who Own Dogs
Query
Global Data Strategy, Ltd. 2017
Putting Ontologies & Queries Together
35
SELECT ?name
WHERE {
?person type Person ;
hasName ?name ;
ownerOf ?pet .
?pet type Dog .
}
-> RESULT “Cynthia Smith”
Define Variables
?person type Person ;
hasName ?name ;
ownerOf ?pet .
?pet type Dog.
Write out the Graph
using Variables
Query across the
Graph
Global Data Strategy, Ltd. 2017
Summary
• Graph Databases provide powerful enterprise-wide association using simple constructs
• “Thing Relates to Thing”
• Relationships are first class constructs
• Enterprise use cases are best suited to those that focus on interrelationships between data points
• Social Networks
• Fraud Detection
• Recommendation Engines
• Enterprise Knowledge Graph
• Data Modeling & Metadata are supported by simple constructs
• Data structures through Triples: Subject, Predicate, Object
• Semantics through Ontologies (e.g. OWL)
• Queries through SPARQL and other methods
Global Data Strategy, Ltd. 2017
About Global Data Strategy, Ltd
• Global Data Strategy is an international information management consulting company that specializes
in the alignment of business drivers with data-centric technology.
• Our passion is data, and helping organizations enrich their business opportunities through data and
information.
• Our core values center around providing solutions that are:
• Business-Driven: We put the needs of your business first, before we look at any technology solution.
• Clear & Relevant: We provide clear explanations using real-world examples.
• Customized & Right-Sized: Our implementations are based on the unique needs of your organization’s
size, corporate culture, and geography.
• High Quality & Technically Precise: We pride ourselves in excellence of execution, with years of
technical expertise in the industry.
37
Data-Driven Business Transformation
Business Strategy
Aligned With
Data Strategy
Visit www.globaldatastrategy.com for more information
Global Data Strategy, Ltd. 2017
Contact Info
• Email: donna.burbank@globaldatastrategy.com
• Twitter: @donnaburbank
@GlobalDataStrat
• Website: www.globaldatastrategy.com
38
Global Data Strategy, Ltd. 2017
Lessons in Data Modeling Series
• January 26th How Data Modeling Fits Into an Overall Enterprise Architecture
• February 23rd Data Modeling and Business Intelligence
• March Conceptual Data Modeling – How to Get the Attention of Business Users
• April The Evolving Role of the Data Architect – What does it mean for your Career?
• May Data Modeling & Metadata Management
• June Self-Service Data Analysis, Data Wrangling, Data Munging, and Data Modeling
• July Data Modeling & Metadata for Graph Databases
• August Data Modeling & Data Integration
• September Data Modeling & MDM
• October Agile & Data Modeling – How Can They Work Together?
• December Data Modeling, Data Quality & Data Governance
39
This Year’s Line Up
Global Data Strategy, Ltd. 2017
Questions?
40
Thoughts? Ideas?

More Related Content

What's hot

Data Modeling Enterprise Architecture
Data Modeling Enterprise ArchitectureData Modeling Enterprise Architecture
Data Modeling Enterprise Architecture
Richard Freggi
 
MDM Strategy & Roadmap
MDM Strategy & RoadmapMDM Strategy & Roadmap
MDM Strategy & Roadmap
victorlbrown
 

What's hot (20)

Elastic Data Warehousing
Elastic Data WarehousingElastic Data Warehousing
Elastic Data Warehousing
 
Data Modeling & Metadata Management
Data Modeling & Metadata ManagementData Modeling & Metadata Management
Data Modeling & Metadata Management
 
Snowflake Overview
Snowflake OverviewSnowflake Overview
Snowflake Overview
 
Data catalog
Data catalogData catalog
Data catalog
 
Data Modeling Enterprise Architecture
Data Modeling Enterprise ArchitectureData Modeling Enterprise Architecture
Data Modeling Enterprise Architecture
 
MDM Strategy & Roadmap
MDM Strategy & RoadmapMDM Strategy & Roadmap
MDM Strategy & Roadmap
 
Data engineering design patterns
Data engineering design patternsData engineering design patterns
Data engineering design patterns
 
Building a modern data warehouse
Building a modern data warehouseBuilding a modern data warehouse
Building a modern data warehouse
 
Data Warehouse Tutorial For Beginners | Data Warehouse Concepts | Data Wareho...
Data Warehouse Tutorial For Beginners | Data Warehouse Concepts | Data Wareho...Data Warehouse Tutorial For Beginners | Data Warehouse Concepts | Data Wareho...
Data Warehouse Tutorial For Beginners | Data Warehouse Concepts | Data Wareho...
 
Graph Databases for Master Data Management
Graph Databases for Master Data ManagementGraph Databases for Master Data Management
Graph Databases for Master Data Management
 
Building a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business GoalsBuilding a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business Goals
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)
 
Data Lake,beyond the Data Warehouse
Data Lake,beyond the Data WarehouseData Lake,beyond the Data Warehouse
Data Lake,beyond the Data Warehouse
 
Zero to Snowflake Presentation
Zero to Snowflake Presentation Zero to Snowflake Presentation
Zero to Snowflake Presentation
 
Building an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureBuilding an Effective Data Warehouse Architecture
Building an Effective Data Warehouse Architecture
 
Data Lake Architecture
Data Lake ArchitectureData Lake Architecture
Data Lake Architecture
 
DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
 
Denodo Data Virtualization Platform: Overview (session 1 from Architect to Ar...
Denodo Data Virtualization Platform: Overview (session 1 from Architect to Ar...Denodo Data Virtualization Platform: Overview (session 1 from Architect to Ar...
Denodo Data Virtualization Platform: Overview (session 1 from Architect to Ar...
 
Data Warehousing Trends, Best Practices, and Future Outlook
Data Warehousing Trends, Best Practices, and Future OutlookData Warehousing Trends, Best Practices, and Future Outlook
Data Warehousing Trends, Best Practices, and Future Outlook
 
Data Architecture Best Practices for Advanced Analytics
Data Architecture Best Practices for Advanced AnalyticsData Architecture Best Practices for Advanced Analytics
Data Architecture Best Practices for Advanced Analytics
 

Viewers also liked

Using a Graph Database for Next-Gen MDM
Using a Graph Database for Next-Gen MDMUsing a Graph Database for Next-Gen MDM
Using a Graph Database for Next-Gen MDM
Neo4j
 

Viewers also liked (6)

Applying Graph DB to Enterprise MDM
Applying Graph DB to Enterprise MDMApplying Graph DB to Enterprise MDM
Applying Graph DB to Enterprise MDM
 
Reltio: Powering Enterprise Data-driven Applications with Cassandra
Reltio: Powering Enterprise Data-driven Applications with CassandraReltio: Powering Enterprise Data-driven Applications with Cassandra
Reltio: Powering Enterprise Data-driven Applications with Cassandra
 
Using a Graph Database for Next-Gen MDM
Using a Graph Database for Next-Gen MDMUsing a Graph Database for Next-Gen MDM
Using a Graph Database for Next-Gen MDM
 
Big MDM Part 2: Using a Graph Database for MDM and Relationship Management
Big MDM Part 2: Using a Graph Database for MDM and Relationship ManagementBig MDM Part 2: Using a Graph Database for MDM and Relationship Management
Big MDM Part 2: Using a Graph Database for MDM and Relationship Management
 
Graph database Use Cases
Graph database Use CasesGraph database Use Cases
Graph database Use Cases
 
How to identify the correct Master Data subject areas & tooling for your MDM...
How to identify the correct Master Data subject areas & tooling for your MDM...How to identify the correct Master Data subject areas & tooling for your MDM...
How to identify the correct Master Data subject areas & tooling for your MDM...
 

Similar to Data Modeling & Metadata for Graph Databases

Similar to Data Modeling & Metadata for Graph Databases (20)

Data Architecture Strategies: The Rise of the Graph Database
Data Architecture Strategies: The Rise of the Graph DatabaseData Architecture Strategies: The Rise of the Graph Database
Data Architecture Strategies: The Rise of the Graph Database
 
Presentation on BIKON - International BI conference
Presentation on BIKON - International BI conferencePresentation on BIKON - International BI conference
Presentation on BIKON - International BI conference
 
Advanced Data Modelling course 3 day synopsis
Advanced Data Modelling course 3 day synopsisAdvanced Data Modelling course 3 day synopsis
Advanced Data Modelling course 3 day synopsis
 
A Comparative Study of Data Management Maturity Models
A Comparative Study of Data Management Maturity ModelsA Comparative Study of Data Management Maturity Models
A Comparative Study of Data Management Maturity Models
 
Proto-Design Your Future - Capital One Digital for Good Summit
Proto-Design Your Future - Capital One Digital for Good SummitProto-Design Your Future - Capital One Digital for Good Summit
Proto-Design Your Future - Capital One Digital for Good Summit
 
Data Modelling Fundamentals course 3 day synopsis
Data Modelling Fundamentals course 3 day synopsisData Modelling Fundamentals course 3 day synopsis
Data Modelling Fundamentals course 3 day synopsis
 
Lessons in Data Modeling: Data Modeling & MDM
Lessons in Data Modeling: Data Modeling & MDMLessons in Data Modeling: Data Modeling & MDM
Lessons in Data Modeling: Data Modeling & MDM
 
SKB-RESUME
SKB-RESUMESKB-RESUME
SKB-RESUME
 
Data Modeling for Big Data
Data Modeling for Big DataData Modeling for Big Data
Data Modeling for Big Data
 
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision IntelligenceApply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
 
Enabling Data-Driven Marketing
Enabling Data-Driven MarketingEnabling Data-Driven Marketing
Enabling Data-Driven Marketing
 
Data driven marketing: wat zijn de kenmerkende verschillen tussen succesvolle...
Data driven marketing: wat zijn de kenmerkende verschillen tussen succesvolle...Data driven marketing: wat zijn de kenmerkende verschillen tussen succesvolle...
Data driven marketing: wat zijn de kenmerkende verschillen tussen succesvolle...
 
Women in Tech Summit 2013 presentation
Women in Tech Summit 2013 presentationWomen in Tech Summit 2013 presentation
Women in Tech Summit 2013 presentation
 
DAS Slides: Graph Databases — Practical Use Cases
DAS Slides: Graph Databases — Practical Use CasesDAS Slides: Graph Databases — Practical Use Cases
DAS Slides: Graph Databases — Practical Use Cases
 
Self-Service Data Analysis, Data Wrangling, Data Munging, and Data Modeling –...
Self-Service Data Analysis, Data Wrangling, Data Munging, and Data Modeling –...Self-Service Data Analysis, Data Wrangling, Data Munging, and Data Modeling –...
Self-Service Data Analysis, Data Wrangling, Data Munging, and Data Modeling –...
 
Agile digital enterprise framework v1.4
Agile digital enterprise framework v1.4Agile digital enterprise framework v1.4
Agile digital enterprise framework v1.4
 
Dmmaturitymodelscomparison 190513162839
Dmmaturitymodelscomparison 190513162839Dmmaturitymodelscomparison 190513162839
Dmmaturitymodelscomparison 190513162839
 
A Comparative Study of Data Management Maturity Models
A Comparative Study of Data Management Maturity ModelsA Comparative Study of Data Management Maturity Models
A Comparative Study of Data Management Maturity Models
 
Data Modeling & Data Integration
Data Modeling & Data IntegrationData Modeling & Data Integration
Data Modeling & Data Integration
 
Information Management Training Options
Information Management Training OptionsInformation Management Training Options
Information Management Training Options
 

More from DATAVERSITY

The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
DATAVERSITY
 
Data Strategy Best Practices
Data Strategy Best PracticesData Strategy Best Practices
Data Strategy Best Practices
DATAVERSITY
 

More from DATAVERSITY (20)

Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
 
Data at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and GovernanceData at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and Governance
 
Exploring Levels of Data Literacy
Exploring Levels of Data LiteracyExploring Levels of Data Literacy
Exploring Levels of Data Literacy
 
Building a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business GoalsBuilding a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business Goals
 
Make Data Work for You
Make Data Work for YouMake Data Work for You
Make Data Work for You
 
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?
 
Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?
 
Data Modeling Fundamentals
Data Modeling FundamentalsData Modeling Fundamentals
Data Modeling Fundamentals
 
Showing ROI for Your Analytic Project
Showing ROI for Your Analytic ProjectShowing ROI for Your Analytic Project
Showing ROI for Your Analytic Project
 
How a Semantic Layer Makes Data Mesh Work at Scale
How a Semantic Layer Makes  Data Mesh Work at ScaleHow a Semantic Layer Makes  Data Mesh Work at Scale
How a Semantic Layer Makes Data Mesh Work at Scale
 
Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?
 
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
 
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?
 
Data Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and ForwardsData Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and Forwards
 
Data Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement TodayData Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement Today
 
2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics
 
Data Strategy Best Practices
Data Strategy Best PracticesData Strategy Best Practices
Data Strategy Best Practices
 
Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?
 
Data Management Best Practices
Data Management Best PracticesData Management Best Practices
Data Management Best Practices
 
MLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageMLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive Advantage
 

Recently uploaded

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Recently uploaded (20)

Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 

Data Modeling & Metadata for Graph Databases

  • 1. 5 MINUTE OVERVIEW STARDOG ENTERPRISE KNOWLEDGE GRAPH stardog.com
  • 2. U N C O N N E C T E D D ATA I S A L I A B I L I T Y
  • 3. E N T E R P R I S E S N E E D F L E X I B L E , R E U S A B L E D ATA O N D E M A N D , W I T H L E S S D I S R U P T I O N A N D O V E R H E A D
  • 4. K N O W L E D G E G R A P H I S T H E A N S W E R F L E X I B L E R E U S A B L E A C C R E T I V E
  • 5. K N O W L E D G E G R A P H = K N O W L E D G E T O O L K I T + G R A P H D B
  • 6. W H AT ' S A K N O W L E D G E T O O L K I T ? V I RT U A L G R A P H S B U I L D K N O W L E D G E A C R O S S S I L O S B U S I N E S S L O G I C B U I L D S R E U S A B L E , L O G I C A L R E A S O N I N G I N T O T H E G R A P H M A C H I N E L E A R N I N G I N T E G R AT E S S TAT I S T I C A L R E A S O N I N G I N T E G R I T Y C O N S T R A I N T VA L I D AT I O N E M P O W E R S D ATA S TA N D A R D S
  • 7. K N O W L E D G E = D ATA P L U S R E A S O N I N G FA C T C O U N T: 4 E X P L I C I T FA C T S Inferno Gareth Edwards Rogue One Felicity Jones Tom Hanks actor director actor actor
  • 8. K N O W L E D G E = D ATA P L U S R E A S O N I N G actorOf inverseOf actor directorOf inverseOf director actorOf subPropertyOf workedOn directorOf subPropertyOf workedOn coworker propertyChain (workedOn [inverseOf workedOn]) coworker subPropertyOf connectedTo connectedTo a TransitiveProperty Inferno Gareth Edwards Rogue One Felicity Jones Tom Hanks actor director actor actor actorOf actorOf directorOf coworker connectedTo coworker connectedTo connectedTo , workedOn , workedOn , workedOn FA C T C O U N T: 1 5 E X P L I C I T / I M P L I C I T FA C T S B U S I N E S S L O G I C T H AT B E T T E R E X P L A I N S T H E D O M A I N
  • 9. K N O W L E D G E G R A P H S C O N N E C T A L L D ATA C O N N E C T I N G A L L D ATA C H A N G E S E V E RY T H I N G
  • 10. T H A N K Y O U A . J . C O O K , N O R T H A M E R I C A N S A L E S A J @ S TA R D O G . C O M
  • 11. Data Modeling & Metadata for Graph Databases Donna Burbank Global Data Strategy Ltd. Lessons in Data Modeling DATAVERSITY Series July 27th, 2017
  • 12. Global Data Strategy, Ltd. 2017 Donna Burbank Donna is a recognised industry expert in information management with over 20 years of experience in data strategy, information management, data modeling, metadata management, and enterprise architecture. Her background is multi- faceted across consulting, product development, product management, brand strategy, marketing, and business leadership. She is currently the Managing Director at Global Data Strategy, Ltd., an international information management consulting company that specializes in the alignment of business drivers with data-centric technology. In past roles, she has served in key brand strategy and product management roles at CA Technologies and Embarcadero Technologies for several of the leading data management products in the market. As an active contributor to the data management community, she is a long time DAMA International member, Past President and Advisor to the DAMA Rocky Mountain chapter, and was recently awarded the Excellence in Data Management Award from DAMA International in 2016. She was on the review committee for the Object Management Group’s (OMG) Information Management Metamodel (IMM) and the Business Process Modeling Notation (BPMN). Donna is also an analyst at the Boulder BI Train Trust (BBBT) where she provides advices and gains insight on the latest BI and Analytics software in the market. She has worked with dozens of Fortune 500 companies worldwide in the Americas, Europe, Asia, and Africa and speaks regularly at industry conferences. She has co-authored two books: Data Modeling for the Business and Data Modeling Made Simple with ERwin Data Modeler and is a regular contributor to industry publications. She can be reached at donna.burbank@globaldatastrategy.com Donna is based in Boulder, Colorado, USA. 2 Follow on Twitter @donnaburbank Today’s hashtag: #LessonsDM
  • 13. Global Data Strategy, Ltd. 2017 Lessons in Data Modeling Series • January 26th How Data Modeling Fits Into an Overall Enterprise Architecture • February 23rd Data Modeling and Business Intelligence • March Conceptual Data Modeling – How to Get the Attention of Business Users • April The Evolving Role of the Data Architect – What does it mean for your Career? • May Data Modeling & Metadata Management • June Self-Service Data Analysis, Data Wrangling, Data Munging, and Data Modeling • July Data Modeling & Metadata for Graph Databases • August Data Modeling & Data Integration • September Data Modeling & MDM • October Agile & Data Modeling – How Can They Work Together? • December Data Modeling, Data Quality & Data Governance 3 This Year’s Line Up
  • 14. Global Data Strategy, Ltd. 2017 Word from our Sponsor 4 Stardog Enterprise Knowledge Graph www.stardog.com
  • 15. Global Data Strategy, Ltd. 2017 Agenda • What is a Graph Database • Use Cases for Graph Databases • Data Modeling & Metadata for Graph Databases 5 What we’ll cover today
  • 16. Global Data Strategy, Ltd. 2017 What is a Graph Database? • A graph database uses a set of nodes, edges, and properties to represent and store data. • With graph databases, the relationships between data points often matter more than the individual points themselves. In order to leverage those data relationships, your organization needs a database technology that stores • These relationships can help you discover new insights from your data. 6
  • 17. Global Data Strategy, Ltd. 2017 Graph Database = Thing Relates to Thing 7
  • 18. Global Data Strategy, Ltd. 2017 Graph Database = Thing Relates to Thing 8 Node Vertice Edge Relationship The more formal way of referring to “thing relates to thing” is “Nodes & Edges”, “Vertices & Relationships”, etc.
  • 19. Global Data Strategy, Ltd. 2017 Graph Databases Mirror the Way We Think 9 Squirrel! I should go visit Mary I wonder how her brother John is doing? Is he still dating Stephanie? …In the mind, as in data, there are always random data points… Do they still have that house at the Lake? Riding their boats on the lake was great. Remember when John crashed the boat? Like my toy as a child. Graph databases can be intuitive to many, since they mirror the way the human brain typically thinks – through Association.
  • 20. Global Data Strategy, Ltd. 2017 “Traditional” way of Looking at the World: Hierarchies • Carolus Linnaeus in 1735 established a hierarchy/taxonomy for organizing and identifying biological systems. Kingdom Phylum Class Order Family Genus Species
  • 21. Global Data Strategy, Ltd. 2017 “New” Way of Looking at the World - Emergence In philosophy, systems theory, science, and art, emergence is the way complex systems and patterns arise out of a multiplicity of relatively simple interactions. - Wikipedia
  • 22. Global Data Strategy, Ltd. 2017 Graph Databases Combine Flexibility w/ Structure & Meaning • In many ways, graph databases provide the “best of both worlds”. 12 Flexibility of the “New World” of Discovery & “Emergence” Structure & Meaning of the “Old World” through Ontologies+
  • 23. Global Data Strategy, Ltd. 2017 It’s All About Relationships • In graph databases, relationships are first class constructs. • Rather ironically, relational databases lack relationships. • In relational databases, relationships are enforced through joins and constraints. • NoSQL (e.g. Key Value) databases are also weak at supporting relationships. 13 “A relational database isn’t about relationships, it’s about constraints.” – Karen Lopez Customer Account Is Owner Of <Customer> <Owner Of> <Account>
  • 24. 14 Use Cases for Graph Databases
  • 25. Global Data Strategy, Ltd. 2017 Social Networks 15 Donna Sad, Lonely Person who doesn’t like data Who are the cool kids? i.e. People linked with Donna
  • 26. Global Data Strategy, Ltd. 2017 X Degrees of Separation – “The Bacon Number” • What’s Audrey Hepburn’s “Bacon Number”? i.e. degrees of separation/relation to actor Kevin Bacon • As always, metadata and data quality are important., i.e Which Audrey Hepburn? 16Courtesy of oracleofbacon.org
  • 27. Global Data Strategy, Ltd. 2017 Fraud Detection in Online Transactions • Online transactions typically have certain identifiers, e.g. User ID, IP address, geo location, tracking cookie, credit card number, etc. • Graph patterns can help detect fraud, e.g. • The more interconnections exist among identifiers, the greater the cause for concern. • Typically they would be 1:1. • Some variations may occur, e.g. Multiple credit cards with one person. Families using same machine, etc. • Large and tightly-knit graphs are very strong indicators that fraud is taking place. • Triggers can be put into place so that these patterns are uncovered before they cause damage. 17 IP1 IP1 IP1 IP1 IP1 IP1 IP1 IP1 IP1 IP1 CC1 CC2 CC3 CC4 CC5 CC6 CC7 CC8 CC9 CC10 CC11 CC12 CC13 CC14 CC15 CC16 CC17 Fraud? FamilyPersonal & Business Card
  • 28. Global Data Strategy, Ltd. 2017 Recommendation Engines • Recommendation Engines are familiar to most of us who do any online shopping. • These engines can be powered by a graph database, e.g. • Capture a customer’s browsing behavior and demographics • Combine those with their buying history to provide relevant recommendations 18
  • 29. Global Data Strategy, Ltd. 2017 Data Quality & Volume Matters • Recommendation engines are based on evaluating data sets. If those data sets are faulty or of poor quality, your results will be flawed. • Especially if the data sets are small 19
  • 30. Global Data Strategy, Ltd. 2017 Master Data Management (MDM) • Master Data Management (MDM) is the practice of identifying, cleansing, storing & governance core data assets of the organization (e.g. customer, product, etc.) • There are many architectural approaches to MDM. Two are the following: 20 Centralized -- Commonly Relational Virtualized/Registry – Commonly Graph MDM Virtualization Layer • Core data stored in a common schema in a centralized “hub”. • Used as a common reference for operational systems, DW, etc. • Data remains in source systems. • Referenced through a common virtualization layer. BOTH require the same core foundation of data quality, parsing & matching, semantic meaning, data governance, etc. in order to be successful… and that’s usually the hardest stuff.
  • 31. Global Data Strategy, Ltd. 2017 21 When you have a Hammer, everything looks like a nail i.e. Data Warehouses serve a particular purpose for aggregating & summarizing data. Not ideal for graph databases. Graph Databases for Data Warehousing
  • 32. Global Data Strategy, Ltd. 2017 Data Warehousing & Enterprise Knowledge Graph 22 Data Warehouse …Show me Total Sales by Region and by Customer each month in 2017 Enterprise Knowledge Graph Relational & Dimensional data model Graph data model …Who are my most influential customers. (with the most connections)
  • 33. Global Data Strategy, Ltd. 2017 Data Management & Ballroom Dancing “First you dance with yourself, then with your partner, then you dance with the room.” 23
  • 34. Global Data Strategy, Ltd. 2017 An Enterprise Knowledge Graph Provides a Holistic View of the Organization through Relationships 24 “First you dance with yourself, then with your partner, then you dance with the room.” Customer Data Data Quality & Semantics are important for core enterprise data assets. Name: Audrey Hepburn DOB: May 4, 1929 Current Customer: No But the true value is in the interrelationships between data assets. Mother of Name: Luca Dotti DOB: February 8, 1970 Current Customer: Yes Purchased Yacht Insurance Purchased Home Insurance Filed a Claim
  • 35. 25 Data Modeling & Metadata for Graph Databases
  • 36. Global Data Strategy, Ltd. 2017 Data Modeling for Graph Databases • There are several dominant ways to model graph databases. Two popular ones include: • Resource Description Language (RDF) Triples • Labeled Property Graph 26 Labeled Property Graph • Made up of nodes, relationships, properties & labels • Sample Query language: Cypher • Sample Vendor: Neo4J Resource Description Language (RDF) Triples • Made up of subject, predicate object triples • Sample Query: SPARQL • Sample Vendor: Stardog • Both have a close affinity between logical & physical models • i.e. We already think in “thing relates to thing” • In the following slides, we’ll use the RDF example, since that is a W3C Open Standard.
  • 37. Global Data Strategy, Ltd. 2017 Graph Query Languages • Unlike relational databases, where SQL is a general standard, there are a number of query language options available for graph databases: • SPARQL: is SQL-like declarative query language that was created by W3C to query RDF (Resource Description Framework) graphs. • Cypher: is also a declarative query language that resembles SQL. Created by Neo4J • GraphQL: is a query language for APIs. Isn’t specific to graph databases, but can be used for them. Developed by Facebook. • Gremlin: is a graph traversal language developed for Apache TinkerPop™, an open source, vendor-agnostic, graph computing framework distributed under the Apache2 license. 27 Again, we’ll use SPARQL in our examples since it’s a W3C standard.
  • 38. Global Data Strategy, Ltd. 2017 Resource Description Framework (RDF) • The RDF (Resource Description Framework) model from the World Wide Web Consortium (W3C) provides a way to link resources on the web (people, places, things) using the concept of “triples”. • This linking structure forms a directed, labeled graph, where the edges represent the named link between two resources, represented by the graph nodes. 28 Subject Object Predicate RDF Triples
  • 39. Global Data Strategy, Ltd. 2017 RDF Triple Example 29 Cynthia Fido Is Owner Of <Cynthia> <Owner Of> <Fido> Reference • Brackets indicate individual references in RDF. Note that these are defined by URIs in RDF, but have been simplified for this example. Subject Predicate Object
  • 40. Global Data Strategy, Ltd. 2017 RDF Triples 30 <Cynthia> <type> <Person>. <Fido> <type> <Dog> <Cynthia> <hasName> “Cynthia Smith” <Fido> <hasName> “Fido” <Cynthia> <ownerOf> <Fido> Class Literal Instance
  • 41. Global Data Strategy, Ltd. 2017 RDF Triple Graphical Representation • RDF triples can be intuitively visualized graphically 31 <Cynthia> <Person> <Fido> <ownerOf> “Cynthia Smith” <hasName> “Fido” <hasName> <type> <Dog> <type>
  • 42. Global Data Strategy, Ltd. 2017 Logical Groupings @prefix example: http://example.org/example#. example: Cynthia rdf:type example: Person; example: hasName “Cynthia Smith” ; example: ownerOf example: Fido> . Example: Fido rdf:type example: Dog; example: hasName: “Fido” . 32 • A Person has a name • A Person can be an owner • A Dog has a name
  • 43. Global Data Strategy, Ltd. 2017 Ontologies • An ontology is a data model of sorts to describe the “things” in RDF data. • Two types of languages include: • OWL (W3C Web Ontology): is a Semantic Web language designed to represent rich and complex knowledge about things, groups of things, and relations between things. • RDFS (RDF Schema): is a general-purpose language for representing simple RDF vocabularies. It is considered a precursor to OWL. • For example: 33 • People have Names • People can own kinds of things • Pets can be owned • A dog is a pet • Dogs can have names RDFS OWL can be more Expressive • A Mother is union of (Parent, Woman) • This Family ontology links with the Person ontology (meta-meta-metadata) • Etc.
  • 44. Global Data Strategy, Ltd. 2017 Ontologies help Define Queries 34 People have Names People can own kinds of things Pets can be owned A dog is a pet Dogs can have names Ontology Show me all of the People who Own Dogs Query
  • 45. Global Data Strategy, Ltd. 2017 Putting Ontologies & Queries Together 35 SELECT ?name WHERE { ?person type Person ; hasName ?name ; ownerOf ?pet . ?pet type Dog . } -> RESULT “Cynthia Smith” Define Variables ?person type Person ; hasName ?name ; ownerOf ?pet . ?pet type Dog. Write out the Graph using Variables Query across the Graph
  • 46. Global Data Strategy, Ltd. 2017 Summary • Graph Databases provide powerful enterprise-wide association using simple constructs • “Thing Relates to Thing” • Relationships are first class constructs • Enterprise use cases are best suited to those that focus on interrelationships between data points • Social Networks • Fraud Detection • Recommendation Engines • Enterprise Knowledge Graph • Data Modeling & Metadata are supported by simple constructs • Data structures through Triples: Subject, Predicate, Object • Semantics through Ontologies (e.g. OWL) • Queries through SPARQL and other methods
  • 47. Global Data Strategy, Ltd. 2017 About Global Data Strategy, Ltd • Global Data Strategy is an international information management consulting company that specializes in the alignment of business drivers with data-centric technology. • Our passion is data, and helping organizations enrich their business opportunities through data and information. • Our core values center around providing solutions that are: • Business-Driven: We put the needs of your business first, before we look at any technology solution. • Clear & Relevant: We provide clear explanations using real-world examples. • Customized & Right-Sized: Our implementations are based on the unique needs of your organization’s size, corporate culture, and geography. • High Quality & Technically Precise: We pride ourselves in excellence of execution, with years of technical expertise in the industry. 37 Data-Driven Business Transformation Business Strategy Aligned With Data Strategy Visit www.globaldatastrategy.com for more information
  • 48. Global Data Strategy, Ltd. 2017 Contact Info • Email: donna.burbank@globaldatastrategy.com • Twitter: @donnaburbank @GlobalDataStrat • Website: www.globaldatastrategy.com 38
  • 49. Global Data Strategy, Ltd. 2017 Lessons in Data Modeling Series • January 26th How Data Modeling Fits Into an Overall Enterprise Architecture • February 23rd Data Modeling and Business Intelligence • March Conceptual Data Modeling – How to Get the Attention of Business Users • April The Evolving Role of the Data Architect – What does it mean for your Career? • May Data Modeling & Metadata Management • June Self-Service Data Analysis, Data Wrangling, Data Munging, and Data Modeling • July Data Modeling & Metadata for Graph Databases • August Data Modeling & Data Integration • September Data Modeling & MDM • October Agile & Data Modeling – How Can They Work Together? • December Data Modeling, Data Quality & Data Governance 39 This Year’s Line Up
  • 50. Global Data Strategy, Ltd. 2017 Questions? 40 Thoughts? Ideas?