SlideShare a Scribd company logo
1 of 20
Download to read offline
Modelling event
data as a graph
Combining event graphs with event grammar
• People tend to intuitively visualise concepts as graphs which do not translate nicely to a
tabular structure.
• Graph databases are often designed for low-latency performance, which can make them
a better choice for certain applications, such as recommendation engines, especially at
scale.
• There are some questions (for example path analysis) that are difficult to answer when
using a relational database, but that are easy to answer with a graph.
Graph DBs have some key
advantages over relational DBs
• In a relational event data model, each event is a record in a table or index.
• The table has as many columns or properties as there are facets to that event, eg user,
timestamp, URL, etc.
• There isn’t much scope for deviation from this basic model.
We know how to model events in a
table…
• We can model events as nodes with properties, related through [:NEXT] edges:
(PageView1)-[:NEXT]->(PageView2).
• We can model events as relationships, eg (User)-[:VIEWS]->(Page).
• We can mix and match different methods.
• It’s not obvious which model is ‘the right one’: if there even is such a thing.
… but modelling events as a graph
is relatively unexplored
• In a relational database, you’d always use the same query to get specific properties of an
event:
SELECT user_id, page_url FROM events;
• However, in a graph database your query syntax depends on whether those dimensions have
been modelled as nodes, relationships, or properties of nodes or relationships:
MATCH (u:User)-[:VIEWS]->(p:Page)
RETURN u.id, p.url;
MATCH (e:Event)
RETURN e.user_id, e.page_url;
Choosing the model dictates what
queries we can run
Taking an event-grammar approach
In the event-grammar model, an event is a
snapshot of a set of entities in time.
This model is already a graph, with nodes
representing the various entities and
relationships between the nodes.
However, when mapping this model to a
tabular structure, the roles of each entity
and the relationships between them are
lost to users without knowledge of the
model and the domain.
To make the roles and relationships
explicit, we have to interpret the
event
MATCH (u:User)-[r:VIEWS]->(p:Page)
WHERE u.email = 'alice@mail.com'
RETURN COUNT(r)
But what if we have more than just page views, eg also link clicks, downloads, form
submits, etc?
This model makes it hard to find all
events by the same user
The event graph approach
A popular option for modelling events in a
graph is to make each event a node that is
related to the event that happened immediately
before it and after it through
a NEXT / PREVIOUS relationship.
The event node then has
outgoing HAS relationships to all of its entities,
such as user nodes, context nodes, etc.
This is an easy model to do path analysis.
MATCH (u:User)<-[:HAS]-(e:Event)-[:HAS]—>(p:Page)
WHERE u.email = 'alice@mail.com'
RETURN COUNT(DISTINCT p)
We can still find all pages visited by the user, but we always have to add a ‘hop’ in the
query, because entities are related to each other only through the event node they belong
to.
The event-graph model makes other
queries harder
The ‘denormalised’ graph approach
There is also the option to “denormalise” the
data, ie represent the same data in different
ways.
An example would be a model where each
event is a node in a time series, with outgoing
relationships to all its entities, but there
are also relationships between the entities.
This adds complexity and redundancy to the
model but makes queries easier.
MATCH (u:User)-[:VIEWS]->(p:Page)
WHERE u.email = 'alice@mail.com'
RETURN COUNT(DISTINCT p)
MATCH p = (u:User)<-[:HAS]-(Event)-[:NEXT*1..5]—>(Event)
WHERE u.email = 'alice@mail.com'
RETURN p
Now we can easily write a variety of
queries
One key advantage is that any insight you glean from analysing the relationships of entities
in your events, can be readily attached to your existing data set.
How is modelling event-level data as
a graph valuable?
• Users log in with different
accounts
• On multiple devices
• Across multiple networks
To illustrate, here’s a popular use
case: an Identity Resolution Graph
You need to write extra code to:
• Check the Identity Graph for all aliases of a specific user / device / network;
• Fill in those aliases in SQL queries against your relational database;
• Union the results of those queries.
An identity graph is powerful but it
remains locked away from the rest
of your relational data
You will be able to easily:
• Find all events for a specific user / device / network;
• Build relationships that link all known aliases for this user / device / network to the same
events;
• Quickly discover all of the user / device / network history, regardless of which alias they
are using at the moment.
Contrast that with building the ID
graph on top of your existing event
graph
e. dilyan@snowplowanalytics.com
t. @dilyan_damyanov
If you would like to explore how Snowplow can enable you to
take control of your data, and what that can make possible,
visit our product page, request a demo or get in touch.
Sign up to our mailing list and stay up-to-date with our new
releases and other news.
snowplowanalytics.com
© 2018 Snowplow Technology and services provider in digital analytics. All Rights Reserved.

More Related Content

What's hot

ArcGIS Bivariate Mapping Tools
ArcGIS Bivariate Mapping ToolsArcGIS Bivariate Mapping Tools
ArcGIS Bivariate Mapping ToolsAileen Buckley
 
Visualizing Data in a Web Browser with Cesium ion & FME
Visualizing Data in a Web Browser with Cesium ion & FMEVisualizing Data in a Web Browser with Cesium ion & FME
Visualizing Data in a Web Browser with Cesium ion & FMESafe Software
 
Porting Spring PetClinic to GigaSpaces
Porting Spring PetClinic to GigaSpacesPorting Spring PetClinic to GigaSpaces
Porting Spring PetClinic to GigaSpacesUri Cohen
 
Minhhue_Khuu_Resume
Minhhue_Khuu_ResumeMinhhue_Khuu_Resume
Minhhue_Khuu_ResumeMinhhue Khuu
 
Data Management Dilemma - SIZE vs COMPLEXITY
Data Management Dilemma - SIZE vs COMPLEXITYData Management Dilemma - SIZE vs COMPLEXITY
Data Management Dilemma - SIZE vs COMPLEXITYReinhold Thurner
 

What's hot (7)

ArcGIS Bivariate Mapping Tools
ArcGIS Bivariate Mapping ToolsArcGIS Bivariate Mapping Tools
ArcGIS Bivariate Mapping Tools
 
Visualizing Data in a Web Browser with Cesium ion & FME
Visualizing Data in a Web Browser with Cesium ion & FMEVisualizing Data in a Web Browser with Cesium ion & FME
Visualizing Data in a Web Browser with Cesium ion & FME
 
Porting Spring PetClinic to GigaSpaces
Porting Spring PetClinic to GigaSpacesPorting Spring PetClinic to GigaSpaces
Porting Spring PetClinic to GigaSpaces
 
Cascade crisis gis
Cascade crisis gisCascade crisis gis
Cascade crisis gis
 
Minhhue_Khuu_Resume
Minhhue_Khuu_ResumeMinhhue_Khuu_Resume
Minhhue_Khuu_Resume
 
ArcGIS Queries (3Nov2011)
ArcGIS Queries (3Nov2011)ArcGIS Queries (3Nov2011)
ArcGIS Queries (3Nov2011)
 
Data Management Dilemma - SIZE vs COMPLEXITY
Data Management Dilemma - SIZE vs COMPLEXITYData Management Dilemma - SIZE vs COMPLEXITY
Data Management Dilemma - SIZE vs COMPLEXITY
 

Similar to Modelling Event Data As A Graph

Structured system analysis
Structured system analysisStructured system analysis
Structured system analysislearnt
 
Tableau Course In Bangalore-October
Tableau Course In Bangalore-OctoberTableau Course In Bangalore-October
Tableau Course In Bangalore-OctoberDataMites
 
data-spread-demo
data-spread-demodata-spread-demo
data-spread-demoBofan Sun
 
Visualizing Software Architecture Effectively in Service Description
Visualizing Software Architecture Effectively in Service DescriptionVisualizing Software Architecture Effectively in Service Description
Visualizing Software Architecture Effectively in Service DescriptionSanjoy Kumar Roy
 
Parallel Computing 2007: Overview
Parallel Computing 2007: OverviewParallel Computing 2007: Overview
Parallel Computing 2007: OverviewGeoffrey Fox
 
chapter 2-DATABASE SYSTEM CONCEPTS AND architecture [Autosaved].pdf
chapter 2-DATABASE SYSTEM CONCEPTS AND architecture [Autosaved].pdfchapter 2-DATABASE SYSTEM CONCEPTS AND architecture [Autosaved].pdf
chapter 2-DATABASE SYSTEM CONCEPTS AND architecture [Autosaved].pdfMisganawAbeje1
 
From discovering to trusting data
From discovering to trusting dataFrom discovering to trusting data
From discovering to trusting datamarkgrover
 
SQL Server Denali: BI on Your Terms
SQL Server Denali: BI on Your Terms SQL Server Denali: BI on Your Terms
SQL Server Denali: BI on Your Terms Andrew Brust
 
Time series-analysis-using-an-event-streaming-platform -_v3_final
Time series-analysis-using-an-event-streaming-platform -_v3_finalTime series-analysis-using-an-event-streaming-platform -_v3_final
Time series-analysis-using-an-event-streaming-platform -_v3_finalconfluent
 
Mapping object to_data_models_with_the_uml
Mapping object to_data_models_with_the_umlMapping object to_data_models_with_the_uml
Mapping object to_data_models_with_the_umlIvan Paredes
 
SQL to NoSQL: Top 6 Questions
SQL to NoSQL: Top 6 QuestionsSQL to NoSQL: Top 6 Questions
SQL to NoSQL: Top 6 QuestionsMike Broberg
 
Data Modeling with Neo4j
Data Modeling with Neo4jData Modeling with Neo4j
Data Modeling with Neo4jNeo4j
 
Shared time-series-analysis-using-an-event-streaming-platform -_v2
Shared   time-series-analysis-using-an-event-streaming-platform -_v2Shared   time-series-analysis-using-an-event-streaming-platform -_v2
Shared time-series-analysis-using-an-event-streaming-platform -_v2confluent
 

Similar to Modelling Event Data As A Graph (20)

Structured system analysis
Structured system analysisStructured system analysis
Structured system analysis
 
Cloudant
CloudantCloudant
Cloudant
 
Tableau Course In Bangalore-October
Tableau Course In Bangalore-OctoberTableau Course In Bangalore-October
Tableau Course In Bangalore-October
 
data-spread-demo
data-spread-demodata-spread-demo
data-spread-demo
 
Data Warehouse Designing: Dimensional Modelling and E-R Modelling
Data Warehouse Designing: Dimensional Modelling and E-R ModellingData Warehouse Designing: Dimensional Modelling and E-R Modelling
Data Warehouse Designing: Dimensional Modelling and E-R Modelling
 
Visualizing Software Architecture Effectively in Service Description
Visualizing Software Architecture Effectively in Service DescriptionVisualizing Software Architecture Effectively in Service Description
Visualizing Software Architecture Effectively in Service Description
 
Parallel Computing 2007: Overview
Parallel Computing 2007: OverviewParallel Computing 2007: Overview
Parallel Computing 2007: Overview
 
Mr bi
Mr biMr bi
Mr bi
 
Presentation1
Presentation1Presentation1
Presentation1
 
chapter 2-DATABASE SYSTEM CONCEPTS AND architecture [Autosaved].pdf
chapter 2-DATABASE SYSTEM CONCEPTS AND architecture [Autosaved].pdfchapter 2-DATABASE SYSTEM CONCEPTS AND architecture [Autosaved].pdf
chapter 2-DATABASE SYSTEM CONCEPTS AND architecture [Autosaved].pdf
 
From discovering to trusting data
From discovering to trusting dataFrom discovering to trusting data
From discovering to trusting data
 
SQL Server Denali: BI on Your Terms
SQL Server Denali: BI on Your Terms SQL Server Denali: BI on Your Terms
SQL Server Denali: BI on Your Terms
 
Spreadsheets 2.0
Spreadsheets 2.0Spreadsheets 2.0
Spreadsheets 2.0
 
IntelligentEnterprise
IntelligentEnterpriseIntelligentEnterprise
IntelligentEnterprise
 
Time series-analysis-using-an-event-streaming-platform -_v3_final
Time series-analysis-using-an-event-streaming-platform -_v3_finalTime series-analysis-using-an-event-streaming-platform -_v3_final
Time series-analysis-using-an-event-streaming-platform -_v3_final
 
Mapping object to_data_models_with_the_uml
Mapping object to_data_models_with_the_umlMapping object to_data_models_with_the_uml
Mapping object to_data_models_with_the_uml
 
Async
AsyncAsync
Async
 
SQL to NoSQL: Top 6 Questions
SQL to NoSQL: Top 6 QuestionsSQL to NoSQL: Top 6 Questions
SQL to NoSQL: Top 6 Questions
 
Data Modeling with Neo4j
Data Modeling with Neo4jData Modeling with Neo4j
Data Modeling with Neo4j
 
Shared time-series-analysis-using-an-event-streaming-platform -_v2
Shared   time-series-analysis-using-an-event-streaming-platform -_v2Shared   time-series-analysis-using-an-event-streaming-platform -_v2
Shared time-series-analysis-using-an-event-streaming-platform -_v2
 

Recently uploaded

BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
ELKO dropshipping via API with DroFx.pptx
ELKO dropshipping via API with DroFx.pptxELKO dropshipping via API with DroFx.pptx
ELKO dropshipping via API with DroFx.pptxolyaivanovalion
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramMoniSankarHazra
 

Recently uploaded (20)

BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
ELKO dropshipping via API with DroFx.pptx
ELKO dropshipping via API with DroFx.pptxELKO dropshipping via API with DroFx.pptx
ELKO dropshipping via API with DroFx.pptx
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 

Modelling Event Data As A Graph

  • 1.
  • 2. Modelling event data as a graph Combining event graphs with event grammar
  • 3. • People tend to intuitively visualise concepts as graphs which do not translate nicely to a tabular structure. • Graph databases are often designed for low-latency performance, which can make them a better choice for certain applications, such as recommendation engines, especially at scale. • There are some questions (for example path analysis) that are difficult to answer when using a relational database, but that are easy to answer with a graph. Graph DBs have some key advantages over relational DBs
  • 4. • In a relational event data model, each event is a record in a table or index. • The table has as many columns or properties as there are facets to that event, eg user, timestamp, URL, etc. • There isn’t much scope for deviation from this basic model. We know how to model events in a table…
  • 5. • We can model events as nodes with properties, related through [:NEXT] edges: (PageView1)-[:NEXT]->(PageView2). • We can model events as relationships, eg (User)-[:VIEWS]->(Page). • We can mix and match different methods. • It’s not obvious which model is ‘the right one’: if there even is such a thing. … but modelling events as a graph is relatively unexplored
  • 6. • In a relational database, you’d always use the same query to get specific properties of an event: SELECT user_id, page_url FROM events; • However, in a graph database your query syntax depends on whether those dimensions have been modelled as nodes, relationships, or properties of nodes or relationships: MATCH (u:User)-[:VIEWS]->(p:Page) RETURN u.id, p.url; MATCH (e:Event) RETURN e.user_id, e.page_url; Choosing the model dictates what queries we can run
  • 7. Taking an event-grammar approach In the event-grammar model, an event is a snapshot of a set of entities in time. This model is already a graph, with nodes representing the various entities and relationships between the nodes. However, when mapping this model to a tabular structure, the roles of each entity and the relationships between them are lost to users without knowledge of the model and the domain.
  • 8. To make the roles and relationships explicit, we have to interpret the event
  • 9. MATCH (u:User)-[r:VIEWS]->(p:Page) WHERE u.email = 'alice@mail.com' RETURN COUNT(r) But what if we have more than just page views, eg also link clicks, downloads, form submits, etc? This model makes it hard to find all events by the same user
  • 10. The event graph approach A popular option for modelling events in a graph is to make each event a node that is related to the event that happened immediately before it and after it through a NEXT / PREVIOUS relationship. The event node then has outgoing HAS relationships to all of its entities, such as user nodes, context nodes, etc. This is an easy model to do path analysis.
  • 11. MATCH (u:User)<-[:HAS]-(e:Event)-[:HAS]—>(p:Page) WHERE u.email = 'alice@mail.com' RETURN COUNT(DISTINCT p) We can still find all pages visited by the user, but we always have to add a ‘hop’ in the query, because entities are related to each other only through the event node they belong to. The event-graph model makes other queries harder
  • 12. The ‘denormalised’ graph approach There is also the option to “denormalise” the data, ie represent the same data in different ways. An example would be a model where each event is a node in a time series, with outgoing relationships to all its entities, but there are also relationships between the entities. This adds complexity and redundancy to the model but makes queries easier.
  • 13. MATCH (u:User)-[:VIEWS]->(p:Page) WHERE u.email = 'alice@mail.com' RETURN COUNT(DISTINCT p) MATCH p = (u:User)<-[:HAS]-(Event)-[:NEXT*1..5]—>(Event) WHERE u.email = 'alice@mail.com' RETURN p Now we can easily write a variety of queries
  • 14. One key advantage is that any insight you glean from analysing the relationships of entities in your events, can be readily attached to your existing data set. How is modelling event-level data as a graph valuable?
  • 15. • Users log in with different accounts • On multiple devices • Across multiple networks To illustrate, here’s a popular use case: an Identity Resolution Graph
  • 16. You need to write extra code to: • Check the Identity Graph for all aliases of a specific user / device / network; • Fill in those aliases in SQL queries against your relational database; • Union the results of those queries. An identity graph is powerful but it remains locked away from the rest of your relational data
  • 17. You will be able to easily: • Find all events for a specific user / device / network; • Build relationships that link all known aliases for this user / device / network to the same events; • Quickly discover all of the user / device / network history, regardless of which alias they are using at the moment. Contrast that with building the ID graph on top of your existing event graph
  • 19. If you would like to explore how Snowplow can enable you to take control of your data, and what that can make possible, visit our product page, request a demo or get in touch. Sign up to our mailing list and stay up-to-date with our new releases and other news.
  • 20. snowplowanalytics.com © 2018 Snowplow Technology and services provider in digital analytics. All Rights Reserved.