SlideShare a Scribd company logo
Modelling event
data as a graph
Combining event graphs with event grammar
• People tend to intuitively visualise concepts as graphs which do not translate nicely to a
tabular structure.
• Graph databases are often designed for low-latency performance, which can make them
a better choice for certain applications, such as recommendation engines, especially at
scale.
• There are some questions (for example path analysis) that are difficult to answer when
using a relational database, but that are easy to answer with a graph.
Graph DBs have some key
advantages over relational DBs
• In a relational event data model, each event is a record in a table or index.
• The table has as many columns or properties as there are facets to that event, eg user,
timestamp, URL, etc.
• There isn’t much scope for deviation from this basic model.
We know how to model events in a
table…
• We can model events as nodes with properties, related through [:NEXT] edges:
(PageView1)-[:NEXT]->(PageView2).
• We can model events as relationships, eg (User)-[:VIEWS]->(Page).
• We can mix and match different methods.
• It’s not obvious which model is ‘the right one’: if there even is such a thing.
… but modelling events as a graph
is relatively unexplored
• In a relational database, you’d always use the same query to get specific properties of an
event:
SELECT user_id, page_url FROM events;
• However, in a graph database your query syntax depends on whether those dimensions have
been modelled as nodes, relationships, or properties of nodes or relationships:
MATCH (u:User)-[:VIEWS]->(p:Page)
RETURN u.id, p.url;
MATCH (e:Event)
RETURN e.user_id, e.page_url;
Choosing the model dictates what
queries we can run
Taking an event-grammar approach
In the event-grammar model, an event is a
snapshot of a set of entities in time.
This model is already a graph, with nodes
representing the various entities and
relationships between the nodes.
However, when mapping this model to a
tabular structure, the roles of each entity
and the relationships between them are
lost to users without knowledge of the
model and the domain.
To make the roles and relationships
explicit, we have to interpret the
event
MATCH (u:User)-[r:VIEWS]->(p:Page)
WHERE u.email = 'alice@mail.com'
RETURN COUNT(r)
But what if we have more than just page views, eg also link clicks, downloads, form
submits, etc?
This model makes it hard to find all
events by the same user
The event graph approach
A popular option for modelling events in a
graph is to make each event a node that is
related to the event that happened immediately
before it and after it through
a NEXT / PREVIOUS relationship.
The event node then has
outgoing HAS relationships to all of its entities,
such as user nodes, context nodes, etc.
This is an easy model to do path analysis.
MATCH (u:User)<-[:HAS]-(e:Event)-[:HAS]—>(p:Page)
WHERE u.email = 'alice@mail.com'
RETURN COUNT(DISTINCT p)
We can still find all pages visited by the user, but we always have to add a ‘hop’ in the
query, because entities are related to each other only through the event node they belong
to.
The event-graph model makes other
queries harder
The ‘denormalised’ graph approach
There is also the option to “denormalise” the
data, ie represent the same data in different
ways.
An example would be a model where each
event is a node in a time series, with outgoing
relationships to all its entities, but there
are also relationships between the entities.
This adds complexity and redundancy to the
model but makes queries easier.
MATCH (u:User)-[:VIEWS]->(p:Page)
WHERE u.email = 'alice@mail.com'
RETURN COUNT(DISTINCT p)
MATCH p = (u:User)<-[:HAS]-(Event)-[:NEXT*1..5]—>(Event)
WHERE u.email = 'alice@mail.com'
RETURN p
Now we can easily write a variety of
queries
One key advantage is that any insight you glean from analysing the relationships of entities
in your events, can be readily attached to your existing data set.
How is modelling event-level data as
a graph valuable?
• Users log in with different
accounts
• On multiple devices
• Across multiple networks
To illustrate, here’s a popular use
case: an Identity Resolution Graph
You need to write extra code to:
• Check the Identity Graph for all aliases of a specific user / device / network;
• Fill in those aliases in SQL queries against your relational database;
• Union the results of those queries.
An identity graph is powerful but it
remains locked away from the rest
of your relational data
You will be able to easily:
• Find all events for a specific user / device / network;
• Build relationships that link all known aliases for this user / device / network to the same
events;
• Quickly discover all of the user / device / network history, regardless of which alias they
are using at the moment.
Contrast that with building the ID
graph on top of your existing event
graph
e. dilyan@snowplowanalytics.com
t. @dilyan_damyanov
If you would like to explore how Snowplow can enable you to
take control of your data, and what that can make possible,
visit our product page, request a demo or get in touch.
Sign up to our mailing list and stay up-to-date with our new
releases and other news.
snowplowanalytics.com
© 2018 Snowplow Technology and services provider in digital analytics. All Rights Reserved.

More Related Content

What's hot

ArcGIS Bivariate Mapping Tools
ArcGIS Bivariate Mapping ToolsArcGIS Bivariate Mapping Tools
ArcGIS Bivariate Mapping Tools
Aileen Buckley
 
Visualizing Data in a Web Browser with Cesium ion & FME
Visualizing Data in a Web Browser with Cesium ion & FMEVisualizing Data in a Web Browser with Cesium ion & FME
Visualizing Data in a Web Browser with Cesium ion & FME
Safe Software
 
Porting Spring PetClinic to GigaSpaces
Porting Spring PetClinic to GigaSpacesPorting Spring PetClinic to GigaSpaces
Porting Spring PetClinic to GigaSpaces
Uri Cohen
 
Cascade crisis gis
Cascade crisis gisCascade crisis gis
Cascade crisis gis
Gilles Legrand
 
Minhhue_Khuu_Resume
Minhhue_Khuu_ResumeMinhhue_Khuu_Resume
Minhhue_Khuu_Resume
Minhhue Khuu
 
ArcGIS Queries (3Nov2011)
ArcGIS Queries (3Nov2011)ArcGIS Queries (3Nov2011)
Data Management Dilemma - SIZE vs COMPLEXITY
Data Management Dilemma - SIZE vs COMPLEXITYData Management Dilemma - SIZE vs COMPLEXITY
Data Management Dilemma - SIZE vs COMPLEXITY
Reinhold Thurner
 

What's hot (7)

ArcGIS Bivariate Mapping Tools
ArcGIS Bivariate Mapping ToolsArcGIS Bivariate Mapping Tools
ArcGIS Bivariate Mapping Tools
 
Visualizing Data in a Web Browser with Cesium ion & FME
Visualizing Data in a Web Browser with Cesium ion & FMEVisualizing Data in a Web Browser with Cesium ion & FME
Visualizing Data in a Web Browser with Cesium ion & FME
 
Porting Spring PetClinic to GigaSpaces
Porting Spring PetClinic to GigaSpacesPorting Spring PetClinic to GigaSpaces
Porting Spring PetClinic to GigaSpaces
 
Cascade crisis gis
Cascade crisis gisCascade crisis gis
Cascade crisis gis
 
Minhhue_Khuu_Resume
Minhhue_Khuu_ResumeMinhhue_Khuu_Resume
Minhhue_Khuu_Resume
 
ArcGIS Queries (3Nov2011)
ArcGIS Queries (3Nov2011)ArcGIS Queries (3Nov2011)
ArcGIS Queries (3Nov2011)
 
Data Management Dilemma - SIZE vs COMPLEXITY
Data Management Dilemma - SIZE vs COMPLEXITYData Management Dilemma - SIZE vs COMPLEXITY
Data Management Dilemma - SIZE vs COMPLEXITY
 

Similar to Modelling Event Data As A Graph

Structured system analysis
Structured system analysisStructured system analysis
Structured system analysis
learnt
 
Cloudant
CloudantCloudant
Cloudant
Mansura Habiba
 
Tableau Course In Bangalore-October
Tableau Course In Bangalore-OctoberTableau Course In Bangalore-October
Tableau Course In Bangalore-October
DataMites
 
data-spread-demo
data-spread-demodata-spread-demo
data-spread-demo
Bofan Sun
 
Data Warehouse Designing: Dimensional Modelling and E-R Modelling
Data Warehouse Designing: Dimensional Modelling and E-R ModellingData Warehouse Designing: Dimensional Modelling and E-R Modelling
Data Warehouse Designing: Dimensional Modelling and E-R Modelling
International Journal of Engineering Inventions www.ijeijournal.com
 
Visualizing Software Architecture Effectively in Service Description
Visualizing Software Architecture Effectively in Service DescriptionVisualizing Software Architecture Effectively in Service Description
Visualizing Software Architecture Effectively in Service Description
Sanjoy Kumar Roy
 
Parallel Computing 2007: Overview
Parallel Computing 2007: OverviewParallel Computing 2007: Overview
Parallel Computing 2007: Overview
Geoffrey Fox
 
Mr bi
Mr biMr bi
Mr bi
renjan131
 
Presentation1
Presentation1Presentation1
Presentation1
Celso Catacutan Jr.
 
chapter 2-DATABASE SYSTEM CONCEPTS AND architecture [Autosaved].pdf
chapter 2-DATABASE SYSTEM CONCEPTS AND architecture [Autosaved].pdfchapter 2-DATABASE SYSTEM CONCEPTS AND architecture [Autosaved].pdf
chapter 2-DATABASE SYSTEM CONCEPTS AND architecture [Autosaved].pdf
MisganawAbeje1
 
From discovering to trusting data
From discovering to trusting dataFrom discovering to trusting data
From discovering to trusting data
markgrover
 
SQL Server Denali: BI on Your Terms
SQL Server Denali: BI on Your Terms SQL Server Denali: BI on Your Terms
SQL Server Denali: BI on Your Terms
Andrew Brust
 
Spreadsheets 2.0
Spreadsheets 2.0Spreadsheets 2.0
Spreadsheets 2.0
Harriet Green
 
IntelligentEnterprise
IntelligentEnterpriseIntelligentEnterprise
IntelligentEnterprise
Barry Grushkin 9,600 +
 
Time series-analysis-using-an-event-streaming-platform -_v3_final
Time series-analysis-using-an-event-streaming-platform -_v3_finalTime series-analysis-using-an-event-streaming-platform -_v3_final
Time series-analysis-using-an-event-streaming-platform -_v3_final
confluent
 
Mapping object to_data_models_with_the_uml
Mapping object to_data_models_with_the_umlMapping object to_data_models_with_the_uml
Mapping object to_data_models_with_the_uml
Ivan Paredes
 
Async
AsyncAsync
SQL to NoSQL: Top 6 Questions
SQL to NoSQL: Top 6 QuestionsSQL to NoSQL: Top 6 Questions
SQL to NoSQL: Top 6 Questions
Mike Broberg
 
Data Modeling with Neo4j
Data Modeling with Neo4jData Modeling with Neo4j
Data Modeling with Neo4j
Neo4j
 
Shared time-series-analysis-using-an-event-streaming-platform -_v2
Shared   time-series-analysis-using-an-event-streaming-platform -_v2Shared   time-series-analysis-using-an-event-streaming-platform -_v2
Shared time-series-analysis-using-an-event-streaming-platform -_v2
confluent
 

Similar to Modelling Event Data As A Graph (20)

Structured system analysis
Structured system analysisStructured system analysis
Structured system analysis
 
Cloudant
CloudantCloudant
Cloudant
 
Tableau Course In Bangalore-October
Tableau Course In Bangalore-OctoberTableau Course In Bangalore-October
Tableau Course In Bangalore-October
 
data-spread-demo
data-spread-demodata-spread-demo
data-spread-demo
 
Data Warehouse Designing: Dimensional Modelling and E-R Modelling
Data Warehouse Designing: Dimensional Modelling and E-R ModellingData Warehouse Designing: Dimensional Modelling and E-R Modelling
Data Warehouse Designing: Dimensional Modelling and E-R Modelling
 
Visualizing Software Architecture Effectively in Service Description
Visualizing Software Architecture Effectively in Service DescriptionVisualizing Software Architecture Effectively in Service Description
Visualizing Software Architecture Effectively in Service Description
 
Parallel Computing 2007: Overview
Parallel Computing 2007: OverviewParallel Computing 2007: Overview
Parallel Computing 2007: Overview
 
Mr bi
Mr biMr bi
Mr bi
 
Presentation1
Presentation1Presentation1
Presentation1
 
chapter 2-DATABASE SYSTEM CONCEPTS AND architecture [Autosaved].pdf
chapter 2-DATABASE SYSTEM CONCEPTS AND architecture [Autosaved].pdfchapter 2-DATABASE SYSTEM CONCEPTS AND architecture [Autosaved].pdf
chapter 2-DATABASE SYSTEM CONCEPTS AND architecture [Autosaved].pdf
 
From discovering to trusting data
From discovering to trusting dataFrom discovering to trusting data
From discovering to trusting data
 
SQL Server Denali: BI on Your Terms
SQL Server Denali: BI on Your Terms SQL Server Denali: BI on Your Terms
SQL Server Denali: BI on Your Terms
 
Spreadsheets 2.0
Spreadsheets 2.0Spreadsheets 2.0
Spreadsheets 2.0
 
IntelligentEnterprise
IntelligentEnterpriseIntelligentEnterprise
IntelligentEnterprise
 
Time series-analysis-using-an-event-streaming-platform -_v3_final
Time series-analysis-using-an-event-streaming-platform -_v3_finalTime series-analysis-using-an-event-streaming-platform -_v3_final
Time series-analysis-using-an-event-streaming-platform -_v3_final
 
Mapping object to_data_models_with_the_uml
Mapping object to_data_models_with_the_umlMapping object to_data_models_with_the_uml
Mapping object to_data_models_with_the_uml
 
Async
AsyncAsync
Async
 
SQL to NoSQL: Top 6 Questions
SQL to NoSQL: Top 6 QuestionsSQL to NoSQL: Top 6 Questions
SQL to NoSQL: Top 6 Questions
 
Data Modeling with Neo4j
Data Modeling with Neo4jData Modeling with Neo4j
Data Modeling with Neo4j
 
Shared time-series-analysis-using-an-event-streaming-platform -_v2
Shared   time-series-analysis-using-an-event-streaming-platform -_v2Shared   time-series-analysis-using-an-event-streaming-platform -_v2
Shared time-series-analysis-using-an-event-streaming-platform -_v2
 

Recently uploaded

06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
Timothy Spann
 
UofT毕业证如何办理
UofT毕业证如何办理UofT毕业证如何办理
UofT毕业证如何办理
exukyp
 
Build applications with generative AI on Google Cloud
Build applications with generative AI on Google CloudBuild applications with generative AI on Google Cloud
Build applications with generative AI on Google Cloud
Márton Kodok
 
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Kaxil Naik
 
Template xxxxxxxx ssssssssssss Sertifikat.pptx
Template xxxxxxxx ssssssssssss Sertifikat.pptxTemplate xxxxxxxx ssssssssssss Sertifikat.pptx
Template xxxxxxxx ssssssssssss Sertifikat.pptx
TeukuEriSyahputra
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Aggregage
 
原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理
原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理 原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理
原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理
tzu5xla
 
一比一原版(曼大毕业证书)曼尼托巴大学毕业证如何办理
一比一原版(曼大毕业证书)曼尼托巴大学毕业证如何办理一比一原版(曼大毕业证书)曼尼托巴大学毕业证如何办理
一比一原版(曼大毕业证书)曼尼托巴大学毕业证如何办理
ytypuem
 
社内勉強会資料_Hallucination of LLMs               .
社内勉強会資料_Hallucination of LLMs               .社内勉強会資料_Hallucination of LLMs               .
社内勉強会資料_Hallucination of LLMs               .
NABLAS株式会社
 
一比一原版(Sheffield毕业证书)谢菲尔德大学毕业证如何办理
一比一原版(Sheffield毕业证书)谢菲尔德大学毕业证如何办理一比一原版(Sheffield毕业证书)谢菲尔德大学毕业证如何办理
一比一原版(Sheffield毕业证书)谢菲尔德大学毕业证如何办理
1tyxnjpia
 
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
nuttdpt
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
bmucuha
 
University of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma TranscriptUniversity of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma Transcript
soxrziqu
 
DSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelinesDSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelines
Timothy Spann
 
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
oaxefes
 
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
Social Samosa
 
一比一原版格里菲斯大学毕业证(Griffith毕业证书)学历如何办理
一比一原版格里菲斯大学毕业证(Griffith毕业证书)学历如何办理一比一原版格里菲斯大学毕业证(Griffith毕业证书)学历如何办理
一比一原版格里菲斯大学毕业证(Griffith毕业证书)学历如何办理
lzdvtmy8
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Kiwi Creative
 
Sample Devops SRE Product Companies .pdf
Sample Devops SRE  Product Companies .pdfSample Devops SRE  Product Companies .pdf
Sample Devops SRE Product Companies .pdf
Vineet
 
Building a Quantum Computer Neutral Atom.pdf
Building a Quantum Computer Neutral Atom.pdfBuilding a Quantum Computer Neutral Atom.pdf
Building a Quantum Computer Neutral Atom.pdf
cjimenez2581
 

Recently uploaded (20)

06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
 
UofT毕业证如何办理
UofT毕业证如何办理UofT毕业证如何办理
UofT毕业证如何办理
 
Build applications with generative AI on Google Cloud
Build applications with generative AI on Google CloudBuild applications with generative AI on Google Cloud
Build applications with generative AI on Google Cloud
 
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
 
Template xxxxxxxx ssssssssssss Sertifikat.pptx
Template xxxxxxxx ssssssssssss Sertifikat.pptxTemplate xxxxxxxx ssssssssssss Sertifikat.pptx
Template xxxxxxxx ssssssssssss Sertifikat.pptx
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
 
原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理
原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理 原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理
原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理
 
一比一原版(曼大毕业证书)曼尼托巴大学毕业证如何办理
一比一原版(曼大毕业证书)曼尼托巴大学毕业证如何办理一比一原版(曼大毕业证书)曼尼托巴大学毕业证如何办理
一比一原版(曼大毕业证书)曼尼托巴大学毕业证如何办理
 
社内勉強会資料_Hallucination of LLMs               .
社内勉強会資料_Hallucination of LLMs               .社内勉強会資料_Hallucination of LLMs               .
社内勉強会資料_Hallucination of LLMs               .
 
一比一原版(Sheffield毕业证书)谢菲尔德大学毕业证如何办理
一比一原版(Sheffield毕业证书)谢菲尔德大学毕业证如何办理一比一原版(Sheffield毕业证书)谢菲尔德大学毕业证如何办理
一比一原版(Sheffield毕业证书)谢菲尔德大学毕业证如何办理
 
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
 
University of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma TranscriptUniversity of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma Transcript
 
DSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelinesDSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelines
 
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
 
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
 
一比一原版格里菲斯大学毕业证(Griffith毕业证书)学历如何办理
一比一原版格里菲斯大学毕业证(Griffith毕业证书)学历如何办理一比一原版格里菲斯大学毕业证(Griffith毕业证书)学历如何办理
一比一原版格里菲斯大学毕业证(Griffith毕业证书)学历如何办理
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
 
Sample Devops SRE Product Companies .pdf
Sample Devops SRE  Product Companies .pdfSample Devops SRE  Product Companies .pdf
Sample Devops SRE Product Companies .pdf
 
Building a Quantum Computer Neutral Atom.pdf
Building a Quantum Computer Neutral Atom.pdfBuilding a Quantum Computer Neutral Atom.pdf
Building a Quantum Computer Neutral Atom.pdf
 

Modelling Event Data As A Graph

  • 1.
  • 2. Modelling event data as a graph Combining event graphs with event grammar
  • 3. • People tend to intuitively visualise concepts as graphs which do not translate nicely to a tabular structure. • Graph databases are often designed for low-latency performance, which can make them a better choice for certain applications, such as recommendation engines, especially at scale. • There are some questions (for example path analysis) that are difficult to answer when using a relational database, but that are easy to answer with a graph. Graph DBs have some key advantages over relational DBs
  • 4. • In a relational event data model, each event is a record in a table or index. • The table has as many columns or properties as there are facets to that event, eg user, timestamp, URL, etc. • There isn’t much scope for deviation from this basic model. We know how to model events in a table…
  • 5. • We can model events as nodes with properties, related through [:NEXT] edges: (PageView1)-[:NEXT]->(PageView2). • We can model events as relationships, eg (User)-[:VIEWS]->(Page). • We can mix and match different methods. • It’s not obvious which model is ‘the right one’: if there even is such a thing. … but modelling events as a graph is relatively unexplored
  • 6. • In a relational database, you’d always use the same query to get specific properties of an event: SELECT user_id, page_url FROM events; • However, in a graph database your query syntax depends on whether those dimensions have been modelled as nodes, relationships, or properties of nodes or relationships: MATCH (u:User)-[:VIEWS]->(p:Page) RETURN u.id, p.url; MATCH (e:Event) RETURN e.user_id, e.page_url; Choosing the model dictates what queries we can run
  • 7. Taking an event-grammar approach In the event-grammar model, an event is a snapshot of a set of entities in time. This model is already a graph, with nodes representing the various entities and relationships between the nodes. However, when mapping this model to a tabular structure, the roles of each entity and the relationships between them are lost to users without knowledge of the model and the domain.
  • 8. To make the roles and relationships explicit, we have to interpret the event
  • 9. MATCH (u:User)-[r:VIEWS]->(p:Page) WHERE u.email = 'alice@mail.com' RETURN COUNT(r) But what if we have more than just page views, eg also link clicks, downloads, form submits, etc? This model makes it hard to find all events by the same user
  • 10. The event graph approach A popular option for modelling events in a graph is to make each event a node that is related to the event that happened immediately before it and after it through a NEXT / PREVIOUS relationship. The event node then has outgoing HAS relationships to all of its entities, such as user nodes, context nodes, etc. This is an easy model to do path analysis.
  • 11. MATCH (u:User)<-[:HAS]-(e:Event)-[:HAS]—>(p:Page) WHERE u.email = 'alice@mail.com' RETURN COUNT(DISTINCT p) We can still find all pages visited by the user, but we always have to add a ‘hop’ in the query, because entities are related to each other only through the event node they belong to. The event-graph model makes other queries harder
  • 12. The ‘denormalised’ graph approach There is also the option to “denormalise” the data, ie represent the same data in different ways. An example would be a model where each event is a node in a time series, with outgoing relationships to all its entities, but there are also relationships between the entities. This adds complexity and redundancy to the model but makes queries easier.
  • 13. MATCH (u:User)-[:VIEWS]->(p:Page) WHERE u.email = 'alice@mail.com' RETURN COUNT(DISTINCT p) MATCH p = (u:User)<-[:HAS]-(Event)-[:NEXT*1..5]—>(Event) WHERE u.email = 'alice@mail.com' RETURN p Now we can easily write a variety of queries
  • 14. One key advantage is that any insight you glean from analysing the relationships of entities in your events, can be readily attached to your existing data set. How is modelling event-level data as a graph valuable?
  • 15. • Users log in with different accounts • On multiple devices • Across multiple networks To illustrate, here’s a popular use case: an Identity Resolution Graph
  • 16. You need to write extra code to: • Check the Identity Graph for all aliases of a specific user / device / network; • Fill in those aliases in SQL queries against your relational database; • Union the results of those queries. An identity graph is powerful but it remains locked away from the rest of your relational data
  • 17. You will be able to easily: • Find all events for a specific user / device / network; • Build relationships that link all known aliases for this user / device / network to the same events; • Quickly discover all of the user / device / network history, regardless of which alias they are using at the moment. Contrast that with building the ID graph on top of your existing event graph
  • 19. If you would like to explore how Snowplow can enable you to take control of your data, and what that can make possible, visit our product page, request a demo or get in touch. Sign up to our mailing list and stay up-to-date with our new releases and other news.
  • 20. snowplowanalytics.com © 2018 Snowplow Technology and services provider in digital analytics. All Rights Reserved.