SlideShare a Scribd company logo
LDBC Social Network Benchmark
Business Intelligence Workload
Marcus Paradies, SAP SE
16/23/2016
Task Force Members
• Task Force members
• Arnau Prat (DAMA-UPC)
• Alex Averbuch (Neo Technologies)
• Moritz Kaufmann (TU München)
• Marcus Paradies (SAP)
• Orri Erling (Google)
• Peter Boncz (CWI)
26/23/2016
Interactive and BI Workload
• Interactive workload
• Only touch a small fraction of the data graph (< 3 hops)
• Queries usually start at a single person vertex
• Can be seen as OLTP-style workload
• Business intelligence workload
• Focus on analytic queries (grouping/aggregations)
• Queries touch the complete graph
• Batch updates
• Can be seen as OLAP-style workload
• 24 queries in the BI workload
• Interactive and BI workload share the benchmark driver and
generated data
• Workload-specific parameter bindings to reduce variability
between queries
3
Benchmark Driver
Data Generator
Interactive
Workload
Business Intelligence
Workload
6/23/2016
Examples Queries
Q1 - Posting Summary
Given a date, find all Messages created before that date.
Group them by a 3-level grouping:
1. by year of creation
2. for each year, group into message types, i.e., Posts or Comments
3. for each year-type group, split into four groups based on length of their content:
0 <= length < 40  short
40 <= length < 80  one liner
80 <= length < 160  tweet
160 <= length  long
Parameters:
date: Date
46/23/2016
Examples Queries
Q1 - Posting Summary
5
Result: (*)
For every 3-level group, return:
• year - 32-bit Integer
• message type  post/comment
• length category  (short/one-liner/tweet/long)
• message count  total number of Messages (Posts/Comments) in that group
• average message length  average length of the Message content in that group
• sum message length  sum of all message content lengths
• per messages  number of messages in a group as a percentage of all
messages created before the given date
* Final sorting and top k omitted here for brevity
6/23/2016
Examples Queries
Q16 - Experts in a Social Circle
6
Given a Person, find all other Persons that live in a given country and are
connected to given person by a path through the knows relation.
For each of these Persons, retrieve all of their Messages (Posts & Comments)
that contain at least one Tag belonging to a given TagClass (direct relation
not transitive). For each Message, also retrieve its Tags.
Parameters:
Person.id - 64-bit Integer
Country.name - String
TagClass.name - String
6/23/2016
Examples Queries
Q16 - Experts in a Social Circle
7
Grouping of results:
First, by Tag.name
Second, by Person.id
Result: (*)
For each group, return:
• Person.id
• Tag.name
• post_count  number of Messages created by that Person containing that Tag
* Final sorting and top k omitted here for brevity
6/23/2016
SNB BI Workload in a Nutshell
6/23/2016 8
Q13 - Popular Tags per month in a country
Q14 - Top thread initiators
Q15 - Social Normals
Q16 - Experts in Social Circle
Q17 - Friend Triangles
Q18 - How many persons have a given number of posts
Q19 - Stranger's Interaction
Q20 - High level topics
Q21 - Zombies in a country
Q22 - International Dialog
Q23 - Holiday Destinations
Q24 - Messages By Topic And Continent
Q1 - Posting summary
Q2 - Top tags for country, age, gender, time
Q3 - Tag evolution
Q4 - Popular topics in a country
Q5 - Top posters in a country
Q6 - Most active Posters of a given Topic
Q7 - Most authoritative users on a given topic
Q8 - Related Topics
Q9 - Forum with related Tags
Q10 - Central Person for a Tag
Q11 - Unrelated Replies
Q12 - Trending Posts
Choke Point Identification
9
• Choke point = implementation challenge
• Interactive workload already contains a number of
choke points
• We extend them by more choke points from TPC-H that
are relevant for BI queries
6/23/2016
Current Status of the BI Workload
• Query Definition
• All 24 queries are specified in plain english text
• Benchmark Driver
• Reads generated parameter bindings and issues queries
• Query mix has to be defined
• 22/24 queries are validated against Sparsity,Neo4j, and Postgres
• Choke points identified for 8/24 queries
10
https://github.com/ldbc/ldbc_snb_implementations/tree/new_bi/bi/postgres/src/main/sql/postgres/queries/bi
Link to Postgres draft reference implementation:
6/23/2016
Outlook and Next Steps
• Finishing of chokepoint identification and analysis of whether all
defined chokepoints are already triggered
• Finishing validation of remaining two queries
• Addition of refresh data sets (update batches)
• Definition
• Implementation in the benchmark driver
• Final polishing of query specifications
116/23/2016

More Related Content

Similar to 8th TUC Meeting – Marcus Paradies (SAP) Social Network Benchmark

Temporal and semantic analysis of richly typed social networks from user-gene...
Temporal and semantic analysis of richly typed social networks from user-gene...Temporal and semantic analysis of richly typed social networks from user-gene...
Temporal and semantic analysis of richly typed social networks from user-gene...
Zide Meng
 
An early look at the LDBC Social Network Benchmark's Business Intelligence wo...
An early look at the LDBC Social Network Benchmark's Business Intelligence wo...An early look at the LDBC Social Network Benchmark's Business Intelligence wo...
An early look at the LDBC Social Network Benchmark's Business Intelligence wo...
Gábor Szárnyas
 
FOSDEM2014 - Social Network Benchmark (SNB) Graph Generator - Peter Boncz
FOSDEM2014 - Social Network Benchmark (SNB) Graph Generator - Peter BonczFOSDEM2014 - Social Network Benchmark (SNB) Graph Generator - Peter Boncz
FOSDEM2014 - Social Network Benchmark (SNB) Graph Generator - Peter BonczIoan Toma
 
Addressing scalability challenges in peer-to-peer search
Addressing scalability challenges in peer-to-peer searchAddressing scalability challenges in peer-to-peer search
Addressing scalability challenges in peer-to-peer search
Harisankar H
 
Music recommendations API with Neo4j
Music recommendations API with Neo4jMusic recommendations API with Neo4j
Music recommendations API with Neo4j
Boris Guarisma
 
Relevancy and Search Quality Analysis - Search Technologies
Relevancy and Search Quality Analysis - Search TechnologiesRelevancy and Search Quality Analysis - Search Technologies
Relevancy and Search Quality Analysis - Search Technologies
enterprisesearchmeetup
 
Msr2010 ibrahim
Msr2010 ibrahimMsr2010 ibrahim
Msr2010 ibrahimSAIL_QU
 
ICANN Updates by Yu Chang Kuek
ICANN Updates by Yu Chang KuekICANN Updates by Yu Chang Kuek
ICANN Updates by Yu Chang Kuek
MyNOG
 
Rubrics for DMPs
Rubrics for DMPsRubrics for DMPs
Rubrics for DMPs
Jisc RDM
 
Clickstream data with spark
Clickstream data with sparkClickstream data with spark
Clickstream data with spark
Marissa Saunders
 
Personalized search
Personalized searchPersonalized search
Personalized search
Toine Bogers
 
ODS Slack exploration
ODS Slack explorationODS Slack exploration
ODS Slack exploration
Lviv Data Science Summer School
 
LOD2: State of Play WP2 - Storing and Querying Very Large Knowledge Bases
LOD2: State of Play WP2 - Storing and Querying Very Large Knowledge BasesLOD2: State of Play WP2 - Storing and Querying Very Large Knowledge Bases
LOD2: State of Play WP2 - Storing and Querying Very Large Knowledge Bases
LOD2 Creating Knowledge out of Interlinked Data
 
eMadrid 2014-01-17 uned Salvador Ros (UNED) "Big Data in Education"
eMadrid 2014-01-17 uned Salvador Ros (UNED) "Big Data in Education"eMadrid 2014-01-17 uned Salvador Ros (UNED) "Big Data in Education"
eMadrid 2014-01-17 uned Salvador Ros (UNED) "Big Data in Education"eMadrid network
 
Network Relationships and Job Changes of Software Developers at Sunbelt 2016
Network Relationships and Job Changes of Software Developers at Sunbelt 2016Network Relationships and Job Changes of Software Developers at Sunbelt 2016
Network Relationships and Job Changes of Software Developers at Sunbelt 2016
Dawn Foster
 
Cross-Platform Profiling tutorial at the Digital Methods Summer School 2013
Cross-Platform Profiling tutorial at the Digital Methods Summer School 2013Cross-Platform Profiling tutorial at the Digital Methods Summer School 2013
Cross-Platform Profiling tutorial at the Digital Methods Summer School 2013Digital Methods Initiative
 
Data Description Registry Interoperability WG at Research Data Alliance Third...
Data Description Registry Interoperability WG at Research Data Alliance Third...Data Description Registry Interoperability WG at Research Data Alliance Third...
Data Description Registry Interoperability WG at Research Data Alliance Third...
amiraryani
 
Strata sf - Amundsen presentation
Strata sf - Amundsen presentationStrata sf - Amundsen presentation
Strata sf - Amundsen presentation
Tao Feng
 
How are project-specific forums utilized? A study of participation, content, ...
How are project-specific forums utilized? A study of participation, content, ...How are project-specific forums utilized? A study of participation, content, ...
How are project-specific forums utilized? A study of participation, content, ...
Yusuf Sulistyo Nugroho
 
Coursera data science specialization
Coursera data science specializationCoursera data science specialization
Coursera data science specialization
Mengshu Liu
 

Similar to 8th TUC Meeting – Marcus Paradies (SAP) Social Network Benchmark (20)

Temporal and semantic analysis of richly typed social networks from user-gene...
Temporal and semantic analysis of richly typed social networks from user-gene...Temporal and semantic analysis of richly typed social networks from user-gene...
Temporal and semantic analysis of richly typed social networks from user-gene...
 
An early look at the LDBC Social Network Benchmark's Business Intelligence wo...
An early look at the LDBC Social Network Benchmark's Business Intelligence wo...An early look at the LDBC Social Network Benchmark's Business Intelligence wo...
An early look at the LDBC Social Network Benchmark's Business Intelligence wo...
 
FOSDEM2014 - Social Network Benchmark (SNB) Graph Generator - Peter Boncz
FOSDEM2014 - Social Network Benchmark (SNB) Graph Generator - Peter BonczFOSDEM2014 - Social Network Benchmark (SNB) Graph Generator - Peter Boncz
FOSDEM2014 - Social Network Benchmark (SNB) Graph Generator - Peter Boncz
 
Addressing scalability challenges in peer-to-peer search
Addressing scalability challenges in peer-to-peer searchAddressing scalability challenges in peer-to-peer search
Addressing scalability challenges in peer-to-peer search
 
Music recommendations API with Neo4j
Music recommendations API with Neo4jMusic recommendations API with Neo4j
Music recommendations API with Neo4j
 
Relevancy and Search Quality Analysis - Search Technologies
Relevancy and Search Quality Analysis - Search TechnologiesRelevancy and Search Quality Analysis - Search Technologies
Relevancy and Search Quality Analysis - Search Technologies
 
Msr2010 ibrahim
Msr2010 ibrahimMsr2010 ibrahim
Msr2010 ibrahim
 
ICANN Updates by Yu Chang Kuek
ICANN Updates by Yu Chang KuekICANN Updates by Yu Chang Kuek
ICANN Updates by Yu Chang Kuek
 
Rubrics for DMPs
Rubrics for DMPsRubrics for DMPs
Rubrics for DMPs
 
Clickstream data with spark
Clickstream data with sparkClickstream data with spark
Clickstream data with spark
 
Personalized search
Personalized searchPersonalized search
Personalized search
 
ODS Slack exploration
ODS Slack explorationODS Slack exploration
ODS Slack exploration
 
LOD2: State of Play WP2 - Storing and Querying Very Large Knowledge Bases
LOD2: State of Play WP2 - Storing and Querying Very Large Knowledge BasesLOD2: State of Play WP2 - Storing and Querying Very Large Knowledge Bases
LOD2: State of Play WP2 - Storing and Querying Very Large Knowledge Bases
 
eMadrid 2014-01-17 uned Salvador Ros (UNED) "Big Data in Education"
eMadrid 2014-01-17 uned Salvador Ros (UNED) "Big Data in Education"eMadrid 2014-01-17 uned Salvador Ros (UNED) "Big Data in Education"
eMadrid 2014-01-17 uned Salvador Ros (UNED) "Big Data in Education"
 
Network Relationships and Job Changes of Software Developers at Sunbelt 2016
Network Relationships and Job Changes of Software Developers at Sunbelt 2016Network Relationships and Job Changes of Software Developers at Sunbelt 2016
Network Relationships and Job Changes of Software Developers at Sunbelt 2016
 
Cross-Platform Profiling tutorial at the Digital Methods Summer School 2013
Cross-Platform Profiling tutorial at the Digital Methods Summer School 2013Cross-Platform Profiling tutorial at the Digital Methods Summer School 2013
Cross-Platform Profiling tutorial at the Digital Methods Summer School 2013
 
Data Description Registry Interoperability WG at Research Data Alliance Third...
Data Description Registry Interoperability WG at Research Data Alliance Third...Data Description Registry Interoperability WG at Research Data Alliance Third...
Data Description Registry Interoperability WG at Research Data Alliance Third...
 
Strata sf - Amundsen presentation
Strata sf - Amundsen presentationStrata sf - Amundsen presentation
Strata sf - Amundsen presentation
 
How are project-specific forums utilized? A study of participation, content, ...
How are project-specific forums utilized? A study of participation, content, ...How are project-specific forums utilized? A study of participation, content, ...
How are project-specific forums utilized? A study of participation, content, ...
 
Coursera data science specialization
Coursera data science specializationCoursera data science specialization
Coursera data science specialization
 

More from LDBC council

8th TUC Meeting - Juan Sequeda (Capsenta). Integrating Data using Graphs and ...
8th TUC Meeting - Juan Sequeda (Capsenta). Integrating Data using Graphs and ...8th TUC Meeting - Juan Sequeda (Capsenta). Integrating Data using Graphs and ...
8th TUC Meeting - Juan Sequeda (Capsenta). Integrating Data using Graphs and ...
LDBC council
 
8th TUC Meeting - Zhe Wu (Oracle USA). Bridging RDF Graph and Property Graph...
8th TUC Meeting -  Zhe Wu (Oracle USA). Bridging RDF Graph and Property Graph...8th TUC Meeting -  Zhe Wu (Oracle USA). Bridging RDF Graph and Property Graph...
8th TUC Meeting - Zhe Wu (Oracle USA). Bridging RDF Graph and Property Graph...
LDBC council
 
8th TUC Meeting – Yinglong Xia (Huawei), Big Graph Analytics Engine
8th TUC Meeting – Yinglong Xia (Huawei), Big Graph Analytics Engine8th TUC Meeting – Yinglong Xia (Huawei), Big Graph Analytics Engine
8th TUC Meeting – Yinglong Xia (Huawei), Big Graph Analytics Engine
LDBC council
 
8th TUC Meeting – George Fletcher (TU Eindhoven), gMark: Schema-driven data a...
8th TUC Meeting – George Fletcher (TU Eindhoven), gMark: Schema-driven data a...8th TUC Meeting – George Fletcher (TU Eindhoven), gMark: Schema-driven data a...
8th TUC Meeting – George Fletcher (TU Eindhoven), gMark: Schema-driven data a...
LDBC council
 
8th TUC Meeting - Sergey Edunov (Facebook). Generating realistic trillion-edg...
8th TUC Meeting - Sergey Edunov (Facebook). Generating realistic trillion-edg...8th TUC Meeting - Sergey Edunov (Facebook). Generating realistic trillion-edg...
8th TUC Meeting - Sergey Edunov (Facebook). Generating realistic trillion-edg...
LDBC council
 
Weining Qian (ECNU). On Statistical Characteristics of Real-Life Knowledge Gr...
Weining Qian (ECNU). On Statistical Characteristics of Real-Life Knowledge Gr...Weining Qian (ECNU). On Statistical Characteristics of Real-Life Knowledge Gr...
Weining Qian (ECNU). On Statistical Characteristics of Real-Life Knowledge Gr...
LDBC council
 
8th TUC Meeting - Peter Boncz (CWI). Query Language Task Force status
8th TUC Meeting - Peter Boncz (CWI). Query Language Task Force status8th TUC Meeting - Peter Boncz (CWI). Query Language Task Force status
8th TUC Meeting - Peter Boncz (CWI). Query Language Task Force status
LDBC council
 
8th TUC Meeting | Lijun Chang (University of New South Wales). Efficient Subg...
8th TUC Meeting | Lijun Chang (University of New South Wales). Efficient Subg...8th TUC Meeting | Lijun Chang (University of New South Wales). Efficient Subg...
8th TUC Meeting | Lijun Chang (University of New South Wales). Efficient Subg...
LDBC council
 
8th TUC Meeting - Eugene I. Chong (Oracle USA). Balancing Act to improve RDF ...
8th TUC Meeting - Eugene I. Chong (Oracle USA). Balancing Act to improve RDF ...8th TUC Meeting - Eugene I. Chong (Oracle USA). Balancing Act to improve RDF ...
8th TUC Meeting - Eugene I. Chong (Oracle USA). Balancing Act to improve RDF ...
LDBC council
 
8th TUC Meeting -
8th TUC Meeting - 8th TUC Meeting -
8th TUC Meeting -
LDBC council
 
8th TUC Meeting - David Meibusch, Nathan Hawes (Oracle Labs Australia). Frapp...
8th TUC Meeting - David Meibusch, Nathan Hawes (Oracle Labs Australia). Frapp...8th TUC Meeting - David Meibusch, Nathan Hawes (Oracle Labs Australia). Frapp...
8th TUC Meeting - David Meibusch, Nathan Hawes (Oracle Labs Australia). Frapp...
LDBC council
 
8th TUC Meeting - Martin Zand University of Rochester Clinical and Translatio...
8th TUC Meeting - Martin Zand University of Rochester Clinical and Translatio...8th TUC Meeting - Martin Zand University of Rochester Clinical and Translatio...
8th TUC Meeting - Martin Zand University of Rochester Clinical and Translatio...
LDBC council
 
8th TUC Meeting - Tim Hegeman (TU Delft). Social Network Benchmark, Analytics...
8th TUC Meeting - Tim Hegeman (TU Delft). Social Network Benchmark, Analytics...8th TUC Meeting - Tim Hegeman (TU Delft). Social Network Benchmark, Analytics...
8th TUC Meeting - Tim Hegeman (TU Delft). Social Network Benchmark, Analytics...
LDBC council
 
LDBC 8th TUC Meeting: Introduction and status update
LDBC 8th TUC Meeting: Introduction and status updateLDBC 8th TUC Meeting: Introduction and status update
LDBC 8th TUC Meeting: Introduction and status update
LDBC council
 
LDBC 6th TUC Meeting conclusions
LDBC 6th TUC Meeting conclusionsLDBC 6th TUC Meeting conclusions
LDBC 6th TUC Meeting conclusions
LDBC council
 
Parallel and incremental materialisation of RDF/DATALOG in RDFOX
Parallel and incremental materialisation of RDF/DATALOG in RDFOXParallel and incremental materialisation of RDF/DATALOG in RDFOX
Parallel and incremental materialisation of RDF/DATALOG in RDFOX
LDBC council
 
MODAClouds Decision Support System for Cloud Service Selection
MODAClouds Decision Support System for Cloud Service SelectionMODAClouds Decision Support System for Cloud Service Selection
MODAClouds Decision Support System for Cloud Service Selection
LDBC council
 
E-Commerce and Graph-driven Applications: Experiences and Optimizations while...
E-Commerce and Graph-driven Applications: Experiences and Optimizations while...E-Commerce and Graph-driven Applications: Experiences and Optimizations while...
E-Commerce and Graph-driven Applications: Experiences and Optimizations while...
LDBC council
 
LDBC SNB Benchmark Auditing
LDBC SNB Benchmark AuditingLDBC SNB Benchmark Auditing
LDBC SNB Benchmark Auditing
LDBC council
 
Social Network Benchmark Interactive Workload
Social Network Benchmark Interactive WorkloadSocial Network Benchmark Interactive Workload
Social Network Benchmark Interactive Workload
LDBC council
 

More from LDBC council (20)

8th TUC Meeting - Juan Sequeda (Capsenta). Integrating Data using Graphs and ...
8th TUC Meeting - Juan Sequeda (Capsenta). Integrating Data using Graphs and ...8th TUC Meeting - Juan Sequeda (Capsenta). Integrating Data using Graphs and ...
8th TUC Meeting - Juan Sequeda (Capsenta). Integrating Data using Graphs and ...
 
8th TUC Meeting - Zhe Wu (Oracle USA). Bridging RDF Graph and Property Graph...
8th TUC Meeting -  Zhe Wu (Oracle USA). Bridging RDF Graph and Property Graph...8th TUC Meeting -  Zhe Wu (Oracle USA). Bridging RDF Graph and Property Graph...
8th TUC Meeting - Zhe Wu (Oracle USA). Bridging RDF Graph and Property Graph...
 
8th TUC Meeting – Yinglong Xia (Huawei), Big Graph Analytics Engine
8th TUC Meeting – Yinglong Xia (Huawei), Big Graph Analytics Engine8th TUC Meeting – Yinglong Xia (Huawei), Big Graph Analytics Engine
8th TUC Meeting – Yinglong Xia (Huawei), Big Graph Analytics Engine
 
8th TUC Meeting – George Fletcher (TU Eindhoven), gMark: Schema-driven data a...
8th TUC Meeting – George Fletcher (TU Eindhoven), gMark: Schema-driven data a...8th TUC Meeting – George Fletcher (TU Eindhoven), gMark: Schema-driven data a...
8th TUC Meeting – George Fletcher (TU Eindhoven), gMark: Schema-driven data a...
 
8th TUC Meeting - Sergey Edunov (Facebook). Generating realistic trillion-edg...
8th TUC Meeting - Sergey Edunov (Facebook). Generating realistic trillion-edg...8th TUC Meeting - Sergey Edunov (Facebook). Generating realistic trillion-edg...
8th TUC Meeting - Sergey Edunov (Facebook). Generating realistic trillion-edg...
 
Weining Qian (ECNU). On Statistical Characteristics of Real-Life Knowledge Gr...
Weining Qian (ECNU). On Statistical Characteristics of Real-Life Knowledge Gr...Weining Qian (ECNU). On Statistical Characteristics of Real-Life Knowledge Gr...
Weining Qian (ECNU). On Statistical Characteristics of Real-Life Knowledge Gr...
 
8th TUC Meeting - Peter Boncz (CWI). Query Language Task Force status
8th TUC Meeting - Peter Boncz (CWI). Query Language Task Force status8th TUC Meeting - Peter Boncz (CWI). Query Language Task Force status
8th TUC Meeting - Peter Boncz (CWI). Query Language Task Force status
 
8th TUC Meeting | Lijun Chang (University of New South Wales). Efficient Subg...
8th TUC Meeting | Lijun Chang (University of New South Wales). Efficient Subg...8th TUC Meeting | Lijun Chang (University of New South Wales). Efficient Subg...
8th TUC Meeting | Lijun Chang (University of New South Wales). Efficient Subg...
 
8th TUC Meeting - Eugene I. Chong (Oracle USA). Balancing Act to improve RDF ...
8th TUC Meeting - Eugene I. Chong (Oracle USA). Balancing Act to improve RDF ...8th TUC Meeting - Eugene I. Chong (Oracle USA). Balancing Act to improve RDF ...
8th TUC Meeting - Eugene I. Chong (Oracle USA). Balancing Act to improve RDF ...
 
8th TUC Meeting -
8th TUC Meeting - 8th TUC Meeting -
8th TUC Meeting -
 
8th TUC Meeting - David Meibusch, Nathan Hawes (Oracle Labs Australia). Frapp...
8th TUC Meeting - David Meibusch, Nathan Hawes (Oracle Labs Australia). Frapp...8th TUC Meeting - David Meibusch, Nathan Hawes (Oracle Labs Australia). Frapp...
8th TUC Meeting - David Meibusch, Nathan Hawes (Oracle Labs Australia). Frapp...
 
8th TUC Meeting - Martin Zand University of Rochester Clinical and Translatio...
8th TUC Meeting - Martin Zand University of Rochester Clinical and Translatio...8th TUC Meeting - Martin Zand University of Rochester Clinical and Translatio...
8th TUC Meeting - Martin Zand University of Rochester Clinical and Translatio...
 
8th TUC Meeting - Tim Hegeman (TU Delft). Social Network Benchmark, Analytics...
8th TUC Meeting - Tim Hegeman (TU Delft). Social Network Benchmark, Analytics...8th TUC Meeting - Tim Hegeman (TU Delft). Social Network Benchmark, Analytics...
8th TUC Meeting - Tim Hegeman (TU Delft). Social Network Benchmark, Analytics...
 
LDBC 8th TUC Meeting: Introduction and status update
LDBC 8th TUC Meeting: Introduction and status updateLDBC 8th TUC Meeting: Introduction and status update
LDBC 8th TUC Meeting: Introduction and status update
 
LDBC 6th TUC Meeting conclusions
LDBC 6th TUC Meeting conclusionsLDBC 6th TUC Meeting conclusions
LDBC 6th TUC Meeting conclusions
 
Parallel and incremental materialisation of RDF/DATALOG in RDFOX
Parallel and incremental materialisation of RDF/DATALOG in RDFOXParallel and incremental materialisation of RDF/DATALOG in RDFOX
Parallel and incremental materialisation of RDF/DATALOG in RDFOX
 
MODAClouds Decision Support System for Cloud Service Selection
MODAClouds Decision Support System for Cloud Service SelectionMODAClouds Decision Support System for Cloud Service Selection
MODAClouds Decision Support System for Cloud Service Selection
 
E-Commerce and Graph-driven Applications: Experiences and Optimizations while...
E-Commerce and Graph-driven Applications: Experiences and Optimizations while...E-Commerce and Graph-driven Applications: Experiences and Optimizations while...
E-Commerce and Graph-driven Applications: Experiences and Optimizations while...
 
LDBC SNB Benchmark Auditing
LDBC SNB Benchmark AuditingLDBC SNB Benchmark Auditing
LDBC SNB Benchmark Auditing
 
Social Network Benchmark Interactive Workload
Social Network Benchmark Interactive WorkloadSocial Network Benchmark Interactive Workload
Social Network Benchmark Interactive Workload
 

Recently uploaded

GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
Zilliz
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
Kumud Singh
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
James Anderson
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
Alex Pruden
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
Neo4j
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 

Recently uploaded (20)

GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 

8th TUC Meeting – Marcus Paradies (SAP) Social Network Benchmark

  • 1. LDBC Social Network Benchmark Business Intelligence Workload Marcus Paradies, SAP SE 16/23/2016
  • 2. Task Force Members • Task Force members • Arnau Prat (DAMA-UPC) • Alex Averbuch (Neo Technologies) • Moritz Kaufmann (TU München) • Marcus Paradies (SAP) • Orri Erling (Google) • Peter Boncz (CWI) 26/23/2016
  • 3. Interactive and BI Workload • Interactive workload • Only touch a small fraction of the data graph (< 3 hops) • Queries usually start at a single person vertex • Can be seen as OLTP-style workload • Business intelligence workload • Focus on analytic queries (grouping/aggregations) • Queries touch the complete graph • Batch updates • Can be seen as OLAP-style workload • 24 queries in the BI workload • Interactive and BI workload share the benchmark driver and generated data • Workload-specific parameter bindings to reduce variability between queries 3 Benchmark Driver Data Generator Interactive Workload Business Intelligence Workload 6/23/2016
  • 4. Examples Queries Q1 - Posting Summary Given a date, find all Messages created before that date. Group them by a 3-level grouping: 1. by year of creation 2. for each year, group into message types, i.e., Posts or Comments 3. for each year-type group, split into four groups based on length of their content: 0 <= length < 40  short 40 <= length < 80  one liner 80 <= length < 160  tweet 160 <= length  long Parameters: date: Date 46/23/2016
  • 5. Examples Queries Q1 - Posting Summary 5 Result: (*) For every 3-level group, return: • year - 32-bit Integer • message type  post/comment • length category  (short/one-liner/tweet/long) • message count  total number of Messages (Posts/Comments) in that group • average message length  average length of the Message content in that group • sum message length  sum of all message content lengths • per messages  number of messages in a group as a percentage of all messages created before the given date * Final sorting and top k omitted here for brevity 6/23/2016
  • 6. Examples Queries Q16 - Experts in a Social Circle 6 Given a Person, find all other Persons that live in a given country and are connected to given person by a path through the knows relation. For each of these Persons, retrieve all of their Messages (Posts & Comments) that contain at least one Tag belonging to a given TagClass (direct relation not transitive). For each Message, also retrieve its Tags. Parameters: Person.id - 64-bit Integer Country.name - String TagClass.name - String 6/23/2016
  • 7. Examples Queries Q16 - Experts in a Social Circle 7 Grouping of results: First, by Tag.name Second, by Person.id Result: (*) For each group, return: • Person.id • Tag.name • post_count  number of Messages created by that Person containing that Tag * Final sorting and top k omitted here for brevity 6/23/2016
  • 8. SNB BI Workload in a Nutshell 6/23/2016 8 Q13 - Popular Tags per month in a country Q14 - Top thread initiators Q15 - Social Normals Q16 - Experts in Social Circle Q17 - Friend Triangles Q18 - How many persons have a given number of posts Q19 - Stranger's Interaction Q20 - High level topics Q21 - Zombies in a country Q22 - International Dialog Q23 - Holiday Destinations Q24 - Messages By Topic And Continent Q1 - Posting summary Q2 - Top tags for country, age, gender, time Q3 - Tag evolution Q4 - Popular topics in a country Q5 - Top posters in a country Q6 - Most active Posters of a given Topic Q7 - Most authoritative users on a given topic Q8 - Related Topics Q9 - Forum with related Tags Q10 - Central Person for a Tag Q11 - Unrelated Replies Q12 - Trending Posts
  • 9. Choke Point Identification 9 • Choke point = implementation challenge • Interactive workload already contains a number of choke points • We extend them by more choke points from TPC-H that are relevant for BI queries 6/23/2016
  • 10. Current Status of the BI Workload • Query Definition • All 24 queries are specified in plain english text • Benchmark Driver • Reads generated parameter bindings and issues queries • Query mix has to be defined • 22/24 queries are validated against Sparsity,Neo4j, and Postgres • Choke points identified for 8/24 queries 10 https://github.com/ldbc/ldbc_snb_implementations/tree/new_bi/bi/postgres/src/main/sql/postgres/queries/bi Link to Postgres draft reference implementation: 6/23/2016
  • 11. Outlook and Next Steps • Finishing of chokepoint identification and analysis of whether all defined chokepoints are already triggered • Finishing validation of remaining two queries • Addition of refresh data sets (update batches) • Definition • Implementation in the benchmark driver • Final polishing of query specifications 116/23/2016