SlideShare a Scribd company logo
EDBT
Summer School
2015
Badenes, Carlos
Garijo, Daniel
Priyatna, Freddy
Palamos, Spain
31/8 - 4/9 2015
The Venue
2
(where we thought we would be) (where we actually were)
Overview
3
Graph Data Management
Part I: Theoretical
- Notes about lectures
Part II: Practical
- Sparksee Technology
- Challenges
Part I: Theoretical
4
๏ Large Scale Graph Processing System -
(C. Badenes)
Sherif Sakr - National ICT Australia
๏ Graph Visualization - (C. Badenes)
Peter Eades - University of Sydney
๏ Graph Data Management - (F. Priyatna)
Claudio Gutierrez - Universidad de Chile
๏ Applications of Flexible Querying to Graphs - (F.
Priyatna)
Alexandra Poulovassilis - Birkbeck, University of London
๏ Graph Management Benchmarking - (F. Priyatna)
Peter Boncz - CWI and Vrije Universiteit Amsterdam
๏ Graph Algorithms - (D. Garijo)
Dennis Shasha - New York
University
๏ Parallel Processing - (D. Garijo)
Bin Shao - Microsoft Research,
Beijing
Graph Data Management
5
Dr. Claudio Gutierrez
Computer Science Department
Universidad de Chile
http://richard.cyganiak.de/blog/2006/06/perez-et-al-semantics-and-
complexity-of-sparql/
(2-2, 1-1)
a general view of the main features of current graph
databases
Graph Data Management
6
A hypernode is a directed graph whose nodes can themselves be
graphs (or hypernodes), allowing nesting of graphs.
A property graph is a directed, labelled, attributed multigraph. That is, a graph where the
edges are directed, both nodes and edges are labeled and can have any number of
properties (or attributes), and there can be multiple edges between any two vertices.
Applications of Flexible Querying to Graphs
7
Dr. Alexandra Poulovassilis
Department of Computer Science and Information Systems,
Birkbeck, University of London
Reasoning in Event-Based
Distributed Systems
Authors: Helmer, Sven,
Poulovassilis, Alexandra, Xhafa,
Fatos
Adapting to Change in
Content, Size, Topology
and Use
Editors: Levene, Mark,
Poulovassilis, Alexandra (Eds.)
The Functional Approach
to Data Management
Modeling, Analyzing and
Integrating Heterogeneous
Data
Editors: Gray, P.M.D., Kerschberg, L.,
King, P.J.H., Poulovassilis, A. (Eds.)
Applications of Flexible Querying to Graphs
8
Query relaxation, which generally returns additional
answers compared to the exact form of the database
query.
Query approximation, which returns potentially
different answers compared to the exact form of the
query.
Q2 = SELECT * WHERE {
?x :actedIn :Tea_with_Mussolini .
RELAX ( ?x :hasFamilyName ?z ) }
Q3 = SELECT * WHERE {
?x :actedIn :Tea_with_Mussolini .
?x :label ?z . }
Q3.1 = SELECT * WHERE {
?x :actedIn :Tea_with_Mussolini .
?x :hasGivenName ?z ) }
Q3.2 = SELECT * WHERE {
?x :actedIn :Tea_with_Mussolini .
?x :hasFamilyName ?z . }
Q1= SELECT * WHERE {
APPROX ( :Battle_of_Waterloo :happenedIn/(:hasLongitude|:hasLatitude) ?x )
}
Q1.1=SELECT * WHERE {
:Battle_of_Waterloo :hasLongitude ?x }
Q1.2=SELECT * WHERE {
:Battle_of_Waterloo :hasLatitude ?x }
hasFamilyNamehasGivenName
label
subPropertyOf subPropertyOf
SparqlAR
Graph Management Benchmarking
9
Dr. Peter Boncz
Centrum Wiskunde & Informatica
(CWI)
Graph Management Benchmarking
10
Description: Given a start Person, find the Forums which that Person’s friends and friends of
friends (excluding start Person) became Members of after a given date. Return top 20 Forums,
and the number of Posts in each Forum that was Created by any of these Persons. For each
Forum consider only those Persons which joined that particular Forum after the given date. Sort
results descending by the count of Posts, and then ascending by Forum identifier.
Graph Management Benchmarking
11
Description: Given a start Person, find the Forums which that Person’s friends and friends of
friends (excluding start Person) became Members of after a given date. Return top 20 Forums,
and the number of Posts in each Forum that was Created by any of these Persons. For each
Forum consider only those Persons which joined that particular Forum after the given date. Sort
results descending by the count of Posts, and then ascending by Forum identifier.
Graph Motifs
12
Graph Motifs
13
Parallel Processing
14
Parallel Processing
15
Large Scale Graph Processing System
16
Dr. Sherif Sakr
Associate Professor at
College of Public Health and Health Informatics at
King Saud bin Abdul-Aziz University
“Big Data (Graph) Processing Systems
State-of-the-art and open challenges”
Large Scale Graph Processing System
17
Pregel Family
Bulk Synchronous Parallel (BSP) model
Valiant. A Bridging Model for Parallel Computation. Commun. ACM, 1990
GraphLab Family
Gather, Apply, Scatter (GAS) model
Y. Low, J. Gonzalez, A. Kyrola, D. Bickson, C. Guestrin, and J. M. Hellerstein
Distributed GraphLab: A Framework for Machine Learning in the Cloud. PVLDB,
Graph Visualization
18
Dr. Peter Eades
Research Professor at
School of Information Technologies at
The University of Sidney
Data Drawing Human
Visualization
Function
Perception
Function
faithful + readable
Graph Visualization
19
Topology-Shape-Metric approach
Energy-based approach
Clustered Planarity:
Multilevel methods:
Fast Approximations:
scaling to large graphs
scaling to large graphs
Part II: Practical
๏ Sparksee - Sparsity Technologies (C.Badenes)
Universitat Politécnica de Catalunya
๏ Challenges :: OEG-Team - (D.Garijo)
Similarities between Wikipedia Articles
20
Sparksee
21
Sparksee
22
schema
ry: Get common Messages for the given Hashtags
// User Node
int nodeUser = graph.newNodeType("User");
int userNickName = graph.newAttribute(nodeUser, "nickname",
DataType.String, AttributeKind.Unique);
// knows edge
int edgeKnows = graph.newEdgeType("knows", true, true);
// User1
long user1 = graph.newNode(nodeUser);
graph.setAttribute(user1,userNickName,new Value().setString(“User1"));
// edge 'knows'
long knows1 = graph.newEdge(edgeKnows, user1, user2);
// Find out the OID of the Hashtags with the given hastag's texts.
int tag = g.findType("Tag");
int tagName = g.findAttribute(tag, "name");
long tag1 = g.findObject(tagName, new Value().setString(ht1));
long tag2 = g.findObject(tagName, new Value().setString(ht2));
// Retrieve Messages with both hashtags and intersect the retrieved collection of Messages.
int tags = g.findType("tags");
Objects msgs1 = g.neighbors(tag1, tags, EdgesDirection.Ingoing);
Objects msgs2 = g.neighbors(tag2, tags, EdgesDirection.Ingoing);
long nums = msgs1.intersection(msgs2);
Challenge
23
Similarities in Wikipedia
- Description
- To Evaluate
- The design
- A good proof of functionality
- The efficiency, in terms of computation time
- The originality of the proposed method
- Technical prerequisites of participants
- Basic programming skills
- To be familiar with some graph library
- Technical support provided to participants
- English Wikipedia data (dump):
- articles_ids.csv
- articles_links.csv
- articles_body.csv
- articles_redirect.csv
- categories_ids.csv
- articles_category.csv
- categories_relations.csv
Problem
24
Similarity between Wikipedia Articles
Wikipedia Article:
text
links
categories
Hypothesis
25
Wikipedia Article:
text
links
categories
simLinks
simCtg
simTextα·
β·
ɣ·
+
+
simWA(R1,R2) = α·simTxt(R1,R2) + β·simLinks(R1,R2) + ɣ·simCtg(R1,R2)
where α+β+ɣ=1
Similarity based on Text
26
…
TOPIC_1
p = [0.5, 0.3,.., 0.7]
q = [0.2, 0.4,.., 0.9]Ri
R
j
TOPIC_2 TOPIC_n
Latent
Dirichlet
Allocation
Similarity based on Categories
27
Articles with multiple common categories
are likely to be similar
Noise filtering is necessary (e.g., “All articles lacking in-text citations”).
See https://github.com/cbadenes/siminwikart-challenge4/blob/master/category/wikipedia_bad_categories.txt
Similarity based on Links
28
Sim(A,B) = links(A) ∩ links(B) / ( (links(A) U links(B) ) / 2)
Articles with multiple common links
are likely to be similar
Proof of Concept
29
Fernando Alonso
Lionel Messi
Iker Casillas Princess Akiko
(simLinks) α = 0.2
(simCtg) β = 0.2
(simTxt) ɣ = 0.6
[1]0.062
[3]0.075
[1]0.666
[3]0.683
[1]0.058
[3]0.069
[1]0.043
[3]0.072
[1]0.019
[3]0.023
[1]0.068
[3]0.069
simTxt = 0.059
simLinks = 0.019
simCtg=[1]0.117
[3]0.181
simTxt = 0.065
simLinks = 0.0
simCtg=[1]0.095
[3]0.161
simTxt = 0.052
simLinks = 0.019
simCtg=[1]0.166
[3]0.172
simTxt = 0.980
simLinks = 0.175
simCtg=[1]0.217
[3]0.302
simTxt = 0.060
simLinks = 0.008
simCtg=[1]0.030
[3]0.172
simTxt = 0.069
simLinks = 0.004
simCtg=[1]0.080
[3]0.134
Comparison
30
Lionel Messi
Princess Akiko
simTxt = 0.060 -> <common words>
simLinks = 0.008 -> (England,Buenos_Aires,Chile,Madrid,Argentina)
simCtg=[1]0.030 -> living_person
Proposal
31
Graph based on Links Graph based on Similarities
Problem
32
Wikipedia links reliability
(missing links)
Wikipedia Article:
text
links
categories
Further Refinement
33
Similarities between categories (as topics)
can define relations between articles
Graph based on Links Graph based on Similarities
Subgraph Pattern Matching
+
Topic Model
+
Code
34
https://github.com/cbadenes/siminwikart-challenge4
Happy Ending
35
Kitkat Time
• Suggestions?
• Name for the system?
• Contributors?
36

More Related Content

What's hot

Introduction to R Graphics with ggplot2
Introduction to R Graphics with ggplot2Introduction to R Graphics with ggplot2
Introduction to R Graphics with ggplot2
izahn
 
6. Linked list - Data Structures using C++ by Varsha Patil
6. Linked list - Data Structures using C++ by Varsha Patil6. Linked list - Data Structures using C++ by Varsha Patil
6. Linked list - Data Structures using C++ by Varsha Patil
widespreadpromotion
 
Learning from (dis)similarity data
Learning from (dis)similarity dataLearning from (dis)similarity data
Learning from (dis)similarity data
tuxette
 
Learning to Compose Domain-Specific Transformations for Data Augmentation
Learning to Compose Domain-Specific Transformations for Data AugmentationLearning to Compose Domain-Specific Transformations for Data Augmentation
Learning to Compose Domain-Specific Transformations for Data Augmentation
Tatsuya Shirakawa
 
What makes a linked data pattern interesting?
What makes a linked data pattern interesting?What makes a linked data pattern interesting?
What makes a linked data pattern interesting?
Szymon Klarman
 
Fast Billion-scale Graph Computation Using a Bimodal Block Processing Model
Fast Billion-scale Graph Computation Using a Bimodal Block Processing ModelFast Billion-scale Graph Computation Using a Bimodal Block Processing Model
Fast Billion-scale Graph Computation Using a Bimodal Block Processing Model
Universidade de São Paulo
 
Machine learning for materials design: opportunities, challenges, and methods
Machine learning for materials design: opportunities, challenges, and methodsMachine learning for materials design: opportunities, challenges, and methods
Machine learning for materials design: opportunities, challenges, and methods
Anubhav Jain
 
Brief introduction on GAN
Brief introduction on GANBrief introduction on GAN
Brief introduction on GAN
Dai-Hai Nguyen
 
Big data presentation
Big data presentationBig data presentation
Big data presentation
Catur Wibisono
 
An Evaluation of Models for Runtime Approximation in Link Discovery
An Evaluation of Models for Runtime Approximation in Link DiscoveryAn Evaluation of Models for Runtime Approximation in Link Discovery
An Evaluation of Models for Runtime Approximation in Link Discovery
Holistic Benchmarking of Big Linked Data
 
Searching for optimal patterns in Boolean tensors
Searching for optimal patterns in Boolean tensorsSearching for optimal patterns in Boolean tensors
Searching for optimal patterns in Boolean tensors
Dmitrii Ignatov
 
Scientific Applications and Heterogeneous Architectures
Scientific Applications and Heterogeneous ArchitecturesScientific Applications and Heterogeneous Architectures
Scientific Applications and Heterogeneous Architectures
inside-BigData.com
 
Data Structures and Algorithm - Week 6 - Red Black Trees
Data Structures and Algorithm - Week 6 - Red Black TreesData Structures and Algorithm - Week 6 - Red Black Trees
Data Structures and Algorithm - Week 6 - Red Black Trees
Ferdin Joe John Joseph PhD
 
Smart Metrics for High Performance Material Design
Smart Metrics for High Performance Material DesignSmart Metrics for High Performance Material Design
Smart Metrics for High Performance Material Design
aimsnist
 
Henning agt talk-caise-semnet
Henning agt   talk-caise-semnetHenning agt   talk-caise-semnet
Henning agt talk-caise-semnetcaise2013vlc
 
Recent improvements to the RDKit
Recent improvements to the RDKitRecent improvements to the RDKit
Recent improvements to the RDKit
NextMove Software
 
What's next in Julia
What's next in JuliaWhat's next in Julia
What's next in JuliaJiahao Chen
 
Scalable and Efficient Algorithms for Analysis of Massive, Streaming Graphs
Scalable and Efficient Algorithms for Analysis of Massive, Streaming GraphsScalable and Efficient Algorithms for Analysis of Massive, Streaming Graphs
Scalable and Efficient Algorithms for Analysis of Massive, Streaming Graphs
Jason Riedy
 
DATA VISUALIZATION WITH R PACKAGES
DATA VISUALIZATION WITH R PACKAGESDATA VISUALIZATION WITH R PACKAGES
DATA VISUALIZATION WITH R PACKAGES
Fatma ÇINAR
 
R and Visualization: A match made in Heaven
R and Visualization: A match made in HeavenR and Visualization: A match made in Heaven
R and Visualization: A match made in Heaven
Edureka!
 

What's hot (20)

Introduction to R Graphics with ggplot2
Introduction to R Graphics with ggplot2Introduction to R Graphics with ggplot2
Introduction to R Graphics with ggplot2
 
6. Linked list - Data Structures using C++ by Varsha Patil
6. Linked list - Data Structures using C++ by Varsha Patil6. Linked list - Data Structures using C++ by Varsha Patil
6. Linked list - Data Structures using C++ by Varsha Patil
 
Learning from (dis)similarity data
Learning from (dis)similarity dataLearning from (dis)similarity data
Learning from (dis)similarity data
 
Learning to Compose Domain-Specific Transformations for Data Augmentation
Learning to Compose Domain-Specific Transformations for Data AugmentationLearning to Compose Domain-Specific Transformations for Data Augmentation
Learning to Compose Domain-Specific Transformations for Data Augmentation
 
What makes a linked data pattern interesting?
What makes a linked data pattern interesting?What makes a linked data pattern interesting?
What makes a linked data pattern interesting?
 
Fast Billion-scale Graph Computation Using a Bimodal Block Processing Model
Fast Billion-scale Graph Computation Using a Bimodal Block Processing ModelFast Billion-scale Graph Computation Using a Bimodal Block Processing Model
Fast Billion-scale Graph Computation Using a Bimodal Block Processing Model
 
Machine learning for materials design: opportunities, challenges, and methods
Machine learning for materials design: opportunities, challenges, and methodsMachine learning for materials design: opportunities, challenges, and methods
Machine learning for materials design: opportunities, challenges, and methods
 
Brief introduction on GAN
Brief introduction on GANBrief introduction on GAN
Brief introduction on GAN
 
Big data presentation
Big data presentationBig data presentation
Big data presentation
 
An Evaluation of Models for Runtime Approximation in Link Discovery
An Evaluation of Models for Runtime Approximation in Link DiscoveryAn Evaluation of Models for Runtime Approximation in Link Discovery
An Evaluation of Models for Runtime Approximation in Link Discovery
 
Searching for optimal patterns in Boolean tensors
Searching for optimal patterns in Boolean tensorsSearching for optimal patterns in Boolean tensors
Searching for optimal patterns in Boolean tensors
 
Scientific Applications and Heterogeneous Architectures
Scientific Applications and Heterogeneous ArchitecturesScientific Applications and Heterogeneous Architectures
Scientific Applications and Heterogeneous Architectures
 
Data Structures and Algorithm - Week 6 - Red Black Trees
Data Structures and Algorithm - Week 6 - Red Black TreesData Structures and Algorithm - Week 6 - Red Black Trees
Data Structures and Algorithm - Week 6 - Red Black Trees
 
Smart Metrics for High Performance Material Design
Smart Metrics for High Performance Material DesignSmart Metrics for High Performance Material Design
Smart Metrics for High Performance Material Design
 
Henning agt talk-caise-semnet
Henning agt   talk-caise-semnetHenning agt   talk-caise-semnet
Henning agt talk-caise-semnet
 
Recent improvements to the RDKit
Recent improvements to the RDKitRecent improvements to the RDKit
Recent improvements to the RDKit
 
What's next in Julia
What's next in JuliaWhat's next in Julia
What's next in Julia
 
Scalable and Efficient Algorithms for Analysis of Massive, Streaming Graphs
Scalable and Efficient Algorithms for Analysis of Massive, Streaming GraphsScalable and Efficient Algorithms for Analysis of Massive, Streaming Graphs
Scalable and Efficient Algorithms for Analysis of Massive, Streaming Graphs
 
DATA VISUALIZATION WITH R PACKAGES
DATA VISUALIZATION WITH R PACKAGESDATA VISUALIZATION WITH R PACKAGES
DATA VISUALIZATION WITH R PACKAGES
 
R and Visualization: A match made in Heaven
R and Visualization: A match made in HeavenR and Visualization: A match made in Heaven
R and Visualization: A match made in Heaven
 

Similar to EDBT 2015: Summer School Overview

Describing Scholarly Contributions semantically with the Open Research Knowle...
Describing Scholarly Contributions semantically with the Open Research Knowle...Describing Scholarly Contributions semantically with the Open Research Knowle...
Describing Scholarly Contributions semantically with the Open Research Knowle...
Sören Auer
 
Data Curation and Debugging for Data Centric AI
Data Curation and Debugging for Data Centric AIData Curation and Debugging for Data Centric AI
Data Curation and Debugging for Data Centric AI
Paul Groth
 
Euro30 2019 - Benchmarking tree approaches on street data
Euro30 2019 - Benchmarking tree approaches on street dataEuro30 2019 - Benchmarking tree approaches on street data
Euro30 2019 - Benchmarking tree approaches on street data
Fabion Kauker
 
Data-centric AI and the convergence of data and model engineering: opportunit...
Data-centric AI and the convergence of data and model engineering:opportunit...Data-centric AI and the convergence of data and model engineering:opportunit...
Data-centric AI and the convergence of data and model engineering: opportunit...
Paolo Missier
 
Building Effective Visualization Shiny WVF
Building Effective Visualization Shiny WVFBuilding Effective Visualization Shiny WVF
Building Effective Visualization Shiny WVF
Olga Scrivner
 
Distributed Meta-Analysis System
Distributed Meta-Analysis SystemDistributed Meta-Analysis System
Distributed Meta-Analysis System
jarising
 
Wehc - Linked Data for Economic-Social historians
Wehc - Linked Data for Economic-Social historiansWehc - Linked Data for Economic-Social historians
Wehc - Linked Data for Economic-Social historians
Bram van den Hout
 
A modified k means algorithm for big data clustering
A modified k means algorithm for big data clusteringA modified k means algorithm for big data clustering
A modified k means algorithm for big data clustering
SK Ahammad Fahad
 
Hala skafkeynote@conferencedata2021
Hala skafkeynote@conferencedata2021Hala skafkeynote@conferencedata2021
Hala skafkeynote@conferencedata2021
hala Skaf
 
Knowledge Graph Embeddings for Recommender Systems
Knowledge Graph Embeddings for Recommender SystemsKnowledge Graph Embeddings for Recommender Systems
Knowledge Graph Embeddings for Recommender Systems
Enrico Palumbo
 
polystore_NYC_inrae_sysinfo2021-1.pdf
polystore_NYC_inrae_sysinfo2021-1.pdfpolystore_NYC_inrae_sysinfo2021-1.pdf
polystore_NYC_inrae_sysinfo2021-1.pdf
Rim Moussa
 
SQL is Dead; Long Live SQL: Lightweight Query Services for Long Tail Science
SQL is Dead; Long Live SQL: Lightweight Query Services for Long Tail ScienceSQL is Dead; Long Live SQL: Lightweight Query Services for Long Tail Science
SQL is Dead; Long Live SQL: Lightweight Query Services for Long Tail Science
University of Washington
 
STINGER: Multi-threaded Graph Streaming
STINGER: Multi-threaded Graph StreamingSTINGER: Multi-threaded Graph Streaming
STINGER: Multi-threaded Graph StreamingJason Riedy
 
Big Data + Big Sim: Query Processing over Unstructured CFD Models
Big Data + Big Sim: Query Processing over Unstructured CFD ModelsBig Data + Big Sim: Query Processing over Unstructured CFD Models
Big Data + Big Sim: Query Processing over Unstructured CFD Models
University of Washington
 
Geometric Deep Learning
Geometric Deep Learning Geometric Deep Learning
Geometric Deep Learning
PetteriTeikariPhD
 
AI Science
AI Science AI Science
AI Science
Melanie Swan
 
Ppt for paper id 696 a review of hybrid data mining algorithm for big data mi...
Ppt for paper id 696 a review of hybrid data mining algorithm for big data mi...Ppt for paper id 696 a review of hybrid data mining algorithm for big data mi...
Ppt for paper id 696 a review of hybrid data mining algorithm for big data mi...
Prasanta Paul
 
Relaxing global-as-view in mediated data integration from linked data
Relaxing global-as-view in mediated data integration from linked dataRelaxing global-as-view in mediated data integration from linked data
Relaxing global-as-view in mediated data integration from linked data
Alessandro Adamou
 
A New Partnership for Cross-Scale, Cross-Domain eScience
A New Partnership for Cross-Scale, Cross-Domain eScienceA New Partnership for Cross-Scale, Cross-Domain eScience
A New Partnership for Cross-Scale, Cross-Domain eScience
University of Washington
 

Similar to EDBT 2015: Summer School Overview (20)

Describing Scholarly Contributions semantically with the Open Research Knowle...
Describing Scholarly Contributions semantically with the Open Research Knowle...Describing Scholarly Contributions semantically with the Open Research Knowle...
Describing Scholarly Contributions semantically with the Open Research Knowle...
 
Data Curation and Debugging for Data Centric AI
Data Curation and Debugging for Data Centric AIData Curation and Debugging for Data Centric AI
Data Curation and Debugging for Data Centric AI
 
Euro30 2019 - Benchmarking tree approaches on street data
Euro30 2019 - Benchmarking tree approaches on street dataEuro30 2019 - Benchmarking tree approaches on street data
Euro30 2019 - Benchmarking tree approaches on street data
 
Poster Final
Poster FinalPoster Final
Poster Final
 
Data-centric AI and the convergence of data and model engineering: opportunit...
Data-centric AI and the convergence of data and model engineering:opportunit...Data-centric AI and the convergence of data and model engineering:opportunit...
Data-centric AI and the convergence of data and model engineering: opportunit...
 
Building Effective Visualization Shiny WVF
Building Effective Visualization Shiny WVFBuilding Effective Visualization Shiny WVF
Building Effective Visualization Shiny WVF
 
Distributed Meta-Analysis System
Distributed Meta-Analysis SystemDistributed Meta-Analysis System
Distributed Meta-Analysis System
 
Wehc - Linked Data for Economic-Social historians
Wehc - Linked Data for Economic-Social historiansWehc - Linked Data for Economic-Social historians
Wehc - Linked Data for Economic-Social historians
 
A modified k means algorithm for big data clustering
A modified k means algorithm for big data clusteringA modified k means algorithm for big data clustering
A modified k means algorithm for big data clustering
 
Hala skafkeynote@conferencedata2021
Hala skafkeynote@conferencedata2021Hala skafkeynote@conferencedata2021
Hala skafkeynote@conferencedata2021
 
Knowledge Graph Embeddings for Recommender Systems
Knowledge Graph Embeddings for Recommender SystemsKnowledge Graph Embeddings for Recommender Systems
Knowledge Graph Embeddings for Recommender Systems
 
polystore_NYC_inrae_sysinfo2021-1.pdf
polystore_NYC_inrae_sysinfo2021-1.pdfpolystore_NYC_inrae_sysinfo2021-1.pdf
polystore_NYC_inrae_sysinfo2021-1.pdf
 
SQL is Dead; Long Live SQL: Lightweight Query Services for Long Tail Science
SQL is Dead; Long Live SQL: Lightweight Query Services for Long Tail ScienceSQL is Dead; Long Live SQL: Lightweight Query Services for Long Tail Science
SQL is Dead; Long Live SQL: Lightweight Query Services for Long Tail Science
 
STINGER: Multi-threaded Graph Streaming
STINGER: Multi-threaded Graph StreamingSTINGER: Multi-threaded Graph Streaming
STINGER: Multi-threaded Graph Streaming
 
Big Data + Big Sim: Query Processing over Unstructured CFD Models
Big Data + Big Sim: Query Processing over Unstructured CFD ModelsBig Data + Big Sim: Query Processing over Unstructured CFD Models
Big Data + Big Sim: Query Processing over Unstructured CFD Models
 
Geometric Deep Learning
Geometric Deep Learning Geometric Deep Learning
Geometric Deep Learning
 
AI Science
AI Science AI Science
AI Science
 
Ppt for paper id 696 a review of hybrid data mining algorithm for big data mi...
Ppt for paper id 696 a review of hybrid data mining algorithm for big data mi...Ppt for paper id 696 a review of hybrid data mining algorithm for big data mi...
Ppt for paper id 696 a review of hybrid data mining algorithm for big data mi...
 
Relaxing global-as-view in mediated data integration from linked data
Relaxing global-as-view in mediated data integration from linked dataRelaxing global-as-view in mediated data integration from linked data
Relaxing global-as-view in mediated data integration from linked data
 
A New Partnership for Cross-Scale, Cross-Domain eScience
A New Partnership for Cross-Scale, Cross-Domain eScienceA New Partnership for Cross-Scale, Cross-Domain eScience
A New Partnership for Cross-Scale, Cross-Domain eScience
 

More from dgarijo

FOOPS!: An Ontology Pitfall Scanner for the FAIR principles
FOOPS!: An Ontology Pitfall Scanner for the FAIR principlesFOOPS!: An Ontology Pitfall Scanner for the FAIR principles
FOOPS!: An Ontology Pitfall Scanner for the FAIR principles
dgarijo
 
FAIR Workflows: A step closer to the Scientific Paper of the Future
FAIR Workflows: A step closer to the Scientific Paper of the FutureFAIR Workflows: A step closer to the Scientific Paper of the Future
FAIR Workflows: A step closer to the Scientific Paper of the Future
dgarijo
 
Towards Reusable Research Software
Towards Reusable Research SoftwareTowards Reusable Research Software
Towards Reusable Research Software
dgarijo
 
SOMEF: a metadata extraction framework from software documentation
SOMEF: a metadata extraction framework from software documentationSOMEF: a metadata extraction framework from software documentation
SOMEF: a metadata extraction framework from software documentation
dgarijo
 
A Template-Based Approach for Annotating Long-Tailed Datasets
A Template-Based Approach for Annotating Long-Tailed DatasetsA Template-Based Approach for Annotating Long-Tailed Datasets
A Template-Based Approach for Annotating Long-Tailed Datasets
dgarijo
 
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphs
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge GraphsOBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphs
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphs
dgarijo
 
Towards Knowledge Graphs of Reusable Research Software Metadata
Towards Knowledge Graphs of Reusable Research Software MetadataTowards Knowledge Graphs of Reusable Research Software Metadata
Towards Knowledge Graphs of Reusable Research Software Metadata
dgarijo
 
Scientific Software Registry Collaboration Workshop: From Software Metadata r...
Scientific Software Registry Collaboration Workshop: From Software Metadata r...Scientific Software Registry Collaboration Workshop: From Software Metadata r...
Scientific Software Registry Collaboration Workshop: From Software Metadata r...
dgarijo
 
WDPlus: Leveraging Wikidata to Link and Extend Tabular Data
WDPlus: Leveraging Wikidata to Link and Extend Tabular DataWDPlus: Leveraging Wikidata to Link and Extend Tabular Data
WDPlus: Leveraging Wikidata to Link and Extend Tabular Data
dgarijo
 
OKG-Soft: An Open Knowledge Graph With Mathine Readable Scientific Software M...
OKG-Soft: An Open Knowledge Graph With Mathine Readable Scientific Software M...OKG-Soft: An Open Knowledge Graph With Mathine Readable Scientific Software M...
OKG-Soft: An Open Knowledge Graph With Mathine Readable Scientific Software M...
dgarijo
 
Towards Human-Guided Machine Learning - IUI 2019
Towards Human-Guided Machine Learning - IUI 2019Towards Human-Guided Machine Learning - IUI 2019
Towards Human-Guided Machine Learning - IUI 2019
dgarijo
 
Capturing Context in Scientific Experiments: Towards Computer-Driven Science
Capturing Context in Scientific Experiments: Towards Computer-Driven ScienceCapturing Context in Scientific Experiments: Towards Computer-Driven Science
Capturing Context in Scientific Experiments: Towards Computer-Driven Science
dgarijo
 
A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Met...
A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Met...A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Met...
A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Met...
dgarijo
 
WIDOCO: A Wizard for Documenting Ontologies
WIDOCO: A Wizard for Documenting OntologiesWIDOCO: A Wizard for Documenting Ontologies
WIDOCO: A Wizard for Documenting Ontologies
dgarijo
 
Towards Automating Data Narratives
Towards Automating Data NarrativesTowards Automating Data Narratives
Towards Automating Data Narratives
dgarijo
 
Automated Hypothesis Testing with Large Scale Scientific Workflows
Automated Hypothesis Testing with Large Scale Scientific WorkflowsAutomated Hypothesis Testing with Large Scale Scientific Workflows
Automated Hypothesis Testing with Large Scale Scientific Workflows
dgarijo
 
OntoSoft: A Distributed Semantic Registry for Scientific Software
OntoSoft: A Distributed Semantic Registry for Scientific SoftwareOntoSoft: A Distributed Semantic Registry for Scientific Software
OntoSoft: A Distributed Semantic Registry for Scientific Software
dgarijo
 
OEG tools for supporting Ontology Engineering
OEG tools for supporting Ontology EngineeringOEG tools for supporting Ontology Engineering
OEG tools for supporting Ontology Engineering
dgarijo
 
Software Metadata: Describing "dark software" in GeoSciences
Software Metadata: Describing "dark software" in GeoSciencesSoftware Metadata: Describing "dark software" in GeoSciences
Software Metadata: Describing "dark software" in GeoSciences
dgarijo
 
Reproducibility Using Semantics: An Overview
Reproducibility Using Semantics: An OverviewReproducibility Using Semantics: An Overview
Reproducibility Using Semantics: An Overview
dgarijo
 

More from dgarijo (20)

FOOPS!: An Ontology Pitfall Scanner for the FAIR principles
FOOPS!: An Ontology Pitfall Scanner for the FAIR principlesFOOPS!: An Ontology Pitfall Scanner for the FAIR principles
FOOPS!: An Ontology Pitfall Scanner for the FAIR principles
 
FAIR Workflows: A step closer to the Scientific Paper of the Future
FAIR Workflows: A step closer to the Scientific Paper of the FutureFAIR Workflows: A step closer to the Scientific Paper of the Future
FAIR Workflows: A step closer to the Scientific Paper of the Future
 
Towards Reusable Research Software
Towards Reusable Research SoftwareTowards Reusable Research Software
Towards Reusable Research Software
 
SOMEF: a metadata extraction framework from software documentation
SOMEF: a metadata extraction framework from software documentationSOMEF: a metadata extraction framework from software documentation
SOMEF: a metadata extraction framework from software documentation
 
A Template-Based Approach for Annotating Long-Tailed Datasets
A Template-Based Approach for Annotating Long-Tailed DatasetsA Template-Based Approach for Annotating Long-Tailed Datasets
A Template-Based Approach for Annotating Long-Tailed Datasets
 
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphs
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge GraphsOBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphs
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphs
 
Towards Knowledge Graphs of Reusable Research Software Metadata
Towards Knowledge Graphs of Reusable Research Software MetadataTowards Knowledge Graphs of Reusable Research Software Metadata
Towards Knowledge Graphs of Reusable Research Software Metadata
 
Scientific Software Registry Collaboration Workshop: From Software Metadata r...
Scientific Software Registry Collaboration Workshop: From Software Metadata r...Scientific Software Registry Collaboration Workshop: From Software Metadata r...
Scientific Software Registry Collaboration Workshop: From Software Metadata r...
 
WDPlus: Leveraging Wikidata to Link and Extend Tabular Data
WDPlus: Leveraging Wikidata to Link and Extend Tabular DataWDPlus: Leveraging Wikidata to Link and Extend Tabular Data
WDPlus: Leveraging Wikidata to Link and Extend Tabular Data
 
OKG-Soft: An Open Knowledge Graph With Mathine Readable Scientific Software M...
OKG-Soft: An Open Knowledge Graph With Mathine Readable Scientific Software M...OKG-Soft: An Open Knowledge Graph With Mathine Readable Scientific Software M...
OKG-Soft: An Open Knowledge Graph With Mathine Readable Scientific Software M...
 
Towards Human-Guided Machine Learning - IUI 2019
Towards Human-Guided Machine Learning - IUI 2019Towards Human-Guided Machine Learning - IUI 2019
Towards Human-Guided Machine Learning - IUI 2019
 
Capturing Context in Scientific Experiments: Towards Computer-Driven Science
Capturing Context in Scientific Experiments: Towards Computer-Driven ScienceCapturing Context in Scientific Experiments: Towards Computer-Driven Science
Capturing Context in Scientific Experiments: Towards Computer-Driven Science
 
A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Met...
A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Met...A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Met...
A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Met...
 
WIDOCO: A Wizard for Documenting Ontologies
WIDOCO: A Wizard for Documenting OntologiesWIDOCO: A Wizard for Documenting Ontologies
WIDOCO: A Wizard for Documenting Ontologies
 
Towards Automating Data Narratives
Towards Automating Data NarrativesTowards Automating Data Narratives
Towards Automating Data Narratives
 
Automated Hypothesis Testing with Large Scale Scientific Workflows
Automated Hypothesis Testing with Large Scale Scientific WorkflowsAutomated Hypothesis Testing with Large Scale Scientific Workflows
Automated Hypothesis Testing with Large Scale Scientific Workflows
 
OntoSoft: A Distributed Semantic Registry for Scientific Software
OntoSoft: A Distributed Semantic Registry for Scientific SoftwareOntoSoft: A Distributed Semantic Registry for Scientific Software
OntoSoft: A Distributed Semantic Registry for Scientific Software
 
OEG tools for supporting Ontology Engineering
OEG tools for supporting Ontology EngineeringOEG tools for supporting Ontology Engineering
OEG tools for supporting Ontology Engineering
 
Software Metadata: Describing "dark software" in GeoSciences
Software Metadata: Describing "dark software" in GeoSciencesSoftware Metadata: Describing "dark software" in GeoSciences
Software Metadata: Describing "dark software" in GeoSciences
 
Reproducibility Using Semantics: An Overview
Reproducibility Using Semantics: An OverviewReproducibility Using Semantics: An Overview
Reproducibility Using Semantics: An Overview
 

Recently uploaded

TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
EugeneSaldivar
 
The Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptxThe Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptx
DhatriParmar
 
The Diamond Necklace by Guy De Maupassant.pptx
The Diamond Necklace by Guy De Maupassant.pptxThe Diamond Necklace by Guy De Maupassant.pptx
The Diamond Necklace by Guy De Maupassant.pptx
DhatriParmar
 
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
MysoreMuleSoftMeetup
 
A Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptxA Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptx
thanhdowork
 
Advantages and Disadvantages of CMS from an SEO Perspective
Advantages and Disadvantages of CMS from an SEO PerspectiveAdvantages and Disadvantages of CMS from an SEO Perspective
Advantages and Disadvantages of CMS from an SEO Perspective
Krisztián Száraz
 
STRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBC
STRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBCSTRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBC
STRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBC
kimdan468
 
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptxChapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Mohd Adib Abd Muin, Senior Lecturer at Universiti Utara Malaysia
 
Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptx
Jisc
 
Overview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with MechanismOverview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with Mechanism
DeeptiGupta154
 
Azure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHatAzure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHat
Scholarhat
 
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
Nguyen Thanh Tu Collection
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
Jean Carlos Nunes Paixão
 
Model Attribute Check Company Auto Property
Model Attribute  Check Company Auto PropertyModel Attribute  Check Company Auto Property
Model Attribute Check Company Auto Property
Celine George
 
Thesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.pptThesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.ppt
EverAndrsGuerraGuerr
 
How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17
Celine George
 
Digital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments UnitDigital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments Unit
chanes7
 
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
Levi Shapiro
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
Special education needs
 
"Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe..."Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe...
SACHIN R KONDAGURI
 

Recently uploaded (20)

TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
 
The Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptxThe Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptx
 
The Diamond Necklace by Guy De Maupassant.pptx
The Diamond Necklace by Guy De Maupassant.pptxThe Diamond Necklace by Guy De Maupassant.pptx
The Diamond Necklace by Guy De Maupassant.pptx
 
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
 
A Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptxA Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptx
 
Advantages and Disadvantages of CMS from an SEO Perspective
Advantages and Disadvantages of CMS from an SEO PerspectiveAdvantages and Disadvantages of CMS from an SEO Perspective
Advantages and Disadvantages of CMS from an SEO Perspective
 
STRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBC
STRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBCSTRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBC
STRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBC
 
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptxChapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
 
Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptx
 
Overview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with MechanismOverview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with Mechanism
 
Azure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHatAzure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHat
 
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
 
Model Attribute Check Company Auto Property
Model Attribute  Check Company Auto PropertyModel Attribute  Check Company Auto Property
Model Attribute Check Company Auto Property
 
Thesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.pptThesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.ppt
 
How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17
 
Digital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments UnitDigital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments Unit
 
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
 
"Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe..."Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe...
 

EDBT 2015: Summer School Overview

  • 1. EDBT Summer School 2015 Badenes, Carlos Garijo, Daniel Priyatna, Freddy Palamos, Spain 31/8 - 4/9 2015
  • 2. The Venue 2 (where we thought we would be) (where we actually were)
  • 3. Overview 3 Graph Data Management Part I: Theoretical - Notes about lectures Part II: Practical - Sparksee Technology - Challenges
  • 4. Part I: Theoretical 4 ๏ Large Scale Graph Processing System - (C. Badenes) Sherif Sakr - National ICT Australia ๏ Graph Visualization - (C. Badenes) Peter Eades - University of Sydney ๏ Graph Data Management - (F. Priyatna) Claudio Gutierrez - Universidad de Chile ๏ Applications of Flexible Querying to Graphs - (F. Priyatna) Alexandra Poulovassilis - Birkbeck, University of London ๏ Graph Management Benchmarking - (F. Priyatna) Peter Boncz - CWI and Vrije Universiteit Amsterdam ๏ Graph Algorithms - (D. Garijo) Dennis Shasha - New York University ๏ Parallel Processing - (D. Garijo) Bin Shao - Microsoft Research, Beijing
  • 5. Graph Data Management 5 Dr. Claudio Gutierrez Computer Science Department Universidad de Chile http://richard.cyganiak.de/blog/2006/06/perez-et-al-semantics-and- complexity-of-sparql/ (2-2, 1-1)
  • 6. a general view of the main features of current graph databases Graph Data Management 6 A hypernode is a directed graph whose nodes can themselves be graphs (or hypernodes), allowing nesting of graphs. A property graph is a directed, labelled, attributed multigraph. That is, a graph where the edges are directed, both nodes and edges are labeled and can have any number of properties (or attributes), and there can be multiple edges between any two vertices.
  • 7. Applications of Flexible Querying to Graphs 7 Dr. Alexandra Poulovassilis Department of Computer Science and Information Systems, Birkbeck, University of London Reasoning in Event-Based Distributed Systems Authors: Helmer, Sven, Poulovassilis, Alexandra, Xhafa, Fatos Adapting to Change in Content, Size, Topology and Use Editors: Levene, Mark, Poulovassilis, Alexandra (Eds.) The Functional Approach to Data Management Modeling, Analyzing and Integrating Heterogeneous Data Editors: Gray, P.M.D., Kerschberg, L., King, P.J.H., Poulovassilis, A. (Eds.)
  • 8. Applications of Flexible Querying to Graphs 8 Query relaxation, which generally returns additional answers compared to the exact form of the database query. Query approximation, which returns potentially different answers compared to the exact form of the query. Q2 = SELECT * WHERE { ?x :actedIn :Tea_with_Mussolini . RELAX ( ?x :hasFamilyName ?z ) } Q3 = SELECT * WHERE { ?x :actedIn :Tea_with_Mussolini . ?x :label ?z . } Q3.1 = SELECT * WHERE { ?x :actedIn :Tea_with_Mussolini . ?x :hasGivenName ?z ) } Q3.2 = SELECT * WHERE { ?x :actedIn :Tea_with_Mussolini . ?x :hasFamilyName ?z . } Q1= SELECT * WHERE { APPROX ( :Battle_of_Waterloo :happenedIn/(:hasLongitude|:hasLatitude) ?x ) } Q1.1=SELECT * WHERE { :Battle_of_Waterloo :hasLongitude ?x } Q1.2=SELECT * WHERE { :Battle_of_Waterloo :hasLatitude ?x } hasFamilyNamehasGivenName label subPropertyOf subPropertyOf SparqlAR
  • 9. Graph Management Benchmarking 9 Dr. Peter Boncz Centrum Wiskunde & Informatica (CWI)
  • 10. Graph Management Benchmarking 10 Description: Given a start Person, find the Forums which that Person’s friends and friends of friends (excluding start Person) became Members of after a given date. Return top 20 Forums, and the number of Posts in each Forum that was Created by any of these Persons. For each Forum consider only those Persons which joined that particular Forum after the given date. Sort results descending by the count of Posts, and then ascending by Forum identifier.
  • 11. Graph Management Benchmarking 11 Description: Given a start Person, find the Forums which that Person’s friends and friends of friends (excluding start Person) became Members of after a given date. Return top 20 Forums, and the number of Posts in each Forum that was Created by any of these Persons. For each Forum consider only those Persons which joined that particular Forum after the given date. Sort results descending by the count of Posts, and then ascending by Forum identifier.
  • 16. Large Scale Graph Processing System 16 Dr. Sherif Sakr Associate Professor at College of Public Health and Health Informatics at King Saud bin Abdul-Aziz University “Big Data (Graph) Processing Systems State-of-the-art and open challenges”
  • 17. Large Scale Graph Processing System 17 Pregel Family Bulk Synchronous Parallel (BSP) model Valiant. A Bridging Model for Parallel Computation. Commun. ACM, 1990 GraphLab Family Gather, Apply, Scatter (GAS) model Y. Low, J. Gonzalez, A. Kyrola, D. Bickson, C. Guestrin, and J. M. Hellerstein Distributed GraphLab: A Framework for Machine Learning in the Cloud. PVLDB,
  • 18. Graph Visualization 18 Dr. Peter Eades Research Professor at School of Information Technologies at The University of Sidney Data Drawing Human Visualization Function Perception Function faithful + readable
  • 19. Graph Visualization 19 Topology-Shape-Metric approach Energy-based approach Clustered Planarity: Multilevel methods: Fast Approximations: scaling to large graphs scaling to large graphs
  • 20. Part II: Practical ๏ Sparksee - Sparsity Technologies (C.Badenes) Universitat Politécnica de Catalunya ๏ Challenges :: OEG-Team - (D.Garijo) Similarities between Wikipedia Articles 20
  • 22. Sparksee 22 schema ry: Get common Messages for the given Hashtags // User Node int nodeUser = graph.newNodeType("User"); int userNickName = graph.newAttribute(nodeUser, "nickname", DataType.String, AttributeKind.Unique); // knows edge int edgeKnows = graph.newEdgeType("knows", true, true); // User1 long user1 = graph.newNode(nodeUser); graph.setAttribute(user1,userNickName,new Value().setString(“User1")); // edge 'knows' long knows1 = graph.newEdge(edgeKnows, user1, user2); // Find out the OID of the Hashtags with the given hastag's texts. int tag = g.findType("Tag"); int tagName = g.findAttribute(tag, "name"); long tag1 = g.findObject(tagName, new Value().setString(ht1)); long tag2 = g.findObject(tagName, new Value().setString(ht2)); // Retrieve Messages with both hashtags and intersect the retrieved collection of Messages. int tags = g.findType("tags"); Objects msgs1 = g.neighbors(tag1, tags, EdgesDirection.Ingoing); Objects msgs2 = g.neighbors(tag2, tags, EdgesDirection.Ingoing); long nums = msgs1.intersection(msgs2);
  • 23. Challenge 23 Similarities in Wikipedia - Description - To Evaluate - The design - A good proof of functionality - The efficiency, in terms of computation time - The originality of the proposed method - Technical prerequisites of participants - Basic programming skills - To be familiar with some graph library - Technical support provided to participants - English Wikipedia data (dump): - articles_ids.csv - articles_links.csv - articles_body.csv - articles_redirect.csv - categories_ids.csv - articles_category.csv - categories_relations.csv
  • 24. Problem 24 Similarity between Wikipedia Articles Wikipedia Article: text links categories
  • 25. Hypothesis 25 Wikipedia Article: text links categories simLinks simCtg simTextα· β· ɣ· + + simWA(R1,R2) = α·simTxt(R1,R2) + β·simLinks(R1,R2) + ɣ·simCtg(R1,R2) where α+β+ɣ=1
  • 26. Similarity based on Text 26 … TOPIC_1 p = [0.5, 0.3,.., 0.7] q = [0.2, 0.4,.., 0.9]Ri R j TOPIC_2 TOPIC_n Latent Dirichlet Allocation
  • 27. Similarity based on Categories 27 Articles with multiple common categories are likely to be similar Noise filtering is necessary (e.g., “All articles lacking in-text citations”). See https://github.com/cbadenes/siminwikart-challenge4/blob/master/category/wikipedia_bad_categories.txt
  • 28. Similarity based on Links 28 Sim(A,B) = links(A) ∩ links(B) / ( (links(A) U links(B) ) / 2) Articles with multiple common links are likely to be similar
  • 29. Proof of Concept 29 Fernando Alonso Lionel Messi Iker Casillas Princess Akiko (simLinks) α = 0.2 (simCtg) β = 0.2 (simTxt) ɣ = 0.6 [1]0.062 [3]0.075 [1]0.666 [3]0.683 [1]0.058 [3]0.069 [1]0.043 [3]0.072 [1]0.019 [3]0.023 [1]0.068 [3]0.069 simTxt = 0.059 simLinks = 0.019 simCtg=[1]0.117 [3]0.181 simTxt = 0.065 simLinks = 0.0 simCtg=[1]0.095 [3]0.161 simTxt = 0.052 simLinks = 0.019 simCtg=[1]0.166 [3]0.172 simTxt = 0.980 simLinks = 0.175 simCtg=[1]0.217 [3]0.302 simTxt = 0.060 simLinks = 0.008 simCtg=[1]0.030 [3]0.172 simTxt = 0.069 simLinks = 0.004 simCtg=[1]0.080 [3]0.134
  • 30. Comparison 30 Lionel Messi Princess Akiko simTxt = 0.060 -> <common words> simLinks = 0.008 -> (England,Buenos_Aires,Chile,Madrid,Argentina) simCtg=[1]0.030 -> living_person
  • 31. Proposal 31 Graph based on Links Graph based on Similarities
  • 32. Problem 32 Wikipedia links reliability (missing links) Wikipedia Article: text links categories
  • 33. Further Refinement 33 Similarities between categories (as topics) can define relations between articles Graph based on Links Graph based on Similarities Subgraph Pattern Matching + Topic Model +
  • 36. Kitkat Time • Suggestions? • Name for the system? • Contributors? 36