SlideShare a Scribd company logo
BotNetBM

          A Benchmark for Social Network


                                       CWI
                            Project Meeting@Innsbruck
                              Feb 28 - Mar 04, 2011




Wednesday, March 02, 2011
Motivation
     —   Highly linked data

     —   No (good) benchmark yet for social
          networks




                            Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck


Wednesday, March 02, 2011
BotNetBM
     —   A benchmark for social networks

     —   Simulates an RDF OLTP backend

     —   Simulates random activities of large #users

     —   Simulates on-site “analyst” ➠ weekly
          “analytic report”

     —   One parameter: scale (#user accounts to
          start BM)
                            Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck


Wednesday, March 02, 2011
BotNetBM Queries
     —   SPARQL 1.1 + SPARUL

     —   User Actions

          ◦ Interactive queries (80%)

          ◦ Update transactions (20%)

     —   Measurement: successful #clicks/min.

          ◦ Transactions commit, penalty for > 3 sec.

          ◦ Interactive queries response time < 3 sec.

     —   Analytic queries (must finish within simulated weekend)

                            Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck


Wednesday, March 02, 2011
Limitations
     —   Data generator: too uniform, not realistic for social networks

          ◦ 10 operations / user / simulated day

          ◦ all users are equally active

          ◦ some queries have no “meaningful” relation to each other

          ◦ read/write contention unrealistically frequent
          ◦ ...

     —   Query mix:

          ◦ Does not exploit SPARQL 1.1 advanced features
          ◦ No link to other RDF datasets

     —   Queries do not run with the open source ed. of Virtuoso Server

                               Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck


Wednesday, March 02, 2011
Our Goals
     —   Exploit SPARQL1.1 features in queries

          ◦ “Property Path Expressions”

     —   Add links to well-know RDF data sets into the queries

          ◦ DBpedia

     —   Use real-life analysis info (e.g., twitter)

          ◦ redesign data generator

          ◦ distribution of interactive/update queries

     —   Use real-life social network data

          ◦ twitter, facebook, orkut, MySpace, ...

     —   Migration to MonetDB

                               Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck


Wednesday, March 02, 2011
Done
     —   Loaded into the Virtuoso Server (commercial ed.)

     —   Design of new query mix

     —   Twitter datasets

          ◦ http://infochimps.com/collections/twitter-census

          ◦ http://an.kaist.ac.kr/traces/WWW2010.html

          ◦ http://snap.stanford.edu/data/twitter7.html

          ◦ http://twitter.mpi-sws.org/

     —   Analysis information

          ◦ “The Man Your Man Could Smell Like: Twitter Analytics Report”

          ◦ “Characterizing user behavior in online social networks”

          ◦ “User Interactions in Social Networks and their Implications”


                                 Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck


Wednesday, March 02, 2011
Interactive & Analytic Queries

     Q1 - Q8: Information of Profiles & Friends
     1.   Find all users whose first names contain a particular string, e.g., “Minh”.

     2.   Return the names of people who study in the same school and have the same age as a user. These
          people can be the classmates of the user.

     3.   Find people studied from the same school that connect with you by a path of friend relationship. (Use
          the “Property Path Expression” in SPARQL 1.1 with arbitrary length path)

     4.   Find all friends who like an action movie whose actor is Tom Cruise. (Use the information from dbpedia
          for the movie and actor Tom Cruise)

     5.   Find all people living in a specific location, e.g., Amsterdam, that can be reached from a user by at most
          3 steps friend relationship.

     6.   Show all the friends of yours who are living in Europe. (Use the information from dbpedia. For example,
          Amsterdam is a city in Europe, London is a city in Europe)

     7.   Find top-10 suggested friends for a user: those people that are currently not your friend but are friends
          of many of your friends. (Get all friends of your friends, order them by the number of people in your
          friends list connecting to them)

     8.   Return all users that have not joined a specific group but more than 5 friends of theirs joined the group.



                                       Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck


Wednesday, March 02, 2011
Interactive & Analytic Queries

     Q9 - Q14: Posts or Tweets
     9.   Show 10 latest posts/tweets from your friends or the friends of them. (Order by posting time)

     10. Show active posts/tweets - the 10 latest commented posts/tweets from your friends. (Order
         by the timestamp of the last comments on the posts)

     11. Return top-10 most interesting posts from your friends - First order by the number of
         “like” (or in Twitter, the number of “re-tweet” posts) on the posts from your friends, then
         order by the number of comments.

     12. Return all posts about an event (e.g., Unrest in Tunisia) in 10 recent days. (Based on the
         hash tags if they are available. In case no tag appears in the post, check whether the content
         of the post contains the terms in the searching event.)

     13. Show all posts about a specific location, e.g., Egypt, in 10 recent days. (Use the information
         from DBpedia for identify the location of the post. For example, Cairo is the capital of Egypt,
         Tahrir square is in Cairo.)

     14. Find number of inactive user: all users activated for at least 30 days but did not have any
         post or all users that do not have any more post for 60 days.



                                   Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck


Wednesday, March 02, 2011
Interactive & Analytic Queries

     Q15 - Q17: Hash tags
     15.Show all photos posted by my friends that I was tagged.

     16.Find top-10 friends or all friends of friends of you that have
        common interest. (Based on the similarity between the tags in
        your posts and tags in their posts)

     17.What are the current hottest events/problems? (Get the hash tags
        from posts and order by the number of their appearances in 10
        recent days)




                            Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck


Wednesday, March 02, 2011
Interactive & Analytic Queries

     Q18 - Q19: other information
     18.Which area is the most active area? (Order by the total number of
        posts in each location in 5 recent days.)

     19.Return the top-10 locations that have the fastest growth in the
        number of users. (Count the number of people joined before 10
        days and those joined during the 10 recent days, and then,
        compute the developing rate.)




                            Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck


Wednesday, March 02, 2011
SPARQL/Update Queries
     1. Update user profile

     2. Posts/Tweets:

           2.1. Add a posts (Popularity: high)

           2.2. Remove a posts (Popularity: low)

           2.3. Add tags for your friends

           2.4. Add/Remove a comment

     3. Friends

           3.1. Add a friend (Popularity: high)

           3.2. Remove a friend (Popularity: low)

                            Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck


Wednesday, March 02, 2011
SPARQL/Update Queries
     4. Group, Event

           4.1. Join/Leave a group/event

           4.2. Add/Delete post in the group/event

     5. Photos

           5.1. Add/Delete a photo

           5.2. Add/Remove tags in the photo

           5.3. Add/Remove a comment
           5.4. Remove tags to me from all the pictures of my friends

                            Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck


Wednesday, March 02, 2011

More Related Content

Viewers also liked

Privacy-Preserving Schema Reuse
Privacy-Preserving Schema ReusePrivacy-Preserving Schema Reuse
Privacy-Preserving Schema Reuse
PlanetData Network of Excellence
 
Dl2014 slides
Dl2014 slidesDl2014 slides
Arrays in database systems, the next frontier?
Arrays in database systems, the next frontier?Arrays in database systems, the next frontier?
Arrays in database systems, the next frontier?
PlanetData Network of Excellence
 
Exposing Real World Information for the Web of Things
Exposing Real World Information for the Web of ThingsExposing Real World Information for the Web of Things
Exposing Real World Information for the Web of Things
PlanetData Network of Excellence
 
Scalable Nonmonotonic Reasoning over RDF Data Using MapReduce
Scalable Nonmonotonic Reasoning over RDF Data Using MapReduceScalable Nonmonotonic Reasoning over RDF Data Using MapReduce
Scalable Nonmonotonic Reasoning over RDF Data Using MapReduce
PlanetData Network of Excellence
 
Tractor Pulling on Data Warehouse
Tractor Pulling on Data WarehouseTractor Pulling on Data Warehouse
Tractor Pulling on Data Warehouse
PlanetData Network of Excellence
 
On Leveraging Crowdsourcing Techniques for Schema Matching Networks
On Leveraging Crowdsourcing Techniques for Schema Matching NetworksOn Leveraging Crowdsourcing Techniques for Schema Matching Networks
On Leveraging Crowdsourcing Techniques for Schema Matching Networks
PlanetData Network of Excellence
 
Efficiently Maintaining Distributed Model-Based Views on Real-Time Data Streams
Efficiently Maintaining Distributed Model-Based Views on Real-Time Data StreamsEfficiently Maintaining Distributed Model-Based Views on Real-Time Data Streams
Efficiently Maintaining Distributed Model-Based Views on Real-Time Data Streams
PlanetData Network of Excellence
 
Planetdata
PlanetdataPlanetdata

Viewers also liked (9)

Privacy-Preserving Schema Reuse
Privacy-Preserving Schema ReusePrivacy-Preserving Schema Reuse
Privacy-Preserving Schema Reuse
 
Dl2014 slides
Dl2014 slidesDl2014 slides
Dl2014 slides
 
Arrays in database systems, the next frontier?
Arrays in database systems, the next frontier?Arrays in database systems, the next frontier?
Arrays in database systems, the next frontier?
 
Exposing Real World Information for the Web of Things
Exposing Real World Information for the Web of ThingsExposing Real World Information for the Web of Things
Exposing Real World Information for the Web of Things
 
Scalable Nonmonotonic Reasoning over RDF Data Using MapReduce
Scalable Nonmonotonic Reasoning over RDF Data Using MapReduceScalable Nonmonotonic Reasoning over RDF Data Using MapReduce
Scalable Nonmonotonic Reasoning over RDF Data Using MapReduce
 
Tractor Pulling on Data Warehouse
Tractor Pulling on Data WarehouseTractor Pulling on Data Warehouse
Tractor Pulling on Data Warehouse
 
On Leveraging Crowdsourcing Techniques for Schema Matching Networks
On Leveraging Crowdsourcing Techniques for Schema Matching NetworksOn Leveraging Crowdsourcing Techniques for Schema Matching Networks
On Leveraging Crowdsourcing Techniques for Schema Matching Networks
 
Efficiently Maintaining Distributed Model-Based Views on Real-Time Data Streams
Efficiently Maintaining Distributed Model-Based Views on Real-Time Data StreamsEfficiently Maintaining Distributed Model-Based Views on Real-Time Data Streams
Efficiently Maintaining Distributed Model-Based Views on Real-Time Data Streams
 
Planetdata
PlanetdataPlanetdata
Planetdata
 

Similar to BotNetBenchmark - A Benchmark for Social Network

Predicting what gets ‘Likes’ on Facebook: case study of BlogTO
Predicting what gets ‘Likes’ on Facebook:  case study of BlogTOPredicting what gets ‘Likes’ on Facebook:  case study of BlogTO
Predicting what gets ‘Likes’ on Facebook: case study of BlogTO
Toronto Metropolitan University
 
Predicting what gets ‘Likes’ on Facebook: case study of BlogTO
Predicting what gets ‘Likes’ on Facebook: case study of BlogTOPredicting what gets ‘Likes’ on Facebook: case study of BlogTO
Predicting what gets ‘Likes’ on Facebook: case study of BlogTO
Priya Kumar
 
Social Media Crawling & Mining Seminar
Social Media Crawling & Mining Seminar Social Media Crawling & Mining Seminar
Social Media Crawling & Mining Seminar
Symeon Papadopoulos
 
CSE509 Lecture 5
CSE509 Lecture 5CSE509 Lecture 5
Social media as a tool for terminological research
Social media as a tool for terminological researchSocial media as a tool for terminological research
Social media as a tool for terminological research
TERMCAT
 
Flux of MEME - DOW 1st semester
Flux of MEME - DOW 1st semesterFlux of MEME - DOW 1st semester
Flux of MEME - DOW 1st semester
thomas alisi
 
Developing rich interactive eBooks to teach linked open data to professionals...
Developing rich interactive eBooks to teach linked open data to professionals...Developing rich interactive eBooks to teach linked open data to professionals...
Developing rich interactive eBooks to teach linked open data to professionals...
John Domingue
 
2016 09-28 social network analysis with node-xl_emke
2016 09-28 social network analysis with node-xl_emke2016 09-28 social network analysis with node-xl_emke
2016 09-28 social network analysis with node-xl_emke
Dr Martina Emke
 
Social Media Analysis... according to Net7
Social Media Analysis... according to Net7Social Media Analysis... according to Net7
Social Media Analysis... according to Net7
Net7
 
Analyzing customer sentiments in microblogs
Analyzing customer sentiments in microblogsAnalyzing customer sentiments in microblogs
Analyzing customer sentiments in microblogs
Stefan Sommer
 
Text mining on Twitter information based on R platform
Text mining on Twitter information based on R platformText mining on Twitter information based on R platform
Text mining on Twitter information based on R platform
Fayan TAO
 
#mytweet via Instagram: Exploring User Behaviour Across Multiple Social Networks
#mytweet via Instagram: Exploring User Behaviour Across Multiple Social Networks#mytweet via Instagram: Exploring User Behaviour Across Multiple Social Networks
#mytweet via Instagram: Exploring User Behaviour Across Multiple Social Networks
Bang Hui Lim
 
Anticipating Discussion Activity on Community Forums
Anticipating Discussion Activity on Community ForumsAnticipating Discussion Activity on Community Forums
Anticipating Discussion Activity on Community Forums
Matthew Rowe
 
Social Media in Japan (Panel in Blogtalk2009)
Social Media in Japan (Panel in Blogtalk2009)Social Media in Japan (Panel in Blogtalk2009)
Social Media in Japan (Panel in Blogtalk2009)
National Institute of Informatics (NII)
 
Mapping Online Publics: Researching the Uses of Twitter
Mapping Online Publics: Researching the Uses of TwitterMapping Online Publics: Researching the Uses of Twitter
Mapping Online Publics: Researching the Uses of Twitter
Axel Bruns
 
IAT334-Lec04-DesignIdeasPrinciples.pptx
IAT334-Lec04-DesignIdeasPrinciples.pptxIAT334-Lec04-DesignIdeasPrinciples.pptx
IAT334-Lec04-DesignIdeasPrinciples.pptx
ssuseraae9cd
 
Twitter in Academic Conferences
Twitter in Academic ConferencesTwitter in Academic Conferences
Twitter in Academic Conferences
Denis Parra Santander
 
Open Annotation Collaboration Briefing
Open Annotation Collaboration BriefingOpen Annotation Collaboration Briefing
Open Annotation Collaboration Briefing
Timothy Cole
 
Information Extraction from Text, presented @ Deloitte
Information Extraction from Text, presented @ DeloitteInformation Extraction from Text, presented @ Deloitte
Information Extraction from Text, presented @ Deloitte
Deep Kayal
 
Accessing and analysing your own social media data.pptx
Accessing and analysing your own social media data.pptxAccessing and analysing your own social media data.pptx
Accessing and analysing your own social media data.pptx
LadduAnanu
 

Similar to BotNetBenchmark - A Benchmark for Social Network (20)

Predicting what gets ‘Likes’ on Facebook: case study of BlogTO
Predicting what gets ‘Likes’ on Facebook:  case study of BlogTOPredicting what gets ‘Likes’ on Facebook:  case study of BlogTO
Predicting what gets ‘Likes’ on Facebook: case study of BlogTO
 
Predicting what gets ‘Likes’ on Facebook: case study of BlogTO
Predicting what gets ‘Likes’ on Facebook: case study of BlogTOPredicting what gets ‘Likes’ on Facebook: case study of BlogTO
Predicting what gets ‘Likes’ on Facebook: case study of BlogTO
 
Social Media Crawling & Mining Seminar
Social Media Crawling & Mining Seminar Social Media Crawling & Mining Seminar
Social Media Crawling & Mining Seminar
 
CSE509 Lecture 5
CSE509 Lecture 5CSE509 Lecture 5
CSE509 Lecture 5
 
Social media as a tool for terminological research
Social media as a tool for terminological researchSocial media as a tool for terminological research
Social media as a tool for terminological research
 
Flux of MEME - DOW 1st semester
Flux of MEME - DOW 1st semesterFlux of MEME - DOW 1st semester
Flux of MEME - DOW 1st semester
 
Developing rich interactive eBooks to teach linked open data to professionals...
Developing rich interactive eBooks to teach linked open data to professionals...Developing rich interactive eBooks to teach linked open data to professionals...
Developing rich interactive eBooks to teach linked open data to professionals...
 
2016 09-28 social network analysis with node-xl_emke
2016 09-28 social network analysis with node-xl_emke2016 09-28 social network analysis with node-xl_emke
2016 09-28 social network analysis with node-xl_emke
 
Social Media Analysis... according to Net7
Social Media Analysis... according to Net7Social Media Analysis... according to Net7
Social Media Analysis... according to Net7
 
Analyzing customer sentiments in microblogs
Analyzing customer sentiments in microblogsAnalyzing customer sentiments in microblogs
Analyzing customer sentiments in microblogs
 
Text mining on Twitter information based on R platform
Text mining on Twitter information based on R platformText mining on Twitter information based on R platform
Text mining on Twitter information based on R platform
 
#mytweet via Instagram: Exploring User Behaviour Across Multiple Social Networks
#mytweet via Instagram: Exploring User Behaviour Across Multiple Social Networks#mytweet via Instagram: Exploring User Behaviour Across Multiple Social Networks
#mytweet via Instagram: Exploring User Behaviour Across Multiple Social Networks
 
Anticipating Discussion Activity on Community Forums
Anticipating Discussion Activity on Community ForumsAnticipating Discussion Activity on Community Forums
Anticipating Discussion Activity on Community Forums
 
Social Media in Japan (Panel in Blogtalk2009)
Social Media in Japan (Panel in Blogtalk2009)Social Media in Japan (Panel in Blogtalk2009)
Social Media in Japan (Panel in Blogtalk2009)
 
Mapping Online Publics: Researching the Uses of Twitter
Mapping Online Publics: Researching the Uses of TwitterMapping Online Publics: Researching the Uses of Twitter
Mapping Online Publics: Researching the Uses of Twitter
 
IAT334-Lec04-DesignIdeasPrinciples.pptx
IAT334-Lec04-DesignIdeasPrinciples.pptxIAT334-Lec04-DesignIdeasPrinciples.pptx
IAT334-Lec04-DesignIdeasPrinciples.pptx
 
Twitter in Academic Conferences
Twitter in Academic ConferencesTwitter in Academic Conferences
Twitter in Academic Conferences
 
Open Annotation Collaboration Briefing
Open Annotation Collaboration BriefingOpen Annotation Collaboration Briefing
Open Annotation Collaboration Briefing
 
Information Extraction from Text, presented @ Deloitte
Information Extraction from Text, presented @ DeloitteInformation Extraction from Text, presented @ Deloitte
Information Extraction from Text, presented @ Deloitte
 
Accessing and analysing your own social media data.pptx
Accessing and analysing your own social media data.pptxAccessing and analysing your own social media data.pptx
Accessing and analysing your own social media data.pptx
 

More from PlanetData Network of Excellence

A Contextualized Knowledge Repository for Open Data about Trentino
A Contextualized Knowledge Repository for Open Data about TrentinoA Contextualized Knowledge Repository for Open Data about Trentino
A Contextualized Knowledge Repository for Open Data about Trentino
PlanetData Network of Excellence
 
Towards Enabling Probabilistic Databases for Participatory Sensing
Towards Enabling Probabilistic Databases for Participatory SensingTowards Enabling Probabilistic Databases for Participatory Sensing
Towards Enabling Probabilistic Databases for Participatory Sensing
PlanetData Network of Excellence
 
Pay-as-you-go Reconciliation in Schema Matching Networks
Pay-as-you-go Reconciliation in Schema Matching NetworksPay-as-you-go Reconciliation in Schema Matching Networks
Pay-as-you-go Reconciliation in Schema Matching Networks
PlanetData Network of Excellence
 
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstream
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstreamDemo: tablet-based visualisation of transport data in Madrid using SPARQLstream
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstream
PlanetData Network of Excellence
 
On the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream ProcessingOn the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream Processing
PlanetData Network of Excellence
 
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
PlanetData Network of Excellence
 
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatch
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatchLinking Smart Cities Datasets with Human Computation: the case of UrbanMatch
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatch
PlanetData Network of Excellence
 
SciQL, Bridging the Gap between Science and Relational DBMS
SciQL, Bridging the Gap between Science and Relational DBMSSciQL, Bridging the Gap between Science and Relational DBMS
SciQL, Bridging the Gap between Science and Relational DBMS
PlanetData Network of Excellence
 
CLODA: A Crowdsourced Linked Open Data Architecture
CLODA: A Crowdsourced Linked Open Data ArchitectureCLODA: A Crowdsourced Linked Open Data Architecture
CLODA: A Crowdsourced Linked Open Data Architecture
PlanetData Network of Excellence
 
Data and Knowledge Evolution
Data and Knowledge Evolution  Data and Knowledge Evolution
Data and Knowledge Evolution
PlanetData Network of Excellence
 
Evolution of Workflow Provenance Information in the Presence of Custom Infere...
Evolution of Workflow Provenance Information in the Presence of Custom Infere...Evolution of Workflow Provenance Information in the Presence of Custom Infere...
Evolution of Workflow Provenance Information in the Presence of Custom Infere...
PlanetData Network of Excellence
 
Access Control for RDF graphs using Abstract Models
Access Control for RDF graphs using Abstract ModelsAccess Control for RDF graphs using Abstract Models
Access Control for RDF graphs using Abstract Models
PlanetData Network of Excellence
 
Arrays in Databases, the next frontier?
Arrays in Databases, the next frontier?Arrays in Databases, the next frontier?
Arrays in Databases, the next frontier?
PlanetData Network of Excellence
 
Abstract Access Control Model for Dynamic RDF Datasets
Abstract Access Control Model for Dynamic RDF DatasetsAbstract Access Control Model for Dynamic RDF Datasets
Abstract Access Control Model for Dynamic RDF Datasets
PlanetData Network of Excellence
 
Towards Parallel Nonmonotonic Reasoning with Billions of Facts
Towards Parallel Nonmonotonic Reasoning with Billions of FactsTowards Parallel Nonmonotonic Reasoning with Billions of Facts
Towards Parallel Nonmonotonic Reasoning with Billions of Facts
PlanetData Network of Excellence
 
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
PlanetData Network of Excellence
 
Heuristic based Query Optimisation for SPARQL
Heuristic based Query Optimisation for SPARQLHeuristic based Query Optimisation for SPARQL
Heuristic based Query Optimisation for SPARQL
PlanetData Network of Excellence
 
Adaptive Semantic Data Management Techniques for Federations of Endpoints
Adaptive Semantic Data Management Techniques for Federations of EndpointsAdaptive Semantic Data Management Techniques for Federations of Endpoints
Adaptive Semantic Data Management Techniques for Federations of Endpoints
PlanetData Network of Excellence
 
Building a Front End for a Sensor Data Cloud
Building a Front End for a Sensor Data CloudBuilding a Front End for a Sensor Data Cloud
Building a Front End for a Sensor Data Cloud
PlanetData Network of Excellence
 
OntoGen Extension for Exploring Image Collections
OntoGen Extension for Exploring Image CollectionsOntoGen Extension for Exploring Image Collections
OntoGen Extension for Exploring Image Collections
PlanetData Network of Excellence
 

More from PlanetData Network of Excellence (20)

A Contextualized Knowledge Repository for Open Data about Trentino
A Contextualized Knowledge Repository for Open Data about TrentinoA Contextualized Knowledge Repository for Open Data about Trentino
A Contextualized Knowledge Repository for Open Data about Trentino
 
Towards Enabling Probabilistic Databases for Participatory Sensing
Towards Enabling Probabilistic Databases for Participatory SensingTowards Enabling Probabilistic Databases for Participatory Sensing
Towards Enabling Probabilistic Databases for Participatory Sensing
 
Pay-as-you-go Reconciliation in Schema Matching Networks
Pay-as-you-go Reconciliation in Schema Matching NetworksPay-as-you-go Reconciliation in Schema Matching Networks
Pay-as-you-go Reconciliation in Schema Matching Networks
 
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstream
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstreamDemo: tablet-based visualisation of transport data in Madrid using SPARQLstream
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstream
 
On the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream ProcessingOn the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream Processing
 
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
 
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatch
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatchLinking Smart Cities Datasets with Human Computation: the case of UrbanMatch
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatch
 
SciQL, Bridging the Gap between Science and Relational DBMS
SciQL, Bridging the Gap between Science and Relational DBMSSciQL, Bridging the Gap between Science and Relational DBMS
SciQL, Bridging the Gap between Science and Relational DBMS
 
CLODA: A Crowdsourced Linked Open Data Architecture
CLODA: A Crowdsourced Linked Open Data ArchitectureCLODA: A Crowdsourced Linked Open Data Architecture
CLODA: A Crowdsourced Linked Open Data Architecture
 
Data and Knowledge Evolution
Data and Knowledge Evolution  Data and Knowledge Evolution
Data and Knowledge Evolution
 
Evolution of Workflow Provenance Information in the Presence of Custom Infere...
Evolution of Workflow Provenance Information in the Presence of Custom Infere...Evolution of Workflow Provenance Information in the Presence of Custom Infere...
Evolution of Workflow Provenance Information in the Presence of Custom Infere...
 
Access Control for RDF graphs using Abstract Models
Access Control for RDF graphs using Abstract ModelsAccess Control for RDF graphs using Abstract Models
Access Control for RDF graphs using Abstract Models
 
Arrays in Databases, the next frontier?
Arrays in Databases, the next frontier?Arrays in Databases, the next frontier?
Arrays in Databases, the next frontier?
 
Abstract Access Control Model for Dynamic RDF Datasets
Abstract Access Control Model for Dynamic RDF DatasetsAbstract Access Control Model for Dynamic RDF Datasets
Abstract Access Control Model for Dynamic RDF Datasets
 
Towards Parallel Nonmonotonic Reasoning with Billions of Facts
Towards Parallel Nonmonotonic Reasoning with Billions of FactsTowards Parallel Nonmonotonic Reasoning with Billions of Facts
Towards Parallel Nonmonotonic Reasoning with Billions of Facts
 
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
 
Heuristic based Query Optimisation for SPARQL
Heuristic based Query Optimisation for SPARQLHeuristic based Query Optimisation for SPARQL
Heuristic based Query Optimisation for SPARQL
 
Adaptive Semantic Data Management Techniques for Federations of Endpoints
Adaptive Semantic Data Management Techniques for Federations of EndpointsAdaptive Semantic Data Management Techniques for Federations of Endpoints
Adaptive Semantic Data Management Techniques for Federations of Endpoints
 
Building a Front End for a Sensor Data Cloud
Building a Front End for a Sensor Data CloudBuilding a Front End for a Sensor Data Cloud
Building a Front End for a Sensor Data Cloud
 
OntoGen Extension for Exploring Image Collections
OntoGen Extension for Exploring Image CollectionsOntoGen Extension for Exploring Image Collections
OntoGen Extension for Exploring Image Collections
 

Recently uploaded

Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Alpen-Adria-Universität
 
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
SitimaJohn
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
IndexBug
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
Chart Kalyan
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
Jason Packer
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
tolgahangng
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
Jakub Marek
 
OpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - AuthorizationOpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - Authorization
David Brossard
 
Webinar: Designing a schema for a Data Warehouse
Webinar: Designing a schema for a Data WarehouseWebinar: Designing a schema for a Data Warehouse
Webinar: Designing a schema for a Data Warehouse
Federico Razzoli
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Jeffrey Haguewood
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying AheadDigital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Wask
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
Ivanti
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
Daiki Mogmet Ito
 
UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
Wouter Lemaire
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 

Recently uploaded (20)

Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
 
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
 
OpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - AuthorizationOpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - Authorization
 
Webinar: Designing a schema for a Data Warehouse
Webinar: Designing a schema for a Data WarehouseWebinar: Designing a schema for a Data Warehouse
Webinar: Designing a schema for a Data Warehouse
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying AheadDigital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying Ahead
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
 
UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 

BotNetBenchmark - A Benchmark for Social Network

  • 1. BotNetBM A Benchmark for Social Network CWI Project Meeting@Innsbruck Feb 28 - Mar 04, 2011 Wednesday, March 02, 2011
  • 2. Motivation — Highly linked data — No (good) benchmark yet for social networks Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck Wednesday, March 02, 2011
  • 3. BotNetBM — A benchmark for social networks — Simulates an RDF OLTP backend — Simulates random activities of large #users — Simulates on-site “analyst” ➠ weekly “analytic report” — One parameter: scale (#user accounts to start BM) Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck Wednesday, March 02, 2011
  • 4. BotNetBM Queries — SPARQL 1.1 + SPARUL — User Actions ◦ Interactive queries (80%) ◦ Update transactions (20%) — Measurement: successful #clicks/min. ◦ Transactions commit, penalty for > 3 sec. ◦ Interactive queries response time < 3 sec. — Analytic queries (must finish within simulated weekend) Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck Wednesday, March 02, 2011
  • 5. Limitations — Data generator: too uniform, not realistic for social networks ◦ 10 operations / user / simulated day ◦ all users are equally active ◦ some queries have no “meaningful” relation to each other ◦ read/write contention unrealistically frequent ◦ ... — Query mix: ◦ Does not exploit SPARQL 1.1 advanced features ◦ No link to other RDF datasets — Queries do not run with the open source ed. of Virtuoso Server Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck Wednesday, March 02, 2011
  • 6. Our Goals — Exploit SPARQL1.1 features in queries ◦ “Property Path Expressions” — Add links to well-know RDF data sets into the queries ◦ DBpedia — Use real-life analysis info (e.g., twitter) ◦ redesign data generator ◦ distribution of interactive/update queries — Use real-life social network data ◦ twitter, facebook, orkut, MySpace, ... — Migration to MonetDB Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck Wednesday, March 02, 2011
  • 7. Done — Loaded into the Virtuoso Server (commercial ed.) — Design of new query mix — Twitter datasets ◦ http://infochimps.com/collections/twitter-census ◦ http://an.kaist.ac.kr/traces/WWW2010.html ◦ http://snap.stanford.edu/data/twitter7.html ◦ http://twitter.mpi-sws.org/ — Analysis information ◦ “The Man Your Man Could Smell Like: Twitter Analytics Report” ◦ “Characterizing user behavior in online social networks” ◦ “User Interactions in Social Networks and their Implications” Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck Wednesday, March 02, 2011
  • 8. Interactive & Analytic Queries Q1 - Q8: Information of Profiles & Friends 1. Find all users whose first names contain a particular string, e.g., “Minh”. 2. Return the names of people who study in the same school and have the same age as a user. These people can be the classmates of the user. 3. Find people studied from the same school that connect with you by a path of friend relationship. (Use the “Property Path Expression” in SPARQL 1.1 with arbitrary length path) 4. Find all friends who like an action movie whose actor is Tom Cruise. (Use the information from dbpedia for the movie and actor Tom Cruise) 5. Find all people living in a specific location, e.g., Amsterdam, that can be reached from a user by at most 3 steps friend relationship. 6. Show all the friends of yours who are living in Europe. (Use the information from dbpedia. For example, Amsterdam is a city in Europe, London is a city in Europe) 7. Find top-10 suggested friends for a user: those people that are currently not your friend but are friends of many of your friends. (Get all friends of your friends, order them by the number of people in your friends list connecting to them) 8. Return all users that have not joined a specific group but more than 5 friends of theirs joined the group. Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck Wednesday, March 02, 2011
  • 9. Interactive & Analytic Queries Q9 - Q14: Posts or Tweets 9. Show 10 latest posts/tweets from your friends or the friends of them. (Order by posting time) 10. Show active posts/tweets - the 10 latest commented posts/tweets from your friends. (Order by the timestamp of the last comments on the posts) 11. Return top-10 most interesting posts from your friends - First order by the number of “like” (or in Twitter, the number of “re-tweet” posts) on the posts from your friends, then order by the number of comments. 12. Return all posts about an event (e.g., Unrest in Tunisia) in 10 recent days. (Based on the hash tags if they are available. In case no tag appears in the post, check whether the content of the post contains the terms in the searching event.) 13. Show all posts about a specific location, e.g., Egypt, in 10 recent days. (Use the information from DBpedia for identify the location of the post. For example, Cairo is the capital of Egypt, Tahrir square is in Cairo.) 14. Find number of inactive user: all users activated for at least 30 days but did not have any post or all users that do not have any more post for 60 days. Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck Wednesday, March 02, 2011
  • 10. Interactive & Analytic Queries Q15 - Q17: Hash tags 15.Show all photos posted by my friends that I was tagged. 16.Find top-10 friends or all friends of friends of you that have common interest. (Based on the similarity between the tags in your posts and tags in their posts) 17.What are the current hottest events/problems? (Get the hash tags from posts and order by the number of their appearances in 10 recent days) Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck Wednesday, March 02, 2011
  • 11. Interactive & Analytic Queries Q18 - Q19: other information 18.Which area is the most active area? (Order by the total number of posts in each location in 5 recent days.) 19.Return the top-10 locations that have the fastest growth in the number of users. (Count the number of people joined before 10 days and those joined during the 10 recent days, and then, compute the developing rate.) Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck Wednesday, March 02, 2011
  • 12. SPARQL/Update Queries 1. Update user profile 2. Posts/Tweets: 2.1. Add a posts (Popularity: high) 2.2. Remove a posts (Popularity: low) 2.3. Add tags for your friends 2.4. Add/Remove a comment 3. Friends 3.1. Add a friend (Popularity: high) 3.2. Remove a friend (Popularity: low) Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck Wednesday, March 02, 2011
  • 13. SPARQL/Update Queries 4. Group, Event 4.1. Join/Leave a group/event 4.2. Add/Delete post in the group/event 5. Photos 5.1. Add/Delete a photo 5.2. Add/Remove tags in the photo 5.3. Add/Remove a comment 5.4. Remove tags to me from all the pictures of my friends Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck Wednesday, March 02, 2011