Presentation of the spatiotemporal RDF store Strabon at the Linked Data Europe Workshop, co-located with the European Data Forum in Athens, Greece (21 March 2014)
Web-based framework for online sketch-based image retrievalLukas Tencer
My presentation for course SYS821 "Pattern recognition and inspection" at ETS. This describes implementation of my project on topic "Web-based framework for online sketch-based image retrieval".
Challenge@RuleML2015 Modeling Object-Relational Geolocation Knowledge in PSOA...RuleML
In recent years, many geospatial data sets have become available
on the Web. These data can be incorporated into real-world applications
to answer advanced geospatial queries. In this paper, we present
a use case to integrate a local data set with external geospatial data
sets on the Web. The data sets are modeled in different paradigms –
relational and object-centered. The integration uses Positional-Slotted
Object-Applicative (PSOA) RuleML, which combines the relational and
object-centered modeling paradigms for databases as well as knowledge
bases (KBs).
Track 13. New Trends in Digital Humanities
Authors: Alejandro Benito; Antonio G. Losada; Roberto Theron; Amelie Dorn; Melanie Seltmann; Eveline Wandl-Vogt
https://youtu.be/5tTot6vinZk
Web-based framework for online sketch-based image retrievalLukas Tencer
My presentation for course SYS821 "Pattern recognition and inspection" at ETS. This describes implementation of my project on topic "Web-based framework for online sketch-based image retrieval".
Challenge@RuleML2015 Modeling Object-Relational Geolocation Knowledge in PSOA...RuleML
In recent years, many geospatial data sets have become available
on the Web. These data can be incorporated into real-world applications
to answer advanced geospatial queries. In this paper, we present
a use case to integrate a local data set with external geospatial data
sets on the Web. The data sets are modeled in different paradigms –
relational and object-centered. The integration uses Positional-Slotted
Object-Applicative (PSOA) RuleML, which combines the relational and
object-centered modeling paradigms for databases as well as knowledge
bases (KBs).
Track 13. New Trends in Digital Humanities
Authors: Alejandro Benito; Antonio G. Losada; Roberto Theron; Amelie Dorn; Melanie Seltmann; Eveline Wandl-Vogt
https://youtu.be/5tTot6vinZk
Making Use of the Linked Data Cloud: The Role of Index StructuresThomas Gottron
The intensive growth of the Linked Open Data Cloud has spawned a web of data where a multitude of data sources provides huge amounts of valuable information across different domains. Nowadays, when accessing and using Linked Data more and more often the challenging question is not so much whether there is relevant data available, but rather where it can be found and how it is structured. Thus, index structures play an important role for making use of the information in LOD cloud. In this talk I will address three aspects of Linked Data index structures: (1) a high level view and categorization of indices structures and how they can be queried and explored, (2) approaches for building index structures and the need to maintain them and (3) some example applications which greatly benefit from indices over linked data.
Provenance Analytics at AAAI Human Computation Conference 2013T Dong Huynh
Trung Dong Huynh presenting the paper entitled "Interpretation of Crowdsourced Activities using Provenance Network Analysis" - How analysing provenance graphs can help interpreting crowdsouced activities in CollabMap
An Empirical Comparison of Fast and Efficient Tools for Mining Textual Datavtunali
In order to effectively manage and retrieve the information comprised in vast amount of text documents, powerful text mining tools and techniques are essential. In this paper we evaluate and compare two state-of-the-art data mining tools for clustering high-dimensional text data, Cluto and Gmeans. Several experiments were conducted on three benchmark datasets, and results are analysed in terms of clustering quality, memory and CPU time consumption. We empirically show that Gmeans offers high scalability by sacrificing clustering quality while Cluto presents better clustering quality at the expense of memory and CPU time.
Feta: Federated QuEry TrAcking for Linked Dataserrano-p
Following the principles of Linked Data (LD), data providers are producing thousands of interlinked datasets in multiple domains including life science, government, social networking, media and publications. Federated query engines allow data consumers to query several datasets through a federation of SPARQL endpoints. However, data providers just receive subqueries resulting from the decomposition of the original federated query. Consequently, they do not know how their data are crossed with other datasets of the federation. In this paper, we propose FETA, a Federated quEry TrAcking system for LD. We consider that data providers collaborate by sharing their query logs. Then, from a fed-erated log, FETA infers Basic Graph Patterns (BGPs) containing joined triple patterns, executed among endpoints. We experimented FETA with logs produced by FedBench queries executed with Anapsid and FedX federated query engines. Experiments show that FETA is able to infer BGPs of joined triple patterns with a good precision and recall.
Deep Learning algorithms are gaining momentum as main components in a large number of fields, from computer vision and robotics to finance and biotechnology. At the same time, the use of Field Programmable Gate Arrays (FPGAs) for data-intensive applications is increasingly widespread thanks to the possibility to customize hardware accelerators and achieve high-performance implementations with low energy consumption. Moreover, FPGAs have demonstrated to be a viable alternative to GPUs in embedded systems applications, where the benefits of the reconfigurability properties make the system more robust, capable to face the system failures and to respect the constraints of the embedded devices. In this work, we present a framework to help to implement Deep Learning algorithms by exploiting the PYNQ platform. In particular, we optimized the creation of the communication interface, the failure tolerance, and the on-chip memory usage.
Sector - Presentation at Cloud Computing & Its Applications 2009Robert Grossman
This is a presentation about Sector that I gave at the Cloud Computing and Its Applications (CCA 09) Workshop that took place in Chicago on October 20, 2009. Sector is an open source cloud computing framework designed for data intensive computing.
Scalable Semantic Version Control for Linked Data Management (presented at 2n...Claudius Hauptmann
Scalable Semantic Version Control
for Linked Data Management
Claudius Hauptmann, Michele Brocco and Wolfgang Wörndl
Technische Universität München
2nd Workshop on Linked Data Quality at ESWC 2015 (June 1st, 2015 - Portorož, Slovenia)
http://ldq.semanticmultimedia.org/program
https://www.dropbox.com/s/o2wbd386g676ul6/LDQ2015_paper_06.pdf?dl=1
Process mining methods use data recorded by information systems to analyze the real execution of processes.
This event data is stored in an event log, which is the main input to most process mining methods.
The XES standard provides a uniform way to store event logs.
OpenXES is the XES reference implementation, which is used widely by research tools. However, OpenXES is not scalable towards large event log.
XESLite provides solutions to manage large event logs that are compatible with the OpenXES interfaces. Therefore, it can be used as drop-in replacement for existing algorithms. This presentation investigates the storage requirements of different types of event logs, describes XESLite, and contains a benchmark of XESLite and OpenXES based on real-life event logs.
In this paper, we propose the problem of implementing an efficient query processing system for incomplete temporal and geospatial information in RDFi as a challenge to the SSTD community.
Making Use of the Linked Data Cloud: The Role of Index StructuresThomas Gottron
The intensive growth of the Linked Open Data Cloud has spawned a web of data where a multitude of data sources provides huge amounts of valuable information across different domains. Nowadays, when accessing and using Linked Data more and more often the challenging question is not so much whether there is relevant data available, but rather where it can be found and how it is structured. Thus, index structures play an important role for making use of the information in LOD cloud. In this talk I will address three aspects of Linked Data index structures: (1) a high level view and categorization of indices structures and how they can be queried and explored, (2) approaches for building index structures and the need to maintain them and (3) some example applications which greatly benefit from indices over linked data.
Provenance Analytics at AAAI Human Computation Conference 2013T Dong Huynh
Trung Dong Huynh presenting the paper entitled "Interpretation of Crowdsourced Activities using Provenance Network Analysis" - How analysing provenance graphs can help interpreting crowdsouced activities in CollabMap
An Empirical Comparison of Fast and Efficient Tools for Mining Textual Datavtunali
In order to effectively manage and retrieve the information comprised in vast amount of text documents, powerful text mining tools and techniques are essential. In this paper we evaluate and compare two state-of-the-art data mining tools for clustering high-dimensional text data, Cluto and Gmeans. Several experiments were conducted on three benchmark datasets, and results are analysed in terms of clustering quality, memory and CPU time consumption. We empirically show that Gmeans offers high scalability by sacrificing clustering quality while Cluto presents better clustering quality at the expense of memory and CPU time.
Feta: Federated QuEry TrAcking for Linked Dataserrano-p
Following the principles of Linked Data (LD), data providers are producing thousands of interlinked datasets in multiple domains including life science, government, social networking, media and publications. Federated query engines allow data consumers to query several datasets through a federation of SPARQL endpoints. However, data providers just receive subqueries resulting from the decomposition of the original federated query. Consequently, they do not know how their data are crossed with other datasets of the federation. In this paper, we propose FETA, a Federated quEry TrAcking system for LD. We consider that data providers collaborate by sharing their query logs. Then, from a fed-erated log, FETA infers Basic Graph Patterns (BGPs) containing joined triple patterns, executed among endpoints. We experimented FETA with logs produced by FedBench queries executed with Anapsid and FedX federated query engines. Experiments show that FETA is able to infer BGPs of joined triple patterns with a good precision and recall.
Deep Learning algorithms are gaining momentum as main components in a large number of fields, from computer vision and robotics to finance and biotechnology. At the same time, the use of Field Programmable Gate Arrays (FPGAs) for data-intensive applications is increasingly widespread thanks to the possibility to customize hardware accelerators and achieve high-performance implementations with low energy consumption. Moreover, FPGAs have demonstrated to be a viable alternative to GPUs in embedded systems applications, where the benefits of the reconfigurability properties make the system more robust, capable to face the system failures and to respect the constraints of the embedded devices. In this work, we present a framework to help to implement Deep Learning algorithms by exploiting the PYNQ platform. In particular, we optimized the creation of the communication interface, the failure tolerance, and the on-chip memory usage.
Sector - Presentation at Cloud Computing & Its Applications 2009Robert Grossman
This is a presentation about Sector that I gave at the Cloud Computing and Its Applications (CCA 09) Workshop that took place in Chicago on October 20, 2009. Sector is an open source cloud computing framework designed for data intensive computing.
Scalable Semantic Version Control for Linked Data Management (presented at 2n...Claudius Hauptmann
Scalable Semantic Version Control
for Linked Data Management
Claudius Hauptmann, Michele Brocco and Wolfgang Wörndl
Technische Universität München
2nd Workshop on Linked Data Quality at ESWC 2015 (June 1st, 2015 - Portorož, Slovenia)
http://ldq.semanticmultimedia.org/program
https://www.dropbox.com/s/o2wbd386g676ul6/LDQ2015_paper_06.pdf?dl=1
Process mining methods use data recorded by information systems to analyze the real execution of processes.
This event data is stored in an event log, which is the main input to most process mining methods.
The XES standard provides a uniform way to store event logs.
OpenXES is the XES reference implementation, which is used widely by research tools. However, OpenXES is not scalable towards large event log.
XESLite provides solutions to manage large event logs that are compatible with the OpenXES interfaces. Therefore, it can be used as drop-in replacement for existing algorithms. This presentation investigates the storage requirements of different types of event logs, describes XESLite, and contains a benchmark of XESLite and OpenXES based on real-life event logs.
In this paper, we propose the problem of implementing an efficient query processing system for incomplete temporal and geospatial information in RDFi as a challenge to the SSTD community.
The linked open data cloud is constantly evolving as datasets are continuously updated with newer versions. As a result, representing, querying, and visualizing the temporal dimension of linked data is crucial. This is especially important for geospatial datasets that form the backbone of large scale open data publication efforts in many sectors of the economy (the public sector, the Earth observation sector). Although there has been some work on the representation and querying of linked geospatial data that change over time, to the best of our knowledge, there is currently no tool that offers spatiotemporal visualization of such data. Although the visualization of the temporal evolution of geospatial data is common practice in the GIS area, there is no tool that handles linked geospatial data and allows for the visualization of both the spatial and temporal dimensions, to the best of our knowledge. In this demo paper, we present SexTant, a Web-based system for the visualization and exploration of time-evolving linked geospatial data and the creation, sharing, and collaborative editing of "temporally-enriched" thematic maps which are produced by combining different sources of such data.
AN EFFECT OF USING A STORAGE MEDIUM IN DIJKSTRA ALGORITHM PERFORMANCE FOR IDE...ijcsit
The graph model is used widely for representing connected objects within a specific area. These objects are defined as nodes; where the connection is represented as arc called edges. The shortest path between two nodes is one of the most focus researchers’ attentions. Many algorithms are developed with different structured approach for reducing the shortest path cost. The most widely used algorithm is Dijkstra algorithm. This algorithm has been represented with various structural developments in order to reduce the shortest path cost. This paper highlights the idea of using a storage medium to store the solution path from Dijkstra algorithm, then, uses it to find the implicit path in an ideal time cost. The performance of Dijkstra algorithm using an appropriate data structure is improved. Our results emphasize that the searching time through the given data structure is reduced within different graphs models.
Second part of the Course "Java Open Source GIS Development - From the building blocks to extending an existing GIS application." held at the University of Potsdam in August 2011
Data integration with a façade. The case of knowledge graph construction.Enrico Daga
"Data integration with a façade.
The case of knowledge graph construction." is an overview of recent research in façade-based data access. The slides introduce core notions of façade-based data access and the design principles of SPARQL Anything, a system that allows querying of many formats (CSV, JSON, XML, HTML, Markdown , Excel, ...) in plain SPARQL.
Design and Development of a Provenance Capture Platform for Data SciencePaolo Missier
A talk given at the DATAPLAT workshop, co-located with the IEEE ICDE conference (May 2024, Utrecht, NL).
Data Provenance for Data Science is our attempt to provide a foundation to add explainability to data-centric AI.
It is a prototype, with lots of work still to do.
Age of Information in an URLLC-enabled Decode-and-Forward Wireless Communicat...Chathuranga Basnayaka
Age of Information (AoI) measures the freshness of data in mission critical Internet-of-Things (IoT) applications i.e., industrial internet, intelligent transportation systems etc. In this paper, a new system model is proposed to estimate the average AoI (AAoI) in an ultra-reliable low latency communication (URLLC) enabled wireless communication system with decode-and-forward relay scheme over the quasi-static Rayleigh block fading channels. Short packet communication scheme is used to meet both reliability and latency requirements of the proposed wireless network. By resorting finite block length information theory, queuing theory and stochastic processes, a closed-form expression for AAoI is obtained. Finally, the impact of the system parameters, such as update generation rate, block length and block length allocation factor on the AAoI are investigated. All results are validated by the numerical results. Index Terms-Age-of-Information, finite block length regime, latency, reliability, ultra-reliable low latency communications (URLLC) and 5GB.
GRASS and OSGeo: a framework for archeologyMarkus Neteler
Use of GIS and geospatial data in archeology. Contribution to:
Quarto Workshop Italiano "Open Source, Free Software e Open Format nei processi di ricerca archeologica", Roma, 27 e 28 aprile 2009. Sede centrale del Consiglio Nazionale delle Ricerche (CNR)
http://www.archeo-foss.org/
Abstract:
With the widespread availability of desktop GIS, archaeologists have gained the tools to comprehensively analyze the important spatial component of their data. Initial archaeological use of GIS was (and still is in many instances) for making maps of archaeological sites. Rather quickly GIS became used for predictive modeling of site locations. More recently, viewshed analysis has seen increasing use, in efforts to understand prehistoric perceptions of the landscape.
In the last years, Open Source GIS software evolved to a powerful set of software products which support both scientific as well as common GIS users. In particular, the integration of GIS with image processing capabilities, geospatial data analysis, database management system and Web mapping software enables archaeologists to perform their tasks in a completely free environment. Since 2006, the Open Source Geospatial Foundation (OSGeo) operates as umbrella foundation for Web Mapping, Desktop GIS Applications, Geospatial Libraries, Metadata Catalog as well as the Public Geospatial Data project and the Education and Curriculum project.
In our presentation, we focus on GRASS GIS (http://grass.osgeo.org/) for spatial data analysis and visualization. GRASS is the largest Open Source GIS program currently available. The new version GRASS 6.4.0 is interoperable as it supports all common vector and raster GIS formats. Its capabilities cover raster and volume spatial analysis and modeling, time-series and landscape analysis, image processing, and visualization of 2D and 3D (voxel) raster data. Vector data can be digitized, extracted, extruded to 3D, and vector networks analyzed. Vector data are handled topologically. Vector attributes are stored in internal or externally connected databases. All general GIS tasks like map reprojection, georeferencing, and transformations are available for raster and vector data. The data storage concept of GRASS permits for single as well as multi-user access set up via network file system.
GRASS 6.4.0, the new stable release after more than one year of development and testing, brings a number of exciting enhancements to the GIS. Besides the hundreds of new module features, supported data formats, and language translations. The 6.4.0 release also runs in MS-Windows, a new installer is provided. A new graphical user interface with integrated location wizard and new vector digitizer is also included.
The presentation concludes with a series of applications relevant to archaeology including image processing, Lidar data analysis, fast viewshed analysis and more.
ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial DataKostis Kyzirakos
In this tutorial we present the life cycle of linked geospatial data and we focus on two important steps: the publication of geospatial data as RDF graphs and interlinking them with each other. Given the proliferation of geospatial information on the Web many kinds of geospatial data are now becoming available as linked datasets (e.g., Google and Bing maps, user-generated geospatial content, public sector information published as open data etc.). The topic of the tutorial is related to all core research areas of the Semantic Web (e.g., semantic information extraction, transformation of data into RDF graphs, interlinking linked data etc.) since there is often a need to re-consider existing core techniques when we deal with geospatial information. Thus, it is timely to train Semantic Web researchers, especially the ones that are in the early stages of their careers, on the state of the art of this area and invite them to contribute to it.
In this tutorial we give a comprehensive background on data models, query languages, implemented systems for linked geospatial data, and we discuss recent approaches on publishing and interlinking geospatial data. The tutorial is complemented with a hands-on session that will familiarize the audience with the state-of-the-art tools in publishing and interlinking geospatial information.
http://event.cwi.nl/eswc2015-geo/
Geographica: A Benchmark for Geospatial RDF StoresKostis Kyzirakos
Geospatial extensions of SPARQL like GeoSPARQL and stSPARQL have recently been defined and corresponding geospatial RDF stores have been implemented. However, there is no widely used benchmark for evaluating geospatial RDF stores which takes into account recent advances to the state of the art in this area. In this paper, we develop a benchmark, called Geographica, which uses both real-world and synthetic data to test the offered functionality and the performance of some prominent geospatial RDF stores.
We present a new version of the data model stRDF and the query language stSPARQL for the representation and querying of geospatial data. The new versions of stRDF and stSPARQL use OGC standards to represent geometries where the original version of stSPARQL used linear constraints. In this sense stSPARQL is a subset of the recent standard GeoSPARQL proposed by OGC. We discuss the implementation of the system Strabon which is a storage and query evaluation module for stRDF/stSPARQL and the corresponding subset of GeoSPARQL. We study the performance of Strabon experimentally and show that it scales to very large data volumes.
1. University of Athens
School of Science
Faculty of Informatics and Telecommunications
Kostis Kyzirakos
Linked Data Europe:
Tools and Applications
March 21, 2014
Database Architectures group
Centrum Wiskunde & Informatica
Netherlands
7. Real-world Workload:
500 million triples – cold caches
7
Thematic selectivity: 100%
number of Nodes in query region
Responsetime(sec)
Responsetime(sec)
number of Nodes in query region
Thematic selectivity: 0.1%