This talk was given by FORTH, Greece, at the European Data Forum (EDF) 2012 took place on June 6-7, 2012 in Copenhagen (Denmark) at the Copenhagen Business School (CBS).
Abstract:
Given the increasing amount of sensitive RDF data available on the Web, it becomes increasingly critical to guarantee secure access to this content. Access control is complicated when RDFS inference rules and other dependencies between access permissions of triples need to be considered; this is necessary, e.g., when we want to associate the access permissions of inferred triples with the ones that implied it. In this paper we advocate the use of abstract provenance models that are defined by means of abstract tokens operators to support fine grained access control for RDF graphs. The access label of a triple is a complex expression that encodes how said label was produced (i.e., the triples that contributed to its computation). This feature allows us to know exactly the effects of any possible change, thereby avoiding a complete recomputation of the labels when a change occurs. In addition, the same application can choose to enforce different access control policies or, different applications can enforce different policies on the same data, avoiding the recomputation of the label of a triple. Preliminary experiments have shown the applicability and benefits of our approach.
The formulation of constraints and the validation of RDF data against these constraints is a common requirement and a much sought-after feature, particularly as this is taken for granted in the XML world. Recently, RDF validation as a research field gained speed due to shared needs of data practitioners from a variety of domains. For constraint formulation and RDF data validation, several languages exist or are currently developed. Yet, none of the languages is able to meet all requirements raised by data professionals.
We have published a set of constraint types that are required by diverse stakeholders for data applications. We use these constraint types to gain a better understanding of the expressiveness of solutions, investigate the role that reasoning plays in practical data validation, and give directions for the further development of constraint languages.
We introduce a validation framework that enables to consistently execute RDF-based constraint languages on RDF data and to formulate constraints of any type in a way that mappings from high-level constraint languages to an intermediate generic representation can be created straight-forwardly. The framework reduces the representation of constraints to the absolute minimum, is based on formal logics, and consists of a very simple conceptual model with a small lightweight vocabulary. We demonstrate that using another layer on top of SPARQL ensures consistency regarding validation results and enables constraint transformations for each constraint type across RDF-based constraint languages.
Doctoral Examination at the Karlsruhe Institute of Technology (08.07.2016)Dr.-Ing. Thomas Hartmann
In this thesis, a validation framework is introduced that enables to consistently execute RDF-based constraint languages on RDF data and to formulate constraints of any type. The framework reduces the representation of constraints to the absolute minimum, is based on formal logics, consists of a small lightweight vocabulary, and ensures consistency regarding validation results and enables constraint transformations for each constraint type across RDF-based constraint languages.
RDF Constraint Checking using RDF Data Descriptions (RDD)Alexander Schätzle
Linked Open Data (LOD) sources on the Web are increas-
ingly becoming a mainstream method to publish and con-
sume data. For real-life applications, mechanisms to de-
scribe the structure of the data and to provide guarantees
are needed, as recently emphasized by the W3C in its Data
Shape Working Group. Using such mechanisms, data providers will be able to validate their data, assuring that it is structured in a way expected by data consumers. In turn, data consumers can design and optimize their applications to match the data format to be processed.
In this paper, we present several crucial aspects of RDD,
our language for expressing RDF constraints. We introduce
the formal semantics and describe how RDD constraints can be translated into SPARQL for constraint checking. Based on our fully working validator, we evaluate the feasibility and eciency of this checking process using two popular, state-of-the-art RDF triple stores. The results indicate that even a naive implementation of RDD based on SPARQL 1.0 will incur only a moderate overhead on the RDF loading process, yet some constraint types contribute an outsize share and scale poorly. Incorporating several preliminary optimizations, some of them based on SPARQL 1.1, we provide insights on how to overcome these limitations.
Presentation of the paper "Reasoning in the OWL 2 Full Ontology Language using First-Order Automated Theorem Proving" by Michael Schneider, FZI Karlsruhe, and Geoff Sutcliffe, University of Miami, at the 23rd International Conference on Automated Deduction (CADE 23), August 2011.
The formulation of constraints and the validation of RDF data against these constraints is a common requirement and a much sought-after feature, particularly as this is taken for granted in the XML world. Recently, RDF validation as a research field gained speed due to shared needs of data practitioners from a variety of domains. For constraint formulation and RDF data validation, several languages exist or are currently developed. Yet, none of the languages is able to meet all requirements raised by data professionals.
We have published a set of constraint types that are required by diverse stakeholders for data applications. We use these constraint types to gain a better understanding of the expressiveness of solutions, investigate the role that reasoning plays in practical data validation, and give directions for the further development of constraint languages.
We introduce a validation framework that enables to consistently execute RDF-based constraint languages on RDF data and to formulate constraints of any type in a way that mappings from high-level constraint languages to an intermediate generic representation can be created straight-forwardly. The framework reduces the representation of constraints to the absolute minimum, is based on formal logics, and consists of a very simple conceptual model with a small lightweight vocabulary. We demonstrate that using another layer on top of SPARQL ensures consistency regarding validation results and enables constraint transformations for each constraint type across RDF-based constraint languages.
Doctoral Examination at the Karlsruhe Institute of Technology (08.07.2016)Dr.-Ing. Thomas Hartmann
In this thesis, a validation framework is introduced that enables to consistently execute RDF-based constraint languages on RDF data and to formulate constraints of any type. The framework reduces the representation of constraints to the absolute minimum, is based on formal logics, consists of a small lightweight vocabulary, and ensures consistency regarding validation results and enables constraint transformations for each constraint type across RDF-based constraint languages.
RDF Constraint Checking using RDF Data Descriptions (RDD)Alexander Schätzle
Linked Open Data (LOD) sources on the Web are increas-
ingly becoming a mainstream method to publish and con-
sume data. For real-life applications, mechanisms to de-
scribe the structure of the data and to provide guarantees
are needed, as recently emphasized by the W3C in its Data
Shape Working Group. Using such mechanisms, data providers will be able to validate their data, assuring that it is structured in a way expected by data consumers. In turn, data consumers can design and optimize their applications to match the data format to be processed.
In this paper, we present several crucial aspects of RDD,
our language for expressing RDF constraints. We introduce
the formal semantics and describe how RDD constraints can be translated into SPARQL for constraint checking. Based on our fully working validator, we evaluate the feasibility and eciency of this checking process using two popular, state-of-the-art RDF triple stores. The results indicate that even a naive implementation of RDD based on SPARQL 1.0 will incur only a moderate overhead on the RDF loading process, yet some constraint types contribute an outsize share and scale poorly. Incorporating several preliminary optimizations, some of them based on SPARQL 1.1, we provide insights on how to overcome these limitations.
Presentation of the paper "Reasoning in the OWL 2 Full Ontology Language using First-Order Automated Theorem Proving" by Michael Schneider, FZI Karlsruhe, and Geoff Sutcliffe, University of Miami, at the 23rd International Conference on Automated Deduction (CADE 23), August 2011.
Presentation at the ESWC 2011 PhD Symposium in May 2011, by Michael Schneider, FZI. Included are backup slides that have not been presented at the event. The corresponding PhD proposal can be found in the ESWC proceedings at <http: />. Alternatively, the PhD proposal can be downloaded from <http: />.
A non-technical explanation of the main ideas and notions in OWL.This talk was also recorded on video, and is available on-line at http://videolectures.net/koml04_harmelen_o/
Dependent Haskell has been desired in the community of Haskell programmers for a long time. Our goal of this project is to make the core language of Haskell, known as System FC, dependently typed, as steps are taken towards dependent Haskell.
This is a working-in-progress project. As a small step towards our final goal, the focus of this talk is on coercion quantification. Coercion quantification is necessary to support homogeneous equality, which simplifies the core and is important for meta-theories of dependently typed core.
Coercion quantification is interesting for both people working in core and for Haskell users. For GHC hackers, the patch to core formalization is worth attention. Adding coercion quantification involves refactor to lots of files in the compilation pipeline and introduces several subtleties. For Haskell users, coercion quantification opens up new questions to the design space in source Haskell, which requires non-trivial extension of the solver. We would want Haskell users to answer if this feature is ever desired in their development.
In this talk, we will share the high-level story-line of the dependently typed core, our low-level progress in implementing coercion quantification, as well as the involving design space, and seek feedbacks from the broader community.
https://imatge-upc.github.io/synthref/
Integrating computer vision with natural language processing has achieved significant progress
over the last years owing to the continuous evolution of deep learning. A novel vision and language
task, which is tackled in the present Master thesis is referring video object segmentation, in which a
language query defines which instance to segment from a video sequence. One of the biggest chal-
lenges for this task is the lack of relatively large annotated datasets since a tremendous amount of
time and human effort is required for annotation. Moreover, existing datasets suffer from poor qual-
ity annotations in the sense that approximately one out of ten language expressions fails to uniquely
describe the target object.
The purpose of the present Master thesis is to address these challenges by proposing a novel
method for generating synthetic referring expressions for an image (video frame). This method pro-
duces synthetic referring expressions by using only the ground-truth annotations of the objects as well
as their attributes, which are detected by a state-of-the-art object detection deep neural network. One
of the advantages of the proposed method is that its formulation allows its application to any object
detection or segmentation dataset.
By using the proposed method, the first large-scale dataset with synthetic referring expressions for
video object segmentation is created, based on an existing large benchmark dataset for video instance
segmentation. A statistical analysis and comparison of the created synthetic dataset with existing ones
is also provided in the present Master thesis.
The conducted experiments on three different datasets used for referring video object segmen-
tation prove the efficiency of the generated synthetic data. More specifically, the obtained results
demonstrate that by pre-training a deep neural network with the proposed synthetic dataset one can
improve the ability of the network to generalize across different datasets, without any additional annotation cost. This outcome is even more important taking into account that no additional annotation cost is involved.
Comparison of features between ShEx (Shape Expressions) and SHACL (Shapes Constraint Language)
Changelog:
11/06/17
- Removed slides about compositionality
31/May/2017
- Added slide 30 about validation report
- Added slide 32 about stems
- Changed slides 7 and 8 adapting compact syntax to new operator .
23/05/2017:
Slide 14: Repaired typos in typos in sh:entailment, rdfs:range
21/05/2017:
- Slide 8. Changed the example to be an IRI and a datatype
- Added typically in slide 9
- Slide 10: Removed the phrase: "Target declarations can problematic when reusing/importing shapes"
and created slide 27 to talk about reuability
- Added slide 11 to talk about the differences in triggering validation
- Created slide 14 to talk about inference
- Renamed slide 15 as "Inference and triggering mechanism"
- Added slides 27 and 28 to talk about reuability
- Added slide 29 to talk about annotations
18/05/2017
- Slides 9 now includes an example using ShEx RDF vocabulary
- Slide 10 now says that target declarations are optional
- Slide 13 now says that some RDF Schema terms have special treatment in SHACL
- Example in slide 18 now uses sh:or instead of sh:and
- Added slides 22, 23 and 24 which show some features supported by SHACL but not supported by ShEx (property pair constraints, uniqueLang and owl:imports)
Linguistic markup and transclusion processing in XML documentsSimon Dew
Transclusion can have linguistic consequences. This presentation proposes a markup scheme that can be used to indicate [a] the required form (e.g. syntactic case) of a transcluded term in an XML document, and [b] the syntactic features of the transcluded term that demand agreement in the surrounding document. It also describes a set of XSLT transformations that can be used to select the correct form of any dependent words in the surrounding document, using dictionaries that conform to the TEI.
Presentation at the ESWC 2011 PhD Symposium in May 2011, by Michael Schneider, FZI. Included are backup slides that have not been presented at the event. The corresponding PhD proposal can be found in the ESWC proceedings at <http: />. Alternatively, the PhD proposal can be downloaded from <http: />.
A non-technical explanation of the main ideas and notions in OWL.This talk was also recorded on video, and is available on-line at http://videolectures.net/koml04_harmelen_o/
Dependent Haskell has been desired in the community of Haskell programmers for a long time. Our goal of this project is to make the core language of Haskell, known as System FC, dependently typed, as steps are taken towards dependent Haskell.
This is a working-in-progress project. As a small step towards our final goal, the focus of this talk is on coercion quantification. Coercion quantification is necessary to support homogeneous equality, which simplifies the core and is important for meta-theories of dependently typed core.
Coercion quantification is interesting for both people working in core and for Haskell users. For GHC hackers, the patch to core formalization is worth attention. Adding coercion quantification involves refactor to lots of files in the compilation pipeline and introduces several subtleties. For Haskell users, coercion quantification opens up new questions to the design space in source Haskell, which requires non-trivial extension of the solver. We would want Haskell users to answer if this feature is ever desired in their development.
In this talk, we will share the high-level story-line of the dependently typed core, our low-level progress in implementing coercion quantification, as well as the involving design space, and seek feedbacks from the broader community.
https://imatge-upc.github.io/synthref/
Integrating computer vision with natural language processing has achieved significant progress
over the last years owing to the continuous evolution of deep learning. A novel vision and language
task, which is tackled in the present Master thesis is referring video object segmentation, in which a
language query defines which instance to segment from a video sequence. One of the biggest chal-
lenges for this task is the lack of relatively large annotated datasets since a tremendous amount of
time and human effort is required for annotation. Moreover, existing datasets suffer from poor qual-
ity annotations in the sense that approximately one out of ten language expressions fails to uniquely
describe the target object.
The purpose of the present Master thesis is to address these challenges by proposing a novel
method for generating synthetic referring expressions for an image (video frame). This method pro-
duces synthetic referring expressions by using only the ground-truth annotations of the objects as well
as their attributes, which are detected by a state-of-the-art object detection deep neural network. One
of the advantages of the proposed method is that its formulation allows its application to any object
detection or segmentation dataset.
By using the proposed method, the first large-scale dataset with synthetic referring expressions for
video object segmentation is created, based on an existing large benchmark dataset for video instance
segmentation. A statistical analysis and comparison of the created synthetic dataset with existing ones
is also provided in the present Master thesis.
The conducted experiments on three different datasets used for referring video object segmen-
tation prove the efficiency of the generated synthetic data. More specifically, the obtained results
demonstrate that by pre-training a deep neural network with the proposed synthetic dataset one can
improve the ability of the network to generalize across different datasets, without any additional annotation cost. This outcome is even more important taking into account that no additional annotation cost is involved.
Comparison of features between ShEx (Shape Expressions) and SHACL (Shapes Constraint Language)
Changelog:
11/06/17
- Removed slides about compositionality
31/May/2017
- Added slide 30 about validation report
- Added slide 32 about stems
- Changed slides 7 and 8 adapting compact syntax to new operator .
23/05/2017:
Slide 14: Repaired typos in typos in sh:entailment, rdfs:range
21/05/2017:
- Slide 8. Changed the example to be an IRI and a datatype
- Added typically in slide 9
- Slide 10: Removed the phrase: "Target declarations can problematic when reusing/importing shapes"
and created slide 27 to talk about reuability
- Added slide 11 to talk about the differences in triggering validation
- Created slide 14 to talk about inference
- Renamed slide 15 as "Inference and triggering mechanism"
- Added slides 27 and 28 to talk about reuability
- Added slide 29 to talk about annotations
18/05/2017
- Slides 9 now includes an example using ShEx RDF vocabulary
- Slide 10 now says that target declarations are optional
- Slide 13 now says that some RDF Schema terms have special treatment in SHACL
- Example in slide 18 now uses sh:or instead of sh:and
- Added slides 22, 23 and 24 which show some features supported by SHACL but not supported by ShEx (property pair constraints, uniqueLang and owl:imports)
Linguistic markup and transclusion processing in XML documentsSimon Dew
Transclusion can have linguistic consequences. This presentation proposes a markup scheme that can be used to indicate [a] the required form (e.g. syntactic case) of a transcluded term in an XML document, and [b] the syntactic features of the transcluded term that demand agreement in the surrounding document. It also describes a set of XSLT transformations that can be used to select the correct form of any dependent words in the surrounding document, using dictionaries that conform to the TEI.
Definition of Digital Twin, how does it work, what's the origin, what the use case of the digital twin in industries, what's the relationship between JSON-LD and RDFs. What is the current market for digital twins and what are the available tools in the market to create a digital twin? Does digital twin only works with JSON-LD or does it works as well with RDF?
Presented in : JIST2015, Yichang, China
Prototype: http://rc.lodac.nii.ac.jp/rdf4u/
Video: https://www.youtube.com/watch?v=z3roA9-Cp8g
Abstract: It is known that Semantic Web and Linked Open Data (LOD) are powerful technologies for knowledge management, and explicit knowledge is expected to be presented by RDF format (Resource Description Framework), but normal users are far from RDF due to technical skills required. As we learn, a concept-map or a node-link diagram can enhance the learning ability of learners from beginner to advanced user level, so RDF graph visualization can be a suitable tool for making users be familiar with Semantic technology. However, an RDF graph generated from the whole query result is not suitable for reading, because it is highly connected like a hairball and less organized. To make a graph presenting knowledge be more proper to read, this research introduces an approach to sparsify a graph using the combination of three main functions: graph simplification, triple ranking, and property selection. These functions are mostly initiated based on the interpretation of RDF data as knowledge units together with statistical analysis in order to deliver an easily-readable graph to users. A prototype is implemented to demonstrate the suitability and feasibility of the approach. It shows that the simple and flexible graph visualization is easy to read, and it creates the impression of users. In addition, the attractive tool helps to inspire users to realize the advantageous role of linked data in knowledge management.
RDF4U: RDF Graph Visualization by Interpreting Linked Data as KnowledgeRathachai Chawuthai
Presented in : JIST2015, Yichang, China
Prototype: http://rc.lodac.nii.ac.jp/rdf4u/
Video: https://www.youtube.com/watch?v=z3roA9-Cp8g
Abstract: It is known that Semantic Web and Linked Open Data (LOD) are powerful technologies for knowledge management, and explicit knowledge is expected to be presented by RDF format (Resource Description Framework), but normal users are far from RDF due to technical skills required. As we learn, a concept-map or a node-link diagram can enhance the learning ability of learners from beginner to advanced user level, so RDF graph visualization can be a suitable tool for making users be familiar with Semantic technology. However, an RDF graph generated from the whole query result is not suitable for reading, because it is highly connected like a hairball and less organized. To make a graph presenting knowledge be more proper to read, this research introduces an approach to sparsify a graph using the combination of three main functions: graph simplification, triple ranking, and property selection. These functions are mostly initiated based on the interpretation of RDF data as knowledge units together with statistical analysis in order to deliver an easily-readable graph to users. A prototype is implemented to demonstrate the suitability and feasibility of the approach. It shows that the simple and flexible graph visualization is easy to read, and it creates the impression of users. In addition, the attractive tool helps to inspire users to realize the advantageous role of linked data in knowledge management.
Presentation done* at the 13th International Semantic Web Conference (ISWC) in which we approach a compressed format to represent RDF Data Streams. See the original article at: http://dataweb.infor.uva.es/wp-content/uploads/2014/07/iswc14.pdf
* Presented by Alejandro Llaves (http://www.slideshare.net/allaves)
First Steps in Semantic Data Modelling and Search & Analytics in the CloudOntotext
This webinar will break the roadblocks that prevent many from reaping the benefits of heavyweight Semantic Technology in small scale projects. We will show you how to build Semantic Search & Analytics proof of concepts by using managed services in the Cloud.
A brief history of the RDF4J Project and an overview of tools and code examples that demonstrate how to work with it in your applications.
Slides accompanying the Lotico Webinar event on May 14, 2020 - see http://www.lotico.com/index.php/Eclipse_RDF4J_-_Working_with_RDF_in_Java
Property graph vs. RDF Triplestore comparison in 2020Ontotext
This presentation goes all the way from intro "what graph databases are" to table comparing the RDF vs. PG plus two different diagrams presenting the market circa 2020
Authros: Nguyen Quoc Viet Hung (1), Nguyen Thanh Tam (1), Zoltán Miklós (2), Karl Aberer (1),
Avigdor Gal (3), and Matthias Weidlich (4)
1 École Polytechnique Fédérale de Lausanne
2 Université de Rennes 1
3 Technion – Israel Institute of Technology
4 Imperial College London
by Irene Celino, Simone Contessa, Marta Corubolo, Daniele Dell’Aglio, Emanuele Della Valle, Stefano Fumeo and Thorsten Krüger
CEFRIEL – Politecnico di Milano – SIEMENS
by G. Larkou, J. Metochi, G. Chatzimilioudis and D. Zeinalipour-Yazti
Presented at: 1st IEEE International Workshop on Mobile Data Management Mining and Computing on Social Networks, collocated with IEEE MDM'13
The presentation was delivered by FORTH at the 3rd International Workshop on the role of Semantic Web in Provenance Management 2012 (SWPM2012) in Heraklion, Greece on 28th of May 2012.
Abstract:
Workflow systems can produce very large amounts of provenance information. In this paper we introduce provenance-based inference rules as a means to reduce the amount of provenance information that has to be stored, and to ease quality control (e.g., corrections). We motivate this kind of (provenance) inference and identify a number of basic inference rules over a conceptual model appropriate for representing provenance. The proposed inference rules concern the interplay between (i) actors and carried out activities, (ii) activities and devices that were used for such activities, and, (iii) the presence of information objects and physical things at events. However, since a knowledge base is not static but it changes over time for various reasons, we also study how we can satisfy change requests while supporting and respecting the aforementioned inference rules. Towards this end, we elaborate on the specification of the required change operations.
This paper was presented by Vassilis Papakonstantinou at the 17th ACM Symposium on Access Control Models and Technologies (ACM SACMAT 2012) in Newark, USA, June 20 - 22, 2012.
Abstract:
The Resource Description Framework (RDF) has become the defacto standard for representing information in the Semantic Web. Given the increasing amount of sensitive RDF data available on the Web, it becomes increasingly critical to guarantee secure access to this content. In this paper we advocate the use of an abstract access control model to ensure the selective exposure of RDF information. The model is defined by a set of abstract operators. Tokens are used to label RDF triples with access information. Abstract operators model RDF Schema inference rules and propagation of labels along the RDF Schema (RDFS) class and property hierarchies. In this way, the access label of a triple is a complex expression that involves the labels of the triples and the operators applied to obtain said label. Different applications can then adopt different concrete access policies that encode an assignment of the abstract tokens and operators to concrete (specific) values. Following this approach, changes in the interpretation of abstract tokens and operators can be easily implemented resulting in a very flexible mechanism that allows one to easily experiment with different concrete access policies (defined per context or user). To demonstrate the feasibility of the approach, we implemented our ideas on top of the MonetDB and PostgreSQL open source database systems. We conducted an initial set of experiments which showed that the overhead for using abstract expressions is roughly linear to the number of triples considered; performance is also affected by the characteristics of the dataset, such as the size and depth of class and property hierarchies as well as the considered concrete policy.
The talk was delivered by Martin Kersten from CWI, Netherland, at the workshop on "Global Scientific Data Infrastructures: The Findability Challenge", held in Taormina, Sicily, Italy, on May 10-11, 2012.
This talk has been given at the 13th International Conference on Principles of Knowledge Representation and Reasoning (KR 2012) to be held in Rome, Italy, June 10-14, 2012 by Ilias Tahmazidis (FORTH).
Abstract:
We are witnessing an explosion of available data from the Web, government authorities, scientific databases, sensors and more. Such datasets could benefit from the introduction of rule sets encoding commonly accepted rules or facts, application- or domain-specific rules, commonsense knowledge etc. This raises the question of whether, how, and to what extent knowledge representation methods are capable of handling the vast amounts of data for these applications. In this paper, we consider nonmonotonic reasoning, which has traditionally focused on rich knowledge structures. In particular, we consider defeasible logic, and analyze how parallelization, using the MapReduce framework, can be used to reason with defeasible rules over huge data sets. Our experimental results demonstrate that defeasible reasoning with billions of data is performant, and has the potential to scale to trillions of facts.
The presentation was delivered during the 1st International Conference on Health Information Science (HIS 2012) on April 9th, 2012 in Beijing, China.
Abstract:
In cytomics bookkeeping of the data generated during lab experiments is crucial. The current approach in cytomics is to conduct High-Throughput Screening (HTS) experiments so that cells can be tested under many different experimental conditions. Given the large amount of different conditions and the readout of the conditions through images, it is clear that the HTS approach requires a proper data management system to reduce the time needed for experiments and the chance of man-made errors. As different types of data exist, the experimental conditions need to be linked to the images produced by the HTS experiments with their metadata and the results of further analysis. Moreover, HTS experiments never stand by themselves, as more experiments are lined up, the amount of data and computations needed to analyze these increases rapidly. To that end cytomic experiments call for automated and systematic solutions that provide convenient and robust features for scientists to manage and analyze their data. In this paper, we propose a platform for managing and analyzing HTS images resulting from cytomics screens taking the automated HTS workflow as a starting point. This platform seamlessly integrates the whole HTS workflow into a single system. The platform relies on a modern relational database system to store user data and process user requests, while providing a convenient web interface to end-users. By implementing this platform, the overall workload of HTS experiments, from experiment design to data analysis, is reduced significantly. Additionally, the platform provides the potential for data integration to accomplish genotype-to-phenotype modeling studies.
The talk was given at the 15th International Conference on Extending Database Technology (EDBT 2012) on March 29, 2012 in Berlin, Germany.
Abstract:
Query optimization in RDF Stores is a challenging problem as SPARQL queries typically contain many more joins than equivalent relational plans, and hence lead to a large join order search space. In such cases, cost-based query optimization often is not possible. One practical reason for this is that statistics typically are missing in web scale setting such as the Linked Open Datasets (LOD). The more profound reason is that due to the absence of schematic structure in RDF, join-hit ratio estimation requires complicated forms of correlated join statistics; and currently there are no methods to identify the relevant correlations beforehand. For this reason, the use of good heuristics is essential in SPARQL query optimization, even in the case that are partially used with cost-based statistics (i.e., hybrid query optimization). In this paper we describe a set of useful heuristics for SPARQL query optimizers. We present these in the context of a new Heuristic SPARQL Planner (HSP) that is capable of exploiting the syntactic and the structural variations of the triple patterns in a SPARQL query in order to choose an execution plan without the need of any cost model. For this, we define the variable graph and we show a reduction of the SPARQL query optimization problem to the maximum weight independent set problem. We implemented our planner on top of the MonetDB open source column-store and evaluated its effectiveness against the state-of-the-art RDF-3X engine as well as comparing the plan quality with a relational (SQL) equivalent of the benchmarks.
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
Elevating Tactical DDD Patterns Through Object Calisthenics
Abstract Access Control Model for Dynamic RDF Datasets
1. Abstract Access Control
Models for Dynamic RDF
Datasets
Irini Fundulaki
CWI & FORTH-ICS
Giorgos Flouris
FORTH-ICS
Vassilis Papakonstantinou
FORTH-ICS & University of Crete
European Data Forum 2012
2. Controlling Access to RDF Data
• Why RDF Data?
• RDF is the de-facto standard for publishing data in the Linked
Open Data Cloud
• Public Government Data (US, UK, France, Austria, The
Netherlands, … )
• E-Science (astronomy, life sciences, earth sciences)
• Social Networks
• DBPedia, Wikipedia, CIA World FactBook, …
• Why Access Control?
• Crucial for sensitive content since it ensures the selective
exposure of information to different classes of users
European Data Forum 2012
3. Controlling Access to RDF Data
• Fine-grained Access Control Model for RDF
• focus at the RDF triple level
• focus on read-only permissions
• with support for RDFS inference to infer new knowledge
• encodes how an access label has been computed
• contributing triples
• Implementation of a fine-grained, repository independent,
portable across platforms access control framework on top of the
MonetDB column store engine
European Data Forum 2012
4. Access Control Annotations
• Standard access control models associate a concrete access label to
a triple
s p o permission
&a type Student allowed
Student sc Person denied
• An implied RDF triple can be accessed if and only if all its
implying triples can be accessed
s p o permission
&a type Person denied
European Data Forum 2012
5. Access Control Annotations
• In the case of any kind of update, the implied triples & their labels
must be re-computed
s p o permission
&a type Student allowed
Student sc Person denied
allowed ⇐
• An implied RDF triple can be accessed if and only if all its
implying triples can be accessed
s p o permission
&a type Person allowed
denied
the overhead can be substantial when updates occur frequently
European Data Forum 2012
6. Access Control Annotations
• Annotation models are easy to handle but are not amenable to
changes since there is no knowledge of the affected triples
• Any change leads to the re-computation of inferred triples and
their labels
• if the access label of one triple changes
• if a triple is deleted, modified or added
• if the semantics according to which the labels of inferred triples are
computed change
• if the policy changes (a liberal policy becomes conservative)
European Data Forum 2012
7. Abstract Access Control Models
for RDF
• Encode how the label of an implied triple was computed
• Triples are assigned abstract tokens and not concrete values
s p o permission
&a type Student l1
Student sc Person l2
s p o permission
&a type Person l1 ⊙ l2
• l 1 l 2 : abstract tokens
• ⊙ : operator that encodes that inference was used to produce the
inferred triple European Data Forum 2012
8. Annotation:
Computing the Access Labels
• Triples are assigned labels through authorization queries
• RDFS inference rules are applied to infer new knowledge
A1 : (construct {?x firstName ?y}
where {?x type Student }, l1) s p o l
A2 : (construct {?x sc ?y}, l2)
q1: Student sc Person l2
A3 : (construct {?x type Student }, l3) q2: Person sc Agent l2
Authorizations q3: &a type Student l3
(Query, abstract token) q4: &a firstName Alice l1
q5: Agent type class l4
q6: Student sc Person l5
RDF quadruples European Data Forum 2012
9. Annotation:
Applying RDFS Inference Rules
RDFS Inference: quadruple generating rules
(A1, sc, A2, l1) (A2, sc, A3, l2) (A1, sc, A3, l1 ⊙ l2)
(&r1, type, A1, l1) (A1, sc, A2, l2) (&r1, type, A2, l1 ⊙ l2)
s p o l
s p o l
q1: Student sc Person l2
q8 : Student sc Agent l2 ⊙ l2
q2 : Person sc Agent l2
q9 : Student sc Agent l5 ⊙ l2
q3 : &a type Student l3
q10: &a type Person l3 ⊙ l2
q6 : Student sc Person l5
q11: &a type Agent (l3 ⊙ l2) ⊙ l2
RDF quadruples q12: &a type Agent (l5 ⊙ l2) ⊙ l2
Inferred RDF quadruples
European Data Forum 2012
10. Evaluation: Assign Concrete
Values to Abstract Expressions
• Set of Concrete Tokens and a Mapping from abstract to
concrete tokens
• Set of Concrete operators that implement the abstract
ones
• Conflict resolution operator to resolve ambiguous
labels
• Access Function to decide when a triple is accessible
European Data Forum 2012
11. Abstract Access Control Models
for RDF
• Use of concrete policies to assign concrete values to the
abstract tokens and operators
s p o permision
q11: &a type Student l3 true
q12: Student sc Person l2 false
• l3 maps to true and l2 maps to false
• ⊙ maps to logical conjunction
s p o permission
&a type Person l3 ⊙ l2
false true and false
European Data Forum 2012
12. Abstract Access Control Models
for RDF: Updates
• If a concrete policy changes, we need to re-compute the
expressions
s p o permision
q11: &a type Student l3 false
q12: Student sc Person l2 true
• l3 maps to false and l2 maps to true
• ⊙ maps to logical disjunction
s p o permission
&a type Person l3 ⊙ l2
true false or true
European Data Forum 2012
13. Pros & Cons of Abstract Access
Control Models
• Pros:
• The same application can experiment with different concrete
policies over the same dataset
• liberal vs conservative policies for different classes of users
• Different applications can experiment with different concrete
policies for the same data
• In the case of updates there is no need re-compute the inferred
triples
• Cons:
• overhead in the required storage space
• algebraic expressions can become complex depending on
the structure of the dataset
European Data Forum 2012
14. Conclusions & Future Work
• Abstract Models to record how the access label of a
triple has been computed: beneficial in the case of
updates
• Currently working towards a robust
implementation of the proposed approach using the
MonetDB column store engine
European Data Forum 2012
Editor's Notes
In the last years we have seen an explosion of massive amounts of graph shaped data coming from a variery of applications that are related to social networks like facebook, twitter, blogs and other on-line media and telecommunication networks. Furthermore, the W3C linking open data initiative has boosted the publication and interlinkage of a large number of datasets on the semantic web resulting to the Linked Data Cloud. These datasets with billions of RDF triples such as Wikipedia, U.S. Census bureau, CIA World Factbook, DBPedia, and government sites have been created and published online. Moreover, numerous datasets and vocabularies from e-science are published nowadays as RDF graphs most notably in life and earth sciences, astronomy in order to facilitate community annotation and interlinkage of both scientific and scholarly data of interest.