ABSTRACT: Knowledge Graphs (KGs) are an emerging, highly flexible and Web-friendly technology for integrating, representing, and querying semi-structured data in a semantically rich model formalized by an Ontology. KGs may be built using specialized data management software (e.g., triplestores) or, by leveraging suitable mappings and query rewriting techniques, as "Virtual Knowledge Graph" (VKG) views over some legacy data source, such as a relational database. In this talk, we provide background information on VKGs and their underlying technologies, with particular emphasis on the open-source Ontop VKG engine, and we discuss ongoing research and development efforts towards their extension to Web APIs as a non-relational data source of practical relevance. This extension, supported by the HIVE and OntoCRM projects, would also enable transparent access to both static relational data and dynamically-computed Web API data as part of a regular VKG query.
BIO: Francesco Corcoglioniti is a researcher at the Free University of Bozen-Bolzano, Italy, where he contributes to research, development, and project collaborations related to Virtual Knowledge Graphs (VKG), their extensions, and their implementation in the open-source Ontop system.
This slides I've used on talk about Semantic Web use-case. Not all know what exactly Semantic Web is about. So I've created set of slides showing this in a simple and correct way. Use-case slides are removed on this public available slides. Animated version here goo.gl/qKoF6k . Contact me for sources!
This document discusses ontology mapping. It begins with an introduction to the semantic web and ontologies. Ontology mapping is important for allowing different ontologies to be aligned and related. There are different types of ontology mapping including alignment, merging, and mapping. The document then surveys some popular ontology mapping techniques including GLUE, PROMPT, and QOM. It evaluates these techniques and discusses their inputs, outputs, and approaches. The document concludes that semantic web research is important for advancing web technologies and realizing the goals of web 3.0. Future work could involve developing new ontology mapping techniques and publishing research on existing mapping methods.
Implementing the Open Government Directive using the technologies of the Soci...George Thomas
This presentation demonstrates the use of Semantic Web technologies with Social Networking tools, considering metadata specifications as Social Media. Example ontologies and instance data from the Capital Planning and Investment Control and Business Motivation are created that link 'what' (Agency IT investments) with 'why' (Agency goals and objectives), using a simple linking ontology. Knowledge Workers use a Semantic Halo Mediawiki to curate the data.
Data integration with a façade. The case of knowledge graph construction.Enrico Daga
"Data integration with a façade.
The case of knowledge graph construction." is an overview of recent research in façade-based data access. The slides introduce core notions of façade-based data access and the design principles of SPARQL Anything, a system that allows querying of many formats (CSV, JSON, XML, HTML, Markdown , Excel, ...) in plain SPARQL.
SEMANTIC WEB SOURCES – comparison of open-source Knowledge GraphsMatteoBelcao
A theorical & practical comparison between the currently most used open-source Knowledge Graphs: DBpedia, Wikidata, Yago
Practical explaination of how to query each Knwlwdge Graph with SPARQL and the sandboxes
A hands on overview of the semantic webMarakana Inc.
This document provides an overview of the Semantic Web. It defines the Semantic Web as linking data to data using technologies like RDF, RDFS, OWL and SPARQL. It explains that RDF represents information as subject-predicate-object statements that can be queried using SPARQL. RDFS allows defining schemas and classes for RDF data, while OWL adds more expressiveness for defining complex ontologies. The document outlines popular Semantic Web tools, public ontologies, and companies working in this domain. It positions the Semantic Web as a way to represent and share data universally on the web.
This slides I've used on talk about Semantic Web use-case. Not all know what exactly Semantic Web is about. So I've created set of slides showing this in a simple and correct way. Use-case slides are removed on this public available slides. Animated version here goo.gl/qKoF6k . Contact me for sources!
This document discusses ontology mapping. It begins with an introduction to the semantic web and ontologies. Ontology mapping is important for allowing different ontologies to be aligned and related. There are different types of ontology mapping including alignment, merging, and mapping. The document then surveys some popular ontology mapping techniques including GLUE, PROMPT, and QOM. It evaluates these techniques and discusses their inputs, outputs, and approaches. The document concludes that semantic web research is important for advancing web technologies and realizing the goals of web 3.0. Future work could involve developing new ontology mapping techniques and publishing research on existing mapping methods.
Implementing the Open Government Directive using the technologies of the Soci...George Thomas
This presentation demonstrates the use of Semantic Web technologies with Social Networking tools, considering metadata specifications as Social Media. Example ontologies and instance data from the Capital Planning and Investment Control and Business Motivation are created that link 'what' (Agency IT investments) with 'why' (Agency goals and objectives), using a simple linking ontology. Knowledge Workers use a Semantic Halo Mediawiki to curate the data.
Data integration with a façade. The case of knowledge graph construction.Enrico Daga
"Data integration with a façade.
The case of knowledge graph construction." is an overview of recent research in façade-based data access. The slides introduce core notions of façade-based data access and the design principles of SPARQL Anything, a system that allows querying of many formats (CSV, JSON, XML, HTML, Markdown , Excel, ...) in plain SPARQL.
SEMANTIC WEB SOURCES – comparison of open-source Knowledge GraphsMatteoBelcao
A theorical & practical comparison between the currently most used open-source Knowledge Graphs: DBpedia, Wikidata, Yago
Practical explaination of how to query each Knwlwdge Graph with SPARQL and the sandboxes
A hands on overview of the semantic webMarakana Inc.
This document provides an overview of the Semantic Web. It defines the Semantic Web as linking data to data using technologies like RDF, RDFS, OWL and SPARQL. It explains that RDF represents information as subject-predicate-object statements that can be queried using SPARQL. RDFS allows defining schemas and classes for RDF data, while OWL adds more expressiveness for defining complex ontologies. The document outlines popular Semantic Web tools, public ontologies, and companies working in this domain. It positions the Semantic Web as a way to represent and share data universally on the web.
The document discusses using semantic wikis and rule-based systems together. It describes translating semantic wiki data and queries into logic rules to enable new applications like access control and integrity checking. Examples are given of translating Semantic MediaWiki (SMW) data, queries, and modeling language into rules. The goal is to use semantic wikis for applications that require rules-based reasoning. Next steps discussed include applying it to policy modeling, expanding SMW's expressiveness with negation, and choosing a rule engine.
WWW09 - Triplify Light-Weight Linked Data Publication from Relational DatabasesSören Auer
Triplify is a tool that publishes semantic data from relational databases on the web as Linked Data. It works by mapping SQL queries to RDF representations. The SQL queries select structured data from databases behind existing web applications. Triplify then converts the query results into RDF triples. This exposes the semantics behind web applications and makes the data accessible to semantic search engines and applications. Triplify aims to overcome the lack of semantic data on the web by leveraging existing relational data sources.
Although the amount of Linked Data published on the web is steady increasing, its consumption is still mainly limited to technical users and domain experts. Thus, it is necessary to foster intuitive visualizations of Linked Data, in order to support users without a technical background. DBpedia Mobile Explorer is a visualization framework to enable non-experts to visualize Linked Data on mobile devices relying on DBpedia (the Linked Data version of Wikipedia).
Wi2015 - Clustering of Linked Open Data - the LODeX toolLaura Po
Presentation of the tool LODeX (http://www.dbgroup.unimore.it/lodex2/testCluster) at the 2015 IEEE/WIC/ACM International Conference on Web Intelligence, Singapore, December 6-8, 2015
Integrating a Domain Ontology Development Environment and an Ontology Search ...Takeshi Morita
In order to reduce the cost of building domain ontologies manually, in this paper, we propose a method and a tool named DODDLE-OWL for domain ontology construction reusing texts and existing ontologies extracted by an ontology search engine: Swoogle. In the experimental evaluation, we applied the method to a particular field of law and evaluated the acquired ontologies.
Knowledge graph construction with a façade - The SPARQL Anything ProjectEnrico Daga
The document discusses a project called "SPARQL Anything" which aims to simplify knowledge graph construction by using SPARQL as the single language for representing and transforming diverse data formats into RDF. It presents an approach called "Facade-X" which defines a common RDF structure that can be applied over different formats like CSV, JSON, HTML, etc. This facade focuses on the RDF meta-model and aims to apply minimal ontological commitments. The document outlines how Facade-X can be used to represent different formats and provides examples of using SPARQL to transform sample data into RDF without committing to a specific domain ontology.
Architecture Patterns for Semantic Web Applicationsbpanulla
This document provides an overview of non-relational database (NoSQL) architectures and patterns for semantic web applications. It discusses NoSQL key-value and graph databases as alternatives to relational databases for domains where schemas change rapidly or data is sparse. It also covers semantic web technologies like RDF, OWL, SPARQL and linked data for representing information and relationships in a machine-readable way. The document uses examples to illustrate concepts like modeling bookmark data from a social bookmarking site in RDF and querying it with SPARQL.
The document summarizes Ivan Herman's presentation on semantic technology and business applications at the 5th June 2012 Semantic Technology & Business Conference in San Francisco. The presentation covered several topics relating to semantic technologies including knowledge graphs, linked data, ontologies, semantic search, semantic data integration, standards like RDF, OWL, and SPARQL, and applications of semantic technologies in domains like life sciences, publishing, and government. It also discussed ongoing and future work at the W3C relating to areas like provenance, access control, and constraints on semantic web data.
DCMI Keynote: Bridging the Semantic Gaps and InteroperabilityMike Bergman
M. Bergman's presentation, 'Bridging the Gaps: Adaptive Approaches to Data Interoperabiity,' was a keynote at the DCMI's DC 2010 International Conference in Pittsburgh, PA, on October 22, 2010.
In the presentation, Bergman points to the Dublin Core Metadata Initiative as a unique and key player in plugging the semantics "gap" within the semantic Web. Some specific activities and roles are suggested.
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphsdgarijo
In this presentation we describe the Ontology-Based APIs framework (OBA), our approach to automatically create REST APIs from ontologies while following RESTful API best
practices. Given an ontology (or ontology network) OBA uses standard technologies familiar to web developers (OpenAPI Specification, JSON) and combines them with W3C standards (OWL, JSON-LD frames and SPARQL) to create maintainable APIs with documentation, units tests, automated validation of resources and clients (in Python, Javascript, etc.) for non Semantic Web experts to access the contents of a target
knowledge graph. We showcase OBA with three examples that illustrate the capabilities of the framework for different ontologies.
A Semantic Wiki Based Light-Weight Web Application ModelJie Bao
The document proposes a semantic wiki-based light-weight web application model. It describes using semantic wikis like Semantic MediaWiki to create structured data models and process data through wiki templates and queries. Two prototype applications are discussed - an event mapping application and an ontology editing application - to demonstrate this model. The model aims to make structured data and processing more open and accessible to support customized web application development.
The document proposes a five-layer semantic grid architecture with a knowledge layer on top of the Gridbus broker. The knowledge layer includes two modules: semantic description and discovery. Semantic description represents grid domain knowledge in an ontology template using Protege-OWL APIs. Semantic discovery uses the Algernon inference engine to retrieve resource information based on user queries. This knowledge layer is implemented in the Gridbus broker and can support popular middleware like Globus and Alchemi.
Semantic Web - Lecture 09 - Web Information Systems (4011474FNR)Beat Signer
This document discusses the Semantic Web and related technologies. It is comprised of several sections that describe key concepts such as the Semantic Web vision, Resource Description Framework (RDF), RDF Schema (RDFS), Web Ontology Language (OWL), and the Jena Semantic Web framework. The document provides examples and explanations of how these technologies relate to representing semantic data on the Web in a way that is accessible to machines.
The document discusses data discovery, conversion, integration and visualization using RDF. It covers topics like ontologies, vocabularies, data catalogs, converting different data formats to RDF including CSV, XML and relational databases. It also discusses federated SPARQL queries to integrate data from multiple sources and different techniques for visualizing linked data including analyzing relationships, events, and multidimensional data.
Ontology-based Cooperation of Information SystemsRaji Ghawi
This document summarizes an ontology-based approach for cooperation of heterogeneous information systems. It proposes using ontologies and mappings between data sources (e.g. relational databases, XML) and ontologies to enable transparent querying across distributed sources. It describes database-to-ontology and XML-to-ontology mapping specifications and processes, including generating mappings using associations with SQL statements or the DOML mapping language. It also covers query translation from SPARQL to SQL using the different mapping approaches.
Evaluation Initiatives for Entity-oriented Searchkrisztianbalog
This document discusses evaluation initiatives for entity-oriented search tasks. It describes several shared tasks and evaluation campaigns that have been held at conferences like TREC, CLEF, and INEX to provide standardized test collections, gold standard annotations, and evaluation metrics for tasks like ad-hoc entity retrieval from knowledge bases. Examples of test collections created for entity retrieval include datasets using Wikipedia, DBpedia, and a web crawl. The document also discusses related entity finding, entity linking, and understanding keyword queries by associating terms with entities.
This tutorial explains the Data Web vision, some preliminary standards and technologies as well as some tools and technological building blocks developed by AKSW research group from Universität Leipzig.
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Speck&Tech
ABSTRACT: A prima vista, un mattoncino Lego e la backdoor XZ potrebbero avere in comune il fatto di essere entrambi blocchi di costruzione, o dipendenze di progetti creativi e software. La realtà è che un mattoncino Lego e il caso della backdoor XZ hanno molto di più di tutto ciò in comune.
Partecipate alla presentazione per immergervi in una storia di interoperabilità, standard e formati aperti, per poi discutere del ruolo importante che i contributori hanno in una comunità open source sostenibile.
BIO: Sostenitrice del software libero e dei formati standard e aperti. È stata un membro attivo dei progetti Fedora e openSUSE e ha co-fondato l'Associazione LibreItalia dove è stata coinvolta in diversi eventi, migrazioni e formazione relativi a LibreOffice. In precedenza ha lavorato a migrazioni e corsi di formazione su LibreOffice per diverse amministrazioni pubbliche e privati. Da gennaio 2020 lavora in SUSE come Software Release Engineer per Uyuni e SUSE Manager e quando non segue la sua passione per i computer e per Geeko coltiva la sua curiosità per l'astronomia (da cui deriva il suo nickname deneb_alpha).
Dati aperti: un diritto digitale, da rivendicare e da alimentareSpeck&Tech
ABSTRACT: L'accesso ai dati è essenziale per una democrazia partecipativa e informata, e rappresenta un diritto fondamentale, al pari di quello di voto o di espressione. Tuttavia, spesso non è percepito come tale, ma come una mera esigenza di tecnici o appassionati di trasparenza. Discuteremo allora non solo come rivendicare efficacemente questo diritto, ma anche come è cruciale e vantaggioso contribuire attivamente alla sua crescita e al suo miglioramento. Ci concentreremo sull'importanza di dati leggibili meccanicamente, aggiornati, ben descritti e condivisi in modo da favorirne un utilizzo efficace. Perché un approccio proattivo non solo amplifica l'accesso ai dati, ma ne aumenta anche il valore per l'intera comunità.
BIO: Andrea Borruso è esperto in Sistemi Informativi Geografici (GIS) e Open Data. Ha lavorato nella progettazione di GIS e nel data processing di dati spaziali (e non), sia per la Pubblica Amministrazione, che per aziende. Al momento è Technical Advisor presso Planetek Italia. È presidente di onData, un'associazione non-profit che promuove l'apertura dei dati pubblici per renderli accessibili a tutti.
More Related Content
Similar to Towards Virtual Knowledge Graphs over Web APIs
The document discusses using semantic wikis and rule-based systems together. It describes translating semantic wiki data and queries into logic rules to enable new applications like access control and integrity checking. Examples are given of translating Semantic MediaWiki (SMW) data, queries, and modeling language into rules. The goal is to use semantic wikis for applications that require rules-based reasoning. Next steps discussed include applying it to policy modeling, expanding SMW's expressiveness with negation, and choosing a rule engine.
WWW09 - Triplify Light-Weight Linked Data Publication from Relational DatabasesSören Auer
Triplify is a tool that publishes semantic data from relational databases on the web as Linked Data. It works by mapping SQL queries to RDF representations. The SQL queries select structured data from databases behind existing web applications. Triplify then converts the query results into RDF triples. This exposes the semantics behind web applications and makes the data accessible to semantic search engines and applications. Triplify aims to overcome the lack of semantic data on the web by leveraging existing relational data sources.
Although the amount of Linked Data published on the web is steady increasing, its consumption is still mainly limited to technical users and domain experts. Thus, it is necessary to foster intuitive visualizations of Linked Data, in order to support users without a technical background. DBpedia Mobile Explorer is a visualization framework to enable non-experts to visualize Linked Data on mobile devices relying on DBpedia (the Linked Data version of Wikipedia).
Wi2015 - Clustering of Linked Open Data - the LODeX toolLaura Po
Presentation of the tool LODeX (http://www.dbgroup.unimore.it/lodex2/testCluster) at the 2015 IEEE/WIC/ACM International Conference on Web Intelligence, Singapore, December 6-8, 2015
Integrating a Domain Ontology Development Environment and an Ontology Search ...Takeshi Morita
In order to reduce the cost of building domain ontologies manually, in this paper, we propose a method and a tool named DODDLE-OWL for domain ontology construction reusing texts and existing ontologies extracted by an ontology search engine: Swoogle. In the experimental evaluation, we applied the method to a particular field of law and evaluated the acquired ontologies.
Knowledge graph construction with a façade - The SPARQL Anything ProjectEnrico Daga
The document discusses a project called "SPARQL Anything" which aims to simplify knowledge graph construction by using SPARQL as the single language for representing and transforming diverse data formats into RDF. It presents an approach called "Facade-X" which defines a common RDF structure that can be applied over different formats like CSV, JSON, HTML, etc. This facade focuses on the RDF meta-model and aims to apply minimal ontological commitments. The document outlines how Facade-X can be used to represent different formats and provides examples of using SPARQL to transform sample data into RDF without committing to a specific domain ontology.
Architecture Patterns for Semantic Web Applicationsbpanulla
This document provides an overview of non-relational database (NoSQL) architectures and patterns for semantic web applications. It discusses NoSQL key-value and graph databases as alternatives to relational databases for domains where schemas change rapidly or data is sparse. It also covers semantic web technologies like RDF, OWL, SPARQL and linked data for representing information and relationships in a machine-readable way. The document uses examples to illustrate concepts like modeling bookmark data from a social bookmarking site in RDF and querying it with SPARQL.
The document summarizes Ivan Herman's presentation on semantic technology and business applications at the 5th June 2012 Semantic Technology & Business Conference in San Francisco. The presentation covered several topics relating to semantic technologies including knowledge graphs, linked data, ontologies, semantic search, semantic data integration, standards like RDF, OWL, and SPARQL, and applications of semantic technologies in domains like life sciences, publishing, and government. It also discussed ongoing and future work at the W3C relating to areas like provenance, access control, and constraints on semantic web data.
DCMI Keynote: Bridging the Semantic Gaps and InteroperabilityMike Bergman
M. Bergman's presentation, 'Bridging the Gaps: Adaptive Approaches to Data Interoperabiity,' was a keynote at the DCMI's DC 2010 International Conference in Pittsburgh, PA, on October 22, 2010.
In the presentation, Bergman points to the Dublin Core Metadata Initiative as a unique and key player in plugging the semantics "gap" within the semantic Web. Some specific activities and roles are suggested.
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphsdgarijo
In this presentation we describe the Ontology-Based APIs framework (OBA), our approach to automatically create REST APIs from ontologies while following RESTful API best
practices. Given an ontology (or ontology network) OBA uses standard technologies familiar to web developers (OpenAPI Specification, JSON) and combines them with W3C standards (OWL, JSON-LD frames and SPARQL) to create maintainable APIs with documentation, units tests, automated validation of resources and clients (in Python, Javascript, etc.) for non Semantic Web experts to access the contents of a target
knowledge graph. We showcase OBA with three examples that illustrate the capabilities of the framework for different ontologies.
A Semantic Wiki Based Light-Weight Web Application ModelJie Bao
The document proposes a semantic wiki-based light-weight web application model. It describes using semantic wikis like Semantic MediaWiki to create structured data models and process data through wiki templates and queries. Two prototype applications are discussed - an event mapping application and an ontology editing application - to demonstrate this model. The model aims to make structured data and processing more open and accessible to support customized web application development.
The document proposes a five-layer semantic grid architecture with a knowledge layer on top of the Gridbus broker. The knowledge layer includes two modules: semantic description and discovery. Semantic description represents grid domain knowledge in an ontology template using Protege-OWL APIs. Semantic discovery uses the Algernon inference engine to retrieve resource information based on user queries. This knowledge layer is implemented in the Gridbus broker and can support popular middleware like Globus and Alchemi.
Semantic Web - Lecture 09 - Web Information Systems (4011474FNR)Beat Signer
This document discusses the Semantic Web and related technologies. It is comprised of several sections that describe key concepts such as the Semantic Web vision, Resource Description Framework (RDF), RDF Schema (RDFS), Web Ontology Language (OWL), and the Jena Semantic Web framework. The document provides examples and explanations of how these technologies relate to representing semantic data on the Web in a way that is accessible to machines.
The document discusses data discovery, conversion, integration and visualization using RDF. It covers topics like ontologies, vocabularies, data catalogs, converting different data formats to RDF including CSV, XML and relational databases. It also discusses federated SPARQL queries to integrate data from multiple sources and different techniques for visualizing linked data including analyzing relationships, events, and multidimensional data.
Ontology-based Cooperation of Information SystemsRaji Ghawi
This document summarizes an ontology-based approach for cooperation of heterogeneous information systems. It proposes using ontologies and mappings between data sources (e.g. relational databases, XML) and ontologies to enable transparent querying across distributed sources. It describes database-to-ontology and XML-to-ontology mapping specifications and processes, including generating mappings using associations with SQL statements or the DOML mapping language. It also covers query translation from SPARQL to SQL using the different mapping approaches.
Evaluation Initiatives for Entity-oriented Searchkrisztianbalog
This document discusses evaluation initiatives for entity-oriented search tasks. It describes several shared tasks and evaluation campaigns that have been held at conferences like TREC, CLEF, and INEX to provide standardized test collections, gold standard annotations, and evaluation metrics for tasks like ad-hoc entity retrieval from knowledge bases. Examples of test collections created for entity retrieval include datasets using Wikipedia, DBpedia, and a web crawl. The document also discusses related entity finding, entity linking, and understanding keyword queries by associating terms with entities.
This tutorial explains the Data Web vision, some preliminary standards and technologies as well as some tools and technological building blocks developed by AKSW research group from Universität Leipzig.
Similar to Towards Virtual Knowledge Graphs over Web APIs (20)
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Speck&Tech
ABSTRACT: A prima vista, un mattoncino Lego e la backdoor XZ potrebbero avere in comune il fatto di essere entrambi blocchi di costruzione, o dipendenze di progetti creativi e software. La realtà è che un mattoncino Lego e il caso della backdoor XZ hanno molto di più di tutto ciò in comune.
Partecipate alla presentazione per immergervi in una storia di interoperabilità, standard e formati aperti, per poi discutere del ruolo importante che i contributori hanno in una comunità open source sostenibile.
BIO: Sostenitrice del software libero e dei formati standard e aperti. È stata un membro attivo dei progetti Fedora e openSUSE e ha co-fondato l'Associazione LibreItalia dove è stata coinvolta in diversi eventi, migrazioni e formazione relativi a LibreOffice. In precedenza ha lavorato a migrazioni e corsi di formazione su LibreOffice per diverse amministrazioni pubbliche e privati. Da gennaio 2020 lavora in SUSE come Software Release Engineer per Uyuni e SUSE Manager e quando non segue la sua passione per i computer e per Geeko coltiva la sua curiosità per l'astronomia (da cui deriva il suo nickname deneb_alpha).
Dati aperti: un diritto digitale, da rivendicare e da alimentareSpeck&Tech
ABSTRACT: L'accesso ai dati è essenziale per una democrazia partecipativa e informata, e rappresenta un diritto fondamentale, al pari di quello di voto o di espressione. Tuttavia, spesso non è percepito come tale, ma come una mera esigenza di tecnici o appassionati di trasparenza. Discuteremo allora non solo come rivendicare efficacemente questo diritto, ma anche come è cruciale e vantaggioso contribuire attivamente alla sua crescita e al suo miglioramento. Ci concentreremo sull'importanza di dati leggibili meccanicamente, aggiornati, ben descritti e condivisi in modo da favorirne un utilizzo efficace. Perché un approccio proattivo non solo amplifica l'accesso ai dati, ma ne aumenta anche il valore per l'intera comunità.
BIO: Andrea Borruso è esperto in Sistemi Informativi Geografici (GIS) e Open Data. Ha lavorato nella progettazione di GIS e nel data processing di dati spaziali (e non), sia per la Pubblica Amministrazione, che per aziende. Al momento è Technical Advisor presso Planetek Italia. È presidente di onData, un'associazione non-profit che promuove l'apertura dei dati pubblici per renderli accessibili a tutti.
AI nel diritto penale, dalle indagini alla redazione delle sentenzeSpeck&Tech
ABSTRACT: In questa talk parlerò dell'utilizzo dell'AI sia come strumento per facilitare la commissione di reati (ad esempio attraverso l'utilizzo di social engineering nella commissione di frodi bancarie), che come strumento investigativo all'interno del procedimento penale (ad esempio attraverso l'utilizzo di sistemi di sorveglianza speciali), che come supporto ai giudici nella determinazione della pena e nella stesura delle sentenze. Infine tratterò delle possibili discriminazioni che l'utilizzo acritico di sistemi di AI può determinare nell'amministrazione della giustizia, distinguendo tra discriminazioni matematico/statistiche e sociali (come succede anche nella valutazione del credit score nella richiesta di mutui).
BIO: Dopo l'abilitazione alla professione forense, Carlotta Capizzi ha lavorato in una delle maggiori società di consulenza mondiali, occupandosi in particolar modo di sistemi di pagamento digitale e antifrode. Attualmente si occupa di compliance ICT all'interno di un importante gruppo bancario italiano, con particolare focus alle normative relative al pacchetto di finanza digitale UE (DORA, FIDA e AI Act). Di recente ha pubblicato un articolo relativo ai sistemi di sorveglianza speciale, basati sull'AI, sperimentati a Trento e Rovereto e approfondito l'interconnessione tra AI e diritto penale.
Vecchi e nuovi diritti per l'intelligenza artificialeSpeck&Tech
ABSTRACT: In questo talk vorrei proporre 2 linee di pensiero volte ad aggiornare il catalogo dei diritti umani alla luce delle caratteristiche tecniche dell'AI, delle sue potenzialità e dei rischi del suo utilizzo. La prima considera l'adattamento di alcuni principi tradizionali (consenso informato e non discriminazione) rispetto alle sfide poste dall'AI. La seconda comprende 4 nuovi diritti: diritto all'eroe, diritto alla discontinuità, diritto ad un ambiente umano, diritto alla AI. Farò anche qualche riferimento all'AI Act, il nuovo regolamento approvato recentemente dal Parlamento dell'Unione Europea.
BIO: Carlo Casonato è professore di diritto costituzionale comparato alla Facoltà di Giurisprudenza di Trento, titolare della Cattedra Jean Monnet di Diritto europeo dell'IA (T4F). È stato Visiting Fellow a Oxford e a Yale ed è membro della Commissione per l'Etica e l'Integrità nella ricerca del CNR. È chief editor del BioLaw Journal e autore o curatore di oltre 160 pubblicazioni (tra cui una ventina di libri).
What should 6G be? - 6G: bridging gaps, connecting futuresSpeck&Tech
ABSTRACT: The transformative influence of Internet and Communication Technology (ICT) has reshaped society, touching every aspect from the economy to healthcare. As the widespread deployment of 5G continues, there is an ongoing focus on the inception of the sixth generation (6G) of wireless communication systems (WCSs). Anticipated to shape the future of connectivity in the 2030s, 6G aims to deliver unparalleled communication services to meet the demands of hyper-connectivity.
While densely populated urban areas have traditionally been the primary beneficiaries of WCS advancements, the vision for 6G transcends city limits. Aligned with the United Nations' sustainability goals for 2030, an important aspect of 6G endeavors to democratize the benefits of ICT, fostering global connectivity sustainably. This talk delves into this particular envisioned landscape of 6G, providing insights into the future of wireless communication and guiding research efforts toward sustainable, inclusive, and high-speed connectivity solutions for the future.
Central to this discussion are two emerging technologies: Free Space Optics (FSO) and Non-Terrestrial Networks (NTN). These innovative solutions hold the promise of extending high-speed connectivity beyond urban hubs to underserved regions, fostering digital inclusivity, and contributing to the development of remote areas.
Through this exploration, we aim to convey the potential of 6G and its role in shaping a connected, sustainable future for all.
BIO: Mohamed-Slim Alouini, was born in Tunis, Tunisia. He earned his Ph.D. from the California Institute of Technology (Caltech) in 1998 before serving as a faculty member at the University of Minnesota and later at Texas A&M University in Qatar. In 2009, he became a founding faculty member at King Abdullah University of Science and Technology (KAUST), where he currently is the Al-Khawarizmi Distinguished Professor of Electrical and Computer Engineering and where he holds the UNESCO Chair on Education to Connect the Unconnected.
Prof. Alouini is a Fellow of the IEEE and OPTICA and his research interests encompass a wide array of research topics in wireless and satellite communications. He is currently particularly focusing on addressing the technical challenges associated with information and communication technologies (ICT) in underserved regions and is deeply committed to bridging the digital divide by tackling issues related to the uneven distribution, access to, and utilization of ICT in rural, low-income, disaster-prone, and hard-to-reach areas.
Creare il sangue artificiale: "buon sangue non mente"Speck&Tech
ABSTRACT: La necessità di sangue ed emoderivati è costantemente elevata negli ospedali di tutto il mondo. Per questo motivo, negli anni, la creazione di "sangue artificiale" ha rappresentato una sfida che molti ricercatori hanno intrapreso. La sintesi in laboratorio di globuli rossi, partendo da cellule ematopoietiche totipotenti per sostituire le convenzionali "trasfusioni", è stata ad oggi sperimentata ma è ancora distante dal divenire pratica clinica. Tuttavia, altre componenti del sangue sono ad oggi utilizzate a scopi sostitutivi, mentre altre ancora sono state sintetizzate in laboratorio e vengono utilizzate per scopi terapeutici. Questa talk ci porterà alla scoperta di quali sono gli avanzamenti in questo campo.
BIO: Alvise Berti, Medico ricercatore dell'ospedale S. Chiara di Trento. Specializzato in Immunologia Clinica nel 2018 presso l'Università San Raffaele di Milano, consegue nel 2021 il Dottorato di Ricerca in Scienze Biomolecolari presso il CIBIO all’Università di Trento. Dopo due anni alla Mayo Clinic (Rochester, MN, USA), ha lavorato come dirigente medico nella Reumatologia dell’Ospedale APSS Santa Chiara di Trento, dove ha sviluppato una linea di ricerca traslazionale su malattie immuno-mediate, autoimmunità e infiammazione, e contribuito alla fondazione di unità dedicate alla cura di malattie autoimmuni sistemiche. Dal 2022 è ricercatore presso il Centro Interdipartimentale di Scienze Mediche (CISMed) dell'Università di Trento.
AWS: gestire la scalabilità su larga scalaSpeck&Tech
ABSTRACT: In questo talk parleremo di Cloud e di soluzioni open source all'avanguardia per gestire la scalabilità. Vi verranno svelati i segreti della gestione di containers su larga scala e di cosa c’è dietro le quinte dei principali servizi serverless come AWS Lambda e AWS Fargate usati da più di 1 milione di clienti in tutto il mondo.
BIO: Andrea Catalano è laureato presso l'Università di Bologna in Informatica ed ha maturato un’esperienza di più di 15 anni in ambito accademico, ha collaborato con le principali università italiane come Team Leader nel più grande centro di calcolo italiano CINECA, e da 4 anni ricopre il ruolo di Solution Architect in AWS.
Praticamente... AWS - Amazon Web ServicesSpeck&Tech
ABSTRACT: Il cloud non è una novità e l'offerta di servizi di AWS è molto ampia. Ma come sono usati nella pratica? Presentiamo in questo intervento due casi d'uso per una multinazionale ed i ragionamenti collegati alla complessità, architettura e convenienza, in un contesto complesso e distribuito.
BIO: Alberto Martinelli è laureato presso l'università di Trento in Informatica ed ha lavorato per alcune realtà locali trentine per clienti provinciali, nazionali ed internazionali. Esperto di architetture software su diverse scale, attualmente lavora presso Fincons come Manager e Solution Architect.
Data Sense-making: navigating the world through the lens of information designSpeck&Tech
ABSTRACT: Every day, we're inundated with a staggering amount of information that continues to grow exponentially. How can we process all these inputs and grasp even a fraction of the available knowledge? In my talk, I'll offer a personal reflection on information design and its tools for accessing the world's complexity without necessarily simplifying it. And, why not, I'll also share how I planned a data-driven visit to my favorite theme park.
BIO: Alessandro Zotta is an Information Designer who is the Head of Data Visualization at Accurat and teaches Data-Driven Design at NABA.
Data Activism: data as rhetoric, data as powerSpeck&Tech
ABSTRACT: Contrary to popular beliefs that depict data as truthful or objective, a data activist navigates the data-sphere from an opposite worldview: data is never neutral, and data visualization is inevitably rhetorical. But don’t worry: this is a feature, not a bug. This talk will focus on the many ways in which data can be used for activism, with a particular focus on data-inspired housing rights initiatives like Inside Airbnb and OCIO Venezia, and the works by the information design studio Sheldon.studio.
BIO: Alice Corona is a partner and data journalist at Sheldon.studio, Board Member at Inside Airbnb, and Data activist at OCIO Venezia.
Delve into the world of the human microbiome and metagenomicsSpeck&Tech
ABSTRACT: Shotgun metagenomics provides a comprehensive snapshot of the entire microbial community present in a given environment. PreBiomics, an academic start-up of the University of Trento, employs state-of-the-art sequencing technologies and bioinformatics tools to allow the identification and characterization of the multitude of known and unknown bacteria, viruses, fungi, and other microorganisms present, shedding light on their functional capabilities, interactions, and potential impact on health or specific applications. During the talk, we will see an example of how the human microbiome is emerging as a key target of personalized medicine.
BIO: Mattia Bolzan is the Chief Technology Officer in PreBiomics with more than 8 years of experience as a scientist and bioinformatician between academia and industry. He holds an MSc in Cellular and Molecular Biotechnology and through varied work experiences has developed several skills in microbiology and sequencing data analysis, specialising in computational metagenomics.
Home4MeAi: un progetto sociale che utilizza dispositivi IoT per sfruttare le ...Speck&Tech
ABSTRACT: Questa presentazione non è soltanto una panoramica tecnica di un progetto innovativo, ma è anche una storia di passione e impegno iniziata nel lontano (dal punto di vista temporale di un dev 😎) 2020, con un'hackathon durante la conferenza "Accessibility Days 2020". Presenterò la visione dietro Home4MeAi, l'architettura tecnica, ma anche come trovare risorse senza costi per avviare un progetto. Presenterò anche Federico Villa: un ex atleta paralimpico che crede nella dimensione sociale della tecnologia.
BIO: Mi chiamo Luca Nardelli e sono un Software Architect in .Net e Node.js, ma prima di tutto sono un tecnico appassionato e curioso. Mi piace sperimentare con molte tecnologie: dispositivi IoT (principalmente Arduino), dispositivi indossabili (Tizen 5.5), Servizi Cognitivi (principalmente Azure, GCP, OpenAI), sviluppo mobile (Flutter), ecc. Amo anche molto la "dimensione umana": quando posso, viaggio per conoscere persone e culture, mi piace la natura e gli animali (😸).
Monitorare una flotta di autobus: architettura di un progetto di acquisizione...Speck&Tech
ABSTRACT: In Alto Adige il sistema del traporto pubblico sta vivendo grandi cambiamenti con l'introduzione di nuove tecnologie e grandi investimenti; la transizione richiede tempo, e per continuare a monitorare la flotta in questa fase abbiamo deciso di creare un sistema di monitoraggio sviluppato in-house e basato su free software. Parleremo dell'architettura del sistema e delle sfide principali che abbiamo incontrato durante lo sviluppo e la messa in produzione.
BIO: Sono Marco Pavanelli, il responsabile del team di sviluppo interno di Sasa Spa. Mi occupo di software da quasi 30 anni, ho una grande passione per Python e per l'ecosistema open source; sono stato speaker a Pycon Italia 2022 e a Pycon Sweden 2023, e sono stato invitato come speaker anche a SFScon 2023 a Bolzano.
ABSTRACT: Ever since OpenAI released ChatGPT, LLMs have been applied to the most diverse domains, from education to medicine. However, what they basically do is to look for patterns in huge amounts of text and use those patterns to guess what the next word in a string is. Having no access to the real world, LLMs are therefore great at mimicry but show important limitations when employed for critical tasks, which may affect human well-being, social justice, and access to digital services. This talk will discuss all the above issues, highlighting the risks but also the huge potential of LLMs.
BIO: Sara Tonelli is the head of the Digital Humanities research unit at Fondazione Bruno Kessler since 2013. She holds a PhD in Language Sciences from Università Ca’ Foscari in Venice and is a member of the Center of Computational Social Science and Human Dynamics jointly with the University of Trento.
Building intelligent applications with Large Language ModelsSpeck&Tech
ABSTRACT: Large Language Models (LLMs) have proved extraordinary capabilities in language understanding and generation, but their most promising feature is probably their reasoning capability. The fact that LLMs can understand complex problems, plan step-by-step solutions, and even work by intuition, make them powerful reasoning engine to be placed at the core of AI-powered application. In this session, we are going to explore how LLMs are revolutionizing the world of software development and paving the way for a new landscape for LLM-powered applications.
BIO: I'm Valentina Alto, a Data Science MSc graduate and Cloud Specialist at Microsoft, focusing on Analytics and AI workloads within the manufacturing and pharmaceutical industry since 2022. I've been working on customers' digital transformations, designing cloud architecture and modern data platforms, including IoT, real-time analytics, Machine Learning, and Generative AI. I'm also a tech author, contributing articles on machine learning, AI, and statistics, and I recently published a book on Generative AI and Large Language Models. In my free time, I love hiking and climbing around the beautiful Italian mountains, running, and enjoying a good book with a cup of coffee.
ABSTRACT: The advent of real functional quantum computers will cause a privacy problem. Indeed, quantum computers are particularly good at solving algorithms that ensure information privacy, like the RSA algorithm. In this talk, we will see how quantum computers can be used to restore unconditional security and privacy.
BIO: Nicolò Leone is a Postdoctoral researcher at the Department of Physics of the University of Trento. He has obtained his PhD in 2022. His research interests are quantum information and integrated photonics.
ABSTRACT: Once introduced the fundamental ideas of quantum computing, we will discuss the possibilities offered by quantum computers in machine learning.
BIO: Davide Pastorello obtained an M.Sc. in Physics (2011) and a Ph.D. in Mathematics (2014) from Trento University. After serving at the Dept. of Mathematics and DISI in Trento, he is currently an assistant professor at the Dept. of Mathematics, University of Bologna. His main research interests concern the mathematical aspects of quantum information theory, quantum computing, and quantum machine learning.
Give your Web App superpowers by using GPUsSpeck&Tech
ABSTRACT: GPUs are cool! You can use them to play games, create entire movies and even run machine learning models. By using some native libraries it's possible to embrace their powers, using programming languages such as Python or C++. How does this work, though, in a world where everything is a web page? New exciting technologies, such as WebGL and WebGPU, are giving programmers the full power of the underlying hardware. We will see how to build applications that use GPU superpowers to offer things that were previously impossible, such as rendering giant scenes or running ChatGPT locally in the browser, using some real-world production examples.
BIO: Giulio Zausa is a Technical Lead and Software Engineer working at Flux, where he uses web technologies to bring hardware design into the future. He is passionate about 3D rendering technologies, performance optimisations, and building cool stuff for the web, like custom React reconcilers, real-time computer vision on Web Workers, and hardware emulators with WebAssembly.
From leaf to orbit: exploring forests with technologySpeck&Tech
ABSTRACT: Forests are vital to our planet's health and to many human activities. Given the increasing impact of climate change and human actions, monitoring the condition of our forests has become imperative. At the Forest Ecology Unit of Fondazione Edmund Mach, we employ an array of cutting-edge technologies to examine forest dynamics and enhance management practices. From tree-mounted sensors to orbiting satellites, the talk will provide an overview of the technologies at our disposal and how we are using them.
BIO: Daniele Marinelli is a researcher at the Forest Ecology Unit of Fondazione Edmund Mach where he uses remote sensing to monitor forest dynamics and changes.
ABSTRACT: Wood plays a key role in our societies, especially in achieving sustainable development in the construction sector. Making the best of this precious raw material is essential. MiCROTEC is the leading provider of innovative wood scanning solutions that allow making the best of every single tree. What makes MiCROTEC so successful at what it does? Innovation is the key and it permeates the company’s every activity, from the development of superior hardware to the deployment of AI to detect defects.
BIO: Carlo Saporito Moretto is responsible for Corporate Business Process Management at MiCROTEC, catalyzing data-driven change.
Have you ever been confused by the myriad of choices offered by AWS for hosting a website or an API?
Lambda, Elastic Beanstalk, Lightsail, Amplify, S3 (and more!) can each host websites + APIs. But which one should we choose?
Which one is cheapest? Which one is fastest? Which one will scale to meet our needs?
Join me in this session as we dive into each AWS hosting service to determine which one is best for your scenario and explain why!
Must Know Postgres Extension for DBA and Developer during MigrationMydbops
Mydbops Opensource Database Meetup 16
Topic: Must-Know PostgreSQL Extensions for Developers and DBAs During Migration
Speaker: Deepak Mahto, Founder of DataCloudGaze Consulting
Date & Time: 8th June | 10 AM - 1 PM IST
Venue: Bangalore International Centre, Bangalore
Abstract: Discover how PostgreSQL extensions can be your secret weapon! This talk explores how key extensions enhance database capabilities and streamline the migration process for users moving from other relational databases like Oracle.
Key Takeaways:
* Learn about crucial extensions like oracle_fdw, pgtt, and pg_audit that ease migration complexities.
* Gain valuable strategies for implementing these extensions in PostgreSQL to achieve license freedom.
* Discover how these key extensions can empower both developers and DBAs during the migration process.
* Don't miss this chance to gain practical knowledge from an industry expert and stay updated on the latest open-source database trends.
Mydbops Managed Services specializes in taking the pain out of database management while optimizing performance. Since 2015, we have been providing top-notch support and assistance for the top three open-source databases: MySQL, MongoDB, and PostgreSQL.
Our team offers a wide range of services, including assistance, support, consulting, 24/7 operations, and expertise in all relevant technologies. We help organizations improve their database's performance, scalability, efficiency, and availability.
Contact us: info@mydbops.com
Visit: https://www.mydbops.com/
Follow us on LinkedIn: https://in.linkedin.com/company/mydbops
For more details and updates, please follow up the below links.
Meetup Page : https://www.meetup.com/mydbops-databa...
Twitter: https://twitter.com/mydbopsofficial
Blogs: https://www.mydbops.com/blog/
Facebook(Meta): https://www.facebook.com/mydbops/
For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2024/06/temporal-event-neural-networks-a-more-efficient-alternative-to-the-transformer-a-presentation-from-brainchip/
Chris Jones, Director of Product Management at BrainChip , presents the “Temporal Event Neural Networks: A More Efficient Alternative to the Transformer” tutorial at the May 2024 Embedded Vision Summit.
The expansion of AI services necessitates enhanced computational capabilities on edge devices. Temporal Event Neural Networks (TENNs), developed by BrainChip, represent a novel and highly efficient state-space network. TENNs demonstrate exceptional proficiency in handling multi-dimensional streaming data, facilitating advancements in object detection, action recognition, speech enhancement and language model/sequence generation. Through the utilization of polynomial-based continuous convolutions, TENNs streamline models, expedite training processes and significantly diminish memory requirements, achieving notable reductions of up to 50x in parameters and 5,000x in energy consumption compared to prevailing methodologies like transformers.
Integration with BrainChip’s Akida neuromorphic hardware IP further enhances TENNs’ capabilities, enabling the realization of highly capable, portable and passively cooled edge devices. This presentation delves into the technical innovations underlying TENNs, presents real-world benchmarks, and elucidates how this cutting-edge approach is positioned to revolutionize edge AI across diverse applications.
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...Jason Yip
The typical problem in product engineering is not bad strategy, so much as “no strategy”. This leads to confusion, lack of motivation, and incoherent action. The next time you look for a strategy and find an empty space, instead of waiting for it to be filled, I will show you how to fill it in yourself. If you’re wrong, it forces a correction. If you’re right, it helps create focus. I’ll share how I’ve approached this in the past, both what works and lessons for what didn’t work so well.
High performance Serverless Java on AWS- GoTo Amsterdam 2024Vadym Kazulkin
Java is for many years one of the most popular programming languages, but it used to have hard times in the Serverless community. Java is known for its high cold start times and high memory footprint, comparing to other programming languages like Node.js and Python. In this talk I'll look at the general best practices and techniques we can use to decrease memory consumption, cold start times for Java Serverless development on AWS including GraalVM (Native Image) and AWS own offering SnapStart based on Firecracker microVM snapshot and restore and CRaC (Coordinated Restore at Checkpoint) runtime hooks. I'll also provide a lot of benchmarking on Lambda functions trying out various deployment package sizes, Lambda memory settings, Java compilation options and HTTP (a)synchronous clients and measure their impact on cold and warm start times.
"Scaling RAG Applications to serve millions of users", Kevin GoedeckeFwdays
How we managed to grow and scale a RAG application from zero to thousands of users in 7 months. Lessons from technical challenges around managing high load for LLMs, RAGs and Vector databases.
Introduction of Cybersecurity with OSS at Code Europe 2024Hiroshi SHIBATA
I develop the Ruby programming language, RubyGems, and Bundler, which are package managers for Ruby. Today, I will introduce how to enhance the security of your application using open-source software (OSS) examples from Ruby and RubyGems.
The first topic is CVE (Common Vulnerabilities and Exposures). I have published CVEs many times. But what exactly is a CVE? I'll provide a basic understanding of CVEs and explain how to detect and handle vulnerabilities in OSS.
Next, let's discuss package managers. Package managers play a critical role in the OSS ecosystem. I'll explain how to manage library dependencies in your application.
I'll share insights into how the Ruby and RubyGems core team works to keep our ecosystem safe. By the end of this talk, you'll have a better understanding of how to safeguard your code.
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor IvaniukFwdays
At this talk we will discuss DDoS protection tools and best practices, discuss network architectures and what AWS has to offer. Also, we will look into one of the largest DDoS attacks on Ukrainian infrastructure that happened in February 2022. We'll see, what techniques helped to keep the web resources available for Ukrainians and how AWS improved DDoS protection for all customers based on Ukraine experience
Northern Engraving | Nameplate Manufacturing Process - 2024Northern Engraving
Manufacturing custom quality metal nameplates and badges involves several standard operations. Processes include sheet prep, lithography, screening, coating, punch press and inspection. All decoration is completed in the flat sheet with adhesive and tooling operations following. The possibilities for creating unique durable nameplates are endless. How will you create your brand identity? We can help!
Fueling AI with Great Data with Airbyte WebinarZilliz
This talk will focus on how to collect data from a variety of sources, leveraging this data for RAG and other GenAI use cases, and finally charting your course to productionalization.
The Department of Veteran Affairs (VA) invited Taylor Paschal, Knowledge & Information Management Consultant at Enterprise Knowledge, to speak at a Knowledge Management Lunch and Learn hosted on June 12, 2024. All Office of Administration staff were invited to attend and received professional development credit for participating in the voluntary event.
The objectives of the Lunch and Learn presentation were to:
- Review what KM ‘is’ and ‘isn’t’
- Understand the value of KM and the benefits of engaging
- Define and reflect on your “what’s in it for me?”
- Share actionable ways you can participate in Knowledge - - Capture & Transfer
Skybuffer SAM4U tool for SAP license adoptionTatiana Kojar
Manage and optimize your license adoption and consumption with SAM4U, an SAP free customer software asset management tool.
SAM4U, an SAP complimentary software asset management tool for customers, delivers a detailed and well-structured overview of license inventory and usage with a user-friendly interface. We offer a hosted, cost-effective, and performance-optimized SAM4U setup in the Skybuffer Cloud environment. You retain ownership of the system and data, while we manage the ABAP 7.58 infrastructure, ensuring fixed Total Cost of Ownership (TCO) and exceptional services through the SAP Fiori interface.
AppSec PNW: Android and iOS Application Security with MobSFAjin Abraham
Mobile Security Framework - MobSF is a free and open source automated mobile application security testing environment designed to help security engineers, researchers, developers, and penetration testers to identify security vulnerabilities, malicious behaviours and privacy concerns in mobile applications using static and dynamic analysis. It supports all the popular mobile application binaries and source code formats built for Android and iOS devices. In addition to automated security assessment, it also offers an interactive testing environment to build and execute scenario based test/fuzz cases against the application.
This talk covers:
Using MobSF for static analysis of mobile applications.
Interactive dynamic security assessment of Android and iOS applications.
Solving Mobile app CTF challenges.
Reverse engineering and runtime analysis of Mobile malware.
How to shift left and integrate MobSF/mobsfscan SAST and DAST in your build pipeline.
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-EfficiencyScyllaDB
Freshworks creates AI-boosted business software that helps employees work more efficiently and effectively. Managing data across multiple RDBMS and NoSQL databases was already a challenge at their current scale. To prepare for 10X growth, they knew it was time to rethink their database strategy. Learn how they architected a solution that would simplify scaling while keeping costs under control.
"$10 thousand per minute of downtime: architecture, queues, streaming and fin...Fwdays
Direct losses from downtime in 1 minute = $5-$10 thousand dollars. Reputation is priceless.
As part of the talk, we will consider the architectural strategies necessary for the development of highly loaded fintech solutions. We will focus on using queues and streaming to efficiently work and manage large amounts of data in real-time and to minimize latency.
We will focus special attention on the architectural patterns used in the design of the fintech system, microservices and event-driven architecture, which ensure scalability, fault tolerance, and consistency of the entire system.
"$10 thousand per minute of downtime: architecture, queues, streaming and fin...
Towards Virtual Knowledge Graphs over Web APIs
1. Towards Virtual Knowledge Graphs over Web APIs
Francesco Corcoglioniti
2022-11-09
postdoc @ KRDB, Free University of Bolzano,
supported by HIVE Fusion Grant project (2021-2022), OntoCRM project (2022-2024), and Ontopic s.r.l
slides available online at https://bit.ly/3WOoldB
2. 1. Introduction
2. The VKG Framework
3. The Ontop VKG System
4. VKGs over Web APIs
5. Conclusions
3. Big Data Context
Towards Virtual Knowledge Graphs over Web APIs – slides available at https://bit.ly/3WOoldB 1/34
4. Variety Drives Data Management Initiatives
69%
25%
6%
Relative Importance
Variety
Volume
Velocity
http://sloanreview.mit.edu/article/
variety-not-volume-is-driving-big-data-initiatives/
(2016)
Data model heterogeneity
relational data, graph data, XML, JSON, CSV,
text files, ...
System heterogeneity
even when systems adopt the same data
model, they are not always fully compatible
Schema heterogeneity
different people see things differently, and
design schemas differently
Data-level heterogeneity
e.g., ‘IBM’ vs. ‘Int. Business Machines’ vs.
‘International Business Machines’
Towards Virtual Knowledge Graphs over Web APIs – slides available at https://bit.ly/3WOoldB 2/34
5. Querying Data Takes Time and IT Expertise (besides Domain Knowledge)
Query from Statoil (now Equinor) use case
EU FP7 Optique project
Natural language: In a given area, return
pressure data tagged with stratigraphy and
quality control attributes
SQL: huge query joining 9 tables, the main one
with 38 columns with cryptic names
Query from Sloan Digital Sky Survey use case
EU H2020 INODE project
Natural language: Get all white dwarf stars
SQL: unintelligible query defining ‘white dwarf’
SELECT objID
FROM skyserverv3_correct.star
WHERE u - g < .4 AND g - r < .7 AND
r - i > .4 AND i - z > .4
Towards Virtual Knowledge Graphs over Web APIs – slides available at https://bit.ly/3WOoldB 3/34
6. Virtual Knowledge Graphs (VKG) – a Data Access / Integration Solution
Three key ideas:
1. use a global (or integrated) schema and map the data sources to the global schema
2. adopt a very flexible data model for the global schema
→ Knowledge Graph (KG) whose vocabulary is expressed in an ontology.
3. exploit virtualization, i.e., the KG is not materialized, but kept virtual
This gives rise to the Virtual Knowledge Graph (VKG) approach to data access / integration, also
called Ontology-Based Data Access / Integration (OBDA)
Towards Virtual Knowledge Graphs over Web APIs – slides available at https://bit.ly/3WOoldB 4/34
7. Virtual Knowledge graphs (VKG) – Core Components
Ontology
conceptualizes a domain of interest in terms of
classes and (binary) properties, overall defining
the terminological knowledge (TBox) of the VKG
Data sources
provide the data forming the RDF triples, i.e., the
assertional knowledge (ABox), of the VKG
Mapping
define how to generate the RDF triples from the raw
data (e.g., relational), via mapping assertions that
populate each class/property of the ontology
Queries
formulated against the VKG (which is virtual) and
rewritten in native queries evaluated over the sources
. . .
. . .
. . .
. . .
Ontology O
Mapping M
Data sources D
query
results
Towards Virtual Knowledge Graphs over Web APIs – slides available at https://bit.ly/3WOoldB 5/34
8. 1. Introduction
2. The VKG Framework
3. The Ontop VKG System
4. VKGs over Web APIs
5. Conclusions
9. VKG Framework – Which Languages to Use?
Need to balance
• expressive power
of adopted languages for O, M, q
• query answering efficiency
with respect to data size
. . .
. . .
. . .
. . .
Ontology O
Mapping M
Data sources D
query
results
W3C has standardized languages that are suitable for VKGs:
• Knowledge graph: expressed in RDF (W3C Rec. 2014 )
• Ontology O: expressed in OWL 2 QL (W3C Rec. 2012 )
• Mapping M: expressed in R2RML (W3C Rec. 2012 )
• Query q: expressed in SPARQL (W3C Rec. 2013 )
Towards Virtual Knowledge Graphs over Web APIs – slides available at https://bit.ly/3WOoldB 6/34
10. RDF – Data Represented as a Graph
The graph consists of a set of ⟨subject, predicate, object⟩ triples, over IRI, literal and blank nodes
• IRI nodes (formerly URI):
<http://example.org/M-25>,
<M-25>, ex:M-25 or :M-25
• Literal nodes:
"2008-02-12", "The Matrix"@en,
"511"^^xsd:integer
• class membership triples:
<A-1> rdf:type :Actor .
• object property triples:
<A-1> :playsIn <M-25> .
• data property triples:
<M-25> :releaseDate "2008-02-12" .
Towards Virtual Knowledge Graphs over Web APIs – slides available at https://bit.ly/3WOoldB 7/34
11. OWL 2 QL – Lightweight Ontology Language for Accessing Large Amounts of Data
Standard sub-language of OWL 2 [W3C Rec. 2012]
Its assertions encode a logical theory in the
DL-Lite fragment of description logics that
enables reasoning by query rewriting
Close correspondence with UML class diagrams
and ER schemas used in conceptual modeling
:actsIn rdfs:range :Movie
:actsIn rdfs:subPropertyOf :playsIn
. . . owl:someValuesFrom . . .
Actor
name: String
SeriesActor MovieActor
Play
title: String
Movie
actsIn
1..⋆
▶
playsIn
▶
{disjoint}
In f
ont
UM
Diego Calvanese (unibz + umu + ontopic) Ontology-based Data Access and Integration
Assertion type DL syntax OWL syntax
Subclass assertion MovieActor ⊑ Actor :MovieActor rdfs:subClassOf :Actor .
Class disjointness Actor ⊑ ¬Movie :Actor owl:disjointWith :Movie .
Domain of a property ∃actsIn ⊑ MovieActor :actsIn rdfs:domain :MovieActor .
Range of a property ∃actsIn−
⊑ Movie :actsIn rdfs:range :Movie .
Subproperty assertion actsIn ⊑ playsIn :actsIn rdfs:subPropertyOf :playsIn .
Inverse properties actsIn ≡ hasActor−
:actsIn owl:inverseOf :hasActor .
Mandatory participation MovieActor ⊑ ∃actsIn owl:someValuesFrom in superclass expression
Towards Virtual Knowledge Graphs over Web APIs – slides available at https://bit.ly/3WOoldB 8/34
12. Mappings
Define how to populate classes & properties via assertions of form: Qsql(⃗
x) ⇝ iri(⃗
x) rdf:type C
Qsql(⃗
x) ⇝ iri1(⃗
x) P iri2(⃗
x)
Ontology O:
:actsIn rdfs:domain :MovieActor .
:actsIn rdfs:range :Movie .
:Movie rdfs:subClassOf :Play .
:title rdfs:domain :Play .
:title rdfs:range xsd:string .
...
Mapping M:
m1: SELECT mcode, mtitle FROM MOVIE WHERE type = "m"
⇝ :m-{mcode} rdf:type :Movie . :m-{mcode} :title {mtitle} .
m2: SELECT M.mcode, A.acode FROM MOVIE M, ACTOR A
WHERE M.mcode = A.pcode AND M.type = "m"
⇝ :a-{acode} :actsIn :m-{mcode} .
Database D:
MOVIE
mcode mtitle myear type · · ·
511 The Matrix 1999 m · · ·
227 Blade Runner 1982 m · · ·
ACTOR
pcode acode aname · · ·
511 43 K. Reeves · · ·
511 57 C.A. Moss · · ·
VKG V from O, M, D:
:m-511 rdf:type :Movie .
:m-227 rdf:type :Movie .
:m-511 :title "The Matrix" .
:m-227 :title "Blade Runner" .
:a-43 :actsIn :m-511 .
:a-57 :actsIn :m-511 .
Towards Virtual Knowledge Graphs over Web APIs – slides available at https://bit.ly/3WOoldB 9/34
13. SPARQL Query Language
Standard query language for RDF data [W3C Rec. 2008, 2013], based on graph matching
SELECT ?a ?t WHERE {
?a rdf:type :Actor .
?a :playsIn ?m .
?m rdf:type :Movie .
?m :title ?t .
}
ndard query language for RDF data. [W3C Rec. 2008, 2013]
ry mechanism is based on graph matching.
?t
a rdf:type Actor .
a playsIn ?m .
m rdf:type Movie .
m title ?t .
?a
Actor
?m
Movie
?t
rdf:type
playsIn
rdf:type
title
guage features (SPARQL 1.1):
atches one of alternative graph patterns
L: produces a match even when part of the pattern is missing
FILTER conditions
Y, to express aggregations
remove possible solutions
paths (regular expressions)
Additional language features (SPARQL 1.1):
• UNION: matches one of alternative graph patterns
• OPTIONAL: produces a match even when part of the pattern is missing
• complex FILTER conditions
• GROUP BY, to express aggregations
• MINUS, to remove possible solutions
• property paths (regular expressions)
• · · ·
Towards Virtual Knowledge Graphs over Web APIs – slides available at https://bit.ly/3WOoldB 10/34
14. Query Answering in VKGs
Goal: answer a query q over a VKG V by jointly considering:
• the data provided by the data source D
• the mapping M encoding how such data translates to ontology
• the ontology O encoding domain knowledge that can be used to enrich answers.
Example:
• suppose that an entity :m-511 of class Movie can be obtained from the data D using some
mapping assertion in M (e.g., m1 about table MOVIE)
• suppose the ontology O states that each Movie is a Play, i.e., :Movie rdfs:subClassOf :Play
• if query q asks for all Plays, we should return also m-511 that is a Movie and thus also a Play
solution:
Query answering by Query Reformulation
Towards Virtual Knowledge Graphs over Web APIs – slides available at https://bit.ly/3WOoldB 11/34
15. Query Answering in VKGs – Query Reformulation
Ontology O
Mappings M
Data
Sources
D
. . .
. . .
. . .
. . .
Ontological Query q
Rewritten Query
SQL
Relational Answer
Ontological Answer
Rewriting
Unfolding
Evaluation
Result Translation
SELECT ?p {
?p rdf:type :Play
}
SELECT ?p {
{ ?p rdf:type :Play }
UNION
{ ?p rdf:type :Movie }
}
SELECT mcode
FROM MOVIE
WHERE type = “m”
?p
:m-511
mcode
511
D: MOVIE (mcode, mtitle, …)
O: :Movie rdfs:subClassOf :Play
M: SELECT mcode
FROM MOVIE
→ :m-{mcode} a :Movie
Towards Virtual Knowledge Graphs over Web APIs – slides available at https://bit.ly/3WOoldB 12/34
16. 1. Introduction
2. The VKG Framework
3. The Ontop VKG System
4. VKGs over Web APIs
5. Conclusions
17. The Ontop VKG System
https://ontop-vkg.org/
• state-of-the-art VKG system born in UNIBZ (2009, first research in 2004)
• compliant with all relevant Semantic Web standards:
RDF, RDFS, OWL 2 QL, R2RML, SPARQL, and GeoSPARQL
• implemented in Java (v1.8+) and also available as Docker image
• supports all major relational DBMSs:
Oracle, DB2, MS SQL Server, Postgres, MySQL, Teiid, Dremio, Denodo, etc.
• open-source (Apache 2) project with a solid community
200+ mailing list members, 9000+ downloads in last 2 years
• commercial services (open-core model) by Ontopic , a UNIBZ spin-off funded in 2019
Towards Virtual Knowledge Graphs over Web APIs – slides available at https://bit.ly/3WOoldB 13/34
18. Ontop Usage Scenarios
s Solution
Mapping
Data
Ontology
materialize
virtualize
Virtual
Knowledge Graph
Materialized
Knowledge Graph
•••
Query Query Result
Triple Store
VKG query answering
• supports most of SPARQL 1.1 under
OWL 2 QL inference regime
• standard-compliant SPARQL endpoint
• over one relational source, or
• over multiple heterogeneous sources,
together with a data federation system
(e.g., Teiid, Dremio) providing an
integrated relational view of sources
VKG materialization
• use ontology and mappings to efficiently
& scalably materialize all the VKG triples
• the produced RDF file can be loaded in
any triplestore
Towards Virtual Knowledge Graphs over Web APIs – slides available at https://bit.ly/3WOoldB 14/34
20. Ontop in Research and Industrial Projects
Research projects
• Optique (EU FP7, 11/2012-10/2016)
Ontop-based scalable end-user access to
big data, 10 partners incl. Statoil, Siemens
• EPNet (ERC Advanced Grant)
cultural heritage project on food production
and distribution in the Roman Empire
• KAOS (Euregio, 06/2016-05/2019)
preparing standardized log files from
timestamped log data for process mining
• INODE (EU H2020, 11/2019-10/2022)
intelligent open data exploration
• IDEE (ERDF 2014-2020)
building & energy consumption data VKG
Industrial projects
• NOI Techpark
development South Tyrol tourism KG
• SIRIS Academic (Barcelona)
open data integration and dashboards
• Siemens Corportate Technologies (Munich)
access to temporal and streaming data
• Robert Bosch GmBH (Stuttgart)
analysis of manufacturing log data
• Metaphacts (Germany)
inclusion of Ontop in their platform
• Fluxicon (Milano)
• Isagog (Rome)
Towards Virtual Knowledge Graphs over Web APIs – slides available at https://bit.ly/3WOoldB 16/34
21. Ontop in Action Optique project, Statoil use case
From SQL query over the data source ...
SELECT wellbore.identifier, stratigraphic_zone.strat_column_identifier,
pty_pressure.pty_pressure_s, stratigraphic_zone.strat_unit_identifier
FROM wellbore, pty_pressure, activity fp_depth_data LEFT JOIN (
pty_location_1d AS fp_depth_pt1_loc
JOIN picked_stratigraphic_zones AS zs
ON zs.strat_zone_entry_md <= fp_depth_pt1_loc.Data_value_1_o AND
zs.strat_zone_exit_md >= fp_depth_pt1_loc.Data_value_1_o AND
zs.strat_zone_depth_uom = fp_depth_pt1_loc.Data_value_1_ou
JOIN join stratigraphic_zone
ON zs.wellbore = stratigraphic_zone.wellbore AND
zs.strat_column_identifier = stratigraphic_zone.
strat_column_identifier AND
zs.strat_interp_version = stratigraphic_zone.strat_interp_version AND
zs.strat_zone_identifier = stratigraphic_zone.strat_zone_identifier
) ON fp_depth_data.facility_s = zs.wellbore AND
fp_depth_data.activity_s = fp_depth_pt1_loc.activity_s,
activity_class AS form_pressure_class
WHERE wellbore.wellbore_s = fp_depth_data.Facility_s AND
fp_depth_data.activity_s = pty_pressure.activity_s AND
fp_depth_data.kind_s = form_pressure_class.activity_class_s AND
wellbore.ref_existence_kind = 'actual' AND
form_pressure_class.name = 'formation pressure depth data'
... to VKG SPARQL query
SELECT ?wellbore ?chronostrat_unit
?top_md_m ?lithostrat_unit
{
?w a :Wellbore ;
:name ?wellbore ;
:hasWellboreInterval ?intv .
?intv a :StratigraphicZone ;
:hasUnit ?cu ;
:hasTopDepth ?top .
?cu :name ?chronostrat_unit ;
:ofStratigraphicColumn
[ a :ChronoStratigraphicColumn ] .
?top a :MeasuredDepth ;
:valueInStandardUnit ?top_md_m .
?intv :overlapsWellboreInterval
?litho_intv .
?litho_intv :hasUnit ?lu .
?lu :name ?lithostrat_unit ;
:ofStratigraphicColumn
[ a :LithoStratigraphicColumn ] .
}
Towards Virtual Knowledge Graphs over Web APIs – slides available at https://bit.ly/3WOoldB 17/34
22. Ongoing Research & Development Directions
Mapping patterns
• bootstrapping (semi-automated generation) of mappings & possibly ontology for a data source
• reduces VKG deploying costs, mostly related to mapping authoring
Provenance & explanations
• report which sources/tuples, mappings and ontology axioms contributed to a query answer
• prototype Ontop extension based on provenance approaches (semi-rings) in DB community
Geospatial queries
• support GeoSPARQL to manipulate & query for geometries, leveraging DB support (e.g., PostGIS)
Temporal/streaming extensions
• support SQL-enabled stream processors like Flink and pattern matching over streaming data
Non-relational sources
• support non-relational data sources such as MongoDB, Neo4J and Web APIs
Towards Virtual Knowledge Graphs over Web APIs – slides available at https://bit.ly/3WOoldB 18/34
23. 1. Introduction
2. The VKG Framework
3. The Ontop VKG System
4. VKGs over Web APIs
5. Conclusions
24. Accessing Web APIs
Data is increasingly available via Web APIs
• access to 3rd-party and/or dynamically-computed data
• access to data-related services, e.g., text search
Some APIs’ statisticsa
• 83% of all Internet traffic belongs to API-based services
• 2M+ API repositories on GitHub
• 90% of developers use APIs
• 30% of development time spent on coding APIs
Complex data access problem for applications operating on
data from both databases and APIs
a
https://nordicapis.com/20-impressive-api-economy-statistics/
RDB Sources
API Sources
SQL
calls
Application
complex
data access
problem
Towards Virtual Knowledge Graphs over Web APIs – slides available at https://bit.ly/3WOoldB 19/34
25. Accessing Web APIs – Open Data Hub (ODH) RDB + Semantic Search API Example
Answer hybrid queries like:
• get (plot) IRI, description, rating &
location of accommodations ...
• whose rating is 3 stars or more
(structured constraint) and ...
• whose EN description matches the
search string “horse riding” (text
constraint)
Semantic search: improved text search
that aims at capturing and leveraging
text meaning (vs term matching only)
• e.g., via BERT-based model from
Sentence Transformers library
Towards Virtual Knowledge Graphs over Web APIs – slides available at https://bit.ly/3WOoldB 20/34
26. Accessing Web APIs – Unified Access using a VKG
• applications operate on a unified VKG spanning APIs and
other involved sources
→ each API operation as an independent source
→ data federation setting due to multiple sources
• VKG built (e.g., via Ontop) over a Virtual Database (VDB)
federating all sources
→ VDB produced by a data federation system (e.g., Teiid)
→ the VDB offers a relational view of API data
→ VKG query reformulation may be tuned to this setting
• delegate the complex orchestration of source sub-queries
and API calls to a VKG + data federation system
• exploit existing database techniques to cope with API access
pattern restrictions during query answering
Virtual DB (VDB) (Teiid extension)
RDB Sources
API Sources
VKG (Ontop extension)
SQL
SQL
calls
SPARQL
User / Application
Towards Virtual Knowledge Graphs over Web APIs – slides available at https://bit.ly/3WOoldB 21/34
27. VDB – SQL/MED Specification
SQL/MED allows federating multiple sources in a virtual database (VDB)
• standardized SQL extension supported by some data federation systems like Teiid
• VDB as a set of schemas mapped to foreign data sources accessed via wrappers/translators
• we extend Teiid with a new service translator for accessing APIs
Example using Teiid with our extensions:
CREATE DATABASE vdb_example OPTIONS ( "... connection options for federated sources ..." );
USE DATABASE vdb_example;
CREATE SERVER db_source FOREIGN DATA WRAPPER postgresql; -- define RDB source with schema 'db'
CREATE SCHEMA db SERVER db_source; -- using 'postgresql' translator to access it
CREATE SERVER srv_source FOREIGN DATA WRAPPER service; -- define API source with schema 'srv'
CREATE SCHEMA srv SERVER srv_source; -- using 'service' translator to access it
IMPORT FOREIGN SCHEMA public FROM SERVER db_source INTO db OPTIONS ( importer.catalog 'public' );
SET SCHEMA srv;
-- CREATE FOREIGN TABLE / PROCEDURE statements mapped to API operations (API bindings)
Towards Virtual Knowledge Graphs over Web APIs – slides available at https://bit.ly/3WOoldB 22/34
28. VDB – API Bindings
API operations as SQL/MED procedures
• input tuple → 0..n output tuples
• URL, method, request/response templates
CREATE FOREIGN PROCEDURE api_semsearch_query (
query VARCHAR
) RETURNS TABLE (
query VARCHAR,
id VARCHAR,
score DOUBLE,
excerpt VARCHAR
) OPTIONS (
"method" 'post',
"url" 'http://semsearch:8080/query',
"requestBody" '{"query": "{query}", "n": 100}',
"responseBody" '{"matches": [{
"id": "{id}",
"score": "{score}",
"excerpt": "{excerpt}" }] }'
);
API data as SQL/MED virtual tables
• linked to API operations/procedures
• each procedure defines an access pattern
CREATE FOREIGN TABLE vt_semsearch_match (
query VARCHAR NOT NULL,
id VARCHAR NOT NULL,
score DOUBLE NOT NULL,
excerpt VARCHAR NOT NULL,
PRIMARY KEY (query, id)
) OPTIONS ( "select" 'api_semsearch_query' );
CREATE FOREIGN TABLE vt_semsearch_index (
id VARCHAR PRIMARY KEY,
text VARCHAR NOT NULL
) OPTIONS (
"UPDATABLE" 'true',
"upsert" 'api_semsearch_store',
"delete" 'api_semsearch_clear'
);
Towards Virtual Knowledge Graphs over Web APIs – slides available at https://bit.ly/3WOoldB 23/34
29. VDB – Query Translation & Execution
Given a VDB defined using SQL/MED + API Bindings and an input query over the VDB
• Teiid splits the query into sub-queries based on translator capabilities and cost heuristics
• sub-queries are sent to translators & Teiid handles remaining operations (e.g., federated joins)
Example SQL query
SELECT s.score,
s.excerpt,
a."AccoCategoryId",
a."AccoDetail-en-Name",
a."AccoDetail-en-City"
FROM srv.vt_semsearch_match AS s
JOIN db.v_accommodationsopen AS a
ON s.id = a."Id"
WHERE s.query = 'horse riding'
ORDER BY s.score DESC
LIMIT 10
Execution plan
LimitNode (limit = 10)
SortNode (s.score DESC)
ProjectNode (s.score, ... a."AccoDetail-en-City")
JoinNode (s.id = a."Id", merge join strategy)
AccessNode (API)
SELECT id, excerpt, score
FROM vt_semsearch_match
WHERE query = ’horse riding’
AccessNode (RDB)
SELECT "Id", "AccoDetail-en-Name",
"AccoDetail-en-City",
FROM v_accommodationsopen
Towards Virtual Knowledge Graphs over Web APIs – slides available at https://bit.ly/3WOoldB 24/34
30. VDB – Push-down of Projection, Filtering, Sorting, Slicing
Special input attributes map API capabilities related to standard relational operators
• filtering: return/process only objects matching some criteria (e.g., attribute = or ≥ constant)
• projection: include/exclude certain attributes in returned results
• sorting: sort results according to a certain attribute and direction (ascending/descending)
• slicing: return only a given page of all possible results
CREATE FOREIGN PROCEDURE api_station_data_from_to (
stype VARCHAR NOT NULL,
sname VARCHAR NOT NULL,
tname VARCHAR NOT NULL,
__min_inclusive__mvaliddate DATE NOT NULL, -- filter push down (conditions min <= mvaliddate <= max)
__max_inclusive__mvaliddate DATE NOT NULL,
__limit__ INTEGER -- slicing push down
) RETURNS TABLE ( ... )
) OPTIONS ( ... );
Partial/complete push down of these operators whenever possible
• allows offloading computation to the API (e.g., sorting)
• allows reducing costs by manipulating & transferring less data
Towards Virtual Knowledge Graphs over Web APIs – slides available at https://bit.ly/3WOoldB 25/34
31. VDB – Exploiting Bulk API Operations
Bulk API operations operate on multiple input tuples, such as lookup by set of IDs or bulk store
• their use enables better performance due to less API calls
• useful to speed-up dependent joins (using IN operator) between RDBMS and API data
A A
RDBMS table R virtual table S bulk API operation
(A input attribute)
⨝R.A = S.A
SELECT A, …
FROM R
WHERE …
1
SELECT A, …
FROM S
WHERE A IN (a1, a2, …)
AND …
3
2 Extract values of join
attribute A: a1, a2, …
API bindings
4 Bulk API calls with
multiple input tuples for
different values of A:
a1, a2, …
Towards Virtual Knowledge Graphs over Web APIs – slides available at https://bit.ly/3WOoldB 26/34
32. VDB – Data Materialization
Data materialization: required by API operations that cannot be invoked at query time
• operations too expensive to call at query time (e.g., align API and DB identifiers)
• operations instrumental to the use of external APIs (e.g., text indexing in a search engine)
Solution #1: materialized views in Teiid (or other data federation system used)
Solution #2: dedicated materialization engine for
flexibly executing arbitrary materialization rules:
• identifier – for documentation & diagnostics
• target – the system-managed computed table
(possibly virtual) where data is stored
• source – arbitrary SQL query (over any tables)
that produces the data to store
rules:
- id: index_accommodation_texts
target: vt_semsearch_index
source: |-
SELECT "Id" AS id,
"AccoDetail-en-Longdesc" AS text
FROM v_accommodationsopen
WHERE "AccoDetail-en-Longdesc"
IS NOT NULL
- ... other rules ...
Towards Virtual Knowledge Graphs over Web APIs – slides available at https://bit.ly/3WOoldB 27/34
33. VDB – Data Materialization (cont’d)
Rules (their SQL source queries) are analyzed to derive a rule dependency graph, which is mapped
to an execution plan using fixpoint rule evaluation for strongly connected components
R1 R2
R3 R4
R5
R1 R2
R3 R4
R5
sequence (
parallel (
R1,
sequence (
R2,
fixpoint (
parallel (
R3,
R4
)
)
)
),
R5
)
Rule / Table Dependencies Rule Dependencies Execution Plan
Towards Virtual Knowledge Graphs over Web APIs – slides available at https://bit.ly/3WOoldB 28/34
34. VKG – Example of Ontology & Mappings over the VDB
Ontology
schema:Accommodation a owl:Class ;
rdfs:subClassOf schema:Place ;
rdfs:label "Accommodation"@en ;
...
schema:name a owl:DatatypeProperty ;
...
hive:Match a owl:Class ...
Current ontology formalism (OWL 2 QL) reused
as is, but now also models data from APIs
Mappings
mappingId Semantic Search
target data:match/accommodation/{id}/{query}
a hive:Match;
hive:query {query}^^xsd:string;
hive:resource data:accommodation/{id};
hive:excerpt {excerpt}@en;
hive:score {score}^^xsd:decimal.
source SELECT *
FROM hiveodh.srv.vt_semsearch_match
Current VKG mapping formalism reused as is, but
data may now come from API virtual tables
Towards Virtual Knowledge Graphs over Web APIs – slides available at https://bit.ly/3WOoldB 29/34
35. VKG – Query Rewriting & Evaluation Example
User-supplied SPARQL query
SELECT ?h ?posLabel ?rating ?pos {
[] a hive:Match ;
hive:query "horse riding"^^xsd:string ;
hive:resource ?h ;
hive:excerpt ?excerpt ;
hive:score ?score .
?h a schema:LodgingBusiness ;
geo:defaultGeometry/geo:asWKT ?pos ;
schema:name ?name ;
schema:description ?description ;
schema:starRating/schema:ratingValue ?rating.
FILTER (?rating >= 3 && lang(?name) = 'en' &&
lang(?description) = 'en')
BIND (CONCAT(?name, " <br><br>...", ?excerpt,
"...<br><br>", ?description) AS ?posLabel)
}
ORDER BY DESC(?score) LIMIT 10
SQL query rewritten by Ontop
SELECT
v1.id,
v1.excerpt, -- fields used
v2."AccoDetail-en-Name", -- for deriving
v2."AccoDetail-en-Longdesc", -- ?posLabel
... complex expression computing rating ...,
ST_ASTEXT(v2."Geometry")
FROM
hiveodh.srv.vt_semsearch_match v1,
hiveodh.db.v_accommodationsopen v2
WHERE
v1."id" = v2."Id" AND
CAST(v1."query" AS TEXT) = 'horse riding' AND
... complex condition on rating >= 3 ... AND
... nonnull conditions for output columns ...
ORDER BY CAST(v1."score" AS DECIMAL) DESC
LIMIT 10
SQL query evaluated on the VDB by Teiid
Towards Virtual Knowledge Graphs over Web APIs – slides available at https://bit.ly/3WOoldB 30/34
36. VKG – ODH with Semantic Search Demo
Data sources
DB with ODH tourism data +
Semantic search API to index &
query accommodations texts
System
Ontop embedding Teiid +
materialization engine
Demo
https://hive.inf.unibz.it/
odh/vkg/ reformulate example
Towards Virtual Knowledge Graphs over Web APIs – slides available at https://bit.ly/3WOoldB 31/34
37. Overall Framework for VKGs over APIs
Virtual DB (VDB) Teiid + service translator
VKG Mappings
including virtual tables,
used for query rewriting
Materialization Rules
pre-compute results of
expensive API calls
→ VDB/VKG no more
fully “virtual”
API Bindings
define how to query/update a virtual
table via API calls, if possible
→ limited access patterns RDB Sources
API Sources
Virtual Knowledge Graph (VKG) Ontop
SQL
SQL
calls
Application
(VKG-based)
Application
(VDB-based)
SQL
SPARQL
VKG Ontology
formalizes the classes/properties
(the “schema”) of the VKG,
enabling reasoning
Towards Virtual Knowledge Graphs over Web APIs – slides available at https://bit.ly/3WOoldB 32/34
38. 1. Introduction
2. The VKG Framework
3. The Ontop VKG System
4. VKGs over Web APIs
5. Conclusions
39. Takeaway Messages
Virtual Knowledge Graphs (VKG): flexible technology for building KGs over existing data source(s)
• useful for inherently relational data where a VKG engine + RDBMS may outperform a triplestore
• useful for existing data RDF-ification via VKG materialization to an RDF file
Ontop: mature, open-source VKG system with a solid user & developer community
• allows a VKG over a single RDB, with support for multiple database engines
• allows a VKG over multiple heterogeneous sources, in combination with an intermediate data
federation system such as the open-source Teiid & Dremio
• active research & development for adding new features and new data sources
VKGs over Web APIs: ongoing research & development effort
• enables transparent access to dynamically-computed API data via declarative queries
• API operations mapped to virtual relations, accessed through a Teiid extension
• optimizations for better using API features, such as bulk operations and operators’ push-down
• expensive API operations supported via pre-computation and data materialization
Towards Virtual Knowledge Graphs over Web APIs – slides available at https://bit.ly/3WOoldB 33/34