Knowledge Patterns for the Web: extraction, transformation, and reuseAndrea Nuzzolese
KPs are an abstraction of frames as introduced by Fillmore and Minsky. KP discovery needs to address two main research problems: the heterogeneity of sources, formats and semantics in the Web (i.e., the knowledge soup problem) and the difficulty to draw relevant boundary around data that allows to capture the meaningful knowledge with respect to a certain context (i.e., the knowledge boundary problem). Hence, we introduce two methods that provide different solutions to these two problems by tackling KP discovery from two different perspectives: (i) the transformation of KP-like artifacts (i.e., top-down defined artifacts that can be compared to KPs, such as FrameNet frames or Ontology Design Patterns) to KPs formalized as OWL2 ontologies; (ii) the bottom-up extraction of KPs by analyzing how data are organized in Linked Data. The two methods address the knowledge soup and boundary problems in different ways. The first method provides a solution to the two aforementioned problems that is based on a purely syntactic transformation step of the original source to RDF followed by a refactoring step whose aim is to add semantics to RDF by select meaningful RDF triples. The second method allows to draw boundaries around RDF in Linked Data by analyzing type paths. A type path is a possible route through an RDF that takes into account the types associated to the nodes of a path. Unfortunately, type paths are not always available. In fact, Linked Data is a knowledge soup because of the heterogeneous semantics of its datasets and because of the limited intentional as well as extensional coverage of ontologies (e.g., DBpedia ontology, YAGO) or other controlled vocabularies (e.g., SKOS, FOAF, etc.). Thus, we propose a solution for enriching Linked Data with additional axioms (e.g., rdf:type axioms) by exploiting the natural language available for example in annotations (e.g. rdfs:comment) or in corpora on which datasets in Linked Data are grounded (e.g. DBpedia is grounded on Wikipedia). Then we present K∼ore, a software architec- ture conceived to be the basis for developing KP discovery systems and designed according to two software architectural styles, i.e, the Component-based and REST. K∼ore is the architectural binding of a set of tools, i.e., K∼tools, which implements the methods for KP transformation and extraction. Finally we provide an example of reuse of KP based on Aemoo, an exploratory search tool which exploits KPs for performing entity summarization.
The Open Knowledge Extraction Challenge focuses on the production of new knowledge aimed at either populating and enriching existing knowledge bases or creating new ones. This means that the defined tasks focus on extracting concepts, individuals, properties, and statements that not necessarily exist already in a target knowledge base, and on representing them according to Semantic Web standard in order to be directly injected in linked datasets and their ontologies. The OKE challenge, has the ambition to advance a reference framework for research on Knowledge Extraction from text for the Semantic Web by re-defining a number of tasks (typically from information and knowledge extraction) by taking into account specific SW requirements. The Challenge is open to everyone from industry and academia.
The ability to take data, understand it, visualize it and extract useful information from it is becoming a hugely important skill. How can you turn all those logs, histories of purchases and trades or open government data, into useful information that help your business make money?
In this talk, we’ll look at doing data science using F#. The F# language is perfectly suited for this task – type providers integrate external data directly into the language – your language suddenly _understands_ CSV, XML, JSON, REST services and other sources. The interactive development style makes it easy to explore data and test your algorithms as you’re writing them. Rich set of libraries for working with data frames, time series and for visualization gives you all the tools you need. And finally – F# easily integrates with statistical environments like R and Matlab, giving you access to the industry standard libraries.
EKAW 2016 - TechMiner: Extracting Technologies from Academic PublicationsFrancesco Osborne
In recent years we have seen the emergence of a variety of scholarly datasets. Typically these capture ‘standard’ scholarly entities and their connections, such as authors, affiliations, venues, publications, citations, and others. However, as the repositories grow and the technology improves, researchers are adding new entities to these repositories to develop a richer model of the scholarly domain. In this paper, we introduce TechMiner, a new approach, which combines NLP, machine learning and semantic technologies, for mining technologies from research publications and generating an OWL ontology describing their relationships with other research entities. The resulting knowledge base can support a number of tasks, such as: richer semantic search, which can exploit the technology dimension to support better retrieval of publications; richer expert search; monitoring the emergence and impact of new technologies, both within and across scientific fields; studying the scholarly dynamics associated with the emergence of new technologies; and others.
TechMiner was evaluated on a manually annotated gold standard and the results indicate that it significantly outperforms alternative NLP approaches and that its semantic features improve performance significantly with respect to both recall and precision.
The ability to take data, understand it, visualize it and extract useful information from it is becoming a hugely important skill. How can you turn all those logs, histories of purchases and trades or open government data, into useful information that help your business make money?
In this talk, we’ll look at doing data science using F#. The F# language is perfectly suited for this task – type providers integrate external data directly into the language – your language suddenly _understands_ CSV, XML, JSON, REST services and other sources. The interactive development style makes it easy to explore data and test your algorithms as you’re writing them. Rich set of libraries for working with data frames, time series and for visualization gives you all the tools you need. And finally – F# easily integrates with statistical environments like R and Matlab, giving you access to the industry standard libraries.
Knowledge Patterns for the Web: extraction, transformation, and reuseAndrea Nuzzolese
KPs are an abstraction of frames as introduced by Fillmore and Minsky. KP discovery needs to address two main research problems: the heterogeneity of sources, formats and semantics in the Web (i.e., the knowledge soup problem) and the difficulty to draw relevant boundary around data that allows to capture the meaningful knowledge with respect to a certain context (i.e., the knowledge boundary problem). Hence, we introduce two methods that provide different solutions to these two problems by tackling KP discovery from two different perspectives: (i) the transformation of KP-like artifacts (i.e., top-down defined artifacts that can be compared to KPs, such as FrameNet frames or Ontology Design Patterns) to KPs formalized as OWL2 ontologies; (ii) the bottom-up extraction of KPs by analyzing how data are organized in Linked Data. The two methods address the knowledge soup and boundary problems in different ways. The first method provides a solution to the two aforementioned problems that is based on a purely syntactic transformation step of the original source to RDF followed by a refactoring step whose aim is to add semantics to RDF by select meaningful RDF triples. The second method allows to draw boundaries around RDF in Linked Data by analyzing type paths. A type path is a possible route through an RDF that takes into account the types associated to the nodes of a path. Unfortunately, type paths are not always available. In fact, Linked Data is a knowledge soup because of the heterogeneous semantics of its datasets and because of the limited intentional as well as extensional coverage of ontologies (e.g., DBpedia ontology, YAGO) or other controlled vocabularies (e.g., SKOS, FOAF, etc.). Thus, we propose a solution for enriching Linked Data with additional axioms (e.g., rdf:type axioms) by exploiting the natural language available for example in annotations (e.g. rdfs:comment) or in corpora on which datasets in Linked Data are grounded (e.g. DBpedia is grounded on Wikipedia). Then we present K∼ore, a software architec- ture conceived to be the basis for developing KP discovery systems and designed according to two software architectural styles, i.e, the Component-based and REST. K∼ore is the architectural binding of a set of tools, i.e., K∼tools, which implements the methods for KP transformation and extraction. Finally we provide an example of reuse of KP based on Aemoo, an exploratory search tool which exploits KPs for performing entity summarization.
The Open Knowledge Extraction Challenge focuses on the production of new knowledge aimed at either populating and enriching existing knowledge bases or creating new ones. This means that the defined tasks focus on extracting concepts, individuals, properties, and statements that not necessarily exist already in a target knowledge base, and on representing them according to Semantic Web standard in order to be directly injected in linked datasets and their ontologies. The OKE challenge, has the ambition to advance a reference framework for research on Knowledge Extraction from text for the Semantic Web by re-defining a number of tasks (typically from information and knowledge extraction) by taking into account specific SW requirements. The Challenge is open to everyone from industry and academia.
The ability to take data, understand it, visualize it and extract useful information from it is becoming a hugely important skill. How can you turn all those logs, histories of purchases and trades or open government data, into useful information that help your business make money?
In this talk, we’ll look at doing data science using F#. The F# language is perfectly suited for this task – type providers integrate external data directly into the language – your language suddenly _understands_ CSV, XML, JSON, REST services and other sources. The interactive development style makes it easy to explore data and test your algorithms as you’re writing them. Rich set of libraries for working with data frames, time series and for visualization gives you all the tools you need. And finally – F# easily integrates with statistical environments like R and Matlab, giving you access to the industry standard libraries.
EKAW 2016 - TechMiner: Extracting Technologies from Academic PublicationsFrancesco Osborne
In recent years we have seen the emergence of a variety of scholarly datasets. Typically these capture ‘standard’ scholarly entities and their connections, such as authors, affiliations, venues, publications, citations, and others. However, as the repositories grow and the technology improves, researchers are adding new entities to these repositories to develop a richer model of the scholarly domain. In this paper, we introduce TechMiner, a new approach, which combines NLP, machine learning and semantic technologies, for mining technologies from research publications and generating an OWL ontology describing their relationships with other research entities. The resulting knowledge base can support a number of tasks, such as: richer semantic search, which can exploit the technology dimension to support better retrieval of publications; richer expert search; monitoring the emergence and impact of new technologies, both within and across scientific fields; studying the scholarly dynamics associated with the emergence of new technologies; and others.
TechMiner was evaluated on a manually annotated gold standard and the results indicate that it significantly outperforms alternative NLP approaches and that its semantic features improve performance significantly with respect to both recall and precision.
The ability to take data, understand it, visualize it and extract useful information from it is becoming a hugely important skill. How can you turn all those logs, histories of purchases and trades or open government data, into useful information that help your business make money?
In this talk, we’ll look at doing data science using F#. The F# language is perfectly suited for this task – type providers integrate external data directly into the language – your language suddenly _understands_ CSV, XML, JSON, REST services and other sources. The interactive development style makes it easy to explore data and test your algorithms as you’re writing them. Rich set of libraries for working with data frames, time series and for visualization gives you all the tools you need. And finally – F# easily integrates with statistical environments like R and Matlab, giving you access to the industry standard libraries.
Abaques Lecko - Enseignements du benchmark 2013 Lecko
Ce document rassemble les 8 enseignements clefs tirés du Benchmark réalisé par Lecko en comparant les activités sociales des plateformes de RSE (réseau social d'entreprise) d'un panel constitué de : Air liquide, Albéa Groupe, Allianz, CDC Climat, Crédit Agricole Creditor Insurance, ENRx, JCDecaux, GDF Suez, Kaufmann & Broad, Lafarge, Lecko, Louis Vuitton, Michelin, Simply Market, Suez Environnement.
Journal presented at AlignmentTrack at ISWC2017.
This work was supported by grants from the EU H2020 Framework Programme provided for the project HOBBIT (GA no. 688227).
Deep Learning for Information Retrieval: Models, Progress, & OpportunitiesMatthew Lease
Talk given at the 8th Forum for Information Retrieval Evaluation (FIRE, http://fire.irsi.res.in/fire/2016/), December 10, 2016, and at the Qatar Computing Research Institute (QCRI), December 15, 2016.
An overview of existing solutions for link discovery and looked into some of the state-of-art algorithms for the rapid execution of link discovery tasks focusing on algorithms which guarantee result completeness.
(HOBBIT project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 688227.)
With the continuously increasing number of datasets published in the Web of Data and form part of the Linked Open Data Cloud, it becomes more and more essential to identify resources that correspond to the same real world object in order to interlink web resources and set the basis for large-scale data integration. This requirement becomes apparent in a multitude of domains ranging from science (marine research, biology, astronomy, pharmacology) to semantic publishing and cultural domains. In this context, instance matching is of crucial importance.
It is though essential at this point to develop, along with instance and entity matching systems, benchmarks to determine the weak and strong points of those systems, as well as their overall quality in order to support users in deciding the system to use for their needs. Hence, well defined, and good quality benchmarks are important for comparing the performance of the developed instance matching systems.
In this tutorial we aim at:
- Discussing the state-of-the-art instance matching benchmarks
- Presenting the benchmark design principles
- Providing an analysis of the performance results of instance matching systems for the presented benchmarks
- Presenting the research directions that should be exploited for the creation of novel benchmarks to answer the needs of the Linked Data paradigm.
Please click here for the Tutorial web-page: http://www.ics.forth.gr/isl/BenchmarksTutorial/
Keynote: SemSci 2017: Enabling Open Semantic Science
1st International Workshop co-located with ISWC 2017, October 2017, Vienna, Austria,
https://semsci.github.io/semSci2017/
Abstract
We have all grown up with the research article and article collections (let’s call them libraries) as the prime means of scientific discourse. But research output is more than just the rhetorical narrative. The experimental methods, computational codes, data, algorithms, workflows, Standard Operating Procedures, samples and so on are the objects of research that enable reuse and reproduction of scientific experiments, and they too need to be examined and exchanged as research knowledge.
We can think of “Research Objects” as different types and as packages all the components of an investigation. If we stop thinking of publishing papers and start thinking of releasing Research Objects (software), then scholar exchange is a new game: ROs and their content evolve; they are multi-authored and their authorship evolves; they are a mix of virtual and embedded, and so on.
But first, some baby steps before we get carried away with a new vision of scholarly communication. Many journals (e.g. eLife, F1000, Elsevier) are just figuring out how to package together the supplementary materials of a paper. Data catalogues are figuring out how to virtually package multiple datasets scattered across many repositories to keep the integrated experimental context.
Research Objects [1] (http://researchobject.org/) is a framework by which the many, nested and contributed components of research can be packaged together in a systematic way, and their context, provenance and relationships richly described. The brave new world of containerisation provides the containers and Linked Data provides the metadata framework for the container manifest construction and profiles. It’s not just theory, but also in practice with examples in Systems Biology modelling, Bioinformatics computational workflows, and Health Informatics data exchange. I’ll talk about why and how we got here, the framework and examples, and what we need to do.
[1] Sean Bechhofer, Iain Buchan, David De Roure, Paolo Missier, John Ainsworth, Jiten Bhagat, Philip Couch, Don Cruickshank, Mark Delderfield, Ian Dunlop, Matthew Gamble, Danius Michaelides, Stuart Owen, David Newman, Shoaib Sufi, Carole Goble, Why linked data is not enough for scientists, In Future Generation Computer Systems, Volume 29, Issue 2, 2013, Pages 599-611, ISSN 0167-739X, https://doi.org/10.1016/j.future.2011.08.004
Profile-based Dataset Recommendation for RDF Data Linking Mohamed BEN ELLEFI
With the emergence of the Web of Data, most notably Linked Open Data (LOD), an abundance of data has become available on the web. However, LOD datasets and their inherent subgraphs vary heavily with respect to their size, topic and domain coverage, the schemas and their data dynamicity (respectively schemas and metadata) over the time. To this extent, identifying suitable datasets, which meet spefic criteria, has become an increasingly important, yet challenging task to support issues such as entity retrieval or semantic search and data linking. Particularly with respect to the interlinking issue, the current topology of the LOD cloud underlines the need for practical and ecient means to recommend suitable datasets: currently, only well-known reference graphs such as DBpedia (the most obvious target), YAGO or Freebase show a high amount of in-links, while there exists a long tail of potentially suitable yet under-recognized datasets. This problem is due to
the semantic web tradition in dealing with "fnding candidate datasets to link to", where data publishers are used to identify target datasets for interlinking.
While an understanding of the nature of the content of specic datasets is a crucial
prerequisite for the mentioned issues, we adopt in this dissertation the notion of
\dataset prole" | a set of features that describe a dataset and allow the comparison
of dierent datasets with regard to their represented characteristics. Our
rst research direction was to implement a collaborative ltering-like dataset recommendation
approach, which exploits both existing dataset topic proles, as well
as traditional dataset connectivity measures, in order to link LOD datasets into
a global dataset-topic-graph. This approach relies on the LOD graph in order to
learn the connectivity behaviour between LOD datasets. However, experiments have
shown that the current topology of the LOD cloud group is far from being complete
to be considered as a ground truth and consequently as learning data.
Facing the limits the current topology of LOD (as learning data), our research
has led to break away from the topic proles representation of \learn to rank"
approach and to adopt a new approach for candidate datasets identication where
the recommendation is based on the intensional proles overlap between dierent
datasets. By intensional prole, we understand the formal representation of a set of
schema concept labels that best describe a dataset and can be potentially enriched
Abstract:
An increasing number of applications rely on RDF, OWL 2, and SPARQL for storing and querying data. SPARQL, however, is not targeted towards end-users, and suitable query interfaces are needed. Faceted search is a prominent approach for end-user data access, and several RDF-based faceted search systems have been developed. There is, however, a lack of rigorous theoretical underpinning for faceted search in the context of RDF and OWL 2. In this paper, we provide such solid foundations. We formalise faceted interfaces for this context, identify a fragment of first-order logic capturing the underlying queries, and study the complexity of answering such queries for RDF and OWL 2 profiles. We then study interface generation and update, and devise efficiently implementable algorithms. Finally, we have implemented and tested our faceted search algorithms for scalability, with encouraging results.
Abaques Lecko - Enseignements du benchmark 2013 Lecko
Ce document rassemble les 8 enseignements clefs tirés du Benchmark réalisé par Lecko en comparant les activités sociales des plateformes de RSE (réseau social d'entreprise) d'un panel constitué de : Air liquide, Albéa Groupe, Allianz, CDC Climat, Crédit Agricole Creditor Insurance, ENRx, JCDecaux, GDF Suez, Kaufmann & Broad, Lafarge, Lecko, Louis Vuitton, Michelin, Simply Market, Suez Environnement.
Journal presented at AlignmentTrack at ISWC2017.
This work was supported by grants from the EU H2020 Framework Programme provided for the project HOBBIT (GA no. 688227).
Deep Learning for Information Retrieval: Models, Progress, & OpportunitiesMatthew Lease
Talk given at the 8th Forum for Information Retrieval Evaluation (FIRE, http://fire.irsi.res.in/fire/2016/), December 10, 2016, and at the Qatar Computing Research Institute (QCRI), December 15, 2016.
An overview of existing solutions for link discovery and looked into some of the state-of-art algorithms for the rapid execution of link discovery tasks focusing on algorithms which guarantee result completeness.
(HOBBIT project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 688227.)
With the continuously increasing number of datasets published in the Web of Data and form part of the Linked Open Data Cloud, it becomes more and more essential to identify resources that correspond to the same real world object in order to interlink web resources and set the basis for large-scale data integration. This requirement becomes apparent in a multitude of domains ranging from science (marine research, biology, astronomy, pharmacology) to semantic publishing and cultural domains. In this context, instance matching is of crucial importance.
It is though essential at this point to develop, along with instance and entity matching systems, benchmarks to determine the weak and strong points of those systems, as well as their overall quality in order to support users in deciding the system to use for their needs. Hence, well defined, and good quality benchmarks are important for comparing the performance of the developed instance matching systems.
In this tutorial we aim at:
- Discussing the state-of-the-art instance matching benchmarks
- Presenting the benchmark design principles
- Providing an analysis of the performance results of instance matching systems for the presented benchmarks
- Presenting the research directions that should be exploited for the creation of novel benchmarks to answer the needs of the Linked Data paradigm.
Please click here for the Tutorial web-page: http://www.ics.forth.gr/isl/BenchmarksTutorial/
Keynote: SemSci 2017: Enabling Open Semantic Science
1st International Workshop co-located with ISWC 2017, October 2017, Vienna, Austria,
https://semsci.github.io/semSci2017/
Abstract
We have all grown up with the research article and article collections (let’s call them libraries) as the prime means of scientific discourse. But research output is more than just the rhetorical narrative. The experimental methods, computational codes, data, algorithms, workflows, Standard Operating Procedures, samples and so on are the objects of research that enable reuse and reproduction of scientific experiments, and they too need to be examined and exchanged as research knowledge.
We can think of “Research Objects” as different types and as packages all the components of an investigation. If we stop thinking of publishing papers and start thinking of releasing Research Objects (software), then scholar exchange is a new game: ROs and their content evolve; they are multi-authored and their authorship evolves; they are a mix of virtual and embedded, and so on.
But first, some baby steps before we get carried away with a new vision of scholarly communication. Many journals (e.g. eLife, F1000, Elsevier) are just figuring out how to package together the supplementary materials of a paper. Data catalogues are figuring out how to virtually package multiple datasets scattered across many repositories to keep the integrated experimental context.
Research Objects [1] (http://researchobject.org/) is a framework by which the many, nested and contributed components of research can be packaged together in a systematic way, and their context, provenance and relationships richly described. The brave new world of containerisation provides the containers and Linked Data provides the metadata framework for the container manifest construction and profiles. It’s not just theory, but also in practice with examples in Systems Biology modelling, Bioinformatics computational workflows, and Health Informatics data exchange. I’ll talk about why and how we got here, the framework and examples, and what we need to do.
[1] Sean Bechhofer, Iain Buchan, David De Roure, Paolo Missier, John Ainsworth, Jiten Bhagat, Philip Couch, Don Cruickshank, Mark Delderfield, Ian Dunlop, Matthew Gamble, Danius Michaelides, Stuart Owen, David Newman, Shoaib Sufi, Carole Goble, Why linked data is not enough for scientists, In Future Generation Computer Systems, Volume 29, Issue 2, 2013, Pages 599-611, ISSN 0167-739X, https://doi.org/10.1016/j.future.2011.08.004
Profile-based Dataset Recommendation for RDF Data Linking Mohamed BEN ELLEFI
With the emergence of the Web of Data, most notably Linked Open Data (LOD), an abundance of data has become available on the web. However, LOD datasets and their inherent subgraphs vary heavily with respect to their size, topic and domain coverage, the schemas and their data dynamicity (respectively schemas and metadata) over the time. To this extent, identifying suitable datasets, which meet spefic criteria, has become an increasingly important, yet challenging task to support issues such as entity retrieval or semantic search and data linking. Particularly with respect to the interlinking issue, the current topology of the LOD cloud underlines the need for practical and ecient means to recommend suitable datasets: currently, only well-known reference graphs such as DBpedia (the most obvious target), YAGO or Freebase show a high amount of in-links, while there exists a long tail of potentially suitable yet under-recognized datasets. This problem is due to
the semantic web tradition in dealing with "fnding candidate datasets to link to", where data publishers are used to identify target datasets for interlinking.
While an understanding of the nature of the content of specic datasets is a crucial
prerequisite for the mentioned issues, we adopt in this dissertation the notion of
\dataset prole" | a set of features that describe a dataset and allow the comparison
of dierent datasets with regard to their represented characteristics. Our
rst research direction was to implement a collaborative ltering-like dataset recommendation
approach, which exploits both existing dataset topic proles, as well
as traditional dataset connectivity measures, in order to link LOD datasets into
a global dataset-topic-graph. This approach relies on the LOD graph in order to
learn the connectivity behaviour between LOD datasets. However, experiments have
shown that the current topology of the LOD cloud group is far from being complete
to be considered as a ground truth and consequently as learning data.
Facing the limits the current topology of LOD (as learning data), our research
has led to break away from the topic proles representation of \learn to rank"
approach and to adopt a new approach for candidate datasets identication where
the recommendation is based on the intensional proles overlap between dierent
datasets. By intensional prole, we understand the formal representation of a set of
schema concept labels that best describe a dataset and can be potentially enriched
Abstract:
An increasing number of applications rely on RDF, OWL 2, and SPARQL for storing and querying data. SPARQL, however, is not targeted towards end-users, and suitable query interfaces are needed. Faceted search is a prominent approach for end-user data access, and several RDF-based faceted search systems have been developed. There is, however, a lack of rigorous theoretical underpinning for faceted search in the context of RDF and OWL 2. In this paper, we provide such solid foundations. We formalise faceted interfaces for this context, identify a fragment of first-order logic capturing the underlying queries, and study the complexity of answering such queries for RDF and OWL 2 profiles. We then study interface generation and update, and devise efficiently implementable algorithms. Finally, we have implemented and tested our faceted search algorithms for scalability, with encouraging results.
A paper presented at the 1st International Workshop on Benchmarking Linked Data (BLINK). We present experimental results with the instance matching benchmark generator LANCE that is developed in the context of HOBBIT.
(HOBBIT project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 688227.)
Effective Semantics for Engineering NLP SystemsAndre Freitas
Provide a synthesis of the emerging representation trends behind NLP systems.
Shift in perspective:
Effective engineering (task driven, scalable) instead of sound formalism.
Best-effort representation.
Knowledge Graphs (Frege revisited)
Information Extraction & Text Classification
Distributional Semantic Models
Knowledge Graphs & Distributional Semantics
(Distributional-Relational Models)
Applications of DRMs
KG Completion
Semantic Parsing
Natural Language Inference
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™UiPathCommunity
In questo evento online gratuito, organizzato dalla Community Italiana di UiPath, potrai esplorare le nuove funzionalità di Autopilot, il tool che integra l'Intelligenza Artificiale nei processi di sviluppo e utilizzo delle Automazioni.
📕 Vedremo insieme alcuni esempi dell'utilizzo di Autopilot in diversi tool della Suite UiPath:
Autopilot per Studio Web
Autopilot per Studio
Autopilot per Apps
Clipboard AI
GenAI applicata alla Document Understanding
👨🏫👨💻 Speakers:
Stefano Negro, UiPath MVPx3, RPA Tech Lead @ BSP Consultant
Flavio Martinelli, UiPath MVP 2023, Technical Account Manager @UiPath
Andrei Tasca, RPA Solutions Team Lead @NTT Data
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfPeter Spielvogel
Building better applications for business users with SAP Fiori.
• What is SAP Fiori and why it matters to you
• How a better user experience drives measurable business benefits
• How to get started with SAP Fiori today
• How SAP Fiori elements accelerates application development
• How SAP Build Code includes SAP Fiori tools and other generative artificial intelligence capabilities
• How SAP Fiori paves the way for using AI in SAP apps
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
SPIMBENCH: A Scalable, Schema-Aware Instance Matching Benchmark for the Semantic Publishing Domain
1. SPIMBENCH:
A Scalable, Schema-Aware
Instance Matching Benchmark
for the Semantic Publishing Domain
T. Saveta1, E. Daskalaki1, G. Flouris1, I. Fundulaki1,
M. Herschel2, A.-C. Ngonga Ngomo3
#1 FORTH-ICS, #2 University of Stuttgart, #3 University of Leipzig
2. Semantic Publishing Instance Matching Benchmark (SPIMBENCH) 2
Instance Matching in Linked Data
Data acquisition
Data
evolution
Data integration
Open/social data
How can we automatically recognize
multiple mentions of the same entity
across or within sources?
=
Instance Matching
3. Semantic Publishing Instance Matching Benchmark (SPIMBENCH) 3
Benchmarking
Instance matching research has led to the development of
various systems and algorithms.
How to compare these?
How can we assess their performance?
How can we push the systems to get better?
These systems need to be benchmarked
4. Semantic Publishing Instance Matching Benchmark (SPIMBENCH) 4
SPIMBENCH
• Based on Semantic Publishing Benchmark (SPB) of Linked
Data Benchmark Council (LDBC)
• Synthetic benchmark for the Semantic Publishing Domain
• Value-based, structure-based and semantics-aware
transformations [FMN+11, FLM08]
• Deterministic, scalable data generation in the order of
billion triples
• Weighted gold standard
5. Semantic Publishing Instance Matching Benchmark (SPIMBENCH) 5
Instance Matching Benchmark Ingredients [FLM08]
Benchmark
Datasets
Gold
Standard
Test
Cases
Metrics
7. Semantic Publishing Instance Matching Benchmark (SPIMBENCH) 7
Value & Structure Based Transformations
Value: Mainly typographical errors and the use of
different data formats.[FMN+11]
Structure: Changes that occur to the properties.
– Property Addition/Deletion
– Property Aggregation/Extraction
Blank Character Addition/Deletion Change Number
Random Character Addition/Deletion/Modification Synonym/Antonym
Token Addition/Deletion/Shuffle Abbreviation
Multi-linguality (65 supported languages) Stem of a Word
Date Format
10. Semantic Publishing Instance Matching Benchmark (SPIMBENCH) 10
Weighted Gold Standard
• Detailed GS for debugging reasons
• Final GS : Contains only URIs that we consider a match
and their similarity
spimbench:Match owl:Thing
spimbench:ValueTransf spimbench:StructureTransf spimbench:SemanticsAwareTransf
spimbench:Transformation
spimbench:VT1 spimbench:VTi
spimbench:ST1 spimbench:STi
spimbench:SAT1
…
spimbench:SATi
…
…
rdfs:subPropertyOf
rdfs:subClassOf
rdf:type
c
spimbench:source
spimbench:target
spimbench:weight xsd:string
spimbench:onProperty rdf:Property
spimbench:transformation
11. Semantic Publishing Instance Matching Benchmark (SPIMBENCH) 11
Scalability Experiments (1/2)
• Scalability experiments for datasets up to 500M triples
• 1000 triples ~ 36 entities
• Data generation along with data transformation is linear to the size
of triples
• Transformation overhead is negligible for value-based, structure-
based, semantics-aware and simple combinations
• Overhead for complex combinations is higher by one magnitude
13. Semantic Publishing Instance Matching Benchmark (SPIMBENCH) 13
Performance of LogMap [JG11]
Performance of LogMap for 10K triples Performance of LogMap for 25K triples
Performance of LogMap for 50K triples
14. Semantic Publishing Instance Matching Benchmark (SPIMBENCH) 14
Conclusions
• Schema aware variations
– Complex class definitions
– Property constraints
– Equivalence, Disjointness, etc.
• Combination of transformations
• Scalable data generation in order of billion triples
– Uses sampling
• Weighted gold standard
– Final gold standard
– Detailed gold standard for debugging reasons
15. Semantic Publishing Instance Matching Benchmark (SPIMBENCH) 15
Future Work
• SPIMBENCH will be used as one of the Ontology
Alignment Evaluation Initiative [OAEI]
benchmarks for 2015.
• Domain independent instance matching test
case generator.
• Definition of more sophisticated metrics that
takes into account the
difficulty (weight).
16. Semantic Publishing Instance Matching Benchmark (SPIMBENCH) 16
Acknowledgments
This work was partially supported by the ongoing FP7
European Project LDBC (Linked Data Benchmark Council)
(317548) and is done in collaboration with I. Fundulaki,
M. Herschel (University of Stuttgart), G. Flouris,
E. Daskalaki and A. C. Ngonga Ngomo (University of
Leipzig)
17. Semantic Publishing Instance Matching Benchmark (SPIMBENCH) 17
References
# Reference Abbreviation
1
A. Ferrara and D. Lorusso and S. Montanelli and G. Varese.
Towards a Benchmark for Instance Matching. In OM, 2008.
[FLM08]
2
A. Ferrara and S. Montanelli and J. Noessner and H. Stuckenschmidt.
Benchmarking Matching Applications on the Semantic Web. In ESWC, 2011.
[FMN+11]
3
M. Nickel and V. Tresp. Tensor Factorization for Multi-relational Learning.
Machine Learning and Knowledge Discovery in Databases. Springer Berlin
Heidelberg, 2013. 617-621.
[NV13]
4
J. M. Joyce . Kullback-Leibler Divergence. International Encyclopedia of
Statistical Science. Springer Berlin Heidelberg, 2011. 720-722.
[J11]
5
E. Jimenez-Ruiz and B. C. Grau. Logmap: Logic-based and scalable ontology
matching. In ISWC, 2011.
[JG11]
6
B. Fuglede and F. Topsoe. Jensen-Shannon divergence and Hilbert space
embedding, in IEEE International Symposium on Information Theory, 2004.
[FT04]
7
Ontology Alignment Evaluation Initiative, find at
http://oaei.ontologymatching.org/
[OAEI]
We are currently working on a domain-independent instance matching test case generator for Linked Data, whose aim is to take
any ontology and RDF dataset as source and produce a target dataset that will implement the test cases discussed earlier. We are
also studying how we can dene more sophisticated metrics that take into account the difficulty (weight) of the correctly
identified matches, to be used in tandem with the standard precision and recall metrics.
Also SPIMBENCH will be used as one of the OAEI benchmarks for 2015.
---------------------------------------------------------------------------------------------------------------
Όσο αφορά την μελλοντική ανάπτυξη του συστήματος θα προσπαθήσουμε να κάνουμε τον SPIMBench τελείως ανεξάρτητο από οποιοδήποτε τομέα (domain). Ακόμα θα μπορεί να υποστηρίζει περισσοτέρους συνδυασμούς μετατροπών με πιο αυτόματο τρόπο. Ακόμα θα πρέπει να επανεξετάσουμε τις μετρικές (precision- recall) ώστε να μπορουν να λάβουν υπόψη και τα βάρη.
Wald method[ref] for sampling ?? -> provlepei kai poso tha einai to sfalma analoga to k
++++
koitaksame ola ta vasika tis owl lite kai owl rl kai auta pou kaname eixan mono noima alliws tha itan polu duskolo gia ta sustimata mpla mpla