SEALS presented the results of its first SEALS Evaluation Campaigns at I-SEMANTICS 2011, held from September 7-9 in Graz, Austria.
Read more about SEALS at http://www.seals-project.eu/
Methodology and Campaign Design for the Evaluation of Semantic Search ToolsStuart Wrigley
The main problem with the state of the art in the semantic search domain is the lack of comprehensive evaluations. There exist only a few efforts to evaluate semantic search tools and to compare the results with other evaluations of their kind.
In this paper, we present a systematic approach for testing and benchmarking semantic search tools that was developed within the SEALS project. Unlike other semantic web evaluations our methodology tests search tools both automatically and interactively with a human user in the loop. This allows us to test not only functional performance measures, such as precision and recall, but also usability issues, such as ease of use and comprehensibility of the query language.
The paper describes the evaluation goals and assumptions; the criteria and metrics; the type of experiments we will conduct as well as the datasets required to conduct the evaluation in the context of the SEALS initiative. To our knowledge it is the first effort to present a comprehensive evaluation methodology for Semantic Web search tools.
Final thesis: Technological maturity of future energy systemsNina Kallio
For my Master thesis I built a methodology to assess system maturities in energy sector. The aim was to build a framework, process and tools with the scope of assessing emerging systems and their current technological maturity in an uniform and quantitative way.
SEALS presented the results of its first SEALS Evaluation Campaigns at I-SEMANTICS 2011, held from September 7-9 in Graz, Austria.
Read more about SEALS at http://www.seals-project.eu/
Methodology and Campaign Design for the Evaluation of Semantic Search ToolsStuart Wrigley
The main problem with the state of the art in the semantic search domain is the lack of comprehensive evaluations. There exist only a few efforts to evaluate semantic search tools and to compare the results with other evaluations of their kind.
In this paper, we present a systematic approach for testing and benchmarking semantic search tools that was developed within the SEALS project. Unlike other semantic web evaluations our methodology tests search tools both automatically and interactively with a human user in the loop. This allows us to test not only functional performance measures, such as precision and recall, but also usability issues, such as ease of use and comprehensibility of the query language.
The paper describes the evaluation goals and assumptions; the criteria and metrics; the type of experiments we will conduct as well as the datasets required to conduct the evaluation in the context of the SEALS initiative. To our knowledge it is the first effort to present a comprehensive evaluation methodology for Semantic Web search tools.
Final thesis: Technological maturity of future energy systemsNina Kallio
For my Master thesis I built a methodology to assess system maturities in energy sector. The aim was to build a framework, process and tools with the scope of assessing emerging systems and their current technological maturity in an uniform and quantitative way.
A Critical Technology Element (CTE) is a new or novel technology that a platform or system depends on to achieve successful development or production or to successfully meet a system operational threshold requirement. Technology Readiness Levels (TRL) are a method of estimating technology maturity of CTE of a program during the Acquisition Process. They are determine during a Technology Readiness Assessment (TRA) that examines program concepts, technology requirements, and demonstrated technology capabilities.
Agile Product Line Engineering Literature ReviewHeba Elshandidy
This is a brief presentation about the main previous work done in the area of agile product line engineering (APLE). This research area focuses on methods/frameworks/algorithms to successfully bring agile and SPLE together in order to make the best use of the advantages of each approach.
Cross-project Defect Prediction Using A Connectivity-based Unsupervised Class...Feng Zhang
Defect prediction on projects with limited historical data has attracted great interest from both researchers and practitioners. Cross-project defect prediction has been the main area of progress by reusing classifiers from other projects. However, existing approaches require some degree of homogeneity (e.g., a similar distribution of metric values) between the training projects and the target project. Satisfying the homogeneity requirement often requires significant effort (currently a very active area of research).
An unsupervised classifier does not require any training data, therefore the heterogeneity challenge is no longer an issue. In this paper, we examine two types of unsupervised classifiers: a) distance-based classifiers (e.g., k-means); and b) connectivity-based classifiers. While distance-based unsupervised classifiers have been previously used in the defect prediction literature with disappointing performance, connectivity-based classifiers have never been explored before in our community.
We compare the performance of unsupervised classifiers versus supervised classifiers using data from 26 projects from three publicly available datasets (i.e., AEEEM, NASA, and PROMISE). In the cross-project setting, our proposed connectivity-based classifier (via spectral clustering) ranks as one of the top classifiers among five widely-used supervised classifiers (i.e., random forest, naive Bayes, logistic regression, decision tree, and logistic model tree) and five unsupervised classifiers (i.e., k-means, partition around medoids, fuzzy C-means, neural-gas, and spectral clustering). In the within-project setting (i.e., models are built and applied on the same project), our spectral classifier ranks in the second tier, while only random forest ranks in the first tier. Hence, connectivity-based unsupervised classifiers offer a viable solution for cross and within project defect predictions.
A Bug Report Analysis and Search Tool (presentation for M.Sc. degree)yguarata
A M.Sc. Dissertation presented to the Federal University of Pernambuco in partial fulfillment of the requirements for the degree of M.Sc. in Computer Science.
Presentation of the technical foundation, Improve Foundations, wich is an operational distribution of Open Source components dedicated to Java development of business applications.
A Critical Technology Element (CTE) is a new or novel technology that a platform or system depends on to achieve successful development or production or to successfully meet a system operational threshold requirement. Technology Readiness Levels (TRL) are a method of estimating technology maturity of CTE of a program during the Acquisition Process. They are determine during a Technology Readiness Assessment (TRA) that examines program concepts, technology requirements, and demonstrated technology capabilities.
Agile Product Line Engineering Literature ReviewHeba Elshandidy
This is a brief presentation about the main previous work done in the area of agile product line engineering (APLE). This research area focuses on methods/frameworks/algorithms to successfully bring agile and SPLE together in order to make the best use of the advantages of each approach.
Cross-project Defect Prediction Using A Connectivity-based Unsupervised Class...Feng Zhang
Defect prediction on projects with limited historical data has attracted great interest from both researchers and practitioners. Cross-project defect prediction has been the main area of progress by reusing classifiers from other projects. However, existing approaches require some degree of homogeneity (e.g., a similar distribution of metric values) between the training projects and the target project. Satisfying the homogeneity requirement often requires significant effort (currently a very active area of research).
An unsupervised classifier does not require any training data, therefore the heterogeneity challenge is no longer an issue. In this paper, we examine two types of unsupervised classifiers: a) distance-based classifiers (e.g., k-means); and b) connectivity-based classifiers. While distance-based unsupervised classifiers have been previously used in the defect prediction literature with disappointing performance, connectivity-based classifiers have never been explored before in our community.
We compare the performance of unsupervised classifiers versus supervised classifiers using data from 26 projects from three publicly available datasets (i.e., AEEEM, NASA, and PROMISE). In the cross-project setting, our proposed connectivity-based classifier (via spectral clustering) ranks as one of the top classifiers among five widely-used supervised classifiers (i.e., random forest, naive Bayes, logistic regression, decision tree, and logistic model tree) and five unsupervised classifiers (i.e., k-means, partition around medoids, fuzzy C-means, neural-gas, and spectral clustering). In the within-project setting (i.e., models are built and applied on the same project), our spectral classifier ranks in the second tier, while only random forest ranks in the first tier. Hence, connectivity-based unsupervised classifiers offer a viable solution for cross and within project defect predictions.
A Bug Report Analysis and Search Tool (presentation for M.Sc. degree)yguarata
A M.Sc. Dissertation presented to the Federal University of Pernambuco in partial fulfillment of the requirements for the degree of M.Sc. in Computer Science.
Presentation of the technical foundation, Improve Foundations, wich is an operational distribution of Open Source components dedicated to Java development of business applications.
Learn about the Open Data Center Alliance Workgroups, Usage Models and Roadmap Structure from the perspective of the Alliance Technical Coordination Committee. This presentation was used in the Nov. 18, 2010 Alliance Webcast delivered by Howard Grodin, VP of Strategic Programs, Terrermark; Alliance Technical Coordination Committee Member, and Ravi Subranamiam, Intel Corporation; Alliance Technical Advisor.
For more information about the Open Data Center Alliance, visit www.opendatacenteralliance.org. You will also find the Webcast recording that accompanies this presentation there.
Infrastructure and Workflow for the Formal Evaluation of Semantic Search Tech...Stuart Wrigley
This paper describes an infrastructure for the automated evaluation of semantic technologies and, in particular, semantic search technologies. For this purpose, we present an evaluation framework which follows a service-oriented approach for evaluating semantic technologies and uses the Business Process Execution Language (BPEL) to define evaluation workflows that can be executed by process engines. This framework supports a variety of evaluations, from different semantic areas, including search, and is extendible to new evaluations. We show how BPEL addresses this diversity as well as how it is used to solve specific challenges such as heterogeneity, error handling and reuse.
Presented at Data infrastructurEs for Supporting Information Retrieval Evaluation (DESIRE 2011) Workshop, Co-located with CIKM 2011, the 20th ACM Conference on Information and Knowledge Management
Friday 28th October 2011, Glasgow, UK
http://www.promise-noe.eu/events/desire-2011/
Accelrys Announces Experiment Knowledge Base (EKB) for Enterprise Lab ManagementBIOVIA
Today’s complex lab environments inhibit innovation productivity, slowing time-to-market and increasing costs. Improve lab efficiency, knowledge capture and reuse with a solution that quickly transforms scientific data into knowledge. For Chemicals, Manufacturing and Materials companies.
This is a general presentation of the EU Project SCAPE, http://www.scape-project.eu from 2011. The project is about large-scale digital preservation and runs from 2011 to 2014.
With the rise of agile development and the adoption of continuous integration, the software industry has seen an increasing interest in test automation. Many organizations invest in test automation but fail to reap the expected benefits, most likely due to a lack of test-automation maturity. In this talk, we present the results of a test automation maturity survey collecting responses of 151 practitioners coming from 101 organizations in 25 countries. We make observations regarding the state of the practice and provide a benchmark for assessing the maturity of an agile team. The benchmark resulted in a self-assessment tool for practitioners to be released under an open source license. An alfa version is presented herein. The research underpinning the survey has been conducted through the TESTOMAT project, a European project with 34 partners coming from 6 different countries.
(Presentation delivered at the Test Automation Days and the Testnet Autumn Event; October 2020)
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™UiPathCommunity
In questo evento online gratuito, organizzato dalla Community Italiana di UiPath, potrai esplorare le nuove funzionalità di Autopilot, il tool che integra l'Intelligenza Artificiale nei processi di sviluppo e utilizzo delle Automazioni.
📕 Vedremo insieme alcuni esempi dell'utilizzo di Autopilot in diversi tool della Suite UiPath:
Autopilot per Studio Web
Autopilot per Studio
Autopilot per Apps
Clipboard AI
GenAI applicata alla Document Understanding
👨🏫👨💻 Speakers:
Stefano Negro, UiPath MVPx3, RPA Tech Lead @ BSP Consultant
Flavio Martinelli, UiPath MVP 2023, Technical Account Manager @UiPath
Andrei Tasca, RPA Solutions Team Lead @NTT Data
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
In his public lecture, Christian Timmerer provides insights into the fascinating history of video streaming, starting from its humble beginnings before YouTube to the groundbreaking technologies that now dominate platforms like Netflix and ORF ON. Timmerer also presents provocative contributions of his own that have significantly influenced the industry. He concludes by looking at future challenges and invites the audience to join in a discussion.
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
Climate Impact of Software Testing at Nordic Testing DaysKari Kakkonen
My slides at Nordic Testing Days 6.6.2024
Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.
Climate Impact of Software Testing at Nordic Testing Days
Presentation of the SEALS project
1. SEALS
(Semantic Evaluation At Large Scale)
http://www.seals-project.eu/
Contact person: EC contribution: Duration:
Asunción Gómez Pérez 3.500.000 € June 2009-May 2012
<asun@fi.upm.es>
Universidad Politécnica de Madrid, Spain (Coordinator)
University of Sheffield, UK University of Mannheim, Germany
3 2 Forschungszentrum Informatik, Germany University of Zurich, Switzerland
1 1 2 University of Innsbruck, Austria STI International, Austria
Institut National de Recherche en Open University, UK
1 Informatique et en Automatique, France Oxford University, UK
1
2. Motivation
Not scalable 50
Execution
problems
T1 T2
Unstable
50
Max
50 ontology size
Not
interoperable
T3 T4 T5
Triples per
80 millisecond
? ?? 80
??
T6 T7
3. SEALS Objectives
The SEALS Platform
•A lasting reference infrastructure for semantic technology evaluation
•The evaluations to be executed on-demand at the SEALS Platform
The SEALS Evaluation Campaigns
•Two public evaluation campaigns including the best-in-class semantic technologies:
– Ontology engineering tools
– Ontology storage and reasoning systems
– Ontology matching tools
– Semantic search tools
– Semantic Web Service tools
•Semantic technology roadmaps
The SEALS Community Service
•Around the evaluation of semantic technologies Activities
3
4. The SEALS Platform
Provides the infrastructure for evaluating semantic technologies
• Open (everybody can use it)
• Scalable (to users, data size)
• Extensible (to more tests, different technology, more measures)
• Sustainable (beyond SEALS)
• Independent (unbiased)
• Repeatable (evaluations can be reproduced)
A platform for remote evaluation of semantic technology:
• Ontology engineering tools
• Storage systems and reasoners
• Ontology matching
• Semantic search
• Semantic web services
According to criteria:
• Interoperability
• Scalability
• Specific measures (e.g., completeness of query answers, matching precision)
4
5. Overall SEALS Platform Architecture
Evaluation Organisers
Technology Technology
Providers Adopters
SEALS Portal
Evaluation Entity
requests management
requests
Runtime
SEALS
Evaluation
Service Manager
Service
Software agents,
SEALS Repositories i.e., technology evaluators
Test Data Tools Results Evaluation
Repository Repository Repository Descriptions
Service Service Service Repository Service
6. Project overview
Networking Activities
WP1: Project Management
(UPM)
WP2: Dissemination, Community Building WP3: Evaluation Campaigns and Semantic
and Sustainability Technology Roadmaps
(STI2) (USFD)
Service Activities
SEALS Platform
WP4: SEALS Service Manager
(UPM)
WP5: Test Data WP6: Tools WP7: Results WP8: Evaluations WP9: Runtime
Repository Service Repository Service Repository Service Repository Service Evaluation Service
(UIBK) (UIBK) (UIBK) (FZI) (UPM)
Joint Research Activities
WP11: Storage and
WP10: Ontology WP12: Matching WP13: Semantic WP14: Semantic
Reasoning
Engineering Tools Tools Search Tools Web Service Tools
Systems
(FZI) (INRIA) (USFD) (OU)
(OXF)
6
7. Two-phase action plan
Evaluation Evaluation
results results
SEALS Platform SEALS Technology SEALS Platform SEALS Technology
Services roadmaps Services roadmaps
Platform Platform
New New
requirements requirements
18 months 18 months
Service
Activities
Evaluation
results
SEALS Platform SEALS Technology
Services Platform roadmaps
New
requirements
7
8. Project overview
Networking Activities
WP1: Project Management
(UPM)
WP2: Dissemination, Community Building WP3: Evaluation Campaigns and Semantic
and Sustainability Technology Roadmaps
(STI2) (USFD)
Service Activities
SEALS Platform
WP4: SEALS Service Manager
(UPM)
WP5: Test Data WP6: Tools WP7: Results WP8: Evaluations WP9: Runtime
Repository Service Repository Service Repository Service Repository Service Evaluation Service
(UIBK) (UIBK) (UIBK) (FZI) (UPM)
Joint Research Activities
WP11: Storage and
WP10: Ontology WP12: Matching WP13: Semantic WP14: Semantic
Reasoning
Engineering Tools Tools Search Tools Web Service Tools
Systems
(FZI) (INRIA) (USFD) (OU)
(OXF)
8
9. The SEALS Platform in the evaluation campaigns
Evaluation Organisers
SEALS Service Manager
Technology Runtime Technology
Developers Evaluation Users
Service
Evaluation Test Data Tool Result
Repository Repository Repository Repository
Service Service Service Service
9
10. The SEALS entities
Tools Evaluation Results
• Entities are described
Evaluation
using: description
Test data
– Data (the entity itself)
– Metadata (that describes
the entity)
• Machine-interpretable
descriptions of
evaluations
– Using BPEL
10
11. The SEALS ontologies
• Describe:
– Evaluations
• + all relevant information
– Evaluation campaigns
• Reused existing ontologies (e.g.,Dublin Core, FOAF, VCard)
http://www.seals-project.eu/ontologies/
14.09.2010
11
12. Project overview
Networking Activities
WP1: Project Management
(UPM)
WP2: Dissemination, Community Building WP3: Evaluation Campaigns and Semantic
and Sustainability Technology Roadmaps
(STI2) (USFD)
Service Activities
SEALS Platform
WP4: SEALS Service Manager
(UPM)
WP5: Test Data WP6: Tools WP7: Results WP8: Evaluations WP9: Runtime
Repository Service Repository Service Repository Service Repository Service Evaluation Service
(UIBK) (UIBK) (UIBK) (FZI) (UPM)
Joint Research Activities
WP11: Storage and
WP10: Ontology WP12: Matching WP13: Semantic WP14: Semantic
Reasoning
Engineering Tools Tools Search Tools Web Service Tools
Systems
(FZI) (INRIA) (USFD) (OU)
(OXF)
12
13. Ontology Engineering Tools
• Goal: To evaluate the ontology management
capabilities of ontology engineering tools
– Ontology editors
• Protégé
• NeOn Toolkit
• (your tool here)
– Ontology management frameworks and APIs
• Jena
• Sesame
• OWL API
• (your tool here)
14.09.2010
13
14. Ontology Engineering Tools
• Evaluation services for:
– Conformance
– Interoperability
– Scalability
• Test data:
– RDF(S) Import Test Suite
– OWL Lite Import Test Suite
– OWL DL Import Test Suite
– OWL Full Import Test Suite
– Scalability test data
14
15. Storage and Reasoning Systems
• Goals
Evaluating the interoperability and performance of
DLBSs
• Standard reasoning services
–Classification
–Class satisfiability
–Ontology satisfiability
–Logical entailment
14.0
15 9.20
10
16. Storage and Reasoning Systems
• Test Data
– Gardiner evaluation suite (300 ontologies)
– [Wang06] ontologies suite (600 ontologies)
– Various versions of the GALEN ontology
– Ontologies created in EU funded projects:
SEMINTEC, VICODI, AEO, ...
– Abox generator [Stoilos10]
14.0
16 9.20
10
17. Storage and Reasoning Systems
• Evaluation Criteria
– Interoperability
– Performance
• Metrics
– The number of tests passed without I/O errors
– Time
• Tools
– HermiT, Pellet, FaCT++, Racer Pro, CEL, CB, …
14.0
17 9.20
10
18. Matching Tools
• Goals: • Criteria
– To evaluate the – Conformance
competence of matching • standard precision and
systems with respect to recall
different evaluation • restricted semantic
criteria. precision and recall
• alignment coherence
– To demonstrate the
feasibility and benefits
of automating matching – Efficiency
evaluation. • runtime
• memory consumption
19. Matching Tools
Data sets: Three subsets from OAEI
Anatomy: matching the Adult Mouse Anatomy (2744
classes) and the NCI Thesaurus (3304 classes)
describing the human anatomy.
Benchmark: goal is to identify the areas in which each
matching algorithm is strong or weak. One particular
ontology of the bibliography domain is compared with a
number of alternative ontologies on the same domain.
Conference: collection of conference organization
ontologies. The goal is to materialize in alignments
aggregated statistical observations and/or implicit design
patterns.
20. Matching Tools
• Evaluation criteria and • Tools
metrics – ASMOV
– Conformance – Aroma
• standard precision and – Falcon – AO
recall
– Lily
• restricted semantic
precision and recall – SAMBO
• alignment coherence
– Efficiency
• runtime
• memory consumption
21. Matching Tools
Scenario 1
Test data: Benchmark
Criteria: conformance with expected results
Scenario 2
Test data: Anatomy
Criteria: conformance with expected results, efficiency in terms of memory
consumption and execution time
Scenario 3
Test data: Conference
Criteria: conformance with expected results and coherence
22. Semantic Search
• Goals
– Benchmark effectiveness of search tools
– Emphasis on tool usability since search is a
inherently user-centered activity.
– Still interested in automated evaluation for other
aspects
– Two phase approach:
• Automated evaluation: runs on SEALS Platform
• User-in-the-loop: human experiment
14.0
22 9.20
10
23. Semantic Search: Data
• User-in-the-loop: Mooney
– Pre-existing dataset
– Extended question set to create unseen questions a number of more
'complex' questions.
– Well suited to human-based experiments: easy to understand domain
• Automated: EvoOnt
– Bespoke dataset
– 5 different sizes (1k, 10k, 100k, 1M, 10M triples)
– Well suited to automated experiments: range of sizes and questions can be of
arbitrary complexity
• Each of the 6 data sets (1 Mooney, 5 EvoOnt) has a set of natural
language questions and associated groundtruths
14.09.2010
23
25. Semantic Search
• Metrics
– Core metrics: precision, recall and f-measure of the triples
returned for each query.
– Other metrics:
• tool performance metrics (e.g., memory usage, CPU load, etc)
• user-centric metrics (e.g., time to obtain the final answer)
• System Usability Scale (SUS) questionnaire
– Also collect demographic information to correlate with
metrics
• Tools
– K-Search (produced by K-Now, a Sheffield spin-out
company)
– Ginseng (Zurich)
25
14.0
9.20
10
26. Semantic Web Services
• Goal: To evaluate Semantic Web Service
discovery
• Test data:
– OWLS Test Collection (OWLS-TC)
– SAWSDL Test Collection (SAWSDL-TC)
– Seekda Services
– OPOSSum Services
26
27. Project overview
Networking Activities
WP1: Project Management
(UPM)
WP2: Dissemination, Community Building WP3: Evaluation Campaigns and Semantic
and Sustainability Technology Roadmaps
(STI2) (USFD)
Service Activities
SEALS Platform
WP4: SEALS Service Manager
(UPM)
WP5: Test Data WP6: Tools WP7: Results WP8: Evaluations WP9: Runtime
Repository Service Repository Service Repository Service Repository Service Evaluation Service
(UIBK) (UIBK) (UIBK) (FZI) (UPM)
Joint Research Activities
WP11: Storage and
WP10: Ontology WP12: Matching WP13: Semantic WP14: Semantic
Reasoning
Engineering Tools Tools Search Tools Web Service Tools
Systems
(FZI) (INRIA) (USFD) (OU)
(OXF)
27
30. Dissemination activities
• Portal
• Evaluation campaigns each have own section
• News pages with RSS for announcements & updates
• Events list
• Next Events
• IWEST Workshop at ISWC, November 2010
• Campaign events at ISWC: Ontology Matching
workshop and S3 Semantic Service Selection
• EKAW2010 sponsorship of best paper
• Previous Events
• ESWC2010 sponsorship, tutorial, News from the
Front, material distribution
• AAAI2010 Outstanding Paper
31. Community Building
• Registration form on the portal
– Community area provides a Web interface to
SEALS portal functionalities, such as registering
and uploading a tool, accessing evaluation results
• SEALS Community launched in summer 2010
– >100 persons from research & industry
• Provider Involvement Program
– Invites to be sent out directly by the campaigns to
tool vendors (116 tools from 100 vendors have
been identified by the campaigns to date)
31
33. Community participation in the 1st phase
November 2009 May 2010 July 2010 November 2010
Provide
requirements
Definition of Comment on the
evaluations evaluations and
and test data test data
Launch of the
1st Evaluation Join the Evaluation Campaign!
Campaign
First release
of the SEALS Run your own evaluations
Platform
Results of the See and discuss
1st Evaluation Evaluation
Campaign Campaign results
34. Conclusions
• We will provide (1st prototype in 2010):
– Evaluation services and datasets for the evaluation of semantic technologies:
Ontology engineering tools
Storage and reasoning systems
Matching tools
Semantic search tools
Semantic Web Service tools
• Benefits for:
– Researchers. Validate their research and compare with others
– Developers. Evaluate their tools, compare with others and monitor
– Providers. Verify and show that their tools work, increase visibility
– Users. Select tools or sets of tools between alternatives
• We ask for:
– Semantic technologies using the SEALS Platform for their evaluations
– Semantic technologies participating in the evaluation campaigns
34
35. Semantic technology roadmaps
Evaluations 50
Not scalable
T1 T2
Max
50 ontology size
Unstable
50 80
Not
interoperable
Execution T3 T4 T5
problems
Triples per
80 millisecond
80
? ??
?? T6
80 T7 80
35
36. SEALS provides evaluation
services to the community!
Contact:
Coordinator:
Asunción Gómez Pérez <asun@fi.upm.es>
SEALS Community Portal:
http://www.seals-project.eu/
SEALS Evaluation Campaigns:
14.09.2010
http://www.seals-project.eu/seals-evaluation-campaigns
36