The document discusses the development of K-HAL, an AI assistant, through three versions. K-HAL V1.0 used a simple ontology and knowledge base to represent spacecraft piloting knowledge. K-HAL V2.0 leveraged existing online ontologies and linked data to expand its knowledge. It also used crowdsourcing to add new facts. K-HAL V3.0 would represent processes by modeling interactions between a virtual choir, conductor, and listeners. The conclusion advocates reusing and sharing ontologies and data to benefit the semantic web community.
Approximation and Self-Organisation on the Web of DataKathrin Dentler
This document discusses using computational intelligence techniques like evolutionary computing and collective intelligence to handle challenges posed by the growing Web of Data. It describes how these techniques can provide adaptive, scalable, and robust approaches to tasks like ontology mapping, query answering, and reasoning. Evolutionary computing is proposed for optimization problems, while collective intelligence approaches may enable emergent behaviors from decentralized data flows and reasoning. While computational intelligence loses precision, it gains properties like adaptation, simplicity, scalability, and interactive behavior that are well-suited to the dynamic, distributed nature of the Web of Data.
The document discusses processing linked data at high speeds using the Signal/Collect graph algorithm framework. It provides examples of how Signal/Collect can be used to perform tasks like RDFS subclass inference and PageRank calculation on semantic graphs. It also summarizes performance results showing that TripleRush, an implementation of Signal/Collect, outperforms other graph processing systems on benchmark datasets. Finally, it discusses ongoing work on graph partitioning with TripleRush.
Keep fit (a bit) - ESWC SSchool 14 - Student projecteswcsummerschool
The document presents a web-based project called "Keep Fit(a Bit) in Kalamaki" which aims to make Kalamaki, Greece a smart city by developing a personalized health planner. The planner integrates data on restaurants, dishes, energy/calorie content, prices, and walking distances to provide personalized recommendations to help users like Fred stay fit on holidays in Kalamaki. The project team collected data from various sources and implemented a prototype interface that allows users to view personalized recommendations. Future steps include publishing more Kalamaki data, adding social features, and integrating additional health and weather data.
ESWC SS 2013 - Tuesday Keynote Steffen Staab: Programming the Semantic Webeswcsummerschool
This document discusses programming with semantic web data and Linked Open Data. It introduces SchemEX, a schema-level index for Linked Open Data that uses type clusters and bi-simulations to efficiently construct an index of the schema. It also discusses an application called LODatio that extends this index to support active user assistance for SPARQL queries, such as providing related queries, result snippets and references to relevant data sources. Finally, it introduces LiteQ, a language for integrating RDF types and queries into programming languages to allow exploring, programming and typing with semantic web data.
SyrtAPI is a new entertainment platform that combines music and book data from multiple sources like MusicBrainz and NYTimes reviews. It uses Linked Data and SPARQL queries to extract lyrics and reviews and recommend songs that match the content of the books. The team learned about using Linked Data vocabularies and linking datasets while building the prototype, which currently retrieves 25 lyrics and 10 book reviews through its pipeline. Future work includes adding more data sources, developing a mobile app, and using natural language processing to better analyze texts.
Usage of Linked Data: Introduction and Application ScenariosEUCLID project
This presentation introduces the main principles of Linked Data, the underlying technologies and background standards. It provides basic knowledge for how data can be published over the Web, how it can be queried, and what are the possible use cases and benefits. As an example, we use the development of a music portal (based on the MusicBrainz dataset), which facilitates access to a wide range of information and multimedia resources relating to music.
This presentation gives details on technologies and approaches towards exploiting Linked Data by building LD applications. In particular, it gives an overview of popular existing applications and introduces the main technologies that support implementation and development. Furthermore, it illustrates how data exposed through common Web APIs can be integrated with Linked Data in order to create mashups.
Approximation and Self-Organisation on the Web of DataKathrin Dentler
This document discusses using computational intelligence techniques like evolutionary computing and collective intelligence to handle challenges posed by the growing Web of Data. It describes how these techniques can provide adaptive, scalable, and robust approaches to tasks like ontology mapping, query answering, and reasoning. Evolutionary computing is proposed for optimization problems, while collective intelligence approaches may enable emergent behaviors from decentralized data flows and reasoning. While computational intelligence loses precision, it gains properties like adaptation, simplicity, scalability, and interactive behavior that are well-suited to the dynamic, distributed nature of the Web of Data.
The document discusses processing linked data at high speeds using the Signal/Collect graph algorithm framework. It provides examples of how Signal/Collect can be used to perform tasks like RDFS subclass inference and PageRank calculation on semantic graphs. It also summarizes performance results showing that TripleRush, an implementation of Signal/Collect, outperforms other graph processing systems on benchmark datasets. Finally, it discusses ongoing work on graph partitioning with TripleRush.
Keep fit (a bit) - ESWC SSchool 14 - Student projecteswcsummerschool
The document presents a web-based project called "Keep Fit(a Bit) in Kalamaki" which aims to make Kalamaki, Greece a smart city by developing a personalized health planner. The planner integrates data on restaurants, dishes, energy/calorie content, prices, and walking distances to provide personalized recommendations to help users like Fred stay fit on holidays in Kalamaki. The project team collected data from various sources and implemented a prototype interface that allows users to view personalized recommendations. Future steps include publishing more Kalamaki data, adding social features, and integrating additional health and weather data.
ESWC SS 2013 - Tuesday Keynote Steffen Staab: Programming the Semantic Webeswcsummerschool
This document discusses programming with semantic web data and Linked Open Data. It introduces SchemEX, a schema-level index for Linked Open Data that uses type clusters and bi-simulations to efficiently construct an index of the schema. It also discusses an application called LODatio that extends this index to support active user assistance for SPARQL queries, such as providing related queries, result snippets and references to relevant data sources. Finally, it introduces LiteQ, a language for integrating RDF types and queries into programming languages to allow exploring, programming and typing with semantic web data.
SyrtAPI is a new entertainment platform that combines music and book data from multiple sources like MusicBrainz and NYTimes reviews. It uses Linked Data and SPARQL queries to extract lyrics and reviews and recommend songs that match the content of the books. The team learned about using Linked Data vocabularies and linking datasets while building the prototype, which currently retrieves 25 lyrics and 10 book reviews through its pipeline. Future work includes adding more data sources, developing a mobile app, and using natural language processing to better analyze texts.
Usage of Linked Data: Introduction and Application ScenariosEUCLID project
This presentation introduces the main principles of Linked Data, the underlying technologies and background standards. It provides basic knowledge for how data can be published over the Web, how it can be queried, and what are the possible use cases and benefits. As an example, we use the development of a music portal (based on the MusicBrainz dataset), which facilitates access to a wide range of information and multimedia resources relating to music.
This presentation gives details on technologies and approaches towards exploiting Linked Data by building LD applications. In particular, it gives an overview of popular existing applications and introduces the main technologies that support implementation and development. Furthermore, it illustrates how data exposed through common Web APIs can be integrated with Linked Data in order to create mashups.
In this Webinar Lorenz Bühmann presents the ontology repair and enrichment tool ORE and also the DL-Learner , a machine learning tool to solve supervised learnings tasks and support knowledge engineers in constructing knowledge. Those two beneighbored tools in the LOD2 Stack are for classification and the following quality analysis of Linked Data.
The document discusses challenges with querying decentralized knowledge graphs. It begins by defining data and knowledge, and how combining them provides meaning. It then discusses how public SPARQL endpoints for knowledge graphs like DBpedia and Wikidata often do not allow querying the full graph and returning complete results due to performance and scalability issues. Federated query engines have been developed to query multiple knowledge graphs at once, but still face challenges with complex queries involving property paths or aggregations over large datasets.
The document summarizes a talk given by Dr. Johannes Keizer on the CIARD (Coherence in Information for Agricultural Research for development) initiative and a global infrastructure for linked open data (LOD). The CIARD initiative aims to provide open access to agricultural research by promoting standards and sharing information. It involves institutions contributing their research outputs through the CIARD RING and adopting standards. The infrastructure proposed includes distributed repositories linked through vocabularies and LOD. Tools are being developed to generate LOD and link datasets through shared concepts.
This is one out of a series of presentations which I have given during a recent trip to the United States. I will make them all public, but content does not vary a lot between some of them
Calit2: An Experiment in Social NetworksLarry Smarr
06.08.16
Invited Talk
Conversation on Social Networks, Social Movements
Third Annual Seminar in Experimental Critical Theory
University of California Humanities Research Institute, UCI
Title: Calit2: An Experiment in Social Networks
Irvine, CA
FAIR Workflows: A step closer to the Scientific Paper of the Futuredgarijo
Keynote presented at the Computational and Autonomous Workflows (CAW-2021) at the Oak Ridge National Laboratory. The keynote describes an overview of the different aspects to take into account when aiming to create FAIR workflows and associated resources.
Calit2-a Persistent UCSD/UCI Framework for CollaborationLarry Smarr
05.02.16
Invited Talk
Sun Microsystems Global Education and Research
Conference 2005
Title: Calit2-a Persistent UCSD/UCI Framework for Collaboration
San Francisco, CA
Talk at 3th Keystone Training School - Keyword Search in Big Linked Data - Institute for Software Technology and Interactive Systems, TU Wien, Austria, 2017
Digital Science: Reproducibility and Visibility in AstronomyJose Enrique Ruiz
The science done in Astronomy is digital science, from observing proposals to final publication, to data and software used: each of the elements and actions involved in scientific output could be recorded in electronic form. This fact does not prevent the final outcome of an experiment is still difficult to reproduce. This procedure can be long, tedious, not easily accessible or understandable, even to the author. At the same time, we have a rich infrastructure of files, observational data and publications. This could be used more efficiently if we reach greater visibility of the scientific production, which avoids duplication of effort and reinvention.
Reproducibility is a cornerstone in scientific method, and extraction of relevant information in the current and future data flood is key in Astronomy. The AMIGA group (Analysis of the interstellar Medium of Isolated GAlaxies, IAA-CSIC, http://amiga.iaa.es) faces these two challenges in the European project "Wf4Ever: Advanced technologies for enhanced preservation workflow Science" to enable the preservation of the methodology in scalable semantic repositories to facilitate their discovery, access, inspection, exploitation and distribution. These repositories store the experiments on "Research Objects" whose main constituents are digital scientific workflows. These provide a comprehensive view and clear scientific interpretation of the experiment as well as the automation of the method, going beyond the usual pipelines that normally end up in data processing.
The quantitative leap in volume and complexity of the next generation of archives will need analysis and data mining tasks to live closer to the data, in computing and distributed storage environments, but they should also be modular enough to allow customization from scientists and be easily accessible to foster their dissemination among the community. Astronomy is a collaborative science, but it has also become highly specialized, as many other disciplines. Sharing, preservation, discovery and a much simplified access to resources in the composition of scientific workflows will enable astronomers to greatly benefit from each other’s highly specialized knowhow, they constitute a way to push Astronomy to share and publish not only results and data, but also processes and methodologies.
We will show how the use of scientific workflows can help to improve the reproducibility of the experiment and a more efficient exploitation of astronomical archives, as well as the visibility of the scientific methodology and its reuse.
Digital Infrastructure: Storage and Content ManagementNoreen Whysel
Discusses analogies between the rise of the electric power grid and the Internet. Describes storage capacity issues and requirements for digital repositories. Reviews different repository platforms specific to archival and digital collection management. Has a really cool picture of Burden's Wheel.
Technologie Proche: Imagining the Archival Systems of Tomorrow With the Tools...Artefactual Systems - AtoM
These slides accompanied a June 4th, 2016 presentation made by Dan Gillean of Artefactual Systems at the Association of Canadian Archivists' 2016 Conference in Montreal, QC, Canada.
This presentation aims to examine several existing or emerging computing paradigms, with specific examples, to imagine how they might inform next-generation archival systems to support digital preservation, description, and access. Topics covered include:
- Distributed Version Control and git
- P2P architectures and the BitTorrent protocol
- Linked Open Data and RDF
- Blockchain technology
The session is part of an attempt by the ACA to create interactive "working sessions" at its conferences. Accompanying notes can be found at: http://bit.ly/tech-Proche
Participants were also asked to use the Twitter hashtag of #techProche for online interaction during the session.
All Things Open 2014 - Day 1
Wednesday, October 22nd, 2014
Arfon Smith
Chief Scientist for GitHub
Open Government/Open Data
What Academia Can Learn from Open Source
Find more by Arfon here: https://speakerdeck.com/arfon
Gridforum David De Roure Newe Science 20080402vrij
The document discusses the evolution of e-Science and how it enables new forms of collaborative research. Key points include:
- e-Science has progressed from specialized teams doing "heroic science" to everyday researchers conducting routine research using ubiquitous digital tools and data sharing.
- Web 2.0 technologies and approaches like open data, workflows, and social networking are empowering researchers and supporting new types of collaborative, data-driven science.
- Future e-Science relies on making these technologies simple and accessible to researchers from all domains to further break down barriers to collaborative, data-centric research.
Finding Emerging Topics Using Chaos and Community Detection in Social Media G...Paragon_Science_Inc
In this talk, we describe our recent work in the analysis of Twitter-based network graphs, including the Ebola crisis in 2014 and the stock market in 2015.
Knowledge graph construction with a façade - The SPARQL Anything ProjectEnrico Daga
The document discusses a project called "SPARQL Anything" which aims to simplify knowledge graph construction by using SPARQL as the single language for representing and transforming diverse data formats into RDF. It presents an approach called "Facade-X" which defines a common RDF structure that can be applied over different formats like CSV, JSON, HTML, etc. This facade focuses on the RDF meta-model and aims to apply minimal ontological commitments. The document outlines how Facade-X can be used to represent different formats and provides examples of using SPARQL to transform sample data into RDF without committing to a specific domain ontology.
FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...Mark Wilkinson
This slide deck accompanies the manuscript "Interoperability and FAIRness through a novel combination of Web technologies", submitted to PeerJ Computer Science: https://doi.org/10.7287/peerj.preprints.2522v1
It describes the output of the "Skunkworks" FAIR implementation group, who were tasked with building a prototype infrastructure that would fulfill the FAIR Principles for scholarly data publishing. We show how a novel combination of the Linked Data Platform, RDF Mapping Language (RML) and Triple Pattern Fragments (TPF) can be combined to create a scholarly publishing infrastructure that is markedly interoperable, at both the metadata and the data level.
This slide deck (or something close) will be presented at the Dutch Techcenter for Life Sciences Partners Workshop, November 4, 2016.
Spanish Ministerio de Economía y Competitividad grant number TIN2014-55993-R
Digital Identity is fundamental to collaboration in bioinformatics research and development because it enables attribution, contribution, publication to be recorded and quantified.
However, current models of identity are often obsolete and have problems capturing both small contributions "microattribution" and large contributions "mega-attribution" in Science. Without adequate identity mechanisms, the incentive for collaboration can be reduced, and the utility of collaborative social tools hindered.
Using examples of metabolic pathway analysis with the taverna workbench and myexperiment.org, this talk will illustrate problems and solutions to identifying scientists accurately and effectively in collaborative bioinformatics networks on the Web.
Integration of oreChem with the eCrystals repository for crystal structuresMark Borkum
This document discusses integrating the oreChem ontology for representing scientific experiments and their provenance with the eCrystals repository for crystallography data. It describes how eCrystals represents crystallography experiments and data, the motivation for open access to research, and the oreChem ontology's concepts for representing methodologies, enactments, and provenance. It then outlines a proposed plugin for eCrystals that would map eCrystals data to oreChem concepts to capture the provenance of experiments and link data to the methods used.
The document describes a semantic recommendation system for helping customers select fish for an aquarium. The system takes into account various criteria like temperature, predator/prey relationships between fish, food requirements, ecosystem needs, size, and color preferences. It integrates data from multiple sources and uses semantic technologies like ontologies and linked data to make personalized recommendations based on a user's needs and preferences. The system aims to connect people interested in fish keeping through a social network application.
This document discusses the creation of an Arabic sentiment lexicon and finding related entities from Arabic text. It involves processing Arabic financial text data by tagging parts of speech, removing stop words, translating verbs and adjectives to English using Google Translate, stemming the words, and using an existing English sentiment lexicon like SentiWordNet to assign positive and negative sentiment scores. Related entities are extracted using a window-based approach to find nouns occurring near sentiment words. The process aims to create an Arabic sentiment lexicon and identify related entities to help with sentiment analysis on Arabic text.
More Related Content
Similar to Mon domingue key_introduction to semantic
In this Webinar Lorenz Bühmann presents the ontology repair and enrichment tool ORE and also the DL-Learner , a machine learning tool to solve supervised learnings tasks and support knowledge engineers in constructing knowledge. Those two beneighbored tools in the LOD2 Stack are for classification and the following quality analysis of Linked Data.
The document discusses challenges with querying decentralized knowledge graphs. It begins by defining data and knowledge, and how combining them provides meaning. It then discusses how public SPARQL endpoints for knowledge graphs like DBpedia and Wikidata often do not allow querying the full graph and returning complete results due to performance and scalability issues. Federated query engines have been developed to query multiple knowledge graphs at once, but still face challenges with complex queries involving property paths or aggregations over large datasets.
The document summarizes a talk given by Dr. Johannes Keizer on the CIARD (Coherence in Information for Agricultural Research for development) initiative and a global infrastructure for linked open data (LOD). The CIARD initiative aims to provide open access to agricultural research by promoting standards and sharing information. It involves institutions contributing their research outputs through the CIARD RING and adopting standards. The infrastructure proposed includes distributed repositories linked through vocabularies and LOD. Tools are being developed to generate LOD and link datasets through shared concepts.
This is one out of a series of presentations which I have given during a recent trip to the United States. I will make them all public, but content does not vary a lot between some of them
Calit2: An Experiment in Social NetworksLarry Smarr
06.08.16
Invited Talk
Conversation on Social Networks, Social Movements
Third Annual Seminar in Experimental Critical Theory
University of California Humanities Research Institute, UCI
Title: Calit2: An Experiment in Social Networks
Irvine, CA
FAIR Workflows: A step closer to the Scientific Paper of the Futuredgarijo
Keynote presented at the Computational and Autonomous Workflows (CAW-2021) at the Oak Ridge National Laboratory. The keynote describes an overview of the different aspects to take into account when aiming to create FAIR workflows and associated resources.
Calit2-a Persistent UCSD/UCI Framework for CollaborationLarry Smarr
05.02.16
Invited Talk
Sun Microsystems Global Education and Research
Conference 2005
Title: Calit2-a Persistent UCSD/UCI Framework for Collaboration
San Francisco, CA
Talk at 3th Keystone Training School - Keyword Search in Big Linked Data - Institute for Software Technology and Interactive Systems, TU Wien, Austria, 2017
Digital Science: Reproducibility and Visibility in AstronomyJose Enrique Ruiz
The science done in Astronomy is digital science, from observing proposals to final publication, to data and software used: each of the elements and actions involved in scientific output could be recorded in electronic form. This fact does not prevent the final outcome of an experiment is still difficult to reproduce. This procedure can be long, tedious, not easily accessible or understandable, even to the author. At the same time, we have a rich infrastructure of files, observational data and publications. This could be used more efficiently if we reach greater visibility of the scientific production, which avoids duplication of effort and reinvention.
Reproducibility is a cornerstone in scientific method, and extraction of relevant information in the current and future data flood is key in Astronomy. The AMIGA group (Analysis of the interstellar Medium of Isolated GAlaxies, IAA-CSIC, http://amiga.iaa.es) faces these two challenges in the European project "Wf4Ever: Advanced technologies for enhanced preservation workflow Science" to enable the preservation of the methodology in scalable semantic repositories to facilitate their discovery, access, inspection, exploitation and distribution. These repositories store the experiments on "Research Objects" whose main constituents are digital scientific workflows. These provide a comprehensive view and clear scientific interpretation of the experiment as well as the automation of the method, going beyond the usual pipelines that normally end up in data processing.
The quantitative leap in volume and complexity of the next generation of archives will need analysis and data mining tasks to live closer to the data, in computing and distributed storage environments, but they should also be modular enough to allow customization from scientists and be easily accessible to foster their dissemination among the community. Astronomy is a collaborative science, but it has also become highly specialized, as many other disciplines. Sharing, preservation, discovery and a much simplified access to resources in the composition of scientific workflows will enable astronomers to greatly benefit from each other’s highly specialized knowhow, they constitute a way to push Astronomy to share and publish not only results and data, but also processes and methodologies.
We will show how the use of scientific workflows can help to improve the reproducibility of the experiment and a more efficient exploitation of astronomical archives, as well as the visibility of the scientific methodology and its reuse.
Digital Infrastructure: Storage and Content ManagementNoreen Whysel
Discusses analogies between the rise of the electric power grid and the Internet. Describes storage capacity issues and requirements for digital repositories. Reviews different repository platforms specific to archival and digital collection management. Has a really cool picture of Burden's Wheel.
Technologie Proche: Imagining the Archival Systems of Tomorrow With the Tools...Artefactual Systems - AtoM
These slides accompanied a June 4th, 2016 presentation made by Dan Gillean of Artefactual Systems at the Association of Canadian Archivists' 2016 Conference in Montreal, QC, Canada.
This presentation aims to examine several existing or emerging computing paradigms, with specific examples, to imagine how they might inform next-generation archival systems to support digital preservation, description, and access. Topics covered include:
- Distributed Version Control and git
- P2P architectures and the BitTorrent protocol
- Linked Open Data and RDF
- Blockchain technology
The session is part of an attempt by the ACA to create interactive "working sessions" at its conferences. Accompanying notes can be found at: http://bit.ly/tech-Proche
Participants were also asked to use the Twitter hashtag of #techProche for online interaction during the session.
All Things Open 2014 - Day 1
Wednesday, October 22nd, 2014
Arfon Smith
Chief Scientist for GitHub
Open Government/Open Data
What Academia Can Learn from Open Source
Find more by Arfon here: https://speakerdeck.com/arfon
Gridforum David De Roure Newe Science 20080402vrij
The document discusses the evolution of e-Science and how it enables new forms of collaborative research. Key points include:
- e-Science has progressed from specialized teams doing "heroic science" to everyday researchers conducting routine research using ubiquitous digital tools and data sharing.
- Web 2.0 technologies and approaches like open data, workflows, and social networking are empowering researchers and supporting new types of collaborative, data-driven science.
- Future e-Science relies on making these technologies simple and accessible to researchers from all domains to further break down barriers to collaborative, data-centric research.
Finding Emerging Topics Using Chaos and Community Detection in Social Media G...Paragon_Science_Inc
In this talk, we describe our recent work in the analysis of Twitter-based network graphs, including the Ebola crisis in 2014 and the stock market in 2015.
Knowledge graph construction with a façade - The SPARQL Anything ProjectEnrico Daga
The document discusses a project called "SPARQL Anything" which aims to simplify knowledge graph construction by using SPARQL as the single language for representing and transforming diverse data formats into RDF. It presents an approach called "Facade-X" which defines a common RDF structure that can be applied over different formats like CSV, JSON, HTML, etc. This facade focuses on the RDF meta-model and aims to apply minimal ontological commitments. The document outlines how Facade-X can be used to represent different formats and provides examples of using SPARQL to transform sample data into RDF without committing to a specific domain ontology.
FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...Mark Wilkinson
This slide deck accompanies the manuscript "Interoperability and FAIRness through a novel combination of Web technologies", submitted to PeerJ Computer Science: https://doi.org/10.7287/peerj.preprints.2522v1
It describes the output of the "Skunkworks" FAIR implementation group, who were tasked with building a prototype infrastructure that would fulfill the FAIR Principles for scholarly data publishing. We show how a novel combination of the Linked Data Platform, RDF Mapping Language (RML) and Triple Pattern Fragments (TPF) can be combined to create a scholarly publishing infrastructure that is markedly interoperable, at both the metadata and the data level.
This slide deck (or something close) will be presented at the Dutch Techcenter for Life Sciences Partners Workshop, November 4, 2016.
Spanish Ministerio de Economía y Competitividad grant number TIN2014-55993-R
Digital Identity is fundamental to collaboration in bioinformatics research and development because it enables attribution, contribution, publication to be recorded and quantified.
However, current models of identity are often obsolete and have problems capturing both small contributions "microattribution" and large contributions "mega-attribution" in Science. Without adequate identity mechanisms, the incentive for collaboration can be reduced, and the utility of collaborative social tools hindered.
Using examples of metabolic pathway analysis with the taverna workbench and myexperiment.org, this talk will illustrate problems and solutions to identifying scientists accurately and effectively in collaborative bioinformatics networks on the Web.
Integration of oreChem with the eCrystals repository for crystal structuresMark Borkum
This document discusses integrating the oreChem ontology for representing scientific experiments and their provenance with the eCrystals repository for crystallography data. It describes how eCrystals represents crystallography experiments and data, the motivation for open access to research, and the oreChem ontology's concepts for representing methodologies, enactments, and provenance. It then outlines a proposed plugin for eCrystals that would map eCrystals data to oreChem concepts to capture the provenance of experiments and link data to the methods used.
Similar to Mon domingue key_introduction to semantic (20)
The document describes a semantic recommendation system for helping customers select fish for an aquarium. The system takes into account various criteria like temperature, predator/prey relationships between fish, food requirements, ecosystem needs, size, and color preferences. It integrates data from multiple sources and uses semantic technologies like ontologies and linked data to make personalized recommendations based on a user's needs and preferences. The system aims to connect people interested in fish keeping through a social network application.
This document discusses the creation of an Arabic sentiment lexicon and finding related entities from Arabic text. It involves processing Arabic financial text data by tagging parts of speech, removing stop words, translating verbs and adjectives to English using Google Translate, stemming the words, and using an existing English sentiment lexicon like SentiWordNet to assign positive and negative sentiment scores. Related entities are extracted using a window-based approach to find nouns occurring near sentiment words. The process aims to create an Arabic sentiment lexicon and identify related entities to help with sentiment analysis on Arabic text.
FIT-8BIT An activity music assistant - ESWC SSchool 14 - Student projecteswcsummerschool
The document discusses the advantages of music in sports. It outlines five key ways music can influence preparation and performance: 1) dissociation to lower effort perception, 2) arousal regulation as a stimulant or sedative, 3) synchronization with exercise for increased output, 4) positive impact on acquiring motor skills, and 5) attainment of flow state. It also discusses links between sport and music, defining tempo rhythms, and providing scenarios for a music application interface and workflow.
Personal Tours at the British Museum - ESWC SSchool 14 - Student projecteswcsummerschool
This document discusses creating personal tours at the British Museum using a mobile app. It would allow visitors to choose a starting point and then be suggested next artifacts to view based on their interests, time constraints, and what they liked or disliked. Challenges included data issues and changing requirements, but enriching descriptions, collecting visitor analytics, and adding game elements could improve the experience.
Empowering fishing business using Linked Data - ESWC SSchool 14 - Student pro...eswcsummerschool
This document describes a project to create a visualization of current fish population and fishing legislation around the world. The project, called PYTHEIA, will provide information to fishing businesses to help them choose suitable new locations by linking data on fish populations, laws, and management from various sources. It outlines the user scenario, workflow, ontology developed to represent the data, and plans for the user interface and enhancing the system in the future.
Tutorial: Social Semantic Web and Crowdsourcing - E. Simperl - ESWC SS 2014 eswcsummerschool
This document discusses combining the social web and semantic web through crowdsourcing. It defines key concepts like the social web, crowdsourcing, and semantic technologies. It then provides examples of how semantic tasks can be crowdsourced, such as annotating research papers, mapping topics to ontologies, and curating linked data. Challenges with crowdsourcing semantic tasks are also explored, such as how to optimally structure tasks and validate crowd responses.
Keynote: Global Media Monitoring - M. Grobelnik - ESWC SS 2014eswcsummerschool
This document discusses tools and techniques for monitoring global media data and events. It introduces several systems developed at the Jozef Stefan Institute for collecting news articles from around the world, enriching documents with semantic annotations, linking information across languages, and analyzing news reporting bias. It also addresses representing events with structured and semantic descriptions and tracking how topics evolve over time through an event registry system. The overall goal is to establish an integrated real-time pipeline for processing multilingual media, identifying events, and providing visualization of global event dynamics.
Hands On: Amazon Mechanical Turk - M. Acosta - ESWC SS 2014 eswcsummerschool
This document provides an overview of Amazon Mechanical Turk (MTurk) and how it can be used for crowdsourcing projects. It discusses key MTurk concepts like requesters, workers, HITs, assignments, and qualifications. It then walks through the steps to create an MTurk project, including defining the HIT properties, previewing templates, creating batches, publishing HITs, and reviewing results. Finally, it discusses best practices like testing HITs in the sandbox environment and monitoring worker forums.
Tutorial: Querying a Marine Data Warehouse Using SPARQL - I. Fundulaki - ESWC...eswcsummerschool
This document describes querying a marine data warehouse using SPARQL. It discusses the MarineTLO ontology used to integrate data about marine species from multiple sources. Examples are provided of SPARQL queries against the MarineTLO warehouse to retrieve information about species, their distributions, relationships and more. A series of 21 example queries are also listed that demonstrate different ways of interrogating the semantic data in the warehouse.
This document discusses different data formats for representing cultural data on the web and their pros and cons, including CSV, RDBMS, XML/SOAP, and JSON/REST. It advocates for using URIs, HTTP, and semantic web standards like RDF and SPARQL to represent cultural data in a way that is distributed, extensible, and links related resources on the web.
The document outlines the schedule and activities for a summer school on semantic web technologies. The summer school will include tutorials on topics such as linked data, ontologies, and data publishing/preservation. Students will work in groups on mini-projects with guidance from tutors. There will be keynote speakers each day and social events planned. The goal is for students to learn practical skills through hands-on experience while interacting with peers and experts in the field.
This document discusses querying cultural heritage data stored as graphs using SPARQL. It provides examples of retrieving single and sets of triples from the graph and explains how a SPARQL server can perform additional reasoning. Exercises demonstrate querying for object owners and their names, exporting query results to CSV, and counting objects made of different materials.
This document outlines the goals and instructions for a hands-on session to publish a dataset as linked data. The session will divide participants into three groups to work on creating, interlinking, and publishing the RDF dataset. Each group will have 40 minutes to select vocabularies, design URIs, transform tabular data into RDF, select target datasets to link to, create metadata using VoID, and select a license. Then each group will present their work in 1 minute without slides. The overall goal is to accomplish the tasks of creating, interlinking, and publishing the RDF dataset.
This document discusses knowledge engineering and the use of knowledge on the web. It covers web data representation using standards like RDF, HTML5 and SKOS. It discusses categorizing knowledge from different sources and aligning categories. It also discusses using knowledge through techniques like visualization, graph-based search across linked data, and improving search through vocabulary alignment and location-based queries.
This document provides an overview of querying linked data using SPARQL. It begins with an introduction and motivation for querying linked data. It then covers the basics of SPARQL including its components like prefixes, query forms, and solution modifiers. Several examples are provided demonstrating how to construct ASK, SELECT, and other types of SPARQL queries. The document also discusses SPARQL algebra and updating linked data with SPARQL 1.1.
This document provides an overview of SPARQL, the query language for retrieving and manipulating data stored in RDF format. It describes the basic components of SPARQL including triple patterns, basic graph patterns, group graph patterns, filters, and how these patterns are matched against RDF data to retrieve variable bindings. It also gives a brief introduction to SPARQL 1.1 features for querying and updating RDF stores.
This document discusses providing linked data. It covers the core tasks of creating, interlinking, and publishing linked data. For creating data, it describes extracting data, naming things with URIs, and selecting vocabularies. Interlinking involves creating RDF links between datasets using properties like owl:sameAs, rdfs:seeAlso, and SKOS mapping properties. Publishing linked data involves creating metadata to describe the dataset, making the data accessible, exposing it in repositories, and validating it.
The document discusses challenges in preserving linked data. It describes the PRELIDA project which aims to identify differences between linked data/open data and digital preservation approaches. Key issues discussed include whether linked data preservation requires storing RDF data alone or additional context. Case studies on preserving DBpedia and Europeana datasets are provided, highlighting dependencies on external linked datasets and challenges of preserving interconnected evolving data.
This document discusses three use cases for preserving linked data: 1) archiving and retrieving RDF data from DBpedia at specific points in time, 2) rendering archived DBpedia data as web pages, and 3) reconstructing the functionality of DBpedia's SPARQL endpoint for past times. It also presents exercises for each use case and discusses additional technical, organizational, and economic issues around long-term linked data preservation.
This document discusses research data management and long-term preservation strategies. It outlines the Open Archival Information System (OAIS) reference model, which provides a framework for describing and comparing digital archive architectures. The OAIS model defines key concepts like the designated community, which is the intended users of preserved data, and representation information, which allows users to understand preserved digital objects. The document also discusses challenges in preserving linked data and outlines the components of a research data management plan.
13. Informal Ontology Explanation
• Used to structure knowledge
• Facilitates interoperability
• Formal explicit shared
conceptualisation of a domain
• A set of concepts, relationships
and individuals over which
there is an agreed consensus
www.sti2.org
27. Concepts Relations
Engineered Artifact
Celestial Body
www.sti2.org
27/59
K-HAL v 1.0 Ontology (small portion)
Space Ship
Has Component
Rocket
Generates Thrust
Star
Planet Asteroid
Agent
Crew
Human Crew
Onboard AI
Has Name
Has Mass
Has Volume
29. Concepts Relations
Engineered Artifact
Celestial Body
www.sti2.org
29/59
K-HAL v 1.0 Ontology (small portion)
Space Ship
Has Component
Rocket
Generates Thrust
Star
Planet Asteroid
Agent
Crew
Human Crew
Onboard AI
Has Name
Has Mass
Has Volume
30. Concepts Relations
Engineered Artifact
Celestial Body
www.sti2.org
30/59
K-HAL v 1.0 Ontology/KB (small portion)
Space Ship
Has Component
Rocket
Generates Thrust
Star
Planet Asteroid
Agent
Crew
Human Crew
Onboard AI
Has Name
Has Mass
Has Volume
HAL
Dave Boorman
The Sun Jupiter
44. Problems to be resolved (ontology)
• Finding ontologies
• Understanding ontologies
• Connecting ontologies
• Adapting ontologies
• Version control
• Agility
www.sti2.org
– New ontologies, changes in used ontologies …
• …….
50. Problems to be resolved
• Finding semantic data
• Transforming unstructured data to a semantic
format
• Transforming structured data to a semantic format
• Connecting semantic datasets
• Querying/reasoning over connected semantic data
• Sharing new data
• Agility
www.sti2.org
– New datasets, changes in used datasets…
51. Linked Data Basics
• Fundamentals of Linked Data: main standards &
technology components, motivating application
scenario
www.sti2.org
– Barry Norton Tutorial 10:45am today
• Querying Linked Data: SPARQL 101
– Irini Fundulaki Tutorial 2pm today
• Semantic Web languages and standards: RDF,
RDFS, SPARQL
– Barry Norton & Irini Fundulaki Hands-on: 3:30pm today
52. Publishing and Using Linked Data
• Providing and consuming Linked Data
www.sti2.org
– Maribel Acosta Tutorial 2:30pm Tuesday
• Publishing and consuming Linked Open Data
– Maribel Acosta Hands-on 4pm Tuesday
53. Linked Data and the Unstructured
World
• Linked Data for NLP
www.sti2.org
– Barry Norton Tutorial Wednesday 10:45am
• Using Linked Data and GATE
– Barry Norton & Isabelle Augenstein Wednesday 11:30am
58. Getting help tutorials and hands-on
• Social Semantic Web and crowdsourcing
www.sti2.org
– Elena Simperl Tutorial Wednesday 2pm
• Using Mechanical Turk to solve Linked Data
problems
– Maribel Acosta Hands-on Wednesday 3pm
59. www.sti2.org
59/59
K-HAL v 1.0
Ontology
Knowledge Base
Input/Output
Vision system
Speech Generation
Speech Understanding
User
Reasoner
60. www.sti2.org
60/59
K-HAL v 2.0 Architecture
Linked Open
Vocabularies
Linked Open
Data
HAL Ontology
HAL Facts in
RDF Store
Reasoner
Input/Output
Vision system
Speech Generation
Speech Understanding
Crowdsourced
facts
Corporate data
65. www.sti2.org
65/59
Conductor
Dictates song
Common notation
Selects performances
Edits and mixes
Choir
Autonomous singers
Available online
66. www.sti2.org
66/59
Listener
Has a desire
Has preferences
Conductor
Dictates song
Common notation
Selects performances
Edits and mixes
Choir
Autonomous singers
Available online
68. Conclusions (1/2)
www.sti2.org
In its current state the Semantic Web/Web
of Data facilitates the re-use of ontologies
and data
• Other problems arise associated with
ontology and data quality, adapting/
aligning ontologies and data …
• Good SW/LD practitioners know online
ontologies and datasets as a good
researcher knows the related literature
Be Lazy
69. Conclusions (2/2)
www.sti2.org
Releasing ontologies and data
• Provides a community benefit for
expected and unexpected uses
• Can increase the value of the released
artifacts
• May be obligated depending on context
(e.g. if paid for by public funding)
• Has associated issues related to
training, quality, privacy,
maintenance….
Be kind and
share