The eleventh international world wide web conference will be held in Honolulu, Hawaii, USA from May 7-11. Participants from over 20 countries will register to attend the prestigious event, which will see Tim Berners-Lee and Ian Foster speak.
Presentation by Lori Farnsworth, Propel Schools, and Norton Gusky for Three Rivers Educational Technology Conference on the Dynamic Resource Portal, an enterprise search project attempting to make searching more relevant and easier.
The document discusses open source enterprise content management and how it can be enhanced by integrating semantic web technologies. It describes how semantic technologies can help extract meaning from unstructured content, connect information to form knowledge, reason about the knowledge, and present it in an actionable way. The document also provides an overview of Nuxeo's work on semantic ECM through various research projects and their semantic engine which extracts metadata from content.
- The document is a slide presentation on semantic analysis in language technology that discusses the semantic web and ontologies. It provides examples of question answering systems like START, Siri, and IBM Watson and discusses the evolution of the web from Web 1.0 to Web 2.0 to the proposed Web 3.0. It also introduces key concepts like ontologies, semantic metadata, and the role of semantics in allowing machines to process information.
Search, Signals & Sense: An Analytics Fueled VisionSeth Grimes
The document discusses how text analytics can fuel semantic search and sensemaking by extracting features from documents, analyzing relationships between entities, and integrating search with other data sources. It outlines trends toward more unified search platforms that incorporate user context and infer intent to provide categorized, clustered results rather than just hit lists. The goal is for search to be the starting point for iterative sensemaking through analysis and synthesis of information.
Untangling Concepts, Objects, and InformationJim Logan
This presentation aims to answer many questions related to concept modeling:
• What is a concept?
• How do we get from concepts and objects to information about objects?
• Can we untangle concepts, objects, and information?
• What kinds of models are there?
• Is it useful to separate things in reality from evidence, measurements, samplings, and recordings?
Ontologies and the humanities: some issues affecting the design of digital in...Toby Burrows
This document discusses issues related to designing digital infrastructure for the humanities using ontologies. It notes that there are many ongoing efforts to develop ontologies for different domains in digital humanities. However, it also acknowledges linguistic, semantic, and conceptual difficulties in representing humanities knowledge through ontologies. As an alternative, it discusses strategies like topic modeling, linked data, and conceptual spaces that may better capture humanistic perspectives on relationships, cognition, and meaning. It argues that future humanities research should look beyond ontologies alone and examine computational modeling from cognitive science and philosophy.
Open Source Web Content Management Technologies for LibrariesAnil Mishra
This document provides an agenda and overview for an open source web content management technologies pre-conference tutorial focused on libraries. The agenda covers topics including the current information landscape, open source overview, categories of open source software for libraries, and several specific open source digital library systems and content management platforms. An overview of each topic is provided along with considerations around selecting and implementing open source solutions for libraries.
Presentation by Lori Farnsworth, Propel Schools, and Norton Gusky for Three Rivers Educational Technology Conference on the Dynamic Resource Portal, an enterprise search project attempting to make searching more relevant and easier.
The document discusses open source enterprise content management and how it can be enhanced by integrating semantic web technologies. It describes how semantic technologies can help extract meaning from unstructured content, connect information to form knowledge, reason about the knowledge, and present it in an actionable way. The document also provides an overview of Nuxeo's work on semantic ECM through various research projects and their semantic engine which extracts metadata from content.
- The document is a slide presentation on semantic analysis in language technology that discusses the semantic web and ontologies. It provides examples of question answering systems like START, Siri, and IBM Watson and discusses the evolution of the web from Web 1.0 to Web 2.0 to the proposed Web 3.0. It also introduces key concepts like ontologies, semantic metadata, and the role of semantics in allowing machines to process information.
Search, Signals & Sense: An Analytics Fueled VisionSeth Grimes
The document discusses how text analytics can fuel semantic search and sensemaking by extracting features from documents, analyzing relationships between entities, and integrating search with other data sources. It outlines trends toward more unified search platforms that incorporate user context and infer intent to provide categorized, clustered results rather than just hit lists. The goal is for search to be the starting point for iterative sensemaking through analysis and synthesis of information.
Untangling Concepts, Objects, and InformationJim Logan
This presentation aims to answer many questions related to concept modeling:
• What is a concept?
• How do we get from concepts and objects to information about objects?
• Can we untangle concepts, objects, and information?
• What kinds of models are there?
• Is it useful to separate things in reality from evidence, measurements, samplings, and recordings?
Ontologies and the humanities: some issues affecting the design of digital in...Toby Burrows
This document discusses issues related to designing digital infrastructure for the humanities using ontologies. It notes that there are many ongoing efforts to develop ontologies for different domains in digital humanities. However, it also acknowledges linguistic, semantic, and conceptual difficulties in representing humanities knowledge through ontologies. As an alternative, it discusses strategies like topic modeling, linked data, and conceptual spaces that may better capture humanistic perspectives on relationships, cognition, and meaning. It argues that future humanities research should look beyond ontologies alone and examine computational modeling from cognitive science and philosophy.
Open Source Web Content Management Technologies for LibrariesAnil Mishra
This document provides an agenda and overview for an open source web content management technologies pre-conference tutorial focused on libraries. The agenda covers topics including the current information landscape, open source overview, categories of open source software for libraries, and several specific open source digital library systems and content management platforms. An overview of each topic is provided along with considerations around selecting and implementing open source solutions for libraries.
folksonomy, social tagging, tag clouds, automatic folksonomy construction, word clouds, wordle,context-preserving word cloud visualisation, CPEWCV, seam carving, inflate and push, star forest, cycle cover, quantitative metrics, realized adjacencies, distortion, area utilization, compactness, aspect ratio, running time, semantics in language technology
UVA MDST 3703 Thematic Research Collections 2012-09-18Rafael Alvarado
The document discusses thematic research collections (TRCs) as an emerging genre of digital scholarship. TRCs consolidate related content to overcome the problem of traditional libraries scattering content. Key features of TRCs include being electronic, structured yet open-ended, research-oriented, and achieving "contextual mass" by making connections between resources. The document then examines several examples of TRCs and evaluates them based on six criteria like content, organization, findability, connections between resources, tools provided, and community involvement.
The document analyzes how cultural factors influence the design of Yahoo's websites in different countries. It first provides context on cultural models and how they have been used to evaluate software and website design across cultures. It then compares Yahoo's homepages in the US and China, finding key differences that reflect the cultural dimensions identified in the literature review. Specifically, the Yahoo China homepage is longer, has more dense information and content areas, and less structured navigation compared to the US version. The analysis suggests cultural factors like power distance and uncertainty avoidance help explain these differences in cross-cultural design.
The document outlines Pablo Mendes' PhD dissertation defense on adaptive semantic annotation of entities and concepts in text. It discusses Pablo Mendes' conceptual model for knowledge base tagging, the DBpedia knowledge base and DBpedia Spotlight system, core evaluations of the system, and case studies applying the system to tweets, audio transcripts, and educational material. The presentation concludes by thanking the audience.
Mapping the use of digital sources amongst Humanities scholars in the Netherl...MaxKemman
1) The document reports on a survey of 294 Dutch and Belgian academics regarding their use of digital sources and databases.
2) It finds that text is the most commonly used digital medium, and Google is the dominant search tool and platform. Younger academics are more confident in using audiovisual search tools.
3) Disciplines like history and literature most commonly use images and digitized objects, while fields like social studies and linguistics make more use of video, audio, and statistical data.
4) The study has implications for how to increase awareness, appeal and adoption of digital humanities approaches through user-focused design and inclusion in education.
This document discusses tagging keywords to provide context and improve search results. It proposes tagging search keywords with metadata tags, called "metawords", to reduce the "context error" of searches. It describes building a metaquery by allowing users to freely tag keywords, querying databases using the metaquery, and re-ranking results based on a proposed "ContextRank" algorithm that estimates relevance, authority and contextuality. Revenue would come from affiliate programs when referring users to products or advertisers.
This document discusses semantic analysis and natural language processing. It describes 7 typical tasks: 1) topic detection, 2) named entity recognition, 3) co-reference resolution and word sense disambiguation, 4) relation extraction, 5) sentiment analysis, 6) social annotation, and 7) text summarization. For each task it provides details on the goal, common techniques used, and examples. The document also discusses publishing content as linked data using semantic vocabularies and ontologies to make it machine-readable and processable.
Beyond document retrieval using semantic annotations Roi Blanco
Traditional information retrieval approaches deal with retrieving full-text document as a response to a user's query. However, applications that go beyond the "ten blue links" and make use of additional information to display and interact with search results are becoming increasingly popular and adopted by all major search engines. In addition, recent advances in text extraction allow for inferring semantic information over particular items present in textual documents. This talks presents how enhancing a document with structures derived from shallow parsing is able to convey a different user experience in search and browsing scenarios, and what challenges we face as a consequence.
How to model digital objects within the semantic webAngelica Lo Duca
These slides describe the general concept of semantic Web and Linked Data, then they illustrate the concept of digital object. Finally they give a use case.
Language as social sensor - Marko Grobelnik - Dubrovnik - HrTAL2016 - 30 Sep ...Marko Grobelnik
At the HrTAL2016 conference I presented the talk on "Language as a Social Sensor to operate with Knowledge". The talk included a section on language as an interface between physical nature and the world of human mind and human society. The role of language as a 'sensor'has several consequences in uncertainties and inexactness of the language evolution, as we know it. The talk was accompanies with several live demonstrations of the systems on semantic annotation (wikifier.org) and media monitoring (eventregistry.org).
The document is about the Semantic Web conference to be held in Honolulu, Hawaii from May 7-11, 2002. It provides information about the conference name, location, date, slogan, and lists some of the expected participants. XML markup is suggested to add structure and meaning to the information for machines to better understand the document.
study or concern about what kinds of things exist
what entities there are in the universe.
the ontology derives from the Greek onto (being) and logia (written or spoken). It is a branch of metaphysics , the study of first principles or the root of things.
The document discusses the evolution of the internet and web technologies. It describes early technologies like Vannevar Bush's memex and hypertext, the development of the World Wide Web through HTTP and HTML. It outlines the rise of user-generated content through blogs, photos, video and social sharing sites. It also discusses the potential for machines to understand semantic meaning through standards like XML, RDF and ontologies.
The document introduces Topic Maps, which provide a standardized way to represent knowledge. Topic Maps use topics to represent subjects, associations to represent relationships between topics, and occurrences to link topics to information resources. They allow multiple perspectives on knowledge through the use of scopes. Topic Maps can be used to improve access to information through semantic queries and customized views. They also enable easy integration and sharing of information across systems through merging of Topic Maps.
The document discusses using semantics to improve search engines and information retrieval. It describes some current limitations like recall issues, results being dependent on vocabulary, and content not being machine-readable. It then outlines several key aspects of using semantics: semantic analysis to extract facts from text, using semantic vocabularies as channels to publish linked data, using ontologies for semantic content modeling, and semantic matchmaking for automatic distribution of content. The goal is to move from isolated data silos to a global web of data where objects are linked with typed relationships and explicit semantics.
Semantic technology in nutshell 2013. Semantic! are you a linguist?Heimo Hänninen
I have often faced the challenge – a person comes to me and asks “what is semantic technology?” Especially, when one linguist wanted to know how do I define semantics (in IT)? – I had to step back and do some home work on the topic. We IT guys are notorious to abuse any term from any domain whenever we need a yet another buzz word to mystify some basic concepts. So, this is what I came up as tried to explain to her what semantic technology is in nutshell.
01 History Of Hypertext+Bibliography 2010Paul Kahn
This document summarizes key ideas from influential pioneers in the development of hypertext and digital media including Vannevar Bush, Ted Nelson, Douglas Engelbart, and Alan Kay. It discusses Bush's concept of the memex, an early vision of hypertext. It outlines Nelson's ideas about linking documents and his Xanadu project. It describes Engelbart's work developing the NLS system and introducing the mouse and graphical user interface. It shares Kay's vision of portable personal devices for storing and manipulating information like notebooks that could outpace human senses.
Elementary explanation of the difficulties of combining indexes for web pages and books, and means by which book index data can optimize general web searches at scale.
folksonomy, social tagging, tag clouds, automatic folksonomy construction, word clouds, wordle,context-preserving word cloud visualisation, CPEWCV, seam carving, inflate and push, star forest, cycle cover, quantitative metrics, realized adjacencies, distortion, area utilization, compactness, aspect ratio, running time, semantics in language technology
UVA MDST 3703 Thematic Research Collections 2012-09-18Rafael Alvarado
The document discusses thematic research collections (TRCs) as an emerging genre of digital scholarship. TRCs consolidate related content to overcome the problem of traditional libraries scattering content. Key features of TRCs include being electronic, structured yet open-ended, research-oriented, and achieving "contextual mass" by making connections between resources. The document then examines several examples of TRCs and evaluates them based on six criteria like content, organization, findability, connections between resources, tools provided, and community involvement.
The document analyzes how cultural factors influence the design of Yahoo's websites in different countries. It first provides context on cultural models and how they have been used to evaluate software and website design across cultures. It then compares Yahoo's homepages in the US and China, finding key differences that reflect the cultural dimensions identified in the literature review. Specifically, the Yahoo China homepage is longer, has more dense information and content areas, and less structured navigation compared to the US version. The analysis suggests cultural factors like power distance and uncertainty avoidance help explain these differences in cross-cultural design.
The document outlines Pablo Mendes' PhD dissertation defense on adaptive semantic annotation of entities and concepts in text. It discusses Pablo Mendes' conceptual model for knowledge base tagging, the DBpedia knowledge base and DBpedia Spotlight system, core evaluations of the system, and case studies applying the system to tweets, audio transcripts, and educational material. The presentation concludes by thanking the audience.
Mapping the use of digital sources amongst Humanities scholars in the Netherl...MaxKemman
1) The document reports on a survey of 294 Dutch and Belgian academics regarding their use of digital sources and databases.
2) It finds that text is the most commonly used digital medium, and Google is the dominant search tool and platform. Younger academics are more confident in using audiovisual search tools.
3) Disciplines like history and literature most commonly use images and digitized objects, while fields like social studies and linguistics make more use of video, audio, and statistical data.
4) The study has implications for how to increase awareness, appeal and adoption of digital humanities approaches through user-focused design and inclusion in education.
This document discusses tagging keywords to provide context and improve search results. It proposes tagging search keywords with metadata tags, called "metawords", to reduce the "context error" of searches. It describes building a metaquery by allowing users to freely tag keywords, querying databases using the metaquery, and re-ranking results based on a proposed "ContextRank" algorithm that estimates relevance, authority and contextuality. Revenue would come from affiliate programs when referring users to products or advertisers.
This document discusses semantic analysis and natural language processing. It describes 7 typical tasks: 1) topic detection, 2) named entity recognition, 3) co-reference resolution and word sense disambiguation, 4) relation extraction, 5) sentiment analysis, 6) social annotation, and 7) text summarization. For each task it provides details on the goal, common techniques used, and examples. The document also discusses publishing content as linked data using semantic vocabularies and ontologies to make it machine-readable and processable.
Beyond document retrieval using semantic annotations Roi Blanco
Traditional information retrieval approaches deal with retrieving full-text document as a response to a user's query. However, applications that go beyond the "ten blue links" and make use of additional information to display and interact with search results are becoming increasingly popular and adopted by all major search engines. In addition, recent advances in text extraction allow for inferring semantic information over particular items present in textual documents. This talks presents how enhancing a document with structures derived from shallow parsing is able to convey a different user experience in search and browsing scenarios, and what challenges we face as a consequence.
How to model digital objects within the semantic webAngelica Lo Duca
These slides describe the general concept of semantic Web and Linked Data, then they illustrate the concept of digital object. Finally they give a use case.
Language as social sensor - Marko Grobelnik - Dubrovnik - HrTAL2016 - 30 Sep ...Marko Grobelnik
At the HrTAL2016 conference I presented the talk on "Language as a Social Sensor to operate with Knowledge". The talk included a section on language as an interface between physical nature and the world of human mind and human society. The role of language as a 'sensor'has several consequences in uncertainties and inexactness of the language evolution, as we know it. The talk was accompanies with several live demonstrations of the systems on semantic annotation (wikifier.org) and media monitoring (eventregistry.org).
The document is about the Semantic Web conference to be held in Honolulu, Hawaii from May 7-11, 2002. It provides information about the conference name, location, date, slogan, and lists some of the expected participants. XML markup is suggested to add structure and meaning to the information for machines to better understand the document.
study or concern about what kinds of things exist
what entities there are in the universe.
the ontology derives from the Greek onto (being) and logia (written or spoken). It is a branch of metaphysics , the study of first principles or the root of things.
The document discusses the evolution of the internet and web technologies. It describes early technologies like Vannevar Bush's memex and hypertext, the development of the World Wide Web through HTTP and HTML. It outlines the rise of user-generated content through blogs, photos, video and social sharing sites. It also discusses the potential for machines to understand semantic meaning through standards like XML, RDF and ontologies.
The document introduces Topic Maps, which provide a standardized way to represent knowledge. Topic Maps use topics to represent subjects, associations to represent relationships between topics, and occurrences to link topics to information resources. They allow multiple perspectives on knowledge through the use of scopes. Topic Maps can be used to improve access to information through semantic queries and customized views. They also enable easy integration and sharing of information across systems through merging of Topic Maps.
The document discusses using semantics to improve search engines and information retrieval. It describes some current limitations like recall issues, results being dependent on vocabulary, and content not being machine-readable. It then outlines several key aspects of using semantics: semantic analysis to extract facts from text, using semantic vocabularies as channels to publish linked data, using ontologies for semantic content modeling, and semantic matchmaking for automatic distribution of content. The goal is to move from isolated data silos to a global web of data where objects are linked with typed relationships and explicit semantics.
Semantic technology in nutshell 2013. Semantic! are you a linguist?Heimo Hänninen
I have often faced the challenge – a person comes to me and asks “what is semantic technology?” Especially, when one linguist wanted to know how do I define semantics (in IT)? – I had to step back and do some home work on the topic. We IT guys are notorious to abuse any term from any domain whenever we need a yet another buzz word to mystify some basic concepts. So, this is what I came up as tried to explain to her what semantic technology is in nutshell.
01 History Of Hypertext+Bibliography 2010Paul Kahn
This document summarizes key ideas from influential pioneers in the development of hypertext and digital media including Vannevar Bush, Ted Nelson, Douglas Engelbart, and Alan Kay. It discusses Bush's concept of the memex, an early vision of hypertext. It outlines Nelson's ideas about linking documents and his Xanadu project. It describes Engelbart's work developing the NLS system and introducing the mouse and graphical user interface. It shares Kay's vision of portable personal devices for storing and manipulating information like notebooks that could outpace human senses.
Elementary explanation of the difficulties of combining indexes for web pages and books, and means by which book index data can optimize general web searches at scale.
A talk given at the annual Computer Science for High School Teachers event at Victoria University of Wellington. I presented on some basics of the World Wide Web and why it's worth to preserve it, our work on non-expert tools to populate semantically enriched content, a current project to identify NZ native birds based on their calls that involves citizen science and contemporary deep learning using TensorFlow, a project that investigates the impact of online citizen science on the development of science capabilities of primary school children, and my collaboration with Adam Grener from the School of English, Film, Theater and Media Studies at VUW with whom I am working on computational tools for the literature studies.
This document provides an overview of ontologies and the semantic web. It defines ontologies as formal specifications of conceptualizations that are shared between people and computers. Ontologies provide a common vocabulary and conceptual structure to facilitate understanding between humans and machines. They allow different systems and communities to work together by providing shared definitions of concepts and relationships. The development of ontologies and the semantic web aims to make web resources more computer-readable and enable machines to better understand and process online information.
This document summarizes a talk given by Andraz Tori about his company Zemanta. It discusses how Zemanta started as a system for closed captioning Slovenian television, which led Tori to start a startup. Zemanta provides a personal writing assistant that suggests images, related articles, in-text links and tags to bloggers as they write. It analyzes text using natural language processing and information retrieval against a database containing Wikipedia, Freebase and other web data. Tori discusses Zemanta's technology, growth serving over 80,000 bloggers monthly, and plans to open its API to more users. He emphasizes lessons like accelerators being beneficial, the importance of monetizing early, and focusing on one
This document provides a summary of a lecture on texts and models. It discusses HTML and how it is used to generate a form of hypertext. The lecture reviews the history of digital text and encoding dating back to 1949. It examines what text is by looking at material examples and how documents are structured through descriptive markup languages like XML. XML is introduced as a simplified version of SGML that defines rules for tagging structural elements in a document hierarchy.
The document discusses the history and evolution of the World Wide Web from its creation in 1989 to present times. It outlines key developments including the commercialization of the web in the late 1990s, the rise of Web 2.0 technologies in the 2000s, and ideas for Semantic Web and Web 3.0 technologies that aim to make the web more intelligent and accessible to machines.
Sample document for submitting Proposals to Department o Science and Techno...sriniefs
This document is a sample proposal submitted to the Department of Science and Technology (DST) in India for funding a research project on developing a dynamic model for a serial-chain robot arm made of flexible links. The proposal was submitted by Dr. S.K. Saha and Dr. S.P. Singh from the Indian Institute of Technology Delhi. If funded, the project aims to extend recursive algorithms based on dynamic modeling to flexible serial-chain robot arms using decoupled natural orthogonal complement matrices. It also examines using fiber reinforced composites as alternative robot link materials. A two-link robot arm will be fabricated to experimentally verify simulation results using both conventional and composite materials for the links.
This document provides an introduction to India's National Education Policy of 2020. It discusses the goals of providing universal access to quality education and developing skills needed for the future like critical thinking, creativity, and multidisciplinary learning. It emphasizes making pedagogy more experiential and learner-centered. The policy aims to develop all aspects of learners, not just cognitive skills, and prepare them for employment while building character. It draws from India's rich educational traditions and aims to have an education system second to none by 2040 with equitable access for all. Key reforms proposed include improving teachers, governance, access for marginalized groups, and aligning education with local and global needs while respecting India's diversity.
This document provides an overview of Ayurveda, the ancient Indian system of medicine. It discusses how Ayurveda developed from India's intuitive understanding of integrating spiritual and physical well-being. Ayurveda aims to develop the whole person and see the body as an instrument for spiritual realization. It contains profound insights into human health and longevity. The document presents Ayurveda as a perfect science of life consisting of remarkable knowledge on internal health, herbal medicines, and effective treatment methods.
The document is a software requirements specification (SRS) for a web accessible alumni database. It includes sections describing the purpose, scope, functions, and requirements of the system. The main functions are for alumni to access a home page, fill out a survey, create or update an entry in the database, and search for or email other alumni. The SRS provides details on each use case, including flow diagrams, step-by-step descriptions and references to further requirement specifications. It also covers non-functional requirements regarding compatibility and security.
The document is an exam paper for an Advanced Computer Architecture course. It contains 8 questions about various topics in computer architecture. The questions cover implementation technologies in machine evolution, Amdahl's Law, CPU performance modeling techniques, branch cost optimization, hardware and software approaches to instruction level parallelism, static branch prediction, thread level parallelism, dynamic scheduling using Tomasulo's algorithm, the architecture of the Trimedia TM32 multiprocessor, I/O system design principles, and the VLIW approach. Students have to answer any 5 of the 8 questions in the 3 hour exam.
3. History of the Semantic Web
• Web was “invented” by Tim Berners-Lee (amongst others), a
physicist working at CERN
• TBL’s original vision of the Web was much more ambitious than
the reality of the existing (syntactic) Web:
“... a goal of the Web was that, if the interaction between person and hypertext could be so
intuitive that the machine-readable information space gave an accurate representation of
the state of people's thoughts, interactions, and work patterns, then machine analysis could
become a very powerful management tool, seeing patterns in our work and facilitating our
working together through the typical problems which beset the management of large
organizations.”
• TBL (and others) have since been working towards realising this
vision, which has become known as the Semantic Web
– E.g., article in May 2001 issue of Scientific American…
5. Beware of the Hype
• Hype seems to suggest that Semantic
Web means: “semantics + web = AI”
– “A new form of Web content that is
meaningful to computers will unleash a
revolution of new abilities”
• More realistic to think of it as meaning:
“semantics + web + AI = more useful
web”
– Realising the complete “vision” is too
hard for now (probably)
– But we can make a start by adding
semantic annotation to web resources
Images from Christine Thompson and David Booth
6. Where we are Today: the Syntactic Web
[Hendler & Miller 02]
7. The Syntactic Web is…
• A hypermedia, a digital library
– A library of documents called (web pages) interconnected by a
hypermedia of links
• A database, an application platform
– A common portal to applications accessible through web pages, and
presenting their results as web pages
• A platform for multimedia
– BBC Radio 4 anywhere in the world! Terminator 3 trailers!
• A naming scheme
– Unique identity for those documents
A place where computers do the presentation (easy) and people
do the linking and interpreting (hard).
Why not get computers to do more of the hard work?
[Goble 03]
8. Hard Work using the Syntactic Web…
Find images of Peter Patel-Schneider, Frank van Harmelen and
Alan Rector…
Rev. Alan M. Gates, Associate Rector of the
Church of the Holy Spirit, Lake Forest, Illinois
9. Impossible (?) using the Syntactic Web…
• Complex queries involving background knowledge
– Find information about “animals that use sonar but are
not either bats or dolphins”, e.g., Barn Owl
• Locating information in data repositories
– Travel enquiries
– Prices of goods and services
– Results of human genome experiments
• Finding and using “web services”
– Visualise surface interactions between two proteins
• Delegating complex tasks to web “agents”
– Book me a holiday next weekend somewhere warm, not
too far away, and where they speak French or English
10. What is the Problem?
• Consider a typical web page:
• Markup consists of:
– rendering
information (e.g.,
font size and
colour)
– Hyper-links to
related content
• Semantic content
is accessible to
humans but not
(easily) to
computers…
11. What information can we see…
WWW2002
The eleventh international world wide web conference
Sheraton waikiki hotel
Honolulu, hawaii, USA
7-11 may 2002
1 location 5 days learn interact
Registered participants coming from
australia, canada, chile denmark, france, germany, ghana, hong kong, india,
ireland, italy, japan, malta, new zealand, the netherlands, norway,
singapore, switzerland, the united kingdom, the united states, vietnam,
zaire
Register now
On the 7th May Honolulu will provide the backdrop of the eleventh
international world wide web conference. This prestigious event …
Speakers confirmed
Tim berners-lee
Tim is the well known inventor of the Web, …
Ian Foster
Ian is the pioneer of the Grid, the next generation internet …
12. What information can a machine see…
…
…
…
15. Need to Add “Semantics”
• External agreement on meaning of annotations
– E.g., Dublin Core
• Agree on the meaning of a set of annotation tags
– Problems with this approach
• Inflexible
• Limited number of things can be expressed
• Use Ontologies to specify meaning of annotations
– Ontologies provide a vocabulary of terms
– New terms can be formed by combining existing ones
– Meaning (semantics) of such terms is formally specified
– Can also specify relationships between terms in multiple
ontologies
16. Ontology: Origins and History
Ontology in Philosophy
a philosophical discipline—a branch of philosophy that
deals with the nature and the organisation of reality
• Science of Being (Aristotle, Metaphysics, IV, 1)
• Tries to answer the questions:
What characterizes being?
Eventually, what is being?
17. Ontology in Linguistics
Concept
Relates to
activates
Form Referent
Stands for
“Tank“
[Ogden, Richards, 1923]
?
18. Ontology in Computer Science
• An ontology is an engineering artifact:
– It is constituted by a specific vocabulary used to describe a
certain reality, plus
– a set of explicit assumptions regarding the intended meaning
of the vocabulary.
• Thus, an ontology describes a formal specification of a certain
domain:
– Shared understanding of a domain of interest
– Formal and machine manipulable model of a domain of
interest
“An explicit specification of a conceptualisation”
[Gruber93]
19. Structure of an Ontology
Ontologies typically have two distinct components:
• Names for important concepts in the domain
– Elephant is a concept whose members are a kind of animal
– Herbivore is a concept whose members are exactly those
animals who eat only plants or parts of plants
– Adult_Elephant is a concept whose members are exactly those
elephants whose age is greater than 20 years
• Background knowledge/constraints on the domain
– Adult_Elephants weigh at least 2,000 kg
– All Elephants are either African_Elephants or Indian_Elephants
– No individual can be both a Herbivore and a Carnivore
20. A Semantic Web — First Steps
Make web resources more accessible to automated processes
• Extend existing rendering markup with semantic markup
– Metadata annotations that describe content/funtion of web
accessible resources
• Use Ontologies to provide vocabulary for annotations
– “Formal specification” is accessible to machines
• A prerequisite is a standard web ontology language
– Need to agree common syntax before we can share semantics
– Syntactic web based on standards such as HTTP and HTML
21. Ontology Design and Deployment
• Given key role of ontologies in the Semantic Web, it will be
essential to provide tools and services to help users:
– Design and maintain high quality ontologies, e.g.:
• Meaningful — all named classes can have instances
• Correct — captured intuitions of domain experts
• Minimally redundant — no unintended synonyms
• Richly axiomatised — (sufficiently) detailed descriptions
– Store (large numbers) of instances of ontology classes, e.g.:
• Annotations from web pages
– Answer queries over ontology classes and instances, e.g.:
• Find more general/specific classes
• Retrieve annotations/pages matching a given description
– Integrate and align multiple ontologies