KNOWLEDGE
A BRIEF INTRO TO ACQUISITION,
REPRESENTATION AND PUBLISHING
CCS515 Guest Lecture
by Dr. Gan Keng Hoon
27 October 2015
Outline
◦Knowledge Publishing
◦Knowledge Representation
◦Knowledge Acquisition
Knowledge
◦What is knowledge?
Knowledge is a familiarity, awareness or understanding of
someone or something, such as facts, information,
descriptions, or skills, which is acquired through
experience or education by perceiving, discovering, or
learning. (Wikipedia)
My Knowledge
◦ List your knowledge.
◦ Where did you get the knowledge?
◦ How do you want to share them?
Knowledge Management
Wikipedia (2015)
Knowledge management (KM) is the process of capturing, developing, sharing, and
effectively using organizational knowledge. It refers to a multi-disciplinary approach to
achieving organizational objectives by making the best use of knowledge.
Davenport (1994) offered the still widely quoted definition:
"Knowledge management is the process of capturing, distributing, and effectively using
knowledge.“
(Duhon, 1998):
"Knowledge management is a discipline that promotes an integrated approach to
identifying, capturing, evaluating, retrieving, and sharing all of an enterprise's
information assets. These assets may include databases, documents, policies,
procedures, and previously un-captured expertise and experience in individual workers."
Manipulating Knowledge
(Duhon, 1998):
“Knowledge management is a discipline that promotes an integrated approach to identifying,
capturing, evaluating, retrieving, and sharing all of an enterprise's information assets. These assets
may include databases, documents, policies, procedures, and previously un-captured expertise
and experience in individual workers.”
Knowledge
Representation
Knowledge
Acquisition
Knowledge
Publishing
Knowledge Publishing/Sharing
◦ Advantage
◦ Avoid losing of information an organization or one as acquired.
◦ Support quicker decision making, better efficiency.
◦ Reusable and benefited by endless users, staffs, readers.
◦ Build reputations in terms of expertise.
◦ Promote knowledge exchange and creation of new knowledge.
Tools for Knowledge Publishing
◦ Offline
◦ Books, newspaper, news etc.
◦ Online
◦ e-books, e-newspapers, e-news etc.
◦ Websites
◦ Blogger (Blog)
◦ Tumblr (Microblog)
◦ Twitter (Short message)
◦ Pinterest (Image)
◦ youTube (Video)
etc…
Personal Knowledge Publishing
◦ One way to communicate
personal knowledge is by
sharing and publishing.
Activities Involved in Personal
Knowledge Publishing
◦Exploring
◦Push
◦Share links via social
media
◦Pull
◦Enable SEO
◦Bookmarked
Personal Knowledge Publishing using
Blog
◦ Let’s blog.
◦ Why blog?
◦ Make implicit knowledge (e.g. not codified or structured) more
explicit.
◦ Reflect on own learning.
◦ Responsibility needed as it is publicly available.
Personal Knowledge Publishing using
Images
◦ Knowledge does not
limit to text.
◦ Can be covey using
images or other
medias.
◦ Pin a series of
images to show
some implicit idea.
Personal Knowledge Publishing using
Videos
◦Record your
knowledge.
◦What else
can be
recorded?
Personal Knowledge Publishing
Issues
◦ Knowledge overload.
Can our World Wide Web handle the capacity of ever growing size of
information?
BIG DATA, STORAGE, CLOUD, NETWORK
◦ Credibility
Do you have any idea which information source to trust?
FEEDBACK, SENTIMENT ANALYSIS, DATA ANALYSIS
◦ Discovery
Any better ways to discovery the published knowledge?
SEARCH, FRIENDS RECOMMENDATION
Knowledge Representation
How to make knowledge publishing better?
1. I want user to be able to locate my video.
2. I want user to discover the slides I share.
3. I want the navigation of my blog posts to be topics
related.
4. Many more…
Knowledge Representation
Tag a concept:
Automata
Computer
Science
Tag a name:
Michael Benjamin
Tag a name: Shai
Simonson
Knowledge Representation
◦ So, we have resources like videos, images, texts…
◦ We need a way to making them more meaningful.
Resource and Description
However, each resource has its own format.
Need standard form.
SOLUTION: Define a standard language for writing the
description -> Metadata (Semantic Web Terminology)
Knowledge Representation
Resource
Description
Instructor: Shai
Simonson
Title: Lecture 1 – Finite State
Machines (Part 1/9)
Duration: 9:59
Uploaded: May
7 2010
Knowledge Representation
- Resource Description Framework
◦ This leads to one of the Semantic Web main task
Metadata Annotation
- description of resources using standard language
◦ Useful for search and discovery.
Knowledge Representation
- Resource Description Framework
◦ Common language for describing
resource
◦ A statement with structure.
◦ A statement is a triple.
◦ Subject-predicate-object
◦ Subject: resource
◦ Predicate: a verb/property/relation
◦ Object: A resource/a literal string
Knowledge Representation
- Resource Description Framework
To describe the statement: "The instructor of https://www.youtube.com/watch?v=HyUK5RAJg1c is Shai
Simonson".
The subject of the statement above is: https://www.youtube.com/watch?v=HyUK5RAJg1c
The predicate is: author
The object is: Shai Simonson
Simplified RDF
<?xml version="1.0"?>
<RDF>
<Description about="https://www.youtube.com/watch?v=HyUK5RAJg1c">
<instructor>Shai Simonson</instructor>
<title>Lecture 1 – Finite State Machines (Part 1/9)</title>
</Description>
</RDF>
Study: http://www.w3schools.com/xml/xml_rdf.asp
Knowledge Representation
- Resource Description Framework
Source: Fulvio Corno, Semantic Web,
Metadata, Knowledge Representation,
Ontologies
Knowledge Representation
- Resource Description Framework
Solution: metadata standardization is required
Many standardization bodies are involved
General standard
e.g. Dublin Core (DC)
or may depend on goal, context, domain, …
e. g. educational resources (IEEE LOM), multimedia resources (MPEG-7),
images (VRA), people (FOAF, IEEE PAPI), geospatial resources (GSDGM),
bibliographical resources (MARC, OAI), cultural heritage resources
(CIDOC CRM)
Knowledge Representation - Ontology
Semantically rich descriptions need “understanding” the
meaning of a resource and the domain related to the resource
Disambiguation of terms
Shared agreement on meanings
Description of the domain, with concepts and relations among
concepts
Knowledge Representation - Ontology
◦ Controlled vocabularies
◦ Taxonomies
◦ Thesauri
◦ Faceted classification
◦ Ontologies
◦ Folksonomies
◦ Others
Knowledge Representation - Ontology
Taxonomy
Subject-based classification that arranges the
terms in the controlled vocabulary into a
hierarchy
Knowledge Representation - Ontology
◦ ACM Classification
system.
◦ Used to annotate
bibliography.
Knowledge Representation - Ontology
Model for describing the world that
consists of a set of types,
properties, and relationships.
Knowledge Representation - Ontology
Ontologies generally describe:
Individuals
◦ the basic or “ground level” objects
Classes
◦ sets, collections, or types of objects
Attributes
◦ properties, features, characteristics, or parameters that objects can have
and share
Relationships
◦ ways that objects can be related to one another
Knowledge Representation - Ontology
◦ How much knowledge do you
have about ice cream??
Knowledge Representation - Ontology
Web Ontology Language (OWL) is a family of knowledge
representation languages for authoring ontologies.
Built upon a W3C XML standard for objects called the
Resource Description Framework (RDF).
Computational logic-based language, exploited by computer
programs, e.g., to verify the consistency of that knowledge or
to make implicit knowledge explicit.
Knowledge Acquisition
Where does the knowledge comes from?
Manual
◦ Written by expert.
Automated
◦ Gathering from those written by expert.
◦ Allow aggregation, consolidation and organization for better usage.
◦ Allow enhancement like semantic annotation, classification.
Knowledge Acquisition
◦ Knowledge acquisition is the process of extracting, structuring and
organizing knowledge from one source, usually human experts.
◦ Extraction
◦ Get resource from texts.
◦ Structuring
◦ Annotate the resource.
◦ Organizing
◦ Store the resource in representation like ontology.
Knowledge Acquisition
Knowledge can be extracted from
Unstructured Text
◦ Web pages
◦ Article
◦ Scanned document
Semi Structured Text
◦ XML
◦ Excel
◦ CSV
◦ BIB
Knowledge Acquisition
Extraction from unstructured text
◦ Can you differentiate between Person and Organization?
Knowledge Acquisition
Extracting aspect and sentiment from a sentence.
Use Part of Speech Tagging.
Review sentence:
The room is beautiful.
POS tagged sentence:
The/DT room/NN is/VBZ beautiful/JJ./.
Representing the acquired knowledge:
RDF triple(hasSentiment, room, beautiful)
General simple rule (R1):
+.*(/nn1) +.*(/jj1) +
Mapping of aspect and opinion
(M1):
map (nn1, jj1)
Knowledge Acquisition
– Road Ahead
Too much knowledge out there to be acquired.
Lots of research opportunities, especially,
unstructured resource to structured resource
Identify relation in a resource
Identify implicit meaning in a resource
Contact
Gan Keng Hoon
khganATusm.my
Visit our works at
ir.cs.usm.my
Picture Source: http://www.mindonsolutions.com

A Brief Introduction to Knowledge Acquisition, Representation and Publishing

  • 1.
    KNOWLEDGE A BRIEF INTROTO ACQUISITION, REPRESENTATION AND PUBLISHING CCS515 Guest Lecture by Dr. Gan Keng Hoon 27 October 2015
  • 2.
  • 3.
    Knowledge ◦What is knowledge? Knowledgeis a familiarity, awareness or understanding of someone or something, such as facts, information, descriptions, or skills, which is acquired through experience or education by perceiving, discovering, or learning. (Wikipedia)
  • 4.
    My Knowledge ◦ Listyour knowledge. ◦ Where did you get the knowledge? ◦ How do you want to share them?
  • 5.
    Knowledge Management Wikipedia (2015) Knowledgemanagement (KM) is the process of capturing, developing, sharing, and effectively using organizational knowledge. It refers to a multi-disciplinary approach to achieving organizational objectives by making the best use of knowledge. Davenport (1994) offered the still widely quoted definition: "Knowledge management is the process of capturing, distributing, and effectively using knowledge.“ (Duhon, 1998): "Knowledge management is a discipline that promotes an integrated approach to identifying, capturing, evaluating, retrieving, and sharing all of an enterprise's information assets. These assets may include databases, documents, policies, procedures, and previously un-captured expertise and experience in individual workers."
  • 6.
    Manipulating Knowledge (Duhon, 1998): “Knowledgemanagement is a discipline that promotes an integrated approach to identifying, capturing, evaluating, retrieving, and sharing all of an enterprise's information assets. These assets may include databases, documents, policies, procedures, and previously un-captured expertise and experience in individual workers.” Knowledge Representation Knowledge Acquisition Knowledge Publishing
  • 7.
    Knowledge Publishing/Sharing ◦ Advantage ◦Avoid losing of information an organization or one as acquired. ◦ Support quicker decision making, better efficiency. ◦ Reusable and benefited by endless users, staffs, readers. ◦ Build reputations in terms of expertise. ◦ Promote knowledge exchange and creation of new knowledge.
  • 8.
    Tools for KnowledgePublishing ◦ Offline ◦ Books, newspaper, news etc. ◦ Online ◦ e-books, e-newspapers, e-news etc. ◦ Websites ◦ Blogger (Blog) ◦ Tumblr (Microblog) ◦ Twitter (Short message) ◦ Pinterest (Image) ◦ youTube (Video) etc…
  • 9.
    Personal Knowledge Publishing ◦One way to communicate personal knowledge is by sharing and publishing.
  • 10.
    Activities Involved inPersonal Knowledge Publishing ◦Exploring ◦Push ◦Share links via social media ◦Pull ◦Enable SEO ◦Bookmarked
  • 11.
    Personal Knowledge Publishingusing Blog ◦ Let’s blog. ◦ Why blog? ◦ Make implicit knowledge (e.g. not codified or structured) more explicit. ◦ Reflect on own learning. ◦ Responsibility needed as it is publicly available.
  • 13.
    Personal Knowledge Publishingusing Images ◦ Knowledge does not limit to text. ◦ Can be covey using images or other medias. ◦ Pin a series of images to show some implicit idea.
  • 14.
    Personal Knowledge Publishingusing Videos ◦Record your knowledge. ◦What else can be recorded?
  • 15.
    Personal Knowledge Publishing Issues ◦Knowledge overload. Can our World Wide Web handle the capacity of ever growing size of information? BIG DATA, STORAGE, CLOUD, NETWORK ◦ Credibility Do you have any idea which information source to trust? FEEDBACK, SENTIMENT ANALYSIS, DATA ANALYSIS ◦ Discovery Any better ways to discovery the published knowledge? SEARCH, FRIENDS RECOMMENDATION
  • 16.
    Knowledge Representation How tomake knowledge publishing better? 1. I want user to be able to locate my video. 2. I want user to discover the slides I share. 3. I want the navigation of my blog posts to be topics related. 4. Many more…
  • 17.
    Knowledge Representation Tag aconcept: Automata Computer Science Tag a name: Michael Benjamin Tag a name: Shai Simonson
  • 18.
    Knowledge Representation ◦ So,we have resources like videos, images, texts… ◦ We need a way to making them more meaningful. Resource and Description However, each resource has its own format. Need standard form. SOLUTION: Define a standard language for writing the description -> Metadata (Semantic Web Terminology)
  • 19.
    Knowledge Representation Resource Description Instructor: Shai Simonson Title:Lecture 1 – Finite State Machines (Part 1/9) Duration: 9:59 Uploaded: May 7 2010
  • 20.
    Knowledge Representation - ResourceDescription Framework ◦ This leads to one of the Semantic Web main task Metadata Annotation - description of resources using standard language ◦ Useful for search and discovery.
  • 21.
    Knowledge Representation - ResourceDescription Framework ◦ Common language for describing resource ◦ A statement with structure. ◦ A statement is a triple. ◦ Subject-predicate-object ◦ Subject: resource ◦ Predicate: a verb/property/relation ◦ Object: A resource/a literal string
  • 22.
    Knowledge Representation - ResourceDescription Framework To describe the statement: "The instructor of https://www.youtube.com/watch?v=HyUK5RAJg1c is Shai Simonson". The subject of the statement above is: https://www.youtube.com/watch?v=HyUK5RAJg1c The predicate is: author The object is: Shai Simonson Simplified RDF <?xml version="1.0"?> <RDF> <Description about="https://www.youtube.com/watch?v=HyUK5RAJg1c"> <instructor>Shai Simonson</instructor> <title>Lecture 1 – Finite State Machines (Part 1/9)</title> </Description> </RDF> Study: http://www.w3schools.com/xml/xml_rdf.asp
  • 23.
    Knowledge Representation - ResourceDescription Framework Source: Fulvio Corno, Semantic Web, Metadata, Knowledge Representation, Ontologies
  • 24.
    Knowledge Representation - ResourceDescription Framework Solution: metadata standardization is required Many standardization bodies are involved General standard e.g. Dublin Core (DC) or may depend on goal, context, domain, … e. g. educational resources (IEEE LOM), multimedia resources (MPEG-7), images (VRA), people (FOAF, IEEE PAPI), geospatial resources (GSDGM), bibliographical resources (MARC, OAI), cultural heritage resources (CIDOC CRM)
  • 25.
    Knowledge Representation -Ontology Semantically rich descriptions need “understanding” the meaning of a resource and the domain related to the resource Disambiguation of terms Shared agreement on meanings Description of the domain, with concepts and relations among concepts
  • 26.
    Knowledge Representation -Ontology ◦ Controlled vocabularies ◦ Taxonomies ◦ Thesauri ◦ Faceted classification ◦ Ontologies ◦ Folksonomies ◦ Others
  • 27.
    Knowledge Representation -Ontology Taxonomy Subject-based classification that arranges the terms in the controlled vocabulary into a hierarchy
  • 28.
    Knowledge Representation -Ontology ◦ ACM Classification system. ◦ Used to annotate bibliography.
  • 29.
    Knowledge Representation -Ontology Model for describing the world that consists of a set of types, properties, and relationships.
  • 30.
    Knowledge Representation -Ontology Ontologies generally describe: Individuals ◦ the basic or “ground level” objects Classes ◦ sets, collections, or types of objects Attributes ◦ properties, features, characteristics, or parameters that objects can have and share Relationships ◦ ways that objects can be related to one another
  • 31.
    Knowledge Representation -Ontology ◦ How much knowledge do you have about ice cream??
  • 32.
    Knowledge Representation -Ontology Web Ontology Language (OWL) is a family of knowledge representation languages for authoring ontologies. Built upon a W3C XML standard for objects called the Resource Description Framework (RDF). Computational logic-based language, exploited by computer programs, e.g., to verify the consistency of that knowledge or to make implicit knowledge explicit.
  • 33.
    Knowledge Acquisition Where doesthe knowledge comes from? Manual ◦ Written by expert. Automated ◦ Gathering from those written by expert. ◦ Allow aggregation, consolidation and organization for better usage. ◦ Allow enhancement like semantic annotation, classification.
  • 34.
    Knowledge Acquisition ◦ Knowledgeacquisition is the process of extracting, structuring and organizing knowledge from one source, usually human experts. ◦ Extraction ◦ Get resource from texts. ◦ Structuring ◦ Annotate the resource. ◦ Organizing ◦ Store the resource in representation like ontology.
  • 35.
    Knowledge Acquisition Knowledge canbe extracted from Unstructured Text ◦ Web pages ◦ Article ◦ Scanned document Semi Structured Text ◦ XML ◦ Excel ◦ CSV ◦ BIB
  • 36.
    Knowledge Acquisition Extraction fromunstructured text ◦ Can you differentiate between Person and Organization?
  • 37.
    Knowledge Acquisition Extracting aspectand sentiment from a sentence. Use Part of Speech Tagging. Review sentence: The room is beautiful. POS tagged sentence: The/DT room/NN is/VBZ beautiful/JJ./. Representing the acquired knowledge: RDF triple(hasSentiment, room, beautiful) General simple rule (R1): +.*(/nn1) +.*(/jj1) + Mapping of aspect and opinion (M1): map (nn1, jj1)
  • 38.
    Knowledge Acquisition – RoadAhead Too much knowledge out there to be acquired. Lots of research opportunities, especially, unstructured resource to structured resource Identify relation in a resource Identify implicit meaning in a resource
  • 39.
    Contact Gan Keng Hoon khganATusm.my Visitour works at ir.cs.usm.my Picture Source: http://www.mindonsolutions.com