2. Semantic Web
The Semantic Web is an extension of the current web in which information is
given well-defined meaning, better enabling computers and people to work in
cooperation (Tim Berners-Lee, 2001).
The study of meaning focused on the relation between signifiers like, words,
phrase, signs and symbol and what they stand for.
2
3. Semantic web problems
Too Much web information
Around 1,000,000,000 (1×109) resources.
Many different type of resources
• Texts, images, graphics
• Audio, video, multimedia.
• Database, web applications
3
4. Semantic web problems
Information not Indexable
No common scheme for doing so.
Differing relationship between authors, publisher, info
intermediaries and users
Each community use their own approach
Information not shareable
Difficult to share information about information.
Not common catalog scheme.
4
6. Second issue
Language for expressing metadata must be:
universal (so all can understand)
–flexible (to incorporate different types)
–extensible (flexible to custom types)
–simple (to encourage adoption)
–modular (so that schemes can be mixed, extended)
6
8. RDF
RDF stands for Resource Description Framework
It is a machine understandable metadata
RDF is graphical formalism ( + XML syntax + semantics) –for
representing metadata
for describing the semantics of information in a machine
accessible way
8
9. Resource Description Framework(RDF)
RDF is language for represent the resources:
A Resource can be any thing.
Within the context of the web relevance is the given to web
resources, i.e. that any thing that can be located via
URL(Uninform Resource Locater).
The basic building block is the statements(or triple).
One of the main application: data integration
9
10. RD2RDF
The adaption of the relation model to the web given rise to RDF.
From Tuples to Triples.
Any relation data can be represented as triples:
Row key subject
Column property / Relation / literal
Value value / object
10
subject valueproperty
statement
11. RDF2Rules
Rule Learning Approach
Learning Rules from RDF Knowledge Bases by Mining Frequent
Predicate Cycles
First mines frequent predicate cycles (FPCs), a kind of interesting
frequent patterns in knowledge bases, and then generates rules
from the mined FPCs
It uses the entity type information when generates and evaluates
rules
11
12. Quality of RDF KB
enrich the knowledge in an RDF KB,
information extraction techniques are usually used to extract more entities
and their relations from plain text or semi-structured text.
expand a KB is to infer new facts from the existing ones by using
inference rules.
hasChild(A,B) ∧ hasSpouse(A,C) ⇒ hasChild(C,B)
Entities
Facts
12
13. RDF graph
RDF is a graph-based data model;
a set of RDF triples constitutes an RDF
graph,
nodes resources
directed vertices predicates
13
14. RDF graph (Nodes)
three kinds of nodes (resources) in an RDF graph:
IRI a global identifier for a resource, such as people, organization and place;
Literals are basic values including strings, dates and numbers, etc.;
blank nodes in RDF represent recourses without global identifiers.
14
15. Path and Cycle
Path
A path in an RDF KB G = (E,P,T) is a sequence of consecutive entities and predicates
Cycle
A cycle in an RDF graph is a special path that starts and ends at the same node.
15
16. Predicate path and Cycle
(PREDICATE PATH).
A predicate path is a sequence of entity variables and predicates .
(PREDICATE CYCLE).
A predicate cycle is a special predicate path that starts and ends at the same entity
variable.
16
17. FPC(Frequent Predicate Cycle)
the interesting patterns represented by predicate cycles can be used to infer new
facts in KBs
dbo : spouse(x1,x2) ∧ dbo : children(x2,x3)
⇒ dbo : children(x1,x3)
dbo : children(x1,x3) ∧ dbo : children(x2,x3)
⇒ dbo : spouse(x1,x2)
17
18. FPC(Frequent Predicate Cycle)
The number of its instances that exists in the given RDF KB is called the support
of it. If the support of a predicate path (cycle) is not less than a specified
threshold, it is called the frequent predicate path (cycle)
Frequent predicate cycles (FPCs) are patterns that frequently appear in the KB,
rules generated from FPCs are prone to be reliable.
so our proposed approach RDF2Rules first mines frequent predicate cycles from
RDF KBs, and then generates inference rules from FPCs.
18
19. RDF INDEXING FOR MINING ALGORITHM
In-memory indexing structure to support the mining algorithm
instead of using the existing RDF storage systems
Given a predicate, find all the entity pairs that it connects;
Given an entity, find all its incident edges and its neighbor entities;
Given a predicate path, find all of its instances path
19
20. RDF2 Rule
RDF2Rules can learn rules very quickly.
RDF2Rules always gets more rules.
Quality of Predictions and the running time.
Remove duplicate rules through FPC.
20
21. conclusion
RDF is a simple data model based on graph.
RDF has a extensible URI-based vocabulary.
Anyone can make statements about ANY RESOURCE( open world
assumptions).
Rules are learned by finding frequent predicate cycles in RDF
graphs.
Quality of Predictions and the running time.
21
Editor's Notes
You’ll be able to find and take pieces of data sets from different places, aggregate them without warehousing, and analyze them in a more straightforward, powerful way than you can now.” (PWC, May 2009)