2. Notetaking – not really necessary, just bookmark:
http://hyperdata.org/krdb2010
Questions – if I'm not clear any time, please raise your hand,
general questions at the end
Mobile phones – be discrete!
IRC -
Server: irc.freenode.net
Channel: #swig
available through a browser at: http://www.mibbit.com
Twitter tag - #krdb
(I'm @danja)
3. Objectives
To answer questions like :
● What is a platform?
● What are the benefits of using platforms?
● What is a Web platform?
● How can Semantic Web technologies contribute?
● How do different kinds of platforms compare, and
what analogies might be useful?
4. Part 1 : Platforms in General
● Defining “Platform”
● A Plethora of Platforms
● Working with Web Platforms
[a <http://dbpedia.org/resource/Coffee_break>]
Part 2 : Semantic Web Platforms
● Review of Semantic Web Technologies
● Semantic Web Platforms
● The Web as Platform
6. “Platform” (“Piattaforma”)
● a raised horizontal surface (palco)
● political program - a document stating the
aims and principles of a political party
things
●the combination of a particular computer which
and a particular operating system support
something
●weapons platform - any military structure else
or vehicle bearing weapons
●platform shoe, chopine - a woman's shoe
with a very high thick sole (zeppa)
Source: Wordnet
12. Working Definition
A platform is a system designed to keep developers
and users out of the mud and closer to heaven.
Una piattaforma è un sistema progettato per
mantenere gli sviluppatori e gli utenti fuori dal fango e
più vicino al cielo.
13. Layered Models
Layer n + 1
Supports Depends on
Layer n
Supports Depends on
Layer n - 1
14. A Plethora of Platforms
loose taxonomy:
● Abstract Platforms
● Workbench Platforms
● Dedicated Platforms
15. Abstract Platforms
What do they support? : Ideas
● branches of mathematics
e.g. geometry, logic
● the sciences
● human languages
● the arts
16. Geometry
Typical mode of use : modelling physical systems
Applications : surveying (earth-measuring),
architecture, engineering...
18. Logic
(propositional, just declarative statements)
A
& C
B
C=A∧B
Typical mode of use : modelling electronic systems
Applications : control circuits, building computers...
19. Logic
(adding predicates and quantifiers)
(∃x)(∃y)(Go(x) ∧ Person(John) ∧ City(Boston) ∧ Bus(y)
∧ Agnt(x,John) ∧ Dest(x,Boston) ∧ Inst(x,y))
(∃x)(∃y)(Go(x) ∧ Person(John) ∧ City(Boston) ∧ Bus(y)
∧ Agnt(x,John) ∧ Dest(x,Boston) ∧ Inst(x,y))
Typical mode of use : modelling physical systems
Applications : knowledge representation & processing
Source: John F. Sowa, http://www.jfsowa.com/krbook
20. Conceptual Graphs
(a dialect of Common Logic)
(∃x)(∃y)(Go(x) ∧ Person(John) ∧ City(Boston) ∧ Bus(y)
∧ Agnt(x,John) ∧ Dest(x,Boston) ∧ Inst(x,y))
Concepts :
Named Entities : John, Boston
Entity Types : Person, Go, City, Bus
Relations : Agnt (Agent), Dest (Destination), Inst (Instrument)
Source: John F. Sowa, http://www.jfsowa.com/krbook
22. Natural Language to CGs
John is going to Boston by bus.
Informal
The person John is the agent of some instance of
going, the city Boston is the destination, and a bus is
the instrument.
Formal
23. Conceptual Graphs can be derived from
Natural Language.
Conceptual Graphs express knowledge
in a formal mathematical language.
But why should we care about
something so abstract?
24. Case Study : Legacy Re-engineering
Analyze software and documentation of a large corporation.
Generate :
● English glossary of all terms with pointers to the software
● Structure diagrams of the programs, files, and data
● List of discrepancies between software and documentation
Source: John Sowa, http://www.jfsowa.com/talks/iss.pdf
25. Case Study : Legacy Re-engineering
Software
1.5 million lines of COBOL programs in daily use, some of
which up to 40 years old
Documentation
100 megabytes of English reports, manuals, e-mails, Lotus
Notes, HTML, and program comments
Source: John Sowa, http://www.jfsowa.com/talks/iss.pdf
26. COBOL Examples
IDENTIFICATION DIVISION.
PROGRAM-ID. HELLO-WORLD.
PROCEDURE DIVISION.
DISPLAY 'Hello, world'.
STOP RUN.
ADD YEARS TO AGE
age = age + years
“The use of COBOL cripples the mind; its teaching should, therefore, be regarded
as a criminal offense." - Dijkstra
Source : http://en.wikipedia.org/wiki/Cobol
27. Case Study : Legacy Re-engineering
A major consulting firm had estimated that the job would take
40 people two years to analyze the documentation and
determine the cross references.
Source: John Sowa, http://www.jfsowa.com/talks/iss.pdf
28. Case Study : Legacy Re-engineering
Approach
● Translate the COBOL programs to Conceptual Graphs
● Use the Conceptual Graphs from COBOL to interpret the
English
● Use the Analogy Engine to compare the graphs derived
from COBOL to the graphs derived from English
● Record the similarities and discrepancies
Source: John Sowa, http://www.jfsowa.com/talks/iss.pdf
29. Case Study : Legacy Re-engineering
VivoMind Analogy Engine
Three methods of analogy:
1. Matching labels:
* Compare type labels on conceptual graphs.
2. Matching subgraphs:
* Compare subgraphs independent of labels.
3. Matching transformations:
* Transform subgraphs.
Source: John Sowa, http://www.jfsowa.com/talks/mitre.htm
30. Case Study : Legacy Re-engineering
Excerpt from the Documentation
The input file that is used to create this piece of the Billing
Interface for the General Ledger is an extract from the 61 byte file
that is created by the COBOL program BILLCRUA in the Billing
History production run. This file is used instead of the history file
for time efficiency. This file contains the billing transaction codes
(types of records) that are to be interfaced to General Ledger for
the given month.
For this process the following transaction codes are used: 32 —
loss on unbilled, 72 — gain on uncollected, and 85 — loss on
uncollected. Any of these records that are actually taxes are
bypassed. Only client types 01 — Mar, 05 — Internal
Non/Billable, 06 — Internal Billable, and 08 — BAS are selected.
This is determined by a GETBDATA call to the client file.
Note that none of the files or COBOL variables are named.
By matching the English graphs to the COBOL graphs, VAE
identified all the file names and COBOL variables involved.
Source: John Sowa, http://www.jfsowa.com/talks/iss.pdf
31. Case Study : Legacy Re-engineering
Job finished in 8 weeks by two programmers, Arun Majumdar
and André LeClerc.
● Four weeks for customization: Design, ontology, and
additional programming for I/O formats.
●Three weeks to run English parser + VAE + extensions:
VAE handled matches with strong evidence (close semantic
distance). Matches with weak evidence were confirmed or
corrected by Majumdarand LeClerc.
● One week to produce a CD-ROM with integrated views
of the results: Glossary, data dictionary, data flow diagrams,
process architecture, system context diagrams.
Source: John Sowa, http://www.jfsowa.com/talks/iss.pdf
32. Case Study : Legacy Re-engineering
Contradiction Found by VAE
From analyzing English documentation:
● Every employee is a human being.
● No human being is a computer.
From analyzing COBOL programs:
● Some employees are computers.
What is the reason for this contradiction?
Source: John Sowa, http://www.jfsowa.com/talks/iss.pdf
33. Case Study : Legacy Re-engineering
In 1979 a COBOL programmer made a quick patch :
● Two computers were used to assist human consultants.
● But there was no provision to bill for computer time.
● Therefore, the programmer named the computers Bob and
Sally, and assigned them employee IDs.
Source: John Sowa, http://www.jfsowa.com/talks/iss.pdf
34. Case Study : Legacy Re-engineering
For more than 20 years:
● Bob and Sally were issued payroll checks.
● But they never cashed them.
VAE discovered the two computer “employees.”
Source: John Sowa, http://www.jfsowa.com/talks/iss.pdf
35. Why should we care about
abstract platforms?
- concrete benefits.
36. All models are wrong.
Some are useful.
- George E. P. Box
37. A Plethora of Platforms
loose taxonomy:
● Abstract Platforms
● Workbench Platforms
● Dedicated Platforms
38. Workbench Platforms
What do they support? :
tools and raw materials
...but end product will often be
indirect
39.
40. A “Jig”
- a device that holds a piece of work and
guides the tools operating on it
54. Virtual Machine as Platform
- Java Style
User Developer
Applications Coding Tools
JRE JDK
JVM
Operating System
Hardware
55. Virtual Machine as Platform
- Squeak Style
Users and Developers
Applications & Coding Tools
System Image
VM
Operating System
Hardware
(See also : emacs)
62. Eclipse Platform
● Core functionality : fairly generic app stuff
● Built on a mechanism for discovering, integrating,
and running modules called plug-ins
● Plug-ins represented as bundles based on the
OSGi * specification
(* originally Open Services Gateway initiative)
83. HTML
<!DOCTYPE html>
<html>
...
<h2>
<a href="http://localhost/wordpress/?p=4">Hello localhost!</a>
</h2>
<p>This is some sample text which doesn’t really</p>
<p>say a lot</p>
…
</html>
84. RSS
...
<item>
<title>Hello localhost!</title>
<link>http://localhost/wordpress/?p=4</link>
<pubDate>Wed, 15 Sep 2010 14:31:01 +0000</pubDate>
<dc:creator>admin</dc:creator>
<description><![CDATA[This is some sample text which
doesn’t really say a lot]]></description>
</item>
...
85. System Characteristics :
Recording Studio
● Raw Input : instrument output/sounds
(various acoustic/electrical signals)
● Output : music (combined and more structured
acoustic/electrical signals) sometimes
● Processing : analog & digital signal processing and mixing
● Storage : computer filesystem
● User Interface : core DAW GUI, plus per-module UI
86. Data Characteristics : Recording Studio
“Models”
● Analog Signals
● Digital Signals
● Structured Recordings (multi-track time/amplitude)
Formats
● Audio data formats (wav, aiff, mp3, CD formats)
● MIDI file format
● Proprietary DAW multi-track format
Protocols
● Analog audio (various levels/impedances)
● MIDI protocol
87. System Characteristics :
Content Management System
● Raw Input : human-readable text + annotations
● Output : more structured text published as Web
resources (HTTP+HTML/RSS)
● Processing : text data structured into DB, converted into
markup
● Storage : SQL Database
● User Interface : HTML in Web browser (dashboard or output)
88. Data Characteristics :
Content Management System
Models
● DB Schema
● Markup
Formats
● HTML
● RSS
● Binary image formats
● SQL
● (PHP)
Protocols
● HTTP
89. Parallels can be drawn
between different kinds of
platforms.
So what?
90. Problem : Impedance mismatch
I want to connect electric guitar directly to mixer, but -
● Signal from electric guitar : 2 volts @ 10 kΩ impedance
● Signal expected by mixer is 20mV @ 100Ω
92. Impedance Matching Transformer
(DI Box)
Signal from
electric guitar Signal to mixer
(10 kΩ output (100 Ω input)
impedance)
(electricity - magnetic flux - electricity)
93. Problem : Impedance mismatch
I want to connect a blog feed directly to a particular aggregator,
but -
● Signal from blog is RSS 2.0
● Signal expected by aggregator is Atom
103. Parallels can be drawn
between different kinds of
platforms.
So what?
Problems in one domain may
have been solved in another.
104. Problem : multitrack structures not
available outside DAW for other tools
● proprietary format only
● vafanculo, need to talk to their developers
105. Wordpress Export
“When you click the button below WordPress will create an XML
file for you to save to your computer.
This format, which we call WordPress eXtended RSS or WXR,
will contain your posts, pages, comments, custom fields,
categories, and tags.”
119. Semantic Web = Web of Data
Informed by :
● Traditional RDBMS & other kinds of data store
● Logics
● Grids
● “The Cloud”
● Hypertext and the Web!
120. Logical Base
● open world assumption
● uniform identifiers
● declarative sentences
121. What works on the Web?
● Uniform Identifiers (URIs)
– for resources
● Common Interface Protocol (HTTP)
● Standard Representation Formats
(notably HTML)
Altogether: a REST Configuration
122. A hyperlink
page.html home.html
<a href=”http://example.org/home.html“>
Home Page
</a>
123. Evolving the Link
page.html home.html
home
<a href=”http://example.org/home.html”
rel=”home”>
Home Page
</a>
124. page home
x:home
<rdf:Description
rdf:about=”http://example.org/page”>
<x:home
rdf:resource=”http://example.org/home” />
</rdf:Description>
138. Part 1 : Platforms in General
● Defining “Platform”
● A Plethora of Platforms
● Working with Web Platforms
[a <http://dbpedia.org/resource/Coffee_break>]
Part 2 : Semantic Web Platforms