the knowledge web
13 november 2008
ucla graduate school of
education and information sciences
knowledge gaps
process failures
transaction costs
lost opportunities
knowledge gaps
process failures
transaction costs
lost opportunities
is the answer more information?
many information
products advance
incrementally
the discovery process:
the discovery process:
thanks to the products,
we already know a lot...
we need information innovations
and process innovations
to match product innovations.
1.
the “digital commons” represents a
methodology that lowers the cost and
increases the volume of transactions
at the “knowledge layer” of the net
does the ability to ask more
questions, faster, lead us to more
knowledge or just more data?
what’s different about
communications and computers?
1. we know stuff.
1. we know stuff.
2. open networks.
content
code
physical
content
code
physical
knowledge
content
code
physical
knowledge rights
knowledge rights
“the commons”
“digital commons”
interoperability
low transaction costs
law and technology
user interface to copyright
140,000,000+ digital objects online
under our licenses
licenses “ported” to 50+ countries
integrated with Google,Yahoo,
Firefox, Microsoft Office...
2.
the digital commons is a stable
methodology to manage data,
materials, and content for science.
project development “do no harm”
funding
pro bono “running code”
community
“think market”
early focus on life sciences
exploring climate change, geospatial, elsewhere
what would move via the science network?
Open Access Content
making knowledge legally and
technically available for re-use and
composition into new knowledge.
we use digital
tools to replicate
paper technology
c
>1000 journals under CC
image from the public library of science
licensed to the public under CC-BY 3.0
PubMedCentral ~ 1,000,000 articles
permissions granted: 50,000
(6% of PMC legal for transformative use)
(.003 of all PubMed records)
what do these
ideas mean in
a world of integrated data?
creative
work?
“So, out of all of this
discussion my
question is whether
ChemSpider is
Content or Data.” -
Antony Williams
“The motivation behind this memorandum is
interoperability of scientific data.”
+
is it legal?
+ + +
+ is it legal? +
+ + +
1 Converge on the public domain by waiving all rights
based on intellectual property
2 Converge on the public domain by waiving other
statutory or intellectual property rights.
3 Converge on the public domain by imposing no
contractual controls.
4 Provide for interoperation with databases not
available under the Protocol through open metadata
a protocol, not a license.
conflicts with the protection instinct
conflicts with the protection instinct
the protection instinct is frequently an instinct to protect “freedom”
3.
we have to build infrastructure for data
into the web of documents that we have.
solves the legal problem
but not the
container problem.
web 2.0, science 3.0, what about
making Google work better?
over 200
years at
one paper/day
what you want is
a list of genes.
not a list of documents.
building a web for data:
the “semantic web”
making computers understand links between documents
links to
Web page Web page
making computers understand relationships between concepts
causes
drinking coffee feel awake
we need a Domain Name System for concepts:
192.168.1.1 http://sciencecommons.org
coffee http://ontology.foo.org/coffee
use the web to
integrate information
from different places
and different names
“coffee”
“cafe” coffee
http://ontology.foo.org/coffee
“kopi”
bed
person
located at get out of bed
last subevent does not want
wants
get out of bed after causes
drink coffee feel awake
first subevent
subevent causes
feel jittery
open eyes
after after
make coffee pour coffee pick up cup
drink
is a
is for
located in
coffee cafe
property of
often near
often near
wet
cup sugar
(too much work for
coffee)
(distributed, networked
approaches start to
look pretty good)
Open Source
Data Integration
formatting digital knowledge into
modular building blocks for
composition into new knowledge.
what are the odds that the organizations making the
namespaces will be here in 50 years? 100 years?
Huntington’s
Huntington’s
Parkinson’s
ALS
Multiple
Sclerosis
Autism
Huntington’s
Parkinson’s
library
ALS
Multiple
Sclerosis
Autism
“In any case, it is clear that a library containing all possible
books, arranged at random, is equivalent (as a source of
information) to a library containing zero books.”
http://en.wikipedia.org/wiki/The_Library_of_Babel
1. Books are for use.
2. Every reader his [or her] book.
3. Every book its reader.
4. Save the time of the User.
5. The library is a growing organism.
1. Books are for use.
2. Every reader his [or her] book.
3. Every book its reader.
4. Save the time of the User.
5. The library is a growing organism.
what’s the digital version of the
five laws?
call to action:
1. join up with the semantic people - support discipline-driven
namespaces and ontologies
2. queries are the interface - average user doesn’t know
how to ask complicated questions on the research web.
3. make the library the hub of the research web.
thank you
wilbanks@creativecommons.org
http://sciencecommons.org
0 comments
Post a comment