Slides for talk given at IWMW 1998 held at the University of Newcastle on 15-17 September 1998.
See http://www.ukoln.ac.uk/web-focus/events/workshops/webmaster-sep1998/materials/
1. 1
Deploying New
Web Technologies
Brian Kelly Email Address
UK Web Focus B.Kelly@ukoln.ac.uk
UKOLN URL
University of Bath http://www.ukoln.ac.uk/
UKOLN is funded by the British Library Research and Innovation Centre,
the Joint Information Systems Committee of the Higher Education Funding
Councils, as well as by project funding from the JISC’s Electronic Libraries
Programme and the European Union.
UKOLN also receives support from the University of Bath where it is based.
2. 2
Contents
• Background
• Web Developments:
• Data Formats
• Transport
• Addressing
• Metadata
• Deployment Issues
• Questions
Aims of Talk
• To give an overview of
the Web architecture
and Web
standardisation
• To review new web
developments
• To address
implementation models
Aims of Talk
• To give an overview of
the Web architecture
and Web
standardisation
• To review new web
developments
• To address
implementation models
3. 3
Web and Standardisation
W3C
•Produces W3C
Recommendations on
Web protocols
•Managed approach to
developments
•Protocols initially developed
by W3C members
•Decisions made by W3C,
influenced by member
and public review
•UK members
include JISC,
UKERNA,
Southampton and
Bristol
IETF
• Produces Internet
Drafts on Internet protocols
• Bottom-up approach to developments
• Protocols developed by
interested individuals
• "Rough consensus and working
code"
ISO
• Produces ISO
Standards
• Can be slow moving
and bureaucratic
• Produce robust
standards
Proprietary
• De facto standards
• Often initially appealing
(cf PowerPoint)
• May emerge as
standards
PNG
HTML
Z39.50
Java?
PNG
HTML
Z39.50
Java?
PNG
HTML
HTTP
PNG
HTML
HTTP
HTTP
URN
HTTP
URN
HTML
extensions
PDF and Java?
HTML
extensions
PDF and Java?
4. 4
The Web Vision
Tim Berners-Lee's (and W3C's) vision for the
Web:
• Evolvability is critical
• Automation of information management:
If a decision can be made by machine, it should
• All structured data formats should be based on
XML
• Migrate HTML to XML
• All logical assertions to map onto RDF model
• All metadata to use RDF
See keynote talk at WWW 7 conference at
<URL: http://www.w3.org/Talks/1998/
0415-Evolvability/slide1-1.htm>
5. 5
Web Protocols
Web initially based on three
simple protocols:
• Data Formats
HTML (HyperText Markup
Language) provides the data format for native
documents
• Addressing
URLs (Uniform Resource Locator) provides an
addressing mechanism for web resources
• Transport
HTTP (HyperText Transfer Protocol) defines
transfer of resources between client and server
Data Format
HTML
Addressing
URL
Transport
HTTP
6. 6
HTML History
HTML 1.0 Unpublished specification.
HTML 2.0 Spec. based on innovations from NCSA
(forms and inline images!)
HTML 3.0 Proposed spec. (renamed from HTML+).
Very comprehensive
Failed to complete IETF standardisation
Little implementation experience
Proprietary Introduction of proprietary HTML elements
by Netscape and Microsoft
HTML 3.2 Spec. based on description of mainstream
innovations in marketplace
HTML 4.0 Current recommendation1998
1994
1997
1994-5
1995
1992
Dilemna
Proprietary extensions
cause problems.
But experiments
are needed
7. 7
HTML 4.0, CSS 2.0 and DOM
HTML 4.0 used in conjunction with CSS 2.0
(Cascading Style Sheets) and the DOM provides an
architecturally pure, yet functionally rich environment
HTML 4.0 : W3C-Rec
• Improved forms
• Hooks for stylesheets
• Hooks for scripting
languages
• Table enhancements
• Better printing
CSS 2.0 : W3C-Rec
• Support for all HTML
formatting
• Positioning of HTML
elements
• Multiple media support
CSS Problems
• Changes during CSS development
• Netscape & IE incompatibilities
• Continued use of browsers with
known bugs
CSS Problems
• Changes during CSS development
• Netscape & IE incompatibilities
• Continued use of browsers with
known bugs
DOM : W3C-Rec
• Document Object Model
• Hooks for scripting
languages
• Permits changes to
HTML & CSS properties
and content (DHTML)
8. 8
HTML Limitations
HTML 4.0 / CSS 2.0 have limitations:
• Difficulties in introducing new elements
– Time-consuming standardisation process
(<ABBREV>)
– Dictated by browser vendor (<BLINK>, <MARQUEE>)
• Area may be inappropriate for standarisation:
– Covers specialist area (maths, music, ...)
– Application-specific (<STUD-NUM>)
• HTML is a display (output) not storage format
• HTML's lack of arbitrary structure limits
functionality:
– Find all memos copied to John Smith
– How many unique tracks on Jackson Browne CDs
9. 9
XML
XML:
• Extensible Markup Language
• A lightweight SGML designed for network use
• Addresses HTML's lack of evolvability
• Arbitrary elements can be defined (<STUDENT-
NUMBER>, <PART-NO>, etc)
• Agreement achieved quickly - XML 1.0 became
W3C Recommendation in Feb 1998
• Support from industry (SGML vendors, Microsoft,
etc.)
• Various XML DTDs already agreed (MathML,
CML)
• Support in Netscape 5 and IE 5
10. 10
XML Deployment
Ariadne issue 14 has an
article on "What Is XML?"
Describes how XML
support can be provided:
• Natively by new browsers
• Back end conversion
of XML - HTML
• Client-side conversion
of XML - HTML / CSS
• Java rendering of XML
Examples of intermediaries
See http://www.ariadne.ac.uk/issue15/what-is/See http://www.ariadne.ac.uk/issue15/what-is/
11. 11
XLink, XPointer and XSL
XLink will provide sophisticated
hyperlinking missing in HTML:
• Links that lead user to multiple destinations
• Bidirectional links
• Links with special behaviors:
– Expand-in-place / Replace / Create new window
– Link on load / Link on user action
• Link databases
XPointer will provide
access to arbitrary
portions of XML resource.
Interesting IPR issues!
XSL stylesheet language will provide extensibility and
transformation facilities (e.g. create a table of contents)
England
France
<commentary xml:link="extended" inline="false">
<locator href="smith2.1" role="Essay"/>
<locator href="jones1.4" role="Rebuttal"/>
<locator href="robin3.2" role="Comparison"/>
</commentary>
<commentary xml:link="extended" inline="false">
<locator href="smith2.1" role="Essay"/>
<locator href="jones1.4" role="Rebuttal"/>
<locator href="robin3.2" role="Comparison"/>
</commentary>
12. 12
Addressing
URLs (e.g. http://www.bristol-
poly.ac.uk/depts/music/latest.html)
have limitations:
• Lack of long-term persistency
– Organisation changes name
– Department shut down / merged
– Directory structure reorganised
• Inability to support multiple versions of resources
(mirroring)
URNs (Uniform Resource Names):
• Proposed as solution
• Difficult to implement (no W3C activity in this area)
13. 13
Addressing - Solutions
DOIs (Document Object Identifiers):
• Proposed by publishing industry as a solution
• Aimed at supporting rights ownership
• Business model needed
PURLs (Persistent URLs):
• Provide single level of redirection
Pragmatic Solution:
• URLs don't break - people break them
• Design URLs to have long life-span
Further information:
<URL: http://www.ukoln.ac.uk/metadata/resources/urn/>
<URL: http://hosted.ukoln.ac.uk/biblink/wp2/
links.html>
14. 14
Transport
HTTP/0.9 and HTTP/1.0:
Design flaws and implementation problems
HTTP/1.1:
Addresses some of these problems
60% server support
Performance benefits! (60% packet traffic reduction)
Is acting as fire-fighter
Not sufficiently flexible or extensible
HTTP/NG:
Radical redesign used object-oriented technologies
Undergoing trials
Gradual transition (using proxies)
15. 15
Metadata
Metadata - the missing architectural component
from the initial implementation
of the web
Metadata / RDF
PICS, TCN,
MCF, DSig,
DC,...
Addressing
URL
Data format
HTML
Transport
HTTP
Metadata Needs:
• Resource discovery
• Content filtering
• Authentication
• Improved navigation
• Multiple format support
• Rights management
Metadata Needs:
• Resource discovery
• Content filtering
• Authentication
• Improved navigation
• Multiple format support
• Rights management
16. 16
Metadata Examples
DSig (Digital Signatures initiative):
• Key component for providing trust on the web
• DSig 2.0 will be based on RDF and will support
signed assertion:
– This page is from the University of Bath
– This page is a legally-binding list of courses
provided by the University
P3P (Platform for Privacy Preferences):
• Developing methods for exchanging Privacy
Practices of Web sites and user
Note that discussions about additional rights
management metadata are currently taking place
17. 17
RDF
RDF (Resource Description Framework):
• Highlight of WWW 7 conference
• Provides a metadata framework ("machine
understandable metadata for the web")
• Based on ideas from content rating (PICS),
resource discovery (Dublin Core) and site mapping
• Based on a formal data model (direct label graphs)
• Applications include:
– cataloging resources – resource discovery
– electronic commerce – intelligent agents
– digital signatures – content rating
– intellectual property rights – privacy
18. 18
Browser Support for RDF
Mozilla (Netscape's
source code release)
provides support for
RDF.
Mozilla supports site
maps in RDF, as well
as bookmarks and
history lists
See Netscape's or
HotWired home page
for a link to the RDF
file.
Trusted
3rd
Party
Metadata
Embedded
Metadata
e.g.
sitemaps
Image from http://purl.oclc.org/net/eric/talks/www7/devday/Image from http://purl.oclc.org/net/eric/talks/www7/devday/
19. 19
Deployment Issues
Various interesting new technologies have
been outlined
How can they be deployed in our environment?
Should we:
• Ignore them?
• Accept them fully?
• Accept them partly?
20. 20
Ignore New Developments
We can chose to ignore new developments,
and continue to use HTML 3.2:
Safe option, with no new training, support or
software costs
Experience in effectiveness, limitations, etc.
Fails to address current performance problems
Fails to address accessibility problems
Fails to provide new functionality
Service likely to look "old-fashioned" compared
with competition
21. 21
Fully Accept New Developments
Can chose to more fully to, say, HTML 4.0 and
CSS 2.0:
Can be exciting to be at leading edge
Performance benefits
Accessibility benefits
Based on open-standards
Provides motivation for users to upgrade browsers
Likely to be solution at some point (cf. Gopher)
Backwards compatibility problems with old browsers
Costly to deploy new authoring news, training, ..
Likely to be bugs and incompatibilities with new tools
and browsers
22. 22
Implement "Safe" Solutions
An alternative is to use "safe" technologies
which are backwards compatible and avoid
major browser bugs
Attractive sounding compromise position
Lose some functionality, but not all
Can be difficult or expensive to find "safe" options
(does .margin-left work on IE on SGI?)
Tools may not allow safe options to be chosen
Lack of validation tools for checking conformance
with restricted set of specification
Note See <URL: www.webreview.com/guides/
style/insafegrid.htm> for unsafe CSS 2.0 properties
23. 23
Decision Time
Which would you opt for?
Stick with current technologies
Cheap, default option. Continuation of
performance and accessibility problems.
Unlikely to be long term solution.
Deploy new technologies
More expensive option. Functionality,
performance and accessibility benefits. Access
problems for old browsers.
Use "safe" new technologies
May require home-grown tools and support.
Avoids some of the problems of other solutions
24. 24
An Alternative
An alternative approach to deploying new
technologies is available:
• Use more intelligent server-side software
• Use "proxies" to address limitations of
browser technologies. The term intermediary
was used in a paper [1] at the WWW 7
conference to describe this approach
• Protocol solutions, such as Transparent
Content Negotiation (TCN)
[1] "Intermediaries: New Places For Producing and
Manipulating Web Content"
25. 25
Intelligent Server Software
Simple model:
• Server receives request for resource
• Server delivers resource to client
More sophisticated model:
• Server receives request for resource
• Server processes header information from client
• Server delivers resource to client based on client
information
This is referred to as browser-sniffing or user-agent
negotiation
Note that server support is now available in Apache and
in server add-ons such as PHP/FI and MS Active
Server Pages (ASP)
26. 26
Portion of CSS file for IE
Total 797 lines
W3C CSS Gallery
W3C have a link to a core
style sampler service.
The service provides 8 core
style sheets which can be
freely linked to.
The style sheets use "browser
sniffing". Different style
sheets are delivered to
different browsers.
H1, H2, H3, H4, H5, H6, ..
{color: black;
background: white}
Portion of CSS file for Netscape
Total 169 lines
H1
{font-family: Tahoma, ...
font-size-adjust: .53;
margin-top: 1.33em;
font-weight: 500; ...}
27. 27
Java Intermediaries
Netscape and Internet
Explorer don't support
MathML
Who cares? MathML
Java renderers are
available
This concept can be
generalised to deploying
support for other new
markup languages.
For example see the
Displets work at
http://www.cs.unibo
.it/~fabio/displet/
28. 28
Deploying URNs
Problem
Today's browsers can't process URNs, such as:
urn:doi:10.1000/1
Possible Solution
• A separate program could resolve URNs into URLs
• Andy Powell (UKOLN) has demonstrated use of
Netscape's autoproxy to pass on URNs of the
format above to Squid for resolution [1]
• Example of use of an intermediary to deploy new
technologies not supported by current browsers
[1] "Resolving DOI Based URNs Using Squid" at
http://mirrored.ukoln.ac.uk/lis-journals/
dlib/dlib/dlib/june98/06powell.html
29. 29
Intermediaries
Intermediaries:
• Enable new functionality to be introduced to the
web without extending the client or the server
• Intermediaries can be implemented using proxies
• Intermediaries can be used for applications such
as web personalisation, document caching,
content distillation and protocol extension
• Demonstration available using WBI (Web Browser
Intelligence)
• See <URL: http://wwwcssrv.almaden.
ibm.com/wbi/>
• Another example for web accessibility at <URL:
http://www.inf.ethz.ch/department/IS/ea/blinds/>
30. 30
Web Applications
An Example
• We're familiar with HTML
validation services
(e.g. HENSA mirror)
• We can "go there" and use the service
• We can also have a link from the page which
will run the service (rather than just go to the
form)
• Consider:
– Web page is in Bath
– User is in Sheffield
– Application is in Kent
• An example of a web (intermediary?)
application
31. 31
Examples
Examples of remote web
applications include:
• Link checking
• Website analysis
• Document format
conversion
• Accessibility support
Imagine an intermediary
service which called an
XML - HTML conversion
service if the browser agent
didn't support XML
http://www.ukoln.ac.uk/web-focus/
webwatch/services/url-info/
http://wheel.compose.cs.cmu.edu:8001/
cgi-bin/browse/objweb
32. 32
Content Negotiation
Transparent Content Negotiation (TCN):
• Method of deploying new formats
Client:
ACCEPT image/gif, image/png
Server:
If foo.png exists, send, else foo.gif
• Used for logos on W3C website
• Not widely deployed
Transparent Feature Negotiation:
• Proposal for deploying new HTML elements
• Over-engineered? Requires naming authority
33. 33
Fourth and Fifth Ways
Several other options for deploying new web
technologies (e.g. on low spec PCs):
Run Browser on Server
• Use Windows Terminal Server, Citrix, etc.
• Browser runs on NT server
Deploy JavaPC (e.g. for DOS)
• Use the JavaPC and run HotJava browser (min.
spec 486 PC with 8Mb)
Opera
• Supports CSS, Frames, … on 486 PCs (8Mb)
• See <URL: http://www.operasoftware.com/>
34. 34
Conclusions
To conclude:
• New web protocols are still being developed
• Deployment of new technologies can be expensive
or time-consuming, but is likely to be needed
• Various deployment models:
Don't implement Implement fully
Implement via proxy Others (thin clients, …)
• We can't do it all ourselves
• Experience in developing (wide-area) web
applications will help in developing intermediaries