Lecture 5: XML <lecture number=“8” date=“2-8-2007”> <lecturer> Simon McCallum</lecturer> <slide number=“1”> <frame pos=“0,0” width=“1024” height =“40”> <img src=“compSFDV2001.gif”> </frame> <frame pos= “40,70” width=“500” height =“30”> ........ </frame> ......... </slide> </lecture> SFDV2001 – Web Development
more than the text that you see
General concept is meta-information
Plays – Mark what the actors do
11/09/07 (SFDV2001:8)XML HORATIO. Peace! Who comes here? [Enter young Osric, a courtier.] OSRIC. Your lordship is right welcome back to Denmark. HAMLET. I humbly thank you, sir. [Aside to Horatio] Dost know this waterfly?
Letting the computer in on the secret
Web - HTML
Latex typesetting language
Generalised Markup Language – 1967
Used in the printing industry
SGML - Standardised GML
This is the basis for HTML, XML, MML most modern markup languages.
IBM moved 90% of its documents to SGML in the 1980's
Given information markup you can move data between applications.
Move data between applications.
Edit data with other programs.
Create files in Word and have them look the same on the web
Move the information from one machine to another
PDA, cellphones, consoles, ...
To many DTDs lead to a lack of consistency.
Poor implementations lead to inconsistent recognition.
Browsers are not able to recover from errors as they do for HTML.
Separate files are usually used for the XML document, the DTD and the presentation.
Bloated documents if you just need a specific part of the information
MS and XML
“ Office 12” to be fully XML based.
Officially the short answer is because
improved file and data management,
improved interoperability, and
a published file-format specification
This is what customers and governments have asked us for.
Laws being passed about open formats for public documents.
Office will print PDF for similar reasons
11/09/07 (SFDV2001:8)XML Steven Sinofsky
Microsoft still wants control so
XPS - XML Paper Specification, the rules on interpreting the XML format.
They have created a licence to use their XML definitions
Licence incompatible with other Open Source Software licences
As it is not a community developed format they can change it without warning and still want control over how it is implemented
Open Office has been XML for several years
The implementation of HTML as an XML application -
Readily viewed, edited, and validated with standard XML tools.
XHTML documents can be written to operate as well or better in existing HTML 4-conforming browsers.
XHTML documents can utilize applications that rely upon either the HTML Document Object Model or the XML Document Object Model.
XHTML more likely to be interoperable in the future.
XHTML more extendable than HTML.
HTML - XHTML differences
You will notice
All tags must have an ending
<p> must have </p>
empty tags such as <br> become <br />
<img ..... />
All attribute values must be quoted. eg size=“3”
Makes using the DOM easy
Everything is encapsulated with no ambiguity.
NOTE : your first project is in HTML NOT XHTML
Newsfeeds - RSS
XML can be used for website data.
RSS Rich Site Summary.
Make a file that records changes to the website.
Publication date, enclosures, links.
Aggregator – Collect XML and display them
On the web
Becoming very common.
Other XML Applications
There are a number of common web based XML document formats
Resource Documents Framework.
Used to give meta data about a document
Small graphics IE does not support it so not used
Video, images and audio
MathML – Maths markup. HTML terrible for math
DOM – Document Object Model
Describes the structure of your document.
Think of tags as containers
Folders in folders in folders….
Allows you to access the objects in the structure.
Allows parts of an HTML page to be easily identified.
Provides an Application Programming Interface (API) for HTML.
Enables site developers to modify the content and visual presentation of HTML easily.
Is language independent so can be used universally.
DOM – Document Object Model
Model of document allows you to access things like