Everything You Always Wanted To Know About XML But Were Afraid To Ask

1,935 views

Published on

Short internal XML course (2003). Interesting to note that RSS doesn't feature - hadn't quite hoved into our view at that point. Also no mention of OxygenXML - was I really not using it then? Seem to have been using it forever.

Published in: Technology
0 Comments
4 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,935
On SlideShare
0
From Embeds
0
Number of Embeds
8
Actions
Shares
0
Downloads
0
Comments
0
Likes
4
Embeds 0
No embeds

No notes for slide
  • Everything You Always Wanted To Know About XML But Were Afraid To Ask

    1. 1. “ Everything you always wanted to know about XML * * But were afraid to ask”
    2. 2. <everything> <ul><li>XML basics </li></ul><ul><li>A bit of web history </li></ul><ul><li>XML in detail </li></ul><ul><li>Where is XML used? </li></ul><ul><li>Creating XML </li></ul><ul><li>Manipulating XML </li></ul><ul><li>Pros and cons </li></ul>
    3. 3. XML basics <ul><li><?xml version=&quot; 1.0 &quot;?> </li></ul><ul><li><manual id=&quot; CRDA/ULCC/IMP/BPM/1.0 &quot;> </li></ul><ul><li><title> Biscuit Procedures Manual </title> </li></ul><ul><li><author> Ruth Vyse </author> </li></ul><ul><li><date> 1998-04-01 </date> </li></ul><ul><li><content> </li></ul><ul><li><heading> Purpose </heading> </li></ul><ul><li><para> This document describes the procedures for </li></ul><ul><li><list> </li></ul><ul><li><item> provision of an adequate quantity of biscuits </item> </li></ul><ul><li><item> provision of an adequate variety of biscuits </item> </li></ul><ul><li></list> </li></ul><ul><li></para> </li></ul><ul><li></content> </li></ul><ul><li></manual> </li></ul><ul><li>XML declaration </li></ul><ul><li>Elements </li></ul><ul><li>Attributes </li></ul><ul><li>Well-formed </li></ul><ul><li>Valid? </li></ul>
    4. 4. A bit of web history <ul><li>SGML </li></ul><ul><li>HTML </li></ul><ul><li>Cascading Style Sheets (CSS) </li></ul><ul><li>Javascript </li></ul>
    5. 5. Standard Generalized Markup Language (SGML) <ul><li>ISO Standard 8879:1986 </li></ul><ul><li>Define structured document types </li></ul><ul><li>Markup languages for structured documents </li></ul><ul><li>Main components - </li></ul><ul><ul><li>Elements delimited by tags </li></ul></ul><ul><ul><li>Attributes </li></ul></ul><ul><ul><li>Character data </li></ul></ul><ul><ul><li>Entities </li></ul></ul>
    6. 6. Hypertext Markup Language (HTML) <ul><li>An application of SGML </li></ul><ul><li>Hyperlinks / hypertext </li></ul><ul><li>Designed for structural markup of documents </li></ul><ul><li>Evolution: </li></ul><ul><ul><li>4 major versions </li></ul></ul><ul><ul><li>Non-standard extensions (Netscape, MS) </li></ul></ul><ul><ul><li>Superseded by XHTML </li></ul></ul>
    7. 7. Cascading Style Sheets (CSS) <ul><li>body { font-family: serif ; </li></ul><ul><li>text-align: justify ; </li></ul><ul><li>margin: 0pt ; </li></ul><ul><li>background-color: white ; </li></ul><ul><li>color: black } </li></ul><ul><li>h1, h2, h3 { font-family: sans-serif ; </li></ul><ul><li>font-weight: bold ; </li></ul><ul><li>text-align: left ; } </li></ul><ul><li>.reverse { background-color: black ; </li></ul><ul><li>color: white } </li></ul><ul><li>Non SGML syntax </li></ul><ul><li>Properties cascade to descendants </li></ul><ul><li>Styles can be defined: </li></ul><ul><ul><li>In external files </li></ul></ul><ul><ul><li>In document header </li></ul></ul><ul><ul><li>Inline </li></ul></ul>
    8. 8. Javascript <ul><li>Interface to browser functions </li></ul><ul><li>&quot;Document object&quot; </li></ul><ul><li>Event-driven interaction with </li></ul><ul><ul><li>Forms </li></ul></ul><ul><ul><li>Images </li></ul></ul><ul><ul><li>Formatting </li></ul></ul><ul><li>Security: limited IO </li></ul><ul><li>Object-oriented </li></ul><ul><li>ECMA 262 / ISO 16262 </li></ul><ul><li><html> </li></ul><ul><li><head> </li></ul><ul><li><script language=“ Javascript ”> </li></ul><ul><li>function helloWorld() { </li></ul><ul><li>var message = “Hello World”; </li></ul><ul><li>document.form[0].elements[0].value </li></ul><ul><li>= message; </li></ul><ul><li>} </li></ul><ul><li></script> </li></ul><ul><li></head> </li></ul><ul><li><body> </li></ul><ul><li><form> </li></ul><ul><li><input type=“ text ”> </li></ul><ul><li><input type=“ button ” </li></ul><ul><li>onClick=“ helloWorld ();”> </li></ul><ul><li></form> </li></ul><ul><li></body> </li></ul><ul><li></html> </li></ul>
    9. 9. <ul><li><mylink style =&quot; color: blue; text-decoration: underline &quot; onClick=&quot; location='http://ndad.ulcc.ac.uk/'; &quot;> NDAD </mylink> </li></ul>
    10. 10. XML in a little more detail <ul><li><?xml version=&quot; 1.0 &quot; </li></ul><ul><li>encoding=&quot; ISO-8859-1 &quot; ?> </li></ul><ul><li><!DOCTYPE ead SYSTEM &quot; ead.dtd &quot;> </li></ul><ul><li><ead audience=&quot; internal &quot;> </li></ul><ul><li><eadheader langencoding=&quot; ISO 639-3 &quot;> </li></ul><ul><li><titleproper> Lord Chancellor's Department: </li></ul><ul><li>Judge Advocate General's Office Case Index System </titleproper> </li></ul><ul><li><date> 2002-08-06 10:52:20 </date> </li></ul><ul><li></eadheader> </li></ul><ul><li><archdesc level=&quot; series &quot; > </li></ul><ul><li><scopecontent id=&quot; AIM-PURPOSE &quot;> </li></ul><ul><li><head> Aim and purpose </head> </li></ul><ul><li><p> A court martial is a court convened to try armed forces personnel who have committed military or criminal offences. </p> </li></ul><ul><li></scopecontent> </li></ul><ul><li></archdesc> </li></ul><ul><li></ead> </li></ul><ul><li>Character encoding </li></ul><ul><li>DTD </li></ul><ul><li>Namespaces </li></ul>
    11. 11. Character encoding <ul><li>ASCII: ISO 646 </li></ul><ul><ul><li>7 bit </li></ul></ul><ul><li>ISO 8859 </li></ul><ul><ul><li>8 bit </li></ul></ul><ul><ul><li>Top half interchangeable </li></ul></ul><ul><li>Unicode: ISO 10646 </li></ul><ul><ul><li>Code for every symbol of every language </li></ul></ul><ul><ul><li>Variable 8 - 32 bit encoding (UTF-8) </li></ul></ul><ul><ul><li>ASCII transparent in 8 bit encoding </li></ul></ul>
    12. 12. Document Type Definition (DTD) <ul><li>SGML compatible </li></ul><ul><li>Non XML syntax * </li></ul><ul><li>Defines document structure </li></ul><ul><ul><li>Elements </li></ul></ul><ul><ul><li>Attributes </li></ul></ul><ul><ul><li>Entities </li></ul></ul><ul><li>* pointy brackets notwithstanding </li></ul><ul><li><!ELEMENT table (title, datafile, field+) > </li></ul><ul><li><!ELEMENT title ( #PCDATA ) > </li></ul><ul><li><!ELEMENT datafile (bytes, numrecs, maxrecsize) > </li></ul><ul><li><!ELEMENT field (name, description, ddtext?, note?, choices?) > </li></ul><ul><li><!ATTLIST table reference CDATA #REQUIRED > </li></ul><ul><li><!ATTLIST datafile type CDATA #REQUIRED </li></ul><ul><li>location CDATA #REQUIRED> </li></ul><ul><li><!ATTLIST field type CDATA #REQUIRED> </li></ul><ul><li><!ENTITY NDAD &quot;http://ndad.ulcc.ac.uk/&quot; > </li></ul>
    13. 13. Namespaces <ul><li>Mix and match XML applications </li></ul><ul><li>Avoid conflicting elements </li></ul><ul><li>Limited DTD compatibility </li></ul><?xml version=&quot; 1.0 &quot; encoding=&quot; ISO-8859_1 &quot; ?> <table reference = &quot; CRDA/8/DS/1/1/1 &quot;>   <title> Court Details </title>   <datafile type=&quot; CSV &quot; location =&quot;/data/ready/8/court.txt &quot;> <bytes> 20583 </bytes>     <numrecs> 388 </numrecs>     <maxrecsize> 65 </maxrecsize>   </datafile> <description xmlns:html =&quot; http://www.w3.org/1999/xhtml &quot;> < html:p > For further details see the < html:a href=&quot; /datasets/8/series.htm &quot;> Series Catalogue < /html:a > . < /html:p > </description> </table>
    14. 14. Where is XML used? <ul><li>Web </li></ul><ul><li>Desktop applications </li></ul><ul><li>New markup applications </li></ul><ul><li>Standards for data exchange </li></ul><ul><li>Configuration files </li></ul>
    15. 15. Web <ul><li>Server side </li></ul><ul><ul><li>on-the-fly transformation to HTML / XHTML using XSLT </li></ul></ul><ul><li>Client side </li></ul><ul><ul><li>rendered native using CSS </li></ul></ul><ul><ul><li>transformed to HTML / XHTML using XSLT </li></ul></ul><ul><li>Metadata </li></ul>
    16. 16. Desktop applications <ul><li>OpenOffice/StarOffice </li></ul><ul><li>Mozilla/Netscape 6+ </li></ul><ul><li>MS Office 2000 </li></ul>
    17. 17. Other applications <ul><li>Encoded Archival Description (EAD) </li></ul><ul><li>Text Encoding Initiative (TEI) </li></ul><ul><li>Scalable Vector Graphics (SVG) </li></ul><ul><li>XHTML </li></ul><ul><li>Custom applications </li></ul>
    18. 18. Creating XML <ul><li>By hand </li></ul><ul><li>XML editors </li></ul><ul><li>Programming </li></ul><ul><li>Using XML ... </li></ul>
    19. 19. XML editors <ul><li>Non-validating </li></ul><ul><ul><li>XML Notepad </li></ul></ul><ul><li>Validating </li></ul><ul><ul><li>Xmetal </li></ul></ul><ul><ul><li>XML Spy </li></ul></ul><ul><ul><li>XML Writer </li></ul></ul>
    20. 20. Programming <ul><li>A simple Perl example: </li></ul><ul><li>#!/usr/bin/perl </li></ul><ul><li>use XML::LibXML; </li></ul><ul><li>$parser = new XML::LibXML; </li></ul><ul><li>$doc = $parser ->parse_file(&quot; myfile.xml &quot;); </li></ul><ul><li>$root = $doc ->getDocumentElement; </li></ul><ul><li>@fields = $root ->getElementsByTagName(&quot; field &quot;); </li></ul><ul><li>foreach ( @fields ) { # do something # } </li></ul>
    21. 21. Manipulating XML <ul><li>Document Object Model (DOM) </li></ul><ul><li>XPath </li></ul><ul><li>XML Stylesheets (XSL) </li></ul><ul><li>More XML applications </li></ul>
    22. 22. Document Object Model (DOM) <ul><li>W3C recommendation </li></ul><ul><li>Application independent </li></ul><ul><li>Language/OS neutral </li></ul><ul><li>Hierarchical </li></ul><ul><ul><li>Parent </li></ul></ul><ul><ul><li>Children </li></ul></ul><ul><ul><li>Siblings </li></ul></ul><ul><li>Node types </li></ul><ul><ul><ul><li>Document </li></ul></ul></ul><ul><ul><ul><li>Element </li></ul></ul></ul><ul><ul><ul><li>Attribute </li></ul></ul></ul><ul><ul><ul><li>Text </li></ul></ul></ul><ul><ul><ul><li>Comment </li></ul></ul></ul><ul><ul><ul><li>Entity </li></ul></ul></ul><ul><ul><ul><li>+ 6 more </li></ul></ul></ul>
    23. 23. XPath <ul><li>W3C recommendation </li></ul><ul><li>XML document as a tree of Nodes </li></ul><ul><li>Non XML syntax </li></ul><ul><li>Location paths </li></ul><ul><ul><li>Relative: ../../tr </li></ul></ul><ul><ul><li>Absolute: /html/body/h1 </li></ul></ul><ul><ul><li>Attributes: img@src </li></ul></ul><ul><ul><li>Axes: parent, child, sibling, etc </li></ul></ul><ul><li>Functions </li></ul><ul><ul><li>String: contains(), substring() </li></ul></ul><ul><ul><li>Array/node: last(), count(), position() </li></ul></ul>
    24. 24. XML Stylesheets (XSL) <ul><li>W3C Recommendation </li></ul><ul><li>XML syntax </li></ul><ul><li>Transformations (XSLT) </li></ul><ul><li>Formatting objects (XSL-FO) </li></ul><ul><li>XSLT processor e.g. Sablotron (sabcmd) </li></ul>
    25. 25. More XML applications <ul><li>XHTML (strict, transitional) </li></ul><ul><li>XML-Schema </li></ul><ul><li>RELAX-NG </li></ul><ul><li>XLink </li></ul><ul><li>XPointer </li></ul><ul><li>XML Query </li></ul>
    26. 26. <ul><li>Open standard </li></ul><ul><li>Flexible </li></ul><ul><li>Transformable </li></ul><ul><li>Not going to be around forever </li></ul><ul><li>Simple </li></ul><ul><li>Complex </li></ul><ul><li>Machine-readable </li></ul><ul><li>Human-readable </li></ul>
    27. 27. </everything>

    ×