2. Objectives
• Explain what XML is and the need for XML
• Know other markup languages – SGML, HTML,
XHTML
• Understand the difference between SGML,
HTML and XML
• Know the various applications of XML
• Know the pros and cons of XML
3. XML
• XML stands for Extensible Markup Language.
• XML is a tool for data transportation and data
storage in platform and language neutral way.
• XML plays an important role in the exchange of a
wide variety of data on the web
• XML defines set of rules for encoding documents
which is both human-readable and machine-
readable
• All rules are defined in XML 1.0 specification
developed by W3C an open standard
• Many Parsers or APIs(Application Programming
Interface) are available to process the XML data
4. History of XML
• W3C’s primary goals is to make the Web universally
accessible—regardless of disabilities, language, culture, etc
• Internet is a collection of interconnected computers
• DARPANET (Defense Advanced Research Project Agency
Network) was the first network to interconnect academic,
government and private research organizations
• Initially, internet used for sending electronic messages and
transferring files.
• FTP(File Transfer Protocol) allows people to request files
from the other system
• Limitation
– what format the files requested would be in and
– Can the file be processed
5. Contd.
• CERN browser
– Used to request files over the internet and display
them in a predefined format
– Uses
• HTTP (Hyper Text Transfer Protocol) and
• HTML(Hyper text markup language)
• Presentation details cannot be transferred as they
are coded in the machine specific manner that may
not be understood at the receiving end
6. Contd.
• Standardized Generalized Markup Language (SGML) -
allows information about the document's structure to be
preserved
• DSSSL – Document style semantics and specification
language
• SGML is used to specify mark up languages.
• The purpose of SGML is to create the vocabularies which
could be used to mark up documents with structural tags.
• HTML - one of the most popular applications of SGML
• HTML - mark up language used for presentation i.e. design
a webpage
• HTML - All tags predefined
7. Contd.
• Limitation of HTML
– Data storage and interchange of data is not
possible using HTML
– All tags are predefined
• XML bridges this gap
– human readable, while being flexible enough to
support platform and
– architecture independent data interchange
8. SGML vs HTML vs XML
• HTML allows hypertext links to be specified,
SGML does not allow any hyper text links
• HTML is used for presentation, not the meaning
of the data content,
XML describes the meaning of the document
• HTML is not extensible,
XML is highly extensible
9. Semi Structured Data
• Data may be
– Structured
– Unstructured
– Raw data
• Text Database – Unstructured
• Text Mark Up – Mark-up languages
• SGML – meta language
• HTML – markup language with predefined
tags