Use of XML
The eXtensible Markup Language
&
Web Server
1
Contents :
 Why XML ?
 What is XML ?
 The Basic Rules
 HTML vs XML
 Validation
 DTD & Schemas
 Parsers
 Advantages & Disadvantages of XML
2
Why XML ?
3
Although HTML is widely used for formatting and
structuring Web documents, it is not suitable for
specifying structured data that is extracted from
databases
A new language—namely XML has emerged as
the standard for structuring and exchanging data
over the Web.
What is XML ?
 XML stands for eXtensible Markup Language.
 A markup language is used to provide information
about a document.
 Tags are added to the document to provide the
extra information.
 XML tags give a reader some idea what some of
the data means
 XML and HTML have a similar syntax … both
derived from SGML
4
The Basic Rules
 XML is case sensitive
 All start tags must have end tags
 Elements must be properly nested
 XML declaration is the first statement
 Every document must contain a root element
 Attribute values must have quotation marks
 Certain characters are reserved for parsing
5
Encoding
 XML uses Unicode to encode characters.
 Unicode comes in many flavors.
 The most common one used in West is UTF-8.
 UTF-8 is a variable length code. Characters
are encoded in 1 byte, 2 bytes, or 4 bytes.
6
Example :
<?xml version = “1.0” ?>
<address>
<name>
<first>AMU</first>
<last>MCA</last>
</name>
<email>csdamu@gmail.com</email>
<phone>123-45-6789</phone>
<birthday>
<year>1920</year>
<month>01</month>
<day>09</day>
</birthday>
</address>
7
XML Files are Trees
 An XML document has a single root node.
 Preorder traversal are usually used.
address
name email phone birthday
first last year month day
8
HTML vs XML
 Fixed set of tags
 Presentation oriented
 No data validation
capabilities
 Single presentation
 Tags are used for
display.
 Extensible set of tags
 Content orientated
 Standard Data
infrastructure
 Allows multiple output
forms
 Tags are used to
describe documents
and data.
9
Validation
 A well-formed document has a tree structure
and obeys all the XML rules.
 A particular application may add more rules in
either a DTD (document type definition) or in a
schema.
 Many specialized DTDs and schemas have
been created to describe particular areas.
 DTDs were developed first, so they are not as
comprehensive as schema.
10
DTD : Document Type Definitions
 A DTD describes the tree structure of a
document and something about its data.
 There are two data types, PCDATA and CDATA.
– PCDATA is parsed character data.
– CDATA is character data, not usually parsed.
 A DTD determines how many times a node may
appear, and how child nodes are ordered.
11
DTD for address Example
<!ELEMENT address (name, email, phone, birthday)>
<!ELEMENT name (first, last)>
<!ELEMENT first (#PCDATA)>
<!ELEMENT last (#PCDATA)>
<!ELEMENT email (#PCDATA)>
<!ELEMENT phone (#PCDATA)>
<!ELEMENT birthday (year, month, day)>
<!ELEMENT year (#PCDATA)>
<!ELEMENT month (#PCDATA)>
<!ELEMENT day (#PCDATA)>
12
Schemas
 Schemas are themselves XML documents.
 They were standardized after DTDs and provide
more information about the document.
 They have a number of data types including
string, decimal, integer, boolean, date, and time.
 They divide elements into simple and complex
types.
 They also determine the tree structure and how
many children a node may have.
13
Schema for First address Example
<?xml version="1.0" encoding="ISO-8859-1" ?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="address">
<xs:complexType>
<xs:sequence>
<xs:element name="name" type="xs:string"/>
<xs:element name="email" type="xs:string"/>
<xs:element name="phone" type="xs:string"/>
<xs:element name="birthday" type="xs:date"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
14
 XSLT is used to transform one xml document
into another, often an html document.
 The Transform classes are now part of Java 1.4.
 A program is used that takes as input one xml
document and produces as output another.
 If the resulting document is in html, it can be
viewed by a web browser.
 This is a good way to display xml data.
XSLT
Extensible Stylesheet Language Transformations
15
A Style Sheet to Transform address.xml
<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="address">
<html><head><title>Address Book</title></head>
<body>
<xsl:value-of select="name"/>
<br/><xsl:value-of select="email"/>
<br/><xsl:value-of select="phone"/>
<br/><xsl:value-of select="birthday"/>
</body>
</html>
</xsl:template>
</xsl:stylesheet>
Result :
AMU MCA
csdamu@gmail.com
123-45-6789
1920-01-09
16
Parsers
 There are two principal models for parsers.
 SAX – Simple API for XML
– Uses a call-back method
– Similar to javax listeners
 DOM – Document Object Model
– Creates a parse tree
– Requires a tree traversal
17
Advantages of XML
 XML is text (Unicode) based.
– Takes up less space.
– Can be transmitted efficiently.
 One XML document can be displayed differently
in different media.
– Html, video, CD, DVD,
– You only have to change the XML document in order
to change all the rest.
 XML documents can be modularized. Parts can
be reused.
18
Disadvantages of XML
19
More difficult ,demanding and precise than HTML.
Lack of browser support / end user applications.
Still experimental / not solidified.
Web Server
20
Contents :
Introduction
Use of Web Server
Common features
Path translation
Kernel-mode and user-mode web servers
Market Share.
Introduction
 A web server is a computer system that
processes requests via HTTP, the basic network
protocol used to distribute information on the
World Wide Web.
 The term can refer either to the entire system, or
specifically to the software that accepts and
supervises the HTTP requests .
21
Continue ……………
 The primary function of a web server is to store,
process and deliver web pages to clients.
 The communication between client and server
takes place using HTTP.
 Many generic web servers also support server-
side scripting using Active Server Pages (ASP),
PHP, or other scripting languages.
 They can also be found embedded in devices
such as printers, routers, and serving only a local
network.
22
Use of Web Server
The most common use of web servers is to host
websites, but there are other uses such as
gaming, data storage, running enterprise
applications, handling email, FTP, or other web
uses.
23
Common features
 Virtual hosting to serve many web sites using one
IP address
 Large file support to be able to serve files whose
size is greater than 2 GB on 32 bit OS
 Bandwidth throttling to limit the speed of responses
in order to not saturate the network and to be able
to serve more clients
 Server-side scripting to generate dynamic web
pages, still keeping web server and website
implementations separate from each other.
24
Path Translation
Web servers are able to map the path component
of a Uniform Resource Locator (URL) into:
 A local file system resource (for static requests)
 An internal or external program name
(for dynamic requests)
25
Kernel-mode and User-mode
web servers
 A web server can be either implemented into the
OS kernel, or in user space.
 An in-Kernel web server (like Microsoft IIS on
Windows) will usually work faster, because, as
part of the system, it can directly use all the
hardware resources it needs, such as non paged
memory, CPU time-slices, network adapters, or
buffers.
26
 Web servers that run in user-mode have to ask
the system for permission to use more memory
or more CPU resources. Not only do these
requests to the kernel take time, but they are
not always satisfied because the system
reserves resources for its own usage and has
the responsibility to share hardware resources
with all the other running applications.
27
Continue ……………
Load limits
 A web server has defined load limits,because it
can handle only a limited number of concurrent
client connections per IP depending on:
 its own settings,
 the HTTP request type,
 whether the content is static or dynamic,
 whether the content is cached, and
 the hardware and software limitations.
28
Market Share
29
Latest statistics of the market share of the top web servers on the
Internet by Netcraft Survey April, May 2014 :
Product Vendor April 2014 % May 2014 % Change
Apache Apache 361,853,003 37.74% 366,262,346 37.56% -0.18%
IIS Microsoft 316,843,695 33.04% 325,854,054 33.41% +0.37%
GWS Google 20,983,310 2.19% 20,685,165 2.12% -0.07%
nginx NGINX,
Inc.
146,204,067 15.25% 142,426,538 14.60% -0.64%
Apache, IIS and Nginx are the most used web servers on the
Internet.
References
 Fundamentals of Database System -5th Edition
( Ramez Elmasri and Shamkant B. Navathe)
 http://en.wikipedia.org/w/index.php?title=Web_se
rver&oldid=629776395
 ^ "What is web server?' "
(http://www.webdevelopersnotes.com/basics/wh
at_is_web_server.php).
30
Thank
You 31

8023.ppt

  • 1.
    Use of XML TheeXtensible Markup Language & Web Server 1
  • 2.
    Contents :  WhyXML ?  What is XML ?  The Basic Rules  HTML vs XML  Validation  DTD & Schemas  Parsers  Advantages & Disadvantages of XML 2
  • 3.
    Why XML ? 3 AlthoughHTML is widely used for formatting and structuring Web documents, it is not suitable for specifying structured data that is extracted from databases A new language—namely XML has emerged as the standard for structuring and exchanging data over the Web.
  • 4.
    What is XML?  XML stands for eXtensible Markup Language.  A markup language is used to provide information about a document.  Tags are added to the document to provide the extra information.  XML tags give a reader some idea what some of the data means  XML and HTML have a similar syntax … both derived from SGML 4
  • 5.
    The Basic Rules XML is case sensitive  All start tags must have end tags  Elements must be properly nested  XML declaration is the first statement  Every document must contain a root element  Attribute values must have quotation marks  Certain characters are reserved for parsing 5
  • 6.
    Encoding  XML usesUnicode to encode characters.  Unicode comes in many flavors.  The most common one used in West is UTF-8.  UTF-8 is a variable length code. Characters are encoded in 1 byte, 2 bytes, or 4 bytes. 6
  • 7.
    Example : <?xml version= “1.0” ?> <address> <name> <first>AMU</first> <last>MCA</last> </name> <email>csdamu@gmail.com</email> <phone>123-45-6789</phone> <birthday> <year>1920</year> <month>01</month> <day>09</day> </birthday> </address> 7
  • 8.
    XML Files areTrees  An XML document has a single root node.  Preorder traversal are usually used. address name email phone birthday first last year month day 8
  • 9.
    HTML vs XML Fixed set of tags  Presentation oriented  No data validation capabilities  Single presentation  Tags are used for display.  Extensible set of tags  Content orientated  Standard Data infrastructure  Allows multiple output forms  Tags are used to describe documents and data. 9
  • 10.
    Validation  A well-formeddocument has a tree structure and obeys all the XML rules.  A particular application may add more rules in either a DTD (document type definition) or in a schema.  Many specialized DTDs and schemas have been created to describe particular areas.  DTDs were developed first, so they are not as comprehensive as schema. 10
  • 11.
    DTD : DocumentType Definitions  A DTD describes the tree structure of a document and something about its data.  There are two data types, PCDATA and CDATA. – PCDATA is parsed character data. – CDATA is character data, not usually parsed.  A DTD determines how many times a node may appear, and how child nodes are ordered. 11
  • 12.
    DTD for addressExample <!ELEMENT address (name, email, phone, birthday)> <!ELEMENT name (first, last)> <!ELEMENT first (#PCDATA)> <!ELEMENT last (#PCDATA)> <!ELEMENT email (#PCDATA)> <!ELEMENT phone (#PCDATA)> <!ELEMENT birthday (year, month, day)> <!ELEMENT year (#PCDATA)> <!ELEMENT month (#PCDATA)> <!ELEMENT day (#PCDATA)> 12
  • 13.
    Schemas  Schemas arethemselves XML documents.  They were standardized after DTDs and provide more information about the document.  They have a number of data types including string, decimal, integer, boolean, date, and time.  They divide elements into simple and complex types.  They also determine the tree structure and how many children a node may have. 13
  • 14.
    Schema for Firstaddress Example <?xml version="1.0" encoding="ISO-8859-1" ?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="address"> <xs:complexType> <xs:sequence> <xs:element name="name" type="xs:string"/> <xs:element name="email" type="xs:string"/> <xs:element name="phone" type="xs:string"/> <xs:element name="birthday" type="xs:date"/> </xs:sequence> </xs:complexType> </xs:element> </xs:schema> 14
  • 15.
     XSLT isused to transform one xml document into another, often an html document.  The Transform classes are now part of Java 1.4.  A program is used that takes as input one xml document and produces as output another.  If the resulting document is in html, it can be viewed by a web browser.  This is a good way to display xml data. XSLT Extensible Stylesheet Language Transformations 15
  • 16.
    A Style Sheetto Transform address.xml <?xml version="1.0" encoding="ISO-8859-1"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="address"> <html><head><title>Address Book</title></head> <body> <xsl:value-of select="name"/> <br/><xsl:value-of select="email"/> <br/><xsl:value-of select="phone"/> <br/><xsl:value-of select="birthday"/> </body> </html> </xsl:template> </xsl:stylesheet> Result : AMU MCA csdamu@gmail.com 123-45-6789 1920-01-09 16
  • 17.
    Parsers  There aretwo principal models for parsers.  SAX – Simple API for XML – Uses a call-back method – Similar to javax listeners  DOM – Document Object Model – Creates a parse tree – Requires a tree traversal 17
  • 18.
    Advantages of XML XML is text (Unicode) based. – Takes up less space. – Can be transmitted efficiently.  One XML document can be displayed differently in different media. – Html, video, CD, DVD, – You only have to change the XML document in order to change all the rest.  XML documents can be modularized. Parts can be reused. 18
  • 19.
    Disadvantages of XML 19 Moredifficult ,demanding and precise than HTML. Lack of browser support / end user applications. Still experimental / not solidified.
  • 20.
    Web Server 20 Contents : Introduction Useof Web Server Common features Path translation Kernel-mode and user-mode web servers Market Share.
  • 21.
    Introduction  A webserver is a computer system that processes requests via HTTP, the basic network protocol used to distribute information on the World Wide Web.  The term can refer either to the entire system, or specifically to the software that accepts and supervises the HTTP requests . 21
  • 22.
    Continue ……………  Theprimary function of a web server is to store, process and deliver web pages to clients.  The communication between client and server takes place using HTTP.  Many generic web servers also support server- side scripting using Active Server Pages (ASP), PHP, or other scripting languages.  They can also be found embedded in devices such as printers, routers, and serving only a local network. 22
  • 23.
    Use of WebServer The most common use of web servers is to host websites, but there are other uses such as gaming, data storage, running enterprise applications, handling email, FTP, or other web uses. 23
  • 24.
    Common features  Virtualhosting to serve many web sites using one IP address  Large file support to be able to serve files whose size is greater than 2 GB on 32 bit OS  Bandwidth throttling to limit the speed of responses in order to not saturate the network and to be able to serve more clients  Server-side scripting to generate dynamic web pages, still keeping web server and website implementations separate from each other. 24
  • 25.
    Path Translation Web serversare able to map the path component of a Uniform Resource Locator (URL) into:  A local file system resource (for static requests)  An internal or external program name (for dynamic requests) 25
  • 26.
    Kernel-mode and User-mode webservers  A web server can be either implemented into the OS kernel, or in user space.  An in-Kernel web server (like Microsoft IIS on Windows) will usually work faster, because, as part of the system, it can directly use all the hardware resources it needs, such as non paged memory, CPU time-slices, network adapters, or buffers. 26
  • 27.
     Web serversthat run in user-mode have to ask the system for permission to use more memory or more CPU resources. Not only do these requests to the kernel take time, but they are not always satisfied because the system reserves resources for its own usage and has the responsibility to share hardware resources with all the other running applications. 27 Continue ……………
  • 28.
    Load limits  Aweb server has defined load limits,because it can handle only a limited number of concurrent client connections per IP depending on:  its own settings,  the HTTP request type,  whether the content is static or dynamic,  whether the content is cached, and  the hardware and software limitations. 28
  • 29.
    Market Share 29 Latest statisticsof the market share of the top web servers on the Internet by Netcraft Survey April, May 2014 : Product Vendor April 2014 % May 2014 % Change Apache Apache 361,853,003 37.74% 366,262,346 37.56% -0.18% IIS Microsoft 316,843,695 33.04% 325,854,054 33.41% +0.37% GWS Google 20,983,310 2.19% 20,685,165 2.12% -0.07% nginx NGINX, Inc. 146,204,067 15.25% 142,426,538 14.60% -0.64% Apache, IIS and Nginx are the most used web servers on the Internet.
  • 30.
    References  Fundamentals ofDatabase System -5th Edition ( Ramez Elmasri and Shamkant B. Navathe)  http://en.wikipedia.org/w/index.php?title=Web_se rver&oldid=629776395  ^ "What is web server?' " (http://www.webdevelopersnotes.com/basics/wh at_is_web_server.php). 30
  • 31.

Editor's Notes

  • #2 XML : It is a markup language that defines a set of rules for encoding documents in a format that is both human readable and machine readable. Developed by W3C in 1998 and extended from SGML. The design goals of XML emphasize simplicity , generally and usablity over the internet. It’s a textual data format with strong support via unicode for different human language.
  • #4 Structured Data:Information stored in databases is known as structured data because it is represented in a strict format. Semi-structured data:This data may have a certain structure, but not all the information collected will have identical structure. Unstructured Data :A third category is known as unstructured data, because there is very limited indication of the type of data.
  • #5 W3C:XML Version 1.0 introduced by World Wide Web Consortium (W3C) in 1998. ML:A ML is a modren system for annotating a document in a way that is syntactically distinguishable from the text. SGML: Standard Generalized Markup Language.
  • #6 Case sensitive: <address> is not the same as </Address> Nested:<name><email>…</name></email> not allowed <name><email>…</email><name> allowed Parser: An XML parser is used to check that all the rules have been obeyed.
  • #7 UTF-8 :Universal Character Set + Transformation Format—8-bit is a character encoding capable of encoding all possible characters in Unicode.  1 Byte: The first 128 characters in Unicode are ASCII, numbers between 128 and 255 used in western Europe, such as ã, á, å, or ç. 2 Byte: Two byte codes are used for some characters not listed in the first 256 and some Asian ideographs. 4 Byte: Four byte codes can handle any ideographs that are left.
  • #15 The purpose of an XML Schema is to define the legal building blocks of an XML document, just like a DTD.(xs=xml schema)
  • #18 Parser : An XML parser is used to check that all the rules have been obeyed. Parsers are also available for free download over the Internet. One is Xerces, from the Apache open-source project. Java 1.4 also supports an open-source parser.
  • #23 HTTP:Hypertext Transfer Protocol PHP:PHP is a server-side scripting language designed for web development but also used as a general-purpose programming language.
  • #26 Astatic request the URL path specified by the client is relative to the web server's root directory. The following URL as it would be requested by a client: http://www.example.com/path/file.html The client's user agent will translate it into a connection to www.example.com with the following HTTP 1.1 request: GET /path/file.html HTTP/1.1 Host: www.example.com
  • #27 Kernel OS:In computing, the kernel is a computer program that manages input/output requests from software, and translates them into data processing instructions for the central processing unit and other electronic components of a computer
  • #29 Causes of overload:Too much legitimate web traffic. Thousands or even millions of clients connecting to the web site in a short interval, e.g., Slashdot effect;
  • #30 IIS: GWS: NGINX: