SlideShare a Scribd company logo
1 of 23
Download to read offline
Unit-IV/Web Engineering Truba College of Sc. & Tech., Bhopal
Prepared By: Ms. Nandini Sharma(CSE DEPT.) Page 1
INTRODUCTION
XML (Extensible Markup Language) is a flexible way to create common information formats
and share both the format and the data on the World Wide Web, intranets, and elsewhere. For
example, computer makers might agree on a standard or common way to describe the
information about a computer product (processor speed, memory size, and so forth) and then
describe the product information format with XML. Such a standard way of describing data
would enable a user to send an intelligent agent (a program) to each computer maker's Web site,
gather data, and then make a valid comparison. XML can be used by any individual or group of
individuals or companies that wants to share information in a consistent way.
XML, a formal recommendation from the World Wide Web Consortium (W3C), is similar to the
language of today's Web pages, the Hypertext Markup Language (HTML). Both XML and
HTML contain markup symbols to describe the contents of a page or file. HTML, however,
describes the content of a Web page (mainly text and graphic images) only in terms of how it is
to be displayed and interacted with. For example, the letter "p" placed within markup tags starts
a new paragraph. XML describes the content in terms of what data is being described. For
example, the word "phoneme" placed within markup tags could indicate that the data that
followed was a phone number. This means that an XML file can be processed purely as data by
a program or it can be stored with similar data on another computer or, like an HTML file, that it
can be displayed. For example, depending on how the application in the receiving computer
wanted to handle the phone number, it could be stored, displayed, or dialled.
XML is "extensible" because, unlike HTML, the markup symbols are unlimited and self-
defining. XML is actually a simpler and easier-to-use subset of the Standard Generalized
Markup Language (SGML), the standard for how to create a document structure. It is expected
that HTML and XML will be used together in many Web applications. XML markup, for
example, may appear within an HTML page.
Early applications of XML include Microsoft's Channel Definition Format (CDF), which
describes a channel, a portion of a Web site that has been downloaded to your hard disk and is
then is updated periodically as information changes. A specific CDF file contains data that
specifies an initial Web page and how frequently it is updated. Another early application is
ChartWare, which uses XML as a way to describe medical charts so that they can be shared by
doctors.Applications related to banking, e-commerce ordering, personal preference profiles,
purchase orders, litigation documents, part lists, and many others are anticipated.
VALIDATING XML FILES
When you validate your XML file, the XML validator will check to see that your file is valid
and well-formed. The XML editor will process XML files that are invalid or not well-formed.
The editor uses heuristics to open a file using the best interpretation of the tagging that it can.
For example, an element with a missing end tag is simply assumed to end at the end of the
document. As you make updates to a file, the editor incrementally reinterprets your document,
changing the highlighting, tree view, and so on. Many formation errors are easy to spot in the
Unit-IV/Web Engineering Truba College of Sc. & Tech., Bhopal
Prepared By: Ms. Nandini Sharma(CSE DEPT.) Page 2
syntax highlighting, so you can easily correct obvious errors on-the-fly. However, there will be
other cases when it will be beneficial to perform formal validation on your documents.
You can validate your file by selecting it in the Navigator view, right-clicking it, and
clicking Validate. Any validation problems are indicated in the Problems view. You can double-
click on individual errors, and you will be taken to the invalid tag in the file, so that you can
make corrections.
Note: If you receive an error message indicating that the Problems view is full, you can increase
the number of error messages allowed by clicking Window > Preferences and
selecting General > Markers . Select the Use marker limits check box and change the number
in theLimit visible items per group field.
You can set up a project's properties so that different types of project resources are automatically
validated when you save them. From a project's pop-up menu, click Properties, then
select Validation. Any validators you can run against your project will be listed in the
Validation page.
The purpose of a Document Type Definition or DTD is to define the structure of a document
encoded in XML (eXtended Markup Language).
For introductory material about XML, see the XML help page.
It is possible to build and use files containing XML tags without ever defining what tags are
legal. However, if you want to insure that files conform to a known structure, writing a DTD is
the preferred method.
 A well-formed file is one that obeys the general XML rules for tags: tags must be
properly nested, opening and closing tags must be balanced, and empty tags must end
with '/>'.
 A valid file is not only well-formed, but it must also conform to a publicly available
DTD that specifies which tags it uses, what attributes those tags can contain, and which
tags can occur inside which other tags, among other properties.
The advantage of a valid file is that its contents are more predictable for applications that want to
process or present that file. The DTD insures that only certain tags can be used in certain places.
DEFINITIONS
We need to review some terminology before proceeding:
 A proper XML name must start with a letter or underbar (_), with the rest letters,
underbars, digits, or hyphen (-).
 A tag is one of the XML constructs used to mark up documents. All tags start with a less-
than symbol (<) and end with a greater-than symbol (>).
Unit-IV/Web Engineering Truba College of Sc. & Tech., Bhopal
Prepared By: Ms. Nandini Sharma(CSE DEPT.) Page 3
 An element is a section of an XML document that acts as a unit. It may be either empty
element, or it may have content.
 An empty element consists of a single tag of the form
<gi.../>
Where gi is the tag type (or ―generic identifier‖), and the tag may include attributes. Note the
slash before the closing ―>‖; this signifies an empty tag.
 An opening tag begins a section of an XML document that ends with the
corresponding closing tag. An opening tag has this form:
<gi...>
where gi is the tag type (or ―generic identifier‖), and the tag may include attributes. A closing
tag has the form:
</gi>
 The content is everything between the opening tag and its corresponding closing tag. The
content may be other elements or just plain text.
The DTD can contain several different types of declarations:
 Element declarations let you specify what kinds of tags can be used, and what (if
anything) can appear inside the contents of the element.
 Attribute declarations define what attributes you can use inside a given element.
 Entity declarations define chunks of fixed text that can be included elsewhere.
 Notation declarations define file types (like JPG and WAV files) so you can refer to non-
XML files like image and sound files.
ELEMENTS WITH MIXED CONTENT
In general, an element can have any mixture of text and other elements as children. You can
specify exactly which elements can be children. If you like, you can even specify that the
children must occur in a given order. You can also specify that the child elements are optional.
So, in the general form of the declaration <!ELEMENT gi (content)>, the content is an
expression syntax—that is, it consists of operators and operands arranged in arbitrarily complex
ways. Let's start with some simple cases to show you the features of a content declaration, but
keep in mind that these features can be used in combination. The simplest case is when an
element a has a single child element b:
<!ELEMENT a (b)>
Unit-IV/Web Engineering Truba College of Sc. & Tech., Bhopal
Prepared By: Ms. Nandini Sharma(CSE DEPT.) Page 4
The above declaration in a DTD means that an element <a>...</a> must contain exactly
one <b> element.
To specify that a child element can occur one or more times, append a plus sign (+) after the
child element name. For example, to say that a <squid> element may contain one or
more <tentacle> elements:
<!ELEMENT squid (tentacle+)>
You can also specify that a child element can occur any number of times, or not at all. Append
an asterisk (*), meaning ―zero or more of the previous,‖ after the child element name:
<!ELEMENT lizard (leg*)> <!-- some <lizard>s have no <leg>s -->
The question-mark suffix (?) means the child element is optional: it can occur zero or one time
in the content of the element you're declaring. For example, suppose an <oven> element can
either be empty or contain a <pie> element:
<!ELEMENT oven (pie?)>
If you want a certain sequence of children, name the child elements in a comma-separated list.
For example, suppose a <memo>element must contain exactly one <from> element, then
one <to> element, one <subject>, and one <message> element:
<!ELEMENT memo (from,to,subject,message)>
But you can use the +, *, and ? operators in this declaration. For example, suppose that you want
to require that a <memo> must have<from> and <to> elements, but the <subject> element is
optional, and it can have zero or more <message> elements. You'd then declare it like this:
<!ELEMENT memo (from,to,subject?,message*)>
Sometimes you need to specify that there is a choice of children. The ―or‖ operator (|) can be
used to separate the choices. For example, suppose that a <trophy> element can have either a
child named <bowling> or a child named <tennis>. Here's how you'd declare it:
<!ELEMENT trophy (bowling|tennis)>
You can also apply the usual suffix operators to groups of elements. For example, suppose you
have an element <timerecord> that starts with a required <purpose> element, followed by zero
or more pairs of <start-time> and <end-time> records:
<!ELEMENT timerecord (purpose,(start-time,end-time)*)>
Here's another more general example:
<!ELEMENT stock ((pig|chicken|cow)*)>
The above example says a <stock> element can contain any number of the three child elements,
in any order.
Unit-IV/Web Engineering Truba College of Sc. & Tech., Bhopal
Prepared By: Ms. Nandini Sharma(CSE DEPT.) Page 5
Moreover, you can allow regular, untagged text to be mixed in with your specified child tags by
placing #PCDATA at the start of a list of choices. For example, suppose a <speech> element can
contain any mixture of regular text, and text tagged with the elements<loud> and <soft>:
<!ELEMENT speech ((#PCDATA|loud|soft)*)>
<!ELEMENT loud (#PCDATA)>
<!ELEMENT soft (#PCDATA)>
So, the content part of the element declaration can be arbitrarily complex. There are some ways
#PCDATA cannot be used, and there are other uncommon features you may need; refer to the
XML standard or a good book on the subject.
ATTRIBUTE DECLARATIONS
If an element is to have attributes, the names and possible values of those attributes must
be declared in the DTD. Here is the general form:
<!ATTLIST ename {aname atype default} ...>
where ename is the name of the element for which you're defining attributes, aname is the name
of one of that element's possible attributes, atype describes what values it can have,
and default describes whether it has a default value. The last three items can be repeated inside
an <!ATTLIST...> declaration, one group per attribute.
The atype part describing the attribute's type can have three kinds of values:
 The keyword CDATA means that the attribute can have any character string as a value.
For example, suppose you want every <play> element to have a title attribute that can contain
any text, and that attribute is required. Here is the complete attribute declaration:
<!ATTLIST play title CDATA #REQUIRED>
 There are several tokenized attribute types, which are required to have a certain
structure. See tokenized attributes below.
 You can provide a specific set of legal values for the attribute; see enumerated
attributes below.
The last part of the declaration, default, specifies whether the attribute can be omitted, and what
value it will have if omitted. This must be one of the following:
REQUIRED- The attribute must always be supplied.
IMPLIED - The attribute can be omitted, and the DTD does not provide a default value.
Anyone reading this file may assume a default value, but that is not the DTD's problem. "value"
The attribute can be omitted, and the default value is the quoted string that you provide.
Unit-IV/Web Engineering Truba College of Sc. & Tech., Bhopal
Prepared By: Ms. Nandini Sharma(CSE DEPT.) Page 6
FIXED "value" - The attribute must be given and must have the given "value".
TOKENIZED ATTRIBUTES
You can restrict an attribute to have only values with a certain structure. Here are the possible
values of the atype part of the attribute declaration for such attributes:
ID
An ID attribute must be a unique identifier for that node. This allows other nodes to refer to it.
The attribute value must also be a valid XML name (see above).
IDREF
An IDREF attribute is a reference to an ID attribute in a different node.
For example, suppose that in your DTD, there is a <sailor> element with an ID-
type nickname attribute, and another element <duty> with an IDREF-type attribute called sailor-
nick. Then if you have an element like this:
<sailor nickname='Bluto'>...</sailor>
then this tag would refer to that element:
<duty sailor-nick='Bluto'>...</duty>
IDREFS
The value of an IDREFS attribute must contain one or more ID references separated by spaces.
Example:
<roster sailor-nicks='Bluto Popeye Olive_Oyl'/>
ENTITY
Use this attribute type to refer to external, non-parsed entities. See the section on notations,
below.
ENTITIES
Like ENTITY, but the attribute can be a list of one or more entity names separated by spaces.
NMTOKEN
The attribute value must be a name token, conforming to the rules for XML names (see above).
NMTOKENS
Like NMTOKEN, but the attribute value can contain one or more name tokens separated by
spaces.
ENUMERATED ATTRIBUTES
Unit-IV/Web Engineering Truba College of Sc. & Tech., Bhopal
Prepared By: Ms. Nandini Sharma(CSE DEPT.) Page 7
You can specify that attributes must have one of a set of one or more values. Here is the
general form of the atype part of the<!ATTLIST...> declaration:
(value1|value2|...)
For example, suppose you want your <vehicle> element to have a kind attribute that must have a
value of either "car","truck", or "boat":
<!ATTLIST vehicle
kind (car|truck|boat) #REQUIRED>
You can also supply a default value in quotes. For example:
<!ATTLIST vehicle
kind (car|truck|boat) "car">
DECLARING AND USING ENTITIES
In a DTD, entities come in four flavours:
 A general entity is a chunk of text with a name attached, so you can use the entity as a
sort of shorthand to get the related text substituted in its place.
For example, suppose you are working on a new product called Project Giant-Slayer, but you
know that the marketing department will change the name when it's released to the market. You
could define the current product name as an entity named &product, and use it everywhere in
your product literature. Then, when the marketing department decides on the final name, you can
change the declaration of the entity and the new name will magically appear in place of the old
one in all your web pages and brochures.
 A character entity is one of the many standardized special characters that you can use
when you need a character unavailable in your local character set.
 A parameter entity is like a general entity, but it can be used as shorthand for parts of a
content declaration in an element declaration.
 A binary or non-parsed entity represents an external file that is not in XML format.
GENERAL ENTITIES
General entities have names of the form &name;, where the name follows the usual rules for
XML names (above).
To declare a general entity, use a declaration of this general form in your DTD:
<!ENTITY ename "text">
where ename is the name of the entity you are defining (without the initial & and final ;),
and text is the text you want substituted for that entity.
Unit-IV/Web Engineering Truba College of Sc. & Tech., Bhopal
Prepared By: Ms. Nandini Sharma(CSE DEPT.) Page 8
For example, to define an entity named &cr; with your copyright string, you might use a
declaration like this:
<!ENTITY cr "Copyright (C) 1763 Cotton Mather LLP">
CHARACTER ENTITIES
To use special characters in your document, you can use the form &#n; where n is
the decimal number of the character you want. A table of these entities is online
at http://www.w3.org/TR/html401/sgml/entities.html.
PARAMETER ENTITIES
The purpose of a parameter entity is to serve as a short hand for some or all of the content part of
an element declaration.
The general form is:
<!ENTITY % ename "text">
For example, suppose you have a lot of tags whose content model is "#PCDATA|bold|ital)*".
You could define an entity like this:
<!ENTITY bitext "(#PCDATA|bold|ital)*">
Then, to define an element <excuse> with that content:
<!ELEMENT excuse %bitext;>
BINARY (NON-PARSED) ENTITIES
This last type of entity represents a file, like an image or sound file, that is not XML. To declare
such an entity:
<!ENTITY ename SYSTEM "url" NDATA nname>
where ename is the name of the entity you are defining, url is the URL where the file can be
found, and nname is the name of thenotation that the file uses. See the section on
notations below for an example.
NOTATION DECLARATIONS
The purpose of a notation declaration is to define the format of some external non-XML file,
such as a sound or image file, so you can refer to such files in your document.
The general form of a notation declaration can be either of these:
<!NOTATION nname PUBLIC std>
<!NOTATION nname SYSTEM url>
Unit-IV/Web Engineering Truba College of Sc. & Tech., Bhopal
Prepared By: Ms. Nandini Sharma(CSE DEPT.) Page 9
where nname is the name you are giving to the notation; std is the published name of a public
notation, and url is a reference to a program that can render a file in the given notation.
There are four steps to connecting an attribute to a notation:
1. Declare the notation. Example:
<!NOTATION jpeg PUBLIC "JPG 1.0">
2. Declare the entity. For example:
3. <!ENTITY bogie-pic SYSTEM
"http://stars.com/bogart.jpg" NDATA jpeg>
4. Declare the attribute as type ENTITY. For example:
<!ATTLIST star-bio pin-shot ENTITY #REQUIRED>
5. Use the attribute:
<star-bio pin-shot="bogie-pic">...</star-bio>
In a way, you could argue that this is the most widespread use of XML, as XHTML. Because
XHTML is simply HTML 4.0 reworked, many HTML 4.0 sites are actually using an invalid
form of XHTML.
But the benefit of XML is not that it already exists as XHTML, but that you can create
web documents from XML using XSLT to transform your documents into HTML. You can then
send your XML to an XSLT processor on the web server and serve that result to the web
browser. This makes your documentation available in whatever format you need it to be in.
XML AND CONTENT MANAGEMENT
Ironically, with most websites that use XML, the web designers and content developers might
not even know that XML is there. This is because there is generally a CMS or content
management system that sits in front of the XML to make it easier for the content writers to
write their web content without worrying about how to write HTML or design web pages.
XML AND DOCUMENTATION
Many companies are moving to XML to write their internal documentation. The most common
XML platform for this is DocBook. The advantage of XML for documentation is that it can be
used to define the common traits in books, magazines, stories, advertisements, and so forth. And
DocBook already has that type of information defined.
The best thing about XML for documentation is that the XML is easy to understand for humans,
both of the actual documentation, but also the XML code surrounding it. XML can be used for
any type of documentation, from a publishing house to Marketing materials.
Unit-IV/Web Engineering Truba College of Sc. & Tech., Bhopal
Prepared By: Ms. Nandini Sharma(CSE DEPT.) Page 10
Here is an example of documentation written in XML:
<howto>
<title>How to Write a Mail Link</title>
<author>Jennifer Kyrnin, Web Design Guide</author>
<description>
<paragraph>
Use a HTML tag to allow your readers to send email directly from your Web site.
</paragraph>
</description>
<directions>
<step>Write a link as usual <a href="">email me</a></step>
<step>Where you would normally put a URL, put the code "mailto" <a href="mailto:">email
me</a></step>
<step>Then put your email address after the colon <a
href="mailto:html@aboutguide.com">email me</a></step>
</directions>
</howto>
As you can see, both the data and the XML are readable and understandable. The content is also
in an order that would be expected by a human reading the document.
XML AND DATABASE DEVELOPMENT
Databases are a natural use for XML, because XML is all about data. Unlike XML for
documentation, XML for databases does not need to be readable by humans. The data is simply
written in such a way to allow machines to read it and make it accessible to a database.
Here's XML that might be loaded into a database:
<item number="00001">
<name>
<first>Jane</first>
<middle>Q</middle>
<last>Public</last>
</name>
<phone type="voice">
<areacode>407</areacode>
<number>555-1212</number>
</phone>
<phone type="fax">
<areacode>407</areacode>
<number>555-1213</number>
</phone>
<email>jpublic@gmail.com</email>
</item>
Unit-IV/Web Engineering Truba College of Sc. & Tech., Bhopal
Prepared By: Ms. Nandini Sharma(CSE DEPT.) Page 11
Unlike the document XML, it's not necessary that this be easily readable by humans. Since it is
meant to be input into a database, it is only important that it be processable by a computer.
HTML versus XML
The most salient difference between HTML and XML is that HTML describes presentation and
XML describes content. An HTML document rendered in a web browser is human readable.
XML is aimed toward being both human and machine readable.
Consider the following HTML.
<html>
<head><title>Books</title><head>
<body>
<h2>Books</h2>
<hr>
<em>Sense and Sensibility</em>, <b>Jane Austen</b>, 1811<br>
<em>Pride and Prejudice</em>, <b>Jane Austen</b>, 1813<br>
<em>Alice in Wonderland</em>, <b>Lewis Carroll</b>, 1866<br>
<em>Through the Looking Glass<</em>, <b>Lewis Carroll</b>, 1872<br>
</body>
</html>
The previous HTML is rendered in a browser as follows.
Unit-IV/Web Engineering Truba College of Sc. & Tech., Bhopal
Prepared By: Ms. Nandini Sharma(CSE DEPT.) Page 12
The HTML above describes how bibliography information is to be presented and formatted for a
human to view in a web browser. Knowing that Sense and Sensibility is enclosed in italic tags
does not however help a program determine that it is the title of a book. XML attempts to
describe web data to address this void.
The following is XML describing the contents of the books HTML page above.
<books>
<book>
<title>Sense and Sensibility</title>
<author>Jane Austen</author>
<year>1811</year>
</book>
<book>
<title>Pride and Prejudice</title>
<author>Jane Austen</author>
<year>1813</year>
</book>
Unit-IV/Web Engineering Truba College of Sc. & Tech., Bhopal
Prepared By: Ms. Nandini Sharma(CSE DEPT.) Page 13
<book>
<title>Alice in Wonderland</title>
<author>Lewis Carroll</author>
<year>1866</year>
</book>
<book>
<title>Through the Looking Glass</title>
<author>Lewis Carroll</author>
<year>1872</year>
</book>
</books>
A program parsing this data can take advantage of the fact that all book titles are enclosed
in <title> tags. Where would such a program find such information? An XML document may
contain an optional description of its grammar. A grammar describes which tags are used in the
XML document and how such tags can be nested. A grammar is a schema or road map for the
XML document. Originally an XML grammar was specified in a DTD (Document Type
Definition). A newer standard however, XSchema (XML Schema) has been adopted. XSchema
addresses some of the limitations of DTDs.
As can be seen above, XML does not contain any information indicating how the document
should be rendered in a browser. Therefore, XML factors data from presentation. The beauty of
this feature is that the same data can be presented in a variety of ways without having to replicate
any data (e.g., consider making book titles bold and authors italic).
XML SYNTAX DIFFERS FROM HTML
 New tags may be defined at will
 Tags may be nested to arbitrary depth
 May contain an optional description of its grammar
XML can be used to store data inside HTML documents. XML data can be stored inside HTML
pages as "Data Islands". As HTML provides a way to format and display the data, XML stores
Unit-IV/Web Engineering Truba College of Sc. & Tech., Bhopal
Prepared By: Ms. Nandini Sharma(CSE DEPT.) Page 14
data inside the HTML documents. The data contained in an XML file is of little value unless it
can be displayed, and HTML files are used for that purpose.
The simple way to insert XML code into an HTML file is to use the <xml> tag. The XML tag
informs, the browser that the contents are to be parsed and interpreted using the XML parser.
Like most other HTML tags, the <xml> tag has attributes. The most important attribute is the ID,
which provides for the unique naming of the code. The contents of the XML tag come from one
of two sources : inline XML code or an imported XML file.
 If the code appears in the current location , it's said to be inline.
Example
Embedding XML code inside an HTML File.
<html>
<xml Id = msg>
<message>
<to> Visitors </to>
<from> Author </from>
<Subject> XML Code Islands </Subject>
<body> In this example, XML code is embedded inside HTML code
</body>
</message>
</xml>
</html>
 The efficient way is to create a file and import it. You can easily do so by using the SRC
attribute of the XML tag.
Syntax
<xml Id = msg SRC = "example1.xml">
</xml>
DATA BINDING
Data binding involves mapping, synchronizing, and moving data from a data source, usually on
a remote server, to an end user's local system where the user can manipulate the data. Using data
binding means that after a remote server transmits data, the user can perform some minor data
manipulations on their own local system. The remote server does not have to perform all the data
manipulations nor repeatedly transmit variations of the same data.
 Data binding involves moving data from a data source to a local system, and then
manipulating the data, such as, searching, sorting, and filtering, it on the local system.
Unit-IV/Web Engineering Truba College of Sc. & Tech., Bhopal
Prepared By: Ms. Nandini Sharma(CSE DEPT.) Page 15
 When you bind data in this way, you do not have to request that the remote server
manipulate the data and then retransmit the results; you can perform some data
manipulation locally.
 In data binding, the data source provides the data, and the appropriate applications
retrieve and synchronize the data and present it on the terminal screen.
 If the data changes, the applications are written so they can alter their presentation to
reflect those changes.
 Data binding is used to reduce traffic on the network and to reduce the work of the Web
server, especially for minor data manipulations.
 Binding data also separates the task of maintaining data from the tasks of developing and
maintaining binding and presentation programs.
CONVERTING XML TO HTML FOR DISPLAY
There exist several ways to convert XML to HTML for display on the Web.
Using HTML alone
If your XML file is of a simple tabular form only two levels deep then you can display XML
files using HTML alone.
Using HTML + CSS
This is a substantially more powerful way to transform XML to HTML than HTML alone, but
lacks the full power and flexibility of the methods listed below.
Using HTML with JavaScript
Fully general XML files of any type and complexity can be processed and displayed using a
combination of HTML and JavaScript. The advantages of this approach are that any possible
transformation and display can be carried out because JavaScript is a fully general purpose
programming language. The disadvantages are that it often requires large, complex, and very
detailed programs using recursive functions (functions that call themselves repeatedly) which are
very difficult for most people to grasp
Using XSL and Xpath
XSL (eXtensible Stylesheet Language) is considered the best way to convert XML to HTML.
The advantages are that the language is very compact, very sophisticated HTML can be
displayed with relatively small programs, it is easy to re-purpose XML to serve a variety of
purposes, it is non-procedural in that you generally specify only what you wish to accomplish as
opposed to detailed instructions as to how to achieve it, and it greatly reduces or eliminates the
need for recursive functions. The disadvantages are that it requires a very different mindset to
use, and the language is still evolving so that many XSL processors in the Web servers are out of
date and newer ones must sometimes be invoked through DOS
Unit-IV/Web Engineering Truba College of Sc. & Tech., Bhopal
Prepared By: Ms. Nandini Sharma(CSE DEPT.) Page 16
DISPLAYING XML DOCUMENT USING XSL
It is a language for expressing stylesheets. It consists of two parts:
 A language for transforming XML documents (XSLT)
 An XML vocabulary for specifying formatting semantics
An XSL stylesheet specifies the presentation of a class of XML documents by describing how an
instance of the class is transformed into an XML document that uses the formatting vocabulary.
Like CSS an XSL is linked to an XML document and tell browser how to display each of
document's elements. An XML document with an attached XSL can be open directly in Internet
Explorers. You don't need to use an HTML page to access and display the data.
There are two basic steps for using a css to display an XML document:
 Create the XSL file.
 Link the XSL sheet to XML document.
CREATING XSL FILE
XSL is a plain text file with .css extension that contains a set of rules telling the web browser
how to format and display the elements in a specific XML document. You can create a css file
using your favorite text editors like Notepad, Wordpad or other text or HTML editor as show
below:
general.xsl
employees
{
background-color: #ffffff;
width: 100%;
}
id
{
display: block; margin-bottom: 30pt; margin-left: 0;
}
name
{
color: #FF0000;
font-size: 20pt;
}
city,state,zipcode
{
color: #0000FF;
Unit-IV/Web Engineering Truba College of Sc. & Tech., Bhopal
Prepared By: Ms. Nandini Sharma(CSE DEPT.) Page 17
font-size: 20pt;
}
LINKING
To link to a style sheet you use an XML processing directive to associate the style sheet with the
current document. This statement should occur before the root node of the document.
<?xml-stylesheet type="text/xsl" href="styles/general.xsl">
The two attributes of the tag are as follows:
href
The URL for the style sheet.
type
The MIME type of the document begin linked, which in this case is text/css.
MIME stands for Multipart Internet Mail Extension. It is a standard which defines how to make
systems aware of the type of content being included in e-mail messages.
The css file is designed to attached to the XML document as shown below:
<?xml version="1.0" encoding="utf-8" standalone="no"?>
<!--This xml file represent the details of an employee-->
<?xml-stylesheet type="text/xsl" href="styles/general.xsl">
<employees>
<employee id="1">
<name>
<firstName>Mohit</firstName>
<lastName>Jain</lastName>
</name>
<city>Karnal</city>
<state>Haryana</state>
<zipcode>98122</zipcode>
</employee>
<employee id="2">
<name>
<firstName>Rahul</firstName>
<lastName>Kapoor</lastName>
</name>
<city>Ambala</city>
<state>Haryana</state>
<zipcode>98112</zipcode>
</employee>
</employees>
Unit-IV/Web Engineering Truba College of Sc. & Tech., Bhopal
Prepared By: Ms. Nandini Sharma(CSE DEPT.) Page 18
REWRITING
Let's say you have a proxy running om www.myproxy.com and have proxied the site
www.remotesite.com to the directory /remote. The links on the proxied page
www.remotesite.com doesn't know they are being proxied, this can create some problems. But
lets start with looking at the three different link types.
 <a href="myfile.html"> - This link will work
 <a href="/myfile.html"> - This link wont work
 <a href="http://www.remotesite.com/myfile.html"> - This link wont work
The first link will work since it is relative to the content.
The second link is mapped to the root and therefore the browser will request the following page:
http://www.myproxy.com/myfile.html, but this file isn't found since only files in the directory
/remote will be sent to www.remotesite.com. We have to change so that the link points to
/remote/myfile.html.
The third link is absolute and therefor the browser will follow it to
http://www.remotesite.com/myfile.html. This works correctly, but only if the remote site is
visible to the client. Probably the site being proxied is some internal server not accessible from
the outside. We have to change the link to http://www.myproxy.com/remote/myfile.html.
The rewrite filter
As you should already have learned the proxy is built using a filter that proxies
all incomingrequests. To make the rewrite work there is another filter supplied, the rewrite filter.
Theproxy filter will work perfectly fine without a rewrite filter and doesn't have any knowledge
of the possibility for links to be rewritten. This makes it just as easy to run the proxy with and
without rewriting.
How it works
The current rewriting is done by parsing the html, javascript and css files looking for links using
regular expressions.
The reason the proxy is using regular expressions is that it then can use the the same type of
parsing to find links in both css and html. There is one other reason for using regular expression
over a XML parser, pages aren't writing in XHTML. Since there are so many non XML
compatible pages out there using a standard XML parser wouldn't work. There are other options
like javax.swing.text.html and changing from regular expressions is something considered for
the next versions. There will have to be some measurable performance benefits for doing so
however.
Turn on rewrite
web.xml
Unit-IV/Web Engineering Truba College of Sc. & Tech., Bhopal
Prepared By: Ms. Nandini Sharma(CSE DEPT.) Page 19
The default setting of the proxy is to not do any link rewriting. But you can easily turn the
rewriting on by adding the rewrite filter. A alternate web.xml is supplied with the proxy that has
rewriting enabled. The file is called web_rewriting.xml and can be found in
TOMCAT_HOME/webapps/J2EP_INSTALL_DIR/WEB-INF/. To enable rewriting rename
web_rewriting.xml to web.xml, make sure that you overwrite the existing file.
data.xml (config file)
Here are the good news, you don't have to do anything (almost). If you have mapped a site for
the proxy all of the links excluding the absolute ones will be rewritten. The reason that the
absolute links aren't rewritten is that you might want to leave them as they are and let the user
follow those links.
You will probably turn absolute link rewriting on however. To do this, simply add
theparameter isRewriting="true" to the server. All absolute links found on a page will be
matched to see if we have them mapped in the config. If we have the server mapped
andisRewriting is set to "true" absolute links for the server will be rewritten.
All servers doesn't support the isRewriting=‖true‖, for instance RoundRobinCluster will always
do rewriting. Consult the documentation of the servers for more information.
Other form of rewrites
There are two more issues with rewriting. One is when the server says a page has moved and
sends a location for the new page, we have to rewrite that location. The other issue is when a
cookie is sent from the server, we have to change so the cookie is set for the correct directory.
Both of these issues are handled by the proxy without having to do any extra configuration.
HTML, SGML, and XML
First you should know that SGML (Standard Generalized Markup Language) is the basis for
both HTML and XML. SGML is an international standard (ISO 8879) that was published in
1986.
Second, you need to know that XHTML is XML. "XHTML 1.0 is a reformulation of HTML
4.01 in XML, and combines the strength of HTML 4 with the power of XML."
Thirdly, XML is NOT a language, it is rules to create an XML based language. Thus, XHTML
1.0 uses the tags of HTML 4.01 but follows the rules of XML.
The Document
A typical document is made up of three layers:
 structure
 Content
 Style
Unit-IV/Web Engineering Truba College of Sc. & Tech., Bhopal
Prepared By: Ms. Nandini Sharma(CSE DEPT.) Page 20
Structure
Structure would be the documents title, author, paragraphs, topics, chapters, head, body etc.
Content
Content is the actual information that composes a title, author, paragraphs etc.
Style
Style is how the content within the structural elements are displayed such as font color, type and
size, text alignment etc.
Markup
HTML, SGML, and XML all markup content using tags. The difference is that SGML and XML
mainly deal with the relationship between content and structure, the structural tags that markup
the content are not predefined (you can make up your own language), and style is kept
TOTALLY separate; HTML on the other hand, is a mix of content marked up with both
structural and stylistic tags. HTML tags are predefined by the HTML language.
By mixing structure, content and style you limit yourself to one form of presentation and in
HTML's case that would be in a limited group of browsers for the World Wide Web.
By separating structure and content from style, you can take one file and present it in multiple
forms. XML can be transformed to HTML/XHTML and displayed on the Web, or the
information can be transformed and published to paper, and the data can be read by any XML
aware browser or application.
SGML (Standard Generalized Markup Language)
Historically, Electronic publishing applications such as Microsoft Word, Adobe PageMaker or
QuarkXpress, "marked up" documents in a proprietary format that was only recognized by that
particular application. The document markup for both structure and style was mixed in with the
content and was published to only one media, the printed page.
These programs and their proprietary markup had no capability to define the appearance of the
information for any other media besides paper, and really did not describe very well the actual
content of the document beyond paragraphs, headings and titles. The file format could not be
read or exchanged with other programs, it was useful only within the application that created it.
Because SGML is a nonproprietary international standard it allows you to create documents that
are independent of any specific hardware or software. The document structure (what elements
are used and their relationship to each other) is described in a file called the DTD (Document
Type Definition). The DTD defines the relationships between a document's elements creating a
consistent, logical structure for each document.
SGML is good for handling large-scale, long-term information management needs and has been
around for more than a decade as the language of defense contractors and the electronic
Unit-IV/Web Engineering Truba College of Sc. & Tech., Bhopal
Prepared By: Ms. Nandini Sharma(CSE DEPT.) Page 21
publishing industry. Because SGML is very large, powerful, and complex it is hard to learn and
understand and is not well suited for the Web environment.
XML (Extensible Markup Language)
XML is a "restricted form of SGML" which removes some of the complexity of SGML. XML
like SGML, retains the flexibility of describing customized markup languages with a user-
defined document structure (DTD) in a non-proprietary file format for both storage and
exchange of text and data both on and off the Web.
As mentioned before, XML separates structure and content from style and the structural markup
tags can actually describe the content because they can be customized for each XML based
markup language. A good example of this is the Math Markup Language (MathML) which is an
XML application for describing mathematical notation and capturing both its structure and
content.
Until MathML, the ability to communicate mathematical expressions on the Web was limited to
mainly displaying images (JPG or GIF) of the scientific notation or posting the document as a
PDF file. MathML allows the information to be displayed on the Web, and makes it available for
searching, indexing, or reuse in other applications.
HTML (Hypertext markup Language)
HTML is a single, predefined markup language that forces Web designers to use it's limiting and
lax syntax and structure. The HTML standard was not designed with other platforms in mind,
such as Web TV’s, mobile phones or PDAs. The structural markup does little to describe the
content beyond paragraph, list, title and heading.
XML breaks the restricting chains of HTML by allowing people to create their own markup
languages for exchanging information. The tags can be descriptive of the content and authors
decide how the document will be displayed using style sheets (CSS and XSL). Because of
XML's consistent syntax and structure, documents can be transformed and published to multiple
forms of media and content can be exchanged between other XML applications.
HTML was useful in the part it has played in the success of the Web but has been outgrown as
the Web requires more robust, flexible languages to support it's expanding forms of
communication and data exchange.
XML will never completely replace SGML because SGML is still considered better for long-
time storage of complex documents. However, XML has already replaced HTML as the
recommended markup language for the Web with the creation of XHTML 1.0.
Even though XHTML has not made the HTML that currently exists on the Web obsolete, HTML
4.01 is the last version of HTML. XHTML (an XML application) is the foundation for a
universally accessible, device independent Web.
Semantic Web Services, like conventional web services, are the server end of a client–
server system for machine-to-machine interaction via the World Wide Web. Semantic services
Unit-IV/Web Engineering Truba College of Sc. & Tech., Bhopal
Prepared By: Ms. Nandini Sharma(CSE DEPT.) Page 22
are a component of the semantic web because they use markup which makes data machine-
readable in a detailed and sophisticated way (as compared with human-readable HTML which is
usually not easily "understood" by computer programs).
WEB ONTOLOGY LANGUAGE
It is a family of knowledge representation languages or ontology languages for authoring
ontologies or knowledge bases. The languages are characterised by formal
semantics and RDF/XML-based serializations for the Semantic Web. OWL is endorsed by
the World Wide Web Consortium (W3C) and has attracted academic, medical and commercial
interest.
In October 2007, a new W3C working group was started to extend OWL with several new
features as proposed in the OWL 1.1 member submission. W3C announced the new version of
OWL on 27 October 2009. This new version, called OWL 2, soon found its way into semantic
editors such as Protégé and semantic reasoners such as Pellet. The OWL family contains many
species, serializations, syntaxes and specifications with similar names. OWL and OWL2 are
used to refer to the 2004 and 2009 specifications, respectively. Full species names will be used,
including specification version (for example, OWL2 EL). When referring more generally, OWL
Family will be used.
TYPES OF ONTOLOGIES
Domain ontology - A domain ontology (or domain-specific ontology) models a specific domain,
which represents part of the world. Particular meanings of terms applied to that domain are
provided by domain ontology. For example the word card has many different meanings. An
ontology about the domain of poker would model the "playing card" meaning of the word, while
an ontology about the domain of computer hardware would model the "punched card" and
"video card" meanings.
Since domain ontologies represent concepts in very specific and often eclectic ways, they are
often incompatible. As systems that rely on domain ontologies expand, they often need to merge
domain ontologies into a more general representation. This presents a challenge to the ontology
designer. Different ontologies in the same domain arise due to different languages, different
intended usage of the ontologies, and different perceptions of the domain (based on cultural
background, education, ideology, etc.).
At present, merging ontologies that are not developed from a common foundation ontology is a
largely manual process and therefore time-consuming and expensive. Domain ontologies that
use the same foundation ontology to provide a set of basic elements with which to specify the
meanings of the domain ontology elements can be merged automatically. There are studies on
generalized techniques for merging ontologies, but this area of research is still largely
theoretical.
Upper ontology - An upper ontology (or foundation ontology) is a model of the common
objects that are generally applicable across a wide range of domain ontologies. It employs a core
Unit-IV/Web Engineering Truba College of Sc. & Tech., Bhopal
Prepared By: Ms. Nandini Sharma(CSE DEPT.) Page 23
glossarythat contains the terms and associated object descriptions as they are used in various
relevant domain sets.
There are several standardized upper ontologies available for use, including Dublin
Core, GFO, OpenCyc/ResearchCyc, SUMO, and DOLCE. WordNet, while considered an upper
ontology by some, is not strictly an ontology. However, it has been employed as a linguistic tool
for learning domain ontologies.
Hybrid ontology - The Gellish ontology is an example of a combination of an upper and a
domain ontology.

More Related Content

What's hot

What's hot (20)

Markup Languages
Markup Languages Markup Languages
Markup Languages
 
Html
HtmlHtml
Html
 
Basic Html Notes
Basic Html NotesBasic Html Notes
Basic Html Notes
 
Presentation on HTML
Presentation on HTMLPresentation on HTML
Presentation on HTML
 
Web engineering notes unit 2
Web engineering notes unit 2Web engineering notes unit 2
Web engineering notes unit 2
 
Basic of web design
Basic of web designBasic of web design
Basic of web design
 
Html basics
Html basicsHtml basics
Html basics
 
What are razor pages?
What are razor pages?What are razor pages?
What are razor pages?
 
HTML
HTMLHTML
HTML
 
Basic Html Knowledge for students
Basic Html Knowledge for studentsBasic Html Knowledge for students
Basic Html Knowledge for students
 
Introduction to xml
Introduction to xmlIntroduction to xml
Introduction to xml
 
Html
HtmlHtml
Html
 
Servlets api overview
Servlets api overviewServlets api overview
Servlets api overview
 
Learning Html
Learning HtmlLearning Html
Learning Html
 
Basic Details of HTML and CSS.pdf
Basic Details of HTML and CSS.pdfBasic Details of HTML and CSS.pdf
Basic Details of HTML and CSS.pdf
 
PHP HTML CSS Notes
PHP HTML CSS  NotesPHP HTML CSS  Notes
PHP HTML CSS Notes
 
Html Intro2
Html Intro2Html Intro2
Html Intro2
 
www and http services
www and http serviceswww and http services
www and http services
 
Html / CSS Presentation
Html / CSS PresentationHtml / CSS Presentation
Html / CSS Presentation
 
Bootstrap 4 Tutorial PDF for Beginners - Learn Step by Step
Bootstrap 4 Tutorial PDF for Beginners - Learn Step by StepBootstrap 4 Tutorial PDF for Beginners - Learn Step by Step
Bootstrap 4 Tutorial PDF for Beginners - Learn Step by Step
 

Viewers also liked

Web engineering UNIT V as per RGPV syllabus
Web engineering UNIT V as per RGPV syllabusWeb engineering UNIT V as per RGPV syllabus
Web engineering UNIT V as per RGPV syllabusNANDINI SHARMA
 
Web Engineering UNIT II Notes as per RGPV Syllabus
Web Engineering UNIT II Notes as per RGPV SyllabusWeb Engineering UNIT II Notes as per RGPV Syllabus
Web Engineering UNIT II Notes as per RGPV SyllabusNANDINI SHARMA
 
Distributed system unit II according to syllabus of RGPV, Bhopal
Distributed system unit II according to syllabus of  RGPV, BhopalDistributed system unit II according to syllabus of  RGPV, Bhopal
Distributed system unit II according to syllabus of RGPV, BhopalNANDINI SHARMA
 
Web Engineering Notes II as per RGPV Syllabus
Web Engineering Notes II as per RGPV SyllabusWeb Engineering Notes II as per RGPV Syllabus
Web Engineering Notes II as per RGPV SyllabusNANDINI SHARMA
 
Data Structure Part II
Data Structure Part IIData Structure Part II
Data Structure Part IINANDINI SHARMA
 
Data Structure Part III
Data Structure Part IIIData Structure Part III
Data Structure Part IIINANDINI SHARMA
 
Database Management System
Database Management SystemDatabase Management System
Database Management SystemNANDINI SHARMA
 
Computer Network MAC Layer Notes as per RGPV syllabus
Computer Network MAC Layer Notes as per RGPV syllabusComputer Network MAC Layer Notes as per RGPV syllabus
Computer Network MAC Layer Notes as per RGPV syllabusNANDINI SHARMA
 
Number System (Computer)
Number System (Computer)Number System (Computer)
Number System (Computer)NANDINI SHARMA
 
Basic Computer Engineering Unit II as per RGPV Syllabus
Basic Computer Engineering Unit II as per RGPV SyllabusBasic Computer Engineering Unit II as per RGPV Syllabus
Basic Computer Engineering Unit II as per RGPV SyllabusNANDINI SHARMA
 
Computer Architecture & Organization
Computer Architecture & OrganizationComputer Architecture & Organization
Computer Architecture & OrganizationNANDINI SHARMA
 
Computer Network notes (handwritten) UNIT 1
Computer Network notes (handwritten) UNIT 1Computer Network notes (handwritten) UNIT 1
Computer Network notes (handwritten) UNIT 1NANDINI SHARMA
 
Distributed system notes unit I
Distributed system notes unit IDistributed system notes unit I
Distributed system notes unit INANDINI SHARMA
 
Cloud computing notes unit II
Cloud computing notes unit II Cloud computing notes unit II
Cloud computing notes unit II NANDINI SHARMA
 
Data Structure Notes Part-1
Data Structure Notes Part-1 Data Structure Notes Part-1
Data Structure Notes Part-1 NANDINI SHARMA
 
Cloud computing notes unit I as per RGPV syllabus
Cloud computing notes unit I as per RGPV syllabusCloud computing notes unit I as per RGPV syllabus
Cloud computing notes unit I as per RGPV syllabusNANDINI SHARMA
 
Computer Network Notes (Handwritten) UNIT 2
Computer Network Notes (Handwritten) UNIT 2Computer Network Notes (Handwritten) UNIT 2
Computer Network Notes (Handwritten) UNIT 2NANDINI SHARMA
 
Notes 2D-Transformation Unit 2 Computer graphics
Notes 2D-Transformation Unit 2 Computer graphicsNotes 2D-Transformation Unit 2 Computer graphics
Notes 2D-Transformation Unit 2 Computer graphicsNANDINI SHARMA
 
Unit 4 Multimedia CSE Vth sem
Unit 4 Multimedia CSE Vth semUnit 4 Multimedia CSE Vth sem
Unit 4 Multimedia CSE Vth semNANDINI SHARMA
 
Unit 5 animation notes
Unit 5 animation notesUnit 5 animation notes
Unit 5 animation notesNANDINI SHARMA
 

Viewers also liked (20)

Web engineering UNIT V as per RGPV syllabus
Web engineering UNIT V as per RGPV syllabusWeb engineering UNIT V as per RGPV syllabus
Web engineering UNIT V as per RGPV syllabus
 
Web Engineering UNIT II Notes as per RGPV Syllabus
Web Engineering UNIT II Notes as per RGPV SyllabusWeb Engineering UNIT II Notes as per RGPV Syllabus
Web Engineering UNIT II Notes as per RGPV Syllabus
 
Distributed system unit II according to syllabus of RGPV, Bhopal
Distributed system unit II according to syllabus of  RGPV, BhopalDistributed system unit II according to syllabus of  RGPV, Bhopal
Distributed system unit II according to syllabus of RGPV, Bhopal
 
Web Engineering Notes II as per RGPV Syllabus
Web Engineering Notes II as per RGPV SyllabusWeb Engineering Notes II as per RGPV Syllabus
Web Engineering Notes II as per RGPV Syllabus
 
Data Structure Part II
Data Structure Part IIData Structure Part II
Data Structure Part II
 
Data Structure Part III
Data Structure Part IIIData Structure Part III
Data Structure Part III
 
Database Management System
Database Management SystemDatabase Management System
Database Management System
 
Computer Network MAC Layer Notes as per RGPV syllabus
Computer Network MAC Layer Notes as per RGPV syllabusComputer Network MAC Layer Notes as per RGPV syllabus
Computer Network MAC Layer Notes as per RGPV syllabus
 
Number System (Computer)
Number System (Computer)Number System (Computer)
Number System (Computer)
 
Basic Computer Engineering Unit II as per RGPV Syllabus
Basic Computer Engineering Unit II as per RGPV SyllabusBasic Computer Engineering Unit II as per RGPV Syllabus
Basic Computer Engineering Unit II as per RGPV Syllabus
 
Computer Architecture & Organization
Computer Architecture & OrganizationComputer Architecture & Organization
Computer Architecture & Organization
 
Computer Network notes (handwritten) UNIT 1
Computer Network notes (handwritten) UNIT 1Computer Network notes (handwritten) UNIT 1
Computer Network notes (handwritten) UNIT 1
 
Distributed system notes unit I
Distributed system notes unit IDistributed system notes unit I
Distributed system notes unit I
 
Cloud computing notes unit II
Cloud computing notes unit II Cloud computing notes unit II
Cloud computing notes unit II
 
Data Structure Notes Part-1
Data Structure Notes Part-1 Data Structure Notes Part-1
Data Structure Notes Part-1
 
Cloud computing notes unit I as per RGPV syllabus
Cloud computing notes unit I as per RGPV syllabusCloud computing notes unit I as per RGPV syllabus
Cloud computing notes unit I as per RGPV syllabus
 
Computer Network Notes (Handwritten) UNIT 2
Computer Network Notes (Handwritten) UNIT 2Computer Network Notes (Handwritten) UNIT 2
Computer Network Notes (Handwritten) UNIT 2
 
Notes 2D-Transformation Unit 2 Computer graphics
Notes 2D-Transformation Unit 2 Computer graphicsNotes 2D-Transformation Unit 2 Computer graphics
Notes 2D-Transformation Unit 2 Computer graphics
 
Unit 4 Multimedia CSE Vth sem
Unit 4 Multimedia CSE Vth semUnit 4 Multimedia CSE Vth sem
Unit 4 Multimedia CSE Vth sem
 
Unit 5 animation notes
Unit 5 animation notesUnit 5 animation notes
Unit 5 animation notes
 

Similar to Web engineering UNIT IV as per RGPV syllabus

Similar to Web engineering UNIT IV as per RGPV syllabus (20)

Web engineering notes unit 4
Web engineering notes unit 4Web engineering notes unit 4
Web engineering notes unit 4
 
xml introduction in web technologies subject
xml introduction in web technologies subjectxml introduction in web technologies subject
xml introduction in web technologies subject
 
Xml tutorial
Xml tutorialXml tutorial
Xml tutorial
 
Xml material
Xml materialXml material
Xml material
 
Xml material
Xml materialXml material
Xml material
 
Xml material
Xml materialXml material
Xml material
 
XML.pptx
XML.pptxXML.pptx
XML.pptx
 
XML Tutor maXbox starter27
XML Tutor maXbox starter27XML Tutor maXbox starter27
XML Tutor maXbox starter27
 
Xml 150323102007-conversion-gate01
Xml 150323102007-conversion-gate01Xml 150323102007-conversion-gate01
Xml 150323102007-conversion-gate01
 
Xml
XmlXml
Xml
 
Unit 2.2
Unit 2.2Unit 2.2
Unit 2.2
 
Sgml and xml
Sgml and xmlSgml and xml
Sgml and xml
 
XML
XMLXML
XML
 
chapter 4 web authoring unit 4 xml.pptx
chapter 4 web authoring  unit 4 xml.pptxchapter 4 web authoring  unit 4 xml.pptx
chapter 4 web authoring unit 4 xml.pptx
 
Unit 2.2
Unit 2.2Unit 2.2
Unit 2.2
 
Xml 1
Xml 1Xml 1
Xml 1
 
XML DTD Validate
XML DTD ValidateXML DTD Validate
XML DTD Validate
 
Oracle soa xml faq
Oracle soa xml faqOracle soa xml faq
Oracle soa xml faq
 
What is html xml and xhtml
What is html xml and xhtmlWhat is html xml and xhtml
What is html xml and xhtml
 
Introduction to xml
Introduction to xmlIntroduction to xml
Introduction to xml
 

More from NANDINI SHARMA

Function and Recursively defined function
Function and Recursively defined functionFunction and Recursively defined function
Function and Recursively defined functionNANDINI SHARMA
 
Relation in Discrete Mathematics
Relation in Discrete Mathematics Relation in Discrete Mathematics
Relation in Discrete Mathematics NANDINI SHARMA
 
Mathematical Induction in Discrete Structure
Mathematical Induction in Discrete StructureMathematical Induction in Discrete Structure
Mathematical Induction in Discrete StructureNANDINI SHARMA
 
SETS in Discrete Structure
SETS in Discrete StructureSETS in Discrete Structure
SETS in Discrete StructureNANDINI SHARMA
 
Rules of Inference in Discrete Structure
Rules of Inference in Discrete StructureRules of Inference in Discrete Structure
Rules of Inference in Discrete StructureNANDINI SHARMA
 
Proposition Logic in Discrete Structure
Proposition Logic in Discrete StructureProposition Logic in Discrete Structure
Proposition Logic in Discrete StructureNANDINI SHARMA
 
Algebraic Structure Part 2 DSTL
Algebraic Structure Part 2 DSTLAlgebraic Structure Part 2 DSTL
Algebraic Structure Part 2 DSTLNANDINI SHARMA
 
FIELD in Discrete Structure
FIELD in Discrete StructureFIELD in Discrete Structure
FIELD in Discrete StructureNANDINI SHARMA
 
LATTICE in Discrete Structure
LATTICE in Discrete StructureLATTICE in Discrete Structure
LATTICE in Discrete StructureNANDINI SHARMA
 
5 variable kmap questions
5 variable kmap questions5 variable kmap questions
5 variable kmap questionsNANDINI SHARMA
 

More from NANDINI SHARMA (15)

Function and Recursively defined function
Function and Recursively defined functionFunction and Recursively defined function
Function and Recursively defined function
 
Relation in Discrete Mathematics
Relation in Discrete Mathematics Relation in Discrete Mathematics
Relation in Discrete Mathematics
 
Mathematical Induction in Discrete Structure
Mathematical Induction in Discrete StructureMathematical Induction in Discrete Structure
Mathematical Induction in Discrete Structure
 
SETS in Discrete Structure
SETS in Discrete StructureSETS in Discrete Structure
SETS in Discrete Structure
 
Predicate logic
Predicate logicPredicate logic
Predicate logic
 
Rules of Inference in Discrete Structure
Rules of Inference in Discrete StructureRules of Inference in Discrete Structure
Rules of Inference in Discrete Structure
 
Proposition Logic in Discrete Structure
Proposition Logic in Discrete StructureProposition Logic in Discrete Structure
Proposition Logic in Discrete Structure
 
Algebraic Structure Part 2 DSTL
Algebraic Structure Part 2 DSTLAlgebraic Structure Part 2 DSTL
Algebraic Structure Part 2 DSTL
 
FIELD in Discrete Structure
FIELD in Discrete StructureFIELD in Discrete Structure
FIELD in Discrete Structure
 
Algebraic Structure
Algebraic StructureAlgebraic Structure
Algebraic Structure
 
LATTICE in Discrete Structure
LATTICE in Discrete StructureLATTICE in Discrete Structure
LATTICE in Discrete Structure
 
DSTL: TREES AND GRAPH
DSTL: TREES AND GRAPHDSTL: TREES AND GRAPH
DSTL: TREES AND GRAPH
 
5 variable kmap questions
5 variable kmap questions5 variable kmap questions
5 variable kmap questions
 
Computer Graphics
Computer GraphicsComputer Graphics
Computer Graphics
 
Multimedia
Multimedia Multimedia
Multimedia
 

Recently uploaded

HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptxHOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptxSCMS School of Architecture
 
Online food ordering system project report.pdf
Online food ordering system project report.pdfOnline food ordering system project report.pdf
Online food ordering system project report.pdfKamal Acharya
 
💚Trustworthy Call Girls Pune Call Girls Service Just Call 🍑👄6378878445 🍑👄 Top...
💚Trustworthy Call Girls Pune Call Girls Service Just Call 🍑👄6378878445 🍑👄 Top...💚Trustworthy Call Girls Pune Call Girls Service Just Call 🍑👄6378878445 🍑👄 Top...
💚Trustworthy Call Girls Pune Call Girls Service Just Call 🍑👄6378878445 🍑👄 Top...vershagrag
 
Employee leave management system project.
Employee leave management system project.Employee leave management system project.
Employee leave management system project.Kamal Acharya
 
Online electricity billing project report..pdf
Online electricity billing project report..pdfOnline electricity billing project report..pdf
Online electricity billing project report..pdfKamal Acharya
 
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...Amil baba
 
A Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityA Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityMorshed Ahmed Rahath
 
Computer Networks Basics of Network Devices
Computer Networks  Basics of Network DevicesComputer Networks  Basics of Network Devices
Computer Networks Basics of Network DevicesChandrakantDivate1
 
Standard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayStandard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayEpec Engineered Technologies
 
Basic Electronics for diploma students as per technical education Kerala Syll...
Basic Electronics for diploma students as per technical education Kerala Syll...Basic Electronics for diploma students as per technical education Kerala Syll...
Basic Electronics for diploma students as per technical education Kerala Syll...ppkakm
 
Design For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startDesign For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startQuintin Balsdon
 
Jaipur ❤CALL GIRL 0000000000❤CALL GIRLS IN Jaipur ESCORT SERVICE❤CALL GIRL IN...
Jaipur ❤CALL GIRL 0000000000❤CALL GIRLS IN Jaipur ESCORT SERVICE❤CALL GIRL IN...Jaipur ❤CALL GIRL 0000000000❤CALL GIRLS IN Jaipur ESCORT SERVICE❤CALL GIRL IN...
Jaipur ❤CALL GIRL 0000000000❤CALL GIRLS IN Jaipur ESCORT SERVICE❤CALL GIRL IN...jabtakhaidam7
 
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...Call Girls Mumbai
 
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptxS1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptxSCMS School of Architecture
 
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Arindam Chakraborty, Ph.D., P.E. (CA, TX)
 
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
COST-EFFETIVE  and Energy Efficient BUILDINGS ptxCOST-EFFETIVE  and Energy Efficient BUILDINGS ptx
COST-EFFETIVE and Energy Efficient BUILDINGS ptxJIT KUMAR GUPTA
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VDineshKumar4165
 
fitting shop and tools used in fitting shop .ppt
fitting shop and tools used in fitting shop .pptfitting shop and tools used in fitting shop .ppt
fitting shop and tools used in fitting shop .pptAfnanAhmad53
 

Recently uploaded (20)

HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptxHOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
 
Signal Processing and Linear System Analysis
Signal Processing and Linear System AnalysisSignal Processing and Linear System Analysis
Signal Processing and Linear System Analysis
 
Online food ordering system project report.pdf
Online food ordering system project report.pdfOnline food ordering system project report.pdf
Online food ordering system project report.pdf
 
💚Trustworthy Call Girls Pune Call Girls Service Just Call 🍑👄6378878445 🍑👄 Top...
💚Trustworthy Call Girls Pune Call Girls Service Just Call 🍑👄6378878445 🍑👄 Top...💚Trustworthy Call Girls Pune Call Girls Service Just Call 🍑👄6378878445 🍑👄 Top...
💚Trustworthy Call Girls Pune Call Girls Service Just Call 🍑👄6378878445 🍑👄 Top...
 
Employee leave management system project.
Employee leave management system project.Employee leave management system project.
Employee leave management system project.
 
Online electricity billing project report..pdf
Online electricity billing project report..pdfOnline electricity billing project report..pdf
Online electricity billing project report..pdf
 
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
 
A Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityA Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna Municipality
 
Computer Networks Basics of Network Devices
Computer Networks  Basics of Network DevicesComputer Networks  Basics of Network Devices
Computer Networks Basics of Network Devices
 
Standard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayStandard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power Play
 
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
 
Basic Electronics for diploma students as per technical education Kerala Syll...
Basic Electronics for diploma students as per technical education Kerala Syll...Basic Electronics for diploma students as per technical education Kerala Syll...
Basic Electronics for diploma students as per technical education Kerala Syll...
 
Design For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startDesign For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the start
 
Jaipur ❤CALL GIRL 0000000000❤CALL GIRLS IN Jaipur ESCORT SERVICE❤CALL GIRL IN...
Jaipur ❤CALL GIRL 0000000000❤CALL GIRLS IN Jaipur ESCORT SERVICE❤CALL GIRL IN...Jaipur ❤CALL GIRL 0000000000❤CALL GIRLS IN Jaipur ESCORT SERVICE❤CALL GIRL IN...
Jaipur ❤CALL GIRL 0000000000❤CALL GIRLS IN Jaipur ESCORT SERVICE❤CALL GIRL IN...
 
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
 
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptxS1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
 
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
 
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
COST-EFFETIVE  and Energy Efficient BUILDINGS ptxCOST-EFFETIVE  and Energy Efficient BUILDINGS ptx
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - V
 
fitting shop and tools used in fitting shop .ppt
fitting shop and tools used in fitting shop .pptfitting shop and tools used in fitting shop .ppt
fitting shop and tools used in fitting shop .ppt
 

Web engineering UNIT IV as per RGPV syllabus

  • 1. Unit-IV/Web Engineering Truba College of Sc. & Tech., Bhopal Prepared By: Ms. Nandini Sharma(CSE DEPT.) Page 1 INTRODUCTION XML (Extensible Markup Language) is a flexible way to create common information formats and share both the format and the data on the World Wide Web, intranets, and elsewhere. For example, computer makers might agree on a standard or common way to describe the information about a computer product (processor speed, memory size, and so forth) and then describe the product information format with XML. Such a standard way of describing data would enable a user to send an intelligent agent (a program) to each computer maker's Web site, gather data, and then make a valid comparison. XML can be used by any individual or group of individuals or companies that wants to share information in a consistent way. XML, a formal recommendation from the World Wide Web Consortium (W3C), is similar to the language of today's Web pages, the Hypertext Markup Language (HTML). Both XML and HTML contain markup symbols to describe the contents of a page or file. HTML, however, describes the content of a Web page (mainly text and graphic images) only in terms of how it is to be displayed and interacted with. For example, the letter "p" placed within markup tags starts a new paragraph. XML describes the content in terms of what data is being described. For example, the word "phoneme" placed within markup tags could indicate that the data that followed was a phone number. This means that an XML file can be processed purely as data by a program or it can be stored with similar data on another computer or, like an HTML file, that it can be displayed. For example, depending on how the application in the receiving computer wanted to handle the phone number, it could be stored, displayed, or dialled. XML is "extensible" because, unlike HTML, the markup symbols are unlimited and self- defining. XML is actually a simpler and easier-to-use subset of the Standard Generalized Markup Language (SGML), the standard for how to create a document structure. It is expected that HTML and XML will be used together in many Web applications. XML markup, for example, may appear within an HTML page. Early applications of XML include Microsoft's Channel Definition Format (CDF), which describes a channel, a portion of a Web site that has been downloaded to your hard disk and is then is updated periodically as information changes. A specific CDF file contains data that specifies an initial Web page and how frequently it is updated. Another early application is ChartWare, which uses XML as a way to describe medical charts so that they can be shared by doctors.Applications related to banking, e-commerce ordering, personal preference profiles, purchase orders, litigation documents, part lists, and many others are anticipated. VALIDATING XML FILES When you validate your XML file, the XML validator will check to see that your file is valid and well-formed. The XML editor will process XML files that are invalid or not well-formed. The editor uses heuristics to open a file using the best interpretation of the tagging that it can. For example, an element with a missing end tag is simply assumed to end at the end of the document. As you make updates to a file, the editor incrementally reinterprets your document, changing the highlighting, tree view, and so on. Many formation errors are easy to spot in the
  • 2. Unit-IV/Web Engineering Truba College of Sc. & Tech., Bhopal Prepared By: Ms. Nandini Sharma(CSE DEPT.) Page 2 syntax highlighting, so you can easily correct obvious errors on-the-fly. However, there will be other cases when it will be beneficial to perform formal validation on your documents. You can validate your file by selecting it in the Navigator view, right-clicking it, and clicking Validate. Any validation problems are indicated in the Problems view. You can double- click on individual errors, and you will be taken to the invalid tag in the file, so that you can make corrections. Note: If you receive an error message indicating that the Problems view is full, you can increase the number of error messages allowed by clicking Window > Preferences and selecting General > Markers . Select the Use marker limits check box and change the number in theLimit visible items per group field. You can set up a project's properties so that different types of project resources are automatically validated when you save them. From a project's pop-up menu, click Properties, then select Validation. Any validators you can run against your project will be listed in the Validation page. The purpose of a Document Type Definition or DTD is to define the structure of a document encoded in XML (eXtended Markup Language). For introductory material about XML, see the XML help page. It is possible to build and use files containing XML tags without ever defining what tags are legal. However, if you want to insure that files conform to a known structure, writing a DTD is the preferred method.  A well-formed file is one that obeys the general XML rules for tags: tags must be properly nested, opening and closing tags must be balanced, and empty tags must end with '/>'.  A valid file is not only well-formed, but it must also conform to a publicly available DTD that specifies which tags it uses, what attributes those tags can contain, and which tags can occur inside which other tags, among other properties. The advantage of a valid file is that its contents are more predictable for applications that want to process or present that file. The DTD insures that only certain tags can be used in certain places. DEFINITIONS We need to review some terminology before proceeding:  A proper XML name must start with a letter or underbar (_), with the rest letters, underbars, digits, or hyphen (-).  A tag is one of the XML constructs used to mark up documents. All tags start with a less- than symbol (<) and end with a greater-than symbol (>).
  • 3. Unit-IV/Web Engineering Truba College of Sc. & Tech., Bhopal Prepared By: Ms. Nandini Sharma(CSE DEPT.) Page 3  An element is a section of an XML document that acts as a unit. It may be either empty element, or it may have content.  An empty element consists of a single tag of the form <gi.../> Where gi is the tag type (or ―generic identifier‖), and the tag may include attributes. Note the slash before the closing ―>‖; this signifies an empty tag.  An opening tag begins a section of an XML document that ends with the corresponding closing tag. An opening tag has this form: <gi...> where gi is the tag type (or ―generic identifier‖), and the tag may include attributes. A closing tag has the form: </gi>  The content is everything between the opening tag and its corresponding closing tag. The content may be other elements or just plain text. The DTD can contain several different types of declarations:  Element declarations let you specify what kinds of tags can be used, and what (if anything) can appear inside the contents of the element.  Attribute declarations define what attributes you can use inside a given element.  Entity declarations define chunks of fixed text that can be included elsewhere.  Notation declarations define file types (like JPG and WAV files) so you can refer to non- XML files like image and sound files. ELEMENTS WITH MIXED CONTENT In general, an element can have any mixture of text and other elements as children. You can specify exactly which elements can be children. If you like, you can even specify that the children must occur in a given order. You can also specify that the child elements are optional. So, in the general form of the declaration <!ELEMENT gi (content)>, the content is an expression syntax—that is, it consists of operators and operands arranged in arbitrarily complex ways. Let's start with some simple cases to show you the features of a content declaration, but keep in mind that these features can be used in combination. The simplest case is when an element a has a single child element b: <!ELEMENT a (b)>
  • 4. Unit-IV/Web Engineering Truba College of Sc. & Tech., Bhopal Prepared By: Ms. Nandini Sharma(CSE DEPT.) Page 4 The above declaration in a DTD means that an element <a>...</a> must contain exactly one <b> element. To specify that a child element can occur one or more times, append a plus sign (+) after the child element name. For example, to say that a <squid> element may contain one or more <tentacle> elements: <!ELEMENT squid (tentacle+)> You can also specify that a child element can occur any number of times, or not at all. Append an asterisk (*), meaning ―zero or more of the previous,‖ after the child element name: <!ELEMENT lizard (leg*)> <!-- some <lizard>s have no <leg>s --> The question-mark suffix (?) means the child element is optional: it can occur zero or one time in the content of the element you're declaring. For example, suppose an <oven> element can either be empty or contain a <pie> element: <!ELEMENT oven (pie?)> If you want a certain sequence of children, name the child elements in a comma-separated list. For example, suppose a <memo>element must contain exactly one <from> element, then one <to> element, one <subject>, and one <message> element: <!ELEMENT memo (from,to,subject,message)> But you can use the +, *, and ? operators in this declaration. For example, suppose that you want to require that a <memo> must have<from> and <to> elements, but the <subject> element is optional, and it can have zero or more <message> elements. You'd then declare it like this: <!ELEMENT memo (from,to,subject?,message*)> Sometimes you need to specify that there is a choice of children. The ―or‖ operator (|) can be used to separate the choices. For example, suppose that a <trophy> element can have either a child named <bowling> or a child named <tennis>. Here's how you'd declare it: <!ELEMENT trophy (bowling|tennis)> You can also apply the usual suffix operators to groups of elements. For example, suppose you have an element <timerecord> that starts with a required <purpose> element, followed by zero or more pairs of <start-time> and <end-time> records: <!ELEMENT timerecord (purpose,(start-time,end-time)*)> Here's another more general example: <!ELEMENT stock ((pig|chicken|cow)*)> The above example says a <stock> element can contain any number of the three child elements, in any order.
  • 5. Unit-IV/Web Engineering Truba College of Sc. & Tech., Bhopal Prepared By: Ms. Nandini Sharma(CSE DEPT.) Page 5 Moreover, you can allow regular, untagged text to be mixed in with your specified child tags by placing #PCDATA at the start of a list of choices. For example, suppose a <speech> element can contain any mixture of regular text, and text tagged with the elements<loud> and <soft>: <!ELEMENT speech ((#PCDATA|loud|soft)*)> <!ELEMENT loud (#PCDATA)> <!ELEMENT soft (#PCDATA)> So, the content part of the element declaration can be arbitrarily complex. There are some ways #PCDATA cannot be used, and there are other uncommon features you may need; refer to the XML standard or a good book on the subject. ATTRIBUTE DECLARATIONS If an element is to have attributes, the names and possible values of those attributes must be declared in the DTD. Here is the general form: <!ATTLIST ename {aname atype default} ...> where ename is the name of the element for which you're defining attributes, aname is the name of one of that element's possible attributes, atype describes what values it can have, and default describes whether it has a default value. The last three items can be repeated inside an <!ATTLIST...> declaration, one group per attribute. The atype part describing the attribute's type can have three kinds of values:  The keyword CDATA means that the attribute can have any character string as a value. For example, suppose you want every <play> element to have a title attribute that can contain any text, and that attribute is required. Here is the complete attribute declaration: <!ATTLIST play title CDATA #REQUIRED>  There are several tokenized attribute types, which are required to have a certain structure. See tokenized attributes below.  You can provide a specific set of legal values for the attribute; see enumerated attributes below. The last part of the declaration, default, specifies whether the attribute can be omitted, and what value it will have if omitted. This must be one of the following: REQUIRED- The attribute must always be supplied. IMPLIED - The attribute can be omitted, and the DTD does not provide a default value. Anyone reading this file may assume a default value, but that is not the DTD's problem. "value" The attribute can be omitted, and the default value is the quoted string that you provide.
  • 6. Unit-IV/Web Engineering Truba College of Sc. & Tech., Bhopal Prepared By: Ms. Nandini Sharma(CSE DEPT.) Page 6 FIXED "value" - The attribute must be given and must have the given "value". TOKENIZED ATTRIBUTES You can restrict an attribute to have only values with a certain structure. Here are the possible values of the atype part of the attribute declaration for such attributes: ID An ID attribute must be a unique identifier for that node. This allows other nodes to refer to it. The attribute value must also be a valid XML name (see above). IDREF An IDREF attribute is a reference to an ID attribute in a different node. For example, suppose that in your DTD, there is a <sailor> element with an ID- type nickname attribute, and another element <duty> with an IDREF-type attribute called sailor- nick. Then if you have an element like this: <sailor nickname='Bluto'>...</sailor> then this tag would refer to that element: <duty sailor-nick='Bluto'>...</duty> IDREFS The value of an IDREFS attribute must contain one or more ID references separated by spaces. Example: <roster sailor-nicks='Bluto Popeye Olive_Oyl'/> ENTITY Use this attribute type to refer to external, non-parsed entities. See the section on notations, below. ENTITIES Like ENTITY, but the attribute can be a list of one or more entity names separated by spaces. NMTOKEN The attribute value must be a name token, conforming to the rules for XML names (see above). NMTOKENS Like NMTOKEN, but the attribute value can contain one or more name tokens separated by spaces. ENUMERATED ATTRIBUTES
  • 7. Unit-IV/Web Engineering Truba College of Sc. & Tech., Bhopal Prepared By: Ms. Nandini Sharma(CSE DEPT.) Page 7 You can specify that attributes must have one of a set of one or more values. Here is the general form of the atype part of the<!ATTLIST...> declaration: (value1|value2|...) For example, suppose you want your <vehicle> element to have a kind attribute that must have a value of either "car","truck", or "boat": <!ATTLIST vehicle kind (car|truck|boat) #REQUIRED> You can also supply a default value in quotes. For example: <!ATTLIST vehicle kind (car|truck|boat) "car"> DECLARING AND USING ENTITIES In a DTD, entities come in four flavours:  A general entity is a chunk of text with a name attached, so you can use the entity as a sort of shorthand to get the related text substituted in its place. For example, suppose you are working on a new product called Project Giant-Slayer, but you know that the marketing department will change the name when it's released to the market. You could define the current product name as an entity named &product, and use it everywhere in your product literature. Then, when the marketing department decides on the final name, you can change the declaration of the entity and the new name will magically appear in place of the old one in all your web pages and brochures.  A character entity is one of the many standardized special characters that you can use when you need a character unavailable in your local character set.  A parameter entity is like a general entity, but it can be used as shorthand for parts of a content declaration in an element declaration.  A binary or non-parsed entity represents an external file that is not in XML format. GENERAL ENTITIES General entities have names of the form &name;, where the name follows the usual rules for XML names (above). To declare a general entity, use a declaration of this general form in your DTD: <!ENTITY ename "text"> where ename is the name of the entity you are defining (without the initial & and final ;), and text is the text you want substituted for that entity.
  • 8. Unit-IV/Web Engineering Truba College of Sc. & Tech., Bhopal Prepared By: Ms. Nandini Sharma(CSE DEPT.) Page 8 For example, to define an entity named &cr; with your copyright string, you might use a declaration like this: <!ENTITY cr "Copyright (C) 1763 Cotton Mather LLP"> CHARACTER ENTITIES To use special characters in your document, you can use the form &#n; where n is the decimal number of the character you want. A table of these entities is online at http://www.w3.org/TR/html401/sgml/entities.html. PARAMETER ENTITIES The purpose of a parameter entity is to serve as a short hand for some or all of the content part of an element declaration. The general form is: <!ENTITY % ename "text"> For example, suppose you have a lot of tags whose content model is "#PCDATA|bold|ital)*". You could define an entity like this: <!ENTITY bitext "(#PCDATA|bold|ital)*"> Then, to define an element <excuse> with that content: <!ELEMENT excuse %bitext;> BINARY (NON-PARSED) ENTITIES This last type of entity represents a file, like an image or sound file, that is not XML. To declare such an entity: <!ENTITY ename SYSTEM "url" NDATA nname> where ename is the name of the entity you are defining, url is the URL where the file can be found, and nname is the name of thenotation that the file uses. See the section on notations below for an example. NOTATION DECLARATIONS The purpose of a notation declaration is to define the format of some external non-XML file, such as a sound or image file, so you can refer to such files in your document. The general form of a notation declaration can be either of these: <!NOTATION nname PUBLIC std> <!NOTATION nname SYSTEM url>
  • 9. Unit-IV/Web Engineering Truba College of Sc. & Tech., Bhopal Prepared By: Ms. Nandini Sharma(CSE DEPT.) Page 9 where nname is the name you are giving to the notation; std is the published name of a public notation, and url is a reference to a program that can render a file in the given notation. There are four steps to connecting an attribute to a notation: 1. Declare the notation. Example: <!NOTATION jpeg PUBLIC "JPG 1.0"> 2. Declare the entity. For example: 3. <!ENTITY bogie-pic SYSTEM "http://stars.com/bogart.jpg" NDATA jpeg> 4. Declare the attribute as type ENTITY. For example: <!ATTLIST star-bio pin-shot ENTITY #REQUIRED> 5. Use the attribute: <star-bio pin-shot="bogie-pic">...</star-bio> In a way, you could argue that this is the most widespread use of XML, as XHTML. Because XHTML is simply HTML 4.0 reworked, many HTML 4.0 sites are actually using an invalid form of XHTML. But the benefit of XML is not that it already exists as XHTML, but that you can create web documents from XML using XSLT to transform your documents into HTML. You can then send your XML to an XSLT processor on the web server and serve that result to the web browser. This makes your documentation available in whatever format you need it to be in. XML AND CONTENT MANAGEMENT Ironically, with most websites that use XML, the web designers and content developers might not even know that XML is there. This is because there is generally a CMS or content management system that sits in front of the XML to make it easier for the content writers to write their web content without worrying about how to write HTML or design web pages. XML AND DOCUMENTATION Many companies are moving to XML to write their internal documentation. The most common XML platform for this is DocBook. The advantage of XML for documentation is that it can be used to define the common traits in books, magazines, stories, advertisements, and so forth. And DocBook already has that type of information defined. The best thing about XML for documentation is that the XML is easy to understand for humans, both of the actual documentation, but also the XML code surrounding it. XML can be used for any type of documentation, from a publishing house to Marketing materials.
  • 10. Unit-IV/Web Engineering Truba College of Sc. & Tech., Bhopal Prepared By: Ms. Nandini Sharma(CSE DEPT.) Page 10 Here is an example of documentation written in XML: <howto> <title>How to Write a Mail Link</title> <author>Jennifer Kyrnin, Web Design Guide</author> <description> <paragraph> Use a HTML tag to allow your readers to send email directly from your Web site. </paragraph> </description> <directions> <step>Write a link as usual <a href="">email me</a></step> <step>Where you would normally put a URL, put the code "mailto" <a href="mailto:">email me</a></step> <step>Then put your email address after the colon <a href="mailto:html@aboutguide.com">email me</a></step> </directions> </howto> As you can see, both the data and the XML are readable and understandable. The content is also in an order that would be expected by a human reading the document. XML AND DATABASE DEVELOPMENT Databases are a natural use for XML, because XML is all about data. Unlike XML for documentation, XML for databases does not need to be readable by humans. The data is simply written in such a way to allow machines to read it and make it accessible to a database. Here's XML that might be loaded into a database: <item number="00001"> <name> <first>Jane</first> <middle>Q</middle> <last>Public</last> </name> <phone type="voice"> <areacode>407</areacode> <number>555-1212</number> </phone> <phone type="fax"> <areacode>407</areacode> <number>555-1213</number> </phone> <email>jpublic@gmail.com</email> </item>
  • 11. Unit-IV/Web Engineering Truba College of Sc. & Tech., Bhopal Prepared By: Ms. Nandini Sharma(CSE DEPT.) Page 11 Unlike the document XML, it's not necessary that this be easily readable by humans. Since it is meant to be input into a database, it is only important that it be processable by a computer. HTML versus XML The most salient difference between HTML and XML is that HTML describes presentation and XML describes content. An HTML document rendered in a web browser is human readable. XML is aimed toward being both human and machine readable. Consider the following HTML. <html> <head><title>Books</title><head> <body> <h2>Books</h2> <hr> <em>Sense and Sensibility</em>, <b>Jane Austen</b>, 1811<br> <em>Pride and Prejudice</em>, <b>Jane Austen</b>, 1813<br> <em>Alice in Wonderland</em>, <b>Lewis Carroll</b>, 1866<br> <em>Through the Looking Glass<</em>, <b>Lewis Carroll</b>, 1872<br> </body> </html> The previous HTML is rendered in a browser as follows.
  • 12. Unit-IV/Web Engineering Truba College of Sc. & Tech., Bhopal Prepared By: Ms. Nandini Sharma(CSE DEPT.) Page 12 The HTML above describes how bibliography information is to be presented and formatted for a human to view in a web browser. Knowing that Sense and Sensibility is enclosed in italic tags does not however help a program determine that it is the title of a book. XML attempts to describe web data to address this void. The following is XML describing the contents of the books HTML page above. <books> <book> <title>Sense and Sensibility</title> <author>Jane Austen</author> <year>1811</year> </book> <book> <title>Pride and Prejudice</title> <author>Jane Austen</author> <year>1813</year> </book>
  • 13. Unit-IV/Web Engineering Truba College of Sc. & Tech., Bhopal Prepared By: Ms. Nandini Sharma(CSE DEPT.) Page 13 <book> <title>Alice in Wonderland</title> <author>Lewis Carroll</author> <year>1866</year> </book> <book> <title>Through the Looking Glass</title> <author>Lewis Carroll</author> <year>1872</year> </book> </books> A program parsing this data can take advantage of the fact that all book titles are enclosed in <title> tags. Where would such a program find such information? An XML document may contain an optional description of its grammar. A grammar describes which tags are used in the XML document and how such tags can be nested. A grammar is a schema or road map for the XML document. Originally an XML grammar was specified in a DTD (Document Type Definition). A newer standard however, XSchema (XML Schema) has been adopted. XSchema addresses some of the limitations of DTDs. As can be seen above, XML does not contain any information indicating how the document should be rendered in a browser. Therefore, XML factors data from presentation. The beauty of this feature is that the same data can be presented in a variety of ways without having to replicate any data (e.g., consider making book titles bold and authors italic). XML SYNTAX DIFFERS FROM HTML  New tags may be defined at will  Tags may be nested to arbitrary depth  May contain an optional description of its grammar XML can be used to store data inside HTML documents. XML data can be stored inside HTML pages as "Data Islands". As HTML provides a way to format and display the data, XML stores
  • 14. Unit-IV/Web Engineering Truba College of Sc. & Tech., Bhopal Prepared By: Ms. Nandini Sharma(CSE DEPT.) Page 14 data inside the HTML documents. The data contained in an XML file is of little value unless it can be displayed, and HTML files are used for that purpose. The simple way to insert XML code into an HTML file is to use the <xml> tag. The XML tag informs, the browser that the contents are to be parsed and interpreted using the XML parser. Like most other HTML tags, the <xml> tag has attributes. The most important attribute is the ID, which provides for the unique naming of the code. The contents of the XML tag come from one of two sources : inline XML code or an imported XML file.  If the code appears in the current location , it's said to be inline. Example Embedding XML code inside an HTML File. <html> <xml Id = msg> <message> <to> Visitors </to> <from> Author </from> <Subject> XML Code Islands </Subject> <body> In this example, XML code is embedded inside HTML code </body> </message> </xml> </html>  The efficient way is to create a file and import it. You can easily do so by using the SRC attribute of the XML tag. Syntax <xml Id = msg SRC = "example1.xml"> </xml> DATA BINDING Data binding involves mapping, synchronizing, and moving data from a data source, usually on a remote server, to an end user's local system where the user can manipulate the data. Using data binding means that after a remote server transmits data, the user can perform some minor data manipulations on their own local system. The remote server does not have to perform all the data manipulations nor repeatedly transmit variations of the same data.  Data binding involves moving data from a data source to a local system, and then manipulating the data, such as, searching, sorting, and filtering, it on the local system.
  • 15. Unit-IV/Web Engineering Truba College of Sc. & Tech., Bhopal Prepared By: Ms. Nandini Sharma(CSE DEPT.) Page 15  When you bind data in this way, you do not have to request that the remote server manipulate the data and then retransmit the results; you can perform some data manipulation locally.  In data binding, the data source provides the data, and the appropriate applications retrieve and synchronize the data and present it on the terminal screen.  If the data changes, the applications are written so they can alter their presentation to reflect those changes.  Data binding is used to reduce traffic on the network and to reduce the work of the Web server, especially for minor data manipulations.  Binding data also separates the task of maintaining data from the tasks of developing and maintaining binding and presentation programs. CONVERTING XML TO HTML FOR DISPLAY There exist several ways to convert XML to HTML for display on the Web. Using HTML alone If your XML file is of a simple tabular form only two levels deep then you can display XML files using HTML alone. Using HTML + CSS This is a substantially more powerful way to transform XML to HTML than HTML alone, but lacks the full power and flexibility of the methods listed below. Using HTML with JavaScript Fully general XML files of any type and complexity can be processed and displayed using a combination of HTML and JavaScript. The advantages of this approach are that any possible transformation and display can be carried out because JavaScript is a fully general purpose programming language. The disadvantages are that it often requires large, complex, and very detailed programs using recursive functions (functions that call themselves repeatedly) which are very difficult for most people to grasp Using XSL and Xpath XSL (eXtensible Stylesheet Language) is considered the best way to convert XML to HTML. The advantages are that the language is very compact, very sophisticated HTML can be displayed with relatively small programs, it is easy to re-purpose XML to serve a variety of purposes, it is non-procedural in that you generally specify only what you wish to accomplish as opposed to detailed instructions as to how to achieve it, and it greatly reduces or eliminates the need for recursive functions. The disadvantages are that it requires a very different mindset to use, and the language is still evolving so that many XSL processors in the Web servers are out of date and newer ones must sometimes be invoked through DOS
  • 16. Unit-IV/Web Engineering Truba College of Sc. & Tech., Bhopal Prepared By: Ms. Nandini Sharma(CSE DEPT.) Page 16 DISPLAYING XML DOCUMENT USING XSL It is a language for expressing stylesheets. It consists of two parts:  A language for transforming XML documents (XSLT)  An XML vocabulary for specifying formatting semantics An XSL stylesheet specifies the presentation of a class of XML documents by describing how an instance of the class is transformed into an XML document that uses the formatting vocabulary. Like CSS an XSL is linked to an XML document and tell browser how to display each of document's elements. An XML document with an attached XSL can be open directly in Internet Explorers. You don't need to use an HTML page to access and display the data. There are two basic steps for using a css to display an XML document:  Create the XSL file.  Link the XSL sheet to XML document. CREATING XSL FILE XSL is a plain text file with .css extension that contains a set of rules telling the web browser how to format and display the elements in a specific XML document. You can create a css file using your favorite text editors like Notepad, Wordpad or other text or HTML editor as show below: general.xsl employees { background-color: #ffffff; width: 100%; } id { display: block; margin-bottom: 30pt; margin-left: 0; } name { color: #FF0000; font-size: 20pt; } city,state,zipcode { color: #0000FF;
  • 17. Unit-IV/Web Engineering Truba College of Sc. & Tech., Bhopal Prepared By: Ms. Nandini Sharma(CSE DEPT.) Page 17 font-size: 20pt; } LINKING To link to a style sheet you use an XML processing directive to associate the style sheet with the current document. This statement should occur before the root node of the document. <?xml-stylesheet type="text/xsl" href="styles/general.xsl"> The two attributes of the tag are as follows: href The URL for the style sheet. type The MIME type of the document begin linked, which in this case is text/css. MIME stands for Multipart Internet Mail Extension. It is a standard which defines how to make systems aware of the type of content being included in e-mail messages. The css file is designed to attached to the XML document as shown below: <?xml version="1.0" encoding="utf-8" standalone="no"?> <!--This xml file represent the details of an employee--> <?xml-stylesheet type="text/xsl" href="styles/general.xsl"> <employees> <employee id="1"> <name> <firstName>Mohit</firstName> <lastName>Jain</lastName> </name> <city>Karnal</city> <state>Haryana</state> <zipcode>98122</zipcode> </employee> <employee id="2"> <name> <firstName>Rahul</firstName> <lastName>Kapoor</lastName> </name> <city>Ambala</city> <state>Haryana</state> <zipcode>98112</zipcode> </employee> </employees>
  • 18. Unit-IV/Web Engineering Truba College of Sc. & Tech., Bhopal Prepared By: Ms. Nandini Sharma(CSE DEPT.) Page 18 REWRITING Let's say you have a proxy running om www.myproxy.com and have proxied the site www.remotesite.com to the directory /remote. The links on the proxied page www.remotesite.com doesn't know they are being proxied, this can create some problems. But lets start with looking at the three different link types.  <a href="myfile.html"> - This link will work  <a href="/myfile.html"> - This link wont work  <a href="http://www.remotesite.com/myfile.html"> - This link wont work The first link will work since it is relative to the content. The second link is mapped to the root and therefore the browser will request the following page: http://www.myproxy.com/myfile.html, but this file isn't found since only files in the directory /remote will be sent to www.remotesite.com. We have to change so that the link points to /remote/myfile.html. The third link is absolute and therefor the browser will follow it to http://www.remotesite.com/myfile.html. This works correctly, but only if the remote site is visible to the client. Probably the site being proxied is some internal server not accessible from the outside. We have to change the link to http://www.myproxy.com/remote/myfile.html. The rewrite filter As you should already have learned the proxy is built using a filter that proxies all incomingrequests. To make the rewrite work there is another filter supplied, the rewrite filter. Theproxy filter will work perfectly fine without a rewrite filter and doesn't have any knowledge of the possibility for links to be rewritten. This makes it just as easy to run the proxy with and without rewriting. How it works The current rewriting is done by parsing the html, javascript and css files looking for links using regular expressions. The reason the proxy is using regular expressions is that it then can use the the same type of parsing to find links in both css and html. There is one other reason for using regular expression over a XML parser, pages aren't writing in XHTML. Since there are so many non XML compatible pages out there using a standard XML parser wouldn't work. There are other options like javax.swing.text.html and changing from regular expressions is something considered for the next versions. There will have to be some measurable performance benefits for doing so however. Turn on rewrite web.xml
  • 19. Unit-IV/Web Engineering Truba College of Sc. & Tech., Bhopal Prepared By: Ms. Nandini Sharma(CSE DEPT.) Page 19 The default setting of the proxy is to not do any link rewriting. But you can easily turn the rewriting on by adding the rewrite filter. A alternate web.xml is supplied with the proxy that has rewriting enabled. The file is called web_rewriting.xml and can be found in TOMCAT_HOME/webapps/J2EP_INSTALL_DIR/WEB-INF/. To enable rewriting rename web_rewriting.xml to web.xml, make sure that you overwrite the existing file. data.xml (config file) Here are the good news, you don't have to do anything (almost). If you have mapped a site for the proxy all of the links excluding the absolute ones will be rewritten. The reason that the absolute links aren't rewritten is that you might want to leave them as they are and let the user follow those links. You will probably turn absolute link rewriting on however. To do this, simply add theparameter isRewriting="true" to the server. All absolute links found on a page will be matched to see if we have them mapped in the config. If we have the server mapped andisRewriting is set to "true" absolute links for the server will be rewritten. All servers doesn't support the isRewriting=‖true‖, for instance RoundRobinCluster will always do rewriting. Consult the documentation of the servers for more information. Other form of rewrites There are two more issues with rewriting. One is when the server says a page has moved and sends a location for the new page, we have to rewrite that location. The other issue is when a cookie is sent from the server, we have to change so the cookie is set for the correct directory. Both of these issues are handled by the proxy without having to do any extra configuration. HTML, SGML, and XML First you should know that SGML (Standard Generalized Markup Language) is the basis for both HTML and XML. SGML is an international standard (ISO 8879) that was published in 1986. Second, you need to know that XHTML is XML. "XHTML 1.0 is a reformulation of HTML 4.01 in XML, and combines the strength of HTML 4 with the power of XML." Thirdly, XML is NOT a language, it is rules to create an XML based language. Thus, XHTML 1.0 uses the tags of HTML 4.01 but follows the rules of XML. The Document A typical document is made up of three layers:  structure  Content  Style
  • 20. Unit-IV/Web Engineering Truba College of Sc. & Tech., Bhopal Prepared By: Ms. Nandini Sharma(CSE DEPT.) Page 20 Structure Structure would be the documents title, author, paragraphs, topics, chapters, head, body etc. Content Content is the actual information that composes a title, author, paragraphs etc. Style Style is how the content within the structural elements are displayed such as font color, type and size, text alignment etc. Markup HTML, SGML, and XML all markup content using tags. The difference is that SGML and XML mainly deal with the relationship between content and structure, the structural tags that markup the content are not predefined (you can make up your own language), and style is kept TOTALLY separate; HTML on the other hand, is a mix of content marked up with both structural and stylistic tags. HTML tags are predefined by the HTML language. By mixing structure, content and style you limit yourself to one form of presentation and in HTML's case that would be in a limited group of browsers for the World Wide Web. By separating structure and content from style, you can take one file and present it in multiple forms. XML can be transformed to HTML/XHTML and displayed on the Web, or the information can be transformed and published to paper, and the data can be read by any XML aware browser or application. SGML (Standard Generalized Markup Language) Historically, Electronic publishing applications such as Microsoft Word, Adobe PageMaker or QuarkXpress, "marked up" documents in a proprietary format that was only recognized by that particular application. The document markup for both structure and style was mixed in with the content and was published to only one media, the printed page. These programs and their proprietary markup had no capability to define the appearance of the information for any other media besides paper, and really did not describe very well the actual content of the document beyond paragraphs, headings and titles. The file format could not be read or exchanged with other programs, it was useful only within the application that created it. Because SGML is a nonproprietary international standard it allows you to create documents that are independent of any specific hardware or software. The document structure (what elements are used and their relationship to each other) is described in a file called the DTD (Document Type Definition). The DTD defines the relationships between a document's elements creating a consistent, logical structure for each document. SGML is good for handling large-scale, long-term information management needs and has been around for more than a decade as the language of defense contractors and the electronic
  • 21. Unit-IV/Web Engineering Truba College of Sc. & Tech., Bhopal Prepared By: Ms. Nandini Sharma(CSE DEPT.) Page 21 publishing industry. Because SGML is very large, powerful, and complex it is hard to learn and understand and is not well suited for the Web environment. XML (Extensible Markup Language) XML is a "restricted form of SGML" which removes some of the complexity of SGML. XML like SGML, retains the flexibility of describing customized markup languages with a user- defined document structure (DTD) in a non-proprietary file format for both storage and exchange of text and data both on and off the Web. As mentioned before, XML separates structure and content from style and the structural markup tags can actually describe the content because they can be customized for each XML based markup language. A good example of this is the Math Markup Language (MathML) which is an XML application for describing mathematical notation and capturing both its structure and content. Until MathML, the ability to communicate mathematical expressions on the Web was limited to mainly displaying images (JPG or GIF) of the scientific notation or posting the document as a PDF file. MathML allows the information to be displayed on the Web, and makes it available for searching, indexing, or reuse in other applications. HTML (Hypertext markup Language) HTML is a single, predefined markup language that forces Web designers to use it's limiting and lax syntax and structure. The HTML standard was not designed with other platforms in mind, such as Web TV’s, mobile phones or PDAs. The structural markup does little to describe the content beyond paragraph, list, title and heading. XML breaks the restricting chains of HTML by allowing people to create their own markup languages for exchanging information. The tags can be descriptive of the content and authors decide how the document will be displayed using style sheets (CSS and XSL). Because of XML's consistent syntax and structure, documents can be transformed and published to multiple forms of media and content can be exchanged between other XML applications. HTML was useful in the part it has played in the success of the Web but has been outgrown as the Web requires more robust, flexible languages to support it's expanding forms of communication and data exchange. XML will never completely replace SGML because SGML is still considered better for long- time storage of complex documents. However, XML has already replaced HTML as the recommended markup language for the Web with the creation of XHTML 1.0. Even though XHTML has not made the HTML that currently exists on the Web obsolete, HTML 4.01 is the last version of HTML. XHTML (an XML application) is the foundation for a universally accessible, device independent Web. Semantic Web Services, like conventional web services, are the server end of a client– server system for machine-to-machine interaction via the World Wide Web. Semantic services
  • 22. Unit-IV/Web Engineering Truba College of Sc. & Tech., Bhopal Prepared By: Ms. Nandini Sharma(CSE DEPT.) Page 22 are a component of the semantic web because they use markup which makes data machine- readable in a detailed and sophisticated way (as compared with human-readable HTML which is usually not easily "understood" by computer programs). WEB ONTOLOGY LANGUAGE It is a family of knowledge representation languages or ontology languages for authoring ontologies or knowledge bases. The languages are characterised by formal semantics and RDF/XML-based serializations for the Semantic Web. OWL is endorsed by the World Wide Web Consortium (W3C) and has attracted academic, medical and commercial interest. In October 2007, a new W3C working group was started to extend OWL with several new features as proposed in the OWL 1.1 member submission. W3C announced the new version of OWL on 27 October 2009. This new version, called OWL 2, soon found its way into semantic editors such as Protégé and semantic reasoners such as Pellet. The OWL family contains many species, serializations, syntaxes and specifications with similar names. OWL and OWL2 are used to refer to the 2004 and 2009 specifications, respectively. Full species names will be used, including specification version (for example, OWL2 EL). When referring more generally, OWL Family will be used. TYPES OF ONTOLOGIES Domain ontology - A domain ontology (or domain-specific ontology) models a specific domain, which represents part of the world. Particular meanings of terms applied to that domain are provided by domain ontology. For example the word card has many different meanings. An ontology about the domain of poker would model the "playing card" meaning of the word, while an ontology about the domain of computer hardware would model the "punched card" and "video card" meanings. Since domain ontologies represent concepts in very specific and often eclectic ways, they are often incompatible. As systems that rely on domain ontologies expand, they often need to merge domain ontologies into a more general representation. This presents a challenge to the ontology designer. Different ontologies in the same domain arise due to different languages, different intended usage of the ontologies, and different perceptions of the domain (based on cultural background, education, ideology, etc.). At present, merging ontologies that are not developed from a common foundation ontology is a largely manual process and therefore time-consuming and expensive. Domain ontologies that use the same foundation ontology to provide a set of basic elements with which to specify the meanings of the domain ontology elements can be merged automatically. There are studies on generalized techniques for merging ontologies, but this area of research is still largely theoretical. Upper ontology - An upper ontology (or foundation ontology) is a model of the common objects that are generally applicable across a wide range of domain ontologies. It employs a core
  • 23. Unit-IV/Web Engineering Truba College of Sc. & Tech., Bhopal Prepared By: Ms. Nandini Sharma(CSE DEPT.) Page 23 glossarythat contains the terms and associated object descriptions as they are used in various relevant domain sets. There are several standardized upper ontologies available for use, including Dublin Core, GFO, OpenCyc/ResearchCyc, SUMO, and DOLCE. WordNet, while considered an upper ontology by some, is not strictly an ontology. However, it has been employed as a linguistic tool for learning domain ontologies. Hybrid ontology - The Gellish ontology is an example of a combination of an upper and a domain ontology.