SlideShare a Scribd company logo
1 of 108
Download to read offline
Thames Valley University



  Faculty of Professional Studies


 BSc Computing and Information
              Systems



Investigating the use of
XML in a DJ database
             system

 Thomas David Stanley Rogers


             May 2010
BSc Computing and Information Systems Project                      Thomas Rogers




 INVESTIGATING THE USE OF XML IN A DJ DATABASE SYSTEM –
                                   MAY 2010




                         Thomas David Stanley Rogers




      Submitted in support of BSc Computing and Information Systems




                                    May 2010




                                DECLARATION




“I certify that this work has not been accepted in substance for any degree, and is
not concurrently being submitted for any degree other than that of BSc Computing
and Information Systems being studied at Thames Valley University.




I also declare that this work is the result of my own investigations except where
otherwise identified by references and that I have not plagiarised another’s work”.




                                         2
BSc Computing and Information Systems Project                                                         Thomas Rogers




                                               Contents
                                                                                                                       Page


Figures ..................................................................................................................... 5
Abstract ................................................................................................................... 7
Introduction ............................................................................................................. 7
   Aims and Objectives ........................................................................................... 7
   Research Element ................................................................................................ 7
   Developmental Element ...................................................................................... 8
   Artefact Deliverables ........................................................................................... 8
   Terms of Reference ............................................................................................. 8
   Background ......................................................................................................... 8
   Target Audience .................................................................................................. 8
Research Element .................................................................................................... 9
1. Literature Review ................................................................................................ 9
   2.1 The role XML plays in handling semi-structured data.................................. 9
   2.3 NXD versus XML-enabled relational databases ......................................... 15
       2.3.1 Investigating Oracle XMLType:CLOB................................................ 19
       2.3.2 Investigating Oracle XSU (XML SQL) ............................................... 19
       2.3.3 Investigating eXist ................................................................................ 20
   2.3.4 Investigation Results ................................................................................ 20
Developmental Element ........................................................................................ 21
3. Methodology ..................................................................................................... 21
3.1 Initial User Requirements ................................................................................ 22
   3.1.1 Possible requirements ............................................................................... 23
3.2 Use Case Diagrams ......................................................................................... 23
3.3 User Stories ..................................................................................................... 25
   3.3.1 Register Details ........................................................................................ 25
   3.3.2 User Login ................................................................................................ 26
   3.3.3 Upload Feedback ...................................................................................... 26

                                                              3
BSc Computing and Information Systems Project                                                        Thomas Rogers

   3.3.4 View Feedback ......................................................................................... 26
   3.3.5 Upload Tips and Experience .................................................................... 27
   3.3.6 Listen to MP3’s ........................................................................................ 27
3.4 MoSCoW Rules ............................................................................................... 27
4. Artefact Design and Construction ..................................................................... 28
   4.1 Prototypes .................................................................................................... 28
   4.1.1 Low Fidelity Prototype ............................................................................. 29
   4.1.2 Medium Fidelity Prototype ...................................................................... 29
   4.1.3 Vertical Prototype..................................................................................... 30
   4.2 XML Experiment ........................................................................................ 32
   4.5 Technology Used ......................................................................................... 33
   4.6 Database Schema ......................................................................................... 33
   4.7 Code Walkthrough ...................................................................................... 33
   4.7.1 XML PHP DOM ...................................................................................... 34
   4.7.2 XML Document Storage .......................................................................... 36
   4.8 Artefact Testing ........................................................................................... 38
   4.8.1 Test Case .................................................................................................. 38
   4.8.2 User Acceptance Test ............................................................................... 39
5. Project Assessment ............................................................................................ 39
   5.1 Requirements Analysis ................................................................................ 39
   5.2 Failed Requirement ..................................................................................... 45
   5.3 Effectiveness of Method Used .................................................................... 46
6. Project Alterations ............................................................................................. 48
   6.1 NXD Installation Issues............................................................................... 48
   6.2 eXist............................................................................................................. 48
   6.3 BDBXML .................................................................................................... 49
   6.4 Oracle, PHP and XML Issues ...................................................................... 51
   6.5 Data Structure .............................................................................................. 52
7. Conclusion ......................................................................................................... 53
   7.1 Recommendations ....................................................................................... 54
   7.2 Future Work ................................................................................................ 55
8. Critique .............................................................................................................. 55

                                                             4
BSc Computing and Information Systems Project                                                     Thomas Rogers

9. References ......................................................................................................... 56
10. Appendix ......................................................................................................... 60


                                                      Figures

Figure 1: Semi-structured Data ............................................................................. 10
Figure 2: Hierarchical Data Structure.................................................................... 10
Figure 3: XML Data (Kroenke, 2002,p452) ........................................................ 11
Figure 4: XML Represented in a Hierarchical Structure....................................... 11
Figure 5: Relational 'Flat' Structure Source: Applequist, p20 ............................... 13
Figure 6: Complex Query ...................................................................................... 14
Figure 7: Sparse Data ............................................................................................ 14
Figure 8: Semi-structured to Relational ................................................................ 15
Figure 9: XML Index ............................................................................................ 17
Figure 10: Hong et al. XML Tag ........................................................................... 18
Figure 11: Register Details and Login .................................................................. 24
Figure 12: Feedback .............................................................................................. 24
Figure 13: Experience/Tips ................................................................................... 25
Figure 14: Register User Story .............................................................................. 25
Figure 15: Login User Story .................................................................................. 26
Figure 16: Upload Feedback User Story ............................................................... 26
Figure 17: View Feedback User Story .................................................................. 26
Figure 18: Experiences/Tips user Story ................................................................ 27
Figure 19: MP3 User Story ................................................................................... 27
Figure 20: Low Fidelity Prototype ........................................................................ 29
Figure 21: Medium Fidelity .................................................................................. 30
Figure 22: Details .................................................................................................. 31
Figure 23: Entry..................................................................................................... 31
Figure 24: Prototype Return .................................................................................. 32
Figure 25: XML Experiment ................................................................................. 32
Figure 26: Database Schema ................................................................................. 33
Figure 27: XML Inputs .......................................................................................... 34


                                                            5
BSc Computing and Information Systems Project                                                     Thomas Rogers

Figure 28: Test Case .............................................................................................. 38
Figure 29: User Acceptance Test .......................................................................... 39
Figure 30: Alias Session ........................................................................................ 40
Figure 31: Entry..................................................................................................... 41
Figure 32: Entry..................................................................................................... 42
Figure 33: Entry..................................................................................................... 42
Figure 34: Feedback Id .......................................................................................... 43
Figure 35: Entry..................................................................................................... 44
Figure 36: Data Displayed ..................................................................................... 44
Figure 37: Data Inserted ........................................................................................ 45
Figure 38: Unformatted XML ............................................................................... 46
Figure 39: Error ..................................................................................................... 49
Figure 40: DBXML Create.................................................................................... 49
Figure 41: DBXML Select .................................................................................... 50
Figure 42: DBXML Query .................................................................................... 50
Figure 43: Structured Data Format ........................................................................ 52
Figure 44: Intended Structure ................................................................................ 53




                                                           6
BSc Computing and Information Systems Project                       Thomas Rogers

Abstract
EXtensible Mark-up Language (XML) is used to transfer data between
applications in a semi-structured format which does not conform to the traditional
relational database format of tables and columns. Research into the role XML
plays in the handling of semi-structured data and how it handles said data better
than the relational approach leads on to a discussion of native XML databases
(NXD) and XML-enabled databases within the initial research element of the this
report; NXD’s are the standard for storing XML documents and XML-enabled
databases are used to store XML documents in a relational database.
The research element is preceded by a developmental element and implementation
of an artefact that incorporates the findings and demonstrates the use of XML in a
DJ database system. The artefact is a XML PHP DJ forum Web application and
database that enables users to upload and download semi-structured XML data in
the form of feedback, tips, experiences and set samples.
The development process adheres to the Dynamic Systems Development Method
(DSDM) software development methodology in cohesion with eXtreme
Programming (XP) and each phase is explained and presented within this
document. This involves requirements elicitation and development, UML
modelling and prototype design, incremental testing, implementation and
evaluation of the artefact and user acceptance testing.

Introduction

Aims and Objectives

The aim of this project is to investigate the use of XML in a DJ database system.
It is a research and development project in which XML and its storage methods
will be investigated followed by the design, development, implementation and
testing of an artefact based on the DSDM framework

Research Element
The research element of the project will investigate the following areas:
   •   Investigate XML and the role it plays in handling semi-structured data and
       the storage options for XML documents
   •   Discussion and comparison of native XML databases and XML-enabled
       relational databases
                                         7
BSc Computing and Information Systems Project                      Thomas Rogers

Developmental Element

In investigating the use of XML in a DJ database system the following areas will
be incorporated in the development of the project:
   •   A PHP application capturing user data
   •   XML to transfer data
   •   XML-enabled relational database such as Oracle to store the data or a
       native XML database

Artefact Deliverables
I have a particular interest in being a DJ and recording and storing music that I
produce. The proposed software development project will attempt to:
   •   Enable the user to register and login to the system
   •   Capture and store user data or feedback related to DJ sets performed i.e.
       semi-structured data
   •   User upload area enabling the sharing of DJ tips and experience
   •   Store MP3 samples of mixes that users can download; this however cannot
       be stored in the XML document because the file will get too big
   •   Possibly a blog area enabling users to add their tips and experience

Terms of Reference

Background
A particular interest in Web based applications and the PHP scripting language
has lead to an interest in learning XML and the programming involved in
producing an application using this paradigm. It was seen as beneficial to attempt
to incorporate being a DJ with this interest in the completion of the literature
review and production of the DJ database system.

Target Audience
The intended target audience of this project could be:
   •   Web developers interested in using XML
               Demonstrate reasons for employing XML in Web application
               Alternatives i.e. what type of XML database to employ
   •   Aspiring DJ’s wanting to market themselves
   •   Users uncertain of DJ music genres

                                         8
BSc Computing and Information Systems Project                        Thomas Rogers

    •   DJ’s wanting to share information with each other i.e. experiences and tips

Prior to the design and implementation of the system, research in to the topic area
will ensue.

Research Element

1. Literature Review
The proposed software application intends to demonstrate the use of XML in a DJ
database system; resultantly the need to investigate XML, its components and
storage methods is essential prior to design and implementation. The following
literature review ultimately attempts to convey the need for this project and the
development of the database system. According to Sperberg-Mcqueen (2005)
XML plays a huge part in solving the issues of semi-structured data. This review
subsequently outlines what semi-structured data is and the role XML plays in the
handling of it and discusses how XML handles semi-structured data differently to
the traditional relational approach. This leads to a discussion and comparison of
two separate XML storage techniques: XML-enabled and Native XML Databases
(NXD); the comparison of the two will form the basis of the qualitative and
quantitative data collection and evaluation of the proposed application in the latter
stages of this project.

2.1 The role XML plays in handling semi-structured data
In the IT industry commonplace information such as reports and programming
structures assume forms that are different from the rigidly structured data
traditionally associated with relational databases; this data is referred to as semi-
structured (Sperberg-Mcqueen, 2005). According to Buneman (1997) semi-
structured or unstructured data is data that does not fit into traditional data models
and the constraints imposed upon data held within these models. Sperberg-
Mcqueen (2005) expands on this notion by describing semi-structured data as
having repetitive characteristics and varying structures in a hierarchical
representation (Figure 1). This continual replication as a consequence results in
large numbers of tables being created and stored in a specified database which are
heavily utilised via numerous insert and delete activities (Burleson, 2009),
characteristics of the Internet. Figure 1 displays an example of semi-structured
data:
                                          9
BSc Computing and Information Systems Project                       Thomas Rogers

   name: Peter Wood
   email: ptw@dcs.bbk.ac.uk, p.wood@bbk.ac.uk

   name:
          first name: Mark
          last name: Levene
   email: mark@dcs.bbk.ac.uk

   name: Alex Poulovassilis
   affiliation: Birkbeck

Source: http://www.dcs.bbk.ac.uk/~ptw/teaching/ssd/notes.html
                            Figure 1: Semi-structured Data

It is possible to see that the 3 different data formats are unstructured and has no
real pattern making this data hard to fit into a regular relational database and
subsequently query the data within the database. This will be discussed further in
section 2.2.
The majority of data that is handled in current times is organized hierarchically:
computer files systems, “library classification schemes” and directories such as
the yellow pages are all examples of data that is structured hierarchically
(Jagadish et al., 1999, p3). Figure 2 displays how semi-structured data is stored
hierarchically in a number of directories and sub-directories i.e. A, B and C are
directories and the rest can be perceived to be sub-directories. B & C are
subdirectories of A also.

                                          A



                            B                           C



                  E                F             G              H
                      Figure 2: Hierarchical Data Structure

Sperberg-Mcqueen (2005) state that XML transfers data in a hierarchical structure
and is perfect for representing semi-structured data that is in this format. Figure 3
is a basic example of an XML document:

<customer>


                                         10
BSc Computing and Information Systems Project                      Thomas Rogers

        <name>
             <firstname>Michelle</firstname>
             <lastname>Correll</lastname>
        </name>
        <address>
             <street>1824 East 7th Avenue</street>
             <street>Suite 700</street>
             <city>Memphis</city>
             <state>TN</state>
             <zip>32123-7788</zip>
       </address>
</customer>

                   Figure 3: XML Data (Kroenke, 2002,p452)

The XML data displayed in Figure 3 can be represented in a hierarchical structure
as follows:
                                    Customer


                       Name                           Address


       Firstname              Lastname City State         Street    Zip
              Figure 4: XML Represented in a Hierarchical Structure
Kasukawa et al. (1999) refer to the fact that XML has the ability to handle semi-
structured data effectively because of its varying structure and the flexibility of
structure that it allows. Sperberg-Mcqueen (2005, p34) further confirms this
assertion by stating that XML data is not confined to merely a flat “table of rows
and columns” allowing for more complex and intricate data structure. According
to W3C in relation to the topic of discussion, XML is a technology that describes
and transfers data and documents over the Internet. XML is a means to structure
and store data, then transport the data in plain text format; this gives programmers
the freedom to create their own entities, defining tags or child elements due to
XML not having a pre-defined standard (W3S), resulting in improved handling of
semi-structured data. Figure 3, although very basic, is a good example of how
flexible an XML document can be purely through the control it gives developers
to essentially create their own hierarchical document structure. This is essential
when relating this concept to semi-structured data.



                                        11
BSc Computing and Information Systems Project                          Thomas Rogers

XML can assist in giving semi-structured data, found in Figure 1, structure in the
definition of an XML schema. According to Roy et al., (2001) the XML schema
language defines the structure and the constraints imposed upon an XML
document, the data types and various elements and attributes within the document.
Roy et al., (2001) further specifications of the schema and the control it enables
by stating that it allows for the definition of data types, such as:
   •   String
   •   Boolean
   •   Data
   •   Time
In the interest of developers however, self specified user-defined data types can be
created allowing for personal constraints being defined for the XML document
attributes and elements within the document, allowing for implementation of
suitable syntax and format (Roy et al., 2001). Semi-structured data can be
structured using XML and an XML schema; the schema gives even more control
to the developer through the definition of types and structure according to the
XML document and the elements and attributes created initially.
Roy et al., (2001) indicate that XML schemas allow for inheritance which in turn
enables the reuse of software, again assisting developers; code will not have to be
created entirely from the beginning continuously and assists in the overall XML
software development process and the maintaining of the code that it is based on.
However, even if a an XML document does not have a schema related to it, an
XML document can be accessed or manipulated due to the fact that XML is self-
describing (IBM).This means that a data structure is not required because the
system can construct a structure around the XML tags and this results in semi-
structured data being understood and structured within the XML document.
Relational databases require a rigid “static schema definition” because according
to IBM (2005) each row with the table must have the same identical schema.
In the context of this project, the database system will be an XML PHP Web
application. The World Wide Web is a perfect example of a semi-structured and
hierarchical data source that is ever expanding and does not conform to the
constraints of a specific data model it is extremely difficult to enforce coherence
of this medium with a structured data model such as the relational approach


                                          12
BSc Computing and Information Systems Project                        Thomas Rogers

(Buneman, 1997). XML’s self-describing capabilities result in elegant handle of
semi-structured data sources on the Internet and this is an excellent place to use
XML (IBM). Therefore creating a Web application that handles semi-structured
data in XML format using an XML schema is a perfect means to investigate and
evaluate the concept.
Having ascertained how XML handles semi-structured data, it is important to
investigate the comparisons between the role XML plays in handling semi-
structured data and the traditional relational approach. For the purposes of the
project it is important to investigate XML from differing angles to gain a
perspective of how XML compares to another method of semi-structured data
management.
Data within a relational database is not stored hierarchically; it is stored as flat
data (Figure 4). The following table is a representation of a relational table of flat
data:

         Book_id                     Book_title                 Author_name


             2                 Sense and Sensibility             Jane Austen


             3                  Pride and Prejudice              Jane Austen



         Figure 5: Relational 'Flat' Structure Source: Applequist, p20

According to Mo. Y et al. (2002), semi-structured data largely is stored in flat files
and if stored in a relational database, data within these tables is hard to query or
update largely due to the inability to query columns with multiple values; if an
attribute has multiple values within its column then it will be hard to search for a
specific value within a attribute. For example when referring to Figure 5, the
Author_name attribute has multiple values i.e. Jane and Austen; in order to find
the surname of the author the query would be more complex e.g. if using SQL to
query the table a ‘WHERE’ clause using the ‘LIKE’ condition would have to be
entered into the statement to return the specified value (assume table name book).

SELECT Author_name
FROM book
WHERE Author_name LIKE’%Austen’;

                                         13
BSc Computing and Information Systems Project                       Thomas Rogers


                             Figure 6: Complex Query

Query and performance issues will be discussed further in 2.2.
Mo, Y et al. (2002), go on to state that semi-structured data held within relational
databases, especially “the edge table approach” (p253), is very often ever
expanding and too large to handle, too costly to maintain and large amounts of
data can be redundant or empty. This fact re-iterates Burleson’s findings in 2.1
(p8). Refer to Figure 7 for an example of sparse data within a relational database.

 Book_id          Book_title          Author_name       Co-author        Editor


                  Sense and
     2                                 Jane Austen          -----      An Editor
                  Sensibility


                   Pride and                               Enid
     3                                 Jane Austen                     An Editor
                   Prejudice                              Blyton


     4              A Book                 ----             ----       An Editor




                                Figure 7: Sparse Data

XML can reduce the problems of redundant data and can also eliminate sparse
data because according to Sperberg-Mqueen (2005), XML’s syntax does not allow
the storage of non-existent data.
In order for relational tables to handle semi-structured data the hierarchically
structured data has to be manipulated so that it fits into the flat structure of a
relational table; this is achieved by mapping semi-structured data using XML to
relational table attributes and columns individually or through “partial
decomposition”, which will be discussed in 2.3 (Appelquist, D.K. 2002, p94).
According to the findings of 2.1, XML has a hierarchical data structure and the
need to transform the semi-structured data is not required; the data remains as it is
initially and the XML document itself needs to be manipulated so that it fits into
the specified model. Chaudri et al. (2003) state that mapping an XML document
so that it conforms to the relational model causes compatibility and performance


                                         14
BSc Computing and Information Systems Project                      Thomas Rogers

issues due to “impedance mismatch” which is a state that occurs between two
different technological paradigms i.e. XML document and relational table. This
will be investigated further in 2.3. Figure 8 displays a basic example of how semi-
structured data from a hierarchical data structure is mapped to a relational table
using XML.
                      <D>                                   Relation: D
       D                      <A>Data</A>
                                                               A       B      C
                              <A>Data</B>            BA
                                                               Data Data Data
A      B       C              <C>Data<C>
                      </D>
                    Figure 8: Semi-structured to Relational

Having investigated the role XML plays in handling semi-structured data and how
it handles semi-structured data contrarily to the relational approach, different
methods of storing XML documents will now be investigated. The intended
objective of the artefact to be delivered is user ability to insert and upload semi-
structured data to a DJ forum web site; essentially the ability to manipulate and
query data in XML format. For the purposes of this project, storage and retrieval
of XML documents is essential. Available XML storage methods and related
work need to be investigated and reviewed as a result. To gain a comparative
perspective it is essential to analyse XML document storage techniques from two
differing angles: XML-enabled databases and NXD.

2.3 NXD versus XML-enabled relational databases
According to Chaudri et al. (2003) XML-enabled databases are employed to store
XML documents in a relational table and NXD’s are designed to hold data that
conforms purely to the XML hierarchical structure and does not merely
deconstruct the XML document and store it in a relational database. According to
Serna et al. (2005) when utilizing an XML-enabled database transformation of the
XML document needs to take place when both storing and returning the document
to and from a relational table within a relational database. NXD’s are a direct
means to store an XML document in a database and conversion to another format
or mapping of the document between two separate data models is not needed
(Serna et al., 2005). Having ascertained the basic principles of NXD’s and XML-


                                        15
BSc Computing and Information Systems Project                           Thomas Rogers

enabled databases, the advantages and disadvantages of both have to be
considered.
According to Ford (2003) when storing XML documents in an NXD the fact that
the XML document does not need to be manipulated in order for it to conform to a
relational model and that the data held in the document is maintained exactly is a
key advantage of utilising an NXD instead of an XML-enabled relational
database. In fact, Kroenke (2005, p89) controversially believes that “the relational
database model is doomed” due to the ability advancing XML document storage
technologies i.e. NXD’s have to store XML documents entirely as they are
without the need to deconstruct. An experiment performed in 2005 by Ainhoa
Serna and Jon Keppa Gerrikagoitia will attempt to prove or disprove Kroenke’s
controversial claims and the results will be displayed later in this review.
However, according to Hong et al. (2007) XML documents are a useful means to
store, extract and query data that is based upon a single repeatable entity and
believe that XML documents suffer as regards efficiency when querying multiple
XML documents. Hong et al. (2007) raise the concern that when XQuery and
XPath - W3C recommendations for querying XML documents - are utilised to
query multiple XML documents, these methods are not as proficient as merging
the documents into a relational database and querying them using SQL due to
having to read the XML files individually prior to processing. Pardede et al.
(2006, p205) further the limitations of NXD’s by implying that even though they
are continuously improving they are forms of technology that are in their
“infancy” and established “mature” relational database management systems
(RDBMS) are more familiar to programmers and consequently more competitive.
Pardede et al. (2006) emphasise the fact that RDBMS are tried and tested
technologies and this outweighs the limitations restructuring an XML document
imposes.
Connolly et al. (2005) further iterate the fact that despite the flexibility, intricacies
and complexity of handling data in XML documents, RDBMS are still the
standard for storing XML documents due to its reliability, available tools,
performance and scalability – an essential asset in handling semi-structured data
(XML). However the conversion process involved in storing an XML document
in an XML-enabled database requires a great deal of processing in order for it to


                                           16
BSc Computing and Information Systems Project                       Thomas Rogers

conform to the relational structure of rows and columns. A particular conversion
method is document shredding. Connolly and Begg (2005) state that the XML
document has to be decomposed or shredded into elements and placed into
columns within one or multiple relational tables within the database. This enables
one benefit which is that of indexing an element value so that it can be accessed
when querying the table using an SQL statement e.g.

<xml title=”i”> (i = index value)
 <content>Hello world</content>
</xml>

                              Figure 9: XML Index

This index value itself will also have to possess its own attribute or column within
the table, which in turn increases the size of the table and the need for table
scalability, an aspect that could be perceived to be a disadvantage (Connolly et al.,
2005). When querying a large table with many attributes retrieving specific data
may prove to be difficult; ciphering through large amounts of potentially
redundant data for index values may decrease query performance and the queries
themselves may need to be more complex in the acquiring of specific data (refer
to 2.2).
But do the processes involved in shredding and decomposing an XML document
and inserting it into an XML-enabled database such as Oracle and then attempting
to query, update, retrieve and restructure the XML document in its original
structure decrease performance levels? Hong et al. (2005) have proposed such a
method of XML document decomposition and reproduction using index values to
store and retrieve an XML document of semi-structured data from a relational
database. In their experiment Hong et al. (2005) shredded each XML tag (refer to
Figure 10) in the document into the relational table separately and used a
Microsoft computer program to co-ordinate all the data within each tag in the
XML document so that they matched exactly when produced in the relational
table.
<node sessionID=”0000000001” nodeID=”000000004” userID=”000000001”
username=”117:YEONG-TAE SONG” pntSeq=”5” time_stamp=”114728264”
opnDetail=”It looks good” />



                                         17
BSc Computing and Information Systems Project                         Thomas Rogers

                            Figure 10: Hong et al. XML Tag

They believe that this proved to be difficult due to the mapping of data between
two different data models, which can be known as impedance mismatch. In using
the displayed indexes they were able to specify which data they required from the
document when querying the relational table using MS SQL. Difficulty was
encountered further however when querying the database in that the XML
document returned was fragmented and further alterations to the query were
required in order to retrieve the XML document exactly as it was created.
However Hong et al. (2005) have only proposed a means to convert or shred XML
documents into a relational model and no query performance details have been
offered. Hong et al. (2005) have iterated the fact that the process is difficult due to
differing data models but performance is the key issue.
Alternatively, shredding the XML document does not always need to occur when
storing XML in an XML-enabled database: Oracle have a method of storing the
XML document as a whole within one single row of a relational table. This
method is known as Oracle XMLType:CLOB. The XMLType:CLOB stores an
XML document exactly according to its initial hierarchy and structure and it uses
XPath to update the document without rewriting any elements of it (W3S); this
will be discussed further in 2.3.3.
The key issue is performance: retrieving and updating the XML document,
sending it too and retrieving it from the specified database. Serna et al. (2005)
state that, when using an NXD, due to data being stored exactly as it is created
initially with no conversion of data required so that what is received is an XML
document and what is sent back is an XML document, performance levels are not
lost due to conversion to a different structure i.e. a relational table. Serna et al.
(2005) performed experiments for both an XML-enabled database and NXD.
They have compared the open-source NXD eXist and XML-enabled relational
database Oracle from two perspectives: XSU-SQL which maps the XML
document to the database, and XMLType:CLOB, within which the whole
document is stored and mapping of document structure to relational table
attributes is not needed.




                                          18
BSc Computing and Information Systems Project                    Thomas Rogers

The following experiments were conducted by Ainhoa Serna and Jon Keppa
Gerrikagoitia in 2005. Three different tests were performed. They were to:
http://xml.sys-con.com/node/104980

   •    XML:DB XPath which involved recurring searches, searches for attribute
    nodes, searches with two conditions and searches using pre-defined XPath
    functions

   •    XML:DB test which entailed add, delete and update document queries as
    well as creating and dropping collections

   •    XML:DB robustness which tested maximum scalability of each storage
    method

When evaluating each database performance 5000 XML documents – each
document being 592 bytes, 4 sub-collections each with 5000 files, and a further 2
sub-collections comprising 4000 files and 6000 XML documents, were queried
where the same queries were executed with continuously increasing amounts of
data.

2.3.1 Investigating Oracle XMLType:CLOB
As stated earlier, the XMLType:CLOB facility stored the XML documents as a
whole within 1 row of a relational table, and was used in conjunction with the
XPath query operation. XMLType possesses functions that enable the query to
extract particular nodes and fragments of an XML document, and Serna et al.,
(2006) have utilised the XMLType functions such as extract() and getStringVal to
query data. A table named ‘clientes’ of XMLTYPE was created and the
documents were inserted into it. Numerous queries including updates, queries
using XPath, queries of several tables and deletions from the table were
performed.

2.3.2 Investigating Oracle XSU (XML SQL)
This involved mapping the XML document nodes to a relational table and two
methods of storing and retrieving the XML documents were implemented by
Serna et al. (2006). These were relational table storage as already mentioned and
object-relational table storage methods and Serna et al. (2006) implemented both
solutions as part of the investigation to compare and contrast the two. When

                                       19
BSc Computing and Information Systems Project                      Thomas Rogers

concerning OR storage the required information was stored in data types and for
the relational storage method the data was stored in a relational table. Again, as
with the XMLType feature a number of SQL insert, update and delete queries
were performed using Java, its JDBC extension and Oracle’s XML parser.
However, due to the mapping of XML nodes to a relational table the names of the
XML elements and table columns have to be identical.

2.3.3 Investigating eXist
The storing of the XML documents did not need any mapping or configuration to
comply with an XML–enabled database when using eXist, so the NXD was
merely queried using XPath.

2.3.4 Investigation Results
The results returned the fact that eXist was the most efficient as regards recurring
searches and has the best response times in all of the tests performed, and when
utilising the XPath query standard functions such as “contains” returned quicker
response times than those queries that required the “=” operator; the “=” operator
is utilised by both the XSU and XMLType so their query times were lower than
eXist in this case also.
When concerning the insertion of 30,000 XML documents in to each of the
databases it was apparent that the OR tables and relational tables into the Oracle
database was better than that of eXist. XMLType:CLOB and eXist insertion
results are similar but both are less than the OR and relational table results. Also
the random access times of the XMLType returned the worst results for all of the
solutions. In general the results returned the fact that update and select queries
were much faster when eXist and structured storage formats were compared to the
CLOB file method.

The investigation within the research element of the project will assist in
ascertaining which XML database to utilise in the completion of the of the DJ
database system.




                                        20
BSc Computing and Information Systems Project                     Thomas Rogers




Developmental Element

3. Methodology
Having investigated XML, how it handles semi-structured data, alternative means
of handling semi-structured data i.e. RDBMS, and potential means to store and
manipulate XML documents from two perspectives, the initial research element
was completed. The developmental element can now be initiated and a software
development lifecycle was adhered to in order for the project to be successful. In
using a lifecycle model the ability to sign off each stage of system development
enables the gauging of where the project is at and limits the time taken for each
stage to be completed i.e. the creation of timeboxes for each development process
(Ramsin et al. p54) (refer to appendix for DSDM project plan and Timeboxes
established).
Essentially, due to the project timescale the project will be a rapid application
development (RAD) methodology. Two RAD techniques have been used in
tandem in the implementation of the proposed system: Dynamic Systems
Development Method (DSDM) and eXtreme Programming (XP). DSDM was the
development methodology the project assumed and DSDM is believed to be the
standard for RAD methodologies (Ramsin et al., 2008, p52).
According to Ramsin et al. (2008, p53) DSDM is an “iterative-incremental”
model in which business processes and user requirements are developed and
applied to two iteration development phases where prototypes are developed and
sequentially implemented. The second method XP is similarly an iterative agile
design process with the aim of creating and adapting software rapidly according to
ever changing user requirements (Hedin, G et al., 2003); both processes press the
importance of testing before implementing.
In applying these two development concepts together the aim was to develop and
implement a high specification system in a short period of time. DSDM provides
the framework upon which XP can develop; XP provides the format for system
testing and programming and DSDM defines the product lifecycle and required
deliverables i.e. what the was required of the system (Camara, S. 2009). This
harmony and unison between methodologies justified employing the two in


                                       21
BSc Computing and Information Systems Project                      Thomas Rogers

tandem in this development project. The two agile development methodologies
were also utilised due to familiarity and passed experience of using the two
concepts together prior to this project.
Refer to the appendix for the original DSDM project plan and an activity diagram
detailing the overall business process design and activities undertaken in the
completion of the project.
Having decided on a relevant development methodology to employ requirements
elicitation was required i.e. establishment of what the user requires of the
proposed system.

3.1 Initial User Requirements
The following requirements list displays the initial user and system requirements.
1. The user shall be able to register their personal details and store them in an
XML database
1.2 Details to be registered:
   •   Firstname
   •   Lastname
   •   Username
   •   Password
   •   Alias
   •   Email Address

2.1 The user shall be able to login to the web site
2.2 Details to be entered will be:
    • Username
    • Password
    • DJ alias
3.1 The user shall be able to upload feedback to the web site using a stored XML
document
3.2 Feedback to be stored:
     • Name
     • Regarding
     • Month
     • Comment
3.3 Name: the name of the user using the DJ forum
3.4 Regarding: what the comment the user is uploading relates to i.e. an event
or MP3

                                           22
BSc Computing and Information Systems Project                      Thomas Rogers

3.5 Month: when the feedback was uploaded
3.6 Comment: comment user wishes to add
4.1 The user shall be able to view the feedback they have uploaded via a feedback
link in the web site (XML document returned and styled using XSL)
4.2 The user shall be able to view all feedback uploaded to the web site
5. The user shall be able to upload tips and experience to the web site, storing the
XML data in an XML database and returning the XML document styled using
XSL
6. The user shall be able to listen to MP3 samples accessed via a web link


NB. This data will not be in XML format it will be stored on the server because
the file will be too large for an XML document to handle.

3.1.1 Possible requirements
7. The user shall be able to upload DJ tutorials to the web site, again handling the
data using XML and storing the data in an XML database
8. The proposed system will be a web log or blog web site enabling real-time
commentary
After establishing the basic requirements, use case diagrams were created to gain
a visual perspective of what was required.

3.2 Use Case Diagrams
In order to gain a better perspective of what was required of the system, UML use
case diagrams were used to assist in this area. The use case diagrams were
produced according to the initial requirements that were established, a use case
diagram for each of the main requirements i.e. register details and login (1, 2),
feedback entry and return (3, 4), experience/tips entry and return (5). The
following UML use case diagrams were created using StarUML (refer to page
24).




                                        23
BSc Computing and Information Systems Project              Thomas Rogers




                   Figure 11: Register Details and Login




                           Figure 12: Feedback




                                    24
BSc Computing and Information Systems Project                          Thomas Rogers




                           Figure 13: Experience/Tips

Having gained a visual perspective of what is required of the system through use
case depiction, user stories were developed.
3.3 User Stories
User stories were established in tandem with a potential user of the system (a
fellow DJ) to ascertain the processes completed by the user in order to satisfy the
specified requirements.

3.3.1 Register Details

 After entering the Web site URL the user will be able to select the ‘Register’
 link from the home page (index.php). Register.php will be opened and the user
 will be able to enter their first name, last name, username, password, DJ alias
 and email address. After clicking ‘Submit’ the user data will be transferred to
 an XML database using an XML document and they will be returned to
 Revolt’s home page.

 1. User will click the ‘Register’ link apparent on the home page
 2. User will enter all required data in to the text fields apparent
 3. The user will press ‘Submit’
 4. The user will be returned to the home page


                          Figure 14: Register User Story

                                         25
BSc Computing and Information Systems Project                     Thomas Rogers

3.3.2 User Login

 After being returned to the home page the user will be able to enter their
 username, password and alias. After clicking ‘Login’ the user will enter the
 Web site and will open welcome.php.

 1. User enters username
 2. User enters password
 3. User enters DJ alias
 4. User clicks ‘Login’
 5. User enters Web site

                           Figure 15: Login User Story

3.3.3 Upload Feedback

 After entering the Web site, the user will be able to select ‘Forum’ the link tab
 displayed. By selecting the ‘Forum’ link from welcome.php the user will be
 able to enter their name, what the comment relates to, the month that the
 feedback is being uploaded and enter a comment. The user will be able to click
 ‘Submit’ to upload their comments.

 1. User clicks ‘Forum’ and enters welcome.php
 2. User enters their name
 3. User enters the feedback re
 4. User enters the month
 5. User enters their comment
 6. User clicks ‘Submit’ to upload comments

                   Figure 16: Upload Feedback User Story

3.3.4 View Feedback
 After clicking ‘Submit’ the user will open confirm.php and will be to view the
 comments they have just uploaded. After selecting the radio button from the
 same page the user will be able to click ‘Submit’ and will be manoeuvred to
 confirmation.php where all upload comments will be displayed.

 1. User views uploaded comments from previous page
 2. User selects the radio button
 3. User clicks ‘Submit’ to manoeuvre to confirmation.php



                    Figure 17: View Feedback User Story




                                       26
BSc Computing and Information Systems Project                      Thomas Rogers



3.3.5 Upload Tips and Experience

 After logging in to the web site the user will be able to select experience/tips
 and fill in the required fields apparent on the page. After clickin Submit the user
 will be able to see the entries they have entered in to the Revolt DJ forum.
 1. User clicks ‘Experience/tips’
 2. User enters their alias
 3. User enters the ‘Give it a name’ field
 4. User enters the experience
 5. User enters their tips
 6. User clicks ‘Submit’ to upload comments

                    Figure 18: Experiences/Tips user Story

3.3.6 Listen to MP3’s

 After entering the relevant URL the user will be able to select ‘MP3’ from the
 link tab and will be manoeuvred to mp3.html where they will be able to select
 an MP3 link from the present page.

 1. User clicks ‘MP3’ link
 2. User selects desired MP3 link from mp3.html


                          Figure 19: MP3 User Story

When developing a piece of software using RAD techniques i.e. DSDM it is
important to establish which requirements are important in order to complete the
project successfully and within the stipulated time period. It was not feasible to
implement all of the user stories established in 3.2 and it was therefore necessary
to refine the requirements. In the following section MoSCoW rules are used to
establish the importance of each user requirement.

3.4 MoSCoW Rules
Ramsin et al. state that the rules as regards requirement priority are specified as
“must haves”, “should haves”, “could haves” and “won’t haves” - essential,
important, not essential and not present in the project respectively (2008, p55).
The user stories established in 3.2 were analysed using MoSCoW rules as follows:
   •   The system MUST be able to upload user feedback to an XML database



                                        27
BSc Computing and Information Systems Project                       Thomas Rogers

   •    The system MUST enable users to view individual feedback they have
        uploaded and previous user feedback, data that is in XML format and
        styled using XSL
   •    The system SHOULD offer user ability to upload experience and tips to
        an XML database
   •    The system COULD be able to provide user registration
   •    The system COULD enable users to login
   •    The system WOULD enable users to listen to MP3’s uploaded

From the above analysis it was possible to see that the system MUST possess
three of the original 6 user stories in order for the project to be a success; it was
considered feasible that these user stories could be completed within the time
scale of the project. The revised user stories completed were therefore 3.3.3, 3.3.4
and 3.3.5 (refer to 3.1). The SHOULD user stories were perceived to be
requirements that could be satisfied within the specified timescale, and the
COULD and WOULD user stories were perceived to be requirements that could
not be satisfied within the project timescale; these are areas of future work. The
three MUST user stories were developed in the functional iteration model stage of
the DSDM product lifecycle (refer to appendix for DSDM project plan). In
identifying the user stories that were essential to the project and incorporated in
the functional iteration stage, the design and implementation process began.

4. Artefact Design and Construction
4.1 Prototypes
To gain an idea of what was to be designed it was essential to identify the
processes that the user went through in satisfying the requirements of the system.
User requirements and user stories (3.1, 3.2) state what is required of the system
and to gain an interactive perspective of what was required a number of design
processes were completed, including:
    •   Low fidelity prototype – sketches or throw away depictions of the system
        interface
    •   Medium fidelity prototype – interactive depictions of the system interface
        and how it operates



                                         28
BSc Computing and Information Systems Project                     Thomas Rogers

    •   Vertical prototype – a functional representation of a small part of the
        system

4.1.1 Low Fidelity Prototype
After creating the user stories, in order to create a general perspective of the
system interface, mock-ups (sketches) of the system interface were produced. The
low fidelity prototype was merely a means to gain an initial idea of what the
system interface will look like. Figure 20 shows a mock-up design of the home
page (index.php):




                       Figure 20: Low Fidelity Prototype

According to the DSDM model the usability of the system is tested at this stage of
the development process; the mock-ups were created in tandem with the user
ascertaining exactly what was required of the system interface. This prototype was
developed further in the next phase of prototyping.


4.1.2 Medium Fidelity Prototype
The low fidelity prototype gave an idea of how the interface will look; a medium
fidelity prototype was used to show how each page was going to be accessed and
how to manoeuvre between them, a general depiction of what was required of the
system. Figure 21 is a representation of the medium fidelity prototype; it can be
found in the ‘medium_fidelity’ folder on the CDROM in the appendix. Refer to
the ‘medium_fidelity’ folder on the CDROM in the appendix for the medium
fidelity code.




                                        29
BSc Computing and Information Systems Project                        Thomas Rogers




                           Figure 21: Medium Fidelity

4.1.3 Vertical Prototype
A vertical prototype is a representation of a small piece of system functionality.
The vertical prototype is essentially a medium fidelity prototype that has a
specific piece of functionality; in relation to this assignment the vertical prototype
satisfies requirement 3 and 4 using a back end database and PHP to connect to,
access and manipulate the required data. Prior to the implementation of the
vertical prototype it was necessary to produce a back end database on a server. A
MySql database containing 1 table (medium) was created on the TVU Apache
server.

Requirement 3 involves user ability to upload data to the forum and requirement 4
enables users to see the data they have uploaded to the forum on request. The
following diagrams show the processes the user goes through in satisfying both
requirements 3 and 4. The code can be found in the appendix in the
‘vertical_prototype’ folder on the CDROM.




                                         30
BSc Computing and Information Systems Project       Thomas Rogers

Step 1: Insert data




                               Figure 22: Details

Step 2: View and submit data




                               Figure 23: Entry

Step 3: Retrieve Entry




                                      31
BSc Computing and Information Systems Project                     Thomas Rogers




                         Figure 24: Prototype Return

The title of the project is to investigate the use of XML in a DJ database system;
all that the prototypes investigated was the development of the system interface
and uploading of data to a database and accessing and manipulating data using
PHP. To gain an idea of how XML worked and how to format the XML
document, a template XML document and XSL stylesheet were produced. The
XML document and XSL stylesheet can be found in the CDROM ‘xml_template’
folder in the appendix. Figure 25 displays the XML document styled by the XSL
stylesheet.

4.2 XML Experiment




                         Figure 25: XML Experiment


                                       32
BSc Computing and Information Systems Project                   Thomas Rogers

4.5 Technology Used
Having investigated into XML and how it is formatted using XSL, how to store
and manipulate the XML document was established. PHP DOM was used to
manipulate the XML document and it was decided that an XML-enabled database
would be used to store the specified XML documents in BLOB form i.e. the
whole XML document is stored as one row in a relational table in a MySql
database. Due to problems with the Apache server at TVU (refer to 6.4) a server
was set up remotely; WampServer 2.0 was utilised to store all PHP and XML
documents in an XML-enabled MySql database. After investigating the
technology that was to be used, a proposed database schema was developed.
4.6 Database Schema
A UML class diagram was created to represent how the tables within the database
were formed. It was created in relation to the user requirements formulated in
section 3. Figure 26 is the database schema created:




                          Figure 26: Database Schema

A code walkthrough of how the main parts of the system function will now ensue.
4.7 Code Walkthrough
The main functions of the system are to create an XML document using PHP
DOM, load the XML document into an XML-enabled database and retrieve the
XML document from the XML-enabled database.

                                        33
BSc Computing and Information Systems Project                   Thomas Rogers

4.7.1 XML PHP DOM
Refer to the folder ‘revolt_final’ on the CDROM in the appendix for all of the
following code excerpts.
The user inserts the relevant data into the specified fields found on
revolt_index.php, ‘reviewer’, ‘regarding’, ‘month’, and ‘feedback’ and the page
calls ‘revolt_index.php’ on submission of the form:


<form name=”input” method=”post” action=”revolt_insert.php”>
 <div class=”subheading”>
  <p>Reviewer</p>
 </div>
<p><input type=”text” name=”reviewer”/><p>
<div class=”subheading”>
 <p>Regarding</p>
</div>
<p><input type=”text” name=”regarding”/><p>
<div class=”subheading”>
 <p>Date</p>
</div>
<p><input type=”text” name=”month”/><p>
<div class=”subheading”>
 <p>Comment</p>
</div>
<p><textarea type=”text” name=”feedback”
rows=”5”cols=”25”/></textarea></p>
<p><input type=”submit” value=”Submit”/><p>

                             Figure 27: XML Inputs

On submission of the form the page revolt_insert.php is called. On this page the
PHP document object model (DOM) is created using the following code:
<?php

DOM document is created...

  $xmldoc = new DOMDocument();

XML document and XSL stylesheet are loaded...

  $xmldoc->load('revolt_final.xml');
  $xmldoc->load("revolt_final.xsl");

‘reviewer’ is called from revolt_index.php (Figure 27)

  $newAct = $_POST['reviewer'];

                                       34
BSc Computing and Information Systems Project                      Thomas Rogers


$root variable is created containing the first child node of the element ‘reviewer’
in the revolt_final.xml document...

  $root = $xmldoc->firstChild;

An element node is created in $xmldoc containing the ‘reviewer’ input from
revolt_final.php...

  $newElement = $xmldoc->createElement('reviewer');

‘reviewer’ found in $newElement is added after the first child node found in
$root...

  $root->appendChild($newElement);

A new text node is created containing the value of ‘reviewer’ posted in
$newAct...

  $newText = $xmldoc->createTextNode($newAct);

The value of ‘reviewer’ is then added into the new element of the $xmldoc created
in the variable $newText...

  $newElement->appendChild($newText);

Repeat this process for the other three input values ‘regarding’, ‘month’ and
‘feedback’ found in Figure 27.

The xml document is then saved...

 $xmldoc->save('revolt_final.xml');

And the user is sent to the page revolt_return.php

  header('Location:revolt_return.php');
?>
The DOM document is created and the user is sent to the page revolt_return.php.
At this page the user will be able to see the DOM document reloaded;
revolt_final.xml is reloaded as follows:


<?php

                                           35
BSc Computing and Information Systems Project                    Thomas Rogers


DOM document created in variable $xmldoc and revolt_final.xml is loaded in to
said variable...

  $xmldoc = new DOMDocument();
  $xmldoc->load("revolt_final.xml", LIBXML_NOBLANKS);

Create $name variable containing the returned first child node element of
‘revolt_final.xml’...

  $name = $xmldoc->firstChild->firstChild;

If $name is not empty...

  if($name!=null){
     while($name!=null){

then all elements of the XML document will be returned using the textContent
XML DOM property...

echo $name->textContent."<div class="subheading"><p>Entry</p></div>";

After the subheading class the second element of the XML document is returned
using the nextSibling property...

          $name = $name->nextSibling;
      }
  }
?>

4.7.2 XML Document Storage
After displaying the XML document using the PHP DOM method, the document
is inserted into the XML-enabled database using a submit button. On the clicking
of the submit button found on revolt_return.php, update.php is called and the
XML document is uploaded using the following code:


<?php

Connection to the database is accomplished using the regular PHP SQL
connection method. Connect to the remote server...

          $con = mysql_connect("localhost", "thomr", "thomr1981");

                                        36
BSc Computing and Information Systems Project                      Thomas Rogers

       if ( ! $con )
       die("Cannot connect to server!");

Connect to the database revolt...

       $db = mysql_select_db("revolt");
       if ( ! $db )
       die ("Cannot connect to database!".mysql_error());

Create a variable containing an INSERT SQL statement and insert the data into
the revolt_feedback table in the revolt database...

       $sql = "INSERT INTO revolt_feedback VALUES
                    (NULL,

Load the revolt_final.xml document using the LOAD_FILE XML property...

       LOAD_FILE('c:/wamp/www/revolt/final/revolt_final.xml'))";

Execute the query, $sql...

       $result = mysql_query($sql);
       if ( ! $result )
       die ("Cannot execute query!".mysql_error());

Send the user to display.php where the XML document will be displayed...

       header('Location:display.php');

?>
At display.php the XML document is retrieved by using a SELECT SQL
statement. Connection is achieved via the same means as above and the statement
is as follows:

       $sql = "SELECT * FROM revolt_feedback";

Due to the XML document being stored as a single row in revolt_feedback the
document is returned as a row element. The variable $row is created containing
the SQL statement above which in turn is contained in the variable $result...

       $row = mysql_fetch_row($result);
       while ( $row != null )
       {
       echo "<div class="subheading"><p>Feedback Id # $row[0]
       </p></div> ";

The XML document contained within $row is the echoed by PHP and displayed...

                                         37
BSc Computing and Information Systems Project                         Thomas Rogers


          echo "<tr> <td> <td> $row[1] </td> </tr>n";
          $row = mysql_fetch_row($result);
          }
This method has been used in exactly the same manner when satisfying the
uploading of experience and tips to the Revolt DJ Forum; the inputs that have
been posted using PHP however are different. The final artefact can be found on a
remote server and the source code can be found in the ‘final’ folder on the
CDROM in the appendix.
In adherence to the DSDM framework upon which the project was based, having
created the artefact it was necessary to test what had been produced as part of the
iterative nature of the framework.

4.8 Artefact Testing
When employing a DSDM development project, testing is completed at each
iteration of the project i.e. if something is changed then the change will be tested
in tandem with the specified user according to their needs. The following testing
methods were produced after the final iteration of the project when the artefact
was deemed complete.
3 forms of testing were implemented in testing the usability of the system
produced. The tests were created in order to assess whether or not the artefact
satisfied the user requirements and stories. Firstly a test case was produced in
which the user was asked to input a number of processes in to the system and
describe the output that the system returned. Secondly a user acceptance test was
produced, again involving a number of user inputs and recording of the output
returned; the first two tests were used as part of a verification process.

4.8.1 Test Case
Figure 28 is an example of a user input to the system and the required response
format.
Input 6
Click ‘Forum’ on the Welcome page. Fill in the fields apparent and click
‘Submit’. Are the details you just entered present on the page? Tick Yes or No
Result:

Yes          No

                                Figure 28: Test Case

                                          38
BSc Computing and Information Systems Project                        Thomas Rogers

4.8.2 User Acceptance Test
Figure 29 is an example of 1 specific user acceptance test. A number of
acceptance tests were performed testing each particular user requirement.

1. Register Details

Purpose               Register personal details
Prerequisite          User must have visited the Revolt DJ Forum
                      Firstname {Enter fname into login table}
                      Lastname {Enter lname into login table}
                      Username {Enter username into login table}
Test Data
                      Password {Enter password into login table}
                      Alias {Enter alias into login table}
                      Email {Enter email into login table}
                      1. User visits Revolt DJ Forum
                      2. User clicks ‘Register’
Steps                 3. User enters required data into text boxes
                      4. User clicks ‘Submit’
                      5. Users sees login page
Notes and
Questions


                         Figure 29: User Acceptance Test

Having investigated semi-structured data and XML, XML storage methods,
elicited what was required of the system, developed user stories, created an
artefact and tested what was produced, a discussion of what was found out is
required.

5. Project Assessment
The purpose of the project was to investigate the use of XML in a DJ database
system. In investigating XML a number of requirements were elicited involving
storing and manipulating XML documents in an XML-enabled database. Using
the results gathered from the tests performed in 4.8, it is possible to assess
whether or not the requirements of the XML database system were satisfied,
essentially establishing whether or not the artefact and in turn the project was a
success. Each requirement will now be analysed to determine whether or not they
were satisfied by the artefact or not.

5.1 Requirements Analysis
1. The user shall be able to register their personal details and store them in an
XML database


                                          39
BSc Computing and Information Systems Project                      Thomas Rogers

1.2 Details to be registered:
   •   Firstname
   •   Lastname
   •   Username
   •   Password
   •   Alias
   •   Email Address

Result: Requirement satisfied

Rationale: The user can register details and create an account. The results of
input 1 of the test case display the fact that a user cannot login to the system
unless they have registered their details on the database. On the clicking of
‘Login’ without registering details the following error message is returned:

You have entered Incorrect details, please try again.
please try to login again.Click here to retry

2.1 The user shall be able to login to the web site
2.2 Details to be entered will be:
    • Username
    • Password
    • DJ alias
Result: Requirement satisfied
Rationale: The user can login to the system. The results of input 5 of the test case
displays the user ‘Alias’ on the welcome page whereby proving that the user
session has been created and the user has logged in.



                                                                           DJ ‘Alias’
                                                                           displayed




                                Figure 30: Alias Session
                                          40
BSc Computing and Information Systems Project                        Thomas Rogers

3.1 The user shall be able to upload feedback to the web site using a stored XML
document
3.2 Feedback to be stored:
     • Name
     • Regarding
     • Month
     • Comment
3.3 Name: the name of the user using the DJ forum
3.4 Regarding: what the comment the user is uploading relates to i.e. an event
or MP3

3.5 Month: when the feedback was uploaded
3.6 Comment: comment user wishes to add
Result: Requirement satisfied
Rationale: The user acceptance tests 3 and 4 results verify that the requirement is
satisfied. Input 4 of the test case asks for the feedback Id# range. Figure 31
displays the original Id # range.




                                    Figure 31: Entry

The range returned by all tests is 1 to 2. On completion of Input 6 as follows...




                                          41
BSc Computing and Information Systems Project       Thomas Rogers




                                 Figure 32: Entry

On clicking ‘Submit’ the entries are displayed...




                                 Figure 33: Entry

On clicking ‘Update’...




                                        42
BSc Computing and Information Systems Project                      Thomas Rogers




                                                                       Feedback Id # 3




                               Figure 34: Feedback Id

The entries are added to the XML-enabled database. The result of Input 9 displays
that the user entry has been added and the feedback Id # is 3.


4.1 The user shall be able to view the feedback they have uploaded via a feedback
link in the web site (XML document returned and styled using XSL)
4.2 The user shall be able to view all feedback uploaded to the web site
Result: Requirement satisfied
Rationale: Refer to requirement analysis 3 for proof of satisfaction
5. The user shall be able to upload tips and experience to the web site, storing the
XML data in an XML database
Result: Requirement satisfied
Rationale: The results of user acceptance test 5 return the following:
After steps 1 to 3 user inserts data (Step 4)...




                                           43
BSc Computing and Information Systems Project                   Thomas Rogers




                                Figure 35: Entry

User clicks ‘Submit’ and sees data entered (Steps 5 and 6)...




                           Figure 36: Data Displayed

User completes steps 7 to 9 and the following is displayed...




                                        44
BSc Computing and Information Systems Project                      Thomas Rogers




                                                                             Entry inserted into
                                                                             XML database




                            Figure 37: Data Inserted

6. The user shall be able to listen to MP3 samples accessed via a web link
Result: Requirement partially satisfied
Rationale: The user acceptance results for test 6 reveal that 2 of the MP3 samples
can be listened to – 2 are not MP3’s.

5.2 Failed Requirement
When assessing the requirements in relation to the MoSCoW rules formulated in
3.4, it was possible to see that all of the requirements regardless of priority had
been implemented successfully. However, 1 issue was raised as regards how the
XML document was returned. For both requirements 4 and 5 it was required that
the XML document would be returned and styled using an XSL stylesheet. This
mini requirement was not satisfied. This was due the whole XML document being
stored in 1 row of a relational table and returned as 1 row. Attempts were made to
open an XSL stylesheet (revolt.xsl) using the XML DOM property LOAD, but



                                          45                       XML document is not
                                                                   styled; the whole
                                                                   document is returned
                                                                   as one row of data
BSc Computing and Information Systems Project                      Thomas Rogers

they were not successful. Figure 38 displays how the XML document was
returned for the feedback requirement (4).




                         Figure 38: Unformatted XML

This fact was re-affirmed by the heuristic evaluations performed when testing the
system. A number of test results returned the notion that the system was usable
and aesthetically pleasing apart from this particular requirement; the data entries
returned were not aesthetically pleasing (refer to appendix for sample heuristic
evaluation).
This particular mini requirement could have been satisfied if the XML document
was stored in its original hierarchical structure in an NXD or shredded in to an
XML-enabled database with schema and index reference. However problems
were experienced in both of these areas and required project scope alterations as a
result.

5.3 Effectiveness of Method Used
The method utilised was XP coupled with the DSDM framework; the DSDM
framework relies on a feasibility study preceded by iterative development stages
in which aspects of the system are developed, tested and then developed and
tested further until the system is ready to be implemented. Each phase of the
DSDM framework relies on constant interaction with the user of the system in
order to ascertain exactly what it is they require. Adhering to the framework
proved difficult due to coding of the proposed system commencing as soon as the
project was initiated. As a result a number of the elements of the framework had
to be reverse engineered i.e. MoSCoW rules were reverse engineered because all
of the requirements were completed in unison and not in order of requirement
importance.


                                        46
BSc Computing and Information Systems Project                         Thomas Rogers

In planning the project according to the DSDM framework, it would have been
beneficial to implement timeboxes to aid in the completion of the various stages
of the project itself i.e. how long it should take to complete each iteration or finish
the feasibility study. When using timeboxes one timebox could be set up
portraying the whole duration of the project and within the project timebox
smaller ones are set up defining when the element should be completed (Ramsin
et al. 2008). Each timebox normally lasts about two to six weeks and they aid in
defining start and end dates as well as setting boundaries and sign off periods for
iterations (Ramsin et al. 2008).
Timeboxes were not used because it was decided that having boundaries for each
element of the development process and initiating sign offs may put pressure on
other areas of the project. It was believed that finishing a specific area before
starting another was the better method to employ. In some cases less important
areas were put on hold or eliminated because more pressing issues ensued i.e.
PICTIVE design sessions were not finished because a medium-fidelity prototype
was perceived as a better method of defining interface design and orientation.
Testing in iterative phases was non-existent because the coding process followed
merely a trial and error process; if something was wrong then it was fixed without
the user at hand. Very little testing of each phase of the system took place due to
user interaction being at a minimum. Testing took place once the system had been
implemented as a means to ascertain whether or not the requirements of the
system had been satisfied or not.
Also a feasibility study was not undertaken prior to design and implementation,
the design phase began immediately. It would have been beneficial to conduct a
feasibility study prior to implementation to assess the ease of installation of an
NXD and XML-enabled database and manipulation of an XML document. Due to
not assessing the feasibility of what was proposed the problems encountered when
attempting to install an NXD (refer to 6) could have been avoided and more time
could have been taken in the investigation of XML-enabled databases and the
configuration procedures involved.
However the prototyping phase proved very useful in both depicting the user
interface (medium fidelity prototype) and system functionality (vertical
prototype). The user interface was experimented with and an aesthetically


                                          47
BSc Computing and Information Systems Project                     Thomas Rogers

pleasing interface was created using the low and medium fidelity prototypes.
System functionality was ascertained using the vertical prototype and proved to be
an extremely beneficial element of the DSDM framework.
In hind sight, the XP/DSDM methodology employed was useful because it
provided general guidelines upon which the software development project could
rely; although a number of processes were not completed the processes that were
completed assisted in the creation and completion of the DJ database system i.e.
the gathering of requirements, user stories, prototyping and user acceptance
testing post implementation.
In completing the project in the specified timeframe a number of alterations were
made. The following section details the problems encountered and the alterations
implemented as a result.

6. Project Alterations
A number of problems were encountered in completing the project, largely
problems related to experimenting with different NXD’s and XML-enabled
databases. As a result a number of alterations were made to the overall scope of
the project.

6.1 NXD Installation Issues
Originally having investigated NXD’s and XML-enabled databases in section 2,
the artefact produced was intended to contain both an NXD and XML-enabled
database; an XML-enabled database for requirements 3 and 4 and an NXD for
requirement 5. The intended implementation would then have been tested and
compared to ascertain which database was more efficient based on the query
speeds of each database, similar to the tests mentioned in the literature review.
Due to configuration issues with NXD implementation it was decided to merely
implement an XML-enabled database for the XML documents to be stored in.
Attempts were made to implement two NXD’s: eXist and Oracle Berkeley
DBXML.

6.2 eXist
It was possible to install the eXist NXD but issues were apparent when attempting
to open the database; an account could be created but access to the database was



                                       48
BSc Computing and Information Systems Project                  Thomas Rogers

not granted. Figure 39 displays the problem that was encountered when
attempting to access the eXist NXD.




                               Figure 39: Error

Because of the access problems experienced with eXist, it was decided to try a
second NXD: Oracle Berkeley DBXML (BDBXML).

6.3 BDBXML
As regards this NXD it was possible to install and even create XML documents,
insert the required data and query the document created using an MSDOS
command prompt, but issues were raised when attempting to access and
manipulate the XML document using PHP and display the query results on the
specified PHP Web page. Figures 40 to 42 display creating and querying an XML
document in BDBXML.
Create document:




                         Figure 40: DBXML Create

Return document:



                                      49
BSc Computing and Information Systems Project                 Thomas Rogers




                          Figure 41: DBXML Select

Query document:




                         Figure 42: DBXML Query

When attempting to open the document revolt.dbxml using the following PHP
(test.php) script...
<?php
$mgr = new XmlManager(null);
if(file_exists("revolt.dbxml")) {
$mgr->openContainer("revolt.dbxml");
} else {
$con = $mgr->createContainer("revolt.dbxml");
}
$con->putDocument($book_name, $book_content);
?>
The following error message was returned:

Fatal error: Class 'XmlManager' not found in C:wampwwwtest.php on line 3

This error was due to issues with PHP extensions not being available in the
php.ini file and having to build the BDBXML using an IDE such as MS Visual
Studio. The point of the project was to investigate the use of XML in a DJ

                                      50
BSc Computing and Information Systems Project                      Thomas Rogers

database system and it was perceived that the installation issues experienced were
detracting away from the real essence of the project; it was therefore decided to
alter the concept of the project and merely decide upon one XML database to
utilise. Therefore an XML-enabled database was utilised purely because of the
ability to store and retrieve an XML document from it without installation
problems.

6.4 Oracle, PHP and XML Issues
Prior to deciding to utilise the BLOB format of a MySql database to hold the
whole XML document in 1 row of a relational table, investigation was made into
employing the XMLType property of an Oracle database to store an XML
document and schema. Due to not being able to install and configure an NXD
successfully it was decided to attempt to utilise an Oracle database and the query
XSU (XML SQL) method in order form a basis for comparison between two
XML-enabled databases – one that stored the data as a whole document in 1 row
(MySql and its BLOB utility), and 1 that shredded each data element in to a
different row of a relational table (Oracle XML XSU). Initially, it was possible to
store an XML document in an Oracle database that TVU utilised but problems
arose when attempting to access and manipulate the XML documents using PHP.
The oci8 extension that PHP requires to access an Oracle database via the
oci_connect PHP command did not exist on the Apache server at TVU. Even after
installing a remote Oracle database and creating and inserting XML documents
manually, issues still arose when attempting to access and manipulate the XML
document using PHP in the Revolt Web application. The following error was
returned on execution of an oracle test page:

Fatal error: Call to undefined function oci_connect() in
C:wampwwworacle_test.php on line 4

This problem again arose due to issues with the oci8 extension and attempts to re-
configure the php.ini file to enable the extension and the related environment
variables PATH were unfounded. It was at this stage that it was decided to merely
utilise the MySql BLOB property that enabled the storage of an entire XML
document in 1 row of the table. This was purely due to the fact this was the only



                                        51
BSc Computing and Information Systems Project                      Thomas Rogers

method that proved successful as regards XML document manipulation via PHP
DOM and storage of the document created.

6.5 Data Structure
The structure of the data being accessed, stored and manipulated was not in semi-
structured format, the data is actually structured. In section 2.2 there is a
description of how semi-structured data appears and it was apparent that the data
being gathered by the DJ forum was not in semi-structured form. Semi-structured
data has an irregular and unstructured form whereas the data in the artefact was
structured.




                       Figure 43: Structured Data Format

This data can be represented as...
<review>
<name></name>
<reviewer></reviewer>
<date></date>
<comment></comment>
</review>

The original intention of the artefact was to collect more extensive feedback in
relation to the event that the customer has been to. It was proposed originally for
the feedback requirement that data relating to music genre, venue description, how


                                        52
BSc Computing and Information Systems Project                      Thomas Rogers

easy the venue was to find, quality of mixing, preferred music would be captured
and stored in the XML database. A varying set of data would have been captured,
some comments long, some short and not all of them relevant to the actual DJ
event. If this data was stored in a relational database a classic example of sparse
data would have been returned (refer to Figure 7, page 14) and be perfect for
XML to handle because it does not have any set boundaries on structure. The
intended structure could have been like the following, and therefore semi-
structured:
<review>
<name></name>
<reviewer></reviewer>
<comment>
 <genre></genre>
 <venue>
 <venue description></venue description>
 <find></find>
 <venue>
</comment>
</review>

                         Figure 44: Intended Structure

Unfortunately because there were serious delays starting the project due to the
inability of TVU to make PHP with Oracle available (the preferred technology)
eventually a remote server named WampServer had to be implemented and the
application had to be simplified because of the time constraints - a simplified
version was implemented to test the application.
7. Conclusion
Having investigated into the use of XML in a DJ database system, XML and
XML storage methods have been assessed, and a DJ database system has been
implemented loosely utilising the XP/DSDM software development methodology.
Due to the inability to install and in turn utilise an NXD a comparison could not
be made between said NXD and an XML-enabled database as originally intended.
Efforts were made to install two NXD’s (BDBXL and eXist) so that a comparison
could be made but proved unsuccessful.
In the literature review it is mentioned that XML-enabled databases are more
reliable due to them being tried and tested and the standard for data storage
(Connolly et al., 2006). Pardede et al., (2005) further iterate the limitations of

                                         53
BSc Computing and Information Systems Project                     Thomas Rogers

NXD’s stating that they are in their “infancy” and that XML-enabled databases
are preferred because of the mature RDBMS foundation they are based on. Due to
the problems experienced in installing two NXD’s and the configuration of an
NXD to use PHP, resulting in the inability to experiment with NXD’s entirely, it
is possible to agree with this argument. The MySql database used in the DJ
database system did not have to be configured to utilise the PHP scripting
language; an XML document was just stored as a BLOB file and was accessed
and manipulated accordingly using SQL statements, although not styled using
XSL on return.
Contrarily, only the scripting language PHP was investigated and this is not the
basis for a solid argument because numerous other programming and querying
languages i.e. Java and XPath could be employed instead of PHP and SQL to
access and manipulate the XML document; PHP and SQL was merely used
because of familiarity with the two languages. Also, if a proper feasibility study
was performed prior to the design and build phase, further investigation could
have been completed in ascertaining how NXD’s are installed and extra time
could have been set aside to install and configure the desired NXD. Due to not
having a basis for comparison between the two originally required XML
databases, judgement of NXD’s cannot be accomplished. Reference can be made
to the experiment performed by Serna and Keppa in the literature review about
how they found that in general the eXist NXD performed better than the XML-
enabled Oracle as regards query time, but again this is only their experiment
performed under their conditions. Assessment cannot be made until experiments
are made first hand.
As a result, a number of different areas of improvement, recommendations and
future work can be offered.
7.1 Recommendations
Below are recommendations that could assist in improving the Revolt DJ database
system:
   •   Investigate in to the retrieval of an XML document with an XSL stylesheet
       so that the XML document returned is formatted in an aesthetic manner
   •   Look into different methods of manipulating XML documents other than
       PHP DOM


                                       54
BSc Computing and Information Systems Project                        Thomas Rogers

   •   Investigate how to configure an NXD to utilise PHP so that an NXD can
       be investigated
   •   Investigate into the query language XPath instead of relying merely of the
       query language SQL to access the specified back-end database

7.2 Future Work
Below are some possible areas of future work:
   •   Implement an NXD for one of the requirements so that a comparison of
       the two XML databases can be completed
   •   Style   the   XML         documents    retrieved   (revolt_feedback.xml   and
       experience.xml) by the DJ database system
   •   Create a totally XML based web application i.e. XML login and XML
       registration of details


8. Critique
The following critique is a summary of what has been found in 6. The initial
objectives of the assignment were to investigate the use of XML in DJ database
system and in doing this an artefact was produce investigating this paradigm. The
objectives was to produce a PHP XML Web application that enabled users to
register details, login, upload feedback and experience to an XML database, and
return the feedback and experience to the the Web page using PHP. There were 6
initial requirements that were set up and all of the requirements have been met.
However, the sub-requirement which involved styling the XML document
returned was not completed and this has been outlined in 5.
The requirements were met but a proposed comparison between XML-enabled
databases and NXD in order to ascertain which was the most efficient to employ
was not completed due to installation issues (refer to 6). Also, the data that was
captured by the Web site was not semi-structured (refer to 6) due to configuration
problems, so the overall purpose of the project was somewhat undermined by this
fact. However XML is the language of the Internet so an investigation into how it
is used proved beneficial.
Even though a number of problems arose due to the aforementioned problems, the
overall question and purpose of the project was realised in that a Web application


                                         55
BSc Computing and Information Systems Project                     Thomas Rogers

using XML was produced and the XML document could be manipulated and
stored in an XML database. Working with the XML technology proved difficult
but the production of the application and the eventual storing and manipulation of
the document proved most satisfying. Further investigation in to XML and how it
is used on the Web will be of further interest in the future.




9. References
BUNEMAN, P. (1997) Semi-structured data. Tutorial, In: Symposium on
Principles of Database Systems. UPenn DB Engineering. Retrieved March 16th,
2010, from http://www.cis.upenn.edu/~db/abstracts/semistructured.html



SPERBERG-MCQUEEN, C.M. (2005) XML. Queue, 3 (8). October, pp.34-41.




                                          56
BSc Computing and Information Systems Project                  Thomas Rogers

BURLESON, D. (2009) Identifying Oracle sparse tables. Burleson Consulting.
Retrieved March 16th, 2010, from
http://www.dbaoracle.com/t_identify_sparse_tables.htm



WOOD, P. Semi-Structured Data: Example of Semi-Structured Data. Retrieved
14th May, 2010, from http://www.dcs.bbk.ac.uk/~ptw/teaching/ssd/notes.html



ROY, J. & RAMANJUAN, A. (2001). XML schema language: taking XML to
the next level. In: IT Professionals, 3 (2), pp37-40.



WORLD WIDE WEB CONSORTIUM (W3C). (2008) Extensible Markup
Language (XML) 1.0 (Fifth Edition) W3C Recommendation. November 26th.
Retrieved March 16th 2010 from http://www.w3.org/TR/xml/



W3 SCHOOLS (W3S). Introduction to XML. Retrieved March 16th, 2010, from
http://www.w3schools.com/xml/xml_whatis.asp



HONG, S. & SONG, Y-T. (2007) Efficient XML query using Relational Data
Model. In: Software Engineering, Artificial Intelligence, Networking, and
Parallel/Distributed Computing. Eighth ACIS International Conference. 3,
pp-1095-1100.

APPLEQUIST, D.K. (2002). XML and SQL: Developing Web Applications.
Addsion and Wesley.


SERNA, A. & GERRIKAGOITIA, J-K. (June 28th, 2005) David and Goliath: A
Comparison Of XML_Enabled And Native XML Data Management
Techniques. Retrieved March 25th from http://xml.sys-con.com/node/104980


CHAUDRI, A.B., RASHID, A., & ZICARI, R. (2003). XML data management:
native XML and XML-enabled database systems.


                                         57
Investigating the Use of XML in a DJ Database System
Investigating the Use of XML in a DJ Database System
Investigating the Use of XML in a DJ Database System
Investigating the Use of XML in a DJ Database System
Investigating the Use of XML in a DJ Database System
Investigating the Use of XML in a DJ Database System
Investigating the Use of XML in a DJ Database System
Investigating the Use of XML in a DJ Database System
Investigating the Use of XML in a DJ Database System
Investigating the Use of XML in a DJ Database System
Investigating the Use of XML in a DJ Database System
Investigating the Use of XML in a DJ Database System
Investigating the Use of XML in a DJ Database System
Investigating the Use of XML in a DJ Database System
Investigating the Use of XML in a DJ Database System
Investigating the Use of XML in a DJ Database System
Investigating the Use of XML in a DJ Database System
Investigating the Use of XML in a DJ Database System
Investigating the Use of XML in a DJ Database System
Investigating the Use of XML in a DJ Database System
Investigating the Use of XML in a DJ Database System
Investigating the Use of XML in a DJ Database System
Investigating the Use of XML in a DJ Database System
Investigating the Use of XML in a DJ Database System
Investigating the Use of XML in a DJ Database System
Investigating the Use of XML in a DJ Database System
Investigating the Use of XML in a DJ Database System
Investigating the Use of XML in a DJ Database System
Investigating the Use of XML in a DJ Database System
Investigating the Use of XML in a DJ Database System
Investigating the Use of XML in a DJ Database System
Investigating the Use of XML in a DJ Database System
Investigating the Use of XML in a DJ Database System
Investigating the Use of XML in a DJ Database System
Investigating the Use of XML in a DJ Database System
Investigating the Use of XML in a DJ Database System
Investigating the Use of XML in a DJ Database System
Investigating the Use of XML in a DJ Database System
Investigating the Use of XML in a DJ Database System
Investigating the Use of XML in a DJ Database System
Investigating the Use of XML in a DJ Database System
Investigating the Use of XML in a DJ Database System
Investigating the Use of XML in a DJ Database System
Investigating the Use of XML in a DJ Database System
Investigating the Use of XML in a DJ Database System
Investigating the Use of XML in a DJ Database System
Investigating the Use of XML in a DJ Database System
Investigating the Use of XML in a DJ Database System
Investigating the Use of XML in a DJ Database System
Investigating the Use of XML in a DJ Database System
Investigating the Use of XML in a DJ Database System

More Related Content

What's hot

Saptableref[1]
Saptableref[1]Saptableref[1]
Saptableref[1]mpeepms
 
Unstructured Data and the Enterprise
Unstructured Data and the EnterpriseUnstructured Data and the Enterprise
Unstructured Data and the EnterpriseDATAVERSITY
 
SafeDNS Content Filtering Service Guide
SafeDNS Content Filtering Service GuideSafeDNS Content Filtering Service Guide
SafeDNS Content Filtering Service GuideSafeDNS
 
Does online interaction with promotional video increase customer learning and...
Does online interaction with promotional video increase customer learning and...Does online interaction with promotional video increase customer learning and...
Does online interaction with promotional video increase customer learning and...rossm2
 
Microsoft project server 2010 project managers guide for project web app
Microsoft project server 2010 project managers guide for project web appMicrosoft project server 2010 project managers guide for project web app
Microsoft project server 2010 project managers guide for project web appApimuk Siripitupum
 
report.doc
report.docreport.doc
report.docbutest
 
Analysis Of The Modern Methods For Web Positioning
Analysis Of The Modern Methods For Web PositioningAnalysis Of The Modern Methods For Web Positioning
Analysis Of The Modern Methods For Web PositioningPaweł Kowalski
 
Mongouk talk june_18
Mongouk talk june_18Mongouk talk june_18
Mongouk talk june_18Skills Matter
 
Ibm watson analytics
Ibm watson analyticsIbm watson analytics
Ibm watson analyticsLeon Henry
 
M.Sc Dissertation: Simple Digital Libraries
M.Sc Dissertation: Simple Digital LibrariesM.Sc Dissertation: Simple Digital Libraries
M.Sc Dissertation: Simple Digital LibrariesLighton Phiri
 
Web Application Forensics: Taxonomy and Trends
Web Application Forensics: Taxonomy and TrendsWeb Application Forensics: Taxonomy and Trends
Web Application Forensics: Taxonomy and TrendsKrassen Deltchev
 

What's hot (15)

Saptableref[1]
Saptableref[1]Saptableref[1]
Saptableref[1]
 
Unstructured Data and the Enterprise
Unstructured Data and the EnterpriseUnstructured Data and the Enterprise
Unstructured Data and the Enterprise
 
SafeDNS Content Filtering Service Guide
SafeDNS Content Filtering Service GuideSafeDNS Content Filtering Service Guide
SafeDNS Content Filtering Service Guide
 
Google Search Quality Rating Program General Guidelines 2011
Google Search Quality Rating Program General Guidelines 2011Google Search Quality Rating Program General Guidelines 2011
Google Search Quality Rating Program General Guidelines 2011
 
Does online interaction with promotional video increase customer learning and...
Does online interaction with promotional video increase customer learning and...Does online interaction with promotional video increase customer learning and...
Does online interaction with promotional video increase customer learning and...
 
Microsoft project server 2010 project managers guide for project web app
Microsoft project server 2010 project managers guide for project web appMicrosoft project server 2010 project managers guide for project web app
Microsoft project server 2010 project managers guide for project web app
 
report.doc
report.docreport.doc
report.doc
 
Analysis Of The Modern Methods For Web Positioning
Analysis Of The Modern Methods For Web PositioningAnalysis Of The Modern Methods For Web Positioning
Analysis Of The Modern Methods For Web Positioning
 
Mongouk talk june_18
Mongouk talk june_18Mongouk talk june_18
Mongouk talk june_18
 
Ibm watson analytics
Ibm watson analyticsIbm watson analytics
Ibm watson analytics
 
M.Sc Dissertation: Simple Digital Libraries
M.Sc Dissertation: Simple Digital LibrariesM.Sc Dissertation: Simple Digital Libraries
M.Sc Dissertation: Simple Digital Libraries
 
IBM Watson Content Analytics Redbook
IBM Watson Content Analytics RedbookIBM Watson Content Analytics Redbook
IBM Watson Content Analytics Redbook
 
Addendum
AddendumAddendum
Addendum
 
Web Application Forensics: Taxonomy and Trends
Web Application Forensics: Taxonomy and TrendsWeb Application Forensics: Taxonomy and Trends
Web Application Forensics: Taxonomy and Trends
 
Access Tutorial
Access TutorialAccess Tutorial
Access Tutorial
 

Viewers also liked

The Power of Good Credit presented by The Credit Firm!
The Power of Good Credit presented by The Credit Firm!The Power of Good Credit presented by The Credit Firm!
The Power of Good Credit presented by The Credit Firm!BigBoss777
 
16 Questions You Should Ask The Bank Before Taking A Home Loan
16 Questions You Should Ask The Bank Before Taking A Home Loan16 Questions You Should Ask The Bank Before Taking A Home Loan
16 Questions You Should Ask The Bank Before Taking A Home LoanMani Subramanian Veeramani
 
Nordic Noir Detectives
Nordic Noir DetectivesNordic Noir Detectives
Nordic Noir Detectiveskillerhaddock
 
Applied Software Engineering Assignment
Applied Software Engineering AssignmentApplied Software Engineering Assignment
Applied Software Engineering Assignmenttdsrogers
 
FDM Presentation
FDM PresentationFDM Presentation
FDM Presentationtdsrogers
 

Viewers also liked (8)

Phdthread.com
Phdthread.comPhdthread.com
Phdthread.com
 
Why CFS
Why CFSWhy CFS
Why CFS
 
The Power of Good Credit presented by The Credit Firm!
The Power of Good Credit presented by The Credit Firm!The Power of Good Credit presented by The Credit Firm!
The Power of Good Credit presented by The Credit Firm!
 
16 Questions You Should Ask The Bank Before Taking A Home Loan
16 Questions You Should Ask The Bank Before Taking A Home Loan16 Questions You Should Ask The Bank Before Taking A Home Loan
16 Questions You Should Ask The Bank Before Taking A Home Loan
 
Nordic Noir Detectives
Nordic Noir DetectivesNordic Noir Detectives
Nordic Noir Detectives
 
Applied Software Engineering Assignment
Applied Software Engineering AssignmentApplied Software Engineering Assignment
Applied Software Engineering Assignment
 
LINKEDIN CONTENT MARKETING Source: LinkedIn
LINKEDIN CONTENT MARKETING Source: LinkedInLINKEDIN CONTENT MARKETING Source: LinkedIn
LINKEDIN CONTENT MARKETING Source: LinkedIn
 
FDM Presentation
FDM PresentationFDM Presentation
FDM Presentation
 

Similar to Investigating the Use of XML in a DJ Database System

Mongo db data-models-guide
Mongo db data-models-guideMongo db data-models-guide
Mongo db data-models-guideDan Llimpe
 
SQL Server Analysis Services
SQL Server Analysis ServicesSQL Server Analysis Services
SQL Server Analysis ServicesAhmed Al Salih
 
Rafal_Malanij_MSc_Dissertation
Rafal_Malanij_MSc_DissertationRafal_Malanij_MSc_Dissertation
Rafal_Malanij_MSc_DissertationRafał Małanij
 
3. Fundametals of Database module.pdf
3. Fundametals of Database module.pdf3. Fundametals of Database module.pdf
3. Fundametals of Database module.pdfdemissieejo
 
Simon Brooks 100042660 - Dissertation - 2010-2011
Simon Brooks 100042660 - Dissertation - 2010-2011Simon Brooks 100042660 - Dissertation - 2010-2011
Simon Brooks 100042660 - Dissertation - 2010-2011Simon Brooks
 
Specification of the Linked Media Layer
Specification of the Linked Media LayerSpecification of the Linked Media Layer
Specification of the Linked Media LayerLinkedTV
 
Building a Simple Network - Study Notes
Building a Simple Network - Study NotesBuilding a Simple Network - Study Notes
Building a Simple Network - Study NotesOxfordCambridge
 
Enterprise Data Center Networking (with citations)
Enterprise Data Center Networking (with citations)Enterprise Data Center Networking (with citations)
Enterprise Data Center Networking (with citations)Jonathan Williams
 
Tcxd 300 manual_02.10.10
Tcxd 300 manual_02.10.10Tcxd 300 manual_02.10.10
Tcxd 300 manual_02.10.10jftorresco
 
Composition of Semantic Geo Services
Composition of Semantic Geo ServicesComposition of Semantic Geo Services
Composition of Semantic Geo ServicesFelipe Diniz
 
Informatica installation guide
Informatica installation guideInformatica installation guide
Informatica installation guidecbosepandian
 
Intrusion Detection on Public IaaS - Kevin L. Jackson
Intrusion Detection on Public IaaS  - Kevin L. JacksonIntrusion Detection on Public IaaS  - Kevin L. Jackson
Intrusion Detection on Public IaaS - Kevin L. JacksonGovCloud Network
 

Similar to Investigating the Use of XML in a DJ Database System (20)

Master's Thesis
Master's ThesisMaster's Thesis
Master's Thesis
 
Mongo db data-models-guide
Mongo db data-models-guideMongo db data-models-guide
Mongo db data-models-guide
 
SQL Server Analysis Services
SQL Server Analysis ServicesSQL Server Analysis Services
SQL Server Analysis Services
 
Rafal_Malanij_MSc_Dissertation
Rafal_Malanij_MSc_DissertationRafal_Malanij_MSc_Dissertation
Rafal_Malanij_MSc_Dissertation
 
3. Fundametals of Database module.pdf
3. Fundametals of Database module.pdf3. Fundametals of Database module.pdf
3. Fundametals of Database module.pdf
 
Hung_thesis
Hung_thesisHung_thesis
Hung_thesis
 
Gomadam Dissertation
Gomadam DissertationGomadam Dissertation
Gomadam Dissertation
 
Simon Brooks 100042660 - Dissertation - 2010-2011
Simon Brooks 100042660 - Dissertation - 2010-2011Simon Brooks 100042660 - Dissertation - 2010-2011
Simon Brooks 100042660 - Dissertation - 2010-2011
 
Specification of the Linked Media Layer
Specification of the Linked Media LayerSpecification of the Linked Media Layer
Specification of the Linked Media Layer
 
thesis_online
thesis_onlinethesis_online
thesis_online
 
R data
R dataR data
R data
 
Essay On Database
Essay On DatabaseEssay On Database
Essay On Database
 
Building a Simple Network - Study Notes
Building a Simple Network - Study NotesBuilding a Simple Network - Study Notes
Building a Simple Network - Study Notes
 
Enterprise Data Center Networking (with citations)
Enterprise Data Center Networking (with citations)Enterprise Data Center Networking (with citations)
Enterprise Data Center Networking (with citations)
 
Tcxd 300 manual_02.10.10
Tcxd 300 manual_02.10.10Tcxd 300 manual_02.10.10
Tcxd 300 manual_02.10.10
 
Composition of Semantic Geo Services
Composition of Semantic Geo ServicesComposition of Semantic Geo Services
Composition of Semantic Geo Services
 
Informatica installation guide
Informatica installation guideInformatica installation guide
Informatica installation guide
 
Intrusion Detection on Public IaaS - Kevin L. Jackson
Intrusion Detection on Public IaaS  - Kevin L. JacksonIntrusion Detection on Public IaaS  - Kevin L. Jackson
Intrusion Detection on Public IaaS - Kevin L. Jackson
 
Deform 3 d v6.0
Deform 3 d v6.0Deform 3 d v6.0
Deform 3 d v6.0
 
Deform 3 d v6.0
Deform 3 d v6.0Deform 3 d v6.0
Deform 3 d v6.0
 

Investigating the Use of XML in a DJ Database System

  • 1. Thames Valley University Faculty of Professional Studies BSc Computing and Information Systems Investigating the use of XML in a DJ database system Thomas David Stanley Rogers May 2010
  • 2. BSc Computing and Information Systems Project Thomas Rogers INVESTIGATING THE USE OF XML IN A DJ DATABASE SYSTEM – MAY 2010 Thomas David Stanley Rogers Submitted in support of BSc Computing and Information Systems May 2010 DECLARATION “I certify that this work has not been accepted in substance for any degree, and is not concurrently being submitted for any degree other than that of BSc Computing and Information Systems being studied at Thames Valley University. I also declare that this work is the result of my own investigations except where otherwise identified by references and that I have not plagiarised another’s work”. 2
  • 3. BSc Computing and Information Systems Project Thomas Rogers Contents Page Figures ..................................................................................................................... 5 Abstract ................................................................................................................... 7 Introduction ............................................................................................................. 7 Aims and Objectives ........................................................................................... 7 Research Element ................................................................................................ 7 Developmental Element ...................................................................................... 8 Artefact Deliverables ........................................................................................... 8 Terms of Reference ............................................................................................. 8 Background ......................................................................................................... 8 Target Audience .................................................................................................. 8 Research Element .................................................................................................... 9 1. Literature Review ................................................................................................ 9 2.1 The role XML plays in handling semi-structured data.................................. 9 2.3 NXD versus XML-enabled relational databases ......................................... 15 2.3.1 Investigating Oracle XMLType:CLOB................................................ 19 2.3.2 Investigating Oracle XSU (XML SQL) ............................................... 19 2.3.3 Investigating eXist ................................................................................ 20 2.3.4 Investigation Results ................................................................................ 20 Developmental Element ........................................................................................ 21 3. Methodology ..................................................................................................... 21 3.1 Initial User Requirements ................................................................................ 22 3.1.1 Possible requirements ............................................................................... 23 3.2 Use Case Diagrams ......................................................................................... 23 3.3 User Stories ..................................................................................................... 25 3.3.1 Register Details ........................................................................................ 25 3.3.2 User Login ................................................................................................ 26 3.3.3 Upload Feedback ...................................................................................... 26 3
  • 4. BSc Computing and Information Systems Project Thomas Rogers 3.3.4 View Feedback ......................................................................................... 26 3.3.5 Upload Tips and Experience .................................................................... 27 3.3.6 Listen to MP3’s ........................................................................................ 27 3.4 MoSCoW Rules ............................................................................................... 27 4. Artefact Design and Construction ..................................................................... 28 4.1 Prototypes .................................................................................................... 28 4.1.1 Low Fidelity Prototype ............................................................................. 29 4.1.2 Medium Fidelity Prototype ...................................................................... 29 4.1.3 Vertical Prototype..................................................................................... 30 4.2 XML Experiment ........................................................................................ 32 4.5 Technology Used ......................................................................................... 33 4.6 Database Schema ......................................................................................... 33 4.7 Code Walkthrough ...................................................................................... 33 4.7.1 XML PHP DOM ...................................................................................... 34 4.7.2 XML Document Storage .......................................................................... 36 4.8 Artefact Testing ........................................................................................... 38 4.8.1 Test Case .................................................................................................. 38 4.8.2 User Acceptance Test ............................................................................... 39 5. Project Assessment ............................................................................................ 39 5.1 Requirements Analysis ................................................................................ 39 5.2 Failed Requirement ..................................................................................... 45 5.3 Effectiveness of Method Used .................................................................... 46 6. Project Alterations ............................................................................................. 48 6.1 NXD Installation Issues............................................................................... 48 6.2 eXist............................................................................................................. 48 6.3 BDBXML .................................................................................................... 49 6.4 Oracle, PHP and XML Issues ...................................................................... 51 6.5 Data Structure .............................................................................................. 52 7. Conclusion ......................................................................................................... 53 7.1 Recommendations ....................................................................................... 54 7.2 Future Work ................................................................................................ 55 8. Critique .............................................................................................................. 55 4
  • 5. BSc Computing and Information Systems Project Thomas Rogers 9. References ......................................................................................................... 56 10. Appendix ......................................................................................................... 60 Figures Figure 1: Semi-structured Data ............................................................................. 10 Figure 2: Hierarchical Data Structure.................................................................... 10 Figure 3: XML Data (Kroenke, 2002,p452) ........................................................ 11 Figure 4: XML Represented in a Hierarchical Structure....................................... 11 Figure 5: Relational 'Flat' Structure Source: Applequist, p20 ............................... 13 Figure 6: Complex Query ...................................................................................... 14 Figure 7: Sparse Data ............................................................................................ 14 Figure 8: Semi-structured to Relational ................................................................ 15 Figure 9: XML Index ............................................................................................ 17 Figure 10: Hong et al. XML Tag ........................................................................... 18 Figure 11: Register Details and Login .................................................................. 24 Figure 12: Feedback .............................................................................................. 24 Figure 13: Experience/Tips ................................................................................... 25 Figure 14: Register User Story .............................................................................. 25 Figure 15: Login User Story .................................................................................. 26 Figure 16: Upload Feedback User Story ............................................................... 26 Figure 17: View Feedback User Story .................................................................. 26 Figure 18: Experiences/Tips user Story ................................................................ 27 Figure 19: MP3 User Story ................................................................................... 27 Figure 20: Low Fidelity Prototype ........................................................................ 29 Figure 21: Medium Fidelity .................................................................................. 30 Figure 22: Details .................................................................................................. 31 Figure 23: Entry..................................................................................................... 31 Figure 24: Prototype Return .................................................................................. 32 Figure 25: XML Experiment ................................................................................. 32 Figure 26: Database Schema ................................................................................. 33 Figure 27: XML Inputs .......................................................................................... 34 5
  • 6. BSc Computing and Information Systems Project Thomas Rogers Figure 28: Test Case .............................................................................................. 38 Figure 29: User Acceptance Test .......................................................................... 39 Figure 30: Alias Session ........................................................................................ 40 Figure 31: Entry..................................................................................................... 41 Figure 32: Entry..................................................................................................... 42 Figure 33: Entry..................................................................................................... 42 Figure 34: Feedback Id .......................................................................................... 43 Figure 35: Entry..................................................................................................... 44 Figure 36: Data Displayed ..................................................................................... 44 Figure 37: Data Inserted ........................................................................................ 45 Figure 38: Unformatted XML ............................................................................... 46 Figure 39: Error ..................................................................................................... 49 Figure 40: DBXML Create.................................................................................... 49 Figure 41: DBXML Select .................................................................................... 50 Figure 42: DBXML Query .................................................................................... 50 Figure 43: Structured Data Format ........................................................................ 52 Figure 44: Intended Structure ................................................................................ 53 6
  • 7. BSc Computing and Information Systems Project Thomas Rogers Abstract EXtensible Mark-up Language (XML) is used to transfer data between applications in a semi-structured format which does not conform to the traditional relational database format of tables and columns. Research into the role XML plays in the handling of semi-structured data and how it handles said data better than the relational approach leads on to a discussion of native XML databases (NXD) and XML-enabled databases within the initial research element of the this report; NXD’s are the standard for storing XML documents and XML-enabled databases are used to store XML documents in a relational database. The research element is preceded by a developmental element and implementation of an artefact that incorporates the findings and demonstrates the use of XML in a DJ database system. The artefact is a XML PHP DJ forum Web application and database that enables users to upload and download semi-structured XML data in the form of feedback, tips, experiences and set samples. The development process adheres to the Dynamic Systems Development Method (DSDM) software development methodology in cohesion with eXtreme Programming (XP) and each phase is explained and presented within this document. This involves requirements elicitation and development, UML modelling and prototype design, incremental testing, implementation and evaluation of the artefact and user acceptance testing. Introduction Aims and Objectives The aim of this project is to investigate the use of XML in a DJ database system. It is a research and development project in which XML and its storage methods will be investigated followed by the design, development, implementation and testing of an artefact based on the DSDM framework Research Element The research element of the project will investigate the following areas: • Investigate XML and the role it plays in handling semi-structured data and the storage options for XML documents • Discussion and comparison of native XML databases and XML-enabled relational databases 7
  • 8. BSc Computing and Information Systems Project Thomas Rogers Developmental Element In investigating the use of XML in a DJ database system the following areas will be incorporated in the development of the project: • A PHP application capturing user data • XML to transfer data • XML-enabled relational database such as Oracle to store the data or a native XML database Artefact Deliverables I have a particular interest in being a DJ and recording and storing music that I produce. The proposed software development project will attempt to: • Enable the user to register and login to the system • Capture and store user data or feedback related to DJ sets performed i.e. semi-structured data • User upload area enabling the sharing of DJ tips and experience • Store MP3 samples of mixes that users can download; this however cannot be stored in the XML document because the file will get too big • Possibly a blog area enabling users to add their tips and experience Terms of Reference Background A particular interest in Web based applications and the PHP scripting language has lead to an interest in learning XML and the programming involved in producing an application using this paradigm. It was seen as beneficial to attempt to incorporate being a DJ with this interest in the completion of the literature review and production of the DJ database system. Target Audience The intended target audience of this project could be: • Web developers interested in using XML Demonstrate reasons for employing XML in Web application Alternatives i.e. what type of XML database to employ • Aspiring DJ’s wanting to market themselves • Users uncertain of DJ music genres 8
  • 9. BSc Computing and Information Systems Project Thomas Rogers • DJ’s wanting to share information with each other i.e. experiences and tips Prior to the design and implementation of the system, research in to the topic area will ensue. Research Element 1. Literature Review The proposed software application intends to demonstrate the use of XML in a DJ database system; resultantly the need to investigate XML, its components and storage methods is essential prior to design and implementation. The following literature review ultimately attempts to convey the need for this project and the development of the database system. According to Sperberg-Mcqueen (2005) XML plays a huge part in solving the issues of semi-structured data. This review subsequently outlines what semi-structured data is and the role XML plays in the handling of it and discusses how XML handles semi-structured data differently to the traditional relational approach. This leads to a discussion and comparison of two separate XML storage techniques: XML-enabled and Native XML Databases (NXD); the comparison of the two will form the basis of the qualitative and quantitative data collection and evaluation of the proposed application in the latter stages of this project. 2.1 The role XML plays in handling semi-structured data In the IT industry commonplace information such as reports and programming structures assume forms that are different from the rigidly structured data traditionally associated with relational databases; this data is referred to as semi- structured (Sperberg-Mcqueen, 2005). According to Buneman (1997) semi- structured or unstructured data is data that does not fit into traditional data models and the constraints imposed upon data held within these models. Sperberg- Mcqueen (2005) expands on this notion by describing semi-structured data as having repetitive characteristics and varying structures in a hierarchical representation (Figure 1). This continual replication as a consequence results in large numbers of tables being created and stored in a specified database which are heavily utilised via numerous insert and delete activities (Burleson, 2009), characteristics of the Internet. Figure 1 displays an example of semi-structured data: 9
  • 10. BSc Computing and Information Systems Project Thomas Rogers name: Peter Wood email: ptw@dcs.bbk.ac.uk, p.wood@bbk.ac.uk name: first name: Mark last name: Levene email: mark@dcs.bbk.ac.uk name: Alex Poulovassilis affiliation: Birkbeck Source: http://www.dcs.bbk.ac.uk/~ptw/teaching/ssd/notes.html Figure 1: Semi-structured Data It is possible to see that the 3 different data formats are unstructured and has no real pattern making this data hard to fit into a regular relational database and subsequently query the data within the database. This will be discussed further in section 2.2. The majority of data that is handled in current times is organized hierarchically: computer files systems, “library classification schemes” and directories such as the yellow pages are all examples of data that is structured hierarchically (Jagadish et al., 1999, p3). Figure 2 displays how semi-structured data is stored hierarchically in a number of directories and sub-directories i.e. A, B and C are directories and the rest can be perceived to be sub-directories. B & C are subdirectories of A also. A B C E F G H Figure 2: Hierarchical Data Structure Sperberg-Mcqueen (2005) state that XML transfers data in a hierarchical structure and is perfect for representing semi-structured data that is in this format. Figure 3 is a basic example of an XML document: <customer> 10
  • 11. BSc Computing and Information Systems Project Thomas Rogers <name> <firstname>Michelle</firstname> <lastname>Correll</lastname> </name> <address> <street>1824 East 7th Avenue</street> <street>Suite 700</street> <city>Memphis</city> <state>TN</state> <zip>32123-7788</zip> </address> </customer> Figure 3: XML Data (Kroenke, 2002,p452) The XML data displayed in Figure 3 can be represented in a hierarchical structure as follows: Customer Name Address Firstname Lastname City State Street Zip Figure 4: XML Represented in a Hierarchical Structure Kasukawa et al. (1999) refer to the fact that XML has the ability to handle semi- structured data effectively because of its varying structure and the flexibility of structure that it allows. Sperberg-Mcqueen (2005, p34) further confirms this assertion by stating that XML data is not confined to merely a flat “table of rows and columns” allowing for more complex and intricate data structure. According to W3C in relation to the topic of discussion, XML is a technology that describes and transfers data and documents over the Internet. XML is a means to structure and store data, then transport the data in plain text format; this gives programmers the freedom to create their own entities, defining tags or child elements due to XML not having a pre-defined standard (W3S), resulting in improved handling of semi-structured data. Figure 3, although very basic, is a good example of how flexible an XML document can be purely through the control it gives developers to essentially create their own hierarchical document structure. This is essential when relating this concept to semi-structured data. 11
  • 12. BSc Computing and Information Systems Project Thomas Rogers XML can assist in giving semi-structured data, found in Figure 1, structure in the definition of an XML schema. According to Roy et al., (2001) the XML schema language defines the structure and the constraints imposed upon an XML document, the data types and various elements and attributes within the document. Roy et al., (2001) further specifications of the schema and the control it enables by stating that it allows for the definition of data types, such as: • String • Boolean • Data • Time In the interest of developers however, self specified user-defined data types can be created allowing for personal constraints being defined for the XML document attributes and elements within the document, allowing for implementation of suitable syntax and format (Roy et al., 2001). Semi-structured data can be structured using XML and an XML schema; the schema gives even more control to the developer through the definition of types and structure according to the XML document and the elements and attributes created initially. Roy et al., (2001) indicate that XML schemas allow for inheritance which in turn enables the reuse of software, again assisting developers; code will not have to be created entirely from the beginning continuously and assists in the overall XML software development process and the maintaining of the code that it is based on. However, even if a an XML document does not have a schema related to it, an XML document can be accessed or manipulated due to the fact that XML is self- describing (IBM).This means that a data structure is not required because the system can construct a structure around the XML tags and this results in semi- structured data being understood and structured within the XML document. Relational databases require a rigid “static schema definition” because according to IBM (2005) each row with the table must have the same identical schema. In the context of this project, the database system will be an XML PHP Web application. The World Wide Web is a perfect example of a semi-structured and hierarchical data source that is ever expanding and does not conform to the constraints of a specific data model it is extremely difficult to enforce coherence of this medium with a structured data model such as the relational approach 12
  • 13. BSc Computing and Information Systems Project Thomas Rogers (Buneman, 1997). XML’s self-describing capabilities result in elegant handle of semi-structured data sources on the Internet and this is an excellent place to use XML (IBM). Therefore creating a Web application that handles semi-structured data in XML format using an XML schema is a perfect means to investigate and evaluate the concept. Having ascertained how XML handles semi-structured data, it is important to investigate the comparisons between the role XML plays in handling semi- structured data and the traditional relational approach. For the purposes of the project it is important to investigate XML from differing angles to gain a perspective of how XML compares to another method of semi-structured data management. Data within a relational database is not stored hierarchically; it is stored as flat data (Figure 4). The following table is a representation of a relational table of flat data: Book_id Book_title Author_name 2 Sense and Sensibility Jane Austen 3 Pride and Prejudice Jane Austen Figure 5: Relational 'Flat' Structure Source: Applequist, p20 According to Mo. Y et al. (2002), semi-structured data largely is stored in flat files and if stored in a relational database, data within these tables is hard to query or update largely due to the inability to query columns with multiple values; if an attribute has multiple values within its column then it will be hard to search for a specific value within a attribute. For example when referring to Figure 5, the Author_name attribute has multiple values i.e. Jane and Austen; in order to find the surname of the author the query would be more complex e.g. if using SQL to query the table a ‘WHERE’ clause using the ‘LIKE’ condition would have to be entered into the statement to return the specified value (assume table name book). SELECT Author_name FROM book WHERE Author_name LIKE’%Austen’; 13
  • 14. BSc Computing and Information Systems Project Thomas Rogers Figure 6: Complex Query Query and performance issues will be discussed further in 2.2. Mo, Y et al. (2002), go on to state that semi-structured data held within relational databases, especially “the edge table approach” (p253), is very often ever expanding and too large to handle, too costly to maintain and large amounts of data can be redundant or empty. This fact re-iterates Burleson’s findings in 2.1 (p8). Refer to Figure 7 for an example of sparse data within a relational database. Book_id Book_title Author_name Co-author Editor Sense and 2 Jane Austen ----- An Editor Sensibility Pride and Enid 3 Jane Austen An Editor Prejudice Blyton 4 A Book ---- ---- An Editor Figure 7: Sparse Data XML can reduce the problems of redundant data and can also eliminate sparse data because according to Sperberg-Mqueen (2005), XML’s syntax does not allow the storage of non-existent data. In order for relational tables to handle semi-structured data the hierarchically structured data has to be manipulated so that it fits into the flat structure of a relational table; this is achieved by mapping semi-structured data using XML to relational table attributes and columns individually or through “partial decomposition”, which will be discussed in 2.3 (Appelquist, D.K. 2002, p94). According to the findings of 2.1, XML has a hierarchical data structure and the need to transform the semi-structured data is not required; the data remains as it is initially and the XML document itself needs to be manipulated so that it fits into the specified model. Chaudri et al. (2003) state that mapping an XML document so that it conforms to the relational model causes compatibility and performance 14
  • 15. BSc Computing and Information Systems Project Thomas Rogers issues due to “impedance mismatch” which is a state that occurs between two different technological paradigms i.e. XML document and relational table. This will be investigated further in 2.3. Figure 8 displays a basic example of how semi- structured data from a hierarchical data structure is mapped to a relational table using XML. <D> Relation: D D <A>Data</A> A B C <A>Data</B> BA Data Data Data A B C <C>Data<C> </D> Figure 8: Semi-structured to Relational Having investigated the role XML plays in handling semi-structured data and how it handles semi-structured data contrarily to the relational approach, different methods of storing XML documents will now be investigated. The intended objective of the artefact to be delivered is user ability to insert and upload semi- structured data to a DJ forum web site; essentially the ability to manipulate and query data in XML format. For the purposes of this project, storage and retrieval of XML documents is essential. Available XML storage methods and related work need to be investigated and reviewed as a result. To gain a comparative perspective it is essential to analyse XML document storage techniques from two differing angles: XML-enabled databases and NXD. 2.3 NXD versus XML-enabled relational databases According to Chaudri et al. (2003) XML-enabled databases are employed to store XML documents in a relational table and NXD’s are designed to hold data that conforms purely to the XML hierarchical structure and does not merely deconstruct the XML document and store it in a relational database. According to Serna et al. (2005) when utilizing an XML-enabled database transformation of the XML document needs to take place when both storing and returning the document to and from a relational table within a relational database. NXD’s are a direct means to store an XML document in a database and conversion to another format or mapping of the document between two separate data models is not needed (Serna et al., 2005). Having ascertained the basic principles of NXD’s and XML- 15
  • 16. BSc Computing and Information Systems Project Thomas Rogers enabled databases, the advantages and disadvantages of both have to be considered. According to Ford (2003) when storing XML documents in an NXD the fact that the XML document does not need to be manipulated in order for it to conform to a relational model and that the data held in the document is maintained exactly is a key advantage of utilising an NXD instead of an XML-enabled relational database. In fact, Kroenke (2005, p89) controversially believes that “the relational database model is doomed” due to the ability advancing XML document storage technologies i.e. NXD’s have to store XML documents entirely as they are without the need to deconstruct. An experiment performed in 2005 by Ainhoa Serna and Jon Keppa Gerrikagoitia will attempt to prove or disprove Kroenke’s controversial claims and the results will be displayed later in this review. However, according to Hong et al. (2007) XML documents are a useful means to store, extract and query data that is based upon a single repeatable entity and believe that XML documents suffer as regards efficiency when querying multiple XML documents. Hong et al. (2007) raise the concern that when XQuery and XPath - W3C recommendations for querying XML documents - are utilised to query multiple XML documents, these methods are not as proficient as merging the documents into a relational database and querying them using SQL due to having to read the XML files individually prior to processing. Pardede et al. (2006, p205) further the limitations of NXD’s by implying that even though they are continuously improving they are forms of technology that are in their “infancy” and established “mature” relational database management systems (RDBMS) are more familiar to programmers and consequently more competitive. Pardede et al. (2006) emphasise the fact that RDBMS are tried and tested technologies and this outweighs the limitations restructuring an XML document imposes. Connolly et al. (2005) further iterate the fact that despite the flexibility, intricacies and complexity of handling data in XML documents, RDBMS are still the standard for storing XML documents due to its reliability, available tools, performance and scalability – an essential asset in handling semi-structured data (XML). However the conversion process involved in storing an XML document in an XML-enabled database requires a great deal of processing in order for it to 16
  • 17. BSc Computing and Information Systems Project Thomas Rogers conform to the relational structure of rows and columns. A particular conversion method is document shredding. Connolly and Begg (2005) state that the XML document has to be decomposed or shredded into elements and placed into columns within one or multiple relational tables within the database. This enables one benefit which is that of indexing an element value so that it can be accessed when querying the table using an SQL statement e.g. <xml title=”i”> (i = index value) <content>Hello world</content> </xml> Figure 9: XML Index This index value itself will also have to possess its own attribute or column within the table, which in turn increases the size of the table and the need for table scalability, an aspect that could be perceived to be a disadvantage (Connolly et al., 2005). When querying a large table with many attributes retrieving specific data may prove to be difficult; ciphering through large amounts of potentially redundant data for index values may decrease query performance and the queries themselves may need to be more complex in the acquiring of specific data (refer to 2.2). But do the processes involved in shredding and decomposing an XML document and inserting it into an XML-enabled database such as Oracle and then attempting to query, update, retrieve and restructure the XML document in its original structure decrease performance levels? Hong et al. (2005) have proposed such a method of XML document decomposition and reproduction using index values to store and retrieve an XML document of semi-structured data from a relational database. In their experiment Hong et al. (2005) shredded each XML tag (refer to Figure 10) in the document into the relational table separately and used a Microsoft computer program to co-ordinate all the data within each tag in the XML document so that they matched exactly when produced in the relational table. <node sessionID=”0000000001” nodeID=”000000004” userID=”000000001” username=”117:YEONG-TAE SONG” pntSeq=”5” time_stamp=”114728264” opnDetail=”It looks good” /> 17
  • 18. BSc Computing and Information Systems Project Thomas Rogers Figure 10: Hong et al. XML Tag They believe that this proved to be difficult due to the mapping of data between two different data models, which can be known as impedance mismatch. In using the displayed indexes they were able to specify which data they required from the document when querying the relational table using MS SQL. Difficulty was encountered further however when querying the database in that the XML document returned was fragmented and further alterations to the query were required in order to retrieve the XML document exactly as it was created. However Hong et al. (2005) have only proposed a means to convert or shred XML documents into a relational model and no query performance details have been offered. Hong et al. (2005) have iterated the fact that the process is difficult due to differing data models but performance is the key issue. Alternatively, shredding the XML document does not always need to occur when storing XML in an XML-enabled database: Oracle have a method of storing the XML document as a whole within one single row of a relational table. This method is known as Oracle XMLType:CLOB. The XMLType:CLOB stores an XML document exactly according to its initial hierarchy and structure and it uses XPath to update the document without rewriting any elements of it (W3S); this will be discussed further in 2.3.3. The key issue is performance: retrieving and updating the XML document, sending it too and retrieving it from the specified database. Serna et al. (2005) state that, when using an NXD, due to data being stored exactly as it is created initially with no conversion of data required so that what is received is an XML document and what is sent back is an XML document, performance levels are not lost due to conversion to a different structure i.e. a relational table. Serna et al. (2005) performed experiments for both an XML-enabled database and NXD. They have compared the open-source NXD eXist and XML-enabled relational database Oracle from two perspectives: XSU-SQL which maps the XML document to the database, and XMLType:CLOB, within which the whole document is stored and mapping of document structure to relational table attributes is not needed. 18
  • 19. BSc Computing and Information Systems Project Thomas Rogers The following experiments were conducted by Ainhoa Serna and Jon Keppa Gerrikagoitia in 2005. Three different tests were performed. They were to: http://xml.sys-con.com/node/104980 • XML:DB XPath which involved recurring searches, searches for attribute nodes, searches with two conditions and searches using pre-defined XPath functions • XML:DB test which entailed add, delete and update document queries as well as creating and dropping collections • XML:DB robustness which tested maximum scalability of each storage method When evaluating each database performance 5000 XML documents – each document being 592 bytes, 4 sub-collections each with 5000 files, and a further 2 sub-collections comprising 4000 files and 6000 XML documents, were queried where the same queries were executed with continuously increasing amounts of data. 2.3.1 Investigating Oracle XMLType:CLOB As stated earlier, the XMLType:CLOB facility stored the XML documents as a whole within 1 row of a relational table, and was used in conjunction with the XPath query operation. XMLType possesses functions that enable the query to extract particular nodes and fragments of an XML document, and Serna et al., (2006) have utilised the XMLType functions such as extract() and getStringVal to query data. A table named ‘clientes’ of XMLTYPE was created and the documents were inserted into it. Numerous queries including updates, queries using XPath, queries of several tables and deletions from the table were performed. 2.3.2 Investigating Oracle XSU (XML SQL) This involved mapping the XML document nodes to a relational table and two methods of storing and retrieving the XML documents were implemented by Serna et al. (2006). These were relational table storage as already mentioned and object-relational table storage methods and Serna et al. (2006) implemented both solutions as part of the investigation to compare and contrast the two. When 19
  • 20. BSc Computing and Information Systems Project Thomas Rogers concerning OR storage the required information was stored in data types and for the relational storage method the data was stored in a relational table. Again, as with the XMLType feature a number of SQL insert, update and delete queries were performed using Java, its JDBC extension and Oracle’s XML parser. However, due to the mapping of XML nodes to a relational table the names of the XML elements and table columns have to be identical. 2.3.3 Investigating eXist The storing of the XML documents did not need any mapping or configuration to comply with an XML–enabled database when using eXist, so the NXD was merely queried using XPath. 2.3.4 Investigation Results The results returned the fact that eXist was the most efficient as regards recurring searches and has the best response times in all of the tests performed, and when utilising the XPath query standard functions such as “contains” returned quicker response times than those queries that required the “=” operator; the “=” operator is utilised by both the XSU and XMLType so their query times were lower than eXist in this case also. When concerning the insertion of 30,000 XML documents in to each of the databases it was apparent that the OR tables and relational tables into the Oracle database was better than that of eXist. XMLType:CLOB and eXist insertion results are similar but both are less than the OR and relational table results. Also the random access times of the XMLType returned the worst results for all of the solutions. In general the results returned the fact that update and select queries were much faster when eXist and structured storage formats were compared to the CLOB file method. The investigation within the research element of the project will assist in ascertaining which XML database to utilise in the completion of the of the DJ database system. 20
  • 21. BSc Computing and Information Systems Project Thomas Rogers Developmental Element 3. Methodology Having investigated XML, how it handles semi-structured data, alternative means of handling semi-structured data i.e. RDBMS, and potential means to store and manipulate XML documents from two perspectives, the initial research element was completed. The developmental element can now be initiated and a software development lifecycle was adhered to in order for the project to be successful. In using a lifecycle model the ability to sign off each stage of system development enables the gauging of where the project is at and limits the time taken for each stage to be completed i.e. the creation of timeboxes for each development process (Ramsin et al. p54) (refer to appendix for DSDM project plan and Timeboxes established). Essentially, due to the project timescale the project will be a rapid application development (RAD) methodology. Two RAD techniques have been used in tandem in the implementation of the proposed system: Dynamic Systems Development Method (DSDM) and eXtreme Programming (XP). DSDM was the development methodology the project assumed and DSDM is believed to be the standard for RAD methodologies (Ramsin et al., 2008, p52). According to Ramsin et al. (2008, p53) DSDM is an “iterative-incremental” model in which business processes and user requirements are developed and applied to two iteration development phases where prototypes are developed and sequentially implemented. The second method XP is similarly an iterative agile design process with the aim of creating and adapting software rapidly according to ever changing user requirements (Hedin, G et al., 2003); both processes press the importance of testing before implementing. In applying these two development concepts together the aim was to develop and implement a high specification system in a short period of time. DSDM provides the framework upon which XP can develop; XP provides the format for system testing and programming and DSDM defines the product lifecycle and required deliverables i.e. what the was required of the system (Camara, S. 2009). This harmony and unison between methodologies justified employing the two in 21
  • 22. BSc Computing and Information Systems Project Thomas Rogers tandem in this development project. The two agile development methodologies were also utilised due to familiarity and passed experience of using the two concepts together prior to this project. Refer to the appendix for the original DSDM project plan and an activity diagram detailing the overall business process design and activities undertaken in the completion of the project. Having decided on a relevant development methodology to employ requirements elicitation was required i.e. establishment of what the user requires of the proposed system. 3.1 Initial User Requirements The following requirements list displays the initial user and system requirements. 1. The user shall be able to register their personal details and store them in an XML database 1.2 Details to be registered: • Firstname • Lastname • Username • Password • Alias • Email Address 2.1 The user shall be able to login to the web site 2.2 Details to be entered will be: • Username • Password • DJ alias 3.1 The user shall be able to upload feedback to the web site using a stored XML document 3.2 Feedback to be stored: • Name • Regarding • Month • Comment 3.3 Name: the name of the user using the DJ forum 3.4 Regarding: what the comment the user is uploading relates to i.e. an event or MP3 22
  • 23. BSc Computing and Information Systems Project Thomas Rogers 3.5 Month: when the feedback was uploaded 3.6 Comment: comment user wishes to add 4.1 The user shall be able to view the feedback they have uploaded via a feedback link in the web site (XML document returned and styled using XSL) 4.2 The user shall be able to view all feedback uploaded to the web site 5. The user shall be able to upload tips and experience to the web site, storing the XML data in an XML database and returning the XML document styled using XSL 6. The user shall be able to listen to MP3 samples accessed via a web link NB. This data will not be in XML format it will be stored on the server because the file will be too large for an XML document to handle. 3.1.1 Possible requirements 7. The user shall be able to upload DJ tutorials to the web site, again handling the data using XML and storing the data in an XML database 8. The proposed system will be a web log or blog web site enabling real-time commentary After establishing the basic requirements, use case diagrams were created to gain a visual perspective of what was required. 3.2 Use Case Diagrams In order to gain a better perspective of what was required of the system, UML use case diagrams were used to assist in this area. The use case diagrams were produced according to the initial requirements that were established, a use case diagram for each of the main requirements i.e. register details and login (1, 2), feedback entry and return (3, 4), experience/tips entry and return (5). The following UML use case diagrams were created using StarUML (refer to page 24). 23
  • 24. BSc Computing and Information Systems Project Thomas Rogers Figure 11: Register Details and Login Figure 12: Feedback 24
  • 25. BSc Computing and Information Systems Project Thomas Rogers Figure 13: Experience/Tips Having gained a visual perspective of what is required of the system through use case depiction, user stories were developed. 3.3 User Stories User stories were established in tandem with a potential user of the system (a fellow DJ) to ascertain the processes completed by the user in order to satisfy the specified requirements. 3.3.1 Register Details After entering the Web site URL the user will be able to select the ‘Register’ link from the home page (index.php). Register.php will be opened and the user will be able to enter their first name, last name, username, password, DJ alias and email address. After clicking ‘Submit’ the user data will be transferred to an XML database using an XML document and they will be returned to Revolt’s home page. 1. User will click the ‘Register’ link apparent on the home page 2. User will enter all required data in to the text fields apparent 3. The user will press ‘Submit’ 4. The user will be returned to the home page Figure 14: Register User Story 25
  • 26. BSc Computing and Information Systems Project Thomas Rogers 3.3.2 User Login After being returned to the home page the user will be able to enter their username, password and alias. After clicking ‘Login’ the user will enter the Web site and will open welcome.php. 1. User enters username 2. User enters password 3. User enters DJ alias 4. User clicks ‘Login’ 5. User enters Web site Figure 15: Login User Story 3.3.3 Upload Feedback After entering the Web site, the user will be able to select ‘Forum’ the link tab displayed. By selecting the ‘Forum’ link from welcome.php the user will be able to enter their name, what the comment relates to, the month that the feedback is being uploaded and enter a comment. The user will be able to click ‘Submit’ to upload their comments. 1. User clicks ‘Forum’ and enters welcome.php 2. User enters their name 3. User enters the feedback re 4. User enters the month 5. User enters their comment 6. User clicks ‘Submit’ to upload comments Figure 16: Upload Feedback User Story 3.3.4 View Feedback After clicking ‘Submit’ the user will open confirm.php and will be to view the comments they have just uploaded. After selecting the radio button from the same page the user will be able to click ‘Submit’ and will be manoeuvred to confirmation.php where all upload comments will be displayed. 1. User views uploaded comments from previous page 2. User selects the radio button 3. User clicks ‘Submit’ to manoeuvre to confirmation.php Figure 17: View Feedback User Story 26
  • 27. BSc Computing and Information Systems Project Thomas Rogers 3.3.5 Upload Tips and Experience After logging in to the web site the user will be able to select experience/tips and fill in the required fields apparent on the page. After clickin Submit the user will be able to see the entries they have entered in to the Revolt DJ forum. 1. User clicks ‘Experience/tips’ 2. User enters their alias 3. User enters the ‘Give it a name’ field 4. User enters the experience 5. User enters their tips 6. User clicks ‘Submit’ to upload comments Figure 18: Experiences/Tips user Story 3.3.6 Listen to MP3’s After entering the relevant URL the user will be able to select ‘MP3’ from the link tab and will be manoeuvred to mp3.html where they will be able to select an MP3 link from the present page. 1. User clicks ‘MP3’ link 2. User selects desired MP3 link from mp3.html Figure 19: MP3 User Story When developing a piece of software using RAD techniques i.e. DSDM it is important to establish which requirements are important in order to complete the project successfully and within the stipulated time period. It was not feasible to implement all of the user stories established in 3.2 and it was therefore necessary to refine the requirements. In the following section MoSCoW rules are used to establish the importance of each user requirement. 3.4 MoSCoW Rules Ramsin et al. state that the rules as regards requirement priority are specified as “must haves”, “should haves”, “could haves” and “won’t haves” - essential, important, not essential and not present in the project respectively (2008, p55). The user stories established in 3.2 were analysed using MoSCoW rules as follows: • The system MUST be able to upload user feedback to an XML database 27
  • 28. BSc Computing and Information Systems Project Thomas Rogers • The system MUST enable users to view individual feedback they have uploaded and previous user feedback, data that is in XML format and styled using XSL • The system SHOULD offer user ability to upload experience and tips to an XML database • The system COULD be able to provide user registration • The system COULD enable users to login • The system WOULD enable users to listen to MP3’s uploaded From the above analysis it was possible to see that the system MUST possess three of the original 6 user stories in order for the project to be a success; it was considered feasible that these user stories could be completed within the time scale of the project. The revised user stories completed were therefore 3.3.3, 3.3.4 and 3.3.5 (refer to 3.1). The SHOULD user stories were perceived to be requirements that could be satisfied within the specified timescale, and the COULD and WOULD user stories were perceived to be requirements that could not be satisfied within the project timescale; these are areas of future work. The three MUST user stories were developed in the functional iteration model stage of the DSDM product lifecycle (refer to appendix for DSDM project plan). In identifying the user stories that were essential to the project and incorporated in the functional iteration stage, the design and implementation process began. 4. Artefact Design and Construction 4.1 Prototypes To gain an idea of what was to be designed it was essential to identify the processes that the user went through in satisfying the requirements of the system. User requirements and user stories (3.1, 3.2) state what is required of the system and to gain an interactive perspective of what was required a number of design processes were completed, including: • Low fidelity prototype – sketches or throw away depictions of the system interface • Medium fidelity prototype – interactive depictions of the system interface and how it operates 28
  • 29. BSc Computing and Information Systems Project Thomas Rogers • Vertical prototype – a functional representation of a small part of the system 4.1.1 Low Fidelity Prototype After creating the user stories, in order to create a general perspective of the system interface, mock-ups (sketches) of the system interface were produced. The low fidelity prototype was merely a means to gain an initial idea of what the system interface will look like. Figure 20 shows a mock-up design of the home page (index.php): Figure 20: Low Fidelity Prototype According to the DSDM model the usability of the system is tested at this stage of the development process; the mock-ups were created in tandem with the user ascertaining exactly what was required of the system interface. This prototype was developed further in the next phase of prototyping. 4.1.2 Medium Fidelity Prototype The low fidelity prototype gave an idea of how the interface will look; a medium fidelity prototype was used to show how each page was going to be accessed and how to manoeuvre between them, a general depiction of what was required of the system. Figure 21 is a representation of the medium fidelity prototype; it can be found in the ‘medium_fidelity’ folder on the CDROM in the appendix. Refer to the ‘medium_fidelity’ folder on the CDROM in the appendix for the medium fidelity code. 29
  • 30. BSc Computing and Information Systems Project Thomas Rogers Figure 21: Medium Fidelity 4.1.3 Vertical Prototype A vertical prototype is a representation of a small piece of system functionality. The vertical prototype is essentially a medium fidelity prototype that has a specific piece of functionality; in relation to this assignment the vertical prototype satisfies requirement 3 and 4 using a back end database and PHP to connect to, access and manipulate the required data. Prior to the implementation of the vertical prototype it was necessary to produce a back end database on a server. A MySql database containing 1 table (medium) was created on the TVU Apache server. Requirement 3 involves user ability to upload data to the forum and requirement 4 enables users to see the data they have uploaded to the forum on request. The following diagrams show the processes the user goes through in satisfying both requirements 3 and 4. The code can be found in the appendix in the ‘vertical_prototype’ folder on the CDROM. 30
  • 31. BSc Computing and Information Systems Project Thomas Rogers Step 1: Insert data Figure 22: Details Step 2: View and submit data Figure 23: Entry Step 3: Retrieve Entry 31
  • 32. BSc Computing and Information Systems Project Thomas Rogers Figure 24: Prototype Return The title of the project is to investigate the use of XML in a DJ database system; all that the prototypes investigated was the development of the system interface and uploading of data to a database and accessing and manipulating data using PHP. To gain an idea of how XML worked and how to format the XML document, a template XML document and XSL stylesheet were produced. The XML document and XSL stylesheet can be found in the CDROM ‘xml_template’ folder in the appendix. Figure 25 displays the XML document styled by the XSL stylesheet. 4.2 XML Experiment Figure 25: XML Experiment 32
  • 33. BSc Computing and Information Systems Project Thomas Rogers 4.5 Technology Used Having investigated into XML and how it is formatted using XSL, how to store and manipulate the XML document was established. PHP DOM was used to manipulate the XML document and it was decided that an XML-enabled database would be used to store the specified XML documents in BLOB form i.e. the whole XML document is stored as one row in a relational table in a MySql database. Due to problems with the Apache server at TVU (refer to 6.4) a server was set up remotely; WampServer 2.0 was utilised to store all PHP and XML documents in an XML-enabled MySql database. After investigating the technology that was to be used, a proposed database schema was developed. 4.6 Database Schema A UML class diagram was created to represent how the tables within the database were formed. It was created in relation to the user requirements formulated in section 3. Figure 26 is the database schema created: Figure 26: Database Schema A code walkthrough of how the main parts of the system function will now ensue. 4.7 Code Walkthrough The main functions of the system are to create an XML document using PHP DOM, load the XML document into an XML-enabled database and retrieve the XML document from the XML-enabled database. 33
  • 34. BSc Computing and Information Systems Project Thomas Rogers 4.7.1 XML PHP DOM Refer to the folder ‘revolt_final’ on the CDROM in the appendix for all of the following code excerpts. The user inserts the relevant data into the specified fields found on revolt_index.php, ‘reviewer’, ‘regarding’, ‘month’, and ‘feedback’ and the page calls ‘revolt_index.php’ on submission of the form: <form name=”input” method=”post” action=”revolt_insert.php”> <div class=”subheading”> <p>Reviewer</p> </div> <p><input type=”text” name=”reviewer”/><p> <div class=”subheading”> <p>Regarding</p> </div> <p><input type=”text” name=”regarding”/><p> <div class=”subheading”> <p>Date</p> </div> <p><input type=”text” name=”month”/><p> <div class=”subheading”> <p>Comment</p> </div> <p><textarea type=”text” name=”feedback” rows=”5”cols=”25”/></textarea></p> <p><input type=”submit” value=”Submit”/><p> Figure 27: XML Inputs On submission of the form the page revolt_insert.php is called. On this page the PHP document object model (DOM) is created using the following code: <?php DOM document is created... $xmldoc = new DOMDocument(); XML document and XSL stylesheet are loaded... $xmldoc->load('revolt_final.xml'); $xmldoc->load("revolt_final.xsl"); ‘reviewer’ is called from revolt_index.php (Figure 27) $newAct = $_POST['reviewer']; 34
  • 35. BSc Computing and Information Systems Project Thomas Rogers $root variable is created containing the first child node of the element ‘reviewer’ in the revolt_final.xml document... $root = $xmldoc->firstChild; An element node is created in $xmldoc containing the ‘reviewer’ input from revolt_final.php... $newElement = $xmldoc->createElement('reviewer'); ‘reviewer’ found in $newElement is added after the first child node found in $root... $root->appendChild($newElement); A new text node is created containing the value of ‘reviewer’ posted in $newAct... $newText = $xmldoc->createTextNode($newAct); The value of ‘reviewer’ is then added into the new element of the $xmldoc created in the variable $newText... $newElement->appendChild($newText); Repeat this process for the other three input values ‘regarding’, ‘month’ and ‘feedback’ found in Figure 27. The xml document is then saved... $xmldoc->save('revolt_final.xml'); And the user is sent to the page revolt_return.php header('Location:revolt_return.php'); ?> The DOM document is created and the user is sent to the page revolt_return.php. At this page the user will be able to see the DOM document reloaded; revolt_final.xml is reloaded as follows: <?php 35
  • 36. BSc Computing and Information Systems Project Thomas Rogers DOM document created in variable $xmldoc and revolt_final.xml is loaded in to said variable... $xmldoc = new DOMDocument(); $xmldoc->load("revolt_final.xml", LIBXML_NOBLANKS); Create $name variable containing the returned first child node element of ‘revolt_final.xml’... $name = $xmldoc->firstChild->firstChild; If $name is not empty... if($name!=null){ while($name!=null){ then all elements of the XML document will be returned using the textContent XML DOM property... echo $name->textContent."<div class="subheading"><p>Entry</p></div>"; After the subheading class the second element of the XML document is returned using the nextSibling property... $name = $name->nextSibling; } } ?> 4.7.2 XML Document Storage After displaying the XML document using the PHP DOM method, the document is inserted into the XML-enabled database using a submit button. On the clicking of the submit button found on revolt_return.php, update.php is called and the XML document is uploaded using the following code: <?php Connection to the database is accomplished using the regular PHP SQL connection method. Connect to the remote server... $con = mysql_connect("localhost", "thomr", "thomr1981"); 36
  • 37. BSc Computing and Information Systems Project Thomas Rogers if ( ! $con ) die("Cannot connect to server!"); Connect to the database revolt... $db = mysql_select_db("revolt"); if ( ! $db ) die ("Cannot connect to database!".mysql_error()); Create a variable containing an INSERT SQL statement and insert the data into the revolt_feedback table in the revolt database... $sql = "INSERT INTO revolt_feedback VALUES (NULL, Load the revolt_final.xml document using the LOAD_FILE XML property... LOAD_FILE('c:/wamp/www/revolt/final/revolt_final.xml'))"; Execute the query, $sql... $result = mysql_query($sql); if ( ! $result ) die ("Cannot execute query!".mysql_error()); Send the user to display.php where the XML document will be displayed... header('Location:display.php'); ?> At display.php the XML document is retrieved by using a SELECT SQL statement. Connection is achieved via the same means as above and the statement is as follows: $sql = "SELECT * FROM revolt_feedback"; Due to the XML document being stored as a single row in revolt_feedback the document is returned as a row element. The variable $row is created containing the SQL statement above which in turn is contained in the variable $result... $row = mysql_fetch_row($result); while ( $row != null ) { echo "<div class="subheading"><p>Feedback Id # $row[0] </p></div> "; The XML document contained within $row is the echoed by PHP and displayed... 37
  • 38. BSc Computing and Information Systems Project Thomas Rogers echo "<tr> <td> <td> $row[1] </td> </tr>n"; $row = mysql_fetch_row($result); } This method has been used in exactly the same manner when satisfying the uploading of experience and tips to the Revolt DJ Forum; the inputs that have been posted using PHP however are different. The final artefact can be found on a remote server and the source code can be found in the ‘final’ folder on the CDROM in the appendix. In adherence to the DSDM framework upon which the project was based, having created the artefact it was necessary to test what had been produced as part of the iterative nature of the framework. 4.8 Artefact Testing When employing a DSDM development project, testing is completed at each iteration of the project i.e. if something is changed then the change will be tested in tandem with the specified user according to their needs. The following testing methods were produced after the final iteration of the project when the artefact was deemed complete. 3 forms of testing were implemented in testing the usability of the system produced. The tests were created in order to assess whether or not the artefact satisfied the user requirements and stories. Firstly a test case was produced in which the user was asked to input a number of processes in to the system and describe the output that the system returned. Secondly a user acceptance test was produced, again involving a number of user inputs and recording of the output returned; the first two tests were used as part of a verification process. 4.8.1 Test Case Figure 28 is an example of a user input to the system and the required response format. Input 6 Click ‘Forum’ on the Welcome page. Fill in the fields apparent and click ‘Submit’. Are the details you just entered present on the page? Tick Yes or No Result: Yes No Figure 28: Test Case 38
  • 39. BSc Computing and Information Systems Project Thomas Rogers 4.8.2 User Acceptance Test Figure 29 is an example of 1 specific user acceptance test. A number of acceptance tests were performed testing each particular user requirement. 1. Register Details Purpose Register personal details Prerequisite User must have visited the Revolt DJ Forum Firstname {Enter fname into login table} Lastname {Enter lname into login table} Username {Enter username into login table} Test Data Password {Enter password into login table} Alias {Enter alias into login table} Email {Enter email into login table} 1. User visits Revolt DJ Forum 2. User clicks ‘Register’ Steps 3. User enters required data into text boxes 4. User clicks ‘Submit’ 5. Users sees login page Notes and Questions Figure 29: User Acceptance Test Having investigated semi-structured data and XML, XML storage methods, elicited what was required of the system, developed user stories, created an artefact and tested what was produced, a discussion of what was found out is required. 5. Project Assessment The purpose of the project was to investigate the use of XML in a DJ database system. In investigating XML a number of requirements were elicited involving storing and manipulating XML documents in an XML-enabled database. Using the results gathered from the tests performed in 4.8, it is possible to assess whether or not the requirements of the XML database system were satisfied, essentially establishing whether or not the artefact and in turn the project was a success. Each requirement will now be analysed to determine whether or not they were satisfied by the artefact or not. 5.1 Requirements Analysis 1. The user shall be able to register their personal details and store them in an XML database 39
  • 40. BSc Computing and Information Systems Project Thomas Rogers 1.2 Details to be registered: • Firstname • Lastname • Username • Password • Alias • Email Address Result: Requirement satisfied Rationale: The user can register details and create an account. The results of input 1 of the test case display the fact that a user cannot login to the system unless they have registered their details on the database. On the clicking of ‘Login’ without registering details the following error message is returned: You have entered Incorrect details, please try again. please try to login again.Click here to retry 2.1 The user shall be able to login to the web site 2.2 Details to be entered will be: • Username • Password • DJ alias Result: Requirement satisfied Rationale: The user can login to the system. The results of input 5 of the test case displays the user ‘Alias’ on the welcome page whereby proving that the user session has been created and the user has logged in. DJ ‘Alias’ displayed Figure 30: Alias Session 40
  • 41. BSc Computing and Information Systems Project Thomas Rogers 3.1 The user shall be able to upload feedback to the web site using a stored XML document 3.2 Feedback to be stored: • Name • Regarding • Month • Comment 3.3 Name: the name of the user using the DJ forum 3.4 Regarding: what the comment the user is uploading relates to i.e. an event or MP3 3.5 Month: when the feedback was uploaded 3.6 Comment: comment user wishes to add Result: Requirement satisfied Rationale: The user acceptance tests 3 and 4 results verify that the requirement is satisfied. Input 4 of the test case asks for the feedback Id# range. Figure 31 displays the original Id # range. Figure 31: Entry The range returned by all tests is 1 to 2. On completion of Input 6 as follows... 41
  • 42. BSc Computing and Information Systems Project Thomas Rogers Figure 32: Entry On clicking ‘Submit’ the entries are displayed... Figure 33: Entry On clicking ‘Update’... 42
  • 43. BSc Computing and Information Systems Project Thomas Rogers Feedback Id # 3 Figure 34: Feedback Id The entries are added to the XML-enabled database. The result of Input 9 displays that the user entry has been added and the feedback Id # is 3. 4.1 The user shall be able to view the feedback they have uploaded via a feedback link in the web site (XML document returned and styled using XSL) 4.2 The user shall be able to view all feedback uploaded to the web site Result: Requirement satisfied Rationale: Refer to requirement analysis 3 for proof of satisfaction 5. The user shall be able to upload tips and experience to the web site, storing the XML data in an XML database Result: Requirement satisfied Rationale: The results of user acceptance test 5 return the following: After steps 1 to 3 user inserts data (Step 4)... 43
  • 44. BSc Computing and Information Systems Project Thomas Rogers Figure 35: Entry User clicks ‘Submit’ and sees data entered (Steps 5 and 6)... Figure 36: Data Displayed User completes steps 7 to 9 and the following is displayed... 44
  • 45. BSc Computing and Information Systems Project Thomas Rogers Entry inserted into XML database Figure 37: Data Inserted 6. The user shall be able to listen to MP3 samples accessed via a web link Result: Requirement partially satisfied Rationale: The user acceptance results for test 6 reveal that 2 of the MP3 samples can be listened to – 2 are not MP3’s. 5.2 Failed Requirement When assessing the requirements in relation to the MoSCoW rules formulated in 3.4, it was possible to see that all of the requirements regardless of priority had been implemented successfully. However, 1 issue was raised as regards how the XML document was returned. For both requirements 4 and 5 it was required that the XML document would be returned and styled using an XSL stylesheet. This mini requirement was not satisfied. This was due the whole XML document being stored in 1 row of a relational table and returned as 1 row. Attempts were made to open an XSL stylesheet (revolt.xsl) using the XML DOM property LOAD, but 45 XML document is not styled; the whole document is returned as one row of data
  • 46. BSc Computing and Information Systems Project Thomas Rogers they were not successful. Figure 38 displays how the XML document was returned for the feedback requirement (4). Figure 38: Unformatted XML This fact was re-affirmed by the heuristic evaluations performed when testing the system. A number of test results returned the notion that the system was usable and aesthetically pleasing apart from this particular requirement; the data entries returned were not aesthetically pleasing (refer to appendix for sample heuristic evaluation). This particular mini requirement could have been satisfied if the XML document was stored in its original hierarchical structure in an NXD or shredded in to an XML-enabled database with schema and index reference. However problems were experienced in both of these areas and required project scope alterations as a result. 5.3 Effectiveness of Method Used The method utilised was XP coupled with the DSDM framework; the DSDM framework relies on a feasibility study preceded by iterative development stages in which aspects of the system are developed, tested and then developed and tested further until the system is ready to be implemented. Each phase of the DSDM framework relies on constant interaction with the user of the system in order to ascertain exactly what it is they require. Adhering to the framework proved difficult due to coding of the proposed system commencing as soon as the project was initiated. As a result a number of the elements of the framework had to be reverse engineered i.e. MoSCoW rules were reverse engineered because all of the requirements were completed in unison and not in order of requirement importance. 46
  • 47. BSc Computing and Information Systems Project Thomas Rogers In planning the project according to the DSDM framework, it would have been beneficial to implement timeboxes to aid in the completion of the various stages of the project itself i.e. how long it should take to complete each iteration or finish the feasibility study. When using timeboxes one timebox could be set up portraying the whole duration of the project and within the project timebox smaller ones are set up defining when the element should be completed (Ramsin et al. 2008). Each timebox normally lasts about two to six weeks and they aid in defining start and end dates as well as setting boundaries and sign off periods for iterations (Ramsin et al. 2008). Timeboxes were not used because it was decided that having boundaries for each element of the development process and initiating sign offs may put pressure on other areas of the project. It was believed that finishing a specific area before starting another was the better method to employ. In some cases less important areas were put on hold or eliminated because more pressing issues ensued i.e. PICTIVE design sessions were not finished because a medium-fidelity prototype was perceived as a better method of defining interface design and orientation. Testing in iterative phases was non-existent because the coding process followed merely a trial and error process; if something was wrong then it was fixed without the user at hand. Very little testing of each phase of the system took place due to user interaction being at a minimum. Testing took place once the system had been implemented as a means to ascertain whether or not the requirements of the system had been satisfied or not. Also a feasibility study was not undertaken prior to design and implementation, the design phase began immediately. It would have been beneficial to conduct a feasibility study prior to implementation to assess the ease of installation of an NXD and XML-enabled database and manipulation of an XML document. Due to not assessing the feasibility of what was proposed the problems encountered when attempting to install an NXD (refer to 6) could have been avoided and more time could have been taken in the investigation of XML-enabled databases and the configuration procedures involved. However the prototyping phase proved very useful in both depicting the user interface (medium fidelity prototype) and system functionality (vertical prototype). The user interface was experimented with and an aesthetically 47
  • 48. BSc Computing and Information Systems Project Thomas Rogers pleasing interface was created using the low and medium fidelity prototypes. System functionality was ascertained using the vertical prototype and proved to be an extremely beneficial element of the DSDM framework. In hind sight, the XP/DSDM methodology employed was useful because it provided general guidelines upon which the software development project could rely; although a number of processes were not completed the processes that were completed assisted in the creation and completion of the DJ database system i.e. the gathering of requirements, user stories, prototyping and user acceptance testing post implementation. In completing the project in the specified timeframe a number of alterations were made. The following section details the problems encountered and the alterations implemented as a result. 6. Project Alterations A number of problems were encountered in completing the project, largely problems related to experimenting with different NXD’s and XML-enabled databases. As a result a number of alterations were made to the overall scope of the project. 6.1 NXD Installation Issues Originally having investigated NXD’s and XML-enabled databases in section 2, the artefact produced was intended to contain both an NXD and XML-enabled database; an XML-enabled database for requirements 3 and 4 and an NXD for requirement 5. The intended implementation would then have been tested and compared to ascertain which database was more efficient based on the query speeds of each database, similar to the tests mentioned in the literature review. Due to configuration issues with NXD implementation it was decided to merely implement an XML-enabled database for the XML documents to be stored in. Attempts were made to implement two NXD’s: eXist and Oracle Berkeley DBXML. 6.2 eXist It was possible to install the eXist NXD but issues were apparent when attempting to open the database; an account could be created but access to the database was 48
  • 49. BSc Computing and Information Systems Project Thomas Rogers not granted. Figure 39 displays the problem that was encountered when attempting to access the eXist NXD. Figure 39: Error Because of the access problems experienced with eXist, it was decided to try a second NXD: Oracle Berkeley DBXML (BDBXML). 6.3 BDBXML As regards this NXD it was possible to install and even create XML documents, insert the required data and query the document created using an MSDOS command prompt, but issues were raised when attempting to access and manipulate the XML document using PHP and display the query results on the specified PHP Web page. Figures 40 to 42 display creating and querying an XML document in BDBXML. Create document: Figure 40: DBXML Create Return document: 49
  • 50. BSc Computing and Information Systems Project Thomas Rogers Figure 41: DBXML Select Query document: Figure 42: DBXML Query When attempting to open the document revolt.dbxml using the following PHP (test.php) script... <?php $mgr = new XmlManager(null); if(file_exists("revolt.dbxml")) { $mgr->openContainer("revolt.dbxml"); } else { $con = $mgr->createContainer("revolt.dbxml"); } $con->putDocument($book_name, $book_content); ?> The following error message was returned: Fatal error: Class 'XmlManager' not found in C:wampwwwtest.php on line 3 This error was due to issues with PHP extensions not being available in the php.ini file and having to build the BDBXML using an IDE such as MS Visual Studio. The point of the project was to investigate the use of XML in a DJ 50
  • 51. BSc Computing and Information Systems Project Thomas Rogers database system and it was perceived that the installation issues experienced were detracting away from the real essence of the project; it was therefore decided to alter the concept of the project and merely decide upon one XML database to utilise. Therefore an XML-enabled database was utilised purely because of the ability to store and retrieve an XML document from it without installation problems. 6.4 Oracle, PHP and XML Issues Prior to deciding to utilise the BLOB format of a MySql database to hold the whole XML document in 1 row of a relational table, investigation was made into employing the XMLType property of an Oracle database to store an XML document and schema. Due to not being able to install and configure an NXD successfully it was decided to attempt to utilise an Oracle database and the query XSU (XML SQL) method in order form a basis for comparison between two XML-enabled databases – one that stored the data as a whole document in 1 row (MySql and its BLOB utility), and 1 that shredded each data element in to a different row of a relational table (Oracle XML XSU). Initially, it was possible to store an XML document in an Oracle database that TVU utilised but problems arose when attempting to access and manipulate the XML documents using PHP. The oci8 extension that PHP requires to access an Oracle database via the oci_connect PHP command did not exist on the Apache server at TVU. Even after installing a remote Oracle database and creating and inserting XML documents manually, issues still arose when attempting to access and manipulate the XML document using PHP in the Revolt Web application. The following error was returned on execution of an oracle test page: Fatal error: Call to undefined function oci_connect() in C:wampwwworacle_test.php on line 4 This problem again arose due to issues with the oci8 extension and attempts to re- configure the php.ini file to enable the extension and the related environment variables PATH were unfounded. It was at this stage that it was decided to merely utilise the MySql BLOB property that enabled the storage of an entire XML document in 1 row of the table. This was purely due to the fact this was the only 51
  • 52. BSc Computing and Information Systems Project Thomas Rogers method that proved successful as regards XML document manipulation via PHP DOM and storage of the document created. 6.5 Data Structure The structure of the data being accessed, stored and manipulated was not in semi- structured format, the data is actually structured. In section 2.2 there is a description of how semi-structured data appears and it was apparent that the data being gathered by the DJ forum was not in semi-structured form. Semi-structured data has an irregular and unstructured form whereas the data in the artefact was structured. Figure 43: Structured Data Format This data can be represented as... <review> <name></name> <reviewer></reviewer> <date></date> <comment></comment> </review> The original intention of the artefact was to collect more extensive feedback in relation to the event that the customer has been to. It was proposed originally for the feedback requirement that data relating to music genre, venue description, how 52
  • 53. BSc Computing and Information Systems Project Thomas Rogers easy the venue was to find, quality of mixing, preferred music would be captured and stored in the XML database. A varying set of data would have been captured, some comments long, some short and not all of them relevant to the actual DJ event. If this data was stored in a relational database a classic example of sparse data would have been returned (refer to Figure 7, page 14) and be perfect for XML to handle because it does not have any set boundaries on structure. The intended structure could have been like the following, and therefore semi- structured: <review> <name></name> <reviewer></reviewer> <comment> <genre></genre> <venue> <venue description></venue description> <find></find> <venue> </comment> </review> Figure 44: Intended Structure Unfortunately because there were serious delays starting the project due to the inability of TVU to make PHP with Oracle available (the preferred technology) eventually a remote server named WampServer had to be implemented and the application had to be simplified because of the time constraints - a simplified version was implemented to test the application. 7. Conclusion Having investigated into the use of XML in a DJ database system, XML and XML storage methods have been assessed, and a DJ database system has been implemented loosely utilising the XP/DSDM software development methodology. Due to the inability to install and in turn utilise an NXD a comparison could not be made between said NXD and an XML-enabled database as originally intended. Efforts were made to install two NXD’s (BDBXL and eXist) so that a comparison could be made but proved unsuccessful. In the literature review it is mentioned that XML-enabled databases are more reliable due to them being tried and tested and the standard for data storage (Connolly et al., 2006). Pardede et al., (2005) further iterate the limitations of 53
  • 54. BSc Computing and Information Systems Project Thomas Rogers NXD’s stating that they are in their “infancy” and that XML-enabled databases are preferred because of the mature RDBMS foundation they are based on. Due to the problems experienced in installing two NXD’s and the configuration of an NXD to use PHP, resulting in the inability to experiment with NXD’s entirely, it is possible to agree with this argument. The MySql database used in the DJ database system did not have to be configured to utilise the PHP scripting language; an XML document was just stored as a BLOB file and was accessed and manipulated accordingly using SQL statements, although not styled using XSL on return. Contrarily, only the scripting language PHP was investigated and this is not the basis for a solid argument because numerous other programming and querying languages i.e. Java and XPath could be employed instead of PHP and SQL to access and manipulate the XML document; PHP and SQL was merely used because of familiarity with the two languages. Also, if a proper feasibility study was performed prior to the design and build phase, further investigation could have been completed in ascertaining how NXD’s are installed and extra time could have been set aside to install and configure the desired NXD. Due to not having a basis for comparison between the two originally required XML databases, judgement of NXD’s cannot be accomplished. Reference can be made to the experiment performed by Serna and Keppa in the literature review about how they found that in general the eXist NXD performed better than the XML- enabled Oracle as regards query time, but again this is only their experiment performed under their conditions. Assessment cannot be made until experiments are made first hand. As a result, a number of different areas of improvement, recommendations and future work can be offered. 7.1 Recommendations Below are recommendations that could assist in improving the Revolt DJ database system: • Investigate in to the retrieval of an XML document with an XSL stylesheet so that the XML document returned is formatted in an aesthetic manner • Look into different methods of manipulating XML documents other than PHP DOM 54
  • 55. BSc Computing and Information Systems Project Thomas Rogers • Investigate how to configure an NXD to utilise PHP so that an NXD can be investigated • Investigate into the query language XPath instead of relying merely of the query language SQL to access the specified back-end database 7.2 Future Work Below are some possible areas of future work: • Implement an NXD for one of the requirements so that a comparison of the two XML databases can be completed • Style the XML documents retrieved (revolt_feedback.xml and experience.xml) by the DJ database system • Create a totally XML based web application i.e. XML login and XML registration of details 8. Critique The following critique is a summary of what has been found in 6. The initial objectives of the assignment were to investigate the use of XML in DJ database system and in doing this an artefact was produce investigating this paradigm. The objectives was to produce a PHP XML Web application that enabled users to register details, login, upload feedback and experience to an XML database, and return the feedback and experience to the the Web page using PHP. There were 6 initial requirements that were set up and all of the requirements have been met. However, the sub-requirement which involved styling the XML document returned was not completed and this has been outlined in 5. The requirements were met but a proposed comparison between XML-enabled databases and NXD in order to ascertain which was the most efficient to employ was not completed due to installation issues (refer to 6). Also, the data that was captured by the Web site was not semi-structured (refer to 6) due to configuration problems, so the overall purpose of the project was somewhat undermined by this fact. However XML is the language of the Internet so an investigation into how it is used proved beneficial. Even though a number of problems arose due to the aforementioned problems, the overall question and purpose of the project was realised in that a Web application 55
  • 56. BSc Computing and Information Systems Project Thomas Rogers using XML was produced and the XML document could be manipulated and stored in an XML database. Working with the XML technology proved difficult but the production of the application and the eventual storing and manipulation of the document proved most satisfying. Further investigation in to XML and how it is used on the Web will be of further interest in the future. 9. References BUNEMAN, P. (1997) Semi-structured data. Tutorial, In: Symposium on Principles of Database Systems. UPenn DB Engineering. Retrieved March 16th, 2010, from http://www.cis.upenn.edu/~db/abstracts/semistructured.html SPERBERG-MCQUEEN, C.M. (2005) XML. Queue, 3 (8). October, pp.34-41. 56
  • 57. BSc Computing and Information Systems Project Thomas Rogers BURLESON, D. (2009) Identifying Oracle sparse tables. Burleson Consulting. Retrieved March 16th, 2010, from http://www.dbaoracle.com/t_identify_sparse_tables.htm WOOD, P. Semi-Structured Data: Example of Semi-Structured Data. Retrieved 14th May, 2010, from http://www.dcs.bbk.ac.uk/~ptw/teaching/ssd/notes.html ROY, J. & RAMANJUAN, A. (2001). XML schema language: taking XML to the next level. In: IT Professionals, 3 (2), pp37-40. WORLD WIDE WEB CONSORTIUM (W3C). (2008) Extensible Markup Language (XML) 1.0 (Fifth Edition) W3C Recommendation. November 26th. Retrieved March 16th 2010 from http://www.w3.org/TR/xml/ W3 SCHOOLS (W3S). Introduction to XML. Retrieved March 16th, 2010, from http://www.w3schools.com/xml/xml_whatis.asp HONG, S. & SONG, Y-T. (2007) Efficient XML query using Relational Data Model. In: Software Engineering, Artificial Intelligence, Networking, and Parallel/Distributed Computing. Eighth ACIS International Conference. 3, pp-1095-1100. APPLEQUIST, D.K. (2002). XML and SQL: Developing Web Applications. Addsion and Wesley. SERNA, A. & GERRIKAGOITIA, J-K. (June 28th, 2005) David and Goliath: A Comparison Of XML_Enabled And Native XML Data Management Techniques. Retrieved March 25th from http://xml.sys-con.com/node/104980 CHAUDRI, A.B., RASHID, A., & ZICARI, R. (2003). XML data management: native XML and XML-enabled database systems. 57