Investigating the Use of XML in a DJ Database System

Thames Valley University

Faculty of Professional Studies

BSc Computing and Information
Systems

Investigating the use of
XML in a DJ database
system

Thomas David Stanley Rogers

May 2010

BSc Computing and Information Systems Project Thomas Rogers

INVESTIGATING THE USE OF XML IN A DJ DATABASE SYSTEM –
MAY 2010

Thomas David Stanley Rogers

Submitted in support of BSc Computing and Information Systems

May 2010

DECLARATION

“I certify that this work has not been accepted in substance for any degree, and is
not concurrently being submitted for any degree other than that of BSc Computing
and Information Systems being studied at Thames Valley University.

I also declare that this work is the result of my own investigations except where
otherwise identified by references and that I have not plagiarised another’s work”.

2


Contents
Page

Figures ..................................................................................................................... 5
Abstract ................................................................................................................... 7
Introduction ............................................................................................................. 7
Aims and Objectives ........................................................................................... 7
Research Element ................................................................................................ 7
Developmental Element ...................................................................................... 8
Artefact Deliverables ........................................................................................... 8
Terms of Reference ............................................................................................. 8
Background ......................................................................................................... 8
Target Audience .................................................................................................. 8
Research Element .................................................................................................... 9
1. Literature Review ................................................................................................ 9
2.1 The role XML plays in handling semi-structured data.................................. 9
2.3 NXD versus XML-enabled relational databases ......................................... 15
2.3.1 Investigating Oracle XMLType:CLOB................................................ 19
2.3.2 Investigating Oracle XSU (XML SQL) ............................................... 19
2.3.3 Investigating eXist ................................................................................ 20
2.3.4 Investigation Results ................................................................................ 20
Developmental Element ........................................................................................ 21
3. Methodology ..................................................................................................... 21
3.1 Initial User Requirements ................................................................................ 22
3.1.1 Possible requirements ............................................................................... 23
3.2 Use Case Diagrams ......................................................................................... 23
3.3 User Stories ..................................................................................................... 25
3.3.1 Register Details ........................................................................................ 25
3.3.2 User Login ................................................................................................ 26
3.3.3 Upload Feedback ...................................................................................... 26

3


3.3.4 View Feedback ......................................................................................... 26
3.3.5 Upload Tips and Experience .................................................................... 27
3.3.6 Listen to MP3’s ........................................................................................ 27
3.4 MoSCoW Rules ............................................................................................... 27
4. Artefact Design and Construction ..................................................................... 28
4.1 Prototypes .................................................................................................... 28
4.1.1 Low Fidelity Prototype ............................................................................. 29
4.1.2 Medium Fidelity Prototype ...................................................................... 29
4.1.3 Vertical Prototype..................................................................................... 30
4.2 XML Experiment ........................................................................................ 32
4.5 Technology Used ......................................................................................... 33
4.6 Database Schema ......................................................................................... 33
4.7 Code Walkthrough ...................................................................................... 33
4.7.1 XML PHP DOM ...................................................................................... 34
4.7.2 XML Document Storage .......................................................................... 36
4.8 Artefact Testing ........................................................................................... 38
4.8.1 Test Case .................................................................................................. 38
4.8.2 User Acceptance Test ............................................................................... 39
5. Project Assessment ............................................................................................ 39
5.1 Requirements Analysis ................................................................................ 39
5.2 Failed Requirement ..................................................................................... 45
5.3 Effectiveness of Method Used .................................................................... 46
6. Project Alterations ............................................................................................. 48
6.1 NXD Installation Issues............................................................................... 48
6.2 eXist............................................................................................................. 48
6.3 BDBXML .................................................................................................... 49
6.4 Oracle, PHP and XML Issues ...................................................................... 51
6.5 Data Structure .............................................................................................. 52
7. Conclusion ......................................................................................................... 53
7.1 Recommendations ....................................................................................... 54
7.2 Future Work ................................................................................................ 55
8. Critique .............................................................................................................. 55

4


9. References ......................................................................................................... 56
10. Appendix ......................................................................................................... 60

Figures

Figure 1: Semi-structured Data ............................................................................. 10
Figure 2: Hierarchical Data Structure.................................................................... 10
Figure 3: XML Data (Kroenke, 2002,p452) ........................................................ 11
Figure 4: XML Represented in a Hierarchical Structure....................................... 11
Figure 5: Relational 'Flat' Structure Source: Applequist, p20 ............................... 13
Figure 6: Complex Query ...................................................................................... 14
Figure 7: Sparse Data ............................................................................................ 14
Figure 8: Semi-structured to Relational ................................................................ 15
Figure 9: XML Index ............................................................................................ 17
Figure 10: Hong et al. XML Tag ........................................................................... 18
Figure 11: Register Details and Login .................................................................. 24
Figure 12: Feedback .............................................................................................. 24
Figure 13: Experience/Tips ................................................................................... 25
Figure 14: Register User Story .............................................................................. 25
Figure 15: Login User Story .................................................................................. 26
Figure 16: Upload Feedback User Story ............................................................... 26
Figure 17: View Feedback User Story .................................................................. 26
Figure 18: Experiences/Tips user Story ................................................................ 27
Figure 19: MP3 User Story ................................................................................... 27
Figure 20: Low Fidelity Prototype ........................................................................ 29
Figure 21: Medium Fidelity .................................................................................. 30
Figure 22: Details .................................................................................................. 31
Figure 23: Entry..................................................................................................... 31
Figure 24: Prototype Return .................................................................................. 32
Figure 25: XML Experiment ................................................................................. 32
Figure 26: Database Schema ................................................................................. 33
Figure 27: XML Inputs .......................................................................................... 34

5


Figure 28: Test Case .............................................................................................. 38
Figure 29: User Acceptance Test .......................................................................... 39
Figure 30: Alias Session ........................................................................................ 40
Figure 31: Entry..................................................................................................... 41
Figure 32: Entry..................................................................................................... 42
Figure 33: Entry..................................................................................................... 42
Figure 34: Feedback Id .......................................................................................... 43
Figure 35: Entry..................................................................................................... 44
Figure 36: Data Displayed ..................................................................................... 44
Figure 37: Data Inserted ........................................................................................ 45
Figure 38: Unformatted XML ............................................................................... 46
Figure 39: Error ..................................................................................................... 49
Figure 40: DBXML Create.................................................................................... 49
Figure 41: DBXML Select .................................................................................... 50
Figure 42: DBXML Query .................................................................................... 50
Figure 43: Structured Data Format ........................................................................ 52
Figure 44: Intended Structure ................................................................................ 53

6


Abstract
EXtensible Mark-up Language (XML) is used to transfer data between
applications in a semi-structured format which does not conform to the traditional
relational database format of tables and columns. Research into the role XML
plays in the handling of semi-structured data and how it handles said data better
than the relational approach leads on to a discussion of native XML databases
(NXD) and XML-enabled databases within the initial research element of the this
report; NXD’s are the standard for storing XML documents and XML-enabled
databases are used to store XML documents in a relational database.
The research element is preceded by a developmental element and implementation
of an artefact that incorporates the findings and demonstrates the use of XML in a
DJ database system. The artefact is a XML PHP DJ forum Web application and
database that enables users to upload and download semi-structured XML data in
the form of feedback, tips, experiences and set samples.
The development process adheres to the Dynamic Systems Development Method
(DSDM) software development methodology in cohesion with eXtreme
Programming (XP) and each phase is explained and presented within this
document. This involves requirements elicitation and development, UML
modelling and prototype design, incremental testing, implementation and
evaluation of the artefact and user acceptance testing.

Introduction

Aims and Objectives

The aim of this project is to investigate the use of XML in a DJ database system.
It is a research and development project in which XML and its storage methods
will be investigated followed by the design, development, implementation and
testing of an artefact based on the DSDM framework

Research Element
The research element of the project will investigate the following areas:
• Investigate XML and the role it plays in handling semi-structured data and
the storage options for XML documents
• Discussion and comparison of native XML databases and XML-enabled
relational databases
7


Developmental Element

In investigating the use of XML in a DJ database system the following areas will
be incorporated in the development of the project:
• A PHP application capturing user data
• XML to transfer data
• XML-enabled relational database such as Oracle to store the data or a
native XML database

Artefact Deliverables
I have a particular interest in being a DJ and recording and storing music that I
produce. The proposed software development project will attempt to:
• Enable the user to register and login to the system
• Capture and store user data or feedback related to DJ sets performed i.e.
semi-structured data
• User upload area enabling the sharing of DJ tips and experience
• Store MP3 samples of mixes that users can download; this however cannot
be stored in the XML document because the file will get too big
• Possibly a blog area enabling users to add their tips and experience

Terms of Reference

Background
A particular interest in Web based applications and the PHP scripting language
has lead to an interest in learning XML and the programming involved in
producing an application using this paradigm. It was seen as beneficial to attempt
to incorporate being a DJ with this interest in the completion of the literature
review and production of the DJ database system.

Target Audience
The intended target audience of this project could be:
• Web developers interested in using XML
Demonstrate reasons for employing XML in Web application
Alternatives i.e. what type of XML database to employ
• Aspiring DJ’s wanting to market themselves
• Users uncertain of DJ music genres

8


• DJ’s wanting to share information with each other i.e. experiences and tips

Prior to the design and implementation of the system, research in to the topic area
will ensue.

Research Element

1. Literature Review
The proposed software application intends to demonstrate the use of XML in a DJ
database system; resultantly the need to investigate XML, its components and
storage methods is essential prior to design and implementation. The following
literature review ultimately attempts to convey the need for this project and the
development of the database system. According to Sperberg-Mcqueen (2005)
XML plays a huge part in solving the issues of semi-structured data. This review
subsequently outlines what semi-structured data is and the role XML plays in the
handling of it and discusses how XML handles semi-structured data differently to
the traditional relational approach. This leads to a discussion and comparison of
two separate XML storage techniques: XML-enabled and Native XML Databases
(NXD); the comparison of the two will form the basis of the qualitative and
quantitative data collection and evaluation of the proposed application in the latter
stages of this project.

2.1 The role XML plays in handling semi-structured data
In the IT industry commonplace information such as reports and programming
structures assume forms that are different from the rigidly structured data
traditionally associated with relational databases; this data is referred to as semi-
structured (Sperberg-Mcqueen, 2005). According to Buneman (1997) semi-
structured or unstructured data is data that does not fit into traditional data models
and the constraints imposed upon data held within these models. Sperberg-
Mcqueen (2005) expands on this notion by describing semi-structured data as
having repetitive characteristics and varying structures in a hierarchical
representation (Figure 1). This continual replication as a consequence results in
large numbers of tables being created and stored in a specified database which are
heavily utilised via numerous insert and delete activities (Burleson, 2009),
characteristics of the Internet. Figure 1 displays an example of semi-structured
data:
9


name: Peter Wood
email: ptw@dcs.bbk.ac.uk, p.wood@bbk.ac.uk

name:
first name: Mark
last name: Levene
email: mark@dcs.bbk.ac.uk

name: Alex Poulovassilis
affiliation: Birkbeck

Source: http://www.dcs.bbk.ac.uk/~ptw/teaching/ssd/notes.html
Figure 1: Semi-structured Data

It is possible to see that the 3 different data formats are unstructured and has no
real pattern making this data hard to fit into a regular relational database and
subsequently query the data within the database. This will be discussed further in
section 2.2.
The majority of data that is handled in current times is organized hierarchically:
computer files systems, “library classification schemes” and directories such as
the yellow pages are all examples of data that is structured hierarchically
(Jagadish et al., 1999, p3). Figure 2 displays how semi-structured data is stored
hierarchically in a number of directories and sub-directories i.e. A, B and C are
directories and the rest can be perceived to be sub-directories. B & C are
subdirectories of A also.

A

B C

E F G H
Figure 2: Hierarchical Data Structure

Sperberg-Mcqueen (2005) state that XML transfers data in a hierarchical structure
and is perfect for representing semi-structured data that is in this format. Figure 3
is a basic example of an XML document:

<customer>

10


<name>
<firstname>Michelle</firstname>
<lastname>Correll</lastname>
</name>
<address>
<street>1824 East 7th Avenue</street>
<street>Suite 700</street>
<city>Memphis</city>
<state>TN</state>
<zip>32123-7788</zip>
</address>
</customer>

Figure 3: XML Data (Kroenke, 2002,p452)

The XML data displayed in Figure 3 can be represented in a hierarchical structure
as follows:
Customer

Name Address

Firstname Lastname City State Street Zip
Figure 4: XML Represented in a Hierarchical Structure
Kasukawa et al. (1999) refer to the fact that XML has the ability to handle semi-
structured data effectively because of its varying structure and the flexibility of
structure that it allows. Sperberg-Mcqueen (2005, p34) further confirms this
assertion by stating that XML data is not confined to merely a flat “table of rows
and columns” allowing for more complex and intricate data structure. According
to W3C in relation to the topic of discussion, XML is a technology that describes
and transfers data and documents over the Internet. XML is a means to structure
and store data, then transport the data in plain text format; this gives programmers
the freedom to create their own entities, defining tags or child elements due to
XML not having a pre-defined standard (W3S), resulting in improved handling of
semi-structured data. Figure 3, although very basic, is a good example of how
flexible an XML document can be purely through the control it gives developers
to essentially create their own hierarchical document structure. This is essential
when relating this concept to semi-structured data.

11


XML can assist in giving semi-structured data, found in Figure 1, structure in the
definition of an XML schema. According to Roy et al., (2001) the XML schema
language defines the structure and the constraints imposed upon an XML
document, the data types and various elements and attributes within the document.
Roy et al., (2001) further specifications of the schema and the control it enables
by stating that it allows for the definition of data types, such as:
• String
• Boolean
• Data
• Time
In the interest of developers however, self specified user-defined data types can be
created allowing for personal constraints being defined for the XML document
attributes and elements within the document, allowing for implementation of
suitable syntax and format (Roy et al., 2001). Semi-structured data can be
structured using XML and an XML schema; the schema gives even more control
to the developer through the definition of types and structure according to the
XML document and the elements and attributes created initially.
Roy et al., (2001) indicate that XML schemas allow for inheritance which in turn
enables the reuse of software, again assisting developers; code will not have to be
created entirely from the beginning continuously and assists in the overall XML
software development process and the maintaining of the code that it is based on.
However, even if a an XML document does not have a schema related to it, an
XML document can be accessed or manipulated due to the fact that XML is self-
describing (IBM).This means that a data structure is not required because the
system can construct a structure around the XML tags and this results in semi-
structured data being understood and structured within the XML document.
Relational databases require a rigid “static schema definition” because according
to IBM (2005) each row with the table must have the same identical schema.
In the context of this project, the database system will be an XML PHP Web
application. The World Wide Web is a perfect example of a semi-structured and
hierarchical data source that is ever expanding and does not conform to the
constraints of a specific data model it is extremely difficult to enforce coherence
of this medium with a structured data model such as the relational approach

12


(Buneman, 1997). XML’s self-describing capabilities result in elegant handle of
semi-structured data sources on the Internet and this is an excellent place to use
XML (IBM). Therefore creating a Web application that handles semi-structured
data in XML format using an XML schema is a perfect means to investigate and
evaluate the concept.
Having ascertained how XML handles semi-structured data, it is important to
investigate the comparisons between the role XML plays in handling semi-
structured data and the traditional relational approach. For the purposes of the
project it is important to investigate XML from differing angles to gain a
perspective of how XML compares to another method of semi-structured data
management.
Data within a relational database is not stored hierarchically; it is stored as flat
data (Figure 4). The following table is a representation of a relational table of flat
data:

Book_id Book_title Author_name

2 Sense and Sensibility Jane Austen

3 Pride and Prejudice Jane Austen

Figure 5: Relational 'Flat' Structure Source: Applequist, p20

According to Mo. Y et al. (2002), semi-structured data largely is stored in flat files
and if stored in a relational database, data within these tables is hard to query or
update largely due to the inability to query columns with multiple values; if an
attribute has multiple values within its column then it will be hard to search for a
specific value within a attribute. For example when referring to Figure 5, the
Author_name attribute has multiple values i.e. Jane and Austen; in order to find
the surname of the author the query would be more complex e.g. if using SQL to
query the table a ‘WHERE’ clause using the ‘LIKE’ condition would have to be
entered into the statement to return the specified value (assume table name book).

SELECT Author_name
FROM book
WHERE Author_name LIKE’%Austen’;

13


Figure 6: Complex Query

Query and performance issues will be discussed further in 2.2.
Mo, Y et al. (2002), go on to state that semi-structured data held within relational
databases, especially “the edge table approach” (p253), is very often ever
expanding and too large to handle, too costly to maintain and large amounts of
data can be redundant or empty. This fact re-iterates Burleson’s findings in 2.1
(p8). Refer to Figure 7 for an example of sparse data within a relational database.

Book_id Book_title Author_name Co-author Editor

Sense and
2 Jane Austen ----- An Editor
Sensibility

Pride and Enid
3 Jane Austen An Editor
Prejudice Blyton

4 A Book ---- ---- An Editor

Figure 7: Sparse Data

XML can reduce the problems of redundant data and can also eliminate sparse
data because according to Sperberg-Mqueen (2005), XML’s syntax does not allow
the storage of non-existent data.
In order for relational tables to handle semi-structured data the hierarchically
structured data has to be manipulated so that it fits into the flat structure of a
relational table; this is achieved by mapping semi-structured data using XML to
relational table attributes and columns individually or through “partial
decomposition”, which will be discussed in 2.3 (Appelquist, D.K. 2002, p94).
According to the findings of 2.1, XML has a hierarchical data structure and the
need to transform the semi-structured data is not required; the data remains as it is
initially and the XML document itself needs to be manipulated so that it fits into
the specified model. Chaudri et al. (2003) state that mapping an XML document
so that it conforms to the relational model causes compatibility and performance

14


issues due to “impedance mismatch” which is a state that occurs between two
different technological paradigms i.e. XML document and relational table. This
will be investigated further in 2.3. Figure 8 displays a basic example of how semi-
structured data from a hierarchical data structure is mapped to a relational table
using XML.
<D> Relation: D
D <A>Data</A>
A B C
<A>Data</B> BA
Data Data Data
A B C <C>Data<C>
</D>
Figure 8: Semi-structured to Relational

Having investigated the role XML plays in handling semi-structured data and how
it handles semi-structured data contrarily to the relational approach, different
methods of storing XML documents will now be investigated. The intended
objective of the artefact to be delivered is user ability to insert and upload semi-
structured data to a DJ forum web site; essentially the ability to manipulate and
query data in XML format. For the purposes of this project, storage and retrieval
of XML documents is essential. Available XML storage methods and related
work need to be investigated and reviewed as a result. To gain a comparative
perspective it is essential to analyse XML document storage techniques from two
differing angles: XML-enabled databases and NXD.

2.3 NXD versus XML-enabled relational databases
According to Chaudri et al. (2003) XML-enabled databases are employed to store
XML documents in a relational table and NXD’s are designed to hold data that
conforms purely to the XML hierarchical structure and does not merely
deconstruct the XML document and store it in a relational database. According to
Serna et al. (2005) when utilizing an XML-enabled database transformation of the
XML document needs to take place when both storing and returning the document
to and from a relational table within a relational database. NXD’s are a direct
means to store an XML document in a database and conversion to another format
or mapping of the document between two separate data models is not needed
(Serna et al., 2005). Having ascertained the basic principles of NXD’s and XML-

15


enabled databases, the advantages and disadvantages of both have to be
considered.
According to Ford (2003) when storing XML documents in an NXD the fact that
the XML document does not need to be manipulated in order for it to conform to a
relational model and that the data held in the document is maintained exactly is a
key advantage of utilising an NXD instead of an XML-enabled relational
database. In fact, Kroenke (2005, p89) controversially believes that “the relational
database model is doomed” due to the ability advancing XML document storage
technologies i.e. NXD’s have to store XML documents entirely as they are
without the need to deconstruct. An experiment performed in 2005 by Ainhoa
Serna and Jon Keppa Gerrikagoitia will attempt to prove or disprove Kroenke’s
controversial claims and the results will be displayed later in this review.
However, according to Hong et al. (2007) XML documents are a useful means to
store, extract and query data that is based upon a single repeatable entity and
believe that XML documents suffer as regards efficiency when querying multiple
XML documents. Hong et al. (2007) raise the concern that when XQuery and
XPath - W3C recommendations for querying XML documents - are utilised to
query multiple XML documents, these methods are not as proficient as merging
the documents into a relational database and querying them using SQL due to
having to read the XML files individually prior to processing. Pardede et al.
(2006, p205) further the limitations of NXD’s by implying that even though they
are continuously improving they are forms of technology that are in their
“infancy” and established “mature” relational database management systems
(RDBMS) are more familiar to programmers and consequently more competitive.
Pardede et al. (2006) emphasise the fact that RDBMS are tried and tested
technologies and this outweighs the limitations restructuring an XML document
imposes.
Connolly et al. (2005) further iterate the fact that despite the flexibility, intricacies
and complexity of handling data in XML documents, RDBMS are still the
standard for storing XML documents due to its reliability, available tools,
performance and scalability – an essential asset in handling semi-structured data
(XML). However the conversion process involved in storing an XML document
in an XML-enabled database requires a great deal of processing in order for it to

16


conform to the relational structure of rows and columns. A particular conversion
method is document shredding. Connolly and Begg (2005) state that the XML
document has to be decomposed or shredded into elements and placed into
columns within one or multiple relational tables within the database. This enables
one benefit which is that of indexing an element value so that it can be accessed
when querying the table using an SQL statement e.g.

<xml title=”i”> (i = index value)
<content>Hello world</content>
</xml>

Figure 9: XML Index

This index value itself will also have to possess its own attribute or column within
the table, which in turn increases the size of the table and the need for table
scalability, an aspect that could be perceived to be a disadvantage (Connolly et al.,
2005). When querying a large table with many attributes retrieving specific data
may prove to be difficult; ciphering through large amounts of potentially
redundant data for index values may decrease query performance and the queries
themselves may need to be more complex in the acquiring of specific data (refer
to 2.2).
But do the processes involved in shredding and decomposing an XML document
and inserting it into an XML-enabled database such as Oracle and then attempting
to query, update, retrieve and restructure the XML document in its original
structure decrease performance levels? Hong et al. (2005) have proposed such a
method of XML document decomposition and reproduction using index values to
store and retrieve an XML document of semi-structured data from a relational
database. In their experiment Hong et al. (2005) shredded each XML tag (refer to
Figure 10) in the document into the relational table separately and used a
Microsoft computer program to co-ordinate all the data within each tag in the
XML document so that they matched exactly when produced in the relational
table.
<node sessionID=”0000000001” nodeID=”000000004” userID=”000000001”
username=”117:YEONG-TAE SONG” pntSeq=”5” time_stamp=”114728264”
opnDetail=”It looks good” />

17


Figure 10: Hong et al. XML Tag

They believe that this proved to be difficult due to the mapping of data between
two different data models, which can be known as impedance mismatch. In using
the displayed indexes they were able to specify which data they required from the
document when querying the relational table using MS SQL. Difficulty was
encountered further however when querying the database in that the XML
document returned was fragmented and further alterations to the query were
required in order to retrieve the XML document exactly as it was created.
However Hong et al. (2005) have only proposed a means to convert or shred XML
documents into a relational model and no query performance details have been
offered. Hong et al. (2005) have iterated the fact that the process is difficult due to
differing data models but performance is the key issue.
Alternatively, shredding the XML document does not always need to occur when
storing XML in an XML-enabled database: Oracle have a method of storing the
XML document as a whole within one single row of a relational table. This
method is known as Oracle XMLType:CLOB. The XMLType:CLOB stores an
XML document exactly according to its initial hierarchy and structure and it uses
XPath to update the document without rewriting any elements of it (W3S); this
will be discussed further in 2.3.3.
The key issue is performance: retrieving and updating the XML document,
sending it too and retrieving it from the specified database. Serna et al. (2005)
state that, when using an NXD, due to data being stored exactly as it is created
initially with no conversion of data required so that what is received is an XML
document and what is sent back is an XML document, performance levels are not
lost due to conversion to a different structure i.e. a relational table. Serna et al.
(2005) performed experiments for both an XML-enabled database and NXD.
They have compared the open-source NXD eXist and XML-enabled relational
database Oracle from two perspectives: XSU-SQL which maps the XML
document to the database, and XMLType:CLOB, within which the whole
document is stored and mapping of document structure to relational table
attributes is not needed.

18


The following experiments were conducted by Ainhoa Serna and Jon Keppa
Gerrikagoitia in 2005. Three different tests were performed. They were to:
http://xml.sys-con.com/node/104980

• XML:DB XPath which involved recurring searches, searches for attribute
nodes, searches with two conditions and searches using pre-defined XPath
functions

• XML:DB test which entailed add, delete and update document queries as
well as creating and dropping collections

• XML:DB robustness which tested maximum scalability of each storage
method

When evaluating each database performance 5000 XML documents – each
document being 592 bytes, 4 sub-collections each with 5000 files, and a further 2
sub-collections comprising 4000 files and 6000 XML documents, were queried
where the same queries were executed with continuously increasing amounts of
data.

2.3.1 Investigating Oracle XMLType:CLOB
As stated earlier, the XMLType:CLOB facility stored the XML documents as a
whole within 1 row of a relational table, and was used in conjunction with the
XPath query operation. XMLType possesses functions that enable the query to
extract particular nodes and fragments of an XML document, and Serna et al.,
(2006) have utilised the XMLType functions such as extract() and getStringVal to
query data. A table named ‘clientes’ of XMLTYPE was created and the
documents were inserted into it. Numerous queries including updates, queries
using XPath, queries of several tables and deletions from the table were
performed.

2.3.2 Investigating Oracle XSU (XML SQL)
This involved mapping the XML document nodes to a relational table and two
methods of storing and retrieving the XML documents were implemented by
Serna et al. (2006). These were relational table storage as already mentioned and
object-relational table storage methods and Serna et al. (2006) implemented both
solutions as part of the investigation to compare and contrast the two. When

19


concerning OR storage the required information was stored in data types and for
the relational storage method the data was stored in a relational table. Again, as
with the XMLType feature a number of SQL insert, update and delete queries
were performed using Java, its JDBC extension and Oracle’s XML parser.
However, due to the mapping of XML nodes to a relational table the names of the
XML elements and table columns have to be identical.

2.3.3 Investigating eXist
The storing of the XML documents did not need any mapping or configuration to
comply with an XML–enabled database when using eXist, so the NXD was
merely queried using XPath.

2.3.4 Investigation Results
The results returned the fact that eXist was the most efficient as regards recurring
searches and has the best response times in all of the tests performed, and when
utilising the XPath query standard functions such as “contains” returned quicker
response times than those queries that required the “=” operator; the “=” operator
is utilised by both the XSU and XMLType so their query times were lower than
eXist in this case also.
When concerning the insertion of 30,000 XML documents in to each of the
databases it was apparent that the OR tables and relational tables into the Oracle
database was better than that of eXist. XMLType:CLOB and eXist insertion
results are similar but both are less than the OR and relational table results. Also
the random access times of the XMLType returned the worst results for all of the
solutions. In general the results returned the fact that update and select queries
were much faster when eXist and structured storage formats were compared to the
CLOB file method.

The investigation within the research element of the project will assist in
ascertaining which XML database to utilise in the completion of the of the DJ
database system.

20


Developmental Element

3. Methodology
Having investigated XML, how it handles semi-structured data, alternative means
of handling semi-structured data i.e. RDBMS, and potential means to store and
manipulate XML documents from two perspectives, the initial research element
was completed. The developmental element can now be initiated and a software
development lifecycle was adhered to in order for the project to be successful. In
using a lifecycle model the ability to sign off each stage of system development
enables the gauging of where the project is at and limits the time taken for each
stage to be completed i.e. the creation of timeboxes for each development process
(Ramsin et al. p54) (refer to appendix for DSDM project plan and Timeboxes
established).
Essentially, due to the project timescale the project will be a rapid application
development (RAD) methodology. Two RAD techniques have been used in
tandem in the implementation of the proposed system: Dynamic Systems
Development Method (DSDM) and eXtreme Programming (XP). DSDM was the
development methodology the project assumed and DSDM is believed to be the
standard for RAD methodologies (Ramsin et al., 2008, p52).
According to Ramsin et al. (2008, p53) DSDM is an “iterative-incremental”
model in which business processes and user requirements are developed and
applied to two iteration development phases where prototypes are developed and
sequentially implemented. The second method XP is similarly an iterative agile
design process with the aim of creating and adapting software rapidly according to
ever changing user requirements (Hedin, G et al., 2003); both processes press the
importance of testing before implementing.
In applying these two development concepts together the aim was to develop and
implement a high specification system in a short period of time. DSDM provides
the framework upon which XP can develop; XP provides the format for system
testing and programming and DSDM defines the product lifecycle and required
deliverables i.e. what the was required of the system (Camara, S. 2009). This
harmony and unison between methodologies justified employing the two in

21


tandem in this development project. The two agile development methodologies
were also utilised due to familiarity and passed experience of using the two
concepts together prior to this project.
Refer to the appendix for the original DSDM project plan and an activity diagram
detailing the overall business process design and activities undertaken in the
completion of the project.
Having decided on a relevant development methodology to employ requirements
elicitation was required i.e. establishment of what the user requires of the
proposed system.

3.1 Initial User Requirements
The following requirements list displays the initial user and system requirements.
1. The user shall be able to register their personal details and store them in an
XML database
1.2 Details to be registered:
• Firstname
• Lastname
• Username
• Password
• Alias
• Email Address

2.1 The user shall be able to login to the web site
2.2 Details to be entered will be:
• Username
• Password
• DJ alias
3.1 The user shall be able to upload feedback to the web site using a stored XML
document
3.2 Feedback to be stored:
• Name
• Regarding
• Month
• Comment
3.3 Name: the name of the user using the DJ forum
3.4 Regarding: what the comment the user is uploading relates to i.e. an event
or MP3

22


3.5 Month: when the feedback was uploaded
3.6 Comment: comment user wishes to add
4.1 The user shall be able to view the feedback they have uploaded via a feedback
link in the web site (XML document returned and styled using XSL)
4.2 The user shall be able to view all feedback uploaded to the web site
5. The user shall be able to upload tips and experience to the web site, storing the
XML data in an XML database and returning the XML document styled using
XSL
6. The user shall be able to listen to MP3 samples accessed via a web link

NB. This data will not be in XML format it will be stored on the server because
the file will be too large for an XML document to handle.

3.1.1 Possible requirements
7. The user shall be able to upload DJ tutorials to the web site, again handling the
data using XML and storing the data in an XML database
8. The proposed system will be a web log or blog web site enabling real-time
commentary
After establishing the basic requirements, use case diagrams were created to gain
a visual perspective of what was required.

3.2 Use Case Diagrams
In order to gain a better perspective of what was required of the system, UML use
case diagrams were used to assist in this area. The use case diagrams were
produced according to the initial requirements that were established, a use case
diagram for each of the main requirements i.e. register details and login (1, 2),
feedback entry and return (3, 4), experience/tips entry and return (5). The
following UML use case diagrams were created using StarUML (refer to page
24).

23


Figure 11: Register Details and Login

Figure 12: Feedback

24


Figure 13: Experience/Tips

Having gained a visual perspective of what is required of the system through use
case depiction, user stories were developed.
3.3 User Stories
User stories were established in tandem with a potential user of the system (a
fellow DJ) to ascertain the processes completed by the user in order to satisfy the
specified requirements.

3.3.1 Register Details

After entering the Web site URL the user will be able to select the ‘Register’
link from the home page (index.php). Register.php will be opened and the user
will be able to enter their first name, last name, username, password, DJ alias
and email address. After clicking ‘Submit’ the user data will be transferred to
an XML database using an XML document and they will be returned to
Revolt’s home page.

1. User will click the ‘Register’ link apparent on the home page
2. User will enter all required data in to the text fields apparent
3. The user will press ‘Submit’
4. The user will be returned to the home page

Figure 14: Register User Story

25


3.3.2 User Login

After being returned to the home page the user will be able to enter their
username, password and alias. After clicking ‘Login’ the user will enter the
Web site and will open welcome.php.

1. User enters username
2. User enters password
3. User enters DJ alias
4. User clicks ‘Login’
5. User enters Web site

Figure 15: Login User Story

3.3.3 Upload Feedback

After entering the Web site, the user will be able to select ‘Forum’ the link tab
displayed. By selecting the ‘Forum’ link from welcome.php the user will be
able to enter their name, what the comment relates to, the month that the
feedback is being uploaded and enter a comment. The user will be able to click
‘Submit’ to upload their comments.

1. User clicks ‘Forum’ and enters welcome.php
2. User enters their name
3. User enters the feedback re
4. User enters the month
5. User enters their comment
6. User clicks ‘Submit’ to upload comments

Figure 16: Upload Feedback User Story

3.3.4 View Feedback
After clicking ‘Submit’ the user will open confirm.php and will be to view the
comments they have just uploaded. After selecting the radio button from the
same page the user will be able to click ‘Submit’ and will be manoeuvred to
confirmation.php where all upload comments will be displayed.

1. User views uploaded comments from previous page
2. User selects the radio button
3. User clicks ‘Submit’ to manoeuvre to confirmation.php

Figure 17: View Feedback User Story

26


3.3.5 Upload Tips and Experience

After logging in to the web site the user will be able to select experience/tips
and fill in the required fields apparent on the page. After clickin Submit the user
will be able to see the entries they have entered in to the Revolt DJ forum.
1. User clicks ‘Experience/tips’
2. User enters their alias
3. User enters the ‘Give it a name’ field
4. User enters the experience
5. User enters their tips
6. User clicks ‘Submit’ to upload comments

Figure 18: Experiences/Tips user Story

3.3.6 Listen to MP3’s

After entering the relevant URL the user will be able to select ‘MP3’ from the
link tab and will be manoeuvred to mp3.html where they will be able to select
an MP3 link from the present page.

1. User clicks ‘MP3’ link
2. User selects desired MP3 link from mp3.html

Figure 19: MP3 User Story

When developing a piece of software using RAD techniques i.e. DSDM it is
important to establish which requirements are important in order to complete the
project successfully and within the stipulated time period. It was not feasible to
implement all of the user stories established in 3.2 and it was therefore necessary
to refine the requirements. In the following section MoSCoW rules are used to
establish the importance of each user requirement.

3.4 MoSCoW Rules
Ramsin et al. state that the rules as regards requirement priority are specified as
“must haves”, “should haves”, “could haves” and “won’t haves” - essential,
important, not essential and not present in the project respectively (2008, p55).
The user stories established in 3.2 were analysed using MoSCoW rules as follows:
• The system MUST be able to upload user feedback to an XML database

27


• The system MUST enable users to view individual feedback they have
uploaded and previous user feedback, data that is in XML format and
styled using XSL
• The system SHOULD offer user ability to upload experience and tips to
an XML database
• The system COULD be able to provide user registration
• The system COULD enable users to login
• The system WOULD enable users to listen to MP3’s uploaded

From the above analysis it was possible to see that the system MUST possess
three of the original 6 user stories in order for the project to be a success; it was
considered feasible that these user stories could be completed within the time
scale of the project. The revised user stories completed were therefore 3.3.3, 3.3.4
and 3.3.5 (refer to 3.1). The SHOULD user stories were perceived to be
requirements that could be satisfied within the specified timescale, and the
COULD and WOULD user stories were perceived to be requirements that could
not be satisfied within the project timescale; these are areas of future work. The
three MUST user stories were developed in the functional iteration model stage of
the DSDM product lifecycle (refer to appendix for DSDM project plan). In
identifying the user stories that were essential to the project and incorporated in
the functional iteration stage, the design and implementation process began.

4. Artefact Design and Construction
4.1 Prototypes
To gain an idea of what was to be designed it was essential to identify the
processes that the user went through in satisfying the requirements of the system.
User requirements and user stories (3.1, 3.2) state what is required of the system
and to gain an interactive perspective of what was required a number of design
processes were completed, including:
• Low fidelity prototype – sketches or throw away depictions of the system
interface
• Medium fidelity prototype – interactive depictions of the system interface
and how it operates

28


• Vertical prototype – a functional representation of a small part of the
system

4.1.1 Low Fidelity Prototype
After creating the user stories, in order to create a general perspective of the
system interface, mock-ups (sketches) of the system interface were produced. The
low fidelity prototype was merely a means to gain an initial idea of what the
system interface will look like. Figure 20 shows a mock-up design of the home
page (index.php):

Figure 20: Low Fidelity Prototype

According to the DSDM model the usability of the system is tested at this stage of
the development process; the mock-ups were created in tandem with the user
ascertaining exactly what was required of the system interface. This prototype was
developed further in the next phase of prototyping.

4.1.2 Medium Fidelity Prototype
The low fidelity prototype gave an idea of how the interface will look; a medium
fidelity prototype was used to show how each page was going to be accessed and
how to manoeuvre between them, a general depiction of what was required of the
system. Figure 21 is a representation of the medium fidelity prototype; it can be
found in the ‘medium_fidelity’ folder on the CDROM in the appendix. Refer to
the ‘medium_fidelity’ folder on the CDROM in the appendix for the medium
fidelity code.

29


Figure 21: Medium Fidelity

4.1.3 Vertical Prototype
A vertical prototype is a representation of a small piece of system functionality.
The vertical prototype is essentially a medium fidelity prototype that has a
specific piece of functionality; in relation to this assignment the vertical prototype
satisfies requirement 3 and 4 using a back end database and PHP to connect to,
access and manipulate the required data. Prior to the implementation of the
vertical prototype it was necessary to produce a back end database on a server. A
MySql database containing 1 table (medium) was created on the TVU Apache
server.

Requirement 3 involves user ability to upload data to the forum and requirement 4
enables users to see the data they have uploaded to the forum on request. The
following diagrams show the processes the user goes through in satisfying both
requirements 3 and 4. The code can be found in the appendix in the
‘vertical_prototype’ folder on the CDROM.

30


Step 1: Insert data

Figure 22: Details

Step 2: View and submit data

Figure 23: Entry

Step 3: Retrieve Entry

31


Figure 24: Prototype Return

The title of the project is to investigate the use of XML in a DJ database system;
all that the prototypes investigated was the development of the system interface
and uploading of data to a database and accessing and manipulating data using
PHP. To gain an idea of how XML worked and how to format the XML
document, a template XML document and XSL stylesheet were produced. The
XML document and XSL stylesheet can be found in the CDROM ‘xml_template’
folder in the appendix. Figure 25 displays the XML document styled by the XSL
stylesheet.

4.2 XML Experiment

Figure 25: XML Experiment

32


4.5 Technology Used
Having investigated into XML and how it is formatted using XSL, how to store
and manipulate the XML document was established. PHP DOM was used to
manipulate the XML document and it was decided that an XML-enabled database
would be used to store the specified XML documents in BLOB form i.e. the
whole XML document is stored as one row in a relational table in a MySql
database. Due to problems with the Apache server at TVU (refer to 6.4) a server
was set up remotely; WampServer 2.0 was utilised to store all PHP and XML
documents in an XML-enabled MySql database. After investigating the
technology that was to be used, a proposed database schema was developed.
4.6 Database Schema
A UML class diagram was created to represent how the tables within the database
were formed. It was created in relation to the user requirements formulated in
section 3. Figure 26 is the database schema created:

Figure 26: Database Schema

A code walkthrough of how the main parts of the system function will now ensue.
4.7 Code Walkthrough
The main functions of the system are to create an XML document using PHP
DOM, load the XML document into an XML-enabled database and retrieve the
XML document from the XML-enabled database.

33


4.7.1 XML PHP DOM
Refer to the folder ‘revolt_final’ on the CDROM in the appendix for all of the
following code excerpts.
The user inserts the relevant data into the specified fields found on
revolt_index.php, ‘reviewer’, ‘regarding’, ‘month’, and ‘feedback’ and the page
calls ‘revolt_index.php’ on submission of the form:

<form name=”input” method=”post” action=”revolt_insert.php”>
<div class=”subheading”>
<p>Reviewer</p>
</div>
<p><input type=”text” name=”reviewer”/><p>
<p>Regarding</p>
</div>
<p><input type=”text” name=”regarding”/><p>
<p>Date</p>
</div>
<p><input type=”text” name=”month”/><p>
<p>Comment</p>
</div>
<p><textarea type=”text” name=”feedback”
rows=”5”cols=”25”/></textarea></p>
<p><input type=”submit” value=”Submit”/><p>

Figure 27: XML Inputs

On submission of the form the page revolt_insert.php is called. On this page the
PHP document object model (DOM) is created using the following code:
<?php

DOM document is created...

$xmldoc = new DOMDocument();

XML document and XSL stylesheet are loaded...

$xmldoc->load('revolt_final.xml');
$xmldoc->load("revolt_final.xsl");

‘reviewer’ is called from revolt_index.php (Figure 27)

$newAct = $_POST['reviewer'];

34


$root variable is created containing the first child node of the element ‘reviewer’
in the revolt_final.xml document...

$root = $xmldoc->firstChild;

An element node is created in $xmldoc containing the ‘reviewer’ input from
revolt_final.php...

$newElement = $xmldoc->createElement('reviewer');

‘reviewer’ found in $newElement is added after the first child node found in
$root...

$root->appendChild($newElement);

A new text node is created containing the value of ‘reviewer’ posted in
$newAct...

$newText = $xmldoc->createTextNode($newAct);

The value of ‘reviewer’ is then added into the new element of the $xmldoc created
in the variable $newText...

$newElement->appendChild($newText);

Repeat this process for the other three input values ‘regarding’, ‘month’ and
‘feedback’ found in Figure 27.

The xml document is then saved...

$xmldoc->save('revolt_final.xml');

And the user is sent to the page revolt_return.php

header('Location:revolt_return.php');
?>
The DOM document is created and the user is sent to the page revolt_return.php.
At this page the user will be able to see the DOM document reloaded;
revolt_final.xml is reloaded as follows:

<?php

35


DOM document created in variable $xmldoc and revolt_final.xml is loaded in to
said variable...

$xmldoc = new DOMDocument();
$xmldoc->load("revolt_final.xml", LIBXML_NOBLANKS);

Create $name variable containing the returned first child node element of
‘revolt_final.xml’...

$name = $xmldoc->firstChild->firstChild;

If $name is not empty...

if($name!=null){
while($name!=null){

then all elements of the XML document will be returned using the textContent
XML DOM property...

echo $name->textContent."<div class="subheading"><p>Entry</p></div>";

After the subheading class the second element of the XML document is returned
using the nextSibling property...

$name = $name->nextSibling;
}
}
?>

4.7.2 XML Document Storage
After displaying the XML document using the PHP DOM method, the document
is inserted into the XML-enabled database using a submit button. On the clicking
of the submit button found on revolt_return.php, update.php is called and the
XML document is uploaded using the following code:

<?php

Connection to the database is accomplished using the regular PHP SQL
connection method. Connect to the remote server...

$con = mysql_connect("localhost", "thomr", "thomr1981");

36


if ( ! $con )
die("Cannot connect to server!");

Connect to the database revolt...

$db = mysql_select_db("revolt");
if ( ! $db )
die ("Cannot connect to database!".mysql_error());

Create a variable containing an INSERT SQL statement and insert the data into
the revolt_feedback table in the revolt database...

$sql = "INSERT INTO revolt_feedback VALUES
(NULL,

Load the revolt_final.xml document using the LOAD_FILE XML property...

LOAD_FILE('c:/wamp/www/revolt/final/revolt_final.xml'))";

Execute the query, $sql...

$result = mysql_query($sql);
if ( ! $result )
die ("Cannot execute query!".mysql_error());

Send the user to display.php where the XML document will be displayed...

header('Location:display.php');

?>
At display.php the XML document is retrieved by using a SELECT SQL
statement. Connection is achieved via the same means as above and the statement
is as follows:

$sql = "SELECT * FROM revolt_feedback";

Due to the XML document being stored as a single row in revolt_feedback the
document is returned as a row element. The variable $row is created containing
the SQL statement above which in turn is contained in the variable $result...

$row = mysql_fetch_row($result);
while ( $row != null )
{
echo "<div class="subheading"><p>Feedback Id # $row[0]
</p></div> ";

The XML document contained within $row is the echoed by PHP and displayed...

37


echo "<tr> <td> <td> $row[1] </td> </tr>n";
$row = mysql_fetch_row($result);
}
This method has been used in exactly the same manner when satisfying the
uploading of experience and tips to the Revolt DJ Forum; the inputs that have
been posted using PHP however are different. The final artefact can be found on a
remote server and the source code can be found in the ‘final’ folder on the
CDROM in the appendix.
In adherence to the DSDM framework upon which the project was based, having
created the artefact it was necessary to test what had been produced as part of the
iterative nature of the framework.

4.8 Artefact Testing
When employing a DSDM development project, testing is completed at each
iteration of the project i.e. if something is changed then the change will be tested
in tandem with the specified user according to their needs. The following testing
methods were produced after the final iteration of the project when the artefact
was deemed complete.
3 forms of testing were implemented in testing the usability of the system
produced. The tests were created in order to assess whether or not the artefact
satisfied the user requirements and stories. Firstly a test case was produced in
which the user was asked to input a number of processes in to the system and
describe the output that the system returned. Secondly a user acceptance test was
produced, again involving a number of user inputs and recording of the output
returned; the first two tests were used as part of a verification process.

4.8.1 Test Case
Figure 28 is an example of a user input to the system and the required response
format.
Input 6
Click ‘Forum’ on the Welcome page. Fill in the fields apparent and click
‘Submit’. Are the details you just entered present on the page? Tick Yes or No
Result:

Yes No

Figure 28: Test Case

38


4.8.2 User Acceptance Test
Figure 29 is an example of 1 specific user acceptance test. A number of
acceptance tests were performed testing each particular user requirement.

1. Register Details

Purpose Register personal details
Prerequisite User must have visited the Revolt DJ Forum
Firstname {Enter fname into login table}
Lastname {Enter lname into login table}
Username {Enter username into login table}
Test Data
Password {Enter password into login table}
Alias {Enter alias into login table}
Email {Enter email into login table}
1. User visits Revolt DJ Forum
2. User clicks ‘Register’
Steps 3. User enters required data into text boxes
4. User clicks ‘Submit’
5. Users sees login page
Notes and
Questions

Figure 29: User Acceptance Test

Having investigated semi-structured data and XML, XML storage methods,
elicited what was required of the system, developed user stories, created an
artefact and tested what was produced, a discussion of what was found out is
required.

5. Project Assessment
The purpose of the project was to investigate the use of XML in a DJ database
system. In investigating XML a number of requirements were elicited involving
storing and manipulating XML documents in an XML-enabled database. Using
the results gathered from the tests performed in 4.8, it is possible to assess
whether or not the requirements of the XML database system were satisfied,
essentially establishing whether or not the artefact and in turn the project was a
success. Each requirement will now be analysed to determine whether or not they
were satisfied by the artefact or not.

5.1 Requirements Analysis
1. The user shall be able to register their personal details and store them in an
XML database

39


1.2 Details to be registered:
• Firstname
• Lastname
• Username
• Password
• Alias
• Email Address

Result: Requirement satisfied

Rationale: The user can register details and create an account. The results of
input 1 of the test case display the fact that a user cannot login to the system
unless they have registered their details on the database. On the clicking of
‘Login’ without registering details the following error message is returned:

You have entered Incorrect details, please try again.
please try to login again.Click here to retry

2.1 The user shall be able to login to the web site
2.2 Details to be entered will be:
• Username
• Password
• DJ alias
Rationale: The user can login to the system. The results of input 5 of the test case
displays the user ‘Alias’ on the welcome page whereby proving that the user
session has been created and the user has logged in.

DJ ‘Alias’
displayed

Figure 30: Alias Session
40


3.1 The user shall be able to upload feedback to the web site using a stored XML
document
3.2 Feedback to be stored:
• Name
• Regarding
• Month
• Comment
3.3 Name: the name of the user using the DJ forum
3.4 Regarding: what the comment the user is uploading relates to i.e. an event
or MP3

3.5 Month: when the feedback was uploaded
3.6 Comment: comment user wishes to add
Rationale: The user acceptance tests 3 and 4 results verify that the requirement is
satisfied. Input 4 of the test case asks for the feedback Id# range. Figure 31
displays the original Id # range.

Figure 31: Entry

The range returned by all tests is 1 to 2. On completion of Input 6 as follows...

41


Figure 32: Entry

On clicking ‘Submit’ the entries are displayed...

Figure 33: Entry

On clicking ‘Update’...

42


Feedback Id # 3

Figure 34: Feedback Id

The entries are added to the XML-enabled database. The result of Input 9 displays
that the user entry has been added and the feedback Id # is 3.

4.1 The user shall be able to view the feedback they have uploaded via a feedback
link in the web site (XML document returned and styled using XSL)
4.2 The user shall be able to view all feedback uploaded to the web site
Rationale: Refer to requirement analysis 3 for proof of satisfaction
5. The user shall be able to upload tips and experience to the web site, storing the
XML data in an XML database
Rationale: The results of user acceptance test 5 return the following:
After steps 1 to 3 user inserts data (Step 4)...

43


Figure 35: Entry

User clicks ‘Submit’ and sees data entered (Steps 5 and 6)...

Figure 36: Data Displayed

User completes steps 7 to 9 and the following is displayed...

44


Entry inserted into
XML database

Figure 37: Data Inserted

6. The user shall be able to listen to MP3 samples accessed via a web link
Result: Requirement partially satisfied
Rationale: The user acceptance results for test 6 reveal that 2 of the MP3 samples
can be listened to – 2 are not MP3’s.

5.2 Failed Requirement
When assessing the requirements in relation to the MoSCoW rules formulated in
3.4, it was possible to see that all of the requirements regardless of priority had
been implemented successfully. However, 1 issue was raised as regards how the
XML document was returned. For both requirements 4 and 5 it was required that
the XML document would be returned and styled using an XSL stylesheet. This
mini requirement was not satisfied. This was due the whole XML document being
stored in 1 row of a relational table and returned as 1 row. Attempts were made to
open an XSL stylesheet (revolt.xsl) using the XML DOM property LOAD, but

45 XML document is not
styled; the whole
document is returned
as one row of data


they were not successful. Figure 38 displays how the XML document was
returned for the feedback requirement (4).

Figure 38: Unformatted XML

This fact was re-affirmed by the heuristic evaluations performed when testing the
system. A number of test results returned the notion that the system was usable
and aesthetically pleasing apart from this particular requirement; the data entries
returned were not aesthetically pleasing (refer to appendix for sample heuristic
evaluation).
This particular mini requirement could have been satisfied if the XML document
was stored in its original hierarchical structure in an NXD or shredded in to an
XML-enabled database with schema and index reference. However problems
were experienced in both of these areas and required project scope alterations as a
result.

5.3 Effectiveness of Method Used
The method utilised was XP coupled with the DSDM framework; the DSDM
framework relies on a feasibility study preceded by iterative development stages
in which aspects of the system are developed, tested and then developed and
tested further until the system is ready to be implemented. Each phase of the
DSDM framework relies on constant interaction with the user of the system in
order to ascertain exactly what it is they require. Adhering to the framework
proved difficult due to coding of the proposed system commencing as soon as the
project was initiated. As a result a number of the elements of the framework had
to be reverse engineered i.e. MoSCoW rules were reverse engineered because all
of the requirements were completed in unison and not in order of requirement
importance.

46


In planning the project according to the DSDM framework, it would have been
beneficial to implement timeboxes to aid in the completion of the various stages
of the project itself i.e. how long it should take to complete each iteration or finish
the feasibility study. When using timeboxes one timebox could be set up
portraying the whole duration of the project and within the project timebox
smaller ones are set up defining when the element should be completed (Ramsin
et al. 2008). Each timebox normally lasts about two to six weeks and they aid in
defining start and end dates as well as setting boundaries and sign off periods for
iterations (Ramsin et al. 2008).
Timeboxes were not used because it was decided that having boundaries for each
element of the development process and initiating sign offs may put pressure on
other areas of the project. It was believed that finishing a specific area before
starting another was the better method to employ. In some cases less important
areas were put on hold or eliminated because more pressing issues ensued i.e.
PICTIVE design sessions were not finished because a medium-fidelity prototype
was perceived as a better method of defining interface design and orientation.
Testing in iterative phases was non-existent because the coding process followed
merely a trial and error process; if something was wrong then it was fixed without
the user at hand. Very little testing of each phase of the system took place due to
user interaction being at a minimum. Testing took place once the system had been
implemented as a means to ascertain whether or not the requirements of the
system had been satisfied or not.
Also a feasibility study was not undertaken prior to design and implementation,
the design phase began immediately. It would have been beneficial to conduct a
feasibility study prior to implementation to assess the ease of installation of an
NXD and XML-enabled database and manipulation of an XML document. Due to
not assessing the feasibility of what was proposed the problems encountered when
attempting to install an NXD (refer to 6) could have been avoided and more time
could have been taken in the investigation of XML-enabled databases and the
configuration procedures involved.
However the prototyping phase proved very useful in both depicting the user
interface (medium fidelity prototype) and system functionality (vertical
prototype). The user interface was experimented with and an aesthetically

47


pleasing interface was created using the low and medium fidelity prototypes.
System functionality was ascertained using the vertical prototype and proved to be
an extremely beneficial element of the DSDM framework.
In hind sight, the XP/DSDM methodology employed was useful because it
provided general guidelines upon which the software development project could
rely; although a number of processes were not completed the processes that were
completed assisted in the creation and completion of the DJ database system i.e.
the gathering of requirements, user stories, prototyping and user acceptance
testing post implementation.
In completing the project in the specified timeframe a number of alterations were
made. The following section details the problems encountered and the alterations
implemented as a result.

6. Project Alterations
A number of problems were encountered in completing the project, largely
problems related to experimenting with different NXD’s and XML-enabled
databases. As a result a number of alterations were made to the overall scope of
the project.

6.1 NXD Installation Issues
Originally having investigated NXD’s and XML-enabled databases in section 2,
the artefact produced was intended to contain both an NXD and XML-enabled
database; an XML-enabled database for requirements 3 and 4 and an NXD for
requirement 5. The intended implementation would then have been tested and
compared to ascertain which database was more efficient based on the query
speeds of each database, similar to the tests mentioned in the literature review.
Due to configuration issues with NXD implementation it was decided to merely
implement an XML-enabled database for the XML documents to be stored in.
Attempts were made to implement two NXD’s: eXist and Oracle Berkeley
DBXML.

6.2 eXist
It was possible to install the eXist NXD but issues were apparent when attempting
to open the database; an account could be created but access to the database was

48


not granted. Figure 39 displays the problem that was encountered when
attempting to access the eXist NXD.

Figure 39: Error

Because of the access problems experienced with eXist, it was decided to try a
second NXD: Oracle Berkeley DBXML (BDBXML).

6.3 BDBXML
As regards this NXD it was possible to install and even create XML documents,
insert the required data and query the document created using an MSDOS
command prompt, but issues were raised when attempting to access and
manipulate the XML document using PHP and display the query results on the
specified PHP Web page. Figures 40 to 42 display creating and querying an XML
document in BDBXML.
Create document:

Figure 40: DBXML Create

Return document:

49


Figure 41: DBXML Select

Query document:

Figure 42: DBXML Query

When attempting to open the document revolt.dbxml using the following PHP
(test.php) script...
<?php
$mgr = new XmlManager(null);
if(file_exists("revolt.dbxml")) {
$mgr->openContainer("revolt.dbxml");
} else {
$con = $mgr->createContainer("revolt.dbxml");
}
$con->putDocument($book_name, $book_content);
?>
The following error message was returned:

Fatal error: Class 'XmlManager' not found in C:wampwwwtest.php on line 3

This error was due to issues with PHP extensions not being available in the
php.ini file and having to build the BDBXML using an IDE such as MS Visual
Studio. The point of the project was to investigate the use of XML in a DJ

50


database system and it was perceived that the installation issues experienced were
detracting away from the real essence of the project; it was therefore decided to
alter the concept of the project and merely decide upon one XML database to
utilise. Therefore an XML-enabled database was utilised purely because of the
ability to store and retrieve an XML document from it without installation
problems.

6.4 Oracle, PHP and XML Issues
Prior to deciding to utilise the BLOB format of a MySql database to hold the
whole XML document in 1 row of a relational table, investigation was made into
employing the XMLType property of an Oracle database to store an XML
document and schema. Due to not being able to install and configure an NXD
successfully it was decided to attempt to utilise an Oracle database and the query
XSU (XML SQL) method in order form a basis for comparison between two
XML-enabled databases – one that stored the data as a whole document in 1 row
(MySql and its BLOB utility), and 1 that shredded each data element in to a
different row of a relational table (Oracle XML XSU). Initially, it was possible to
store an XML document in an Oracle database that TVU utilised but problems
arose when attempting to access and manipulate the XML documents using PHP.
The oci8 extension that PHP requires to access an Oracle database via the
oci_connect PHP command did not exist on the Apache server at TVU. Even after
installing a remote Oracle database and creating and inserting XML documents
manually, issues still arose when attempting to access and manipulate the XML
document using PHP in the Revolt Web application. The following error was
returned on execution of an oracle test page:

Fatal error: Call to undefined function oci_connect() in
C:wampwwworacle_test.php on line 4

This problem again arose due to issues with the oci8 extension and attempts to re-
configure the php.ini file to enable the extension and the related environment
variables PATH were unfounded. It was at this stage that it was decided to merely
utilise the MySql BLOB property that enabled the storage of an entire XML
document in 1 row of the table. This was purely due to the fact this was the only

51


method that proved successful as regards XML document manipulation via PHP
DOM and storage of the document created.

6.5 Data Structure
The structure of the data being accessed, stored and manipulated was not in semi-
structured format, the data is actually structured. In section 2.2 there is a
description of how semi-structured data appears and it was apparent that the data
being gathered by the DJ forum was not in semi-structured form. Semi-structured
data has an irregular and unstructured form whereas the data in the artefact was
structured.

Figure 43: Structured Data Format

This data can be represented as...
<review>
<name></name>
<reviewer></reviewer>
<date></date>
<comment></comment>
</review>

The original intention of the artefact was to collect more extensive feedback in
relation to the event that the customer has been to. It was proposed originally for
the feedback requirement that data relating to music genre, venue description, how

52


easy the venue was to find, quality of mixing, preferred music would be captured
and stored in the XML database. A varying set of data would have been captured,
some comments long, some short and not all of them relevant to the actual DJ
event. If this data was stored in a relational database a classic example of sparse
data would have been returned (refer to Figure 7, page 14) and be perfect for
XML to handle because it does not have any set boundaries on structure. The
intended structure could have been like the following, and therefore semi-
structured:
<review>
<name></name>
<reviewer></reviewer>
<comment>
<genre></genre>
<venue>
<venue description></venue description>
<find></find>
<venue>
</comment>
</review>

Figure 44: Intended Structure

Unfortunately because there were serious delays starting the project due to the
inability of TVU to make PHP with Oracle available (the preferred technology)
eventually a remote server named WampServer had to be implemented and the
application had to be simplified because of the time constraints - a simplified
version was implemented to test the application.
7. Conclusion
Having investigated into the use of XML in a DJ database system, XML and
XML storage methods have been assessed, and a DJ database system has been
implemented loosely utilising the XP/DSDM software development methodology.
Due to the inability to install and in turn utilise an NXD a comparison could not
be made between said NXD and an XML-enabled database as originally intended.
Efforts were made to install two NXD’s (BDBXL and eXist) so that a comparison
could be made but proved unsuccessful.
In the literature review it is mentioned that XML-enabled databases are more
reliable due to them being tried and tested and the standard for data storage
(Connolly et al., 2006). Pardede et al., (2005) further iterate the limitations of

53


NXD’s stating that they are in their “infancy” and that XML-enabled databases
are preferred because of the mature RDBMS foundation they are based on. Due to
the problems experienced in installing two NXD’s and the configuration of an
NXD to use PHP, resulting in the inability to experiment with NXD’s entirely, it
is possible to agree with this argument. The MySql database used in the DJ
database system did not have to be configured to utilise the PHP scripting
language; an XML document was just stored as a BLOB file and was accessed
and manipulated accordingly using SQL statements, although not styled using
XSL on return.
Contrarily, only the scripting language PHP was investigated and this is not the
basis for a solid argument because numerous other programming and querying
languages i.e. Java and XPath could be employed instead of PHP and SQL to
access and manipulate the XML document; PHP and SQL was merely used
because of familiarity with the two languages. Also, if a proper feasibility study
was performed prior to the design and build phase, further investigation could
have been completed in ascertaining how NXD’s are installed and extra time
could have been set aside to install and configure the desired NXD. Due to not
having a basis for comparison between the two originally required XML
databases, judgement of NXD’s cannot be accomplished. Reference can be made
to the experiment performed by Serna and Keppa in the literature review about
how they found that in general the eXist NXD performed better than the XML-
enabled Oracle as regards query time, but again this is only their experiment
performed under their conditions. Assessment cannot be made until experiments
are made first hand.
As a result, a number of different areas of improvement, recommendations and
future work can be offered.
7.1 Recommendations
Below are recommendations that could assist in improving the Revolt DJ database
system:
• Investigate in to the retrieval of an XML document with an XSL stylesheet
so that the XML document returned is formatted in an aesthetic manner
• Look into different methods of manipulating XML documents other than
PHP DOM

54


• Investigate how to configure an NXD to utilise PHP so that an NXD can
be investigated
• Investigate into the query language XPath instead of relying merely of the
query language SQL to access the specified back-end database

7.2 Future Work
Below are some possible areas of future work:
• Implement an NXD for one of the requirements so that a comparison of
the two XML databases can be completed
• Style the XML documents retrieved (revolt_feedback.xml and
experience.xml) by the DJ database system
• Create a totally XML based web application i.e. XML login and XML
registration of details

8. Critique
The following critique is a summary of what has been found in 6. The initial
objectives of the assignment were to investigate the use of XML in DJ database
system and in doing this an artefact was produce investigating this paradigm. The
objectives was to produce a PHP XML Web application that enabled users to
register details, login, upload feedback and experience to an XML database, and
return the feedback and experience to the the Web page using PHP. There were 6
initial requirements that were set up and all of the requirements have been met.
However, the sub-requirement which involved styling the XML document
returned was not completed and this has been outlined in 5.
The requirements were met but a proposed comparison between XML-enabled
databases and NXD in order to ascertain which was the most efficient to employ
was not completed due to installation issues (refer to 6). Also, the data that was
captured by the Web site was not semi-structured (refer to 6) due to configuration
problems, so the overall purpose of the project was somewhat undermined by this
fact. However XML is the language of the Internet so an investigation into how it
is used proved beneficial.
Even though a number of problems arose due to the aforementioned problems, the
overall question and purpose of the project was realised in that a Web application

55


using XML was produced and the XML document could be manipulated and
stored in an XML database. Working with the XML technology proved difficult
but the production of the application and the eventual storing and manipulation of
the document proved most satisfying. Further investigation in to XML and how it
is used on the Web will be of further interest in the future.

9. References
BUNEMAN, P. (1997) Semi-structured data. Tutorial, In: Symposium on
Principles of Database Systems. UPenn DB Engineering. Retrieved March 16th,
2010, from http://www.cis.upenn.edu/~db/abstracts/semistructured.html

SPERBERG-MCQUEEN, C.M. (2005) XML. Queue, 3 (8). October, pp.34-41.

56


BURLESON, D. (2009) Identifying Oracle sparse tables. Burleson Consulting.
Retrieved March 16th, 2010, from
http://www.dbaoracle.com/t_identify_sparse_tables.htm

WOOD, P. Semi-Structured Data: Example of Semi-Structured Data. Retrieved
14th May, 2010, from http://www.dcs.bbk.ac.uk/~ptw/teaching/ssd/notes.html

ROY, J. & RAMANJUAN, A. (2001). XML schema language: taking XML to
the next level. In: IT Professionals, 3 (2), pp37-40.

WORLD WIDE WEB CONSORTIUM (W3C). (2008) Extensible Markup
Language (XML) 1.0 (Fifth Edition) W3C Recommendation. November 26th.
Retrieved March 16th 2010 from http://www.w3.org/TR/xml/

W3 SCHOOLS (W3S). Introduction to XML. Retrieved March 16th, 2010, from
http://www.w3schools.com/xml/xml_whatis.asp

HONG, S. & SONG, Y-T. (2007) Efficient XML query using Relational Data
Model. In: Software Engineering, Artificial Intelligence, Networking, and
Parallel/Distributed Computing. Eighth ACIS International Conference. 3,
pp-1095-1100.

APPLEQUIST, D.K. (2002). XML and SQL: Developing Web Applications.
Addsion and Wesley.

SERNA, A. & GERRIKAGOITIA, J-K. (June 28th, 2005) David and Goliath: A
Comparison Of XML_Enabled And Native XML Data Management
Techniques. Retrieved March 25th from http://xml.sys-con.com/node/104980

CHAUDRI, A.B., RASHID, A., & ZICARI, R. (2003). XML data management:
native XML and XML-enabled database systems.

57

Investigating the Use of XML in a DJ Database System

Investigating the Use of XML in a DJ Database System

Recommended

Recommended

More Related Content

What's hot

What's hot (15)

Viewers also liked

Viewers also liked (8)

Similar to Investigating the Use of XML in a DJ Database System

Similar to Investigating the Use of XML in a DJ Database System (20)

Investigating the Use of XML in a DJ Database System