SlideShare a Scribd company logo
1 of 13
Searching a Native XML Website
Using XML Technologies
Alex Sumner
Sophomore, Computer Science Major
Winston-Salem State University
Dr. Mustafa Atay
Associate Professor, Department of Computer Science
Winston-Salem State University
2
Abstract
XML (Extensible Markup Language) is a standard to represent and exchange data. XML
separates content from its style. The separation of content and formatting should allow
web programmers to come up with efficient document search modules for XML-based
websites. In this research project, we aim to develop a client-side search module for a
native XML website using technologies and standards such as Extensible Stylesheet
Language Transformations (XSLT), XML Path Language (XPath), Document Object
Model (DOM), Regular Expressions and JavaScript.
We have developed JavaScript code for client-side searching which isplaced and invoked
within XSLT code files. The JavaScript code navigated the DOM hierarchy of the
underlying document. We have used string match operator along with regular expressions
for effective and targeted search operations. We used our XML-based website previously
developed for WSSU SURE program as the test bed. Our developed application and
observations showed that a native XML website can be effectively searched using XML
technologies combined with a scripting language.
The method of this research project includes the following steps: (i) Review XML and
other technologies and tools available such as XSLT, XPath and DOM to support
searching a native XML website. (ii) Implement a client-side search utility using the
selected technologies and tools and a scripting language on a test bed XML website. (iii)
Observe and report the applicability, effectiveness and challenges of using XML to
incorporate a client-side search utility over XML websites.
Keywords: XML, XSLT, JavaScript, DOM, client-side search, Regular Expressions
3
Table of Contents:
Abstract ....................................................................................................2
Introduction...............................................................................................4-6
Background Information.......................................................4-6
Problem Statement...............................................................6
Materials and Methods ............................................................................6-7
Research Project.....................................................................................7-10
Basic Search.........................................................................7-8
Advanced Search.................................................................8-10
Conclusions..............................................................................................10
Future Work..............................................................................................10-11
Experiences .............................................................................................11
Acknowledgements .................................................................................11
References ..............................................................................................12
4
1. Introduction
XML is used to store and transport data and is a recommendation of the World Wide Web
Consortium. The focus of this research is to successfully introduce a client-side search
utility on a native XML website. To accomplish this we used XML technologies, such as
XSLT, XPath, DOM, and Regular Expressions, and a scripting language, JavaScript.
1.1 Background Information
1.1.1 XML
XML is a markup language like HTML; however, it is not used the same way as HTML.
XML is used for the storage and transportation of data. It is a W3C (World Wide Web
Consortium) recommendation. This means that it promotes fairness and quality on the
web and is a web standard. XML does not have predefined tags which allows the user
to create their own, self-descriptive, tags.1
1.1.2 XSLT
XSLT is a stylesheet language for XML. It converts XML documents into another form
such as HTML or XHTML. This transformation is necessary in order for the document to
be read properly and displayed by the browser. XSLT became a W3C recommendation
in 1999. It is supported by all of the major browsers and has the ability to incorporate
many of the other XML technologies into its code. 2
1.1.3 JavaScript
5
JavaScript is a popular programming language for the web. It is a necessity in all modern
HTML web pages. Along with HTML and CSS, JavaScript is one of the languages that
all successful web developers must know. It has the ability to change, create, delete, and
copy HTML elements. There are specific tags that must be used with JavaScript.
JavaScript must be placed between the script tags, and it has to either be placed within
the head or body tags. 3
1.1.4 XPath
XPath is used to navigate through an XML document. It can be an extremely useful tool
in an XSLT document. By using XPath to navigate through the XML document an XSLT
document can easily access and modify the data. XPath is also a W3C standard as of
1999 and it is supported by all major browsers. 4
1.1.5 DOM
There are three types of DOM; Core, XML, and HTML. In this research we used the DOM
to help with our navigation in our JavaScript code and accessing specific data in the XSLT
code. This was possible by using Navigation Nodes and DOM Methods such as:
firstChild, nextSibling, getElementsByTagName(), and getElementsById(). These
methods and nodes were used simultaneously in the JavaScript code to locate where the
search would take place in the XSLT code. 5
1.1.6 Regular Expression
Regular Expressions are search patterns formed by a series of characters. They are
primarily used for searching text, but can also be used for replacing text. Patterns can
6
contain single characters, part of a word, or a whole word. Patterns work in coherence
with modifiers. These modifiers are single characters that are capable of making the
search case insensitive, global, or multiline.
Examples:
var patt = /pattern/modifier
var search = new RegExp(key,’i’); 6
1.1.7 HTML Forms
In this research there was only one aspect of HTML that we used. HTML Forms and
Inputs were used to display the radio buttons, text boxes, checkboxes, submit buttons,
and reset buttons. Each input created one of the listed items for our search module. Each
module is one form. These forms are able to pass data to the server which was used by
the JavaScript code to display the results of the search. 7
1.2 Problem Statement
Can a native XML (Extensible Markup Language) website be effectively augmented with
a client-side search utility using XML technologies and a scripting language?
2. Materials and Methods
2.1 Materials & Software
We used the XML website developed for the WSSU SURE program as our test bed
website. We used Notepad++ as our JavaScript program development editor.
7
2.2 Methods
In our process we first had to review and learn some of the XML technologies and
JavaScript. Next we applied what we learned to creating the client-side search utilities
on our test bed. After creating the search utilities we fixed debugged the codes to make
sure everything worked properly. Finally, we introduced the code to the other pages on
our XML test bed, and resolved issues as time allowed.
3. Research Project
3.1 Basic Search
The first contribution to the website is the Basic Search. This search is exactly as the
name implies. The user is able to search using a keyword from the participants page.
Figure 1: 2014 Participants Page
8
This keyword could be a series of letters or numbers, the participants name, their
discipline, the advisor’s name, or the advisor’s phone number. With this search is also
the option for the search to be case sensitive or insensitive. The primary concept in the
development of this search utility was to allow the user to search through all of the text
fields using one text box. The code for the basic search is capable of searching through
all four columns of the table through the user’s input.
Figure 2: Basic Search Code
9
3.2 Advanced Search
The other search utility that we introduced is the Advanced Search. This search has
introduced more options than the Basic Search. With this search the user is able to
choose which column they want to search through. There is the option of searching
through the advisor’s name, the participant’s name, or even the research’s discipline. The
user can also search through the three columns at the same time to refine their search.
Once again there is also the option for the search to be case sensitive or insensitive. In
our research the advanced search was our ultimate goal. This search utilizes three
different HTML input types including: radio buttons, text boxes, a checkbox, and two
buttons. The radio buttons are in place for the user to search through the discipline in
which the research takes place. The first text box is used to search through the names of
Figure 3: Basic Search Results
10
the advisors. The second text box is used to search through the names of the
participants. Like the basic search, the advanced search also contains a checkbox with
the capability of making the search case sensitive or insensitive. The first button is a
submit button used to call the doSearch() function for searching through the table. The
second button is a reset button that calls the Reset() function for resetting the HTML form.
Figure 4: Advanced Search Code
11
4. Conclusions
Our developed Basic and Advanced Search modules showed that an XML website can
be successfully augmented with a client-side search utility using JavaScript for effective
searching. We made use of HTML DOM and simple Regular Expressions in our search
modules. We plan to extend our research work with the use complex Regular Expressions
for finer filtering, XML DOM and server side search utility in the future.
5. Future Work
In the future we would like to enhance the basic search in order to operate it on the home,
application, and personnel pages. We also want to enhance the advanced search to
Figure 5: Advanced Search Results
12
operate on the personnel pages. Finally, we want introduce a global search to our test
bed.
6. Experiences
While working on this project we came across a multitude of major and minor problems.
The first problem we encountered was trying to link the input from our first text box to the
column containing the advisors’ names. Once we were able to solve this problem it made
other parts of the project significantly easier. Another problem we faced was creating the
basic search. It seems like a simple task because it’s similar to splitting the code like the
advanced search; however, it was a more difficult task than we anticipated. Although we
may have ran into quite a few problems, I personally gained many positive experiences.
As a result of this research I have gotten a better understanding of XML, XSLT,
JavaScript, XPath, and HTML. I have also learned some aspects of DOM and Regular
Expressions. In addition, I’ve learned how to search through a document using JavaScript
and other XML technologies which benefits me as a Computer Science Major.
7. Acknowledgements
This project is funded and supported by the NSF HBCU-UP Implementation Grant:
Raising Achievement in Mathematics and Science (RAMS) with Award #0927905. It has
also encouraged me to pursue a graduate degree in the future.
13
8. References
1. “Introduction to XML.” XML Introduction. World Wide Web Consortium, Web. 21
May 2014.
2. “XSLT Tutorial.” XSLT Tutorial. World Wide Web Consortium, Web. 21 May 2014.
3. “JavaScript Tutorial.” JavaScript Tutorial. World Wide Web Consortium, Web. 21
May 2014.
4. "XPath Introduction." XPath Introduction. World Wide Web Consortium, Web. 22
May 2014.
5. “JavaScript HTML DOM Navigation.” JavaScript HTML DOM Navigation. World
Wide Web Consortium, Web. 22 May 2014.
6. “JavaScript RegExp Object.” JavaScript RegExp Object. World Wide Web
Consortium, Web. 22 May 2014.
7. “HTML Forms and Input.” HTML Forms and Input. World Wide Web Consortium,
Web. 22 May 2014.

More Related Content

What's hot

Using SPMetal for faster SharePoint development
Using SPMetal for faster SharePoint developmentUsing SPMetal for faster SharePoint development
Using SPMetal for faster SharePoint developmentPranav Sharma
 
Lexical Pattern- Based Approach for Extracting Name Aliases
Lexical Pattern- Based Approach for Extracting Name AliasesLexical Pattern- Based Approach for Extracting Name Aliases
Lexical Pattern- Based Approach for Extracting Name AliasesIJMER
 
Introduction to xml
Introduction to xmlIntroduction to xml
Introduction to xmlsoumya
 
An Efficient Annotation of Search Results Based on Feature Ranking Approach f...
An Efficient Annotation of Search Results Based on Feature Ranking Approach f...An Efficient Annotation of Search Results Based on Feature Ranking Approach f...
An Efficient Annotation of Search Results Based on Feature Ranking Approach f...Computer Science Journals
 
Web Services Part 1
Web Services Part 1Web Services Part 1
Web Services Part 1patinijava
 
IRJET- Resume Information Extraction Framework
IRJET- Resume Information Extraction FrameworkIRJET- Resume Information Extraction Framework
IRJET- Resume Information Extraction FrameworkIRJET Journal
 
Web programming by Najeeb ullahAzad(1)
Web programming by Najeeb ullahAzad(1)Web programming by Najeeb ullahAzad(1)
Web programming by Najeeb ullahAzad(1)azadmcs
 
Toxic Comment Classification
Toxic Comment ClassificationToxic Comment Classification
Toxic Comment Classificationijtsrd
 
Harnessing Web Page Directories for Large-Scale Classification of Tweets
Harnessing Web Page Directories for Large-Scale Classification of TweetsHarnessing Web Page Directories for Large-Scale Classification of Tweets
Harnessing Web Page Directories for Large-Scale Classification of TweetsGabriela Agustini
 
Displaying XML Documents Using CSS and XSL
Displaying XML Documents Using CSS and XSLDisplaying XML Documents Using CSS and XSL
Displaying XML Documents Using CSS and XSLBình Trọng Án
 
Boilerplate removal and content
Boilerplate removal and contentBoilerplate removal and content
Boilerplate removal and contentIJCSEA Journal
 
Boilerplate Removal and Content Extraction from Dynamic Web Pages
Boilerplate Removal and Content Extraction from Dynamic Web PagesBoilerplate Removal and Content Extraction from Dynamic Web Pages
Boilerplate Removal and Content Extraction from Dynamic Web PagesIJCSEA Journal
 

What's hot (20)

Using SPMetal for faster SharePoint development
Using SPMetal for faster SharePoint developmentUsing SPMetal for faster SharePoint development
Using SPMetal for faster SharePoint development
 
Lexical Pattern- Based Approach for Extracting Name Aliases
Lexical Pattern- Based Approach for Extracting Name AliasesLexical Pattern- Based Approach for Extracting Name Aliases
Lexical Pattern- Based Approach for Extracting Name Aliases
 
Introduction to xml
Introduction to xmlIntroduction to xml
Introduction to xml
 
O9xml
O9xmlO9xml
O9xml
 
An Efficient Annotation of Search Results Based on Feature Ranking Approach f...
An Efficient Annotation of Search Results Based on Feature Ranking Approach f...An Efficient Annotation of Search Results Based on Feature Ranking Approach f...
An Efficient Annotation of Search Results Based on Feature Ranking Approach f...
 
Web Services Part 1
Web Services Part 1Web Services Part 1
Web Services Part 1
 
Basic XML
Basic XMLBasic XML
Basic XML
 
Xhtml
XhtmlXhtml
Xhtml
 
Full xml
Full xmlFull xml
Full xml
 
IRJET- Resume Information Extraction Framework
IRJET- Resume Information Extraction FrameworkIRJET- Resume Information Extraction Framework
IRJET- Resume Information Extraction Framework
 
Web programming by Najeeb ullahAzad(1)
Web programming by Najeeb ullahAzad(1)Web programming by Najeeb ullahAzad(1)
Web programming by Najeeb ullahAzad(1)
 
Toxic Comment Classification
Toxic Comment ClassificationToxic Comment Classification
Toxic Comment Classification
 
Harnessing Web Page Directories for Large-Scale Classification of Tweets
Harnessing Web Page Directories for Large-Scale Classification of TweetsHarnessing Web Page Directories for Large-Scale Classification of Tweets
Harnessing Web Page Directories for Large-Scale Classification of Tweets
 
Displaying XML Documents Using CSS and XSL
Displaying XML Documents Using CSS and XSLDisplaying XML Documents Using CSS and XSL
Displaying XML Documents Using CSS and XSL
 
NamingConvention
NamingConventionNamingConvention
NamingConvention
 
XML and DTD
XML and DTDXML and DTD
XML and DTD
 
Session 1
Session 1Session 1
Session 1
 
Introduction to XML
Introduction to XMLIntroduction to XML
Introduction to XML
 
Boilerplate removal and content
Boilerplate removal and contentBoilerplate removal and content
Boilerplate removal and content
 
Boilerplate Removal and Content Extraction from Dynamic Web Pages
Boilerplate Removal and Content Extraction from Dynamic Web PagesBoilerplate Removal and Content Extraction from Dynamic Web Pages
Boilerplate Removal and Content Extraction from Dynamic Web Pages
 

Viewers also liked

Horizon Report 2015
Horizon Report 2015Horizon Report 2015
Horizon Report 2015Maria Xie
 
Distributed Topology Control in Mobile Ad-hoc Networks
Distributed Topology Control in Mobile Ad-hoc NetworksDistributed Topology Control in Mobile Ad-hoc Networks
Distributed Topology Control in Mobile Ad-hoc NetworksSiddanagouda Khot
 
PROYECTO DE AULA
PROYECTO DE AULAPROYECTO DE AULA
PROYECTO DE AULAsandry17
 
PROYECTO DE AULA
PROYECTO DE AULAPROYECTO DE AULA
PROYECTO DE AULAsandry17
 
Power repeticion escolat._torres 14 diapositivas
Power repeticion escolat._torres 14 diapositivasPower repeticion escolat._torres 14 diapositivas
Power repeticion escolat._torres 14 diapositivasAida Barrionuevo
 
InterManager Dispatch publication November 2016
InterManager Dispatch publication November 2016InterManager Dispatch publication November 2016
InterManager Dispatch publication November 2016Alexander Preston
 

Viewers also liked (13)

Horizon Report 2015
Horizon Report 2015Horizon Report 2015
Horizon Report 2015
 
Distributed Topology Control in Mobile Ad-hoc Networks
Distributed Topology Control in Mobile Ad-hoc NetworksDistributed Topology Control in Mobile Ad-hoc Networks
Distributed Topology Control in Mobile Ad-hoc Networks
 
Deivis peña
Deivis peñaDeivis peña
Deivis peña
 
Animals
AnimalsAnimals
Animals
 
Guangzhou
GuangzhouGuangzhou
Guangzhou
 
PROYECTO DE AULA
PROYECTO DE AULAPROYECTO DE AULA
PROYECTO DE AULA
 
Sreelakshmi_bhadran
Sreelakshmi_bhadranSreelakshmi_bhadran
Sreelakshmi_bhadran
 
Presentation1
Presentation1Presentation1
Presentation1
 
PROYECTO DE AULA
PROYECTO DE AULAPROYECTO DE AULA
PROYECTO DE AULA
 
Seminario Genova 2 maggio (3)
Seminario Genova 2 maggio (3)Seminario Genova 2 maggio (3)
Seminario Genova 2 maggio (3)
 
Power repeticion escolat._torres 14 diapositivas
Power repeticion escolat._torres 14 diapositivasPower repeticion escolat._torres 14 diapositivas
Power repeticion escolat._torres 14 diapositivas
 
Tomorrowland
TomorrowlandTomorrowland
Tomorrowland
 
InterManager Dispatch publication November 2016
InterManager Dispatch publication November 2016InterManager Dispatch publication November 2016
InterManager Dispatch publication November 2016
 

Similar to SURE Research Report

ScholarsDay_Poster2015_Sumner-Atay
ScholarsDay_Poster2015_Sumner-AtayScholarsDay_Poster2015_Sumner-Atay
ScholarsDay_Poster2015_Sumner-AtayAlex Sumner
 
professional fuzzy type-ahead rummage around in xml type-ahead search techni...
professional fuzzy type-ahead rummage around in xml  type-ahead search techni...professional fuzzy type-ahead rummage around in xml  type-ahead search techni...
professional fuzzy type-ahead rummage around in xml type-ahead search techni...Kumar Goud
 
IRJET- Towards Efficient Framework for Semantic Query Search Engine in Large-...
IRJET- Towards Efficient Framework for Semantic Query Search Engine in Large-...IRJET- Towards Efficient Framework for Semantic Query Search Engine in Large-...
IRJET- Towards Efficient Framework for Semantic Query Search Engine in Large-...IRJET Journal
 
Utilizing the natural langauage toolkit for keyword research
Utilizing the natural langauage toolkit for keyword researchUtilizing the natural langauage toolkit for keyword research
Utilizing the natural langauage toolkit for keyword researchErudite
 
Bt0078 website design
Bt0078 website design Bt0078 website design
Bt0078 website design Techglyphs
 
Build an LLM-powered application using LangChain.pdf
Build an LLM-powered application using LangChain.pdfBuild an LLM-powered application using LangChain.pdf
Build an LLM-powered application using LangChain.pdfStephenAmell4
 
Build an LLM-powered application using LangChain.pdf
Build an LLM-powered application using LangChain.pdfBuild an LLM-powered application using LangChain.pdf
Build an LLM-powered application using LangChain.pdfAnastasiaSteele10
 
Build an LLM-powered application using LangChain.pdf
Build an LLM-powered application using LangChain.pdfBuild an LLM-powered application using LangChain.pdf
Build an LLM-powered application using LangChain.pdfMatthewHaws4
 
NEr using N-Gram techniqueppt
NEr using N-Gram techniquepptNEr using N-Gram techniqueppt
NEr using N-Gram techniquepptGyandeep Kansal
 
Building multi billion ( dollars, users, documents ) search engines on open ...
Building multi billion ( dollars, users, documents ) search engines  on open ...Building multi billion ( dollars, users, documents ) search engines  on open ...
Building multi billion ( dollars, users, documents ) search engines on open ...Andrei Lopatenko
 
XML, XML Databases and MPEG-7
XML, XML Databases and MPEG-7XML, XML Databases and MPEG-7
XML, XML Databases and MPEG-7Deniz Kılınç
 
Implementing the Genetic Algorithm in XSLT: PoC
Implementing the Genetic Algorithm in XSLT: PoCImplementing the Genetic Algorithm in XSLT: PoC
Implementing the Genetic Algorithm in XSLT: PoCjimfuller2009
 
A novel approach towards developing a statistical dependent and rank
A novel approach towards developing a statistical dependent and rankA novel approach towards developing a statistical dependent and rank
A novel approach towards developing a statistical dependent and rankIAEME Publication
 
Advanced full text searching techniques using Lucene
Advanced full text searching techniques using LuceneAdvanced full text searching techniques using Lucene
Advanced full text searching techniques using LuceneAsad Abbas
 
Vision Based Deep Web data Extraction on Nested Query Result Records
Vision Based Deep Web data Extraction on Nested Query Result RecordsVision Based Deep Web data Extraction on Nested Query Result Records
Vision Based Deep Web data Extraction on Nested Query Result RecordsIJMER
 
Extending Solr: Building a Cloud-like Knowledge Discovery Platform
Extending Solr: Building a Cloud-like Knowledge Discovery PlatformExtending Solr: Building a Cloud-like Knowledge Discovery Platform
Extending Solr: Building a Cloud-like Knowledge Discovery PlatformTrey Grainger
 

Similar to SURE Research Report (20)

ScholarsDay_Poster2015_Sumner-Atay
ScholarsDay_Poster2015_Sumner-AtayScholarsDay_Poster2015_Sumner-Atay
ScholarsDay_Poster2015_Sumner-Atay
 
93 peter butterfield
93 peter butterfield93 peter butterfield
93 peter butterfield
 
professional fuzzy type-ahead rummage around in xml type-ahead search techni...
professional fuzzy type-ahead rummage around in xml  type-ahead search techni...professional fuzzy type-ahead rummage around in xml  type-ahead search techni...
professional fuzzy type-ahead rummage around in xml type-ahead search techni...
 
IRJET- Towards Efficient Framework for Semantic Query Search Engine in Large-...
IRJET- Towards Efficient Framework for Semantic Query Search Engine in Large-...IRJET- Towards Efficient Framework for Semantic Query Search Engine in Large-...
IRJET- Towards Efficient Framework for Semantic Query Search Engine in Large-...
 
Utilizing the natural langauage toolkit for keyword research
Utilizing the natural langauage toolkit for keyword researchUtilizing the natural langauage toolkit for keyword research
Utilizing the natural langauage toolkit for keyword research
 
Bt0078 website design
Bt0078 website design Bt0078 website design
Bt0078 website design
 
Build an LLM-powered application using LangChain.pdf
Build an LLM-powered application using LangChain.pdfBuild an LLM-powered application using LangChain.pdf
Build an LLM-powered application using LangChain.pdf
 
Build an LLM-powered application using LangChain.pdf
Build an LLM-powered application using LangChain.pdfBuild an LLM-powered application using LangChain.pdf
Build an LLM-powered application using LangChain.pdf
 
In3415791583
In3415791583In3415791583
In3415791583
 
Build an LLM-powered application using LangChain.pdf
Build an LLM-powered application using LangChain.pdfBuild an LLM-powered application using LangChain.pdf
Build an LLM-powered application using LangChain.pdf
 
NEr using N-Gram techniqueppt
NEr using N-Gram techniquepptNEr using N-Gram techniqueppt
NEr using N-Gram techniqueppt
 
Final ppt
Final pptFinal ppt
Final ppt
 
Building multi billion ( dollars, users, documents ) search engines on open ...
Building multi billion ( dollars, users, documents ) search engines  on open ...Building multi billion ( dollars, users, documents ) search engines  on open ...
Building multi billion ( dollars, users, documents ) search engines on open ...
 
XML, XML Databases and MPEG-7
XML, XML Databases and MPEG-7XML, XML Databases and MPEG-7
XML, XML Databases and MPEG-7
 
Implementing the Genetic Algorithm in XSLT: PoC
Implementing the Genetic Algorithm in XSLT: PoCImplementing the Genetic Algorithm in XSLT: PoC
Implementing the Genetic Algorithm in XSLT: PoC
 
A novel approach towards developing a statistical dependent and rank
A novel approach towards developing a statistical dependent and rankA novel approach towards developing a statistical dependent and rank
A novel approach towards developing a statistical dependent and rank
 
Advanced full text searching techniques using Lucene
Advanced full text searching techniques using LuceneAdvanced full text searching techniques using Lucene
Advanced full text searching techniques using Lucene
 
web services
web servicesweb services
web services
 
Vision Based Deep Web data Extraction on Nested Query Result Records
Vision Based Deep Web data Extraction on Nested Query Result RecordsVision Based Deep Web data Extraction on Nested Query Result Records
Vision Based Deep Web data Extraction on Nested Query Result Records
 
Extending Solr: Building a Cloud-like Knowledge Discovery Platform
Extending Solr: Building a Cloud-like Knowledge Discovery PlatformExtending Solr: Building a Cloud-like Knowledge Discovery Platform
Extending Solr: Building a Cloud-like Knowledge Discovery Platform
 

SURE Research Report

  • 1. Searching a Native XML Website Using XML Technologies Alex Sumner Sophomore, Computer Science Major Winston-Salem State University Dr. Mustafa Atay Associate Professor, Department of Computer Science Winston-Salem State University
  • 2. 2 Abstract XML (Extensible Markup Language) is a standard to represent and exchange data. XML separates content from its style. The separation of content and formatting should allow web programmers to come up with efficient document search modules for XML-based websites. In this research project, we aim to develop a client-side search module for a native XML website using technologies and standards such as Extensible Stylesheet Language Transformations (XSLT), XML Path Language (XPath), Document Object Model (DOM), Regular Expressions and JavaScript. We have developed JavaScript code for client-side searching which isplaced and invoked within XSLT code files. The JavaScript code navigated the DOM hierarchy of the underlying document. We have used string match operator along with regular expressions for effective and targeted search operations. We used our XML-based website previously developed for WSSU SURE program as the test bed. Our developed application and observations showed that a native XML website can be effectively searched using XML technologies combined with a scripting language. The method of this research project includes the following steps: (i) Review XML and other technologies and tools available such as XSLT, XPath and DOM to support searching a native XML website. (ii) Implement a client-side search utility using the selected technologies and tools and a scripting language on a test bed XML website. (iii) Observe and report the applicability, effectiveness and challenges of using XML to incorporate a client-side search utility over XML websites. Keywords: XML, XSLT, JavaScript, DOM, client-side search, Regular Expressions
  • 3. 3 Table of Contents: Abstract ....................................................................................................2 Introduction...............................................................................................4-6 Background Information.......................................................4-6 Problem Statement...............................................................6 Materials and Methods ............................................................................6-7 Research Project.....................................................................................7-10 Basic Search.........................................................................7-8 Advanced Search.................................................................8-10 Conclusions..............................................................................................10 Future Work..............................................................................................10-11 Experiences .............................................................................................11 Acknowledgements .................................................................................11 References ..............................................................................................12
  • 4. 4 1. Introduction XML is used to store and transport data and is a recommendation of the World Wide Web Consortium. The focus of this research is to successfully introduce a client-side search utility on a native XML website. To accomplish this we used XML technologies, such as XSLT, XPath, DOM, and Regular Expressions, and a scripting language, JavaScript. 1.1 Background Information 1.1.1 XML XML is a markup language like HTML; however, it is not used the same way as HTML. XML is used for the storage and transportation of data. It is a W3C (World Wide Web Consortium) recommendation. This means that it promotes fairness and quality on the web and is a web standard. XML does not have predefined tags which allows the user to create their own, self-descriptive, tags.1 1.1.2 XSLT XSLT is a stylesheet language for XML. It converts XML documents into another form such as HTML or XHTML. This transformation is necessary in order for the document to be read properly and displayed by the browser. XSLT became a W3C recommendation in 1999. It is supported by all of the major browsers and has the ability to incorporate many of the other XML technologies into its code. 2 1.1.3 JavaScript
  • 5. 5 JavaScript is a popular programming language for the web. It is a necessity in all modern HTML web pages. Along with HTML and CSS, JavaScript is one of the languages that all successful web developers must know. It has the ability to change, create, delete, and copy HTML elements. There are specific tags that must be used with JavaScript. JavaScript must be placed between the script tags, and it has to either be placed within the head or body tags. 3 1.1.4 XPath XPath is used to navigate through an XML document. It can be an extremely useful tool in an XSLT document. By using XPath to navigate through the XML document an XSLT document can easily access and modify the data. XPath is also a W3C standard as of 1999 and it is supported by all major browsers. 4 1.1.5 DOM There are three types of DOM; Core, XML, and HTML. In this research we used the DOM to help with our navigation in our JavaScript code and accessing specific data in the XSLT code. This was possible by using Navigation Nodes and DOM Methods such as: firstChild, nextSibling, getElementsByTagName(), and getElementsById(). These methods and nodes were used simultaneously in the JavaScript code to locate where the search would take place in the XSLT code. 5 1.1.6 Regular Expression Regular Expressions are search patterns formed by a series of characters. They are primarily used for searching text, but can also be used for replacing text. Patterns can
  • 6. 6 contain single characters, part of a word, or a whole word. Patterns work in coherence with modifiers. These modifiers are single characters that are capable of making the search case insensitive, global, or multiline. Examples: var patt = /pattern/modifier var search = new RegExp(key,’i’); 6 1.1.7 HTML Forms In this research there was only one aspect of HTML that we used. HTML Forms and Inputs were used to display the radio buttons, text boxes, checkboxes, submit buttons, and reset buttons. Each input created one of the listed items for our search module. Each module is one form. These forms are able to pass data to the server which was used by the JavaScript code to display the results of the search. 7 1.2 Problem Statement Can a native XML (Extensible Markup Language) website be effectively augmented with a client-side search utility using XML technologies and a scripting language? 2. Materials and Methods 2.1 Materials & Software We used the XML website developed for the WSSU SURE program as our test bed website. We used Notepad++ as our JavaScript program development editor.
  • 7. 7 2.2 Methods In our process we first had to review and learn some of the XML technologies and JavaScript. Next we applied what we learned to creating the client-side search utilities on our test bed. After creating the search utilities we fixed debugged the codes to make sure everything worked properly. Finally, we introduced the code to the other pages on our XML test bed, and resolved issues as time allowed. 3. Research Project 3.1 Basic Search The first contribution to the website is the Basic Search. This search is exactly as the name implies. The user is able to search using a keyword from the participants page. Figure 1: 2014 Participants Page
  • 8. 8 This keyword could be a series of letters or numbers, the participants name, their discipline, the advisor’s name, or the advisor’s phone number. With this search is also the option for the search to be case sensitive or insensitive. The primary concept in the development of this search utility was to allow the user to search through all of the text fields using one text box. The code for the basic search is capable of searching through all four columns of the table through the user’s input. Figure 2: Basic Search Code
  • 9. 9 3.2 Advanced Search The other search utility that we introduced is the Advanced Search. This search has introduced more options than the Basic Search. With this search the user is able to choose which column they want to search through. There is the option of searching through the advisor’s name, the participant’s name, or even the research’s discipline. The user can also search through the three columns at the same time to refine their search. Once again there is also the option for the search to be case sensitive or insensitive. In our research the advanced search was our ultimate goal. This search utilizes three different HTML input types including: radio buttons, text boxes, a checkbox, and two buttons. The radio buttons are in place for the user to search through the discipline in which the research takes place. The first text box is used to search through the names of Figure 3: Basic Search Results
  • 10. 10 the advisors. The second text box is used to search through the names of the participants. Like the basic search, the advanced search also contains a checkbox with the capability of making the search case sensitive or insensitive. The first button is a submit button used to call the doSearch() function for searching through the table. The second button is a reset button that calls the Reset() function for resetting the HTML form. Figure 4: Advanced Search Code
  • 11. 11 4. Conclusions Our developed Basic and Advanced Search modules showed that an XML website can be successfully augmented with a client-side search utility using JavaScript for effective searching. We made use of HTML DOM and simple Regular Expressions in our search modules. We plan to extend our research work with the use complex Regular Expressions for finer filtering, XML DOM and server side search utility in the future. 5. Future Work In the future we would like to enhance the basic search in order to operate it on the home, application, and personnel pages. We also want to enhance the advanced search to Figure 5: Advanced Search Results
  • 12. 12 operate on the personnel pages. Finally, we want introduce a global search to our test bed. 6. Experiences While working on this project we came across a multitude of major and minor problems. The first problem we encountered was trying to link the input from our first text box to the column containing the advisors’ names. Once we were able to solve this problem it made other parts of the project significantly easier. Another problem we faced was creating the basic search. It seems like a simple task because it’s similar to splitting the code like the advanced search; however, it was a more difficult task than we anticipated. Although we may have ran into quite a few problems, I personally gained many positive experiences. As a result of this research I have gotten a better understanding of XML, XSLT, JavaScript, XPath, and HTML. I have also learned some aspects of DOM and Regular Expressions. In addition, I’ve learned how to search through a document using JavaScript and other XML technologies which benefits me as a Computer Science Major. 7. Acknowledgements This project is funded and supported by the NSF HBCU-UP Implementation Grant: Raising Achievement in Mathematics and Science (RAMS) with Award #0927905. It has also encouraged me to pursue a graduate degree in the future.
  • 13. 13 8. References 1. “Introduction to XML.” XML Introduction. World Wide Web Consortium, Web. 21 May 2014. 2. “XSLT Tutorial.” XSLT Tutorial. World Wide Web Consortium, Web. 21 May 2014. 3. “JavaScript Tutorial.” JavaScript Tutorial. World Wide Web Consortium, Web. 21 May 2014. 4. "XPath Introduction." XPath Introduction. World Wide Web Consortium, Web. 22 May 2014. 5. “JavaScript HTML DOM Navigation.” JavaScript HTML DOM Navigation. World Wide Web Consortium, Web. 22 May 2014. 6. “JavaScript RegExp Object.” JavaScript RegExp Object. World Wide Web Consortium, Web. 22 May 2014. 7. “HTML Forms and Input.” HTML Forms and Input. World Wide Web Consortium, Web. 22 May 2014.