2. XML
XML documents have a hierarchical structure and can
conceptually be interpreted as a tree structure, called an XML tree.
XML is becoming the new standard for the exchange and
publishing of data over the Internet
3. Labelling schemes for xml
Need for labelling:
Labeling schemes have been developed to optimize query retrieval,
since they provide a quick way to determine the type of relationships
that are present among the nodes.
The main novelty of Labelling the tree is that it allows people with
limited IT-skills to query and explore one (or multiple) data sources
without prior knowledge about the schema, structure, vocabulary, or
any technical details of these sources.
4. Prefix labelling schemes:
Prefix –based schemes directly encode the father of a node in a tree,
as a prefix of its label.
Existing/proposed prefix based schemes:
dewey encoding
Cohen et all
Lsdx
Ordpath
Kheing et all
5. performance of various schemes
We compare these schemes on the basis of:
Robustness
Time to process query
Tools/techniques used
Sax parser
Netbeans ide
7. Requirements
Functional Requirements
Graphical User interface which the user
Provide accessibility to the application even to people with limited IT-skills to
perform and generate a tree.
It should be able to extract the file and send it to sax Parser efficiently.
Non Functional Requirements
It must have scalability i.e. the xml file can be large enough with numerous
amounts of attributes in it and should be parsed efficiently.
Reliability-the ability of the system to behave consistently in a user-
acceptable manner when operating within the environment for which the
system was intended
Usability-finding usability problems in UI design, making recommendations for
fixing them, and improving UI design.
8. 8Dewey - Structure
Each node is assigned a label that represents the path from the document’s root to the node.
Each component of the label represents the local order of an ancestor node.
Nodes with the same number of delimiters (“.”) in their label are in the same level.
Bib
book paper
paper
author
Tim Sarah
author
(0)
(0.0)
(0.0.0)
(0.0.0.0)
(0.1)
(0.2)
(0.2.0)
(0.2.0.0)
10. 10LSDX - Structure
Duong et al. - 2005
Labeling Scheme for Dynamic Xml data
Allow updates without re-labeling other nodes
Combine numbers and letters to label each tree
For a node X, its label is:
level(X)parent(X).positionalIdentifier(X)
Bib
boo
k
paper
paper
autho
r
Tim Sarah
autho
r
(0a)
(1a.b)
(2ab.b)
(3abb.b)
(1a.c)
(1a.d)
(2ad.b)
(3adb.b)
First positional identifier is “b” in
order to save codes for any
insert before operation
11.
12. Xml query processing:
XQuery is a query and functional programming language that is
designed to query and transform collections of structured and
unstructured data, usually in the form of XML, text and with vendor-
specific extensions for other data formats (JSON, binary, etc.). The
language is developed by the XML Query working group of
the W3C