This document provides an overview of SAX and DOM, which are APIs for parsing XML documents. It discusses:
- SAX and DOM are standards for XML parsers that allow reading and interpreting XML files. SAX is event-based and reads XML sequentially, while DOM loads the entire XML document into memory allowing random access.
- SAX works through callbacks, where the parser calls methods supplied by the application to handle events like start elements, end elements, and character data.
- A simple SAX program is demonstrated using two classes - one contains the main method and parses the XML, while the other implements handler callbacks.
- Key differences between SAX and DOM are that DOM provides random access to
The SAX parser is an event-based XML parser that does not build a DOM tree. It reads XML documents sequentially and notifies registered handlers of parsing events like start elements, end elements, and character data. SAX is useful for large documents since it has a smaller memory footprint than DOM. Handlers implement callback methods to process events. The DefaultHandler class provides empty implementations of required handler interfaces.
This document summarizes key aspects of XML including:
- XML is a text-based format for describing data structures that is both human and machine readable.
- XML became a W3C standard in 1998 and is commonly used for exchanging data between disparate systems.
- Java can be used to generate, access, format, parse, validate, and transform XML data.
- XML documents have a root element containing other nested elements and attributes to describe hierarchical data.
- Well-formed XML documents follow syntax rules for proper nesting of start/end tags and quotes around attribute values.
- XML parsers like SAX and DOM are used to read XML documents sequentially or build a navigable tree structure in memory
- The DOM (Document Object Model) views an XML document as a tree structure where each node represents a component of the XML structure.
- The DOM parser constructs an internal representation of the XML data as a tree structure in memory, allowing traversal and manipulation of nodes.
- To validate an XML document using a schema with DOM, the parser factory is configured to create a validating parser, the schema language and source are set, and errors are handled.
The document discusses XML parsing and processing. It describes two main approaches:
1) Simple API for XML (SAX) - Parses XML as a sequence of events by using event handlers for start/end tags. This is faster but requires processing elements sequentially.
2) Document Object Model (DOM) - Parses XML into a tree structure of nodes that can be randomly accessed. This allows non-sequential access but uses more memory.
It also discusses the Java API for XML Processing (JAXP) which provides a standardized way to access SAX and DOM parsers from Java code.
This document discusses different approaches to parsing XML documents in Java, including DOM, SAX, and StAX parsers.
It begins by explaining that an XML parser reads XML files and makes the information available to applications. It then describes two main parsing approaches: tree-based APIs like DOM that build an in-memory model of the XML document, and event-based APIs like SAX and StAX that parse the document sequentially and throw events.
DOM parsers create a full object model of the XML document that allows random access but uses more memory. SAX is a push-style event-based parser that is more memory efficient but provides less control. StAX is a pull-style parser that gives clients full control
The document discusses different methods for parsing XML data, including SAX, DOM, XML Pull, and JDOM/JSoup. SAX is an event-based parser that is faster but only allows one-pass traversal, while DOM builds an in-memory tree that is slower but allows modification and multi-directional traversal. The document provides code samples for each method and compares their performance, lines of code, and capabilities. It recommends using SAX for large files due to its small memory footprint and speed.
The document discusses XML parsing and processing using different approaches like SAX, DOM and StAX. It explains that XML parsers allow handling XML content in a desired manner by shielding programmers from manually parsing XML. SAX is an event-based API for sequential XML parsing while DOM represents XML as a tree of nodes. StAX is a pull-based parsing approach controlled by the application. The key aspects of SAX parsing in JAXP like creating parsers, handlers and parsing XML files are also covered.
This document discusses different ways to parse XML and JSON documents in Java, including DOM, SAX, StAX for XML and common JSON libraries for Java. It provides examples of parsing XML using DOM and SAX, and generating XML and JSON documents from Java.
The SAX parser is an event-based XML parser that does not build a DOM tree. It reads XML documents sequentially and notifies registered handlers of parsing events like start elements, end elements, and character data. SAX is useful for large documents since it has a smaller memory footprint than DOM. Handlers implement callback methods to process events. The DefaultHandler class provides empty implementations of required handler interfaces.
This document summarizes key aspects of XML including:
- XML is a text-based format for describing data structures that is both human and machine readable.
- XML became a W3C standard in 1998 and is commonly used for exchanging data between disparate systems.
- Java can be used to generate, access, format, parse, validate, and transform XML data.
- XML documents have a root element containing other nested elements and attributes to describe hierarchical data.
- Well-formed XML documents follow syntax rules for proper nesting of start/end tags and quotes around attribute values.
- XML parsers like SAX and DOM are used to read XML documents sequentially or build a navigable tree structure in memory
- The DOM (Document Object Model) views an XML document as a tree structure where each node represents a component of the XML structure.
- The DOM parser constructs an internal representation of the XML data as a tree structure in memory, allowing traversal and manipulation of nodes.
- To validate an XML document using a schema with DOM, the parser factory is configured to create a validating parser, the schema language and source are set, and errors are handled.
The document discusses XML parsing and processing. It describes two main approaches:
1) Simple API for XML (SAX) - Parses XML as a sequence of events by using event handlers for start/end tags. This is faster but requires processing elements sequentially.
2) Document Object Model (DOM) - Parses XML into a tree structure of nodes that can be randomly accessed. This allows non-sequential access but uses more memory.
It also discusses the Java API for XML Processing (JAXP) which provides a standardized way to access SAX and DOM parsers from Java code.
This document discusses different approaches to parsing XML documents in Java, including DOM, SAX, and StAX parsers.
It begins by explaining that an XML parser reads XML files and makes the information available to applications. It then describes two main parsing approaches: tree-based APIs like DOM that build an in-memory model of the XML document, and event-based APIs like SAX and StAX that parse the document sequentially and throw events.
DOM parsers create a full object model of the XML document that allows random access but uses more memory. SAX is a push-style event-based parser that is more memory efficient but provides less control. StAX is a pull-style parser that gives clients full control
The document discusses different methods for parsing XML data, including SAX, DOM, XML Pull, and JDOM/JSoup. SAX is an event-based parser that is faster but only allows one-pass traversal, while DOM builds an in-memory tree that is slower but allows modification and multi-directional traversal. The document provides code samples for each method and compares their performance, lines of code, and capabilities. It recommends using SAX for large files due to its small memory footprint and speed.
The document discusses XML parsing and processing using different approaches like SAX, DOM and StAX. It explains that XML parsers allow handling XML content in a desired manner by shielding programmers from manually parsing XML. SAX is an event-based API for sequential XML parsing while DOM represents XML as a tree of nodes. StAX is a pull-based parsing approach controlled by the application. The key aspects of SAX parsing in JAXP like creating parsers, handlers and parsing XML files are also covered.
This document discusses different ways to parse XML and JSON documents in Java, including DOM, SAX, StAX for XML and common JSON libraries for Java. It provides examples of parsing XML using DOM and SAX, and generating XML and JSON documents from Java.
This document discusses three different approaches to parsing XML documents: SAX, DOM, and JDOM. SAX is an event-based parser that works by invoking methods as it reads the XML document and encounters different components. DOM parses the entire XML document into a tree structure that can be manipulated in memory. JDOM is a Java library that provides an API for parsing XML into Java objects similar to DOM. The document then provides code examples demonstrating how to use each of these XML parsing approaches to search for a book by ISBN in a sample XML book catalog.
This document provides an overview of XML, XML schema, parsing XML, and GladeXML. It defines XML and its components like elements and attributes. It describes XML schema and provides a simple example. It explains how to parse an XML document into a DOM object and access elements. It also gives an overview of how GladeXML can dynamically load user interfaces from XML descriptions.
Tool Development 05 - XML Schema, INI, JSON, YAMLNick Pruehs
Chapter 05 of the lecture Tool Development taught at SAE Institute Hamburg.
Introduction to XML serialization in .NET, XML Schema and Schema validation in .NET, as well as other common text file formats.
JAXB (Java Architecture for XML Binding) is a Java framework that allows for bi-directional data binding between XML documents and XML data to Java objects. It provides an easier way to work with XML compared to DOM and SAX by mapping XML schema to Java classes and binding XML elements/attributes to class properties. The major steps in the JAXB process are generating Java classes from an XML schema, unmarshalling XML documents into Java objects, processing/modifying the objects, and marshalling the objects back into XML documents.
Module 06 covers Java file input/output (IO) including byte streams, character streams, listing and creating directories and files, deleting directories and files, and using the Java console. Key topics include IO streams for input and output, using byte streams like FileInputStream and FileOutputStream to copy file bytes, character streams like BufferedReader and BufferedWriter to copy file lines, listing and interacting with directory objects, creating/deleting files and directories, and using the Java console for password input. The module provides examples of reading/writing files using byte and character streams, listing directory contents, creating/deleting files and directories, and taking password input from the console.
The document provides an introduction to XSLT (Extensible Stylesheet Language Transformations), including:
1) It discusses XSLT basics like using templates to extract values from XML and output them, using for-each loops to process multiple elements, and if/choose for decisions.
2) It covers XPath for addressing parts of an XML document, and functions like contains() and position().
3) The document gives examples of transforming sample XML data using XSLT templates, value-of, and apply-templates.
Start programming in a more functional style in Java. This is the second in a two part series on lambdas and streams in Java 8 presented at the JoziJug.
The document provides an overview of the Arbortext Command Language (ACL), which is a scripting language embedded in Arbortext Editor. It describes ACL's basic syntax, variables, functions, flow control structures, and built-in functions for working with files, documents, and document content. ACL allows users to control the Arbortext software, navigate and modify documents, and interface with other components through scripting.
This document provides tips and tricks for debugging Arbortext applications. It discusses challenges like debugging components with multiple interfaces and custom code. It recommends using messages like response() and eval to monitor state, and debugging tools like the Java console. It also suggests adding debug messages programmatically, using binary search, and getting a second set of eyes to help find bugs. Maintaining backups and good documentation are emphasized.
This document provides a summary of Solr's built-in query parsers. It details the "lucene" query parser, dismax query parser, and other parsers like spatial, boost, term, and prefix parsers. It explains how to specify a query parser and leverage nested querying. The document concludes by covering new features in Solr 4.x like the surround and switch query parsers.
The document discusses processing XML in .NET, including using DOM parsers, streaming parsers, XPath, and LINQ to XML. It provides examples of parsing and modifying XML using the DOM parser, XmlReader, XmlWriter, XPath, and LINQ. Exercises are included to practice working with XML using these .NET APIs.
JAXB (Java Architecture for XML Binding) defines an API for reading and writing Java objects to and from XML documents. It uses annotations to map XML elements and attributes to Java objects. This allows Java objects to be automatically marshalled to XML and XML to be unmarshalled to Java objects without needing to understand XML parsing techniques. JAXB comes bundled with the JDK so there are no extra dependencies. It provides a simpler model than DOM or SAX for working with XML in Java applications.
An XML writer represents a component that provides a fast, forward-only way of outputting XML data to streams or files. It features ad hoc methods to write any XML node type and guarantees well-formed XML output without worrying about syntax. The .NET Framework provides the XmlTextWriter class, which takes precautions like escaping characters to ensure valid XML. Readers and writers are fundamental for I/O operations like serialization and data access in .NET.
Java uses streams to handle input/output operations. Streams provide a standardized way to read from and write to various sources and sinks like files, networks, and buffers. There are byte streams that handle input/output of bytes and character streams that handle characters. Common stream classes include FileInputStream, FileOutputStream, BufferedReader, and BufferedWriter which are used to read from and write to files and console. Streams can be chained together for complex I/O processing.
After a thorough overview of the main features and benefits of Apache Solr (an open source search server), the architecture of Solr and strategies to adopt it for your PHP application and data model will be presented. The main lessons learned around dealing with a mix of structured and non-structured content, multilingual aspects, tuning and the various state-of-the-art features of Solr will be shared as well
The document discusses Scala, a programming language designed to be scalable. It can be used for both small and large programs. Scala combines object-oriented and functional programming. It interoperates seamlessly with Java but allows developers to add new constructs like actors through libraries. The Scala community is growing, with industrial adoption starting at companies like Twitter.
A namespace is a declarative region that provides a scope to the identifiers (the names of types, functions, variables, etc) inside it. It is used to organize code into logical groups and to prevent name collisions that can occur especially when our code base includes multiple libraries. Namespace provides a class-like modularization without class-like semantics
This document provides an overview of .NET Framework and C# programming basics. It covers topics such as .NET Framework features like Common Language Runtime and Base Class Library. It also discusses C# language basics, including types like value types and reference types. The document includes code examples demonstrating typical C# programs and features like properties, arrays, and anonymous types.
The document discusses the XML DOM (Document Object Model). It defines the DOM as a standard for accessing and manipulating XML documents through a tree structure representation. The DOM defines all elements in an XML document as nodes that can be traversed and modified. It outlines DOM properties and methods for navigating and manipulating the node tree. Advantages of the DOM include its traversable and modifiable tree structure, while disadvantages include higher resource usage compared to SAX parsing.
DOM and SAX are two APIs for working with XML documents. DOM loads the entire XML document into memory as a tree structure, allowing manipulation of the document. SAX is event-based and reads the XML document sequentially, invoking callback functions at element boundaries. Compared to DOM, SAX uses less memory but only allows reading of the document. Both APIs are supported across many programming languages including PHP.
This document discusses three different approaches to parsing XML documents: SAX, DOM, and JDOM. SAX is an event-based parser that works by invoking methods as it reads the XML document and encounters different components. DOM parses the entire XML document into a tree structure that can be manipulated in memory. JDOM is a Java library that provides an API for parsing XML into Java objects similar to DOM. The document then provides code examples demonstrating how to use each of these XML parsing approaches to search for a book by ISBN in a sample XML book catalog.
This document provides an overview of XML, XML schema, parsing XML, and GladeXML. It defines XML and its components like elements and attributes. It describes XML schema and provides a simple example. It explains how to parse an XML document into a DOM object and access elements. It also gives an overview of how GladeXML can dynamically load user interfaces from XML descriptions.
Tool Development 05 - XML Schema, INI, JSON, YAMLNick Pruehs
Chapter 05 of the lecture Tool Development taught at SAE Institute Hamburg.
Introduction to XML serialization in .NET, XML Schema and Schema validation in .NET, as well as other common text file formats.
JAXB (Java Architecture for XML Binding) is a Java framework that allows for bi-directional data binding between XML documents and XML data to Java objects. It provides an easier way to work with XML compared to DOM and SAX by mapping XML schema to Java classes and binding XML elements/attributes to class properties. The major steps in the JAXB process are generating Java classes from an XML schema, unmarshalling XML documents into Java objects, processing/modifying the objects, and marshalling the objects back into XML documents.
Module 06 covers Java file input/output (IO) including byte streams, character streams, listing and creating directories and files, deleting directories and files, and using the Java console. Key topics include IO streams for input and output, using byte streams like FileInputStream and FileOutputStream to copy file bytes, character streams like BufferedReader and BufferedWriter to copy file lines, listing and interacting with directory objects, creating/deleting files and directories, and using the Java console for password input. The module provides examples of reading/writing files using byte and character streams, listing directory contents, creating/deleting files and directories, and taking password input from the console.
The document provides an introduction to XSLT (Extensible Stylesheet Language Transformations), including:
1) It discusses XSLT basics like using templates to extract values from XML and output them, using for-each loops to process multiple elements, and if/choose for decisions.
2) It covers XPath for addressing parts of an XML document, and functions like contains() and position().
3) The document gives examples of transforming sample XML data using XSLT templates, value-of, and apply-templates.
Start programming in a more functional style in Java. This is the second in a two part series on lambdas and streams in Java 8 presented at the JoziJug.
The document provides an overview of the Arbortext Command Language (ACL), which is a scripting language embedded in Arbortext Editor. It describes ACL's basic syntax, variables, functions, flow control structures, and built-in functions for working with files, documents, and document content. ACL allows users to control the Arbortext software, navigate and modify documents, and interface with other components through scripting.
This document provides tips and tricks for debugging Arbortext applications. It discusses challenges like debugging components with multiple interfaces and custom code. It recommends using messages like response() and eval to monitor state, and debugging tools like the Java console. It also suggests adding debug messages programmatically, using binary search, and getting a second set of eyes to help find bugs. Maintaining backups and good documentation are emphasized.
This document provides a summary of Solr's built-in query parsers. It details the "lucene" query parser, dismax query parser, and other parsers like spatial, boost, term, and prefix parsers. It explains how to specify a query parser and leverage nested querying. The document concludes by covering new features in Solr 4.x like the surround and switch query parsers.
The document discusses processing XML in .NET, including using DOM parsers, streaming parsers, XPath, and LINQ to XML. It provides examples of parsing and modifying XML using the DOM parser, XmlReader, XmlWriter, XPath, and LINQ. Exercises are included to practice working with XML using these .NET APIs.
JAXB (Java Architecture for XML Binding) defines an API for reading and writing Java objects to and from XML documents. It uses annotations to map XML elements and attributes to Java objects. This allows Java objects to be automatically marshalled to XML and XML to be unmarshalled to Java objects without needing to understand XML parsing techniques. JAXB comes bundled with the JDK so there are no extra dependencies. It provides a simpler model than DOM or SAX for working with XML in Java applications.
An XML writer represents a component that provides a fast, forward-only way of outputting XML data to streams or files. It features ad hoc methods to write any XML node type and guarantees well-formed XML output without worrying about syntax. The .NET Framework provides the XmlTextWriter class, which takes precautions like escaping characters to ensure valid XML. Readers and writers are fundamental for I/O operations like serialization and data access in .NET.
Java uses streams to handle input/output operations. Streams provide a standardized way to read from and write to various sources and sinks like files, networks, and buffers. There are byte streams that handle input/output of bytes and character streams that handle characters. Common stream classes include FileInputStream, FileOutputStream, BufferedReader, and BufferedWriter which are used to read from and write to files and console. Streams can be chained together for complex I/O processing.
After a thorough overview of the main features and benefits of Apache Solr (an open source search server), the architecture of Solr and strategies to adopt it for your PHP application and data model will be presented. The main lessons learned around dealing with a mix of structured and non-structured content, multilingual aspects, tuning and the various state-of-the-art features of Solr will be shared as well
The document discusses Scala, a programming language designed to be scalable. It can be used for both small and large programs. Scala combines object-oriented and functional programming. It interoperates seamlessly with Java but allows developers to add new constructs like actors through libraries. The Scala community is growing, with industrial adoption starting at companies like Twitter.
A namespace is a declarative region that provides a scope to the identifiers (the names of types, functions, variables, etc) inside it. It is used to organize code into logical groups and to prevent name collisions that can occur especially when our code base includes multiple libraries. Namespace provides a class-like modularization without class-like semantics
This document provides an overview of .NET Framework and C# programming basics. It covers topics such as .NET Framework features like Common Language Runtime and Base Class Library. It also discusses C# language basics, including types like value types and reference types. The document includes code examples demonstrating typical C# programs and features like properties, arrays, and anonymous types.
The document discusses the XML DOM (Document Object Model). It defines the DOM as a standard for accessing and manipulating XML documents through a tree structure representation. The DOM defines all elements in an XML document as nodes that can be traversed and modified. It outlines DOM properties and methods for navigating and manipulating the node tree. Advantages of the DOM include its traversable and modifiable tree structure, while disadvantages include higher resource usage compared to SAX parsing.
DOM and SAX are two APIs for working with XML documents. DOM loads the entire XML document into memory as a tree structure, allowing manipulation of the document. SAX is event-based and reads the XML document sequentially, invoking callback functions at element boundaries. Compared to DOM, SAX uses less memory but only allows reading of the document. Both APIs are supported across many programming languages including PHP.
XML parsers are software libraries that allow client applications to work with XML documents. There are two main types: DOM parsers build an in-memory tree representation, while SAX parsers use event-based callbacks. Xerces-J is a popular Java XML parser that implements both DOM and SAX interfaces. An example extracts circle data from an XML file using both a DOM parser to iterate through nodes and a SAX parser overriding callback methods.
This document provides an overview of DOM and SAX, two common XML APIs in Java. It describes the key differences between DOM and SAX, including that DOM builds an in-memory tree representation, while SAX parses the XML as a stream of events. The document also provides code examples for using SAX to parse XML and extract data, and examples of how to access and manipulate DOM trees after parsing XML.
eXtensible Markup Language APIs in Java 1.6 - Simple and efficient XML parsin...Wojciech Podgórski
Presentation describes modern ways of parsing XML documents using Java language. It shows different approaches to the same problem, their capabilities, advantages, disadvantages and their comparison. Moreover, we can learn what to expect from Java 7 in context of XML.
An XML processor takes an XML document and DTD file as input and processes them so that applications can access the information. There are two main API approaches for XML processors - SAX and DOM. SAX is an event-based approach where the processor signals events to the application as it recognizes syntactic structures. DOM builds a hierarchical tree of the document in memory that can then be randomly accessed by applications. SAX is faster but DOM allows random access and rearranging of the document.
The document discusses XML DOM and SAX. XML DOM defines a standard for accessing and manipulating XML documents and is a W3C standard. It defines a model for XML, HTML, and other structured documents. The XML DOM defines the objects, properties, and methods to access and modify XML elements. SAX is an alternative event-based XML parsing API that operates sequentially on each XML element. SAX parsers use less memory than DOM parsers but make validation of XML documents more difficult.
This document summarizes XML parsing techniques including DOM, SAX, and Microsoft XML DOM objects. DOM builds a hierarchical model of the XML document as a tree structure in memory. SAX is event-based and parses the document sequentially, triggering events. Microsoft XML DOM provides classes that map to the W3C DOM standard for manipulating XML documents. The document compares DOM and SAX, describing their advantages and disadvantages. It also outlines common DOM objects and their properties and methods for traversing and manipulating the XML document tree.
The document discusses Semantic Web technologies including XML, DOM, RDF, and ontologies. It provides an overview of how these layers work together, from the basic levels of Unicode and URIs, to XML which enables data sharing and transport, to RDF triplets that represent relationships between resources, to ontologies that define classes and connect related items, and finally to higher levels of logic, digital signatures, and trust. The goal of the Semantic Web is to make data on the web more intelligible to computers and enable more sophisticated question answering about relationships between different entities.
The document provides information about Bobby wanting to invest his $5,631 savings at one of three banks - Royal Bank of Canada (RBC), Toronto Dominion Bank (TD), or Scotia Bank. RBC offers 8% interest bi-annually, TD offers 3% interest monthly, and Scotia Bank offers 5% interest quarterly. Using the compound interest formula, the document calculates that after 2 years Bobby would earn $956.47 interest at RBC, $347.75 interest at TD, and $802.49 interest at Scotia Bank, so RBC would increase his savings the most.
XSLT is used to transform XML documents into other XML documents or HTML. It uses XPath to navigate XML documents. Templates are used to define transformation rules that are applied when nodes are matched. Common elements used in XSLT include value-of to extract node values, for-each for loops, apply-templates to apply templates to child nodes, and copy to duplicate nodes in the output. Conditional logic can be added using elements like if, choose, when and otherwise.
The document describes an XML file containing contact information for a person named John Johansen Doe with a phone number. The XML file is parsed by an XML parser that converts it into an abstract model representation as a DOM tree, where each element in the XML file becomes a node in the tree structure with properties like parentNode, firstChild, and nextSibling. The abstract model can then be used by various XML processing components for tasks like loading, saving, validation, views, and querying with XPath.
This document provides an overview of using the DOM (Document Object Model) API in Java to parse, traverse, modify, and generate XML documents. It describes the core DOM interfaces like Document and Node, how to parse an XML file into a DOM document tree using JAXP, and examples of traversing nodes, modifying content, and generating a new XML document from scratch using the DOM.
El documento describe el Modelo de Objetos de Documento (DOM), que es una interfaz para programas y scripts para acceder y modificar el contenido, estructura y estilo de documentos XML y HTML. El DOM define una representación del documento como un árbol de nodos y objetos, permitiendo la navegación y modificación de la estructura del documento. El DOM ha evolucionado a través de varios niveles para agregar funcionalidades como manipulación de contenido, hojas de estilo, eventos y rangos.
If you are using jQuery, you need to understand the Document Object Model and how it accounts for all the elements inside any HTML document or Web page.
The Document Object Model (DOM) is a standard for representing and interacting with objects in HTML, XML and SVG documents. It defines the logical structure of documents and the way a document is accessed and manipulated. The DOM represents the document as nodes and objects, which can be manipulated programmatically by JavaScript to change the document structure, style and content. It allows dynamic access to and manipulation of page content that is useful for building interactive web applications. The DOM specification is developed by the W3C and provides a platform- and language-neutral interface that can be used across different web technologies.
The document discusses XML, XSLT, and XSL-FO, explaining that XML is used to store and exchange data, XSLT transforms XML documents, and XSL-FO formats XML data for output to different mediums like screens and paper. It also provides examples of using these technologies together and with Java or CSS to display XML content in different formats.
The document discusses two common XML parsers - DOM and SAX. DOM builds an in-memory tree representation of the entire XML document, allowing random access. SAX is event-based and parses the document sequentially, notifying the application of elements and attributes through callback methods. DOM is used when random access or rearranging elements is needed, while SAX is better for large documents or streaming data. Common DOM and SAX methods are also outlined.
This document provides an example of using SAX (Simple API for XML) to parse an XML document and print its outline or structure. It defines a PrintHandler class that extends the DefaultHandler and overrides startElement, endElement, and characters methods to print the start and end tags and first word of tag bodies as the document is parsed. The PrintHandler is used along with a SAXParser to parse an XML file and output its outline. This demonstrates a basic use of SAX to parse and process an XML document sequentially through event-based callbacks.
1. A class is a blueprint for objects that defines common properties and methods. It can include modifiers, name, superclass/interfaces, and a class body.
2. An object is created using the new keyword, assigning the object to a reference variable. Reference variables store an object's address in memory. Assigning one reference variable to another does not create distinct copies but points to the same object.
3. A method performs a specific task and can return a result. It includes modifiers, return type, name, parameters, and a method body. Method overloading allows methods with the same name but different parameters.
This document summarizes best practices from Twitter engineers for writing effective Scala code. It covers topics like formatting, types and generics, collections, concurrency, control structures, functional and object-oriented programming, garbage collection, Java compatibility, and Twitter's standard libraries like Util and Finagle. The recommendations emphasize immutable and declarative code, functional patterns like futures and options, and avoiding side effects.
The document discusses Java strings, string buffers, file I/O, serialization, dates and numbers. Some key points include:
- String objects are immutable but string references are mutable. The String constant pool stores shared strings.
- StringBuilder is like StringBuffer but not thread-safe. Both are used for file I/O with large input streams.
- File handles file paths. FileReader/Writer read/write characters. BufferedReader/Writer improve performance.
- Serialization converts objects to streams. ObjectOutputStream writes and ObjectInputStream reads objects.
- Inheritance affects serialization if the superclass is serializable. Constructors don't run on deserialization.
- Calendar manipulates dates. Date
The document discusses several advanced C++ programming concepts including abstract classes, exception handling, standard libraries, templates, and containers. It defines abstract classes as classes that contain pure virtual functions and cannot be instantiated. Exception handling allows programs to continue running or terminate gracefully after errors using try, catch, and throw blocks. The standard library provides common functions and classes for input/output, strings, and containers. Templates allow writing generic and reusable code for different data types, including class templates and function templates. The Standard Template Library includes common containers like vectors, lists, and maps that store and organize data using templates and iterators.
This document provides an introduction and overview of Java lambda expressions and functional programming concepts in Java 8. It begins with getting started instructions and a review of pre-lambda approaches like anonymous inner classes. It then covers the basics of lambda expressions like syntax, type inference, implied returns, and effectively final variables. It discusses the @FunctionalInterface annotation and introduces Java's built-in java.util.function package for common functional interfaces. The document provides examples throughout and concludes with a wrap-up of lambda expressions in Java 8.
This document provides information on multithreading, networking, and other Java concepts. It discusses how to create threads in Java by extending the Thread class or implementing Runnable, and describes methods like sleep(), wait(), notify() etc. It also covers networking topics like sockets, servers, clients and protocols. Examples of socket programming and a chatting program using sockets are included. Finally, it discusses classes like URL, URLConnection and HttpURLConnection for working with URLs.
This document discusses multithreading and concurrency in .NET. It covers key concepts like processes and threads, and how they relate on an operating system level. It also discusses the Thread Pool, Task Parallel Library (TPL), Tasks, Parallel LINQ (PLINQ), and asynchronous programming patterns in .NET like async/await. Examples are provided for common threading techniques like producer/consumer and using the Timer class. Overall it serves as a comprehensive overview of multithreading and concurrency primitives available in the .NET framework.
Lambda expressions allow implementing functional interfaces using anonymous functions. Method references provide a shorthand syntax for referring to existing methods as lambda expressions. The Stream API allows functional-style operations on streams of values, including intermediate and terminal operations. The new Date/Time API provides a safer and more comprehensive replacement for the previous date/time classes in Java.
Java programs run on the Java Virtual Machine (JVM). The JVM provides a runtime environment that executes Java bytecode. Key aspects of Java include its use of object-oriented programming, garbage collection, and strong typing. Popular integrated development environments for Java include Eclipse and IntelliJ IDEA.
The document discusses Java packages and interfaces. It provides details about:
1) Packages in Java are used to group related classes and interfaces to prevent naming conflicts, make classes easier to find, and control access levels. Packages can contain subpackages.
2) Interfaces define abstract methods that classes implement, allowing multiple inheritance. Interface methods are public and abstract by default. Interfaces are extended, not implemented like classes.
3) The java.lang package contains fundamental classes like Object, String, and Math. The java.util package contains utility classes like Calendar, Collections, and Random. The java.io package supports input/output with streams, files, and serialization.
This document provides information about methods and functional programming in Ruby. It defines what a method is, how methods are defined and invoked, and how they can return values. It discusses defining simple methods, method return values, singleton methods, and undefining methods. It also covers method name conventions, omitting parentheses in method calls, variable-length argument lists, default parameter values, and using hashes for named arguments. The document concludes by explaining procs and lambdas, how they are created, and how they are invoked differently than regular methods.
Solr Troubleshooting - Treemap Approach: Presented by Alexandre Rafolovitch, ...Lucidworks
The document discusses troubleshooting techniques for Solr, including using a "TreeMap" process of establishing boundaries, splitting the problem, identifying relevant parts, zooming in, and re-formulating boundaries in an iterative process until the problem is fixed. It outlines how to establish boundaries by defining the identity, location, timing, and magnitude of the problem. Examples are provided for indexing and searching processes in Solr.
The document discusses Java packages and interfaces. It provides details about:
- What packages are and how they are used to prevent naming conflicts and organize classes.
- How to define a user-defined package with an example.
- What interfaces are and how they allow for multiple inheritance by implementing multiple interfaces.
- Examples of defining and implementing interfaces.
Threads in java, Multitasking and Multithreadingssusere538f7
Threads allow Java programs to take advantage of multiprocessor systems by performing multiple tasks simultaneously. There are two main ways to create threads in Java - by extending the Thread class or implementing the Runnable interface. Threads can be started using the start() method and terminate when their run() method completes. The Java scheduler uses priority to determine which runnable threads get CPU time, with higher priority threads preempting lower priority ones. Threads provide concurrency but not true parallelism since Java threads still run on one CPU.
Rust is a new systems programming language designed for safety, concurrency, and speed. It uses a borrow checker to ensure memory safety and avoid use-after-free bugs, and enables safe concurrency through message passing and ownership rules. The Rust team has been redesigning and implementing the language over the past year based on lessons learned from writing the compiler in Rust. Their goals are to have a memory-safe, concurrent language that achieves performance comparable to C++. They plan to use Rust for the parallel engine of the Servo browser project.
Improved Developer Productivity In JDK8Simon Ritter
The document discusses new features in Java SE 8 that improve developer productivity. It describes lambda expressions and method references that allow for more functional-style programming. It also covers streams, which provide a way to process collections of objects in a declarative way and support aggregate operations. Default methods are discussed as a means to evolve library APIs in a backwards compatible way.
20 Comprehensive Checklist of Designing and Developing a WebsitePixlogix Infotech
Dive into the world of Website Designing and Developing with Pixlogix! Looking to create a stunning online presence? Look no further! Our comprehensive checklist covers everything you need to know to craft a website that stands out. From user-friendly design to seamless functionality, we've got you covered. Don't miss out on this invaluable resource! Check out our checklist now at Pixlogix and start your journey towards a captivating online presence today.
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
Building RAG with self-deployed Milvus vector database and Snowpark Container...Zilliz
This talk will give hands-on advice on building RAG applications with an open-source Milvus database deployed as a docker container. We will also introduce the integration of Milvus with Snowpark Container Services.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Speck&Tech
ABSTRACT: A prima vista, un mattoncino Lego e la backdoor XZ potrebbero avere in comune il fatto di essere entrambi blocchi di costruzione, o dipendenze di progetti creativi e software. La realtà è che un mattoncino Lego e il caso della backdoor XZ hanno molto di più di tutto ciò in comune.
Partecipate alla presentazione per immergervi in una storia di interoperabilità, standard e formati aperti, per poi discutere del ruolo importante che i contributori hanno in una comunità open source sostenibile.
BIO: Sostenitrice del software libero e dei formati standard e aperti. È stata un membro attivo dei progetti Fedora e openSUSE e ha co-fondato l'Associazione LibreItalia dove è stata coinvolta in diversi eventi, migrazioni e formazione relativi a LibreOffice. In precedenza ha lavorato a migrazioni e corsi di formazione su LibreOffice per diverse amministrazioni pubbliche e privati. Da gennaio 2020 lavora in SUSE come Software Release Engineer per Uyuni e SUSE Manager e quando non segue la sua passione per i computer e per Geeko coltiva la sua curiosità per l'astronomia (da cui deriva il suo nickname deneb_alpha).
Full-RAG: A modern architecture for hyper-personalizationZilliz
Mike Del Balso, CEO & Co-Founder at Tecton, presents "Full RAG," a novel approach to AI recommendation systems, aiming to push beyond the limitations of traditional models through a deep integration of contextual insights and real-time data, leveraging the Retrieval-Augmented Generation architecture. This talk will outline Full RAG's potential to significantly enhance personalization, address engineering challenges such as data management and model training, and introduce data enrichment with reranking as a key solution. Attendees will gain crucial insights into the importance of hyperpersonalization in AI, the capabilities of Full RAG for advanced personalization, and strategies for managing complex data integrations for deploying cutting-edge AI solutions.
Maruthi Prithivirajan, Head of ASEAN & IN Solution Architecture, Neo4j
Get an inside look at the latest Neo4j innovations that enable relationship-driven intelligence at scale. Learn more about the newest cloud integrations and product enhancements that make Neo4j an essential choice for developers building apps with interconnected data and generative AI.
Threats to mobile devices are more prevalent and increasing in scope and complexity. Users of mobile devices desire to take full advantage of the features
available on those devices, but many of the features provide convenience and capability but sacrifice security. This best practices guide outlines steps the users can take to better protect personal devices and information.
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...Zilliz
Join us to introduce Milvus Lite, a vector database that can run on notebooks and laptops, share the same API with Milvus, and integrate with every popular GenAI framework. This webinar is perfect for developers seeking easy-to-use, well-integrated vector databases for their GenAI apps.
Sudheer Mechineni, Head of Application Frameworks, Standard Chartered Bank
Discover how Standard Chartered Bank harnessed the power of Neo4j to transform complex data access challenges into a dynamic, scalable graph database solution. This keynote will cover their journey from initial adoption to deploying a fully automated, enterprise-grade causal cluster, highlighting key strategies for modelling organisational changes and ensuring robust disaster recovery. Learn how these innovations have not only enhanced Standard Chartered Bank’s data infrastructure but also positioned them as pioneers in the banking sector’s adoption of graph technology.
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfMalak Abu Hammad
Discover how MongoDB Atlas and vector search technology can revolutionize your application's search capabilities. This comprehensive presentation covers:
* What is Vector Search?
* Importance and benefits of vector search
* Practical use cases across various industries
* Step-by-step implementation guide
* Live demos with code snippets
* Enhancing LLM capabilities with vector search
* Best practices and optimization strategies
Perfect for developers, AI enthusiasts, and tech leaders. Learn how to leverage MongoDB Atlas to deliver highly relevant, context-aware search results, transforming your data retrieval process. Stay ahead in tech innovation and maximize the potential of your applications.
#MongoDB #VectorSearch #AI #SemanticSearch #TechInnovation #DataScience #LLM #MachineLearning #SearchTechnology
Communications Mining Series - Zero to Hero - Session 1DianaGray10
This session provides introduction to UiPath Communication Mining, importance and platform overview. You will acquire a good understand of the phases in Communication Mining as we go over the platform with you. Topics covered:
• Communication Mining Overview
• Why is it important?
• How can it help today’s business and the benefits
• Phases in Communication Mining
• Demo on Platform overview
• Q/A
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc
How does your privacy program stack up against your peers? What challenges are privacy teams tackling and prioritizing in 2024?
In the fifth annual Global Privacy Benchmarks Survey, we asked over 1,800 global privacy professionals and business executives to share their perspectives on the current state of privacy inside and outside of their organizations. This year’s report focused on emerging areas of importance for privacy and compliance professionals, including considerations and implications of Artificial Intelligence (AI) technologies, building brand trust, and different approaches for achieving higher privacy competence scores.
See how organizational priorities and strategic approaches to data security and privacy are evolving around the globe.
This webinar will review:
- The top 10 privacy insights from the fifth annual Global Privacy Benchmarks Survey
- The top challenges for privacy leaders, practitioners, and organizations in 2024
- Key themes to consider in developing and maintaining your privacy program
UiPath Test Automation using UiPath Test Suite series, part 6DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 6. In this session, we will cover Test Automation with generative AI and Open AI.
UiPath Test Automation with generative AI and Open AI webinar offers an in-depth exploration of leveraging cutting-edge technologies for test automation within the UiPath platform. Attendees will delve into the integration of generative AI, a test automation solution, with Open AI advanced natural language processing capabilities.
Throughout the session, participants will discover how this synergy empowers testers to automate repetitive tasks, enhance testing accuracy, and expedite the software testing life cycle. Topics covered include the seamless integration process, practical use cases, and the benefits of harnessing AI-driven automation for UiPath testing initiatives. By attending this webinar, testers, and automation professionals can gain valuable insights into harnessing the power of AI to optimize their test automation workflows within the UiPath ecosystem, ultimately driving efficiency and quality in software development processes.
What will you get from this session?
1. Insights into integrating generative AI.
2. Understanding how this integration enhances test automation within the UiPath platform
3. Practical demonstrations
4. Exploration of real-world use cases illustrating the benefits of AI-driven test automation for UiPath
Topics covered:
What is generative AI
Test Automation with generative AI and Open AI.
UiPath integration with generative AI
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Dr. Sean Tan, Head of Data Science, Changi Airport Group
Discover how Changi Airport Group (CAG) leverages graph technologies and generative AI to revolutionize their search capabilities. This session delves into the unique search needs of CAG’s diverse passengers and customers, showcasing how graph data structures enhance the accuracy and relevance of AI-generated search results, mitigating the risk of “hallucinations” and improving the overall customer journey.
2. SAX and DOM
• SAX and DOM are standards for XML parsers--program
APIs to read and interpret XML files
– DOM is a W3C standard
– SAX is an ad-hoc (but very popular) standard
– SAX was developed by David Megginson and is open source
• There are various implementations available
• Java implementations are provided as part of JAXP (Java
API for XML Processing)
• JAXP is included as a package in Java 1.4
– JAXP is available separately for Java 1.3
• Unlike many XML technologies, SAX and DOM are
relatively easy
3. Difference between SAX and DOM
• DOM reads the entire XML document into memory and
stores it as a tree data structure
• SAX reads the XML document and calls one of your methods
for each element or block of text that it encounters
• Consequences:
– DOM provides “random access” into the XML document
– SAX provides only sequential access to the XML document
– DOM is slow and requires huge amounts of memory, so it cannot be
used for large XML documents
– SAX is fast and requires very little memory, so it can be used for
huge documents (or large numbers of documents)
• This makes SAX much more popular for web sites
– Some DOM implementations have methods for changing the XML
document in memory; SAX implementations do not
4. Callbacks
• SAX works through callbacks: you call the parser,
it calls methods that you supply
Your program
startDocument(...)
main(...)
The SAX parser
parse(...)
startElement(...)
characters(...)
endElement( )
endDocument( )
5. Simple SAX program
• The following program is adapted from CodeNotes® for
XML by Gregory Brill, pages 158-159
• The program consists of two classes:
– Sample -- This class contains the main method; it
•
•
•
•
•
Gets a factory to make parsers
Gets a parser from the factory
Creates a Handler object to handle callbacks from the parser
Tells the parser which handler to send its callbacks to
Reads and parses the input XML file
– Handler -- This class contains handlers for three kinds of callbacks:
• startElement callbacks, generated when a start tag is seen
• endElement callbacks, generated when an end tag is seen
• characters callbacks, generated for the contents of an element
6. The Sample class, I
import javax.xml.parsers.*; // for both SAX and DOM
import org.xml.sax.*;
import org.xml.sax.helpers.*;
// For simplicity, we let the operating system handle exceptions
// In "real life" this is poor programming practice
public class Sample {
public static void main(String args[]) throws Exception {
// Create a parser factory
SAXParserFactory factory = SAXParserFactory.newInstance();
// Tell factory that the parser must understand namespaces
factory.setNamespaceAware(true);
// Make the parser
SAXParser saxParser = factory.newSAXParser();
XMLReader parser = saxParser.getXMLReader();
7. The Sample class, II
• In the previous slide we made a parser, of type XMLReader
• Now we continue with:
// Create a handler
Handler handler = new Handler();
// Tell the parser to use this handler
parser.setContentHandler(handler);
// Finally, read and parse the document
parser.parse("hello.xml");
} // end of Sample class
• You will need to put the file hello.xml :
– In the same directory, if you run the program from the command line
– Or where it can be found by the particular IDE you are using
8. The Handler class, I
• public class Handler extends DefaultHandler {
– DefaultHandler is an adapter class that defines these methods and
others as do-nothing methods, to be overridden as desired
– We will define three very similar methods to handle (1) start tags,
(2) contents, and (3) end tags--our methods will just print a line
– Each of these three methods could throw a SAXException
•
// SAX calls this method when it encounters a start tag
public void startElement(String namespaceURI,
String localName,
String qualifiedName,
Attributes attributes)
throws SAXException {
System.out.println("startElement: " + qualifiedName);
}
9. The Handler class, II
•
// SAX calls this method to pass in character data
public void characters(char ch[], int start, int length)
throws SAXException {
System.out.println("characters: " +
new String(ch, start, length));
}
•
// SAX call this method when it encounters an end tag
public void endElement(String namespaceURI,
String localName,
String qualifiedName)
throws SAXException {
System.out.println("Element: /" + qualifiedName);
}
} // End of Handler class
10. Results
• If the file hello.xml contains:
<?xml version="1.0"?>
<display>Hello World!</display>
• Then the output from running java Sample will be:
startElement: display
characters: Hello World!
Element: /display
11. Parser factories
• A factory is an alternative to constructors
• To create a SAX parser factory, call this method:
SAXParserFactory.newInstance()
– This returns an object of type SAXParserFactory
– It may throw a FactoryConfigurationError
• You can then say what kind of parser you want:
– public void setNamespaceAware(boolean awareness)
• Call this with true if you are using namespaces
• The default (if you don’t call this method) is false
– public void setValidating(boolean validating)
• Call this with true if you want to validate against a DTD
• The default (if you don’t call this method) is false
• Validation will give an error if you don’t have a DTD
12. Getting a parser
• Once you have a SAXParserFactory set up (say it’s named
factory), you can create a parser with:
SAXParser saxParser = factory.newSAXParser();
XMLReader parser = saxParser.getXMLReader();
• Note: older texts may use Parser in place of XMLReader
– Parser is SAX1, not SAX2, and is now deprecated
– SAX2 supports namespaces and some new parser properties
• Note: SAXParser is not thread-safe; to use it in multiple
threads, create a separate SAXParser for each thread
– This is unlikely to be a problem in class projects
13. Declaring which handler to use
• Since the SAX parser will be calling our methods, we need
to supply these methods
• In the example these are in a separate class, Handler
• We need to tell the parser where to find the methods:
Handler handler = new Handler();
parser.setContentHandler(handler);
• These statements could be combined:
parser.setContentHandler(new Handler());
• Finally, we call the parser and tell it what file to parse:
parser.parse("hello.xml");
• Everything else will be done in the handler methods
14. SAX handlers
• A callback handler for SAX must implement these four
interfaces:
– interface ContentHandler
• This is the most important interface--it handles basic parsing
callbacks, such as element starts and ends
– interface DTDHandler
• Handles only notation and unparsed entity declarations
– interface EntityResolver
• Does customized handling for external entities
– interface ErrorHandler
• Must be implemented or parsing errors will be ignored!
• You could implement all these interfaces yourself, but
that’s a lot of work--it’s easier to use an adapter class
15. Class DefaultHandler
• DefaultHandler is in package org.xml.sax.helpers
• DefaultHandler implements ContentHandler, DTDHandler,
EntityResolver, and ErrorHandler
• DefaultHandler is an adapter class--it provides empty
methods for every method declared in each of the four
interfaces
– Empty methods don’t do anything
• To use this class, extend it and override the methods that are
important to your application
– We will cover some of the methods in the ContentHandler and
ErrorHandler interfaces
16. ContentHandler methods, I
• public void setDocumentLocator(Locator loc)
– This method is called once, when parsing first starts
– The Locator contains either a URL or a URN, or both, that
specifies where the document is located
– You may need this information if you need to find a document
whose position is specified relative to this XML document
– Locator methods include:
• public String getPublicId() returns the public identifier for the
current document
• public String getSystemId() returns the system identifier for the
current document
– Every ContentHandler method except this one may throw a
SAXException
17. ContentHandler methods, II
• public void processingInstruction(String target,
String data)
throws SAXException
• This method is called once for each processing instruction
(PI) that is encountered
• The PI is presented as two strings: <?target data?>
• According to XML rules, PIs may occur anywhere in the
document after the initial <?xml ...?> line
– This means calls to processingInstruction do not
necessarily occur before startElement is called with
the document root
18. ContentHandler methods, III
• public void startDocument()
throws SAXException
– This is called just once, at the beginning of parsing
• public void endDocument()
throws SAXException
– This is called just once, and is the last method called by the parser
• Remember: when you override a method, you can throw fewer kinds of
exceptions, but you can’t throw any new kinds
– In other words: your methods don’t have to throw a SAXException
– But if they do throw an exception, it can only be a SAXException
19. ContentHandler methods, IV
• public void startElement(String namespaceURI,
String localName,
String qualifiedName,
Attributes atts)
throws SAXException
• This method is called at the beginning of every element
• If the parser is namespace-aware,
– namespaceURI will hold the prefix (before the colon)
– localName will hold the element name (without a prefix)
– qualifiedName will be the empty string
• If the parser is not using namespaces,
– namespaceURI and localName will be empty strings
– qualifiedName will hold the element name (possibly with prefix)
20. Attributes, I
• When SAX calls startElement, it passes in a parameter of
type Attributes
• Attributes is an interface that defines a number of useful
methods; here are a few of them:
–
–
–
–
–
getLength() returns the number of attributes
getLocalName(index) returns the attribute’s local name
getQName(index) returns the attribute’s qualified name
getValue(index) returns the attribute’s value
getType(index) returns the attribute’s type, which will be one of
the Strings "CDATA", "ID", "IDREF", "IDREFS", "NMTOKEN",
"NMTOKENS", "ENTITY", "ENTITIES", or "NOTATION”
• As with elements, if the local name is the empty string, then
the attribute’s name is in the qualified name
21. Attributes, II
• SAX does not guarantee that the attributes will be returned
in the same order they are written
– After all, the order is irrelevant in XML
• The following methods look up attributes by name rather
than by index:
– public String getValue(String qualifiedName)
– public String getValue(String uri, String localName)
• However, some methods of Attributes require an index, so:
– public int getIndex(String qualifiedName)
– public int getIndex(String uri, String localName)
• An Attributes object is valid only during the call to
startElement
– If you need to remember attributes longer, use:
AttributesImpl attrImpl = new AttributesImpl(attributes);
22. ContentHandler methods, V
• endElement(String namespaceURI,
String localName,
String qualifiedName)
throws SAXException
• The parameters to endElement are the same as those to
startElement, except that the Attributes parameter is
omitted
23. ContentHandler methods, VI
• public void characters(char[] ch,
int start,
int length) throws SAXException
• ch is an array of characters
– Only length characters, starting from ch[start], are the
contents of the element
• The String constructor new String(ch, start, length) is an
easy way to extract the relevant characters from the char array
• characters may be called multiple times for one element
– Newlines and entities may break the data characters into separate calls
– characters may be called with length = 0
– All data characters of the element will eventually be given to characters
24. Example
• If hello.xml contains:
– <?xml version="1.0"?>
<display>
Hello World!
</display>
• Then the sample program we started with gives:
– startElement: display
characters:
characters:
characters:
characters:
<-- zero length string
<-- LF character (ASCII 10)
Hello World! <-- spaces are preserved
<-- LF character (ASCII 10)
Element: /display
25. Whitespace
• Whitespace is a major nuisance
– Whitespace is characters; characters are PCDATA
– IF you are validating, the parser will ignore whitespace
where PCDATA is not allowed by the DTD
– If you are not validating, the parser cannot ignore whitespace
– If you ignore whitespace, you lose your indentation
• To ignore whitespace when validating:
– Happens automatically
• To ignore whitespace when not validating:
– Use the String function trim() to remove whitespace
– Check the result to see if it is the empty string
26. Handling ignorable whitespace
• A nonvalidating parser cannot ignore whitespace, because it
cannot distinguish it from real data
• A validating parser can, and does, ignore whitespace where
character data is not allowed
– For processing XML, this is usually what you want
– However, if you are manipulating and writing out XML, discarding
whitespace ruins your indentation
– To capture ignorable whitespace, you can override this method
(defined in DefaultHandler):
public void ignorableWhitespace(char[] ch,
int start,
int length)
throws SAXException
• Parameters are the same as those for characters
27. Error Handling, I
• SAX error handling is unusual
• Errors are ignored unless you register an error
handler (org.xml.sax.ErrorHandler)
– Ignored errors can cause bizarre behavior
– Failing to provide an error handler is unwise
• The ErrorHandler interface declares:
– public void fatalError (SAXParseException exception)
throws SAXException // XML not well structured
– public void error (SAXParseException exception)
throws SAXException // XML validation error
– public void warning (SAXParseException exception)
throws SAXException // minor problem
28. Error Handling, II
• If you are extending DefaultHandler, it implements
ErrorHandler and registers itself
– DefaultHandler’s version of fatalError() throws a
SAXException, but...
– its error() and warning() methods do nothing!
• You can (and should) override these methods
• Note that the only kind of exception your override
methods can throw is a SAXException
– When you override a method, you cannot add exception types
– If you need to throw another kind of exception, say an
IOException, you can encapsulate it in a SAXException:
• catch (IOException ioException) {
throw new SAXException("I/O error: ", ioException)
}
29. Error Handling, III
• If you are not extending DefaultHandler:
– Create a new class (say, MyErrorHandler) that
implements ErrorHandler (by supplying the three
methods fatalError, error, and warning)
– Create a new object of this class
– Tell your XMLReader object about it by sending it the
following message:
setErrorHandler(ErrorHandler handler)
• Example:
XMLReader parser = saxParser.getXMLReader();
parser.setErrorHandler(new MyErrorHandler());