The eXtensible Markup Language (XML) is not a language itself, but rather a meta-language used to create markup languages to suit whatever purpose you may have. In this session you will learn the basic rules of XML and the philosophy behind it. You will also be introduced to the basics of the popular XML editor, oxygen.
XML is everywhere. Computers, Mobiles, Bank Systems, Internet, TVs, Microwaves, all use XML as an Information Wrapping and Information Xchange System. We will tell you all the basics in a simplest possible way.
Content:
- Structures
- Datatypes
References:
- Beginning XML, 5th Edition, Joe Fawcett, Liam R. E. Quin, Danny Ayers
- XML in a nutshell,3rd Edition, Elliotte Rusty Harold & W. Scott Means
- http://www.w3schools.com/
Pioneers of Information Science in Europe: The Oeuvre of Norbert HenrichsWolfgang Stock
In this presentation we discuss the works and influence of Norbert Henrichs (born 1935), a pioneer of Information Science in Europe. In the context of philosophy documentation, Henrichs developed in the 1960s a dictionary-independent method of indexing: the Text-Word Method. This method works exclusively with the term material of the documents to be indexed. It starts by using a variant of syntactic indexing, viz. the formation of thematic chains. Documents indexed via the Text-Word Method form the basis for relatively ballast-free information retrieval, but also for studies in the history of ideas. Henrichs was a leading contributor to the formulation and realization of the German Information & Documentation (I&D) program (1974 – 1977). This widely noted political program planned for the world’s entire scientific and technical literature to be made available in 20 specialized information centers. Henrichs served as scientific executive director of the central German infrastructure provision within the I&D program, the “Society for Information and Documentation” (GID), from 1980 to 1985. Over the course of the 1980s, the I&D program broke down—mainly due to a lack of financing. At the Heinrich-Heine-University in Düsseldorf, Henrichs successfully developed a curriculum for information science, which—typically for Germany in the 1980s and 1990s—had no strong ties to either library science or computer science.
XML is everywhere. Computers, Mobiles, Bank Systems, Internet, TVs, Microwaves, all use XML as an Information Wrapping and Information Xchange System. We will tell you all the basics in a simplest possible way.
Content:
- Structures
- Datatypes
References:
- Beginning XML, 5th Edition, Joe Fawcett, Liam R. E. Quin, Danny Ayers
- XML in a nutshell,3rd Edition, Elliotte Rusty Harold & W. Scott Means
- http://www.w3schools.com/
Pioneers of Information Science in Europe: The Oeuvre of Norbert HenrichsWolfgang Stock
In this presentation we discuss the works and influence of Norbert Henrichs (born 1935), a pioneer of Information Science in Europe. In the context of philosophy documentation, Henrichs developed in the 1960s a dictionary-independent method of indexing: the Text-Word Method. This method works exclusively with the term material of the documents to be indexed. It starts by using a variant of syntactic indexing, viz. the formation of thematic chains. Documents indexed via the Text-Word Method form the basis for relatively ballast-free information retrieval, but also for studies in the history of ideas. Henrichs was a leading contributor to the formulation and realization of the German Information & Documentation (I&D) program (1974 – 1977). This widely noted political program planned for the world’s entire scientific and technical literature to be made available in 20 specialized information centers. Henrichs served as scientific executive director of the central German infrastructure provision within the I&D program, the “Society for Information and Documentation” (GID), from 1980 to 1985. Over the course of the 1980s, the I&D program broke down—mainly due to a lack of financing. At the Heinrich-Heine-University in Düsseldorf, Henrichs successfully developed a curriculum for information science, which—typically for Germany in the 1980s and 1990s—had no strong ties to either library science or computer science.
C++ open positions and popularity remain high as media has recently, and there is a reason for that: from the many languages and platforms that developers have available today, C++ features uncontested capabilities in power and performance, allowing innovation outside the box (just think on action games, natural user interfaces or augmented reality, to mention some). In this talk you’ll see the new features and technologies that are coming with Visual C++ vNext, helping you build compelling applications with a renewed developer experience. Don’t miss it!!
Keynote for the initial PyCon AU, 26 June 2010 at the Sydney Masonic Center. This is the grand unveiling of the Plexus project - plexus.relationalspace.org.
II Konferencja Naukowa : Nauka o informacji (informacja naukowa) w okresie zmian, Warszawa, 15-16.04.2013 r. Instytut Informacji Naukowej i Studiów Bibliologicznych, Uniwersytet Warszawski
The 2nd Scientific Conference : Information Science in an Age of Change, April 15-16, 2013. Institute of Information and Book Studies, University of Warsaw
RuleML2015: Explanation of proofs of regulatory (non-)complianceusing semanti...RuleML
With recent regulatory advances, modern enterprises have to not only comply with regulations but have to be prepared to provide explanation of proof of (non-)compliance. On top of compliance checking, this necessitates modeling concepts from regulations and enterprise operations so that stakeholder-specific and close to natural language explanations could be generated. We take a step in this direction by using Semantics of Business Vocabulary and Rules to model and map vocabularies of regulations and operations of enterprise. Using these vocabularies and leveraging proof generation abilities of an existing compliance engine, we show how such explanations can be created. Basic natural language explanations that we generate can be easily enriched by adding requisite domain knowledge to the vocabularies.
XML Introduction,Syntax of XML,Well formed XML Documents,XML Document Structure,Document Type Definitions,XML Namespace,XML Schemas,DOM(Document Object Model)
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
11. every XML document must declare itself
as an XML document
<?xml version="1.0"?>
<?xml version="1.0"? Encoding=“utf-8”?>
Basic Rules of XML
12. every XML document must have a root
element that wraps the entire
document
<TEI></TEI>
or:
<modsCollection></modsCollection>
Basic Rules of XML
13. every XML tag that opens must close
<div1></div1>
<head></head>
<name></name>
• The only exception to this are self-closing tags:
<pb/>
<milestone/>
<link/>
Basic Rules of XML
14. Basic Rules of XML
tags are case-sensitive, and tag-pairs
must match
<title></title>
not:
<title></TITLE>
or:
<Title></TITLE>
15. Basic Rules of XML
all tags must nest correctly
<title><persName>Dr. Strangelove</persName>,
<subtitle> or, How I learned to stop worrying and
love the bomb.</subtitle></title>
not:
<title><persName>Dr. Strangelove
</persName>,<subtitle> or, How I
learned to stop worrying and love the
bomb.</title></subtitle>
16. Basic Rules of XML
Well-formed XML
The following is NOT a well-formed document. Why?
<?xml version="1.0"?>
<BOOK>
<TITLE>The Adventures of Huckleberry Finn
<AUTHOR>Mark Twain</TITLE></AUTHOR>
<BINDING>mass market paperback</BINDING>
<PAGES>298</PAGES>
<PRICE>$5.49</price>
</BOOK>
17. Review: Basic Rules of XML
• an XML document must have an XML declaration:
<?xml version="1.0"?>
• every XML document must have a root element that wraps the
entire document:
• every XML tag that opens must close: the only exception to this
are self-closing tags
• tags are case-sensitive and tags must match
• all tags must nest correctly
18. Exercise 1
Using what you’ve learned about well formed XML,
create an XML file describing a text.
1. Open Wordpad or Notepad
2. Open springtime.txt from student_files
3. Use any tags you like to mark up the text to
create a well formed XML document.
19. Key Concepts of XML
XML applications
Dublin Core –broad metadata standard that supports various
purposes and business models
MathML—Math Markup Language
GedML—Genealogical Markup Language
ParlML—Parliamentary Markup Language
RETS—Real Estate Transaction Language
TEI—Text Encoding Initiative
For more examples, see:
List of XML Markup Languages.
20. Key Concepts of XML
Valid XML
an XML application’s tag set is enforced through
an XML schema
OR
a DTD (document type definition)
21. Structure of an XML document
• the prolog
• The XML declaration
<?xml version="1.0"?>
• other declarations (i.e., DTD, entities)
<!DOCTYPE COLL SYSTEM “red.textclass.dtd">
<!ENTITY TEI "Text Encoding Initiative">
• the document element
• defined by root element
<TEI></TEI>
22. Building Blocks of XML
• elements and attributes
• general entities
• XML data
23. Building Blocks of XML
elements and attributes
<front>
CONTENTS
PAGE
<chapter>SPRINGTIME</chapter> <pageNo>1</pageNo>
SOME NAMES OF CHARACTERS IN FICTION 15
THOMAS HEARNE, 1678–1735 29
RECOLLECTIONS 51
</front>
24. Building Blocks of XML
elements and attributes
<text type=“essay”>
Governesses used to tell us that the seasons of the
year each consist of three months, and of these
<month type=“third”>March</month>, April, and May
make the springtime.</text>
<element attribute="value“>content</element>
Attribute values must always be
in single or double quotes
25. Review: Basic Rules of XML
• an XML document must have an XML declaration
• every XML document must have a root element that wraps
the entire document:
• every XML tag that opens must close: the only exception
to this are self-closing tags
• tags are case-sensitive and tags must match
• all tags must nest correctly
• attribute values must always be in single or double
quotation marks
26. Exercise 1, cont.
Using the text you marked up earlier, add attributes and
values to the elements.
Ex: BY <author type=“knight”>SIR FRANCIS
DARWIN</author>
27. Building Blocks of XML
general entities
• used as a placeholder for non-ASCII data, such as
special characters, non-Roman alphabets, and
non-text media
• to be used in the document element, entities must
be declared in prolog
(except for XML Unicode entities)
28. Building Blocks of XML
general entities
• within the document element (anywhere after the
prolog) an entity takes the standard syntax of
starting with & and ending with ;
• ampersands (&) and angle brackets (<>) are
reserved characters in XML and must be encoded
as entities
<measure type=“weight”> > 50lbs</measure>
<measure type=“weight”>> 50lbs</measure>
29. Review: Basic Rules of XML
• an XML document must have an XML declaration
• every XML document must have a root element that wraps
the entire document:
• every XML tag that opens must close: the only exception
to this are self-closing tags
• tags are case-sensitive and tags must match
• all tags must nest correctly
• attribute values must always be in single or double
quotation marks
• ampersands (&) and angle brackets (<>) are
reserved characters in XML and must be
encoded as entities
30. Building Blocks of XML
data
CDATA (character data)
• text data ignored by XML parser
PCDATA (parsed character data)
• text data parsed by XML parser
NDATA (notation data)
• all other media types referenced in the
XML document
31. Review: Key Concepts of XML
• Well-formed XML
• Follows the basic rules--no content model
• Valid XML
• an XML schema
• a DTD (document type definition)
32. Review: Structure of XML document
• the prolog
• The XML declaration
<?xml version="1.0"?>
• other declarations (i.e., DTD, entities)
• the document element
• defined by root element, (i.e., <TEI>)
34. WU site wide license @ http://sl.wustl.edu/catalog/index.php
•Easy-to-use and provides robust functionality for editing,
project management, and validation of structured mark-up
sources.
•Supports output to multiple target formats, including: PDF ,
TXT , HTML and XML
Software: oXygen XML Editor
35. • Multiplatform availability: Windows, Mac
• Multilanguage support: English, German, French, Italian,
and Japanese
• Unicode support
• Spell checking supporting English, German and French
• Easy error tracking
• Content completion
• Built in templates
oXygen Features
36. • Preview transformation results as XHTML or XML or in
your browser
• Import data from a database, Excel, HTML or text file
• XML project manager
• Manual and automatic validation of XML documents
against XML Schema schemas, and DTDs
• Batch validate selected files in project
oXygen Features
Editor's Notes
What is XML?
It is a self-describing document
The tags used describe what the document is about
Example: This book declares itself to be a book
Simplicity - Information coded in XML is just plain text. It’s easy to read and understand, plus it can be processed easily by computers.
It’s an open standard, or non-proprietary. Because it’s stored as plain text files, there’s no special software required to access data. The data can theoretically be around forever because it’s not software dependent. The Word Perfect document you have stored on your floppy disk could be readable now if it was in XML.
Extensibility – XML stands for extensible markup language. This means that there is no fixed set of tags. New tags can be created as they are needed.
Interoperability - XML is a W3C standard, endorsed by software industry market leaders. [Is everyone here familiar with the W3C? The W3C develops open specifications (de facto standards) to enhance the interoperability of web-related products.] This means XML can be used by many different systems, transformed from one metadata standard (i.e. Dublin Core to TEI) to another, and shared by institutions.
It separates content from presentation - XML tags describe meaning not presentation. XML is used to transport data, whereas HTML is used to format and display data.
It can output to multiple formats
The look and feel of an XML document can be controlled by XSL style sheets, allowing the look of a document (or of a complete Web site) to be changed without touching the content of the document.
More importantly, this includes yet unknown formats.
There are many other benefits, such as:
It supports multilingual documents and Unicode (Red Brush, True Crimes projects)
It facilitates the comparison and aggregation of data (standard formats allow different data sets to be aggregated and compared)
You can embed multiple data types (can wrap image/video in a METS wrapper)
And rapid adoption by industry (XML is used by scholarly databases, the publishing industry, software vendors, and many others)
SGML first developed in the 80s out of the idea that markup should be focused on the structure of a text, not the presentation
Not widely accepted because overwhelming number of tags and not modifiable
From SGML, HTML was developed and became the language of the web. HTML indicates how a text should be displayed as well as its structure.
The h1 and I tags tell the browser how to display the chapter heading – as large, italic text
XML was also developed from SGML and introduced flexibility. It was not intended to address display issues, but only to indicate a document’s structure. Not a markup language, strictly speaking, but is a set of rules for creating a markup language.
No display information is given. Only the structure of the document.
Benefit of describing documents structure – if you have content identified as author name, book title, chapter heading, etc. you can limit your search to just that content
XHTML was developed around early 2000. The W3C recommendations for HTML have been based on XML rather than SGML since then. XHTML documents must be well formed XML documents. It is a somewhat stricter version of HTML, allowing for more rigorous and robust documents, while using tags used by HTML.
Allows the billions of web pages in HTML to retroactively become a subset of XML, without abandoning HTML altogether.
Differences in XHTML – all tags must be closed, all attribute values must be quotes, all tag and attributes must be lowercase
To summarize the relationships - HTML, XML and XHTML are all subsets of SGML, which predated the web by over a decade. XML is not an extension of HTML, but represents an attempt to revive the original ideas of SGML, while adding the benefits of HTML (web-friendliness and simplicity) and extensibility.
The basic rules for creating a markup language in XML are relatively simple:
Every XML document must declare itself as an XML document, with &lt;?xml version=&quot;1.0&quot;?&gt;, in the first line of the document.
At the very least, it must identify the version of XML but can include other information such as the type of character encoding used. (utf-8 - representing any character in Unicode standard – used for Chinese, Japanese, etc.)
The initial fixed character string also allows parsers to test the type of character encoding present. (An XML parser is a processor that reads an XML document and determines the structure and properties of the data.)
every XML document must have a single root element that contains all the other elements comprising the document.
It is the top element and all other elements are hierarchically subordinate to it.
Its start tag begins the document and its end tag is the last to occur in the document
every XML tag that opens must close
The only exception to this are self-closing tags—for use with tags that don’t enclose data, but mark a point in a document, like the break tag in HTML and TEI
Tags are case sensitive and must match
all tags must nest correctly:
When an elements start tag is inside another elements, its end tag must also be inside that element.
Elements that have subordinate elements, like the root element, are called “container elements” and also referred to as “parents”. The parent element here is &lt;title&gt;
To continue the parent-child analogy, subordinate elements are called “children” and elements at the same level within a given container element are called “siblings”. The children of &lt;title&gt; are &lt;subtitle&gt; and &lt;persName&gt;. &lt;subtitle&gt; and &lt;persName&gt; are siblings.
&lt;title&gt; does not close properly
&lt;price&gt; mixed cases
there are a few additions to these rules (we’ll add some as we go along) but basically, this is all you need to create well-formed XML. Well-formedness is a key concept of XML. This means that the document follows these minimal requirements for an XML document.
If you follow these rules, you can use ANY tags you want, because XML is extensible, and this will be well-formed XML.
Put basic rules slide back up and give 5 minutes
Show finished XML file in oXygen and show it is well formed
Now that we have some basic rules of XML, we’ll move on to some key concepts of XML
As we saw from this exercise, the benefit to XML is that it allows you to be as precise (or obscure) as you want to be. The drawback is that it allows you to be as precise and obscure as you want to be. You could tag every word, but you have to consider what your target audience will be using this information for. Will scholars really care that and is marked as a conjunction?
XML mediates the problem of large and unwieldy tag-sets (SGML) on the one hand, and overly-simple tag sets that get overloaded and used for purposes they weren’t intended for (HTML) on the other, by creating allowing for a middle ground, the XML application
Not to be confused with software applications, an XML application is a set of XML tags and rules for their use agreed on by a given community or consortium for use in their common subject area or purpose.
Some examples of XML applications include: [see slide]
These are more or less successful, more and less accepted standards, and more are appearing all the time.
Having a set group of tags and attributes means standardization and interoperability. Looking at a set of tags will also help you decide which information to mark up. For example, the TEI standard does not have a &lt;word&gt; tag. It would be overkill to mark every word with an element.
If valid XML can be created using any tags, how is a community’s agreement on a certain set of tags and rules for them—an XML application—enforced?
an XML application’s tag set is enforced through one of two ways: an XML schema & a DTD (document type definition)
Both serve essentially the same function, in that they list the tags the community or group has agreed on (and what they mean) and the rules for where and how they can be used in a document.
DTD (Document Type Definition) – defines elements in your document and attributes they can have, ordering and nesting of elements, declared in a doctype declaration after the XML declaration
Schema – essentially does the same by defining elements and attributes, but is itself an XML document
To enforce the rules of an XML application in an XML document, the document calls on a DTD or schema file (either installed locally on the computer or to a location on the web) and an XML parser will read the rules of the DTD, and validate the XML.
By definition, an XML document that validates against a schema or DTD is well-formed, but we only say it parses or is valid if it also conforms to the rules of a schema or DTD.
Example of DTD: http://www.tei-c.org/Vault/P4/Lite/DTD/teixlite.dtd
Example of schema: http://www.tei-c.org/Vault/P5/1.7.0/xml/tei/custom/schema/relaxng/tei_all.rng
there is also a basic structure to an XML document that must be followed. At its most basic, an XML document is made of two parts: the prolog & the document element
The minimal requirement for the prolog is the XML declaration, which you’ve already seen, which must be at the top of any XML document: &lt;?xml version=&quot;1.0&quot;?&gt;
The prolog is also where a reference to a DTD or schema will be located. Sometimes the reference to the DTD will substitute for the XML declaration. The reference to a DTD or schema in the prolog is unfortunately called a document type declaration—not to be confused with the file it refers to, which is the document type definition (DTD). The prolog is also the place where any entities are declared.
The document element is the root element that wraps the whole document
The TEI entity would be used if you had to use Text Encoding Initiative a lot in your doc but wanted to use a placeholder
With the prolog and document element, we have the basic outline of an XML document—what we need now are the basic building blocks
Fundamentally, you have tags on the one hand and the content you want to encode on the other.
To this point I’ve talked about XML “tags,” and you’ll often hear people refer to them. This generally includes any markup at the level of the root element and above. But “tags” is loose jargon, not really official terminology.
More formally, what is meant by XML “tags” are XML elements and attributes.
For content that can be represented in ASCII text, there is no problem. But for anything else—special characters, or any non-text data you want to refer to (such as jpgs, movie or audio files)—you need a placeholder in the XML to refer to the content outside the XML file. That role is filled in XML by entities.
There are also 3 types of data that go in an XML document that we will discuss later.
XML tags describe the content they contain—they say what the content is—and the way they do this is with XML elements.
From our example before, I used the root element &lt;text&gt; to denote that the resource we are encoding is a text.
The example we’re working with came from a book with a table of contents, so we might want to denote that as &lt;front&gt; matter and describe the chapters and page numbers
But a general description is often not precise enough, and requires further clarification or elaboration. There may be several different kinds of texts in this one resource – poems, narratives, etc. We can expand on the &lt;text&gt; element by including attributes.
Elements and attributes are broadly analogous to nouns and adjectives in grammar: an element describes a general conceptual category, while an attribute gives further information describing the content. Elements always appear first in the angle brackets that identify text as code (&lt;&gt;) while attributes follow them. Attributes, in addition, must have an assigned value.
Here, the attribute is “type” and the value of the attribute is “third.” Note that you don’t have closing tags for the attribute: an attribute is part of the element, so only the element closes.
All attributes must be in double or single quotation marks. This is one of the basic rules of XML. Double are probably the most common, but if the value contains quotes, as in someone’s nickname, the attribute value can be placed in single quotes
Give a few minutes for exercise and add attributes to my file
Show well formedness in oxygen
As mentioned above, general entities function as placeholders for any data that is beyond the ASCII character set, including special characters, characters from non-Roman alphabets, and multimedia formats.
But since it serves as a placeholder, it can also serve as a way to store repeated text (either in the XML prolog, or in a separate file) to reference in your XML file, as a way to save on re-typing.
With the exception of a few pre-defined XML entities, and Unicode XML entities, entities must be declared in the prolog to be referenced in the document element (where your content goes).
Once an entity is declared, it can be referenced in the document, and takes the syntax of being introduced by an ampersand (&) and closed with a semicolon (;).
Just like HTML, ampersands and angle brackets have to be encoded as entities in order to have valid XML. These are reserved characters that have other meaning in XML.
The first example is incorrect. The second has the greater than sign encoded as the entity &gt; Since gt is for greater than, you can probably guess what the entity for less than is
This is one of the key concepts of XML, so we’ll add it to our list
CDATA is not parsed by the XML parser
Some text, like JavaScript, contains a lot of illegal characters like brackets and ampersands
If for some reason you had a snippet of JavaScript code in your XML, you would want to put it in a CDATA section so it is ignored by the XML parser
CDATA sections start and end with &lt;![CDATA[“ “]]&gt;
PCDATA is everything else in your document – all text will be parsed, elements, attributes, text, etc.
NDATA unparsed entities, refer to images, other media files
Example: &lt;!ENTITY pic SYSTEM &quot;http://www.w3schools.com/picture.jpg&quot; NDATA JPEG&gt;
Open Fanny Lewald file in oXygen and Window &gt; Reset Layout if necessary
Point out Project window, Outline window, completion of elements and attributes, content completion
Show error window
Show creating a new file from a template
Show Tools &gt; Compare Files with two Lewald files
Show Project and Add files, Find/Replace across project files
Well formedness exercise
open Sleepy Hollow in oXygen and check for well formedness
title type needs to be in quotes, h2 not nested properly, P different cases, no end blockquote
Show transformation to HTML if time
Use file-to-transform.xml and xml2html.xsl