Developing an STM DTD/Schema: Strategic design choices

Developing an STM DTD/Schema:
Strategic Design Choices
Alexander (‘Sasha’) Schwarzman, AGU (sschwarzman@agu.org)
Extreme Markup Languages 2006, Montréal, Canada
August 7 – 11, 2006

Requirements
 Does an agreed upon Requirements document exist? (Get one!)

 What is your XML’s role?
 Archival copy-of-record (preserving scientific content)?

 Means of producing a pretty PDF?

 Both?

 Much more?

Architecture
 When during production is XML created? How is accuracy checked at each stage?
 Dummy empty elements for not-yet-assigned metadata plus use of configurable
production-stage-specific Business Rules Checker / Validator / QC Tool?

 Multiple DTDs: a separate one for each production stage?

 XML “layering”: What “layer” to use for enforcing editorial style and business
rules?
 DTD / parser?

 Validator / Schematron?

 Human editors?

 Revisable unit (what is the elemental unit?)
 Article?

 Issue?

 Arbitrary / cross-journal article collection?

 Volume / year?

 Journal?

 More than one of these?

Scope
 For what material?
 Current?

 Future-only?

 Legacy?

 All of the above or some combination?

 What is the extent of an article / book?
 Does it include supplementary material, like datasets and computable spreadsheets?

 Do you model “extra stuff” as just another structured section or is it something different?

 Special links (“related links”) section?

Alexander (‘Sasha’) Schwarzman, AGU Extreme Markup Languages 2006, Montréal, Canada Page 1 of 2
(sschwarzman@agu.org) August 7 – 11, 2006

Developing an STM DTD / Schema: Strategic Design Choices (cont’d)

Modeling Language Choices
 Which constraint language is primary?
 DTD?

 XSD?

 RELAX NG?

 How many DTDs / schemas (purpose of each)?
 Authoring?

 Conversion / Transformation?

 Production?

 Archiving?

 Separate or shared: If your content includes journal article, newspaper article, book
chapter, book, case study, lecture notes, etc., should you use:
 Distinct DTD / schema for each?

 A large shared structure?

 A DTD / schema suite with common modules?

 “Off-the-shelf, Altered-to-fit, or Bespoke?” (T. Usdin)
 If altered, what public model?

 “compatible with” or “informed by” (subset or superset)?

 If bespoke, do you use any public models at all (for tables and math, for instance)?

Modeling Design Choices
 “Prussian” or “Californian”: prescriptive or descriptive? Flexible or enforcing?

 Generated or Explicit text? (depends on XML’s role)
 Preserve generation / rendition rules?

 Different approach for text and bibliographic references?

 How to model bibliographic references?
 Mixed content?

 Genre-specific “strict models” (with an escape hatch provided)?

 “Tag abuse” tolerance?

 How to reference non-XML components, e.g., figures, in XML?
 By an ID that maps to a set of multiple images in an archive?

 By naming a specific file from the set? Which one is “the mother of all images”?

 Which components to store / migrate? Is “storing cheaper than thinking”? (D. Lapeyre)

 How to model math?
 MathML presentation versus content (computation)?
 How to ensure the identicalness of the same math symbols in different browsers (same UNICODE
codepoints look differently in various browsers, e.g., epsilon and varepsilon)?

 LaTeX plus GIFs?
 How to ensure the identicalness of special characters that occur both in a displayed formula and
inline?

 Just GIFs?

 “Just because you can, doesn’t mean you should” (D. Lapeyre)
 The lure of modeling for its own sake. Simplicity maintains better over time

Alexander (‘Sasha’) Schwarzman, AGU Extreme Markup Languages 2006, Montréal, Canada Page 2 of 2
(sschwarzman@agu.org) August 7 – 11, 2006

Developing an STM DTD/Schema: Strategic design choices

Recommended

Recommended

More Related Content

More from aschwarzman

More from aschwarzman (14)

Recently uploaded

Recently uploaded (20)

Developing an STM DTD/Schema: Strategic design choices