Your SlideShare is downloading. ×
  • Like
Using UML to define XML document types
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.


Now you can save presentations on your phone or tablet

Available for both IPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Using UML to define XML document types



  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads


Total Views
On SlideShare
From Embeds
Number of Embeds



Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

    No notes for slide


  • 1. Using UML To Define XML Document Types W. Eliot Kimber John D. Heintz
  • 2. Agenda
    • Problem Definition
      • What we are and are not building
      • General system and document modeling
      • Modeling information structures (e.g, DTDs)
      • Difficult to Integrate DTDs with rest of system model
    • Solution
      • Catalysis-style refinement
      • DTD models as implementation refinement of abstract business objects
      • Types and Stereotypes
    • Simple Example
    • Summary
    • Future Work
  • 3. Transcend Syntax
  • 4. Problem Definition How do we integrate traditional system engineering modeling practice with traditional SGML and XML document analysis and modeling?
  • 5. What Are We Building
    • We build standards-based information management systems, primarily SGML and XML-based
    • E.g., documentation authoring, production, and delivery
    • Often integrated with other core business processes:
      • Product engineering
      • Marketing support
      • Legislation
    • XML-based data is primary work product of system users
  • 6. Typical System Requirements
    • Must support many document types:
      • Reflect complex (and often arcane) business rules
      • Reflect distinct cultures and practices of authors
      • Form families of related document types
      • May need to integrate with industry standards (ATA 2100, Docbook, etc.)
    • Tens or hundreds of thousands of individual documents
    • Hyperlinking and use by reference
    • Must integrate with other information systems and business processes
    • Multiple outputs from a single source: print, HTML, etc.
    • Long life cycle documents (20-100+ years)
  • 7. What We Are Not Building
    • Not using XML just for simple object marshalling
    • Not using XML just for messaging among system components
  • 8. System and Document Modeling
    • Want to use UML-based data and object modeling to define our systems
    • Traditional document analysis does not use formal data modeling
    • Impedence mismatch similar to storing a program in a relational database
    • Two ways to solve:
      • Define mapping mechanism from DTDs to object models
      • Define mapping mechanism from object models to DTDs
  • 9. Mapping from DTDs to Object Models
    • This approach is problematic:
      • DTDs are not really data models…
      • … only weak syntax constraints
      • DTDs provide no way to capture abstraction across models
      • DTDs are implementation views of some higher-level abstraction
      • May be many ways to interpret a given XML structure as objects
      • May need multiple related DTDs for the same business object
        • Authoring vs. delivery, different languages or cultures, etc.
  • 10. Practical Difficulties of DTD Development
    • Tools for developing DTDs not integrated with other system design tools
    • No standard graphical representation
    • Difficult to engineer system objects and models from DTDs
    • Difficult to integrate DTD documentation with DTD definition
    • DTDs are not modular, making management of related DTDs difficult…
    • … DTDs are not inherently shareable.
  • 11. We Had To Reject Traditional Approach
    • We wanted to apply formal system modeling to XML-based systems
    • With focus on DTDs, XML document type components could be bound to implementation definition only through documentation strings
    • No automated tracability from requirements to XML rules:
      • Difficult to define relationships among XML data and related code objects
      • Difficult to define relationships among different XML components (architectures provide some but not all)
    • Difficult to capture re-usable XML parts of designs
    • DTD documentation became unmanagable
  • 12. Solution: Map Object Models to XML Document Types Where we realize that DTDs are just implementation refinements of higher-level abstractions
  • 13. Mapping Objects to Document Type Definitions
    • DTD becomes implementation view of higher-level abstractions…
    • … Design focus is on business objects not data representation details
    • Traceability from system objects and formal requirements to DTD implementation
    • Can use facilities of modeling language not available in DTDs
    • Can bind documentation directly to model
    • Can use formal constraints to define semantic and syntactic constraints
  • 14. Refinement: Relating Layers of Abstraction
    • A complete system model will have several layers of abstraction:
      • High-level system model
      • Functional requirements model
      • Implementation design model
    • Objects in one level will be reflected in other levels, but not necessarily directly
    • Design tracability requires formal mapping from objects in one level to objects in the adjacent models
    • Any number of implementation models can refine a given functional model
  • 15. Refinement from High to Low Abstraction A B Abstract System Design System Implementation 1 A B D C Refinement A->A,C,D B->B B E F System Implementation 2 Refinement A->E,F B->B
  • 16. Problem: How To Bind Types to XML Syntax?
    • UML data models define types
    • Must have formal, computer-sensible way to map UML types to XML DTD syntactic constructs: elements, attributes, content models, notations
    • A fixed mapping from UML graphical components to XML components won’t work:
      • Some types will be element types
      • Some types will be attributes
      • Some types will be notations
    • No direct analog of content models in UML language
  • 17. Solution: UML Stereotypes
    • Stereotypes characterize UML syntactic components to add specialized semantics:
    • UML does not define how a set of stereotypes is formally defined
    <<element>> Book <<attribute>> Author
  • 18. Components of Our Solution
    • We had to define the set of stereotypes needed to enable mapping to XML DTD syntax
    • Had to define the semantics of those stereotypes:
      • Defined the stereotypes as UML types in their own package
      • Formal constraints on these types plus prose documentation defines the semantics
      • The XML stereotypes in turn map back to the formal models for XML as defined by ISO and/or W3C
    • The stereotypes reflect the abstract model for XML DTD declarations (element type, attribute, notation, etc.)…
    • … therefore, can map to any XML DTD representation syntax (markup declarations, XML Schema, etc.)
  • 19. Document Analysis Produces Business Object Design
    • Document analysis now results in document business object models
    • For us, document analysis is part of the larger system analysis task…
    • …documents are just another kind of business object…
    • …may or may not be represented in XML in implementation.
  • 20. Document Analysis (cont)
    • Focus of document analysis stays on business requirements, not syntax details
    • Document analysis results in abstract information model for business objects that are documents (in the everyday sense)
    • From this model, multiple implementations (“DTDs”) may be refined
    • XML syntax details defined as part of implementation task, not system analysis task
    • Can have multiple XML or non-XML implementations of same document business objects with full design traceability
  • 21. Simple Example A trivial but representative example of an abstract information model and an implementation refinement
  • 22. Document Business Object Model
  • 23. XML Implementation Model Top Level
  • 24. <P> Element Type
  • 25. Re-Use of Oasis Table Model
  • 26. Full DTD View
  • 27. Summary
  • 28. Benefits for System Design and Implementation
    • Offers traceability from abstract system modeling to XML implementation
    • Offers rich set of features for managing DTD definitions:
      • Provides modularity through UML packages
      • Provides all of UML’s typing to XML components
      • Provides formal syntactic and semantic constraints through UML object constraint language (or equivalent)
    • Focus of document analysis is on business objects, not on implementation technology or representation syntax
    • Same business models can be refined into XML, CORBA IDL, Java objects, RDBMS tables, etc….
  • 29. Benefits for XML Practitioner
    • XML design completely integrated into larger system design
    • Can use existing tools to develop and maintain DTD definition (e.g., Rose, ObjectDomain, etc.)
    • Provides design documentation in form easily understood by implementors
    • Get graphical representations of DTDs for free
    • Documentation can be bound directly to model
    • Elevates XML to first-class citizen in system design
    • No need to choose a particular DTD representation syntax (e.g., DTD declarations vs. XML Schema)…
    • … both are simply generated from UML model
  • 30. Work to Be Done
    • Implement DTD syntax output generators
    • Better understand how UML packages, Catalysis refinement, OO frameworks, and SGML architectures interact
    • Understand how to map these models to groves at different levels of abstraction (e.g., groves that reflect the business object model, not the XML syntax model)
    • Expand model to include hyperlink representation
    • Apply approach in practice
  • 31. Contact Info
    • W. Eliot Kimber [email_address]
    • John Heintz [email_address]