Day Of Dot Net Ann Arbor 2007
Upcoming SlideShare
Loading in...5
×
 

Day Of Dot Net Ann Arbor 2007

on

  • 1,837 views

My presentation on Office 2007/OpenXML file formats

My presentation on Office 2007/OpenXML file formats

Statistics

Views

Total Views
1,837
Views on SlideShare
1,836
Embed Views
1

Actions

Likes
1
Downloads
16
Comments
0

1 Embed 1

http://www.linkedin.com 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Day Of Dot Net Ann Arbor 2007 Day Of Dot Net Ann Arbor 2007 Presentation Transcript

  • Creating Office Documents with Open XML David Truxall, Ph.D. Principal Consultant NuSoft Solutions
  • Agenda
    • Overview
    • System.IO.Packaging
    • Building Documents with .Net
  • Open XML
    • A Standard that describes a family of XML schemas (Ecma Standard)
    • Defines the XML vocabularies for word-processing, spreadsheet, and presentation documents
    • Defines the packaging of documents that conform to these schemas
  • Features of Office Open XML
  • Support for Open XML
    • iPhone
    • iWork
    • Microsoft Office
    • OpenOffice
    • Gnumeric
    • WordPerfect
    • Palm OS
    • NeoOffice
    • PHP
    • Java
    • Monarch v.9.0
    • OpenXML Writer
    • Word Counter 2.2.1
    • Altsoft XML2PDF
    • MindMapping
    • XmlSpy
  • Open XML Format Architecture User view: single Office file
    • Document Parts
    • Most parts are XML
    • Each XML part is a discrete component
    • Can add, extract and modify individual parts without using Office programs
    • Corruption of any part would not prohibit the file from opening
    Developer view: modular file File Container Document Properties Comments WordML / Spreadsheet ML Custom XML Embedded Code Images / Video / Sound
  • Open Packaging Organization
    • Package – The container (a ZIP archive)
    • Document Parts – The files inside the container
    • Relationships – Every part that references other parts does so via a relationship
    Document Properties Application Properties Custom Properties Sheet 1 Sheet 2 Sheet 3 Strings Theme Workbook
  • Exploring the Document Package
  • Reference Schemas
    • Xml Reference Schemas
      • 80+ that make up the standard
        • Display oriented
        • Document format
    • Custom Schemas
      • Specific to your business
        • Data oriented
        • Business information
  • Custom XML Content
    • Enables interoperability with other systems
      • Documents can provide a rich view of back-end data sources
      • Documents can update back-end data sources
    • Exposes business data in Open XML documents
      • Heterogenous systems can easily read data from documents
      • Business-specific semantics can be applied to document data
    • Separates presentation and data
      • Simplified programming model for all of the above
    • Custom XML schema support was a key design objective for Open XML: any schema can be used in Open XML documents.
  • System.IO.Packaging
    • Part of Windows Presentation Foundation
    • Installed with .NET 3.0
    • Requires .NET 2.0 Runtime
    • Enables package manipulation for
      • Office Open XML File Formats
      • XML Paper Specification Files
      • Any Open Packaging Convention files
  • The Package
    • Package Class
    • Provides methods to create, enumerate and delete the following entities:
      • Package
      • Package Properties
      • PackageRelationships
      • PackageParts
    Common Package Parts Package Relationships Core Properties Digital Signatures Specific Format Parts Office Document Part Relationships XML Part XML Part Part Rels Etc …
  • The PackagePart
    • A PackagePart is the object of data within the Package
    • It provides support to create, enumerate and delete part relationships
    • Get data as a System.IO.Stream
    • PackagePart properties:
      • CompressionOption
      • ContentType
      • Package
      • Uri
  • PackageRelationship
    • Required to find parts (part names are not guaranteed)
    • Iterate through a RelationshipCollection by type or ID
    • Relationship Properties
      • ID
      • Package
      • RelationshipType
      • SourceUri
      • TargetMode
      • TargetUri
  • Package Uri Helper
    • Find a related PackagePart by searching relationships, either by relationship type or relationship ID
      • This returns a list of PackageRelationship objects
    • A PackageRelationship defines two relative URIs
      • Source URI, pointing to the source PackagePart
      • Target URI, pointing to the target PackagePart
    • Retrieve a PackagePart by using a URI relative to the root of the Package
      • Translation of Source and Target URIs is required
      • Use the PackUriHelper class to aid in the translation
  • System.IO.Packaging
  • SpreadsheetML
    • Workbooks, Worksheets
    • Rows, Columns, Values
    • Formulas
    Workbook properties table chart styles calcChain sharedStrings sheet1..N sheet1..N sheet1..N sheet1..N sheet1..N sheet1..N sheet1..N drawing
  • The Minimal xlsx
    • Required: workbook.xml , the document “start part”
    • Required: at least one sheet, worksheet.xml
    • Required: one relationship part ( .rels )
      • Must be in a _rels folder
    • Required: [Content_Types].xml
      • Required part for all Open XML documents
      • Three content types must be defined:
        • SpreadsheetML main document (for the start part)
        • Worksheet
        • Package relationships (for the required relationships)
    • Everything else is optional
      • Worksheet <sheetdata> is required, but may be empty
  • SpreadsheetML Tables
    • SpreadsheetML tables provide structure and formatting for worksheet information
    • Separation of presentation and data:
      • Data stays in the worksheet
      • Table definition in separate part (implicit relationship)
    • Open XML has different types of tables for each document type, optimized for different scenarios:
      • WordprocessingML has its tbl element
      • SpreadsheetML has its table element
      • PresentationML uses DrawingML tables ( tbl inside graphicData )
  • SpreadsheetML Table Headings = shared strings Worksheet (sheet1.xml) Table definition (table1.xml) <sheetData> <row r=&quot;1&quot; spans=&quot;1:2&quot;> <c r=&quot;A1&quot; t=&quot;s&quot;><v>0</v></c> <c r=&quot;B1&quot; t=&quot;s&quot;><v>1</v></c> </row> <row r=&quot;2&quot; spans=&quot;1:2&quot;> <c r=&quot;A2&quot;><v>1</v></c> <c r=&quot;B2&quot;><v>4</v></c> </row> <row r=&quot;3&quot; spans=&quot;1:2&quot;> <c r=&quot;A3&quot;><v>2</v></c> <c r=&quot;B3&quot;><v>5</v></c> </row> <row r=&quot;4&quot; spans=&quot;1:2&quot;> <c r=&quot;A4&quot;><v>3</v></c> <c r=&quot;B4&quot;><v>6</v></c> </row> </sheetData> ... <tableParts count=&quot;1&quot;> <tablePart r:id=&quot;rId2&quot;/> </tableParts> <table … ref=&quot;A1:B4” …> <autoFilter ref=&quot;A1:B4”/> <tableColumns count=&quot;2&quot;> <tableColumn id=&quot;1&quot; name=&quot;Column1&quot; /> <tableColumn id=&quot;2&quot; name=&quot;Column2&quot; /> </tableColumns> <tableStyleInfo …/> </table>
  • ExcelPackage
    • Open Source API on Codeplex
    • Wraps System.IO.Packaging and SpreadsheetML
    http://www.codeplex.com/ExcelPackage
  • WordProcessingML Document
      • A WordprocessingML file is a collection of multiple “stories”:
        • The main story
        • Header(s) / Footer(s)
        • Footnote(s) / Endnote(s)
        • Subdocuments
        • Comment(s)
    Document body properties fontTable headers/footers images numberingDefinitions styles customXML footnotes/endnotes comments
  • Main Document Part
    • The top-level element in the start part (e.g., document.xml) is document
    • Document has two optional child elements:
      • The background element, which specifies the settings for the background for the document
      • The body element, which contains the content of the main story
  • Block-Level Elements
    • The body element contains the main document story, made up of block-level elements:
      • Paragraphs
      • Tables
      • Custom XML markup
      • Alternate format chunks
      • Subdocuments
      • Final section properties
      • Future extensibility containers
    • Nested elements: a table may contain a table which contains a paragraph, etc.
  • Inline Structures
    • The <w:p> paragraph element contains inline structures:
      • Runs (containing <w:t> text regions)
      • Custom Markup (can occur at block or inline level)
      • Annotations (comments, tracked changes, bookmarks)
      • DrawingML elements
      • Fields (date, page number, document creator, etc.)
      • Hyperlinks
  • Paragraphs <w:p>
    • The most basic unit of a WordprocessingML document
    • Contains three pieces of information:
      • Paragraph properties
      • Inline content
      • optional revision IDs used for document merge and compare
    • A paragraph may occur at any location which allows block level content:
      • At the top-most level within a story (e.g. header, footer, main document)
      • Nested within a table cell
      • Nested within a structured document tag or annotation markers
  • Paragraph Properties
    • Can be set directly on a paragraph (below) or in a paragraph style
    • 24 total property settings
    <w:p> <w:pPr> <w:widowControl w:val=“on” /> <w:keepNext/> <w:keepLines/> <w:pageBreakBefore/> <w:suppressLineNumbers /> <w:suppressAutoHyphens /> <w:textBoxTightWrap /> </w:pPr> … runs, paragraph content … </w:p>
  • Runs <w:r>
    • A run is a region of text with a common set of properties
    • All text must be contained within runs
    • All runs must be contained within paragraphs
    • A run contains three types of information:
      • Run properties
      • Run content (text, fields, soft line breaks, pictures, etc.)
      • Optional revision IDs for document comparison
    • Define formatting for individual characters
    • Font attributes, size/position, etc.
    • 24 total properties
    Run Properties <w:r> <w:rPr> <w:rFonts w:ascii=“ Arial ” w:hAnsi=“Arial” w:cs=“Arial” /> <w: b /> <w: i /> <w:sz w:val=“ 11 ” /> <w: dstrike w:val=“ true ” />
  • PresentationML View Properties Presentation Properties Code Themes Fonts        Notes Masters        Slides        Handout Masters        Slide Masters        Notes Slides        Slide Layouts Presentation
  • The Minimal pptx
    • Presentation Element
      • Presentation.xml
        • Slide Masters
        • Notes Masters
        • Handout Masters
        • Slides
    • Relationships Part
      • Links to slide parts
  • Slide Parts <p:sld xmlns:p=“…/presentationml/2006/main” xmlns:a=“…/drawingml/2006/main” …> <p:cSld> <p:spTree> <p:sp> <p:nvSpPr>   <p:cNvPr id=&quot;2&quot; name=&quot; 7-Point Star 1 ” /> … <p:sp> <p:nvSpPr>   <p:cNvPr id=&quot;3&quot; name=&quot; TextBox 2 ” /> … <p:graphicFrame > <p:nvGraphicFramePr> <p:cNvPr id=&quot;4&quot; name=&quot; Chart 3 ” /> … </p:spTree> </p:cSld> <p:clrMapOvr> <a:masterClrMapping /> </p:clrMapOvr> </p:sld> Shape Chart Textbox
  • Object Parts – DrawingML Chart Part (chart1.xml) Data source Shape Chart Textbox
  • DrawingML
    • 5 Main types of objects
      • Shape
      • Group Shape
      • Connector
      • Picture
      • Graphic Frame
        • General-purpose container
        • Used for Charts, Diagrams, Tables
    • Most widely used elements are Property elements
      • Non-Visible Properties (nvPrs): union of common nvPrs and object specific nvPrs
      • Visible Properties: object specific
  • Shapes
    • Preset geometry
      • Pick the preset shape
      • Specify the adjust values for the shape
    • Text geometry
      • Pick the preset text shape
      • Specify the adjust values for the text shape
    • Custom geometry
      • Not covered in this course
  • Shape Line and Fill Properties < a:blipFill > < a:blip r:embed = &quot; rId2 &quot; /> < a:stretch > < a:fillRect /> </ a:stretch > </ a:blipFill > < a:ln > < a:solidFill > < a:srgbClr val = &quot; 4F81BD &quot; /> </ a:solidFill > < a:prstDash val = &quot; sysDash &quot; /> </ a:ln > Indicates relationship id to image data BLIP (Binary Large Image or Pictures) Fill Gradient Fill Dash Line and Solid Fill Fill Dashed Line Line < a:gradFill flip = &quot; none &quot; rotWithShape = &quot; 1 &quot; > < a:gsLst > < a:gs pos = &quot; 0 &quot; > < a:srgbClr val = &quot; DDEBCF &quot; /> </ a:gs > < a:gs pos = &quot; 50000 &quot; > < a:srgbClr val = &quot; 9CB86E &quot; /> </ a:gs > ... </ a:gsLst > < a:lin ang = &quot; 4200000 &quot; scaled = &quot; 0 &quot; /> < a:tileRect /> </ a:gradFill > Gradient stop and color
  • Pictures
    • Define a Picture: <p:pic/>
    • Source image rel. id <a:blip r:embed=“rId2”/>
    • Acts similar to a shape <p:spPr/>
    • Non-Visual picture properties convey picture specific save properties <p:nvPicPr/>
    • Similar for Audio & Video
    < p:pic > < p:nvPicPr > < p:cNvPr id = &quot; 4 &quot; name = &quot; lake.jpeg &quot; /> < p:cNvPicPr > < a:picLocks noChangeAspect = &quot; 1 &quot; /> </ p:cNvPicPr > < p:nvPr /> </ p:nvPicPr > < p:blipFill > < a:blip r:embed = &quot; rId2 &quot; /> < a:stretch > < a:fillRect /> </ a:stretch > </ p:blipFill > < p:spPr > < a:xfrm > < a:off x = &quot; 762000 &quot; y = &quot; 571500 &quot; /> < a:ext cx = &quot; 7620000 &quot; cy = &quot; 5715000 &quot; /> </ a:xfrm > < a:prstGeom prst = &quot; rect &quot; > < a:avLst /> </ a:prstGeom > </ p:spPr > </ p:pic >
  • Pictures vs. Shapes
    • Single fill allowed
    • Borders grow in/outward
    • Must be done by app
    • Can have text attached
    • Can have shape properties
    • Shape specific UI enabled
    • Two overlaid fills allowed
    • Borders grow outward
    • Lock aspect ratio flag
    • Cannot have text attached
    • Can have shape properties
    • Picture specific UI enabled
  • Graphic Objects
    • Graphic element represents a single graphical object
    • GraphicData element and Uri attribute
      • Specifies the namespace for the embedded content
      • Tells the consumer how to interpret the graphicData
      • Ability to render is application specific
      • Office supports a set of specific URI values:
        • http://schemas.openxmlformats.org/drawingml/2006/chart
        • http://schemas.openxmlformats.org/drawingml/2006/diagrams
    Graphic Object < graphic > < a:graphicData uri = &quot; http://schemas.../drawingml/2006/chart &quot; > < c:chart xmlns:c = &quot; http://schemas.../drawingml/2006/chart &quot; xmlns:r = &quot; http://schemas.../officeDocument/2006/relationships &quot; r:id = &quot; rd123232 &quot; /> </ a:graphicData > </ graphic > URI means chart follows
  • Charts
    • Graphic Object definition
      • References separate XML chart part
      • Defined in DrawingML namespace
    • Chart XML Part
      • Visual representation of data.
      • Includes a cache of data for chart.
      • Includes formatting using DrawingML.
    • Data Relationship
      • External relationship to file, or
      • Internal relationship to embedded spreadsheet
      • Spreadsheets point to their own data.
    • Chart Drawing
      • Contains shapes and pictures drawn on chart
    XML Chart Part Graphic Object Data Source Chart Drawing
  • Build a Document in Code
  • Resources
    • OpenXMLDeveloper.org
    • OpenXMLSDK
    • Package Explorer
    • Code Snippets
    http://blogs.nusoftsolutions.com/DTruxall/
  •  
    • [email_address]
    http://blogs.nusoftsolutions.com/DTruxall/