Generating Documents
  with OpenOffice
     michael@koziarski.com
Introduction - Me
TheRailsWay.com
Introduction - PlanHQ
Online Business
   Planning
What we needed
Unique Challenges


• No limits on amount of text
• Real time generation
Visual Design


• Printed output must be high quality
• Frequent iteration and enhancement
• Worked on by non-programmers
What we evaluated
MS Office APIs
Office APIs
Downsides

• Windows Only
• None of us knew .NET
• Complicated APIs
Downsides

• Windows Only
• None of us knew .NET
• Complicated APIs
Downsides


• No PDF export
• Design Changes are Expensive
Expensive Design
    Changes
Web Programming
    Works
The design output should be as close
as possible to the programming output.
Design Changes
           Imperative                  Markup

•                           •
    Change Design             ...
PDF-Writer
PDF-Writer
Downsides


• Design Changes are Expensive
• No suitable .doc conversion
RTF
RTF
• Relatively Simple Format
• Cross Platform Support
     fs30 First line with 15 point textline
     fs20 Second line ...
Downsides


• Design Changes are Expensive
• Subset of Formatting options available
• No PDF support
LaTeX
LaTeX


• Beautiful output
• Some Existing ‘GUI’ Editors (LyX, TeXmacs)
Downsides


• Design Changes are Expensive
• No Mathematics in our output
• No .doc support
HTML
HTML

• Automated conversion tools produce UGLY
  results
• Browsers give poor control over printed
  output
ODF
Why ODF?
Open Standard
Cross Platform Tools
Simple Design Lifecycle
Creation
Structure
  Zip File
meta.xml
meta.xml
settings.xml

• PrintControls
• PrintEmptyPages
• DoNotCaptureDrawObjsOnPage
• ConsiderTextWrapOnObjPos
settings.xml
META-INF/manifest.xml
Configurations2


   I have NO idea
content.xml
content.xml
content.xml
content.xml
Final Step
Why not use Builder?
ODF XML
ODF XML
ODF XML

24 Seperate Namespaces!
ODF XML
ODF XML
Inserting an image uses 4 separate namespaces!
ODF XML
Inserting an image uses 4 separate namespaces!
ODF XML
ODF XML
Conversion
Surprisingly Difficult!
Conversion?


ooffice --save-as pdf some_file.odt
Conversion?


ooffice --save-as pdf some_file.odt
UNO
UNO

No Ruby Bindings!
UNO

 No Ruby Bindings!

Stuck with Python!
ooextract.py

Takes a document on the command line,
     and converts it to pdf or html
ooextract.py

Takes a document on the command line,
     and converts it to pdf or html


     Cargo Culting to the rescue
ooextract.py
ooextract.py
ooextract.py




What’s the filterName for MS Office?
ooextract.py
One last problem
X11
New Deployment
      Architecture
• Separate ‘converter’ nodes
• S3 to hold odf files
• SQS to trigger conversion
• Convers...
Pain!
ODF Toolkit
The solution?
ODF Toolkit
ODF Toolkit


• Still in the incubator phase
• No production ready code yet.
Conclusions
ODF - Good
ODF - Weird
Conversion - Bad
Questions?
michael@koziarski.com
Os Koziarsky
Os Koziarsky
Os Koziarsky
Os Koziarsky
Os Koziarsky
Os Koziarsky
Os Koziarsky
Os Koziarsky
Os Koziarsky
Upcoming SlideShare
Loading in...5
×

Os Koziarsky

525

Published on

Published in: Business, Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
525
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
0
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Os Koziarsky

  1. 1. Generating Documents with OpenOffice michael@koziarski.com
  2. 2. Introduction - Me
  3. 3. TheRailsWay.com
  4. 4. Introduction - PlanHQ
  5. 5. Online Business Planning
  6. 6. What we needed
  7. 7. Unique Challenges • No limits on amount of text • Real time generation
  8. 8. Visual Design • Printed output must be high quality • Frequent iteration and enhancement • Worked on by non-programmers
  9. 9. What we evaluated
  10. 10. MS Office APIs
  11. 11. Office APIs
  12. 12. Downsides • Windows Only • None of us knew .NET • Complicated APIs
  13. 13. Downsides • Windows Only • None of us knew .NET • Complicated APIs
  14. 14. Downsides • No PDF export • Design Changes are Expensive
  15. 15. Expensive Design Changes
  16. 16. Web Programming Works
  17. 17. The design output should be as close as possible to the programming output.
  18. 18. Design Changes Imperative Markup • • Change Design Change Design • • Reverse engineer into Apply changes to code method calls • Test • Apply changes to the code • Test
  19. 19. PDF-Writer
  20. 20. PDF-Writer
  21. 21. Downsides • Design Changes are Expensive • No suitable .doc conversion
  22. 22. RTF
  23. 23. RTF • Relatively Simple Format • Cross Platform Support fs30 First line with 15 point textline fs20 Second line with 10 point testline i Italics on i0 Italics offline b Bold on b0 Bold offline
  24. 24. Downsides • Design Changes are Expensive • Subset of Formatting options available • No PDF support
  25. 25. LaTeX
  26. 26. LaTeX • Beautiful output • Some Existing ‘GUI’ Editors (LyX, TeXmacs)
  27. 27. Downsides • Design Changes are Expensive • No Mathematics in our output • No .doc support
  28. 28. HTML
  29. 29. HTML • Automated conversion tools produce UGLY results • Browsers give poor control over printed output
  30. 30. ODF
  31. 31. Why ODF?
  32. 32. Open Standard
  33. 33. Cross Platform Tools
  34. 34. Simple Design Lifecycle
  35. 35. Creation
  36. 36. Structure Zip File
  37. 37. meta.xml
  38. 38. meta.xml
  39. 39. settings.xml • PrintControls • PrintEmptyPages • DoNotCaptureDrawObjsOnPage • ConsiderTextWrapOnObjPos
  40. 40. settings.xml
  41. 41. META-INF/manifest.xml
  42. 42. Configurations2 I have NO idea
  43. 43. content.xml
  44. 44. content.xml
  45. 45. content.xml
  46. 46. content.xml
  47. 47. Final Step
  48. 48. Why not use Builder?
  49. 49. ODF XML
  50. 50. ODF XML
  51. 51. ODF XML 24 Seperate Namespaces!
  52. 52. ODF XML
  53. 53. ODF XML Inserting an image uses 4 separate namespaces!
  54. 54. ODF XML Inserting an image uses 4 separate namespaces!
  55. 55. ODF XML
  56. 56. ODF XML
  57. 57. Conversion
  58. 58. Surprisingly Difficult!
  59. 59. Conversion? ooffice --save-as pdf some_file.odt
  60. 60. Conversion? ooffice --save-as pdf some_file.odt
  61. 61. UNO
  62. 62. UNO No Ruby Bindings!
  63. 63. UNO No Ruby Bindings! Stuck with Python!
  64. 64. ooextract.py Takes a document on the command line, and converts it to pdf or html
  65. 65. ooextract.py Takes a document on the command line, and converts it to pdf or html Cargo Culting to the rescue
  66. 66. ooextract.py
  67. 67. ooextract.py
  68. 68. ooextract.py What’s the filterName for MS Office?
  69. 69. ooextract.py
  70. 70. One last problem
  71. 71. X11
  72. 72. New Deployment Architecture • Separate ‘converter’ nodes • S3 to hold odf files • SQS to trigger conversion • Conversion must be idempotent to allow retries
  73. 73. Pain!
  74. 74. ODF Toolkit The solution?
  75. 75. ODF Toolkit
  76. 76. ODF Toolkit • Still in the incubator phase • No production ready code yet.
  77. 77. Conclusions
  78. 78. ODF - Good
  79. 79. ODF - Weird
  80. 80. Conversion - Bad
  81. 81. Questions? michael@koziarski.com

×