HTML to ODT to XML to PDF to . . .
             FrOSCon 2010


     Tobias Schlitt <toby@qafoo.com>


             August 22, 2010
HTML to ODT to XML to PDF to . . .                              2 / 25


    License


                  Copyright by Tobias Schlitt, Qafoo GmbH


                    Licensensed under Creative Commons
             Attribution-NonCommercial-ShareAlike 2.0 Generic
HTML to ODT to XML to PDF to . . .                                  3 / 25


    About me



         Tobias Schlitt
               Apprenticed IT specialist
               Waiting for diploma to be approved (TU Dortmund)
               Open source enthusiast
               Co-founder of Qafoo - passion for software quality
               PMC member of Apache Zeta Components
               Contributor to various other OSS projects
HTML to ODT to XML to PDF to . . .   4 / 25


    Outline


    Introduction


    The Document component


    Getting into the code


    End
HTML to ODT to XML to PDF to . . .   5 / 25


    eZ Components
HTML to ODT to XML to PDF to . . .            5 / 25


    Apache Zeta Components




                                 apache

                                 Zeta
                                 Components
HTML to ODT to XML to PDF to . . .                        6 / 25


    Apache Zeta Components


         Previously developed by eZ Systems
               http://ez.no
         Originally named eZ Components
         Code donated to the Apache Software Foundation
               http://apache.org
         Currently incubating
               Re-organizing / re-gathering community
               Join us!
HTML to ODT to XML to PDF to . . .                        7 / 25


    Apache Zeta Components


         General purpose component library for PHP 5.1+
         Open source (Apache 2.0 license)
         Focus
               High code quality
               Excellent docs
               Backwards compatibility
         Professional support available
         http://zetacomponents.org
HTML to ODT to XML to PDF to . . .    8 / 25


    The components


    49 components, an extract . . .
         Archive
         ConsoleTools
         Graph
         Mail
         MvcTools
         Webdav
         Workflow
HTML to ODT to XML to PDF to . . .   9 / 25


    Outline


    Introduction


    The Document component


    Getting into the code


    End
HTML to ODT to XML to PDF to . . .                                 10 / 25


    Goal


         Applications need to deal with
               Different input mechanisms
                    WYSIWYG editor (HTML)
                    Simple text editor (wiki markup)
                    Emails (ReST)

               Different output formats
                    Web front end (HTML)
                    Technical documentation management (Docbook)
                    Print (PDF)

         The Document component converts markup formats
HTML to ODT to XML to PDF to . . .                                 10 / 25


    Goal


         Applications need to deal with
               Different input mechanisms
                    WYSIWYG editor (HTML)
                    Simple text editor (wiki markup)
                    Emails (ReST)

               Different output formats
                    Web front end (HTML)
                    Technical documentation management (Docbook)
                    Print (PDF)

         The Document component converts markup formats
HTML to ODT to XML to PDF to . . .                                 10 / 25


    Goal


         Applications need to deal with
               Different input mechanisms
                    WYSIWYG editor (HTML)
                    Simple text editor (wiki markup)
                    Emails (ReST)

               Different output formats
                    Web front end (HTML)
                    Technical documentation management (Docbook)
                    Print (PDF)

         The Document component converts markup formats
HTML to ODT to XML to PDF to . . .                                 10 / 25


    Goal


         Applications need to deal with
               Different input mechanisms
                    WYSIWYG editor (HTML)
                    Simple text editor (wiki markup)
                    Emails (ReST)

               Different output formats
                    Web front end (HTML)
                    Technical documentation management (Docbook)
                    Print (PDF)

         The Document component converts markup formats
HTML to ODT to XML to PDF to . . .                                            11 / 25


    Supported formats


         Currently supported formats
               Docbook
               (X)Html
               eZ XML
               ReST
               Wiki
                    Dokuwiki, popular PHP based wiki (wiki.php.net) (read-only)
                    Creole, wiki markup standardization initiative
                    Confluence, Apache Atlassian wiki dialect (read-only)
               PDF (write only)
               ODF (only flat)
HTML to ODT to XML to PDF to . . .                                     12 / 25


    Approach

         Docbook as central conversion format
               Possible conversion shortcuts
               Conversions always configurable and extensible


                       (X)Html ➤         ReST           Creole




                                          ➤➤
                                                    ➤
                               ➤
                                     ➤
                                                ➤
                   eZ XML ➤        ➤ Docbook ➤              Dokuwiki
                                          ➤➤
                                                ➤
                               ➤
                         PDF             ODF        Confluence
HTML to ODT to XML to PDF to . . .   13 / 25


    Outline


    Introduction


    The Document component


    Getting into the code


    End
HTML to ODT to XML to PDF to . . .                                                                                                     14 / 25


     Reading ReStructered Text
 1   ==================
 2   PHP @ FrOSCon 2010
 3   ==================
 4
 5   For the f i f t h time we              w i l l be a t t h e ‘ F r e e and Open S o u r c e C o n f e r e n c e ‘
 6   ( FrOSCon ) i n St . A u g u s t i n , n e a r Bonn , o r g a n i z i n g a t r a c k f u l l o f PHP r e l a t e d
 7   t a l k s . We a l s o o f f e r s p a c e t o d i s c u s s PHP r e l a t e d t o p i c s , o r j u s t hack w i t h
 8   o t h e r open minded p e o p l e a r o u n d you . We would l o v e t o welcome you i n t h e PHP
 9   room .
10
11   We a r e c u r r e n t l y l o o k i n g f o r t a l k s f o r t h e PHP room , and t h e ‘ C a l l F o r
12   Papers ‘       w i l l l a s t u n t i l the 2 3 . 0 5 . 2 0 1 0 , j u s t l i k e the ‘ C a l l For Papers ‘                of
13   t h e FrOSCon .        P l e a s e s u b m i t a t a l k , i f you g o t s o m e t h i n g i n t e r e s t i n g t o t a l k
14   about .
15
16   You m i g h t   a l s o want t o s u b m i t t a l k s t o t h e ‘ main s c h e d u l e ‘         of the
17   conference      , w h i c h a l s o a c c e p t s PHP r e l a t e d t a l k s . F o r t a l k s i n t h e main
18   conference        y o u r c o s t s w i l l be c o v e r e d a s u s u a l , d e t a i l s a r e on t h e ‘ d e d i c a t e d
19   website ‘       .
20
21        h t t p : / / phpugdo . de /
22        h t t p : / / f r o s c o n . de /
23        / c a l l f o r p a p e r s . html
24        h t t p : / /www . f r o s c o n . de / i n d e x . php ? i d =15&mid=119& r e t =15&L=0&L=0
25        h t t p : / /www . f r o s c o n . de / i n d e x . php ? i d =15&mid=119& r e t =15&L=0&L=0
26        h t t p : / /www . f r o s c o n . de / i n d e x . php ? i d =15&mid=119& r e t =15&L=0&L=0
HTML to ODT to XML to PDF to . . .                              15 / 25


     Reading ReStructered Text




 1   <?php
 2
 3   require     ’ . / a u t o l o a d . php ’ ;
 4
 5   $document = new ezcDocumentRst ( ) ;
 6   $document− o a d F i l e ( ’ f r o s c o n . t x t ’ ) ;
               >l
 7
 8   echo $document−>getAsDocbook ( ) ;
HTML to ODT to XML to PDF to . . .                              15 / 25


     Reading ReStructered Text




 1   <?php
 2
 3   require     ’ . / a u t o l o a d . php ’ ;
 4
 5   $document = new ezcDocumentRst ( ) ;
 6   $document− o a d F i l e ( ’ f r o s c o n . t x t ’ ) ;
               >l
 7
 8   echo $document−>getAsDocbook ( ) ;
HTML to ODT to XML to PDF to . . .                              15 / 25


     Reading ReStructered Text




 1   <?php
 2
 3   require     ’ . / a u t o l o a d . php ’ ;
 4
 5   $document = new ezcDocumentRst ( ) ;
 6   $document− o a d F i l e ( ’ f r o s c o n . t x t ’ ) ;
               >l
 7
 8   echo $document−>getAsDocbook ( ) ;
HTML to ODT to XML to PDF to . . .                              15 / 25


     Reading ReStructered Text




 1   <?php
 2
 3   require     ’ . / a u t o l o a d . php ’ ;
 4
 5   $document = new ezcDocumentRst ( ) ;
 6   $document− o a d F i l e ( ’ f r o s c o n . t x t ’ ) ;
               >l
 7
 8   echo $document−>getAsDocbook ( ) ;
HTML to ODT to XML to PDF to . . .                                                                                                          16 / 25


      Reading ReStructered Text
 1   <? xml v e r s i o n=” 1 . 0 ” ?>
 2   <!DOCTYPE a r t i c l e PUBLIC ”−//OASIS //DTD DocBook XML V4 . 5 / /EN” ” h t t p : //www . o a s i s −
               open . o r g / docbook / xml / 4 . 5 / d o c b o o k x . d t d ”>
 3   < a r t i c l e x m l n s=” h t t p : // docbook . o r g / n s / docbook ”>
 4     <s e c t i o n ID=” p h p f r o s c o n 2 0 1 0 ”>
 5         < t i t l e>PHP @ FrOSCon 2010</ t i t l e>
 6         <p a r a>F o r t h e f i f t h t i m e <u l i n k u r l=” h t t p : // phpugdo . de / ”>we</ u l i n k> w i l l be
                      a t t h e <u l i n k u r l=” h t t p : // f r o s c o n . de / ”>F r e e and Open S o u r c e C o n f e r e n c e<
                      / u l i n k> ( FrOSCon ) i n St . A u g u s t i n , n e a r Bonn , o r g a n i z i n g a t r a c k f u l l
                      o f PHP r e l a t e d t a l k s . We a l s o o f f e r s p a c e t o d i s c u s s PHP r e l a t e d t o p i c s
                      , o r j u s t hack w i t h o t h e r open minded p e o p l e a r o u n d you . We would l o v e
                      t o welcome you i n t h e PHP room .</ p a r a>
 7         <p a r a>We a r e c u r r e n t l y l o o k i n g f o r t a l k s f o r t h e PHP room , and t h e <u l i n k
                      u r l=” / c a l l f o r p a p e r s . h t m l ”>C a l l F o r P a p e r s</ u l i n k> w i l l l a s t u n t i l t h e
                        2 3 . 0 5 . 2 0 1 0 , j u s t l i k e t h e <u l i n k u r l=” h t t p : //www . f r o s c o n . de / i n d e x . php ?
                      i d =15&amp ; amp ; mid=119&amp ; amp ; r e t =15&amp ; amp ; L=0&amp ; amp ; L=0”>C a l l F o r
                      P a p e r s</ u l i n k> o f t h e FrOSCon .            P l e a s e s u b m i t a t a l k , i f you g o t
                      s o m e t h i n g i n t e r e s t i n g t o t a l k a b o u t .</ p a r a>
 8         <p a r a>You m i g h t a l s o want t o s u b m i t t a l k s t o t h e <u l i n k u r l=” h t t p : //www .
                      f r o s c o n . de / i n d e x . php ? i d =15&amp ; amp ; mid=119&amp ; amp ; r e t =15&amp ; amp ; L=0&
                     amp ; amp ; L=0”>main s c h e d u l e</ u l i n k> o f t h e c o n f e r e n c e , w h i c h a l s o
                      a c c e p t s PHP r e l a t e d t a l k s . F o r t a l k s i n t h e main c o n f e r e n c e y o u r c o s t s
                      w i l l be c o v e r e d a s u s u a l , d e t a i l s a r e on t h e <u l i n k u r l=” h t t p : //www .
                      f r o s c o n . de / i n d e x . php ? i d =15&amp ; amp ; mid=119&amp ; amp ; r e t =15&amp ; amp ; L=0&
                     amp ; amp ; L=0”>d e d i c a t e d w e b s i t e</ u l i n k>.</ p a r a>
 9     </ s e c t i o n>
10   </ a r t i c l e>
HTML to ODT to XML to PDF to . . .                                                               17 / 25


     HTML to RST conversion


 1   <?php
 2
 3   require        ’ a u t o l o a d . php ’ ;
 4
 5   $ x h t m l = new ezcDocumentXhtml ( ) ;
 6   $xhtml− e t F i l t e r s ( a r r a y (
                 >s
 7           new e z c D o c u m e n t X h t m l E l e m e n t F i l t e r ( ) ,
 8           new e z c D o c u m e n t X h t m l C o n t e n t L o c a t o r F i l t e r ( ) ,
 9   ) );
10
11   $xhtml− o a d F i l e ( ’ w h i t e p a p e r . h t m l ’ ) ;
            >l
12   // $xhtml− o a d F i l e ( ’ h t t p : / / q a f o o . com/
                   >l
          w h i t e p a p e r . html ’ ) ;
13
14   $ r s t = new ezcDocumentRst ( ) ;
15   $ r s t − r e a t e F r o m D o c b o o k ( $xhtml−
              >c                                        >getAsDocbook ( )
              );
16
17   echo $ r s t ;
HTML to ODT to XML to PDF to . . .                                                               17 / 25


     HTML to RST conversion


 1   <?php
 2
 3   require        ’ a u t o l o a d . php ’ ;
 4
 5   $ x h t m l = new ezcDocumentXhtml ( ) ;
 6   $xhtml− e t F i l t e r s ( a r r a y (
                 >s
 7           new e z c D o c u m e n t X h t m l E l e m e n t F i l t e r ( ) ,
 8           new e z c D o c u m e n t X h t m l C o n t e n t L o c a t o r F i l t e r ( ) ,
 9   ) );
10
11   $xhtml− o a d F i l e ( ’ w h i t e p a p e r . h t m l ’ ) ;
            >l
12   // $xhtml− o a d F i l e ( ’ h t t p : / / q a f o o . com/
                   >l
          w h i t e p a p e r . html ’ ) ;
13
14   $ r s t = new ezcDocumentRst ( ) ;
15   $ r s t − r e a t e F r o m D o c b o o k ( $xhtml−
              >c                                        >getAsDocbook ( )
              );
16
17   echo $ r s t ;
HTML to ODT to XML to PDF to . . .                                                               17 / 25


     HTML to RST conversion


 1   <?php
 2
 3   require        ’ a u t o l o a d . php ’ ;
 4
 5   $ x h t m l = new ezcDocumentXhtml ( ) ;
 6   $xhtml− e t F i l t e r s ( a r r a y (
                 >s
 7           new e z c D o c u m e n t X h t m l E l e m e n t F i l t e r ( ) ,
 8           new e z c D o c u m e n t X h t m l C o n t e n t L o c a t o r F i l t e r ( ) ,
 9   ) );
10
11   $xhtml− o a d F i l e ( ’ w h i t e p a p e r . h t m l ’ ) ;
            >l
12   // $xhtml− o a d F i l e ( ’ h t t p : / / q a f o o . com/
                   >l
          w h i t e p a p e r . html ’ ) ;
13
14   $ r s t = new ezcDocumentRst ( ) ;
15   $ r s t − r e a t e F r o m D o c b o o k ( $xhtml−
              >c                                        >getAsDocbook ( )
              );
16
17   echo $ r s t ;
HTML to ODT to XML to PDF to . . .                                                               17 / 25


     HTML to RST conversion


 1   <?php
 2
 3   require        ’ a u t o l o a d . php ’ ;
 4
 5   $ x h t m l = new ezcDocumentXhtml ( ) ;
 6   $xhtml− e t F i l t e r s ( a r r a y (
                 >s
 7           new e z c D o c u m e n t X h t m l E l e m e n t F i l t e r ( ) ,
 8           new e z c D o c u m e n t X h t m l C o n t e n t L o c a t o r F i l t e r ( ) ,
 9   ) );
10
11   $xhtml− o a d F i l e ( ’ w h i t e p a p e r . h t m l ’ ) ;
            >l
12   // $xhtml− o a d F i l e ( ’ h t t p : / / q a f o o . com/
                   >l
          w h i t e p a p e r . html ’ ) ;
13
14   $ r s t = new ezcDocumentRst ( ) ;
15   $ r s t − r e a t e F r o m D o c b o o k ( $xhtml−
              >c                                        >getAsDocbook ( )
              );
16
17   echo $ r s t ;
HTML to ODT to XML to PDF to . . .                                                               17 / 25


     HTML to RST conversion


 1   <?php
 2
 3   require        ’ a u t o l o a d . php ’ ;
 4
 5   $ x h t m l = new ezcDocumentXhtml ( ) ;
 6   $xhtml− e t F i l t e r s ( a r r a y (
                 >s
 7           new e z c D o c u m e n t X h t m l E l e m e n t F i l t e r ( ) ,
 8           new e z c D o c u m e n t X h t m l C o n t e n t L o c a t o r F i l t e r ( ) ,
 9   ) );
10
11   $xhtml− o a d F i l e ( ’ w h i t e p a p e r . h t m l ’ ) ;
            >l
12   // $xhtml− o a d F i l e ( ’ h t t p : / / q a f o o . com/
                   >l
          w h i t e p a p e r . html ’ ) ;
13
14   $ r s t = new ezcDocumentRst ( ) ;
15   $ r s t − r e a t e F r o m D o c b o o k ( $xhtml−
              >c                                        >getAsDocbook ( )
              );
16
17   echo $ r s t ;
HTML to ODT to XML to PDF to . . .                                                                                                      18 / 25


      HTML to RST conversion
 1   −−−−−−−−−−−−−−−
      −−−−−−−−−−−−−−−
 2   Your s o f t w a r e s q u a l i t y m a t t e r s
 3   −−−−−−−−−−−−−−−
      −−−−−−−−−−−−−−−
 4
 5   Contents
 6   ========
 7
 8   − ‘ Introduction ‘
 9     − ‘ Business goals ‘
10    [..]
11
12   R a i s i n g and c o n t i n u o u s l y m o n i t o r i n g t h e q u a l i t y o f s o f t w a r e a l l o w s you t o
13   i m p r o v e t h e R e t u r n On I n v e s t m e n t , t o r e d u c e t h e t i m e t o m a r k e t and t o i n c r e a s e
14   c u s t o m e r s a t i s f a c t i o n f o r y o u r s o f t w a r e p r o d u c t . No m a t t e r i f you d e v e l o p f o r
15   company i n t e r n a l u s e o n l y , on a c u s t o m e r b a s i s o r s t a n d a r d s o f t w a r e .
16
17       introduction
18       b u s i n e s s −g o a l s
19    [..]
20
21    ..   f i g u r e : : / i m a g e s / w h i t e p a p e r / s t a i r w a y s t o q u a l i t y . png
22         : w i d t h : 500
23         : h e i g h t : 225
24         : a l t : Stairways to q u a l i t y
25
26    Introduction
27   ============
28
29    Q u a l i t y i m p a c t s v a r i o u s a r e a s o f y o u r company  ’ s s o f t w a r e d e v e l o p m e n t   [..]
HTML to ODT to XML to PDF to . . .                                                                                             19 / 25


     PDF generation
 1   <?php
 2   require        ’ a u t o l o a d . php ’ ;
 3
 4   // C o n v e r t some web page t o PDF
 5   $ x h t m l = new ezcDocumentXhtml ( ) ;
 6   $xhtml− e t F i l t e r s ( a r r a y (
                 >s
 7           new e z c D o c u m e n t X h t m l E l e m e n t F i l t e r ( ) ,
 8           new e z c D o c u m e n t X h t m l X p a t h F i l t e r ( ’ // d i v [ @ c l a s s =” c o n t e n t ” ] ’ ) ,
 9   ) );
10   $xhtml− o a d F i l e ( ’ c o n s u l t i n g . h t m l ’ ) ;
                 >l
11
12   // Load t h e docbook document and c r e a t e a PDF from i t
13   $ p d f = new ezcDocumentPdf ( ) ;
14   $p df− p t i o n s − r r o r R e p o r t i n g = E PARSE | E ERROR | E WARNING ;
             >o          >e
15
16   // Load a custom s t y l e s h e e t
17   $p df− o a d S t y l e s ( ’ custom . c s s ’ ) ;
           >l
18
19   // Add a c u s t o m i z e d h e a d e r
20   $p df− e g i s t e r P d f P a r t ( new e z c D o c u m e n t P d f H e a d e r P d f P a r t (
           >r
21        new e z c D o c u m e n t P d f F o o t e r O p t i o n s ( a r r a y (
22            ’ showPageNumber ’                      = false ,
                                                        >
23            ’ height ’                              = ’ 10mm’ ,
                                                        >
24        ) )
25   ) );
26
27   $p df− r e a t e F r o m D o c b o o k ( $xhtml−
           >c                                         >getAsDocbook ( ) ) ;
28   file put contents (                FILE     . ’ . pdf ’ , $pdf ) ;
HTML to ODT to XML to PDF to . . .                                                                                             19 / 25


     PDF generation
 1   <?php
 2   require        ’ a u t o l o a d . php ’ ;
 3
 4   // C o n v e r t some web page t o PDF
 5   $ x h t m l = new ezcDocumentXhtml ( ) ;
 6   $xhtml− e t F i l t e r s ( a r r a y (
                 >s
 7           new e z c D o c u m e n t X h t m l E l e m e n t F i l t e r ( ) ,
 8           new e z c D o c u m e n t X h t m l X p a t h F i l t e r ( ’ // d i v [ @ c l a s s =” c o n t e n t ” ] ’ ) ,
 9   ) );
10   $xhtml− o a d F i l e ( ’ c o n s u l t i n g . h t m l ’ ) ;
                 >l
11
12   // Load t h e docbook document and c r e a t e a PDF from i t
13   $ p d f = new ezcDocumentPdf ( ) ;
14   $p df− p t i o n s − r r o r R e p o r t i n g = E PARSE | E ERROR | E WARNING ;
             >o          >e
15
16   // Load a custom s t y l e s h e e t
17   $p df− o a d S t y l e s ( ’ custom . c s s ’ ) ;
           >l
18
19   // Add a c u s t o m i z e d h e a d e r
20   $p df− e g i s t e r P d f P a r t ( new e z c D o c u m e n t P d f H e a d e r P d f P a r t (
           >r
21        new e z c D o c u m e n t P d f F o o t e r O p t i o n s ( a r r a y (
22            ’ showPageNumber ’                      = false ,
                                                        >
23            ’ height ’                              = ’ 10mm’ ,
                                                        >
24        ) )
25   ) );
26
27   $p df− r e a t e F r o m D o c b o o k ( $xhtml−
           >c                                         >getAsDocbook ( ) ) ;
28   file put contents (                FILE     . ’ . pdf ’ , $pdf ) ;
HTML to ODT to XML to PDF to . . .                                                                                             19 / 25


     PDF generation
 1   <?php
 2   require        ’ a u t o l o a d . php ’ ;
 3
 4   // C o n v e r t some web page t o PDF
 5   $ x h t m l = new ezcDocumentXhtml ( ) ;
 6   $xhtml− e t F i l t e r s ( a r r a y (
                 >s
 7           new e z c D o c u m e n t X h t m l E l e m e n t F i l t e r ( ) ,
 8           new e z c D o c u m e n t X h t m l X p a t h F i l t e r ( ’ // d i v [ @ c l a s s =” c o n t e n t ” ] ’ ) ,
 9   ) );
10   $xhtml− o a d F i l e ( ’ c o n s u l t i n g . h t m l ’ ) ;
                 >l
11
12   // Load t h e docbook document and c r e a t e a PDF from i t
13   $ p d f = new ezcDocumentPdf ( ) ;
14   $p df− p t i o n s − r r o r R e p o r t i n g = E PARSE | E ERROR | E WARNING ;
             >o          >e
15
16   // Load a custom s t y l e s h e e t
17   $p df− o a d S t y l e s ( ’ custom . c s s ’ ) ;
           >l
18
19   // Add a c u s t o m i z e d h e a d e r
20   $p df− e g i s t e r P d f P a r t ( new e z c D o c u m e n t P d f H e a d e r P d f P a r t (
           >r
21        new e z c D o c u m e n t P d f F o o t e r O p t i o n s ( a r r a y (
22            ’ showPageNumber ’                      = false ,
                                                        >
23            ’ height ’                              = ’ 10mm’ ,
                                                        >
24        ) )
25   ) );
26
27   $p df− r e a t e F r o m D o c b o o k ( $xhtml−
           >c                                         >getAsDocbook ( ) ) ;
28   file put contents (                FILE     . ’ . pdf ’ , $pdf ) ;
HTML to ODT to XML to PDF to . . .                                                                                             19 / 25


     PDF generation
 1   <?php
 2   require        ’ a u t o l o a d . php ’ ;
 3
 4   // C o n v e r t some web page t o PDF
 5   $ x h t m l = new ezcDocumentXhtml ( ) ;
 6   $xhtml− e t F i l t e r s ( a r r a y (
                 >s
 7           new e z c D o c u m e n t X h t m l E l e m e n t F i l t e r ( ) ,
 8           new e z c D o c u m e n t X h t m l X p a t h F i l t e r ( ’ // d i v [ @ c l a s s =” c o n t e n t ” ] ’ ) ,
 9   ) );
10   $xhtml− o a d F i l e ( ’ c o n s u l t i n g . h t m l ’ ) ;
                 >l
11
12   // Load t h e docbook document and c r e a t e a PDF from i t
13   $ p d f = new ezcDocumentPdf ( ) ;
14   $p df− p t i o n s − r r o r R e p o r t i n g = E PARSE | E ERROR | E WARNING ;
             >o          >e
15
16   // Load a custom s t y l e s h e e t
17   $p df− o a d S t y l e s ( ’ custom . c s s ’ ) ;
           >l
18
19   // Add a c u s t o m i z e d h e a d e r
20   $p df− e g i s t e r P d f P a r t ( new e z c D o c u m e n t P d f H e a d e r P d f P a r t (
           >r
21        new e z c D o c u m e n t P d f F o o t e r O p t i o n s ( a r r a y (
22            ’ showPageNumber ’                      = false ,
                                                        >
23            ’ height ’                              = ’ 10mm’ ,
                                                        >
24        ) )
25   ) );
26
27   $p df− r e a t e F r o m D o c b o o k ( $xhtml−
           >c                                         >getAsDocbook ( ) ) ;
28   file put contents (                FILE     . ’ . pdf ’ , $pdf ) ;
HTML to ODT to XML to PDF to . . .                                                                                             19 / 25


     PDF generation
 1   <?php
 2   require        ’ a u t o l o a d . php ’ ;
 3
 4   // C o n v e r t some web page t o PDF
 5   $ x h t m l = new ezcDocumentXhtml ( ) ;
 6   $xhtml− e t F i l t e r s ( a r r a y (
                 >s
 7           new e z c D o c u m e n t X h t m l E l e m e n t F i l t e r ( ) ,
 8           new e z c D o c u m e n t X h t m l X p a t h F i l t e r ( ’ // d i v [ @ c l a s s =” c o n t e n t ” ] ’ ) ,
 9   ) );
10   $xhtml− o a d F i l e ( ’ c o n s u l t i n g . h t m l ’ ) ;
                 >l
11
12   // Load t h e docbook document and c r e a t e a PDF from i t
13   $ p d f = new ezcDocumentPdf ( ) ;
14   $p df− p t i o n s − r r o r R e p o r t i n g = E PARSE | E ERROR | E WARNING ;
             >o          >e
15
16   // Load a custom s t y l e s h e e t
17   $p df− o a d S t y l e s ( ’ custom . c s s ’ ) ;
           >l
18
19   // Add a c u s t o m i z e d h e a d e r
20   $p df− e g i s t e r P d f P a r t ( new e z c D o c u m e n t P d f H e a d e r P d f P a r t (
           >r
21        new e z c D o c u m e n t P d f F o o t e r O p t i o n s ( a r r a y (
22            ’ showPageNumber ’                      = false ,
                                                        >
23            ’ height ’                              = ’ 10mm’ ,
                                                        >
24        ) )
25   ) );
26
27   $p df− r e a t e F r o m D o c b o o k ( $xhtml−
           >c                                         >getAsDocbook ( ) ) ;
28   file put contents (                FILE     . ’ . pdf ’ , $pdf ) ;
HTML to ODT to XML to PDF to . . .                                                                                             19 / 25


     PDF generation
 1   <?php
 2   require        ’ a u t o l o a d . php ’ ;
 3
 4   // C o n v e r t some web page t o PDF
 5   $ x h t m l = new ezcDocumentXhtml ( ) ;
 6   $xhtml− e t F i l t e r s ( a r r a y (
                 >s
 7           new e z c D o c u m e n t X h t m l E l e m e n t F i l t e r ( ) ,
 8           new e z c D o c u m e n t X h t m l X p a t h F i l t e r ( ’ // d i v [ @ c l a s s =” c o n t e n t ” ] ’ ) ,
 9   ) );
10   $xhtml− o a d F i l e ( ’ c o n s u l t i n g . h t m l ’ ) ;
                 >l
11
12   // Load t h e docbook document and c r e a t e a PDF from i t
13   $ p d f = new ezcDocumentPdf ( ) ;
14   $p df− p t i o n s − r r o r R e p o r t i n g = E PARSE | E ERROR | E WARNING ;
             >o          >e
15
16   // Load a custom s t y l e s h e e t
17   $p df− o a d S t y l e s ( ’ custom . c s s ’ ) ;
           >l
18
19   // Add a c u s t o m i z e d h e a d e r
20   $p df− e g i s t e r P d f P a r t ( new e z c D o c u m e n t P d f H e a d e r P d f P a r t (
           >r
21        new e z c D o c u m e n t P d f F o o t e r O p t i o n s ( a r r a y (
22            ’ showPageNumber ’                      = false ,
                                                        >
23            ’ height ’                              = ’ 10mm’ ,
                                                        >
24        ) )
25   ) );
26
27   $p df− r e a t e F r o m D o c b o o k ( $xhtml−
           >c                                         >getAsDocbook ( ) ) ;
28   file put contents (                FILE     . ’ . pdf ’ , $pdf ) ;
HTML to ODT to XML to PDF to . . .                         20 / 25


     PDF generation
 1   article {
 2       f o n t−f a m i l y : ” s a n s−s e r i f ” ;
 3       f o n t−s i z e : ”10 p t ” ;
 4   }
 5
 6   title {
 7       f o n t−f a m i l y : ” s a n s−s e r i f ” ;
 8        c o l o r : #97BF0D ;
 9
10         b o r d e r−bottom : 1 px s o l i d #555753;
11         b o r d e r−l e f t : 4 px s o l i d #555753;
12         p a d d i n g−l e f t : 4 px ;
13   }
14
15   section > section > section > section > t i t l e {
16       b o r d e r−c o l o r : #babdb6 ;
17   }
18
19   page {
20       p a d d i n g : ”15mm 30mm” ;
21   }
22
23   ulink {
24       c o l o r : #97BF0D ;
25   }
26
27   link {
28       c o l o r : #97BF0D ;
29   }
HTML to ODT to XML to PDF to . . .   21 / 25


    PDF generation
HTML to ODT to XML to PDF to . . .                                                                                             22 / 25


     ODT generation


 1   <?php
 2
 3   require        ’ a u t o l o a d . php ’ ;
 4
 5   // C o n v e r t some i n p u t R S T f i l e t o docbook
 6   $ x h t m l = new ezcDocumentXhtml ( ) ;
 7   $xhtml− e t F i l t e r s ( a r r a y (
                 >s
 8           new e z c D o c u m e n t X h t m l E l e m e n t F i l t e r ( ) ,
 9           new e z c D o c u m e n t X h t m l X p a t h F i l t e r ( ’ // d i v [ @ c l a s s =” c o n t e n t ” ] ’ ) ,
10   ) );
11   $xhtml− o a d F i l e ( ’ c o n s u l t i n g . h t m l ’ ) ;
                 >l
12
13   $ c o n v e r t e r = new ezcDocumentDocbookToOdtConverter ( ) ;
14   $ c o n v e r t e r − p t i o n s − t y l e r − d d S t y l e s h e e t F i l e ( ’ custom . c s s ’ ) ;
                          >o            >s          >a
15
16   $ o d t = $ c o n v e r t e r − o n v e r t ( $xhtml−
                                    >c                        >getAsDocbook ( ) ) ;
17   file put contents (               FILE        . ’ . f o d t ’ , $odt ) ;
HTML to ODT to XML to PDF to . . .                                                                                             22 / 25


     ODT generation


 1   <?php
 2
 3   require        ’ a u t o l o a d . php ’ ;
 4
 5   // C o n v e r t some i n p u t R S T f i l e t o docbook
 6   $ x h t m l = new ezcDocumentXhtml ( ) ;
 7   $xhtml− e t F i l t e r s ( a r r a y (
                 >s
 8           new e z c D o c u m e n t X h t m l E l e m e n t F i l t e r ( ) ,
 9           new e z c D o c u m e n t X h t m l X p a t h F i l t e r ( ’ // d i v [ @ c l a s s =” c o n t e n t ” ] ’ ) ,
10   ) );
11   $xhtml− o a d F i l e ( ’ c o n s u l t i n g . h t m l ’ ) ;
                 >l
12
13   $ c o n v e r t e r = new ezcDocumentDocbookToOdtConverter ( ) ;
14   $ c o n v e r t e r − p t i o n s − t y l e r − d d S t y l e s h e e t F i l e ( ’ custom . c s s ’ ) ;
                          >o            >s          >a
15
16   $ o d t = $ c o n v e r t e r − o n v e r t ( $xhtml−
                                    >c                        >getAsDocbook ( ) ) ;
17   file put contents (               FILE        . ’ . f o d t ’ , $odt ) ;
HTML to ODT to XML to PDF to . . .                                                                                             22 / 25


     ODT generation


 1   <?php
 2
 3   require        ’ a u t o l o a d . php ’ ;
 4
 5   // C o n v e r t some i n p u t R S T f i l e t o docbook
 6   $ x h t m l = new ezcDocumentXhtml ( ) ;
 7   $xhtml− e t F i l t e r s ( a r r a y (
                 >s
 8           new e z c D o c u m e n t X h t m l E l e m e n t F i l t e r ( ) ,
 9           new e z c D o c u m e n t X h t m l X p a t h F i l t e r ( ’ // d i v [ @ c l a s s =” c o n t e n t ” ] ’ ) ,
10   ) );
11   $xhtml− o a d F i l e ( ’ c o n s u l t i n g . h t m l ’ ) ;
                 >l
12
13   $ c o n v e r t e r = new ezcDocumentDocbookToOdtConverter ( ) ;
14   $ c o n v e r t e r − p t i o n s − t y l e r − d d S t y l e s h e e t F i l e ( ’ custom . c s s ’ ) ;
                          >o            >s          >a
15
16   $ o d t = $ c o n v e r t e r − o n v e r t ( $xhtml−
                                    >c                        >getAsDocbook ( ) ) ;
17   file put contents (               FILE        . ’ . f o d t ’ , $odt ) ;
HTML to ODT to XML to PDF to . . .   23 / 25


    ODT generation
HTML to ODT to XML to PDF to . . .   24 / 25


    Outline


    Introduction


    The Document component


    Getting into the code


    End
HTML to ODT to XML to PDF to . . .                             25 / 25


    Thanks for listening


         Apache Zeta Components
               http://zetacomponents.org
               #zetacomponents @ Freenode
         Stay in touch
               toby@qafoo.com
               @tobySen
         Take care for your software quality
               http://qafoo.com
               Also consulting / training / support for Zeta

HTML to ODT to XML to PDF to …

  • 1.
    HTML to ODTto XML to PDF to . . . FrOSCon 2010 Tobias Schlitt <toby@qafoo.com> August 22, 2010
  • 2.
    HTML to ODTto XML to PDF to . . . 2 / 25 License Copyright by Tobias Schlitt, Qafoo GmbH Licensensed under Creative Commons Attribution-NonCommercial-ShareAlike 2.0 Generic
  • 3.
    HTML to ODTto XML to PDF to . . . 3 / 25 About me Tobias Schlitt Apprenticed IT specialist Waiting for diploma to be approved (TU Dortmund) Open source enthusiast Co-founder of Qafoo - passion for software quality PMC member of Apache Zeta Components Contributor to various other OSS projects
  • 4.
    HTML to ODTto XML to PDF to . . . 4 / 25 Outline Introduction The Document component Getting into the code End
  • 5.
    HTML to ODTto XML to PDF to . . . 5 / 25 eZ Components
  • 6.
    HTML to ODTto XML to PDF to . . . 5 / 25 Apache Zeta Components apache Zeta Components
  • 7.
    HTML to ODTto XML to PDF to . . . 6 / 25 Apache Zeta Components Previously developed by eZ Systems http://ez.no Originally named eZ Components Code donated to the Apache Software Foundation http://apache.org Currently incubating Re-organizing / re-gathering community Join us!
  • 8.
    HTML to ODTto XML to PDF to . . . 7 / 25 Apache Zeta Components General purpose component library for PHP 5.1+ Open source (Apache 2.0 license) Focus High code quality Excellent docs Backwards compatibility Professional support available http://zetacomponents.org
  • 9.
    HTML to ODTto XML to PDF to . . . 8 / 25 The components 49 components, an extract . . . Archive ConsoleTools Graph Mail MvcTools Webdav Workflow
  • 10.
    HTML to ODTto XML to PDF to . . . 9 / 25 Outline Introduction The Document component Getting into the code End
  • 11.
    HTML to ODTto XML to PDF to . . . 10 / 25 Goal Applications need to deal with Different input mechanisms WYSIWYG editor (HTML) Simple text editor (wiki markup) Emails (ReST) Different output formats Web front end (HTML) Technical documentation management (Docbook) Print (PDF) The Document component converts markup formats
  • 12.
    HTML to ODTto XML to PDF to . . . 10 / 25 Goal Applications need to deal with Different input mechanisms WYSIWYG editor (HTML) Simple text editor (wiki markup) Emails (ReST) Different output formats Web front end (HTML) Technical documentation management (Docbook) Print (PDF) The Document component converts markup formats
  • 13.
    HTML to ODTto XML to PDF to . . . 10 / 25 Goal Applications need to deal with Different input mechanisms WYSIWYG editor (HTML) Simple text editor (wiki markup) Emails (ReST) Different output formats Web front end (HTML) Technical documentation management (Docbook) Print (PDF) The Document component converts markup formats
  • 14.
    HTML to ODTto XML to PDF to . . . 10 / 25 Goal Applications need to deal with Different input mechanisms WYSIWYG editor (HTML) Simple text editor (wiki markup) Emails (ReST) Different output formats Web front end (HTML) Technical documentation management (Docbook) Print (PDF) The Document component converts markup formats
  • 15.
    HTML to ODTto XML to PDF to . . . 11 / 25 Supported formats Currently supported formats Docbook (X)Html eZ XML ReST Wiki Dokuwiki, popular PHP based wiki (wiki.php.net) (read-only) Creole, wiki markup standardization initiative Confluence, Apache Atlassian wiki dialect (read-only) PDF (write only) ODF (only flat)
  • 16.
    HTML to ODTto XML to PDF to . . . 12 / 25 Approach Docbook as central conversion format Possible conversion shortcuts Conversions always configurable and extensible (X)Html ➤ ReST Creole ➤➤ ➤ ➤ ➤ ➤ eZ XML ➤ ➤ Docbook ➤ Dokuwiki ➤➤ ➤ ➤ PDF ODF Confluence
  • 17.
    HTML to ODTto XML to PDF to . . . 13 / 25 Outline Introduction The Document component Getting into the code End
  • 18.
    HTML to ODTto XML to PDF to . . . 14 / 25 Reading ReStructered Text 1 ================== 2 PHP @ FrOSCon 2010 3 ================== 4 5 For the f i f t h time we w i l l be a t t h e ‘ F r e e and Open S o u r c e C o n f e r e n c e ‘ 6 ( FrOSCon ) i n St . A u g u s t i n , n e a r Bonn , o r g a n i z i n g a t r a c k f u l l o f PHP r e l a t e d 7 t a l k s . We a l s o o f f e r s p a c e t o d i s c u s s PHP r e l a t e d t o p i c s , o r j u s t hack w i t h 8 o t h e r open minded p e o p l e a r o u n d you . We would l o v e t o welcome you i n t h e PHP 9 room . 10 11 We a r e c u r r e n t l y l o o k i n g f o r t a l k s f o r t h e PHP room , and t h e ‘ C a l l F o r 12 Papers ‘ w i l l l a s t u n t i l the 2 3 . 0 5 . 2 0 1 0 , j u s t l i k e the ‘ C a l l For Papers ‘ of 13 t h e FrOSCon . P l e a s e s u b m i t a t a l k , i f you g o t s o m e t h i n g i n t e r e s t i n g t o t a l k 14 about . 15 16 You m i g h t a l s o want t o s u b m i t t a l k s t o t h e ‘ main s c h e d u l e ‘ of the 17 conference , w h i c h a l s o a c c e p t s PHP r e l a t e d t a l k s . F o r t a l k s i n t h e main 18 conference y o u r c o s t s w i l l be c o v e r e d a s u s u a l , d e t a i l s a r e on t h e ‘ d e d i c a t e d 19 website ‘ . 20 21 h t t p : / / phpugdo . de / 22 h t t p : / / f r o s c o n . de / 23 / c a l l f o r p a p e r s . html 24 h t t p : / /www . f r o s c o n . de / i n d e x . php ? i d =15&mid=119& r e t =15&L=0&L=0 25 h t t p : / /www . f r o s c o n . de / i n d e x . php ? i d =15&mid=119& r e t =15&L=0&L=0 26 h t t p : / /www . f r o s c o n . de / i n d e x . php ? i d =15&mid=119& r e t =15&L=0&L=0
  • 19.
    HTML to ODTto XML to PDF to . . . 15 / 25 Reading ReStructered Text 1 <?php 2 3 require ’ . / a u t o l o a d . php ’ ; 4 5 $document = new ezcDocumentRst ( ) ; 6 $document− o a d F i l e ( ’ f r o s c o n . t x t ’ ) ; >l 7 8 echo $document−>getAsDocbook ( ) ;
  • 20.
    HTML to ODTto XML to PDF to . . . 15 / 25 Reading ReStructered Text 1 <?php 2 3 require ’ . / a u t o l o a d . php ’ ; 4 5 $document = new ezcDocumentRst ( ) ; 6 $document− o a d F i l e ( ’ f r o s c o n . t x t ’ ) ; >l 7 8 echo $document−>getAsDocbook ( ) ;
  • 21.
    HTML to ODTto XML to PDF to . . . 15 / 25 Reading ReStructered Text 1 <?php 2 3 require ’ . / a u t o l o a d . php ’ ; 4 5 $document = new ezcDocumentRst ( ) ; 6 $document− o a d F i l e ( ’ f r o s c o n . t x t ’ ) ; >l 7 8 echo $document−>getAsDocbook ( ) ;
  • 22.
    HTML to ODTto XML to PDF to . . . 15 / 25 Reading ReStructered Text 1 <?php 2 3 require ’ . / a u t o l o a d . php ’ ; 4 5 $document = new ezcDocumentRst ( ) ; 6 $document− o a d F i l e ( ’ f r o s c o n . t x t ’ ) ; >l 7 8 echo $document−>getAsDocbook ( ) ;
  • 23.
    HTML to ODTto XML to PDF to . . . 16 / 25 Reading ReStructered Text 1 <? xml v e r s i o n=” 1 . 0 ” ?> 2 <!DOCTYPE a r t i c l e PUBLIC ”−//OASIS //DTD DocBook XML V4 . 5 / /EN” ” h t t p : //www . o a s i s − open . o r g / docbook / xml / 4 . 5 / d o c b o o k x . d t d ”> 3 < a r t i c l e x m l n s=” h t t p : // docbook . o r g / n s / docbook ”> 4 <s e c t i o n ID=” p h p f r o s c o n 2 0 1 0 ”> 5 < t i t l e>PHP @ FrOSCon 2010</ t i t l e> 6 <p a r a>F o r t h e f i f t h t i m e <u l i n k u r l=” h t t p : // phpugdo . de / ”>we</ u l i n k> w i l l be a t t h e <u l i n k u r l=” h t t p : // f r o s c o n . de / ”>F r e e and Open S o u r c e C o n f e r e n c e< / u l i n k> ( FrOSCon ) i n St . A u g u s t i n , n e a r Bonn , o r g a n i z i n g a t r a c k f u l l o f PHP r e l a t e d t a l k s . We a l s o o f f e r s p a c e t o d i s c u s s PHP r e l a t e d t o p i c s , o r j u s t hack w i t h o t h e r open minded p e o p l e a r o u n d you . We would l o v e t o welcome you i n t h e PHP room .</ p a r a> 7 <p a r a>We a r e c u r r e n t l y l o o k i n g f o r t a l k s f o r t h e PHP room , and t h e <u l i n k u r l=” / c a l l f o r p a p e r s . h t m l ”>C a l l F o r P a p e r s</ u l i n k> w i l l l a s t u n t i l t h e 2 3 . 0 5 . 2 0 1 0 , j u s t l i k e t h e <u l i n k u r l=” h t t p : //www . f r o s c o n . de / i n d e x . php ? i d =15&amp ; amp ; mid=119&amp ; amp ; r e t =15&amp ; amp ; L=0&amp ; amp ; L=0”>C a l l F o r P a p e r s</ u l i n k> o f t h e FrOSCon . P l e a s e s u b m i t a t a l k , i f you g o t s o m e t h i n g i n t e r e s t i n g t o t a l k a b o u t .</ p a r a> 8 <p a r a>You m i g h t a l s o want t o s u b m i t t a l k s t o t h e <u l i n k u r l=” h t t p : //www . f r o s c o n . de / i n d e x . php ? i d =15&amp ; amp ; mid=119&amp ; amp ; r e t =15&amp ; amp ; L=0& amp ; amp ; L=0”>main s c h e d u l e</ u l i n k> o f t h e c o n f e r e n c e , w h i c h a l s o a c c e p t s PHP r e l a t e d t a l k s . F o r t a l k s i n t h e main c o n f e r e n c e y o u r c o s t s w i l l be c o v e r e d a s u s u a l , d e t a i l s a r e on t h e <u l i n k u r l=” h t t p : //www . f r o s c o n . de / i n d e x . php ? i d =15&amp ; amp ; mid=119&amp ; amp ; r e t =15&amp ; amp ; L=0& amp ; amp ; L=0”>d e d i c a t e d w e b s i t e</ u l i n k>.</ p a r a> 9 </ s e c t i o n> 10 </ a r t i c l e>
  • 24.
    HTML to ODTto XML to PDF to . . . 17 / 25 HTML to RST conversion 1 <?php 2 3 require ’ a u t o l o a d . php ’ ; 4 5 $ x h t m l = new ezcDocumentXhtml ( ) ; 6 $xhtml− e t F i l t e r s ( a r r a y ( >s 7 new e z c D o c u m e n t X h t m l E l e m e n t F i l t e r ( ) , 8 new e z c D o c u m e n t X h t m l C o n t e n t L o c a t o r F i l t e r ( ) , 9 ) ); 10 11 $xhtml− o a d F i l e ( ’ w h i t e p a p e r . h t m l ’ ) ; >l 12 // $xhtml− o a d F i l e ( ’ h t t p : / / q a f o o . com/ >l w h i t e p a p e r . html ’ ) ; 13 14 $ r s t = new ezcDocumentRst ( ) ; 15 $ r s t − r e a t e F r o m D o c b o o k ( $xhtml− >c >getAsDocbook ( ) ); 16 17 echo $ r s t ;
  • 25.
    HTML to ODTto XML to PDF to . . . 17 / 25 HTML to RST conversion 1 <?php 2 3 require ’ a u t o l o a d . php ’ ; 4 5 $ x h t m l = new ezcDocumentXhtml ( ) ; 6 $xhtml− e t F i l t e r s ( a r r a y ( >s 7 new e z c D o c u m e n t X h t m l E l e m e n t F i l t e r ( ) , 8 new e z c D o c u m e n t X h t m l C o n t e n t L o c a t o r F i l t e r ( ) , 9 ) ); 10 11 $xhtml− o a d F i l e ( ’ w h i t e p a p e r . h t m l ’ ) ; >l 12 // $xhtml− o a d F i l e ( ’ h t t p : / / q a f o o . com/ >l w h i t e p a p e r . html ’ ) ; 13 14 $ r s t = new ezcDocumentRst ( ) ; 15 $ r s t − r e a t e F r o m D o c b o o k ( $xhtml− >c >getAsDocbook ( ) ); 16 17 echo $ r s t ;
  • 26.
    HTML to ODTto XML to PDF to . . . 17 / 25 HTML to RST conversion 1 <?php 2 3 require ’ a u t o l o a d . php ’ ; 4 5 $ x h t m l = new ezcDocumentXhtml ( ) ; 6 $xhtml− e t F i l t e r s ( a r r a y ( >s 7 new e z c D o c u m e n t X h t m l E l e m e n t F i l t e r ( ) , 8 new e z c D o c u m e n t X h t m l C o n t e n t L o c a t o r F i l t e r ( ) , 9 ) ); 10 11 $xhtml− o a d F i l e ( ’ w h i t e p a p e r . h t m l ’ ) ; >l 12 // $xhtml− o a d F i l e ( ’ h t t p : / / q a f o o . com/ >l w h i t e p a p e r . html ’ ) ; 13 14 $ r s t = new ezcDocumentRst ( ) ; 15 $ r s t − r e a t e F r o m D o c b o o k ( $xhtml− >c >getAsDocbook ( ) ); 16 17 echo $ r s t ;
  • 27.
    HTML to ODTto XML to PDF to . . . 17 / 25 HTML to RST conversion 1 <?php 2 3 require ’ a u t o l o a d . php ’ ; 4 5 $ x h t m l = new ezcDocumentXhtml ( ) ; 6 $xhtml− e t F i l t e r s ( a r r a y ( >s 7 new e z c D o c u m e n t X h t m l E l e m e n t F i l t e r ( ) , 8 new e z c D o c u m e n t X h t m l C o n t e n t L o c a t o r F i l t e r ( ) , 9 ) ); 10 11 $xhtml− o a d F i l e ( ’ w h i t e p a p e r . h t m l ’ ) ; >l 12 // $xhtml− o a d F i l e ( ’ h t t p : / / q a f o o . com/ >l w h i t e p a p e r . html ’ ) ; 13 14 $ r s t = new ezcDocumentRst ( ) ; 15 $ r s t − r e a t e F r o m D o c b o o k ( $xhtml− >c >getAsDocbook ( ) ); 16 17 echo $ r s t ;
  • 28.
    HTML to ODTto XML to PDF to . . . 17 / 25 HTML to RST conversion 1 <?php 2 3 require ’ a u t o l o a d . php ’ ; 4 5 $ x h t m l = new ezcDocumentXhtml ( ) ; 6 $xhtml− e t F i l t e r s ( a r r a y ( >s 7 new e z c D o c u m e n t X h t m l E l e m e n t F i l t e r ( ) , 8 new e z c D o c u m e n t X h t m l C o n t e n t L o c a t o r F i l t e r ( ) , 9 ) ); 10 11 $xhtml− o a d F i l e ( ’ w h i t e p a p e r . h t m l ’ ) ; >l 12 // $xhtml− o a d F i l e ( ’ h t t p : / / q a f o o . com/ >l w h i t e p a p e r . html ’ ) ; 13 14 $ r s t = new ezcDocumentRst ( ) ; 15 $ r s t − r e a t e F r o m D o c b o o k ( $xhtml− >c >getAsDocbook ( ) ); 16 17 echo $ r s t ;
  • 29.
    HTML to ODTto XML to PDF to . . . 18 / 25 HTML to RST conversion 1 −−−−−−−−−−−−−−− −−−−−−−−−−−−−−− 2 Your s o f t w a r e s q u a l i t y m a t t e r s 3 −−−−−−−−−−−−−−− −−−−−−−−−−−−−−− 4 5 Contents 6 ======== 7 8 − ‘ Introduction ‘ 9 − ‘ Business goals ‘ 10 [..] 11 12 R a i s i n g and c o n t i n u o u s l y m o n i t o r i n g t h e q u a l i t y o f s o f t w a r e a l l o w s you t o 13 i m p r o v e t h e R e t u r n On I n v e s t m e n t , t o r e d u c e t h e t i m e t o m a r k e t and t o i n c r e a s e 14 c u s t o m e r s a t i s f a c t i o n f o r y o u r s o f t w a r e p r o d u c t . No m a t t e r i f you d e v e l o p f o r 15 company i n t e r n a l u s e o n l y , on a c u s t o m e r b a s i s o r s t a n d a r d s o f t w a r e . 16 17 introduction 18 b u s i n e s s −g o a l s 19 [..] 20 21 .. f i g u r e : : / i m a g e s / w h i t e p a p e r / s t a i r w a y s t o q u a l i t y . png 22 : w i d t h : 500 23 : h e i g h t : 225 24 : a l t : Stairways to q u a l i t y 25 26 Introduction 27 ============ 28 29 Q u a l i t y i m p a c t s v a r i o u s a r e a s o f y o u r company ’ s s o f t w a r e d e v e l o p m e n t [..]
  • 30.
    HTML to ODTto XML to PDF to . . . 19 / 25 PDF generation 1 <?php 2 require ’ a u t o l o a d . php ’ ; 3 4 // C o n v e r t some web page t o PDF 5 $ x h t m l = new ezcDocumentXhtml ( ) ; 6 $xhtml− e t F i l t e r s ( a r r a y ( >s 7 new e z c D o c u m e n t X h t m l E l e m e n t F i l t e r ( ) , 8 new e z c D o c u m e n t X h t m l X p a t h F i l t e r ( ’ // d i v [ @ c l a s s =” c o n t e n t ” ] ’ ) , 9 ) ); 10 $xhtml− o a d F i l e ( ’ c o n s u l t i n g . h t m l ’ ) ; >l 11 12 // Load t h e docbook document and c r e a t e a PDF from i t 13 $ p d f = new ezcDocumentPdf ( ) ; 14 $p df− p t i o n s − r r o r R e p o r t i n g = E PARSE | E ERROR | E WARNING ; >o >e 15 16 // Load a custom s t y l e s h e e t 17 $p df− o a d S t y l e s ( ’ custom . c s s ’ ) ; >l 18 19 // Add a c u s t o m i z e d h e a d e r 20 $p df− e g i s t e r P d f P a r t ( new e z c D o c u m e n t P d f H e a d e r P d f P a r t ( >r 21 new e z c D o c u m e n t P d f F o o t e r O p t i o n s ( a r r a y ( 22 ’ showPageNumber ’ = false , > 23 ’ height ’ = ’ 10mm’ , > 24 ) ) 25 ) ); 26 27 $p df− r e a t e F r o m D o c b o o k ( $xhtml− >c >getAsDocbook ( ) ) ; 28 file put contents ( FILE . ’ . pdf ’ , $pdf ) ;
  • 31.
    HTML to ODTto XML to PDF to . . . 19 / 25 PDF generation 1 <?php 2 require ’ a u t o l o a d . php ’ ; 3 4 // C o n v e r t some web page t o PDF 5 $ x h t m l = new ezcDocumentXhtml ( ) ; 6 $xhtml− e t F i l t e r s ( a r r a y ( >s 7 new e z c D o c u m e n t X h t m l E l e m e n t F i l t e r ( ) , 8 new e z c D o c u m e n t X h t m l X p a t h F i l t e r ( ’ // d i v [ @ c l a s s =” c o n t e n t ” ] ’ ) , 9 ) ); 10 $xhtml− o a d F i l e ( ’ c o n s u l t i n g . h t m l ’ ) ; >l 11 12 // Load t h e docbook document and c r e a t e a PDF from i t 13 $ p d f = new ezcDocumentPdf ( ) ; 14 $p df− p t i o n s − r r o r R e p o r t i n g = E PARSE | E ERROR | E WARNING ; >o >e 15 16 // Load a custom s t y l e s h e e t 17 $p df− o a d S t y l e s ( ’ custom . c s s ’ ) ; >l 18 19 // Add a c u s t o m i z e d h e a d e r 20 $p df− e g i s t e r P d f P a r t ( new e z c D o c u m e n t P d f H e a d e r P d f P a r t ( >r 21 new e z c D o c u m e n t P d f F o o t e r O p t i o n s ( a r r a y ( 22 ’ showPageNumber ’ = false , > 23 ’ height ’ = ’ 10mm’ , > 24 ) ) 25 ) ); 26 27 $p df− r e a t e F r o m D o c b o o k ( $xhtml− >c >getAsDocbook ( ) ) ; 28 file put contents ( FILE . ’ . pdf ’ , $pdf ) ;
  • 32.
    HTML to ODTto XML to PDF to . . . 19 / 25 PDF generation 1 <?php 2 require ’ a u t o l o a d . php ’ ; 3 4 // C o n v e r t some web page t o PDF 5 $ x h t m l = new ezcDocumentXhtml ( ) ; 6 $xhtml− e t F i l t e r s ( a r r a y ( >s 7 new e z c D o c u m e n t X h t m l E l e m e n t F i l t e r ( ) , 8 new e z c D o c u m e n t X h t m l X p a t h F i l t e r ( ’ // d i v [ @ c l a s s =” c o n t e n t ” ] ’ ) , 9 ) ); 10 $xhtml− o a d F i l e ( ’ c o n s u l t i n g . h t m l ’ ) ; >l 11 12 // Load t h e docbook document and c r e a t e a PDF from i t 13 $ p d f = new ezcDocumentPdf ( ) ; 14 $p df− p t i o n s − r r o r R e p o r t i n g = E PARSE | E ERROR | E WARNING ; >o >e 15 16 // Load a custom s t y l e s h e e t 17 $p df− o a d S t y l e s ( ’ custom . c s s ’ ) ; >l 18 19 // Add a c u s t o m i z e d h e a d e r 20 $p df− e g i s t e r P d f P a r t ( new e z c D o c u m e n t P d f H e a d e r P d f P a r t ( >r 21 new e z c D o c u m e n t P d f F o o t e r O p t i o n s ( a r r a y ( 22 ’ showPageNumber ’ = false , > 23 ’ height ’ = ’ 10mm’ , > 24 ) ) 25 ) ); 26 27 $p df− r e a t e F r o m D o c b o o k ( $xhtml− >c >getAsDocbook ( ) ) ; 28 file put contents ( FILE . ’ . pdf ’ , $pdf ) ;
  • 33.
    HTML to ODTto XML to PDF to . . . 19 / 25 PDF generation 1 <?php 2 require ’ a u t o l o a d . php ’ ; 3 4 // C o n v e r t some web page t o PDF 5 $ x h t m l = new ezcDocumentXhtml ( ) ; 6 $xhtml− e t F i l t e r s ( a r r a y ( >s 7 new e z c D o c u m e n t X h t m l E l e m e n t F i l t e r ( ) , 8 new e z c D o c u m e n t X h t m l X p a t h F i l t e r ( ’ // d i v [ @ c l a s s =” c o n t e n t ” ] ’ ) , 9 ) ); 10 $xhtml− o a d F i l e ( ’ c o n s u l t i n g . h t m l ’ ) ; >l 11 12 // Load t h e docbook document and c r e a t e a PDF from i t 13 $ p d f = new ezcDocumentPdf ( ) ; 14 $p df− p t i o n s − r r o r R e p o r t i n g = E PARSE | E ERROR | E WARNING ; >o >e 15 16 // Load a custom s t y l e s h e e t 17 $p df− o a d S t y l e s ( ’ custom . c s s ’ ) ; >l 18 19 // Add a c u s t o m i z e d h e a d e r 20 $p df− e g i s t e r P d f P a r t ( new e z c D o c u m e n t P d f H e a d e r P d f P a r t ( >r 21 new e z c D o c u m e n t P d f F o o t e r O p t i o n s ( a r r a y ( 22 ’ showPageNumber ’ = false , > 23 ’ height ’ = ’ 10mm’ , > 24 ) ) 25 ) ); 26 27 $p df− r e a t e F r o m D o c b o o k ( $xhtml− >c >getAsDocbook ( ) ) ; 28 file put contents ( FILE . ’ . pdf ’ , $pdf ) ;
  • 34.
    HTML to ODTto XML to PDF to . . . 19 / 25 PDF generation 1 <?php 2 require ’ a u t o l o a d . php ’ ; 3 4 // C o n v e r t some web page t o PDF 5 $ x h t m l = new ezcDocumentXhtml ( ) ; 6 $xhtml− e t F i l t e r s ( a r r a y ( >s 7 new e z c D o c u m e n t X h t m l E l e m e n t F i l t e r ( ) , 8 new e z c D o c u m e n t X h t m l X p a t h F i l t e r ( ’ // d i v [ @ c l a s s =” c o n t e n t ” ] ’ ) , 9 ) ); 10 $xhtml− o a d F i l e ( ’ c o n s u l t i n g . h t m l ’ ) ; >l 11 12 // Load t h e docbook document and c r e a t e a PDF from i t 13 $ p d f = new ezcDocumentPdf ( ) ; 14 $p df− p t i o n s − r r o r R e p o r t i n g = E PARSE | E ERROR | E WARNING ; >o >e 15 16 // Load a custom s t y l e s h e e t 17 $p df− o a d S t y l e s ( ’ custom . c s s ’ ) ; >l 18 19 // Add a c u s t o m i z e d h e a d e r 20 $p df− e g i s t e r P d f P a r t ( new e z c D o c u m e n t P d f H e a d e r P d f P a r t ( >r 21 new e z c D o c u m e n t P d f F o o t e r O p t i o n s ( a r r a y ( 22 ’ showPageNumber ’ = false , > 23 ’ height ’ = ’ 10mm’ , > 24 ) ) 25 ) ); 26 27 $p df− r e a t e F r o m D o c b o o k ( $xhtml− >c >getAsDocbook ( ) ) ; 28 file put contents ( FILE . ’ . pdf ’ , $pdf ) ;
  • 35.
    HTML to ODTto XML to PDF to . . . 19 / 25 PDF generation 1 <?php 2 require ’ a u t o l o a d . php ’ ; 3 4 // C o n v e r t some web page t o PDF 5 $ x h t m l = new ezcDocumentXhtml ( ) ; 6 $xhtml− e t F i l t e r s ( a r r a y ( >s 7 new e z c D o c u m e n t X h t m l E l e m e n t F i l t e r ( ) , 8 new e z c D o c u m e n t X h t m l X p a t h F i l t e r ( ’ // d i v [ @ c l a s s =” c o n t e n t ” ] ’ ) , 9 ) ); 10 $xhtml− o a d F i l e ( ’ c o n s u l t i n g . h t m l ’ ) ; >l 11 12 // Load t h e docbook document and c r e a t e a PDF from i t 13 $ p d f = new ezcDocumentPdf ( ) ; 14 $p df− p t i o n s − r r o r R e p o r t i n g = E PARSE | E ERROR | E WARNING ; >o >e 15 16 // Load a custom s t y l e s h e e t 17 $p df− o a d S t y l e s ( ’ custom . c s s ’ ) ; >l 18 19 // Add a c u s t o m i z e d h e a d e r 20 $p df− e g i s t e r P d f P a r t ( new e z c D o c u m e n t P d f H e a d e r P d f P a r t ( >r 21 new e z c D o c u m e n t P d f F o o t e r O p t i o n s ( a r r a y ( 22 ’ showPageNumber ’ = false , > 23 ’ height ’ = ’ 10mm’ , > 24 ) ) 25 ) ); 26 27 $p df− r e a t e F r o m D o c b o o k ( $xhtml− >c >getAsDocbook ( ) ) ; 28 file put contents ( FILE . ’ . pdf ’ , $pdf ) ;
  • 36.
    HTML to ODTto XML to PDF to . . . 20 / 25 PDF generation 1 article { 2 f o n t−f a m i l y : ” s a n s−s e r i f ” ; 3 f o n t−s i z e : ”10 p t ” ; 4 } 5 6 title { 7 f o n t−f a m i l y : ” s a n s−s e r i f ” ; 8 c o l o r : #97BF0D ; 9 10 b o r d e r−bottom : 1 px s o l i d #555753; 11 b o r d e r−l e f t : 4 px s o l i d #555753; 12 p a d d i n g−l e f t : 4 px ; 13 } 14 15 section > section > section > section > t i t l e { 16 b o r d e r−c o l o r : #babdb6 ; 17 } 18 19 page { 20 p a d d i n g : ”15mm 30mm” ; 21 } 22 23 ulink { 24 c o l o r : #97BF0D ; 25 } 26 27 link { 28 c o l o r : #97BF0D ; 29 }
  • 37.
    HTML to ODTto XML to PDF to . . . 21 / 25 PDF generation
  • 38.
    HTML to ODTto XML to PDF to . . . 22 / 25 ODT generation 1 <?php 2 3 require ’ a u t o l o a d . php ’ ; 4 5 // C o n v e r t some i n p u t R S T f i l e t o docbook 6 $ x h t m l = new ezcDocumentXhtml ( ) ; 7 $xhtml− e t F i l t e r s ( a r r a y ( >s 8 new e z c D o c u m e n t X h t m l E l e m e n t F i l t e r ( ) , 9 new e z c D o c u m e n t X h t m l X p a t h F i l t e r ( ’ // d i v [ @ c l a s s =” c o n t e n t ” ] ’ ) , 10 ) ); 11 $xhtml− o a d F i l e ( ’ c o n s u l t i n g . h t m l ’ ) ; >l 12 13 $ c o n v e r t e r = new ezcDocumentDocbookToOdtConverter ( ) ; 14 $ c o n v e r t e r − p t i o n s − t y l e r − d d S t y l e s h e e t F i l e ( ’ custom . c s s ’ ) ; >o >s >a 15 16 $ o d t = $ c o n v e r t e r − o n v e r t ( $xhtml− >c >getAsDocbook ( ) ) ; 17 file put contents ( FILE . ’ . f o d t ’ , $odt ) ;
  • 39.
    HTML to ODTto XML to PDF to . . . 22 / 25 ODT generation 1 <?php 2 3 require ’ a u t o l o a d . php ’ ; 4 5 // C o n v e r t some i n p u t R S T f i l e t o docbook 6 $ x h t m l = new ezcDocumentXhtml ( ) ; 7 $xhtml− e t F i l t e r s ( a r r a y ( >s 8 new e z c D o c u m e n t X h t m l E l e m e n t F i l t e r ( ) , 9 new e z c D o c u m e n t X h t m l X p a t h F i l t e r ( ’ // d i v [ @ c l a s s =” c o n t e n t ” ] ’ ) , 10 ) ); 11 $xhtml− o a d F i l e ( ’ c o n s u l t i n g . h t m l ’ ) ; >l 12 13 $ c o n v e r t e r = new ezcDocumentDocbookToOdtConverter ( ) ; 14 $ c o n v e r t e r − p t i o n s − t y l e r − d d S t y l e s h e e t F i l e ( ’ custom . c s s ’ ) ; >o >s >a 15 16 $ o d t = $ c o n v e r t e r − o n v e r t ( $xhtml− >c >getAsDocbook ( ) ) ; 17 file put contents ( FILE . ’ . f o d t ’ , $odt ) ;
  • 40.
    HTML to ODTto XML to PDF to . . . 22 / 25 ODT generation 1 <?php 2 3 require ’ a u t o l o a d . php ’ ; 4 5 // C o n v e r t some i n p u t R S T f i l e t o docbook 6 $ x h t m l = new ezcDocumentXhtml ( ) ; 7 $xhtml− e t F i l t e r s ( a r r a y ( >s 8 new e z c D o c u m e n t X h t m l E l e m e n t F i l t e r ( ) , 9 new e z c D o c u m e n t X h t m l X p a t h F i l t e r ( ’ // d i v [ @ c l a s s =” c o n t e n t ” ] ’ ) , 10 ) ); 11 $xhtml− o a d F i l e ( ’ c o n s u l t i n g . h t m l ’ ) ; >l 12 13 $ c o n v e r t e r = new ezcDocumentDocbookToOdtConverter ( ) ; 14 $ c o n v e r t e r − p t i o n s − t y l e r − d d S t y l e s h e e t F i l e ( ’ custom . c s s ’ ) ; >o >s >a 15 16 $ o d t = $ c o n v e r t e r − o n v e r t ( $xhtml− >c >getAsDocbook ( ) ) ; 17 file put contents ( FILE . ’ . f o d t ’ , $odt ) ;
  • 41.
    HTML to ODTto XML to PDF to . . . 23 / 25 ODT generation
  • 42.
    HTML to ODTto XML to PDF to . . . 24 / 25 Outline Introduction The Document component Getting into the code End
  • 43.
    HTML to ODTto XML to PDF to . . . 25 / 25 Thanks for listening Apache Zeta Components http://zetacomponents.org #zetacomponents @ Freenode Stay in touch toby@qafoo.com @tobySen Take care for your software quality http://qafoo.com Also consulting / training / support for Zeta