HTML to ODT to XML to PDF to . . .
FrOSCon 2010
Tobias Schlitt <toby@qafoo.com>
August 22, 2010
HTML to ODT to XML to PDF to . . . 2 / 25
License
Copyright by Tobias Schlitt, Qafoo GmbH
Licensensed under Creative Commons
Attribution-NonCommercial-ShareAlike 2.0 Generic
HTML to ODT to XML to PDF to . . . 3 / 25
About me
Tobias Schlitt
Apprenticed IT specialist
Waiting for diploma to be approved (TU Dortmund)
Open source enthusiast
Co-founder of Qafoo - passion for software quality
PMC member of Apache Zeta Components
Contributor to various other OSS projects
HTML to ODT to XML to PDF to . . . 4 / 25
Outline
Introduction
The Document component
Getting into the code
End
HTML to ODT to XML to PDF to . . . 5 / 25
eZ Components
HTML to ODT to XML to PDF to . . . 5 / 25
Apache Zeta Components
apache
Zeta
Components
HTML to ODT to XML to PDF to . . . 6 / 25
Apache Zeta Components
Previously developed by eZ Systems
http://ez.no
Originally named eZ Components
Code donated to the Apache Software Foundation
http://apache.org
Currently incubating
Re-organizing / re-gathering community
Join us!
HTML to ODT to XML to PDF to . . . 7 / 25
Apache Zeta Components
General purpose component library for PHP 5.1+
Open source (Apache 2.0 license)
Focus
High code quality
Excellent docs
Backwards compatibility
Professional support available
http://zetacomponents.org
HTML to ODT to XML to PDF to . . . 8 / 25
The components
49 components, an extract . . .
Archive
ConsoleTools
Graph
Mail
MvcTools
Webdav
Workflow
HTML to ODT to XML to PDF to . . . 9 / 25
Outline
Introduction
The Document component
Getting into the code
End
HTML to ODT to XML to PDF to . . . 10 / 25
Goal
Applications need to deal with
Different input mechanisms
WYSIWYG editor (HTML)
Simple text editor (wiki markup)
Emails (ReST)
Different output formats
Web front end (HTML)
Technical documentation management (Docbook)
Print (PDF)
The Document component converts markup formats
HTML to ODT to XML to PDF to . . . 10 / 25
Goal
Applications need to deal with
Different input mechanisms
WYSIWYG editor (HTML)
Simple text editor (wiki markup)
Emails (ReST)
Different output formats
Web front end (HTML)
Technical documentation management (Docbook)
Print (PDF)
The Document component converts markup formats
HTML to ODT to XML to PDF to . . . 10 / 25
Goal
Applications need to deal with
Different input mechanisms
WYSIWYG editor (HTML)
Simple text editor (wiki markup)
Emails (ReST)
Different output formats
Web front end (HTML)
Technical documentation management (Docbook)
Print (PDF)
The Document component converts markup formats
HTML to ODT to XML to PDF to . . . 10 / 25
Goal
Applications need to deal with
Different input mechanisms
WYSIWYG editor (HTML)
Simple text editor (wiki markup)
Emails (ReST)
Different output formats
Web front end (HTML)
Technical documentation management (Docbook)
Print (PDF)
The Document component converts markup formats
HTML to ODT to XML to PDF to . . . 11 / 25
Supported formats
Currently supported formats
Docbook
(X)Html
eZ XML
ReST
Wiki
Dokuwiki, popular PHP based wiki (wiki.php.net) (read-only)
Creole, wiki markup standardization initiative
Confluence, Apache Atlassian wiki dialect (read-only)
PDF (write only)
ODF (only flat)
HTML to ODT to XML to PDF to . . . 12 / 25
Approach
Docbook as central conversion format
Possible conversion shortcuts
Conversions always configurable and extensible
(X)Html ➤ ReST Creole
➤➤
➤
➤
➤
➤
eZ XML ➤ ➤ Docbook ➤ Dokuwiki
➤➤
➤
➤
PDF ODF Confluence
HTML to ODT to XML to PDF to . . . 13 / 25
Outline
Introduction
The Document component
Getting into the code
End
HTML to ODT to XML to PDF to . . . 14 / 25
Reading ReStructered Text
1 ==================
2 PHP @ FrOSCon 2010
3 ==================
4
5 For the f i f t h time we w i l l be a t t h e ‘ F r e e and Open S o u r c e C o n f e r e n c e ‘
6 ( FrOSCon ) i n St . A u g u s t i n , n e a r Bonn , o r g a n i z i n g a t r a c k f u l l o f PHP r e l a t e d
7 t a l k s . We a l s o o f f e r s p a c e t o d i s c u s s PHP r e l a t e d t o p i c s , o r j u s t hack w i t h
8 o t h e r open minded p e o p l e a r o u n d you . We would l o v e t o welcome you i n t h e PHP
9 room .
10
11 We a r e c u r r e n t l y l o o k i n g f o r t a l k s f o r t h e PHP room , and t h e ‘ C a l l F o r
12 Papers ‘ w i l l l a s t u n t i l the 2 3 . 0 5 . 2 0 1 0 , j u s t l i k e the ‘ C a l l For Papers ‘ of
13 t h e FrOSCon . P l e a s e s u b m i t a t a l k , i f you g o t s o m e t h i n g i n t e r e s t i n g t o t a l k
14 about .
15
16 You m i g h t a l s o want t o s u b m i t t a l k s t o t h e ‘ main s c h e d u l e ‘ of the
17 conference , w h i c h a l s o a c c e p t s PHP r e l a t e d t a l k s . F o r t a l k s i n t h e main
18 conference y o u r c o s t s w i l l be c o v e r e d a s u s u a l , d e t a i l s a r e on t h e ‘ d e d i c a t e d
19 website ‘ .
20
21 h t t p : / / phpugdo . de /
22 h t t p : / / f r o s c o n . de /
23 / c a l l f o r p a p e r s . html
24 h t t p : / /www . f r o s c o n . de / i n d e x . php ? i d =15&mid=119& r e t =15&L=0&L=0
25 h t t p : / /www . f r o s c o n . de / i n d e x . php ? i d =15&mid=119& r e t =15&L=0&L=0
26 h t t p : / /www . f r o s c o n . de / i n d e x . php ? i d =15&mid=119& r e t =15&L=0&L=0
HTML to ODT to XML to PDF to . . . 15 / 25
Reading ReStructered Text
1 <?php
2
3 require ’ . / a u t o l o a d . php ’ ;
4
5 $document = new ezcDocumentRst ( ) ;
6 $document− o a d F i l e ( ’ f r o s c o n . t x t ’ ) ;
>l
7
8 echo $document−>getAsDocbook ( ) ;
HTML to ODT to XML to PDF to . . . 15 / 25
Reading ReStructered Text
1 <?php
2
3 require ’ . / a u t o l o a d . php ’ ;
4
5 $document = new ezcDocumentRst ( ) ;
6 $document− o a d F i l e ( ’ f r o s c o n . t x t ’ ) ;
>l
7
8 echo $document−>getAsDocbook ( ) ;
HTML to ODT to XML to PDF to . . . 15 / 25
Reading ReStructered Text
1 <?php
2
3 require ’ . / a u t o l o a d . php ’ ;
4
5 $document = new ezcDocumentRst ( ) ;
6 $document− o a d F i l e ( ’ f r o s c o n . t x t ’ ) ;
>l
7
8 echo $document−>getAsDocbook ( ) ;
HTML to ODT to XML to PDF to . . . 15 / 25
Reading ReStructered Text
1 <?php
2
3 require ’ . / a u t o l o a d . php ’ ;
4
5 $document = new ezcDocumentRst ( ) ;
6 $document− o a d F i l e ( ’ f r o s c o n . t x t ’ ) ;
>l
7
8 echo $document−>getAsDocbook ( ) ;
HTML to ODT to XML to PDF to . . . 16 / 25
Reading ReStructered Text
1 <? xml v e r s i o n=” 1 . 0 ” ?>
2 <!DOCTYPE a r t i c l e PUBLIC ”−//OASIS //DTD DocBook XML V4 . 5 / /EN” ” h t t p : //www . o a s i s −
open . o r g / docbook / xml / 4 . 5 / d o c b o o k x . d t d ”>
3 < a r t i c l e x m l n s=” h t t p : // docbook . o r g / n s / docbook ”>
4 <s e c t i o n ID=” p h p f r o s c o n 2 0 1 0 ”>
5 < t i t l e>PHP @ FrOSCon 2010</ t i t l e>
6 <p a r a>F o r t h e f i f t h t i m e <u l i n k u r l=” h t t p : // phpugdo . de / ”>we</ u l i n k> w i l l be
a t t h e <u l i n k u r l=” h t t p : // f r o s c o n . de / ”>F r e e and Open S o u r c e C o n f e r e n c e<
/ u l i n k> ( FrOSCon ) i n St . A u g u s t i n , n e a r Bonn , o r g a n i z i n g a t r a c k f u l l
o f PHP r e l a t e d t a l k s . We a l s o o f f e r s p a c e t o d i s c u s s PHP r e l a t e d t o p i c s
, o r j u s t hack w i t h o t h e r open minded p e o p l e a r o u n d you . We would l o v e
t o welcome you i n t h e PHP room .</ p a r a>
7 <p a r a>We a r e c u r r e n t l y l o o k i n g f o r t a l k s f o r t h e PHP room , and t h e <u l i n k
u r l=” / c a l l f o r p a p e r s . h t m l ”>C a l l F o r P a p e r s</ u l i n k> w i l l l a s t u n t i l t h e
2 3 . 0 5 . 2 0 1 0 , j u s t l i k e t h e <u l i n k u r l=” h t t p : //www . f r o s c o n . de / i n d e x . php ?
i d =15& ; amp ; mid=119& ; amp ; r e t =15& ; amp ; L=0& ; amp ; L=0”>C a l l F o r
P a p e r s</ u l i n k> o f t h e FrOSCon . P l e a s e s u b m i t a t a l k , i f you g o t
s o m e t h i n g i n t e r e s t i n g t o t a l k a b o u t .</ p a r a>
8 <p a r a>You m i g h t a l s o want t o s u b m i t t a l k s t o t h e <u l i n k u r l=” h t t p : //www .
f r o s c o n . de / i n d e x . php ? i d =15& ; amp ; mid=119& ; amp ; r e t =15& ; amp ; L=0&
amp ; amp ; L=0”>main s c h e d u l e</ u l i n k> o f t h e c o n f e r e n c e , w h i c h a l s o
a c c e p t s PHP r e l a t e d t a l k s . F o r t a l k s i n t h e main c o n f e r e n c e y o u r c o s t s
w i l l be c o v e r e d a s u s u a l , d e t a i l s a r e on t h e <u l i n k u r l=” h t t p : //www .
f r o s c o n . de / i n d e x . php ? i d =15& ; amp ; mid=119& ; amp ; r e t =15& ; amp ; L=0&
amp ; amp ; L=0”>d e d i c a t e d w e b s i t e</ u l i n k>.</ p a r a>
9 </ s e c t i o n>
10 </ a r t i c l e>
HTML to ODT to XML to PDF to . . . 17 / 25
HTML to RST conversion
1 <?php
2
3 require ’ a u t o l o a d . php ’ ;
4
5 $ x h t m l = new ezcDocumentXhtml ( ) ;
6 $xhtml− e t F i l t e r s ( a r r a y (
>s
7 new e z c D o c u m e n t X h t m l E l e m e n t F i l t e r ( ) ,
8 new e z c D o c u m e n t X h t m l C o n t e n t L o c a t o r F i l t e r ( ) ,
9 ) );
10
11 $xhtml− o a d F i l e ( ’ w h i t e p a p e r . h t m l ’ ) ;
>l
12 // $xhtml− o a d F i l e ( ’ h t t p : / / q a f o o . com/
>l
w h i t e p a p e r . html ’ ) ;
13
14 $ r s t = new ezcDocumentRst ( ) ;
15 $ r s t − r e a t e F r o m D o c b o o k ( $xhtml−
>c >getAsDocbook ( )
);
16
17 echo $ r s t ;
HTML to ODT to XML to PDF to . . . 17 / 25
HTML to RST conversion
1 <?php
2
3 require ’ a u t o l o a d . php ’ ;
4
5 $ x h t m l = new ezcDocumentXhtml ( ) ;
6 $xhtml− e t F i l t e r s ( a r r a y (
>s
7 new e z c D o c u m e n t X h t m l E l e m e n t F i l t e r ( ) ,
8 new e z c D o c u m e n t X h t m l C o n t e n t L o c a t o r F i l t e r ( ) ,
9 ) );
10
11 $xhtml− o a d F i l e ( ’ w h i t e p a p e r . h t m l ’ ) ;
>l
12 // $xhtml− o a d F i l e ( ’ h t t p : / / q a f o o . com/
>l
w h i t e p a p e r . html ’ ) ;
13
14 $ r s t = new ezcDocumentRst ( ) ;
15 $ r s t − r e a t e F r o m D o c b o o k ( $xhtml−
>c >getAsDocbook ( )
);
16
17 echo $ r s t ;
HTML to ODT to XML to PDF to . . . 17 / 25
HTML to RST conversion
1 <?php
2
3 require ’ a u t o l o a d . php ’ ;
4
5 $ x h t m l = new ezcDocumentXhtml ( ) ;
6 $xhtml− e t F i l t e r s ( a r r a y (
>s
7 new e z c D o c u m e n t X h t m l E l e m e n t F i l t e r ( ) ,
8 new e z c D o c u m e n t X h t m l C o n t e n t L o c a t o r F i l t e r ( ) ,
9 ) );
10
11 $xhtml− o a d F i l e ( ’ w h i t e p a p e r . h t m l ’ ) ;
>l
12 // $xhtml− o a d F i l e ( ’ h t t p : / / q a f o o . com/
>l
w h i t e p a p e r . html ’ ) ;
13
14 $ r s t = new ezcDocumentRst ( ) ;
15 $ r s t − r e a t e F r o m D o c b o o k ( $xhtml−
>c >getAsDocbook ( )
);
16
17 echo $ r s t ;
HTML to ODT to XML to PDF to . . . 17 / 25
HTML to RST conversion
1 <?php
2
3 require ’ a u t o l o a d . php ’ ;
4
5 $ x h t m l = new ezcDocumentXhtml ( ) ;
6 $xhtml− e t F i l t e r s ( a r r a y (
>s
7 new e z c D o c u m e n t X h t m l E l e m e n t F i l t e r ( ) ,
8 new e z c D o c u m e n t X h t m l C o n t e n t L o c a t o r F i l t e r ( ) ,
9 ) );
10
11 $xhtml− o a d F i l e ( ’ w h i t e p a p e r . h t m l ’ ) ;
>l
12 // $xhtml− o a d F i l e ( ’ h t t p : / / q a f o o . com/
>l
w h i t e p a p e r . html ’ ) ;
13
14 $ r s t = new ezcDocumentRst ( ) ;
15 $ r s t − r e a t e F r o m D o c b o o k ( $xhtml−
>c >getAsDocbook ( )
);
16
17 echo $ r s t ;
HTML to ODT to XML to PDF to . . . 17 / 25
HTML to RST conversion
1 <?php
2
3 require ’ a u t o l o a d . php ’ ;
4
5 $ x h t m l = new ezcDocumentXhtml ( ) ;
6 $xhtml− e t F i l t e r s ( a r r a y (
>s
7 new e z c D o c u m e n t X h t m l E l e m e n t F i l t e r ( ) ,
8 new e z c D o c u m e n t X h t m l C o n t e n t L o c a t o r F i l t e r ( ) ,
9 ) );
10
11 $xhtml− o a d F i l e ( ’ w h i t e p a p e r . h t m l ’ ) ;
>l
12 // $xhtml− o a d F i l e ( ’ h t t p : / / q a f o o . com/
>l
w h i t e p a p e r . html ’ ) ;
13
14 $ r s t = new ezcDocumentRst ( ) ;
15 $ r s t − r e a t e F r o m D o c b o o k ( $xhtml−
>c >getAsDocbook ( )
);
16
17 echo $ r s t ;
HTML to ODT to XML to PDF to . . . 18 / 25
HTML to RST conversion
1 −−−−−−−−−−−−−−−
−−−−−−−−−−−−−−−
2 Your s o f t w a r e s q u a l i t y m a t t e r s
3 −−−−−−−−−−−−−−−
−−−−−−−−−−−−−−−
4
5 Contents
6 ========
7
8 − ‘ Introduction ‘
9 − ‘ Business goals ‘
10 [..]
11
12 R a i s i n g and c o n t i n u o u s l y m o n i t o r i n g t h e q u a l i t y o f s o f t w a r e a l l o w s you t o
13 i m p r o v e t h e R e t u r n On I n v e s t m e n t , t o r e d u c e t h e t i m e t o m a r k e t and t o i n c r e a s e
14 c u s t o m e r s a t i s f a c t i o n f o r y o u r s o f t w a r e p r o d u c t . No m a t t e r i f you d e v e l o p f o r
15 company i n t e r n a l u s e o n l y , on a c u s t o m e r b a s i s o r s t a n d a r d s o f t w a r e .
16
17 introduction
18 b u s i n e s s −g o a l s
19 [..]
20
21 .. f i g u r e : : / i m a g e s / w h i t e p a p e r / s t a i r w a y s t o q u a l i t y . png
22 : w i d t h : 500
23 : h e i g h t : 225
24 : a l t : Stairways to q u a l i t y
25
26 Introduction
27 ============
28
29 Q u a l i t y i m p a c t s v a r i o u s a r e a s o f y o u r company ’ s s o f t w a r e d e v e l o p m e n t [..]
HTML to ODT to XML to PDF to . . . 19 / 25
PDF generation
1 <?php
2 require ’ a u t o l o a d . php ’ ;
3
4 // C o n v e r t some web page t o PDF
5 $ x h t m l = new ezcDocumentXhtml ( ) ;
6 $xhtml− e t F i l t e r s ( a r r a y (
>s
7 new e z c D o c u m e n t X h t m l E l e m e n t F i l t e r ( ) ,
8 new e z c D o c u m e n t X h t m l X p a t h F i l t e r ( ’ // d i v [ @ c l a s s =” c o n t e n t ” ] ’ ) ,
9 ) );
10 $xhtml− o a d F i l e ( ’ c o n s u l t i n g . h t m l ’ ) ;
>l
11
12 // Load t h e docbook document and c r e a t e a PDF from i t
13 $ p d f = new ezcDocumentPdf ( ) ;
14 $p df− p t i o n s − r r o r R e p o r t i n g = E PARSE | E ERROR | E WARNING ;
>o >e
15
16 // Load a custom s t y l e s h e e t
17 $p df− o a d S t y l e s ( ’ custom . c s s ’ ) ;
>l
18
19 // Add a c u s t o m i z e d h e a d e r
20 $p df− e g i s t e r P d f P a r t ( new e z c D o c u m e n t P d f H e a d e r P d f P a r t (
>r
21 new e z c D o c u m e n t P d f F o o t e r O p t i o n s ( a r r a y (
22 ’ showPageNumber ’ = false ,
>
23 ’ height ’ = ’ 10mm’ ,
>
24 ) )
25 ) );
26
27 $p df− r e a t e F r o m D o c b o o k ( $xhtml−
>c >getAsDocbook ( ) ) ;
28 file put contents ( FILE . ’ . pdf ’ , $pdf ) ;
HTML to ODT to XML to PDF to . . . 19 / 25
PDF generation
1 <?php
2 require ’ a u t o l o a d . php ’ ;
3
4 // C o n v e r t some web page t o PDF
5 $ x h t m l = new ezcDocumentXhtml ( ) ;
6 $xhtml− e t F i l t e r s ( a r r a y (
>s
7 new e z c D o c u m e n t X h t m l E l e m e n t F i l t e r ( ) ,
8 new e z c D o c u m e n t X h t m l X p a t h F i l t e r ( ’ // d i v [ @ c l a s s =” c o n t e n t ” ] ’ ) ,
9 ) );
10 $xhtml− o a d F i l e ( ’ c o n s u l t i n g . h t m l ’ ) ;
>l
11
12 // Load t h e docbook document and c r e a t e a PDF from i t
13 $ p d f = new ezcDocumentPdf ( ) ;
14 $p df− p t i o n s − r r o r R e p o r t i n g = E PARSE | E ERROR | E WARNING ;
>o >e
15
16 // Load a custom s t y l e s h e e t
17 $p df− o a d S t y l e s ( ’ custom . c s s ’ ) ;
>l
18
19 // Add a c u s t o m i z e d h e a d e r
20 $p df− e g i s t e r P d f P a r t ( new e z c D o c u m e n t P d f H e a d e r P d f P a r t (
>r
21 new e z c D o c u m e n t P d f F o o t e r O p t i o n s ( a r r a y (
22 ’ showPageNumber ’ = false ,
>
23 ’ height ’ = ’ 10mm’ ,
>
24 ) )
25 ) );
26
27 $p df− r e a t e F r o m D o c b o o k ( $xhtml−
>c >getAsDocbook ( ) ) ;
28 file put contents ( FILE . ’ . pdf ’ , $pdf ) ;
HTML to ODT to XML to PDF to . . . 19 / 25
PDF generation
1 <?php
2 require ’ a u t o l o a d . php ’ ;
3
4 // C o n v e r t some web page t o PDF
5 $ x h t m l = new ezcDocumentXhtml ( ) ;
6 $xhtml− e t F i l t e r s ( a r r a y (
>s
7 new e z c D o c u m e n t X h t m l E l e m e n t F i l t e r ( ) ,
8 new e z c D o c u m e n t X h t m l X p a t h F i l t e r ( ’ // d i v [ @ c l a s s =” c o n t e n t ” ] ’ ) ,
9 ) );
10 $xhtml− o a d F i l e ( ’ c o n s u l t i n g . h t m l ’ ) ;
>l
11
12 // Load t h e docbook document and c r e a t e a PDF from i t
13 $ p d f = new ezcDocumentPdf ( ) ;
14 $p df− p t i o n s − r r o r R e p o r t i n g = E PARSE | E ERROR | E WARNING ;
>o >e
15
16 // Load a custom s t y l e s h e e t
17 $p df− o a d S t y l e s ( ’ custom . c s s ’ ) ;
>l
18
19 // Add a c u s t o m i z e d h e a d e r
20 $p df− e g i s t e r P d f P a r t ( new e z c D o c u m e n t P d f H e a d e r P d f P a r t (
>r
21 new e z c D o c u m e n t P d f F o o t e r O p t i o n s ( a r r a y (
22 ’ showPageNumber ’ = false ,
>
23 ’ height ’ = ’ 10mm’ ,
>
24 ) )
25 ) );
26
27 $p df− r e a t e F r o m D o c b o o k ( $xhtml−
>c >getAsDocbook ( ) ) ;
28 file put contents ( FILE . ’ . pdf ’ , $pdf ) ;
HTML to ODT to XML to PDF to . . . 19 / 25
PDF generation
1 <?php
2 require ’ a u t o l o a d . php ’ ;
3
4 // C o n v e r t some web page t o PDF
5 $ x h t m l = new ezcDocumentXhtml ( ) ;
6 $xhtml− e t F i l t e r s ( a r r a y (
>s
7 new e z c D o c u m e n t X h t m l E l e m e n t F i l t e r ( ) ,
8 new e z c D o c u m e n t X h t m l X p a t h F i l t e r ( ’ // d i v [ @ c l a s s =” c o n t e n t ” ] ’ ) ,
9 ) );
10 $xhtml− o a d F i l e ( ’ c o n s u l t i n g . h t m l ’ ) ;
>l
11
12 // Load t h e docbook document and c r e a t e a PDF from i t
13 $ p d f = new ezcDocumentPdf ( ) ;
14 $p df− p t i o n s − r r o r R e p o r t i n g = E PARSE | E ERROR | E WARNING ;
>o >e
15
16 // Load a custom s t y l e s h e e t
17 $p df− o a d S t y l e s ( ’ custom . c s s ’ ) ;
>l
18
19 // Add a c u s t o m i z e d h e a d e r
20 $p df− e g i s t e r P d f P a r t ( new e z c D o c u m e n t P d f H e a d e r P d f P a r t (
>r
21 new e z c D o c u m e n t P d f F o o t e r O p t i o n s ( a r r a y (
22 ’ showPageNumber ’ = false ,
>
23 ’ height ’ = ’ 10mm’ ,
>
24 ) )
25 ) );
26
27 $p df− r e a t e F r o m D o c b o o k ( $xhtml−
>c >getAsDocbook ( ) ) ;
28 file put contents ( FILE . ’ . pdf ’ , $pdf ) ;
HTML to ODT to XML to PDF to . . . 19 / 25
PDF generation
1 <?php
2 require ’ a u t o l o a d . php ’ ;
3
4 // C o n v e r t some web page t o PDF
5 $ x h t m l = new ezcDocumentXhtml ( ) ;
6 $xhtml− e t F i l t e r s ( a r r a y (
>s
7 new e z c D o c u m e n t X h t m l E l e m e n t F i l t e r ( ) ,
8 new e z c D o c u m e n t X h t m l X p a t h F i l t e r ( ’ // d i v [ @ c l a s s =” c o n t e n t ” ] ’ ) ,
9 ) );
10 $xhtml− o a d F i l e ( ’ c o n s u l t i n g . h t m l ’ ) ;
>l
11
12 // Load t h e docbook document and c r e a t e a PDF from i t
13 $ p d f = new ezcDocumentPdf ( ) ;
14 $p df− p t i o n s − r r o r R e p o r t i n g = E PARSE | E ERROR | E WARNING ;
>o >e
15
16 // Load a custom s t y l e s h e e t
17 $p df− o a d S t y l e s ( ’ custom . c s s ’ ) ;
>l
18
19 // Add a c u s t o m i z e d h e a d e r
20 $p df− e g i s t e r P d f P a r t ( new e z c D o c u m e n t P d f H e a d e r P d f P a r t (
>r
21 new e z c D o c u m e n t P d f F o o t e r O p t i o n s ( a r r a y (
22 ’ showPageNumber ’ = false ,
>
23 ’ height ’ = ’ 10mm’ ,
>
24 ) )
25 ) );
26
27 $p df− r e a t e F r o m D o c b o o k ( $xhtml−
>c >getAsDocbook ( ) ) ;
28 file put contents ( FILE . ’ . pdf ’ , $pdf ) ;
HTML to ODT to XML to PDF to . . . 19 / 25
PDF generation
1 <?php
2 require ’ a u t o l o a d . php ’ ;
3
4 // C o n v e r t some web page t o PDF
5 $ x h t m l = new ezcDocumentXhtml ( ) ;
6 $xhtml− e t F i l t e r s ( a r r a y (
>s
7 new e z c D o c u m e n t X h t m l E l e m e n t F i l t e r ( ) ,
8 new e z c D o c u m e n t X h t m l X p a t h F i l t e r ( ’ // d i v [ @ c l a s s =” c o n t e n t ” ] ’ ) ,
9 ) );
10 $xhtml− o a d F i l e ( ’ c o n s u l t i n g . h t m l ’ ) ;
>l
11
12 // Load t h e docbook document and c r e a t e a PDF from i t
13 $ p d f = new ezcDocumentPdf ( ) ;
14 $p df− p t i o n s − r r o r R e p o r t i n g = E PARSE | E ERROR | E WARNING ;
>o >e
15
16 // Load a custom s t y l e s h e e t
17 $p df− o a d S t y l e s ( ’ custom . c s s ’ ) ;
>l
18
19 // Add a c u s t o m i z e d h e a d e r
20 $p df− e g i s t e r P d f P a r t ( new e z c D o c u m e n t P d f H e a d e r P d f P a r t (
>r
21 new e z c D o c u m e n t P d f F o o t e r O p t i o n s ( a r r a y (
22 ’ showPageNumber ’ = false ,
>
23 ’ height ’ = ’ 10mm’ ,
>
24 ) )
25 ) );
26
27 $p df− r e a t e F r o m D o c b o o k ( $xhtml−
>c >getAsDocbook ( ) ) ;
28 file put contents ( FILE . ’ . pdf ’ , $pdf ) ;
HTML to ODT to XML to PDF to . . . 20 / 25
PDF generation
1 article {
2 f o n t−f a m i l y : ” s a n s−s e r i f ” ;
3 f o n t−s i z e : ”10 p t ” ;
4 }
5
6 title {
7 f o n t−f a m i l y : ” s a n s−s e r i f ” ;
8 c o l o r : #97BF0D ;
9
10 b o r d e r−bottom : 1 px s o l i d #555753;
11 b o r d e r−l e f t : 4 px s o l i d #555753;
12 p a d d i n g−l e f t : 4 px ;
13 }
14
15 section > section > section > section > t i t l e {
16 b o r d e r−c o l o r : #babdb6 ;
17 }
18
19 page {
20 p a d d i n g : ”15mm 30mm” ;
21 }
22
23 ulink {
24 c o l o r : #97BF0D ;
25 }
26
27 link {
28 c o l o r : #97BF0D ;
29 }
HTML to ODT to XML to PDF to . . . 21 / 25
PDF generation
HTML to ODT to XML to PDF to . . . 22 / 25
ODT generation
1 <?php
2
3 require ’ a u t o l o a d . php ’ ;
4
5 // C o n v e r t some i n p u t R S T f i l e t o docbook
6 $ x h t m l = new ezcDocumentXhtml ( ) ;
7 $xhtml− e t F i l t e r s ( a r r a y (
>s
8 new e z c D o c u m e n t X h t m l E l e m e n t F i l t e r ( ) ,
9 new e z c D o c u m e n t X h t m l X p a t h F i l t e r ( ’ // d i v [ @ c l a s s =” c o n t e n t ” ] ’ ) ,
10 ) );
11 $xhtml− o a d F i l e ( ’ c o n s u l t i n g . h t m l ’ ) ;
>l
12
13 $ c o n v e r t e r = new ezcDocumentDocbookToOdtConverter ( ) ;
14 $ c o n v e r t e r − p t i o n s − t y l e r − d d S t y l e s h e e t F i l e ( ’ custom . c s s ’ ) ;
>o >s >a
15
16 $ o d t = $ c o n v e r t e r − o n v e r t ( $xhtml−
>c >getAsDocbook ( ) ) ;
17 file put contents ( FILE . ’ . f o d t ’ , $odt ) ;
HTML to ODT to XML to PDF to . . . 22 / 25
ODT generation
1 <?php
2
3 require ’ a u t o l o a d . php ’ ;
4
5 // C o n v e r t some i n p u t R S T f i l e t o docbook
6 $ x h t m l = new ezcDocumentXhtml ( ) ;
7 $xhtml− e t F i l t e r s ( a r r a y (
>s
8 new e z c D o c u m e n t X h t m l E l e m e n t F i l t e r ( ) ,
9 new e z c D o c u m e n t X h t m l X p a t h F i l t e r ( ’ // d i v [ @ c l a s s =” c o n t e n t ” ] ’ ) ,
10 ) );
11 $xhtml− o a d F i l e ( ’ c o n s u l t i n g . h t m l ’ ) ;
>l
12
13 $ c o n v e r t e r = new ezcDocumentDocbookToOdtConverter ( ) ;
14 $ c o n v e r t e r − p t i o n s − t y l e r − d d S t y l e s h e e t F i l e ( ’ custom . c s s ’ ) ;
>o >s >a
15
16 $ o d t = $ c o n v e r t e r − o n v e r t ( $xhtml−
>c >getAsDocbook ( ) ) ;
17 file put contents ( FILE . ’ . f o d t ’ , $odt ) ;
HTML to ODT to XML to PDF to . . . 22 / 25
ODT generation
1 <?php
2
3 require ’ a u t o l o a d . php ’ ;
4
5 // C o n v e r t some i n p u t R S T f i l e t o docbook
6 $ x h t m l = new ezcDocumentXhtml ( ) ;
7 $xhtml− e t F i l t e r s ( a r r a y (
>s
8 new e z c D o c u m e n t X h t m l E l e m e n t F i l t e r ( ) ,
9 new e z c D o c u m e n t X h t m l X p a t h F i l t e r ( ’ // d i v [ @ c l a s s =” c o n t e n t ” ] ’ ) ,
10 ) );
11 $xhtml− o a d F i l e ( ’ c o n s u l t i n g . h t m l ’ ) ;
>l
12
13 $ c o n v e r t e r = new ezcDocumentDocbookToOdtConverter ( ) ;
14 $ c o n v e r t e r − p t i o n s − t y l e r − d d S t y l e s h e e t F i l e ( ’ custom . c s s ’ ) ;
>o >s >a
15
16 $ o d t = $ c o n v e r t e r − o n v e r t ( $xhtml−
>c >getAsDocbook ( ) ) ;
17 file put contents ( FILE . ’ . f o d t ’ , $odt ) ;
HTML to ODT to XML to PDF to . . . 23 / 25
ODT generation
HTML to ODT to XML to PDF to . . . 24 / 25
Outline
Introduction
The Document component
Getting into the code
End
HTML to ODT to XML to PDF to . . . 25 / 25
Thanks for listening
Apache Zeta Components
http://zetacomponents.org
#zetacomponents @ Freenode
Stay in touch
toby@qafoo.com
@tobySen
Take care for your software quality
http://qafoo.com
Also consulting / training / support for Zeta