SlideShare a Scribd company logo
Generating PDF from Python web applications
Gaël LE MIGNOT
Pilot Systems
June 6, 2017
Gaël LE MIGNOT Pilot Systems Generating PDF from Python web applications
Summary
1 Introduction
2 Tools
3 Tips, tricks and pitfalls
4 Conclusion
Gaël LE MIGNOT Pilot Systems Generating PDF from Python web applications
Introduction
Pilot Systems
Free Software service provider
Python Web application development and hosting
Using Zope/Plone (since 2000) and Django (since 0.96)
All kind of customers (public/private, small/big, . . . )
Generating PDFs
Very frequently asked
Different purpose require different tools
Several pitfalls to avoid
Gaël LE MIGNOT Pilot Systems Generating PDF from Python web applications
Weasyprint - presentation
What is weasyprint?
Free Software Python library
Convert HTML5 page (using a print CSS) into PDF
Also exists in command-line
When to use it?
To convert an existing HTML document
Consistency: same templating engine, same language
For simple page layouts
Gaël LE MIGNOT Pilot Systems Generating PDF from Python web applications
Weasyprint - code details
Simple usage
from weasyprint import HTML, CSS
html = template()
data = HTML(string=html).write_pdf()
Some mangling with BeautifulSoup
from bs4 import BeautifulSoup
soup = BeautifulSoup(html, 'html.parser')
bl = ('typography.com', 'logged-in.css')
for css in soup.findAll("link"):
for cssname in bl:
if cssname in css['href']:
css.extract()
Gaël LE MIGNOT Pilot Systems Generating PDF from Python web applications
Weasyprint - code details
Add some page header/footer
@page {
margin: 3cm 2cm;
@bottom-right {
content: "Page " counter(page)
}
@top-center {
content: "Pilot Systems";
}
}
Gaël LE MIGNOT Pilot Systems Generating PDF from Python web applications
Reportlab - presentation
What is reportlab?
Python library for generating PDF and graphs
Powerful RML templating language
Template and story concepts
Versions and tools
Complicated licensing
Reportlab PDF toolkit: limited Free Software version
Reportlab PLUS: non-free complete version
trml2pdf: free software, third-party implementation of RML
RMLPageTemplate: Zope integration of trml2pdf
Gaël LE MIGNOT Pilot Systems Generating PDF from Python web applications
Reportlab - code example
Bright warning
<template>
<fill color="yellow"/>
<rect x="115mm" y="217mm"
width="90mm" height="18mm"
fill="yes" stroke="yes"/>
<frame id="warning" x1="115mm" y1="213mm"
width="90mm" height="24mm" />
</template>
<story>
<para>
TEMPORARY DOCUMENT - DO NOT PRINT
</para>
</story>
Gaël LE MIGNOT Pilot Systems Generating PDF from Python web applications
pdftk
What is pdftk?
Toolbox to manipulate PDF
Perform operation like extract pages, concatenate
Can also stamp a PDF on top of another
Command-line tool, so use subprocess
Use-case
Afdas - collect taxes and finance training
Companies make a yearly declaration
Take a background and fill cells
Gaël LE MIGNOT Pilot Systems Generating PDF from Python web applications
LATEX
What is LATEX?
Very powerful document composition system
Used for scientific publishing, among others
Used for those slides, too
How to use it?
Generate a .tex file
Can use a template, or intermediate language (like rst)
Then execute pdflatex
When to use it?
Rich formatting
Table of content, index, glossary, . . .
Gaël LE MIGNOT Pilot Systems Generating PDF from Python web applications
Other tools
For the brave
Client-side rendering with JS libraries
Using LibreOffice with pyuno
Generate QR-code/datamatrix with elaphe
Gaël LE MIGNOT Pilot Systems Generating PDF from Python web applications
HTTP Headers
Don’t forget HTTP headers
Specify the content-type
Hint between displaying and downloading
Provide default filename
Code example
response.setHeader('Content-Type',
'application/pdf')
cd = 'attachment; filename="%s"' % filename
response.setHeader('Content-Disposition', cd)
Gaël LE MIGNOT Pilot Systems Generating PDF from Python web applications
Handling long generation times
The problem
Generate a PDF report of 500 pages
It takes 10 minutes
Timeout or users get angry
Solutions
Increase timeouts, inform users
Use fork or threads to generate async
Use a scheduler like Celery
Send the result by email, with a link
Cleaning
find /path/to/pdfs -mtime +14 -delete
Gaël LE MIGNOT Pilot Systems Generating PDF from Python web applications
Careful with search engines
Typical situation
Public website (using a CMS)
Button on each page to get a PDF version
A crawler comes... and boom.
Don’t panic
Use robots.txt file, but limited
Have the button do a POST
Use load-balancer like haproxy and pin PDF requests
Gaël LE MIGNOT Pilot Systems Generating PDF from Python web applications
CPU and RAM usage
PDF generation is expensive
PDF generation can be heavy both in CPU and RAM
Always estimate your volume before deploying
Task schedulers (like Celery) are great help
Be nice!
#!/bin/sh
PDFTK=/usr/bin/pdftk
exec nice -n 10 taskset -c 0 $PDFTK "$@"
Gaël LE MIGNOT Pilot Systems Generating PDF from Python web applications
Accessing external resources
The problem
Restricted access CSS and images
Common with weasyprint, but can also happen with other
tools
Solutions
Reuse the user’s cookies in the sub-requests
Extract the resources to a temporary directory
Allow unprotected access from localhost (dangerous)
Gaël LE MIGNOT Pilot Systems Generating PDF from Python web applications
Accessing external resources
Cookie code example
import urllib2
cookies = request.cookies.items()
cookies = [ '%s=%s' % (k,v) for k,v in cookies ]
cookiestr = "; ".join(cookies)
cookiestr = cookiestr.replace('n', '')
opener = urllib2.build_opener()
opener.addheaders.append(('Cookie', cookies))
html = opener.open(ressource_url).read()
Gaël LE MIGNOT Pilot Systems Generating PDF from Python web applications
Encrypted PDFs
Typical use-case
User submitted a form with text fields and PDF
attachments
At the end the answers are contactened into a PDF
Or even all the answers of all users!
Use weasyprint + pdftk or LATEX
What happens
It works most of the time
But on some PDF it breaks weirdly
The culprit: DRM (Digital Restrictions Management)
Gaël LE MIGNOT Pilot Systems Generating PDF from Python web applications
Encrypted PDFs
What to do?
Ensure the PDF is not DRM-protected
Use pdfinfo from poppler
Code example
out = subprocess.check_output([ 'pdfinfo',
pdffile ])
if re.search('Encrypted:.*yes', out):
raise ValueError, "DRM protected"
Gaël LE MIGNOT Pilot Systems Generating PDF from Python web applications
Conclusion
Conclusion
Gaël LE MIGNOT Pilot Systems Generating PDF from Python web applications
Conclusion
Thanks for listening!
Any question?
Gaël LE MIGNOT Pilot Systems Generating PDF from Python web applications

More Related Content

What's hot

DaNode - A home made web server in D
DaNode - A home made web server in DDaNode - A home made web server in D
DaNode - A home made web server in D
Andrei Alexandrescu
 
Pulumi. Modern Infrastructure as Code.
Pulumi. Modern Infrastructure as Code.Pulumi. Modern Infrastructure as Code.
Pulumi. Modern Infrastructure as Code.
Yurii Bychenok
 
Infrastructure as "Code" with Pulumi
Infrastructure as "Code" with PulumiInfrastructure as "Code" with Pulumi
Infrastructure as "Code" with Pulumi
Venura Athukorala
 
Infrastructure-as-Code with Pulumi - Better than all the others (like Ansible)?
Infrastructure-as-Code with Pulumi- Better than all the others (like Ansible)?Infrastructure-as-Code with Pulumi- Better than all the others (like Ansible)?
Infrastructure-as-Code with Pulumi - Better than all the others (like Ansible)?
Jonas Hecht
 
Getting started with Emscripten – Transpiling C / C++ to JavaScript / HTML5
Getting started with Emscripten – Transpiling C / C++ to JavaScript / HTML5Getting started with Emscripten – Transpiling C / C++ to JavaScript / HTML5
Getting started with Emscripten – Transpiling C / C++ to JavaScript / HTML5
David Voyles
 
Our Puppet Story (Linuxtag 2014)
Our Puppet Story (Linuxtag 2014)Our Puppet Story (Linuxtag 2014)
Our Puppet Story (Linuxtag 2014)
DECK36
 
20151117 IoT를 위한 서비스 구성과 개발
20151117 IoT를 위한 서비스 구성과 개발20151117 IoT를 위한 서비스 구성과 개발
20151117 IoT를 위한 서비스 구성과 개발
영욱 김
 
Beachhead implements new opcode on CLR JIT
Beachhead implements new opcode on CLR JITBeachhead implements new opcode on CLR JIT
Beachhead implements new opcode on CLR JIT
Kouji Matsui
 
Nodejs intro
Nodejs introNodejs intro
Nodejs intro
Ndjido Ardo BAR
 
#PDR15 - waf, wscript and Your Pebble App
#PDR15 - waf, wscript and Your Pebble App#PDR15 - waf, wscript and Your Pebble App
#PDR15 - waf, wscript and Your Pebble App
Pebble Technology
 
PyHEP 2018: Tools to bind to Python
PyHEP 2018:  Tools to bind to PythonPyHEP 2018:  Tools to bind to Python
PyHEP 2018: Tools to bind to Python
Henry Schreiner
 
Data Management and Streaming Strategies in Drakensang Online
Data Management and Streaming Strategies in Drakensang OnlineData Management and Streaming Strategies in Drakensang Online
Data Management and Streaming Strategies in Drakensang Online
Andre Weissflog
 
Node js introduction
Node js introductionNode js introduction
Node js introduction
Joseph de Castelnau
 
An Introduction of Node Package Manager (NPM)
An Introduction of Node Package Manager (NPM)An Introduction of Node Package Manager (NPM)
An Introduction of Node Package Manager (NPM)
iFour Technolab Pvt. Ltd.
 
Machine Learning on Your Hand - Introduction to Tensorflow Lite Preview
Machine Learning on Your Hand - Introduction to Tensorflow Lite PreviewMachine Learning on Your Hand - Introduction to Tensorflow Lite Preview
Machine Learning on Your Hand - Introduction to Tensorflow Lite Preview
Modulabs
 
Puppetizing Your Organization
Puppetizing Your OrganizationPuppetizing Your Organization
Puppetizing Your Organization
Robert Nelson
 
Real-Time Web Apps & Symfony. What are your options?
Real-Time Web Apps & Symfony. What are your options?Real-Time Web Apps & Symfony. What are your options?
Real-Time Web Apps & Symfony. What are your options?
Phil Leggetter
 
Infrastructure as (real) Code – Manage your K8s resources with Pulumi
Infrastructure as (real) Code – Manage your K8s resources with PulumiInfrastructure as (real) Code – Manage your K8s resources with Pulumi
Infrastructure as (real) Code – Manage your K8s resources with Pulumi
inovex GmbH
 
node.js dao
node.js daonode.js dao
node.js dao
Vladimir Miguro
 
Escape the Walls of PaaS: Unlock the Power & Flexibility of DigitalOcean App ...
Escape the Walls of PaaS: Unlock the Power & Flexibility of DigitalOcean App ...Escape the Walls of PaaS: Unlock the Power & Flexibility of DigitalOcean App ...
Escape the Walls of PaaS: Unlock the Power & Flexibility of DigitalOcean App ...
DigitalOcean
 

What's hot (20)

DaNode - A home made web server in D
DaNode - A home made web server in DDaNode - A home made web server in D
DaNode - A home made web server in D
 
Pulumi. Modern Infrastructure as Code.
Pulumi. Modern Infrastructure as Code.Pulumi. Modern Infrastructure as Code.
Pulumi. Modern Infrastructure as Code.
 
Infrastructure as "Code" with Pulumi
Infrastructure as "Code" with PulumiInfrastructure as "Code" with Pulumi
Infrastructure as "Code" with Pulumi
 
Infrastructure-as-Code with Pulumi - Better than all the others (like Ansible)?
Infrastructure-as-Code with Pulumi- Better than all the others (like Ansible)?Infrastructure-as-Code with Pulumi- Better than all the others (like Ansible)?
Infrastructure-as-Code with Pulumi - Better than all the others (like Ansible)?
 
Getting started with Emscripten – Transpiling C / C++ to JavaScript / HTML5
Getting started with Emscripten – Transpiling C / C++ to JavaScript / HTML5Getting started with Emscripten – Transpiling C / C++ to JavaScript / HTML5
Getting started with Emscripten – Transpiling C / C++ to JavaScript / HTML5
 
Our Puppet Story (Linuxtag 2014)
Our Puppet Story (Linuxtag 2014)Our Puppet Story (Linuxtag 2014)
Our Puppet Story (Linuxtag 2014)
 
20151117 IoT를 위한 서비스 구성과 개발
20151117 IoT를 위한 서비스 구성과 개발20151117 IoT를 위한 서비스 구성과 개발
20151117 IoT를 위한 서비스 구성과 개발
 
Beachhead implements new opcode on CLR JIT
Beachhead implements new opcode on CLR JITBeachhead implements new opcode on CLR JIT
Beachhead implements new opcode on CLR JIT
 
Nodejs intro
Nodejs introNodejs intro
Nodejs intro
 
#PDR15 - waf, wscript and Your Pebble App
#PDR15 - waf, wscript and Your Pebble App#PDR15 - waf, wscript and Your Pebble App
#PDR15 - waf, wscript and Your Pebble App
 
PyHEP 2018: Tools to bind to Python
PyHEP 2018:  Tools to bind to PythonPyHEP 2018:  Tools to bind to Python
PyHEP 2018: Tools to bind to Python
 
Data Management and Streaming Strategies in Drakensang Online
Data Management and Streaming Strategies in Drakensang OnlineData Management and Streaming Strategies in Drakensang Online
Data Management and Streaming Strategies in Drakensang Online
 
Node js introduction
Node js introductionNode js introduction
Node js introduction
 
An Introduction of Node Package Manager (NPM)
An Introduction of Node Package Manager (NPM)An Introduction of Node Package Manager (NPM)
An Introduction of Node Package Manager (NPM)
 
Machine Learning on Your Hand - Introduction to Tensorflow Lite Preview
Machine Learning on Your Hand - Introduction to Tensorflow Lite PreviewMachine Learning on Your Hand - Introduction to Tensorflow Lite Preview
Machine Learning on Your Hand - Introduction to Tensorflow Lite Preview
 
Puppetizing Your Organization
Puppetizing Your OrganizationPuppetizing Your Organization
Puppetizing Your Organization
 
Real-Time Web Apps & Symfony. What are your options?
Real-Time Web Apps & Symfony. What are your options?Real-Time Web Apps & Symfony. What are your options?
Real-Time Web Apps & Symfony. What are your options?
 
Infrastructure as (real) Code – Manage your K8s resources with Pulumi
Infrastructure as (real) Code – Manage your K8s resources with PulumiInfrastructure as (real) Code – Manage your K8s resources with Pulumi
Infrastructure as (real) Code – Manage your K8s resources with Pulumi
 
node.js dao
node.js daonode.js dao
node.js dao
 
Escape the Walls of PaaS: Unlock the Power & Flexibility of DigitalOcean App ...
Escape the Walls of PaaS: Unlock the Power & Flexibility of DigitalOcean App ...Escape the Walls of PaaS: Unlock the Power & Flexibility of DigitalOcean App ...
Escape the Walls of PaaS: Unlock the Power & Flexibility of DigitalOcean App ...
 

Similar to Ways to generate PDF from Python Web applications, Gaël Le Mignot

PyQt Application Development On Maemo
PyQt Application Development On MaemoPyQt Application Development On Maemo
PyQt Application Development On Maemo
achipa
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
Florian Wilhelm
 
Introduction to Google App Engine with Python
Introduction to Google App Engine with PythonIntroduction to Google App Engine with Python
Introduction to Google App Engine with Python
Brian Lyttle
 
Tool overview – how to capture – how to create basic workflow .pptx
Tool overview – how to capture – how to create basic workflow .pptxTool overview – how to capture – how to create basic workflow .pptx
Tool overview – how to capture – how to create basic workflow .pptx
RUPAK BHATTACHARJEE
 
Continuous Delivery for Python Developers – PyCon Otto
Continuous Delivery for Python Developers – PyCon OttoContinuous Delivery for Python Developers – PyCon Otto
Continuous Delivery for Python Developers – PyCon Otto
Peter Bittner
 
Company Visitor Management System Report.docx
Company Visitor Management System Report.docxCompany Visitor Management System Report.docx
Company Visitor Management System Report.docx
fantabulous2024
 
بررسی چارچوب جنگو
بررسی چارچوب جنگوبررسی چارچوب جنگو
بررسی چارچوب جنگو
railsbootcamp
 
Lamp Zend Security
Lamp Zend SecurityLamp Zend Security
Lamp Zend Security
Ram Srivastava
 
Software Quality Assurance Tooling - Wintersession 2024
Software Quality Assurance Tooling - Wintersession 2024Software Quality Assurance Tooling - Wintersession 2024
Software Quality Assurance Tooling - Wintersession 2024
Henry Schreiner
 
Taking Your FDM Application to the Next Level with Advanced Scripting
Taking Your FDM Application to the Next Level with Advanced ScriptingTaking Your FDM Application to the Next Level with Advanced Scripting
Taking Your FDM Application to the Next Level with Advanced Scripting
Alithya
 
Programmable infrastructure with FlyScript
Programmable infrastructure with FlyScriptProgrammable infrastructure with FlyScript
Programmable infrastructure with FlyScript
Riverbed Technology
 
How do we do it
How do we do itHow do we do it
How do we do it
Peter Samoilov
 
Cloud Native Development
Cloud Native DevelopmentCloud Native Development
Cloud Native Development
Manuel Garcia
 
From localhost to the cloud: A Journey of Deployments
From localhost to the cloud: A Journey of DeploymentsFrom localhost to the cloud: A Journey of Deployments
From localhost to the cloud: A Journey of Deployments
Tegar Imansyah
 
Pre press workflow
Pre press workflowPre press workflow
Pre press workflow
Kulakkada pradeep
 
Php Development Stack
Php Development StackPhp Development Stack
Php Development Stack
Bipin Upadhyay
 
Php Development Stack
Php Development StackPhp Development Stack
Php Development Stack
shah_neeraj
 
Improving code quality using CI
Improving code quality using CIImproving code quality using CI
Improving code quality using CI
Martin de Keijzer
 

Similar to Ways to generate PDF from Python Web applications, Gaël Le Mignot (20)

PyQt Application Development On Maemo
PyQt Application Development On MaemoPyQt Application Development On Maemo
PyQt Application Development On Maemo
 
Django by rj
Django by rjDjango by rj
Django by rj
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Introduction to Google App Engine with Python
Introduction to Google App Engine with PythonIntroduction to Google App Engine with Python
Introduction to Google App Engine with Python
 
Tool overview – how to capture – how to create basic workflow .pptx
Tool overview – how to capture – how to create basic workflow .pptxTool overview – how to capture – how to create basic workflow .pptx
Tool overview – how to capture – how to create basic workflow .pptx
 
CGI by rj
CGI by rjCGI by rj
CGI by rj
 
Continuous Delivery for Python Developers – PyCon Otto
Continuous Delivery for Python Developers – PyCon OttoContinuous Delivery for Python Developers – PyCon Otto
Continuous Delivery for Python Developers – PyCon Otto
 
Company Visitor Management System Report.docx
Company Visitor Management System Report.docxCompany Visitor Management System Report.docx
Company Visitor Management System Report.docx
 
بررسی چارچوب جنگو
بررسی چارچوب جنگوبررسی چارچوب جنگو
بررسی چارچوب جنگو
 
Lamp Zend Security
Lamp Zend SecurityLamp Zend Security
Lamp Zend Security
 
Software Quality Assurance Tooling - Wintersession 2024
Software Quality Assurance Tooling - Wintersession 2024Software Quality Assurance Tooling - Wintersession 2024
Software Quality Assurance Tooling - Wintersession 2024
 
Taking Your FDM Application to the Next Level with Advanced Scripting
Taking Your FDM Application to the Next Level with Advanced ScriptingTaking Your FDM Application to the Next Level with Advanced Scripting
Taking Your FDM Application to the Next Level with Advanced Scripting
 
Programmable infrastructure with FlyScript
Programmable infrastructure with FlyScriptProgrammable infrastructure with FlyScript
Programmable infrastructure with FlyScript
 
How do we do it
How do we do itHow do we do it
How do we do it
 
Cloud Native Development
Cloud Native DevelopmentCloud Native Development
Cloud Native Development
 
From localhost to the cloud: A Journey of Deployments
From localhost to the cloud: A Journey of DeploymentsFrom localhost to the cloud: A Journey of Deployments
From localhost to the cloud: A Journey of Deployments
 
Pre press workflow
Pre press workflowPre press workflow
Pre press workflow
 
Php Development Stack
Php Development StackPhp Development Stack
Php Development Stack
 
Php Development Stack
Php Development StackPhp Development Stack
Php Development Stack
 
Improving code quality using CI
Improving code quality using CIImproving code quality using CI
Improving code quality using CI
 

More from Pôle Systematic Paris-Region

OSIS19_IoT :Transparent remote connectivity to short-range IoT devices, by Na...
OSIS19_IoT :Transparent remote connectivity to short-range IoT devices, by Na...OSIS19_IoT :Transparent remote connectivity to short-range IoT devices, by Na...
OSIS19_IoT :Transparent remote connectivity to short-range IoT devices, by Na...
Pôle Systematic Paris-Region
 
OSIS19_Cloud : SAFC: Scheduling and Allocation Framework for Containers in a ...
OSIS19_Cloud : SAFC: Scheduling and Allocation Framework for Containers in a ...OSIS19_Cloud : SAFC: Scheduling and Allocation Framework for Containers in a ...
OSIS19_Cloud : SAFC: Scheduling and Allocation Framework for Containers in a ...
Pôle Systematic Paris-Region
 
OSIS19_Cloud : Qu’apporte l’observabilité à la gestion de configuration? par ...
OSIS19_Cloud : Qu’apporte l’observabilité à la gestion de configuration? par ...OSIS19_Cloud : Qu’apporte l’observabilité à la gestion de configuration? par ...
OSIS19_Cloud : Qu’apporte l’observabilité à la gestion de configuration? par ...
Pôle Systematic Paris-Region
 
OSIS19_Cloud : Performance and power management in virtualized data centers, ...
OSIS19_Cloud : Performance and power management in virtualized data centers, ...OSIS19_Cloud : Performance and power management in virtualized data centers, ...
OSIS19_Cloud : Performance and power management in virtualized data centers, ...
Pôle Systematic Paris-Region
 
OSIS19_Cloud : Des objets dans le cloud, et qui y restent -- L'expérience du ...
OSIS19_Cloud : Des objets dans le cloud, et qui y restent -- L'expérience du ...OSIS19_Cloud : Des objets dans le cloud, et qui y restent -- L'expérience du ...
OSIS19_Cloud : Des objets dans le cloud, et qui y restent -- L'expérience du ...
Pôle Systematic Paris-Region
 
OSIS19_Cloud : Attribution automatique de ressources pour micro-services, Alt...
OSIS19_Cloud : Attribution automatique de ressources pour micro-services, Alt...OSIS19_Cloud : Attribution automatique de ressources pour micro-services, Alt...
OSIS19_Cloud : Attribution automatique de ressources pour micro-services, Alt...
Pôle Systematic Paris-Region
 
OSIS19_IoT : State of the art in security for embedded systems and IoT, by Pi...
OSIS19_IoT : State of the art in security for embedded systems and IoT, by Pi...OSIS19_IoT : State of the art in security for embedded systems and IoT, by Pi...
OSIS19_IoT : State of the art in security for embedded systems and IoT, by Pi...
Pôle Systematic Paris-Region
 
Osis19_IoT: Proof of Pointer Programs with Ownership in SPARK, by Yannick Moy
Osis19_IoT: Proof of Pointer Programs with Ownership in SPARK, by Yannick MoyOsis19_IoT: Proof of Pointer Programs with Ownership in SPARK, by Yannick Moy
Osis19_IoT: Proof of Pointer Programs with Ownership in SPARK, by Yannick Moy
Pôle Systematic Paris-Region
 
Osis18_Cloud : Pas de commun sans communauté ?
Osis18_Cloud : Pas de commun sans communauté ?Osis18_Cloud : Pas de commun sans communauté ?
Osis18_Cloud : Pas de commun sans communauté ?
Pôle Systematic Paris-Region
 
Osis18_Cloud : Projet Wolphin
Osis18_Cloud : Projet Wolphin Osis18_Cloud : Projet Wolphin
Osis18_Cloud : Projet Wolphin
Pôle Systematic Paris-Region
 
Osis18_Cloud : Virtualisation efficace d’architectures NUMA
Osis18_Cloud : Virtualisation efficace d’architectures NUMAOsis18_Cloud : Virtualisation efficace d’architectures NUMA
Osis18_Cloud : Virtualisation efficace d’architectures NUMA
Pôle Systematic Paris-Region
 
Osis18_Cloud : DeepTorrent Stockage distribué perenne basé sur Bittorrent
Osis18_Cloud : DeepTorrent Stockage distribué perenne basé sur BittorrentOsis18_Cloud : DeepTorrent Stockage distribué perenne basé sur Bittorrent
Osis18_Cloud : DeepTorrent Stockage distribué perenne basé sur Bittorrent
Pôle Systematic Paris-Region
 
Osis18_Cloud : Software-heritage
Osis18_Cloud : Software-heritageOsis18_Cloud : Software-heritage
Osis18_Cloud : Software-heritage
Pôle Systematic Paris-Region
 
OSIS18_IoT: L'approche machine virtuelle pour les microcontrôleurs, le projet...
OSIS18_IoT: L'approche machine virtuelle pour les microcontrôleurs, le projet...OSIS18_IoT: L'approche machine virtuelle pour les microcontrôleurs, le projet...
OSIS18_IoT: L'approche machine virtuelle pour les microcontrôleurs, le projet...
Pôle Systematic Paris-Region
 
OSIS18_IoT: La securite des objets connectes a bas cout avec l'os et riot
OSIS18_IoT: La securite des objets connectes a bas cout avec l'os et riotOSIS18_IoT: La securite des objets connectes a bas cout avec l'os et riot
OSIS18_IoT: La securite des objets connectes a bas cout avec l'os et riot
Pôle Systematic Paris-Region
 
OSIS18_IoT : Solution de mise au point pour les systemes embarques, par Julio...
OSIS18_IoT : Solution de mise au point pour les systemes embarques, par Julio...OSIS18_IoT : Solution de mise au point pour les systemes embarques, par Julio...
OSIS18_IoT : Solution de mise au point pour les systemes embarques, par Julio...
Pôle Systematic Paris-Region
 
OSIS18_IoT : Securisation du reseau des objets connectes, par Nicolas LE SAUZ...
OSIS18_IoT : Securisation du reseau des objets connectes, par Nicolas LE SAUZ...OSIS18_IoT : Securisation du reseau des objets connectes, par Nicolas LE SAUZ...
OSIS18_IoT : Securisation du reseau des objets connectes, par Nicolas LE SAUZ...
Pôle Systematic Paris-Region
 
OSIS18_IoT : Ada and SPARK - Defense in Depth for Safe Micro-controller Progr...
OSIS18_IoT : Ada and SPARK - Defense in Depth for Safe Micro-controller Progr...OSIS18_IoT : Ada and SPARK - Defense in Depth for Safe Micro-controller Progr...
OSIS18_IoT : Ada and SPARK - Defense in Depth for Safe Micro-controller Progr...
Pôle Systematic Paris-Region
 
OSIS18_IoT : RTEMS pour l'IoT professionnel, par Pierre Ficheux (Smile ECS)
OSIS18_IoT : RTEMS pour l'IoT professionnel, par Pierre Ficheux (Smile ECS)OSIS18_IoT : RTEMS pour l'IoT professionnel, par Pierre Ficheux (Smile ECS)
OSIS18_IoT : RTEMS pour l'IoT professionnel, par Pierre Ficheux (Smile ECS)
Pôle Systematic Paris-Region
 
PyParis 2017 / Un mooc python, by thierry parmentelat
PyParis 2017 / Un mooc python, by thierry parmentelatPyParis 2017 / Un mooc python, by thierry parmentelat
PyParis 2017 / Un mooc python, by thierry parmentelat
Pôle Systematic Paris-Region
 

More from Pôle Systematic Paris-Region (20)

OSIS19_IoT :Transparent remote connectivity to short-range IoT devices, by Na...
OSIS19_IoT :Transparent remote connectivity to short-range IoT devices, by Na...OSIS19_IoT :Transparent remote connectivity to short-range IoT devices, by Na...
OSIS19_IoT :Transparent remote connectivity to short-range IoT devices, by Na...
 
OSIS19_Cloud : SAFC: Scheduling and Allocation Framework for Containers in a ...
OSIS19_Cloud : SAFC: Scheduling and Allocation Framework for Containers in a ...OSIS19_Cloud : SAFC: Scheduling and Allocation Framework for Containers in a ...
OSIS19_Cloud : SAFC: Scheduling and Allocation Framework for Containers in a ...
 
OSIS19_Cloud : Qu’apporte l’observabilité à la gestion de configuration? par ...
OSIS19_Cloud : Qu’apporte l’observabilité à la gestion de configuration? par ...OSIS19_Cloud : Qu’apporte l’observabilité à la gestion de configuration? par ...
OSIS19_Cloud : Qu’apporte l’observabilité à la gestion de configuration? par ...
 
OSIS19_Cloud : Performance and power management in virtualized data centers, ...
OSIS19_Cloud : Performance and power management in virtualized data centers, ...OSIS19_Cloud : Performance and power management in virtualized data centers, ...
OSIS19_Cloud : Performance and power management in virtualized data centers, ...
 
OSIS19_Cloud : Des objets dans le cloud, et qui y restent -- L'expérience du ...
OSIS19_Cloud : Des objets dans le cloud, et qui y restent -- L'expérience du ...OSIS19_Cloud : Des objets dans le cloud, et qui y restent -- L'expérience du ...
OSIS19_Cloud : Des objets dans le cloud, et qui y restent -- L'expérience du ...
 
OSIS19_Cloud : Attribution automatique de ressources pour micro-services, Alt...
OSIS19_Cloud : Attribution automatique de ressources pour micro-services, Alt...OSIS19_Cloud : Attribution automatique de ressources pour micro-services, Alt...
OSIS19_Cloud : Attribution automatique de ressources pour micro-services, Alt...
 
OSIS19_IoT : State of the art in security for embedded systems and IoT, by Pi...
OSIS19_IoT : State of the art in security for embedded systems and IoT, by Pi...OSIS19_IoT : State of the art in security for embedded systems and IoT, by Pi...
OSIS19_IoT : State of the art in security for embedded systems and IoT, by Pi...
 
Osis19_IoT: Proof of Pointer Programs with Ownership in SPARK, by Yannick Moy
Osis19_IoT: Proof of Pointer Programs with Ownership in SPARK, by Yannick MoyOsis19_IoT: Proof of Pointer Programs with Ownership in SPARK, by Yannick Moy
Osis19_IoT: Proof of Pointer Programs with Ownership in SPARK, by Yannick Moy
 
Osis18_Cloud : Pas de commun sans communauté ?
Osis18_Cloud : Pas de commun sans communauté ?Osis18_Cloud : Pas de commun sans communauté ?
Osis18_Cloud : Pas de commun sans communauté ?
 
Osis18_Cloud : Projet Wolphin
Osis18_Cloud : Projet Wolphin Osis18_Cloud : Projet Wolphin
Osis18_Cloud : Projet Wolphin
 
Osis18_Cloud : Virtualisation efficace d’architectures NUMA
Osis18_Cloud : Virtualisation efficace d’architectures NUMAOsis18_Cloud : Virtualisation efficace d’architectures NUMA
Osis18_Cloud : Virtualisation efficace d’architectures NUMA
 
Osis18_Cloud : DeepTorrent Stockage distribué perenne basé sur Bittorrent
Osis18_Cloud : DeepTorrent Stockage distribué perenne basé sur BittorrentOsis18_Cloud : DeepTorrent Stockage distribué perenne basé sur Bittorrent
Osis18_Cloud : DeepTorrent Stockage distribué perenne basé sur Bittorrent
 
Osis18_Cloud : Software-heritage
Osis18_Cloud : Software-heritageOsis18_Cloud : Software-heritage
Osis18_Cloud : Software-heritage
 
OSIS18_IoT: L'approche machine virtuelle pour les microcontrôleurs, le projet...
OSIS18_IoT: L'approche machine virtuelle pour les microcontrôleurs, le projet...OSIS18_IoT: L'approche machine virtuelle pour les microcontrôleurs, le projet...
OSIS18_IoT: L'approche machine virtuelle pour les microcontrôleurs, le projet...
 
OSIS18_IoT: La securite des objets connectes a bas cout avec l'os et riot
OSIS18_IoT: La securite des objets connectes a bas cout avec l'os et riotOSIS18_IoT: La securite des objets connectes a bas cout avec l'os et riot
OSIS18_IoT: La securite des objets connectes a bas cout avec l'os et riot
 
OSIS18_IoT : Solution de mise au point pour les systemes embarques, par Julio...
OSIS18_IoT : Solution de mise au point pour les systemes embarques, par Julio...OSIS18_IoT : Solution de mise au point pour les systemes embarques, par Julio...
OSIS18_IoT : Solution de mise au point pour les systemes embarques, par Julio...
 
OSIS18_IoT : Securisation du reseau des objets connectes, par Nicolas LE SAUZ...
OSIS18_IoT : Securisation du reseau des objets connectes, par Nicolas LE SAUZ...OSIS18_IoT : Securisation du reseau des objets connectes, par Nicolas LE SAUZ...
OSIS18_IoT : Securisation du reseau des objets connectes, par Nicolas LE SAUZ...
 
OSIS18_IoT : Ada and SPARK - Defense in Depth for Safe Micro-controller Progr...
OSIS18_IoT : Ada and SPARK - Defense in Depth for Safe Micro-controller Progr...OSIS18_IoT : Ada and SPARK - Defense in Depth for Safe Micro-controller Progr...
OSIS18_IoT : Ada and SPARK - Defense in Depth for Safe Micro-controller Progr...
 
OSIS18_IoT : RTEMS pour l'IoT professionnel, par Pierre Ficheux (Smile ECS)
OSIS18_IoT : RTEMS pour l'IoT professionnel, par Pierre Ficheux (Smile ECS)OSIS18_IoT : RTEMS pour l'IoT professionnel, par Pierre Ficheux (Smile ECS)
OSIS18_IoT : RTEMS pour l'IoT professionnel, par Pierre Ficheux (Smile ECS)
 
PyParis 2017 / Un mooc python, by thierry parmentelat
PyParis 2017 / Un mooc python, by thierry parmentelatPyParis 2017 / Un mooc python, by thierry parmentelat
PyParis 2017 / Un mooc python, by thierry parmentelat
 

Recently uploaded

FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
James Anderson
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
Peter Spielvogel
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
nkrafacyberclub
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
Alex Pruden
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
RinaMondal9
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 

Recently uploaded (20)

FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 

Ways to generate PDF from Python Web applications, Gaël Le Mignot

  • 1. Generating PDF from Python web applications Gaël LE MIGNOT Pilot Systems June 6, 2017 Gaël LE MIGNOT Pilot Systems Generating PDF from Python web applications
  • 2. Summary 1 Introduction 2 Tools 3 Tips, tricks and pitfalls 4 Conclusion Gaël LE MIGNOT Pilot Systems Generating PDF from Python web applications
  • 3. Introduction Pilot Systems Free Software service provider Python Web application development and hosting Using Zope/Plone (since 2000) and Django (since 0.96) All kind of customers (public/private, small/big, . . . ) Generating PDFs Very frequently asked Different purpose require different tools Several pitfalls to avoid Gaël LE MIGNOT Pilot Systems Generating PDF from Python web applications
  • 4. Weasyprint - presentation What is weasyprint? Free Software Python library Convert HTML5 page (using a print CSS) into PDF Also exists in command-line When to use it? To convert an existing HTML document Consistency: same templating engine, same language For simple page layouts Gaël LE MIGNOT Pilot Systems Generating PDF from Python web applications
  • 5. Weasyprint - code details Simple usage from weasyprint import HTML, CSS html = template() data = HTML(string=html).write_pdf() Some mangling with BeautifulSoup from bs4 import BeautifulSoup soup = BeautifulSoup(html, 'html.parser') bl = ('typography.com', 'logged-in.css') for css in soup.findAll("link"): for cssname in bl: if cssname in css['href']: css.extract() Gaël LE MIGNOT Pilot Systems Generating PDF from Python web applications
  • 6. Weasyprint - code details Add some page header/footer @page { margin: 3cm 2cm; @bottom-right { content: "Page " counter(page) } @top-center { content: "Pilot Systems"; } } Gaël LE MIGNOT Pilot Systems Generating PDF from Python web applications
  • 7. Reportlab - presentation What is reportlab? Python library for generating PDF and graphs Powerful RML templating language Template and story concepts Versions and tools Complicated licensing Reportlab PDF toolkit: limited Free Software version Reportlab PLUS: non-free complete version trml2pdf: free software, third-party implementation of RML RMLPageTemplate: Zope integration of trml2pdf Gaël LE MIGNOT Pilot Systems Generating PDF from Python web applications
  • 8. Reportlab - code example Bright warning <template> <fill color="yellow"/> <rect x="115mm" y="217mm" width="90mm" height="18mm" fill="yes" stroke="yes"/> <frame id="warning" x1="115mm" y1="213mm" width="90mm" height="24mm" /> </template> <story> <para> TEMPORARY DOCUMENT - DO NOT PRINT </para> </story> Gaël LE MIGNOT Pilot Systems Generating PDF from Python web applications
  • 9. pdftk What is pdftk? Toolbox to manipulate PDF Perform operation like extract pages, concatenate Can also stamp a PDF on top of another Command-line tool, so use subprocess Use-case Afdas - collect taxes and finance training Companies make a yearly declaration Take a background and fill cells Gaël LE MIGNOT Pilot Systems Generating PDF from Python web applications
  • 10. LATEX What is LATEX? Very powerful document composition system Used for scientific publishing, among others Used for those slides, too How to use it? Generate a .tex file Can use a template, or intermediate language (like rst) Then execute pdflatex When to use it? Rich formatting Table of content, index, glossary, . . . Gaël LE MIGNOT Pilot Systems Generating PDF from Python web applications
  • 11. Other tools For the brave Client-side rendering with JS libraries Using LibreOffice with pyuno Generate QR-code/datamatrix with elaphe Gaël LE MIGNOT Pilot Systems Generating PDF from Python web applications
  • 12. HTTP Headers Don’t forget HTTP headers Specify the content-type Hint between displaying and downloading Provide default filename Code example response.setHeader('Content-Type', 'application/pdf') cd = 'attachment; filename="%s"' % filename response.setHeader('Content-Disposition', cd) Gaël LE MIGNOT Pilot Systems Generating PDF from Python web applications
  • 13. Handling long generation times The problem Generate a PDF report of 500 pages It takes 10 minutes Timeout or users get angry Solutions Increase timeouts, inform users Use fork or threads to generate async Use a scheduler like Celery Send the result by email, with a link Cleaning find /path/to/pdfs -mtime +14 -delete Gaël LE MIGNOT Pilot Systems Generating PDF from Python web applications
  • 14. Careful with search engines Typical situation Public website (using a CMS) Button on each page to get a PDF version A crawler comes... and boom. Don’t panic Use robots.txt file, but limited Have the button do a POST Use load-balancer like haproxy and pin PDF requests Gaël LE MIGNOT Pilot Systems Generating PDF from Python web applications
  • 15. CPU and RAM usage PDF generation is expensive PDF generation can be heavy both in CPU and RAM Always estimate your volume before deploying Task schedulers (like Celery) are great help Be nice! #!/bin/sh PDFTK=/usr/bin/pdftk exec nice -n 10 taskset -c 0 $PDFTK "$@" Gaël LE MIGNOT Pilot Systems Generating PDF from Python web applications
  • 16. Accessing external resources The problem Restricted access CSS and images Common with weasyprint, but can also happen with other tools Solutions Reuse the user’s cookies in the sub-requests Extract the resources to a temporary directory Allow unprotected access from localhost (dangerous) Gaël LE MIGNOT Pilot Systems Generating PDF from Python web applications
  • 17. Accessing external resources Cookie code example import urllib2 cookies = request.cookies.items() cookies = [ '%s=%s' % (k,v) for k,v in cookies ] cookiestr = "; ".join(cookies) cookiestr = cookiestr.replace('n', '') opener = urllib2.build_opener() opener.addheaders.append(('Cookie', cookies)) html = opener.open(ressource_url).read() Gaël LE MIGNOT Pilot Systems Generating PDF from Python web applications
  • 18. Encrypted PDFs Typical use-case User submitted a form with text fields and PDF attachments At the end the answers are contactened into a PDF Or even all the answers of all users! Use weasyprint + pdftk or LATEX What happens It works most of the time But on some PDF it breaks weirdly The culprit: DRM (Digital Restrictions Management) Gaël LE MIGNOT Pilot Systems Generating PDF from Python web applications
  • 19. Encrypted PDFs What to do? Ensure the PDF is not DRM-protected Use pdfinfo from poppler Code example out = subprocess.check_output([ 'pdfinfo', pdffile ]) if re.search('Encrypted:.*yes', out): raise ValueError, "DRM protected" Gaël LE MIGNOT Pilot Systems Generating PDF from Python web applications
  • 20. Conclusion Conclusion Gaël LE MIGNOT Pilot Systems Generating PDF from Python web applications
  • 21. Conclusion Thanks for listening! Any question? Gaël LE MIGNOT Pilot Systems Generating PDF from Python web applications