SlideShare a Scribd company logo
1 of 22
is it ready for production?

          Mark Rees
         Group CTO
    Censof Holdings Berhad
pypy & me

not affiliated with pypy team
have followed it‟s development since
2004
use cpython and jython at work
used ironpython for small projects
the question:
would pypy improve performance of
some of our workloads?
i am a manager, who still is wants to be a
programmer, so i did the analysis
pypy
what is pypy?
 - RPython translation toolchain, a framework for
generating dynamic programming language
implementations
 - a implementation of Python in Python using the
framework
history
- first sprint 2003, EU project from 2004 – 2007
- open source project from 2007
  https://bitbucket.org/pypy
- pypy 1.4 first release suitable for “production”
12/2010
pypy

want to know more about pypy
- http://pypy.org/
- david beazley pycon 2012 keynote
http://goo.gl/5PXFQ
- how the pypy jit works http://goo.gl/dKgFp
- why pypy by example http://goo.gl/vpQyJ
production ready – a definition
http://programmers.stackexchange.com/questions/61726/define-production-ready

           it runs
           it satisfies the project requirements
           its design was well thought out
           it's stable
           it's maintainable
           it's scalable
           it's documented

           it works with the python modules we use
           it is as fast or faster than cpython
pypy – does it run?

                     of course, it runs




See http://pypy.readthedocs.org/en/latest/cpython_differences.html
for differences between PyPy and CPython
pypy – other production criteria
does it satisfy the project requirements
- yes
is it‟s design was well thought out
- I would assume so
is it stable
- yes
is it maintainable
- 7 out of 10
is it scalable
- stackless & greenlets built in
is it documented
- cpython docs for functionality, rpython toolchain 8 out
of 10
pypy – does it work with the modules we use

standard library modules supported:
 __builtin__, __pypy__, _ast, _bisect, _codecs, _collections, _ffi, _hashlib,
 _io, _locale, _lsprof, _md5, _minimal_curses, _multiprocessing, _random,
 _rawffi, _sha, _socket, _sre, _ssl, _warnings, _weakref, _winreg, array,
 binascii, bz2, cStringIO, clr, cmath, cpyext, crypt, errno, exceptions,
 fcntl, gc, imp, itertools, marshal, math, mmap, operator, oracle, parser,
 posix, pyexpat, select, signal, struct, symbol, sys, termios, thread, time,
 token, unicodedata, zipimport, zlib

these modules are supported but written in
python:
 cPickle, _csv, ctypes, datetime, dbm, _functools, grp, pwd, readline,
 resource, sqlite3, syslog, tputil

many python libs are known to work, like:
 ctypes, django, pyglet, sqlalchemy, PIL, sqlalchemy. See
 https://bitbucket.org/pypy/compatibility/wiki/Home for a more
 exhaustive list.
pypy – does it work with the modules we use
pypy c-api support is beta, worked most of
the time but failed with reportlab:
Fatal error in cpyext, CPython compatibility layer, calling
PySequence_GetItem
Either report a bug or consider not using this particular extension
<OpErrFmt object at 0x7f1e89587e88>
RPython traceback:
 File "module_cpyext_api_2.c", line 51963, in PySequence_GetItem
 File "module_cpyext_pyobject.c", line 1071, in
BaseCpyTypedescr_realize
 File "objspace_std_objspace.c", line 3396, in
allocate_instance__W_ObjectObject
 File "objspace_std_typeobject.c", line 3010, in
W_TypeObject_check_user_subclass
Segmentation fault
But this was the only compatibility issue we
had running all of our python code under
pypy and we could fallback to pure python
reportlab extensions anyway.
pypy – does it run as fast as cpython




                 but!



           http://speed.pypy.org/
pypy django benchmark
DJANGO_TMPL = Template("""<table>
{% for row in table %}
<tr>{% for col in row %}<td>{{ col|escape }}</td>{% endfor %}</tr>
{% endfor %}
</table>
""")

def test_django(count):
  table = [xrange(150) for _ in xrange(150)]
  context = Context({"table": table})

  # Warm up Django.
  DJANGO_TMPL.render(context)
  DJANGO_TMPL.render(context)

  times = []
  for _ in xrange(count):
     t0 = time.time()
     data = DJANGO_TMPL.render(context)
     t1 = time.time()
     times.append(t1 - t0)
  return times
my csv to xml benchmark
def bench(data, output):
  f = open(data, 'rb')
  fn = [„age‟,….]
  reader = csv.DictReader(f, fn)
  writer = SAXWriter(output)
  writer.start_doc()
  writer.start_tag('data')
  try:
     for row in reader:
        writer.start_tag('row')
        for key in row.keys():
           writer.tag(key.replace(' ', '_'), body=row[key])
        writer.end_tag('row')
  finally:
     f.close()
     writer.end_tag('data')
     writer.end_doc()
my pypy benchmarks
https://bitbucket.org/hexdump42/pypy-benchmarks

average execution time (in seconds)

benchmark       cpython     pypy-jit              pypy-jit
                2.7.3       1.9                   nightly
bm_csv2xml      88.26/94.   28.89      3.0549 x   28.96      3.3723 x
                04                     faster                faster
my pypy benchmarks
https://bitbucket.org/hexdump42/pypy-benchmarks

average execution time (in seconds)

benchmark       cpython     pypy-jit              pypy-jit
                2.7.3       1.9                   nightly
bm_csv2xml      88.26/94.   28.89      3.0549 x   28.96      3.3723 x
                04                     faster                faster
bm_csv          1.54/1.65   5.89       3.8122 x   5.78       3.5025 x
                                       slower                slower
my pypy benchmarks
https://bitbucket.org/hexdump42/pypy-benchmarks

average execution time (in seconds)

benchmark       cpython     pypy-jit              pypy-jit
                2.7.3       1.9                   nightly
bm_csv2xml      88.26/94.   28.89      3.0549 x   28.96      3.3723 x
                04                     faster                faster
bm_csv          1.54/1.65   5.89       3.8122 x   5.78       3.5025 x
                                       slower                slower
bm_openpyxml 1.31/1.21      3.26       2.4871 x   3.15       2.6051 x
                                       slower                slower
my pypy benchmarks
https://bitbucket.org/hexdump42/pypy-benchmarks

average execution time (in seconds)

benchmark       cpython     pypy-jit              pypy-jit
                2.7.3       1.9                   nightly
bm_csv2xml      88.26/94.   28.89      3.0549 x   28.96      3.3723 x
                04                     faster                faster
bm_csv          1.54/1.65   5.89       3.8122 x   5.78       3.5025 x
                                       slower                slower
bm_openpyxml 1.31/1.21      3.26       2.4871 x   3.15       2.6051 x
                                       slower                slower
bm_xhtml2pdf    1.91/1.95   3.27       1.7155 x   4.22       2.1637 x
                                       slower                slower
my pypy benchmarks
https://bitbucket.org/hexdump42/pypy-benchmarks

max memory use

benchmark      cpython    pypy-jit              pypy-jit
               2.7.3      1.9                   nightly
bm_interp      5412/5248 12556       2.32 x     21880      4.1692 x
                                     larger                larger
bm_csv2xml     7048/7064 55180       7.8292 x   55232      7.8188 x
                                     larger                larger
bm_csv         5812/5180 52200       8.9814 x   52176      10.0726
                                     larger                x larger
bm_openpyxml 12656/       77252      6.1040 x   80428      6.3549 x
             12656                   larger                larger
bm_xhtml2pdf   48880/     236792     4.8444 x   101376     2.906 x
               34884                 larger                larger
what is the pypy jit doing?
https://bitbucket.org/pypy/jitviewer/
modified csv pypy benchmarks
  https://bitbucket.org/hexdump42/pypy-benchmarks

  average execution time (in seconds)

benchmark        cpython      pypy-jit              pypy-jit
                 2.7.3        1.9                   nightly
bm_csv2xml_mod   88.25/90.02 23.65       3.7315 x   23.86      3.7728x
                                         faster                faster
bm_csv_mod       1.62/1.69    1.89       0.8571 x   1.72       0.9825 x
                                         slower                slower
is pypy ready for production

1. it runs
2. it satisfies the project requirements
3. its design was well thought out
4. it's stable
5. it's maintainable
6. it's scalable
7. it's documented
8. it works with the python modules we use
9. it is as fast or faster than cpython
some other reasons to consider pypy

cffi – foreign function interface for python
- http://cffi.readthedocs.org/
pypy version of numpy
py3k version of pypy
check out the STM/AME project


http://www.pypy.org/howtohelp.html
contact details




                          Mark Rees
                   mark at censof dot com
                        +Mark Rees
                       @hexdump42
                  hex-dump.blogspot.com


http://www.slideshare.net/hexdump42/pypy-is-it-ready-for-production

More Related Content

What's hot

IRIS-HEP: Boost-histogram and Hist
IRIS-HEP: Boost-histogram and HistIRIS-HEP: Boost-histogram and Hist
IRIS-HEP: Boost-histogram and HistHenry Schreiner
 
Multiprocessing with python
Multiprocessing with pythonMultiprocessing with python
Multiprocessing with pythonPatrick Vergain
 
Python Developer Certification
Python Developer CertificationPython Developer Certification
Python Developer CertificationVskills
 
Practicing Python 3
Practicing Python 3Practicing Python 3
Practicing Python 3Mosky Liu
 
CHEP 2018: A Python upgrade to the GooFit package for parallel fitting
CHEP 2018: A Python upgrade to the GooFit package for parallel fittingCHEP 2018: A Python upgrade to the GooFit package for parallel fitting
CHEP 2018: A Python upgrade to the GooFit package for parallel fittingHenry Schreiner
 
Learning Python from Data
Learning Python from DataLearning Python from Data
Learning Python from DataMosky Liu
 
Automatically Fusing Functions on CuPy
Automatically Fusing Functions on CuPyAutomatically Fusing Functions on CuPy
Automatically Fusing Functions on CuPyPreferred Networks
 
2019 IRIS-HEP AS workshop: Boost-histogram and hist
2019 IRIS-HEP AS workshop: Boost-histogram and hist2019 IRIS-HEP AS workshop: Boost-histogram and hist
2019 IRIS-HEP AS workshop: Boost-histogram and histHenry Schreiner
 
Parallel programming using python
Parallel programming using python Parallel programming using python
Parallel programming using python Samah Gad
 
CHEP 2019: Recent developments in histogram libraries
CHEP 2019: Recent developments in histogram librariesCHEP 2019: Recent developments in histogram libraries
CHEP 2019: Recent developments in histogram librariesHenry Schreiner
 
Concurrency in Python
Concurrency in PythonConcurrency in Python
Concurrency in PythonGavin Roy
 
Introduction to Polyaxon
Introduction to PolyaxonIntroduction to Polyaxon
Introduction to PolyaxonYu Ishikawa
 
Writing Fast Code (JP) - PyCon JP 2015
Writing Fast Code (JP) - PyCon JP 2015Writing Fast Code (JP) - PyCon JP 2015
Writing Fast Code (JP) - PyCon JP 2015Younggun Kim
 

What's hot (19)

IRIS-HEP: Boost-histogram and Hist
IRIS-HEP: Boost-histogram and HistIRIS-HEP: Boost-histogram and Hist
IRIS-HEP: Boost-histogram and Hist
 
Move from C to Go
Move from C to GoMove from C to Go
Move from C to Go
 
Multiprocessing with python
Multiprocessing with pythonMultiprocessing with python
Multiprocessing with python
 
Python Developer Certification
Python Developer CertificationPython Developer Certification
Python Developer Certification
 
Practicing Python 3
Practicing Python 3Practicing Python 3
Practicing Python 3
 
CHEP 2018: A Python upgrade to the GooFit package for parallel fitting
CHEP 2018: A Python upgrade to the GooFit package for parallel fittingCHEP 2018: A Python upgrade to the GooFit package for parallel fitting
CHEP 2018: A Python upgrade to the GooFit package for parallel fitting
 
Learning Python from Data
Learning Python from DataLearning Python from Data
Learning Python from Data
 
Automatically Fusing Functions on CuPy
Automatically Fusing Functions on CuPyAutomatically Fusing Functions on CuPy
Automatically Fusing Functions on CuPy
 
2019 IRIS-HEP AS workshop: Boost-histogram and hist
2019 IRIS-HEP AS workshop: Boost-histogram and hist2019 IRIS-HEP AS workshop: Boost-histogram and hist
2019 IRIS-HEP AS workshop: Boost-histogram and hist
 
Parallel programming using python
Parallel programming using python Parallel programming using python
Parallel programming using python
 
Presentation1
Presentation1Presentation1
Presentation1
 
CHEP 2019: Recent developments in histogram libraries
CHEP 2019: Recent developments in histogram librariesCHEP 2019: Recent developments in histogram libraries
CHEP 2019: Recent developments in histogram libraries
 
Raspberry pi a la cfml
Raspberry pi a la cfmlRaspberry pi a la cfml
Raspberry pi a la cfml
 
Concurrency in Python
Concurrency in PythonConcurrency in Python
Concurrency in Python
 
Introduction to Polyaxon
Introduction to PolyaxonIntroduction to Polyaxon
Introduction to Polyaxon
 
Writing Fast Code (JP) - PyCon JP 2015
Writing Fast Code (JP) - PyCon JP 2015Writing Fast Code (JP) - PyCon JP 2015
Writing Fast Code (JP) - PyCon JP 2015
 
Chainer v4 and v5
Chainer v4 and v5Chainer v4 and v5
Chainer v4 and v5
 
Day4
Day4Day4
Day4
 
Available HPC resources at CSUC
Available HPC resources at CSUCAvailable HPC resources at CSUC
Available HPC resources at CSUC
 

Similar to Is PyPy Ready for Production

Performance Enhancement Tips
Performance Enhancement TipsPerformance Enhancement Tips
Performance Enhancement TipsTim (文昌)
 
Everything You Always Wanted to Know About Memory in Python - But Were Afraid...
Everything You Always Wanted to Know About Memory in Python - But Were Afraid...Everything You Always Wanted to Know About Memory in Python - But Were Afraid...
Everything You Always Wanted to Know About Memory in Python - But Were Afraid...Piotr Przymus
 
First python project
First python projectFirst python project
First python projectNeetu Jain
 
Everything you didn't know you needed
Everything you didn't know you neededEverything you didn't know you needed
Everything you didn't know you neededHenry Schreiner
 
Princeton Wintersession: Software Quality Assurance Tooling
Princeton Wintersession: Software Quality Assurance ToolingPrinceton Wintersession: Software Quality Assurance Tooling
Princeton Wintersession: Software Quality Assurance ToolingHenry Schreiner
 
PyParis 2017 / Writing a C Python extension in 2017, Jean-Baptiste Aviat
PyParis 2017 / Writing a C Python extension in 2017, Jean-Baptiste Aviat PyParis 2017 / Writing a C Python extension in 2017, Jean-Baptiste Aviat
PyParis 2017 / Writing a C Python extension in 2017, Jean-Baptiste Aviat Pôle Systematic Paris-Region
 
PyCon 2013 : Scripting to PyPi to GitHub and More
PyCon 2013 : Scripting to PyPi to GitHub and MorePyCon 2013 : Scripting to PyPi to GitHub and More
PyCon 2013 : Scripting to PyPi to GitHub and MoreMatt Harrison
 
PyCon2022 - Building Python Extensions
PyCon2022 - Building Python ExtensionsPyCon2022 - Building Python Extensions
PyCon2022 - Building Python ExtensionsHenry Schreiner
 
[GSoC 2017] gopy: Updating gopy to support Python3 and PyPy
[GSoC 2017] gopy: Updating gopy to support Python3 and PyPy[GSoC 2017] gopy: Updating gopy to support Python3 and PyPy
[GSoC 2017] gopy: Updating gopy to support Python3 and PyPyDong-hee Na
 
Euro python2011 High Performance Python
Euro python2011 High Performance PythonEuro python2011 High Performance Python
Euro python2011 High Performance PythonIan Ozsvald
 
Christian Strappazzon - Presentazione Python Milano - Codemotion Milano 2017
Christian Strappazzon - Presentazione Python Milano - Codemotion Milano 2017Christian Strappazzon - Presentazione Python Milano - Codemotion Milano 2017
Christian Strappazzon - Presentazione Python Milano - Codemotion Milano 2017Codemotion
 
Software Quality Assurance Tooling 2023
Software Quality Assurance Tooling 2023Software Quality Assurance Tooling 2023
Software Quality Assurance Tooling 2023Henry Schreiner
 
carrow - Go bindings to Apache Arrow via C++-API
carrow - Go bindings to Apache Arrow via C++-APIcarrow - Go bindings to Apache Arrow via C++-API
carrow - Go bindings to Apache Arrow via C++-APIYoni Davidson
 
The Popper Experimentation Protocol and CLI tool
The Popper Experimentation Protocol and CLI toolThe Popper Experimentation Protocol and CLI tool
The Popper Experimentation Protocol and CLI toolIvo Jimenez
 
Zendcon scaling magento
Zendcon scaling magentoZendcon scaling magento
Zendcon scaling magentoMathew Beane
 
나도 할 수 있다 오픈소스
나도 할 수 있다 오픈소스나도 할 수 있다 오픈소스
나도 할 수 있다 오픈소스효준 강
 

Similar to Is PyPy Ready for Production (20)

Performance Enhancement Tips
Performance Enhancement TipsPerformance Enhancement Tips
Performance Enhancement Tips
 
Everything You Always Wanted to Know About Memory in Python - But Were Afraid...
Everything You Always Wanted to Know About Memory in Python - But Were Afraid...Everything You Always Wanted to Know About Memory in Python - But Were Afraid...
Everything You Always Wanted to Know About Memory in Python - But Were Afraid...
 
First python project
First python projectFirst python project
First python project
 
Everything you didn't know you needed
Everything you didn't know you neededEverything you didn't know you needed
Everything you didn't know you needed
 
Optimizing Your CI Pipelines
Optimizing Your CI PipelinesOptimizing Your CI Pipelines
Optimizing Your CI Pipelines
 
Princeton Wintersession: Software Quality Assurance Tooling
Princeton Wintersession: Software Quality Assurance ToolingPrinceton Wintersession: Software Quality Assurance Tooling
Princeton Wintersession: Software Quality Assurance Tooling
 
PyParis 2017 / Writing a C Python extension in 2017, Jean-Baptiste Aviat
PyParis 2017 / Writing a C Python extension in 2017, Jean-Baptiste Aviat PyParis 2017 / Writing a C Python extension in 2017, Jean-Baptiste Aviat
PyParis 2017 / Writing a C Python extension in 2017, Jean-Baptiste Aviat
 
PyCon 2013 : Scripting to PyPi to GitHub and More
PyCon 2013 : Scripting to PyPi to GitHub and MorePyCon 2013 : Scripting to PyPi to GitHub and More
PyCon 2013 : Scripting to PyPi to GitHub and More
 
PyCon2022 - Building Python Extensions
PyCon2022 - Building Python ExtensionsPyCon2022 - Building Python Extensions
PyCon2022 - Building Python Extensions
 
[GSoC 2017] gopy: Updating gopy to support Python3 and PyPy
[GSoC 2017] gopy: Updating gopy to support Python3 and PyPy[GSoC 2017] gopy: Updating gopy to support Python3 and PyPy
[GSoC 2017] gopy: Updating gopy to support Python3 and PyPy
 
jvm goes to big data
jvm goes to big datajvm goes to big data
jvm goes to big data
 
Euro python2011 High Performance Python
Euro python2011 High Performance PythonEuro python2011 High Performance Python
Euro python2011 High Performance Python
 
Christian Strappazzon - Presentazione Python Milano - Codemotion Milano 2017
Christian Strappazzon - Presentazione Python Milano - Codemotion Milano 2017Christian Strappazzon - Presentazione Python Milano - Codemotion Milano 2017
Christian Strappazzon - Presentazione Python Milano - Codemotion Milano 2017
 
Software Quality Assurance Tooling 2023
Software Quality Assurance Tooling 2023Software Quality Assurance Tooling 2023
Software Quality Assurance Tooling 2023
 
carrow - Go bindings to Apache Arrow via C++-API
carrow - Go bindings to Apache Arrow via C++-APIcarrow - Go bindings to Apache Arrow via C++-API
carrow - Go bindings to Apache Arrow via C++-API
 
Python Orientation
Python OrientationPython Orientation
Python Orientation
 
The Popper Experimentation Protocol and CLI tool
The Popper Experimentation Protocol and CLI toolThe Popper Experimentation Protocol and CLI tool
The Popper Experimentation Protocol and CLI tool
 
Zendcon scaling magento
Zendcon scaling magentoZendcon scaling magento
Zendcon scaling magento
 
나도 할 수 있다 오픈소스
나도 할 수 있다 오픈소스나도 할 수 있다 오픈소스
나도 할 수 있다 오픈소스
 
PyPy London Demo Evening 2013
PyPy London Demo Evening 2013PyPy London Demo Evening 2013
PyPy London Demo Evening 2013
 

More from Mark Rees

Porting a legacy app to python 3
Porting a legacy app to python 3Porting a legacy app to python 3
Porting a legacy app to python 3Mark Rees
 
Relational Database Access with Python
Relational Database Access with PythonRelational Database Access with Python
Relational Database Access with PythonMark Rees
 
Seeing with Python presented at PyCon AU 2014
Seeing with Python presented at PyCon AU 2014Seeing with Python presented at PyCon AU 2014
Seeing with Python presented at PyCon AU 2014Mark Rees
 
Seeing with Python - Pycon SG 2014
Seeing with Python - Pycon SG 2014Seeing with Python - Pycon SG 2014
Seeing with Python - Pycon SG 2014Mark Rees
 
Relational Database Access with Python ‘sans’ ORM
Relational Database Access with Python ‘sans’ ORM  Relational Database Access with Python ‘sans’ ORM
Relational Database Access with Python ‘sans’ ORM Mark Rees
 
What do you mean it needs to be Java based? How jython saved the day.
What do you mean it needs to be Java based? How jython saved the day.What do you mean it needs to be Java based? How jython saved the day.
What do you mean it needs to be Java based? How jython saved the day.Mark Rees
 

More from Mark Rees (6)

Porting a legacy app to python 3
Porting a legacy app to python 3Porting a legacy app to python 3
Porting a legacy app to python 3
 
Relational Database Access with Python
Relational Database Access with PythonRelational Database Access with Python
Relational Database Access with Python
 
Seeing with Python presented at PyCon AU 2014
Seeing with Python presented at PyCon AU 2014Seeing with Python presented at PyCon AU 2014
Seeing with Python presented at PyCon AU 2014
 
Seeing with Python - Pycon SG 2014
Seeing with Python - Pycon SG 2014Seeing with Python - Pycon SG 2014
Seeing with Python - Pycon SG 2014
 
Relational Database Access with Python ‘sans’ ORM
Relational Database Access with Python ‘sans’ ORM  Relational Database Access with Python ‘sans’ ORM
Relational Database Access with Python ‘sans’ ORM
 
What do you mean it needs to be Java based? How jython saved the day.
What do you mean it needs to be Java based? How jython saved the day.What do you mean it needs to be Java based? How jython saved the day.
What do you mean it needs to be Java based? How jython saved the day.
 

Recently uploaded

Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 

Recently uploaded (20)

Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 

Is PyPy Ready for Production

  • 1. is it ready for production? Mark Rees Group CTO Censof Holdings Berhad
  • 2. pypy & me not affiliated with pypy team have followed it‟s development since 2004 use cpython and jython at work used ironpython for small projects the question: would pypy improve performance of some of our workloads? i am a manager, who still is wants to be a programmer, so i did the analysis
  • 3. pypy what is pypy? - RPython translation toolchain, a framework for generating dynamic programming language implementations - a implementation of Python in Python using the framework history - first sprint 2003, EU project from 2004 – 2007 - open source project from 2007 https://bitbucket.org/pypy - pypy 1.4 first release suitable for “production” 12/2010
  • 4. pypy want to know more about pypy - http://pypy.org/ - david beazley pycon 2012 keynote http://goo.gl/5PXFQ - how the pypy jit works http://goo.gl/dKgFp - why pypy by example http://goo.gl/vpQyJ
  • 5. production ready – a definition http://programmers.stackexchange.com/questions/61726/define-production-ready it runs it satisfies the project requirements its design was well thought out it's stable it's maintainable it's scalable it's documented it works with the python modules we use it is as fast or faster than cpython
  • 6. pypy – does it run? of course, it runs See http://pypy.readthedocs.org/en/latest/cpython_differences.html for differences between PyPy and CPython
  • 7. pypy – other production criteria does it satisfy the project requirements - yes is it‟s design was well thought out - I would assume so is it stable - yes is it maintainable - 7 out of 10 is it scalable - stackless & greenlets built in is it documented - cpython docs for functionality, rpython toolchain 8 out of 10
  • 8. pypy – does it work with the modules we use standard library modules supported: __builtin__, __pypy__, _ast, _bisect, _codecs, _collections, _ffi, _hashlib, _io, _locale, _lsprof, _md5, _minimal_curses, _multiprocessing, _random, _rawffi, _sha, _socket, _sre, _ssl, _warnings, _weakref, _winreg, array, binascii, bz2, cStringIO, clr, cmath, cpyext, crypt, errno, exceptions, fcntl, gc, imp, itertools, marshal, math, mmap, operator, oracle, parser, posix, pyexpat, select, signal, struct, symbol, sys, termios, thread, time, token, unicodedata, zipimport, zlib these modules are supported but written in python: cPickle, _csv, ctypes, datetime, dbm, _functools, grp, pwd, readline, resource, sqlite3, syslog, tputil many python libs are known to work, like: ctypes, django, pyglet, sqlalchemy, PIL, sqlalchemy. See https://bitbucket.org/pypy/compatibility/wiki/Home for a more exhaustive list.
  • 9. pypy – does it work with the modules we use pypy c-api support is beta, worked most of the time but failed with reportlab: Fatal error in cpyext, CPython compatibility layer, calling PySequence_GetItem Either report a bug or consider not using this particular extension <OpErrFmt object at 0x7f1e89587e88> RPython traceback: File "module_cpyext_api_2.c", line 51963, in PySequence_GetItem File "module_cpyext_pyobject.c", line 1071, in BaseCpyTypedescr_realize File "objspace_std_objspace.c", line 3396, in allocate_instance__W_ObjectObject File "objspace_std_typeobject.c", line 3010, in W_TypeObject_check_user_subclass Segmentation fault But this was the only compatibility issue we had running all of our python code under pypy and we could fallback to pure python reportlab extensions anyway.
  • 10. pypy – does it run as fast as cpython but! http://speed.pypy.org/
  • 11. pypy django benchmark DJANGO_TMPL = Template("""<table> {% for row in table %} <tr>{% for col in row %}<td>{{ col|escape }}</td>{% endfor %}</tr> {% endfor %} </table> """) def test_django(count): table = [xrange(150) for _ in xrange(150)] context = Context({"table": table}) # Warm up Django. DJANGO_TMPL.render(context) DJANGO_TMPL.render(context) times = [] for _ in xrange(count): t0 = time.time() data = DJANGO_TMPL.render(context) t1 = time.time() times.append(t1 - t0) return times
  • 12. my csv to xml benchmark def bench(data, output): f = open(data, 'rb') fn = [„age‟,….] reader = csv.DictReader(f, fn) writer = SAXWriter(output) writer.start_doc() writer.start_tag('data') try: for row in reader: writer.start_tag('row') for key in row.keys(): writer.tag(key.replace(' ', '_'), body=row[key]) writer.end_tag('row') finally: f.close() writer.end_tag('data') writer.end_doc()
  • 13. my pypy benchmarks https://bitbucket.org/hexdump42/pypy-benchmarks average execution time (in seconds) benchmark cpython pypy-jit pypy-jit 2.7.3 1.9 nightly bm_csv2xml 88.26/94. 28.89 3.0549 x 28.96 3.3723 x 04 faster faster
  • 14. my pypy benchmarks https://bitbucket.org/hexdump42/pypy-benchmarks average execution time (in seconds) benchmark cpython pypy-jit pypy-jit 2.7.3 1.9 nightly bm_csv2xml 88.26/94. 28.89 3.0549 x 28.96 3.3723 x 04 faster faster bm_csv 1.54/1.65 5.89 3.8122 x 5.78 3.5025 x slower slower
  • 15. my pypy benchmarks https://bitbucket.org/hexdump42/pypy-benchmarks average execution time (in seconds) benchmark cpython pypy-jit pypy-jit 2.7.3 1.9 nightly bm_csv2xml 88.26/94. 28.89 3.0549 x 28.96 3.3723 x 04 faster faster bm_csv 1.54/1.65 5.89 3.8122 x 5.78 3.5025 x slower slower bm_openpyxml 1.31/1.21 3.26 2.4871 x 3.15 2.6051 x slower slower
  • 16. my pypy benchmarks https://bitbucket.org/hexdump42/pypy-benchmarks average execution time (in seconds) benchmark cpython pypy-jit pypy-jit 2.7.3 1.9 nightly bm_csv2xml 88.26/94. 28.89 3.0549 x 28.96 3.3723 x 04 faster faster bm_csv 1.54/1.65 5.89 3.8122 x 5.78 3.5025 x slower slower bm_openpyxml 1.31/1.21 3.26 2.4871 x 3.15 2.6051 x slower slower bm_xhtml2pdf 1.91/1.95 3.27 1.7155 x 4.22 2.1637 x slower slower
  • 17. my pypy benchmarks https://bitbucket.org/hexdump42/pypy-benchmarks max memory use benchmark cpython pypy-jit pypy-jit 2.7.3 1.9 nightly bm_interp 5412/5248 12556 2.32 x 21880 4.1692 x larger larger bm_csv2xml 7048/7064 55180 7.8292 x 55232 7.8188 x larger larger bm_csv 5812/5180 52200 8.9814 x 52176 10.0726 larger x larger bm_openpyxml 12656/ 77252 6.1040 x 80428 6.3549 x 12656 larger larger bm_xhtml2pdf 48880/ 236792 4.8444 x 101376 2.906 x 34884 larger larger
  • 18. what is the pypy jit doing? https://bitbucket.org/pypy/jitviewer/
  • 19. modified csv pypy benchmarks https://bitbucket.org/hexdump42/pypy-benchmarks average execution time (in seconds) benchmark cpython pypy-jit pypy-jit 2.7.3 1.9 nightly bm_csv2xml_mod 88.25/90.02 23.65 3.7315 x 23.86 3.7728x faster faster bm_csv_mod 1.62/1.69 1.89 0.8571 x 1.72 0.9825 x slower slower
  • 20. is pypy ready for production 1. it runs 2. it satisfies the project requirements 3. its design was well thought out 4. it's stable 5. it's maintainable 6. it's scalable 7. it's documented 8. it works with the python modules we use 9. it is as fast or faster than cpython
  • 21. some other reasons to consider pypy cffi – foreign function interface for python - http://cffi.readthedocs.org/ pypy version of numpy py3k version of pypy check out the STM/AME project http://www.pypy.org/howtohelp.html
  • 22. contact details Mark Rees mark at censof dot com +Mark Rees @hexdump42 hex-dump.blogspot.com http://www.slideshare.net/hexdump42/pypy-is-it-ready-for-production

Editor's Notes

  1. I have listed a number of resources that I found helpful but this talk is more about using pypy rather than how it works.
  2. The first 8 criteria came from a question on stackexchange, the last 2 are my additional requirements. A little detailed definition than the management version: it runs, it makes money.You may disagree with the list but it’s the criteria I will be using. Also I will be biased towards the needs of the company I work for. So let’s work thru the list to see how pypy stacks up.
  3. It runs great on x86 32bit and 64bit platforms under Linux, Windows and OS X. There are other backend implementations – ARM, PPC, Java &amp; .NET VM’s. Some have had more love than others. Pypy implements the Python language version 2.7.2, supporting all of the core language passing the Python test suite. It supports most of the standard library modules. It has support for CPython C API but it is beta quality. I will go into more detail about standard library and other module compatibility later in the talk.
  4. I am not a language interpreter designer so I cannot really comment on the design but you would assume with the number of years development &amp; refactoring by the pypy team it is a well thought out design.With regards maintainability, due to much of the pypytoolchain using RPython and the complexity of the architecture I feel it is hard for the normal python programmer to be able to contribute to coding maintenance of pypy. The learning curve is steep but certainly maintainability f the pure-python portions of the pypy components are easier.
  5. As I said before pypy implements python language version 2.7.2
  6. As at pypy 1.9 c-api support is considered beta and while it worked for many of the modules we use e.g PIL, it failed with the c extensions for reportlab. This wasn’t a show-stopper as these extensions also have python equivalents in the standard reportlab distribution. Of course, our python library use will be different from yours, so you experience will be dufferent as well.
  7. The above plot represents PyPy trunk (with JIT) benchmark times normalized to Cpython as at 12 August 2012. Smaller is better.The standard benchmarks are limited to one domain and do not in a lot of cases cover complete processes or workloads. For example:
  8. Thedjango benchmark in the standard pypy benchmark suite and was originally part of the unladen swallow benchmarks. So this benchmark is only testing the template rendering performance of django. There is nothing wrong with this and it’s a standard benchmark technique. So if you see the results of this benchmark, then it’s likely the performance of django template rendering under pypy would be faster than cpython. Does this mean your django website perfromance would be better? Maybe.
  9. My benchmarks are a little different from the standard pypy as they simulate workloads similar to what we use python for at work. So rather than benchmarking a small portion or function as the standard benchmarks do, mine cover either a complete process or the majority of one. So my benchmarks are impacted by io as well as the in-program execution. Since the majority of the non web use of in our workplace is extract/transform/load (ETL) tasks, this is what the benchmarks are doing.
  10. To perform the benchmarks, a clone of the pypy benchmark tools was done and my benchmarks added to it. You can see these at https://bitbucket.org/hexdump42/pypy-benchmarks. The benchmarks were run on a VMWare virtual instance with 2GB RAM, 1 Core 64bit running Scientific Linux 6.2. The base CPython used was 2.7.2 and comparison benchmarks were run against pypy-jit release 1.9 and the nightly pypy-jit build of August 14 2012 collecting avg execution time and memory use. 50 iteration benchmark runs. So for the bm_csv2xml benchmark, 100Mb csv file of census data to loaded, parsed and output as xml to a file. So it is faster than cpython, things are looking good. But I had hoped it would be a little better. So
  11. I created a benchmark of just the csv load and parse and was surprised to see that it was slower than the cpython equivalent, so in my previous benchmark the xml output was what gave the improved performance under pypy.
  12. .
  13. The bm_interp benchmark just provides a baseline of what memory just the interpreter uses prior to any real work.Just in case these benchmark results were related to something related to my vm configuration, I also reran these benchmarks on physical hardware and obtained similar results. If I had stopped here, you would have say that pypy didn’t meet my production criteria but since some of the components that affect the performance are in python under pypy, I decided to see why performance wasn’t the same or better than cpython. I decided to start with the low hanging fruit – csv performance.
  14. You can use the pypyjit viewer to see what is happening and of course I can review the source of _csv.py since it’s written in pure python. Thanks to some input in pypy issue tracker https://bugs.pypy.org/issue641
  15. I was able to after a number of attempts modify _csv.py so that bm_csv benchmark performed at the same speed as cpython. This also gave a small performance improvement in the bm_csv2xml benchmark. Based on thee improvements, it is very likely we will use pypy in place of cpython for the ETL where we load csv files and convert to xml. I also intend to investigate where the performance bottlenecks are in the other ETL process benchmarks to see if we can get the gains sinmilar to what we get with pypy for the bm_csv2xml benchmark.
  16. If we revisit the definition of production ready, certainly if we just use items 1-7 as the criteria, pypy is certainly production ready when compared with other python implementations that are being used in production. If you want to run existing python code under pypy, then pypy compatibility with non standard python libraries needs to be considered and getting your hands dirty by running the code under pypy is really the best way to see if pypy will work. If nothing else you can report an issue to the pypy team and they can use it to improve compatibility. And will our company be deploying anything in production under pypy? It is likely sometime this year we will look at deploying it for certain ETL workloads due to measured benchmark performance. The additional memory overhead isn’t an issue for us. So my recommendation is that if you are looking for performance improvements, give pypy a go, you may be surprised.But performance shouldn’t be the only reason to consider pypy, there are various pypy side projects that will have good benefits for the python community as a whole. Last week the pypy team released cffi Foreign Function Interface for Python calling C code. The aim of this project is to provide a convenient and reliable way of calling C code from Python. It is
  17. But performance shouldn’t be the only reason to consider pypy, there are various pypy side projects that will have good benefits for the python community as a whole. Last week the pypy team released cffi Foreign Function Interface for Python calling C code. The aim of this project is to provide a convenient and reliable way of calling C code from Python. It works with both pypyabdcpython 2.6+. The pypy team are working a pypy implementation of numpy and are close to a py3k language compliant version. If you want to help with pypy, check out the howto help page &amp; the donation page.