Four Python Pains
               Stefane Fermigier
Scripting Languages Workshop, IRILL, May 2011
Part1: Some context
Wrong
(on several points)
Python
    (“Scripting Language”)
• Has built-in syntactical support for arbitrarily
  complex data structures (lists, dictionaries)
• Supports significantly complex applications:
  ERP5, Plone, Nuxeo CPS (~320 KLOC),
  OpenERP (~150 KLOC), Komodo (~250
  KLOC)
• Is compiled (to bytecode)
Scala
    (“Sys. Prog. Language”)


• Is interpreted (it as a REPL)
• Is a high-level language - in other words, has a
  high CPU instructions / statement ratio
  (comparable or higher than Python)
Other counterexamples

• Java, pre-generics (Java SE < 5, i.e. at the time
  of Ousterhout’s paper), had really weak
  support for static typing
• Clojure (that many consider in the same
  league as Scala) is not even statically typed
At which expense?
• Execution speed?
• Safety?
• Maintainability?
• What else?
Or maybe Python is a
scripting language after all ?
Uses of Python
• Education
• Small to medium sized sysadmin and
  software engineering project
• Medium to large web frameworks (Zope,
  Django...) and applications (Youtube...)
• Scientific computing (numpy, scipy...)
• Scripting of games, desktop apps, etc.
Recap: 4 pain points

• Speed
• Scalability
• Type-safety
• Adherence to external code bases, and an
  implementation (CPython)
Part 2: Discussion
Speed

• Python (seems to be) slower by 10x than
  Clojure and V8 (JavaScript)
• Both Clojure and JavaScript are dynamically
  typed languages
• TODO: look at Clojure and V8 and see
  how they do it
Scalability
•   CPython multi-threading / multi-core
    performance is constrained by “the GIL” (global
    interpretor lock)

•   It doesn’t have to be there (Jython doesn’t have
    one) but removing it from Python is
    controversial

•   Python has support for some par prog
    constructs (see: http://wiki.python.org/moin/
    ParallelProcessing) but not the fancy new ones
    (a la Scala, Clojure, Go, etc.)
Type Safety
•   Static typic is good to create safer programs,
    and also to ease refactoring (necessary
    condition for agility)

•   Since Python 3, type annotations are possible
    on variables, parameters, etc.

•   Some tools (e.g. Jetbrains’ PyCharm) have
    support for type inference on Python, and
    enable: code completion, errors detections,
    refactoring...
Adherence to CPython
•   There are at least 4 major Python
    implementations: CPython, Jython, IronPython
    and PyPy
•   But only CPython is mainstream
•   Why? Because all the “system programming
    lang” extensions written for CPython cannot
    be used with the others
•   And we’re talking about hundreds of packages
Solution?
•   There are also tools, languages and
    extensions that ease up the task of writing a
    CPython extension: SWIG, ctypes, Pyrex,
    Cython...
•   By encapsulating part of the complexity and
    low-level details of creating a Python
    extension, it could make is both easier, more
    robust and easier to share extensions
    between language implementations
Conclusion
• Python could maybe be made 10x faster by
  applying ideas from Clojure and V8

• Python could be made more scalable by
  removing the GIL (probably needs to
  rewrite the interpreter, though) and adding
  modern parallel programming paradigms
• Python programs could be made more
  robust with the help of static typing analysis
  tools + a little help from the developers
  (i.e. add a few type annotations in their
  programs and libraries)

• Python is a scripting language after all (but
  not only), and could benefit from a way of
  writing extensions that is higher level than
  the common one (but then: how to convert
  the existing code base?)

Four Python Pains

  • 1.
    Four Python Pains Stefane Fermigier Scripting Languages Workshop, IRILL, May 2011
  • 2.
  • 10.
  • 11.
    Python (“Scripting Language”) • Has built-in syntactical support for arbitrarily complex data structures (lists, dictionaries) • Supports significantly complex applications: ERP5, Plone, Nuxeo CPS (~320 KLOC), OpenERP (~150 KLOC), Komodo (~250 KLOC) • Is compiled (to bytecode)
  • 12.
    Scala (“Sys. Prog. Language”) • Is interpreted (it as a REPL) • Is a high-level language - in other words, has a high CPU instructions / statement ratio (comparable or higher than Python)
  • 13.
    Other counterexamples • Java,pre-generics (Java SE < 5, i.e. at the time of Ousterhout’s paper), had really weak support for static typing • Clojure (that many consider in the same league as Scala) is not even statically typed
  • 16.
  • 17.
    • Execution speed? •Safety? • Maintainability? • What else?
  • 18.
    Or maybe Pythonis a scripting language after all ?
  • 19.
    Uses of Python •Education • Small to medium sized sysadmin and software engineering project • Medium to large web frameworks (Zope, Django...) and applications (Youtube...) • Scientific computing (numpy, scipy...) • Scripting of games, desktop apps, etc.
  • 20.
    Recap: 4 painpoints • Speed • Scalability • Type-safety • Adherence to external code bases, and an implementation (CPython)
  • 21.
  • 22.
    Speed • Python (seemsto be) slower by 10x than Clojure and V8 (JavaScript) • Both Clojure and JavaScript are dynamically typed languages • TODO: look at Clojure and V8 and see how they do it
  • 24.
    Scalability • CPython multi-threading / multi-core performance is constrained by “the GIL” (global interpretor lock) • It doesn’t have to be there (Jython doesn’t have one) but removing it from Python is controversial • Python has support for some par prog constructs (see: http://wiki.python.org/moin/ ParallelProcessing) but not the fancy new ones (a la Scala, Clojure, Go, etc.)
  • 26.
    Type Safety • Static typic is good to create safer programs, and also to ease refactoring (necessary condition for agility) • Since Python 3, type annotations are possible on variables, parameters, etc. • Some tools (e.g. Jetbrains’ PyCharm) have support for type inference on Python, and enable: code completion, errors detections, refactoring...
  • 28.
    Adherence to CPython • There are at least 4 major Python implementations: CPython, Jython, IronPython and PyPy • But only CPython is mainstream • Why? Because all the “system programming lang” extensions written for CPython cannot be used with the others • And we’re talking about hundreds of packages
  • 29.
    Solution? • There are also tools, languages and extensions that ease up the task of writing a CPython extension: SWIG, ctypes, Pyrex, Cython... • By encapsulating part of the complexity and low-level details of creating a Python extension, it could make is both easier, more robust and easier to share extensions between language implementations
  • 30.
  • 31.
    • Python couldmaybe be made 10x faster by applying ideas from Clojure and V8 • Python could be made more scalable by removing the GIL (probably needs to rewrite the interpreter, though) and adding modern parallel programming paradigms
  • 32.
    • Python programscould be made more robust with the help of static typing analysis tools + a little help from the developers (i.e. add a few type annotations in their programs and libraries) • Python is a scripting language after all (but not only), and could benefit from a way of writing extensions that is higher level than the common one (but then: how to convert the existing code base?)