Os Vanrossum


Published on

Published in: Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Os Vanrossum

  1. 1. Python 3000 Phoenix, Amsterdam, Vilnius, The Dalles, Portland (Google, CWI, EuroPython, Google, OSCON) Guido van Rossum [email_address] [email_address]
  2. 2. What Is Python 3000? <ul><li>The next major Python release </li></ul><ul><ul><li>to be released as Python 3.0 </li></ul></ul><ul><li>The first one in a long time to be incompatible </li></ul><ul><ul><li>but not a completely different language </li></ul></ul><ul><li>Concept first formed around 2000 </li></ul><ul><ul><li>Py3k nickname was a play on Windows 2000 </li></ul></ul><ul><li>Goal: to correct my early design mistakes </li></ul><ul><ul><li>those that would require incompatibility to fix </li></ul></ul><ul><ul><li>reduce cognitive load for first-time learners </li></ul></ul><ul><ul><li>fix “regrets” and “warts” </li></ul></ul>
  3. 3. Recent History <ul><li>Work started for real in early 2006 </li></ul><ul><li>PEP submission deadline was April 2007 </li></ul><ul><li>41 PEPs submitted (including meta-PEPs) </li></ul><ul><ul><li>12 open, 10 accepted, 10 final, 8 rejected </li></ul></ul><ul><li>Several branches created in Subversion </li></ul><ul><ul><li>p3yk (sic): main Py3k branch </li></ul></ul><ul><ul><li>py3k-struni: string unification branch </li></ul></ul><ul><ul><li>and some private branches </li></ul></ul><ul><ul><li>regular merges: trunk (2.6) -> p3yk -> py3k-struni </li></ul></ul>
  4. 4. Tentative Release Schedule <ul><li>Interleaving 3.0 and 2.6 releases </li></ul><ul><li>3.0 alpha 1: August 2007 </li></ul><ul><li>2.6 alpha 1: December 2007 </li></ul><ul><li>2.6 final: June 2008 </li></ul><ul><li>3.0 final: August 2008 </li></ul><ul><li>Standard library reorganization work to start in earnest after 3.0a1 release </li></ul>
  5. 5. Compatibility <ul><li>Python 3.0: </li></ul><ul><ul><li>will break backwards compatibility </li></ul></ul><ul><ul><li>not even aiming for a usable common subset </li></ul></ul><ul><li>Python 2.6: </li></ul><ul><ul><li>will be fully backwards compatible with Python 2.5 </li></ul></ul><ul><ul><li>will provide forward compatibility features: </li></ul></ul><ul><ul><ul><li>“ Py3k warnings mode” detects runtime problems </li></ul></ul></ul><ul><ul><ul><li>many Py3k features backported </li></ul></ul></ul><ul><ul><ul><ul><li>may need “from __future__ import <feature>” </li></ul></ul></ul></ul>
  6. 6. 2to3: Source Conversion Tool <ul><li>In subversion: sandbox/trunk/2to3 </li></ul><ul><li>Context-free source-to-source translation </li></ul><ul><li>Can do things like </li></ul><ul><ul><ul><li>`x` -> repr(x) </li></ul></ul></ul><ul><ul><ul><li>x <> y -> x != y </li></ul></ul></ul><ul><ul><ul><li>apply(f, args, kwds) -> f(*args, **kwds) </li></ul></ul></ul><ul><ul><ul><li>d.iterkeys() -> iter(d.keys()) </li></ul></ul></ul><ul><ul><ul><li>d.keys() -> list(d.keys()) </li></ul></ul></ul><ul><ul><ul><li>xrange() -> range() </li></ul></ul></ul><ul><ul><ul><li>range() -> list(range()) </li></ul></ul></ul><ul><ul><ul><li>except E, err: -> except E as err: </li></ul></ul></ul><ul><li>Can’t do dataflow analysis or type inference </li></ul>
  7. 7. How to Support 2.6 and 3.0 <ul><li>0. Start with excellent unit tests & full coverage </li></ul><ul><li>Port project to 2.6 </li></ul><ul><li>Test with Py3k warnings mode turned on </li></ul><ul><li>Fix all warnings </li></ul><ul><li>Use 2to3 to convert to 3.0 syntax </li></ul><ul><ul><ul><li>no hand-editing of output allowed!! </li></ul></ul></ul><ul><li>Test converted source code under 3.0 </li></ul><ul><li>To fix problems, edit 2.6 source and go to (2) </li></ul><ul><li>Release separate 2.6 and 3.0 tarballs </li></ul>
  8. 8. Do’s and Don'ts <ul><li>Do start using “modern” features now: </li></ul><ul><ul><li>e.g. new-style classes, sorted(), xrange(), int//int, relative import, new exception hierarchy </li></ul></ul><ul><ul><li>segregate Unicode processing </li></ul></ul><ul><li>Don't try to write source-level compatible code </li></ul><ul><ul><li>intersection of 2.6 and 3.0 is large but incomplete </li></ul></ul><ul><li>Don’t go in one step from 2.5 (or older) to 3.0 </li></ul><ul><ul><li>always plan to transition via 2.6 </li></ul></ul><ul><ul><li>users can easily upgrade to 2.6 </li></ul></ul>
  9. 9. What’s New in Python 3.0 Status of Individual Features
  10. 10. Unicode Source Code <ul><li>Default source encoding is UTF-8 </li></ul><ul><ul><li>was (7-bit) ASCII; this is fully forward compatible </li></ul></ul><ul><li>Unicode letters allowed for identifiers </li></ul><ul><li>Open issues: </li></ul><ul><ul><li>normalization </li></ul></ul><ul><ul><li>which alphabets are supported </li></ul></ul><ul><ul><li>support for right-to-left scripts </li></ul></ul><ul><li>Standard library remains ASCII only! </li></ul><ul><ul><li>except for author names and a few unit tests </li></ul></ul>
  11. 11. Unicode Strings <ul><li>Java-like model: </li></ul><ul><ul><li>strings (the str type) are always Unicode </li></ul></ul><ul><ul><li>separate bytes type </li></ul></ul><ul><ul><li>must explicitly specify encoding to go between these </li></ul></ul><ul><li>Implementation is ~same as 2.x unicode </li></ul><ul><li>Dropping u”…” prefix to string literals </li></ul><ul><li>Codecs changes: </li></ul><ul><ul><li>.encode() always goes from str -> bytes </li></ul></ul><ul><ul><li>.decode() always goes from bytes -> str </li></ul></ul><ul><ul><li>base64, rot13, bz2 “codecs” dropped </li></ul></ul>
  12. 12. Bytes Type <ul><li>Mutable sequence of small ints (0…255) </li></ul><ul><ul><li>b[0] is an int; b[:1] is a new bytes object </li></ul></ul><ul><li>Implemented efficiently as unsigned char[] </li></ul><ul><li>Has some list-like methods, e.g. .extend() </li></ul><ul><li>Has some string-like methods, e.g. .find() </li></ul><ul><ul><li>But none that depend on locale </li></ul></ul><ul><li>bytes literals: b&quot;ascii or xDD or 12&quot; </li></ul><ul><li>bytes and str don’t mix: </li></ul><ul><ul><li>must use .encode() or .decode() </li></ul></ul>
  13. 13. New I/O Library <ul><li>Stackable components (inspired by Java, Perl) </li></ul><ul><ul><li>Lowest level: unbuffered bytes I/O </li></ul></ul><ul><ul><ul><li>platform-specific; doesn't use C stdio </li></ul></ul></ul><ul><ul><li>Middle layer: buffering </li></ul></ul><ul><ul><li>Top layer: unicode encoding/decoding </li></ul></ul><ul><ul><ul><li>encoding explicitly specified or system default </li></ul></ul></ul><ul><ul><ul><li>optionally handles CRLF/LF mapping </li></ul></ul></ul><ul><li>Compatible API </li></ul><ul><ul><li>open(filename) returns a buffered text file </li></ul></ul><ul><ul><ul><li>read() and readline() return strings </li></ul></ul></ul><ul><ul><li>open(filename, &quot;rb&quot;) returns a buffered binary file </li></ul></ul><ul><ul><ul><li>read() returns bytes; readline() too (!?) </li></ul></ul></ul>
  14. 14. Print is a Function <ul><li>print x, y -> print(x, y) </li></ul><ul><li>print x, -> print(x, end=&quot; &quot;) </li></ul><ul><li>print >>f, x -> print(x, file=f) </li></ul><ul><li>Automatic translation is 98% correct </li></ul><ul><li>Fails for cases involving softspace cleverness: </li></ul><ul><ul><li>print &quot;x &quot;, &quot;y&quot; doesn 't insert a space before y </li></ul></ul><ul><ul><li>print(&quot;x &quot;, &quot;y&quot;) does </li></ul></ul><ul><ul><li>ditto for print &quot;x &quot;, &quot;y&quot; </li></ul></ul>
  15. 15. String Formatting <ul><li>Examples (see PEP 3101 for more): </li></ul><ul><ul><li>&quot;See {0}, {1} and {foo}&quot;.format(&quot;A&quot;, &quot;B&quot;, foo=&quot;C&quot;) </li></ul></ul><ul><ul><ul><li>&quot;See A, B and C&quot; </li></ul></ul></ul><ul><ul><li>&quot;my name is {0} :-{{}}&quot;.format(&quot;Fred&quot;) </li></ul></ul><ul><ul><ul><li>&quot;my name is Fred :-{}&quot; </li></ul></ul></ul><ul><ul><li>&quot;File name {0.foo}&quot;.format(open(&quot;foo.txt&quot;)) </li></ul></ul><ul><ul><ul><li>&quot;File name foo.txt&quot; </li></ul></ul></ul><ul><ul><li>&quot;Name is {0[name]}&quot;.format({&quot;name&quot;: &quot;Fred&quot;}) </li></ul></ul><ul><ul><ul><li>&quot;Name is Fred&quot; </li></ul></ul></ul><ul><ul><li>Shoe size {0:8}&quot;.format(42) </li></ul></ul><ul><ul><ul><li>&quot;Shoe size 42&quot; </li></ul></ul></ul>
  16. 16. Classic Classes are Dead <ul><li>In 2.2 … 2.9: </li></ul><ul><ul><li>class C: # classic class (0.1 … 2.1) </li></ul></ul><ul><ul><li>class C(object): # new-style class (old now :-) </li></ul></ul><ul><li>In 3.0: </li></ul><ul><ul><li>both are new-style classes (just say &quot;classes&quot;) </li></ul></ul><ul><li>Differences are subtle, few of you will notice </li></ul><ul><ul><li>new-style classes don’t support “magic” methods (e.g. __hash__) stored in instance __dict__ </li></ul></ul>
  17. 17. Class Decorators <ul><li>@some.decorator class C: … </li></ul><ul><li>Same semantics as function decorators: </li></ul><ul><li>class C: … C = some.decorator(C) </li></ul>
  18. 18. Signature Annotations <ul><li>NOT type declarations! “Meaning” is up to you </li></ul><ul><li>Example: </li></ul><ul><ul><li>def foo(x: &quot;whatever&quot;, y: range(10)) -> 42: … </li></ul></ul><ul><li>Argument syntax is (roughly): </li></ul><ul><ul><li>NAME [':' expr] ['=' expr] </li></ul></ul><ul><li>Both expressions are evaluated at 'def' time </li></ul><ul><ul><li>foo.func_annotations is: </li></ul></ul><ul><ul><ul><li>{'a': &quot;whatever&quot;, 'b': [0, 1, 2], &quot;return&quot;: 84} </li></ul></ul></ul><ul><ul><li>NO use is made of annotations by the language </li></ul></ul><ul><ul><ul><li>library may use them, e.g. generic functions </li></ul></ul></ul>
  19. 19. New Metaclass Syntax <ul><li>class C(B1, B2, metaclass=MC): … </li></ul><ul><li>Other keywords passed to MC.__new__ </li></ul><ul><li>__metaclass__ dropped (in module too) </li></ul><ul><li>Class heading has full function call syntax: </li></ul><ul><ul><li>bases = (B1, B2) class C(*bases): … </li></ul></ul><ul><li>Metaclass can provide __prepare__() </li></ul><ul><ul><li>returns the namespace dict for class body execution </li></ul></ul><ul><ul><li>use case: ordered dict to define e.g. db schema </li></ul></ul>
  20. 20. issubclass(), isinstance() <ul><li>Overloadable on the second argument (a class) </li></ul><ul><li>isinstance(x, C) tries C.__instancecheck__(x) </li></ul><ul><li>issubclass(D, C) tries C.__subclasscheck__(D) </li></ul><ul><li>If not overloaded, behavior is unchanged </li></ul><ul><li>Used for “virtual inheritance” from ABCs </li></ul>
  21. 21. Abstract Base Classes: abc.py <ul><li>Voluntary base classes for standard APIs </li></ul><ul><ul><li>e.g. Iterable, MutableMapping, Real, RawIOBase </li></ul></ul><ul><li>Usable as a mix-in class (like DictMixin) </li></ul><ul><ul><li>provides abstract methods you must override </li></ul></ul><ul><ul><li>@abstractmethod decorates an abstract method </li></ul></ul><ul><ul><ul><li>class with abstract methods left can’t be instantiated </li></ul></ul></ul><ul><ul><ul><li>requires metaclass=ABCMeta </li></ul></ul></ul><ul><ul><ul><li>from abc import ABCMeta, abstractmethod </li></ul></ul></ul><ul><li>Alternatively, can register virtual subclasses </li></ul><ul><ul><li>after A.register(C), issubclass(C, A) is true </li></ul></ul><ul><ul><li>however, C isn’t modified </li></ul></ul><ul><ul><ul><li>IOW C must already implement A’s abstract methods </li></ul></ul></ul>
  22. 22. Standard ABCs <ul><li>“ One-trick ponies” (collections.py): </li></ul><ul><ul><ul><li>Hashable, Iterable, Iterator, Sized, Container, Callable </li></ul></ul></ul><ul><ul><ul><li>these check for presence of magic method </li></ul></ul></ul><ul><ul><ul><ul><li>e.g. isinstance(x, Callable) ~ what used to be callable(x) </li></ul></ul></ul></ul><ul><li>Containers (collections.py): </li></ul><ul><ul><ul><li>Set, Mapping, Sequence; Mutable<ditto> </li></ul></ul></ul><ul><li>I/O classes (io.py): </li></ul><ul><ul><ul><li>IOBase, RawIOBase, BufferedIOBase, TextIOBase </li></ul></ul></ul><ul><li>Numbers (numbers.py?): </li></ul><ul><ul><ul><li>Number, Complex, Real, Rational, Integer </li></ul></ul></ul>
  23. 23. Exception Reform <ul><li>&quot;raise E(arg)&quot; replaces &quot;raise E, arg&quot; </li></ul><ul><li>&quot;except E as v:&quot; replaces &quot;except E, v:&quot; </li></ul><ul><ul><li>v is deleted at end of except block!!! </li></ul></ul><ul><li>All exceptions must derive from BaseException </li></ul><ul><ul><li>better still, Exception; StandardError removed </li></ul></ul><ul><li>New standard exception attributes: </li></ul><ul><ul><li>__traceback__: instead of sys.exc_info()[2] </li></ul></ul><ul><ul><li>__cause__: set by raise E from v </li></ul></ul><ul><ul><li>__context__: set when raising in except/finally block </li></ul></ul><ul><li>Exceptions aren’t sequences; use v.args </li></ul>
  24. 24. Int/Long Unification <ul><li>A single built-in integer type </li></ul><ul><li>Name is int </li></ul><ul><li>Behaves like old long </li></ul><ul><li>‘ L’ suffix is dropped </li></ul><ul><li>C API is mostly compatible </li></ul><ul><li>Performance may still need some boosting </li></ul>
  25. 25. Int Division Returns a Float <ul><li>1/2 == 0.5 </li></ul><ul><li>Always! </li></ul><ul><li>Same effect in 2.x with </li></ul><ul><ul><li>from __future__ import division </li></ul></ul><ul><li>Use // for int division </li></ul><ul><li>Use -Q option to Python 2.x to find old usage </li></ul><ul><li>Has been supported since 2.2! </li></ul>
  26. 26. Octal and Binary Literals <ul><li>0o777 instead of 0777 </li></ul><ul><li>Avoid accidental mistakes in data entry </li></ul><ul><li>Avoid confusing younger generation :-) </li></ul><ul><li>0b1010 is a binary number </li></ul><ul><li>bin(10) returns '0b1010' </li></ul>
  27. 27. Itera{tors,bles}, not Lists <ul><li>range() behaves like old xrange() </li></ul><ul><li>zip(), map(), filter() return iterators </li></ul><ul><li>dict.keys(), .items(), .values() </li></ul>
  28. 28. Dictionary Views <ul><li>Inspired by Java Collections Framework </li></ul><ul><li>Remove .iterkeys(), .iteritems(), .itervalues() </li></ul><ul><li>Change .keys(), .items(), .values() </li></ul><ul><li>These return a dict view </li></ul><ul><ul><li>Not an iterator </li></ul></ul><ul><ul><li>A lightweight object that can be iterated repeatedly </li></ul></ul><ul><ul><li>.keys(), .items() have set semantics </li></ul></ul><ul><ul><li>.values() has &quot;collection&quot; semantics </li></ul></ul><ul><ul><ul><li>supports iter(), len(), and not much else </li></ul></ul></ul>
  29. 29. Default Comparison Changed <ul><li>Default ==, != compare object identity </li></ul><ul><ul><li>unchanged from 2.x </li></ul></ul><ul><ul><li>many type override this </li></ul></ul><ul><li>New : default <, <=, >, >= raise TypeError </li></ul><ul><li>Example: [1, 2, &quot;&quot;].sort() raises TypeError </li></ul><ul><li>Rationale: 2.x default ordering is bogus </li></ul><ul><ul><li>depends on type names </li></ul></ul><ul><ul><li>depends on addresses </li></ul></ul>
  30. 30. Nonlocal Statement <ul><li>Paul Graham’s challenge * finally met (sort of): </li></ul><ul><ul><li>def new_accumulator(n): def accumulator(i): nonlocal n n += i return n return accumulator </li></ul></ul><ul><li>* Revenge of the Nerds, “Appendix: Power” </li></ul>
  31. 31. New super() Call <ul><li>PEP 3135 (was 367) </li></ul><ul><ul><li>note: PEP is behind; actual design chosen is different </li></ul></ul><ul><li>Instead of super(ThisClass, self), use super() </li></ul><ul><ul><li>old style calls still supported </li></ul></ul><ul><li>When called without args, digs out of frame: </li></ul><ul><ul><li>__class__ cell, the class defining the method </li></ul></ul><ul><ul><ul><li>based on static, textual inclusion </li></ul></ul></ul><ul><ul><ul><li>cell is filled in after metaclass created the class </li></ul></ul></ul><ul><ul><ul><ul><li>but before class decorators run </li></ul></ul></ul></ul><ul><ul><li>first argument (self; or cls for class methods) </li></ul></ul>
  32. 32. Set Literals <ul><li>{1, 2, 3} is the same as set([1, 2, 3]) </li></ul><ul><li>No empty set literal; use set() </li></ul><ul><li>No frozenset literal; use frozenset({…}) </li></ul><ul><li>Set comprehensions: </li></ul><ul><ul><li>{ f ( x ) for x in S if P ( x )} </li></ul></ul><ul><ul><ul><li>same as set( f ( x ) for x in S if P ( x )) </li></ul></ul></ul>
  33. 33. And More – Much More! <ul><li>it.next() -> it.__next__() </li></ul><ul><li>f.func_code -> f.__code__ </li></ul><ul><li>reduce() is dead (I can’t read code using it) </li></ul><ul><li><> is dead; use != </li></ul><ul><li>`…` is dead; use repr(…) </li></ul><ul><li>raw_input() -> input() </li></ul><ul><li>Read PEP 3100 for many more </li></ul><ul><li>PS. lambda lives! </li></ul>
  34. 34. C API Changes <ul><li>Too early to tell what will happen exactly </li></ul><ul><ul><li>Still working on this… </li></ul></ul><ul><li>Will have to recompile at the very least </li></ul><ul><li>Biggest problem expected: Unicode, bytes </li></ul><ul><li>For now, these simple rules: </li></ul><ul><ul><li>Adding APIs is okay (of course) </li></ul></ul><ul><ul><li>Deleting APIs is okay </li></ul></ul><ul><ul><li>Changing APIs incompatibly is NOT OKAY </li></ul></ul>
  35. 35. Questions PS: Read my blog at artima.com
  36. 36. Why reduce() is an Attractive Nuisance <ul><li>~90% of reduce() calls found in practice can be rewritten using sum() </li></ul><ul><li>half the rest are concatenating sequences, i.e. O(N**2) running time </li></ul><ul><li>First example found by Google Code Search (google.com/codesearch): </li></ul><ul><ul><li>quotechar = reduce(lambda a, b: (quotes[a] > quotes[b]) and a or b, quotes.keys()) </li></ul></ul><ul><li>I find the following rewrite much more readable: </li></ul><ul><ul><li>quotechar = quotes.keys()[0] for a in quotes.keys(): if quotes[a] > quotes[quotechar]: quotechar = a </li></ul></ul><ul><li>Another: reduce(lambda a, b: a+'|'+b, value) Rewrite as: '|'.join(value) </li></ul><ul><li>Another: reduce(lambda x, y: x or y.ambiguous and True, parents, False) Rewrite as: any(y.ambiguous for y in parents) </li></ul><ul><li>All-time worst, from Python Cookbook (unreadable and O(N**2) running time): </li></ul><ul><ul><li>def wrap(text, width): return reduce(lambda line, word, width=width: '%s%s%s' % (line, ' '[(len(line) - line.rfind(' ') - 1 + len(word.split(' ', 1)[0])) >= width)], word), text.split(' ')) </li></ul></ul>