Your SlideShare is downloading. ×
0
PyData: Past, Present, Future
Peter Wang
@pwang
!
Continuum Analytics
!
PyData SV 2014
How did we get here?
“Python Data Workshop”
March 3, 2012, Google HQ
“Guido, please help us
convince core dev to
work with us to solve the
packaging problem!”
“Guido, please help us
convince core dev to
work with us to solve the
packaging problem!”
“Meh. Feel free
to solve it
your...
“Guido, please help us
convince core dev to
work with us to solve the
packaging problem!”
“Meh. Feel free
to solve it
your...
“What Packaging Problem?”
“What Packaging Problem?”
“I just use….”
“What Packaging Problem?”
“I just use….”
• pip & virtualenv
“What Packaging Problem?”
“I just use….”
• pip & virtualenv
• homebrew
“What Packaging Problem?”
“I just use….”
• pip & virtualenv
• homebrew
• rpm
“What Packaging Problem?”
“I just use….”
• pip & virtualenv
• homebrew
• rpm
• apt-get
“What Packaging Problem?”
“I just use….”
• pip & virtualenv
• homebrew
• rpm
• apt-get
• emerge
“What Packaging Problem?”
“I just use….”
• pip & virtualenv
• homebrew
• rpm
• apt-get
• emerge
• tar -zxf
“What Packaging Problem?”
“I just use….”
• pip & virtualenv
• homebrew
• rpm
• apt-get
• emerge
• tar -zxf
• double-click ...
“What Packaging Problem?”
“I just use….”
• pip & virtualenv
• homebrew
• rpm
• apt-get
• emerge
• tar -zxf
• double-click ...
“What Packaging Problem?”
“I just use….”
• pip & virtualenv
• homebrew
• rpm
• apt-get
• emerge
• tar -zxf
• double-click ...
“What Packaging Problem?”
“I just use….”
• pip & virtualenv
• homebrew
• rpm
• apt-get
• emerge
• tar -zxf
• double-click ...
“What Packaging Problem?”
“I just use….”
• pip & virtualenv
• homebrew
• rpm
• apt-get
• emerge
• tar -zxf
• double-click ...
This Packaging Problem
This Packaging Problem
This Packaging Problem
This Packaging Problem
This Packaging Problem
PyData: The First 2 Years
• Oct 2012: First PyData Conf, NYC
!
• March 2013: PyData SV (PyCon)
• July 2013: PyData Boston ...
PyData: The First 10 years
PyData: The First 10 years
• IPython Notebook: 2005-2011
• pandas: 2008-2009
• scikit-learn: 2007
• NumPy: 2006
PyData: The First 15 Years
• IPython Notebook: 2005-2011
• pandas: 2008-2009
• scikit-learn: 2007
• NumPy: 2006
• SciPy: 1...
PyData: The First 15 Years
• IPython Notebook: 2005-2011
• pandas: 2008-2009
• scikit-learn: 2007
• NumPy: 2006
• SciPy: 1...
PyData: The First 20 Years
• Numarray: 2001
• Numeric: 1995
• Matrix Obj: 1994
• IPython Notebook: 2005-2011
• pandas: 200...
Way Way Back
Way Way Back
• python: 1989-1991
Way Way Back
• python: 1989-1991
• v1.0: 1994
Way Way Back
• python: 1989-1991
• v1.0: 1994
• “ABC, SETL…
Way Way Back
• python: 1989-1991
• v1.0: 1994
• “ABC, SETL…
…That would appeal to UNIX/C hackers”
Way Way Back
• python: 1989-1991
• v1.0: 1994
• “ABC, SETL…
…That would appeal to UNIX/C hackers”
$ conda create -n py10 p...
Way Way Back
• python: 1989-1991
• v1.0: 1994
• “ABC, SETL…
…That would appeal to UNIX/C hackers”
http://continuum.io/blog...
Way Way Back
It is interactive, structured, high-level, and intended
to be used instead of BASIC, Pascal, or AWK.
!
It is ...
“In June [1960] we were
introduced to this tall
college kid that always
signed his name with
lowercase letters. He was
don...
PyData NYC 2013 Keynote
PyData NYC 2013 Keynote
PyData NYC 2013 Keynote
http://tuulos.github.io/sf-python-meetup-sep-2013/#/
“One of the most exciting features in
development is the Numba-based ...
http://grokbase.com/t/python/python-list/01az9hmtf1/python-development-practices
http://grokbase.com/t/python/python-list/01az9hmtf1/python-development-practices
Glue 2.0
Python’s legacy as a powerful glue
language
• manipulate files
• call fast libraries
!
Next-gen Glue:
• Link data...
Hard Problems in Data Science
Lots of data
Messy data
Noisy data
Hard Problems in Data Science
Lots of data
Messy data
Noisy data
Lots of computers
Lots of tools
Lots of hacking
Hard Problems in Data Science
Lots of data
Messy data
Noisy data
Lots of computers
Lots of tools
Lots of hacking
More ques...
The Hype & The Opportunity
“Internet Revolution” True Believer, 1996:
Businesses that build network capability into their ...
The Hype & The Opportunity
“Internet Revolution” True Believer, 1996:
Businesses that build network capability into their ...
The Hype & The Opportunity
“Internet Revolution” True Believer, 1996:
Businesses that build network capability into their ...
Soft Problems in Data Science
Soft Problems in Data Science
Computers
EE
Soft Problems in Data Science
Computers
EE
Applications
CS
Soft Problems in Data Science
Computers
EE
Applications
CS
DATA
Insights
Math, Stats
Computers
Applications
Data
Insights
Computers
Applications
Data
Insights
Computers
DATA
Applications
DataScientist
2013 Data Science Salary Survey!
http://www.oreilly.com/data/free/stratasurvey.csp
“Python is the second best language…”
...Because it blurs the lines between “user” and “maker”.
!
We stand on the shoulder...
Standing Tall
Standing Tall
• Science: Standing on the shoulders of giants
Standing Tall
• Science: Standing on the shoulders of giants
• Programming: Standing on each others toes
Standing Tall
• Science: Standing on the shoulders of giants
• Programming: Standing on each others toes
• But in Python, ...
“For there is but one veritable problem -
the problem of human relations…”
—Antoine de Saint-Exupéry
https://archive.org/details/Scipy2010-PeterWang-PythonEvangelism101
Participate
• Submit issues and pull requests
• Represent for the tools you love in social
media conversations
• Start PyD...
How did we get here?
• Hard Work
• By a community of people
• Who cared
• About code and people
Where do we go from here?
• More hard work
• More community
• More caring
• More code
• More people
Python is not just glu...
Where do we go from here?
• More hard work
• More community
• More caring
• More code
• More people
Python is not just glu...
PyData: Past, Present Future (PyData SV 2014 Keynote)
PyData: Past, Present Future (PyData SV 2014 Keynote)
PyData: Past, Present Future (PyData SV 2014 Keynote)
PyData: Past, Present Future (PyData SV 2014 Keynote)
PyData: Past, Present Future (PyData SV 2014 Keynote)
PyData: Past, Present Future (PyData SV 2014 Keynote)
PyData: Past, Present Future (PyData SV 2014 Keynote)
PyData: Past, Present Future (PyData SV 2014 Keynote)
PyData: Past, Present Future (PyData SV 2014 Keynote)
PyData: Past, Present Future (PyData SV 2014 Keynote)
PyData: Past, Present Future (PyData SV 2014 Keynote)
PyData: Past, Present Future (PyData SV 2014 Keynote)
PyData: Past, Present Future (PyData SV 2014 Keynote)
PyData: Past, Present Future (PyData SV 2014 Keynote)
PyData: Past, Present Future (PyData SV 2014 Keynote)
PyData: Past, Present Future (PyData SV 2014 Keynote)
PyData: Past, Present Future (PyData SV 2014 Keynote)
Upcoming SlideShare
Loading in...5
×

PyData: Past, Present Future (PyData SV 2014 Keynote)

1,205

Published on

From the closing keynoteLook back at the last two years of PyData, discussion about Python's role in the growing and changing data analytics landscape, and encouragement of ways to grow the community

Published in: Data & Analytics, Technology
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,205
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
27
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

Transcript of "PyData: Past, Present Future (PyData SV 2014 Keynote)"

  1. 1. PyData: Past, Present, Future Peter Wang @pwang ! Continuum Analytics ! PyData SV 2014
  2. 2. How did we get here?
  3. 3. “Python Data Workshop” March 3, 2012, Google HQ
  4. 4. “Guido, please help us convince core dev to work with us to solve the packaging problem!”
  5. 5. “Guido, please help us convince core dev to work with us to solve the packaging problem!” “Meh. Feel free to solve it yourselves.”
  6. 6. “Guido, please help us convince core dev to work with us to solve the packaging problem!” “Meh. Feel free to solve it yourselves.”
  7. 7. “What Packaging Problem?”
  8. 8. “What Packaging Problem?” “I just use….”
  9. 9. “What Packaging Problem?” “I just use….” • pip & virtualenv
  10. 10. “What Packaging Problem?” “I just use….” • pip & virtualenv • homebrew
  11. 11. “What Packaging Problem?” “I just use….” • pip & virtualenv • homebrew • rpm
  12. 12. “What Packaging Problem?” “I just use….” • pip & virtualenv • homebrew • rpm • apt-get
  13. 13. “What Packaging Problem?” “I just use….” • pip & virtualenv • homebrew • rpm • apt-get • emerge
  14. 14. “What Packaging Problem?” “I just use….” • pip & virtualenv • homebrew • rpm • apt-get • emerge • tar -zxf
  15. 15. “What Packaging Problem?” “I just use….” • pip & virtualenv • homebrew • rpm • apt-get • emerge • tar -zxf • double-click MSI
  16. 16. “What Packaging Problem?” “I just use….” • pip & virtualenv • homebrew • rpm • apt-get • emerge • tar -zxf • double-click MSI • configure ; make ; make install
  17. 17. “What Packaging Problem?” “I just use….” • pip & virtualenv • homebrew • rpm • apt-get • emerge • tar -zxf • double-click MSI • configure ; make ; make install • export PYTHONPATH=…
  18. 18. “What Packaging Problem?” “I just use….” • pip & virtualenv • homebrew • rpm • apt-get • emerge • tar -zxf • double-click MSI • configure ; make ; make install • export PYTHONPATH=…
  19. 19. “What Packaging Problem?” “I just use….” • pip & virtualenv • homebrew • rpm • apt-get • emerge • tar -zxf • double-click MSI • configure ; make ; make install • export PYTHONPATH=… from python import ! technical_debt
  20. 20. This Packaging Problem
  21. 21. This Packaging Problem
  22. 22. This Packaging Problem
  23. 23. This Packaging Problem
  24. 24. This Packaging Problem
  25. 25. PyData: The First 2 Years • Oct 2012: First PyData Conf, NYC ! • March 2013: PyData SV (PyCon) • July 2013: PyData Boston (Microsoft) • Oct 2013: PyData NYC (JP Morgan) ! • Feb 2014: PyData UK (Level39) • May 2014: PyData SV (Facebook) • July 2014: PyData Berlin (EuroPython) • October 2014: NYC (Strata NYC) ! • October 2014: NYC (YOUR COMPANY HERE)
  26. 26. PyData: The First 10 years
  27. 27. PyData: The First 10 years • IPython Notebook: 2005-2011 • pandas: 2008-2009 • scikit-learn: 2007 • NumPy: 2006
  28. 28. PyData: The First 15 Years • IPython Notebook: 2005-2011 • pandas: 2008-2009 • scikit-learn: 2007 • NumPy: 2006 • SciPy: 1999 • IPython: 2001 • matplotlib: 2002
  29. 29. PyData: The First 15 Years • IPython Notebook: 2005-2011 • pandas: 2008-2009 • scikit-learn: 2007 • NumPy: 2006 • SciPy: 1999 • IPython: 2001 • matplotlib: 2002 http://numfocus.org/johnhunter.html
  30. 30. PyData: The First 20 Years • Numarray: 2001 • Numeric: 1995 • Matrix Obj: 1994 • IPython Notebook: 2005-2011 • pandas: 2008-2009 • scikit-learn: 2007 • NumPy: 2006 • IPython: 2001 • matplotlib: 2002
  31. 31. Way Way Back
  32. 32. Way Way Back • python: 1989-1991
  33. 33. Way Way Back • python: 1989-1991 • v1.0: 1994
  34. 34. Way Way Back • python: 1989-1991 • v1.0: 1994 • “ABC, SETL…
  35. 35. Way Way Back • python: 1989-1991 • v1.0: 1994 • “ABC, SETL… …That would appeal to UNIX/C hackers”
  36. 36. Way Way Back • python: 1989-1991 • v1.0: 1994 • “ABC, SETL… …That would appeal to UNIX/C hackers” $ conda create -n py10 python=1.0
  37. 37. Way Way Back • python: 1989-1991 • v1.0: 1994 • “ABC, SETL… …That would appeal to UNIX/C hackers” http://continuum.io/blog/python-1.0 $ conda create -n py10 python=1.0
  38. 38. Way Way Back It is interactive, structured, high-level, and intended to be used instead of BASIC, Pascal, or AWK. ! It is not meant to be a systems-programming language but is intended for teaching or prototyping.
  39. 39. “In June [1960] we were introduced to this tall college kid that always signed his name with lowercase letters. He was don knuth … don claimed that he could write the [Algol] compiler and a language manual all by himself during his three and a half month summer vacation.”
  40. 40. PyData NYC 2013 Keynote
  41. 41. PyData NYC 2013 Keynote
  42. 42. PyData NYC 2013 Keynote
  43. 43. http://tuulos.github.io/sf-python-meetup-sep-2013/#/ “One of the most exciting features in development is the Numba-based UDF compiler. Building UDFs for Impala currently requires writing C++ or Java code and registering them manually with the cluster. Writing C++/Java code is more difficult, time-consuming, and error- prone for many data analysts.” http://blog.cloudera.com/blog/2014/04/a-new-python-client-for-impala/
  44. 44. http://grokbase.com/t/python/python-list/01az9hmtf1/python-development-practices
  45. 45. http://grokbase.com/t/python/python-list/01az9hmtf1/python-development-practices
  46. 46. Glue 2.0 Python’s legacy as a powerful glue language • manipulate files • call fast libraries ! Next-gen Glue: • Link data silos • Link disjoint memory & compute • Unify disparate runtime models • Transcend legacy models of computers
  47. 47. Hard Problems in Data Science Lots of data Messy data Noisy data
  48. 48. Hard Problems in Data Science Lots of data Messy data Noisy data Lots of computers Lots of tools Lots of hacking
  49. 49. Hard Problems in Data Science Lots of data Messy data Noisy data Lots of computers Lots of tools Lots of hacking More questions More data More people
  50. 50. The Hype & The Opportunity “Internet Revolution” True Believer, 1996: Businesses that build network capability into their core will outcompete and destroy their competition.
  51. 51. The Hype & The Opportunity “Internet Revolution” True Believer, 1996: Businesses that build network capability into their core will outcompete and destroy their competition. “Data Revolution” True Believer, 2014: Businesses that build data comprehension into their core will destroy their competition over the next 5-15 years.
  52. 52. The Hype & The Opportunity “Internet Revolution” True Believer, 1996: Businesses that build network capability into their core will outcompete and destroy their competition. “Data Revolution” True Believer, 2014: Businesses that build data comprehension into their core will destroy their competition over the next 5-15 years. (1993 == 2011?)
  53. 53. Soft Problems in Data Science
  54. 54. Soft Problems in Data Science Computers EE
  55. 55. Soft Problems in Data Science Computers EE Applications CS
  56. 56. Soft Problems in Data Science Computers EE Applications CS DATA Insights Math, Stats
  57. 57. Computers Applications Data Insights
  58. 58. Computers Applications Data Insights
  59. 59. Computers DATA Applications DataScientist
  60. 60. 2013 Data Science Salary Survey! http://www.oreilly.com/data/free/stratasurvey.csp
  61. 61. “Python is the second best language…” ...Because it blurs the lines between “user” and “maker”. ! We stand on the shoulders of Users who became Makers. ! Some people say: “R has a very strong user community.” ! I want people to say that “Python has a strong maker community.”
  62. 62. Standing Tall
  63. 63. Standing Tall • Science: Standing on the shoulders of giants
  64. 64. Standing Tall • Science: Standing on the shoulders of giants • Programming: Standing on each others toes
  65. 65. Standing Tall • Science: Standing on the shoulders of giants • Programming: Standing on each others toes • But in Python, we stand on each others’ shoulders - community that bootstraps itself
  66. 66. “For there is but one veritable problem - the problem of human relations…” —Antoine de Saint-Exupéry
  67. 67. https://archive.org/details/Scipy2010-PeterWang-PythonEvangelism101
  68. 68. Participate • Submit issues and pull requests • Represent for the tools you love in social media conversations • Start PyData meetups • Come to PyData conferences and present • Encourage diversity!!
  69. 69. How did we get here? • Hard Work • By a community of people • Who cared • About code and people
  70. 70. Where do we go from here? • More hard work • More community • More caring • More code • More people Python is not just glue. Python and PyData are communities!
  71. 71. Where do we go from here? • More hard work • More community • More caring • More code • More people Python is not just glue. Python and PyData are communities!
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×