Your SlideShare is downloading. ×
0
Intro to Python Data Analysis in Wakari
Intro to Python Data Analysis in Wakari
Intro to Python Data Analysis in Wakari
Intro to Python Data Analysis in Wakari
Intro to Python Data Analysis in Wakari
Intro to Python Data Analysis in Wakari
Intro to Python Data Analysis in Wakari
Intro to Python Data Analysis in Wakari
Intro to Python Data Analysis in Wakari
Intro to Python Data Analysis in Wakari
Intro to Python Data Analysis in Wakari
Intro to Python Data Analysis in Wakari
Intro to Python Data Analysis in Wakari
Intro to Python Data Analysis in Wakari
Intro to Python Data Analysis in Wakari
Intro to Python Data Analysis in Wakari
Intro to Python Data Analysis in Wakari
Intro to Python Data Analysis in Wakari
Intro to Python Data Analysis in Wakari
Intro to Python Data Analysis in Wakari
Intro to Python Data Analysis in Wakari
Intro to Python Data Analysis in Wakari
Intro to Python Data Analysis in Wakari
Intro to Python Data Analysis in Wakari
Intro to Python Data Analysis in Wakari
Intro to Python Data Analysis in Wakari
Intro to Python Data Analysis in Wakari
Intro to Python Data Analysis in Wakari
Intro to Python Data Analysis in Wakari
Intro to Python Data Analysis in Wakari
Intro to Python Data Analysis in Wakari
Intro to Python Data Analysis in Wakari
Intro to Python Data Analysis in Wakari
Intro to Python Data Analysis in Wakari
Intro to Python Data Analysis in Wakari
Intro to Python Data Analysis in Wakari
Intro to Python Data Analysis in Wakari
Intro to Python Data Analysis in Wakari
Intro to Python Data Analysis in Wakari
Intro to Python Data Analysis in Wakari
Intro to Python Data Analysis in Wakari
Intro to Python Data Analysis in Wakari
Intro to Python Data Analysis in Wakari
Intro to Python Data Analysis in Wakari
Intro to Python Data Analysis in Wakari
Intro to Python Data Analysis in Wakari
Intro to Python Data Analysis in Wakari
Intro to Python Data Analysis in Wakari
Intro to Python Data Analysis in Wakari
Intro to Python Data Analysis in Wakari
Intro to Python Data Analysis in Wakari
Intro to Python Data Analysis in Wakari
Intro to Python Data Analysis in Wakari
Intro to Python Data Analysis in Wakari
Intro to Python Data Analysis in Wakari
Intro to Python Data Analysis in Wakari
Intro to Python Data Analysis in Wakari
Intro to Python Data Analysis in Wakari
Intro to Python Data Analysis in Wakari
Intro to Python Data Analysis in Wakari
Intro to Python Data Analysis in Wakari
Intro to Python Data Analysis in Wakari
Intro to Python Data Analysis in Wakari
Intro to Python Data Analysis in Wakari
Intro to Python Data Analysis in Wakari
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Intro to Python Data Analysis in Wakari

1,951

Published on

Outlines the vision and philosophy for Wakari.io with a basic overview of popular python data analysis packages. Most of the talk is conducted in Wakari and is not visible on these slides. 90 minutes …

Outlines the vision and philosophy for Wakari.io with a basic overview of popular python data analysis packages. Most of the talk is conducted in Wakari and is not visible on these slides. 90 minutes for PyData NYC, November 8th 2013.

Published in: Technology, Education
0 Comments
6 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,951
On Slideshare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
55
Comments
0
Likes
6
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • I do web programming
  • How many of you use python on a daily basis for data analysis?In the past year, raise your hand if you’ve worked primarily in python.
  • Domain-specific librariesStatsmodels => statistical computingScikit-image => image manipulationOpenCV => Image processing with interface that can accept NumPy arraysPyTables => HDF5 integrationNumexpr => you can write expressions on your data with cache-aware expressions, it’s very efficient.There are more packages in the python scientific stack than just these. But, it’s good to know numpy so you can get down and dirty with your data and manipulate it if need be.
  • PACKAGES!Occasional programmers can jump on
  • PACKAGES!
  • THIS SHOULD NEVER HAPPEN.At continuum analytics, we never want these words to be uttered again.
  • Python in 60 secondsNumPyScipyPandasMatplotlibScikit-learn
  • Homogenous
  • We’re going to pull it all together in Wakari.
  • And this is why sharing in wakari is so important
  • Transcript

    • 1. Intro to Python Data Analysis in Wakari Karissa McKelvey Software Developer Continuum Analytics @karissamck November 8, 2013 PyData NYC
    • 2. $ WHOAMI karissamck.com @karissamck
    • 3. truthy.indiana.edu
    • 4. More Tweets, Mote Votes
    • 5. MY GOALS Get you excited about data analysis in Wakari Walk through some basic analysis packages and wakari workflows Kick-start your journey
    • 6. WHO ARE YOU?
    • 7. Putting Science back in Comp Sci • Much of the software stack is for systems programming --- C++, Java, .NET, ObjC, web - Complex numbers? - Vectorized primitives? • Software stack for scientists is not as helpful as it should be • Fortran is still where many scientists end up
    • 8. Why Python?
    • 9. High Performance with BIG DATA
    • 10. Packages for data analysis and visualization
    • 11. Syntax – Gets out of your way
    • 12. Community Driven
    • 13. Ready for web applications, too.
    • 14. • “Python is good for data cleanup, R for statistical models” “Which is the better Data Analysis language? R or Python?” Quora. http://www.quora.com/Data-Analysis/Which-is-the-better-Data-analysis-language-R-or-Python
    • 15. • “Python is good for data cleanup, R for statistical models” • “R is quirky and weird but the statisticians love it and there really isn’t any compelling reason to switch” “Which is the better Data Analysis language? R or Python?” Quora. http://www.quora.com/Data-Analysis/Which-is-the-better-Data-analysis-language-R-or-Python
    • 16. • “Python is good for data cleanup, R for statistical models” • “R is quirky and weird but the statisticians love it and there really isn’t any compelling reason to switch” • “You’re running an MCMC simulation on a laptop? Perhaps you should write it in C++/FORTRAN” “Which is the better Data Analysis language? R or Python?” Quora. http://www.quora.com/Data-Analysis/Which-is-the-better-Data-analysis-language-R-or-Python
    • 17. “You’re running an MCMC simulation on a laptop? Perhaps you should write it in C++/FORTRAN” Ready for DATA, and then some
    • 18. Numba: just-in-time compiler to LLVM through @decorators numba.pydata.org
    • 19. Numba: just-in-time compiler to LLVM through @decorators* numba.pydata.org *aka, fast. easy.
    • 20. Basic packages for data analysis and visualization
    • 21. NumPy: The foundation of the Python Data Analysis stack
    • 22. NumPy: Array-oriented
    • 23. Pandas: Builds upon NumPy
    • 24. Matplotlib: 2D plotting library
    • 25. IPython: Interactive Python (+ in the Web) tab completion magic %-commands Inline plots
    • 26. Anaconda: pulls it all together
    • 27. wakari.io Browser-based Python & Linux environment
    • 28. IPython Notebook Scientific Packages Terminal Share files, IPython notebooks, and plots with pay-as-you-go compute
    • 29. Sharing in Wakari • Packages IPython notebooks, files, folders, data, and environment • Get a link • Share that link.
    • 30. Reproducible Research
    • 31. “A rule of thumb among biotechnology venture capitalists is that half of published research cannot be replicated”
    • 32. How do we replicate research today?
    • 33. collaborate on How do we replicate research today?
    • 34. collaborate on How do we replicate research today? data analysis
    • 35. How do we collaborate today?
    • 36. How do we collaborate today?
    • 37. How do we collaborate today?
    • 38. How do we collaborate today?
    • 39. ????????
    • 40. How do we replicate research today?
    • 41. wakari.io Browser-based Python & Linux environment
    • 42. Enterprise or Cloud Online at wakari.io or install locally for access to your hardware and data
    • 43. wakari.io Browser-based Python & Linux environment
    • 44. Coming Soon
    • 45. Project-based interaction user Projects starting at 10$/month with unlimited team members
    • 46. Interactive Plotting Next-generation collaborative data manipulation, analysis, and presentation
    • 47. Talks to see • Jack Vanderplas (Washington) – Efficient computing with Numpy • 29th Floor combo 3pm (Right now, next door!) • Julia Evans (N/A) – A practical introduction to IPython Notebook & pandas • Here, 4:45pm.
    • 48. Talks to see • Sarah Guido (Michigan) – A Beginner’s Guide to Machine Learning with scikit-learn • Imram Haque (Counsyl) – Beyond the dict • Peter Wang (Continuum) – Bokeh Workshop
    • 49. Special Thanks Ben Zaitlin Mark Florisson Clayton Davis Bryan Van de Ven Travis Oliphant
    • 50. Karissa McKelvey @karissamck

    ×