Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Data Visualization Tools in Python

1,828 views

Published on

Overview of tools available in python for performing data visualization (statistical, geographical, reporting, etc). Prepared for Minsk DataViz Day (October 4, 2017)

Published in: Data & Analytics
  • Be the first to comment

Data Visualization Tools in Python

  1. 1. Data visualization tools in Python Roman Merkulov Data Scientist at InData Labs r_merkulov@indatalabs.com merkylovecom@mail.ru
  2. 2. Content - why dataviz is important - dataviz libraries in python - facets tool - interactive maps - Apache Superset
  3. 3. data visualization - EDA & understanding the data - fix data - show insights - models validation - analytics & reporting
  4. 4. Plots vs descriptive statistics Anscombe's quartet *https://en.wikipedia.org/wiki/Anscombe%27s_quartet
  5. 5. Plots vs descriptive statistics Anscombe's quartet *https://en.wikipedia.org/wiki/Anscombe%27s_quartet Property Value Accuracy Mean of X 9 exact Sample variance of X 11 exact Mean of y 7.5 2 decimal places Sample variance of y 4.125 +- 0.003 Correlation coef. 0.816 3 decimal places Linear regression y = 3.00 + 0.5x 2 decimal places Determ. coef. 0.67 2 decimal places
  6. 6. *http://blog.revolutionanalytics.com/2017/05/the-datasaurus-dozen.html
  7. 7. *https://matplotlib.org/gallery.html
  8. 8. Pros: - very powerful - large community, long history
  9. 9. Doesn’t look simple enough...
  10. 10. Cons: - imperative API - poor support for interactivity Just to add a popup...
  11. 11. matplotlib based solutions *https://speakerdeck.com/jakevdp/pythons-visualization-landscape-pycon-2017
  12. 12. matplotlib based solutions http://yhat.github.io/ggpy/ http://scitools.org.uk/cartopy/docs/latest/gallery.html https://seaborn.pydata.org/examples/index.html https://networkx.github.io/documentation/networkx-1.9.1/examples/drawing/random_geometric_graph.html
  13. 13. javascript based solutions *https://speakerdeck.com/jakevdp/pythons-visualization-landscape-pycon-2017 folium bqplot
  14. 14. *https://plot.ly/python/ Pros: - interactivity - lots of visualization types - both declarative and imperative capabilities Cons: - paid features
  15. 15. bokeh Pros: - interactivity - lots of visualization types - both declarative and imperative capabilities Cons: - limited vector graphic export
  16. 16. Datashader when you have millions and billions of points NYC Taxi US Census 2010 *https://datashader.readthedocs.io/en/latest/
  17. 17. Altair (based on Vega-Lite) Fully declarative paradigm *https://altair-viz.github.io/#
  18. 18. Facets Overview Dive Quick Draw Dataset https://pair-code.github.io/facets/quickdraw.html *https://pair-code.github.io/facets/ https://github.com/PAIR-code/facets
  19. 19. *https://pair-code.github.io/facets/quickdraw.html
  20. 20. https://research.googleblog.com/2017/07/facets-open-source-visualization-tool.html
  21. 21. Folium *https://github.com/python-visualization/folium
  22. 22. https://indatalabs.com/discover-hong-kong-through-the-lense-of-instagram/ https://indatalabs.com/brands-on-london-instagram Visualization of the week according to InsideBigData https://insidebigdata.com/2016/02/03/visualization-of-the-week-hong-kong-social-media-data-map/
  23. 23. Apache Superset *https://superset.incubator.apache.org/
  24. 24. Apache Superset Whatever! if SQLAlchemy dialect is available for your DB *https://github.com/apache/incubator-superset
  25. 25. Apache Superset Who uses: Airbnb Amino Brilliant.org Clark.de Digit Game Studios Douban Endress+Hauser FBK - ICT center Faasos GfK Data Lab InData Labs Maieutical Labs Qunar Shopkick Tails.com Tobii Tooploox Udemy Yahoo! Zalando Panoramix Caravel Superset *https://github.com/apache/incubator-superset Article on Superset benefits and limitations https://indatalabs.com/blog/data-strategy/open- source-data-visualization-tool-superset Roaring Elephant podcast Episode 41 https://roaringelephant.org/2017/04/25/episode-41- news-news-and-some-more-news/
  26. 26. Thanks for your attention! some examples shown are available here https://github.com/merkylove/data_visualisations_for_datathon_2017 Contacts: r_merkulov@indatalabs.com merkylovecom@mail.ru https://www.linkedin.com/in/roman-merkulov-a61804a4/

×