The Weather of the Century: 
Visualization 
A. Jesse Jiryu Davis 
Senior Python Engineer, MongoDB 
@jessejiryudavis
Serious MongoDB Talk
Serious MongoDB Talk 
Database
Serious MongoDB Talk
This Talk
Where’s the data from?
Where’s the data from?
How Much Is There?
Visualization
Visualization Pipeline 
MongoDB PyMongo Python dicts NumPy SciPy Matplotlib
{ 
ts: ISODate("1991-01-01T00:00:00Z"), 
position: { 
type: "Point", 
coordinates: [ 
-94.6, 
39.117 
] 
}, 
airTemperatur...
import numpy 
import pymongo 
data = [] 
db = pymongo.MongoClient().my_database 
for doc in db.collection.find(query): 
da...
# NumPy column access syntax. 
lons = arrays[:, 0] 
lats = arrays[:, 1] 
temps = arrays[:, 2]
from scipy import griddata 
from matplotlib import pyplot 
xs = numpy.linspace(-180, 180, 361) 
ys = numpy.linspace(-90, 9...
from matplotlib import pyplot 
xs = numpy.linspace(-180, 180, 361) 
ys = numpy.linspace(-90, 90, 181) 
zs = griddata(lats,...
Triangulation
Triangulation
Triangulation 
What temperature?
Barycentric Interpolation 
What temperature? 53 
48 
54 
Weighted Average 
51.1
Interpolation 
51.1
Interpolation
Interpolation
Contours
Contours
import numpy 
import pymongo 
Not terrifically fast 
data = [] 
db = pymongo.MongoClient().my_database 
for doc in db.coll...
Analyzing large datasets 
• Querying: 109k documents per second 
• (On localhost) 
• Can we go faster? 
• Enter “Monary”
Monary 
by David Beach 
MongoDB PyMongo Python dicts NumPy Matplotlib 
MongoDB Monary NumPy Matplotlib
import monary 
data = [] 
connection = monary.Monary() 
arrays = monary_connection.query( 
db='my_database', 
coll='collec...
Monary 
• PyMongo: 109k documents per second 
• Monary: 817k documents per second
Visualization
• Author: 
David Beach 
• Interns: 
Kyle Suarez 
Matt Cotter 
• Mentors: 
A. Jesse Jiryu Davis 
Jason Carey 
Monary
Monary 
Recent features: 
• Easy installation 
• Nested field access 
• Aggregation 
• Python 3
• Insert, update, remove 
• SSL and authentication mechanisms 
• parallelCollectionScan 
Monary 
Future:
Thanks 
• Monary 
• NumPy 
• SciPy 
• Matplotlib
Thanks
Thank you 
A. Jesse Jiryu Davis 
Senior Python Engineer, MongoDB 
#MongoDBWorld
Weather of the Century: Visualization
Weather of the Century: Visualization
Weather of the Century: Visualization
Weather of the Century: Visualization
Weather of the Century: Visualization
Weather of the Century: Visualization
Weather of the Century: Visualization
Weather of the Century: Visualization
Upcoming SlideShare
Loading in …5
×

Weather of the Century: Visualization

1,506 views

Published on

MongoDB natively supports geospatial indexing and querying, and it integrates easily with open source visualization tools. In this webinar, learn high-performance techniques for querying and retrieving geospatial data, and how to create a rich visual representation of global weather data using Python, Monary, and Matplotlib.

Published in: Data & Analytics, Technology
  • Be the first to comment

Weather of the Century: Visualization

  1. 1. The Weather of the Century: Visualization A. Jesse Jiryu Davis Senior Python Engineer, MongoDB @jessejiryudavis
  2. 2. Serious MongoDB Talk
  3. 3. Serious MongoDB Talk Database
  4. 4. Serious MongoDB Talk
  5. 5. This Talk
  6. 6. Where’s the data from?
  7. 7. Where’s the data from?
  8. 8. How Much Is There?
  9. 9. Visualization
  10. 10. Visualization Pipeline MongoDB PyMongo Python dicts NumPy SciPy Matplotlib
  11. 11. { ts: ISODate("1991-01-01T00:00:00Z"), position: { type: "Point", coordinates: [ -94.6, 39.117 ] }, airTemperature: { value: 45, quality: "1" } } GeoJSON
  12. 12. import numpy import pymongo data = [] db = pymongo.MongoClient().my_database for doc in db.collection.find(query): data.append(( doc['position']['coordinates'][0], doc['position']['coordinates'][1], doc['airTemperature']['value'])) arrays = numpy.array(data)
  13. 13. # NumPy column access syntax. lons = arrays[:, 0] lats = arrays[:, 1] temps = arrays[:, 2]
  14. 14. from scipy import griddata from matplotlib import pyplot xs = numpy.linspace(-180, 180, 361) ys = numpy.linspace(-90, 90, 181) zs = griddata(lats, lons, temps, (xs, ys), method='linear') Magic!! pyplot.contour(xs, ys, zs) Also magic!!
  15. 15. from matplotlib import pyplot xs = numpy.linspace(-180, 180, 361) ys = numpy.linspace(-90, 90, 181) zs = griddata(lats, lons, temps, (xs, ys), method='linear') pyplot.contour(xs, ys, zs)
  16. 16. Triangulation
  17. 17. Triangulation
  18. 18. Triangulation What temperature?
  19. 19. Barycentric Interpolation What temperature? 53 48 54 Weighted Average 51.1
  20. 20. Interpolation 51.1
  21. 21. Interpolation
  22. 22. Interpolation
  23. 23. Contours
  24. 24. Contours
  25. 25. import numpy import pymongo Not terrifically fast data = [] db = pymongo.MongoClient().my_database for doc in db.collection.find(query): data.append(( doc['position']['coordinates'][0], doc['position']['coordinates'][1], doc['airTemperature']['value'])) arrays = numpy.array(data)
  26. 26. Analyzing large datasets • Querying: 109k documents per second • (On localhost) • Can we go faster? • Enter “Monary”
  27. 27. Monary by David Beach MongoDB PyMongo Python dicts NumPy Matplotlib MongoDB Monary NumPy Matplotlib
  28. 28. import monary data = [] connection = monary.Monary() arrays = monary_connection.query( db='my_database', coll='collection', query=query, fields=[ 'position.coordinates.0', 'position.coordinates.1', 'airTemperature.value'], types=[ 'float32', 'float32', 'float32'])
  29. 29. Monary • PyMongo: 109k documents per second • Monary: 817k documents per second
  30. 30. Visualization
  31. 31. • Author: David Beach • Interns: Kyle Suarez Matt Cotter • Mentors: A. Jesse Jiryu Davis Jason Carey Monary
  32. 32. Monary Recent features: • Easy installation • Nested field access • Aggregation • Python 3
  33. 33. • Insert, update, remove • SSL and authentication mechanisms • parallelCollectionScan Monary Future:
  34. 34. Thanks • Monary • NumPy • SciPy • Matplotlib
  35. 35. Thanks
  36. 36. Thank you A. Jesse Jiryu Davis Senior Python Engineer, MongoDB #MongoDBWorld

×