Successfully reported this slideshow.
Your SlideShare is downloading. ×

Analyzing Larger RasterData in a Jupyter Notebook with GeoPySpark on AWS - FOSS4G 2017 Workshop

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad

Check these out next

1 of 54 Ad

Analyzing Larger RasterData in a Jupyter Notebook with GeoPySpark on AWS - FOSS4G 2017 Workshop

Download to read offline

Slides from the 2017 FOSS4G Workshop "Analyzing Larger RasterData in a Jupyter Notebook with GeoPySpark on AWS"

See the repository at https://github.com/lossyrob/foss4g-2017-geopyspark-workshop

Slides from the 2017 FOSS4G Workshop "Analyzing Larger RasterData in a Jupyter Notebook with GeoPySpark on AWS"

See the repository at https://github.com/lossyrob/foss4g-2017-geopyspark-workshop

Advertisement
Advertisement

More Related Content

Slideshows for you (20)

Similar to Analyzing Larger RasterData in a Jupyter Notebook with GeoPySpark on AWS - FOSS4G 2017 Workshop (20)

Advertisement

Recently uploaded (20)

Analyzing Larger RasterData in a Jupyter Notebook with GeoPySpark on AWS - FOSS4G 2017 Workshop

  1. 1. Rob Emanuele @lossyrob ANALYZING LARGE RASTER DATA IN A JUPYTER NOTEBOOK WITH GEOPYSPARK ON AWS
  2. 2. Connect to the WIFI Network: Harvard University http://getonline.harvard.edu Click “I am a guest” Credentials: U: foss4g2017@gmail.com P: 7RFQU3rm FIRST: Find your Jupyter Notebook URL https://git.io/v77lh (lowercase L) visit the URL next to your name Log in to the Jupyter Hub U: hadoop P: hadoop
  3. 3. OUTLINE 8:00 - 8:30 Intro and Background 8:30 - 9:10 Section 1: Land Cover data 9:10 - 10:00 Section 2: Landsat 8 data 10:00 - 10:10 BREAK 10:10 - 10:30 Deployment and Ingestion 10:30 - 11:10 Section 3: Combining data layers 11:10 - 12:00 Section 4: Making Cool Maps
  4. 4. NOW: A MOTIVATING EXAMPLE
  5. 5. BY
  6. 6. rdd.map(lambda x: x + 1) Source: http://silverpond.com.au/2016/10/06/balancing-spark.ht
  7. 7. (1, 1) (2, 1)(0, 1) (0, 0) (1, 0) (2, 0) (1, 2) (2, 2)(0, 2)
  8. 8. (1, 1) (2, 1)(0, 1) (0, 0) (1, 0) (2, 0) (1, 2) (2, 2)(0, 2) Node 1 Node 2 Node 3
  9. 9. (1, 1) (2, 1)(0, 1) (0, 0) (1, 0) (2, 0) (1, 2) (2, 2)(0, 2) Node 1 Node 2 Node 3
  10. 10. (1, 1) (2, 1)(0, 1) (0, 0) (1, 0) (2, 0) (1, 2) (2, 2)(0, 2) Node 1 Node 2 Node 3
  11. 11. (1, 1) (2, 1)(0, 1) Node 1 Node 2 Node 3
  12. 12. (1, 1) (2, 1)(0, 1) Node 1 Node 2 Node 3 rdd.bufferTiles(…)
  13. 13. + + Interactive and Batch Processing of large raster data Web-Speed Processing of small to medium sized raster data
  14. 14. GeoTrellis Ecosystem Raster Foundry by Spark SQL and Spark ML support Raster Frames by Spark SQL and Spark ML support GeoPySpark Python bindings Vector Pipes Vector Tiles on Spark PDAL integration Point Clouds on Spark
  15. 15. GeoPySpark
  16. 16. Started December 2016 Follows PySpark’s model of communication between the JavaVirtual Machine and Python Access GeoTrellis functionality through Python, and integrates with your favorite python raster tools (numpy + friends). 0.2 is released! GeoPySpark
  17. 17. EXERCISE 1: ANALYZING LAND COVER DATA
  18. 18. EXERCISE 2: WORKING WITH LANDSAT IMAGERY AND NDVITHROUGHTIME
  19. 19. (SpaceTimeKey, Tile) (SpaceTimeKey, Tile) (SpaceTimeKey, Tile) … SpaceTimeKey ≈  (col, row, instant)
  20. 20. (SpaceTimeKey, Tile) (SpaceTimeKey, Tile) (SpaceTimeKey, Tile) … lambda lambda lambda (SpatialKey, (DateTime, Tile)) (SpatialKey, (DateTime, Tile)) (SpatialKey, (DateTime, Tile)) …
  21. 21. … (SpatialKey, [(DateTime, Tile) (DateTime, Tile)]) (SpatialKey, (DateTime, Tile)) (SpatialKey, (DateTime, Tile)) (SpatialKey, (DateTime, Tile)) (SpatialKey, [(DateTime, Tile)]) …
  22. 22. … (SpatialKey, [(DateTime, Tile) (DateTime, Tile)]) (SpatialKey, (DateTime, Tile)) (SpatialKey, (DateTime, Tile)) (SpatialKey, (DateTime, Tile)) (SpatialKey, [(DateTime, Tile)]) (Shuffle) …
  23. 23. (SpatialKey, [(DateTime, Tile) (DateTime, Tile)]) (SpatialKey, [(DateTime, Tile)]) … mosaic (SpatialKey, Tile) (SpatialKey, Tile) … mosaic
  24. 24. BREAK!
  25. 25. WHERE AND HOW ARETHESE NOTEBOOKS RUNNING?
  26. 26. WHERE’STHIS DATA COMING FROM?
  27. 27. Supported Backends
  28. 28. EXERCISE 3: COMBINING LAND COVER AND NDVITO DETECT CROP CYCLES
  29. 29. (SpaceTimeKey, Tile) (SpaceTimeKey, Tile) (SpaceTimeKey, Tile) …
  30. 30. (SpaceTimeKey, Tile) (SpaceTimeKey, Tile) (SpaceTimeKey, Tile) … map_to_spatial (SpatialKey, (STK, Tile)) (SpatialKey, (STK, Tile)) (SpatialKey, (STK, Tile)) … map_to_spatial map_to_spatial STK = SpaceTimeKey
  31. 31. (SpatialKey, (STK, Tile)) (SpatialKey, (STK, Tile)) (SpatialKey, (STK, Tile)) … (SpatialKey, Tile) (SpatialKey, Tile) … ndwi_rdd nlcd_layer.to_numpy_rdd() (SpatialKey, ((STK, Tile), Tile)) (SpatialKey, ((STK, Tile), Tile)) (SpatialKey, ((STK, Tile),Tile)) …
  32. 32. (SpatialKey, (STK, Tile)) (SpatialKey, (STK, Tile)) (SpatialKey, (STK, Tile)) … (SpatialKey, Tile) (SpatialKey, Tile) … ndwi_rdd nlcd_layer.to_numpy_rdd() (SpatialKey, ((STK, Tile), Tile)) (SpatialKey, ((STK, Tile), Tile)) (SpatialKey, ((STK, Tile),Tile)) … (Shuffle)
  33. 33. mask_ndwi mask_ndwi mask_ndwi (SpaceTimeKey, Tile) (SpaceTimeKey, Tile) (SpaceTimeKey, Tile) … (SpatialKey, ((STK, Tile), Tile)) (SpatialKey, ((STK, Tile), Tile)) (SpatialKey, ((STK, Tile),Tile)) …
  34. 34. EXERCISE 4: COMBINING IMAGERY, ELEVATION AND LAND COVER DATA TO MAKE A COOL LOOKING MAP
  35. 35. EXERCISE 4: COMBINING IMAGERY, ELEVATION AND LAND COVER DATA TO MAKE A COOL LOOKING MAP TWEETYOUR SWEET MAP SCREENSHOTS WITH #GEOPYSPARK #FOSS4G!
  36. 36. FINAL QUESTIONS?
  37. 37. Thank you!

×