Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Rob Emanuele @lossyrob
ANALYZING LARGE RASTER DATA
IN A JUPYTER NOTEBOOK
WITH GEOPYSPARK
ON AWS
Connect to the WIFI
Network: Harvard University
http://getonline.harvard.edu
Click “I am a guest”
Credentials:
U: foss4g20...
OUTLINE
8:00 - 8:30 Intro and Background
8:30 - 9:10 Section 1: Land Cover data
9:10 - 10:00 Section 2: Landsat 8 data
10:...
NOW:
A MOTIVATING EXAMPLE
BY
rdd.map(lambda x: x + 1)
Source: http://silverpond.com.au/2016/10/06/balancing-spark.ht
(1, 1) (2, 1)(0, 1)
(0, 0) (1, 0) (2, 0)
(1, 2) (2, 2)(0, 2)
(1, 1) (2, 1)(0, 1)
(0, 0) (1, 0) (2, 0)
(1, 2) (2, 2)(0, 2)
Node 1
Node 2
Node 3
(1, 1) (2, 1)(0, 1)
(0, 0) (1, 0) (2, 0)
(1, 2) (2, 2)(0, 2)
Node 1
Node 2
Node 3
(1, 1) (2, 1)(0, 1)
(0, 0) (1, 0) (2, 0)
(1, 2) (2, 2)(0, 2)
Node 1
Node 2
Node 3
(1, 1) (2, 1)(0, 1)
Node 1
Node 2
Node 3
(1, 1) (2, 1)(0, 1)
Node 1
Node 2
Node 3
rdd.bufferTiles(…)
+
+
Interactive and Batch Processing
of large raster data
Web-Speed Processing
of small to medium sized raster data
GeoTrellis Ecosystem
Raster Foundry by
Spark SQL and Spark ML support
Raster Frames by
Spark SQL and Spark ML support
GeoP...
GeoPySpark
Started December 2016
Follows PySpark’s model of communication
between the JavaVirtual Machine and Python
Access GeoTrelli...
EXERCISE 1:
ANALYZING LAND COVER DATA
EXERCISE 2:
WORKING WITH LANDSAT IMAGERY
AND NDVITHROUGHTIME
(SpaceTimeKey, Tile)
(SpaceTimeKey, Tile)
(SpaceTimeKey, Tile)
…
SpaceTimeKey ≈  (col, row, instant)
(SpaceTimeKey, Tile)
(SpaceTimeKey, Tile)
(SpaceTimeKey, Tile)
…
lambda
lambda
lambda
(SpatialKey, (DateTime, Tile))
(Spat...
…
(SpatialKey, [(DateTime, Tile)
(DateTime, Tile)])
(SpatialKey, (DateTime, Tile))
(SpatialKey, (DateTime, Tile))
(Spatial...
…
(SpatialKey, [(DateTime, Tile)
(DateTime, Tile)])
(SpatialKey, (DateTime, Tile))
(SpatialKey, (DateTime, Tile))
(Spatial...
(SpatialKey, [(DateTime, Tile)
(DateTime, Tile)])
(SpatialKey, [(DateTime, Tile)])
…
mosaic
(SpatialKey, Tile)
(SpatialKey...
BREAK!
WHERE AND HOW ARETHESE
NOTEBOOKS RUNNING?
WHERE’STHIS DATA COMING
FROM?
Supported Backends
EXERCISE 3:
COMBINING LAND COVER AND NDVITO
DETECT CROP CYCLES
(SpaceTimeKey, Tile)
(SpaceTimeKey, Tile)
(SpaceTimeKey, Tile)
…
(SpaceTimeKey, Tile)
(SpaceTimeKey, Tile)
(SpaceTimeKey, Tile)
…
map_to_spatial
(SpatialKey, (STK, Tile))
(SpatialKey, (ST...
(SpatialKey, (STK, Tile))
(SpatialKey, (STK, Tile))
(SpatialKey, (STK, Tile))
…
(SpatialKey, Tile)
(SpatialKey, Tile)
…
nd...
(SpatialKey, (STK, Tile))
(SpatialKey, (STK, Tile))
(SpatialKey, (STK, Tile))
…
(SpatialKey, Tile)
(SpatialKey, Tile)
…
nd...
mask_ndwi
mask_ndwi
mask_ndwi
(SpaceTimeKey, Tile)
(SpaceTimeKey, Tile)
(SpaceTimeKey, Tile)
…
(SpatialKey, ((STK, Tile), ...
EXERCISE 4:
COMBINING IMAGERY, ELEVATION AND
LAND COVER DATA
TO MAKE A COOL LOOKING MAP
EXERCISE 4:
COMBINING IMAGERY, ELEVATION AND
LAND COVER DATA
TO MAKE A COOL LOOKING MAP
TWEETYOUR SWEET MAP SCREENSHOTS WI...
FINAL QUESTIONS?
Thank you!
Analyzing Larger RasterData in a Jupyter Notebook with GeoPySpark on AWS - FOSS4G 2017 Workshop
Analyzing Larger RasterData in a Jupyter Notebook with GeoPySpark on AWS - FOSS4G 2017 Workshop
Analyzing Larger RasterData in a Jupyter Notebook with GeoPySpark on AWS - FOSS4G 2017 Workshop
Analyzing Larger RasterData in a Jupyter Notebook with GeoPySpark on AWS - FOSS4G 2017 Workshop
Analyzing Larger RasterData in a Jupyter Notebook with GeoPySpark on AWS - FOSS4G 2017 Workshop
Analyzing Larger RasterData in a Jupyter Notebook with GeoPySpark on AWS - FOSS4G 2017 Workshop
Analyzing Larger RasterData in a Jupyter Notebook with GeoPySpark on AWS - FOSS4G 2017 Workshop
Analyzing Larger RasterData in a Jupyter Notebook with GeoPySpark on AWS - FOSS4G 2017 Workshop
Analyzing Larger RasterData in a Jupyter Notebook with GeoPySpark on AWS - FOSS4G 2017 Workshop
Analyzing Larger RasterData in a Jupyter Notebook with GeoPySpark on AWS - FOSS4G 2017 Workshop
Analyzing Larger RasterData in a Jupyter Notebook with GeoPySpark on AWS - FOSS4G 2017 Workshop
Analyzing Larger RasterData in a Jupyter Notebook with GeoPySpark on AWS - FOSS4G 2017 Workshop
Analyzing Larger RasterData in a Jupyter Notebook with GeoPySpark on AWS - FOSS4G 2017 Workshop
Analyzing Larger RasterData in a Jupyter Notebook with GeoPySpark on AWS - FOSS4G 2017 Workshop
Analyzing Larger RasterData in a Jupyter Notebook with GeoPySpark on AWS - FOSS4G 2017 Workshop
Analyzing Larger RasterData in a Jupyter Notebook with GeoPySpark on AWS - FOSS4G 2017 Workshop
Analyzing Larger RasterData in a Jupyter Notebook with GeoPySpark on AWS - FOSS4G 2017 Workshop
Upcoming SlideShare
Loading in …5
×
Upcoming SlideShare
Top 5 Mistakes to Avoid When Writing Apache Spark Applications
Next
Download to read offline and view in fullscreen.

Share

Analyzing Larger RasterData in a Jupyter Notebook with GeoPySpark on AWS - FOSS4G 2017 Workshop

Download to read offline

Slides from the 2017 FOSS4G Workshop "Analyzing Larger RasterData in a Jupyter Notebook with GeoPySpark on AWS"

See the repository at https://github.com/lossyrob/foss4g-2017-geopyspark-workshop

Related Books

Free with a 30 day trial from Scribd

See all

Analyzing Larger RasterData in a Jupyter Notebook with GeoPySpark on AWS - FOSS4G 2017 Workshop

  1. 1. Rob Emanuele @lossyrob ANALYZING LARGE RASTER DATA IN A JUPYTER NOTEBOOK WITH GEOPYSPARK ON AWS
  2. 2. Connect to the WIFI Network: Harvard University http://getonline.harvard.edu Click “I am a guest” Credentials: U: foss4g2017@gmail.com P: 7RFQU3rm FIRST: Find your Jupyter Notebook URL https://git.io/v77lh (lowercase L) visit the URL next to your name Log in to the Jupyter Hub U: hadoop P: hadoop
  3. 3. OUTLINE 8:00 - 8:30 Intro and Background 8:30 - 9:10 Section 1: Land Cover data 9:10 - 10:00 Section 2: Landsat 8 data 10:00 - 10:10 BREAK 10:10 - 10:30 Deployment and Ingestion 10:30 - 11:10 Section 3: Combining data layers 11:10 - 12:00 Section 4: Making Cool Maps
  4. 4. NOW: A MOTIVATING EXAMPLE
  5. 5. BY
  6. 6. rdd.map(lambda x: x + 1) Source: http://silverpond.com.au/2016/10/06/balancing-spark.ht
  7. 7. (1, 1) (2, 1)(0, 1) (0, 0) (1, 0) (2, 0) (1, 2) (2, 2)(0, 2)
  8. 8. (1, 1) (2, 1)(0, 1) (0, 0) (1, 0) (2, 0) (1, 2) (2, 2)(0, 2) Node 1 Node 2 Node 3
  9. 9. (1, 1) (2, 1)(0, 1) (0, 0) (1, 0) (2, 0) (1, 2) (2, 2)(0, 2) Node 1 Node 2 Node 3
  10. 10. (1, 1) (2, 1)(0, 1) (0, 0) (1, 0) (2, 0) (1, 2) (2, 2)(0, 2) Node 1 Node 2 Node 3
  11. 11. (1, 1) (2, 1)(0, 1) Node 1 Node 2 Node 3
  12. 12. (1, 1) (2, 1)(0, 1) Node 1 Node 2 Node 3 rdd.bufferTiles(…)
  13. 13. + + Interactive and Batch Processing of large raster data Web-Speed Processing of small to medium sized raster data
  14. 14. GeoTrellis Ecosystem Raster Foundry by Spark SQL and Spark ML support Raster Frames by Spark SQL and Spark ML support GeoPySpark Python bindings Vector Pipes Vector Tiles on Spark PDAL integration Point Clouds on Spark
  15. 15. GeoPySpark
  16. 16. Started December 2016 Follows PySpark’s model of communication between the JavaVirtual Machine and Python Access GeoTrellis functionality through Python, and integrates with your favorite python raster tools (numpy + friends). 0.2 is released! GeoPySpark
  17. 17. EXERCISE 1: ANALYZING LAND COVER DATA
  18. 18. EXERCISE 2: WORKING WITH LANDSAT IMAGERY AND NDVITHROUGHTIME
  19. 19. (SpaceTimeKey, Tile) (SpaceTimeKey, Tile) (SpaceTimeKey, Tile) … SpaceTimeKey ≈  (col, row, instant)
  20. 20. (SpaceTimeKey, Tile) (SpaceTimeKey, Tile) (SpaceTimeKey, Tile) … lambda lambda lambda (SpatialKey, (DateTime, Tile)) (SpatialKey, (DateTime, Tile)) (SpatialKey, (DateTime, Tile)) …
  21. 21. … (SpatialKey, [(DateTime, Tile) (DateTime, Tile)]) (SpatialKey, (DateTime, Tile)) (SpatialKey, (DateTime, Tile)) (SpatialKey, (DateTime, Tile)) (SpatialKey, [(DateTime, Tile)]) …
  22. 22. … (SpatialKey, [(DateTime, Tile) (DateTime, Tile)]) (SpatialKey, (DateTime, Tile)) (SpatialKey, (DateTime, Tile)) (SpatialKey, (DateTime, Tile)) (SpatialKey, [(DateTime, Tile)]) (Shuffle) …
  23. 23. (SpatialKey, [(DateTime, Tile) (DateTime, Tile)]) (SpatialKey, [(DateTime, Tile)]) … mosaic (SpatialKey, Tile) (SpatialKey, Tile) … mosaic
  24. 24. BREAK!
  25. 25. WHERE AND HOW ARETHESE NOTEBOOKS RUNNING?
  26. 26. WHERE’STHIS DATA COMING FROM?
  27. 27. Supported Backends
  28. 28. EXERCISE 3: COMBINING LAND COVER AND NDVITO DETECT CROP CYCLES
  29. 29. (SpaceTimeKey, Tile) (SpaceTimeKey, Tile) (SpaceTimeKey, Tile) …
  30. 30. (SpaceTimeKey, Tile) (SpaceTimeKey, Tile) (SpaceTimeKey, Tile) … map_to_spatial (SpatialKey, (STK, Tile)) (SpatialKey, (STK, Tile)) (SpatialKey, (STK, Tile)) … map_to_spatial map_to_spatial STK = SpaceTimeKey
  31. 31. (SpatialKey, (STK, Tile)) (SpatialKey, (STK, Tile)) (SpatialKey, (STK, Tile)) … (SpatialKey, Tile) (SpatialKey, Tile) … ndwi_rdd nlcd_layer.to_numpy_rdd() (SpatialKey, ((STK, Tile), Tile)) (SpatialKey, ((STK, Tile), Tile)) (SpatialKey, ((STK, Tile),Tile)) …
  32. 32. (SpatialKey, (STK, Tile)) (SpatialKey, (STK, Tile)) (SpatialKey, (STK, Tile)) … (SpatialKey, Tile) (SpatialKey, Tile) … ndwi_rdd nlcd_layer.to_numpy_rdd() (SpatialKey, ((STK, Tile), Tile)) (SpatialKey, ((STK, Tile), Tile)) (SpatialKey, ((STK, Tile),Tile)) … (Shuffle)
  33. 33. mask_ndwi mask_ndwi mask_ndwi (SpaceTimeKey, Tile) (SpaceTimeKey, Tile) (SpaceTimeKey, Tile) … (SpatialKey, ((STK, Tile), Tile)) (SpatialKey, ((STK, Tile), Tile)) (SpatialKey, ((STK, Tile),Tile)) …
  34. 34. EXERCISE 4: COMBINING IMAGERY, ELEVATION AND LAND COVER DATA TO MAKE A COOL LOOKING MAP
  35. 35. EXERCISE 4: COMBINING IMAGERY, ELEVATION AND LAND COVER DATA TO MAKE A COOL LOOKING MAP TWEETYOUR SWEET MAP SCREENSHOTS WITH #GEOPYSPARK #FOSS4G!
  36. 36. FINAL QUESTIONS?
  37. 37. Thank you!
  • WilliamSandvejHansen

    Aug. 31, 2017
  • ShingoIkeda1

    Aug. 23, 2017

Slides from the 2017 FOSS4G Workshop "Analyzing Larger RasterData in a Jupyter Notebook with GeoPySpark on AWS" See the repository at https://github.com/lossyrob/foss4g-2017-geopyspark-workshop

Views

Total views

1,798

On Slideshare

0

From embeds

0

Number of embeds

27

Actions

Downloads

55

Shares

0

Comments

0

Likes

2

×