Your SlideShare is downloading. ×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Hadoop World 2011: Indexing the Earth - Large Scale Satellite Image Processing Using Hadoop - Oliver Guinan, Skybox Imaging

4,457

Published on

Skybox Imaging is using Hadoop as the engine of it's satellite image processing system. Using CDH to store and process vast quantities of raw satellite image data enables Skybox to create a system …

Skybox Imaging is using Hadoop as the engine of it's satellite image processing system. Using CDH to store and process vast quantities of raw satellite image data enables Skybox to create a system that scales as they launch larger numbers of ever more complex satellites. Skybox has developed a CDH based framework that allows image processing specialists to develop complex processing algorithms using native code and then publish those algorithms into the highly scalable Hadoop Map/Reduce interface. This session will provide an overview of how we use hdfs, hbase and map/reduce to process raw camera data into high resolution satellite images.

Published in: Technology, Business
0 Comments
6 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
4,457
On Slideshare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
165
Comments
0
Likes
6
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Indexing the Earth Hadoop World NYC 2011Oliver Guinan -VP Ground Data Systems ollie@skyboximaging.com
  • 2. Session Agenda‣ Skybox‣ The Big Data problem‣ Indexing the planet at scale‣ Questions 2 HadoopWorld 2011
  • 3. Today’s data is oldBridge underconstruction Convention center (completed under construction 2009) (completed 2010) Stadium under construction Image taken (completed 2010) September 2008. > than three years old 3 HadoopWorld 2011
  • 4. A problem of scale 4 HadoopWorld 2011
  • 5. Satellite Imagery = Transparency... -15% 43% vegetation damage 55,245 gallons of oil crude 215 6,254 automobiles containers 5J F M A M J J A S O N D J F M A M J J A S O HadoopWorld 2011J N D F
  • 6. The problem ofcapacity 6 HadoopWorld 2011
  • 7. Sensor networkin space 7 HadoopWorld 2011
  • 8. New approach: Many distributed, low-cost satellites 8 HadoopWorld 2011
  • 9. Total Raw Data compute Sensor Network • Satellites produce ~1TB of raw data/day Single Satellite Sensors in Network 15 20Data Captured per Year (PB) 11.25 15 Sensors in Network Title 7.5 10 3.75 5 0 0 Year1 Year2 Year3 Year4 Year5 9 HadoopWorld 2011
  • 10. Total Raw Data storage Sensor Network• Satellites produce ~1TB of raw data/day Single Satellite Sensors in Network 30 20Data Captured per Year (PB) 22.5 15 Sensors in Network Title 15 10 7.5 5 0 0 Year1 Year2 Year3 Year4 Year5 10 HadoopWorld 2011
  • 11. Enter the elephant 11 HadoopWorld 2011
  • 12. Hadoop from space - processing bits Hadoop is bad at: ๏Calling native C code or libraries at scale ๏Scientific computing is immature in Java 12 HadoopWorld 2011
  • 13. Hadoop from space - processing bits Standard Java Hadoop ๏Hadoop knows where data stored ๏Jobs efficiently scheduled close to data ๏Throughput optimized 13 HadoopWorld 2011
  • 14. Hadoop from space - processing bits Hadoop Pipes & Streaming ๏Hadoop schedules jobs without regard to the data required by the job ๏Native code reads data across the network ๏Drives up network costs and drives down throughput 14 HadoopWorld 2011
  • 15. Hadoop from space - processing bits BusBoy ✓Hadoop manages data reads & writes ✓Hadoop schedules jobs close to the data ✓Jobs read data and hand off to native code for processing 15 HadoopWorld 2011
  • 16. Architecture Overview Hadoop Task BusBoy C code math.lib gdal.lib cv.lib Inputs Outputs Logging Progress Hadoop JobTracker HDFS HBase Hive 16 HadoopWorld 2011
  • 17. Framework Benefits - Deployment✓Low time to first byte✓Insight into job progress✓Diagnostics for large scale operations✓Logging 17 HadoopWorld 2011
  • 18. Framework Benifits - Development✓Prototyping outside of Hadoop✓Rapid turnaround✓Testable interfaces 18 HadoopWorld 2011
  • 19. Skybox providing Big Data✓Produce the most complete and timely data about the world✓Make data available to users to mine the raw data for information✓Turn Big Data into knowledge, at Earth scale Skybox BusBoy 19 HadoopWorld 2011
  • 20. Color Images Simulated from aerial platform using flight sensor 20 HadoopWorld 2011
  • 21. HD Video HadoopWorld 2011
  • 22. Questions? Sample Data?bigdata@skyboximaging.com HadoopWorld 2011

×