Project Matsu: Elastic Clouds for Disaster Relief

www.opencloudconsortium.orgProject Matsu: Large Scale On-Demand Image Processing for Disaster ReliefCollin Bennett, Robert Grossman, YunhongGu, and Andrew LevineOpen Cloud ConsortiumJune 21, 2010

Project Matsu GoalsProvide persistent data resources and elastic computing to assist in disasters:Make imagery available for disaster relief workersElastic computing for large scale image processingChange detection for temporally different and geospatially identical image setsProvide a resource to test standards and interoperability studies large data clouds

501(3)(c) Not-for-profit corporationSupports the development of standards, interoperability frameworks, and reference implementations.Manages testbeds: Open Cloud Testbed and IntercloudTestbed.Manages cloud computing infrastructure to support scientific research: Open Science Data Cloud.Develops benchmarks.4www.opencloudconsortium.org

OCC MembersCompanies: Aerospace, Booz Allen Hamilton, Cisco, InfoBlox, Open Data Group, Raytheon, YahooUniversities: CalIT2, Johns Hopkins, Northwestern Univ., University of Illinois at Chicago, University of ChicagoGovernment agencies: NASAOpen Source Projects: Sector Project5

Operates Clouds500 nodes3000 cores1.5+ PBFour data centers10 GbpsTarget to refresh 1/3 each year.Open Cloud Testbed

Project Matsu: Cloud-based Disaster Relief ServicesOpen Science Data CloudAstronomical dataBiological data (Bionimbus)Networking dataImage processing for disaster relief7

Focus of OCC Large Data Cloud Working Group8AppAppAppAppAppTable-based Data ServicesRelational-like Data ServicesAppAppCloud Compute Services (MapReduce, UDF, & other programming frameworks)AppAppCloud Storage ServicesDeveloping APIs for this framework.

Tools and StandardsApache Hadoop/MapReduceSector/Sphere large data cloudOpen Geospatial ConsortiumWeb Map Service (WMS)OCC tools are open source (matsu-project)http://code.google.com/p/matsu-project/

Part 2: Technical ApproachHadoop – Lead Andrew LevineHadoop with Python Streams – Lead Collin BennetSector/Sphere – Lead YunhongGu

Implementation 1: Hadoop & MapreduceAndrew Levine

Image Processing in the Cloud - MapperMapper Input Key: Bounding BoxMapper Output Key: Bounding BoxMapper Output Value:+ Timestamp(minx = -135.0 miny = 45.0 maxx = -112.5 maxy = 67.5)Mapper Input Value:Mapper Output Key: Bounding BoxMapper Output Value:+ Timestamp+ TimestampMapper Output Key: Bounding BoxMapper Output Value:+ TimestampMapper Output Key: Bounding BoxStep 1: Input to MapperMapper Output Value:+ TimestampMapper resizes and/or cuts up the originalimage into pieces to output Bounding BoxesMapper Output Key: Bounding BoxMapper Output Value:+ TimestampMapper Output Key: Bounding BoxMapper Output Value:+ TimestampMapper Output Key: Bounding BoxMapper Output Value:+ TimestampMapper Output Key: Bounding BoxMapper Output Value:+ TimestampStep 3: Mapper OutputStep 2: Processing in Mapper

Image Processing in the Cloud - ReducerReducer Key Input: Bounding Box(minx = -45.0 miny = -2.8125 maxx = -43.59375 maxy = -2.109375)Reducer Value Input:……Step 1: Input to ReducerResult is a delta of the two ImagesAssemble Images based on timestamps and compareStep 2: Process difference in ReducerAll images go to different map layers set of images for display in WMSTimestamp 1SetTimestamp 2SetDelta SetStep 3: Reducer Output

Implementation 2: Hadoop & Python StreamsCollin Bennett

Preprocessing StepAll images (in a batch to be processed) are combined into a single file.

Each line contains the image’s byte array transformed to pixels (raw bytes don’t seem to work well with the one-line-at-a-timeHadoop streaming paradigm).geolocation \t timestamp | tuple size ; image width ; image height; comma-separated list of pixelsthe fields in red are metadata needed to process the image in the reducer

Map and ShuffleWe can use the identity mapper

All of the work for mapping was done in the pre-process step

Project Matsu: Elastic Clouds for Disaster Relief

More Related Content

What's hot

Viewers also liked

Similar to Project Matsu: Elastic Clouds for Disaster Relief

More from Robert Grossman

Recently uploaded

Project Matsu: Elastic Clouds for Disaster Relief