Environmental Science, Big Data and the Cloud

2,313 views

Published on

Scientific instruments, environmental sensors, and large-scale simulations are creating more scientific data than ever before. By using advanced, large-scale information processing facilities, scientists are now able to analyze massive volumes of data in ways that never would have been possible just a few years ago. While a few researchers have access to these large computer systems, most are limited by the processing capacity they can access conveniently and quickly. Cloud computing solutions utilizing Microsoft Azure allow environmental science researchers to access the compute and storage resources that they need, when they need them—without the up-front financial investment required—and helps reduce the time between progress and breakthroughs. Microsoft Azure brings on-demand computing and data access to environmental scientists and researchers everywhere.

Published in: Science, Technology
1 Comment
13 Likes
Statistics
Notes
No Downloads
Views
Total views
2,313
On SlideShare
0
From Embeds
0
Number of Embeds
11
Actions
Shares
0
Downloads
47
Comments
1
Likes
13
Embeds 0
No embeds

No notes for slide

Environmental Science, Big Data and the Cloud

  1. 1. Microsoft Research: Computational Ecology and Environmental Science Group http://research.microsoft.com/en-us/groups/ecology/
  2. 2. Manual Measurement Automated Measurement Sample Collection Historical Photographs Counting Ubiquitous Motes Aircraft Surveys Model Output Typing
  3. 3. Monitoring Collation Quality assurance Aggregation Analysis Reporting Forecasting Distribution Done poorly, but a few notable counter-examples Done poorly to moderately, not easy to find Sometimes done well, generally discoverable and available, but could be improved Integration (I. Zaslavsky & CSIRO, BOM, WMO)
  4. 4. Data-intensive Science Data Acquisition & modelling Collaboration and visualisation Analysis & data mining Dissemination & sharing Archiving and preserving fourthparadigm.org
  5. 5. Complex shared detector Simple instrument (if any) Complex and Heavy process by experts Ad hoc observations and models KB GB TB PB Science happens when PBs, TBs, GBs, and KBs can be mashed up simply Provenance and trust widely varies Data acquisition, early processing, and reporting ranges from a large government agency to individual scientists. Smaller data often passed around in email; big data downloads can take days (if at all) Data sharing concerns and patterns vary Open access followed by (non-repeatable and tedious) pre-processing True science ready data set but concerns about misuse, misunderstanding particularly for hard won data. Computational tools differ. Not everyone can get an account at a supercomputer center Very large computations require engineering (error handling) Space and time aren’t always simple dimensions
  6. 6. Getting what you need, when you need it Cloud computing is good for…
  7. 7. http://github.com/windowsazure
  8. 8. Customer Data Center
  9. 9. http://fetchclimate2.cloudapp.net/
  10. 10. Data Marketplaces
  11. 11. Web search: “open weather data azure”
  12. 12. Weather Forecast Computation as a Service ttp://aka.ms/oljnt2
  13. 13. http://weatherservice.cloudapp.net
  14. 14. http://research.microsoft.com/en-us/projects/azure/technical-papers.aspx
  15. 15. http://aka.ms/dm0 http://research.microsoft.com/projects/msrceesdm/
  16. 16. Windows Azure for Research Group @azure4research www.azure4research.com
  17. 17. MODIS Azure: Computing Evapotranspiration (ET) in the Cloud A pipeline for download, processing, and reduction of diverse NASA MODIS satellite imagery. Catharine van Ingen (Microsoft Research), Jie Li, Marty Humphrey (UVA), Youngryel Ryu (UCB), Deb Agarwal (BWC/LBL), Keith Jackson (BL), Jay Borenstein (Stanford) , Team SICT: Vlad Andrei, Klaus Ganser, Samir Selman, Nandita Prabhu (Stanford), Team Nimbus: David Li, Sudarshan Rangarajan, Shantanu Kurhekar, Riddhi Mittal (Stanford)
  18. 18. MODIS Azure Service Reduction #1 Queue Scientific Results Downloa d Reduction #2 Queue Source Metadata MODIS Azure Service Web Role Portal Request Queue Analysis Reduction Stage Data Collection Stage Source Imagery Download Sites . . . Reprojection Queue Derivation Reduction StageReprojection Stage Download Queue Scientists Science results Catharine van Ingen (Microsoft Research), Jie Li, Marty Humphrey (UVA), Youngryel Ryu (UCB), Deb Agarwal (BWC/LBL), Keith Jackson (BL), Jay Borenstein (Stanford) , Team SICT: Vlad Andrei, Klaus Ganser, Samir Selman, Nandita Prabhu (Stanford), Team Nimbus: David Li, Sudarshan Rangarajan, Shantanu Kurhekar, Riddhi Mittal (Stanford)
  19. 19. Use laptops & desktop computers Overwhelmed by data Finding analysis ever more difficult; sharing even harder
  20. 20. www.azure4research.com
  21. 21. Windows Azure for Research Group @azure4research www.azure4research.com

×