Time to Science, Time to Results: Accelerating Research with AWS - AWS Symposium 2014 - Washington D.C.

1,037 views

Published on

This session demonstrates how the Cloud can accelerate breakthroughs in scientific research by providing on-demand access to powerful computing. The Session will feature scientific researchers making use of the Cloud to increase speed to results.

Published in: Technology, Sports
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,037
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
55
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Carat is a free app that tells you what is using up the battery of your mobile device
    Personalized actionable recommendations to increase battery life
    Over 700,000 users
  • Research the old way: Move data around
    Difficult with today’s Big Data
  • Research the old way: Move data around
    Difficult with today’s Big Data
  • Leverage a large ecosystem of tools
  • Cohorts for Heart and Aging Research in Genomic Epidemiology project (CHARGE)
    200 hundred researchers across 5 intitutions
    Working to identify genes that contribute to aging and heart disease
    DNA sequence of 14,000 individuals -- 3,751 whole genomes and 10,771 whole exomes
    2.4 million core-hours of computational time
    generated 440 TB (terabytes) of results
    Nearly a petabyte of total storage
  • Ever growing ecosystem of tools and HPC partners
  • Cohorts for Heart and Aging Research in Genomic Epidemiology project (CHARGE)
    200 hundred researchers across 5 intitutions
    Working to identify genes that contribute to aging and heart disease
    DNA sequence of 14,000 individuals -- 3,751 whole genomes and 10,771 whole exomes
    2.4 million core-hours of computational time
    generated 440 TB (terabytes) of results
    Nearly a petabyte of total storage
  • Time to Science, Time to Results: Accelerating Research with AWS - AWS Symposium 2014 - Washington D.C.

    1. 1. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Accelerating Research with AWS Steve Halliwell shall@amazon.com Jamie Kinney jkinney@amazon.com Angel Pizarro pizarroa@amazon.com
    2. 2. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Why? • “Work hard, have fun, make history” • Accelerate the pace of scientific discovery What? • Motivations, Theory, and Practice
    3. 3. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 AWS Research Grants • Apply for credits to teach advanced courses, tackle research endeavors, and explore new projects • Bootstrap projects that previously would have required expensive up-front and ongoing investments in infrastructure
    4. 4. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 http://aws.amazon.com/solutions/case-studies/university-of-california-berkeley-amp-lab-carat-project/
    5. 5. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Some more examples • MIT, Mark Pearrow, McGovern Institute – Genetic and computational analysis, electrophysiological recordings, and non-invasive brain imaging • University of Illinois Urbana-Champaign, Indranil Gupta, Computer Science – Research issues in loosely federated clouds • Singapore Management University, Ming Jiang – New techniques in malware analysis • Technion, Israel Institute of Technology, Alex Zlotnik – Systems for efficient execution of scientific workloads • University of Maryland, Michael Schatz, Center for Bioinformatics and Computational Biology – Assembly of large genomes using cloud computing • ETH Zurich, Till Quack, Computer Vision Lab – Large scale annotation of photo collections • University of Pennsylvania, Zachary Ives, Computer and Information Science Department – Orchestra, collaborative data sharing system on the cloud • Monash University, Blair Bethwaite, eScience and Grid Engineering Laboratory – Mixing grids and clouds for high throughput science • Harvard University, Vinothan N. Manoharan, SEAS, Department of Physics – Exploring the physics of self-organization with digital holographic microscopy
    6. 6. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Take home message AWS Research Grants are a great way to bootstrap a project, or experiment on AWS http://aws.amazon.com/grants
    7. 7. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Scientific Computing Initiatives Y0L0!
    8. 8. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 UCSF, UCSC, UCB BGI University of Cape Town UT/MD Anderson Seven Bridges Genomics Caltech Monash Universit y Sanger Institute Wellcome Trust Fred Hutchinson Cancer Research Center & Sage Bionetworkks Broad Institute OIC R U. Chicago Plus hundreds of other sites around the world for Co-Is and ColleaguesCancer Researc h UK OHSU RIKE NIndian Society of Human Genetics Global Alliance for Genomics & Health
    9. 9. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 1+ Million Cancer Genome Data Warehouse
    10. 10. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Enable collaboration • Easily and securely share data and applications across institutions • Publish preconfigured resources
    11. 11. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Data to the compute
    12. 12. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Compute to the data
    13. 13. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Download and Copy S3Amazon RDS
    14. 14. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Amazon RDS Access in the Cloud S3 RDS RDS RDS
    15. 15. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Compute in the Cloud S3 Amazon RDS
    16. 16. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Baylor College of MedicineA platform built by Baylor College of Medicine Human Genome Sequencing Center and DNANexus using the Mercury Pipeline for the Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) Consortium Supports 300+ researchers around the world Analyzed the genomes of over 14,000 individuals, encompassing 3,751 whole genomes and 10,940 whole exomes (~1PB of data) Used 3.3 million core hours over 4 weeks to complete the job 5.7x faster than what could have been accomplished on-premise The outcomes? 1. Easier collaboration 2. Faster time to science 3. Cost-effective: On-premise was prohibitively expensive 4. No longer constrained by on-premise capacity 5. Scientists focusing on Science as opposed to infrastructure
    17. 17. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 • A centralized repository of public datasets • Seamless integration with cloud based applications • No charge to the community • Tell us what else you’d like for us to host … AWS Public Data Sets 1000 Genomes Project Ensembl, GenBank, UniGene, PubChem NASA NEX: Earth science data sets The Cannabis Sativa Genome US Census Data: US demographic data from 1980, 1990, and 2000 US Censuses Freebase Data Dump: A data dump of all the current facts and assertions in the Freebase system, an open database covering millions of topics Google Books n-grams
    18. 18. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Technical computing: Why AWS? The IT infrastructure needed for technical computing is: Large, complex, expensive Poorly utilized due to project cycles Rapidly obsolete due to technology advances Big simulations can require days or weeks per iteration “Time in the queue” is a growing problem in larger firms Result? Engineering innovation is slowed
    19. 19. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Big JOB to do …
    20. 20. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 … with little resources to do it.
    21. 21. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Use a large shared resource …
    22. 22. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 ? … but there is a queue.
    23. 23. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 The hidden cost of queues • HPC users seek fastest possible time-to-results and must compete for scarce cluster resources • IT support team seeks highest possibility utilization of expensive cluster resources • Result: • The job queue becomes the buffer for managing IT capacity • Time needed to complete simulations is too long and hard to predict?
    24. 24. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Properly size your clusters …
    25. 25. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 … from small …
    26. 26. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 … to large …
    27. 27. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 … and lots of them!
    28. 28. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Computational compound analysis Solar panel material Estimated serial computation time 264 years 156,314 core cluster across 8 regions 1.21 petaFLOPS (Rpeak) Simulated 205,000 materials 18 hours for $33,000 16¢ per molecule http://news.cnet.com/8301-1001_3-57611919-92/supercomputing-simulation-employs-156000-amazon-processor-cores/
    29. 29. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Time: +00h <10 cores
    30. 30. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Time: +24h >1500 cores
    31. 31. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Time: +72h <10 cores
    32. 32. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 AWS value for HPC • Security: Deploy applications and store data in a secure, highly configurable VPC environment • Agility: Deploy the right infrastructure for each technical computing job, at the right time • Scalability: Add and subtract servers in minutes to optimize time-to- results • Cost Savings: Pay only for what you use, don’t pay for idle or outdated servers
    33. 33. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Experiment often Fail quickly, at low cost More Innovation
    34. 34. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 HPC Partners and Apps
    35. 35. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Kyushu University Support seasonal demand for engineering and science computational resources.
    36. 36. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Downstream Analysis Compute Analytics ToolsDatabasesStorage
    37. 37. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Questions
    38. 38. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 http://aws.amazon.com/solutions/case-studies/baylor/
    39. 39. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Elastic Map Reduce S3 Amazon EMRVery high, non-blocking, parallel bandwidth 2. Start a cluster (Hadoop, SGE, custom)1. Put data in S3 3. Get the results
    40. 40. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Easily scale to more computational nodes
    41. 41. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Use Spot instances to save $$$
    42. 42. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014
    43. 43. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Amazon EC2
    44. 44. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Launch in VPC for secure computing

    ×