Your SlideShare is downloading. ×
Empowering Congress with Data-Driven Analytics (BDT304) | AWS re:Invent 2013
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Empowering Congress with Data-Driven Analytics (BDT304) | AWS re:Invent 2013

904
views

Published on

MACPAC is a federal legislative branch agency tasked with reviewing state and federal Medicaid and Children's Health Insurance Program (CHIP) access and payment policies and making recommendations …

MACPAC is a federal legislative branch agency tasked with reviewing state and federal Medicaid and Children's Health Insurance Program (CHIP) access and payment policies and making recommendations to Congress. By March 15 and again by June 15 each year, the agency produces a comprehensive report for Congress that compiles results from Medicaid and CHIP data sources for the 50 states and territories. The CIO of MACPAC wanted a secure, cost-effective, high performance platform that met their needs to crunch this large amount of health data. In this session, learn how MACPAC and 8KMiles helped set up the agency’s Big Data/HPC analytics platform on AWS using SAS analytics software.

Published in: Technology, Education

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
904
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
19
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Empowering Congress with Data-Driven Analytics Mathew Chase, November 13, 2013 Sri Vasireddy, © 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.
  • 2. • A small federal legislative branch agency • Newly established in late 2010 • Going beyond the “Cloud First” goal to “Cloud Only”
  • 3. Hello • Mathew Chase • Federal CIO • Over 20 years experience in the public and private sectors leading technology operations
  • 4. Who are you? • • • • Government Health care industry Cloud newbies AWS ninjas • Whoops… wrong session
  • 5. Question? How many of you are using AWS as your primary computing datacenter?
  • 6. MACPAC’s AWS Datacenter • AWS to replace an onsite or hosted datacenter • Single primary region with cold recovery on the the other coast • Multiple AZs for redundancy • Separate VPCs for security “air gaps”
  • 7. MACPAC: the “perfect” cloud customer • • • • Predicable work cycles Two intense work periods (annual) Growing with an undefined future Potential need for more computing resources • Very cost conscious • No legacy infrastructure
  • 8. What we achieved in the cloud • > 40% reduction in capital expenses – With additional savings in rent, utilities, and labor • • • • Cost spread over typical equipment lifespan On demand storage and archiving Zero over provisioning Ability to expand and contract resources at will
  • 9. Core focus Recommendations to Congress on Medicaid and the Children’s Health Insurance Program
  • 10. Reports to the Congress Reports due by: • March 15th & • June 15th www.MACPAC.gov/reports
  • 11. Research backed by analytics • Analyze Medicaid program data • Find intersections with Medicare • Evaluate Medicaid survey information
  • 12. Tools • SAS Office Analytics enterprise platform • Red Hat Enterprise Linux x64 • Amazon EC2
  • 13. Concerns 1. Security 2. Performance
  • 14. Security
  • 15. Security Requirements • Multi-user controlled environment • Isolated environment with strong controls • No sensitive and personal data sitting at periphery • Data encrypted at rest and in transit
  • 16. Access Protection Challenge • Twenty Instances • Twenty Ports for AD • 20 x 20 = 400 Rules
  • 17. Access Control Using Security Groups AD-1 AD-2 Accept AD related requests from ‘Infra’ group AD Security Group Client Instances Accept DNS queries from AD group Infra Security Group DNS-1 DNS-2 DNS SecurityGroup Accept DNS queries from ‘Infra’ group
  • 18. Encrypted Data flow
  • 19. Cloud Security Design
  • 20. Performance
  • 21. SAS Requirements • Very IO intensive • Sequential read and writes o 35-70mb/sec per core of IO desired o GOAL: 4 core system = ~200mb /sec IO
  • 22. Base AWS Structure • M3 extra large running RHEL x64 for cluster o 1 TB EBS RAID 10 for primary data (4, 500gb drives) o 1 TB EBS RAID 0 for temp work space (4, 256gb drives) o 1 TB EBS LUKS encrypted RAID 0 for ETL (4, 256gb drives)
  • 23. Can AWS yield the necessary performance?
  • 24. In the immortal words of Spinal Tap: “These go to eleven!”
  • 25. Turning up the AWS dial
  • 26. Volume @ 3 Specifications M3 extra large 4 – 256gb EBS Disks RAID 0 Stripe
  • 27. fio Sequential Read @ 3 [ec2-user]# fio sastest.fio job1: (g=0): rw=read, bs=4K-4K/4K-4K/4K-4K, ioengine=sync, iodepth=1 fio-2.1.2 Starting 1 process Jobs: 1 (f=1) job1: (groupid=0, jobs=1): err= 0: pid=31661: Sun Oct 27 23:07:18 2013 read : io=102400KB, bw=77167KB/s, iops=19291, runt= 1327msec clat (usec): min=3, max=25911, avg=44.70, stdev=572.02 lat (usec): min=5, max=25913, avg=46.86, stdev=572.02 77,166 KB/s Run status group 0 (all jobs): READ: io=102400KB, aggrb=77166KB/s, minb=77166KB/s, maxb=77166KB/s, mint=1327msec, maxt=1327msec
  • 28. Volume @ 10 Specifications M3 extra large 4 – 256gb EBS Disks 4000 iops per drive RAID 0 Stripe
  • 29. fio Sequential Read @ 10 [ec2-user]$ fio sastest.fio job1: (g=0): rw=read, bs=4K-4K/4K-4K/4K-4K, ioengine=sync, iodepth=1 fio-2.1.2 Starting 1 process 191,401 KB/s job1: (groupid=0, jobs=1): err= 0: pid=2731: Tue Nov 5 22:55:33 2013 read : io=102400KB, bw=191402KB/s, iops=47850, runt= 535msec clat (usec): min=3, max=51820, avg=13.29, stdev=337.22 lat (usec): min=4, max=51821, avg=15.52, stdev=337.21 Run status group 0 (all jobs): READ: io=102400KB, aggrb=191401KB/s, minb=191401KB/s, maxb=191401KB/s, mint=535msec, maxt=535msec
  • 30. “If we need that extra push over the cliff. You know what we do?” “11! Exactly.” — Nigel
  • 31. fio Sequential Read @ 11 [ec2-user]$ fio sastest.fio job1: (g=0): rw=read, bs=4K-4K/4K-4K/4K-4K, ioengine=sync, iodepth=1 fio-2.1.2 Starting 1 process 432,067 KB/s job1: (groupid=0, jobs=1): err= 0: pid=3133: Tue Nov 5 23:13:13 2013 read : io=102400KB, bw=432068KB/s, iops=108016, runt= 237msec clat (usec): min=0, max=1594, avg= 8.26, stdev=42.59 lat (usec): min=0, max=1594, avg= 8.38, stdev=42.59 Run status group 0 (all jobs): READ: io=102400KB, aggrb=432067KB/s, minb=432067KB/s, maxb=432067KB/s, mint=237msec, maxt=237msec
  • 32. Volume @ 11 Specifications 4 – 256gb EBS Disks 4000 iops per drive RAID 0 Stripe cg1.4xlarge (10gb io channel)
  • 33. fio Sequential Read @ 11 [ec2-user]$ fio sastest.fio job1: (g=0): rw=read, bs=4K-4K/4K-4K/4K-4K, ioengine=sync, iodepth=1 fio-2.1.2 Starting 1 process 432,067 KB/s job1: (groupid=0, jobs=1): err= 0: pid=3133: Tue Nov 5 23:13:13 2013 read : io=102400KB, bw=432068KB/s, iops=108016, runt= 237msec clat (usec): min=0, max=1594, avg= 8.26, stdev=42.59 lat (usec): min=0, max=1594, avg= 8.38, stdev=42.59 Run status group 0 (all jobs): READ: io=102400KB, aggrb=432067KB/s, minb=432067KB/s, maxb=432067KB/s, mint=237msec, maxt=237msec
  • 34. I am pretty sure I can make the dial go higher Ram Disks Block sizes Larger stripes Application tuning Etc…
  • 35. WARNING! • Be sure to touch all sectors of a new disk per AWS guidance prior to testing and production Command for Unix environments $ dd if=/dev/md0 of=/dev/null
  • 36. You are not alone… • • • • Guidance from software vendors AWS professional services Use an iterative process (Fail quickly) Third party partners (8kMiles) so get going!
  • 37. What did we learn? • • • • Make a decision Start at zero… Spend time really thinking about security And then crank it up where you need it “Try again. Fail again. Fail better.” Samuel Beckett, Worstward Ho (1983)
  • 38. References • Amazon EBS Volume Performance – http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSPerfor mance.html • AWS Microsoft Platform Security – http://media.amazonwebservices.com/AWS_Microsoft_Platform_Se curity.pdf • Benchmarking SAS I/O: Verifying I/O Performance Using fio – http://support.sas.com/resources/papers/proceedings13/4792013.pdf • This is Spinal Tap (Movie, 1984, Rob Reiner - Director)
  • 39. Special Thanks to: 8kMiles, AWS, and SAS And thank you for your time today.
  • 40. Contact Information mathew.chase@macpac.gov www.macpac.gov
  • 41. Please give us your feedback on this presentation BDT304 As a thank you, we will select prize winners daily for completed surveys!