Asgard, the Grails App that Deploys Netflix to the Cloud

702 views

Published on

Speaker: Joe Sondow
Asgard is a free and open source Grails application built and used by Netflix to deploy code changes and to manage resources in the Amazon cloud at large scale.
In this talk we'll delve into how Asgard works, covering topics such as:
Open source motivation, presence, and support.
Tour of the user interface.
Customizing Asgard for your own company.
How Netflix philosophies drive Asgard's design.
The Netflix Cloud Model.
Easy large deployments and fast rollback.
Caching the cloud with Groovy.
Visual language for the cloud.
Publishing a REST API.
Mocking AWS for offline testing.
Comparison with the AWS Management Console.

Published in: Technology, Education
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
702
On SlideShare
0
From Embeds
0
Number of Embeds
16
Actions
Shares
0
Downloads
17
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Asgard, the Grails App that Deploys Netflix to the Cloud

  1. 1. The Grails App that Deploys Netflix to the Cloud Joe Sondow, Netflix @joesondow
  2. 2. Slides online http://slideshare.net/joesondow @joesondow
  3. 3. Who am I? @joesondow
  4. 4. Who am I? @joesondow
  5. 5. Who am I? Joe Sondow @joesondow
  6. 6. Who am I? Joe Sondow Netflix since 2010 @joesondow
  7. 7. Who am I? Joe Sondow Netflix since 2010 Asgard lead @joesondow
  8. 8. Who am I? Joe Sondow Netflix since 2010 Asgard lead Grails @joesondow
  9. 9. Who am I? Joe Sondow Netflix since 2010 Asgard lead Grails jQuery @joesondow
  10. 10. Why am I here?
  11. 11. Why am I here?
  12. 12. Why am I here? Sell you something
  13. 13. Why am I here? Sell you something Discuss business plans
  14. 14. Why am I here? Sell you something Discuss business plans Answer technical questions
  15. 15. Why am I here? Sell you something Discuss business plans Answer technical questions Be a smaller fish in AWS
  16. 16. Why am I here? Sell you something Discuss business plans Answer technical questions Be a smaller fish in AWS Give back to community
  17. 17. Why am I here? Sell you something Discuss business plans Answer technical questions Be a smaller fish in AWS Give back to community Build cloud standards
  18. 18. Why am I here? Sell you something Discuss business plans Answer technical questions Be a smaller fish in AWS Give back to community Build cloud standards Steal your engineers
  19. 19. Asgard
  20. 20. Asgard
  21. 21. Asgard Screen shots
  22. 22. Asgard Application list
  23. 23. Asgard Auto Scaling Group list
  24. 24. Asgard Cluster deployment, ready for fast rollback
  25. 25. Asgard
  26. 26. Asgard Application deployment
  27. 27. Asgard Application deployment Cloud management
  28. 28. Asgard Application deployment Cloud management Started 2010
  29. 29. Asgard Application deployment Cloud management Started 2010 Open source June 2012
  30. 30. Asgard Application deployment Cloud management Started 2010 Open source June 2012 http://netflix.github.io
  31. 31. Asgard Application deployment Cloud management Started 2010 Open source June 2012 http://netflix.github.io 100’s of Jira tickets
  32. 32. Asgard Application deployment Cloud management Started 2010 Open source June 2012 http://netflix.github.io 100’s of Jira tickets Actively developed
  33. 33. Source code and download https://github.com/Netflix/asgard
  34. 34. User forum https://groups.google.com/group/AsgardUsers
  35. 35. Twitter feed https://twitter.com/AsgardOSS
  36. 36. The Asgard Show https://youtube.com/TheAsgardShow
  37. 37. Open Source, Closed Config
  38. 38. Open Source, Closed Config Pull company-specific details out of Asgard
  39. 39. Open Source, Closed Config
  40. 40. Open Source, Closed Config
  41. 41. Open Source, Closed Config Asgard for Netflix is configured to use companyspecific extension points such as standard utility links for instances
  42. 42. Open Source, Closed Config Out-of-the-box Asgard installation has no instance utility links
  43. 43. Open Source, Closed Config Netflix specific $ASGARD_HOME/Config.groovy link  {        //  Avoid  GStrings  here  because  these  Strings  are  stored  dynamic  templates  for  arbitrary  server  names.        String  logUrlStart  =  'http://${server}:7777'        String  configUrlStart  =  'http://${server}:9999/AdminConfig'        instanceLinkGroupingsToLinkTemplateLists  =  [                        'Logs':  [                                        new  TextLinkTemplate(logUrlStart  +  '/Admin/list?view=tomcat/catalina.out',  'catalina.out'),                                        new  TextLinkTemplate(logUrlStart  +  '/Admin/list',  'Log  File  Archive'),                                        new  TextLinkTemplate(logUrlStart  +  '/Admin/threaddumps',  'Thread  Dumps'),                                        new  TextLinkTemplate(logUrlStart  +  '/AdminProxy',  'Admin  Proxy  Info'),                                        new  TextLinkTemplate(logUrlStart  +  '/AdminStatus',  'Admin  Proxy  Status'),                                        new  TextLinkTemplate(logUrlStart  +  '/GC/index',  'GC  Visualization')                        ],                        'Netflix  Configuration':  [                                        new  TextLinkTemplate(configUrlStart  +  '/prop.html',  'NetflixConfiguration  Properties  Console'),                                        new  TextLinkTemplate(configUrlStart  +  '/libs.html',  'Libraries  Console'),                                        new  TextLinkTemplate(configUrlStart  +  '/machineProps',  'Machine  Readable  Properties'),                                        new  TextLinkTemplate(configUrlStart  +  '/webapp/META-­‐INF/MANIFEST.MF',  'Manifest'),                        ]        ] }
  44. 44. Open Source, Closed Config grails-app/conf/Config.groovy references external configuration file ~/.asgard/Config.groovy https://github.com/Netflix/asgard/blob/master/grails-app/conf/Config.groovy asgardHome = System.getenv('ASGARD_HOME') ?: System.getProperty('ASGARD_HOME') ?: "${System.getProperty('user.home')}/.asgard" // Locations to search for config files that get merged into the main config. // Config files can either be Java properties files or ConfigSlurper scripts. grails.config.locations = [ "file:${asgardHome}/Config.groovy", 'classpath:sourceVersion.properties' ]
  45. 45. Open Source, Closed Config External Config.groovy also holds the AWS account credentials, or references for finding them. grails  {        awsAccounts=["171234567890"]        awsAccountNames=["171234567890":  "prod"] } secret  {        accessId="AKOBIWANLANDOLUKEHAN"        secretKey="Od0INd/C3P0/R2D2atatTIEFIGHTERdeathstar" } cloud  {        accountName="prod"        publicResourceAccounts=["amazon"] }
  46. 46. Netflix is the world’s leading Internet television network with nearly 38 million members in 40 countries enjoying more than one billion hours of TV shows and movies per month, including original series. (from http://ir.netflix.com)
  47. 47. Freedom and Responsibility
  48. 48. Freedom and Responsibility Corporate culture and the Cloud
  49. 49. Freedom and Responsibility
  50. 50. Freedom and Responsibility Cloud SOA
  51. 51. Freedom and Responsibility Cloud SOA 100’s of services
  52. 52. Freedom and Responsibility Cloud SOA 100’s of services Small teams
  53. 53. Freedom and Responsibility Cloud SOA 100’s of services Small teams Independent releases
  54. 54. Freedom and Responsibility Cloud SOA 100’s of services Small teams Independent releases Controlled chaos
  55. 55. Cloud deployment model
  56. 56. Cloud deployment model Applications and Clusters
  57. 57. Cloud deployment model
  58. 58. Cloud deployment model Auto Scaling Group
  59. 59. Cloud deployment model Auto Scaling Group Launch Configuration
  60. 60. Cloud deployment model Auto Scaling Group Launch Configuration Elastic Load Balancer
  61. 61. Cloud deployment model Auto Scaling Group Elastic Load Balancer Launch Configuration Amazon Machine Image
  62. 62. Cloud deployment model Auto Scaling Group Elastic Load Balancer Security Group Launch Configuration Amazon Machine Image
  63. 63. Cloud deployment model Auto Scaling Group Elastic Load Balancer Security Group Launch Configuration Amazon Machine Image Instances
  64. 64. Cloud deployment model Auto Scaling Group Elastic Load Balancer Security Group Launch Configuration Amazon Machine Image Instances
  65. 65. Cloud deployment model Auto Scaling Group Elastic Load Balancer Security Group Launch Configuration Amazon Machine Image Instances
  66. 66. Cloud deployment model Auto Scaling Group Elastic Load Balancer Security Group Launch Configuration Amazon Machine Image Instances
  67. 67. Cloud deployment model
  68. 68. Cloud deployment model Search
  69. 69. Cloud deployment model API Search
  70. 70. Cloud deployment model Ratings API Search
  71. 71. Cloud deployment model Streaming Starts Ratings API Search
  72. 72. Cloud deployment model Streaming Starts Ratings Autocomplete API Search
  73. 73. Cloud deployment model Streaming Starts Sign Up Ratings Autocomplete API Search
  74. 74. Cloud deployment model Streaming Starts Sign Up Ratings Application Application Application Autocomplete Application API Application Search Application
  75. 75. Inventing the Application
  76. 76. Inventing the Application Problem: Application is not an Amazon concept
  77. 77. Inventing the Application Problem: Application is not an Amazon concept Solution: Create an Application database in the cloud Enforce naming conventions on Amazon objects
  78. 78. Fast Rollback
  79. 79. Fast Rollback Optimism causes outages
  80. 80. Fast Rollback Optimism causes outages Production traffic is unique
  81. 81. Fast Rollback Optimism causes outages Production traffic is unique Keep old version running
  82. 82. Fast Rollback Optimism causes outages Production traffic is unique Keep old version running Switch traffic to new version
  83. 83. Fast Rollback Optimism causes outages Production traffic is unique Keep old version running Switch traffic to new version Monitor results
  84. 84. Fast Rollback Optimism causes outages Production traffic is unique Keep old version running Switch traffic to new version Monitor results Revert traffic quickly
  85. 85. Fast Rollback
  86. 86. Fast Rollback api-frontend api-usprod-v007
  87. 87. Fast Rollback api-frontend api-usprod-v007 api-usprod-v008
  88. 88. Fast Rollback api-frontend api-usprod-v007 api-usprod-v008
  89. 89. Fast Rollback api-frontend api-usprod-v007 api-usprod-v008
  90. 90. Fast Rollback api-frontend api-usprod-v007 api-usprod-v008
  91. 91. Fast Rollback api-frontend api-usprod-v007
  92. 92. Inventing the Cluster
  93. 93. Inventing the Cluster Problem: Two ASGs with one function but different names
  94. 94. Inventing the Cluster Problem: Two ASGs with one function but different names Solution: Append version number in reserved format Parse ASG name to determine long-term “cluster”
  95. 95. Inventing the Cluster Instead of keeping a database in sync, use naming conventions to store the source in truth in Amazon’s API api Application api-usprod Cluster api-usprod-v007 Auto Scaling Group api-usprod-v008 Auto Scaling Group
  96. 96. Database Aversion
  97. 97. Database Aversion Storing metadata on cloud objects
  98. 98. Database Aversion
  99. 99. Database Aversion Naming conventions
  100. 100. Database Aversion Naming conventions Tagging conventions
  101. 101. Database Aversion Naming conventions Tagging conventions No GORM domain objects
  102. 102. Database Aversion Naming conventions Tagging conventions No GORM domain objects AWS Java SDK
  103. 103. Database Aversion Naming conventions Tagging conventions No GORM domain objects AWS Java SDK Less to go out of sync
  104. 104. Database Aversion Naming conventions Tagging conventions No GORM domain objects AWS Java SDK Less to go out of sync Shared source of truth
  105. 105. Caching the Cloud
  106. 106. Caching the Cloud Responsive, massive, multi-region metadata
  107. 107. Caching the Cloud
  108. 108. Caching the Cloud Large counts
  109. 109. Caching the Cloud Large counts Many types
  110. 110. Caching the Cloud Large counts Many types Complex relationships
  111. 111. Caching the Cloud Large counts Many types Complex relationships Multiple regions
  112. 112. Caching the Cloud Large counts Many types Complex relationships Multiple regions Consistent single objects
  113. 113. Caching the Cloud Large counts Many types Complex relationships Multiple regions Consistent single objects Eventually consistent lists
  114. 114. Caching the Cloud class  Caches  {        final  CachedMap<AppRegistration>  allApplications        final  CachedMap<ApplicationMetrics>  allApplicationMetrics        final  CachedMap<HardwareProfile>  allHardwareProfiles        final  MultiRegionCachedMap<MetricAlarm>  allAlarms        final  MultiRegionCachedMap<ApplicationInstance>  allApplicationInstances        final  MultiRegionCachedMap<AutoScalingGroup>  allAutoScalingGroups        final  MultiRegionCachedMap<AvailabilityZone>  allAvailabilityZones        final  MultiRegionCachedMap<Cluster>  allClusters        final  MultiRegionCachedMap<DBInstance>  allDBInstances        final  MultiRegionCachedMap<DBSecurityGroup>  allDBSecurityGroups        final  MultiRegionCachedMap<DBSnapshot>  allDBSnapshots        final  MultiRegionCachedMap<String>  allDomains        final  MultiRegionCachedMap<FastProperty>  allFastProperties        final  MultiRegionCachedMap<Image>  allImages        final  MultiRegionCachedMap<Instance>  allInstances        final  MultiRegionCachedMap<InstanceTypeData>  allInstanceTypes        final  MultiRegionCachedMap<KeyPairInfo>  allKeyPairs        final  MultiRegionCachedMap<LaunchConfiguration>  allLaunchConfigurations        //  etc.        //  etc.        //  etc. }
  115. 115. Caching the Cloud class  Caches  {        final  CachedMap<AppRegistration>  allApplications        final  CachedMap<ApplicationMetrics>  allApplicationMetrics        final  CachedMap<HardwareProfile>  allHardwareProfiles        final  MultiRegionCachedMap<MetricAlarm>  allAlarms        final  MultiRegionCachedMap<ApplicationInstance>  allApplicationInstances        final  MultiRegionCachedMap<AutoScalingGroup>  allAutoScalingGroups        final  MultiRegionCachedMap<AvailabilityZone>  allAvailabilityZones        final  MultiRegionCachedMap<Cluster>  allClusters        final  MultiRegionCachedMap<DBInstance>  allDBInstances        final  MultiRegionCachedMap<DBSecurityGroup>  allDBSecurityGroups        final  MultiRegionCachedMap<DBSnapshot>  allDBSnapshots        final  MultiRegionCachedMap<String>  allDomains        final  MultiRegionCachedMap<FastProperty>  allFastProperties        final  MultiRegionCachedMap<Image>  allImages        final  MultiRegionCachedMap<Instance>  allInstances        final  MultiRegionCachedMap<InstanceTypeData>  allInstanceTypes        final  MultiRegionCachedMap<KeyPairInfo>  allKeyPairs        final  MultiRegionCachedMap<LaunchConfiguration>  allLaunchConfigurations        //  etc.        //  etc.        //  etc. }
  116. 116. Caching the Cloud class  AwsRdsService  implements  CacheInitializer,  InitializingBean  {        MultiRegionAwsClient<AmazonRDS>  awsClient        Caches  caches        void  initializeCaches()  {                caches.allDBInstances.ensureSetUp({  Region  region  -­‐>  retrieveDBInstances(region)  })        }        private  List<DBInstance>  retrieveDBInstances(Region  region)  {                awsClient.by(region).describeDBInstances(new  DescribeDBInstancesRequest()).getDBInstances()        }        Collection<DBInstance>  getDBInstances(UserContext  userContext)  {                caches.allDBInstances.by(userContext.region).list()        } }
  117. 117. Caching the Cloud class  AwsRdsService  implements  CacheInitializer,  InitializingBean  {        MultiRegionAwsClient<AmazonRDS>  awsClient        Caches  caches        void  initializeCaches()  {                caches.allDBInstances.ensureSetUp({  Region  region  -­‐>  retrieveDBInstances(region)  })        }        private  List<DBInstance>  retrieveDBInstances(Region  region)  {                awsClient.by(region).describeDBInstances(new  DescribeDBInstancesRequest()).getDBInstances()        }        Collection<DBInstance>  getDBInstances(UserContext  userContext)  {                caches.allDBInstances.by(userContext.region).list()        } }
  118. 118. Caching the Cloud class  AwsRdsService  implements  CacheInitializer,  InitializingBean  {        MultiRegionAwsClient<AmazonRDS>  awsClient        Caches  caches        void  initializeCaches()  {                caches.allDBInstances.ensureSetUp({  Region  region  -­‐>  retrieveDBInstances(region)  })        }        private  List<DBInstance>  retrieveDBInstances(Region  region)  {                awsClient.by(region).describeDBInstances(new  DescribeDBInstancesRequest()).getDBInstances()        }        Collection<DBInstance>  getDBInstances(UserContext  userContext)  {                caches.allDBInstances.by(userContext.region).list()        } }
  119. 119. Caching the Cloud class  AwsRdsService  implements  CacheInitializer,  InitializingBean  {        MultiRegionAwsClient<AmazonRDS>  awsClient        Caches  caches        void  initializeCaches()  {                caches.allDBInstances.ensureSetUp({  Region  region  -­‐>  retrieveDBInstances(region)  })        }        private  List<DBInstance>  retrieveDBInstances(Region  region)  {                awsClient.by(region).describeDBInstances(new  DescribeDBInstancesRequest()).getDBInstances()        }        Collection<DBInstance>  getDBInstances(UserContext  userContext)  {                caches.allDBInstances.by(userContext.region).list()        } }
  120. 120. Caching the Cloud
  121. 121. Visual Language for the Cloud
  122. 122. Visual Language for the Cloud Tango open source icons
  123. 123. Visual Language for the Cloud
  124. 124. Visual Language for the Cloud AWS is intimidating
  125. 125. Visual Language for the Cloud AWS is intimidating Many object types
  126. 126. Visual Language for the Cloud AWS is intimidating Many object types Help newbie users
  127. 127. Visual Language for the Cloud AWS is intimidating Many object types Help newbie users Reduce cognitive load
  128. 128. Visual Language for the Cloud AWS is intimidating Many object types Help newbie users Reduce cognitive load Make it easy
  129. 129. Visual Language for the Cloud AWS is intimidating Many object types Help newbie users Reduce cognitive load Make it easy Avoid surprises
  130. 130. Visual Language for the Cloud
  131. 131. Visual Language for the Cloud At a glance, these nav bar items look alike.
  132. 132. Visual Language for the Cloud At a glance, these nav bar items look alike.
  133. 133. Visual Language for the Cloud
  134. 134. Visual Language for the Cloud Some screens have multiple action buttons that look too similar.
  135. 135. Visual Language for the Cloud Some screens have multiple action buttons that look too similar.
  136. 136. Visual Language for the Cloud
  137. 137. Visual Language for the Cloud Because of naming conventions, these links look alike.
  138. 138. Visual Language for the Cloud Because of naming conventions, these links look alike.
  139. 139. Visual Language for the Cloud
  140. 140. Visual Language for the Cloud The indicators for the current AWS region are too easy to miss.
  141. 141. Visual Language for the Cloud The indicators for the current AWS region are too easy to miss.
  142. 142. Visual Language for the Cloud These availability zones are important to recognize at a glance but their names look similar, and they appear on many screens.
  143. 143. Visual Language for the Cloud These availability zones are important to recognize at a glance but their names look similar, and they appear on many screens.
  144. 144. Visual Language for the Cloud
  145. 145. Tango Icons
  146. 146. Tango Icons http://tango.freedesktop.org/
  147. 147. Tango Icons http://tango.freedesktop.org/ http://tango.freedesktop.org/Tango_Icon_Theme_Guidelines
  148. 148. Tango Icons http://tango.freedesktop.org/ http://tango.freedesktop.org/Tango_Icon_Theme_Guidelines http://commons.wikimedia.org/wiki/Tango_icons
  149. 149. Tango Icons http://tango.freedesktop.org/ http://tango.freedesktop.org/Tango_Icon_Theme_Guidelines http://commons.wikimedia.org/wiki/Tango_icons Used by Firefox, Jenkins, GIMP, OpenOffice, VMWare
  150. 150. REST API in Grails
  151. 151. REST API in Grails Enable external mashups with cloud data
  152. 152. REST API in Grails
  153. 153. REST API in Grails
  154. 154. REST API in Grails
  155. 155. REST API in Grails
  156. 156. REST API in Grails
  157. 157. REST API in Grails
  158. 158. REST API in Grails
  159. 159. REST API in Grails
  160. 160. REST API in Grails
  161. 161. REST API in Grails ApplicationController.groovy def show = { String name = params.id UserContext userContext = UserContext.of(request) AppRegistration app = applicationService.getRegisteredApplication(userContext, name) def groups = awsAutoScalingService.getAutoScalingGroupsForApp(userContext, name) List<String> clusterNames = groups.collect { Relationships.clusterFromGroupName(it.autoScalingGroupName) }.unique() Map details = [ app: app, strictName: Relationships.checkStrictName(app.name), clusters: clusterNames, groups: groups, balancers: awsLoadBalancerService.getLoadBalancersForApp(userContext, name), securities: awsEc2Service.getSecurityGroupsForApp(userContext, name), appSecurityGroup: awsEc2Service.getSecurityGroup(userContext, name), launches: awsAutoScalingService.getLaunchConfigurationsForApp(userContext, name), ] withFormat { html { return details } xml { new XML(details).render(response) } json { new JSON(details).render(response) } } }
  162. 162. Offline Development
  163. 163. Offline Development Makes on a plane
  164. 164. Offline Development Mock data Mock behavior System property switch offline=true
  165. 165. Mock Data http://asgardprod/us-east-1/autoScaling/list.json
  166. 166. Mock Data Parse JSON
  167. 167. Mock Behavior Override Amazon Java client methods
  168. 168. System Property grails run-app -Doffline=true
  169. 169. Why not the AWS console?
  170. 170. Why not the AWS console?
  171. 171. Why not the AWS console?
  172. 172. Why not the AWS console? Hide keys
  173. 173. Why not the AWS console? Hide keys Customize model
  174. 174. Why not the AWS console? Hide keys Customize model Enforce conventions
  175. 175. Why not the AWS console? Hide keys Customize model Enforce conventions Automate workflow
  176. 176. Why not the AWS console? Hide keys Customize model Enforce conventions Automate workflow Log changes
  177. 177. Why not the AWS console? Hide keys Customize model Enforce conventions Automate workflow Log changes Integrate systems
  178. 178. Why not the AWS console? Hide keys Customize model Enforce conventions Automate workflow Log changes Integrate systems Customize REST API
  179. 179. @NetflixOSS
  180. 180. @NetflixOSS http://techblog.netflix.com
  181. 181. @NetflixOSS http://techblog.netflix.com http://netflix.github.io
  182. 182. http://github.com/Netflix/asgard Thank you
  183. 183. http://github.com/Netflix/asgard Thank you Questions? @AsgardOSS @joesondow jobs.netflix.com

×