Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Upcoming SlideShare
What to Upload to SlideShare
Next

0

Share

DevOps Fest 2019. Олег Белецкий. Using Chef to manage hardware-based infrastructure

Доклад посвящен практическому опыту использования Chef для разворачивания, Сonfiguration-management, Release-management of medium-scale infrastructure (1000+ железных серверов). Будут рассмотрены и даны ответы на следующие вопросы:
- Как управлять железом в мире клаудов и k8s?
- Сколько нужно времени и инженеров чтобы релизнуться на 1000 машин?
- Какой инструментарий предоставляет Chef для мониторинга хода релиза в процессе, pre-release, post-release статуса?
- How to keep, manage and rotate secrets secure way?
- Сколько нужно времени чтобы изменить root-password на всей среде?
- Bonus: сколько нужно грузовиков чтобы перевезти датацентр?

Related Books

Free with a 30 day trial from Scribd

See all
  • Be the first to like this

DevOps Fest 2019. Олег Белецкий. Using Chef to manage hardware-based infrastructure

  1. 1. OLEG BELETSKYI CHEF MANAGED INFRASTRUCTURE CONTINUOUS DELIVERY. CONTINUOUS DEVOPS. 6APRIL 2019 KYIV, UKRAINE th
  2. 2. Continuous Delivery. Continuous DevOps. 6APRIL 2019 KYIV, UKRAINE Goal ● Give an example of infrastructure managed by Chef ● Share commit-based release workflow ● Describe secret management patterns th
  3. 3. Continuous Delivery. Continuous DevOps. 6APRIL 2019 KYIV, UKRAINE Infrastructure overview ● Medium scale ~1000 servers ● Hardware based ● Geo-distributed - 3 datacenters ( changing over time ) ● RPM-based package management ( RHEL, yum ) ● Chef Configuration Management th
  4. 4. Continuous Delivery. Continuous DevOps. 6APRIL 2019 KYIV, UKRAINE Infrastructure overview: Unit ● Unit is a group of system data servers ● Uses same portion of frontend/db stack ● Unit of horizontal scaling ● We will discuss only data servers th
  5. 5. Continuous Delivery. Continuous DevOps. 6APRIL 2019 KYIV, UKRAINEth Infrastructure overview: DC View
  6. 6. Continuous Delivery. Continuous DevOps. 6APRIL 2019 KYIV, UKRAINE Release Infrastructure ● Chef Server, UI, Reporting ● Github - chef-repo ● yum/pxe servers ● Jenkins ● SecretServer th
  7. 7. Continuous Delivery. Continuous DevOps. 6APRIL 2019 KYIV, UKRAINE Data Server Bootstrap ● IPMI config ● pxe boot ● Minimal install ● net,raid, volumes ● Bootstrap in Chef ● Profit (in about ~1h) th
  8. 8. Continuous Delivery. Continuous DevOps. 6APRIL 2019 KYIV, UKRAINE Chef: Client Overview ● Each server has chef-client installed ● Classic pull model: single machine run chef every 30 minutes ● ~30 server check-in/minute ● 1.5minute average chef run duration ● ~ 540 resources managed th * simple_iptables_rule[log FORWARD] action append (up to date) * execute[chkconfig iptables on] action run - execute chkconfig iptables on Running handlers: Running handlers complete Chef Client finished, 22/536 resources updated in 01 minutes 14 seconds
  9. 9. Continuous Delivery. Continuous DevOps. 6APRIL 2019 KYIV, UKRAINE Chef: Chef-repo ● chef-repo: Configuration Management github repository ○ Cookbooks ○ Environments ○ Roles ○ Tests th
  10. 10. Continuous Delivery. Continuous DevOps. APRIL 2019 KYIV, UKRAINE chef-repo: cookbooks ● Cookbooks - configuration code units ● Used for server configuration: ○ ntp, iptables, packages, resolvers, users ● Apps config: packages, services, configs ● Attributes ○ Default system/app configuration. h ● ~ 45 cookbooks in repo ● ~ 80 with dependencies ● 2 product app cookbooks 6
  11. 11. Continuous Delivery. Continuous DevOps. 2019 KYIV, UKRAINE chef-repo: cookbooks : attributes th6 Cookbook defaults Unit1 environment override (unit1_prod.json) "name": "unit1_prod", "default_attributes": { "app_volume": { "pki_master": "frk-pki01.intra", "zone_controller": "frk-zone01.intra" } ... # File: cookbooksapp_volumeattributesdefault.rb # Default app_volume settings Default["app_volume"]["pki_master"] = "dev-ctrl01.local" Default["app_volume"]["zone_controller"] = "dev-ctrl01.local" ... "name": "unit12_prod", "default_attributes": { "app_volume": { "pki_master": "pa-pki01.intra", "zone_controller": "pa-zone01.intra" } ... Unit12 environment override (unit12_prod.json)
  12. 12. Continuous Delivery. Continuous DevOps. APRIL 2019 KYIV, UKRAINE chef-repo: environments ● Environments - policy file ● Stored raw JSON format ● Reflect Units layout ● Reflect DTSP areas ● Attributes override. ○ Unit configuration: dbs, api, etc ○ DC specific config: ntp, dc, dns ● Cookbook versions locks. ○ Primary release control mechanism th6 "name": "unit1_prod", "json_class": "Chef::Environment", "description": "Unit1 Production Environment", "default_attributes": { "app_volume": { "pki_master": "frk-pki01.intra", "zone_controller": "frk-zone01.intra" },... } "cookbook_versions": { "app_volume": "= 2.1.17", "os-hardening": "= 1.0.2", "chef-client": "= 3.0.1", ... }
  13. 13. Continuous Delivery. Continuous DevOps. APRIL 2019 KYIV, UKRAINE chef-repo: roles ● Roles - used to specify server run_list ● Problem: ○ Roles do not have version in Chef ○ Role change affect all environments th6 "name": "unit_data_server", "description": "System Unit Data Server", "json_class": "Chef::Role", "chef_type": "role", "run_list": [ "role[monitored]", "recipe[ohai]", "recipe[app_volume]", "recipe[chef-client]", "recipe[os-hardening]" ]
  14. 14. Continuous Delivery. Continuous DevOps. APRIL 2019 KYIV, UKRAINE chef-repo: tests ● Chef Test ecosystem inherits Ruby ● Syntax: cookstyle ( rubocop in ruby ) ● Linting: foodcritic ● Unit: chefspec ( rspec in ruby ) ● Integration: inspec th6 ● Chefspec ○ In-memory ○ chef-client run emulation ○ Fast ( 1-2 minutes ) ○ Roles, Environments, DataBags ○ Chef-Zero! ● Cons ○ Dozens of stubs ○ Overtesting ○ Complex
  15. 15. Continuous Delivery. Continuous DevOps. APRIL 2019 KYIV, UKRAINE Chef toolbelt ● Inventory ● Reporting ● CLI ( knife ) ○ Knife status ○ Knife search ○ Knife node edit ● Attributes override ○ cookbook-> environment -> node th6
  16. 16. Continuous Delivery. Continuous DevOps. APRIL 2019 KYIV, UKRAINE Chef toolbelt: knife examples th6 $ knife search chef_environment:unit101_dev -i 3 items found unit101-s1-d.local unit101-s2-d.local unit101-s3-d.local ... $ knife search chef_environment:unit12_prod -a cookbooks.app_volume.version 130 items found unit12-s4: cookbooks.app_volume.version: 1.9.1 unit12-s5: cookbooks.app_volume.version: 1.9.1 unit12-s13: cookbooks.app_volume.version: 1.9.1 ... ● Search nodes in environment ● Search nodes in environment ● Shows each node cookbook version
  17. 17. Continuous Delivery. Continuous DevOps. APRIL 2019 KYIV, UKRAINE Chef toolbelt: knife examples th6 $knife search 'NOT cookbooks_app_volume_version:2.0.0' -a cookbooks.app_volume.version 12 items found unit102-s1-d: cookbooks.app_volume.version: 2.1.0 unit102-s2-d: cookbooks.app_volume.version: 2.1.0 unit2-s3: cookbooks.app_volume.version: 1.9.1 ... $knife search 'cookbooks_app_volume_version:2.0.0' -i 659 items found ... ● Search nodes not updated to release version ● Search nodes with release version
  18. 18. Continuous Delivery. Continuous DevOps. APRIL 2019 KYIV, UKRAINE Chef toolbelt: knife examples th6 $ knife status 'NOT cookbooks_app_volume_version:2.0.0' ... 2 minutes ago, unit102-s3-d, centos 7.6.1810. 2 minutes ago, unit102-s1-d, centos 7.6.1810. 1 minute ago, unit102-s2-d, centos 7.6.1810. $ knife status 'NOT cookbooks_app_volume_version:2.0.0' --hide-by-mins 45 431654 hours ago, ob-test12. 431654 hours ago, ob-test13. 25 hours ago, unit1-s1, redhat 6.10. 45 minutes ago, unit2-s3, redhat 6.10. ● Convergence status of nodes ● Show unhealthy nodes
  19. 19. Continuous Delivery. Continuous DevOps. APRIL 2019 KYIV, UKRAINE Environments overview: Dev/Test Dev area: ● Automatic code promotion ● Multiple dev environments ● Jenkins jobs - auto th6 Test area: ● Engineering control promotion ● Jenkins jobs - auto/manual ● Multiple test environments ● Some environments rebuilds from bare-metal state
  20. 20. Continuous Delivery. Continuous DevOps. APRIL 2019 KYIV, UKRAINE Environments overview: Stage/Prod Stage area: ● develop/master branches model ● Git - PR develop->master, approve ● Jenkins job - manual th6 Prod: ● Multiple *_prod environments! ● Reflects Units layout ● Dev,Test,Stage passed ● Bureaucracy gate - get all Approvals ● Canary release to 1 unit ● Environments constraints updated ● Jenkins job - manual/ console ● profit.
  21. 21. Continuous Delivery. Continuous DevOps. APRIL 2019 KYIV, UKRAINE Typical release Preparations: ● Passed Dev/Tests ● Approve then Merge PR into master ● Automatic cookbooks upload to chef ● Manual trigger: Jenkins job - release to stage ● Approve/Sign-off for prod release th6 Release: ● Jenkins job - bump cookbook versions on canary unit ( dedicated prod unit , with real data ) ● Approve ● Jenkins job - bump cookbook versions on rest of the units ( environments files ) ● Monitoring release flow
  22. 22. Continuous Delivery. Continuous DevOps. APRIL 2019 KYIV, UKRAINE Monitoring release ● 1 devops engineer ● Uses knife status/search ● Uses monitoring dashboards ● Uses Chef reporting logs/graphs ● Communication with engineering ● Waits 30+ minutes to complete th6 Post-release ● search/catch system that’s failed to update. ● Knife status->reporting-> logs. ● Chef-client run helps in 99% times
  23. 23. Continuous Delivery. Continuous DevOps. APRIL 2019 KYIV, UKRAINE Security ● All sensitives in SecretServer ● Cookbooks uses only secret ID ● Chef code uses simple secret server lib. ● Chef-client fetches secrets from SecretServer each run: ○ to do interactive calls ○ fill configuration files templates th6
  24. 24. Continuous Delivery. Continuous DevOps. APRIL 2019 KYIV, UKRAINE Security: Chef-vault How to get secrets from SecretServer safely? ● Using Chef-Vault ● Encrypted items with creds on secret server ● Maintain ACLs of client keys allowed to decrypt ● Maintain admins list allowed to decrypt th6
  25. 25. Continuous Delivery. Continuous DevOps. APRIL 2019 KYIV, UKRAINE Security: Summary ● Security audit ● Secrets access audit ● Tools to rotate secrets ● Clean git th6
  26. 26. Continuous Delivery. Continuous DevOps. APRIL 2019 KYIV, UKRAINE Use case: root password management ● root access allowed only from console ○ iDRAC ● Chef recipe os-hardening::change_root_password ○ Generate new password ○ Create or update existing secret item on secret server ○ Set local root password ● Used during bootstrap phase ● Can be used at any time by Ops to rotate password th6
  27. 27. Continuous Delivery. Continuous DevOps. APRIL 2019 KYIV, UKRAINE Real infrastructure problems ● Chef codebase require maintenance ● Chef deprecates versions too fast ( for enterprise ) ● Hard to fix bad-patterns legacy ● Dependency hell - wait a minute?? ● Many infrastructure dependencies: ○ Why NTP servers in dev always give wrong time? ○ Why passwords on these 2 servers was set to 500? th6
  28. 28. Continuous Delivery. Continuous DevOps. 6APRIL 2019 KYIV, UKRAINE Q/A th
  29. 29. Continuous Delivery. Continuous DevOps. 6APRIL 2019 KYIV, UKRAINE Thank you! th

Доклад посвящен практическому опыту использования Chef для разворачивания, Сonfiguration-management, Release-management of medium-scale infrastructure (1000+ железных серверов). Будут рассмотрены и даны ответы на следующие вопросы: - Как управлять железом в мире клаудов и k8s? - Сколько нужно времени и инженеров чтобы релизнуться на 1000 машин? - Какой инструментарий предоставляет Chef для мониторинга хода релиза в процессе, pre-release, post-release статуса? - How to keep, manage and rotate secrets secure way? - Сколько нужно времени чтобы изменить root-password на всей среде? - Bonus: сколько нужно грузовиков чтобы перевезти датацентр?

Views

Total views

80

On Slideshare

0

From embeds

0

Number of embeds

0

Actions

Downloads

0

Shares

0

Comments

0

Likes

0

×