Cloud-Init
Customizing Virtual Machines
What is this cloud-init?
Per the doc: "Cloud-init is the defacto multi-distribution package that
handles early initialization of a cloud instance.”


A set of services that can be con
fi
gured to run tasks and set up a VM
instance, mostly during the early stages of the VM’s lifecycle


A speci
fi
cation for providing con
fi
guration to these services.


A speci
fi
cation for adding functionality to these services (modules)


Developed by Canonical, dual licensed under GPL3 and Apache 2.0
Basic overview
Boot Stages


https://cloudinit.readthedocs.io/en/latest/topics/boot.html
1. Generator - decides if cloud-init services should run


2. Local - cloud-init-local.service - network setup, local datasource


3. Network - cloud-init.service - cloud_init_modules


4. Con
fi
g - cloud-con
fi
g.service - cloud_con
fi
g_modules


5. Final - cloud-
fi
nal.service - pkg installations, etc
Datasources


https://cloudinit.readthedocs.io/en/latest/topics/datasources.html
Cloud-init de
fi
nes an API for fetching userdata and metadata


Implement the API to create a datasource


CloudStack relevant datasources:


CloudStack - Fetched from router VM


Con
fi
gDrive v2 - Fetched from ISO
Modules


https://cloudinit.readthedocs.io/en/latest/topics/modules.html
Bundle of code that implements a speci
fi
c con
fi
g task


Has a ‘frequency’ property (once, instance, always)


Has de
fi
ned inputs (con
fi
g keys) accepted via userdata


Bundled modules found in cloud-init source at “cloudinit/con
fi
g” dir


Can be referenced to run in “cloud-init”, “cloud-con
fi
g”, or “cloud-
fi
nal”
stages
Cloud-init Con
fi
gurations
Con
fi
gurations
/etc/cloud/cloud.cfg


Usually static, template level con
fi
guration starting point


Commonly used to specify which modules and datasources the template supports


Userdata con
fi
guration


Dynamic, instance level con
fi
guration


Used to customize


Variety of format options


Instance metadata


Comes from cloud provider, values can be referenced in metadata
Userdata formats


https://cloudinit.readthedocs.io/en/latest/topics/format.html
Shell scripts - script is run


Include
fi
le -
fi
le containing urls to other userdata
fi
les


Upstart job - placed in /etc/init and run as any other


Boothook data - stored in /var/lib/cloud and executed


Part handler - python, override or provide new module


Cloud con
fi
g - yaml
fi
le to call modules that perform con
fi
guration


MIME multi-part archive - provide any number of the above, sections labeled with
mime tags
1. Install cloud-init


2. Con
fi
gure /etc/cloud/cloud.cfg


3. Create a template


4. Deploy w/userdata
Installation
yum install cloud-init


apt install cloud-init


Kickstart package includes


Cloud templates from distributions
/etc/cloud/cloud.cfg
/etc/cloud/cloud.cfg
Data source con
fi
g
/etc/cloud/cloud.cfg
Network con
fi
g (cloud-init-local)
/etc/cloud/cloud.cfg
Modules for cloud-init service
/etc/cloud/cloud.cfg
Modules for cloud-con
fi
g service
/etc/cloud/cloud.cfg
Modules for cloud-
fi
nal service
/etc/cloud/cloud.cfg (cont’d)
/etc/cloud/cloud.cfg overrides a default, built-in
cloud.cfg that comes with the distribution package


/etc/cloud/cloud.cfg can be overridden by other
con
fi
gs, such as userdata or those provided via CLI
when running “cloud-init” manually


con
fi
gs are merged, not replaced
Note on overrides
/etc/cloud/cloud.cfg (cont’d)
Add individual module to a stage


Can override module frequency


Modules are run in order of con
fi
guration


Some modules depend on other modules (e.g. runcmd
needs scripts-user)
De
fi
ning modules for a stage
Module con
fi
guration


https://cloudinit.readthedocs.io/en/latest/topics/modules.html
“once” - run only once (do not re-run for new instance-id)


“once-per-instance” - run on boot when instance-id changes


“always” - run every boot


Modules have a default frequency, see documentation
Module frequency
Module con
fi
guration


https://cloudinit.readthedocs.io/en/latest/topics/modules.html
YAML cloud-con
fi
g data


Each module has its own supported con
fi
guration, see
documentation
Module keys
Cloud-init with CloudStack
CloudStack datasources
Con
fi
gured by network offering


Virtual router based


Con
fi
gDrive based (cd-rom)


Baremetal (MaaS, Canonical)
Virtual Router Datasource
http-based


Hosted on virtual router


curl http://<router IP>/latest/user-data


Router provides data based on caller’s IP


Router should also be con
fi
gured as DHCP server for the network
Con
fi
gDrive Datasource
ISO image attached to VM with
fi
lesystem label “con
fi
g-2”


Con
fi
gDrive v2 format


Can be used for network setup, as it can be processed in the cloud-init-
local stage


Con
fi
guration “vm.con
fi
gdrive.primarypool.enabled” - will host ISO on
primary storage when used with KVM
CloudStack catch-all datasource
con
fi
guration (/etc/cloud/cloud.cfg)
datasource_list: [ ConfigDrive, CloudStack, None
]

datasource
:

ConfigDrive
:

dsmode: loca
l

CloudStack: {
}

None: {
}

Looks for Con
fi
gDrive
fi
rst, during cloud-init-local,


falls back to checking for virtual router
Providing user data
“deployVirtualmachine userdata=<base64 encoded>”


“updateVirtualmachine userdata=<base64 encoded>”


32kb cloudstack limit with http post


Can gzip
$ cat my-userdata | gzip | base64 -w0
Example: setting up data disk
#cloud-confi
g

disk_setup
:

/dev/vdb
:

table_type: gp
t

layout
:

- [33,82
]

- 6
7

fs_setup
:

- label: swa
p

filesystem: swa
p

device: /dev/vdb
1

- label: dat
a

filesystem: ext
4

device: /dev/vdb
2

mounts
:

- [ /dev/vdb1, none, swap, sw, '0', '0'
]

- [ /dev/vdb2, /data, ext4, defaults, '0', '0' ]
Other Examples


https://cloudinit.readthedocs.io/en/latest/topics/examples.html
Run apt upgrade/yum update


Add package repos


Set up users, ssh keys, groups, etc.


Call a url when
fi
nished


Set up puppet/chef agent


Reboot


Run arbitrary scripts or write
fi
les


Much more…


Developing cloud-init data
Remember to clear data, or module frequency con
fi
g may block
execution


“cloud-init init” — run the init stage, or the local stage with “—local”
fl
ag


“cloud-init modules” — run init, con
fi
g,
fi
nal stages. see “—mode”
fl
ag


“cloud-init clear” — clears data for cloud-init so phases can be run


“cloud-init single” — single module, can override frequency with
fl
ag


“cloud-init query” look up value of speci
fi
c metadata
Cloud-init analyze show
Semaphores
Used to track the running of modules and determine frequency
thresholds


Found in /var/lib/cloud/instance/sem/ and /var/lib/cloud/sem/


Format of semaphore
fi
les is: <stage>_<module_name>.<frequency>


Can be deleted individually to allow re-run of module
Questions?

CloudStack and cloud-init