Exploring cloud-init

Let’s explore cloud-init.

I’ve been doing that since yesterday, my main goal was to define an Ansible role alongside my Terraform code and pass it to the instance via user-data. I know that you are cringing right now and saying: The right tool for the job is Packer, why don’t you use it? Also, do you know that there is a 16KB limit on user-data (on AWS at least)?

Those are valid points, but sometimes I just want to run the damn Ansible man, I don’t want to bother with the cloud-init syntax and its modules. Also, how am I passing my Ansible code to the instance if not via user-data, exactly? Am I setting an S3 bucket (plus IAM policies) and fetching it from there? Or am I using a git token so that I can fetch the role directly from a private Git repo? Bloat. Either way, you’ll most likely be inserting some proportion of cloud-init in there, so might as well learn the trade-offs. Packer? It would be ideal to use it, yes, but the AMI needs to be stored somewhere, and that obviously isn’t free.

So, maybe, just maybe, if all you want to do is pass a small Ansible playbook to your instances and you’re trying to be frugal about it, maybe you can get away by using just cloud-init, dodging the bloat.

Cloud-init pros and cons

Let’s assume the following: we want to use a conjunction of Terraform and Ansible to provision and configure our servers. There are many ways to achieve that.

Across the internet, you’ll find multiple posts that go about using the Terraform provisioners (remote-exec and local-exec) to trigger Ansible. I would advise you not to do so, and so does the terraform documentation, they provide a very solid explanation for why you should only use remote-exec and local-exec as a last resort, their behavior doesn’t match well with the statefulness of Terraform.

So, we are left with user-data.

But really, is it possible to pass a Ansible role via user-data?

Well, sadly I don’t think there is a clean way to pass an Ansible role directly via user-data. I’ve tried it, but cloud-init is very specific on how it acepts files. If I passed a file content and a path, cloud-init would flat-out the path structure. So files/something.yml would be turned into files_something.yml. Looking at the Github repo we can find where that behaviour is defined.

This behaviour makes inviable passing an Ansible role via user-data, because Ansible is very particular about the file structure of the roles.

So, what options do we have left? Well, if you really need to pass an ansible role, you’re back at square zero, you’ll need to pull the role from inside the machine somehow, and you’ll have to deal with authentications and all that jazz.

How to use an Ansible playbook with cloud-init

If on the other hand, you define all your ansible code in a single playbook file, it’s simple to pass it to the instance.

Here is a dummy example, the only cloud-init that you’ll write: installs ansible and invokes the playbook. You can now write all your bootstrapping logic in Ansible instead of cloud-init.

  - ansible
  - ansible-playbook /var/lib/cloud/instance/scripts/playbook.yml

Our playbook template, notice the what variable.

  - name: "Just a stupidly small example"
    hosts: localhost
    connection: local
      what: ${what}
      - debug:
          msg: I'm just a simple debug 

You can then use the cloudinit_config resource to get your cloud-init definition ready on the Terraform side.

Notice that we are invoking templatefile on the playbook, also, we are doing a little bit of cheating, we’re saying that the content_type is text/x-shellscript when in reality we are providing a YAML file and not a script. You can bother with the proper format if you want to. Check the docs on Multi-part archives which is the functionality that cloudinit_config use under the hood, notice that on our terraform we are providing multiple parts.

data "cloudinit_config" "example" {
  part {
    filename     = "playbook.yml"
    content_type = "text/x-shellscript"
    content = templatefile(
        what = "passed-by-terraform"

  part {
    content      = file("files/cloud-init.yml")
    content_type = "cloud-config"

Then all you need to do is provide your rendered cloud-init via user-data.

module "ec2" {
    user_data = data.cloudinit_config.example.rendered

Just a couple of tips for debugging: you’ll most likely find your cloud-init configuration on the /var/lib/cloud/ folder. Also, to check the logs look at /var/log/cloud-init-output.log.

· cloud-init, cloud, aws, ansible, terraform