Terraform Chef Provisioner

Posted: October 24, 2019/Under: Chef, Django, Jenkins, Terraform/By: sdarwin

This article is a proof of concept to explore using the Terraform Chef Provisioner and Chef Vault to deploy a Django App on AWS.

It should be noted that these technologies have significant caveats.

Regarding Terraform Chef Provisioner: “Provisioners should only be used as a last resort. For most common situations there are better alternatives.” (from https://www.terraform.io/docs/provisioners/chef.html)

The Terraform Chef Provisioner does not support auto-scaling since it’s provisioning individual nodes.

Regarding Chef Vault: “However granting or revoking new servers requires human interaction. This means chef-vault is incompatible with auto-scaling or self-healing systems.” (from https://coderanger.net/chef-secrets/ )

Create 3 servers:

Jenkins
Chef Infra Server
Chef Workstation

Jenkins

Launch a small EC2 instance. Tag the machine Name=Jenkins

Create a security group called “jenkins”. Allow 8080, HTTPS, HTTP, SSH. Assign this to the server.

Once it’s up, ssh in.

# Set a servername
hostnamectl set-hostname jenkins

# Install java
apt install openjdk-8-jre-headless

# Install Jenkins
wget -q -O – https://pkg.jenkins.io/debian/jenkins.io.key | sudo apt-key add –
sudo sh -c ‘echo deb http://pkg.jenkins.io/debian-stable binary/ > /etc/apt/sources.list.d/jenkins.list’
sudo apt-get update
sudo apt-get install -y jenkins

bug fix, if necessary:
inside /var/lib/jenkins/secrets/initialAdminPassword, add passwd: . For example,
passwd:qwjewlkjdflwlf

Create a DNS entry for jenkins.example.com

Connect to the server at jenkins.example.com:8080. Complete installation.

Set up an Nginx Front-end Proxy for Jenkins

apt install nginx ssl-cert

in nginx/sites-available/default , add the following sections

server {
listen 80 default_server;
listen [::]:80 default_server;
return 301 https://$host$request_uri;
}

server {
listen 443 ssl default_server;
listen [::]:443 ssl default_server;
include snippets/snakeoil.conf;
location / {
include /etc/nginx/proxy_params;
proxy_pass http://localhost:8080;
proxy_read_timeout 90s;
}
}

Set the URL inside of Jenkins->Manage Jenkins->Configure System to be https://_url_ , replacing _url_ with the IP address or hostname.

Chef Server

Launch a small EC2 instance. Tag the machine Name=Chef-Server

Create a security group called “chef-server”. Allow HTTPS, SSH. Assign this to the server.

Once it’s up, ssh in.

# Set a servername
hostnamectl set-hostname chefserver

Follow the instructions at https://docs.chef.io/install_server.html to install.

A review of the steps:

Download Chef Server locally. Then copy the package to the chef server.

scp -i key.pem chef-server-core_13.0.17-1_amd64.deb ubuntu@_server_ip_:/tmp/
mv /tmp/chef-server-core_13.0.17-1_amd64.deb /opt/downloads/
cd /opt/downloads
dpkg -i chef-server-core_13.0.17-1_amd64.deb

Sidebar: Should the Chef infrastructure be based on public or private IP’s? Your AWS network might have private instances in a private subnet. Therefore, use private IP’s…

sudo chef-server-ctl reconfigure

#sudo chef-server-ctl user-create USER_NAME FIRST_NAME LAST_NAME EMAIL 'PASSWORD' --filename FILE_NAME

mkdir /root/.chef
sudo chef-server-ctl user-create sdarwin sam darwin [email protected] 'password' --filename /root/.chef/sdarwin.pem

#sudo chef-server-ctl org-create short_name 'full_organization_name' --association_user user_name --filename ORGANIZATION-validator.pem
sudo chef-server-ctl org-create testorg 'Test Corporation' --association_user sdarwin --filename /root/.chef/testorg-validator.pem

Chef Workstation

Launch a small EC2 instance. Tag the machine Chef-Workstation

Create a security group called “chef-workstation”. Allow SSH. Assign this to the server.

Once it’s up, ssh in.

# Set a servername
hostnamectl set-hostname chefworkstation

add the chefserver IP address to /etc/hosts

172.31.6.99 chefserver

# Install Chef Workstation
mkdir -p /opt/downloads
cd /opt/downloads
wget https://packages.chef.io/files/stable/chef-workstation/0.8.7/ubuntu/18.04/chef-workstation_0.8.7-1_amd64.deb
dpkg -i chef-workstation_0.8.7-1_amd64.deb
#Please enter the chef server URL: [https://chefworkstation/organizations/myorg]
https://chefserver/organizations/testorg

# copy over sdarwin.pem and testorg-validator.pem to /root/.chef on the workstation
mkdir /root/.chef
cd /root/.chef
knife ssl fetch
knife node list

Django

In order to have a Django website, create a new one from scratch using the steps which are already thoroughly documented here:
https://docs.djangoproject.com/en/2.2/intro/install/
https://docs.djangoproject.com/en/2.2/intro/tutorial01/
https://docs.djangoproject.com/en/2.2/intro/tutorial02/
https://docs.djangoproject.com/en/2.2/intro/tutorial03/
https://docs.djangoproject.com/en/2.2/intro/tutorial04/
https://docs.djangoproject.com/en/2.2/intro/tutorial05/
https://docs.djangoproject.com/en/2.2/intro/tutorial06/
https://docs.djangoproject.com/en/2.2/intro/tutorial07/

The final results of that are available here:
https://github.com/sdarwin/django-website

Database

Create a MySQL database in RDS:
https://console.aws.amazon.com/rds/home?region=us-east-1
Set the same username/password/database which will be configured on the Chef side.
The resulting endpoint such as djangodatabase.c4sb5x9i4nxh.us-east-1.rds.amazonaws.com will be the “host”.
Assign an aws security group for the DB permitting 3306 to the relevant servers.

Customizations

Database credentials and secrets will be stored in Chef Vault. How are those value propagated all the way from Vault to Django? As follows:

In Django, the database credentials are stored in mysite/settings.py.

Change them to read in environment variables instead of literal values.

DATABASES = {
'default': {
'ENGINE': 'django.db.backends.mysql',
'NAME': os.environ.get('DATABASE_NAME'),
'USER': os.environ.get('DATABASE_USER'),
'PASSWORD': os.environ.get('DATABASE_PASSWORD'),
'HOST': os.environ.get('DATABASE_HOST'),
'PORT': '3306'
}
}

How do you get environment variables into django?
In this case, python-dotenv was used. There are other choices such as django-dotenv and django-environ.

– Add python-dotenv to requirements.txt
– Add this code to settings.py and wsgi.py

import os
from dotenv import load_dotenv
dotenv_path = os.path.join(os.path.dirname(__file__), '.env')
load_dotenv(dotenv_path)

Create a template in the chef cookbook (to be discussed next) which will create the /home/django/repo/mysite/.env file.

DATABASE_USER=<%= @database_user %>
DATABASE_PASSWORD=<%= @database_password %>
DATABASE_NAME=<%= @database_name %>
DATABASE_HOST=<%= @database_host %>
SECRET_KEY=<%= @secret_key %>

Chef will read these values from Chef Vault.

Chef

Bootstrap a target node for testing:
Launch an EC2 instance. Get it’s IP address.
Add chefserver to the target’s /etc/hosts file.

knife bootstrap 172.31.15.79 -N “django” -i /root/.ssh/id_rsa -U ubuntu –sudo

Create a chef repo and cookbook.

chef generate repo chef-repo
cd chef-repo
cd cookbooks
chef generate cookbook django-website
cd django-website

The cookbook code is available here: https://github.com/sdarwin/django-website-cookbook

It’s a wrapper cookbook around django_platform : https://github.com/ualaska-it/django_platform

If you read the wrapper cookbook, you will see that it only installs a few things beyond what ualaska-it/django_platform does. Namely, the mysql client libraries, and the .env template mentioned earlier.

Many attributes are set in the wrapper cookbook’s attributes/default.rb as required by django_platform.

The cookbook utilizes a Policyfile workflow as an alternative to Berkshelf. See Policyfile.rb in the cookbook. “production” is the policy group.

chef install
chef push production

chef update
chef push production

#knife node policy set NODE POLICY_GROUP POLICY_NAME (options)
knife node policy set django production django-website

ssh django
sudo chef-client

“Cookbooks that you upload with policyfile commands via chef are stored in a separate API, so you won’t see them with knife cookbook list.”

Make sure metadata.rb in the tutorial cookbook has
depends “django_platform”

Sidebar: The fqdn of a host can be set in it’s /etc/hosts file: 1.2.3.4 django.example.com django

Chef Vault

export EDITOR=vi
knife vault create vaultbag gitkey -M client
knife vault edit vaultbag gitkey -M client

{ "thekey" : "____" }

to get the key without newlines, run this locally:

sed -E ':a;N;$!ba;s/\r{0,1}\n/\\n/g' originalkey.pem > newkey.txt

sed -i ':a;N;$!ba;s/\n/\\n/g' newkey.txt

Or from http://jtimberman.housepub.org/blog/2013/09/10/managing-secrets-with-chef-vault/

ruby -rjson -e 'puts JSON.generate({"thekey" => File.read("originalkey.pem")})' > output.json

Next,

knife vault create vaultbag auth -M client
knife vault edit vaultbag auth -M client

The contents should be something like this:

{
"database_user": "deploy",
"database_password": "test1234",
"database_name": "tutorial",
"database_host": "tutorialserver.c4sb5x9i4nxh.us-east-1.rds.amazonaws.com",
"secret_key": "f(2ppgp&bcr&92d1gy47jpgw*s65gh!uj()-%m9&&$ot#2v_e3"
}

#Add the node as a client for the vault bag:
knife vault update vaultbag auth -C "django.example.com" -M client
knife vault update vaultbag gitkey -C "django.example.com" -M client

#Or after setting the node's environment:
knife vault edit vaultbag auth -S "chef_environment:production" -M client
knife vault edit vaultbag gitkey -S "chef_environment:production" -M client

django_platform is using python virtual environments on the chef client machine:

virtualenv env
source env/bin/activate

Terraform

Download and install terraform from their website
https://releases.hashicorp.com/terraform/0.12.10/terraform_0.12.10_linux_amd64.zip

The terraform code is available at https://github.com/sdarwin/django-terraform

add to .bashrc:

export AWS_ACCESS_KEY_ID="_id_"
export AWS_SECRET_ACCESS_KEY="_key_"
export AWS_DEFAULT_REGION="us-east-1"

To review what the terraform code does:

– Create two django web server instances.
– Set chefserver in the /etc/hosts file
– Run the chef provisioner which installs and runs chef-client the first time
– Create a null resource which can be run for all subsequent iterations. Use “terraform taint” to cause this to be re-run.
– Add ALB resources so the web servers will be behind a load balancer

Chef provisioner has settings to access chef vault. It will add each node as a vault client.

# To re-run chef-client
terraform apply # it works
terraform apply # a second time. No action.
terraform state list
terraform taint null_resource.ProvisionRemoteHosts[0]
terraform taint null_resource.ProvisionRemoteHosts[1]
terraform apply # it works

Jenkins

Install terraform on the Jenkins machine:

cd /usr/local/bin
wget https://releases.hashicorp.com/terraform/0.12.10/terraform_0.12.10_linux_amd64.zip
apt install unzip
unzip terraform_0.12.10_linux_amd64.zip

mkdir -p /opt/github
cd /opt/github
git clone https://github.com/sdarwin/django-terraform

Create the file /var/lib/jenkins/load_env.sh

Add contents:

export AWS_ACCESS_KEY_ID="_id_"
export AWS_SECRET_ACCESS_KEY="_key_"
export AWS_DEFAULT_REGION="us-east-1"
export SECRET_KEY="123"

#for local testing only. Not the official rds instance.
export DATABASE_NAME="xyz"
export DATABASE_USER="jenkins"
export DATABASE_PASSWORD="test123"
export DATABASE_HOST="localhost"

Terraform has been configured to use environment variables which will be read from load_env.sh during a Jenkins job as:

. ~/load_env.sh

copy over chef cert to Jenkins machine.

#for root
mkdir /root/.chef/
vi /root/.chef/sdarwin.pem
chmod 700 /root/.chef/sdarwin.pem

#and for jenkins
cp -rp .chef/ /var/lib/jenkins
chown -R jenkins:jenkins /var/lib/jenkins

copy over ssh cert to Jenkins machine.

vi /root/.ssh/id_rsa
chmod 600 /root/.ssh/id_rsa

Install SSH Agent plugin in Jenkins.

In Jenkins, create a job. “Django”

Choose the second option which is Pipeline script.
– Pipeline script from SCM
– git
– [email protected]:sdarwin/django-terraform.git

Add a credential – an ssh key to access the github repo.

Testing

Let’s compose a script which Jenkins will run in the test stage.

pre-requisites:
as root:

apt install python3-pip
pip3 install virtualenv
apt install mysql-client libmysqlclient-dev
pip3 install mysqlclient #not needed at this step, it's in requirements.txt

mysql:

CREATE USER 'jenkins'@'%' IDENTIFIED BY 'test123';
GRANT ALL PRIVILEGES ON *.* TO 'jenkins'@'%';

as jenkins:

virtualenv env

The test script:

set -e
. ~/env/bin/activate
. ~/load_env.sh
git clone [email protected]:sdarwin/django-website.git || true
cd django-website
git fetch --all
git reset --hard origin/master
pip3 install -r requirements.txt
python manage.py test polls
cd ..

That script is added as the Test stage in the Jenkinsfile.
It can be simplified by removing a few of those git steps because Jenkins automatically fetches the latest version during each checkout.

Structuring

For the Jenkins job, set the Build Trigger to “GitHub hook trigger for GITScm polling”

In github.com, create a webhook for the terraform repo:
https://jenkins.logchart.com/github-webhook/

This will trigger based on the terraform repository. Generally, a Jenkins job is triggered from only one repository.

However, this project has three repositories, not one:
– Django
– Terraform
– Chef

We’d like to trigger a Jenkins build when any of those changes.

One possibility is to encode all the Jenkins functionality into a single Jenkinsfile in the Terraform codebase. Create “stub” jobs for the other repos. Small jenkins jobs which do very little except trigger the main build to occur, since one Jenkins job can trigger another Jenkins job.

Another possibility is to merge the Django code and the Terraform code into one single git repo. Have a subdirectory of the django repo called “terraform/”. A Jenkinsfile in the root of this single codebase will handle all the logic.

A third possibility, and the one explored further here, is to keep Django and Terraform separate. This is appealing for logistical reasons, to keep the infrastructure and the web code separated. Have the Jenkinsfile in the Django codebase run the Tests. After that succeeds, it will trigger the Terraform job. Observe the Jenkinsfiles for both Django and Terraform, with this setup in place:

https://github.com/sdarwin/django-website/blob/master/Jenkinsfile
https://github.com/sdarwin/django-terraform/blob/master/Jenkinsfile

Environments

Create pipeline Jenkins jobs for:
django-production
django-staging
terraform-production
terraform-staging

Each will be based on a Jenkinsfile, and point to the corresponding github project and github branch.
The jobs should be parameterized. Add a string parameter ENVIRONMENT, and set the default value as “staging” or “production”

Manually add webhooks in github, for example to https://jenkins.example.com/github-webhook/

What code changes were required to have two environments, both staging and production?

– Of course, create the two branches in each repository for staging and production. Their contents should actually be identical though, to allow merges.

– Create two backend databases in RDS.

– Create two vault data bags in chef:
vaultbag-production
vaultbag-staging

Each will have different database credentials. It also turns out that with Terraform’s implementation of Chef Vault, there might be a conflict if you don’t have separate data bags for prod and staging.

– Parameterize the Jenkinsfiles. Notice how they refer to $ENVIRONMENT now. And send this parameter from the front-end in Jenkins.

– In the Terraform code, parameterize items that will be different between environments. For example, there will be two load balancers. They are now called “django-${var.environment}” to distinguish between them. Add a “variable” called “environment”.

– In the Chef cookbook, a convenient way to handle multiple environments is to keep the variables in the attributes file of the cookbook, and not in the Policy, Role, or Environment. Here is how that’s done, in attributes/default.rb

case node.policy_group
  when "production"
    default['http_platform']['www']['additional_aliases'] = {'django.logchart.com' => { 'log_level' => 'info' }}
    default['django_platform']['git_ssh_key']['vault_data_bag'] = 'vaultbag-production'
    default['django_platform']['app_repo']['git_revision'] = 'master'
  when "staging"
    default['http_platform']['www']['additional_aliases'] = {'staging.logchart.com' => { 'log_level' => 'info' }}
    default['django_platform']['git_ssh_key']['vault_data_bag'] = 'vaultbag-staging'
    default['django_platform']['app_repo']['git_revision'] = 'staging'
  else
    default['http_platform']['www']['additional_aliases'] = {'dev.logchart.com' => { 'log_level' => 'info' }}
end

Conclusion

That’s a good start to deploying Django with the Terraform Chef Provisioner. I hope this documentation was helpful, let me know if you have any feedback or questions.

The example code is available at:
https://github.com/sdarwin/django-website
https://github.com/sdarwin/django-website-cookbook
https://github.com/sdarwin/django-terraform

Terraform Chef Provisioner