Atmosphere Release Deployment Process

Two Weeks Prior to Release

Select a crew chief who will have the following duties:
- Take ownership of the deployment to ensure this process is completed, delegating specific tasks as needed
- Serve as point person for communication with other teams, leadership, etc
- Ensure that driver has what he or she needs for a successful release
Select a driver who will have the following duties:
- Perform technical deployment tasks
- Communicate with crew chief regarding progress, setbacks, blocking issues, etc

CyVerse Only:
- Send maintenance notice to atmosphere-users; Post a maintenance banner to the Atmosphere User Manual. Worst-case scenario: notice must be sent two working days prior to release.
- Ensure maintenance window is on the "CyVerse Maintenance and Release Schedule" calendar
- Ensure a scheduled maintenance on status.io, see instructions
Jetstream Only:
- Ensure that Jetstream tech team is aware of deployment date/time, on their calendar, notified their users, etc

Each development cycle has a major release. During a relase, hotfixes will be applied and sometimes we will create patch releases. Each release is identified with a tag Major-Minor (ex. 34-0, or 34-1). A tag is used, because it always refers to an exact snapshot of the project and doesn't change. A release branch may continue to receive changes, and doesn't identify the project at any particular time.

To create the release candidate on Github:

Update the changelog, moving "Unreleased" items into a named release
Create a PR for these changes and merge into master
Create and tag a new release by visiting the releases page for the repo and clicking "Draft a new release"
This will trigger a new tagged build on Dockerhub
- NOTE: If you cannot wait for the Dockerhub build to complete, do a shallow clone of Atmosphere/Troposphere and build (instead of including all branches and things from your local repository): shell git clone --depth 1 --branch v36-6 https://github.com/cyverse/atmosphere.git && cd atmosphere docker build -t cyverse/atmosphere:v36-6 .

Prepare Release on Atmobeta

Merge PRs slated for release in Atmosphere, Troposphere, and Atmosphere-Ansible
Deploy release to atmobeta (See "Deployment Day" and apply to atmobeta)
Test changes on atmobeta
Prepare production variables
If you are creating a new server for this release, please refer to the wiki documentation

Deployment Day!

Pre-deployment: add the following to your aliases to save time typing:

alias atmosphere-docker="/opt/dev/atmosphere-docker/atmosphere-docker.sh -f /opt/dev/atmosphere-docker/docker-compose.prod.yml"

Deploying Atmosphere-Docker for the First Time

Clone the atmosphere-docker and atmosphere-docker-secrets repositories - typically do this at /opt/dev/
Setup PostgreSQL
1. Install PostgreSQL with apt-get (Google to find up-to-date guide)
2. Edit /etc/postgresql/<version>/main/postgresql.conf to uncomment listen_addresses = 'localhost' and add Docker host: listen_addresses = 'localhost,172.17.0.1'
3. Also sure that ssl = off, otherwise it will cause issues with user login
4. Edit /etc/postgresql/<version>/main/pg_hba.conf to add this line which will allow connections from within the Docker network: host all all 172.16.0.0/12 md5
5. Restart PostgreSQL: systemctl restart postgresql
6. Create atmo_app user and databases: SQL CREATE USER atmo_app WITH CREATEDB NOSUPERUSER CREATEROLE; ALTER USER atmo_app WITH PASSWORD 'password'; -- Reconnect as atmo_app user CREATE DATABASE atmo_prod; CREATE DATABASE troposphere;
7. Load data from database dumps: shell psql -h localhost -U atmo_app atmo_prod < data_base_dump psql -h localhost -U atmo_app troposphere < data_base_dump
Continue to Upgrading with Atmosphere-Docker instructions

Upgrading with Atmosphere-Docker

Even though the database is outside of Docker and should not be touched, back it up: shell pg_dump -U atmo_app atmo_prod > "/root/atmo-$(date +%F).sql" pg_dump -U atmo_app troposphere > "/root/tropo-$(date +%F).sql"
Start maintenance for Atmosphere and Troposphere ```shell atmosphere-docker exec troposphere /bin/bash -c
"source /opt/env/troposphere/bin/activate && ./manage.py maintenance start
--title '2019-06-16 v36-1 deployment'
--message 'Atmosphere is down for a Scheduled Maintenance, Today between 9am - 4pm MST.'"
atmosphere-docker exec atmosphere /bin/bash -c
"source /opt/env/atmo/bin/activate && ./manage.py maintenance start
--title '2019-06-16 v36-1 deployment'
--message 'Atmosphere is down for a Scheduled Maintenance, Today between 9am - 4pm MST.'" - If there is a separate proxy server or nginx for Atmosphere - atmo.cyverse.org for Cyverse prod and use.jetstream-cloud.org for Jetstream prod - switch that over to a maintenance message:shell cd /etc/nginx/sites-enabled ln -sf /etc/nginx/sites-available/default proxy.conf systemctl restart nginx ```
- You could also set this (and #8 below) up as 'at' jobs on the proxy server beforehand to get done automatically
Checkout the correct branch of atmosphere-docker-secrets - prod, jetstream-prod, atmobeta, etc - and double-check that variables look correct for this server and consist of your latest updates, if any
Stop the current containers and back them up to images shell atmosphere-docker stop docker commit atmosphere-docker_atmosphere_1 old_atmosphere-docker_atmosphere docker commit atmosphere-docker_atmosphere_1 old_atmosphere-docker_atmosphere
- Unfortunately, there is no way to preserve the containers without creating an image. Even if you rename the containers, docker-compose uses the IDs
Make sure atmosphere-docker repository is up-to-date (be careful to note any significant changes before doing this!) shell git reset --hard git pull
Edit docker-compose.prod.yml to use image tags from the new release if changes are not already in the repository and start containers shell atmosphere-docker pull atmosphere-docker up -d
- If you have trouble with this, type the whole thing out (see alias above)
Use and test, test, test
1. Also do apt-updates/apt-upgrades and reboot the server
2. Check logs are rotating okay and not getting too big
3. Delete any unnecessary images
4. If there are issues with containers connecting to postgres, restart service postgresql
Once you are confident the services are ready-to-go, remove the maintenance message from the proxy server shell cd /etc/nginx/sites-enabled ln -sf /etc/nginx/sites-available/proxy.conf proxy.conf systemctl restart nginx
End the maintenance by removing maintenance messages from Atmosphere and Troposphere ```shell atmosphere-docker exec troposphere /bin/bash -c
"source /opt/env/troposphere/bin/activate && ./manage.py maintenance stop"

atmosphere-docker exec atmosphere /bin/bash -c
"source /opt/env/atmo/bin/activate && ./manage.py maintenance stop" ``- If this isn't working, go inside the containers for both atmo and tropo and run./manage.py maintenance stop`

Hotfixing

The wonderful thing about Docker and containers in general is that they are easy to create and destroy. Also, changes inside the container do not have to be reflected outside the container to the services interacting with it. This gives us the ability to easily swap out one container for a new one.

Imagine you want to deploy an important change that cannot wait until the next scheduled maintenance. Once you have a container image ready with this new feature, follow these instructions:

Edit the docker-compose.prod.yml file with the new image tag
Scale up the Atmosphere service:

./atmosphere-docker.sh -f docker-compose.prod.yml up -d --scale atmosphere=2 --no-recreate --no-deps

By scaling atmosphere up, compose will create a new container alongside the first one. It will now load-balance round robin style between the two containers
The --no-recreate flag here is very important to avoid stopping and deleting the original, out-of-date container

Wait for the new container (atmosphere_2) to finish the setup tasks, then stop and delete the original:

docker kill atmosphere-docker_atmosphere_1 docker rm atmosphere-docker_atmosphere_1

Note: the container names may be slightly different than above
Unfortunately, the new container will always have the 2 index and will require --index=2 on all compose commands interacting with that container. On Jetstream, add this to entries in the crontab after completing the next step
- Scaling back down to 1 will delete the newest container
- Renaming will not change anything since docker-compose relies on immutable labels

Now you will need to restart Nginx, probably due to the way the underlying docker network handles hostname mapping:

./atmosphere-docker.sh -f docker-compose.prod.yml exec troposphere bash service nginx restart # inside troposphere container

Remember to update any crontab entries using docker-compose to include --index=2