Jérôme Decoster

Jérôme Decoster

3x AWS Certified - Architect, Developer, Cloud Practionner

03 Jun 2022

EKS + ArgoCD + Multi-Environments and Multi-clusters Gitops

The Goal
  • Create a voting app
  • Automate docker image build on ECR after each push using Github Actions workflows
  • Automatic management of feature branches in specific dynamically created repositories
  • Infrastructure management with Terraform from a dedicated git repository
  • Creation of 2 EKS clusters for a staging environment and another for production
  • Installation of ArgoCD in the staging cluster. Management of the production cluster from this instance
  • Creation of 2 ArgoCD applications. One per environment
  • Managing GitOps deployments using separate repositories
  • Quick synchronization of ArgoCD applications from a Github Action workflow


    The project

    The project is composed of 2 git repositories that you can fork :

    The voting app

    You can launch the voting application via this command :

    # run vote website using npm - dev mode (livereload + nodemon)
    $ make vote

    This command executes this script

    The site is available at http://localhost:4000/


    The site is in static version. It is not linked to a database

    The returned vote values are random

    function randInt(min, max) {
      return Math.floor(Math.random() * (max - min + 1)) + min
    // ...
    app.get('/vote', async (req, res) => {
      let up = randInt(1, 9)
      let down = randInt(1, 9)
      return res.send({ up, down })

    Continuous containerization of the voting application

    Containerization is managed by 3 Github Actions workflows


    The tag.yml workflow is triggered only when a tag using semver notation is pushed :

    name: tag
        tags: [ "v*.*.*" ]
    # ...
        runs-on: ubuntu-latest
        if: github.ref_type == 'tag'

    In this specific case, the docker image is built and pushed to the ECR repository named multienv-vote :

    The pushed image will have as tag a value similar to v1.0.0

      ECR_REPOSITORY: multienv-vote
    # ...
      run: |
        cd vote
        docker image build \
          --build-arg NODE_ENV=production \
          --build-arg VERSION=${{ github.ref_name }} \
          --tag ${{ env.ECR_REGISTRY }}/${{ env.ECR_REPOSITORY }}:${{ github.ref_name }} \
          --tag ${{ env.ECR_REGISTRY }}/${{ env.ECR_REPOSITORY }}:latest \
        docker push --all-tags ${{ env.ECR_REGISTRY }}/${{ env.ECR_REPOSITORY }}

    The cd.yml workflow is triggered only when a commit is pushed to a feature branch :

          - master
          - master

    In this specific case, the docker image is built and pushed to a dynamically created ECR repository named multienv-vote-${ feature-branch-name } :

    The pushed image will be tagged with the first 8 characters of the commit sha. A value similar to 23e44fea

      ECR_REPOSITORY: multienv-vote-${{ github.event.ref }}
      SHORT_SHA: $(echo ${{ github.sha }} | cut -c 1-8)
    # ...
      run: |
        cd vote
        docker image build \
          --build-arg NODE_ENV=${{ github.ref_name }} \
          --build-arg VERSION=${{ env.SHORT_SHA }} \
          --tag ${{ env.ECR_REGISTRY }}/${{ env.ECR_REPOSITORY }}:${{ env.SHORT_SHA }} \
          --tag ${{ env.ECR_REGISTRY }}/${{ env.ECR_REPOSITORY }}:latest \
        docker push --all-tags ${{ env.ECR_REGISTRY }}/${{ env.ECR_REPOSITORY }}

    The delete.yml workflow is triggered only when a branch is deleted on github :

    on: delete

    When a feature branch is deleted, we also delete its associated ECR repository :

      ECR_REPOSITORY: gitops-multienv-${{ github.event.ref }}
    # ...
      run: |
        aws ecr delete-repository \
          --repository-name ${{ env.ECR_REPOSITORY }} \
          --query 'repository.repositoryUri' \
          --output text \

    Setting up the workflow

    To be able to push an image into ECR from Github, you must create a user, associate a secret access key with it.

    You must then store the value pair KeyId / SecretAccessKey in the secret settings area of the Github project :


    To create this user we execute the following command :

    # create iam user + access key
    $ make user-create

    This command executes the user-create.sh script :

    The 2 variables are output in 2 files at the root of the project

    • .env_AWS_ACCESS_KEY_ID

    Add these values to the secrets of your Github repository

    Workflow testing

    I’m going to create a feature branch named feature-one and make a CSS modification in it :

    # make sure we ar on the master branch
    $ git checkout master
    # create + switch to the feature-one branch
    $ git checkout -b feature-one

    To modify our site we uncomment the following lines on the main.css file

    /* uncomment below to create feature-one */
    color: black;
    background-color: #eceff1;
    border: #37474f 1px dashed;
    padding: .7em; 

    Changes can be seen at http://localhost:4000/ after running the following command :

    # run vote website using npm - dev mode (livereload + nodemon)
    $ make vote


    We push these modifications on Github :

    $ git add vote
    $ git commit -m 'css update'
    $ git push -u origin feature-one

    The workflow gets started :


    The workflow is quickly successful :


    The repository was created automatically :


    The image was successfully pushed :


    To locally test the image we just pushed to the ECR repository we run this command :

    # run latest image pushed to ecr
    $ make ecr-run

    This command runs the ecr-run.sh script

    It launches the website via a docker run which uses the last image pushed into a feature branch :

    # get the first feature repository (so, everything except master branch)
    REPOSITORY_NAME=$(aws ecr describe-repositories \
      --query "repositories[?starts_with(repositoryName, '$PROJECT_NAME-')].[repositoryName]" \
      --output text | \
      head -n 1)
    docker run \
      --rm \
      -e WEBSITE_PORT=4000 \
      -p 4000:4000 \
      --name vote \

    By opening http://localhost:4000/ we have our website :


    Note that the footer no longer indicates the same data

    The footer is created using this template :

    {% if node_env == 'development' %}<u>development</u> {% endif %}version: {{ version }}

    These instructions :

    FROM node:18.2-slim
    # ...

    And these arguments :

    run: |
      docker image build \
        --build-arg NODE_ENV=${{ github.ref_name }} \
        --build-arg VERSION=${{ env.SHORT_SHA }} \
        # ...

    We are happy with these changes. We will merge this feature into the master branch, push them and create a new tag :

    # switch to the master branch 
    $ git checkout master
    # merge feature
    $ git merge feature-one
    $ git push
    # create a tag
    $ git tag v1.0.0
    # push tag on github
    $ git push --tags

    The tag is added to the github repository :


    The tag workflow starts :


    The ECR repository is created :


    The image is pushed :


    The tag name is v1.0.0

    The infrastructure

    The infrastructure is managed via a specific repository

    Terraform is used to create the 2 EKS environments :


    2 environments are available, via 2 separate folders :

    Managing environments through directories instead of terraform workspaces is a good solution

    We use a backend on S3 :

    terraform {
      backend "s3" {}

    Which is defined via terraform init :

    terraform -chdir="$CHDIR" init \
      -input=false \
      -backend=true \
      -backend-config="bucket=$S3_BUCKET" \
      -backend-config="key=$CONFIG_KEY" \
      -backend-config="region=$AWS_REGION" \

    To create our S3 bucket we run the command :

    # setup project + create S3 bucket
    $ make init

    This command creates the bucket with versioning enabled :

    aws s3 mb s3://$S3_BUCKET --region $AWS_REGION
    # /!\ important for terraform states : enable bucket versioning
    aws s3api put-bucket-versioning \
      --bucket $S3_BUCKET \
      --versioning-configuration Status=Enabled \
      --region $AWS_REGION

    The bucket is created :


    The script also creates 3 variables in 3 files at the root of the projects :

    • .env_AWS_ID
    • .env_S3_BUCKET
    • .env_UUID

    Deployment of the staging environment

    To deploy the environment we just need to run these 2 commands :

    # terraform init the staging env
    $ make staging-init
    # terraform plan + apply the staging env
    $ make staging-apply

    The cluster is deployed :


    The tfstate is in the S3 bucket :


    2 important steps are still missing at this precise moment :

    • kubectl has not been configured to interact with the cluster
    • We don’t have the possibility to see the contents of the cluster. What this unpleasant warning confirms :


    To correct these 2 points we execute the following command :

    # setup kubectl config + aws-auth configmap for staging env
    $ make eks-staging-config

    We configure kubectl using the update-kubeconfig command from this script :

    aws eks update-kubeconfig \
      --name $NAME \
      --region $REGION

    Making resources visible is more complex. You must edit the aws-auth ConfigMap

    Editing it manually is easier, we just need to do :

    $ EDITOR=nano kubectl edit configmap aws-auth -n kube-system

    And add the mapUsers section defined like this :

    mapUsers: |
      - userarn: arn:aws:iam::[account_id]:root
        - system:masters

    Automating this modification by a script is more complex. The script below uses jq and yq :

    kubectl get configmap aws-auth \
            --namespace kube-system \
            --output yaml > "$PROJECT_DIR/aws-auth-configmap.yaml"
    # convert to json
    yq aws-auth-configmap.yaml -o json > "$PROJECT_DIR/aws-auth-configmap.json"
    # add mapUsers (use jq instead yq to add mapUsers because it's MUCH simpler and MORE clean)
    jq '.data += {"mapUsers": "- userarn: arn:aws:iam::'$AWS_ID':root\n  groups:\n  - system:masters\n"}' aws-auth-configmap.json \
    | yq --prettyPrint > "$PROJECT_DIR/aws-auth-configmap.yaml"
    # apply udated aws-auth-configmap.yaml
    kubectl apply --filename aws-auth-configmap.yaml --namespace kube-system

    Installing ArgoCD

    To install ArgoCD we run this script :

    # install argocd in staging env
    $ make argo-install

    Installing ArgoCD is done in a few command lines :

    kubectl create namespace argocd
    kubectl apply \
      --namespace argocd \
      --filename https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml
    kubectl wait deploy argocd-server \
      --timeout=180s \
      --namespace argocd \
    kubectl patch svc argocd-server -n argocd -p '{"spec": {"type": "LoadBalancer"}}'

    You can now connect to the ArgoCD web application with the data displayed in the Terminal :

    Accept self-signed certificate :


    Then login :


    Connect ArgoCD to Github repository

    ArgoCD must have access to our github repository gitops-multienv-infra

    There are chances are that this repository is private

    It is therefore necessary to link the repository and ArgoCD using an SSH key

    These operations are performed using the following command

    The following command uses gh to automate adding the ssh key to Github. However, you can easily do it manually.

    Important : the command uses the GIT_REPO variable which must be adjusted to your repository

    # add git repo connection + create ssh key + add ssh key to github
    $ make argo-add-repo

    The script executes a few lines :

    ssh-keygen -t ed25519 -N "" -f ~/.ssh/$PROJECT_NAME.pem
    # ...
    gh ssh-key add ~/.ssh/$PROJECT_NAME.pub --title $PROJECT_NAME
    argocd repo add $GIT_REPO \
      --insecure-ignore-host-key \
      --ssh-private-key-path ~/.ssh/$PROJECT_NAME.pem

    The repository is added :


    The public key is added to Github :


    The ArgoCD staging application

    The Kubernetes application is managed through the argocd folder

    Deploying multiple environments using GitOps principles is not an easy and obvious practice

    We use the method explained by Kostis Kapelonis in this excellent article

    The application is defined using k8s manifests and a kustomization manifest within a base folder :


    The customization of the environments is then done via 2 specific folders within an overlays folder :


    The ArgoCD application is generated from a template :

    apiVersion: argoproj.io/v1alpha1
    kind: Application
      name: app-${NAMESPACE}
      namespace: argocd
      project: default
        repoURL: ${GIT_REPO}
        targetRevision: HEAD
        path: argocd/overlays/${NAMESPACE}
        server: ${SERVER}
        namespace: gitops-multienv

    The creation of the application is done in a few lines with envsubst :

    export NAMESPACE=staging
    export SERVER=https://kubernetes.default.svc
    kubectl config use-context $PROJECT_NAME-staging
    cat argocd/argocd-app.yaml | envsubst | kubectl apply -f -

    The application deploys the image set via this patch :

    Important: you must configure and push the correct docker image URI before installing the ArgoCD staging application

    - target:
        kind: Deployment
        name: website
      patch: |-
        - op: replace
          path: /spec/template/spec/containers/0/image
          value: xxxxxx.dkr.ecr.eu-west-3.amazonaws.com/multienv-vote:v1.0.0

    We execute the following command :

    # create argocd staging app
    $ make argo-staging-app


    The application is successfully deployed :


    We get the load balancer URL :


    Setup the production EKS cluster

    The deployment of the cluster is done via the commands :

    # terraform init the production env
    $ make production-init
    # terraform plan + apply the production env
    $ make production-apply

    After a long wait, the cluster is deployed :


    We configure the access to the cluster with this command :

    # setup kubectl config + aws-auth configmap for production env
    $ make eks-production-config

    Connect ArgoCD to the new EKS cluster

    We want a single ArgoCD application to manage our multiple clusters

    We are therefore not going to install ArgoCD within the production cluster but connect this cluster to our ArgoCD application present within the staging cluster

    We can see the list of clusters managed by argocd via this command :

    $ argocd cluster list
    SERVER                          NAME        VERSION  STATUS      MESSAGE  PROJECT
    https://kubernetes.default.svc  in-cluster  1.21+    Successful

    This corresponds to this visual interface :


    To add the cluster, it must be previously defined in kubectl. This has been done by the previous steps

    You can check the list of clusters manageable by kubectl via this command :

    $ kubectl config get-contexts
    CURRENT   NAME                           CLUSTER                                                             AUTHINFO                                                               
    *         multienv-infra-production      arn:aws:eks:eu-west-3:xxxxx:cluster/multienv-infra-production       arn:aws:eks:eu-west-3:xxxxx:cluster/multienv-infra-production       
              multienv-infra-staging         arn:aws:eks:eu-west-3:xxxxx:cluster/multienv-infra-staging          arn:aws:eks:eu-west-3:xxxxx:cluster/multienv-infra-staging

    We associate a cluster with argocd with this simple command :

    $ argocd cluster add --yes <cluster-context-name>

    We execute this command :

    # argocd add production cluster
    $ make argo-add-cluster

    The cluster is added :

    $ argocd cluster list
    SERVER                                         NAME                       VERSION  STATUS      MESSAGE
    https://xxxxx.gr7.eu-west-3.eks.amazonaws.com  multienv-infra-production           Unknown     Cluster has no applications and is not being monitored.  
    https://kubernetes.default.svc                 in-cluster                 1.21+    Successful

    It can also be seen via the interface :


    The ArgoCD production application

    We deploy the production application using the following command :

    # create argocd production app
    $ make argo-production-app

    The application deploys the image set via this patch :

    Important : you must setup and push the correct docker image URI before install the ArgoCD production application

    - target:
        kind: Deployment
        name: website
      patch: |-
        - op: replace
          path: /spec/template/spec/containers/0/image
          value: xxxxxx.dkr.ecr.eu-west-3.amazonaws.com/multienv-vote:v1.0.0


    Faster synchronization of ArgoCD applications

    ArgoCD checkd for git repository update every 3 minutes

    You can use the Github webhook to sync it after each push

    But if it doesn’t work, we can also use the sync by the cli app

    We need to generate an ArgoCD token

    At the moment we don’t have any :

    $ argocd account get --account admin
    Name:               admin
    Enabled:            true
    Capabilities:       login

    We get an error if we try to generate one :

    $ argocd account generate-token
    FATA[0000] rpc error: code = Unknown desc = account 'admin' does not have apiKey capability

    We need to edit the argocd-cm ConfigMap :

    $ EDITOR=nano kubectl edit configmap argocd-cm -n argocd --context multienv-infra-staging

    Before :

    apiVersion: v1
    kind: ConfigMap
        # ...

    After :

    apiVersion: v1
    kind: ConfigMap
      accounts.admin: apiKey
        # ...

    Then generate the token again :

    $ argocd account generate-token

    Let’s check the data returned for the admin account :

    $ argocd account get --account admin
    Name:               admin
    Enabled:            true
    Capabilities:       login, apiKey
    ID                                    ISSUED AT                  EXPIRING AT
    xxxxxxxx-3f07-44b7-a9d4-40972f273484  2022-02-20T00:00:00+02:00  never

    We get the address of the ArgoCD server with this command :

    # argocd cli login + show access data
    $ make argo-login 
    SERVER  xxxxx-xxxxx.eu-west-3.elb.amazonaws.com

    We add the token and the ArgoCD server URL as Github secrets :


    The cd.yml workflow is used to synchronize our 2 applications :

    - name: Install argocd and run `app sync`
      run: |
        curl -sSL -o /usr/local/bin/argocd https://github.com/argoproj/argo-cd/releases/latest/download/argocd-linux-amd64
        chmod +x /usr/local/bin/argocd
        argocd app get app-staging \
          --auth-token ${{ secrets.ARGOCD_TOKEN }} \
          --server ${{ secrets.ARGOCD_SERVER }} \
          --insecure \
          --grpc-web \
          --refresh >/dev/null
        argocd app wait app-staging \
          --auth-token ${{ secrets.ARGOCD_TOKEN }} \
          --server ${{ secrets.ARGOCD_SERVER }} \
          --insecure \
          --grpc-web \
          --sync \
        # ...

    Let’s deploy a new feature in staging

    We create a new branch in our multienv-vote project :

    $ git checkout -b feature-two

    We modify the css again :

    footer {
      /* ... */
      color: grey;
      background-color: #eceff1;
      border: grey 1px solid;
      padding: .7em;

    Our local server gives this at address http://localhost:4000/ :


    We push this branch on github :

    $ git add .
    $ git commit -m 'css again'
    $ git push --set-upstream origin feature-two

    The repository is added :


    We get the URI of the docker image :


    We modify our kustomization.yaml file of the staging environment :

    - target:
        kind: Deployment
        name: website
      patch: |-
        - op: replace
          path: /spec/template/spec/containers/0/image
          value: xxxxx.dkr.ecr.eu-west-3.amazonaws.com/multienv-vote-feature-two:cf03d643

    We push this modification :

    $ git add .
    $ git commit -m feature-two
    $ git push

    The workflow is activated :


    The update of the 2 applications took 11 seconds :


    By reloading my browser at the staging URL :


    This demonstration is over, it is important to remove all resources :

    # delete argocd apps then argocd
    $ make argo-destroy
    # terraform destroy the production env
    $ make production-destroy
    # terraform destroy the staging env
    $ make staging-destroy