Automated process with Bitbucket Pipelines for quick and easy creation of custom Docker images

Reading Time: 9 minutes

Starting from the first initial release back in 2013, Docker and its images are still growing and growing every day. We have more and more images created, more containers used, all that is expanding so we need something that will help us in the process to complete repetitive tasks easy and fast. Every second that we can save is a plus. For that purpose, I wanted to bring this article to you and save some time so you can focus and work on new things. In this article, I will bring one process that I think is very useful when we are working with Docker, here we will see how we can easily speed up the creation of Docker images for our custom usage.

For this presentation we need several prerequisites: installed Docker on your machine, Google Artifact Registry, the appropriate accounts set up for Google Cloud (authenticated and ready for use), Bitbucket Pipelines. This article can cost money, so please first take a look at the resources.

The process

First, we have to create the folder structure from where we will build, and push the Docker images. Create one folder like Docker_Images and inside that folder, we have to create two subfolders (Base_Images and Extension_Images), in the Base images folder we will store the base layer images that will be used in the Extension Docker Images, so we will use base images, add new features and get new custom Docker Images fast. With this approach, we can have several base images and always create new ones easily using the base layers by adding new features on top of them.

We will create one Base Docker Image and one Extension Docker Image. So first, we have to create the Base Image, for that, create a new subdirectory in the Base_Images directory, with the appropriate name, I will use linux_dind_openjdk11 as my base image. When we have created the subdirectory we have to create the Dockerfile. In the Dockerfile for this base image we will have this:

FROM docker:19.03-dind  #you can use different image here if you need specific one 
USER root
ENV \
    RUNTIME_DEPS="tar unzip curl openjdk11 bash docker-compose"  
#in ENV we add the packages that we want to be installed
RUN \
    echo "http://dl-cdn.alpinelinux.org/alpine/edge/community" >> /etc/apk/repositories && \
    apk update && \
    apk add --no-cache $RUNTIME_DEPS
#with RUN we are adding the package repo and adding the requested packages

As you can see in the Dockerfile that we have created several parts, the FROM, USER, ENV and RUN parts, each one of them is doing some specific job inside the Dockerfile. The Dockerfile can have additional parts, but in our case, these are the only ones that we need. So with this, we have created our base image. You can find more info about the Dockerfile structure on this link: https://docs.docker.com/engine/reference/builder/.

The next step is to build and push the image to the specific Google Artifact Registry where we will store the images and every member in our company or team can use those images.

docker build -t us-central1-docker.pkg.dev/gcp-team-platform/docker-registry-name/linux_dind_openjdk11:1.0.0 .
docker push us-central1-docker.pkg.dev/gcp-team-platform/docker-registry-name/linux_dind_openjdk11:1.0.0

When we execute these two commands successfully we can be sure that now we have our base image pushed on the Google Artifact Registry and can be found there. In the Google Cloud, this can be found by simply typing Artifact Registry in the search bar and selecting Registry where we will see the pushed image that we have created.

After this step, we can continue to our new Extension Docker Image, using the base image as a starting point. So now we have to create a new subfolder in the extension_images folder, with a similar pattern name, e.g. linux_dind_openjdk11_maven3_gradle7, this new folder also has to have Dockerfile inside. This Docker image will have new features like Maven and Gradle. The Dockerfile will look like below:

FROM us-central1-docker.pkg.dev/gcp-team-platform/docker-registry-name/linux_dind_openjdk11:1.0.0
#the Base Image that we have created previously
USER root
ENV \
    RUNTIME_DEPS="maven gradle"  #added new packages
RUN \
    echo "http://dl-cdn.alpinelinux.org/alpine/edge/community" >> /etc/apk/repositories && \
    apk update && \
    apk add --no-cache $RUNTIME_DEPS

When we build and push this Dockerfile we will have the previous (Base Image) and the new packages in our Extension Image, which means Maven and Gradle will be installed together with opendjk11, DinD (Docker in Docker) and Linux.

This process can be done with all variations and needs for Docker Images that we need, the process will be simple buy creating Base Images in the appropriate folder, and upgrading them in the Extension Images folder where we will create the Docker Images using the Baselayers. But as you can see this is a repetitive task and we can automatize some processes here.

For the automation part, I use Bitbucket Pipelines, which is giving us a very easy and fast way of doing specific actions easily, fast and secure. You can read more here https://bitbucket.org/product/features/pipelines.

Bitbucket Pipelines

Bitbucket Pipelines is a CI/CD tool, that is working also with Docker, in a way where every build we do, Bitbucket Pipeline is using Docker Container to serve our needs.

In our example we need one Bitbucket repo where we will store our files, so please create one and push the files there, also for the Bitbucket to work we need a bitbucket-pipelines.yml file created in the root of our working directory. The name must be the exact bitbucket-pipelines.yml, this is needed so the pipeline can work and recognize the file. When we create this file inside we have the option to automate the steps needed for the build and push the images.

The Bitbucket Pipeline file has several things that we have to take care of. The first thing is the image, that is on the top of the document. This is pulling the Cloud SDK version, so we can quickly execute gcloud commands.

image: gcr.io/google.com/cloudsdktool/cloud-sdk:latest

The next one in our code is the Git/Bitbucket part where we are getting the full structure of the files in the directories.

clone:
  depth: full

In this definitions part, where we are defining our script that will automate the steps:

definitions:
  scripts:
    - script: &buildDockerImage
      - echo $SERVICE_ACCOUNT_KEY | base64 -d > key.json
      - gcloud auth activate-service-account $SERVICE_ACCOUNT_EMAIL --key-file=key.json
      - gcloud auth configure-docker $DOCKER_REGISTRY_LOCATION --quiet
      - chmod +rx build_push_docker_images_script.sh
      - ./build_push_docker_images_script.sh

Here we can see the name of the script, next command is the Service Account Key spawn, so as I mentioned we need a service account already created for this task, here we are adding the key of the service account from Google. The next two steps starting with gcloud command are for connection to our GCP platform, here we are authenticating the service account and on the next command, we are configuring the connection and authentication with the Docker repository on Google. With the cd command, we are entering in the extension_images folder where we have to build and push our new Docker images.

The last two commands are for giving permission and running our custom script build_push_docker_images_script.sh, this script is created in order to go into every new folder of the extension_images directory, build the new images with their name and tag, and push them to the Artifact Registry if everything is correct.

The Bash script is a simple one, loops through every directory and executes several commands, see the code below:

#!/bin/sh
/bin/bash
extension_path="$(dirname $(realpath $0) )/extension-images"
for dir in $extension_path/*; do
if [ -d $dir ];
  then
   cd $dir
   folder_name=$(basename $dir);
   docker build -t $DOCKER_REGISTRY/$folder_name:1.0.0 .
   docker push $DOCKER_REGISTRY/$folder_name:1.0.0
fi
done

With the for cycle, we are looping through every folder in the extension_images directory and it is entering each of them and executing the docker build and push commands.

The last step in the Bitbucket pipeline file is the section where we define the pipeline how to work, means that in this section we are defining the triggers, so there are several different ways of triggering the pipeline, one of them is using the branches, so when we push to a specific branch it can be a trigger for the pipeline, in our case that branch is master.

pipelines:
  branches:
    master:
      - step:
          name: Build and Deploy Docker Images
          deployment: Dev
          script: *buildDockerImage
          services:
            - docker

In this code you can see that we have a step again, in Bitbucket Pipelines every new step is a separate Docker image, we have the name of the step for easy recognition what we do, the deployment (Dev) is the environment where this pipeline will execute in this case our script. The last one is the part of the service where we are spinning up a separate docker container for faster build and easy service editing. In our case, we have chosen docker.

You can find more about Bitbucket Pipeline Triggers on this link https://support.atlassian.com/bitbucket-cloud/docs/pipeline-triggers/.

Summary

In short, here you can find steps on how to automate and create Docker Images, using the Bitbucket Pipelines as a fast method to bring up Docker Images ready to use.

  • First the manual creation of the Base Docker Image, where we see how to create the folder structure and one base image, that later can be used as many times as we need.
  • Extension Docker Image, reusing the Base Image to create new custom image, how to do that and how to use the concept of fast creation.
  • Bitbucket Pipelines, pioneer in the technology where we are getting different ways of running fast and reliable pipelines, new features, automated triggers, great option for this kind of tasks.
  • Shell scripting for automated execution of commands that are repetitive.

I hope this helped you to understand the Docker Image creation, Base and Extension Image concept that is explained in short here, and the automation with the Bitbucket Pipelines.

Note!!!

Please be sure that some variables in this article are local so if you are running this setup be aware that you have to change them to the correct one so this can work on your end.

What is CI? Continuous Integration Explained

Reading Time: 5 minutes

Continuous Integration (CI) is a software development practice that requires members of a team, to frequently integrate their code changes into a central repository (master branch), preferably several times a day.

Each merge is then verified by automatically generating a build, and running automated tests against that build.

By integrating regularly, you can detect errors quickly, as well as locate and fix them easier.

Why is Continuous Integration Needed?

Back in the days, BCI – Before Continuous Integration, developers from a single team might have worked in isolation for a longer period of time, and they merged their code changes only when they finished working on a particular feature or bug fix.

This caused the well-known merge hell (integration hell) or in other words a lot of code conflicts, bugs introduced, lots of time invested into the analysis, as well as frustrated developers and project managers.

All these ingredients made it harder to deliver updates and value to the customers on time.

How does Continuous Integration Work?

Continuous Integration as a software development practice entails two components: automation and cultural.

The cultural component focuses on the principle of frequent integrations of your code changes to the mainline of the central repository, using a version control system such as Git, Mercurial or Subversion.

But applying the cultural component you will drastically lower the frustrations and time wasted merging code, because, in reality, you are merging small changes all the time.

As a matter of fact, you can practice Continuous Integration using only this principle, but with adding the automation component into your CI process you can exploit the full potential of the Continuous Integration principle.

Continuous Integration Image
Source

As shown in the picture above, this includes a CI server that will generate builds automatically, run automated tests against those builds and notify (or alert) the team members of the results.

By leveraging the automation component you will immediately be aware of any errors, thus allowing the team to fix them fast and without too much time spent analysing.

There are plenty of CI tools out there that you can choose from, but the most common are: Jenkins, CircleCI, GitHub Actions, Bitbucket Pipelines etc.

Continuous Integration Best Practices and Benefits

Everyone should commit to the mainline daily

By doing frequent commits and integrations, developers let other developers know about the changes they’ve done, so passive communication is being maintained.

Other benefits that come with developers integrating multiple times a day:

  • integration hell is drastically reduced
  • conflicts are easily resolved as not much has changed in the meantime
  • errors are quickly detected

The builds should be automated and fast

Given the fact several integrations will be done daily, automating the CI Pipeline is crucial to improving developer productivity as it leads to less manual work and faster detection of errors and bugs.

Another important aspect of the automated build is optimising its execution speed and make it as fast as possible as this enables faster feedback and leads to more satisfied developers and customers.

Everyone should know what’s happening with the system

Given Continuous Integration is all about communication, a good practice is to inform each team member of the status of the repository.

In other words, whenever a merge is made, thus a build is triggered, each team member should be notified of the merge as well as the results of the build itself.

To notify all team members or stakeholders, use your imagination, though email is the most common channel, but you can leverage SMS, integrate your CI server with communication platforms like Slack, Microsoft Teams, Webex etc.

Test Driven Development

Test Driven Development (TDD) is a software development approach relying on the principle of writing tests before writing the actual code. What TDD offers as a value in general, is improved test coverage and an even better understanding of the system requirements.

But, put those together, Continuous Integration and TDD, and you will get a lot more trust and comfort in the CI Pipelines as every new feature or bug fix will be shipped with even better test coverage.

Test Driven Development also inspires a cultural change into the team and even the whole organisation, by motivating the developers to write even better and more robust test cases.

Pull requests and code review

A big portion of the software development teams nowadays, practice pull request and code review workflow.

A pull request is typically created whenever a developer is ready to merge new code changes into the mainline, making the pull request perfect for triggering the CI Pipeline.

Usually, additional manual approval is required after a successful build, where other developers review the new code, make suggestions and approve or deny the pull request. This final step brings additional value such as knowledge sharing and an additional layer of communication between the team members.

Summary

Building software solutions in a multi-developer team are as complex as it was five, ten or even twenty years ago if you are not using the right tools and exercise the right practices and principles, and Continuous Integration is definitely one of them.


I hope you enjoyed this article and you are not leaving empty-handed.
Feel free to leave a comment. πŸ˜€

Follow N47 on InstagramTwitterLinkedInFacebook for any updates.

How we deploy with Terraform and BitBucket to Azure Kubernetes

Reading Time: 6 minutes

N47 implemented a set of back-office web applications for Prestige, a real estate management company located in Zurich, Switzerland. One application is a tool for displaying construction projects nearby properties managed by Prestige and a second example is a tool for creating and assigning orders to craftsmen. But the following examples aren’t specific for those use cases.

Screenshot of the Construction Project tool.

An Overview

The project entails one frontend application with multiple microservices whereby each service has its own database schema.

The application consumes data from Prestige’s main ERP system Abacus and third-party applications.

N47 is responsible for setting up and maintaining the full Kubernetes stack, MySQL Database, Azure Application Gateway and Azure Active Directory applications.

Another company is responsible for the networking and the Abacus part.

Architectural Overview

Involved Technologies

Our application uses following technologies:

  • Database: MySQL 8
  • Microservices: Java 11, Spring Boot 2.3, Flyway for database schema updates
  • Frontend: Vue.js 2.5 and Vuetify 2.3
  • API Gateway: ngix

The CI/CD technology stack includes:

  • Source code: BitBucket (Git)
  • Pipelines: BitBucket Pipelines
  • Static code analysis: SonarCloud
  • Infrastructure: Terraform
  • Cloud provider: Azure

We’ll focus on the second list of technologies.

Infrastructure as Code (IaC) with Terraform and BitBucket Pipelines

One thing I really like when using IaC is having the definition of the involved services and resources of the whole project in source code. That enables us to track the changes over time in the Git log and of course, it makes it far easier to set up a stage and deploy safely to production.

Please read more about Terraform in our blog post Build your own Cloud Infrastructure using Terraform. The Terraform website is of course as well a good resource.

Storage of Terraform State

One important thing when dealing with Terraform is storing the state in an appropriate place. We’ve chosen to create an Azure Storage Account and use Azure Blob Storage like this:

terraform {
  backend azurerm {
    storage_account_name = "prestigetoolsterraform"
    container_name = "prestige-tools-dev-tfstate"
    key = "prestige-tools-dev"
  }
}

The required access_key is passed as an argument to terraform within the pipeline (more details later). You can find more details in the official tutorial Store Terraform state in Azure Storage by Microsoft.

Another important point is not to run pipelines in parallel, as this could result in conflicts with locks.

Used Terraform Resources

We provide the needed resources on Azure via BitBucket + Terraform. Selection of important resources:

Structure of Terraform Project

We created an entry point for each stage (local, dev, test and prod) which is relatively small and mainly aggregate to the modules with some environment-specific configurations.

The configurations, credentials and other data are stored as variables in the BitBucket pipelines.

/environments
  /local
  /dev
  /test
  /prod
/modules
  /azure_active_directory
  /azure_application_gateway
  /azure_aplication_insights
    /_variables.tf
    /_output.tf
    /main.tf
  /azure_mysql
  /azure_kubernetes_cluster
  /...

The modules themselves have always a file _variables.tf, main.tf and _output.tf to have a clean separation of input, logic and output.


Example source code of the azure_aplication_insights module (please note that some of the text have been shortened in order to have enough space to display it properly)

_variables.tf

variable "name" {
  type = string
}

variable "location" {
  type = string
}

variable "resource_group_name" {
  type = string
}

main.tf

resource "azurerm_application_insights" "ai" {
  name                = var.name
  location            = var.location
  resource_group_name = var.resource_group_name
  application_type    = "web"
}

_output.tf

output "instrumentation_key" {
  value = azurerm_application_insights.ai.instrumentation_key
}

BitBucket Pipeline

The BitBucket pipeline controls Terraform and includes the init, plan and apply. We decided to manually apply the changes in the infrastructure in the beginning.

image: hashicorp/terraform:0.12.26

pipelines:
  default:
      - step:
        name: Plan DEV
        script:
          - cd environments/dev
          - terraform init -backend-config="access_key=$DEV_TF_CONFIG_ACCESS_KEY"
          - terraform plan -out out-overall.plan
        artifacts:
          - environments/dev/out-overall.plan

  branches:
    develop:
      - step:
        name: Plan DEV
        script:
          - cd environments/dev
          - terraform init -backend-config="access_key=$DEV_TF_CONFIG_ACCESS_KEY"
          - terraform plan -out out-overall.plan
        artifacts:
          - environments/dev/out-overall.plan
          - environments/dev/.terraform/**
      - step:
        name: Apply DEV
        trigger: manual
        deployment: dev
        script:
          - cd environments/dev
          - terraform apply out-overall.plan

    master:
      # PRESTIGE TEST
      - step:
          name: Plan TEST
          script:
            - cd environments/test
            - terraform init -backend-config="access_key=$PRESTIGE_TF_CONFIG_ACCESS_KEY"
            - terraform plan -out out-overall.plan
          artifacts:
            - environments/test/out-overall.plan
            - environments/test/.terraform/**
      - step:
          name: Apply TEST
          trigger: manual
          deployment: test
          script:
            - cd environments/test
            - terraform apply out-overall.plan

      # PRESTIGE PROD ...

Needed Steps for Deploying to Production

1. Create feature branch with some changes

2. Push to Git (BitBucket pipeline with step Plan DEV will run). All the details about the changes can be found in the Terraform plan command

3. Create a pull request and merge the feature branch into develop. This will start another pipeline with the two steps (plan + apply)

4. Check the output of the plan step before triggering the deploy on dev

5. Now the dev stage is updated and if everything is working as you wish, create another pull request to merge from develop to master. And re-do the same for the production of other stages

We have just deployed an infrastructure change to production without logging into any system except BitBucket. Time for celebration.

people watching concert
Symbol picture of N47 production deployment party (from unsplash)

Is Really Everything That Shiny?

Well, everything is a big word.

We found issues, for example with cross-module dependencies, which aren’t just solvable with a depends_on. Luckily, there are some alternatives:

network module:

output "id" {
  description = "The Azure assigned ID generated after the Virtual Network resource is created and available."
  value = azurerm_virtual_network.virtual_network.id
}

kubernetes cluster module, which depends on network:

variable "subnet_depends_on" {
  description = "Variable to force module to wait for the Virtual Network creation to finish"
}

and the usage of those two modules in environments/dev/main.tf

module "network" {
  source = "../../modules/azure_network"
}

module "kubernetes_cluster" {
  source = "../../modules/azure_kubernetes_cluster"
  subnet_depends_on = module.network.id
}

After having things set up, it really makes joy to wipe out a stage and just provision everything from scratch with running a BitBucket pipeline.