Docker is finally bringing Squash support

For a while, Docker image size has been an interesting topic for discussion in the Containerization world, especially things like how if you’d observe silent image size increases if you don’t use --no-cache flag while building. The layered structure of a Docker image would spook out a beginner.

Typically, these layers would be handled in such a way that maximum optimization in terms of re-use and storage is achieved. However, it’s often the case that overlaying file systems can introduce unnecessary image sizes, especially when the same file system locations are edited in different layers.

To get around the issues of image sizes, there were several recommended workarounds. You’d want to have all the RUN commands in a single instruction. You’d minimize the changes done to the same file locations in different instruction layers. There were even suggestions to copy all the files over a temporary HTTP server in to the Docker image during build time to minimize file changes, and we at WSO2 successfully did that too, for WSO2 Dockerfiles, bringing the image size down the minimum possible with an Ubuntu base image.

However, it quickly became apparent that these workarounds introduce quite a lot of noise to the core concern involving a Dockerfile, which is to build a Docker image from a Dockerfile descriptor. They would constantly require special scripts that do some kind of bootstrapping before the actual Docker build, which made it almost impossible for plain old Docker CLI commands to make use of the Dockerfiles. There was a clear compromise between readable Dockerfiles and size-efficient Dockerfiles. Also, being forced to use --no-cache always was a downer, since incremental builds took more time than actually needed.

Squashing images used to be a more of a hack-practice, that a lot of third-party tools would be available for. It would basically flatten all the layers by saving the images to a TAR file, and then reload the images. This would, most of the time, lose layer related metadata and would not always be consistent across platforms.

Docker has finally introduced --squash option for the build command in the Experimental features. Having a vendor-supported way to reduce image sizes is a massive relief of a headache, that otherwise would be first part of any Docker related discussion. In most cases there seems to be around 40% reduction in the image sizes, which is not bad for an experimental feature.

Enable Docker Experimental Mode

First, verify that Docker Daemon has experimental mode.

$ docker version
Client:
 Version:      17.05.0-ce
 API version:  1.29
 Go version:   go1.7.5
 Git commit:   89658be
 Built:        Thu May  4 22:10:54 2017
 OS/Arch:      linux/amd64

Server:
 Version:      17.05.0-ce
 API version:  1.29 (minimum version 1.12)
 Go version:   go1.7.5
 Git commit:   89658be
 Built:        Thu May  4 22:10:54 2017
 OS/Arch:      linux/amd64
 Experimental: false

If Experimental is set to false , we have to enable it in the Daemon and restart it.

To do this, create the file /etc/docker/daemon.json and add the following content.

{
 “experimental”: true
}

Then restart the Docker Daemon. The experiemental mode should now be set to true .

$ sudo systemctl restart docker.service
$ docker version
Client:
 Version:      17.05.0-ce
 API version:  1.29
 Go version:   go1.7.5
 Git commit:   89658be
 Built:        Thu May  4 22:10:54 2017
 OS/Arch:      linux/amd64

Server:
 Version:      17.05.0-ce
 API version:  1.29 (minimum version 1.12)
 Go version:   go1.7.5
 Git commit:   89658be
 Built:        Thu May  4 22:10:54 2017
 OS/Arch:      linux/amd64
 Experimental: true

Building with Squash Option

Now, --squash option is available for the docker build command.

docker build --squash -t chamilad/testdocker:0.1 .

If you inspect the resulting Docker image using the docker history command, you will see that although the layers are mentioned, their resulting sizes are 0B and there is an additional layer with a comment similar to format merge sha256:<hash> to <hash> . This is all the previous layers (up to the parent image) being squashed to one layer. With the Union FS layers out of the question, the unified layer can then be used with less storage for the Docker image layers.


Written on August 22, 2017 by chamila de alwis.

Originally published on Medium