Here it is (base64-encoded):
H4sICIa2A1sCA2IA7Vrrbts2FFYL7M9+7QUGGNyfDYhtkuJFFLAhWZOhBYJmaLMOWBAEFC+xVlkyJLpYEBjdY+0l+k6jfGvqtkEWp2qD8TMg8vAqnsNzDg9lQhhmEjHDhY4zgWJBBUQJ5ZnCGAubMUQMyhJqoRRMJxYbo7Q2CedYxlQO/myqMroeEEHICIngApspxohEKI4h5DHmGEUQQw7jqAejDjBtnKz9q2w7zubi7gkugazVKHdGuWltQArkWDMCdoCqSpufg/QSPK4aV8pxW+nL96uxzMu39G+NqRe5PeekGj13Oi9BamXRmCtl1dS9X2jqel147C7W+aOJKd8dZ04dlcqsSw7KVyA9Ab/uHT/+cTht6mFRKVkMmywv0yv0mnxbMc8sSP8Apzvg0ViDtJwWxQ54Mpbny5W9qIrp2DSrmt+r+mVenu/ny+UelK6+mFR56VYtjsqfp3mxHupQZqZYdp/NGeo850x99r9j7QloyWEz8kvpK//47vuymvzQ29vf79m8MKnIaIa8bUmwRdByw6TKREIoIzE3xBrjrY7MGDUilomQ3GrNrFaIKqSZ4lkvL3tD12sn/IQCrI10xtcC7C1kH9I+xseQpYilRAwoZ5AI9IcfWFfqpRfzK1M3eeUZDRAfQDGAfc/jHTDKG1fVXiInlzcfctnwLPP9Vszs9VXvUzFy5jlZV5WzTbtN3cWkZWkhL/yS2gXm1p7lumkl24wkpv51FbYcU0EZy7SV0ucEZowkiCjvLbAVikCaGUqhyjT0c0Lj/YrElmmSWANOZ7MooHPwRCiLRaJEzBXKFGTCy49lUHNKjEigVdD6H4uTzPj9wzDCSawU0TQT2ujhjVwjgZzSj/n/eX7D/xPm/T8N/v/Ll/+Lg2fPnxw93eL85xFvyB9Rn4TzXwdAAxiMYLD/t9f/7eM/xDja1P+YBf3vKP7L2+PnttsA/IfjcQiE7nkgdH18Ey4O7pjdH7ygmX0p9n8eFA5aG3pb+0/eP/9jzFmw/13AdTBHK3/OPx7/Ic4X8qecQ9K244QG/98JXh8c/vLwwYM1/TD6KWqpv6LdOb37gT67URKterTpVxu1V9PXq3lW1d8skn++9Y83f4cDeEBAQMBnwliWuTWNu8l33G38/3X3fzGk79wFQ4S4Lwr+vwOcXIJHy4ANkLv4L4APcJ6ZSXUsz+efh1xaSOf3VxstHS6+H/nSu4s6wOns9OugxrdG7WXV5K6qc9NEn0n/ESab+s9o0P+O7v9ce1WzVNI7uAiczYI6BgQEBNwD/AvqV/+XACoAAA==
How I Got There
A colleague of mine showed me a Docker image he was using to test Kubernetes clusters. It did nothing, just starts up as a pod and sits there until you kill it.
‘Look, it’s only 700kb! Really quick to download!’
This got me wondering what the smallest Docker image I could create was.
I wanted one I could base64 encode and send ‘anywhere’ with a cut and paste.
Since a Docker image is just a tar file, and a tar file is ‘just’ a file, this should be quite possible.
A Tiny Binary
The first thing I needed was a tiny Linux binary that does nothing.
There’s some prior art here, a couple of fantastic and instructive articles on creating small executables, which are well worth reading:
A Whirlwind Tutorial on Creating Really Teensy ELF Executables for Linux
I didn’t want a ‘Hello World’, but a program that just slept and that worked on x86_64.
I started with an example from the first article above:
SECTION .data msg: db "Hi World",10 len: equ $-msg SECTION .text global _start _start: mov edx,len mov ecx,msg mov ebx,1 mov eax,4 int 0x80 mov ebx,0 mov eax,1 int 0x80
Running:
nasm -f elf64 hw.asm -o hw.o ld hw.o -o hw strip -s hw
Produces a binary of 504 bytes.
But I don’t want a ‘hello world’.
First, I figured I didn’t need the .data
or .text
sections, nor did I need to load up the data. I figured the top half of the _start
section was doing the printing so tried:
global _start _start: mov ebx,0 mov eax,1 int 0x80
Which compiled at 352 bytes.
But that’s no good, because it just exits. I need it to sleep. So a little further digging and I worked out that the mov eax
command loads up the CPU register with the relevant Linux syscall number, and int 0x80
makes the syscall itself call. More info on this here.
I found a list of these here. Syscall 1 is ‘exit’, so what I wanted was syscall 29: pause.
This made the program:
global _start _start: mov eax, 29 int 0x80
Which shaved 8 bytes off to compile at 344 bytes, and creates a binary that just sits there waiting for a signal, which is exactly what I want.
Hexering
At this point I took out the chainsaw and started hacking away at the binary. To do this I used hexer
which is essentially a vim
you can use on binary files to edit the hex directly. After a lot of trial and error I got from this:
to this:
Which appeared to do the same thing. Notice how the strings are gone, as well as a lot of whitespace. Along the way I referenced this doc, but mostly it was trial and error.
That got me down to 136 bytes.
Sub-100 Bytes?
I wanted to see if I could get any smaller. Reading this suggested I could get down to 45 bytes, but alas, no. That worked for a 32-bit executable, but pulling the same stunts on a 64-bit one didn’t seem to fly at all.
The best I could do was lift a 64-bit version of the program in the above blog and sub in my syscall:
BITS 64 org 0x400000 ehdr: ; Elf64_Ehdr db 0x7f, "ELF", 2, 1, 1, 0 ; e_ident times 8 db 0 dw 2 ; e_type dw 0x3e ; e_machine dd 1 ; e_version dq _start ; e_entry dq phdr - $$ ; e_phoff dq 0 ; e_shoff dd 0 ; e_flags dw ehdrsize ; e_ehsize dw phdrsize ; e_phentsize dw 1 ; e_phnum dw 0 ; e_shentsize dw 0 ; e_shnum dw 0 ; e_shstrndx ehdrsize equ $ - ehdr phdr: ; Elf64_Phdr dd 1 ; p_type dd 5 ; p_flags dq 0 ; p_offset dq $$ ; p_vaddr dq $$ ; p_paddr dq filesize ; p_filesz dq filesize ; p_memsz dq 0x1000 ; p_align phdrsize equ $ - phdr _start: mov eax, 29 int 0x80 filesize equ $ - $$
which gave me an image of 127 bytes.
I gave up reducing at this point, and am open to suggestions.
A Teensy Docker Image
Now I have my ‘sleep’ executable, I needed to put this in a Docker image.
To try and squeeze every byte possible, I created a binary with a filename one byte long called ‘t
‘ and put it in a Dockerfile from scratch
, a virtual 0-byte image:
FROM scratch ADD t /t
Note there’s no CMD
, as that increases the size of the Docker image. A command needs to be passed to the docker run
command for this to run.
Using docker save
to create a tar file, and then using maximum compression with gzip
I got to a portable Docker image file that was less than 1000 bytes:
$ docker build -t t .
$ docker save t | gzip -9 - | wc -c
976
I tried in vain to reduce the size of the tar file by fiddling with the Docker manifest file, but my efforts were in vain – due to the nature of the tar file format and the gzip compression algorithm, these attempts actually made the final gzip bigger!
I also tried other compression algorithms, but gzip did best on this small file.
Can You Get This Lower?
Keen to hear from you if you can…
Code
is here.
If you like this post, you might like Docker in Practice
zopfli compression reduces the size to 956 bytes
Google Brotli compresses this image down to 868B. I know it’s specially tuned for HTML but its predefined dictionary and optimized header seem to squeeze another 15% down. I just tried this with larger images but brotli takes much longer to compress.
https://github.com/google/brotli
gz compression was my choice rather than Docker’s. Apparently you can squeeze more out with specific flags to xz, but I like the (relative) universality of gzip compression.
Agree. It might be nice to have a flexible compression option sometime but that could get ugly. Nice post!
Thanks! It was a fun one :)
Larger system image from Docker Hub (111M):
$ ls -lh test.tar*
111M May 22 16:45 test.tar
26M May 22 16:35 test.tar.br
41M May 22 16:35 test.tar.gz
Brotli compression takes many time longer but the decompression is just about as fast as Gzip (no parallel option currently that I know of).
Changing “mov eax,29” to “mov al,29” reduced the executable size to 124 bytes here.