Losslessly Compressing My JPEG Photos with jpegoptim

Focus, simplify, and scale your JPEG's.

I’ve recently been running low on disk space on my laptop. I’ve freed some by removing files, but I’ve also been looking for ways to save space through compression.

My photo collection is currently 117GB. And that’s after removing the “everyone closed their eyes” shots and walrus memes!

Looks like a prime candidate for compression.

Introducing jpegoptim

jpegoptim is an open source utility for losslessly optimizing JPEGs. It’s from simpler times when names were obvious and websites didn’t need CSS. It works its magic by optimizing the Huffman coding used to compress the image data.

JPEG encoders don’t always find the optimal coding for an image, prioritizing speed over perfection. Camera software especially opts for speed to keep the “shutter” available.

Verification

I wanted to verify jpegoptim would be safe to use on my photos. I first optimized a single photo:

$ cp IMG_4586.JPG IMG_4586-optim.JPG
$ jpegoptim IMG_4586-optim.JPG
IMG_4586-optim.JPG 4032x3024 24bit N Exif XMP ICC  [OK] 4750852 --> 4018279 bytes (15.42%), optimized.

I verified jpegoptim preseved the exact pixels with GraphicsMagick compare on the before and after images:

$ gm compare -metric mse IMG_4586.JPG IMG_4586-optim.JPG
Image Difference (MeanSquaredError):
           Normalized    Absolute
          ============  ==========
     Red: 0.0000000000        0.0
   Green: 0.0000000000        0.0
    Blue: 0.0000000000        0.0
   Total: 0.0000000000        0.0

Zero difference - they're exactly the same!

I double-checked this by converting the images to bitmaps with GraphicsMagick convert, then comparing the hashes with sha256sum:

$ gm convert IMG_4586.JPG IMG_4586.bmp
$ gm convert IMG_4586-optim.JPG IMG_4586-optim.bmp
$ sha256sum IMG_4586.bmp IMG_4586-optim.bmp
0fd1d8d5a2286ca220746317612c0d8dfb757919a027ca8ad21e0c680f0954df  IMG_4586.bmp
0fd1d8d5a2286ca220746317612c0d8dfb757919a027ca8ad21e0c680f0954df  IMG_4586-optim.bmp

The hashes of the bitmap files are the same, so they contain exactly the same data. So jpegoptim is indeed lossless.

JPEGs also have Exif metadata. This contains tags such as date/time the photo was actually taken, the GPS coordinates, and the camera settings.

I used exiftool and diff to check that jpegoptim preserves all this metadata:

$ exiftool IMG_4586.JPG > IMG_4586.JPG-exif
$ exiftool IMG_4586-optim.JPG > IMG_4586-optim.JPG-exif
$ diff IMG_4586.JPG-exif IMG_4586-optim.JPG-exif
2c2
< File Name                       : IMG_4586.JPG
---
> File Name                       : IMG_4586-optim.JPG
4,7c4,7
< File Size                       : 4.5 MB
< File Modification Date/Time     : 2019:09:20 11:27:24+01:00
< File Access Date/Time           : 2019:09:20 11:27:33+01:00
< File Inode Change Date/Time     : 2019:09:20 11:27:24+01:00
---
> File Size                       : 3.8 MB
> File Modification Date/Time     : 2019:09:20 11:27:36+01:00
> File Access Date/Time           : 2019:09:20 11:27:37+01:00
> File Inode Change Date/Time     : 2019:09:20 11:27:36+01:00
11a12
> JFIF Version                    : 1.01
72c73
< Thumbnail Offset                : 2312
---
> Thumbnail Offset                : 2330

Great. The only differences are in non-EXIF fields that exiftool outputs, such as the file name and creation time. The other EXIF data remains in tact, including the time the photo was actually taken.

But How Much Savings?

When I ran jpegoptim above its output ended with:

4750852 --> 4018279 bytes (15.42%), optimized

15% saving - not bad! (Or “good” in non-British English.)

This image might have been randomly more or less compressible though. To get a more accurate figure, I ran jpegoptim on my entire “incoming imports” folder. This contains the last 30 days of photos imported from my phone (my only camera).

I checked the disk usage of the folder before and after with du.

$ du -sh .
2.4G    .
$ jpegoptim *.JPG
APFG7754.JPG 1920x1440 24bit P JFIF  [OK] 377974 --> 366834 bytes (2.95%), optimized.
AVEL7283.JPG 2583x2583 24bit N Exif XMP IPTC JFIF  [OK] 1245429 --> 1208442 bytes (2.97%), optimized.
BFJO7034.JPG 1440x1920 24bit P JFIF  [OK] 359923 --> 352126 bytes (2.17%), optimized.
...
XTPF0658.JPG 1081x1920 24bit P JFIF  [OK] 272653 --> 264491 bytes (2.99%), optimized.
YCQQ4283.JPG 3840x2160 24bit N Exif  [OK] 1583447 --> 1534028 bytes (3.12%), optimized.
YHDI8749.JPG 1600x901 24bit P JFIF  [OK] 161337 --> 156934 bytes (2.73%), optimized.
YHFZ1946.JPG 1600x1200 24bit P JFIF  [OK] 115151 --> 115385 bytes (-0.20%), skipped.
$ du -sh .
2.0G    .

So we went from 2.4GB to 2.0GB. 0.4 / 2.4 = 0.16 = 16% savings. Across my current collection that would be 117 ⨉ 0.16 = 18.7GB. Not bad indeed!

(N.B. The first and last images in the folder, starting with random letters rather than IMG_, are from WhatsApp. It seems they are better compressed than those taken with my phone’s camera, so there are less savings to be had. The camera images compressed up to 40%.)

(N.B.2. When jpegoptim compresses an image “negatively,” like the last one in my output, it means it can only find worse codings. It leaves the original in place, rather than increasing the file size!)

Fin

I’ll be running jpegoptim on my whole collection in due course. It takes a lot of internet bandwidth afterwards: my backup software Arq has to back up the photos again, since they’re entirely new data.

I’ll also make it part of my import process so I don’t need to think about it again.

I hope this post has helped advertise a great tool to you,

—Adam


Learn how to make your tests run quickly in my book Speed Up Your Django Tests.


Subscribe via RSS, Twitter, Mastodon, or email:

One summary email a week, no spam, I pinky promise.

Related posts:

Tags: