pre-commit for safe image handling
Modern cameras put a lot of metadata into images:
- GPS location
- Device model
- Camera software version
- Subject distance
- Facing direction
- Colour profiles
- Time, with timezone
- Reasonable camera stuff like ISO, shutter speed, flash usage, focal length
This is mostly good actually, since it is useful to you as the photographer. I've very frequently found use for photo geolocation of old photos. However, it can present a significant privacy risk if you ever post or send someone verbatim image files taken by such a camera; in particular, the GPS coordinates. Out of an abundance of caution, I would prefer to strip all of it that is not necessary to displaying the image.
Image metadata, of course, is not the only way to cause yourself privacy problems with images. The data itself can be just as big of a problem: your OS vendor can fuck up their filesystem APIs undocumentedly and cause an "acropalypse", or particularly motivated stalkers can geolocate most distinctive things, especially if there's a background of "outside". That said, the phrase "your threat model is not my threat model but your threat model is OK" always rings true, and this may or may not actually be a consideration.
It's a well known bug class to forget to strip image metadata so relatively few web tools will make the mistake of not stripping it, but sometimes people mess it up. If I used a web-based content management system, I would double check it, but I would expect that it strips any private metadata off of images.
That said, this website (like many others run by computer dorks) is maintained with a static site generator, a forked version of Zola, which takes text and image sources to generate HTML, as a compilation process: the source files are left untouched. Further, perhaps unfortunately, my source files are public so I had better not check in anything bad.
Ruh roh. Better not check in any images with metadata. I have, to date, succeeded in this purely by vigilance, but vigilance is not a robust process. Typically I stick the images into the GNU Image Manipulation Program and export fresh files without retaining EXIF metadata.
Let's fix this by instituting an automated barrier that also fixes images:
exiftool
conveniently supports most image formats, and can do
arbitrary metadata editing. We can ensure that it is always run on files before
they are checked in by using a tool like pre-commit to create user-friendly
Git hooks.
First, we need to find a exiftool
invocation. We want to keep some metadata
that is crucial to having the image display correctly: we need the colour
profile so the colours are right, and we need the orientation (since phone
cameras tend to rotate the image on the viewer side, probably because that
makes rotation lossless).
The manual states:
--TAG
Exclude specified tag from extracted information.
(...)
May also be used following a
-tagsFromFile
option to exclude tags from being copied (when redirecting to another tag, it is the source tag that should be excluded), or to exclude groups from being deleted when deleting all information (eg.-all= --exif:all
deletes all but EXIF information). But note that this will not exclude individual tags from a group delete (unless a family 2 group is specified, see note 4 below).Instead, individual tags may be recovered using the
-tagsFromFile
option (eg.-all= -tagsfromfile @ -artist
).
Hmm, so -all= --icc_profile:all -tagsfromfile @ -orientation
, maybe?
exiftool output
» exiftool PXL_20220116_223722991.jpg
ExifTool Version Number : 12.50
File Name : PXL_20220116_223722991.jpg
Directory : .
File Size : 1551 kB
File Modification Date/Time : 2023:04:24 15:05:02-07:00
File Access Date/Time : 2023:04:24 15:05:02-07:00
File Inode Change Date/Time : 2023:04:24 15:05:02-07:00
File Permissions : -rw-r--r--
File Type : JPEG
File Type Extension : jpg
MIME Type : image/jpeg
Exif Byte Order : Big-endian (Motorola, MM)
Orientation : Horizontal (normal)
X Resolution : 72
Y Resolution : 72
Resolution Unit : inches
Y Cb Cr Positioning : Centered
Profile CMM Type :
Profile Version : 4.0.0
Profile Class : Display Device Profile
Color Space Data : RGB
Profile Connection Space : XYZ
Profile Date Time : 2016:12:08 09:38:28
Profile File Signature : acsp
Primary Platform : Unknown ()
CMM Flags : Not Embedded, Independent
Device Manufacturer : Google
Device Model :
Device Attributes : Reflective, Glossy, Positive, Color
Rendering Intent : Perceptual
Connection Space Illuminant : 0.9642 1 0.82491
Profile Creator : Google
Profile ID : 75e1a6b13c34376310c8ab660632a28a
Profile Description : sRGB IEC61966-2.1
Profile Copyright : Copyright (c) 2016 Google Inc.
Media White Point : 0.95045 1 1.08905
Media Black Point : 0 0 0
Red Matrix Column : 0.43604 0.22249 0.01392
Green Matrix Column : 0.38512 0.7169 0.09706
Blue Matrix Column : 0.14305 0.06061 0.71391
Red Tone Reproduction Curve : (Binary data 32 bytes, use -b option to extract)
Chromatic Adaptation : 1.04788 0.02292 -0.05019 0.02959 0.99048 -0.01704 -0.00922 0.
01508 0.75168
Blue Tone Reproduction Curve : (Binary data 32 bytes, use -b option to extract)
Green Tone Reproduction Curve : (Binary data 32 bytes, use -b option to extract)
Image Width : 4080
Image Height : 3072
Encoding Process : Baseline DCT, Huffman coding
Bits Per Sample : 8
Color Components : 3
Y Cb Cr Sub Sampling : YCbCr4:2:0 (2 2)
Image Size : 4080x3072
Megapixels : 12.5
Looks like it. It's not overwriting the file though, but it looks like there's
-overwrite_original
for that.
Let's put it all together into pre-commit: we want a repo-local
hook because it's easier to manage, so something like
this as .pre-commit-config.yml
:
repos:
- repo: local
hooks:
- id: no-spicy-exif
name: Ban spicy exif data
description: Ensures that there is no sensitive exif data committed
language: system
entry: exiftool -all= --icc_profile:all -tagsfromfile @ -orientation -overwrite_original
exclude_types: ["svg"]
types: ["image"]
Check with git add .pre-commit-config.yml image-with-gps.jpg && pre-commit run
, and it fails as expected. If we git add
the file again, it will pass,
and the file is now devoid of problematic metadata. Success!