Ingo Karkat - Git trailers to document project parameters

Git trailers to document project parameters

Posted Tuesday, 19-Nov-2024 by Ingo Karkat

trailers?!

Git trailers are metadata at the end of a commit (or tag) message that follows a <token>: <value> format. They are usually added via git commit --trailer <token>[(=|:)<value>] (and configuration can influence the exact ordering, duplicate handling, etc.), but can also be directly edited or even added in the commit message editor. They are usually used for sign-offs, giving credit to co-authors, adding a ticket ID, or target release. Brooke Kuhlmann has a nice article with more details. The placement is both very readable for us humans, but also amenable to machine processing, especially as git log has format specifiers to extract (all or selected) trailers.

Brooke emphasizes the opportunity to remove all "noise" from the commit subject via trailers. I agree that most metadata is too long and unimportant to clutter the valuable subject line, but wouldn't go so far as banning ticket IDs (e.g. TASK-1234) or keywords like Documentation: Typo:, as they provide crucial categorization1 that I don't want to be stowed away at the end of a potentially long message and whose direct context I would otherwise have to partially repeat in the subject text. Although I can configure git log to extract some trailers and put them back right at the front where they used to be, there's currently no or little support for trailer extraction in platforms like GitHub and most graphical Git clients. As long as there's no universal support for trailers, I'm going to keep those exceptions at the front of the subject line.

my use case

Most of my Git repos are small personal projects. I rarely have the opportunity to credit other people there, and have no need for metadata to control the bare-bones release process. However, most of those little tools and tweaks are developed on the side, often ad-hoc: A short alias added to simplify a repetitive task at work here, a small compatibility fix for the programming tool update I just installed on my dev system there. I use these customizations on my personal notebook, work systems, and VMs; while they all run some sort of Linux distribution (or Cygwin a long time ago), the actual versions vary quite a bit. Likewise, on my main systems I closely follow major tools' releases (e.g. through Ubuntu PPAs), whereas other systems just have the stock vanilla version. So it happens quite frequently that I make a change during the day on my work system, and then in the evening I sync those to my personal notebook, and soon realize that there's a compatibility issue. To fix that, it's super helpful to know the other system's versions, but that computer is off now or even miles away at the office.

Big projects can (and should!) ensure compatibility via a CI/CD pipeline; e.g. through a matrix of test environments in a GitHub workflow. But these small personal pet projects often don't even have automated tests; they only get exercised by me during my interactions on one of my systems (but that can be a lot of coverage!) That's where I got the idea of automatically collecting system and version information and attach this to every commit.
Git trailers provide the right place and scope for this kind of metadata, so when I amend a commit from another system, those differing parameters would be added to the existing ones, but amends on the same (type of) system don't. My git-extensions already support the extension of built-in Git commands, so there's no additional parameter for me to remember if it happens automatically, and because the added trailers appear in the commit message editor, I can directly review them as part of the editing.

implementation

I basically just had to add a bit of code to my git-commit wrapper that reads the project parameter configuration, calls the found shell commands, and then inserts the corresponding --trailer parameters to the commit command. I've identified two parameters so far:

System-info shows information about the system the change has been developed on; i.e. the operating system version and architecture
Platform-info contains project-specific information; for my extensions, that's the version of the command that is extended, and maybe some additional infrastructure like testing library or command configuration

The commands that produce the information can be configured via (in order of precedence):

worktree or local Git configuration commit.trailer.*
environment variable GIT_COMMIT_TRAILER_*
global or system Git configuration commit.trailer.*

Environment variables work well for settings across multiple projects, and can be set in an environment setup script (which my configuration automatically sources when a corresponding directory is entered in the shell). Git configuration can be used for global settings, as well as to tactically override an environment variable for a specific project.

In addition, I register the two trailer tokens in the Git configuration so that Git knows where to insert them and how to deal with duplicates. With this, the extension can use the short token name (system) instead of the full name (System-info). Done!

example

I currently set the system and platform parameters for all of my personal customizations as environment variables:

$ e GIT_COMMIT_TRAILER_
declare -x GIT_COMMIT_TRAILER_PLATFORM="echo \"\${BASH_VERSION:+bash }\${BASH_VERSION}\""
declare -x GIT_COMMIT_TRAILER_SYSTEM="distro && arch"

The platform one defaults to the Bash version as most of my projects are shell scripts. The Bats version is added if such tests are detected in the project.

For my git-extensions, the working copy has a local configuration that overrides the platform command:

$ git config --get commit.trailer.platform
hub --version; gh --version | head -n 1; git-lfs --version

Thus, any Git commit in my Git extensions now records the system information as well as the currently installed versions of Git, hub, GitHub CLI, and Large File Storage (if installed). A commit looks like this:

commit 33216e5b137ae5e86d69f9352bfd150c6f428a99
Author: Ingo Karkat <swdev@ingo-karkat.de>
Date:   2024-11-14 19:17:09 +0100

    git-do-extensions: BUG: -P short form of --predicate-command is not parsed

    System-info: Armbian 24.8.4 noble, aarch64
    Platform-info: git version 2.47.0, hub version 2.14.2, gh version 2.60.1 (2024-10-25)

I had to go through all of my extension projects once to configure the corresponding commands (e.g. kubectl-extensions uses kubectl version). Using local Git configuration is the simplest to add and works for me as I synchronize (and therefore reuse a single instance of) working copies across all of my systems. For a more persistent configuration and multi-user scenarios, a local shell environment file checked into the repo would work better. Supporting a .gitautotrailer file (analog to .gitattributes and .gitignore) would be trivial to add to git-commit, but since I already have environment variables and per-directory configuration, I don't have a need for that right now.

conclusion

Git trailers aren't yet used much, probably because it takes some time for people to find out about them and add them to their workflow. Due to that, tool support is lagging, too. (At least the raw trailer data should be there whereever full commit messages are shown. A tool would have to actively filter them out to get rid of them.) In the Git CLI, trailers are already a nice way to automatically record important context information, and the management of the metadata is less complex than Git notes. I'm sure I'll discover more neat uses over time.

I also find Brooke's five categories (Fixed / Removed / Added / Updated / Refactored) far too limiting. The 68 types that I currently have are probably excessive, but I definitely miss categories for Documentation, Housekeeping and Testing (these types of commit can be safely skipped when bisecting and that can be a huge timesaver during troubleshooting). His are more geared towards automatic changelog extraction, though I'd still miss Deprecated and Security then. And I'm personally not a fan of generating the changelog from commits, as it discourages fine-granular commits. To me, if it's worth providing a changelog, then it should be treated like any other form of documentation, and hand-crafted for maximum benefit of the readers. But I digress…

Ingo Karkat, 19-Nov-2024

ingo's blog is licensed under Attribution-ShareAlike 4.0 International