Using mailmap to Tidy Git Contributors

Do you ever contribute to a Git repository from different machines? Yeah, you probably do. Sometimes you’re on your work machine. Other times you’re on your personal laptop. Or your gaming desktop. And you might have a different Git identity on each of those. And this means that your Git log ends up looking a bit messy. Who are all of these people with similar names but different email addresses? A .mailmap file can be used to tidy things up.

The Problem

If you work on multiple machines then you might have a different Git identity set up on each of them. For example, here are my (fictional) configurations across three different machine.

# ~/.gitconfig on work machine.
[user]
        name = Andrew Collier
        email = andrew@fathomdata.dev
# ~/.gitconfig on personal laptop.
[user]
        name = Andrew B. Collier
        email = andrew@personal-laptop.org
# ~/.gitconfig on gaming desktop.
[user]
        name = Andrew
        email = datawookie@gaming-beast.com

Now if I were to contribute to a repository from each of these machines then I might end up with logs that look like this:

commit 8afad98ab02d9daa92b543a8bf4356f3b7729921 (HEAD -> master)
Author: Andrew <datawookie@gaming-beast.com>
Date:   Mon May 1 06:02:24 2023 +0100

    chore: Add requirements.txt with initial dependencies

commit eb65cdc168157874768431dd5e9e7f2e130e80ce
Author: Andrew B. Collier <andrew@personal-laptop.org>
Date:   Mon May 1 05:59:59 2023 +0100

    feat: Add scraping script (minimal version)

commit 6081bdf3edbd7132242937461f7b23bf954ecf26
Author: Andrew Collier <andrew@fathomdata.dev>
Date:   Mon May 1 05:58:35 2023 +0100

    feat: Virgin repository; added README

It looks like these are from three distinct people, but they are all just different versions of me.

The Solution

To fix this I can create a .mailmap file in the repository. Suppose that I choose to consolidate on my work persona. Then the .mailmap would look like this:

Andrew Collier <andrew@fathomdata.dev> Andrew B. Collier <andrew@personal-laptop.org>
Andrew Collier <andrew@fathomdata.dev> <datawookie@gaming-beast.com>

The format of this file is first a canonical identity follower by a contributor identity, where each identity consists of an email address and (optionally) a name. The contributor details for a commit which match a contributor identity are mapped to the corresponding canonical identity.

With this .mailmap file in place the logs now look like this:

commit 8afad98ab02d9daa92b543a8bf4356f3b7729921
Author: Andrew Collier <andrew@fathomdata.dev>
Date:   Mon May 1 06:02:24 2023 +0100

    chore: Add requirements.txt with initial dependencies

commit eb65cdc168157874768431dd5e9e7f2e130e80ce
Author: Andrew Collier <andrew@fathomdata.dev>
Date:   Mon May 1 05:59:59 2023 +0100

    feat: Add scraping script (minimal version)

commit 6081bdf3edbd7132242937461f7b23bf954ecf26
Author: Andrew Collier <andrew@fathomdata.dev>
Date:   Mon May 1 05:58:35 2023 +0100

    feat: Virgin repository; added README

The commit SHAs do not change because the contents of the commits are not being modified by .mailmap entries. They only affect the way that those entries are presented.

Other Uses

You can use a .mailmap file to handle contributions from multiple people operating on a shared email address. Suppose, for example, that Jane and Joe both make contributions from the communal email address hero@fathomdata.dev. You’d then see entries that looked like this:

commit 92c55945dc9c2756bbaf1ff1233d0d01cd1539b5 (HEAD -> master)
Author: Joe <hero@fathomdata.dev>
Date:   Mon May 1 06:20:24 2023 +0100

    fix: Replace CSS with XPath

commit 71a16c4747868e17addb1aae4cec8ff3aeed9813
Author: Jane <hero@fathomdata.dev>
Date:   Mon May 1 06:19:42 2023 +0100

    fix: Update CSS selector

These .mailmap entries can clarify those commits.

Jane Brown <jane@fathomdata.dev> Jane <hero@fathomdata.dev>
Joe Smith <joe@fathomdata.dev> Joe <hero@fathomdata.dev>

Now the commit history looks like this:

commit 92c55945dc9c2756bbaf1ff1233d0d01cd1539b5 (HEAD -> master)
Author: Joe Smith <joe@fathomdata.dev>
Date:   Mon May 1 06:20:24 2023 +0100

    fix: Replace CSS with XPath

commit 71a16c4747868e17addb1aae4cec8ff3aeed9813
Author: Jane Brown <jane@fathomdata.dev>
Date:   Mon May 1 06:19:42 2023 +0100

    fix: Update CSS selector

Checking Mailmap

The check-mailmap command can be used to check whether a particular contributor identity is mapped via a mailmap entry`.

git check-mailmap "Jane <hero@fathomdata.dev>"
Jane Brown <jane@fathomdata.dev>

Jane’s contributions will be remapped to her personal identity. But contributions from Alice, who doesn’t appear in .mailmap, still retain the communal email address.

git check-mailmap "Alice <hero@fathomdata.dev>"
Alice <hero@fathomdata.dev>

Mailmap and Other Commands

Mailmap entries can also affect other Git commands.

The cat-file command can be used to get information about repository objects (identified by their commit SHA).

git cat-file -p 92c55945dc9c2756bbaf1ff1233d0d01cd1539b5
tree 73585d4062a8cbfb22c3438f80f40c7f5cb00ae5
parent 71a16c4747868e17addb1aae4cec8ff3aeed9813
author Joe <hero@fathomdata.dev> 1682918424 +0100
committer Joe <hero@fathomdata.dev> 1682918424 +0100

fix: Replace CSS with XPath

Joe’s commit still uses the communal email address. Why? As mentioned above, the commit history is not modified by .mailmap. If we want to apply .mailmap entries to the output from cat-file then simply provide the --mailmap (or --use-mailmap ) flags.

git cat-file --mailmap -p 92c55945dc9c2756bbaf1ff1233d0d01cd1539b5
tree 73585d4062a8cbfb22c3438f80f40c7f5cb00ae5
parent 71a16c4747868e17addb1aae4cec8ff3aeed9813
author Joe Smith <joe@fathomdata.dev> 1682918424 +0100
committer Joe Smith <joe@fathomdata.dev> 1682918424 +0100

fix: Replace CSS with XPath