linuxlab.io
Tutorials▾
  • Linux & networking
    File system, processes, TCP/IP, BGP and OSPF
    →
  • Terraform & IaC
    HCL, state, plan/apply on a LocalStack sandbox
    →
  • Git & GitHub
    Object model, plumbing, branching, GitHub Actions
    →
All tutorials →
PricingAboutSign inCreate account
/
Intro
Lessons
Footer
linuxlab-TutorialsPricingAboutPrivacy & cookies
Copyright © 2026 LinuxLab. All rights reserved.
linuxlab.io
Tutorials▾
  • Linux & networking
    File system, processes, TCP/IP, BGP and OSPF
    →
  • Terraform & IaC
    HCL, state, plan/apply on a LocalStack sandbox
    →
  • Git & GitHub
    Object model, plumbing, branching, GitHub Actions
    →
All tutorials →
PricingAboutSign inCreate account
/
  • Introduction
  • Chapters
  • How it works
  • Lessons
  • Knowledge base
  • Interview prep
home/git/kb/Security/git-filter-repo

kb/security ── Security ── advanced

git filter-repo: Rewriting History

The modern replacement for the deprecated `git filter-branch`. Rewrites history in place: removes files, changes author emails, replaces strings. Use it to remove secrets or large binaries that landed in the repo.

view as markdownaka: filter-repo, rewrite-history

git filter-repo is a standalone tool (not part of Git core; install it separately). It rewrites history completely: every commit is recreated with a different tree or metadata. Think of it as surgery. Use with care.

It replaces git filter-branch, which was officially deprecated in Git 2.24. filter-branch is slow and easy to misuse. filter-repo is fast (written in Python, uses git fast-export) and safer: by default it requires a fresh clone.

Installation

bash
# macOS
brew install git-filter-repo
# Debian/Ubuntu
apt install git-filter-repo
# Pip
pip install git-filter-repo

Main uses

Remove a file from all history

The typical scenario is an accidentally committed secret. See secret-scanning: rotate the key first, then clean the history.

bash
git filter-repo --path secrets.env --invert-paths

--invert-paths means "delete the matches." The command removes secrets.env from every commit in history.

Replace a string

If you need to scrub a specific value (an API key, say) rather than a whole file:

bash
# Create a file with replacement rules
cat > replace.txt <<EOF
AKIAIOSFODNN7EXAMPLE==>[REMOVED]
literal:my-secret-password==>[REMOVED]
regex:ghp_[a-zA-Z0-9]{36}==>[REMOVED]
EOF
git filter-repo --replace-text replace.txt

This walks the content of every commit and substitutes matching strings with the placeholder.

Remove a large file

Someone committed a 500 MB dataset and the repo ballooned. Remove it from history:

bash
git filter-repo --strip-blobs-bigger-than 100M

Or by path:

bash
git filter-repo --path dataset.csv --invert-paths

After that, .git/ is usually still physically large. To reclaim space:

bash
git reflog expire --expire=now --all
git gc --aggressive --prune=now

Change an email across all history

You committed with a personal email and want it to show your work address:

bash
cat > mailmap.txt <<EOF
Your Name <work@company.com> <personal@gmail.com>
EOF
git filter-repo --mailmap mailmap.txt

All commits from personal@gmail.com will be shown as work@company.com. Commit SHAs change.

After filter-repo

The command rewrites history: every commit SHA changes. The consequences:

  • All clones are now stale. Nobody can just git pull: the histories have diverged. Everyone needs a fresh clone.
  • PRs and issues referencing old SHAs break. Those commits no longer exist.
  • Force push to the remote. Be careful: branch protection normally blocks this and must be temporarily disabled.
  • A backup is required. Before running: git clone <url> backup-clone. If something goes wrong, you have a restore point.

These consequences make filter-repo an "once a year" operation, not a daily one. It is usually a team effort, done with coordination and advance notice.

Alternatives

  • BFG Repo-Cleaner is a Java tool, even faster on large repos. It is less flexible: file and blob deletion only, no text replacement.
  • Just rotate and move on. If the secret was in a public repo, it is already compromised. Cleaning the history does not undo the exposure. Sometimes the right call is to rotate the key, learn the lesson, and leave the history alone.

Pitfalls

  • filter-repo only works on a fresh clone by default. Run it in a repo with many remotes and it will refuse. You can pass --force, but that is a signal to stop and think.
  • Submodules are preserved by filter-repo, but if pointers change you may need separate work inside each submodule. See detached-head.
  • Very large repos (tens of GB) can take hours to filter. Run with nohup or inside a screen session.
  • Tags are rewritten locally but not in others' clones. On your machine, filter-repo updates refs/tags: a tag pointing to a rewritten commit starts pointing to the new SHA; a tag on a deleted commit disappears. But colleagues who cloned earlier keep their old tags locally. git fetch does not remove them automatically. They need git fetch --prune --prune-tags or a fresh clone. On the server, delete stale tags explicitly: git push origin :refs/tags/<tag>.

§ команды

bash
git filter-repo --path file.env --invert-paths

Remove a file from all history

bash
git filter-repo --replace-text replace.txt

Replace strings across all commits

bash
git filter-repo --strip-blobs-bigger-than 100M

Remove all large files from history

bash
git reflog expire --expire=now --all && git gc --aggressive --prune=now

Reclaim disk space after filter-repo

§ см. также

  • secret-scanningSecret Scanning in a RepositoryScan your repo regularly for accidentally committed secrets (API keys, passwords, tokens). The main tools: gitleaks, detect-secrets, trufflehog. The best time to catch them is before the commit, with a pre-commit hook. After an exposure, key rotation is non-negotiable. History cleanup is optional.
  • gitignore.gitignoreA file in the repo root listing ignore patterns: what Git should skip entirely. Do not confuse it with staging. It has no effect on already-tracked files. It is your primary defense against accidentally committing secrets and junk.
  • rebasegit rebaseRewrites the commits of a branch so they descend from a different commit. Each commit gets a new SHA; history becomes linear. Safe only on branches that no one else has seen.
Footer
linuxlab-
Copyright © 2026 LinuxLab. All rights reserved.
Tutorials
Pricing
About
Privacy & cookies