linuxlab.io
Tutorials▾
  • Linux & networking
    File system, processes, TCP/IP, BGP and OSPF
    →
  • Terraform & IaC
    HCL, state, plan/apply on a LocalStack sandbox
    →
  • Git & GitHub
    Object model, plumbing, branching, GitHub Actions
    →
All tutorials →
PricingAboutSign inCreate account
/
  • Introduction
  • Lessons
  • How it works
  • Simulator
  • Knowledge base
  • Interview prep
Index
Categories
All entries
Footer
linuxlab-TutorialsPricingAboutPrivacy & cookies
Copyright © 2026 LinuxLab. All rights reserved.
home/linux/kb/Commands/xargs-and-find-exec

kb/commands ── Commands ── intermediate

xargs and find -exec: bulk operations

Two ways to apply a command to a set of files: `find ... -exec cmd {} +` (inside find) and `... | xargs cmd` (via pipe). For safety with spaces and special characters, use `find -print0 | xargs -0`.

view as markdownaka: xargs, find-exec, find-print0, xargs-0, parallel-xargs

Why not just a bash loop

Say you want to compress all .log files older than 7 days. A naive bash approach:

bash
for f in $(find /var/log -name '*.log' -mtime +7); do
    gzip "$f"
done

Three problems:

  • Word splitting on spaces. A file named my app.log splits into two arguments.
  • A separate fork per file. With thousands of files, this is slow.
  • Bash can exceed ARG_MAX when find produces a very long output.

find -exec and xargs handle all of this. They deal with spaces correctly, batch calls, and respect ARG_MAX automatically.

find -exec: two forms

\;: one file at a time

bash
find /var/log -name '*.log' -mtime +7 -exec gzip {} \;

Runs gzip separately for each file. {} is substituted with the path. \; is a literal semicolon (escaped from bash).

Downside: 1000 files = 1000 forks = slow.

+: batch mode

bash
find /var/log -name '*.log' -mtime +7 -exec gzip {} +

Accumulates arguments and calls gzip once with a batch of files (or several times if ARG_MAX is hit). An order of magnitude faster.

Use + whenever the command accepts multiple files (gzip, rm, chmod, chown, cp). Use \; only when you need one file at a time, for example inside a shell construct with a condition.

xargs: the pipe form

bash
find /var/log -name '*.log' -mtime +7 | xargs gzip

Equivalent to find ... -exec gzip {} +. By default xargs batches and handles ARG_MAX on its own.

Why both exist: xargs reads from any pipe, not just find, and has more options (parallelism, mid-command placeholder substitution). find -exec is shorter for simple cases.

Important: -print0 / -0 for safety

The default xargs delimiter is whitespace. A file named foo bar.log gets split in two, and xargs tries to process foo and bar.log as separate paths. Quotes are also treated specially.

The fix is to use a null byte as the delimiter:

bash
find /var/log -name '*.log' -print0 | xargs -0 gzip
  • find -print0 prints names separated by \0 instead of \n.
  • xargs -0 reads input split on \0.

A filename cannot contain \0 (it is the C string terminator), so this pipeline is always correct.

Rule: if you are not 100% certain that filenames have no spaces or quotes, always use -print0 | xargs -0. Or use find -exec ... +.

xargs: useful options

FlagWhat it does
-0use \0 as delimiter (for find -print0)
-I {}place {} anywhere in the command
-n Npass N arguments per invocation
-P Nrun up to N parallel processes
-r / --no-run-if-emptyskip the command if input is empty
-ttrace: print each command before running it
--max-args=1same as -n 1
-d $'\n'custom delimiter (e.g., newline only)

Substituting into the middle of a command: -I

bash
ls *.tar.gz | xargs -I {} mv {} /backup/{}.bak
# or
cat hosts.txt | xargs -I HOST ssh HOST 'uptime'

By default xargs appends arguments at the end of the command. With -I it inserts them wherever you put the placeholder. This is useful when you need the filename both before and after (for example mv X /dst/X.bak).

The cost: -I implicitly sets -n 1, so you are back to one invocation per argument.

Parallelism: -P

bash
find . -name '*.log' -print0 | xargs -0 -n1 -P4 gzip

▸4 gzip processes running in parallel

Useful when the task is CPU-bound and you have multiple cores. For I/O-bound work you can set -P higher than the core count.

Note: with -P > 1, output order is unpredictable. Multiple processes write to stdout interleaved.

Batch size: -n

bash
echo {1..100} | xargs -n 5 echo

▸echo 1 2 3 4 5

▸echo 6 7 8 9 10

▸...

Useful when the command accepts only a limited number of arguments, or when you want to see progress in groups.

Common patterns

Delete old logs

bash
find /var/log -name '*.log.gz' -mtime +30 -delete
# or
find /var/log -name '*.log.gz' -mtime +30 -print0 | xargs -0 rm

-delete is built into find and faster than xargs (no rm fork). Use it when deletion is all you need.

Search for content only in recent files

bash
find /var/log -name '*.log' -mtime -1 -print0 | xargs -0 grep -l 'ERROR'

grep -l prints only the names of files where the pattern was found.

Run an SSH command on a list of hosts

bash
cat servers.txt | xargs -I {} -P 10 ssh {} 'systemctl restart myapp'

▸10 parallel SSH sessions

Kill processes by pattern

bash
ps aux | grep '[m]yapp' | awk '{print $2}' | xargs kill

Without [m], grep matches its own process line. A cleaner alternative: pkill myapp, which is atomic and skips the four-process pipeline.

Change mode or owner on many files

bash
find /srv/upload -type f -print0 | xargs -0 chmod 644
find /srv/upload -type d -print0 | xargs -0 chmod 755

Convert many images in parallel

bash
find . -name '*.png' -print0 | xargs -0 -n1 -P$(nproc) -I IMG \
    convert IMG -resize 50% IMG.thumb.png

$(nproc) returns the core count. -n1 -I are required when you need a placeholder.

Pitfalls

  1. xargs without -r runs the command even on empty input:

    bash
    find . -name '*.tmp' | xargs rm        # if find found nothing, rm runs with no arguments

    On GNU xargs (Linux) this is harmless (rm says "missing operand"), but on BSD/macOS the behavior may differ. Safe form: xargs -r rm. Alternatively, use find -exec ... +, which never runs on empty input.

  2. Quotes and xargs: ' and " are special to xargs. Without -0, the string "foo bar" is parsed as one token. That is not what you want when reading a file of paths. Always use -0 or -d $'\n'.

  3. find -exec ... vs find ... | xargs: for a single command the result is the same, but find -exec is shorter and always null-safe (find passes paths directly via argv). Use it when you do not need a pipe.

  4. {} inside a shell construct: this does not work:

    bash
    find . -name '*.log' -exec sh -c 'echo Processing {}' \;

    ▸Processing {} (no substitution!)

    Inside a shell command, pass the path as a positional argument:

    bash
    find . -name '*.log' -exec sh -c 'echo "Processing $1"' _ {} \;

    _ is a dummy $0; {} becomes $1.

When not to use xargs

  • Complex per-file shell logic. Write a plain while read loop (see bash-scripting):

    bash
    find . -type f -name '*.log' -print0 | while IFS= read -r -d '' f; do
        if [[ -s "$f" ]]; then
            echo "Non-empty: $f"
        fi
    done
  • You need arrays or maps. Rewrite in Python.

  • Parallel execution with result handling. Use GNU parallel, a superset of xargs with a progress UI and logging.

§ команды

bash
find /var/log -name '*.log' -mtime +7 -exec gzip {} +

Compress all .log files older than 7 days. + instead of \; runs in batch, an order of magnitude faster.

bash
find . -type f -print0 | xargs -0 grep -l 'ERROR'

Safe grep across all files. The null delimiter handles filenames with spaces.

bash
cat hosts.txt | xargs -I {} -P 10 ssh {} 'uptime'

Run ssh in parallel on a list of hosts, 10 sessions at a time.

bash
find . -name '*.tmp' -delete

Built-in find deletion, faster than xargs rm and no extra fork.

bash
ls *.png | xargs -n1 -P$(nproc) -I IMG convert IMG -resize 50% IMG.small.png

Parallel image resize using all CPU cores.

§ см. также

  • cmd-findfind: search files by predicates`find` walks a directory tree and applies predicates (name, type, time, size, permissions). Actions: `-print` (default), `-delete`, `-exec`, `| xargs`.
  • cmd-grepgrep: search lines by pattern`grep` searches stdin or files for lines matching a regex. Key modes: `-E` (ERE), `-P` (PCRE), `-F` (fixed string), `-r` (recursive tree walk).
  • cmd-sedsed: stream editorsed is a stream editor: it applies commands (`s/a/b/`, `d`, `p`, ...) to each line. `-i` edits a file in place; `-E` enables ERE; the address range `/start/,/end/` filters a block. Hold space is a second buffer.
  • cmd-awkawk: field-oriented processing of structured textawk splits a line into fields by FS (default is whitespace) and applies pattern { action }. `$1..$NF`, `NR` (a counter), BEGIN/END for a prologue and totals. It covers 80% of "process the columns" tasks without Python.
  • bash-scriptingbash scripts: basics and idiomsA bash script is a text file with shebang `#!/usr/bin/env bash` and `chmod +x`. Start every script with `set -euo pipefail` and run `shellcheck` to catch errors early.
  • cmd-rsyncrsync: incremental file synchronizationrsync copies only the changed blocks of files, locally or over SSH. `-avz` is the baseline combination (archive + verbose + compress). `--delete` mirrors. `--dry-run` is required before the first run.

§ упоминается в уроках

  • ›intermediate-10-xargs-and-parallel
Footer
linuxlab-
Copyright © 2026 LinuxLab. All rights reserved.
Tutorials
Pricing
About
Privacy & cookies