Why not just a bash loop
Say you want to compress all .log files older than 7 days. A naive bash approach:
for f in $(find /var/log -name '*.log' -mtime +7); do
gzip "$f"
done
Three problems:
- Word splitting on spaces. A file named
my app.logsplits into two arguments. - A separate fork per file. With thousands of files, this is slow.
- Bash can exceed ARG_MAX when find produces a very long output.
find -exec and xargs handle all of this. They deal with spaces correctly,
batch calls, and respect ARG_MAX automatically.
find -exec: two forms
\;: one file at a time
find /var/log -name '*.log' -mtime +7 -exec gzip {} \;Runs gzip separately for each file. {} is substituted with the path.
\; is a literal semicolon (escaped from bash).
Downside: 1000 files = 1000 forks = slow.
+: batch mode
find /var/log -name '*.log' -mtime +7 -exec gzip {} +Accumulates arguments and calls gzip once with a batch of files
(or several times if ARG_MAX is hit). An order of magnitude faster.
Use + whenever the command accepts multiple files
(gzip, rm, chmod, chown, cp). Use \; only when you need one file
at a time, for example inside a shell construct with a condition.
xargs: the pipe form
find /var/log -name '*.log' -mtime +7 | xargs gzip
Equivalent to find ... -exec gzip {} +. By default xargs batches and
handles ARG_MAX on its own.
Why both exist: xargs reads from any pipe, not just find, and has
more options (parallelism, mid-command placeholder substitution).
find -exec is shorter for simple cases.
Important: -print0 / -0 for safety
The default xargs delimiter is whitespace. A file named foo bar.log gets
split in two, and xargs tries to process foo and bar.log as separate paths.
Quotes are also treated specially.
The fix is to use a null byte as the delimiter:
find /var/log -name '*.log' -print0 | xargs -0 gzip
find -print0prints names separated by\0instead of\n.xargs -0reads input split on\0.
A filename cannot contain \0 (it is the C string terminator), so this
pipeline is always correct.
Rule: if you are not 100% certain that filenames have no spaces or quotes,
always use -print0 | xargs -0. Or use find -exec ... +.
xargs: useful options
| Flag | What it does |
|---|---|
-0 | use \0 as delimiter (for find -print0) |
-I {} | place {} anywhere in the command |
-n N | pass N arguments per invocation |
-P N | run up to N parallel processes |
-r / --no-run-if-empty | skip the command if input is empty |
-t | trace: print each command before running it |
--max-args=1 | same as -n 1 |
-d $'\n' | custom delimiter (e.g., newline only) |
Substituting into the middle of a command: -I
ls *.tar.gz | xargs -I {} mv {} /backup/{}.bak# or
cat hosts.txt | xargs -I HOST ssh HOST 'uptime'
By default xargs appends arguments at the end of the command. With -I
it inserts them wherever you put the placeholder. This is useful when you
need the filename both before and after (for example mv X /dst/X.bak).
The cost: -I implicitly sets -n 1, so you are back to one invocation per argument.
Parallelism: -P
find . -name '*.log' -print0 | xargs -0 -n1 -P4 gzip
▸4 gzip processes running in parallel
Useful when the task is CPU-bound and you have multiple cores. For I/O-bound
work you can set -P higher than the core count.
Note: with -P > 1, output order is unpredictable. Multiple processes
write to stdout interleaved.
Batch size: -n
echo {1..100} | xargs -n 5 echo▸echo 1 2 3 4 5
▸echo 6 7 8 9 10
▸...
Useful when the command accepts only a limited number of arguments, or when you want to see progress in groups.
Common patterns
Delete old logs
find /var/log -name '*.log.gz' -mtime +30 -delete
# or
find /var/log -name '*.log.gz' -mtime +30 -print0 | xargs -0 rm
-delete is built into find and faster than xargs (no rm fork). Use it
when deletion is all you need.
Search for content only in recent files
find /var/log -name '*.log' -mtime -1 -print0 | xargs -0 grep -l 'ERROR'
grep -l prints only the names of files where the pattern was found.
Run an SSH command on a list of hosts
cat servers.txt | xargs -I {} -P 10 ssh {} 'systemctl restart myapp'▸10 parallel SSH sessions
Kill processes by pattern
ps aux | grep '[m]yapp' | awk '{print $2}' | xargs killWithout [m], grep matches its own process line. A cleaner alternative:
pkill myapp, which is atomic and skips the four-process pipeline.
Change mode or owner on many files
find /srv/upload -type f -print0 | xargs -0 chmod 644
find /srv/upload -type d -print0 | xargs -0 chmod 755
Convert many images in parallel
find . -name '*.png' -print0 | xargs -0 -n1 -P$(nproc) -I IMG \
convert IMG -resize 50% IMG.thumb.png
$(nproc) returns the core count. -n1 -I are required when you need a placeholder.
Pitfalls
-
xargswithout-rruns the command even on empty input:bashfind . -name '*.tmp' | xargs rm # if find found nothing, rm runs with no arguments
On GNU xargs (Linux) this is harmless (rm says "missing operand"), but on BSD/macOS the behavior may differ. Safe form:
xargs -r rm. Alternatively, usefind -exec ... +, which never runs on empty input. -
Quotes and
xargs:'and"are special to xargs. Without-0, the string"foo bar"is parsed as one token. That is not what you want when reading a file of paths. Always use-0or-d $'\n'. -
find -exec ...vsfind ... | xargs: for a single command the result is the same, butfind -execis shorter and always null-safe (find passes paths directly via argv). Use it when you do not need a pipe. -
{}inside a shell construct: this does not work:bashfind . -name '*.log' -exec sh -c 'echo Processing {}' \;▸Processing {} (no substitution!)
Inside a shell command, pass the path as a positional argument:
bashfind . -name '*.log' -exec sh -c 'echo "Processing $1"' _ {} \;_is a dummy$0;{}becomes$1.
When not to use xargs
-
Complex per-file shell logic. Write a plain
while readloop (see bash-scripting):bashfind . -type f -name '*.log' -print0 | while IFS= read -r -d '' f; do
if [[ -s "$f" ]]; then
echo "Non-empty: $f"
fi
done
-
You need arrays or maps. Rewrite in Python.
-
Parallel execution with result handling. Use GNU
parallel, a superset of xargs with a progress UI and logging.