Why jq
HTTP APIs increasingly return JSON, and core tools like kubectl, docker, gh,
and aws all support JSON output. Parsing JSON with regular expressions is
painful and fragile. jq is the standard for CLI JSON processing: a single
binary with no dependencies, and a filter language similar to XPath/JSONPath
but more expressive.
The alternative, python -c 'import json,sys;...', is harder to read in a pipe.
Basic syntax
jq [OPTIONS] FILTER [FILE...]
Without a file, jq reads stdin. The simplest filter is . (identity):
curl -s api.example.com/users | jq .
This gives you colored pretty-print output. Often that is enough.
Selectors
jq '.name' # field name
jq '.users[0]' # first element of the array
jq '.users[]' # ALL elements as a stream (not an array)
jq '.users[].email' # email field of each element
jq '.users | length' # length of the array
jq '.users | keys' # keys of an object (or indices of an array)
jq '.["weird-key"]' # keys with hyphens/spaces via []
jq '..|.email? // empty' # recursive search for email anywhere in the tree
select(): filtering
# Active users
jq '.users[] | select(.active == true)'
# IDs of users with >100 commits
jq '.users[] | select(.commits > 100) | .id'
# Pods in CrashLoopBackOff state
kubectl get pods -o json | jq '
.items[]
| select(.status.containerStatuses[]?.state.waiting.reason == "CrashLoopBackOff")
| .metadata.name'
? after a field means: if the key does not exist, skip it instead of failing.
Transformation
# Only name and email of each user, as an array of objects
jq '.users | map({name, email})'# From an array of objects to flat TSV
jq -r '.users[] | [.id, .name, .email] | @tsv'
# From an array to CSV with a header row
jq -r '(.users[0] | keys_unsorted), (.users[] | [.[]]) | @csv'
# Grouping
jq 'group_by(.team) | map({team: .[0].team, count: length})'Formatters at the end of a filter: @tsv, @csv, @sh (escape for bash),
@json, @uri, @base64, @base64d.
-r and -c
-r(raw output) strips JSON quotes from strings. Without-ryou get"foo"; with-ryou getfoo. Use it when passing output to the shell.-c(compact) puts one object per line with no newlines inside. Useful for NDJSON logs andxargs.-s(slurp) reads all stdin as a single array. By default jq reads a stream of JSON documents.
# NDJSON: one object per line
cat events.jsonl | jq -c 'select(.severity=="ERROR")'
# Extract IPs for xargs
jq -r '.hits[].ip' alerts.json | sort -u | xargs -I{} whois {}Variables and parameters
# Pass a value from the shell
jq --arg user "$USER" '.users[] | select(.login == $user)' data.json
# Numeric (not a string)
jq --argjson min 100 '.events[] | select(.duration_ms > $min)' data.json
Without --arg, shell substitutions inside a filter are a common source of
injection bugs. Do not write jq ".x == \"$VAR\"". Write
jq --arg v "$VAR" '.x == $v' instead.
Reduce, foreach, paths
For aggregation:
# Sum the size field
jq '[.files[].size] | add'
jq '.files | reduce .[] as $f (0; . + $f.size)'
# All paths to leaf nodes (useful for exploring an unknown structure)
jq '[paths(scalars)]'
jq with kubectl, docker, and aws
All three CLIs support -o json or --format=json:
# Nodes and their kubelet version
kubectl get nodes -o json | jq -r '.items[] | [.metadata.name, .status.nodeInfo.kubeletVersion] | @tsv'
# Containers by image
docker ps --format='{{json .}}' | jq -r 'select(.Image | contains("nginx")) | .Names'# All S3 buckets tagged owner=team-x
aws s3api list-buckets | jq -r '.Buckets[].Name' \
| xargs -I{} sh -c 'aws s3api get-bucket-tagging --bucket {} 2>/dev/null \ | jq -r --arg b {} ".TagSet[] | select(.Key==\"owner\" and .Value==\"team-x\") | \"\($b)\""'When things go wrong
jq: error: Cannot index ... with ...means you applied.fieldto a non-object (an array, null, or a number). Use?orselect(type=="object").nullin output instead of an error means?is silencing errors. Remove?to see where the failure occurs.- Quotes in output are in the way means you forgot
-r. - Newlines inside values cause
-rto emit literal\ncharacters, which can break a pipe intoawk. Use@csv/@tsvor-cwith a downstream parser. - Large file is slow means jq is loading the entire stream into memory because
of
-s. Without-s, jq streams. For gigabyte-scale files, look at gojq or jaq. - JSON5/JSONC (with comments) is not parsed by jq. Strip comments first with
yq -p jsonor a preprocessor.
Alternatives
yq(Mike Farah): jq-compatible syntax for YAML, TOML, and XMLgojq: Go implementation, somewhat faster on large filesjaq: Rust implementation, faster still, not 100% compatible with all featuresfx: interactive JSON explorer (TUI)jless: jq combined with less, for paging through large JSON