linuxlab.io
Tutorials▾
  • Linux & networking
    File system, processes, TCP/IP, BGP and OSPF
    →
  • Terraform & IaC
    HCL, state, plan/apply on a LocalStack sandbox
    →
  • Git & GitHub
    Object model, plumbing, branching, GitHub Actions
    →
All tutorials →
PricingAboutSign inCreate account
/
  • Introduction
  • Lessons
  • How it works
  • Simulator
  • Knowledge base
  • Interview prep
Index
Categories
All entries
Footer
linuxlab-TutorialsPricingAboutPrivacy & cookies
Copyright © 2026 LinuxLab. All rights reserved.
home/linux/kb/Observability & monitoring/service-discovery-prometheus

kb/observability ── Observability & monitoring ── advanced

Service discovery in Prometheus: k8s, Consul, file_sd, relabel

Prom discovers targets through the k8s API, Consul, or file_sd (static). relabel_configs runs before scrape (filter and rewrite labels). metric_relabel runs after scrape (drop bad metrics). Without relabel, cardinality from k8s explodes.

view as markdownaka: service-discovery, prometheus-sd, kubernetes-sd, consul-sd, relabel, relabel-configs

Why service discovery

A static config works for 5 hosts. For 500 it does not. In Kubernetes the endpoints change every second (rollouts, autoscaling). You need a way to learn who to scrape automatically.

The answer is service discovery (SD): Prometheus says "give me all pods/services with these labels", the SD mechanism returns a list of endpoints, and Prom scrapes them.

Around 30 SD mechanisms are supported: kubernetes, consul, dns, ec2, azure, gce, file_sd, http_sd. The most common are k8s and consul.

Discovery → relabel → scrape

┌──────────────┐
│ SD mechanism │  returns targets with meta-labels
│ (k8s, etc)   │  __meta_kubernetes_pod_name, etc
└──────┬───────┘
       │ raw targets with __meta_* labels
       ▼
┌──────────────┐
│  relabel_    │  filter + transform labels
│  configs     │  action: keep/drop/replace/labelmap
└──────┬───────┘
       │ final targets
       ▼
┌──────────────┐
│  scrape      │  HTTP GET /metrics
└──────┬───────┘
       │ raw metrics
       ▼
┌──────────────┐
│  metric_     │  drop bad metrics, rewrite names
│  relabel_    │
│  configs     │
└──────┬───────┘
       │
       ▼
     TSDB

Critical insight: __meta_* labels are dropped after relabel. If you want them in the TSDB, use an explicit replace action.

Kubernetes SD

yaml
scrape_configs:
  - job_name: kubernetes-pods
    kubernetes_sd_configs:
      - role: pod              # pod | service | endpoints | endpointslices | node | ingress
    relabel_configs:
      # Only pods with the annotation prometheus.io/scrape=true
      - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
        action: keep
        regex: 'true'
      # Take the port from the annotation prometheus.io/port
      - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
        action: replace
        regex: '([^:]+)(?::\d+)?;(\d+)'
        replacement: '$1:$2'
        target_label: __address__
      # Path from the annotation prometheus.io/path (default /metrics)
      - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
        action: replace
        target_label: __metrics_path__
        regex: '(.+)'
      # All pod labels → metric labels with a prefix
      - action: labelmap
        regex: __meta_kubernetes_pod_label_(.+)
      # Convenience labels
      - source_labels: [__meta_kubernetes_namespace]
        target_label: namespace
      - source_labels: [__meta_kubernetes_pod_name]
        target_label: pod
      - source_labels: [__meta_kubernetes_pod_node_name]
        target_label: node

Result: every pod with the prometheus.io/scrape=true annotation is scraped. All its k8s labels are copied into metric labels.

Roles in kubernetes_sd

RoleWhat it returnsWhen
nodeKubernetes nodes (kubelet)host metrics, kubelet
podevery podapplication metrics
servicek8s Service objectsblackbox probes to services
endpointsendpoints (legacy)a replacement for service for kube-state-metrics
endpointslicesEndpointSlice (modern)k8s 1.21+, scale better
ingressIngress objectscheck ingresses

Modern setup: endpointslices instead of endpoints (better performance on large clusters).

Consul SD

yaml
scrape_configs:
  - job_name: consul
    consul_sd_configs:
      - server: consul.example.com:8500
        tags: ['prometheus']      # only services with the tag
    relabel_configs:
      - source_labels: [__meta_consul_service]
        target_label: service
      - source_labels: [__meta_consul_tags]
        target_label: tags

Consul is popular in non-k8s stacks (Nomad, classic VMs). A service registers itself in Consul, and Prom learns about it through SD.

file_sd: static with granularity

When there is no k8s or Consul, but you have a script that knows who to scrape:

yaml
scrape_configs:
  - job_name: file-discovery
    file_sd_configs:
      - files: ['/etc/prometheus/targets/*.json']
        refresh_interval: 30s

The file:

json
[
  {
    "targets": ["host1:9100", "host2:9100"],
    "labels": {"env": "prod", "team": "infra"}
  },
  {
    "targets": ["dbhost:9187"],
    "labels": {"env": "prod", "team": "db"}
  }
]

An external tool (Ansible, terraform, chef) generates the JSON. Prom auto-reloads every 30s. Flexible and simple.

relabel actions

ActionWhat it does
replacewrites regex.replace(source, replacement) into target_label
keepdrop the target if source ~ regex does NOT match
dropdrop the target if source ~ regex matches
keepequalkeep if source == target
dropequaldrop if source == target
hashmodtarget_label = hash(source) % modulus (for sharding)
labelmapcopies all labels matching the regex (with renaming)
labeldropremoves labels matching the regex
labelkeepkeeps only labels matching the regex
lowercase / uppercasecase transform

keep and drop are the most common for filtering. replace and labelmap are for shaping labels.

Sharding with hashmod

Three Proms scrape 1000 targets, split evenly:

yaml
relabel_configs:
  - source_labels: [__address__]
    modulus: 3
    target_label: __tmp_hash
    action: hashmod
  - source_labels: [__tmp_hash]
    regex: '0'              # this Prom, shard 0
    action: keep

Each Prom holds about 330 targets. Federation aggregates upward.

metric_relabel_configs: after scrape

Applies to already scraped metrics, before they are written to the TSDB.

yaml
scrape_configs:
  - job_name: ...
    metric_relabel_configs:
      # Drop high-cardinality metrics
      - source_labels: [__name__]
        regex: 'go_gc_pauses_seconds_bucket'
        action: drop
      # Drop a specific label with user_id (cardinality)
      - regex: 'user_id'
        action: labeldrop
      # Rewrite metric name
      - source_labels: [__name__]
        regex: 'old_metric_name'
        replacement: 'new_metric_name'
        target_label: __name__

Used to fight cardinality-explosion from ill-behaved exporters. Better to fix it in the code, but sometimes you have no access.

Best practices

  • Filter at the SD stage, not the metric stage: keep/drop is cheaper than metric_relabel, and it puts less load on the target.
  • Convenient labels (namespace, pod, service): stable names across all jobs. Do not use __meta_kubernetes_* in queries.
  • Do not copy every pod label with labelmap blindly. k8s attaches controller-revision-hash, pod-template-hash, and so on. That is cardinality. Whitelist with a regex in labelmap:
    yaml
    - action: labelmap
      regex: __meta_kubernetes_pod_label_(app|version|component)
  • CI-test your relabel: promtool check config plus targeted dry-runs through promtool (limited).

kube-state-metrics + node-exporter

The standard k8s monitoring stack:

  • node-exporter on every node → node_* metrics
  • kube-state-metrics, a single instance → kube_* metrics about the state of k8s objects
  • cAdvisor in the kubelet → container metrics
  • app metrics through annotation discovery

All through k8s SD with different relabel configs.

When things go wrong

  • No targets in /targets: a relabel keep is too strict and nothing is left. Remove one rule at a time and check the UI.
  • Targets exist, but scrape errors with "401 Unauthorized": a kubelet scrape needs a ServiceAccount and RBAC, or bearer_token_file: /var/run/secrets/.../token.
  • Cardinality explosion after a rollout: labelmap copied pod-template-hash. Whitelist the labels.
  • Targets are duplicated: the same endpoint appears in several roles. Deduplicate: one role plus the right selector.
  • Slow SD reload (5+ minutes): a k8s API rate limit. Lower refresh_interval or use endpointslices instead of endpoints.
  • __address__ has the wrong port: k8s SD takes the first declared port. Override it from an annotation with replace.
  • Stale targets after a k8s namespace delete: Prom keeps them until --query.lookback-delta (default 5m). This is normal.

§ команды

bash
curl -s http://prometheus:9090/api/v1/targets | jq '.data.activeTargets[] | {scrapeUrl, labels}'

All active targets with their final labels, after relabel

bash
curl -s http://prometheus:9090/api/v1/targets/metadata | jq '.data | length'

How many metrics the targets declare, a volume estimate

bash
promtool check config /etc/prometheus/prometheus.yml

Validate the whole config: scrape_configs, relabel, rules

bash
curl -s http://prometheus:9090/api/v1/status/config | jq -r .data.yaml | grep -A 20 relabel

The live runtime config, what Prom actually uses (after reload)

bash
kubectl get servicemonitor -A  # if prometheus-operator is in use

ServiceMonitor CRD: a declarative replacement for scrape_configs in k8s

bash
consul catalog services -tags  # for consul SD

All Consul services with tags, what Prom sees through consul_sd

§ см. также

  • kubelet-internalskubelet: the Kubernetes node agent architecturekubelet is a daemon on every node. It receives the PodSpec through the API, starts containers through CRI, mounts volumes through CSI, and watches health. Under pressure it does eviction. Image GC and the cgroup tree are also its job.
Footer
linuxlab-
Copyright © 2026 LinuxLab. All rights reserved.
Tutorials
Pricing
About
Privacy & cookies