Why bonding
A single network interface is one SPOF. Servers carry 2-4 NICs and combine them:
- High availability - one port drops, the other holds the link (active-backup)
- Throughput - 2×10G physically, up to 20G on one logical interface (LACP with the right hashing)
- Load balancing - spread bandwidth across outbound sessions
Linux implements this through the bonding driver, which creates bond0
on top of the physical slaves. The alternative is team (more modular), but
bonding has historically been more popular.
Modes
| mode | Name | Description |
|---|---|---|
| 0 | balance-rr | round-robin: packets sent to each slave in turn. Good for throughput, bad for ordering. reorder causes retransmits |
| 1 | active-backup | one active, the rest standby. The switch needs no configuration. The universal default |
| 2 | balance-xor | hash(src-MAC ^ dst-MAC) → slave. A stable "one flow, one path" |
| 3 | broadcast | everything to all slaves. For special HA cases |
| 4 | 802.3ad (LACP) | aggregation by the standard, needs a LAG on the switch |
| 5 | balance-tlb | Transmit Load Balancing - outbound goes to the less loaded slave |
| 6 | balance-alb | the same plus inbound via ARP tricks (no switch support needed) |
The most common: mode 1 (active-backup) for simplicity and mode 4 (LACP) for performance.
active-backup (mode 1)
Simple and universal. The switch needs no configuration.
ip link add bond0 type bond mode active-backup miimon 100
ip link set eth0 master bond0
ip link set eth1 master bond0
ip link set bond0 up
ip addr add 10.0.0.5/24 dev bond0
- miimon 100 - check the link every 100ms via MII (PHY status)
- When the link on the active slave breaks, failover takes ~200ms to the backup
- The bond MAC = the MAC of the active slave (it can change on failover, with an ARP update into the network)
LACP (mode 4) - 802.3ad
The standard mechanism. On the switch you configure a port-channel / LAG with the same ports:
ip link add bond0 type bond mode 802.3ad miimon 100 \
lacp_rate fast xmit_hash_policy layer3+4
- lacp_rate fast - LACPDU every second (vs 30s by default)
- xmit_hash_policy layer3+4 - hash by src/dst IP + port. One flow goes over one slave, different flows are spread out.
Important: one TCP flow always goes through one slave. So from a single client you get at most 10G on a 2×10G LACP. Several clients reach up to 20G.
Hash policy
| xmit_hash_policy | Hash |
|---|---|
| layer2 | src+dst MAC |
| layer2+3 | + IP addresses |
| layer3+4 | IP + port (spreads the most) |
layer3+4 breaks 802.3ad strictly, but in practice everything works. Switches do not care how you pick the slave for outbound traffic.
Configuration in systemd-networkd
# /etc/systemd/network/10-bond.netdev
[NetDev]
Name=bond0
Kind=bond
[Bond]
Mode=802.3ad
LACPTransmitRate=fast
TransmitHashPolicy=layer3+4
MIIMonitorSec=100ms
# /etc/systemd/network/20-bond-eth0.network
[Match]
Name=eth0
[Network]
Bond=bond0
# /etc/systemd/network/30-bond-ip.network
[Match]
Name=bond0
[Network]
Address=10.0.0.5/24
Gateway=10.0.0.1
Bonding + VLAN
To bring up VLANs on top of the bond, do the bond first, then the VLAN:
ip link add bond0 type bond mode 802.3ad
ip link add link bond0 name bond0.10 type vlan id 10
You get LACP aggregation plus [[vlan-and-trunk|802.1Q]] tags on top.
Bonding + bridge
In virtualization (KVM, libvirt) the bond + bridge is the standard
configuration. VMs see br0, and the path out goes through the bond:
eth0 ──┐
├─ bond0 ── br0 ── (vnetX to the VMs)
eth1 ──┘
When something goes wrong
- LACP does not converge - the LAG is not configured on the switch, or it is not in active mode.
Check:
cat /proc/net/bonding/bond0shows the per-slave LACP state - No throughput gain - a single TCP flow, hash policy layer2. Switch to layer3+4 plus several flows
- Failover is slow - miimon is too high. Set it to 100ms
- MAC changes on failover - use
fail_over_mac=active(a new MAC) orfail_over_mac=follow(always the MAC of the primary) - Bridge over bond complains -
net.ipv4.conf.all.arp_filter=1orarp_ignore=1are needed on the bonding interface