Why VLANs
A physical switch is a single [[broadcast-domain|broadcast domain]]. Every host hears every other host, ARP broadcasts fill the wire, and there is no traffic isolation. If accounting must not reach the DMZ, you physically need two switches.
VLAN solves this virtually: one switch behaves as several independent ones. Hosts in different VLANs cannot see each other at L2. Communication between VLANs requires a [[default-gateway|router]] (inter-VLAN routing).
802.1Q: the tagged frame
A standard [[ethernet-frame|Ethernet frame]] looks like this:
| dst-MAC | src-MAC | EtherType | payload | FCS |
802.1Q inserts 4 tag bytes between src-MAC and EtherType:
| dst-MAC | src-MAC | 0x8100 | TCI | EtherType | payload | FCS |
- 0x8100 signals that an 802.1Q tag follows
- TCI (Tag Control Information):
- PCP (3 bits) - QoS priority 0-7
- DEI (1 bit) - drop eligible
- VID (12 bits) - VLAN ID, 0-4095 (0 and 4095 are reserved)
Access vs. Trunk
A switch port operates in one of two modes:
- Access port - the port belongs to one VLAN. A frame from the host arrives at the switch untagged; the switch adds the tag internally. When the frame leaves toward the host, the tag is stripped. The host has no knowledge of VLANs.
- Trunk port - the port carries multiple VLANs. Frames travel with an 802.1Q tag. Used between switches or toward a host machine that processes tags itself (virtualization, containers).
VLAN 10 VLAN 20 VLAN 10
| | |
|access |access |access
[SW1] --- trunk --- [SW2]
(10,20)
Native VLAN
A trunk port can have an untagged VLAN: a frame arriving without a tag is assigned to that VLAN. The default is VLAN 1.
This is a source of bugs and attacks (VLAN hopping). Best practices:
- Set the native VLAN explicitly (do not leave the default)
- Do not use VLAN 1 for user traffic
- On a trunk, prefer tagging everything
VLANs on Linux
Sub-interface via ip link
ip link add link eth0 name eth0.10 type vlan id 10
ip addr add 10.10.0.1/24 dev eth0.10
ip link set eth0.10 up
This creates sub-interface eth0.10, which sends all traffic tagged as VLAN 10.
VLAN-aware bridge (modern approach)
Docker, Kubernetes, and libvirt use a bridge with VLAN support:
ip link add br0 type bridge vlan_filtering 1
bridge vlan add dev eth0 vid 10 pvid 10 untagged # access
bridge vlan add dev eth1 vid 10 tagged # trunk
bridge vlan add dev eth1 vid 20 tagged
One [[bridge|bridge]] emulates a switch with access and trunk ports.
Inter-VLAN routing
Hosts in VLAN 10 and VLAN 20 cannot reach each other at L2. To communicate, they need an L3 route. The standard approach is router-on-a-stick: one physical router interface with two sub-interfaces tagged 10 and 20, each with its own IP. One IP is the gateway for VLAN 10, the other for VLAN 20:
router# show ip route
10.10.0.0/24 -> eth0.10
10.20.0.0/24 -> eth0.20
Hosts in VLAN 10 set their default gateway to the IP of the sub-interface in VLAN 10.
QinQ (double tagging): 802.1ad
When VLAN tags need to pass through a provider that has its own VLANs, you use double encapsulation: an S-tag (provider service VLAN) is placed over a C-tag (customer VLAN). EtherType = 0x88a8.
Linux has limited support for this: vlan_protocol 802.1ad.
Troubleshooting
- No ping across trunk - native VLAN mismatch between two switches, or the port is not in trunk mode (access is the default on many platforms)
- VLAN frames are dropped - check that the MTU on the trunk port is at least 1504 (1500 payload plus 4 tag bytes)
- VLAN-hopping attack - native VLAN was guessed, attacker sends double-tagged frames. Mitigations: use a tagged native VLAN, disable DTP (auto-trunk)
- Host receives foreign broadcast - service port configured in the wrong VLAN
- Jumbo frames with VLANs - MTU must account for the tag: 9000 on access ports, 9004 on trunk ports