Why kernel modules
The Linux kernel is a monolith, but it is not static. A Loadable Kernel
Module (LKM) is an object file (.ko) that you can load into a running
kernel and unload again without a reboot. This gives you:
- Hardware drivers loaded on demand. You plug in a USB stick and
usb_storageloads. - A modular kernel. Distributions build a minimal vmlinuz and keep the
rest as modules. A RHEL/Ubuntu kernel is roughly a 20 MB vmlinuz plus
hundreds of MB of modules under
/lib/modules/. - Optional features.
nf_nat,vfat, andnvidia.koload only when they are needed. - Out-of-tree modules. Proprietary drivers (NVIDIA, ZFS, VirtualBox) work without a patch in mainline.
Alternatives to an LKM:
- eBPF (ebpf-basics) is sandboxed and needs no ring0 trust. It cannot do everything a module can, though (a new driver, a new syscall).
- Userspace drivers through uio/vfio, used for NICs (DPDK) and accelerators.
Loading and unloading
Three levels of abstraction:
insmod / rmmod (raw)
insmod /lib/modules/$(uname -r)/kernel/fs/vfat/vfat.ko
rmmod vfat
These do not resolve dependencies. If module A.ko needs B.ko and B
is not loaded, you get Unknown symbol in module. Use them rarely, only
for debugging.
modprobe (with dependencies)
modprobe vfat
modprobe -r vfat # remove
modprobe -v nf_nat # verbose
modprobe --first-time vfat # error if already loaded
It resolves dependencies through modules.dep (built by depmod). It
loads fat first (the dependency), then vfat.
Module parameters:
modprobe nvidia NVreg_PreserveVideoMemoryAllocations=1
Persistent parameters go in /etc/modprobe.d/<name>.conf:
options nvidia NVreg_PreserveVideoMemoryAllocations=1
blacklist nouveau
blacklist stops a module from loading automatically, but you can still
load it by hand. To block it entirely, use install <module> /bin/false.
depmod: the dependency tree
depmod -a
It scans /lib/modules/$(uname -r)/, reads the symbols from every .ko,
and builds:
modules.dep: "A depends on B, C"modules.symbols: "symbol foo lives in A.ko"modules.alias: "alias 'pci:v0x8086d0x10b8*' = e1000.ko"modules.builtin: what is compiled into vmlinuz
It runs automatically during make modules_install and when a
linux-image package is installed. You rarely run it by hand.
Automatic loading
Modern systemd-udev does this on its own through alias matching:
- A PCI device has a vendor/device ID
- The kernel and udev generate a MODALIAS string
- The match against
modules.aliasloads the right module
Check it:
udevadm test /sys/class/net/eth0 # what matches
USB works the same way. For filesystems, the kernel tries to load the
module when you run mount -t vfat ....
modinfo: module details
modinfo nvidia
filename: /lib/modules/.../nvidia.ko
version: 535.183.01
license: NVIDIA
description: NVIDIA Driver
author: NVIDIA Corporation
depends: ...
retpoline: Y
intree: N
vermagic: 6.5.0-44-generic SMP preempt mod_unload
parm: NVreg_OpenRmEnableUnsupportedGpus:int
...
Useful for:
vermagic: which kernel it was built against (if it does not match, it will not load withoutmodprobe --force)intree: N: out-of-tree (see taint below)parm:: the parameters you can set
/proc/modules and lsmod
lsmod | head
Module Size Used by
nvidia_uvm 1810432 0
nf_conntrack_netlink 65536 0
nf_nat 57344 0
nf_conntrack 196608 2 nf_nat,nf_conntrack_netlink
The columns are name, size in memory, and the refcount plus who uses it. When the refcount is above 0, you cannot unload the module:
rmmod: ERROR: Module nf_conntrack is in use by: nf_nat,nf_conntrack_netlink
Unload in the right order (the dependents first).
Module signing
Since version 3.7 the kernel supports module signing. On a UEFI Secure Boot system this is usually enforced: an unsigned module will not load.
Signing:
/usr/src/linux-headers-$(uname -r)/scripts/sign-file sha256 \
MOK_PRIVATE_KEY.pem MOK_CERT.x509 my_module.ko
The certificate has to be in the kernel keyring or in the MOK (Machine Owner Key):
mokutil --import my_cert.der
# reboot, enroll in shim mokmanager, enter the password
After enrollment the kernel trusts that certificate.
Signing state:
cat /proc/sys/kernel/modules_disabled # 1 = loading is blocked entirely
cat /sys/module/<m>/sections/.signature 2>/dev/null # signed?
Lockdown mode
Kernel 5.4+ has the lockdown LSM:
- none: full root access to kernel memory, MSR, kexec, /dev/mem
- integrity: kernel modifications are blocked (writeable text), but reads are allowed
- confidentiality: reads are blocked too (sensitive keys cannot be sniffed)
How it turns on:
- On UEFI Secure Boot it is usually integrity by default
- Or by hand: the kernel cmdline
lockdown=integrity - Or
echo integrity > /sys/kernel/security/lockdown
The effect:
- Unsigned modules do not load
- kexec without a signature is blocked
- /dev/mem reads are blocked
- Some ioctls (
SET_KERNEL, the ftrace marker) are blocked
It is used on hardened systems (Federal POS, ChromeOS, Tails).
Out-of-tree modules and DKMS
Some modules are not in the mainline kernel: NVIDIA, ZFS, VirtualBox, and WireGuard before 5.6. After every kernel upgrade these modules have to be rebuilt.
DKMS (Dynamic Kernel Module Support) automates that:
- The module source and spec live in
/usr/src/<module>-<version>/ - When new kernel-headers are installed, DKMS rebuilds every registered module
- It puts the
.koin/lib/modules/$(new_kernel)/updates/dkms/ depmodruns automatically
Registration:
dkms add -m my-driver -v 1.0
dkms build -m my-driver -v 1.0
dkms install -m my-driver -v 1.0
dkms status
Without DKMS, an apt upgrade && reboot loses NVIDIA and you get a black
screen.
Kernel taint
When a "bad" module loads, the kernel marks itself tainted:
cat /proc/sys/kernel/tainted
The bitmap:
1(P): proprietary module loaded (NVIDIA)2(F): module force-loaded (modprobe --force)4(S): SMP w/ non-SMP-capable cpu512(W): WARN_ON triggered4096(K): kernel was live-patched8192(X): external/unsupported module32768(E): unsigned module loaded
In a bug report the maintainers look at the taint first. If it is P, you get no support (NVIDIA is your problem). Decode it:
cat /proc/sys/kernel/tainted
# or
dmesg | grep -i taint
Module init/exit
Every module has:
static int __init my_init(void) { pr_info("loaded\n");return 0;
}
static void __exit my_exit(void) { pr_info("unloaded\n");}
module_init(my_init);
module_exit(my_exit);
MODULE_LICENSE("GPL");MODULE_AUTHOR("...");insmod calls module_init. If init returns nonzero, the module is
unloaded (without a call to module_exit). rmmod calls module_exit.
The pr_info logs (= printk(KERN_INFO ...)) show up through
[[cmd-dmesg|dmesg]] or journalctl -k.
When something goes wrong
Required key not available: Secure Boot is on and the module is not signed. Sign it or disable Secure Boot (at your own risk).Exec format error: the vermagic does not match. The module was built for a different kernel. Rebuild it.Unknown symbol in module: a dependency is not loaded. Usemodprobeinstead ofinsmod. Or the dependency does not exist (a classic pain with out-of-tree modules).- Cannot rmmod:
Module is in use. Find what uses it (lsmod | grep <name>) and unload the dependents first. - Kernel panic on load: the module is broken. Boot from another
initramfs and delete the
.kofrom/lib/modules/.../updates/. - DKMS does not rebuild after an upgrade: run
dkms autoinstallby hand and check the logs in/var/lib/dkms/<module>/<ver>/build/make.log. - A
Lockdown: ... is restrictedmessage: lockdown mode is blocking the action. Checkdmesg | grep -i lockdown.