Why BGP
Inside a single organization, IGP (interior gateway protocols) do the job: ospf, IS-IS, RIP. They are fast, share full topology, but they do not scale to the size of the internet.
BGP is an EGP (exterior gateway protocol). It:
- Runs between autonomous systems (AS), independent networks each with its own number (ASN, for example 65001 for private use or 13335 for Cloudflare)
- Does not share topology, only reachability: "AS 65001 can reach prefix 10.0.0.0/24 via the path AS 65002 -> AS 65003"
- Picks routes not by shortest metric, but by policy: prefer cheaper transit, do not leak prefixes to competitors, and so on
The internet is roughly 75,000 AS exchanging BGP updates with each other. The full internet BGP table currently holds about 950,000 IPv4 prefixes.
eBGP vs iBGP
| Type | Between | Notes |
|---|---|---|
| eBGP | different AS | TTL=1 (neighbors on the same link), AS-path grows |
| iBGP | within one AS | full-mesh required, AS-path unchanged |
You need iBGP when you have multiple BGP routers inside one AS. They must all know about each other to forward transit consistently. Instead of a full-mesh you can use a route reflector: one router speaks to all others and redistributes updates.
Session states (FSM)
A BGP session runs over TCP/179. After TCP is established, the BGP handshake proceeds:
Idle -> Connect -> Active -> OpenSent -> OpenConfirm -> Established
- Idle - initial state, waiting for a trigger
- Connect - attempting to open TCP
- Active - TCP did not open, retrying
- OpenSent / OpenConfirm - OPEN messages exchanged
- Established - session up, prefix exchange can begin
If show ip bgp summary shows a neighbor in Active, TCP is not getting
through (firewall, wrong IP, wrong AS). See cmd-vtysh.
BGP attributes (how the best path is chosen)
When a router receives the same prefix via multiple paths, it selects the best-path in this order:
- Local Preference (higher is better) - within an AS
- AS-path length (shorter is better)
- Origin (IGP < EGP < incomplete)
- MED (Multi-Exit Discriminator, lower is better)
- eBGP > iBGP
- IGP metric to next-hop
- Router ID (tie-breaker)
This is simplified. The full decision process has 13 steps. In practice: a shorter AS-path and a higher Local-Pref win.
Minimal configuration (FRR)
Neighbor on a p2p link 10.0.0.0/30, local AS 65001, remote AS 65002:
router bgp 65001
bgp router-id 1.1.1.1
no bgp default ipv4-unicast
neighbor 10.0.0.2 remote-as 65002
address-family ipv4 unicast
neighbor 10.0.0.2 activate
network 192.168.10.0/24
exit-address-family
What each line does:
network- announce your prefix to the neighbor (it must be in the RIB)no bgp default ipv4-unicast- modern best practice: explicitly activate address-family per neighborbgp router-id- stable 32-bit ID (often a loopback IP)
Verify: show ip bgp summary should show Established and a count of
accepted prefixes.
Prefix filters
Without filters, iBGP/eBGP can turn you into a transit for half the internet. At minimum:
ip prefix-list MY-PREFIXES seq 10 permit 192.168.10.0/24
router bgp 65001
neighbor 10.0.0.2 prefix-list MY-PREFIXES out
This sends only your prefix to neighbor 10.0.0.2, nothing else.
BGP in the data center
In modern DCs (Clos / spine-leaf), BGP has displaced OSPF even as an IGP. This pattern is called BGP-as-IGP or EVPN BGP. The reasons:
- Simple policy control (route-map, prefix-list)
- Multipath with unequal AS-path
- No area design required, unlike ospf
- Clean separation of underlay from overlay (VXLAN/EVPN)
For a deep reference: Cumulus / NVIDIA "BGP in the Data Center" is the de-facto standard.