As your network grows bigger and your internet traffic grows, it starts to make sense to peer directly with other networks rather than simply pay an ISP to deliver all that traffic. Peering is one of the areas in life where the Pareto principle, also known as the 80-20 rule applies: 20% of your potential peers are responsible for 80% of the peering traffic. (Give or take.) So you’ll want to send a really nice message to your largest prospective peers, and possibly do private peering using dedicated circuits. But at some point, you’re going to hit diminishing returns as you’re sending peering requests and set up BGP sessions towards smaller and smaller networks.
The good news is that there’s a better way to exchange traffic with lots of small peers than to set up BGP sessions with each one: peering over an internet exchange with a route server.
Normally, when AS 123 peers with AS 456 and AS 456 peers with AS 789, AS 456 won’t propagate AS 123’s prefixes to AS 789—if AS 123 and AS 789 want to exchange traffic, they’ll have to peer directly. Figure 1 shows regular peering over an IX, where each network has a direct BGP session with the networks it peers with.
Figure 1: Direct peering over an internet exchange.
However, things are different for route servers. Route servers do propagate prefixes from one peer to all their other peers. At least, that’s how route servers typically operate today. In the 1990s, there were some route servers that used extensive filters to limit the propagation of prefixes.
In figure 2, the networks from figure 1 no longer peer directly, but all maintain BGP sessions with AS 25, the route server. Note that now each AS has all prefixes, and the AS path is a hop longer because AS 25 appears in it.
Figure 2: BGP sessions towards a route server.
Under normal circumstances, the traffic would now flow through the route server, making running a route server on a big internet exchange a non-starter. However, BGP is smart enough to recognize that all the route server peers are connected to the same subnet (the internet exchange peering LAN). So unlike it does under other circumstances, BGP doesn’t update the next hop address. This means that when AS 456 gets the route to 1.2.3.0/24 from the route server, the next hop address isn’t the route server’s address, but the AS 123 router’s address. As such, packets don’t flow through the route server, but rather, are directly delivered to the right AS, as shown in figure 3.
Figure 3: Traffic flow when peering with a route server.
However, there are also some networks that have an open peering policy and will peer directly with anyone, but don’t peer with route servers, because that way they don’t control their peerings. For instance, when there is an issue with a peer, it’s useful to be able to temporarily shut down the BGP session with that peer. If the peering happens through a route server, you’ll either have to wait for the route server operator to take action or shut down the BGP session towards the route server and impact lots of other peers, too.
Most networks prefer to peer through the route servers exclusively when possible, while other networks like to add direct peering in addition to the route server peering. The argument in favor of a route server only policy is that it keeps the amount of work and the number of BGP sessions to a minimum. On the other hand, adding direct peering has the advantage that if there’s an issue with the route server peering, there’s still the direct peering.
The BGP session towards a route server is configured exactly the same as any other BGP session used for peering. The route server itself can also be a standard BGP router, or a Unix (-like) system running BGP software. The route server configured to allow prefixes from each peer to propagate to other peers. A regular router performing route server duties will include its own AS in the AS path for prefixes it propagates, as shown in figures 2 and 3. This makes paths learned through the route server a hop longer, so paths learned through direct peering will be preferred. However, it’s not uncommon for route servers to be set up to leave out their own AS number in the AS path propagated to peers, so there’s no impact to AS path length.
Noction – Intelligent Routing Platform 3.0
Zaheer
Very nice blog and explanations. Keep it up 🙂