11/04/2025 | Press release | Distributed by Public on 11/03/2025 19:15
In this series 'Network edge design', Brandon Hitzel shares practical lessons learned from designing edge networks, building on his first series on the topic.
Previously, I wrote a series of posts about network edge design, with a focus on using dual Internet Service Providers (ISPs), Demilitarized Zone Network (DMZ) delineation, and network edge designs and topologies.
After publishing that series, I received many questions about specific aspects of network edge design. If you haven't read the first series (part 1, part 2, part 3), I recommend starting there - it provides useful context and foundational concepts, and I won't be revisiting everything covered in that series.
In this new series, I'll go into more depth on some of the topics from the previous series and address some of the questions and comments from readers. By the end of it, you'll be better prepared to set up your network so it meets your goals.
We'll look at Border Gateway Protocol (BGP) and the physical design of your circuits, with a focus on on-premises networking. You will find some configuration examples, but not bulk republication of material you can find elsewhere.
We won't be looking at things like application programming interface (API) endpoints, load balancers, storage containers, and so on, even though they might be part of your edge network design.
In this series, we'll be covering:
In this blog post, we'll explore key considerations for circuit design at the network edge, including path diversity, active/active strategies, Autonomous System Number (ASN) filtering, BGP communities, and resilient edge architectures.
Circuit design
When designing a circuit, you want to consider the physical and logical network paths - including equipment and locations. You want to have redundant equipment and have diversity in fibre routes and geographic areas. What does this look like? Let's dive in.
Common Language Location Identifier (CLLI) code is commonly used in the United States to identify large exchange and network interchange points like central offices (COs), smaller exchange Points of Presence (PoPs), and serving wiring centres (SWCs) in cable plant buildings. These terms are widely used in the provider world and are sometimes used interchangeably. They are carry-over terms from the old telephone days.
I'd be interested to know how other regions like Europe, India and Australia handle this. The CLLI code is important because different addresses can sometimes refer to the same location. The code helps identify this and document key sites in your design.
When designing a circuit, consider the path your traffic will take on your dual circuit edge network setup. Whether it's for intra-network connectivity between PoPs or creating diverse Internet paths without using local data centres' Internet access, it's important to trace out the path. Different addresses listed for a CO could still be within the same data centre 'campus', or collection of buildings that share infrastructure.
CLLI codes can help you determine whether locations are truly separate. Google Maps can help, as can the TelcoData.US telecommunications database.
Typically, a location will connect to a local wiring centre where much of the local fibre terminates. This is often the first stop before reaching a larger CO or a peering point. These facilities can include Layer 1, 2, or 3 equipment, but they are usually focused on local connectivity and fibre splicing. Depending on the city, network, and provider, these roles may be combined or separate.
Here's how I recommend approaching path diversity, along with some questions to consider. I'll use fibre as an example because it's the most reliable and highest-performing medium, but you can apply the same principles to any Layer 1 technology. We'll cover both outside-plant components and logical IP considerations.
Start at your location: Diversity begins with equipment inside the building, such as Customer Premises Equipment (CPE). From there, the fibre exits the building to the street and then connects to the SWC. Next, trace the path from the SWC to the CO or POP for upstream connectivity, and repeat this process for each segment. If you have internal tools like ArcGIS or Lightyear, use them. Otherwise, you can often request a KMZ file that shows the fibre path on a map.
Building multiple entry points into a site or between sites can be cost-prohibitive, so compromises - such as using a common conduit - may be necessary. Once the fibre reaches the street, check whether it uses different vaults for underground cable or separate poles for aerial cable. Which streets do your two fibres travel along? Is the connection passive optical or dark fibre? Avoid common paths wherever possible. If both fibres are aerial, the risk is higher than if they are underground.
Next, you may have two different SWCs located across town from each other. This helps ensure physical local access or last-mile diversity once the fibres, paths, and SWCs are confirmed in the topology. However, keep in mind that if they are in the same town, they might still share the same power source from the same utility provider.
If the building is off-net for you or the provider, you may need to contract with a Local Exchange Carrier (LEC) for network access. In such cases, the path might travel through a PoP before reaching a CO, where a network-to-network interconnect (NNI) exists. Inside these locations, fibre and equipment can be shared between circuits, so it's important to use different switches and routers within both your network and the provider's network for NNI. Hostnames and CLLI codes can help verify this. If there is only one NNI in the area for both circuits, consider using different providers. Investigate your circuit thoroughly if off-net connectivity is involved.
From there, analyse the middle-mile or intermediate path from the wiring centre to the CO in the same way you reviewed the first leg. At the egress point, check whether you have separate network terminations or interconnections with your neighbor's equipment. Aim for diversity in switches and routers. As mentioned in the previous series, if you have two circuits inside a data centre, use separate cross-connect cables whenever possible.
Figure 3 shows a diverse dual-edge network design. The co-location site is a smaller data centre with limited connectivity, but it illustrates the design from a Layer 1 and Layer 2 perspective. Tools such as Geographic Information Systems (GIS) can show the actual physical fibre paths on a map.
In this example, Edge Router 2 connects to provider 2, but the data centre is off-net for that provider. As a result, Provider 2 uses Provider 1's Layer 1 and Layer 2 connectivity to reach the site. Meanwhile, Provider 1's connection for Edge Router 1 uses direct fibre. This setup provides physical and logical cabling diversity once the connection leaves the co-location. The NNI is also at a different POP than the one used by Edge Router 1, and the ASNs are diverse upstream for Layer 3 connectivity.
At Layer 3, consider the IP space and the BGP ASN paths your traffic will take. Is your Internet access from two providers with different ASNs, or from one provider using the same ASN? If it's only one ASN and there's a peering issue, your connectivity could be affected. For enterprise networks without provider-independent IP space, you may need to use the same provider. However, if you have your own IP space, you can establish multiple peerings with different entities for better routing resilience. The example above assumes provider-independent space.
Some providers require you to order two circuits at the same time under the same order if you want them to remain independently diverse. Without this, future maintenance could result in equipment being consolidated or cables being moved to the same card, unless the original order specifies otherwise.
Even if you use two different carriers in a dual-ISP network, the last mile may still come from a single provider that owns the building's facilities. This could mean both circuits share the same switch or fibre cable (as shown in Figure 3). Local verification or documented confirmation is essential. Even with wave circuits, review your fibre paths for both working and protection routes to ensure proper resiliency.
Finally, if you are evaluating different locations for co-location and edge routing, check the site's interconnection details in PeeringDB. A more interconnected site usually offers better pricing and more fibre options. Once you've chosen a location and want to select peers, tools like Hurricane Electric's BGP resources can help with research.
Active/active design
We talked about active/active setups in the previous series, and I wanted to add some more thoughts surrounding this type of edge network design.
In Figure 3, we showed edge PoPs in different cities connected through different transport mediums - one using dark fibre and the other using metro fibre. While this provides strong diversity, these options can have very different service-level agreements (SLAs) and performance characteristics, which is why I highlighted them.
This is an important factor when selecting access circuits. Metro fibre typically offers some middle-mile protection, whereas dark fibre often does not. On-net fibre generally provides better latency and jitter than an off-net circuit that traverses multiple provider networks and may even hairpin through an NNI.
The packet delivery SLA for the on-net fibre might be 99.99% whereas the off-net circuit might only have a 99.9% packet delivery guarantee. A small difference, but some people might care about it. Consider this for your traffic in active/active edge routing, because if you have latency-sensitive flows, the customer experience could be different on a per-circuit-flow basis.
This issue has become less significant in recent years with the rise of Software Defined Wide Area Networking (SD-WAN), improvements in network equipment, and overall Internet latency reductions. However, consider scenarios such as a branch office with a dedicated Internet access (DIA) primary link and a 4G or satellite backup. In such cases, Voice Over Internet Protocol (VoIP) performance can vary dramatically when running active/active. If there is a large difference in reliability, latency, or jitter between your two circuits, it's usually not advisable - even with SD-WAN - to run active/active for VoIP or other latency-sensitive applications.
When selecting two cities, analyse who you are serving from each location and what resources you are accessing between them. Different cities can mean different network paths, which can lead to varying latency. Even with 10Gb circuits, poor latency can degrade performance. Whenever possible, obtain latency metrics directly. If that's not feasible, use public tools to establish a baseline. For example, tools like Global Ping or Uniti Fibre's latency tool can provide useful insights.
Also, research features such as BGP multipath for multi-homed active/active designs. These can help you make better use of all active peers, improve capacity, and justifying monthly costs.
ASN filtering
The question of whether to run full routing tables or partial tables often comes up, and the answer is usually 'it depends'. For an enterprise data centre or an on-premises data centre hosting a single website or application, a default route only is typically sufficient, as you are likely operating in an active/backup configuration at the border. If you are a provider with multiple peers and serving many networks, full tables with active/active routing may be more appropriate. Regardless of the approach, ensure you only advertise your own networks or your customers' networks - never anything else. A strict routing policy is essential.
There are scenarios where you might run active/backup, but still request full tables to gain flexibility through filtering. For example, if you have specific cloud destinations for outbound traffic or want to implement one-way load balancing, you could leak certain prefixes from one router to force traffic to exit through a preferred path. This can be done by adding a route-map entry for that prefix above your default-only rule and applying a higher local preference on the desired router. However, if your goal is to load-balance inbound traffic, full tables will not help, as inbound policy is controlled by external peers.
If you accept only local routes from each neighboring ASN, you can configure your routing as active/active and allow inbound flows to follow the natural paths within each provider's network. This typically results in symmetrical traffic without the need for prepending. You can also leak selected prefixes from each provider into your edge routing table. Eventually, you will need to choose a default route, and factors such as latency, bandwidth, and cost will influence that decision. Communities can also help control traffic, which we will cover later.
Even if you filter most routes, accepting full Internet routing tables still requires sufficient router memory to handle the size of the routing information base (RIB), even when using a default-only approach.
One advantage of receiving full tables without a default route - rather than using a default route only - is the ability to filter traffic based on destination ASNs or AS-PATHS. This can support use cases such as geo-blocking or blocking adversarial networks during normal operations or hijack scenarios. Without a default route, you can send undesirable traffic to null. Another benefit is filtering out excessive AS prepending from your route table.
The following example Cisco IOS-XR example demonstrates this:
I tried several different IOS-XR combinations and tests to filter out excessive ASN prepending, and also searched Google/AI, but I couldn't get it working exactly as intended. Some examples I tried were:
All AI-provided examples failed.
However, the shown above will block excessively prepended routes, which the option cannot block. See the output below.
After filtering, note that AS65007 is gone because it's pre-pended too much.
With IOS-XE, you need to use a path Access Control List (ACL), which works slightly differently and casts a wider net as shown in this example:
Oh no - we just blackholed a friendly AS (65005)! This highlights why you need to be careful. The example was intentional to show how things can go wrong. Always analyse your routing table for long AS-PATHS to identify candidate prefixes you may want to remove.
Communities and templates
Another topic that often comes up - and was only briefly mentioned in the first series - is tagging BGP communities. It's generally best practice to define a few policies in your network so you can tag different types of traffic using the BGP community attribute. This enables strategies for traffic engineering and operational flexibility. Examples include tagging customer routes, transit routes, internal announcements, and geographic distinctions. Usually, you'll want to strip external communities on both inbound and outbound traffic, except in specific scenarios we'll cover later. Local preference remains the primary BGP mechanism for influencing routing decisions.
Tag or assign communities at the route's origin and on inbound routes at the border. This allows you to apply internal policies by matching communities downstream within your AS. Community tags can also simplify troubleshooting - often, you can quickly understand routing behaviour by checking which communities are applied. The next diagram illustrates the basic principle.
Setting communities at all points allows routing policy to prioritize paths - for example, preferring private peering first, then Internet Exchange (IX), and finally using transit (the most expensive option) as a last resort. For inbound advertisements, consider cases where the same network (such as Network 1A in Figure 7) sends specific routes via both a private peering and an IX. In such scenarios, you want all routers in your network to prefer the private peering connection for reasons such as bandwidth efficiency or return on investment (ROI).
This example shows how BGP community policy can be implemented in Cisco IOS-XR:
Ensure all routers interpret the community consistently and apply the same local preference through routing policy - in this case, preferring it. IOS-XR is highly granular, allowing you to nest these policies within other route policies, which provides significant flexibility. Communities can also signal attributes, such as , or enable features like Distributed Denial of Service (DDoS) mitigation using remotely triggered black hole (RTBH) filtering.
Lumen, a large Tier 1 provider, uses communities in a way that might also be useful to customers or peers (BGP Tools login required).
BGP templates or peer groups provide a repeatable way to standardize settings for neighbors, which is especially useful for current or future automation plans. You can apply common settings such as route maps or policies that include community tagging, filtering, BGP timers, descriptions, and maximum prefix limits. Templates can also be organized by peer type, based on the community strategies you've defined. Using templates simplifies customer turn-ups, supports automation through scripts or ticket-based pushes, and ensures consistency across your Autonomous System.
You may want to check out how Juniper implements BGP Groups or these Cisco BGP Peer templates for inspiration.
Advanced network edge design tips
I still see many people using and recommending mechanisms such as Cisco's IP SLA, object tracking, conditional advertisements, or policy-based routing at the border. In most cases, these are unnecessary if BGP is configured correctly. The only time such mechanisms make sense is for branch offices with small provider-assigned (PA) IP address blocks that do not use or support BGP.
If you use ping monitors, always test against destinations you control. If you control the destination, use Two-Way Active Measurement Protocol (TWAMP) or a similar tool.
For better failure detection, use Bidirectional Forwarding Detection (BFD) or more aggressive BGP timers, and design the network properly. Avoid non-protocol mechanisms for route or availability tracking, as they often lead to route flapping and phantom issues that reduce reliability. Even built-in features such as next-hop address tracking in BGP can cause excessive dampening if misused.
A common question involves whether to run full tables on Multiprotocol Label Switching (MPLS) provider edge (PE) routers that sit closer to the core and behind edge routers (typically connecting customer edge [CE] devices). Here's my view:
If the network is small - say, two edge routers, two PE routers, and two provider (P) routers arranged symmetrically as in Figure 7 - full tables are probably unnecessary for simplicity's sake. However, as the network grows, running full tables on PE routers becomes essential. For example, you can run an Internet virtual routing and forwarding (VRF) and carry traffic for DIA customers attached to the core PEs, importing routes as needed.
This approach allows you to run MPLS labels end-to-end without maintaining a separate pure IP network for DIA full-table customers and MPLS for Virtual Private Network (VPN) customers. It also keeps the core BGP-free while simplifying route leaking compared to using numerous Layer 2 cross-connects. Additionally, large public routing changes won't impact your core label switch routers (LSRs). Hardware capacity, however, can be a limiting factor.
If you run only a default route on a few perimeter and PE routers, you're more likely to encounter asymmetric traffic flows because traffic may load balance inefficiently and return incorrectly from edge routers to PEs. Conversely, running full tables between your active/active border and internal PEs provides more balanced return paths and symmetrical flows to and from the core - especially important if firewalls are involved.
Next in the series
In the next post, I'll share a practical network edge design checklist to help you verify that all critical considerations are covered before designing your own edge architecture.
Stay tuned!
Brandon Hitzel (Twitter) is a network engineer who has worked in multiple industries for a number of years. He holds multiple networking and security certifications and enjoys writing about networking, cyber defence, and other related topics on his blog.
Originally published on Network Defense Blog.
The views expressed by the authors of this blog are their own and do not necessarily reflect the views of APNIC. Please note a Code of Conduct applies to this blog.