APNIC Pty Ltd.

06/24/2026 | Press release | Distributed by Public on 06/23/2026 23:52

Hunting stuck routes with the BGP Clock and the BGP Stuck Route Observatory

This post was co-authored by Antonis Chariton.

Border Gateway Protocol (BGP) stuck routes, also known as ghost routes or BGP zombies, are not a new problem, but they remain an important one for network operators. They have been observed across the global Internet for years and discussed across operator and research communities. The issue persists because it is operationally disruptive, difficult to prove from a single vantage point, and more common than many operators may expect.

At a protocol level, the problem is straightforward: messages are used to announce prefixes, change path attributes, and withdraw prefixes. If a router fails to process or propagate a withdrawal, downstream networks may continue to believe a route is valid even after the origin has withdrawn it.

The operational impact is often significant. Consider a network experiencing congestion on a path. The operator performs traffic engineering and withdraws that path so traffic can move to a healthier route. If the withdrawal is not propagated correctly, some downstream networks may continue sending traffic toward the congested or invalid path. The result can include packet loss, blackholing, and a troubleshooting process where the control plane presents an incomplete or misleading view of reality.

Cisco ThousandEyes research indicates that this behaviour is not rare. Evidence of stuck routes appears hundreds of times per day across the Internet.

Visibility is crucial

The industry is making progress toward reducing the impact of stuck routes. One important development is RFC 9687, which defines the BGP Send Hold Timer. As with any new proposal, however, adoption takes time and resources. Software-based BGP implementations have generally moved faster in adopting this functionality, but broader vendor support remains important.

Standards and implementation improvements reduce risk, but measurement remains essential. Operators still need Internet-scale visibility to understand where stuck routes occur, how long they persist, which autonomous systems may be affected, and where remediation should begin.

To help bridge this visibility gap, Cisco ThousandEyes introduced a methodology for measuring the prevalence of stuck routes: The BGP Clock.

BGP Clock

As part of previous research, we studied stuck routes by manually advertising prefixes, withdrawing them after a defined period, and then checking whether they persisted in the global routing table long after they should have disappeared. More specifically, the methodology was based on periodically advertising an IPv6 /48 prefix, withdrawing it, and then using public BGP sources such as RIPE Routing Information Service (RIS) to determine which autonomous systems still held the prefix after it should have been withdrawn.

The findings were notable. Some withdrawn prefixes remained visible for months. In certain cases, routes remained visible even after the prefixes became Resource Public Key Infrastructure (RPKI) , demonstrating that stale routing information can persist long after the origin has taken the expected corrective action.

Feedback from operator and research communities shaped the next phase of the work. The earlier study was IPv6-only, which prompted questions about IPv4. Operators wanted a way to inspect their own Routing Information Bases (RIBs). Researchers wanted a longer-term beaconing system for pattern analysis. The broader community also benefited from more public data available through sources such as RIPE RIS and RouteViews.

The BGP Clock was created to address these needs.

The concept is intentionally simple: Encode time into the BGP advertisement. By embedding timing information in the prefix itself, researchers and operators can determine when a route was originated and when it should no longer be visible.

For IPv6, the BGP Clock uses prefixes in the format , where N represents the index of the ten-minute period since the beginning of the year. For example, 1 January at 00:00 UTC maps to , while 1 January at 00:10 UTC maps to .

For IPv4, the BGP Clock uses prefixes in the format , where N=HH(mod8)N = HH (mod 8) in UTC. At 00:00 UTC, the advertised /24 is ; one hour later, it is , and so on.

The process is similarly straightforward. For IPv6, a clock prefix is announced at the top of the hour and withdrawn ten minutes later. IPv6 clock prefixes are recycled yearly, enabling long-term observation of stuck routes. For IPv4, prefixes recycle every eight hours, which provides a practical balance between frequent measurement and enough time for routing behaviour to settle.

Documentation on the mechanics is available through whois for the IPv6 and IPv4 clock prefixes, including details about the encoding method and advertised communities.

How operators can check their own networks

The BGP Clock is designed to make local validation possible. Operators can search for BGP Clock prefixes directly in their routing tables and compare what they see with expected behaviour.

Under normal conditions, only a small number of clock routes should be visible. For IPv6, operators will typically see the current ten-minute interval and, briefly, possibly the previous one while withdrawals propagate. For IPv4, operators should see the current /24 prefix and the aggregate /21 prefix.

If many older BGP Clock prefixes remain visible in the routing table, that is a strong signal that stale routing information may be present and should be investigated.

BGP Stuck Route Observatory

Comprehensive visibility into the global BGP routing table from multiple vantage points was essential for this research. To support that view, Cisco ThousandEyes used public BGP data from the RIPE RIS project along with our own global BGP collection network. The RIPE RIS project provides public RIB and dumps from hundreds of peers. Cisco ThousandEyes operates a global BGP collection network with near real-time IPv4 and IPv6 visibility.

The ThousandEyes BGP Stuck Route Observatory is a free, web-based tool that requires no login. Operators can enter an Autonomous System Number (ASN), and the Observatory checks whether that ASN appears to be affected by stuck routes based on BGP Clock data and ThousandEyes detection logic.

The Observatory can return several types of results.

The best case is that no evidence of impact is observed. In this scenario, the queried ASN does not appear in paths containing BGP Clock prefixes that should no longer be visible.

In other cases, an ASN may appear to be affected by another network. This can occur when an upstream provider or another autonomous system along the path is holding a stuck route. This is often the hardest scenario operationally because mitigation options may be limited. Traffic engineering may help in some cases, but operators may need to escalate to the network that appears to be retaining the stale route.

The Observatory may also indicate that the queried ASN itself appears to be contributing to the issue. In this case, the tool provides the relevant prefix and observed autonomous system paths to help operators begin their internal investigation.

By providing a shared view of the prefix, timing model, and observed AS paths, the Observatory gives operators a common reference point for troubleshooting and escalation.

Challenges: Visibility, invisibility, and intra-AS behaviour

The team faced three primary challenges in conducting this large-scale research: Visibility, invisibility, and intra-AS inconsistencies.

The visibility challenge is direct: Measurement systems can only analyse what their data sources can see. Today, there is direct visibility into approximately 1500 autonomous systems between BGP data providers and the BGP Clock origin. This includes Tier 1 networks, large telcos and enterprises, but it still represents approximately 1% of the total ASN space.

Customer-cone analysis can extend the likely affected view to a much larger portion of the Internet. For example, if a large provider has a stuck route visible in around 20 data feeds and its customer cone contains approximately 2,000 ASNs, it may be reasonable to suspect that many of those downstream networks are affected. However, inference is not the same as direct proof.

The invisibility challenge is different. Some ASNs do not appear in AS paths by design. Internet Exchange Point (IXP) route servers are a common example. If a route server retains a stuck route but its ASN is not visible in the path, attribution becomes difficult. The issue may appear to belong to many route server peers rather than the route server itself. These days, many route servers use modern software BGP stacks, but attribution remains challenging.

The third challenge is intra-AS inconsistency. An ASN can represent a small network or a very large one with thousands of routers. Most routers inside an AS may be patched, upgraded, and operating normally, while one device, a route reflector cluster, a confederation segment, or an update group could behave differently. Pinpointing that device or internal component can be difficult.

Cisco ThousandEyes research has observed cases where a stuck route inside a Tier 1 provider affected only about 5% of downstream networks. These types of symptoms are difficult to troubleshoot externally. BGP communities can sometimes provide hints about geography or topology, but they are not standardized enough to support reliable automation across networks. BGP Monitoring Protocol (BPM) can help by creating a timeline of BGP advertisements inside an AS, but deployment remains uneven, and some operators remain cautious because of implementation maturity and scale concerns.

For these reasons, the Observatory should be viewed as a starting point for investigation, not a replacement for operational analysis.

Turning ground truth into action

Stuck routes are not just local routing artifacts. A single router failing to process or propagate a withdrawal can create symptoms several hops away. A stale path can make a legitimate traffic engineering change appear ineffective, while an issue in one route reflector cluster may affect only a narrow subset of downstream networks. These characteristics make the problem difficult to reproduce, diagnose, and escalate.

The BGP Clock provides a shared source of ground truth: A route was announced at a known time, withdrawn at a known time, and should no longer be visible. The BGP Stuck Route Observatory turns that signal into an operator-facing workflow by showing whether an ASN appears in paths for prefixes that should have disappeared.

We intend to continue improving the BGP Clock methodology, the Observatory, and the surrounding research. However, broader operator participation is essential: Pushing adoption of RFC 9687, every additional vantage point, validation, and escalation helps the community identify stale routing information earlier, before it becomes a prolonged and difficult-to-diagnose incident.

Kemal is a result-oriented engineer focusing on designing, operating and troubleshooting large-scale networks. Over the last two decades, he worked at several large-scale companies applying NRE practices and automating remediation actions. As a Principal Internet Analyst at ThousandEyes, he focuses on research and providing deep and meaningful insights into outages through the lenses of ThousandEyes.

Antonios is a security and networking leader with experience in creating, designing, standardising and operating Internet-scale solutions that keep everyone more secure and private online. He is involved in TLS, PKI, routing, IPv6, firmware, policy development, post-quantum encryption, and anything that could improve people's safety online. He is working as a Researcher at Cisco ThousandEyes.

The views expressed by the authors of this blog are their own and do not necessarily reflect the views of APNIC. Please note a Code of Conduct applies to this blog.

APNIC Pty Ltd. published this content on June 24, 2026, and is solely responsible for the information contained herein. Distributed via Public Technologies (PUBT), unedited and unaltered, on June 24, 2026 at 05:52 UTC. If you believe the information included in the content is inaccurate or outdated and requires editing or removal, please contact us at [email protected]