06/29/2026 | Press release | Distributed by Public on 06/28/2026 21:41
AI eats the world
NANOG 97 was held in Bellevue, Washington, at the start of June 2026. These days, you could be excused for suspecting that the world has gone AI-mad, and if you were at the NANOG meeting, your suspicions would've only been confirmed! The topics discussed included the design of data centres used to generate Large Language Models (LLMs) that underpin today's AI tools, and the application of AI tools in network operations.
The mood of the meeting was set by Cogent's Dave Schaeffer, who provided a set of statistics and associated observations that, I must admit, didn't manage to engage my uncritical enthusiasm and instead confirmed some of my own alarm bells about the extent of this irrational insanity pervading this space. Expenditure in 2026 is centred on a whole new round of data centres built specifically for the demanding requirements of intense deployments of GPUs.
In historical terms, the sums involved are massive. In the US alone, expenditure will approach USD 750 billion for this year (2026), almost double last year's USD 450 billion. Current projections of the longer-term multi-year build cost within the US are now heading to USD 7 trillion. In economic terms, that's 2% of the US GDP for the next four years. And that's not counting expenditure occurring in other economies. There's no doubt we're in the middle of a wild boom. It's an economic inevitability that booms are followed by crashes, and the emerging question is how soon and how catastrophic the crash will be when this bubble bursts.
Oddly enough, it's not as if all the major investors are there because they're driven by a sense of exuberance and optimism over the future prospects of market dominance and wealth through their AI platform. Many of these investors are today's digital behemoths, and for them, it appears that the driving motivation is that engagement with AI is a forced activity: If they don't sign up and invest in their own AI platform now, then they will be unable to survive when their competitors steal their existing markets with these new AI platforms. For example, what good is search when an AI tool can provide me with the answers? If it's fear and greed that drive markets, then today it appears that fear dominates many corporate decisions to play in this space!
AI training, inference and agentic workloads are placing a new generation of stresses on today's data centre network architectures. Large clusters of GPU engines coupled with the decoupling of processing from memory using Remote Direct Memory Access (RDMA) generate sustained high-volume loss-intolerant traffic in the data centre's transmission and switching capability. These demands challenge the prior conventional data centre design parameters around bandwidth density, electrical reach, and optical capacity. So far, we've relied on the constant increase in chip capabilities to resolve these scaling challenges, but as these environments demand even further scaling, limitations are emerging that cannot be addressed through incremental capacity upgrades to existing designs.
Bear in mind that the true genius of the TCP transport control protocol was its tolerance to packet loss (and jitter). RDMA over Converged Ethernet, Version 2 (RoCEv2) is a network protocol that allows computers to transfer data directly between memory on different servers, bypassing the operating system kernel and CPU, using standard Ethernet. This enables low-latency memory access and is vital for large-scale AI clusters and high-performance computing. However, RoCEv2 is completely intolerant to packet loss, packet reordering and jitter. This implies that TCP is not a suitable transport, and RoCEv2 is layered over UDP. Normally, it would be left to the application to detect and repair packet loss and packet reordering, but RoCEv2 contains no such support. The outcome is that the network simply cannot lose a single packet! This is a rather novel challenge for large-scale high-speed data centre design.
We are building out on the fine margins of what we can deliver in terms of silicon processors, storage systems, photonics and switching, power delivery and cooling. For example, in the power market, these data centres are expected to consume more electricity in the US than all energy-intensive manufacturing combined by the end of the 2020s. The current projection of power requirements will account for up to 12% of all US electricity consumption within a few years, and it's not as if some other sector is switching off its power by a comparable amount over this period. This is a new demand on the economy's power generation system, and represents a new source of pressure on existing electricity generation capabilities and power grid infrastructure.
It's not just the packet handling layer that is feeling the stress of large-scale AI data centre deployments. AI infrastructure impacts are also being applied to the underlying optical layer. Photonics is now becoming a binding architectural constraint in AI data centre networks rather than a transparent transport layer. Current architectures are encountering scaling limits, including faceplate density, thermal and power budgets for pluggable optics, and the operational complexity of rapidly expanding fibre plants.
We are now evaluating several optical architectural approaches, including Co-Packaged Optics, external laser source models, and optical circuit switching as a complement to packet-switched fabrics. For each, it highlights the problems these approaches aim to address, the new constraints they introduce, and where they might provide needed capability within operational environments.
Investment in data centre optical systems is expected to grow at a compound annual growth rate of 45% per year in the coming few years. Gallium Arsenide semiconductor systems using multi-mode fibre have been a mainstay for some years, and that technology appears to top out at around 200G per lane, and it mandates short interconnects. Higher capacity systems are achievable with Indium Phosphide semiconductors, which are used with single-mode optics for optical spans of 500 meters and above. This is typically what has been deployed in mass by the large hyperscalers. These systems support 200G per lane, and there is a general expectation that 400G per lane is going to be deployed in the near future.
However, 400G per lane is encountering some slowdown in deployment, and nobody really knows what's going to happen next after 400G. Higher capacities tend to run into power density limits and thermal dissipation. Pluggable optics no longer scale as easily, and while we can foresee total per-fibre capabilities of 3.2T as being achievable, the path to yet higher capacity is not clear, in terms of the scaling capabilities of optics, silicon, power, cooling, physical packaging and unit costs per bit.
The construction of these data centres doesn't seem to resemble a traditional speculative boom with the prospect of short-term windfall returns for investors who are underwriting the costs. The best-case scenario for investors is that data centres earn steady utility-like returns in the long term. The worst-case scenario is that the capital cost is underwritten by a financial bubble that bursts mid-way through the construction process. But maybe this time really is different from the conventional boom and bust business cycles. The difference is that you and I appear to be major investors, indirectly funding this AI infrastructure through public-sector tax concessions provided to the companies building it.
It's a fascinating journey we are on with AI. There is a well-founded belief that the quality of the AI systems is heavily reliant on the scale of the processing used to assemble the models, and so far, the rule of thumb is that larger scale produces a better outcome. There is no clear idea of what scale is 'enough' or even if there is a point of diminishing returns where double the scale of processing capability only generates marginal improvements in the service quality. Without a clear idea of where any logical endpoint might exist, we continue to push hard at every aspect of the components in these facilities to gain further scale. It's a wild ride!
Geolocation
The task of geolocation, or assigning a location of a remote device based on the IP address it uses to access a service, has become a popular topic in recent NANOG meetings.
It's an area that continues to raise many more questions than we can provide answers to. It appears to start with a rather innocuous question: 'Where are you?'
But who are you in this question? Is it you who is the human using a device, or is it the device itself? In human terms, the question of location becomes one of adapting the service experience using an appropriate language, an appropriate character set and presentation layout that matches the conventional use and cultural norms in that location. There is also the societal dimension. What laws apply to my ability to access digital services? Which economy's legal code applies to my transactions? If transaction taxes are applicable, which taxation code is applicable?
Where is a similarly complex question. We could use latitude/longitude coordinates to pinpoint a location to a point on the surface of the earth. Or perhaps a more useful taxonomy is to use a geopolitical location system, using economy, state, city and so on.
Is my location where I am located physically, or is it somewhere else? If I use a VPN service or a remote proxy agent, then is my location the location of the remote portal? There is also a temporal dimension, as I could be carrying my mobile device in a train, a boat or an aeroplane. Mobility also raises the question of stability and duration of location. What is the useful lifetime of IP location data?
What's the granularity of location? What are the tensions between personal privacy and high-precision location data? Where has the deprecation of post codes already been seen in some geofeed systems, where the cost code is at such a level of precision that it can be used as personally identifying information? When we consider the precision of location, it's useful to ask if this information is to be used as a delivery address, as in 'Where do I deliver the pizza?' Or should we blur the precision and simply refer to an economy, or a state within an economy? Some service providers use location as a means of making a good selection of a 'nearby' data centre.
In this case, the metrics of 'proximity' and 'distance' are not based on a physical distance but are based on metrics that use an underlying network topology and the state of the network's routing system.
The common intention of geolocation is a mapping service that takes an IP address as input and generates a location as output. The question is where and how you can find the data that acts as a seed for this IP-to-location map.
Some network providers maintain this map for the IP addresses that they use. For example, Starlink and Apple's Private Relay service both publish location maps, but this is the exception rather than the rule in the ISP space. In some cases, it's left to individual users to publish their own location, but how can a consumer of this data determine the truth (or otherwise) of these self-assertions of location? Is there any form of external validation that could be used to test the veracity of this geolocation data? Is my location a piece of public data? Or is it an instance of personally identifying information that would allow my identity to be inferred? What level of location granularity turns a general locale into my specific location?
I also can't help but ask myself, why are there multiple geolocation providers? And why do they differ in detail? Surely location is impartial information and not open to variable interpretation?
In many ways, these panel sessions on geolocation are curiously illuminating. What seems to be a very simple question has oddly complicated answers. There are multiple layers of nuance and complexity lurking behind what at the outset is a very simple question.
Measuring IPv6
How much IPv6 has been deployed? How much IPv6 is being used on today's public Internet? There are a couple of web pages that report on IPv6 use. The IPv6 measurement site operated by Google reports on the IPv6 capability of a sample of users who access the Google home page. There is also the data published by APNIC Labs. Both of these reports look at the capability of users to use IPv6, but not actual use. Several popular Internet Exchange Points (IXPs) report on traffic volumes, such as the AMSIX report on IPv6 volumes at their exchange.
These efforts to generate a 'big picture' report can be complemented by small-scale studies of individual cases, as is reported in the presentation on a non-binary view of IPv6 adoption. The results of an analysis of user traffic at a small number of sites show that a number of common applications do not have support for IPv6, including Zoom, TikTok, GitHub, and Twitch. Depending on the level of use of these applications, different users will show different levels of IPv6 use.
On the server side, of the Tranco top 110k sites, 58% are IPv4-only, 30% require IPv4 to load, and 12% of these sites are fully IPv6 enabled. Of course, what this does not show is the number of users of these services, the frequency of use and the amount of traffic generated by these services.
In the cloud service environment, some platforms have extensive IPv6 support, such as Cloudflare, and others, notably Amazon, are mainly IPv4. Again, however, this data needs to be placed into a context of relative use and relative traffic volumes.
The conclusion for me is that while there are clear signs of progress with the transition to IPv6, the pace is slow. If anyone is impatient to move to an IPv6-only environment, then they will necessarily exist within a highly fragmented space. If we wish to preserve the coherence of the Internet, then some further patience is needed while we traverse this transition path.
The cost of SSH
These days, Secure Shell (SSH) is the default transport for many applications. When you need encryption on the wire (and only the most stupidly cavalier would ignore channel encryption in a world of Wi-Fi), and maybe you'd like to understand that you are communicating with the service that you intended to communicate with, then SSL can help.
But it's slow.
We learned last October at NANOG 95 that many of the performance issues with SSH arise from poor internal configuration and buffer dimensioning, and High-Performance Networking SSH (HPN-SSH) is a SSH variant that shows what is achievable in a performance-optimized SSH implementation.
However, there is another dimension to SSH performance, and that is the overhead to start an SSH session. There's the TCP handshake and the SSH initial version exchange, which account for two Round-Trip Time intervals (RTTs). This is followed by algorithm initiation and key exchange, and a further one to three RTT intervals. At this point, the application layer needs to make a service request, and the parties need to perform authentication, with a further two to four RTTs. For a terminal session, there is then the channel open exchange, shell initiation and terminal display characteristics. The total delay is some 10 to 15 RTT intervals. Can we improve on this situation?
A useful observation is that TLS 1.3 makes significant improvements to this connection establishment time, and HTTP using TLS 1.3 can complete a connection within just three RTT intervals. This is exploited in a proposal to use an edge proxy, where the network service uses HTTPS over TLS 1.3, and the edge proxy talks terminal emulation (PTY) over SSH to the device.
If you are using SSH to manage a small number of devices with network automation, the SSH connection overhead can be annoying, but it is not a major issue. On the other hand, if you are automating a system with 10,000 or 20,000 devices, the additional RTT delays become a significant factor, and an HTTPS-to-SSH proxy approach starts to look very attractive.
NANOG
This is a small sample of a busy three-day program. The full content is available on the NANOG 97 website. The next NANOG meeting will be held in Miami, 19 - 21 October 2026.
The views expressed by the authors of this blog are their own and do not necessarily reflect the views of APNIC. Please note a Code of Conduct applies to this blog.