09/04/2025 | News release | Distributed by Public on 09/04/2025 11:56
Hey everyone, I'm back to exploring how agentic AI might fit into a network engineer's workflow and become a valuable tool in our tool chest.
In my blog post, Creating a NetAI Playground for Agentic AI Experimentation, I began this journey by exploring how we can utilize Model Context Protocol (MCP) servers and the concept of "tools" to enable our AI agents to interact with network devices by sending show commands. If you haven't read that post yet, definitely check it out because it is some really fabulous prose. Oh, and there is some really cool NetAI stuff in there, too.
While it was fascinating to see how well AI could understand a network engineering task presented in natural language, create a plan, and then execute that plan in the same way I would, there was a limitation in that first example. The only "tool" the agent had was the ability to send show commands to the network device. I had to explicitly provide the details about the network device-details that are readily available in my "source of truth."
To realize the power of agentic AI, NetAI needs to have access to the same information as human network engineers. For today's post, I wanted to explore how I could provide source-of-truth data to my NetAI agent. So, let's dig in!
NetBox has long been a favorite tool of mine. It is an open-source network source of truth, written in Python, and available in various deployment options. NetBox has been with me through much of my network automation exploration; it seemed fitting to see how it could fit into this new world of AI.
Initially, I expected to put a simple MCP server together to access NetBox data. I quickly learned that the team at NetBox Labs had already released an open-source basic MCP server on GitHub. It only provides "read access" to data, but as we saw in my first NetAI post, I'm starting out slowly with read-only work anyway. Having a starting point for introducing some source of truth into my playground was going to significantly speed up my exploration. Totally awesome.
Have you ever been working on a project and gotten distracted by another "cool idea?" No? I guess it's just me then…
Like most of my network labs and explorations, I'm using Cisco Modeling Labs (CML) to run the network playground for AI. This wasn't the first time I wanted to have NetBox as part of a CML topology. And as I was prepping to play with the NetBox MCP server, I had the thought…
Hank, wouldn't it be great if there were a CML NetBox node that could be easily added to a topology, and that would automatically populate NetBox with the topology information from CML?
Of course I answered myself…
Heck yeah, Hank, that's a great idea!
My mind immediately started working out the details of how to put it together. I knew it would be super easy and fast to knock out. And I figured other people would find it handy as well. So I took a "short detour."
I'm sure many of you raised your eyebrows when I said "super easy" and "fast." You were right to be skeptical, of course. It wasn't quite as easy or straightforward as I expected. However, I was able to get it working, and it is really cool and handy for anyone who wants to add not only a NetBox server to a CML network but also have it pre-populated with the devices, links, and IP details from the CML topology.
I still need to compile the documentation for the new node definition before I can post it to the CML-Community on GitHub for others to use. However, consider this blog post my public accountability post, indicating that it is forthcoming. You can hold me to it.
But enough of the side track in this blog post, let's get back to the AI stuff!
As I mentioned in the last blog post, I'm using LM Studio to run the Large Language Model (LLM) for my AI agent locally on my laptop. The main reason is to avoid sending any network information to a cloud AI service. Although I'm using a "lab network" for my exploration, there are details in the lab setup that I do NOT want to be public or risk ending up in future training data for an LLM.
If this exploration is successful, using the approach with production data would be the next step; however, that is definitely not something that aligns with a responsible AI approach.
Cloning down the netbox-mcp-server code from GitHub was easy enough. The README included an example MCP server configuration that provided everything I needed to update my mcp.json file in LM Studio to add it to my already configured pyATS MCP server.
{ "mcpServers": { "pyats": { "url": "http://localhost:8002/mcp" }, "netbox": { "command": "uv", "args": [ "--directory", "/Users/hapresto/code/netbox-mcp-server", "run", "server.py" ], "env": { "NETBOX_URL": "http://{{MY NETBOX IP ADDRESS}/", "NETBOX_TOKEN": "{{MY NETBOX API TOKEN}}" } } } }
As soon as I saved the file, LM Studio discovered the tools available.
There are three tools provided by the NetBox MCP server.
I was, and continue to be, interested in the approach used by the NetBox Labs folks on this MCP server. Rather than providing tools to "get_devices" and "get_ips", they have a single tool. NetBox's APIs and object model are well thought out, and make a generic approach like this possible. And it certainly means less code and development time. Still, it essentially gives API access to the LLM and shifts the load for "thought" and "processing the data" back to the LLM. As Agentic AI and MCP are still very new standards and approaches, there aren't real best practices and details on what works best in design patterns here yet. I'll come back to this approach and what I see as some possible downsides later on in the post.
I then loaded the newly released open model by OpenAI, gpt-oss, and sent the first query.
My first thought.. "Success". And then I scratched my head for a second. 10 devices"? Scroll back up to the CML topology image and count how many devices are in the topology. Go ahead, I'll wait…
Yeah. I counted seven devices, too. And if I check NetBox itself, it also shows seven devices.
So what happened? LM Studio shows the exact response from the tool call, so I went and checked. Sure enough, only seven devices' worth of information was returned. I then remembered that one of the notoriously meme-worthy failings of many AI tools is the ability to count. Blueberries anyone?
So this turned into a nice teachable moment about AI… AI is fantastic, but it can be wrong. And it will be bad with some of the strangest things. Stay vigilant, my friends
After resolving the issue with the 10 devices, I spent a considerable amount of time asking more questions and observing the AI utilize the tools to retrieve data from NetBox. In general, I was pretty impressed, and having access to source-of-truth data will be key to any Agentic NetAI work we undertake. When you try this out on your own, definitely play around and see what you can do with the LLM and your NetBox data. However, I wanted to explore what was possible in bringing tools together.
I wanted to start out with something that felt both useful and pretty straightforward. So I sent this prompt.
I'd like to verify that router01 is physically connected to the correct devices per the NetBox cable connections. > Note: The credentials for router01 are: `netadmin / 1234QWer` Can you: 1. Check NetBox for what network devices router01 is supposed to be connected to, and on what interfaces 2. Lookup the Out of Band IP address and SSH port from NetBox, use these to connect to router01. 3. Use CDP on router01 to check what neighbors are seen 4. Compare the NetBox to CDP information.
I still had to tell the LLM what the credentials are for the devices. That's because while NetBox is a fantastic source of truth, it does NOT store secrets/credentials. I'm planning on exploring what tool options exist for pulling data from secret storage later on.
If you are wondering why I provided a list of steps to tackle this problem rather than let the LLM "figure it out," the answer is that while GenAI LLMs can seem "brilliant", they are NOT network engineers. Or, more specifically, they haven't been trained and tuned to BE network engineers. Likely, the future will offer tuned LLMs for specific job roles rather than the general-purpose LLMs of today. Until then, the best practice for "prompt engineering" is to provide the LLM with detailed instructions on what you want it to do. That dramatically increases the chances of success and the speed at which the LLM can tackle the problem.
Let's look at how the LLM handled the first step in the request, looking up the device connections.
At first glance, this looks pretty good. It "knew" that it needed to check the Cables from NetBox. However, there are some problems here. The LLM crafted what appears to be a valid filter for the lookup: "device_a_name": "router01." However, that is actually NOT a valid filter. It is a hallucination.
An entire blog post could be written on the reason this hallucination happened, but the TL;DR is that the NetBox MCP server does NOT provide explicit details on how to craft filters. It relies on the LLM to be able to build a filter based on the training data. And while every LLM has benefited from the copious amounts of NetBox documentation available on the internet, in all of my testing, I have yet to have any LLM successfully craft the correct filter for anything but the most basic searches for NetBox.
This has led me to start building my own "opinion" on how MCP servers should be built, and it involves requiring less "guessing" from the LLMs to use them. I'll most certainly be back more on this topic in later posts and presentations. But enough on that for now.
The LLM doesn't know that the filter was wrong; it assumes that the cables returned are all connected to router01. This leads to other errors in the reporting, as the "Thought" process reveals. It sees both Cable 1 and Cable 4 as connected to Ethernet 0/0. The truth is that Cable 4 is connected to switch01 Ethernet0/0. We'll see how this factors in later in the summary of data.
Once it has the cable information, the LLM proceeds and completes the rest of the tool's use to gather data.
Finding the Out of Band IP and SSH port was straightforward. But the first attempt to run "show cdp neighbors" failed because the LLM initially didn't use the SSH port as part of the tool call. But this is an excellent example of how Agentic AI can understand errors from MCP servers and "fix them." It realized the need for SSH and tried again.
I've seen several cases where AI agents will resolve errors with tool calls through trial and error and iteration. In fact, some MCP servers seem to be designed specifically with this as the expected behavior. Good error messages can give the LLM the context required to fix the problem. Similar to how we as humans might react and adjust when we get an error from a command or API call we send. This is an excellent power of LLMs; however, I think that MCP servers can and should be designed to limit the amount of trial and error required. I've also seen LLMs "give up" after too many errors.
Let's take a look at the final response from the AI agent after it completed gathering and processing the results.
So how did it do?
First, the good things. It correctly recognized that the link to switch01 from NetBox matched a CDP entry. Excellent. It also called out the missing CDP neighbor for the "mgmt" switch. It's missing because "mgmt" is an unmanaged switch and doesn't run CDP.
It would have been really "cool" if the LLM had noticed that the device type of "mgmt" was "Unmanaged Switch" and commented on that being the reason CDP information was missing. As already mentioned, the LLM is NOT tuned for network engineering use cases, so I'll give it a pass on this.
And now the mistakes… The problem with the filter for the cable resulted in two errors in the findings. There aren't two cables on Ethernet0/0, and the "Other unused cables" aren't connected to router01.
I was definitely a little disappointed that my initial tests weren't 100% successful; that would have made for a great story in this blog post. But if I'm honest, running into a few problems was even better for the post.
AI can be downright amazing and jaw-dropping with what it can do. But it isn't perfect. We are in the very early days of Agentic AI and AIOps, and there is a lot of work left to do, from developing and offering tuned LLMs with domain-specific knowledge to finding the best practices for building the best functioning tools for AI use cases.
What I did see in this experiment, and all my experiments and learning, is the true potential for NetAI to provide network engineers a powerful tool for designing and operating their networks. I'll be continuing my exploration and look forward to seeing that potential come to fruition.
There's so much more I learned from this project, but the blog post is getting quite long, so it'll have to wait for another installment. While I'm working on that, let me know what you think of AI and the potential for making your daily work as a network engineer better.
How has AI helped you recently? What's the best hallucination you've run into so far?
Let me know in the comments!
Read next:
Creating a NetAI Playground for Agentic AI Experimentation
Wrangling the Wild West of MCP Servers
Sign up for Cisco U. | Join the Cisco Learning Network today for free.
Use #CiscoU and #CiscoCert to join the conversation.