Personal Imzy Blog / Community
Birds of a Feather
At work I have a network monitoring system that I use, called Cacti. It's really quite a useful system and you can't beat the bang-for-your-buck, in so far that it's completely free. Cacti is a GPL licensed network monitoring system.
Cacti pings all the cable modems across my company, all the time. This builds telemetry graphs so I can look over the history of each branch and be able to tell when their cable modems start running into trouble, at least the segment between the Windstream-run AT&T carried 100M fiber optics at work, that is.
In my Cacti system, I've got all my branches organized by their IP space indexes, and since my IP space is organized on geographical markers, the list is organized the same way. We have two coax carriers that we use, Charter and Comcast.
In a lot of ways, much like today, I bring up my Cacti telemetry of all the cable modems across my company and certain patterns emerge. The shops where Charter provides service are pretty staid and flat, uninteresting graphs in that they have some small variability in ping returns, but nothing really remarkable. The same cannot be said about Comcast. Starting yesterday at around 3pm or so, there is a mound in the telemetry and it appears at every cable modem that is served by Comcast. So yesterday there was a lot of something going on, and it got progressively worse until you get to right around midnight last night. Then at 2:45am there is this really nasty spike where the average ping went to 140ms, and then in a few minutes went right down to 55ms, like a stone.
It's these sorts of graphs that make me wonder what happened "behind the wizards curtain" for Comcast. Yesterday was a mess, but today it seems to be almost unnaturally shackled to a specific ping return of about 55ms per ping packet. This is in no way a complaint, but it is more along the lines of a curiosity. It's kind of neat being able to see headaches for these two companies, sometimes I can even see the "problems" travel around the country, as my different modems all demonstrate different behaviors when polled by Cacti.
I think ICMP Ping may make up probably 10% of my company traffic, but having these graphs means I have ammunition and being able to tell a carrier "Your service crapped out at 2:35am and stayed down until 7:13am is worth its weight in gold. It helps them when it comes to pinning down a diagnosis of network problems, especially if I can give them graphs showing regular failure intervals.



Who knows what difficulties they may have been wrestling with upstream? A herd of zombie user machines suddenly lagging the system? Some part of the supporting infrastructure temporarily down for maintenance? It would be nice to know for our education and awareness.