We had two more unscheduled reboots of critical systems this past week. File corruption on a SAN mounted volume caused our system folk to have to retore to an earlier snapshot on one of them. For those not in the industry the situation briefly described above is fairly horrible and something one hopes they will never encounter. My part in this troublehsooting, as resident network weasel, has been to set up a continuous packet capture running on the SAN network. This means that as soon as an incident is confirmed, someone runs and pulls the plug on the packet capture so we will have a record of what was going on over the network at that moment. TCPDUMP is my new best friend. I had played with it before, mostly in various classroom settings, and I always thought it was an interesting tool. Anything with that many options has to be given some respect. It is the command line swiss army knife of packet analysis. (And yes, I know just how much of a techno geek saying so makes me, and I am okay with it.) In general I tend to use snoop to look at packets on an interface in a unix system so this was my first real use of tcpdump in production. We have a nice GUI product running on Win2K in a rack mount server in the network core that I use for day to day troubleshooting of problems, but there was not enough free memory on that server for this continuous looping buffer of packet capture we needed for this problem. The loop is over 2000 files, each 20 megabytes in size for a total of 40 gigabytes of network packets. And it is only the first 96 bytes of each packet so it does not include the actual data being passed, just the packet headers (information needed by network interfaces and network hardware to make sure the packet gets to its destination). Heady stuff, as it were.
Ok, enough about work. I hope to post something about the book I am currently reading. A new author for me at least and very fun.