Broadcom Netxtreme II “bnx2” driver under redhat– msi-x may not necessarily be your friend….

Today I am going to TRY to start logging a daily blog entry. I’ll add more info about myself later, but I am essentially a sysadmin generalist. That means I manage a LOT of stuff, and I don’t necessarily specialize in ONE particular thing.

My blog entries are going to be totally random– mostly tech stuff (some not) – as a means of documenting day to day things I learn or discover. I’ve been doing this 20 years. I’m still learning.

Today’s tip of the day: Broadcom nic drivers suck. That opinion was more or less confirmed from the masses in irc #lopsa today 🙂

OK, more specifically, I have a pair of new Dell servers. One of these servers is a backup storage node (legato networker), which means that system has a tape library attached. It also means that when backups start, a half dozen nodes or so start shoving boatloads of data toward the storage nodes in parallel. *HERE*, says the nodes.

At 1am (when my backups start) this one node has had the habit of having its network interface stop. No errors, no runs, no one on base. Nothing. It just– stops. a simple ifdown/ifup pair of ops done from the console get it going again. The nic – a broadcom netxtreme II. The driver for redhat is “bnx2”.

I’ve been battling this for almost 2 weeks now. And losing lots of sleep.

But- there’s hope: I found this today:
========================================================

Module Parameters
=================

One optional parameter “disable_msi” can be supplied as a command line
argument to the insmod or modprobe command. This parameter is used
to disable Message Signaled Interrupts (MSI) and the parameter is only
valid on 2.6 kernels that support MSI. On 2.4 kernels, this parameter
cannot be used. By default, the driver will enable MSI if it is supported
by the kernel. It will run an interrupt test during initialization to
determine if MSI is working. If the test passes, the driver will enable
MSI. Otherwise, it will use legacy INTx mode.

Set the “disable_msi” parameter to 1 as shown below to always disable
MSI on all NetXtreme II NICs in the system.

insmod bnx2.ko disable_msi=1

or

modprobe bnx2 disable_msi=1

The parameter can also be set in modprobe.conf. See the man page
for more information.
================================================

basically, the msi-x stuff takes advantage of xeon processors, and helps manage the nic “in parallel” somehow. As soon as I found this, I think my “system administration generalist” red flag went up, and I think this is the ticket. And yeah, I have a 2.6 kernel. What triggered my search was getting boot info about my nic, and I saw “MSI-X enabled” as part of the nic firing-up process at boot time. Hmmmm.

I set that parameter tonight. We shall see if indeed this is the culprit.