One of those problems I noticed and shared with the Equallogic team was that following Equallogic's Configuring and Installing the EqualLogic Multipathing Extension Module for VMware vSphere and PS Series SANs guide to MPIO for iSCSI targets Traffic on the ESXi Host was coming out of the incorrect vmnic.
The Environment:
- ESXi Host 5.0.0 build 768111 fully patched as of 9-16-2012
- Dell Equallogic PS6100XS with firmware 5.2.5
- Using EqualLogic-ESX-Multipathing-Module v1.1.1
The Problem
If you follow the install guide for the EqualLogic-ESX-Multipathing-Module and use the setup.pl script for the configuring of a vSwitch or Distributed Switch (vDS) you'll create two ISCSI vmk's and a storage heartbeat vmk.
vSwitch Setup for just iSCSI |
The Dell setup.pl script marks the failback to No on each iSCSI Network. So following these best practices in my setup I would expect that vmk2 traffic could only come out vmnic2 as vmnic3 is set to be Unused. And vise versa vmk3 traffic would only come out vmnic3 as vmnic2 is marked as Unused for it.
However when I SSH to the ESXi host and run "esxtop" and hit "n" for network I see the following showing that vmk2 is infact using that what is suppose to be an "unused" vmnic3. I pointed this out to Dell and they sais it was odd didn't have any answers yet they ever followed up with me on it. I got asked for vmware supports but not even asked to recreate it.
The Fix
After setting this up every way I could think of; trying it using both vSwitches and Distributed Switches, rebuilding my hosts from CD, trying different ESX hosts with different hardware, tried hosts on different network infrastructure. All with the same issue of it using the incorrect nic.
After all I used the script to create the setup to ensure it was correct and best practice. I create it by hand instead of the script and double checked every setting to ensure everything matched. Still no luck. So of course after it was already time to go home I checked ESXTOP one last time on a ESX host I wasn't finished configuring and behold it was working correctly. Each vmk was bound to its correct vmnic.
vmk using the correct vmnic |
The Reason this Happened
After doing some reading in this epic post by Joshua Townsend that laid out the resent changes round VMware iSCSI Networking. In fact his quick fix at the time was to in fact turn on failback to no. Looking at the EqualLogic-ESX-Multipathing-Module v1.1.1 setup.pl you can see it following Joshua's quick fix and setting the failback option. I also found the same thing in version 1.1.0. The scripts comments even say that its doing so because of a Vmware bug. However VMware says that bug is now fixed (VMware KB 2008144) and instead looks like setting this option introduces a bug instead.
So if you used the Dell EqualLogic-ESX-Multipathing-Module setup script or followed the install guide you may want to check if you do in fact have this problem because Network throughput, Multipathing and Network Redundancy may not work as you expect.
I really appreciate you taking the time to document this. Fixed an issue I've been looking into for a couple weeks now thanks to your post!
ReplyDelete