
As some of you read previously, I had been experiencing disk latency issues on our SAN and tried many initial methods to troubleshoot and understand the root cause. Due to other more pressing issues this was placed aside until we started to experience VMs being occasionaly restarted by vSphere HA as the lock had been lost on a given VMDK file. (NOT GOOD!!)
The Environment:-
3x vSphere 5.1 Hosts
2x 4port Nics 1GBe (allowing 2x iSCSi vmkernel ports per host for redundancy)
Dedicated Switching (isolated from the LAN) for iSCSi and vMotion (on seperate respective VLANs)
_MSA2312i SAN G2 (with 4 Shelves)
The iSCSi Multipathing policy was set to Round Robin.
SIOC is enabled.
_
After a great deal of digging I resolved to contacting VMware support whom pointed me in turn to the SAN as the Host log files had the following..
So duely armed I contacted HP support whom immediately escalated the issue internally. During this time I had a very helpful conversation with a good friend
When HP did eventually come back to me they suggested the SAN was perfectly fine, However! enough time had passed since the iSCSi port configuration change that I could already see a noticable drop in latency.
I waited another week (and since then) and I am very glad to say the latency is considerably lower with no reoccurance of the locks being lost on VM vmdk files.