02-22-2012 12:07 PM
you can do it like described at vmware KB 1026256 Creating a physical RDM
vmkfstools -z /vmfs/devices/disks/<device> example.vmdk
which is blocksize do you use at the RAID5 level 64k ?
02-23-2012 08:21 AM
Thanks for the quick response. As for block size, we would have used whatever the SmartStart default for block size is when we created the RAID 5 arrays on the two DL380's. What should we be using?
One question I have is if you think creating Physical RDM's on local storage will be supported in the future or if this is a feature that might go away in some future upgrade?
Also, do you have your logical disks set up so the first one is small just to hold ESXi (if not booting from an SD card) and the VSA, and the second logical disk represents the rest of the storage that you would point to with the physical RDM? Also how would you prepare/format the second disk for the physical RDM?
I apologize but I am not as familiar as I should be with RDM's and their use so if you could describe in a little more detail how to do this, I would greatly appreciate it.
Thanks very much,
02-23-2012 09:24 AM
> One question I have is if you think creating Physical RDM's on local storage will be supported in the future or if this is a feature that might go away in some future upgrade?
Support is a question that only HP or VMware could address authoritatively, but it would be a tough for them to remove such features now.
But we can "intelligently speculate"... :-)
For VMware: what they say is that by default local RDMs are not supported, but there *are* cases where local RDMs are fine (refer to KB 1017530). It comes down to whether your controller exports a globally unique ID (refer to KB 1014953). In our case, VMware engineering specifically recommended local RDMs.
A few more KBs to check out:
For HP: the VSA just sees virtual disks (VMDKs). It doesn't know or care what is physically behind that virtual disk - be it a VMDK on top of a VMFS, a mapping to a block storage device, or even a mapping to another iSCSI device. Where the VSA *does* care is regarding the performance (latency, etc) characteristics of the device. So, while you *could* give your VSA a pile of VMDKs that live on remote NFS or iSCSI or FC, you would want to understand the implications vs VMDKs living on local storage. (I've heard of individuals using this feature to consolidate storage from other vendors into SAN/IQ.) Anyway, this brings us back to the benefits of local RDMs... which, in theory, by removing layers of complexity, should perform slightly better than the default VMDK on VMFS.
> Also, do you have your logical disks set up so the first one is small just to hold ESXi (if not booting from an SD card) and the VSA, and the second logical disk represents the rest of the storage that you would point to with the physical RDM?
You've got it.
We boot ESXi from a USB device. We have a tiny logical disk to hold the VSA "OS" disks. We then divided the rest of the storage into < 2TB logical disk chunks (although we have tested > 2 TB local RDMs with ESXi 5.0). Each of these logical disks appears as an naa.* device in /dev/disks. We created RDMs from these devices and then added those RDMs to the VSAs as VMDKs.
> Also how would you prepare/format the second disk for the physical RDM?
No need to prepare them any differently than a regular VMDK. You just add them to the virtual machine (VSA) as disks like you would with regular VMDKs -- on their own virtual controller, etc. In the VM's settings, instead of creating the VMDK on a VMFS, you'll add a VMDK that "already exists" and then point to the RDMs you created.
> I apologize but I am not as familiar as I should be with RDM's and their use so if you could describe in a little more detail how to do this, I would greatly appreciate it.
Don't apologize -- this method is a bit more tricky, so you can see why it is not something that HP may want to document in their normal "quick-start" guides. But for those running into strange problems that could stem from the extra layers of plumbing or those wanting/needing to glean more performance, this is a path to consider. We went "all in" on this method after it resolved our serious performance problems, but it may not be a good solution for your case.
Hope that helps.
For anyone else reading this far (and still awake) -- have you tried RDMs? Did it make any difference? Any other tips/tricks?
02-23-2012 12:52 PM
@virtualmatrix: can you suggest > 2 TB local RDMs with ESXi 5.0? do you also recognized the (300+) latenzys with vm version 7 ? did you played arround with the underlaying raid level (raid 5) block size? or did you used always the default blocksize 256k? thanks for your feedback
02-23-2012 01:28 PM
> can you suggest > 2 TB local RDMs with ESXi 5.0?
I can say that it worked for some basic tests: The VSA starts up, reports the correct amount of space, and we can create and use volumes. I wouldn't say I could "suggest" it since we aren't using that configuration in a heavy workload/production environment yet. We are building out another VSA based cluster and we'll need to make that decision soon.
> do you also recognized the (300+) latenzys with vm version 7 ?
We have not upgraded any of our VSAs to version 7. We *did* see latency symptoms in version 4, but in our case, these were resolved by using RDMs. I suspect the original poster for this thread is experiencing a different root issue than we've seen, but it is interesting that the symptoms are similar.
> did you played arround with the underlaying raid level (raid 5) block size? or did you used always the default blocksize 256k? thanks for your feedback
We left the controllers at their default configuration except for enabling caches. We didn't see much benefit to increasing the blocksize given the overall random nature of I/O in a busy virtualized environment, but we did not spend time benchmarking to prove our thoughts right or wrong. It would be different if we could better predict the most common I/O patterns for a particular group of disks.
Sorry I'm not more help for those questions.
If others have a different perspective on how to best optimize controllers for a VSA environment, please share!
03-03-2012 02:19 PM
03-06-2012 05:59 AM
As I'm currently running on some older DL380 G5's this isn't an option for me but what about using direct-path to pass the NICs directly to the VSA bypassing the problem all together?
03-21-2012 04:23 AM
Having exactly the same issue as the thread starter. Any news on this?
Im using VSA on ESXi5 in a c7000 with ds2200 storage blades and Flex-10. Usualy perfomance is great, but when removing snapshots latency peaks to over 700ms, VMs even disconnect RDP sessions.
How did you change the driver to E1000? Did you change just the driver or the virtual nic of the VSA VM? Is there another way to use 10Gbit Ethernet than VMXNET3? I don't think so...
03-21-2012 10:20 AM
It doesn't work with all controllers. But you have to use vmfstools to get it to work if it does.
You should be able to google it. It is definitely not supported by vmware.
03-21-2012 01:50 PM
M.Braak - do you have a vmware incident and/or bug number, that others could refer to if they experience the same issue?
I was hoping that if a fix was forthcoming, it would be included in the U1 which just came out last week ...
03-21-2012 02:20 PM
Vmware recommended using local RDM? What was the performance issue - related to vmxnet 3 nics described in the rest of this thread, and RDM route was seen as a way of by-passing whatever the issue was?
Huh - so I have not yet had a chance to look over the links you provided, but I'm hoping that one of them contains a list of raid controllers that export GUIDs?
Anyway - I generally HATE RDMs - management of them once you get more than a few is such a pain, and all the perf docs that I have seen show very little -to-none advantage of an RDM over a virtual disk file on VMDK ...
But in the case of a VSA, it seems to me a bit more sensible - you want the VSA to have as direct and unmediated access to the disk as possible, I would think.
Nice post, worth thinking about.
03-30-2012 06:36 AM
This bug bit me yesterday.... I got one node in a 2 node cluster replaced. Replacing the other one as we speak. I am deploying from the 9.0 ovf then upgrading to 9.5. The virtual hardware version seems to be the problem as i can run 9.5 on hardware version 4 and it runs great. But if you deploy from the 9.5 ovf thats the newer hardware version and it doesn't work well at all. Had issues with disk latency from the vsa to the virtual disks on the servers. All the underlying storage performance was good it was just the virtual disk on the vm's where the latency was horrible.
04-03-2012 10:45 AM
I now have a copy of the 9.0 OVF. I have a few servers to use for testing, so I'm going to give the upgrade to 9.5 on VM HW version 4 and see what happens.
04-20-2012 05:38 AM
The last reply was that they are investigating the issue but it could take a while cause they are very busy at the moment ?!?! and there is a workaround available so it has no prio...
We discarded the VSA this week because of the severe performance issues.
I didn't had time to rebuild the VSA using HW4 however.
04-20-2012 05:43 AM
and the workaround is: simply to not use the VMXNET3 nic?
This response does not make sense to me - I'd like to see whether we can look into this incident from our side - can you get me the incident number?
04-20-2012 05:46 AM
Not using VMXNET3 will speed things up but still bad performance.
The only stable solution seems to use VMWare Hardware Level 4. So this means deploying VSA 9.0 and the upgrade to 9.5 so hardware stays at level 4.
VMware Support Request number is 11128945812
04-24-2012 09:40 PM - edited 04-24-2012 09:47 PM
I've been following this thread for a while but haven't posted in it till now.
I am trying to roll out 2 P4000 VSA 9.5 on ESXi 5 with a server that uses a supermicro X7DWN+ and an adapted 51245 with 10 300gb 15K rpm sas drives. Just a single host for now while testing. All updates available via the update manager have been loaded for both vmware and the P4000.
I am getting constant latency warnings and according to esxtop the davg on the iSCSI hba is anywhere between 0 and 150. Physical networking isn't the problem because all traffic is on 1 host and on the same vswitch.
davg on the adaptec hba is ALWAYS less than 10 and mostly 0 to 5. Absolutely nothing in the logs about latency on the adaptec hba.
I contacted HP about it and their response was to disable multipathing. Well that doesn't appear to have fixed it, though it does improve latency. They also recommended vmxnet3 and hardware version 8.
I referred the vmware tech your support request number and he said it's not applicable to my situation because it was opened under esx 4.1.
Today I got a PSOD and VMware support said it was because of high latency. They told me not to ignore the latency any longer.
I haven't tried installing the VSA 9.0 yet but that will be my next step.
05-10-2012 05:16 AM
Has anyone tested the 9.5 VSA (HW Ver 7+) with ESXi 5 Update 1? I have update 1 installed, but I'll have to restripe everything and to deploy the latest VSA again.
05-14-2012 06:50 AM - edited 05-14-2012 07:22 AM
Interesting post; unfortunely same problem here!
With vSphere 5U1 and HP VSA 9.5 (from noew, not upgrade from 9.0).
I also have a SR in VMware, hoping something new asap.
By the way, have you some more info about downgrading from HP VSA 9.5 to HP VSA 9.0 without breaking my vmfs ?