07-24-2013 10:50 AM - edited 07-24-2013 10:55 AM
currently I build up following infrastructure on vmware vsphere 5.1 Enterprise... the following is given:
7x hp DL 380 g7 server with 2 sockets cpu and in each case 192 GB ram.
every server is equipped with 12 network cards. 4 onboard and 2x HP NC 364T on the riser card.
as storage is used a netapp 2240-2 with 2 controllers and 24 hard disks.
the netapp is equipped with a mezzanine card per controller. thereby 2x 10gb nics are available per controller.
the 10gb nics on the netapp are bundled up as a virtual interface on each controller.
used switches are cisco ws c3750x 24. these are also equipped with 10gb modules.
the cisco switch model is certificated from netapp. the cable connection from storage to switch is offered about netapp directly.
each hp server is installed with vmware esxi 5.1 release 799733.
the esxi installation file is from hp with the drivers from the server compiled in it.
the netapp has the data ontap release 8.1.1P1 7-mode
we use software iscsi hba on the esx server
now the big problem:
i converted some physical servers through vmware converter to the the vm infrastructure.
some servers are running ms sql server und oracle on it.
the problem is now, that the storage i/o traffic over software iscsi hba in vmware is now slower then before.
im posting screenshots from iometer in the posting.
virtual networking in vcenter is built up as follows:
all four nics (HP NC 364T) from each server in the vm cluster sends the storage traffic to the cisco switch.
the cisco sends the traffic about his 10gb nic modules further to the netapp.
i've use the howtos to build iscsi multipathing von the vswtich, that sends the storage traffic to the netapp.
the virtual machine vmdk's are stored in one big aggregate on the netapp.
the aggregate is splitted in three volumes (one for the netapp wafl system and two for vm data).
the two data volumes has each one lun in it, where the vmdks are stored.
we use vmware software iscsi hba on each server.
now i'm experimenting with the vmware best practice guide "oracle databases on vmware". interesting is, that my performance is the same bad on each server and virtual machine.
it's not the matter, if the virtual machine is with ms sql, oracle or just a windows 2008 server with nothing on it. iometer still brings bad results on the 4k blocks read and write.
now the question in here is, what is wrong here ?
many thanks for a solution
07-24-2013 11:50 AM - edited 07-24-2013 12:06 PM
i have tested the infrastructure with a brandnew dl 380 g8 with the same NC 364T nics and the
performance was as bad as on the g7 servers. even when i changed the storage cables to the onboard nics.
what wonders me , that on the g8 servers the broadcom onboard nics are not any more shown
as an available (dependent) hardware iscsi adapters.
by the way should i better use software or hardware iscsi adapters ?
i am so under pressure that i already orderd a qlogic qle 4062c hw iscsi hba for testing.
07-25-2013 04:56 AM
I know it's a long shot, which Cisco switch are you using?
What's the maximum bandwidth of the switch, and bandwidth per VLAN?
I have seen some similar issues when we set up a single VLAN for multiple servers and was only using a fraction of the bandwidth.
07-25-2013 01:40 PM
i use the following Cisco 2x Switches WS-C3750X-24T-L in one stack.
on the attachment i post the configuration file.
my colleague and i suggest, that maybe the channel groups for the esx servers are responsible for the performance problems.
in another infrastructure, same storage, same switches, same servers, we haven an oracle rac cluster on red hat enterprise linux (2x DL 380 G7) and there are no channel groups configured on the ciscos.
by the way, our qlogic hardware iscsi hba (QLE4062C) arrived today. we put it in a dl 380 from the vmware cluster and the performance in i/oMeter was equal to the software iscsi hba.
interesting was, we were not able to ping the netapp from the qlogic bios of the hba. since we configured both nics of the qlogic hba (192.168.10.61 and 192.168.10.62) after 5-6 minutes we saw a mapped lun in vmware at the port 2 of the qlogic adapter. port1 hast still no mapped lun.
info: channel groups on the cisco were still active at this time. tomorrow morning our cisco support engineer is going to deactivate all the channel groups on the cisco stack.
07-25-2013 07:07 PM
seriously. $75 for some old be2 nic's and remember to use DAC/SFP+ since base-T latency is enormous.