Home Lab Gen IV – Part V Installing Mellanox HCAs with ESXi 6.5

The next step on my InfiniBand home lab journey was getting the InfiniBand HCAs to play nice with ESXi. To do this I need to update the HCA firmware, this proved to be a bit of a challenge. In this blog post I go into how I solved this issue and got them working with ESXi 6.5.

My initial HCA selection was the ConnectX aka HP INFINIBAND 4X DDR PCI-E HCA CARD 452372-001, and Mellanox MHGA28-XTC InfiniHost III HCA these two cards proved to be a challenge when updating their firmware. I tried all types of operating systems, different drivers, different mobos, and MFT tools versions but they would not update or be OS recognized. Only thing I didn’t try was Linux OS. The Mellanox forums are filled with folks trying to solve these issues with mixed success. I went with these cheaper cards and they simply do not have the product support necessary. I don’t recommend the use of these cards with ESXi and have migrated to a ConnectX-3 which you will see below.

Updating the ConnectX 3 Card:

After a little trial and error here is how I updated the firmware on the ConnectX 3. I found the ConnectX 3 card worked very well with Windows 2012 and I was able to install the latest Mellanox OFED for Windows (aka Windows Drivers for Mellanox HCA card) and updated the firmware very smoothly.

First, I confirm the drivers via Windows Device Manager (Update to latest if needed)

Once you confirm Windows device functionality then install the Mellanox Firmware Tools for windows (aka WinMFT)

Next, it’s time to update the HCA firmware. To do this you need to know the exact model number and sometimes the card revision. Normally this information can be found on the back of your HCA. With this in hand go to the Mellanox firmware page and locate your card then download the update.

After you download the firmware place it in an accessible directory. Next use the CLI, navigate to the WinMFT directory and use the ‘mst status’ command to reveal the HCA identifier or the MST Device Name. If this command is working, then it is a good sign your HCA is working properly and communicating with the OS. Next, I use the flint command to update my firmware. Syntax is — flint -d <MST Device Name> -i <Firmware Name> burn

Tip: If you are having trouble with your Mellanox HCA I highly recommend the Mellanox communities. The community there is generally very responsive and helpful!

Installation of ESXi 6.5 with Mellanox ConnectX-3

I would love to tell you how easy this was, but the truth is it was hard. Again, old HCA’s with new ESXi doesn’t equal easy or simple to install but it does equal Home lab fun. Let me save you hours of work. Here is the simple solution when trying to get Mellanox ConnextX Cards working with ESXi 6.5. In the end I was able to get ESXi 6.5 working with my ConnectX Card (aka HP INFINIBAND 4X DDR PCI-E HCA CARD 452372-001) and with my ConnectX-3 CX354A.

Tip: I do not recommend the use of the ConnectX Card (aka HP INFINIBAND 4X DDR PCI-E HCA CARD 452372-001) with ESXi 6.x. No matter how I tried I could not update its firmware and it has VERY limited or non-existent support. Save time go with ConnectX-3 or above.

After I installed ESXi 6.5 I followed the following commands and it worked like a champ.

Disable native driver for vRDMA

  • esxcli system module set –enabled=false -m=nrdma
  • esxcli system module set –enabled=false -m=nrdma_vmkapi_shim
  • esxcli system module set –enabled=false -m=nmlx4_rdma
  • esxcli system module set –enabled=false -m=vmkapi_v2_3_0_0_rdma_shim
  • esxcli system module set –enabled=false -m=vrdma

Uninstall default driver set

  • esxcli software vib remove -n net-mlx4-en
  • esxcli software vib remove -n net-mlx4-core
  • esxcli software vib remove -n nmlx4-rdma
  • esxcli software vib remove -n nmlx4-en
  • esxcli software vib remove -n nmlx4-core
  • esxcli software vib remove -n nmlx5-core

Install Mellanox OFED for ESXi 6.x.

  • esxcli software vib install -d /var/log/vmware/MLNX-OFED-ESX-

Ref Links:

After a quick reboot, I got 40Gb networking up and running. I did a few vmkpings between hosts and they ping perfectly.

So, what’s next? Now that I have the HCA working I need to get VSAN (if possible) working with my new highspeed network, but this folks is another post.

If you like my ‘no-nonsense’ blog articles that get straight to the point… then post a comment or let me know… Else, I’ll start writing boring blog content.

