One of the initial limitations around NVMe-oF was the (in)ability to boot from SAN–though this is no longer the case. And you need some fairly new drivers across the board to do it. As far as I am aware, (as of the publication of this post) boot from SAN via NVMe is only currently supported via Fibre Channel, not RoCEv2. But I will keep an eye on that. You do need NVMe-oF/FC capable HBAs–a list of them can be found here:
I am using Emulex LightPulse LPe32002-M2 2-Port 32Gb Fibre Channel Adapter in my server, so I will go through the Emulex instructions. The ESXi-side of things will be similar for other vendors, but the HBA driver version/configuration will be different.
First off, want to learn more about NVMe? Check out this from SNIA:
In order to boot from SAN via NVMe/FC you need a few other things:
- ESXi 7.0 Update 1 (or later)
- 12.8.x release (or later) of the lpfx and brcmnvmefc drivers from Broadcom/Emulex
- FlashArray//X or C withPurity 6.1 or later (Pure customers of course) and FC ports.
- Latest release of the HBA firmware
Build a custom ESXi ISO
As of the writing of this post, the required drivers for Emulex to support boot from SAN via NVMe-oF, are not included in ANY image of ESXi as far as I can tell. There is an important change in the lpfc driver in particular listed in the release notes:
The NVMe feature is no longer disabled by default in LPe31000-series and LPe32000-series adapters
This is important because enabling NVMe on the adapters is required for ESXi (and in this case the installer) to discover NVMe namespaces. But this requires a reboot. And the ESXi installer, by its very nature, is stateless. So we have a bootstrap issue in prior drivers. The setting isn’t enabled by default, and requires a reboot, but the setting is not persisted, so it can’t be set. The issue here is that you will not see any NVMe namespaces in the ESXi installer:
Consequently, you must generate a custom ESXi ISO with the newer drivers.
You can download the 12.8.340 LPFC drivers here:
and the brcmnvmefc 12.8.329 here:
Of course also download ESXi 7.0 U1 offline installer (NOT the ISO):
Download the zip files:
Now the drivers, you will need to unzip the main zip file and pull out the zip file in there.
Repeat this for both drivers. You do NOT need to unzip the ESXi download.
Next install PowerCLI. The custom ISO builder is not supported yet on PowerShell Core, so you must use Windows with PowerShell 5.x installed for this. Ugh.
The first step is to add the ESXi offline installer to the depot.
Then do the same for the two drivers:
Add-EsxSoftwareDepot C:\Users\cody\Documents\custom\Broadcom-ELX-brcmnvmefc_12.8.329.0-1OEM.700.1.0.15843807_16963846.zip Add-EsxSoftwareDepot C:\Users\cody\Documents\custom\Broadcom-ELX-lpfc_12.8.340.12-1OEM.700.1.0.15843807_17305774.zip
The identify the profiles available in the ESXi bundle:
Usually go with the latest standard. Then create a new profile from it with a new name. I am appending Emulex to it to make it clear the ISO version.
New-EsxImageProfile -cloneprofile ESXi-7.0U1c-17325551-standard -Name ESXi-7.0U1c-17325551-standard-Emulex -Vendor Emulex
Then add the right packages, this would be the profile you just created (with the Emulex name) and the driver names. You can identify the driver names with the Get-EsxSoftwarePackage command if you need to:
But for this they are lpfc and brcmnvmefc:
Finally create the ISO:
Export-EsxImageProfile -ImageProfile ESXi-7.0U1c-17325551-standard-Emulex -ExportToIso -FilePath C:\Users\cody\Documents\custom\ESXi-7.0U1c-17325551-standard-Emulex.iso
This will be the ISO you need the mount to your server.
Configuring the HBA NVMe-oF Boot
The next step is to configure the NVMe boot from SAN support in the HBAs themselves. You need the 12.6.x or later version of the firmware. In my case:
Boot the baremetal and go into system setup:
So this will vary a bit depending on your server vendor, but the basic idea is the same:
- Enable UEFI boot
- Discover NVMe targets
- Create FlashArray host object and add NQNs
- Provision FlashArray volume
- Choose namespace in HBA configuration
- Repeat for each NVMe controller.
First enable UEFI boot.
The boot settings:
Commit the change. Now go to Device Settings:
Now choose the Emulex FC port. You will need to repeat this process for EACH of the ports.
Scroll to the bottom and look for Emulex NVMe over FC Boot Settings.
Enable it and commit the changes.
This will enable the rest of the options.
The next step is to configure the FlashArray with the NQNs. The HBA has its own hardware NQNs to provide access to the NVMe namespace at boot time. You will see the NQN for each port listed at the top of the NVMe over FC Boot Settings page:
If you do not, it means you are not running the 12.8.x firmware. Download it from here and create an ISO with it and run the firmware update utility:
Note that I think the 12.6 firmware will work as well, but 12.8 is recommended here regardless. Reboot the server. The NQN should then show up.
Now create a new host object on the FlashArray:
I am creating a host object specifically for the boot WWNs and then later I will create a second host object specifically for the eventual ESXi NQN. You don’t have to do this–if you have 100s of hosts eventual you might run into a host count limit doing this–so you might just want to consolidate them. If you separate them (like I did) you do not need to assign a host personality to the boot host object as seen above.
Then identify each HBA port NQN. Note that you can determine these prior if needed. The prefix “nqn.2017-01.com.broadcom:ecd:nvmf:fc:” is static and they are made unique by adding their WWN to the end. So if some of the setup is scripted you can pre-determine the NQNs (assuming you know the WWNs).
Then put them in the host object.
One thing to note is that you must enter the NQNs entirely in lowercase–there is some case sensitivity in the NVMe stack that will break namespace discovery at the moment if you enter them in with upper case letters. This can be a tricky thing to troubleshoot otherwise (believe me…).
Once added the ports will come online in the Health > Host Connections screen:
Verify redundant access to BOTH controllers. If not, check your zoning.
Next create a boot volume and connect it to the FlashArray host. Storage > Volume > Create Volume.
Next is flip back to the host and we need to discover the NVMe environment. Click Add NVMe over FC Boot Device.
Once complete (takes a second or so) you will see each discovered NVMe controller.
Each listed controller actually refers to a port on the FlashArray the HBA port is zoned to. So you should see at least two if configured correctly. The 99.9 btw is a dummy value we put in for the firmware version–some hosts are sensitive to that value changing, so instead of updating it for each Purity upgrade it just stays the same.
Now navigate back and click Add NVMe over FC Boot Device.
You should see the FlashArray target ports. If you do NOT see anything it means the zoning is not done correctly and/or there is a cabling issue.
Click on the first listed controller and it will then show the array port NQN. Click on that.
Clicking on that will return the connected namespace.
This namespace is the volume we created/connected earlier. I speak more about what these UUIDs are in this post:
If you do not see any namespace you either:
- Didn’t connect the volume to the host or the right host
- Typo’ed the NQN (incorrect characters or case)
- Forgot to scan to new namespaces
Click on the namespace then choose Commit Changes.
Do this for the other controller(s) on that HBA port and then repeat for the other port(s). Doing this on each port/controller pair will ensure multipathing is configured correctly.
Now connect the ISO and boot!
You will get the the storage selection screen and will notice an absence of any NVMe devices. This is because you are no longer using the boot NQNs to discover storage–the ESXi installer is now the OS of record and does not have NVMe-oF connectivity established.
You can prove this out by looking at the FlashArray, the boot host I created no longer shows the NQNs as active:
Those NQNs are only for initial boot volume discovery–so once boot discovery passes they are no longer active. So we must enable the ESXi installer NQN. To do this, we must move over to the mgmt CLI, you can access this by pressing Alt-F1:
Credentials are root with no password (just hit enter).
In normal circumstances, you use the esxcli command to query for NVMe stuff, but since this is the installer and hostd is not running (which esxcli requires) there is an alternative cmd call localcli. So to query for the NQN you can run:
localcli nvme info get
But if you run this, you will get nothing:
Well this is because VMware does not support NQN creation if the host name is not set, this stops NQN collisions by many hosts accidentally getting the same NQN based off of localhost. So set the runtime host name.
You can use the following command to set the host name:
esxcfg-advcfg -s <hostname> /Misc/hostname
Replace <hostname> with you desired FQDN. Ideally you want to set the host name to the name you plan on actually using for this host–that way the generated NQN will be the same post-install.
So if you re-run the NQN query:
You have the NQN. This NQN can be easily predicted:
nqn.2014-08.com.<domain name>:nvme:<host name>
So if you enter in esxi-16.purecloud.com it will be:
If you don’t enter a FQDN and only a domain name, it will use “vmware” as the domain:
So go to the FlashArray, and create a new host (or add it to the one you created earlier). I will do the former. In this case, set the personality to ESXi as this will be the NQN used to actually present VM storage to ESXi. If you are using the previous host object, ensure the personality is set to ESXi now.
Now add the NQN to this host.
Finally, connect the boot volume to this host (of course if you are using the same host as earlier it should already be connected so you can skip this):
Now back at the ESXi install, type Alt-F2 to get back to the installer. Type F5 to discover the connected volume (namespace).
Choose the namespace and press enter. The rest of the process is normal install.
Complete the install and reboot. If the host does not boot into the new installation it means you didn’t configure the HBA NVMe boot selection correctly (or at all), and/or the server is not configured to boot to UEFI NVMe. Ensure it isn’t set to BIOS boot.
Once booted add to your vCenter, you can verify it is booting off NVMe with a handy PowerCLI script that William Lam wrote:
You will also see that the NQNs for the HBA are not logged in–they are no longer in use. Only for boot, but the ESXi NQN is good–they share the same WWNs so zoning does not change and is valid for both: