Migrating From SCSI To NVMe on vCenter (Part 1 – Live Migration)

This is going to be broken up into two parts- first, a live migration where no VMs get powered off during the migration; second, a migration where you temporarily power off VMs attached to the SCSI datastore.

Why would you want to do it one way or another?

Pros of live migration:

  • No VM downtime
  • Simpler configuration changes and overlap. Less to go wrong or mess up

Pros of powering off VMs:

  • The total migration time will be significantly less because no data will have to be moved. Currently VMware doesn’t support XCOPY (even on the same array) for NVMe-oF

Great, you’ve decided on a live migration for your VMs because you don’t care about how long it takes; you just want to minimize downtime of your VMs as much as possible. If you haven’t already, you’ll need to follow the guides Pure Storage has for setting up NVMe-oF in your environment.

Once you’ve configured NVMe-oF in your environment, you’ll need to create the namespace (volume), connect it to the appropriate host group, create the NVMe-oF datastore in vCenter and finally storage vMotion your VMs from the SCSI datastore to the NVMe datastore.

Create the Volume

From a FlashArray perspective, this is identical to SCSI except for the slightly different terms and labels. Cody wrote a nice article explaining the differences. Log into your FlashArray, select (1) Storage then (2) Volumes then click the (3) + on the right hand side of the GUI.

In the window that pops up, populate a (1) Name for the namespace (volume), give it a (2) Provisioned Size then click (3) Create.

Note the volume serial number by going to (1) Storage then (2) Volumes, finding the name of your (3) Volume, then (4) clicking on the hyperlink name of it.

On the next window, note the Serial of the volume. We will use this later in vCenter to validate that we are connecting the right namespace.

Connect The Volume To the Appropriate Host Group

Still in the FlashArray GUI, go back to (1) Storage, select (2) Hosts, then select the (3) Host Group you have created for your NVMe-oF hosts. In this case, I am setting this up for NVMe-FC but the steps will be the same for NVMe-RoCE after you have followed the previously linked KB articles.

Next, click the three vertical dots (I think this is called a hamburger) and select Connect.

For the last step in the FlashArray GUI, select the (1) Namespace (volume) you created before then click (2) Connect.

Create The NVMe-oF Datastore

Switching over to vCenter, we’ll first want to create a datastore from the namespace that we’ve just presented to our host group. This process is easier than with SCSI datastores because you do not have to rescan the storage adapters- all you need to do is create a datastore on top of the NVMe namespace that is already present.

(1) Right click on the vSphere cluster you’ve presented the namespace to, hover over (2) Storage, then click (3) New Datastore.

Select (1) VMFS (currently vVols is unsupported by VMware with NVMe-oF) and click (2) Next.

Specify a (1) Name for your datastore, (2) Select a host that the namespace was presented to, select the (3) namespace from the list and click (4) Next. Validate the serial number of the namespace (volume) from the FlashArray GUI before in the Name column.

Select (1) VMFS 6 (who uses 5 anymore anyways?!) and click (2) Next.

Click (1) Next.

Review the details and click (1) Finish.

Validate the hosts are connected to your newly created NVMe-oF datastore by going to the (1) Storage tab, selecting the (2) Datastore Name and clicking on the (3) Hosts tab. If anything looks incorrect here (not all hosts from the cluster are connected, etc), please review your NVMe-oF configuration for issues.

Storage vMotion the VMs from SCSI-backed Datastore(s) to NVMe-backed Datastore(s)

Staying in the vCenter GUI, select the (1) Hosts and Clusters tab, right click on the (2) VM you want to migrate from SCSI to NVMe then select (3) Migrate… from the list that pops up.

Select (1) Change storage only from the window that pops up and click (2) Next.

Select the (1) NVMe datastore you created before then click (2) Next. Optionally you can modify the storage policies for the VM and the virtual disk format.

Finally, verify the details of the migration and click (1) Finish.

And now wait until the VM has migrated to the NVMe-oF datastore. Migrations in general can be very daunting, but luckily with NVMe-oF, it can be extremely simple. Hopefully you found this helpful.

S.M.A.R.T. Alerts in vmkernel Log with FlashArray™ Hardware-backed Volumes

Hello- Nelson Elam, a Solutions Engineer on Cody’s team at Pure, guest-writing here again.

If you are a current Pure customer and have had ESXi issues that warranted you checking the vmkernel logs of a host, you may have noticed a significant amount of messages similar to this for SCSI:

Cmd(0x45d96d9e6f48) 0x85, CmdSN 0x6 from world 2099867 to dev "naa.624a9370f439f7c5a4ab425000024d83" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0

Or this for NVMe-oF:

WARNING: NvmeScsi: 172: SCSI opcode 0x85 (0x45d9757eeb48) on path vmhba67:C0:T1:L258692 to namespace eui.00f439f7c5a4ab4224a937500003f285 failed with NVMe error status: 0x1 translating to SCSI error
ScsiDeviceIO: 4131: Cmd(0x45d9757eeb48) 0x85, CmdSN 0xc from world 2099855 to dev "eui.00f439f7c5a4ab4224a937500003f285" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0

If you reached out to Pure Storage support to ask what the deal is with this, you were likely told that these are 0x85s and nothing to worry about because it’s a VMware error that doesn’t mean anything with Pure devices.

But why would this be logged and what is happening here?

ESXi regularly checks the S.M.A.R.T. status of attached storage devices, including for array-backed devices that aren’t local. When the SCSI command is received on the FlashArray software, it returns 0x85 with the following sense data back to the ESXi host:

failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0

These can be quite challenging to decode. Luckily, virten.net has a powerful tool for decoding these. When I paste this output into that site, I get the following details:

TypeCodeNameDescription
Host Status[0x0]OKThis status is returned when there is no error on the host side. This is when you will see if there is a status for a Device or Plugin. It is also when you will see Valid sense data instead of Possible sense Data.
Device Status[0x2]CHECK_CONDITIONThis status is returned when a command fails for a specific reason. When a CHECK CONDITION is received, the ESX storage stack will send out a SCSI command 0x3 (REQUEST SENSE) in order to get the SCSI sense data (Sense Key, Additional Sense Code, ASC Qualifier, and other bits). The sense data is listed after Valid sense data in the order of Sense Key, Additional Sense Code, and ASC Qualifier.
Plugin Status[0x0]GOODNo error. (ESXi 5.x / 6.x only)
Sense Key[0x5]ILLEGAL REQUEST
Additional Sense Data20/00INVALID COMMAND OPERATION CODE

The key thing here is the Sense Key which has a value of ILLEGAL REQUEST. The FlashArray software does not support S.M.A.R.T. SCSI requests from hosts, so the FlashArray software returns ILLEGAL REQUEST to the ESXi host to tell the host we don’t support that request type.

This is for two reasons:

1. Since the FlashArray software’s volumes are not a physically attached storage device on the ESXi host, S.M.A.R.T. from the ESXi host doesn’t really make sense.
2. The FlashArray software handles drive failures and drive health independent of ESXi and monitoring the health of these drives that back the volumes is handled by the FlashArray software, not ESXi. You can read more about this in this datasheet.

Great Nelson, thanks for explaining that. Why are you talking about this now?

Pure has been working with VMware to reduce the noise and unnecessary concern caused by these errors. Seeing a failed ScsiDeviceIO in your vmkernel logs is alarming. In vSphere 7.0U3c, VMware fixed this problem and this will now only log once this when the ESXi host boots up instead of as often as every 15 minutes.

This means that in vSphere 7.0U3c if you are doing any ESXi host troubleshooting you no longer have to concern yourself with these errors; for me, this means I won’t have to filter these out in my greps anymore when looking into an ESXi issue in my lab. Great news all around!

ESXi NVMe-oF Namespace IDs, LUNs, and other Identifiers

In the world of SCSI, a storage device is generally addressed by two things:

  1. LUN–Logical Unit Number. This is a number used to address the device down a specific path to a specific array, for a specific host. So it is not a unique number really, it is not guaranteed to be unique to an array, to a host, or a volume. So for every path to a volume there could be a different LUN number. Think of it like a street address. 100 Maple St. There are many “100 Maple Streets”. So it requires the city, the state/province/etc, the country to really be meaningful. And a street name can change. So can other things. So it can usually get you want you want, but it isn’t guaranteed.
  2. Serial number. This is a globally unique identifier of the volume. This means it is entirely unique for that volume and it cannot be change. It is the same for everyone and everything who uses that volume. To continue our metaphor, look at it like the GPS coordinates of the house instead of the address. It will get you where you need, always.

So how does this change with NVMe? Well these things still exist, but how they interact is…different.

Now, first, let me remind that generally these concepts are vendor neutral, but how things are generated, reported, and even sometimes named vary. So I write this for Pure Storage, so keep that in mind.

Continue reading “ESXi NVMe-oF Namespace IDs, LUNs, and other Identifiers”

Disk.DiskMaxIOSize and the Blue Screen of Death

While the title of this post does sound like a halfway decent Harry Potter novel, this is far more nefarious. Pure Storage, like many other vendors have a best practice around lowering the Disk.DiskMaxIOSize setting on ESXi hosts when using UEFI boot for your Windows VMs. Why? Well:

Yes not having it set in a few situations would cause BSOD. First off, why?

Continue reading “Disk.DiskMaxIOSize and the Blue Screen of Death”

Latency-based PSP in ESXi 6.7 Update 1: A test drive

Yesterday, I wrote a post introducing the new latency-based round robin multipathing policy in ESXi 6.7 Update 1. You can check that out here:

Latency Round Robin PSP in ESXi 6.7 Update 1

In normal scenarios, you may not see much of a performance difference between the standard IOPS switching-based policy and the latency one. So don’t necessarily expect that switching policies will change anything. But then again, multipathing primarily exists not for healthy states, but instead exists to protect during times of poor health. Continue reading “Latency-based PSP in ESXi 6.7 Update 1: A test drive”

Latency Round Robin PSP in ESXi 6.7 Update 1

This is my first (but certainly not last post) on the new path selection policy option in vSphere 6.7 Update 1. In reality, this option was introduced in the initial release of 6.7, but it was not officially supported until update 1.

So what is it? Well first off, see the official words from my colleague Jason Massae at VMware here:

https://storagehub.vmware.com/t/vsphere-storage/vsphere-6-7-core-storage-1/vsphere-6-7-u1-enhanced-round-robin-load-balancing/

Why was this PSP option introduced? Well the most common path selection policy is the NMP Round Robin. This is VMware’s built-in path selection policy for arrays that offer multiple paths. Round Robin was a great way to leverage the full performance of your array by actively using all of the paths simultaneously. Well…almost simultaneously.

Continue reading “Latency Round Robin PSP in ESXi 6.7 Update 1”

What’s New in Core Storage in vSphere 6.7 Part VI: Flat LUN ID Addressing Support

vSphere 6.7 core storage “what’s new” series:

A while back I wrote a blog post about LUN ID addressing and ESXi, which you can find here:

ESXi and the Missing LUNs: 256 or Higher

In  short, VMware only supported one mechanism of LUN ID addressing which is called “peripheral”. A different mechanism is generally encouraged by the SAM called “flat” especially for larger LUN IDs (like 256 and above). If a storage array used flat addressing, then ESXi would not see LUNs from that target. This is often why ESXi could not see LUN IDs greater than 255, as arrays would use flat addressing for LUN IDs that number or higher.

ESXi 6.7 adds support for flat addressing.  Continue reading “What’s New in Core Storage in vSphere 6.7 Part VI: Flat LUN ID Addressing Support”

What is the latency stat QAVG?

I wrote a blog post a year or so ago about ESXi and storage queues which has received a lot of wonderful feedback (thank you!!) and I eventually turned it into a VMworld session and other engagements:

So in the past year I have had quite a few discussions around this. And one part has always bothered me a bit.

In ESXI, there are a variety of latency metrics:

  • GAVG. Guest average. Sometimes called “VM observed latency”. This is the amount of time it takes for an I/O to be completed, after it leaves the VM. So through ESXi, through the SAN (or iSCSI network) and committed to the array and acknowledged back.
  • KAVG. Kernel average. This is how long an I/O is spending in the ESXi kernel. If this is anything but zero, there is some kind of bottleneck (often a maxed out queue)
  • DAVG. This is how long it takes for the I/O to be sent from host, through the SAN and to the array and acknowledged back.

Continue reading “What is the latency stat QAVG?”

VMware Capacity Reporting Part V: VVols and UNMAP

Storage capacity reporting seems like a pretty straight forward topic. How much storage am I using? But when you introduce the concept of multiple levels of thin provisioning AND data reduction into it, all usage is not equal (does it compress well? does it dedupe well? is it zeroes?).

This multi-part series will break it down in the following sections:

  1. VMFS and thin virtual disks
  2. VMFS and thick virtual disks
  3. Thoughts on VMFS Capacity Reporting
  4. VVols and capacity reporting
  5. VVols and UNMAP

Let’s talk about the ins and outs of these in detail, then of course finish it up with why VVols makes this so much better.

NOTE: Examples in this are given from a FlashArray perspective. So mileage may vary depending on the type of array you have. The VMFS and above layer though are the same for all. This is the benefit of VMFS–it abstracts the physical layer. This is also the downside, as I will describe in these posts.

Continue reading “VMware Capacity Reporting Part V: VVols and UNMAP”

Upgrading ESXi environment with PowerCLI

A new ESXi 6.5 patch came out today:

https://kb.vmware.com/s/article/2151104

And I wanted to upgrade my whole lab environment to it and I haven’t set up auto-deploy or update manager yet (I plan to, making all of this much easier to manage). So I wrote a quick and dirty PowerCLI script that updates to the latest patch and if the host doesn’t have any VMs on it, puts it into maintenance mode and reboots it. I will reboot the other ones as needed.

So short, not really even worth throwing on GitHub, but I might make it cleaner, and smarter at some point and put it there. Continue reading “Upgrading ESXi environment with PowerCLI”