Migrating From SCSI To NVMe on vCenter (Part 1 – Live Migration)

This is going to be broken up into two parts- first, a live migration where no VMs get powered off during the migration; second, a migration where you temporarily power off VMs attached to the SCSI datastore.

Why would you want to do it one way or another?

Pros of live migration:

  • No VM downtime
  • Simpler configuration changes and overlap. Less to go wrong or mess up

Pros of powering off VMs:

  • The total migration time will be significantly less because no data will have to be moved. Currently VMware doesn’t support XCOPY (even on the same array) for NVMe-oF

Great, you’ve decided on a live migration for your VMs because you don’t care about how long it takes; you just want to minimize downtime of your VMs as much as possible. If you haven’t already, you’ll need to follow the guides Pure Storage has for setting up NVMe-oF in your environment.

Once you’ve configured NVMe-oF in your environment, you’ll need to create the namespace (volume), connect it to the appropriate host group, create the NVMe-oF datastore in vCenter and finally storage vMotion your VMs from the SCSI datastore to the NVMe datastore.

Create the Volume

From a FlashArray perspective, this is identical to SCSI except for the slightly different terms and labels. Cody wrote a nice article explaining the differences. Log into your FlashArray, select (1) Storage then (2) Volumes then click the (3) + on the right hand side of the GUI.

In the window that pops up, populate a (1) Name for the namespace (volume), give it a (2) Provisioned Size then click (3) Create.

Note the volume serial number by going to (1) Storage then (2) Volumes, finding the name of your (3) Volume, then (4) clicking on the hyperlink name of it.

On the next window, note the Serial of the volume. We will use this later in vCenter to validate that we are connecting the right namespace.

Connect The Volume To the Appropriate Host Group

Still in the FlashArray GUI, go back to (1) Storage, select (2) Hosts, then select the (3) Host Group you have created for your NVMe-oF hosts. In this case, I am setting this up for NVMe-FC but the steps will be the same for NVMe-RoCE after you have followed the previously linked KB articles.

Next, click the three vertical dots (I think this is called a hamburger) and select Connect.

For the last step in the FlashArray GUI, select the (1) Namespace (volume) you created before then click (2) Connect.

Create The NVMe-oF Datastore

Switching over to vCenter, we’ll first want to create a datastore from the namespace that we’ve just presented to our host group. This process is easier than with SCSI datastores because you do not have to rescan the storage adapters- all you need to do is create a datastore on top of the NVMe namespace that is already present.

(1) Right click on the vSphere cluster you’ve presented the namespace to, hover over (2) Storage, then click (3) New Datastore.

Select (1) VMFS (currently vVols is unsupported by VMware with NVMe-oF) and click (2) Next.

Specify a (1) Name for your datastore, (2) Select a host that the namespace was presented to, select the (3) namespace from the list and click (4) Next. Validate the serial number of the namespace (volume) from the FlashArray GUI before in the Name column.

Select (1) VMFS 6 (who uses 5 anymore anyways?!) and click (2) Next.

Click (1) Next.

Review the details and click (1) Finish.

Validate the hosts are connected to your newly created NVMe-oF datastore by going to the (1) Storage tab, selecting the (2) Datastore Name and clicking on the (3) Hosts tab. If anything looks incorrect here (not all hosts from the cluster are connected, etc), please review your NVMe-oF configuration for issues.

Storage vMotion the VMs from SCSI-backed Datastore(s) to NVMe-backed Datastore(s)

Staying in the vCenter GUI, select the (1) Hosts and Clusters tab, right click on the (2) VM you want to migrate from SCSI to NVMe then select (3) Migrate… from the list that pops up.

Select (1) Change storage only from the window that pops up and click (2) Next.

Select the (1) NVMe datastore you created before then click (2) Next. Optionally you can modify the storage policies for the VM and the virtual disk format.

Finally, verify the details of the migration and click (1) Finish.

And now wait until the VM has migrated to the NVMe-oF datastore. Migrations in general can be very daunting, but luckily with NVMe-oF, it can be extremely simple. Hopefully you found this helpful.

Pure Storage vSphere Remote Plugin™ 5.1.0 launch: vVol Point-in-Time Recovery

We are excited to announce the launch of the latest version of Pure Storage’s remote vSphere plugin, 5.1.0. It includes a number of bug fixes PLUS a highly sought after feature: vVols VM point-in-time (PiT) recovery!

Why am I excited about this feature?

With vVol PiT VM recovery, you can now easily recover an entire VM that was accidentally deleted (and eradicated) or you can restore the state of a VM back to a point in time that you took a snapshot from vCenter directly while using Pure’s vSphere plugin.

The requirements of this are Pure’s vSphere remote plugin 5.1.0 and Purity™ 6.2.6 or higher for PiT revert and for PiT VM undelete with a vVol VM that has had its FlashArray™ volumes eradicated from the FlashArray itself. If you’re undeleting a vVol VM that has not been eradicated yet, that functionality is present for Purity versions 6.1 and lower.

For PiT VM revert, you will also need to make sure that you have snapshots of all of the volumes associated with the vVol VM except swap- at least one data volume and one configuration volume.

For VM undelete before the volumes have been eradicated, you will need a snapshot of the vVol VM’s configuration volume.

For VM undelete after the vVol-backed VM has been eradicated, you’ll need a FlashArray protection group snapshot of all the VM’s data volumes, managed snapshots and configuration volumes.

Rather than rehash what my teammate Alex Carver has put a lot of work into, I’m just going to link to the KB and videos he created:

Download the new plugin (part of Pure’s OVA), read the release notes and test out vVol PiT recovery today! Like a lot of things, it’s better to have some understanding of what’s happening and why before needing something that might be part of your recovery process. Please note that you can also upgrade in-place from 5.0.0 to 5.1.0 (and future remote plugin releases) by following this guide.

vCenter Storage Provider “Refresh certificate” Functionality Restored

This will be a short blog, partially because my teammate Alex Carver already wrote a great blog that covers one workaround for this button not working that uses vCenter’s MOB.

If you have been using self-signed certificates in your vVols environment since vCenter 6.7 and updated to vCenter 7.0, you might have noticed something frustrating when trying to refresh those certificates manually: the button was greyed out! If you were like me, you were probably wondering why this useful functionality was removed and thought maybe it was for security reasons; your concerns might have been validated when searching VMware’s KB system and finding this KB that read like it was functionality that was removed on purpose (recently updated to reflect the current situation better).

Turns out my guess was wrong and that KB was a little misleading. VMware has brought this button’s functionality back in vCenter 7.0U3d and higher. You might say to yourself “that’s great Nelson, but I don’t upgrade my production vCenter whenever a new vCenter version comes out”. If you want a simpler workflow than re-creating the storage providers before you upgrade to newer versions of vCenter when the certificates expire eventually, Alex Carver has the method for you which uses vCenter’s MOB to refresh the storage providers without re-creating them.

Pure Storage’s vROps Management Pack ™ 3.2.0 – New Features and Changes

Pure Storage recently launched a new management pack for vROps that had a number of important fixes and some changes to the interface. You can download it here and find the full release notes here. What’s new?

  • Interface changes
    • Updated icons
    • Restructuring of Pure Storage objects in the Object view of vROps
  • Add Offload Snapshot capacity metric
  • Add FlashArray Software™ version property

Let’s go over the interface changes first. If you navigate to Environment -> Object Browser -> Pure Storage FlashArray -> FlashArray Resources -> PureStorage World and expand an array, the layout will look quite different than what was there before. For starters, the icons have almost all been updated to mirror what you would expect to see on a modern FlashArray Purity version (or vCenter if that is a vCenter object). We made this change to make the vROps management pack experience as close to the FlashArray experience as possible.

Additionally, we moved the structure of the objects around to be more consistent with what you’d expect from the FlashArray. No objects were removed and the same object can be listed in multiple places where it makes sense (for example, if you expand a Hosts group, you will see the pertinent volumes there as well as under the Volumes group).

Next, we’ve added the Offload Snapshot capacity metric in this version as well as a FlashArray Offload Target object. The Offload Target object is visible under Protection and you can see the current space used by that Offload Target in the badge for that object; additionally, there is a Capacity metric for this object that shows historical consumption.

Lastly, you can now retrieve the Purity version of the array directly from vROps to help plan your FlashArrays’ upgrades. This information is found by selecting a FlashArray and going to Metrics -> Properties -> Details -> Purity Version.

Native Pure Storage FlashArray™ File Replication – Purity 6.3


With the release of Purity 6.3, Native FA File replication has been added to the Pure Storage FlashArray™ software. This adds an often important feature to the FA File folder redirection solution I wrote about last year. Pure Storage is referring to this feature as ActiveDR for File Services.

ActiveDR for File Services is a useful feature if you’ve set up or are going to set up folder redirection on FA File and you would like the file data to be replicated asynchronously to a different array, whether that FlashArray hardware is at the same site or a different one. This feature is included with FlashArray.

This allows you to use your FlashArray for native block and file workloads that need the protection that replication provides and allow you to benefit from the great data reduction rate that FlashArray is known for with those replicated file sets.

Now, if you lose a site or an array for some reason, the file workload you have hosted on FA File can be recovered natively on FlashArray easily and quickly.

There are some differences between file and block workloads when it comes to ActiveDR replication. You can read more in the ActiveDR for File Services section of this Pure KB.

Horizon Folder Redirection Hosted on FlashArray™ File

Late last year, I wrote a KB for a solution that I wanted to bring up here- hosting Horizon’s VDI user directories on FlashArray™ File with folder redirection controlled through a group policy object (GPO). I’d like to discuss this for a couple of reasons:

1. Configuring FA File was surprising easy, especially compared to what I remember from setting up a Windows file server was for the same purpose in a previous role.
2. Why I landed on using folder redirection for this KB instead of roaming profiles or another solution for user shares in a VDI environment.

When I have managed or set up VDI environments from scratch in previous jobs, there were always a ton of considerations that went into the VDI environment. From determining the appropriate amount of virtual resources to deploy to each VDI user to determining how much hardware I actually needed to buy to support the full deployment, each step can be more painful than the last. Any opportunity we can take to help ourselves be successful in the project is a good step to invest in. But when that step is easier and I don’t have to invest any resources to get the benefit of improving the success of the project, I have to take a step back and appreciate what just went so well.

ComputerEntryFlashArrayConfiguration.png


It took me roughly 30 minutes to deploy and configure FA File in my existing Active Directory environment in my lab the first time. That included carefully digesting all the applicable new-to-me Pure documentation. From what I can recall with this process from my previous roles, that was at best a 2 hour job with a carefully put together and well documented Active Directory environment with automated Windows server deployments; at worst, that might have taken me a full day or two when I had to build everything from scratch. When any task took a day or more, I always had interrupts that would drag the process out and I ended up taking more time to review what I had done and what I needed to do from a documentation perspective.

AD create dialog.png


On the point of why I used folder redirection instead of roaming profiles with Active Directory, VMware has this very helpful KB that outlines decisions you might make if you are using Dynamic Environment manager (DEM), but I think a lot of the points are applicable even if you aren’t using DEM. I’d like to highlight some disadvantages they list of roaming profiles:

Disadvantages
-Large roaming profiles might get corrupted and cause the individual roaming profile to reset completely. As a result, users might spend a lot of time getting all personalized settings back.
-Roaming profiles do not roam across different operating systems. This results in multiple roaming profiles per user in a mixed environment, like desktops and Terminal Services.
-Potential for unnecessary growth of roaming profile, causing long login times.

When I saw these three specifically, I decided to go with folder redirection instead of roaming profiles. Anytime corruption is mentioned I try to avoid it. With VDI projects (let’s be real, most IT projects), you always want to minimize the impact to the end users partially because it will hurt adoption of it or reduce confidence from different groups in the company.

There is more to come with FA File and data protection, so please keep this blog in mind!

Validating SafeMode configuration on your Pure Storage Fleet with Pure1 API via PowerShell

If you are not familiar with Pure Safemode, you should be, check out the details here:

Or some of my thoughts in general here

Each FlashArray, Cloud Block Store, and FlashBlade has a built-in REST API, but so does Pure1–a place that aggregates all reporting for you in one API. Reporting on a Safemode configuration is a useful tool, to ensure our extra protections are configured (if they aren’t reach out to Pure support–for security reasons customers cannot turn it on themselves, nor off).

The Pure1 REST API has a beta release out (v1.1.b) that includes Safemode reporting in the arrays endpoint and it is super easy to pull via PowerShell.

Install the module (if you haven’t):

Install-Module PureStorage.Pure1

Create a new certificate (if you haven’t already) and retrieve the public key

Login to Pure1.purestorage.com as an admin and add the public key to Pure1 and get the application ID.

If you already created an app ID you do not need to do all of the above each time. Just once.

Now normally you can just connect, but the module auto-connects to the latest GA REST version by default, so before you do you need to set a variable to force it to the beta release:

Now connect.

Next, use Get-PureOneArrays and store it in a variable.

If you look at one of the results, you will see each array returned has a new property:

safe_mode

And there are additional properties available stating what is turned on (if at all)

If you see all-disabled, it means it is not enabled on that platform. Now keep in mind, this is a beta API so it may and likely will change by GA release of the REST–especially since our Safemode work is rapidly expanding on the storage platforms.

S.M.A.R.T. Alerts in vmkernel Log with FlashArray™ Hardware-backed Volumes

Hello- Nelson Elam, a Solutions Engineer on Cody’s team at Pure, guest-writing here again.

If you are a current Pure customer and have had ESXi issues that warranted you checking the vmkernel logs of a host, you may have noticed a significant amount of messages similar to this for SCSI:

Cmd(0x45d96d9e6f48) 0x85, CmdSN 0x6 from world 2099867 to dev "naa.624a9370f439f7c5a4ab425000024d83" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0

Or this for NVMe-oF:

WARNING: NvmeScsi: 172: SCSI opcode 0x85 (0x45d9757eeb48) on path vmhba67:C0:T1:L258692 to namespace eui.00f439f7c5a4ab4224a937500003f285 failed with NVMe error status: 0x1 translating to SCSI error
ScsiDeviceIO: 4131: Cmd(0x45d9757eeb48) 0x85, CmdSN 0xc from world 2099855 to dev "eui.00f439f7c5a4ab4224a937500003f285" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0

If you reached out to Pure Storage support to ask what the deal is with this, you were likely told that these are 0x85s and nothing to worry about because it’s a VMware error that doesn’t mean anything with Pure devices.

But why would this be logged and what is happening here?

ESXi regularly checks the S.M.A.R.T. status of attached storage devices, including for array-backed devices that aren’t local. When the SCSI command is received on the FlashArray software, it returns 0x85 with the following sense data back to the ESXi host:

failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0

These can be quite challenging to decode. Luckily, virten.net has a powerful tool for decoding these. When I paste this output into that site, I get the following details:

TypeCodeNameDescription
Host Status[0x0]OKThis status is returned when there is no error on the host side. This is when you will see if there is a status for a Device or Plugin. It is also when you will see Valid sense data instead of Possible sense Data.
Device Status[0x2]CHECK_CONDITIONThis status is returned when a command fails for a specific reason. When a CHECK CONDITION is received, the ESX storage stack will send out a SCSI command 0x3 (REQUEST SENSE) in order to get the SCSI sense data (Sense Key, Additional Sense Code, ASC Qualifier, and other bits). The sense data is listed after Valid sense data in the order of Sense Key, Additional Sense Code, and ASC Qualifier.
Plugin Status[0x0]GOODNo error. (ESXi 5.x / 6.x only)
Sense Key[0x5]ILLEGAL REQUEST
Additional Sense Data20/00INVALID COMMAND OPERATION CODE

The key thing here is the Sense Key which has a value of ILLEGAL REQUEST. The FlashArray software does not support S.M.A.R.T. SCSI requests from hosts, so the FlashArray software returns ILLEGAL REQUEST to the ESXi host to tell the host we don’t support that request type.

This is for two reasons:

1. Since the FlashArray software’s volumes are not a physically attached storage device on the ESXi host, S.M.A.R.T. from the ESXi host doesn’t really make sense.
2. The FlashArray software handles drive failures and drive health independent of ESXi and monitoring the health of these drives that back the volumes is handled by the FlashArray software, not ESXi. You can read more about this in this datasheet.

Great Nelson, thanks for explaining that. Why are you talking about this now?

Pure has been working with VMware to reduce the noise and unnecessary concern caused by these errors. Seeing a failed ScsiDeviceIO in your vmkernel logs is alarming. In vSphere 7.0U3c, VMware fixed this problem and this will now only log once this when the ESXi host boots up instead of as often as every 15 minutes.

This means that in vSphere 7.0U3c if you are doing any ESXi host troubleshooting you no longer have to concern yourself with these errors; for me, this means I won’t have to filter these out in my greps anymore when looking into an ESXi issue in my lab. Great news all around!

vSphere Remote Plugin: .local vCenter Domains

Hello- Nelson Elam here! I’m a VMware Solutions Engineer at Pure Storage and wanted to make you aware of an issue we’ve seen crop up a couple of times recently with our vSphere Remote Plugin and provide a quick explanation.

If your vCenter uses a .local domain (vcenter.purestorage.local is one example), you might have seen the following 3 errors in Pure’s vSphere Remote Plugin in vCenter:

  1. In the FlashArray list page, the error “Error retrieving array list. Please try again later.” is returned.
    clipboard_eaaa133673603ef70ba1a091e33b493c7.png
  2. When trying to import arrays via Pure1, the error “Authenticate with Pure1 to use this feature” is returned despite previously successful registration with Pure1 through the plugin.
  3. When adding an array manually, a “no permissions” error is returned.

Resolution:
To resolve this, follow step 14 from the Online Deployment Procedure for the remote plugin by running this command after customizing it to your environment:
pureuser@purestorage-vmware-appliance:~$ puredns setattr --search {your .local domain} --nameservers {ip or FQDN of DNS server}

So what’s going on here? When the OVA where you deployed the Remote vSphere Plugin tries to reach out to your vCenter with a .local domain suffix, it can’t resolve the DNS address unless you’ve provided the appropriate search domain for the OVA and will return different errors depending on where you are trying to interact with it in vCenter.

Luckily this is a simple fix despite the seemingly unrelated errors that pop up. Hopefully this was helpful!

Refreshing a VM configuration from a vVol Snapshot

A lot of the time when we talk about vVols and snapshots we talk about restoring the virtual disks (the data vVols). This of course is a huge benefit of vVols–the virtual disks are 1:1 to a volume on the array so the snapshots (and other array features) can be used at a level of virtual disk. Need to restore a database on virtual disk B (the E:\ drive or whatever), just use the snapshot restore to instantly refresh the entire disk. No need to mount a copied datastore, resignature, remove the old disk etc. etc. Just copy from the snapshot to the vVol volume and re-mount the file system in the guest. Fast and easy.

VMware snapshots exist with vVols too–they create array-based copies. But when you restore from them, you restore the whole VM. And the existence of them complicate the VM configuration–extra pointers and files etc. So a common vVol option is to just temporarily use VMware snapshots for backup procedures or for one off protection of VMs while I run an upgrade etc and then delete it when it works.

What if I want to refresh the VM configuration from a snapshot? Keep the data on the disk as is, but refresh the VM config files (VMX mainly) from a snapshot?

This is possible from a VMFS but quite complex. For a vVol VM this is really simple. Process?

  1. Shutdown VM.
  2. Copy from snapshot to config volume.
  3. Reload VMX
  4. Power-on VM.

So for some background, in a vVol world the VM directory (which houses the VMX file, some logs, virtual disk pointers, and some other frivolities) looks like a folder. But in reality it is a logical pointer to a volume on the array. This volume is called a config vVol and each “directory” in the vVol datastore maps to one. This config vVol is actually a mini VMFS. See more details here

Since this is a volume, you can of course take snapshots of it. There are a few ways to do this, either create one off snapshots of it or through protection policies.

Continue reading “Refreshing a VM configuration from a vVol Snapshot”