I’m going to be writing a series of blogs on ActiveDR (Active Disaster Recovery) with NFS datastores over the next couple of posts. Some of the other posts I have planned in the near future are:
Failover scenarios where the ESXi hosts are connected to both arrays
Failover scenarios where the ESXi hosts are only connected to one array
In this blog I will do an introduction to the technologies and I will give some high level information on how you might want to use them together.
Replication Options on FlashArray
This post will be covering FlashArray specific replication techniques that you may or may not be familiar with. If it is the latter, my colleague Cody Hosterman has a great primer on our technologies that might be worth a read for you.
It can be difficult to understand the work that has gone into Pure Storage’s vSphere plugin if you’re not digesting the release notes for every release. Because NVMe-oF is going to become more and more relevant I think it’s worth highlighting some recent improvements we’ve made around NVMe-oF in the vSphere plugin. I’ll mostly be referencing the vSphere plugin release notes in this blog. I strongly recommend installing the vSphere plugin for all of your vCenter + FlashArray needs but it is a requirement of following along with the new features of the plugin later.
The first update that involves NVMe-oF datastores was back in April of 2020 and was version 4.3.0. We added support for identification of NVMe-oF datastores. A good first step!
Hello- Nelson Elam here. I wanted to go over the reasons why I think you should enable automatic directory management (autodir) if you are planning to use NFS datastores on FA File. A quick note before we get started- autodir is not restricted to ESXi hosts but ESXi hosts will be the focus of this blog.
What is autodir? Autodir is a way for FlashArray to reflect the current directory structure on an NFS datastore that’s managed by a connected host- a managed directory. What does this mean for ESXi? Whenever a VM gets created on an NFS datastore, a new directory (folder) gets created for the VM on the datastore. When a VM gets deleted from disk, the directory gets destroyed. Note that directories you create or destroy manually on an NFS datastore in vCenter get reflected in FlashArray as well. Simple enough!
If you’ve read the FA File launch blogs or have seen some of the webinars we’ve done about FA File or NFS datastores, you’ve likely seen or heard us talk about VM granular management being part of FA File. Autodir enables VM granular management. Let’s dive into VM granular management in the context of NFS datastores.
With autodir enabled, these changes are reflected on FlashArray and enable FlashArray administrators to be able to see the current state of the NFS file system from a directory perspective.
Want to figure out why the data reduction ratio of a file system dropped so significantly? Now you can see that at a per-VM basis on FlashArray.
Want to see which VMs are spiking in load at inopportune times? You can use the FlashArray GUI to help figure that out. Worth mentioning this info is more easily consumed in Pure1 when using the VM analytics collector.
Want to have a special snapshot schedule for a certain group of VMs on a FlashArray-backed NFS datastore? With autodir, you can create snapshot policies and apply them to specific directories, allowing you to get around having to snapshot an entire NFS datastore like it’s a VMFS datastore. You can still snapshot the entire NFS file system if you want! Autodir enables you to have other options.
Your mission critical VMs likely have more complex snapshot retention and frequency requirements than your test VMs. With autodir, you can also apply multiple snapshot policies to the same directory (VM).
That’s sounds great Nelson, but surely autodir isn’t a good option for every NFS datastore on FlashArray. What are the reasons you wouldn’t want to enable autodir?
The main circumstance where autodir doesn’t make sense is if the scale limits of autodir are less than the directory count in your NFS datastore. Those can be found in this KB under “Managed Directories per array“.
Today I want to tell to you about what I use the vSphere plugin for regularly in my lab to hopefully help you get more value out of your existing Pure array and tools. The assumption of this guide is that you already have the vSphere plugin installed (follow this guide if you don’t currently have it installed or would like to upgrade to a more feature-rich remote plugin version). Our vSphere plugin release notes KB covers the differences between versions. If you aren’t sure what version you want, use the latest version.
Why should you care about the vSphere plugin and why would I highlight these workflows for you? Pure’s vSphere plugin can save you a significant amount of time in the configuration/management of your vSphere+FlashArray environment. It can also greatly reduce the barriers to success in your projects by reducing the steps required of the administrator for successful completion of a workflow. Additionally, you might currently be using the vSphere plugin for a couple of workflows but didn’t realize all of the great work our engineers have put into making your life easier.
I am planning to write more blogs on the vSphere plugin and the next one I plan to write is on the highest value features that exist in current vSphere plugin versions.
Create and Manage FlashArray Hosts and Host Group Objects
If you’re currently a Pure customer, you have likely managed your host and host group objects directly from the array. Did you know you can also do this from the vSphere plugin without having to copy over WWNs/IPs manually? (1) Right-click on the ESXi cluster you want to create/manage a host or host group object on, (2) hover over Pure Storage, then (3) left-click on Add/Update Host Group.
In this menu, there are currently Fibre Channel and iSCSI protocol configuration options. We are currently exploring options here for NVMe-oF configuration; stay tuned by following this KB. You can also check a box to configure your ESXi hosts for Pure’s best practices with iSCSI, making it so you don’t have to manually configure new iSCSI ESXi hosts.
FlashArray VMFS Datastore and Volume Management(Creation and Deletion)
When you use the plugin for datastore creation, the plugin will create the appropriate datastore in vSphere, the volume on the FlashArray, and it will connect the volume to the appropriate host(s) and host group objects on the FlashArray. (1) Right-click on the pertinent cluster or host object in vSphere, (2) hover over Pure Storage and finally (3) left-click on Create Datastore. This will bring up a wizard with a lot of options that I won’t cover here, but the end result will be a datastore that has a FlashArray volume backing it.
The great thing about deleting a datastore from the plugin is that there are no additional steps required on the array to clean up the objects. This is the most satisfying workflow for me personally because cleanup in a lab can feel like it’s not a good use of time until I’ve got hundreds of objects worth cleaning up. This workflow enables me to quickly clean up every time after I’ve completed testing instead of letting this work pile up.
(1) Right-click the datastore you want to delete, (2) hover over Pure Storage and (3) left-click on Destroy Datastore. After the confirmation prompt, the FlashArray volume backing that datastore will be destroyed and is pending eradication for whatever that value is configured on the FlashArray (default 24 hours, configurable up to 30 days with SafeMode). That’s it!
FlashArray Snapshot Creation
One of the benefits of FlashArray is its portable and lightweight snapshots. The good news is that you can create these directly from vSphere without having to log into the FlashArray. It’s worth mentioning that although the snapshot recovery workflows built into the vSphere plugin (vVols and VMFS) are far more powerful and useful when you really need them, I’m covering what I use regularly and I rarely have to recover from snapshots in my lab. I try to take snapshots every time I make a major change to my environment in case I need to quickly roll-back.
There are two separate workflows for snapshot creation: one for VMFS and one for vVols. The granularity advantage with vVols over VMFS is very clear here- with VMFS, you are taking snapshots of the entire VMFS datastore, no matter how many VMs or disks are attached to those VMs. With vVols, you only have to snapshot the volumes you need to, as granular as a single disk attached to a single VM.
With VMFS, (1) right click on the datastore, (2) hover over Pure Storage and (3) left click on Create Snapshot.
For a vVols backed disk, from the Virtual Machine Configure tab, navigate to the Pure Storage – Virtual Volumes pane, (1) select the disk you would like to snapshot and (2) click Create Snapshot.
A prompt will pop up to add a suffix to the snapshot if you’d like; click on create and you’ve got your FlashArray snapshot of a vVols backed disk created!
Stay tuned for a blog on the vSphere plugin features you might not know about that, like the above, can save you a significant amount of time and effort.
Hello- Nelson Elam, a Solutions Engineer on Cody’s team at Pure, guest-writing here again.
If you are a current Pure customer and have had ESXi issues that warranted you checking the vmkernel logs of a host, you may have noticed a significant amount of messages similar to this for SCSI:
Cmd(0x45d96d9e6f48) 0x85, CmdSN 0x6 from world 2099867 to dev "naa.624a9370f439f7c5a4ab425000024d83" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0
Or this for NVMe-oF:
WARNING: NvmeScsi: 172: SCSI opcode 0x85 (0x45d9757eeb48) on path vmhba67:C0:T1:L258692 to namespace eui.00f439f7c5a4ab4224a937500003f285 failed with NVMe error status: 0x1 translating to SCSI error
ScsiDeviceIO: 4131: Cmd(0x45d9757eeb48) 0x85, CmdSN 0xc from world 2099855 to dev "eui.00f439f7c5a4ab4224a937500003f285" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0
If you reached out to Pure Storage support to ask what the deal is with this, you were likely told that these are 0x85s and nothing to worry about because it’s a VMware error that doesn’t mean anything with Pure devices.
But why would this be logged and what is happening here?
ESXi regularly checks the S.M.A.R.T. status of attached storage devices, including for array-backed devices that aren’t local. When the SCSI command is received on the FlashArray software, it returns 0x85 with the following sense data back to the ESXi host:
failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0
These can be quite challenging to decode. Luckily, virten.net has a powerful tool for decoding these. When I paste this output into that site, I get the following details:
This status is returned when there is no error on the host side. This is when you will see if there is a status for a Device or Plugin. It is also when you will see Valid sense data instead of Possible sense Data.
This status is returned when a command fails for a specific reason. When a CHECK CONDITION is received, the ESX storage stack will send out a SCSI command 0x3 (REQUEST SENSE) in order to get the SCSI sense data (Sense Key, Additional Sense Code, ASC Qualifier, and other bits). The sense data is listed after Valid sense data in the order of Sense Key, Additional Sense Code, and ASC Qualifier.
No error. (ESXi 5.x / 6.x only)
Additional Sense Data
INVALID COMMAND OPERATION CODE
The key thing here is the Sense Key which has a value of ILLEGAL REQUEST. The FlashArray software does not support S.M.A.R.T. SCSI requests from hosts, so the FlashArray software returns ILLEGAL REQUEST to the ESXi host to tell the host we don’t support that request type.
This is for two reasons:
1. Since the FlashArray software’s volumes are not a physically attached storage device on the ESXi host, S.M.A.R.T. from the ESXi host doesn’t really make sense. 2. The FlashArray software handles drive failures and drive health independent of ESXi and monitoring the health of these drives that back the volumes is handled by the FlashArray software, not ESXi. You can read more about this in this datasheet.
Great Nelson, thanks for explaining that. Why are you talking about this now?
Pure has been working with VMware to reduce the noise and unnecessary concern caused by these errors. Seeing a failed ScsiDeviceIO in your vmkernel logs is alarming. In vSphere 7.0U3c, VMware fixed this problem and this will now only log once this when the ESXi host boots up instead of as often as every 15 minutes.
This means that in vSphere 7.0U3c if you are doing any ESXi host troubleshooting you no longer have to concern yourself with these errors; for me, this means I won’t have to filter these out in my greps anymore when looking into an ESXi issue in my lab. Great news all around!
Note: This is another guest blog by Kyle Grossmiller. Kyle is a Sr. Solutions Architect at Pure and works with Cody on all things VMware.
VMware Tanzu is a game-changing piece of technology for numerous reasons, but probably the most transformational piece of it is also the most apparent – it provides the capability for the vCenter admin to give resources for both consumers of traditional virtual machines as well as Kubernetes/DevOps users from the same set of compute hosts and storage. This consolidation means that the vCenter admin can more easily see what is being allocated where, as well as gaining insight into what application(s) might be candidates to make the move into a container-based environment from a virtual machine.
A Tanzu deployment is comprised of quite a few moving pieces and a central piece of this is durable storage made possible by persistent volumes. While container nodes and pods are ephemeral by nature (which is one of their major advantages), the data that they consume, produce and manipulate must be performant, portable and often, saved. So, there is obviously a different set of things we care about for persistent data vs the Kubernetes nodes that Tanzu runs in unison with here. For the remainder of this post we will show a couple of quick and easy ways you can change your persistent volumes to suit your application needs. There’s a bit of work and some choices to be made around getting a Tanzu environment up and running in vSphere, and I’d encourage you to check out the VMware Tanzu User Guide on our Pure Storage support site or Cody’s blog series to get some additional information.
With that being said, when a persistent volume is created via either dynamic or static provisioning, one of the first things the application developer needs to decide is what will happen to that volume and data when the application that uses it itself is no longer needed. The default behavior for an SPBM policy/storageclass assigned to a vSphere Namespace is to delete it, but through a simple kubectl patch command line, the persistent volume can be saved for future usage.
To make this change, first get the persistent volume name that you want to Retain/save:
$ kubectl get pv
NAME CAPACITY RECLAIM POLICY STATUS
pvc-f37c39fd 5Gi Delete Bound
Next, apply this kubectl command line to it to switch the reclaim policy from Delete to Retain:
When we run the kubectl get pv command again, we can see it is set to Retain, so we are all set:
$ kubectl get pv
NAME CAPACITY RECLAIM POLICY STATUS
pvc-f37c39fd 5Gi Retain Bound
If there is anything close to a certainty in the storage world – it is that the longer a volume exists, the more full of data it will become. This becomes even more of a certainty if a persistent volume is retained and reused across multiple application instances for increasing amounts of time. In the vSphere and Supervisor cluster 7.0U2 release VMware has introduced the capability for Online Volume Expansion. What this means is that while in previous versions users had to unbind their persistent volume claim from a pod or node prior to resizing it (otherwise known as offline volume expansion) – now they are able to accomplish that same operation without that step . This is a huge advantage as the offline expansion required that the volume be effectively be taken out of service when additional space was added to it, which could lead to application downtime. With the online volume expansion enhancement that annoyance goes away completely.
Online volume expansion operation is really simple to do. This time we find the persistent volume claim (which is basically the glue between the persistent volume and the application) that we need to expand:
$ kubectl get pvcNAME STATUS VOLUME CAPACITY
pvc-vvols-mysql Bound pvc-f37c39fd 5Gi
Now we run the following patch command against the PVC name we found above so that it knows to request additional storage for the persistent volume that it is bound to. In this case, we will ask to expand from 5Gi to 6Gi:
After waiting for a few moments for the expansion to complete, we look at the pvc in order to confirm we have the additional space that we asked for and we can see it has been added:
$ kubectl get pvc
NAME STATUS VOLUME CAPACITY
pvc-vvols-mysql Bound pvc-f37c39fd 6Gi
Taking a closer look at the PVC via the describe command shows that it indeed increased the PV size while it remained mounted to the mysql-deployment node under the events section:
$ kubectl describe pvc Name: pvc-vvols-mysql Namespace: default StorageClass: cns-vvols Status: Bound Volume: pvc-f37c39fd-dbe9-4f27-abe8-bca85bf9e87c Labels: <none> Annotations: pv.kubernetes.io/bind-completed: yes pv.kubernetes.io/bound-by-controller: yes volume.beta.kubernetes.io/storage-provisioner: csi.vsphere.vmware.com volumehealth.storage.kubernetes.io/health: accessible Finalizers: [kubernetes.io/pvc-protection] Capacity: 6Gi Access Modes: RWO VolumeMode: Filesystem Mounted By: mysql-deployment-5d8574cb78-xhhq5 Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning ExternalExpanding 52s volume_expand Ignoring the PVC: didn't find a plugin capable of expanding the volume; waiting for an external controller to process this PVC. Normal Resizing 52s external-resizer csi.vsphere.vmware.com External resizer is resizing volume pvc-f37c39fd-dbe9-4f27-abe8-bca85bf9e87c Normal FileSystemResizeRequired 51s external-resizer csi.vsphere.vmware.com Require file system resize of volume on node Normal FileSystemResizeSuccessful 40s kubelet, tkc-120-workers-mbws2-68d7869b97-sdkgh MountVolume.NodeExpandVolume succeeded for volume "pvc-f37c39fd-dbe9-4f27-abe8-bca85bf9e87c"
Those are just a couple of the ways we can update our persistent volumes to do what we need them to do within a Tanzu deployment, and we have really just scratched the surface with these few examples. To see how to do more advanced operations like migrating a persistent volume to a different Tanzu Kubernetes Cluster, please head over to our new Tanzu User Guide. Of course, it also is very important to mention that Portworx combined with Tanzu gives us even more features and functionalities like RBAC, automated backup and recovery and a whole lot more. Getting deeper into how Portworx interoperates with Tanzu is what I’m working on next so please stay tuned for some more cool stuff.
This feature has many names. Creating a larger config vVol. Creating a sub-vVol datastore. Creating an ISO repository. Etc.
In 7.0U2, VMware added a new feature that supports creating a custom size config vVol–while this was technically possible in earlier releases, it was not supported. Also, I should note that this is not supported by all vVol vendors, so of course speak to your vendor first.
First to review what a config vVol is check out this post:
In short, it is a mini VMFS that gets created when you create a directory in a vVol datastore (most commonly created by creating a new VM). This defaults to 4 GB in size. Enough to store the general VM files; some logs, VMDK pointers, vmx file, and some other frivolities.
The issue though is that this was not large enough to store large things like ISOs or vib files or whatever. So if you tried to upload something to a vVol datastore folder it would fail with an out-of-space issue. And you cannot upload to the root of a vVol datastore because a vVol datastore is not a file system. So you had to use VMFS or NFS to store those objects.
Note: This is another guest blog by Kyle Grossmiller. Kyle is a Sr. Solutions Architect at Pure and works with Cody on all things VMware.
VMware Tanzu is a great way for VMware users to manage their virtual machine environments while in parallel coming up to speed with containers all under the same familiar pane of glass. In fact, that’s possibly the biggest value proposition that Tanzu gives us today: extending vSphere and all of its enterprise features and goodness to the realm of K8s in a recognizable context.
There’s a catch, though. Before one can start to use Tanzu one has to get Tanzu setup. While it’s reasonably straight forward to do so – it is also important to note that there are multiple ways to enable Tanzu and to distinguish the pros/cons of each of them. It’s also key to understand what these underlying components are and how they interact to help troubleshoot any potential problems. This post will focus on two of these methods: vCenter Network Option (HA-Proxy) and NSX-T (VMware Cloud Foundation). Cody covered the 3rd option of directly deploying Tanzu Kubernetes Grid to vSphere in an earlier post that can be found here.
It’s critical to note that you must be at a minimum of vSphere 7 before you can use either of these below methods we will cover. ESXi 6.7U3 and up is supported via the method shown in Cody’s post I linked above. The two Workload Management enablement networking options are built right into the vSphere 7 UI under Workload Management when you add a cluster:
The other important prerequisite is that you will need one or more SPBM (Storage Policy Based Management) policies defined in order to get the Supervisor Cluster up and running. There are a couple of KB articles on our Platform Guide which shows how to do this on the FlashArray and the differences between VMFS and vVols based policies.
So let’s lay out the key components and differences between the vCenter Server Network and NSX-T backed Workload Management options and provide some more information on how to enable each of them…
The vCenter Server Network option allows customers to get Workload Management working in their vSphere environment by deploying a single OVA (known as HA-Proxy) and does not require any other external products like NSX-T or SDDC Manager. This OVA and associated information can be found at this link on GitHub. The main benefit of this option is the relative simplicity of getting Tanzu running since the only items you need to setup are a distributed switch portgroup to handle your Kubernetes ingress/egress ranges and the HA-Proxy OVA itself. The HA-Proxy will act as the load balancer for kubernetes traffic and provide the supervisor cluster API endpoint. The downside is that the HA-Proxy VM represents a single point of failure and larger kubernetes deployments at some point will more than likely overwhelm the CPU/memory resources available to it. This option is best for those who want to look at Tanzu in a POC/exploratory type of setup.
Here’s a quick narrated technical video I created showing how to setup Workload Management with the HA-Proxy OVA:
After Workload Management was enabled, I created another quick demo video showing how to create Namespaces and deploy a Tanzu Kubernetes Guest Cluster:
NSX-T based Workload Management is recommended to be backed/managed by a VMware Cloud Foundation deployment. This option gives you all of the enterprise grade features, resilience and lifecycle management that comes embedded with SDDC Manager. The con of this option is there is more setup work and moving pieces involved than the other Tanzu deployment choices and it requires additional licensing once the trial period expires. The key component that needs to be setup for Tanzu in particular is called an NSX-T Edge Cluster. An Edge Cluster is comprised of at least one (but really it should be two for resiliency) Edge VMs which help route and load balance network traffic from your top of rack switch to the underlying kubernetes deployments. The Edge Cluster deployment can be automated to a large extent from within SDDC Manager via a wizard.
In our lab, I went the route of manually deploying a 2 node Edge Cluster within NSX-T as this gives a better ‘under the hood’ view of how everything works together. As most customers likely do not have top of rack switch access to setup BGP, I also decided to use the static routing option within NSX-T. Here’s a video showing how to set this up end-to-end:
With the EdgeCluster built, the next step is to enable Workload Management which is shown in this video:
Hopefully this post has provided a bit of guidance and insight in terms of what Tanzu solution might be right for your environment. We are continuing to investigate and document how to best leverage the FlashArray with kubernetes so please check back here and our platform guide often for updates. Thanks for reading.
In the world of SCSI, a storage device is generally addressed by two things:
LUN–Logical Unit Number. This is a number used to address the device down a specific path to a specific array, for a specific host. So it is not a unique number really, it is not guaranteed to be unique to an array, to a host, or a volume. So for every path to a volume there could be a different LUN number. Think of it like a street address. 100 Maple St. There are many “100 Maple Streets”. So it requires the city, the state/province/etc, the country to really be meaningful. And a street name can change. So can other things. So it can usually get you want you want, but it isn’t guaranteed.
Serial number. This is a globally unique identifier of the volume. This means it is entirely unique for that volume and it cannot be change. It is the same for everyone and everything who uses that volume. To continue our metaphor, look at it like the GPS coordinates of the house instead of the address. It will get you where you need, always.
So how does this change with NVMe? Well these things still exist, but how they interact is…different.
Now, first, let me remind that generally these concepts are vendor neutral, but how things are generated, reported, and even sometimes named vary. So I write this for Pure Storage, so keep that in mind.
So in the previous post, I wrote about deploying a Tanzu Kubernetes Grid management cluster.
VMware defines this as:
A management cluster is the first element that you deploy when you create a Tanzu Kubernetes Grid instance. The management cluster is a Kubernetes cluster that performs the role of the primary management and operational center for the Tanzu Kubernetes Grid instance. This is where Cluster API runs to create the Tanzu Kubernetes clusters in which your application workloads run, and where you configure the shared and in-cluster services that the clusters use.
In other words this is the cluster that manages them all. This is your entry point to the rest. You can deploy applications in it, but not usually the right move. Generally you want to deploy clusters that are specifically devoted to running workloads.
These clusters are called Tanzu Kubernetes Cluster. Which VMware defines as:
After you have deployed a management cluster, you use the Tanzu Kubernetes Grid CLI to deploy CNCF conformant Kubernetes clusters and manage their lifecycle. These clusters, known as Tanzu Kubernetes clusters, are the clusters that handle your application workloads, that you manage through the management cluster. Tanzu Kubernetes clusters can run different versions of Kubernetes, depending on the needs of the applications they run. You can manage the entire lifecycle of Tanzu Kubernetes clusters by using the Tanzu Kubernetes Grid CLI. Tanzu Kubernetes clusters implement Antrea for pod-to-pod networking by default.