VMware Cloud Foundation – Cody Hosterman

June 16, 2021June 16, 2021

Affecting Persistent Volumes in VMware Tanzu

Note: This is another guest blog by Kyle Grossmiller. Kyle is a Sr. Solutions Architect at Pure and works with Cody on all things VMware.

VMware Tanzu is a game-changing piece of technology for numerous reasons, but probably the most transformational piece of it is also the most apparent – it provides the capability for the vCenter admin to give resources for both consumers of traditional virtual machines as well as Kubernetes/DevOps users from the same set of compute hosts and storage. This consolidation means that the vCenter admin can more easily see what is being allocated where, as well as gaining insight into what application(s) might be candidates to make the move into a container-based environment from a virtual machine.

A Tanzu deployment is comprised of quite a few moving pieces and a central piece of this is durable storage made possible by persistent volumes. While container nodes and pods are ephemeral by nature (which is one of their major advantages), the data that they consume, produce and manipulate must be performant, portable and often, saved. So, there is obviously a different set of things we care about for persistent data vs the Kubernetes nodes that Tanzu runs in unison with here. For the remainder of this post we will show a couple of quick and easy ways you can change your persistent volumes to suit your application needs. There’s a bit of work and some choices to be made around getting a Tanzu environment up and running in vSphere, and I’d encourage you to check out the VMware Tanzu User Guide on our Pure Storage support site or Cody’s blog series to get some additional information.

With that being said, when a persistent volume is created via either dynamic or static provisioning, one of the first things the application developer needs to decide is what will happen to that volume and data when the application that uses it itself is no longer needed. The default behavior for an SPBM policy/storageclass assigned to a vSphere Namespace is to delete it, but through a simple kubectl patch command line, the persistent volume can be saved for future usage.

To make this change, first get the persistent volume name that you want to Retain/save:

$ kubectl get pv
 NAME           CAPACITY    RECLAIM POLICY   STATUS 
 pvc-f37c39fd   5Gi         Delete           Bound

Next, apply this kubectl command line to it to switch the reclaim policy from Delete to Retain:

$ kubectl patch pv (PV_Name) -p '{"spec":{"persistentVolumeReclaimPolicy":"Retain"}}'

So for our PV example:

$ kubectl patch pv pvc-f37c39fd -p '{"spec":{"persistentVolumeReclaimPolicy":"Retain"}}'

When we run the kubectl get pv command again, we can see it is set to Retain, so we are all set:

$ kubectl get pv
NAME           CAPACITY    RECLAIM POLICY   STATUS  
pvc-f37c39fd   5Gi         Retain           Bound

If there is anything close to a certainty in the storage world – it is that the longer a volume exists, the more full of data it will become. This becomes even more of a certainty if a persistent volume is retained and reused across multiple application instances for increasing amounts of time. In the vSphere and Supervisor cluster 7.0U2 release VMware has introduced the capability for Online Volume Expansion. What this means is that while in previous versions users had to unbind their persistent volume claim from a pod or node prior to resizing it (otherwise known as offline volume expansion) – now they are able to accomplish that same operation without that step . This is a huge advantage as the offline expansion required that the volume be effectively be taken out of service when additional space was added to it, which could lead to application downtime. With the online volume expansion enhancement that annoyance goes away completely.

Online volume expansion operation is really simple to do. This time we find the persistent volume claim (which is basically the glue between the persistent volume and the application) that we need to expand:

$ kubectl get pvc
NAME             STATUS  VOLUME        CAPACITY
pvc-vvols-mysql  Bound   pvc-f37c39fd  5Gi

Now we run the following patch command against the PVC name we found above so that it knows to request additional storage for the persistent volume that it is bound to. In this case, we will ask to expand from 5Gi to 6Gi:

$ kubectl patch pvc pvc-vvols-mysql -p '{"spec": {"resources": {"requests":{"storage": "6Gi"}}}}'

After waiting for a few moments for the expansion to complete, we look at the pvc in order to confirm we have the additional space that we asked for and we can see it has been added:

$ kubectl get pvc
NAME              STATUS   VOLUME         CAPACITY
pvc-vvols-mysql   Bound    pvc-f37c39fd   6Gi

Taking a closer look at the PVC via the describe command shows that it indeed increased the PV size while it remained mounted to the mysql-deployment node under the events section:

$ kubectl describe pvc
Name:          pvc-vvols-mysql
Namespace:     default
StorageClass:  cns-vvols
Status:        Bound
Volume:        pvc-f37c39fd-dbe9-4f27-abe8-bca85bf9e87c
Labels:        <none>
Annotations:   pv.kubernetes.io/bind-completed: yes
               pv.kubernetes.io/bound-by-controller: yes
               volume.beta.kubernetes.io/storage-provisioner: csi.vsphere.vmware.com
               volumehealth.storage.kubernetes.io/health: accessible
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:      6Gi
Access Modes:  RWO
VolumeMode:    Filesystem
Mounted By:    mysql-deployment-5d8574cb78-xhhq5
Events:
  Type     Reason                      Age   From                                             Message
  ----     ------                      ----  ----                                             -------
  Warning  ExternalExpanding           52s   volume_expand                                    Ignoring the PVC: didn't find a plugin capable of expanding the volume; waiting for an external controller to process this PVC.
  Normal   Resizing                    52s   external-resizer csi.vsphere.vmware.com          External resizer is resizing volume pvc-f37c39fd-dbe9-4f27-abe8-bca85bf9e87c
  Normal   FileSystemResizeRequired    51s   external-resizer csi.vsphere.vmware.com          Require file system resize of volume on node
  Normal   FileSystemResizeSuccessful  40s   kubelet, tkc-120-workers-mbws2-68d7869b97-sdkgh  MountVolume.NodeExpandVolume succeeded for volume "pvc-f37c39fd-dbe9-4f27-abe8-bca85bf9e87c"

Those are just a couple of the ways we can update our persistent volumes to do what we need them to do within a Tanzu deployment, and we have really just scratched the surface with these few examples. To see how to do more advanced operations like migrating a persistent volume to a different Tanzu Kubernetes Cluster, please head over to our new Tanzu User Guide. Of course, it also is very important to mention that Portworx combined with Tanzu gives us even more features and functionalities like RBAC, automated backup and recovery and a whole lot more. Getting deeper into how Portworx interoperates with Tanzu is what I’m working on next so please stay tuned for some more cool stuff.

February 17, 2021February 17, 2021

Digging into vSphere Workload Management Options

Note: This is another guest blog by Kyle Grossmiller. Kyle is a Sr. Solutions Architect at Pure and works with Cody on all things VMware.

VMware Tanzu is a great way for VMware users to manage their virtual machine environments while in parallel coming up to speed with containers all under the same familiar pane of glass. In fact, that’s possibly the biggest value proposition that Tanzu gives us today: extending vSphere and all of its enterprise features and goodness to the realm of K8s in a recognizable context.

There’s a catch, though. Before one can start to use Tanzu one has to get Tanzu setup. While it’s reasonably straight forward to do so – it is also important to note that there are multiple ways to enable Tanzu and to distinguish the pros/cons of each of them. It’s also key to understand what these underlying components are and how they interact to help troubleshoot any potential problems. This post will focus on two of these methods: vCenter Network Option (HA-Proxy) and NSX-T (VMware Cloud Foundation). Cody covered the 3rd option of directly deploying Tanzu Kubernetes Grid to vSphere in an earlier post that can be found here.

It’s critical to note that you must be at a minimum of vSphere 7 before you can use either of these below methods we will cover. ESXi 6.7U3 and up is supported via the method shown in Cody’s post I linked above. The two Workload Management enablement networking options are built right into the vSphere 7 UI under Workload Management when you add a cluster:

The other important prerequisite is that you will need one or more SPBM (Storage Policy Based Management) policies defined in order to get the Supervisor Cluster up and running. There are a couple of KB articles on our Platform Guide which shows how to do this on the FlashArray and the differences between VMFS and vVols based policies.

So let’s lay out the key components and differences between the vCenter Server Network and NSX-T backed Workload Management options and provide some more information on how to enable each of them…

The vCenter Server Network option allows customers to get Workload Management working in their vSphere environment by deploying a single OVA (known as HA-Proxy) and does not require any other external products like NSX-T or SDDC Manager. This OVA and associated information can be found at this link on GitHub. The main benefit of this option is the relative simplicity of getting Tanzu running since the only items you need to setup are a distributed switch portgroup to handle your Kubernetes ingress/egress ranges and the HA-Proxy OVA itself. The HA-Proxy will act as the load balancer for kubernetes traffic and provide the supervisor cluster API endpoint. The downside is that the HA-Proxy VM represents a single point of failure and larger kubernetes deployments at some point will more than likely overwhelm the CPU/memory resources available to it. This option is best for those who want to look at Tanzu in a POC/exploratory type of setup.

Here’s a quick narrated technical video I created showing how to setup Workload Management with the HA-Proxy OVA:

After Workload Management was enabled, I created another quick demo video showing how to create Namespaces and deploy a Tanzu Kubernetes Guest Cluster:

NSX-T based Workload Management is recommended to be backed/managed by a VMware Cloud Foundation deployment. This option gives you all of the enterprise grade features, resilience and lifecycle management that comes embedded with SDDC Manager. The con of this option is there is more setup work and moving pieces involved than the other Tanzu deployment choices and it requires additional licensing once the trial period expires. The key component that needs to be setup for Tanzu in particular is called an NSX-T Edge Cluster. An Edge Cluster is comprised of at least one (but really it should be two for resiliency) Edge VMs which help route and load balance network traffic from your top of rack switch to the underlying kubernetes deployments. The Edge Cluster deployment can be automated to a large extent from within SDDC Manager via a wizard.

In our lab, I went the route of manually deploying a 2 node Edge Cluster within NSX-T as this gives a better ‘under the hood’ view of how everything works together. As most customers likely do not have top of rack switch access to setup BGP, I also decided to use the static routing option within NSX-T. Here’s a video showing how to set this up end-to-end:

With the EdgeCluster built, the next step is to enable Workload Management which is shown in this video:

Hopefully this post has provided a bit of guidance and insight in terms of what Tanzu solution might be right for your environment. We are continuing to investigate and document how to best leverage the FlashArray with kubernetes so please check back here and our platform guide often for updates. Thanks for reading.

October 6, 2020October 6, 2020

vVols, please report to the Principal’s Office! VCF 4.1 and vVols!

Note: This is another guest blog by Kyle Grossmiller. Kyle is a Sr. Solutions Architect at Pure and works with Cody on all things VMware.

In VMware Cloud Foundation (VCF) version 4.1, vVols have taken center stage as a Principal Storage type available for Workload Domain deployments. This inclusion in one of VMware’s premier products reinforces the continued emphasis on vVols and all the benefits that they enable from VMware. vVols with iSCSI is particularly exciting to us as this is the first instance of the iSCSI protocol being supported as a Principal Storage type within VCF. We at Pure Storage are honored to have had a little bit of influence over this added functionality by serving as a design partner for this new feature and we are confident you are going to like what you see!

Someone who is using VMFS datastore with VCF today might ask themselves ‘why vVols’? This is a great question deserving of an expansive answer beyond this blog post. Fundamentally, though, using vVols enables you to fully use the FlashArray in the way it was intended. By leverage VASA (VMware API for Storage Awareness) you gain far more granular control and monitoring abilities over your individual VMs. Native FlashArray capabilities such as snapshots and replication are directly executed against the underlying array via policy-driven constructs. Further information on these and other benefits with vVols are available here.

Using vVols as Principal Storage is a lot like the methods VCF customers are used to for pre-existing Principal Storage options. Image an ESXi host, apply a few prerequisites to it, commission it to SDDC manager and create Workload Domains. Deploying Workload Domains with VMware Cloud Foundation automates and takes all the guesswork out of deploying vCenter and NSX-T for modern use cases such as Kubernetes via Workload Management.

Stepping into some specifics for a moment; here’s the process on how to use FlashArray iSCSI and vVols for VCF Workload Domains:

The most fundamental update to SDDC Manager to allow vVols is the capability to register a VASA Provider. In the below screenshot and following detailed information, we show an example of how you can add a FlashArray using another block protocol: Fibre Channel:

Provide a descriptive name for the VASA provider. It is recommended to use the FlashArray name and append it with -ct0 or -ct1 to denote which controller the entry is associated with.
Provide the URL for the VASA provider. This cannot be the management VIP of the array. Instead this field needs to be the management IP address associated with one of the controllers. The URL also is required to have the VASA port and version.xml appended to it. The format for the URL is: https://<IP of FlashArrayController>:8084/version.xml
Give a FlashArray user name with the arrayadmin role. The procedure for how to create such a user can be found here. While the pureuser account can be used, we recommend creating and using a separate FlashArray user for VASA operations.
Provide the password for the FlashArray username to be used.
Container Name must be Vvol container. Note that this value is case-sensitive.
For Container Type, select FC from the drop-down menu to use Fibre Channel.
Once all entries are completed, click Save.

Obviously, there’s a lot more to share here so we will be expanding on this substantially in the very near future on our VMware Platform Guide site.

Rounding out this post, I’m happy to show a demo video of just how easy it is to deploy a FC+vVols-based Workload Domain with VMware Cloud Foundation.

June 10, 2020June 10, 2020

Automating FlashStack with SmartConfig and VMware Cloud Foundation

Note: This is another guest blog by Kyle Grossmiller. Kyle is a Sr. Solutions Architect at Pure and works with Cody on all things VMware.

One of the (many) fun things we get to work on at Pure is researching and figuring out new ways to streamline things that are traditionally repetitive and time-consuming (read: boring). Recently, we looked at how we could go about automating the deployment of FlashStack™ end-to-end; since a traditional deployment absolutely includes some of these repetitive tasks. Our goal is to start off with a completely greenfield FlashStack (racked, powered, cabled and otherwise completely unconfigured) and automate everything possible to end up with a fully-functional VMware environment ready for use. After some thought, reading and discussion, we found that this goal was achievable with the combination of SmartConfig™ and VMware Cloud Foundation™.

Automating a FlashStack deployment makes a ton of sense: From the moment new hardware is procured and delivered to a datacenter, the race is on for it to switch from a liability to a money producing asset for the business. Further, using SmartConfig and Cloud Foundation together is really combining two blueprint-driven solutions: Cisco Validated Designs (CVDs) and VMware Validated Designs (VVDs). That does a lot to take the guesswork out of building the underlying infrastructure and hypervisor layers since firmware, hardware and software versions have all been pre validated and tested by Cisco, VMware and Pure Storage. In addition, these two tools also go through setting up these blueprints automatically via a customizable and repeatable framework.

Once we started working through this in the lab, the following automation workflow emerged:

Along with some introduction to the key technologies in play, we have divided the in-depth deployment guide into 3 core parts. All of these sections, including product overviews and click-by-click instructions are publicly available here on the Pure Storage VMware Platform Guide.

Deploy FlashStack with ESXi via SmartConfig. The input of this section will be factory reset Cisco hardware and the output will be a fully functional imaged/zoned/deployed UCS chassis with ESXi7 installed and ready for use with VMware Cloud Foundation.
Build VMware Cloud Foundation SDDC Manager on FlashStack. The primary input for CloudBuilder is, not ironically, the output of the work in part 1. Specifically, ESXi hosts and their underlying infrastructure, from which we will automatically deploy a Management Domain with CloudBuilder.
The last section will show how to deploy a VMware Cloud Foundation Workload Domain with Pure Storage as both Principle Storage (VMFS on FC) and Supplemental Storage (vVols). Options such as iSCSI are covered in additional KB articles in the VMware Cloud Foundation section of the Pure Storage support site.

Post-deployment, customers will enjoy the benefits of single-click lifecycle management for the bulk of their UCS and VMware components and the ability to dynamically scale up or down their Workload Domain deployment resources independently or collectively based upon specific needs (e.g. compute/memory, network and/or storage) all from SDDC Manager.

For those who prefer a more interactive demo, I’ve recorded an in-depth overview video of this automation project followed by a four-part demo video series that shows click-by-click just how easy and fast it is to deploy a FlashStack with VMware from scratch.

Craig Waters and I gave a Light Board session on this subject:

And this is an in-depth PowerPoint overview of the project:

Finally, this is a video series showing the end-to-end process in-depth broken into a few parts for brevity.

March 11, 2020March 19, 2020

Extending vVols to VMware Cloud Foundation

Note: This is a guest blog by Kyle Grossmiller. Kyle is a Sr. Solutions Architect at Pure and works with Cody on all things VMware.

As we’ve covered in past posts, VMware Cloud Foundation (VCF) offers immense advantage to VMware users in terms of simplifying day 0 and 1 activities and streamlining management operations within the vSphere ecosystem. Today, we dive into how to use the Pure Storage leading vVols implementation as Supplemental storage with your Management and Workload Domains.

First though, a brief description of the differences between Principal Storage and Supplemental Storage and how it relates to VCF is in order to set the table. Fortunately, it is very easy to distinguish between the two storage types:

Principal Storage is any storage type that you can connect to your Workload Domain as a part of the setup process within SDDC Manager. Today, that’s comprised of vSAN, NFS and VMFS on Fibre Channel, pictured below. We’ve shown how to use VMFS on FC previously.

Supplemental Storage simply means that you connect your storage system to a Workload Domain after it has been deployed. Examples of this storage type today include iSCSI and the focus of this blog: vVols.

November 13, 2019January 7, 2020

VMware Cloud Foundation and Pure Storage

A few weeks ago, VMware released the latest version of VMware Cloud Foundation, version 3.9. There were more than a few things in this release but one enhancement was around Fibre Channel:

https://docs.vmware.com/en/VMware-Cloud-Foundation/3.9/rn/VMware-Cloud-Foundation-39-Release-Notes.html

The release notes mention:

Fibre Channel Storage as Principal Storage: Virtual Infrastructure (VI) workload domains now support Fibre Channel as a principal storage option in addition to VMware vSAN and NFS.
vCF 3.9 Release Notes

So this leads to three questions:

What does this mean?
What does this mean for non-FC storage?
What was the stance around FC storage BEFORE this release?

Let’s answer the first question first.