Latency Round Robin PSP in ESXi 6.7 Update 1

This is my first (but certainly not last post) on the new path selection policy option in vSphere 6.7 Update 1. In reality, this option was introduced in the initial release of 6.7, but it was not officially supported until update 1.

So what is it? Well first off, see the official words from my colleague Jason Massae at VMware here:

https://storagehub.vmware.com/t/vsphere-storage/vsphere-6-7-core-storage-1/vsphere-6-7-u1-enhanced-round-robin-load-balancing/

Why was this PSP option introduced? Well the most common path selection policy is the NMP Round Robin. This is VMware’s built-in path selection policy for arrays that offer multiple paths. Round Robin was a great way to leverage the full performance of your array by actively using all of the paths simultaneously. Well…almost simultaneously.

Continue reading “Latency Round Robin PSP in ESXi 6.7 Update 1”

What’s New in Core Storage in vSphere 6.7 Part VI: Flat LUN ID Addressing Support

vSphere 6.7 core storage “what’s new” series:

A while back I wrote a blog post about LUN ID addressing and ESXi, which you can find here:

ESXi and the Missing LUNs: 256 or Higher

In  short, VMware only supported one mechanism of LUN ID addressing which is called “peripheral”. A different mechanism is generally encouraged by the SAM called “flat” especially for larger LUN IDs (like 256 and above). If a storage array used flat addressing, then ESXi would not see LUNs from that target. This is often why ESXi could not see LUN IDs greater than 255, as arrays would use flat addressing for LUN IDs that number or higher.

ESXi 6.7 adds support for flat addressing.  Continue reading “What’s New in Core Storage in vSphere 6.7 Part VI: Flat LUN ID Addressing Support”

What is the latency stat QAVG?

I wrote a blog post a year or so ago about ESXi and storage queues which has received a lot of wonderful feedback (thank you!!) and I eventually turned it into a VMworld session and other engagements:

So in the past year I have had quite a few discussions around this. And one part has always bothered me a bit.

In ESXI, there are a variety of latency metrics:

  • GAVG. Guest average. Sometimes called “VM observed latency”. This is the amount of time it takes for an I/O to be completed, after it leaves the VM. So through ESXi, through the SAN (or iSCSI network) and committed to the array and acknowledged back.
  • KAVG. Kernel average. This is how long an I/O is spending in the ESXi kernel. If this is anything but zero, there is some kind of bottleneck (often a maxed out queue)
  • DAVG. This is how long it takes for the I/O to be sent from host, through the SAN and to the array and acknowledged back.

Continue reading “What is the latency stat QAVG?”

VMware Capacity Reporting Part V: VVols and UNMAP

Storage capacity reporting seems like a pretty straight forward topic. How much storage am I using? But when you introduce the concept of multiple levels of thin provisioning AND data reduction into it, all usage is not equal (does it compress well? does it dedupe well? is it zeroes?).

This multi-part series will break it down in the following sections:

  1. VMFS and thin virtual disks
  2. VMFS and thick virtual disks
  3. Thoughts on VMFS Capacity Reporting
  4. VVols and capacity reporting
  5. VVols and UNMAP

Let’s talk about the ins and outs of these in detail, then of course finish it up with why VVols makes this so much better.

NOTE: Examples in this are given from a FlashArray perspective. So mileage may vary depending on the type of array you have. The VMFS and above layer though are the same for all. This is the benefit of VMFS–it abstracts the physical layer. This is also the downside, as I will describe in these posts.

Continue reading “VMware Capacity Reporting Part V: VVols and UNMAP”

Upgrading ESXi environment with PowerCLI

A new ESXi 6.5 patch came out today:

https://kb.vmware.com/s/article/2151104

And I wanted to upgrade my whole lab environment to it and I haven’t set up auto-deploy or update manager yet (I plan to, making all of this much easier to manage). So I wrote a quick and dirty PowerCLI script that updates to the latest patch and if the host doesn’t have any VMs on it, puts it into maintenance mode and reboots it. I will reboot the other ones as needed.

So short, not really even worth throwing on GitHub, but I might make it cleaner, and smarter at some point and put it there. Continue reading “Upgrading ESXi environment with PowerCLI”

Moving from an RDM to a VVol

Migrating VMDKs or virtual mode RDMs to VVols is easy: Storage vMotion. No downtime, no pre-creating of volumes. Simple and fast. But physical mode RDMs are a bit different.

As we all begrudgingly admit there are still more than a few Raw Device Mappings out there in VMware environments. Two primary use cases:

  • Microsoft Clustering. Virtual disks can only be used for Failover Clustering if all of the VMs are on the same ESXi hosts which feels a bit like defeating the purpose. So most opt for RDMs so they can split the VMs up.
  • Physical to virtual. Sharing copies of data between physical and virtual or some other hypervisor is the most common reason I see these days. Mostly around database dev/test scenarios. The concept of a VMDK can keep your data from being easily shared, so RDMs provide a workaround.

Continue reading “Moving from an RDM to a VVol”

Do thin VVols perform better than thin VMDKs?

Yes. Any questions?

Ahem, I suppose I will prove it out. The real answer is, well maybe. Depends on the array.

So debates have raged on for quite some time around performance of virtual disk types and while the difference has diminished drastically over the years, eagerzeroedthick has always out-performed thin. And therefore many users opted to not use thin virtual disks because of it.

So first off, why the difference?

Continue reading “Do thin VVols perform better than thin VMDKs?”

Monitoring Automatic VMFS-6 UNMAP in ESXi

With VMFS-6, space reclamation is now an automatic, but asynchronous process. This is great because, well you don’t have to worry about running UNMAP anymore. But since it is asynchronous (and I mean like 12-24 hours later asynchronous) you lose the instant gratification of reclamation.

So you do find yourself wondering, did it actually reclaim anything?

Besides looking at the array and seeing space reclaimed, how can I see from ESXi if my space was reclaimed?

Continue reading “Monitoring Automatic VMFS-6 UNMAP in ESXi”

In-Guest UNMAP, EnableBlockDelete and VMFS-6

EnableBlockDelete is a setting in ESXi that has been around since ESXi 5.0 P3 I believe. It was initially introduced as a way to turn on and off the automatic VMFS UNMAP feature introduced in 5.0 and then eventually canned in 5.0 U1.

The description of the setting back in 5.0 was “Enable VMFS block delete”. The setting was then hidden and made defunct (it did nothing when you turned it off or on) until ESXi 6.0. The description then changed to “Enable VMFS block delete when UNMAP is issued from guest OS”. Continue reading “In-Guest UNMAP, EnableBlockDelete and VMFS-6”

NMP Multipathing rules for the FlashArray are now default

As you might have noticed vSphere 6.5 Update 1 just came out (7/27/2017) and there are quite a few enhancements and fixes. I will be blogging about these in subsequent posts, but there is one that I wanted to specifically and immediately call out now.

Round Robin and IO Operations Limit of 1 is now default in ESXi for the Pure Storage FlashArray! This means that you no longer need to create a custom SATP rule when provisioning a new host or adding your first FlashArray into an existing environment. Continue reading “NMP Multipathing rules for the FlashArray are now default”