I wrote a blog post a year or so ago about ESXi and storage queues which has received a lot of wonderful feedback (thank you!!) and I eventually turned it into a VMworld session and other engagements:
So in the past year I have had quite a few discussions around this. And one part has always bothered me a bit.
In ESXI, there are a variety of latency metrics:
GAVG. Guest average. Sometimes called “VM observed latency”. This is the amount of time it takes for an I/O to be completed, after it leaves the VM. So through ESXi, through the SAN (or iSCSI network) and committed to the array and acknowledged back.
KAVG. Kernel average. This is how long an I/O is spending in the ESXi kernel. If this is anything but zero, there is some kind of bottleneck (often a maxed out queue)
DAVG. This is how long it takes for the I/O to be sent from host, through the SAN and to the array and acknowledged back.
In the latest GA release of Purity, version 4.1.5, there have been some nice improvements in how we handle host connectivity/balance reporting. There is a new CLI command to monitor the balance of I/O from a host standpoint as well as how we report/display host connectivity in the FlashArray web GUI. Let’s take a look at these enhancements. In Part 1, I will talk about the CLI enhancement.
The Pure Storage Management Pack for VMware vRealize Operations Manager version 1 is now out! Download it here. This is the latest in our aggressive 2015 roadmap of VMware management integration, whether that be integration point that are new or updated.
So first, what is a management pack? A management pack is a plugin of sorts that can be installed into vRealize Operations Manager (vROPs) that provides context and relationships to existing objects inside vROPs. How these objects are related depends on what the pack represents. In the case of Pure Storage, the pack relates VMware objects, such as VMs and datastore to volumes on a particular FlashArray. This in addition to FlashArray host groups and hosts. Continue reading “The Pure Storage FlashArray vROPs Adapter v1”
Here is another “look what I found” storage-related post for vSphere 6. Once again, I am still looking into exact design changes, so this is what I observed and my educated guess on how it was done. Look for more details as time wears on.
***This blog post really turned out longer than I expected, probably should have been a two parter, so I apologize for the length.***
Like usual, let me wax historical for a bit… A little over a year ago, in my previous job, I wrote a proposal document to VMware to improve how they handled XCOPY. XCOPY, as you may be aware, is the SCSI command used by ESXi to clone/Storage vMotion/deploy from template VMs on a compatible array. It seems that in vSphere 6.0 VMware implemented these requests (my good friend Drew Tonnesen recently blogged on this). My request centered around three things:
Allow XCOPY to use a much larger transfer size (current maximum is 16 MB) a.k.a, how much space a single XCOPY SCSI command can describe. Things like Microsoft ODX can handle XCOPY sizes up to 256 MB for example (though the ODX implementation is a bit different).
Allow ESXi to query the Maximum Segment Length during an Extended Copy (XCOPY) Receive Copy Results and use that value. This value tells ESXi what to use as a maximum transfer size. This will allow the end user to avoid the hassle of having to deal with manual transfer size changes.
Allow for thin virtual disks to leverage a larger transfer size than 1 MB.
The first two are currently supported in a very limited fashion by VMware right now, (but stay tuned on this!) so for this post I am going to focus on the thin virtual disk enhancement and what it means on the FlashArray.
Quick post here. I am working on updating some documentation and I wanted to add a bit more color to a section on changing the IO Operations limit for ESXi NMP Round Robin devices. The Pure Storage recommendation is to change this value to one from the default of 1,000. Therefore, ESXi will switch logical paths after each I/O instead of 1,000. There are some performance benefits to this and some evidence for improved failover time (in the case of a path failure) with this setting. I am not going to get into the veracity of these benefits right now. What I wanted to share here is that there is no doubt changing this to 1 makes a big difference to I/O balance on the array itself. Continue reading “ESXi IO Operations Limit Parameter and IO Balance”
Today I posted a new document to our repository on purestorage.com: Pure Storage and VMware Storage APIs for Array Integration—VAAI. This is a new white paper that describes in detail the VAAI block primitives that VMware offers and that we support. Furthermore, performance expectations are described, comparing before/after and how the operations do at scale. There are some best practices listed as well, the why and how of those recommendations are also described within.
I have to say, especially when it comes to XCOPY, I have never seen a storage array do so well with it. It is really quite impressive how fast XCOPY sessions complete and how scaling it up (in terms of numbers of VMs or size of the VMDKs) doesn’t weaken the process at all. The main purpose of this post is to alert you to the new document but I will go over some high level performance pieces of information as well. Read the document for the details and more.
I posted a week or so ago about the ESXCLI UNMAP process with vSphere 5.5 on the Pure Storage FlashArray here and came up with the conclusion that larger block counts are highly beneficial to the UNMAP process. So the recommendation was simply use a larger block count than the default to speed up the UNMAP operation, something sufficiently higher than the default of 200 MB. I received a few questions about a more specific recommendation (and had some myself) so I decided to dive into this a little deeper to see if I could provide some guidance that was a little more concrete. In the end a large block count is perfectly fine–if you want to know more details–read on!
This is a topic I have posted about in the past but this time I am going to speak about it with the Pure Storage FlashArray. Anyone familiar with the VMware Native Multipathing Plugin probably knows about the Round Robin “IOPS” value which I will interchangeably also refer to as the IO Operation Limit. This value dictates how often NMP switches paths to the device–after a configured number of I/Os NMP will move to a different path. The default value of this is 1,000 but can be changed to as low as 1. For the highest performance Pure recommends changing this setting to 1 for all devices. The tricky thing is that it has to be done for every device on every host and doing this in a simple way isn’t immediately obvious. But here is the procedure.
A common recommendation from storage vendors is to change the default IOPS setting for VMwares’ Native Multi-Pathing (NMP) Path Selection Policy (PSP) Round Robin. The IOPS setting controls how many I/Os are sent down a single logical path before switching to the next path. By default this number is 1,000 I/Os. The VMAX recommendation is to set this to 1. The purpose of this blog post is not to debate the setting, but to help those who want to use it. Regardless, I have seen many customers benefit from this recommendation. Once they see a benefit they want to know–can I make this setting a default?