In the world of SCSI, a storage device is generally addressed by two things:
LUN–Logical Unit Number. This is a number used to address the device down a specific path to a specific array, for a specific host. So it is not a unique number really, it is not guaranteed to be unique to an array, to a host, or a volume. So for every path to a volume there could be a different LUN number. Think of it like a street address. 100 Maple St. There are many “100 Maple Streets”. So it requires the city, the state/province/etc, the country to really be meaningful. And a street name can change. So can other things. So it can usually get you want you want, but it isn’t guaranteed.
Serial number. This is a globally unique identifier of the volume. This means it is entirely unique for that volume and it cannot be change. It is the same for everyone and everything who uses that volume. To continue our metaphor, look at it like the GPS coordinates of the house instead of the address. It will get you where you need, always.
So how does this change with NVMe? Well these things still exist, but how they interact is…different.
Now, first, let me remind that generally these concepts are vendor neutral, but how things are generated, reported, and even sometimes named vary. So I write this for Pure Storage, so keep that in mind.
While the title of this post does sound like a halfway decent Harry Potter novel, this is far more nefarious. Pure Storage, like many other vendors have a best practice around lowering the Disk.DiskMaxIOSize setting on ESXi hosts when using UEFI boot for your Windows VMs. Why? Well:
Yes not having it set in a few situations would cause BSOD. First off, why?
In normal scenarios, you may not see much of a performance difference between the standard IOPS switching-based policy and the latency one. So don’t necessarily expect that switching policies will change anything. But then again, multipathing primarily exists not for healthy states, but instead exists to protect during times of poor health. Continue reading “Latency-based PSP in ESXi 6.7 Update 1: A test drive”
This is my first (but certainly not last post) on the new path selection policy option in vSphere 6.7 Update 1. In reality, this option was introduced in the initial release of 6.7, but it was not officially supported until update 1.
So what is it? Well first off, see the official words from my colleague Jason Massae at VMware here:
Why was this PSP option introduced? Well the most common path selection policy is the NMP Round Robin. This is VMware’s built-in path selection policy for arrays that offer multiple paths. Round Robin was a great way to leverage the full performance of your array by actively using all of the paths simultaneously. Well…almost simultaneously.
In short, VMware only supported one mechanism of LUN ID addressing which is called “peripheral”. A different mechanism is generally encouraged by the SAM called “flat” especially for larger LUN IDs (like 256 and above). If a storage array used flat addressing, then ESXi would not see LUNs from that target. This is often why ESXi could not see LUN IDs greater than 255, as arrays would use flat addressing for LUN IDs that number or higher.
I wrote a blog post a year or so ago about ESXi and storage queues which has received a lot of wonderful feedback (thank you!!) and I eventually turned it into a VMworld session and other engagements:
So in the past year I have had quite a few discussions around this. And one part has always bothered me a bit.
In ESXI, there are a variety of latency metrics:
GAVG. Guest average. Sometimes called “VM observed latency”. This is the amount of time it takes for an I/O to be completed, after it leaves the VM. So through ESXi, through the SAN (or iSCSI network) and committed to the array and acknowledged back.
KAVG. Kernel average. This is how long an I/O is spending in the ESXi kernel. If this is anything but zero, there is some kind of bottleneck (often a maxed out queue)
DAVG. This is how long it takes for the I/O to be sent from host, through the SAN and to the array and acknowledged back.
Storage capacity reporting seems like a pretty straight forward topic. How much storage am I using? But when you introduce the concept of multiple levels of thin provisioning AND data reduction into it, all usage is not equal (does it compress well? does it dedupe well? is it zeroes?).
This multi-part series will break it down in the following sections:
Let’s talk about the ins and outs of these in detail, then of course finish it up with why VVols makes this so much better.
NOTE: Examples in this are given from a FlashArray perspective. So mileage may vary depending on the type of array you have. The VMFS and above layer though are the same for all. This is the benefit of VMFS–it abstracts the physical layer. This is also the downside, as I will describe in these posts.
And I wanted to upgrade my whole lab environment to it and I haven’t set up auto-deploy or update manager yet (I plan to, making all of this much easier to manage). So I wrote a quick and dirty PowerCLI script that updates to the latest patch and if the host doesn’t have any VMs on it, puts it into maintenance mode and reboots it. I will reboot the other ones as needed.
Migrating VMDKs or virtual mode RDMs to VVols is easy: Storage vMotion. No downtime, no pre-creating of volumes. Simple and fast. But physical mode RDMs are a bit different.
As we all begrudgingly admit there are still more than a few Raw Device Mappings out there in VMware environments. Two primary use cases:
Microsoft Clustering. Virtual disks can only be used for Failover Clustering if all of the VMs are on the same ESXi hosts which feels a bit like defeating the purpose. So most opt for RDMs so they can split the VMs up.
Physical to virtual. Sharing copies of data between physical and virtual or some other hypervisor is the most common reason I see these days. Mostly around database dev/test scenarios. The concept of a VMDK can keep your data from being easily shared, so RDMs provide a workaround.
Ahem, I suppose I will prove it out. The real answer is, well maybe. Depends on the array.
So debates have raged on for quite some time around performance of virtual disk types and while the difference has diminished drastically over the years, eagerzeroedthick has always out-performed thin. And therefore many users opted to not use thin virtual disks because of it.