Another UNMAP post, are you shocked? A common question that came up was what volumes have dead space? What datastores should I run UNMAP on?
My usual response was, well it is hard to say. Dead space is introduced when you move a VM or you delete one. The array will not release the space until you either delete the physical volume, overwrite it, or issue UNMAP. Until vSphere 6.5, UNMAP for VMFS was not automatic. You had to run a CLI command to do it. So that leads back to the question, well I have 100 datastores, which ones should I run it on?
So to find out, you need to know two things:
- How much space the file system reports as currently being used.
- How much space the array is physically storing for the volume hosting that file system.
The delta between this is the amount of dead space that can be reclaimed. If the array number is much higher, that means there is a lot of dead space. Because that delta represents the space consumed by files that have been deleted or moved.
With a Data Reduction AFA like the FlashArray, this is not necessarily a simple thing. The FlashArray, for instance, reports how much space after data reduction is being stored. Data reduction including pattern removal (such as zero removal), compression and deduplication. So this reduces the footprint of what the FlashArray reports as actually used. So you can’t really just compare the two numbers.
Let’s take this example:
I have 80 GB used on my file system and the FlashArray reports 20 GB used for the underlying volume–the data reduction ratio is 4:1. If I delete 20 GB from my file system, that will reduce my file system usage down to 60 GB. So 20 GB of dead space now. My array still reports 20 GB though because it does not know the files were deleted.
But the 20 GB is still far less than file system-reported 60 GB. So you don’t really know that you have up to 20 GB of dead space (depending on how reducible the deleted space is dictates how much you will actually get back) it just looks like your data set is reducing 3:1 now instead of 4:1. So unless you deleted A LOT of data (or your data set is not reducible), it would be hard to know you have dead space on a volume because our number is almost always going to be lower than whatever the file system says.
So this begs the question–is there a way on the FlashArray to see what the host has written in total at a point in time for a given device, before data reduction? Well up until recently I didn’t think so. But after an unrelated conversation with engineering I found out there is indeed a way to figure this out!
So we have a metric called “thin_provisioning” and this is reported as a percentage in our CLI and REST and not reflected in our GUI.
I always thought this was a throw-away number, who cares about thin provisioning savings? All that matter is how well your actual data is reduced by data reduction techniques? Who cares how much physical capacity is saved by using array-based thin provisioning? That is sooo 2008…
But! Actually! This is an important metric when it comes to dead space identification. Let’s look at what this metric actually means. What this actually means is the percentage of the overall space that the host has never written to. So basically 1 minus the thin provisioning percentage will give you the percentage of the volume the host has actually written to. Multiply that by the provisioned capacity of the volume and you get how much space we think the host has written to us.
(1 – thin_provisioning ) * size = host written space
Pretty simple! So you take this number and subtract what the file system itself shows as written you get the dead space!
Let’s look at an example. I have a VMFS with 20 VMs on it and it is currently using 320 GB (1.68 TB is free) out of 2 TB:
Now to look at the FlashArray volume. You can do this a variety of ways, I used our REST API via vRealize Orchestrator and our vRO plugin. You can use anything that can make REST calls or of course our CLI.
This is the GET REST call with the URI of:
The CLI command would be:
purevol list UNMAP-Test –space
I prefer REST because it gives you a very accurate number. The CLI reports just two digits. But that is probably good enough regardless.’
My volume report a value of 0.8438614518381655 for thin_provisioning.
If I run that number through the math:
(1 – 0.8438614518381655 ) * 2048 GB = Virtual Space
I get the FlashArray reports 319.77 GB as used. Which is exactly (well other than some rounding error) what VMFS sees! So let’s create some dead space–I will delete half of my VMs.
We can see that half have been deleted and I now have 160 GB used instead of 320 GB.
But if we look at the FlashArray, we see that the thin_provisioning value has not changed. This is because it does not know that these VMs have been deleted. So it still reports 320 GB as being used by the file system.
The difference between VMFS and the FlashArray is now 320 GB – 160 GB, so we have 160 GB of dead space! Time to run UNMAP!
A million ways to run UNMAP, but I will just use the old fashioned SSH and esxcli:
Now if we check the thin_provisioning value for that volume again it is now up to 92%, and if you do the math that is now at the 160 GB like VMFS reports! Dead space totally gone!
So a quick FAQ:
Q: If I run UNMAP on this volume, will I get physical space back on my array?
A: Maybe, maybe not. This is reporting dead virtual space, so if that space is heavily deduplicated it may not be returned immediately. Running UNMAP on the FlashArray is about cleaning up the meta data tables. Once the final meta data pointer has been cleared for a given block, the physical space of that block will be returned.
Q: Is this 100% accurate?
A: This is pretty accurate in my testing. The major caveat is that this is only very accurate when thin virtual disks are used. Since thick virtual disks increase the allocate, but not written space on the VMFS–the array does not consider a full thick virtual disk host written space. So with thick type virtual disks results may vary. With datastores with thin virtual disks only, this works very well.
Q: What about vSphere 6.5 and VMFS-6?
A: Since space reclamation is automatic with VMFS-6, this is really only targeted at VMFS-5. So upgrade to vSphere 6.5 and VMFS-6 and you don’t need to worry about this anymore.
Q:How to I automate this?
A: I wrote a PowerShell/PowerCLI script! Read on.
So I wrote a PowerCLI script to automate the discovery of this. This script does the following:
- Fully interactive. There is no need to edit the script. Just run it, and it will ask you what it needs.
- It will ask you how many FlashArrays you want to look at. Essentially it looks at your datastores and then figures out what FlashArrays they are on. So we need to provide connectivity to the script for each FlashArray. Enter how many, and then an IP/FQDN for each.
- Enter in FlashArray credentials.
- Enter in vCenter IP/FQDN
- Either let it re-use your FlashArray creds for vCenter, if they have access of course to vCenter, or enter in new ones.
- Enter a dead space threshold. This script will look at all of the FlashArray datastores and then find how much space they currently have allocated from a VMFS perspective, and then look at what the FlashArray sees as virtually allocated. The difference is the dead space. It then returns the datastores. Now you may not care if a volume has 30 MB of dead space. So entering a number here will filter out any volume that has less than some amount in GB of dead space. I choose 25 in my video below.
- The script requires PowerCLI 6.3 or later. It will check for it and import it, if it is installed.
- The script also requires the FlashArray PowerShell SDK. It will check for it and import it, if it is installed.
- It also logs everything out to a log file. It will ask you for a directory. Some basic information is spit out to the screen.
Download it here:
See a demo here: