Changing the default VMware Round Robin IO Operation Limit value for Pure Storage FlashArray devices

This is a topic I have posted about in the past but this time I am going to speak about it with the Pure Storage FlashArray. Anyone familiar with the VMware Native Multipathing Plugin probably knows about the Round Robin “IOPS” value which I will interchangeably also refer to as the IO Operation Limit. This value dictates how often NMP switches paths to the device–after a configured number of I/Os NMP will move to a different path. The default value of this is 1,000 but can be changed to as low as 1. For the highest performance Pure recommends changing this setting to 1 for all devices. The tricky thing is that it has to be done for every device on every host and doing this in a simple way isn’t immediately obvious. But here is the procedure.

flasharray

 

The most common method employed to do this was setting it on each device using esxcli, but this is not exactly the most scalable method, but it requires doing it to every device on every host until the end of time. What is much easier is to create a rule that specifically will set a IOPS value for every Pure device that comes in. The SATP that claims Pure devices is the standard ALUA one, VMW_SATP_ALUA. So a rule needs to be assigned for Pure devices claimed by that SATP. First you need some information.

To create a rule specific enough to encompass only Pure devices we need to get the vendor information from an existing device. The simplest way to do this (or a simple one at least) is to just grep the vmkernel log after a rescan:

grep -i scsiscan /var/log/vmkernel.log

This will give you lines that look like so:

2014-05-14T21:54:50.756Z cpu13:33081 opID=2ac75bde)ScsiScan: 976: Path 'vmhba3:C0:T5:L11': Vendor: 'PURE ' Model: 'FlashArray ' Rev: '342 '

We just need to take the vendor and model names, which unsurprisingly are PURE and FlashArray respectively. To create a rule to both make sure Pure devices use round robin and that the IOPS value is always set to 1 run this command on all of your ESXi hosts:

esxcli storage nmp satp rule add -s "VMW_SATP_ALUA" -V "PURE" -M "FlashArray" -P "VMW_PSP_RR" -O "iops=1"

This is case sensitive so make sure you type this exactly as above.

***See how to do this with PowerCLI here***

Note that existing devices will not get this change! If they are currently using MRU or something or have a different IOPS value this will not change them. You either need to specifically change existing devices or unclaim and reclaim them (which requires the device going offline) or reboot the host. If you want to change specific devices without taking them offline you can run (with a different NAA of course):

esxcli storage nmp device set -d naa.6006016055711d00cff95e65664ee011 --psp=VMW_PSP_RR
esxcli storage nmp psp roundrobin deviceconfig set -d naa.6006016055711d00cff95e65664ee011 -I 1 -t iops

Regardless all new devices will now be claimed with round robin using an IOPS value of 1 from this point on. You can check the IO Operation Limit value for a given device by running:

esxcli storage nmp psp roundrobin deviceconfig get --device naa.624a9370753d69fe46db318d00010000
 Byte Limit: 10485760
 Device: naa.624a9370753d69fe46db318d00010000
 IOOperation Limit: 1
 Limit Type: Default
 Use Active Unoptimized Paths: false

To change or remove the rule you cannot simply just run the command again to change the rule back to 1,000 or whatever number. You must first remove the rule and then you can create a new one with a different number, or leave it without a rule to use 1,000 again.

esxcli storage nmp satp rule remove -s "VMW_SATP_SYMM" -V "PURE" -M "FlashArray" -P "VMW_PSP_RR" -O "iops=1"

If you don’t remember what you set or want to take a look at the existing rules, run:

esxcli storage nmp satp rule list -s VMW_SATP_ALUA

Pretty straight forward!

17 Replies to “Changing the default VMware Round Robin IO Operation Limit value for Pure Storage FlashArray devices”

  1. If you just don’t want to pay too much attention to the naa attribute, using these two commands should help.

    RR activation :
    # for i in `esxcli storage nmp device list | grep PURE | awk ‘{gsub(/[()]/,””); print $8}’` ; do `esxcli storage nmp device set -d $i –psp=VMW_PSP_RR`; done

    Path Switching to 1 :
    # for i in `esxcli storage nmp device list | grep PURE | awk ‘{gsub(/[()]/,””); print $8}’` ; do esxcli storage nmp psp roundrobin deviceconfig set -d $i -I 1 -t iops;done

    1. For any device presented you should see it. esxcfg-scsidevs -l should show it too. What vendor are you looking to configure for?

  2. Oh wow that is cool, I ran “esxcfg-scsidevs -l” and looks like there are 15 different ones.
    Three pertain to “Vendor: PURE” and “Model: FlashArray”.
    I just have one array, should there be 15 different naa.#’s?
    Each is Multipath Plugin: NMP

    1. Every volume (or datastore or LUN or whatever you want to call it) you provision will have it’s own NAA. The NAA is based on the volume serial number, so each one has a unique NAA–as it is what VMware uses to identify each datastore uniquely. Though for the FlashArray the vendor and model info will always be PURE and FlashArray–this is not unique to a volume, instead it is common to all storage from our array. To create a SATP rule you would use those values for us. If you are running the latest versions of ESXi though, you do not need to do this anymore

  3. Ah ok so I am on esxi 5.5. I’d just run the rule for PURE FlashArray and should be good. Or alternatively, I’d upgrade to esxi 6.5 and wouldn’t need to add the rule to change the round robin io limit?

    Thanks again

    1. You’re welcome! Yep exactly! If you are on 5.5 run that rule on each ESXi host once and you are good. If you are on 6.0 Express patch 5 or later or 6.5 U1 or later you dont need to do it at all as these recommendations are now default in ESXi for the FlashArray in those releases and later

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.