Removing an incorrect SATP Rule

This was a fun one. I’ve posted in the past about changing and removing a SATP rule. That post, however, only works if the SATP rule is valid and you want to change or remove it for whatever reason. I am going to re-use the same image I made for that previous post, because it still holds true:

mordor

Anyway. The issue:

I was presented with a person setting up their ESXi environment for the FlashArray and they wanted to add our main recommendation, changing the default SATP configuration action for Pure Storage (vendor) FlashArray (model) volumes from MRU to Round Robin and the IO Operations value set to one. To change this default is a simple command that adds this rule:

 esxcli storage nmp satp rule add -s "VMW_SATP_ALUA" -V "PURE" -M "FlashArray" -P "VMW_PSP_RR" -O "iops=1"

They ran into an issue with this though. What happened is that the command was emailed or copy/pasted into an enhanced editor at some point and the straight double quotes around the Path Selection Policy (PSP) were “autocorrected” into curly double quotes.

So it looked like so

esxcli storage nmp satp rule add -s "VMW_SATP_ALUA" -V "PURE" -M "FlashArray" -P “VMW_PSP_RR” -O "iops=1"

Notice the slight difference between the PSP double quotes.

Wrong:

right

Right:

wrong

Anyways, the issue is that ESXi shell sees the curly double quotes not as a string delineator like straight double quotes are, but instead as part of the string. So when they ran the command ESXi thought the curly double quotes were part of the string itself. The string “VMW_PSP_RR” is not a PSP name, VMW_PSP_RR is. Therefore ESXi returned an error saying the PSP does not exist and if they really want to they can add –force to the command.

See the error below

satperror

So they ran it again with the force tag and the rule (the incorrect one) was added. If you run a SATP query you can see it looks funky:

rule

The PSP has the quotes around it, the others do not.

The person noticed the extra quotes and after some troubleshooting I realized the problem was that the command used to add this had curly quotes. The problem is that the esxcli command to remove a SATP rule (you cannot change one, only add or remove) does not include a force command. It does not allow you to specify a PSP that does not exist, even if the rule exists with it. No matter what method I tried it would fail to remove:

failedtoremove

Always an error “Unknown PSP…”

I tried PowerCLI, host profiles (creating one from that host, removing the rule in the profile then re-applying, but the rule remained).

I originally thought it was stored in the esx.conf file but mindlessly could not find it. I spoke with Cormac and after some back and forth he concurred there seems to be no other way, but he located it in the conf file. I had missed it. So thanks Cormac!!

So as far as either of us are aware, the only way to remove a bad rule like this is via manual removal from the esx.conf file. This is not a process that should be taken lightly though!

I recommend doing the following things prior to editing:

  1. vMotion any running VMs off of the host
  2. Put the host in maintenance mode
  3. Backup the conf file by running something like:
    1. cp /etc/vmware/esx.conf /etc/vmware/esx.confbackup

Take a look at this KB too:

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1017022

Now, VI into the conf file:

vi /etc/vmware/esx.conf

The locate the five lines (there could possibly be more or less depending on the rule, check the rule number in the file).

ruleinconffile

Type i to get into insert mode and delete the lines. Type esc then J (capital) to remove the empty lines and join them, to keep things clean. Type esc again then : then wq! to save the file. Reboot the ESXi host and the rule will be gone!

The lessons learned here should be:

  1. Don’t use –force unless you know why
  2. Edit the esx.conf file with great care
  3. Characters matter

UPDATE: I received a question that I think is worth sharing here:

The question: Could you not just remove the double quotes in the esx.conf file vs. having to delete the line and then re-add it?  Just thought I would check to see if that is possible.

My answer: While it is possible, in my opinion, it won’t really help for a few reasons:

  1. You would still need to reboot the host for it to pick up the change
  2. Deleting is far less error prone then editing, you are more likely to screw it up via an edit than a delete
  3. When you enter a bad PSP the iops=1 (that we recommend) setting gets dropped so you would need to enter a whole new line for it, which is another set of problems.

So why it is possible, it really doesn’t get you much.

 

3 Replies to “Removing an incorrect SATP Rule”

  1. When running the command esxcli storage nmp satp rule add -s “VMW_SATP_ALUA” -V “PURE” -M “FlashArray” -P “VMW_PSP_RR” -O “iops=1”

    did you put the server in maintenance mode?

    also what was the command you ran to show you the wrong entry?

    Thanks

    1. You do not need to put the host in maintenance mode to add this.

      All rules are stored in the file: /etc/vmware/esx.conf

      So to view invalid rules you can look in there.

  2. In vSphere v7, you can remove your user rule like this (all fields are needed):

    # esxcli storage nmp satp rule remove -s VMW_SATP_ALUA -V PURE -M FlashArray -P “VMW_PSP_RR” -O “iops=1” -e “FlashArray Best Practice SATP Claim”

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.