This was a fun one. I’ve posted in the past about changing and removing a SATP rule. That post, however, only works if the SATP rule is valid and you want to change or remove it for whatever reason. I am going to re-use the same image I made for that previous post, because it still holds true:
Anyway. The issue:
I was presented with a person setting up their ESXi environment for the FlashArray and they wanted to add our main recommendation, changing the default SATP configuration action for Pure Storage (vendor) FlashArray (model) volumes from MRU to Round Robin and the IO Operations value set to one. To change this default is a simple command that adds this rule:
esxcli storage nmp satp rule add -s "VMW_SATP_ALUA" -V "PURE" -M "FlashArray" -P "VMW_PSP_RR" -O "iops=1"
They ran into an issue with this though. What happened is that the command was emailed or copy/pasted into an enhanced editor at some point and the straight double quotes around the Path Selection Policy (PSP) were “autocorrected” into curly double quotes.
So it looked like so
esxcli storage nmp satp rule add -s "VMW_SATP_ALUA" -V "PURE" -M "FlashArray" -P “VMW_PSP_RR” -O "iops=1"
Notice the slight difference between the PSP double quotes.
Anyways, the issue is that ESXi shell sees the curly double quotes not as a string delineator like straight double quotes are, but instead as part of the string. So when they ran the command ESXi thought the curly double quotes were part of the string itself. The string “VMW_PSP_RR” is not a PSP name, VMW_PSP_RR is. Therefore ESXi returned an error saying the PSP does not exist and if they really want to they can add –force to the command.
See the error below
So they ran it again with the force tag and the rule (the incorrect one) was added. If you run a SATP query you can see it looks funky:
The PSP has the quotes around it, the others do not.
The person noticed the extra quotes and after some troubleshooting I realized the problem was that the command used to add this had curly quotes. The problem is that the esxcli command to remove a SATP rule (you cannot change one, only add or remove) does not include a force command. It does not allow you to specify a PSP that does not exist, even if the rule exists with it. No matter what method I tried it would fail to remove:
Always an error “Unknown PSP…”
I tried PowerCLI, host profiles (creating one from that host, removing the rule in the profile then re-applying, but the rule remained).
I originally thought it was stored in the esx.conf file but mindlessly could not find it. I spoke with Cormac and after some back and forth he concurred there seems to be no other way, but he located it in the conf file. I had missed it. So thanks Cormac!!
So as far as either of us are aware, the only way to remove a bad rule like this is via manual removal from the esx.conf file. This is not a process that should be taken lightly though!
I recommend doing the following things prior to editing:
- vMotion any running VMs off of the host
- Put the host in maintenance mode
- Backup the conf file by running something like:
- cp /etc/vmware/esx.conf /etc/vmware/esx.confbackup
Take a look at this KB too:
Now, VI into the conf file:
The locate the five lines (there could possibly be more or less depending on the rule, check the rule number in the file).
Type i to get into insert mode and delete the lines. Type esc then J (capital) to remove the empty lines and join them, to keep things clean. Type esc again then : then wq! to save the file. Reboot the ESXi host and the rule will be gone!
The lessons learned here should be:
- Don’t use –force unless you know why
- Edit the esx.conf file with great care
- Characters matter
UPDATE: I received a question that I think is worth sharing here:
The question: Could you not just remove the double quotes in the esx.conf file vs. having to delete the line and then re-add it? Just thought I would check to see if that is possible.
My answer: While it is possible, in my opinion, it won’t really help for a few reasons:
- You would still need to reboot the host for it to pick up the change
- Deleting is far less error prone then editing, you are more likely to screw it up via an edit than a delete
- When you enter a bad PSP the iops=1 (that we recommend) setting gets dropped so you would need to enter a whole new line for it, which is another set of problems.
So why it is possible, it really doesn’t get you much.