Testing New SRA Release with a 2nd SRM Pair

At the time of writing this post we are currently at work on our next release of our Storage Replication Adapter for the FlashArray. In a discussion with a customer who needs the feature that we are adding (what a nice coincidence!) the question came up, “what is the best way to test?”. They want to test the SRA without fouling up their production SRM environment.

So a simple answer is well deploy two new vCenters and a SRM pair. But that requires certain hosts and similar network configuration and authentication, etc. etc. So they wanted to use their existing vCenters but NOT their existing SRM servers.

SRM used to be a fairly rigid tool (for good reason, let’s not break your DR). But in the past few years VMware has really opened it up. Loosened the tight vCenter version to SRM version, shared recovery sites, and multiple SRM pairs per vCenter pair. This is where we come in.

Continue reading “Testing New SRA Release with a 2nd SRM Pair”

Site Recovery Manager and ActiveCluster Part III: Creating Protection Groups and Recovery Plans

Now that all of the prerequisites are complete, it is time to start creating protection groups and recovery plans.

This is part 3 of this series, the earlier parts were:

Continue reading “Site Recovery Manager and ActiveCluster Part III: Creating Protection Groups and Recovery Plans”

Site Recovery Manager and ActiveCluster Part II: Configuring SRM

In my last post, I walked through configuring ActiveCluster and your VMware environment to prepare for use in Site Recovery Manager.

Site Recovery Manager and ActiveCluster Part I: Pre-SRM Configuration

In this post, I will walk through configuring Site Recovery Manager itself.  There are a few pre-requisites at this point:

  • Everything that was done in part 1.
  • Site Recovery Manager installed and paired
  • Inventory mappings in SRM are complete (network, folders, clusters, resource pools etc).
  • Downloaded and installed the FlashArray SRA 3.x or later on both SRM servers.

Continue reading “Site Recovery Manager and ActiveCluster Part II: Configuring SRM”

Multi-site FlashArray Configuration with Site Recovery Manager

The FlashArray Storage Replication Adapter for VMware Site Recovery Manager supports many:many replication since the 2.0 release of the SRA. Use of test failover, failover and reprotect is no different than with 1:1, and nor is the setup of the volumes. The only real difference is how you configure the array managers in SRM. So let’s review how this is done.

Continue reading “Multi-site FlashArray Configuration with Site Recovery Manager”

Querying SRM for Protected VMs with PowerCLI

I was recently asked how to query SRM for protected VMs and I decided it would make a good quick blog post. There is a great post here on using PowerCLI with SRM, but it doesn’t show the information to return per virtual machine information by default. Needs a bit more.

All it returns is a SRM-based virtual machine ID which doesn’t relate to what a user is probably looking for (a virtual machine name). So it needs a few more simple steps. The following script which can be found on my GitHub page here that does the following things:

  1. Connects to a vCenter
  2. Connects to SRM
  3. Creates a log folder with a time stamp in the name
  4. Iterates through each Protection Group
  5. Logs every virtual machine in that protection group

Continue reading “Querying SRM for Protected VMs with PowerCLI”

Site Recovery Manager 6 and Storage DRS Tagging: Part I–The Basics

VMware vCenter Site Recovery Manager 6.0 was mostly a compatibility release–getting it to work right with vCenter 6.0 essentially. That being said, there were a few new features (and some nice tweaks in the GUI) included in the release. One of the new features that sparked my interest was SRM and Storage DRS compatibility enhancements.

Ben Meadowcroft a VMware PM who works on amongst other things, SRM, blogged about this new feature here. Find the VMware KB here.

srmspash

Ben covers most of the history of this in his post so I will skip over that. Let’s take a look though a little closer at this functionality. So to overview there are three tags that SRM introduces to a datastore:

  • SRM-com.vmware.vcDr:::status (indicates that the datastore is replicated)
  • SRM-com.vmware.vcDr:::consistencyGroup (indicates what CG the datastore belongs to, if any)
  • SRM-com.vmware.vcDr:::protectionGroup (indicates what PG the datastore belongs to, if any)

Replication status is assigned as soon as SRM (and it’s respective Storage Replication Adapter) discovers it to be replicated through a Device Discovery operation. Upon this discovery a consistency group tag is also assigned. If the volume is not advertised by the SRA as being in a consistency group a unique one will be created for that volume–basically indicating it is in its own consistency group.

nopg

npogsrm

A protection tag is not assigned until the volume is actually added to a protection group. Once the datastore is assigned to a protection group it will receive the tag (remember a volume can only be in one PG and SRM only supports being in one CG so there will always only be one to assign).

yespg

So what do these tags do? Well Storage DRS will note these tags and not make any automatic moves if a Storage vMotion would violate any of them, this means it will not move from one datastore to another if:

1) Source datastore is replicated and target is not

tononreplicated

2) Source datastore is NOT replicated and target is

toreplicatedfromnon

3) Source datastore is in a different consistency group than the target

differentCG

4) Source datastore is replicated AND in a protection group but target is replicated but NOT in an protection group

notinPG

Basically Storage DRS will not move a VM from one datastore to another if it deems it to cause a change in the configuration of the protection group or consistency of a virtual machine.

So automatic Storage DRS will never make these moves. It may suggest them if it cannot find a better option, but it will never make a move that will violate these rules. If for some reason you want this to occur you can always override the warning and execute the operation.

overridesdrs

Let’s take a look now at the relevant configurable behavior in SRM.

There are four options:

Setting Name Description and Default Value
storage.enableSdrsStandardTagCategoryCreation This creates the three tag categories in vCenter for you.
storage.enableSdrsTagging This actually applies the tags to the datastores when discovered etc.
storage.enableSdrsTaggingRepair This allows SRM to fix datastore tag when something has changed (PG/CG membership changes for instance).
storage.sdrsTaggingPollInterval How often SRM checks tags to make sure they are accurate.

srmsdrsoptions

All of these options are enabled by default, well, kinda, the last one is just set to 50 seconds.

So like the table says the enableSdrsStandardTagCategoryCreation option is pretty straight forward. Creates the three categories. You can, of course, create them yourself if you choose to, not sure why you would though with the exception of the reason stated in the option description:

“In Federated SSO setups, this flag should be disabled and the tags and tag categories should be manually created.”

When enableSdrsTagging is enabled, SRM will place the correct tags at the appropriate times. So when a new device is discovered or its protection group membership changes.

The option enableSdrsTaggingRepair is a little more to think about. New tags will still be placed on datastores, replicated/cg tags during device discovery, pg tags upon adding it to a new or different pg. But it will not fix or remove them, if you remove it from a PG or delete the PG, the tag will remain. If you delete the SRM provided tag and replace it with you own, it will not fix it. Though if you add it to a new PG it will remove an old one if it exists and then give it the correct one. But it won’t ever do that unless you make that PG change.

A note about the repair functionality. If you decide to delete a SRM-provided tag and make you own, it will not last long if this feature is enabled. SRM will right things quite quickly (50 seconds or less). So if you want more control over this tagging for SRM-related devices, disabling this is an option. Of course disabling this can easily lead to stale information in the tags, so do so at your own risk.

In general, I think this is a great enhancement. I would like to see more granular control from the SRM side of things (enable/disable CG auto-tagging when a CG doesn’t exist for that device for instance. This also should have a play in non-SRM environments, it’s just a bit more work because you have to do the tagging yourself.

In Part II, I will take a look at how this works with the FlashArray SRA and what’s involved in that.

Add Storage Wizard Slowness and Unresolved VMFS Volumes

This week I received a question from a customer about some slowness in the vSphere “Add Storage” wizard they were seeing. This is a problem that has occurred over the years quite a few times for a variety of different reasons. VMware has fixed most of them, this latest reason luckily was known and has a relatively simple solution. An option called VMFS.UnresolvedVolumeLiveCheck.

option

Continue reading “Add Storage Wizard Slowness and Unresolved VMFS Volumes”

Pure Storage FlashArray SRA for Site Recovery Manager

I’ve have been working with VMware’s vCenter Site Recovery Manager since the tail end of the 1.x release and I have to say this is the most excited I have been about a Storage Replication Adapter release that I can remember. Since I started with Pure in late April 2014 I have been working with our development team and product management to design and shape this initial release of the Pure Storage SRA. I have to say it has been a blast–a really great team that does some really amazing work! It is now officially approved and posted on VMware’s  compatibility guide and SRA download site:

http://www.vmware.com/resources/compatibility/detail.php?productid=38264&deviceCategory=sra&details=1&partner=399

https://my.vmware.com/group/vmware/details?downloadGroup=SRM_SRA55&productId=451

srmpure

Continue reading “Pure Storage FlashArray SRA for Site Recovery Manager”

Site Recovery Manager with PowerCLI Automation Gotcha

Quick post here. So I have been reviewing some great posts from @vmKen and @BenMeadowcroft about automating Site Recovery Manager operations with PowerCLI and wanted to give it a try myself. They outlined the process rather clearly in their blogs so it was a breeze to get most of the stuff up and running. But when I went to actually execute a test recovery or a recovery etc. it kept failing! The PowerCLI command to start the recovery  was $VMrp.Start($RPmode)–the $VMrp being my recovery plan and the $RPMode being the recovery plan mode of a recovery. The command was accepted but the recovery plan never started.

I got the following error in vCenter:

Unable to start the requested operation. Another operation may be in progress. Please wait for it to finish and try again.

Hmm…weird. I could kick off a test from the GUI with no issue so nothing was “interfering” from what I could tell. I thought maybe since I was using Site Recovery Manager 5.8 maybe something had changed so I tried it with my 5.5 environment and got the same result.

srm55

webclient

After I was about to lose my mind it finally occurred to me that I was connecting to the protected vCenter and the protected SRM server (I did enter in remote credentials for the recovery SRM server though). While I could query the recovery plan etc without issue from here, maybe SRM didn’t allow a recovery plan to be started unless you directly connected to the recovery vCenter/SRM server.

So I reconnected to the recovery site and it worked! So I guess it makes a difference, so FYI. Now there might be a workaround to this and it is definitely possible I missed something that allows this but this seems to be what you need to do. If you find this isn’t true please let me know!

Thanks Ken and Ben for getting me started!! Cool stuff. Kens posts:

http://blogs.vmware.com/vsphere/2014/05/automate-failover-with-srm.html

http://blogs.vmware.com/vsphere/2014/05/srm-powercli-reporting.html

http://blogs.vmware.com/vsphere/2014/05/powercli-and-the-srm-api.html