Migrating a virtual machine that uses 100% virtual disks is a simple task due to VMware Storage vMotion but migrating a VM that uses Raw Device Mappings from one array to another is somewhat trickier. There are options to convert a RDM into a virtual disk but that might not be feasible for applications that still require RDMs to be supported. Other options are host based mechanisms/in-guest mechanisms to copy the data from one device to another. That option sometimes can be complex and requires special drivers or even possibly downtime to fully finish the transition. To solve this issue for physical hosts, EMC introduced for the Symmetrix a feature called Federated Live Migration.
Federated Live Migration (FLM) allows the migration of a device from one Symmetrix array to another Symmetrix array without any downtime to the host and also does not affect the SCSI inquiry information of the original source device. Therefore, even though the device is now residing on a completely different Symmetrix array the host is none the wiser. FLM leverages Open Replicator functionality to migrate the data so it has some SAN requirements–the source array must be zoned to the target array. A FLM setup looks like the image below:
In the most recent release of Enginuity this feature was extended to support VMware ESXi devices as well. This can be used for VMFS volumes or RDMs but I feel RDMs would be the more common use case and therefore will be what I demonstrate here. But before walking through the process, what are the requirements? Well the source array must be running at least 5773 code (the most likely scenario) or 5875. The target array must be no lower than 5876.229. Some fixes do need to be applied to a 5773 array so it must be at 5773.184.130 with fixes 65924 and 65510. The ESXi host must be 5.0 U1 or later and requires the VMware patch PR667599 for proper functioning. The source device must also be configured to use VMware NMP Round Robin (NMP fixed is not supported and PP/VE is not yet supported). Solutions Enabler version 7.6 is required for FLM and VMware. For official documentation of these requirements see the Simple Support Matrix for FLM here:
FLM allows the source device to be thick and the target to be thin, which allows a non-disruptive way to not only migrate a device from one array to another but to also immediately leverage the benefits of Symmetrix Virtual Provisioning after the migration such as FAST VP. FLM also offers the ability to run zero detection during the migration so that large chunks of contiguous zeros that are detected are not copied to the target device, which is especially valuable when migrating to a thin device. Furthermore, the target device can be the same size or larger than the source device but it cannot be smaller.
Without further ado the following is a quick run-through of performing a FLM operation on a VMware RDM. For detailed instructions and further information refer to the Symmetrix Procedure Generator and the FLM Technote found here:
The environment I have setup is a DMX-4 and a VMAX 40K. I have a 200 GB RDM assigned to a Windows Server 2008 R2 virtual machine.
First a few prep steps:
- For each fabric, create a zone from the FLM target FA ports to the FLM source FA ports. One-to-one zoning should be specified.
- Map, but do not mask, the target devices to the identified target ports that will be used for Open Replicator. Masking records of any kind for the target VMAX devices will prevent creation of the FLM session.
- Mask the FLM source devices to the FLM target FA ports.
- Adjust the ORS ceiling on the FLM target FA ports. For FLM, EMC recommends setting the ceiling limits for the target FA ports for a 5773 source array to 40 percent. symrcopy -sid 275 -dir 7e -p 0 set ceiling 40
- Create an FLM pair file. My text file looks like this:
Once these steps are complete, the FLM session can be created. I ran the following command to do so. Note that zero detection was leveraged by adding the -frontend_zero parameter.
You can query the FLM session to get the current details by running the following:
As you can see the session is created, it is a migration session and zero detection is enabled. We can query the VMAX device and it will show that the identity of the DMX devices have been spoofed.
C:>symdev -sid 398 list -identity -range 1ee:1ee Symmetrix ID: 000195700398 Device FLG External Identity ---------------------------------- --- ---------------------------------------- Sym Physical Config Sts IG Array ID Num Ser Num Cap (MB) ---------------------------------- --- ---------------------------------------- 01EE Not Visible TDEV RW XX 000190300207 001CD 07001CD000 0 Legend: Flags: (I)dentity : X = The device has a non-native external identity set . = The device does not have an external identity set (G)eometry : X = The device has a user defined geometry . = The device does not have a user defined geometry
You can see the the VMAX devices (SN 398) now report the SN (207) and other information from the DMX devices. All the fields listed under the External Identity column for each device must match the associated source devices. In this example, the value of X under (I)dentity and (G)eometry in the FLG column indicates that the FLM target devices have both user-defined external identity and geometry.
Now the VMAX target devices must be presented to the ESXi host via autoprovisioning groups. Once the SCSI bus of the ESXi host is rescanned you will notice the pathing of the RDM will change from showing only the original paths to include the new paths to the VMAX. Note that the new paths will be reported as “dead” at this point. Do not panic, that is expected!
In the image below you can see the transition of the path count and state from right before the create FLM operation/masking to after.
At this point the session can be activated by running the following command:
This will start the Open Replicator session and begin to copy data from the DMX to the VMAX device. Continue to query the FLM session until the state becomes “Copied”.
The activate also sets the target devices to host access mode active and the source devices to host access mode passive meaning that the host is now using the device on the VMAX. This can be verified by once again looking at the path states on the RDM in the vSphere Client. The paths to the DMX are now dead and the paths to the VMAX are active.
Once the FLM session reaches the state “Copied” all of the data is on the VMAX. The session can now be terminated. It is important to note that once the session is terminated, the DMX source device no longer receives donor update writes. As a result, you no longer have the ability to fail back to the old source device.
Once the session is terminated the Open Replicator session is over and the VM is now solely using the VMAX. The non-disruptive migration of the RDM is complete! The DMX devices can now be removed from the host and from the Open Replicator ports leading to the VMAX. The VMAX devices can be unmapped from the ports leading to the DMX.
The host may run indefinitely with spoofed identity/geometry; however it is still recommended that the spoofed identity be removed as soon as it is convenient. The primary reason for unspoofing is to reduce any potential confusion for administrators who may not be familiar with the details of spoofing from FLM. This recommendation is simply for ease of management as spoofing does not block to use of the vast majority of features on the VMAX (there are a few minor restrictions though so check out the release notes). VMAX devices may remain federated indefinitely and there is no requirement to unspoof at any time.
That being said, spoofing will confuse some VMware integration products such as EMC Storage Viewer. Until the device is unspoofed it will still show up in the Storage Viewer as the old DMX device. I haven’t yet checked how this affects VMware vCenter SRM. Something I plan on doing though at some point this summer.
Unspoofing requires downtime of the device as it must be unmasked from the host and unmapped from the front end ports before it can be done. So in the case of an RDM remove it from the virtual machine, unmask/unmap it and then unspoof. Then re-present it to the ESXi host and it will need to be added back to the virtual machine as a new RDM. The data will be preserved of course.
C:>symconfigure -cmd "set dev 01EE identity = no identity;" commit -nop -sid 398 C:>symconfigure -cmd "set dev 01EE geometry = no geometry;" commit -nop -sid 398
Okay, that was a long one. Will try to keep them shorter in the future.
8 Replies to “Migrating a Raw Device Mapping with Federated Live Migration”
HMMM… for some reason I forgot that FLM needs source and target arrays to both be Symms. Might have confused myself with Federated Tiered Storage.
Thanks for all that info Cody…
Cody, very interesting post, as always thank you for providing the details.
Step 2) map but not mask ..Assume using traditional “symconfigure -cmd “map dev 0BAD to dir blah:blah” and not symaccess. Unless there is a way to use symaccess to map but not mask ?
Step 5) do you have to specify full symmmetrix serial number if both source and target have been discovered with SE ? I’ve always used last 4 digits of the serial for OR sessions.
can you use “-precopy” with FML ?
same zoning requirements as with OR push, every FA used on source device should be zoned to target FA ?
Glad you both liked it!
@Dynamox, unfortunately no. Symaccess does not provide a mechanism to map but not mask. It must be done with symconfigure
Nah, if they arrays are both discovered you only need to put in enough numbers to indicate a unique array. I just put the whole thing in to make it clear i was referring to the array SNs
No, as far as I am aware precopy is not supported with FLM. I will double check that though.
And yes, that is my understanding for the zoning requirements.
have you tried to change identity one path a time to avoid downtime ?
The command won’t work if the device is masked/mapped on any port to protect against corruption. Regardless the RDM would still need to be removed and re-added.
ahh, darn. Need to be able to swap that identity without downtime otherwise i can achieve the same thing using OR with hot push or hot pull. I personally would not want to leave those spoofed WWNs, nightmare to troubleshoot something at 2am.
Yeah, agreed. FLM is mostly about being able to delay the downtime until a more convenient window