Using Red Hat Enterprise Virtualization 3.6 (Live Migration)

We’re now firmly rooted in getting work done and doing more fun things in Red Hat Enterprise Virtualization at this point. We’ve moved on from “deploying” in that we’ve deployed RHEV, we’ve deployed resources, and we’ve even imported some resources from a completely different hypervisor/environment. Today, we’ll talk about and demonstrate another cornerstone feature of any enterprise virtualization platform:

Live migration

It goes by different names, depending on the brand/make of hypervisor or the marketing department behind it, but the driving force behind it is largely the same. Essentially, the running VM is re-located from one hypervisor to another hypervisor without disruption to the applications or the users. Being that virtualization has been a data center mainstay for many years now, we take this for granted, but it’s really kind of cool how it all works. Much like flying airplanes, we’ve being doing it for so long we forget what goes into it, it’s still really cool when you stop and watch it work.

How does it work?

While I want to keep this post short and sweet, I’ll cover the basics and provide you with a link if you’d like to get more into deep details. Here is the rundown of how live migration works in KVM (and therefore RHEV):

It works in 3 basic stages:

  1. Marking all of the RAM dirty – This essentially sets everything up. All of RAM for the VM is tagged “dirty” so that the next step takes it all away to the new hypervisor.
  2. Sending all dirty RAM to the destination host, or until a certain condition is reached. This might be a low watermark or some other state. There is typically some RAM left that will need to be pushed over.. This leads us to the last step.
  3. Pausing the guest/transfer remaining RAM/device state. There will always need to be a “cutover”, however brief. The guest and application may not feel it, but there will need to be a fraction of a second where the “old” VM and “new” VM transition in order to give the illusion of a seamless transition.

If you follow the link I provide below, you’ll know that what I just wrote above is a huge over-generalization; the process is not trivial.

Tips for Seamless Live Migration

How do we mitigate risk in Live Migration? Great question. Regardless of your hypervisor or virtualization platform, the following guidelines will hold true.

Consistent settings for shared storage. All Live Migration requires shared storage, and for production, you likely want “enterprise” grade. They likely have “best” or “recommended” practices. Start with those, then test them. Then change the settings based on your environment and business needs and compare the new numbers against the original baseline. Make the determination as to whether they are better or worse. And still supported. Oh, and then document the settings!!

Consistent settings for network configuration. If you’re using Ethernet storage, then you need to use 10GbE and VLANs. Use separate VLANs for each storage subnet. Use separate VLANs for management and Live Migration. Document everything.

Make sure that your hypervisors match. Instruction sets matter. In RHEV, you can use AMD or Intel in the same RHEV data center, but not the same RHEV cluster. You can even use different Intel (or AMD) chipsets in the same cluster, but it will force you to use the oldest (lowest common denominator) chipset in order to guarantee that all VMs Live Migrating within the cluster will have the same underlying capabilities.

Time. This is a silent killer. If your time and dates don’t match up between hypervisors or hypervisors and management platform, it could really mess things up. NTP is yours special friend. Use it, configure it. You have been warned.

Enough talk, let’s see the demo. Like the other demo’s in this series, this one is just as straightforward. Check it out:

(Best viewed in fullscreen):

If you want to view more, Amit Shah wrote up an in-depth description of what really happens in regards to KVM Live Migration. You can find it at this blog.

Hope this helps:

Captain KVM

Agree? Disagree? Something to add to the conversation?