Live Migration of Storage and Storage Connections for KVM & OpenStack

In this most current post, I wanted to cover 2 of the core features of NetApp Clustered Data ONTAP and tie it specifically to how it really helps KVM (and OpenStack) scale. While this is meant to be a tech discussion, not a sales pitch, I can’t help but point out that this article shows you some things you can do with NetApp that you can’t do with any other storage, be it “enterprise” or “commodity”.

I know, I know, that’s pretty bold talk, but I’m about to back it up. In order to provide a level set, here are some things that you should know and/or understand:

  • Data ONTAP is NetApp’s storage operating system, with the current version being 8.2
  • Clustered ONTAP is NOT the old (now referred to as 7-mode) ONTAP in a simple 2 node active/active cluster.
  • Clustered ONTAP uses active/active pairs as building blocks to create a global namespace for SAN and NAS.
  • Clustered ONTAP has a different architecture and command set as compared to ONTAP 7.x and earlier as well as ONTAP 8.x running in 7-mode.
  • All of the things that folks love about the old NetApp (dedupe, thin provisioning, cloning, Snapshots, etc) are still there

And for good measure, here is a visual representation of a 4-node cluster:

cDOT

From top to bottom, we see that the SAN/NAS clients attach to the storage by way of “LIFs”, or logical interfaces. These could be for Ethernet storage or for FC storage. We also see that the 4-node cluster is made up of 2 HA (active/active) pairs, and are linked by what is referred to as the cluster interconnect which is a 10GbE network that carries both Cluster traffic but also data traffic. This is very important to understand in the context of the global namespace.

Under the storage controllers and under the cluster interconnect, we see 4 stacks of disk shelves overlayed with “SVM 1” and “SVM 2”. A storage virtual machine, or SVM, is the secure container within Clustered ONTAP that owns storage volumes and storage interfaces (LIFs). And as illustrated, an SVM can span the entire cluster, by way of the cluster interconnect and the global namespace. Here is an example.

If the client on the far left is connected to the NetApp controller on the far left by way of LIF4, he can access the data (NFS export, CIFS share, iSCSI LUN, FC LUN) regardless of where the actual storage volume is.. For example, that far left host using LIF4 will have no issue accessing the yellow volume in the far right bottom. The cluster interconnect simply routes the traffic there once Clustered ONTAP realizes the data is not on the local system. For the initiated, this is one of the primary benefits of a “global namespace”, and NetApp is not the only player here.

However, we’re about to go into uncharted territory. What if the NetApp controller on the far right is getting hammered? No problem, we look at the volume or volumes that are getting the most requests and make a decision. Do we want to move the offending volumes to a different host, or do we want to move the less important volumes to a different host. The decision will be based on business factors, but the action you take is the same. You live migrate the volume(s) from node 4 to whatever other node in the cluster that you need. Again, your SAN/NAS clients can still access that data through their respective LIFs, so there is no disruption to the storage access. The length of time it takes to move the volume depends on the size of the volume(s) and the amount of traffic on the cluster.

Lets throw in another example. What if you wanted to replace the 2 nodes on the left with newer/faster models? No problem. The first thing to do is live migrate the LIF(s) on the far left to the second controller. Again, the storage clients can still access their data without interruption. LIF migration is near instantaneous. Now that the storage access is moved off of node 1, we shut it down, and replace it with the new node. Then we live migrate all of the LIFs on node 2 over to node 1, and replace that node. Again, this is all done non-disruptively. Once node 2 is up and joined to the cluster, we spread the LIFs across the 2 nodes again. We never even touched the storage volumes in this example.

Ok one more example, then we tie this back to KVM and OpenStack. Your environment is ready to grow again, but because the new storage will be used for an existing project on the cluster, you don’t want to have that data spread across disparate controllers in the data center. Not a problem. Add an additional NetApp HA pair to the cluster. You’ll likely want to add some new LIFs to the cluster as well. Once your new cluster nodes are added (non-disruptively), you can both add new storage volumes and load balance the existing storage. A NAS only cluster can grow to 24 nodes and thousands of Petabytes. A SAN only or mixed cluster can grow to 8 nodes, and still grow into thousands of Petabytes. Each HA pair can handle 256 LIFs (you’ll want to spread them equally), and each cluster can handle hundreds of SVMs.

Hopefully you’ve already seen the possibilities for your virtualization and cloud environments. But just in case, here are some pointers. The fact that you can migrate volumes and LIFs on the fly, means that your storage can move as fast or faster than the applications that it supports. Have a tier 1 database that you know is going go into overdrive in the next few days? Go ahead and plan your volume migrations and be ready. Expecting a bunch of new hypervisors? Start adding LIFs to spread the access across the cluster so that the storage connection itself is not the bottleneck.

Need dual paths for your boot LUNs? Again, not an issue as your initiators can log into LIFs on multiple nodes in the cluster, allowing the Linux multi-pather to handle path failover. Want to implement pNFS? Not a problem. Mount your pNFS client to any LIF in the cluster that has access to the SVM and volume, and pNFS will automatically maintain the most optimal connection without human intervention. Move the pNFS export? No problem. pNFS will automatically reconnect elsewhere without interruption and without re-configuration.

All of this means that the number of KVM and OpenStack hypervisors can grow without creating bottlenecks at the storage connection. It means that different storage volumes (applications!!) with different SLA’s can be spread accordingly to faster or slower controllers. Storage volumes and storage connections being live migrated non-disruptively to balance and scale with your environment pain free.

No one else can do that.

Hope this helps,

Captain KVM

4 thoughts on “Live Migration of Storage and Storage Connections for KVM & OpenStack”

  1. Hi Captain,

    We need to perform live storage migration using LVM Mirroring and while performing there was huge I/O wait happening causing the database non-operational (Oracle). Hence decided to reduce the priority of lvconvert using ionice. Is it advisable to use ionice for lvconvert process alone in a production environment.?

    1. Hi Vijay,

      Honestly, if you are having to resort to using LVM Mirroring as a way to do Live Storage Migration, you might need to re-evaluate your storage solution and or layout. If it’s housing an Oracle database, then the storage needs to be robust and flexible to start with. I get that not everyone can afford hi-end EMC, NetApp, or Hitachi, but even the lower end NetApp is really good and useful.. and when compared with the cost of adding up disk shelves and downtime from unexpected outages, it evens out.

      Anyway, the first question to ask is, “why am I migrating the volume?” If it’s because you weren’t getting the performance you needed because you were sharing network i/o and/or storage i/o with other lower priority things, then this is a planning issue. If it’s purely network, then look at the layout and if you’re not already using 10GbE with jumbo frames and VLANs, you should.

      Just out of curiousity.. what platform are you doing Live Storage Migration with? RHEL+KVM?

      Captain KVM

      1. Thank You Captain for the reply..Here are the necessary details.

        Issue description :

        We are planning to migrate a storage array without downtime to the attached linux servers. Below is our approach,

        1. Attach new LUNs from swing kit array
        2. Mirror at host LVM level on to a swing kit array
        3. Once mirroring completes, remove production storage array from LVM and move to new DC
        4. Post movement attach back and remove swing kit LUN from LVM.

        Problem statement

        While doing this mirroring we are seeing high I/O wait time for application I/O but LVM mirror goes on normally. We initially suspected on performance of storage hence tried with another better storage, still same. To avoid direct performance on application LUN we attempted with backup LUN and simulated workload by doing RMAN backup. Before initiating RMAN backup, LVM mirror was normal and I/O wait% was also normal. After initiating LVM mirror, RMAN backup slowed down and I/O wait also increased.

        We logged a case with RedHat to check if LVM mirror I/O can deprioritised so that application I/O can be processed quickly. But the answer is no.

        Environment details as follows

        Server OS – RHEL [x86_64] 6.1
        Cluster – RHEL cluster
        Production storage array – VNX 5700
        Swing Kit array – CX4-480

        Can we use IONICE to deprioritize the lvconvert command.

        Thanks & Regards,
        Vijay

        1. Vijay,

          So are you using RHEL 6.1 as a hypervisor with Red Hat Cluster Suite? If this is the case, then stop. Switch to Red Hat Enterprise Virtualization (RHEV). RHEV supports live storage migration. It also supports HA for important VMs like ones running Oracle, and can actually live migrate the VM running Oracle without dropping the connection. I know, because I’ve run the tests. When I still worked at NetApp, we used to set up virtualized Oracle RAC nodes on RHEV and NetApp, as well as single instance Oracle on RHEV and NetApp with really great results. Put the cluster suite away and switch to RHEV. RHEV will work with the EMC just fine.

          Captain KVM

Agree? Disagree? Something to add to the conversation?