Supporting Multiple RHEV 3.0 Environments Simultaneously

So you’re planning out a single environment around the soon to be released 3.0 version of RHEV. You know you’re going to eventually have multiple environments by the end of the year, and each environment will need to have varying levels of separation. You know you can set up IPtables and SELinux, but that’s host and VM level. VLANs provide additional virtualization, but again, that’s at the network level.

What can we do at the storage level that will support and complement the levels of virtualization and separation found at the compute and network layers that would support multiple RHEV environments? Would this make high availability, scaling out, scaling up, and load balancing more difficult?

What if I told you that you could virtualize a NetApp controller, and thereby answer, “yes” to all of the questions above?

MultiStore to the Rescue!!

It’s going to be difficult to make this not sound like a sales or product pitch, so bear with me and I’ll tie this back into RHEV ASAP. I promise.

<product speak>
Using the the MultiStore feature of NetApp, we can carve up a single NetApp controller into multiple “vFilers”. Here are some defining characteristics and capabilities for vFilers:

  • A lightweight instance of Data ONTAP
  • Maintains its own IP space and routing table
  • Authenticate against its own LDAP directory (Active Directory, etc)
  • Ability to migrate from one NetApp Controller to another
  • Maintain its own Ethernet storage separate from other vFilers (FC not supported)

The figure below provides a logical representation of vFilers.

There is much more to MultiStore and vFilers than this, but again, I don’t want this to be a “pitch”. This is meant to be a means of solving a problem from an architectural standpoint. So, having said that, I hereby provide the “product end tag” – </product speak>.

I Cut it Twice and it’s Still Too Short

As with anything, if it’s worth doing, it’s worth doing right. And while it might be worth doing a second time, nobody likes having to do something over. Even if no one is watching, you know the feeling that coincides with the realization that you didn’t do something right and that you’ll have to start over. Here are some things that my Dad used to drill into my brother and I growing up:

  • In a jovial voice during any given carpentry-type project he would joke, “Dammit! I cut it twice, and it’s still too short!!”
  • The 5 P’s, “Prior Planning Prevents Piss-Poor Performance” (the 6th P doesn’t count because of the hyphen)
  • Measure twice, cut once

In other words, preparation and laying out the required groundwork is a prerequisite for success. The same holds true for any given technology related project, especially virtualization. We can’t necessarily predict the future, but we can certainly make decisions and implement technologies that leave us the most options down the road.

RHEV 3.0 is certainly easy enough to install, configure, and have VMs up in minutes, but this discussion goes beyond the sandbox lab. This is less “kick the tires” and more “I’m going to have to live with this until they make me CTO and I can pawn it off on someone else”.

For example, you might know that you will have one RHEV environment that is ultimately used by a group of developers. You know that if it works well in “dev/test”, someone is likely going to want to try it out in production in the next 6-12 months. It may be tempting to think, “I’ll just carve out new LUNs or NFS exports when the time comes”. But don’t limit yourself to that line of thinking. Better yet, don’t limit your environment.

What I’m getting at here is the recommendation to use a vFiler for each environment. Think of it this way: the dev/test environment is humming along, and someone requests additional storage for a production RHEV environment. You carve off a LUN or NFS, or both, from the same NetApp controller. Things seem to be going great until one or more of the following scenarios rears its ugly head:

  • The production RHEV sysadmin can “see” the dev/test storage and starts co-mingling virtual machines. (or dev/test can see production…)
  • The dev/test RHEV sysadmin can “see” the production storage and accidentally overwrites it.
  • dev/test VLANs, and therefore dev/test storage, are on the same network(s) as the production VLANs and the security team has a cow.
  • The production VMs and dev/test VMs are in different clusters but the same RHEV datacenter, and the production VMs are supposed to have I/O priority.

I could go on and on with the examples, but hopefully you get the idea. And yes, you could also solve some of this by limiting actions and views by login, but allow me to now let drive it home – replace the word “production” with “Coca-Cola” and “dev/test” with “Pepsi” and tell me this isn’t a recipe for disaster.

Yes, technically you could create different RHEV datacenters, RHEV clusters, and RHEV logical networks, but again, that’s only part of the solution as they deal solely with the compute layer (hypervisor and VM). The separation needs to exist not at one or two layers in the stack, but at every layer in the stack.

Rather than having to go through the pain, heartache, and lost weekends to re-do the environment, do it right the first time. Yes, we want “shared storage” in order to provide storage efficiency and storage virtualization, but we also need to provide layers of separation and security to balance things out. Best of both worlds and all of that..

RHEV & MultiStore – A Match Made in Techie Heaven

Here’s where we tie this back specifically to hosting multiple RHEV environments on the same NetApp controller. The example below shows a fairly basic design, but should illustrate some of our points nicely.

In dev/test, we’ve got our RHEV-H nodes booting over iSCSI, our VM storage on one NFS export, and our ISO storage on another NFS export.  Our example production environment makes use of the same NetApp controller, but instead uses a separate vFiler from dev/test. The storage layout is the same, except that the production hypervisors boot over FCP.

But you said I can’t use FCP!!!!

Actually, I didn’t, I just said that vFilers can’t use FC. The base (physical) NetApp controller still has full support for FCP and FCoE. The whole reason we’re using vFilers is to provide security and separation for Ethernet storage, while still utilizing the shared storage model. There is no loss of security or separation if any needed FC LUNs are presented from the base NetApp controller. Related boot LUNs can be contained (and deduplicated) in the same FlexVol. Non-related data LUNs can be housed in their own FlexVol.

How does this solve the problems described above?

Well for one, we’ve guaranteed that Coke and Pepsi (or dev/test and production) data is completely separate. Each vFiler maintains its own routing tables & IP space and has the ability to authenticate to it’s own LDAP domain. Pushing down a little further, each vFiler can maintain its own VLANs and virtual network interfaces (VIF or ifgrp). Combined, this provides us with clear separation and some of the pieces required for security*.

*Remember, security is not a product. It’s the careful orchestration of multiple layers of technology, procedure, and policy that requires constant challenge and evolution.

This helps to protect the underlying storage from intended and unintended security breaches. Additionally, now our vFilers can be logically aligned with the different environments using the resources. An environment could be defined as a separate RHEV data center, or as multiple instances of RHEV-M. Regardless of the definition, we can use a separate vFiler for each RHEV data center, we could also use 1 or more vFilers to support multiple RHEV-m instances.

Add the ability to migrate vFilers between different NetApp, and the issues around both high availability and load balancing aren’t as daunting either. As one NetApp controller is under consistently increasing load, vFilers can non-disruptively be migrated to the other NetApp controllers to load balance the storage requirements.

The other important point to bring up is that while you can add in the MultiStore feature at any time, its best to implement it at the beginning in order to avoid disruption while data is moved and made temporarily unavailable.

How about Scaling Out?

Scaling out is as easy as creating a new vFiler. More importantly, you can create vFilers, as they are needed, without affecting the existing storage or environments. You can then move those vFilers around as necessary in order to scale your environment. Among other things, this means that you don’t have to be a psychic in terms of planning out your storage for the year. It just means that you need to build the proper foundation.

Again, we’re trying to address the problem of providing proper separation in the planning stages of supporting multiple RHEV environments. We can apply VLANs, firewalls, role based logins, and policies to the VM, hypervisor, and network layer, but if we don’t apply complementary technologies and settings to the storage layer, it’s incomplete. It’s akin to going through the trouble of separating your different recyclables into different bins for paper, plastic, and aluminum, only to throw it all in the same dumpster.

So “measure twice” and cut once. Do it right the first time, and be done with it. Be a beacon of all things pre-planned, correct, and smug. Incorporate vFilers in the architecture planning for your multiple RHEV environments.

2 thoughts on “Supporting Multiple RHEV 3.0 Environments Simultaneously”

Agree? Disagree? Something to add to the conversation?