Don’t kickstart – Clone!!!!

So you have the perfect “Golden Image” built by way of a kickstart file.  You have a dedicated network for your installs and your boot disk is a NetApp LUN.  You’ve set every conceivable tunable to ensure that every subsequent build via PXE install is under 10 minutes, including post configurations in “%post”.

Guess what.  I can beat it by 9.5 minutes.  Consistently.

If you’re smart enough to use NetApp for your boot LUN, you’re smart enough to use FlexClone to do the rest of the work for you after your initial Kickstart.  Work smarter, not harder right?  Allow me to explain.  Let’s back up to the first sentence of this paragraph…  “smart enough to use NetApp for your boot LUN…”

If you’re boot LUNs are on a NetApp controller, housed in the same FlexVol, then you can take advantage of Thin Provisioning, Dedupe, and SnapShot copies which all take part in reducing the physical storage actually used by your boot LUNs.  But lets take this a step further and tie this back into sub-30 second boot LUNs.

Let’s go back to the concept of the “Golden Image”.  You’ve spent some time picking the right packages, subtracting the wrong ones, addressing file system alignment where needed, adding in your clever “%post” scripts, and you’re ready to go.  The end result is a host that is ready to rock – possibly even “turn key”.  Every time you Kickstart a new RHEL system with this image, you marvel at the your ingenious use of “awk” and “sed” to custom edit various configuration files.

I’ve got a better idea.  Clone it.

You’ve already created the system once.  Why create it again? And again? And again?  Save yourself and your users a lot of time and clone your LUN.  Use Kickstart for the initial build of the Golden Image/template, but then use FlexClone to rapidly deploy cloned instantiations of that template.  I mentioned earlier the ability to create SnapShot copies – a read only copy of the entire FlexVol (including the LUNs in that FlexVol).  This time, we’re going to use FlexClone – a writable snapshot copy of a FlexVol, LUN, or file.

Before we get there, we have a little prep work to do first.  When we’re done, you’ll see that it is actually part of the process of creating that golden image.  Besides, you can likely script this part.  So what is the prep work?  What is so important that I can’t just show you the clone commands? (cue dramatic music..)

Configuration artifacts.

Before you accuse me of going all “archeology” on this article, allow me to illuminate.  Every time a system is installed, there are several things configured in the operating system that would cause problems if they were cloned along with the rest of the system.  For example, imagine 20 servers cloned from the same LUN that all share the same SSH host keys.  (If you can’t see the issue here, please go take a class on remedial computer security.)  Or the same 20 servers that all have the same WWID hard-coded in the initramfs (or initrd for RHEL 5).  (I’ll give you a hint – it breaks multi-pathing.)  There are other things as well, and we’re about to cover it.

Here is the workflow for the entire process, starting with the creation of the golden image:

1. Perform an installation of a server that uses a NetApp LUN to boot from, then perform the necessary updates, configurations, edits and other things required for your golden image.
2. Clean out all static artifacts – these are the things that once configured (or edited) don’t change unless manually changed again

  • Strip hostname, gateway, and IP information (/etc/hosts, /etc/sysconfig/network, /etc/sysconfig/network-scripts/ifcfg-{eth*,br*,bond*})
  • Strip MAC addresses from ethernet configuration files (/etc/sysconfig/network-scripts/ifcfg-{eth*,br*,bond*})
  • If registered to RHN (including Satellite & Proxy), strip the ‘systemid’ (/etc/sysconfig/rhn/systemid)
  • Strip the iSCSI initiator name in /etc/iscsi/initiatorname.conf (can be re-generated with ‘iscsi-name’ command)
  • Replace LUN WWID with either a NetApp friendly wildcard or full wildcard in multipath configuration (/etc/multipath.conf) (wwid “360a98000572d4273685a664462667a36” becomes wwid “360a9*”)
  • Rebuild initrd (RHEL 5) or initramfs (RHEL 6) (mkinitrd for RHEL 5, dracut for RHEL 6)
  • Clear out multipath bindings (RHEL 6) (/etc/multipath/bindings)
  • Label boot device and direct system to boot from the label and not a UUID (RHEL 6) or path (RHEL 5) (`e2label` & /etc/fstab)

NOTE: Nothing says you can’t script this (and the next step) to be part of the Kickstart %post

3. Clean out all dynamic artifacts – these are the things that are automatically repopulated or reconfigured on boot if they are not set or deleted prior

  • Clear out LVM cache (/etc/lvm/cache/*)
  • Remove UDEV rule for ethernet device assignment (RHEL 6) (/etc/udev/rules.d/70-persistent-net.rules)
  • Remove remaining “persistent” UDEV rules (RHEL 6) (/etc/udev/rules.d/*-persistent-*.rules)
  • Remove SSH host keys (/etc/ssh/ssh_host*)

5. Clone LUN with FlexClone at will (once for each new physical server being added)
6. Map new LUNs to new physical servers

NOTE: Remember – anything under the list of “dynamic” artifacts needs to be removed every time you reboot that golden image.  For example, if you boot up the golden image in order to install a new package or change something, then the dynamic artifacts need to be removed/cleared out again.

So, lets move on to the actual cloning part!

Unlike SnapShot, Dedupe, and Thin Provisioning, FlexClone requires an additional license, so this is part of the prep-work, along with enabling Dedupe on the FlexVol:

license add <license_number>
sis on /vol/vol_BootVol/BootLUN_AppSrv

Then enter the proper role level clone it:

priv set adv
clone start /vol/vol_BootVol/BootLUN_AppSrv /vol/vol_BootVol/BootLUN_AppSrv01 -n -l

This will generally take less than 30 seconds – many times it will be even faster depending on the size of the LUN.

After the LUN is cloned, it’s a matter of creating a new “igroup” and mapping that igroup to the cloned LUN.  You might argue that the mapping adds to the 30 seconds – but you were going to have to do this anyway.  Besides, you probably have a clever script for that as well.  (or even better, you do this through the NetApp API…)

So that’s the rub – don’t kickstart when you can clone.  (Work smarter, not harder.)  But wait you say… didn’t I mention something about cloning files with FlexClone?  And how does this relate to KVM?  That will have to wait.  For the next article….

c.k.

7 thoughts on “Don’t kickstart – Clone!!!!”

  1. And how would you manage postinstallation config changes? If you have a tool like puppet with all modules ready. Why not also use it to configure a new kickstarted server.you are more flexibel this way and you dont have to mind unique identifiers.

    1. Hi,

      Actually, there’s no reason not to use puppet and cloning together. Cloning a boot disk is still faster than deploying any other way (unless you’re running a true stateless environment). If you have to create a new boot LUN, you might as well have the OS already on it as it will only add a fraction of a second to the process. However, performing config changes with puppet may very well be the fastest way to perform the post-config.

      Captain KVM

      1. Thanks for the good info provided, i have a question, i have a boot lun template in netapp filer and whenever we need to install the os we will clone the lun from boot lun template and OS will come up fine, but for all the servers which we installed the OS cloned from boot lun all servers are getting the same UUID.

        Now we have all servers with same UUID inside the OS in /etc/fstab, will it be advisable to have same uuid for all.

        please help me ASAP that will help me a lot

        1. Hi Ramesh,

          Thanks for stopping by. There are a number of things you can do – one of them is in the article itself:
          * Label boot device and direct system to boot from the label and not a UUID (RHEL 6, 7) or path (RHEL 5) (`e2label` & /etc/fstab)

          Give that a try,

          Captain KVM

  2. Thanks for sharing the info. This is really a nice post on Kickstarter. Yes thats right that to compete with Kickstarter itself by making Kickstarter Clone is not easy to deal with. Because Kickstarter is very popular and well known crowdfunding and Fundraising website on the web.

    1. Hi,

      I think you’re a little confused. Kickstart and Kickstarter are 2 totally different things. You’re clearly familiar with Kickstarter, the crowdfunding site. Kickstart is a Linux utility that is used for automating Linux installations.

      hope this helps,

      Captain KVM

      1. Ohh.. Okey.. then im Sorry, I thought description is related to Kickstarter website, and when I was reading I also read somewhere Kickstarter so I wrote it.

        btw, thanks for clarification. 🙂

        Thnaks

Agree? Disagree? Something to add to the conversation?