How to clone availability in the cloud with better outcomes

Tips from the movies — Multiplicity

Multiplicity is a 1996 American science fiction comedy film starring Michael Keaton as Doug Kinney, a busy construction worker struggling to make time for his family and his demanding job. When a scientist offers to clone him, Doug agrees to just make meeting his schedule and commitments easier. But then the copies of him begin making copies of themselves. By the time the last copy is made the point is clear, cloning may not be all it’s cracked up to be, or at the very least comes with some strong warnings, challenges and side effects. The famous original Star Trek episode “Trouble with Tribbles” illustrates a similar point.

Tips for how to get better outcomes when you clone your availability system

1. Clone operational systems

This sounds obvious, but I have seen it happen more than once in real enterprise environments. If you clone your non-functional system, the clone will be equally non-functional and problematic when you restore it. Be sure that the clone you make was from an operational and functional system.

2. Sync data to disk and resync on restore

File system integrity is critical. If you don’t ensure your application and/or VM are in a consistent state, most vendors will not guarantee the resulting created image. Since snapshots only capture data that has been written to your volume at the time the snapshot command is issued, this might exclude any data that has been cached by any applications or the operating system. Making sure data has been properly synced to the file system is an important step, and absolutely critical in a cluster environment.

3. Stop your instance

Many environments do not require you to stop an instance to create an image, and some, such as AWS will do the step of powering down the node before making the copy. However, many tools and sites recommend making sure applications are stopped and file system access is properly synced to avoid damage, loss of integrity, or creating images that have trouble starting, stopping, or running installed applications.

4. Label everything in the cloud (nodes, disks, NICs, everything)

While creating a clone is a free operation, the resulting disks and components typically are not. AWS states, for example, that you are “charged for the snapshots until you deregister the image and delete the snapshots.” When things aren’t labeled, knowing what is in use or not in use and why it was created can become problematic and subject to the fleeting memories or poor concentration of existing team members. Label everything.

5. Prune clones and snapshots often (cost savings and headache savings)

Pruning old snapshots and clones is not only good for the cost savings, but it is also good for reducing headaches. Older snapshots run the risk of reintroducing vulnerabilities that have been addressed or resolved in newer copies. As VP of Customer Experience at SIOS Technology Corp., I saw the consequences firsthand when we worked with a customer who restored from a snapshot. After restarting the application, they ran into several problems. After troubleshooting, we determined that the clone was running an older version of security software and the cached credentials and metadata stored in the user profile were no longer in sync with the actual application data stored on the externally mounted data drives.

6. Limit or restrict cloning of clones in the cloud

Lastly, not everything you do in the cloud needs to be cloned. Consider limiting the types of workloads that you will clone and restrict the number or roles who can create clones in your environment.

Cassius leads the Customer Experience team at SIOS Technology responsible for customer success.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store