How to clone availability in the cloud with better outcomes
Tips from the movies — Multiplicity
Multiplicity is a 1996 American science fiction comedy film starring Michael Keaton as Doug Kinney, a busy construction worker struggling to make time for his family and his demanding job. When a scientist offers to clone him, Doug agrees to just make meeting his schedule and commitments easier. But then the copies of him begin making copies of themselves. By the time the last copy is made the point is clear, cloning may not be all it’s cracked up to be, or at the very least comes with some strong warnings, challenges and side effects. The famous original Star Trek episode “Trouble with Tribbles” illustrates a similar point.
Like cloning on the big screen (or small) cloning in the cloud is a great tool, but not without its challenges.
Tips for how to get better outcomes when you clone your availability system
1. Clone operational systems
This sounds obvious, but I have seen it happen more than once in real enterprise environments. If you clone your non-functional system, the clone will be equally non-functional and problematic when you restore it. Be sure that the clone you make was from an operational and functional system.
2. Sync data to disk and resync on restore
File system integrity is critical. If you don’t ensure your application and/or VM are in a consistent state, most vendors will not guarantee the resulting created image. Since snapshots only capture data that has been written to your volume at the time the snapshot command is issued, this might exclude any data that has been cached by any applications or the operating system. Making sure data has been properly synced to the file system is an important step, and absolutely critical in a cluster environment.
File system integrity is also critical to keep in mind when you restore from an image. If you are using data replication and you restore an image as source or target in the cluster, making sure the two nodes are in sync is paramount. Failing to do so may lead to file system errors on failover or switchover, or even potential data loss.
3. Stop your instance
Many environments do not require you to stop an instance to create an image, and some, such as AWS will do the step of powering down the node before making the copy. However, many tools and sites recommend making sure applications are stopped and file system access is properly synced to avoid damage, loss of integrity, or creating images that have trouble starting, stopping, or running installed applications.
4. Label everything in the cloud (nodes, disks, NICs, everything)
While creating a clone is a free operation, the resulting disks and components typically are not. AWS states, for example, that you are “charged for the snapshots until you deregister the image and delete the snapshots.” When things aren’t labeled, knowing what is in use or not in use and why it was created can become problematic and subject to the fleeting memories or poor concentration of existing team members. Label everything.
5. Prune clones and snapshots often (cost savings and headache savings)
Pruning old snapshots and clones is not only good for the cost savings, but it is also good for reducing headaches. Older snapshots run the risk of reintroducing vulnerabilities that have been addressed or resolved in newer copies. As VP of Customer Experience at SIOS Technology Corp., I saw the consequences firsthand when we worked with a customer who restored from a snapshot. After restarting the application, they ran into several problems. After troubleshooting, we determined that the clone was running an older version of security software and the cached credentials and metadata stored in the user profile were no longer in sync with the actual application data stored on the externally mounted data drives.
6. Limit or restrict cloning of clones in the cloud
Lastly, not everything you do in the cloud needs to be cloned. Consider limiting the types of workloads that you will clone and restrict the number or roles who can create clones in your environment.
In the movie, when Doug’s clones sparked their own series of duplications, an already overwhelmed Doug (Michael Keaton) is forced to exert extra energy to manage his many clones while trying to hide the mess he created from his wife. Clone carefully to avoid making more work and adding risk from a tool that was supposed to make your work easier and your environment safer.