CycleCloud ships with a repository of pre-created images that are recommended for use by most users.
Some users may need to use custom images to fulfill business or security requirements. CycleCloud supports building clusters from images created and owned entirely by the user.
Amazon Machine Images
When creating cluster template files an Image, ImageName, or ImageId must be specified for each node created. Bear in mind that in EC2 each image is specific to both the OS as well as the Amazon Region the machine is in.
Azure Custom Images
Currently, Azure GPU instances are ubuntu 16 only, which is not supported in jetpack.
Private custom Azure images can be specified in the template file with the ImageID attribute. This ID can be found in the Azure portal as the Resource ID for the image. In addition, the ImageOS attribute must be set to either windows or linux:
[[node demo]] ImageId = /subscriptions/xxxxxxxx-xxxx-xxxx-xxxxxxxxxxxx/resourceGroups/MyResourceGroup/providers/Microsoft.Compute/images/MyCustomImage ImageOS = linux
Public Azure images can be directly specified as well. The following attributes must be set to identify the image: Azure.Publisher, Azure.Sku, Azure.Offer, Azure.ImageVersion and Azure.OS:
[[node demo]] Azure.Publisher = Canonical Azure.Offer = UbuntuServer Azure.ImageVersion = latest Azure.Sku = 14.04-LTS Azure.OS = linux
As an alternative to specific version most publishers support referencing the latest image with a label: Azure.ImageVersion = latest.
CycleCloud can launch instances using any Machine Image. However, to build clusters using CycleCloud and to benefit from CycleCloud’s orchestration layer, Cycle Computing’s Jetpack package must be installed on the Image. If the image does not have Jetpack installed, then many of the features of CycleCloud will be lost.
Users have two options for installing Jetpack:
- Use images provided by Cycle Computing that already have Jetpack installed
- Install Jetpack on an existing image to create a custom Jetpack image.
Jetpack packages are currently built for the following operating systems: – Enterprise Linux 6, 7 (CentOS, RHEL) – Ubuntu 12.04, 14.04 – Windows 2008, 2012
CycleCloud Image Requirements
- Jetpack must be installed on the Image.
- Jetpack uses Chef to configure instances launched using the Image. In order for Chef to function, other infrastructure configuration management tools (such as CloudInit in AWS) should be disabled.
If a configuration management tool, such as Chef or Puppet, is already in use, it is generally possible to configure the systems to work together. Please contact Cycle Computing support for more help in this case.
- Management ports, such as TCP port 22, SSH (Secure Shell), and port 3389 RDP (Remote Desktop Protocol), should be open in the security group and on an instance’s firewall during the image baking process.
Building a Custom Image using the AWS Console
For users accustomed to building AMIs via the AWS Console, the easiest way to begin building custom images for CycleCloud is to continue using your current image building method or follow the method, using the AWS Console, described here.
If you intend to build many AMIs or update them regularly, then you may eventually want to switch to using CycleCloud to build images directly.
Select a Base AMI
The first step in building a custom image in AWS is selecting the base AMI and making note of its AMI ID. Your organization may have an approved list of base AMIs. CycleCloud alsp provides a set of base AMIs or you can select any AMI of a supported platform available from within the AWS Console.
This guide refers to AMI ID ami-f0b23b98, but this should be replaced with the correct base AMI ID.
Launch the Instance
- Go to the Services -> EC2 dashboard in the AWS Console and select the AMIs page.
- Select the AMI you wish to use as your base AMI, and click Launch to start an instance with the base AMI. For details on configuring and launching an instance via the AWS Console, see the AWS User Guide.
- For most users, using an EBS-backed instance with a root volume of at least 8GB in size is sufficient.
- If the base image is configured to use Cloud-Init, then do not attach an ephemeral drive to the instance at launch, otherwise Cloud-Init will attempt to mount the drive and add it to the fstab. If you do need the ephemeral drive to build the image, then be sure to clear out the fstab as described in the Clean Up section below.
- Be sure to select a security group with the management ports opened (e.g. SSH, Remote Desktop), and select a security key-pair for which you have access to the private key (this guide will assume that you have used a keypair named cyclecloud as described in the CycleCloud Quickstart Guide).
- After launching the instance, collect the instance ID and hostname of the image builder instance.
Once the new instance has started, use the private key to log in to the instance. For the example in this guide, the keypair allows direct access to the instance as root. If your base image uses a different default user, be sure that the user has sudo access.
Next, install any custom software and configurations that your cluster requires.
It is recommended to create a shell script which automates the software download and installation. Such a script can later be used with CycleCloud to orchestrate image rebuilding.
Chef/Cluster-Init vs. Image Baking
The default images provided by CycleCloud are very close to minimal installation images for supported platforms. All user-level software is installed and configured using Chef and/or Cluster-Init at cluster startup. This makes Cycle Computing provided images very flexible and usable by nearly any CycleCloud user; however, this also shifts most software installation to launch time.
For your own images, you may be able to reduce launch times by pre-installing some software or pre-downloading the packages for software that may only be installed after launch. When making the decision what to burn into the image, here are some questions to think about:
- Do all users of this image need this software? Or is it required by policy?
- A great candidate for baking into the image would be software such as anti-virus programs that are required by policy and may be updated dynamically on each launch.
- Can this software be updated when instances are launched?
- If not, the you may end up baking new images every time a new version is released.
- Does this software require customization at install time that can only be done after instance launch?
- Grid-enabled software often needs to be configured with hostnames and ips of actual, running instances based on cluster search results. (You may still be able to install the software prior to baking the image, and use Chef to re-configure it at instance start-up.)
- Does this software package belong on the root volume of the instance?
- If the software should be installed on EBS or the ephemeral drive, then it cannot be installed directly on the image.
If the software cannot easily be installed prior to image baking, consider including a copy of the installer for use by CycleCloud’s Thunderball cookbook to avoid the download at instance startup until the installer is updated.
Jetpack has no external dependencies, and it includes an easy-to-use installer for all supported platforms.
On Linux, Jetpack will be installed at: /opt/cycle/jetpack
On Windows, Jetpack will be installed at: C:cyclejetpack
First, contact Cycle Computing support to request a copy of Jetpack for your platform, then follow the instructions below to install it in your instance.
To install Jetpack on Windows run the following commands from a Powershell session as Administrator:
PS> unzip jetpack.zip PS> cd jetpack PS> install.cmd
If you do not have 7zip or another unzip command available on the command line, you can extract jetpack.zip using Windows Explorer.
Should the installation fail for any reason, error output will be saved in install.log in the same directory as install.cmd.
To install Jetpack on a supported Linux distribution:
$ tar xvzf jetpack.tar.gz $ cd jetpack $ chmod +x install.sh $ sudo ./install.sh # if not running as root
Should the installation fail for any reason, error output will be saved in install.log in the same directory as install.sh.
The Linux installer requires root privileges.
An important part of building a new image is to ensure only the files that you want on every instance, launched from a saved AMI, exist on the instance at the time the AMI is baked.
Prior to baking the image it is a good idea to remove the installer and all other temporary files from the instance. Otherwise, those files will be baked into the image and permanently clutter it.
Here are the a few common clean-up steps (in order):
- Disable password based login for all users:
$ passwd -l root
- (Optional) Configure sshd_config according to your policies.
- (Optional) Configure the instance level firewall according to your policies.
- Remove any temporary files and installers for you custom installations.
- Remove the Jetpack installer and install dir:
$ cd /tmp $ rm -rf jetpack*
- Remove any system logs that may contain sensitive data.
- If you mounted and formatted the ephemeral drive for the instance (or let Cloud-Init do it for you), then be sure to remove the mount configuration from the /etc/fstab file.
- Clear the bash history for all users, but in particular for root:
$ sudo su - $ history -w $ history -c
9. Remove the authorized key for the key-pair you used to log in: (Do this last: once you perform this step, you won’t be able to log back into the instance):
$ rm ~/.ssh/authorized_keys
Bake the New Image
The new image is ready for baking.
Return to the AWS Console and locate the running instance in the EC2 instances list. Select the instance and select Action -> Create Image. Give the image an appropriate name and description, and then click the Create Image button.
The image creation process will take several minutes. Once the process has completed, the AMI will be ready for use in CycleCloudStore, and this new AMI ID can be used in your cluster templates.
Import the Image
The image ids generated above can be automatically added to the image registry in CycleCloud with the cyclecloud image add command:
cyclecloud image add --name custom.image --label "My Image" --os linux ami-123 ubuntu14_rstudio
This would attempt to find images with an id or name of ami-123 and ubuntu14_rstudio in all the cloud accounts you have configured, and save the resulting image package and artifacts. If that command succeeds, you can then use either Image = My Image or ImageName = custom.image in your cluster templates, and My Image will appear in the image dropdown for a cluster-creation form. The “–os” option should be specified so that the resulting package is correctly labeled as a linux or windows image.
The above command will version the package with the latest version of custom.image that is stored, or 1.0 if there is not currently an image package named custom.image. To automatically increment to the next version, include --bump-version:
cyclecloud image add --name custom.image --label "My Image" --bump-version minor ami-123 ubuntu14_rstudio
The option to --bump-version can be one of major, minor, or patch, which will increment the first, second, or third part of the version number, respectively. You can also set the version directly with --package-version (for instance, --package-version 2.0).
The command prints out a summary like the following:
Image name/id Description ----------------------- -------------------------------------------------------------------- ami-123 AWS image in account prod-aws, region us-east-1 (standard HVM), 8 GB ubuntu14_rstudio GCP image in account prod-google, 10 GB Added image custom.image, v1.0 with 2 artifacts from 2 accounts (ubuntu14_rstudio, prod-google)
If it cannot match all the images, the command will fail. To test out what it would find, include the --dry-run option, which prints out the same summary but does not store anything.