Deploying independent appliance pools using PlanetLab bootstrap
From Grid-Appliance Wiki
Welcome to the tutorial on creating your own Grid appliance pool. The purpose of this tutorial is to explain what steps are needed for you to deploy an independent Grid appliance pool on your own domain (or across domains) reusing our public PlanetLab bootstrap infrastructure. This tutorial assumes you have already gone through the introductory Grid appliance tutorial and have gone through the presentation slides "Deploying local Grid appliance pools 1: Bootstrapping from shared PlanetLab infrastructure".
Contents |
Introduction
The Grid appliance can be used to create your own pool of resources, which is logically isolated from other users of the system. It is possible to do so in an easy way, reusing our existing infrastructure which runs world-wide on the PlanetLab system. The basic idea is to create your own unique "namespace", a string which will identify to your appliances which other appliances they should connect to. This process is done as follows: Using our Web interface you can create virtual floppy disks which are configured with your own namespace information These floppy disks replace the default floppy disk which is packaged with the baseline appliance; there are three floppy configurations you can use:
- Server: runs the Condor scheduler daemons (i.e. negotiator and collector). You need at least one VM configured as a server in your pool.
- Client: can submit and run jobs (i.e. Condor startd and schedd)
- Worker: can only run jobs (i.e. Condor startd)
Creating floppy disks for your namespace
You need to first create an account with the Grid appliance portal do create your virtual floppy disk images.
- Once you are able to successfully log in, click on the "GroupVPN" link (under the "User menu" box to the left). Click on "Create a new group" to create your VPN group. For more information on how to configure the GroupVPN, please see the videos available in the ACIS P2P YouTube channel, and for more information about GroupVPN, follow this link.
- Enter a group name, description for your pool. Select "ufl_test0" under "Use a Managed P2P pool" in order to connect to ourPlanetLab overlay. Select the virtual IP addressrange/netmask for the appliance pool (e.g. 10.0.0.0/255.0.0.0) - DHCP virtual IP addresses will be given on that range. Select a unique name for your IP namespace - e.g. the group name you entered in the first box. Check "end-to-end security" to ensure all traffic within your pool is authenticated and encrypted.
- Submit your group creation request
- Once you have a GroupVPN created, you need to configure an appliance group. Click on the "GroupAppliances" link (under the "User menu" box to the left).
- Enter a group name and a group description on the text boxes at the bottom. Select, from the drop-down menu, the GroupVPN that you just created in the previous step.
- Create a group
- Once the appliance group is created, you can click on the group's name and an interface allowing you to generate and download a virtual floppy for your appliances will be displayed.
- Select the kind of floppy (the following section describes the different roles for the client, server and worker floppy configurations), the appliance architecture (32-bit, or 64-bit for VMs with 4GB or more) and download your floppies.
Deploying Condor nodes: clients, servers, workers
In order to deploy your resource pool, you need to start by deploying one or more appliances configured as a "Server". Our experience in a local-area network has been that the Condor manager can handle of the order of a couple hundred clients/workers - though this depends on characteristics of the applications you plan to run on your pool. If you have several hundreds of resources in your pool, you may need to monitor and fine-tune your deployment to design and deploy an appropriate number of servers for your system.
- Select one or more resources to run your "Server" appliances. Uncompress the Grid appliance image first, and then the server floppy .zip archive. Replace the floppy.img in the "Grid Appliance" directory with the floppy.img you just downloaded.
- Use a configuration tool for your VM monitor (VMware, KVM, Virtualbox) to select the memory size and number of CPUs for your VM (if applicable). Suggested values are 512-1024MB of RAM and 2 CPUs (not all VM monitors allow multiple virtual CPUs). If you would like to use more than 4GB of memory or more than 2 virtual CPUs, it is recommended that you use a recent version of KVM.
- Instantiate the server VM(s). If your VMM allows for scripting the startup of VMs upon server boot (e.g. VMware Server), it is a good idea to register the VM to automatically start upon bootup.
- Select one or more resources to run worker and/or client VMs. Repeat the steps as above, using the appropriate floppy image - worker images are useful when the VM will not be used interactively (e.g. in a server, or in a desktop PC lab), while client images should be distributed to end users.
Dealing with passwords and ssh
The Grid appliance image comes with a default password for the griduser account to simplify the first login; users are encouraged to change this password on the nodes they interact with, but when you deploy a pool with workers/servers, the administrator of the virtual pool may want to change their passwords. Note that sshd is enabled in the appliances, but only takes requests from the host-only network, so for any user to log in using the default password they must first have access to the host. If you would like to change the appliance password, you can ssh from the host into an appliance and use the passwd command. Optional: the virtual appliance floppy contains an ssh authorized_keys file that is loaded for the "root" user. You can configure this file to add a public key that will allow you to log in remotely as root using private/public key authentication. If you would like to use this feature, you need to loop-back mount the floppy (mount -o loop floppy.img /mnt/floppy or equivalente in Unix), replace the authorized_keys file (ensure it's chmod 600), and unmount the floppy.

