Grid Appliance Tutorial
From Grid-Appliance Wiki
Welcome to the Grid Appliance Tutorial! The purpose of this tutorial is to show you how to join our Condor pool in minutes. This tutorial is simple and user-friendly and contains plenty of visual aids to help you along the way.
Contents |
Introduction
The Grid Appliance is a self-configuring Virtual Machine appliance, that is used to create ad-hoc pools of computer resources over wide area networks to execute high-throughput, long-running jobs. Appliances are connected to each other through a peer-to-peer virtual network using private IP addresses. Upon starting the appliance, it is automatically connected to a pool of resources and is capable of submitting and executing jobs using the Condor Grid scheduler.
The purpose of this tutorial is to go into the details of the features provided by the appliance. This involves installing VMware, submitting Condor jobs, and accessing files on the appliance. Documents describing the design and configuration of the appliance and the Condor Manual are packaged alongside the image of the appliance. More information about the appliance and other related projects, including links to the latest VM image and detailed documentation can be found at: http://www.grid-appliance.org
Users are advised to visit this page for most up-to-date information and available updates.
We also have a group for discussions/questions and answers among users and developers. Follow this link to join.
System Requirements
- Standard Intel/AMD x86-compatible PC
- 1 GHz or faster CPU
- 512 MB of RAM
- 2G of free disk space available
- Internet connection
- A virtual machine monitor. VMware, KVM, Qemu, and VirtualBox are currently supported.
Running the Grid Appliance
Before you can run the Grid appliance, you will need to complete the following steps:
- Download and install virtualization software (i.e. VMware, KVM, Qemu). VMware Player is recommended.
- Download the Grid Appliance
- Extract the zipped file
- Start the Grid Appliance
- For VMware in Windows, browse to the grid_appliance directory and double-click on GridAppliance.vmx
- For VMware in Linux, cd to the extracted directory. Then type "vmplayer GridAppliance.vmx" if you are using VMware player, or type "vmware GridAppliance.vmx" if you are using VMware server.
- For Qemu on Windows, browse to the extracted directory and double-click on qemu.bat
- For Qemu on Linux, cd to the extracted directory, and type "qemu.sh"
- (For Qemu and KVM users only: when the Grid Appliance boots up, a boot menu will show up; use the keyboard down arrow to select the option Qemu Release)
- VirtualBox click here
Upon successful completion of the previous steps, the Grid Appliance will boot and display a terminal - see below
This is what shows up after a successful boot
There are some key things to notice about this boot page:
- xmessage box (top left): shows the local IP address (which can used to access files in the appliance as explained later in this tutorial), and the default user name and password for logging into the system. It is highly recommended that you change your password for future use; how to accomplish this is explained later in the tutorial.
- Terminal window: The current Grid Appliance is a Ubuntu 9.10 based. The terminal window provides a Linux shell, where you can execute your typical Unix commands (such as ls, top, df)
The username (griduser)
The Grid Appliance tailors the experience around the user. If you download the default Grid Appliance, your user name will be griduser; however, if you download a floppy and join a GroupAppliance, it will be the name with which you logged into the website. For the remainder of this document, griduser is used, if you are uncertain of your user name, type "whoami" at a terminal inside the appliance.
Checking your state
After you have started the VM, you will see at the top left a message window telling you the state of both the VPN and Condor. When they are both running, you should be connected to the condor pool. This can be verified by using "condor_status". Type "condor_status" in the terminal window and you should see the following:
In the terminal window, you will see a list of machines (which are grid appliances just like yours) connected to form a Condor pool. These machines are connected through a virtual network. With Condor and the Grid Appliance, all of these machines work together to provide more computing power to each grid appliance user. More information about Condor can be found here.
Note: At times it may take several minutes to properly connect to the Condor pool, so if a warning message appears as a result of the "condor_status" command, wait a few moments, and try again.
Submitting a Condor job through the Grid Appliance
Most users will primarily interface with the job submission portion of condor. The primary commands that users need to familarize themselves with are "condor_submit", "condor_q" and "condor_status".
To explain there purpose, we will use one of the provided examples on the Grid Appliance. This demonstration uses a Monte Carlo method to estimate the value of Pi. This example illustrates a typical use of high-throughput computing to execute many different tasks using the Condor scheduler. This application takes a single argument (how many data points to generate). Based on this parameter, it generates random points in a unit square and compute the ratio of points that fall within a unit circle to estimate Pi. Check the file montepi.c within examples/montepi for the source code.
Type the following commands on a shell terminal:
- sudo apt-get install gcc
- cd examples/montepi
- gcc -o montepi montepi.c -lm
- condor_submit submit_montepi_vanilla
- condor_q
- condor_status
In step 1, we install the GNU C compiler. After firing this command, you would be asked to enter your password. After that, press Y to continue. In step 3, the gcc command compiles the montepi.c source code.At 4, we use "condor_submit" which takes as its input a condor job configuration file. Use "more submit_montepi_vanilla" to inspect this file. Condor submit files can be much more complex than this one, but this example illustrates a configuration that is useful for many simple applications. At 5, we check the progress of our jobs - each job is assigned an identifier, and the status "I" means a job is idle, and "R" means a job is running. At 6, we check the status summary of the entire Condor pool and other jobs running in the system.
While the jobs run, take a look at the Condor submit configuration file:
more submit_montepi_vanilla
You will notice it specifies the Condor "Universe" used (vanilla supports unmodified executables), the name of the application executable (montepi), and the arguments used for the application. In this example, 50 jobs are queued to run with argument 10000, then 50 additional jobs are queued with argument 20000.
Once the 100 jobs have completed, the results will be placed in files with extensions "*.out". Type the following command to view all 100 estimates of Pi which have been computed:
grep pi *.out
The files with extension "*.log" show where each Condor job executed. Type the following command to see the virtual IP address of each appliance which executed a job:
grep executing *.log
Changing your password in the Grid Appliance
To change your password is straightforward, simply type:
passwd
in the command line. You will be asked to provide the current default password, then set your own password.
Gaining root access in the Grid Appliance
Since the Grid Appliance is based off Debian, it uses the famous sudo tool to provide a subset of root priviledges to a user. There are two ways for getting root access:
- Use the sudo before every command (i.e. sudo ifconfig), or
- Type "sudo bash" to gain root access
Remote Access to the Grid Appliance
There are two ways to transfer files to and from the Grid Appliance (over Samba or over SCP/SFTP). Samba is the recommended approach since it's much more user friendly, faster, and is built into Windows. SCP/SFTP is recommended for Linux users that do not have access to a Samba client.
SAMBA
To access files using Samba, simply type the following address into your internet explorer address bar: file://gridappliance/griduser, a new window will pop-up with the files in your home directory inside of the virtual appliance (/home/griduser). You will be able to drag-and-drop content to and from the grid appliance in the manner:
If that does not work, use the address "\\ipaddress\griduser", where IP address is the one given in the top left xmessage box.
In the screenshot example shown above, this would be \\192.168.159.149\griduser
How to SSH into the Grid Appliance
The Grid Appliance allows ssh connections from the host machine only. Note the IP Address can be obtained from the xmessage presented during the Grid Appliance boot. Sometimes the name resolution fails limiting use to the IP Address.
- From Windows:
- Download PuTTY (or PSCP / PSFTP for scp or sftp access)
- Double-click on the executable,
- type "gridappliance" or the actual IP address obtained from the display window, and click open.
- A login window will show up, use "griduser" as the login name and "password" for the password (unless you changed it).
- From Linux:
- From the command line, type "sftp griduser@ipaddress" (e.g. sftp griduser@192.168.159.149 )
- Follow the command line instructions and provide your password and you should have access.



