IOOS:Index
From Grid-Appliance Wiki
Contents |
IOOS Testbed Virtual Grid
This page will provide pointers and reference material on how to access and use the UF/UNC/NCSU/USF IOOS Testbed virtual Grid.
The Testbed Grid consists of a pool of virtual machines running the Linux operating system and the Condor batch job scheduler. Cluster resources at the University of Florida support the execution of models, while researchers at the testbed institutions are able to conveniently access the infrastructure from their own desktops by downloading pre-packaged system that we call "Grid appliance". Grid appliances run on virtual machine software that is freely available and simple to install in desktops running Windows, Linux or MacOS (with Intel processors).
Virtual Grid self-training
In order to get acquainted with the virtual Grid infrastructure, we have prepared a tutorial that can be followed by users in a self-paced manner, with questions/milestones to help you assess your progress. If you cannot successfully complete a step for any reason, or if you have questions that prevent you from continuing, please stop and send an email with your question to renato at acis dot ufl dot edu.
- Go through steps 1-3 of the Grid appliance quick start guide to learn about the basics of the Grid Appliance.
- Have you successfully installed the VM software?
- Have you successfully started up the Grid appliance?
- Did your machine connect to the Condor pool? (I.e. did condor_status return a list of nodes - if it did not, please wait for a few minutes and retry)
- Were you able to successfully run the demonstration MontePi job?
- Were you able to transfer files from/to your host to the appliance using drag-and-drop (and/or sftp)?
- Try to execute a sample "canned" model (CH3D) from your appliance. Click on the following link for step-by-step instructions: IOOS:CH3D sample
- Were you able to successfully run the demonstration WW3 job?
- Try to execute your own model within the appliance, locally. Copy binaries (statically-linked Linux executables are the simplest to work; dynamically-linked may require libraries to be installed) and input files with the mechanisms described earlier.
- Were you able to successfully run your model within the appliance?
- Try to execute your model remotely using Condor. Create a condor submit script based on the examples above; do not worry about linking with Condor libraries, instead use the "vanilla" universe which supports unmodified binaries. Make sure you specify input files to be transfered and any environment variables needed to run the job
- Were you able to successfully run your model remotely through Condor?
- (Optional) Go through the Condor compilation tutorial to learn how to link an application to Condor. This enables transparent checkpoint/restart functionality which is useful for increased reliability for long-running jobs; it is not required that your applications are condor-compiled to run in the testbed, but you may consider using this feature for long-running jobs.
- Were you able to condor-compile the test application?
- Were you able to successfully submit and view results from the test application?
- (Optional) Go through the Condor DAGman tutorial. This describes how you can schedule workflows in Condor; it is not required for the IOOS testbed, but it may help you to schedule a batch of model runs with dependences
- Were you able to compile the application?
- Were you able to successfully submit and view results from the DAGman application?
Installing IOOS Testbed Appliance
The following instructions are on how to deploy new resources to connect to the IOOS pool; you don't need to go through this material if you are just following the self-training tutorial.
We have created a virtual grid appliance for the IOOS project which runs on a private, independent resource pool. If you plan to run an IOOS appliance, please send an email request to ptony82@ufl.edu to obtain IOOS-specific floppy images.
Quick instructions for VMware desktop appliances
Click here for instructions on how to set up a desktop with VMware and the IOOS appliance.
Click here for instructions on how to upgrade your IOOS appliance.
General instructions for VMware and KVM
- Download the Grid Appliance for VMware from this link
- Extract the zipped file
- Send an email request to ptony82@ufl.edu to obtain IOOS-specific floppy images (all configurations for the virtual machines are stored on floppy images, so floppy images for the IOOS project have to be acquired before running the virtual appliance)
- Once you've obtained the zipped floppy images, extract the file.
- The file contains three floppy images (Server.img, Client.img, Worker.img)
- To create a server virtual machine (runs Condor Manager):
- Delete fdb.img, rename Server.img to fdb.img, move to grid-appliance folder
- To create a client virtual machine (able to submit and run jobs):
- Repeat the steps above, but use the Client.img file instead of the Server.img file
- To create a worker virtual machine (able to only run jobs):
- Repeat the steps above, but use the Worker.img file instead of the Server.img file
- Change how much memory (RAM) you want to allocate to your virtual machine (the default is 256 MB)
- For VMware:
- Start your node through the vmware console
- For KVM:
- You can start up the appliance with the following command;
kvm -M pc -m 256 -smp 1 -hda ga-flat.vmdk -hdb opt.vmdk -hdd home.vmdk -fda fdb.img -net user,vlan=0 -net nic,vlan=0,model=pcnet -no-acpi -boot a Please review kvm --help for more information. Note: on some machines kvm is named qemu-kvm.
Verify your installation
You can verify that your installation is correct by run the command below
condor_status
You should see a list of machines running condor, if it fails, do the following commands:
sudo bash dhclient tapipop condor_status
Once you are able to see this list, then you are ready to use your system.
| |

