FAQ
From Grid-Appliance Wiki
Grid Appliance
I've started a Grid appliance but it does not seem to be connected to the resource pool. What is wrong?
There could be many reasons. Here is a trouble-shooting list to guide you in debugging typical problem.
- It may take up to a few minutes for an appliance to become properly connected.
- This can be the case especially if you are behind a firewall that blocks the UDP port that the Grid appliance uses, or if the virtual machine you use has a NAT that is difficult to traverse. We have observed that the VirtualBox NAT can be difficult to traverse and you may see delays in getting connected
- Leave your appliance running for a few minutes and see if it eventually gets connected
- It may be that the VM's virtual network is not working properly
- First, check that your VM can resolve names and ping Internet hosts
- Check that you can ping a well-known Internet node (e.g. www.google.com). You need to be root to do this, with the following command:
sudo ping www.google.com
- If you cannot ping a well-known Internet node, it is possible that your host or your VM software is not properly configured. Check the following:
- We have seen cases where the VMM installation was not clean and re-installing the VMM and rebooting the system solves the problem. This has been observed with VMware Player and Server
- Are you using bridged networking? If so, you need to ensure an address is allocated to your VM (static or DHCP). A NAT interface is recommended for most uses.
- If the above troubleshooting steps are not successful, please contact us and the following information
- What VMM are you running (VMware, VirtualBox, KVM) and which version
- What virtual network model are you using (NAT, bridged)
- Did you check your machine has Internet connectivity? What command (e.g. sudo ping www.google.com from within the VM)
- Please collect information from your virtual machine and post a message to the user's group for further assistance. If you can package, copy and send the files under /opt/ipop/var and /opt/ipop/etc, they are useful for debugging purposes. You can package them as follows:
cd /opt/ipop tar -czf ipopvaretc.tgz var etc/node.config etc/dhcp.config
Are there known issues with Grid appliances running on the virtual machine technology I'm using?
VMware
We have noticed that in some installations, VMware VMs connected through NAT do not properly resolve names. You can check this by trying to ping a well-known public host from within the appliance (see topic above). This is a VMM issue outside our control, unfortunately, but restarting or reinstalling VMware often fixes this issue.
VirtualBox
We have noticed in some VirtualBox versions that NAT traversal does not succeed properly, which results in non-ideal networking performance. We are working on identifying which versions and why this may be occurring.
I'm having connectivity problems that may be due to my router/NAT/DMZ. Is there anything I can do to improve this?
The Grid appliance uses a virtual network that supports transparent traversal of NATs (network address translators) that are often found in residential and some academic institutions. NAT traversal is important in the appliance as it allows two VMs to communicate directly to each other, even if both behind NATs. In our experience, most users are behind NATs that are amenable to our NAT traversal. Unfortunately, in some cases the NAT is configured in ways that may prevent traversal to take place. Your appliance should still be able to work without NAT traversal - we have a technique to autonomously select a relay proxy for appliances behind such NATs, however there is a hit in network performance. If you would like to further improve performance of your appliance, you may try the following steps:
- How do I know if NAT traversal is working for me?
- This is tricky. We don't have a simple test or tool yet to determine this. Our experience suggests that if you ping another appliance for a hundred seconds or so, and the round-trip delay is 500ms or more, chances are you might be behind a NAT that we can't traverse. There is a generic NAT check tool that can help determine your NAT type. If it is a "cone" NAT, it's likely that we can traverse it.
- In the typical case, Grid appliances communicate over UDP, and there is a single UDP port inside the Archer VM that multiplexes all traffic. By default, this port is 44392 - it can be found by inspecting the file:
cat /opt/ipop/var/node.config
- When you are behind a NAT, appliance UDP packets go through the NAT and get to be translated to an IP/port in the public Internet. The general approach is to configure your NAT/router to a) map the appliance port to a fixed public port, and b) allow incoming traffic to this port. This is dependent upon how your NAT/router is configured, and is difficult to provide a one-size-fits-all solution here, but some pointers that can help:
- Try running the VM in bridged interface mode. If you are behind a NAT in a residential network, you may be able to have the VM run with a bridget network interface (eth0) instead of NAT. This is suggested, as you will not need to deal with another level of NAT (the VMware NAT)
- Search the Internet for router port configuration FAQs and see if you find a guide for your NAT. Here are some examples: Wiki entry for port forwarding, PortForward.com,
- When you are behind a NAT, appliance UDP packets go through the NAT and get to be translated to an IP/port in the public Internet. The general approach is to configure your NAT/router to a) map the appliance port to a fixed public port, and b) allow incoming traffic to this port. This is dependent upon how your NAT/router is configured, and is difficult to provide a one-size-fits-all solution here, but some pointers that can help:
- Share your experience
If you've been able to configure your router successfully to deal with this problem, feel free to share your experiences in this FAQ.
IPOP and Brunet
What are the software requirements to run IPOP?
Mono or .Net 2.0. For Mono, 2.0 and higher are recommended, though there are users who have used 1.9.x successfully. Linux distributions don't necessarily include all the libraries required for IPOP/Brunet when installing the base Mono, so if something doesn't work, please make sure the library it is referencing does indeed exist on your system. If that fails, Mono provides binaries and source that will produce a full working Mono installation: link.
How do I get a list of connected IPOP nodes?
Crawling the dht may be unnecessary. One approach could be to have one key store all the values of the different nodes. In the Grid Appliance, we have a single node (condor master/server) that every one knows about and he let's everyone else know about each other. How you go about dealing with this is based upon your application and its needs. In all honesty, we don't have a good mechanism at this point in time to do a distributed search for nodes in the system. So you can either do a crawl of the network (which is what you may have meant by crawling the dht), the above approaches, or trail blaze.
Are there any additional testing scripts?
Not really, there are some for the Grid Appliance that could easily be ported to IPOP. If you'd like to have a hand at it, they are available here under tests.
Could you give concise directions on setting up a pool?
- Start MultiNode
- The machine running Ipop has direct connectivity to the machine running MultiNode
- Run bput.py with your DHCP configuration every 30 seconds or so until it returns true
- Run dhc* on tapipop and it gets an IP address
I get this message: Unhandled Exception: System.DllNotFoundException: libtuntap. What can I do?
Navigate to a directory of the form drivers/c-lib or src/c-lib, run make.sh, and replace the libtuntap.so in your IPOP folder with this one.
Now in some cases, a user may chose to install a 32-bit Mono into a 64-bit environment. In this case, you'll need to compile the libtuntap for a 32-bit system and not the default 64-bit (gcc needs the -m32 parameter).
You must be in the same directory as the libtuntap.so when executing IPOP. An alternative is to add the path to the library to LD_LIBRARY_PATH.
DHCP isn't working, even though I was successful with a bput.py.
Check to make sure bget.py works. If it fails, but bput.py worked, your DHCP configuration file may be broken. Review it for proper Xml syntax. Make sure your namespaces properly match (ipop.config and dhcp.config).
I get this message: Unhandled Exception: System.Net.Sockets.SocketException: No such host is known at System.Net.Dns.GetHostByName
Most likely you are using Linux and are not using openresolv / resolvconf. What happens is that your machine has been configured to use a hostname that is not locally resolvable and instead it relies on a remote DNS. That DNS is specified in your /etc/resolv.conf. When you do a DHCP on an IPOP interface (and in general), it will erase all entries in your current /etc/resolv.conf. Besides using openresolv/resolvconf, you can also run this command as root echo 127.0.0.1 `hostname` >> /etc/hosts, this will allow you to locally resolve your name. Additionally, you could modify your DHCP client to not add domain names or to append your local DNS.
XmlRpc applications aren't working, like bput and bget
Short story, Mono 2.0 ships with a bug in the Http Remoting code. A few solutions are to download a 2.2+ version of Mono, downgrade to pre-2.0 (1.9), or extract the contents of this to your IPOP/BasicNode folder. See this for more information. And PLEASE file a bug report for your distribution.
I lost my IP Address, what can I do to get it back?
In the Grid Appliance:
- edit /usr/local/ipop/var/ipop.config and replace the IP address with the one you want
- rm /var/lib/dhcp*/*
- /etc/init.d/ipop.sh restart
IPOP:
- edit your ipop.config and replace the IP address with the one you want
- restart IPOP
How to use static IP address in IPOP on Windows?
You can explicitly change the dynamic IP address of "tapipop" to a static one in the Windows GUI, that is:
- Have IPOP running on Windows.
- Make sure the IP address you want to use has no conflict with the existing ones (By Ping, for example).
- Open "Start", go to "Control Pannel".
- Open "Network and Internet Connections", and then "Network Connections".
- Right-click "tapipop" interface, select "Properties".
- Select "Internet Protocol (TCP/IP)".
- Check "Use the following IP addresses", and then fill in the IP and the subnet mask of your virtual network (For example, if you are using a IP space 5.0.0.0/8, put "5.X.X.X" in "IP address", and "255.0.0.0" in "subnet mask").
- Click on "OK", and then "OK".
Then the static IP should be working.
I have a question not listed here, what can I do?
Review documentation here and here. Still no luck? Join the ACIS P2P Users Group and ask questions there.

