Archer:Simics
From Grid-Appliance Wiki
Contents |
Introduction
Overview: This tutorial guides the user through the using Simics on Archer. This tutorial assumes that the user has a good familiarity with Simics; it is not intended to teach Simics but rather how to run a Simics simulation in Archer.
Simics on Archer
Virtutech® Simics is a full system simulation platform, capable of simulating high-end target systems with sufficient fidelity and speed to boot and run operating systems and commercial workloads. Typical Simics users develop their own modules, disk images and checkpoints, plug it into Simics and run simulations. Simics is also used as an "instruction generator" in conjunction with timing models like FeS2 and GEMS.
Archer is an open infrastructure with hundreds of cores available for computer architecture simulation, available for academic users around the globe. Archer currently has a pool of approximately 450 CPUs readily accessible from an easy-to-install virtual machine appliance. Archer has system-wide support for sharing of large datasets using the well-known Network File System (NFS) interface. This setup is especially useful for simulations, like Simics, that require large checkpoint images (e.g. in Simics). Users can easily make NFS repositories with their images on their own resources, and have them accessible (read-only) by remote Archer nodes.
There are many modes in which Simics can be run on Archer. Simulations can be run on the local Grid Appliance with the checkpoints being stored on the same appliance or being accessed via NFS from other appliances in the Archer pool. Simulations may also be submitted as batch jobs, which will be scheduled on the available machines in the Archer pool. User-created modules and devices, like caches or TLBs may be used in such simulations, both those running on local appliances as well as batch jobs. This tutorial will provide illustrating examples for each of the above scenarios.
Getting Started
Obtaining the Grid Appliance
Follow the Registering_with_Archer tutorial, specifically
- Create an account with the Grid appliance portal
- Register with the Archer GroupVPN
- Create or Join a GroupAppliance.
- Connecting_to_Archer_Global. In this step, ensure that you download the Grid appliance image which has the Simics module.
- Running the Appliance. The Grid appliance should start up and log in automatically.
- Check your User Name and state. Check that $HOME points correctly to your home directory /home/your_user_name
- Ensure Simics is installed in /opt/virtutech/simics3/
Creating a workspace and testing simics
Set up a workspace for running your simulations and start up a simple Simics simulation to ensure that it works properly.
cd $HOME mkdir -p workspace /opt/virtutech/simics3/bin/workspace-setup workspace cd workspace ./simics targets/ebony/ebony-linux-firststeps.simics
You should see a Simics simulation start up. If there are any issues with checking out the license, refer to the Licensing Details section.
Simics simulations - checkpoints and user-created modules
In typical Simics simulations, a system is booted and run till the point of interest has been reached. Then, this system is checkpointed. Further simulations use this checkpoint as the starting point. These simulations may use standard modules which are provided with Simics or customized user-created modules.
In this tutorial, a checkpoint has been prepared which has:
- Booted Xen 3.0.2 with two virtual machines ("domains" in the Xen terminology)
- A dom0 (management domain) with domid 0 and virtual IP 10.10.0.12 and 768MB memory
- A domU (user domain) with domid 2 virtual IP 10.10.0.14 and 256 MB memory
- SSH with RSA keys (no need of passwords) set up between the two domains
- Two workloads in each domain
Xen has been compiled including Simics specific hooks to identify Domain Switches, TLB Flushes and Page Faults. A python script to catch these hooks has also been provided.
Running Simics on your local Grid Appliance
The user-created checkpoints, scripts and modules for Simics simulations may be located on the local appliance or accessed via the Archer NFS from another appliance in the Archer pool. Both of these options are illustrated.
Running on the local appliance with checkpoints on the same appliance
The checkpoint and the scripts mentioned in the previous section can be downloaded from http://www.grid-appliance.org/files/archer/tutorials/simics/simics_tutorial_files.tgz. Download and unpackage this file within the appliance as follows:
cd $HOME sudo wget http://www.grid-appliance.org/files/archer/tutorials/simics/simics_local_tutorial_files.tgz mkdir -p test_simics_local cd test_simics_local tar -xzf ../simics_local_tutorial_files.tgz mkdir -p new-workspace /opt/virtutech/simics3/bin/workspace-setup new-workspace cd new-workspace
Start up the simulation (on the your local grid appliance). The steps below accomplish the following goals:
- start Simics and load the checkpointed system image with the Xen hypervisor and two domains pre-booted and ready to run applications
- load the Python script magic-hap-handler.py, which contains the hooks described above
- force a keyboard entry to the console that starts the execution of the nbench benchmark
- run the simulation for one million processor cycles (In this mode, Simics executes one instruction per cycle, so this is equivalent of running the simulation for 1M instructions)
- quit the simulation
./simics ../xen-dom0-domu simics > run-python-file ../magic-hap-handler.py simics > con0.input "cd /root/nbench-byte-2.2.3; ./nbench -cCOM.DAT \n" simics > continue 1_000_000 simics > quit
(for details about con0.input, refer to the gfx-console in the Simics Reference Manual)
The details of the context switches happening will be displayed in the simics console and will look something like
Switching from 0 to 32767 at 226982799362 Switching from 32767 to 0 at 226982802018 Switching from 0 to 2 at 226983022800 Switching from 2 to 0 at 226983049827 [cpu0] cs:0xff108c5e p:0x00108c5e mov ebx,dword ptr [ecx]
By uncommenting different portions of the magic-hap-handler.py script, different events, like tlb flushes etc, can be tracked. The python script can also be modified so that these messages can be written to a file.
Running on the local appliance using scripts
To automate Simics in batch mode, we need two files - a wrapper script and a batch script. These are part of the package you have extracted at the beginning of this tutorial; their contents is shown here for your reference:
- Wrapper script. This script is the top-level script which sets up the environment for the Simics simulation and starts the simulation. There are many ways to create such a wrapper script, and this is dependent upon the application you are running. The following is a template using Bash that works with Simics (simics_wrapper.sh):
################################################################################# # simics_wrapper.sh to demo running Simics on Archer # sets up a workspace called new-workspace inside the current folder mkdir -p new-workspace cd new-workspace tgt_wrk_spc=`pwd` cd /opt/virtutech/simics3/bin/ ./workspace-setup $tgt_wrk_spc cd $tgt_wrk_spc #invoking Simics ./simics ../xen-dom0-domu -no-win -batch-mode ../batch_script.simics > ../screen_dump.out #################################################################################
- Batch script. This script is provided to Simics commands to drive the batch simulation. Here is a sequence of commands for the simulation of Nbench for 1 million cycles (batch_script.simics):
##################################### #batch_script.simics run-python-file ../magic-hap-handler.py con0.input "cd /root/nbench-byte-2.2.3; ./nbench -cCOM.DAT \n" continue 1_000_000 quit #####################################
Go back to the test_simics_local directory, clean up the new-workspace directory, and run the simics_wrapper.sh script:
cd $HOME/test_simics_local rm -rf new-workspace ./simics_wrapper.sh
Once the job completes, we can examine $HOME/test_simics_local/screen_dump.out for the results. It will look something like:
Checking out a license... done: academic license. The user is deemed to have read and complied with the SLA at http://www.grid-appliance.org/files/archer/tutorials/simics/NON_COMMERCIAL_SLA Use of this software is subject to appropriate license.Type 'copyright' for details on copyright. Type 'help help' for info on the on-line documentation. [con0 info] Graphics subsystem not initialized Switching from 0 to 32767 at 226982799362 Switching from 32767 to 0 at 226982802018 Switching from 0 to 2 at 226983022800 Switching from 2 to 0 at 226983049827 Simics license checked in!
Running on local appliance using checkpoints over Archer NFS
The checkpoint and the script used in the Running on the local appliance with checkpoints on the same appliance can be accessed by mounting the Archer NFS shared directory exported by appliance C001001250 to a local mount point on your grid appliance, as follows:
cd $HOME mkdir -p workspace_nfs /opt/virtutech/simics3/bin/workspace-setup $HOME/workspace_nfs cd workspace_nfs ln -s /mnt/ganfs/C001001250 local_mount
Start up the simulation (on the your local grid appliance), similar to the Running on the local appliance with checkpoints on the same appliance.
./simics local_mount/homework2/xen-dom0-domu simics > run-python-file local_mount/homework2/magic-hap-handler.py simics > con0.input "cd /root/nbench-byte-2.2.3; ./nbench -cCOM.DAT \n" simics > continue 1_000_000 simics > quit
An output similar to the Running on the local appliance with checkpoints on the same appliance can be observed.
This mechanism maybe used to create repositories of checkpoints or scripts on a Grid Appliance and use these for running simulations.
Running a Simics job through Condor
In this example, we will submit Simics simulations as batch jobs to the Archer pool using Condor. These jobs will be scheduled and executed on the Archer nodes. The example simulation used here is similar to the simulations in the Running on the local appliance using scripts with one significant difference - a cache model is used to study the cache behaviour of Nbench.
To run a Simics job on Archer through Condor, you need three files: a wrapper script, a batch script and a condor submit file. Examples of these files are described below:
- The wrapper script, similar to the Running on the local appliance using scripts is used to set up the simulation environment and start the simulation. This script is the first thing that is executed when a job submitted through Condor finds a destination host to run.
#!/bin/sh
export SIMICS_HOST=x86-linux # for 32-bit Simics simulations
# export SIMICS_HOST=amd64-linux for 64-bit Simics
mkdir -p new-workspace
cd new-workspace
tgt_wrk_spc=`pwd` # create a workspace directory and store path in tgt_wrk_spc
cd /opt/virtutech/simics3/bin
./workspace-setup $tgt_wrk_spc # setup workspace
cd $tgt_wrk_spc
cp ../setup-gcache.simics . # copy Simics configuration files into workspace
cp ../batch_script.simics .
# Run Simics; -no-win and -batch-mode are required
# for batch execution, and -stall for cache simulation
# Follow commands from batch_script.simics and write standard
# out to screen_dump.out
./simics /mnt/ganfs/C001001250/homework2/xen-dom0-domU -no-win -batch-mode -stall \
batch_script.simics > ../screen_dump.out
- The batch script is used to provide the commands to the simulation. In addition to the Running on the local appliance using scripts , this batch_script.simics file contains a sequence of commands for the simulation of Nbench for 2 million cycles using the cache model, and print out cache statistics at the end (batch_script.simics):
#batch_script.simics run-command-file setup-gcache.simics con0.input "cd /root/nbench-byte-2.2.3; ./nbench -cCOM.DAT\n" c 2_000_000 cpu0_l1_uc.statistics quit
- A condor submit file. This file tells Condor which files to transfer before and after running a job, which binary to execute, and requirements for the machines that will run the job. Here is a template Condor submit file for this exercise (condor_script):
# specify the executable and job type (Simics is always vanilla in Condor) universe = vanilla executable = simics_wrapper.sh # specify requirements for resource running job. In this case, resource must # have memory > 512MB and Simics installed requirements = HasSimicsX86 == TRUE && Memory >= 512 # Use requirements = HasSimicsAmd64 == TRUE && Memory >= 512 for 64-bit Simics simulations # Provide names for the output files. $(Cluster) and $(Process) are substituted # by unique Condor IDs Log = simics.$(Cluster).$(Process).log Error = simics.$(Cluster).$(Process).err Output = simics.$(Cluster).$(Process).out # Tell Condor which files to transfer: send the Simics input files, retrieve output should_transfer_files = yes when_to_transfer_output = ON_EXIT transfer_input_files = setup-gcache.simics, batch_script.simics transfer_output_files = screen_dump.out # Submit job queue
Example 1: submitting a sample job
- In the appliance, create a folder named "condor1" and copy the files from /mnt/ganfs/C001001250/homework3/ into this folder
mkdir -p $HOME/condor1 cd condor1 tar -xzf /mnt/ganfs/C001001250/homework3/hw3_cache.tgz
- To submit the job:
condor_submit condor_script
- To track the progress of the job:
condor_q
When the job finishes executing, you should see the screen_dump.out file with the outputs of this simulation:
more screen_dump.out
Example 2: submitting multiple jobs
The instructions above show how to run a single job. In several realistic scenarios, however, you need to run many simulations. This is where Condor is particularly useful, as it ensures that your jobs are queued for execution if there are no resources available, deal with failures by retrying, among other features. But one needs to be careful when preparing multiple jobs to avoid files being overwritten and to keep appropriate bookkeeping.
Given the way the Condor and Simics configuration files are setup, one approach here is to create multiple directories, one per configuration, and submit your jobs from these subdirectories. For example, to set up two different configurations where nbench runs on dom0 and domU:
mkdir -p $HOME/nbench-dom0 $HOME/nbench-domU cp /mnt/ganfs/C001001250/homework3/hw3_cache.tgz $HOME/nbench-dom0 cp /mnt/ganfs/C001001250/homework3/hw3_cache.tgz $HOME/nbench-domU cd $HOME/nbench-dom0 tar -xzf hw3_cache.tgz cd $HOME/nbench-domU tar -xzf hw3_cache.tgz # optionally, edit nbench-dom0/batch_script.simics to set up con0.input appropriately, see examples below # optionally, edit nbench-domU/batch_script.simics to set up con0.input appropriately, see examples below cd $HOME/nbench-dom0 condor_submit condor_script cd $HOME/nbench-domU condor_submit condor_script
The output files screen_dump.out will be stored in the corresponding subdirectories at the end of the job execution.
In general, if you want to run many simulations and they all generate an output file with the same name, you will need to place each simulation setup in a separate directory. It is also possible to change the name of each output file generated by tweaking the Condor wrapper script. Check out the Condor documentation/user's mailing lists for ideas on how to do this.
Example 3: Customizing simulation by changing the cache size
The architecture of the cache that is being simulated can be changed by modifying the setup-gcache.simics file. This file is shown below, annotated with explanation of the different parameters
#########################################################
# To set up 1 level of unified cache
@from configuration import *
$tsc = cpu0->ia32_time_stamp_counter
@SIM_set_configuration([
OBJECT("cpu0_l1_uc", "g-cache",
cpus = OBJ("cpu0"), - This specfies the cpu to which this cache is connected
config_line_number = 512, - The number of entries in the cache
config_line_size = 64, - The size of each cache line in bits
config_assoc = 2, - Associativity of the cache
config_virtual_index = 0, - Specifies whether the cache is indexed and using
virtual address or physical address.
config_virtual_tag = 0, - Specifies whether the cache is indexed using
virtual address or physical address.
config_replacement_policy = "lru", - Cache replacement policy
penalty_read = 0, - Read/write latencies
penalty_write = 0,
penalty_read_next = 0, - Latencies for read/write communication with the
next unit in the memory hierarchy
penalty_write_next = 0)])
# plug the hierarchy
@conf.cpu0_mem.timing_model = conf.cpu0_l1_uc
# Send instruction fetches to the cache - NOTE MUST START IN STALL MODE
cpu0.ifm "instruction-fetch-trace"
cpu0->ia32_time_stamp_counter = $tsc
#########################################################
By changing the line number, the size of the cache can be varied.
Example 4: Using Custom TLB modules
In the previous examples, the default 64-entry x86-tlb was being used. This example will demonstrate how to use a TLB module that the user has compiled. This approach can be used to run Simics simulations with user-created modules and devices.
Simulations for different TLB configurations require additional files to be sent along with the job, as well as different checkpoint images. The following files provide templates for the execution of TLB simulations of 2 million instructions for openAFS, with TLBs configured with 64 and 256 entries, respectively:
mkdir -p $HOME/openafs-64 cd $HOME/openafs-64 tar -xzf /mnt/ganfs/C001001250/homework3/hw3_tlb64.tgz # edit batch_script.simics to set up con0.input appropriately condor_submit condor_script cd .. mkdir -p $HOME/openafs-256 cd $HOME/openafs-256 tar -xzf /mnt/ganfs/C001001250/homework3/hw3_tlb256.tgz # edit batch_script.simics to set up con0.input appropriately condor_submit condor_script
Simics with GEMS
Getting GEMS modules working with Simics
The current version of GEMS supports both using Simics 2.2.x and 3.0.x. However, Simics has discontinued providing license files for 2.2.x version, and Archer has 3.0.31 installed by default.
If you want to install the same Simics 3.0.x version as your local system, it is very easy and the license server of Archer works fine. Once you have the Simics working, before you can follow the steps at Setting up your GEMS environment and Quickstart.
Installation
You need a few pre-requisite packages installed on your Grid Appliance. Those are as follows:
sudo apt-get update sudo apt-get install csh bison flex
After the compilation steps in Quickstart guide of GEMS, you can continue following the instructions to submit Simics jobs .
Simics with FeS2
FeS2 is a timing-first, multiprocessor, x86 simulator, implemented as a module for Virtutech Simics. The highlights of FeS2 are
- Full System simulation with an accurate execution-driven timing-model that includes a cache hierarchy, branch predictors and a superscalar out-of-order core for x86 ISA
- Provides multiprocessor (Ruby from GEMS) support
- Uses PTLSim model to decode the x86 instructions to micro-operations
More about FeS2 can be found at the FeS2 website.
Installing Prerequisites
FeS2 builds without any issues using g++-4.1. The following instructions install g++-4.1 and links /usr/bin/g++ to g++-4.1. If you have other versions of gcc or g++ installed on the grid appliance, it may cause conflicts and errors during the build. Hence, it is advisable to install FeS2 on a fresh appliance which does not have gcc and g++ installed
sudo iptables --flush sudo apt-get update sudo apt-get install g++-4.1 bison m4 flex scons subversion make libqt3-mt qt3-dev-tools cd /usr/bin sudo ln -s gcc-4.1 gcc sudo ln -s g++-4.1 g++
Checking Out FeS2 code and building it
Open a command terminal and type the following commands to check out and build FeS2.
cd $HOME svn co http://subversion.cs.uiuc.edu/pub/svn/FeS2/trunk FeS2 cd FeS2 cd $HOME/FeS2 export SIMICS_HOST=x86-linux # or export SIMICS_HOST=amd64-linux for 64-bit appliance export FES2_HOME=${PWD} export PACG_HOME=${PWD}/external export PYTHONPATH=${PACG_HOME}:${FES2_HOME}/python_lib:${PYTHONPATH} export SIMICS_INSTALL_DIR=/opt/virtutech/simics3 export LD_LIBRARY_PATH=/usr/share/qt3/lib/ cd $HOME/FeS2 ./install.py make
Testing the Build
In order to test the build the user has to create a Simics checkpoint. Archer users can use the checkpoint at /mnt/ganfs/C001001250/fes2-checkpoint/. A Simics script file, for running the simpletest binary inside the checkpoint, and some related files can be found in the $HOME/FeS2/test/example/simpletest directory. To use them, copy the files into your FeS2-workspace directory and build the test program.
cd $HOME/FeS2/ cp test/example/simpletest/* simics/FeS2-workspace cd $HOME/FeS2/simics/FeS2-workspace ./simics -quiet /mnt/ganfs/C001001250/fes2-checkpoint/tango-booted_standalone -x simpletest_notrace.simics
Two statistic files will be generated: one for the warmup (named simpletest.stats.warmup) and one for the timing run (named simpletest.stats). These statistics files are formatted such that they can be directly parsed by a python script for analysis, as demonstrated by the included example script.
cd $HOME/FeS2/simics/FeS2-workspace ./simpletest_stats_parser.py simpletest.stats
You will see an output similar to what is shown below.
-------------------------------------------------------------------------------- infile: simpletest.stats total cycles: 42293 total UOps: 123449 ( UPC: 2.919 ), total X86 Ops: 71780 ( IPC: 1.697 ), error rate: 0.0502% , branch prediction: 99.26% ( 10423 / 10501 ), L1 cache hit rate: 98.59% ( 70129 / 71132 ), L2 cache hit rate: 99.60% ( 999 / 1003 ), --------------------------------------------------------------------------------
This verifies the build.
Some useful hints
Listed below are tips for some common Simics tasks.
- For setting memory, number of cpus etc, the top level simics script (enterprise-archer-common.simics) can be used
- Disk image diff files may be added, as the very first command in batch_script.simics
- It is recommended that any logging / tracing is made to write the output to files which can then be transferred back using condor (similar to the screen_dump.out)
- Cache states, TLB states etc can be written to files by using simics provided commands (like tlb.status) to display the contents and redirecting the output to files (like screen_dump.out)
- Checkpoints may be loaded by using the read-configuration command
- If a checkpoint is created at the end of the batch job using the write-configuration command, it may be a good idea to compress them into craff format before transferring them back from the target machine.
- In some cases, the customized modules may be quite large in size and submitting them with the other files using the submit script may not be efficient. In such cases, copy the customized modules to the /mnt/local/myexport and modify the bash script so that it copies the module from /mnt/ganfs/Cxxxyyyzzz/myexport/. Also remove the customized module from the list of files to be submitted in the submit script.
- con0 input strings for running benchmarks
# nBench # In dom0: con0.input "cd /root/nbench-byte-2.2.3; ./nbench -cCOM.DAT\n" # In domU: con0.input "ssh 10.10.0.14 'cd /root/nbench-byte-2.2.3; ./nbench -cCOM.DAT'\n" # In both dom0 and domU: con0.input "cd /root/nbench-byte-2.2.3; ./nbench -cCOM.DAT &\n" con0.input "ssh 10.10.0.14 'cd /root/nbench-byte-2.2.3; ./nbench -cCOM.DAT'\n" # OpenAFS make # In dom0: con0.input "cd /root/openafs-1.4.7; make\n" # In domU: con0.input "ssh 10.10.0.14 'cd /root/openafs-1.4.7; make'\n" # In both dom0 and domU: con0.input "cd /root/openafs-1.4.7; make &\n" con0.input "ssh 10.10.0.14 'cd /root/openafs-1.4.7; make'\n"
Simics Licensing Details
Virtutech® issues floating-node academic licenses to academic users, which provides all Simics binaries and the source code for some modules, free of charge. The Software License Agreement for this license can be found at http://www.grid-appliance.org/files/archer/tutorials/simics/NON_COMMERCIAL_SLA. All users who use Simics via the Archer infrastructure are deemed to have read and complied with this SLA. More about Simics can be found at the Simics Website.
Whenever a Simics simulation is initiated on the Archer Grid Appliance, the Archer license server (which handles Simics licenses for the Archer infrastructure) is contacted and one of 300 floating licenses gets checked out for the simulation. Once the simulation is completed, the license gets returned to the license server.
If the license does not get checked out properly, the simulation will not proceed. Such cases may happen because
- The Grid Appliance is not part of the Archer pool. In such cases, the error message will be something like
Checking out a license... *** Failed to checkout feature "p_developer" Server node is down or not responding See the system adminstrator about starting the server, or make sure the you're referring to the right host (see LM_LICENSE_FILE) Feature: p_developer Hostname: 5.1.1.251 License path: /opt/virtutech/simics3/licenses/license.lic FLEXlm error: -96,482 For further information, refer to the FLEXlm End User Manual, available at "www.macrovision.com". Simics license checked in!
In such cases, ensure that you are connected to the Archer pool. If you have just started the appliance, wait for a few minutes and try again.
- Sometimes, the time on the node on which the job gets scheduled may be different from the time on the Simics license server. If this difference is quite high, the license server will refuse to check out the license and the simics job will terminate without running the simulation. The error message in this case will be
Checking out a license... *** Failed to checkout feature "p_developer" Clock difference too large between client and server Feature: p_developer License path: /opt/virtutech/simics3/licenses/license.lic FLEXlm error: -34,147 For further information, refer to the FLEXlm End User Manual, available at "www.macrovision.com". Simics license checked in!
In such cases, update the time on your appliance is correct using ntpdate-debian and try again
- Sometimes, all 300 available license will have been checked out by other users. In such cases, the error message will be
Checking out a license... *** Failed to checkout feature "p_developer" Licensed number of users already reached Feature: p_developer License path: @10.227.56.131:@128.227.56.122:/opt/virtutech/simics - -3.0.31/licenses/u_florida-x2009-07-03-acis.lic FLEXlm error: -4,132 For further information, refer to the FLEXlm End User Manual, available at "www.macrovision.com". Simics license checked in!

