EEL6892 Class Project 3
From Grid-Appliance Wiki
Contents |
Introduction
In this project you will apply simulation-based analysis techniques to investigate the behavior of virtual machine environments on x86 architectures. You will use the Simics x86 simulator, the pre-configured Xen checkpoints and the nbench and openAFS compilation benchmarks that were used in the previous assignments.
Below are two possible approaches for directions you can take in your project. Other ideas are encouraged - if you have an approach of your own that you would like to propose, please discuss it with the instructor.
Project report
Your project can be done individually or in groups of 2. You will turn in a report of at most 6 pages, with the following sections: introduction, approach, experimental methodology, data collected, data analysis, and conclusion. The report is due Dec. 9th.
Approach 1: study of performance isolation under different VMM scheduler parameters
Xen 3.0 has a scheduler or VMs called sEDF, which is based on the EDF (earliest deadline first) policy. With this scheduler, you can allocate different fraction of CPU times to different virtual machines. This is intended to allow domains to have reserved time allocations in the CPU, but it does not support reservations of other architectural structures, like TLB and caches.
In this project, your goal is to:
- Learn about the sEDF scheduling policy
- Conceive experiments that consider different sEDF parameters for dom0/domU, and different combinations of benchmarks running in these domains
- Analyze the relationship between different performance metrics collected at the simulation layer (e.g. miss rates) with the scheduler configuration
See the "Hints" section below for an example on how to configure sedf.
Approach 2: study of per-domain architecture statistics
In this project, your goal is to study and characterize performance metrics at the architecture level (e.g. miss rates) on a per-domain basis. Using and extending the Simics simulator, you will be able to differentiate the statistics of TLB and caches due to accesses issued by dom0, domU and the hypervisor itself - e.g. by being able to "bin" number of accesses and hits/misses from dom0, domU and the hypervisor.
To accomplish this, you will need to extend Simics "HAP" handlers, which are Python scripts triggered during events of interest (e.g. domain switches, or CPU privileged switches). We have used already one such Hap for domain switches in a previous assignment; under the "Hints" section below you will find a template Python script for catching changes in the CPU privileged state (which will help you gather statistics for the hypervisor itself).
Useful hints
- Develop and test locally in your own appliance with the Simics interactive interface. Then, try to run a job in batch mode in your own appliance. Then you start submitting jobs to Condor. This is so you're sure things are running properly.
- Be careful to not be extremely verbose in the generation of output from your simulation. When you submit a job for emote execution, the output files need to be transferred back to you. If they are too large, and if you have many jobs, it can take a long time to transfer them back.
- One thing to consider doing if you have large outputs is to compress them before sending back. You can add a gzip statement at the end of the wrapper script and make sure you change the condor config file to transfer the gzippedfile (screen_dump.out.gz)
- START EARLY
- To configure the Xen SEDF scheduler, you can use the following command in dom0:
xm sched-sedf #This displays the current scheduler settings xm sched-sedf <dom_number> -p <period> -s <slice of that period reserved> -e 1
Period and slice are in milliseconds unless mentioned otherwise.
The extra flag (-e) specifies that it is OK (or not OK if this is set 0) to allocate any extra time to this domain. For instance, if both dom0 and domU have equal slices (10 of 20 ms), but domU is not running anything. If extra flag is set in dom0 - it will run all 20ms - else it will run for only 10 ms.
- To add a Simics hook to catch changes in CPU mode, the following python script can be used:
#HAP to catch core mode change
def hap_core_mode_change(user_arg, trigger_obj, old_mode, new_mode):
if ((old_mode == Sim_CPU_Mode_Supervisor) and (new_mode == Sim_CPU_Mode_User)):
value = ('CPU %d switching from Supervisor to User at %d' %(trigger_obj.processor_number, SIM_cycle_count(trigger_obj)))
elif ((old_mode == Sim_CPU_Mode_User) and (new_mode == Sim_CPU_Mode_Supervisor)):
value = ('CPU %d switching from User to Supervisor at %d' %(trigger_obj.processor_number, SIM_cycle_count(trigger_obj)))
else:
value=('Error CPU %d old_mode %d new_mode %d at %d' %(trigger_obj.processor_number,old_mode, new_mode, SIM_cycle_count(trigger_obj)))
s=str(value)
print(s)
SIM_hap_add_callback("Core_Mode_Change", hap_core_mode_change, None)

