Summary:
Virtual machines (VMs) provide greater flexibility and protection but require the ability to run one operating system (OS) under the control of another. In the absence of virtualization hardware, VMs are typically built by porting the OS to run in user mode, using a special kernel-level environment or as a system-level simulator. ``Partial Emulation'' or a ``Lightweight Virtual Machine'' is an augmentation-based approach to system-level simulation: directly execute most instructions, statically rewrite and virtualize those instructions which are ``tricky'' due to running in a VM environment. Compared to the other approaches, partial emulation offers fewer OS modifications than user-mode execution (user-mode Linux requires a machine description around 33,000 lines) and higher performance than a full (all instructions) simulator (Bochs is about 10x slower than native execution).
The implementation described here emultes all privilged instructions and some non-privileged instructions. One approach replaces each ``interesting'' instruction with illegal instruction traps. A second approach is to call emulation subroutines. ``Rewriting'' is done during compilation, and the current implementation requires OS source code [EY 03].
The approach here must: detect and emulate privileged and some non-privileged instructions; redirect system calls and page faults to the user-level OS; emulate an MMU; emulate devices.
The implementation with illegal instruction traps uses a companion process and debugger-type accesses to simulate interesting instructions. Otherwise, the user-level OS and its processes are executed in a single host process. The ``illegal instruction trap'' approach inserts an illegal instruction before each ``interesting'' instruction. The companion process then skips the illegal instruction, simulates the ``interesting'' instruction, then restarts the process. It is about 1,500 lines of C code. The ``procedure call'' approach is about 1,400 lines but is faster. There are still out-of-process traps due to e.g., MMU emulation (ala SimOS).
For IA-32, the ``interesting'' instructions are mov, push, and pop instructions that manipulate segment registers; call, jmp, and ret instructions that cross segment boundaries; iret; instructions that manipulate special registers; and instructions that read and write (privileged bits of) the flag register.
Not all host OSs have the right facilities to implement a partial emulator.
Some target OS changes were needed. For NetBSD, six address constants were changed to avoid host OS conflicts, and device drivers were removed. For FreeBSD, there were also replaced BIOS calls with code that returned the needed values; had they tried to implement (run) the BIOS the system would need to execute virtual 8086 mode.
User-level execution speed was similar to native. For OS-intensive microbenchmarks, the ``illegal instruction trap'' implementat was at least 100x slower than native (non-virtual) execution and slower than Bochs. The ``procedure call'' approach was 3-5x faster, but little slower than Bochs and still 10x slower than VMware which was in turn 4x-10x slower than native. A test benchmark (patch) was 15x slower using illegal instruction traps and about 5x slower using procedure calls. For comparison, VMware was about 1.1x slower.
The paper proposes using a separate host process for each page table base register value in order to reduce overhead for MMU emulation.
Categories:
Further reading: ``Running BSD Kernels as User Processes by Partial Emulation and Rewriting of Machine Instructions'' [EY 03].