Longer Writeups

Longer writeups and cross-references. Some of the tools here have bibliographic entries, home pages or online papers, noted with ``See: ...''. Many are also described and referenced in the 1994 SIGMETRICS Shade paper, noted with ``See: Shade''.

See here for a list of tools.

Accelerator

???

See:

Atari Emulators

???

The listed tools include:

See:

Apple II Emulators

???

The listed tools include Apple II emulators:

See:

Apple Macintosh Emulators

???

The listed tools include Macintosh emulators:

See:

ATOM

???

ATOM is built on top of OM.

See:

ATUM

???

See:

BEaT (Binary Emulation and Translation)

???

See:

Cerberus

???

See: bib cite, Shade

As of 1994, Cerberus was being actively used and updated by <csa@transmeta.com>, who might be willing to provide information and/or code.

Commodore Emulators

???

See:

Amiga

PET VIC20

See

CRISP

???

See:

Crusoe

Crusoe is an x86 emulator. It both interprets x86 instructions and also translates x86 instructions to a host VLIW instruction set; translations are cached for reuse. The host instruction set is not exported, only target instructions may be executed. A demonstration Crusoe executed both x86 and Java instructions.

Categories:

See:

Cygnus

???

See:

dcc

???

A prototype/research vehicle for decompiling DOS EXE binary files. It uses digital signatures to determine library function calls and the original compiler.

See:

DEC PDP-8 Simulators

???

See:

DEC PDP-11 Simulators

???

See:

Decomp

???

See:

dis+mod+run

???

See:

Dynascope

???

See:

Dynascope-II

???

See:

EDSAC Debug

The EDSAC Debugger uses a tracing simulator that operates by: fetching the simulated instruction; decoding it to save trace information; checking to see if the instruction is a branch, and updating the simulated program counter if it is; else placing the instruction in the middle of the simulator loop and executing it directly; and then returning to the top of the simulator loop.

As an aside, the 1951 paper on the EDSAC debugger contains a pretty complete description of a modern debugger...

Categories:

See:

EEL

EEL reads object files and executables and allows tools built on top of EEL to modify the machine code without needing details of the underlying architecture or operating system or with the consequences of adding or deleting code.

EEL appears as a C++ class. EEL is provided with an executable, which it analyzes, creating abstractions such as executable (whole program), routines, CFGs, instructions and snippets. A tool built on EEL then edits the executable by performing structured rewrites of the EEL constructs; EEL ensures that details of register allocation, branches, etc. are updated correctly in the final code.

Categories:

See:

Executor

???

See:

FAST

???

See:

FlashPort

???

See:

FLEX-ES

FLEX-ES (formerly OPEN/370) provides a System/390 on a Pentium. It includes system-mode operation, runs 8 popular S/370 OS's. On a 2-processor Pentium-II/400MHz, it provides 7 to 8 MIPS on one processor and I/O functions on the other processor. They also sell installed systems (hardware/software turnkey systems).

Categories:

FLEX-ES home page.

FreePort Express

??? FreePort Express is a tool for convering Sun SPARC binaries to DEC Alpha AXP binaries.

See: FreePort Express web page

g88

g88 is a portable simulator that simulates both user and system-mode code. It uses threaded code to performance on the order of a few tens of instructions per simulated instruction.

See:

g88 was written by Robert Bedichek.

GNU Simulators

???

See:

Hiprof

???

Built on top of OM.

See:

IDtrace

???

See:

IMS

???

See:

The Interpreter

???

``The Interpreter'' is a micro-architecture that is intended for a variety of uses including emulation of existing or hypothetical machines and program profiling. An emulator is written in microcode and instructions executed from the microinstructions that are executed from the microstore give both parallelism and fast execution.

Categories:

More detailed review:

See:

This review/summary by Pardo.

Kx10

???

See:

Mable

???

See:

Migrant

???

See:

Mime

???

See:

Mimic

???

See:

MINT

???

See:

Moxie

???

See:

MPtrace

MPtrace statically augments parallel programs written for the i386-based Sequent Symmetry multiprocessor. The instrumented programs are then run to generate multiprocessor address traces.

See:

MPtrace was written by David Keppel and Eric J. Koldinger under the supervision of Susan J. Eggers and Henry M. Levy

MSX

???

Emulators:

See:

New Jersey Machine Code Toolkit (NJMCT)

The New Jersey Machine Code Toolkit lets programmers decode and encode machine instructions symbolically, guided by machine specifications that mappings between symbolic and machine (binary) forms. It thus helps programmers write applications such as assemblers, diassemblers, linkers, run-time code generators, tracing tools, and other tools that consume or produce machine code.

Questions and comments can be sent to `toolkit@cs.princeton.edu'.

See:

OM

???

See:

Partial Emulation

Summary:

Virtual machines (VMs) provide greater flexibility and protection but require the ability to run one operating system (OS) under the control of another. In the absence of virtualization hardware, VMs are typically built by porting the OS to run in user mode, using a special kernel-level environment or as a system-level simulator. ``Partial Emulation'' or a ``Lightweight Virtual Machine'' is an augmentation-based approach to system-level simulation: directly execute most instructions, statically rewrite and virtualize those instructions which are ``tricky'' due to running in a VM environment. Compared to the other approaches, partial emulation offers fewer OS modifications than user-mode execution (user-mode Linux requires a machine description around 33,000 lines) and higher performance than a full (all instructions) simulator (Bochs is about 10x slower than native execution).

The implementation described here emultes all privilged instructions and some non-privileged instructions. One approach replaces each ``interesting'' instruction with illegal instruction traps. A second approach is to call emulation subroutines. ``Rewriting'' is done during compilation, and the current implementation requires OS source code [EY 03].

The approach here must: detect and emulate privileged and some non-privileged instructions; redirect system calls and page faults to the user-level OS; emulate an MMU; emulate devices.

The implementation with illegal instruction traps uses a companion process and debugger-type accesses to simulate interesting instructions. Otherwise, the user-level OS and its processes are executed in a single host process. The ``illegal instruction trap'' approach inserts an illegal instruction before each ``interesting'' instruction. The companion process then skips the illegal instruction, simulates the ``interesting'' instruction, then restarts the process. It is about 1,500 lines of C code. The ``procedure call'' approach is about 1,400 lines but is faster. There are still out-of-process traps due to e.g., MMU emulation (ala SimOS).

For IA-32, the ``interesting'' instructions are mov, push, and pop instructions that manipulate segment registers; call, jmp, and ret instructions that cross segment boundaries; iret; instructions that manipulate special registers; and instructions that read and write (privileged bits of) the flag register.

Not all host OSs have the right facilities to implement a partial emulator.

Some target OS changes were needed. For NetBSD, six address constants were changed to avoid host OS conflicts, and device drivers were removed. For FreeBSD, there were also replaced BIOS calls with code that returned the needed values; had they tried to implement (run) the BIOS the system would need to execute virtual 8086 mode.

User-level execution speed was similar to native. For OS-intensive microbenchmarks, the ``illegal instruction trap'' implementat was at least 100x slower than native (non-virtual) execution and slower than Bochs. The ``procedure call'' approach was 3-5x faster, but little slower than Bochs and still 10x slower than VMware which was in turn 4x-10x slower than native. A test benchmark (patch) was 15x slower using illegal instruction traps and about 5x slower using procedure calls. For comparison, VMware was about 1.1x slower.

The paper proposes using a separate host process for each page table base register value in order to reduce overhead for MMU emulation.

Categories:

Further reading: ``Running BSD Kernels as User Processes by Partial Emulation and Rewriting of Machine Instructions'' [EY 03].

Pixie

???

See:

Pixie-II

???

See:

Proteus

???

See:

Purify

???

See:

qp/qpt

???

See:

RPPT

???

See:

RSIM

???

Simulates pipeline-level parallelism and memory system behavior.

See:

SELF

???

See:

Shade

Shade combines efficient instruction-set simulation with a flexible, extensible trace generation capability. Efficiency is achieved by dynamically compiling and caching code to simulate and trace the application program; the cost is as low as two instructions per simulated instruction. The user may control the extent of tracing in various ways; arbitrarily detailed application state information may be collected during the simulation, but tracing less translates directly into greater efficiency. Current Shade implementations run on SPARC systems and simulate the SPARC (Versions 8 and 9) and MIPS I instruction sets.

See:

Shade was written by Bob Cmelik, with help from David Keppel.

SimICS

SimICS is a multiprocessor simulator. SimICS simulates both the user and system modes of 88000 and SPARC processors and is used for simulation, debugging, and prototyping.

See:

SimICS should soon be available under license. Contact Peter Magnusson.

SimICS is a rewrite of gsim, which, in turn, was derived from g88. SimICS was written by Peter Magnusson, David Samuelsson, Bengt Werner and Henrik Forsberg.

Sinclair ZX Spectrum Emulators

???

See:

Shadow

???

See:

Simon

???

See:

SimOS

SimOS emulates both user-mode and system-mode code for a MIPS-based multiprocessor. It uses a combination of direct-execution (some OS rewrites may be required) and dynamic cross-compilation (no rewrites needed) in order to emulate and, to some degree, instrument.

Categories:

See:

Sleipnir

Sleipnir is an instruction-level simulator generator in the style of yacc. The configuration file is extended C, with special constructs to describe bit-level encodings and common code and support for generation of a threaded-code simulator.

For example, 0b_10ii0sss_s0iidddd specifies a 16-bit pattern with constant values which must match and named ``don't care'' fields i (split over two locations), s, and d. Sleipnir combines the various patterns to create an instruction decoder. Named fields are substituted in action rules for an instruction. For example, add 0b_10ii0sss_s0iidddd { GP(reg[$d]) = GP(reg[$s]) + $^c }. Here, ^ indicates sign-extension. Threaded-code dispatch is implied.

For simple machines, Sleipnir can generate cycle-accurate simulators. For more complex machines, it generates ISA machines. Threaded-code simulators are typically weak at VLIW simulation and machines with some kinds of exposed latencies. Threaded-code simulators typically simulate one instruction entirely before starting the next, but with VLIW and exposed latencies, the effects of a single instruction are spread over the execution of several instructions. Sleipnir supports some kinds of exposed latencies by running an after() function after each instruction. Simulator code that creates values writes them in to buffers, and code in after() can copy the values as needed to memory, the PC, and so on.

Reported machine description sizes, speeds, and level of accuracy include the following. ``Speed'' is based on a 250 MHz MIPS R10000-based machine.

In Norse mythology, ``Sleipnir'' is an eight-legged horse that could travel over land and sea and through the air.

ArchitectureMD linesSim. speedAccuracy
MIPS-I (integer)7005.1 MIPSISA
M*Core9706.4 MIPSCycle
ARM/Thumb2,8123.6 MIPSISA
TI C62015,2313.4 MIPSCycle
Lucent DSP16003,9033.7 MIPSCycle

See:

SoftPC

SoftPC is an 8086/80286 emulator which runs on a variety of host machines. The first version implemented an 8086 processor core using an interpreter. It provided device emulators for EGA/VGA and Hercules graphics, hard disks, floppies, and and an interrupt controller.

In about 1986, Steve Chamberlain developed a dynamic cross-compiler for the Sun 3/260. The basic emulation structure is an array of bytes for simulated memory and and an ``action'' array, which is a same-size array of bytes. There are then three arrays R, W, and X for reads, writes, and execution; each is subscripted by the ``action'' byte and contains a pointer to the correspondition read, write, or execute action. For example, a read of location 17 is implemented by reading a = action[17], then branching to R[a]. Similarly, executing location 17 is implemented by reading a = action[17], then branching to X[a]. The default action is that each instruction is interpreted.

Each branch invokes the translator. The translator (dynamic cross-compiler) generates a translation that starts at the last branch and goes through the current branch. SoftPC then records the current branch target, which is the starting place for the next branch's translation. SoftPC ``installs'' the translation by allocating a byte subscript a, then it fills in the action table with the value a and sets R[a] to act as a normal read; W[a] to invalidate the corresponding translation; and X[a] to point to the new translation. For each byte ``covered'' by the translation, the action table is set to a byte value that will invalidate the translation. For each translation, SoftPC also sets a back-pointer in a 256-entry table so that when a particular translation is being invalidated it is easy to find the location in the ``action'' table which currently uses that translation.

There are thus a maximum of 256 translations at any time (actually 254 due to reserved byte values). The simulated system had up to 1MB RAM. In about 1988 Henry ??? extended the system to use the low bit of the address as part of the subscript, in order to expand the table to 512 translations. This is used in the first Apple MacIntosh target of SoftPC.

SoftPC emulates many devices, including EGA, VGA, and Hercules video; disks, including floppies and hard disks; the interrupt controller; and so on. In about 1987, Steve Chamberlain implemented an 8087 (FP coprocessor) that was not a faithful 8087 (e.g., did not provide full 80-bit FP) but which provided sufficient accuracy to run common applications.

Categories:

See:

Spa

???

See:

SPIM

???

See:

Spix

???

See:

ST-80

???

See:

STonX

???

An Atari ST emulator that runs on (at least) a Sun SPARC IPC under SunOS 4.1; it emulates an MC68000, RAM, ROM, Atari ST graphics, keyboard, BIOS, clock and maybe some other stuff. On a SPECint=13.8 machine it runs average half the speed of a real ST.

See:

By: Marinos "nino" Yannikos.

T2

T2 is a SPARCle/Fugu simulator that is implemented by dynamically cross-compiling SPARCle code to SPARC code. It simulates both user and system mode code and was used for doing program development before the arrival of SPARCle hardware.

The name T2 is short for ``Talisman-2''. Note that, despite the similarity in names, Talisman and T2 share little in implementation or core features: the former uses a threaded code implementation and provides timing simulation of an m88k, while the latter uses dynamic cross-compilation and provides fast simulation of a SPARCle.

Tango Lite

???

See:

Talisman

Talisman is a fast timing-accurate simulator for an 88000-based multiple-processor machine. Talisman provides both user-mode and system mode simulation and can boot OS kernels. Simulation is reasonably fast, on the order of a hundred instructions per simulated instruction. Talisman also does low-level timing simulation and typically produces estimated running times that are within a few percent of running times on real hardware. Note that e.g. turning off dynamic RAM refresh simulation makes the timing accuracy substantially worse!

See:

Tapeworm II

???

See:

Third Degree

???

Built on top of OM.

See:

Titan tracing

???

See:

TRAPEDS

???

See:

VAX-11 RSX Emulator

???

See:

Vest and mx

???

See:

Windows x86

???

According to a Microsoft information release, "Windows x86" is a user-space x86 emulator with an OS interface to 32-bit Microsoft Windows (tm).

Windows on Windows (WOW)

???

According to a Microsoft information release, "Windows on Windows" is a user-space x86 emulator with an interface to 16-bit Microsoft Windows (tm).

Wabi

???

See:

Wine

???

Wine is a Microsoft Windows(tm) OS emulator for i*86 systems. Most of the application's code runs native, but calls to ``OS'' functions are transformed into calls into Unix/X. Some programs require enhanced mode device drivers and will (probably) never run under Wine. Wine is neither a processor emulator nor a tracing tool.

See:

WWT

???

See:

xtrs

???

See:

Z-80 Simulators

Z80MU

???

See:

8051 Emulators

???

    Simulators
        - 2500 A.D.
        - Avocet Systems
          (also compilers and assemblers).
        - ChipTools
             on a 33 MHz 486 matches the speed of a 12 MHz 8051
        - Cybernetic Micro Systems
        - Dunfield Development Systems
             Low cost $50.00
             500,000+ instructions/second on 486/33
             Can interface to target system for physical I/O
             Includes PC hosted "on chip" debugger with identical user
                interface
        - HiTech Equipment Corp.
        - Iota Systems, Inc.
        - J & M Microtek, Inc.
        - Keil Electronics
        - Lear Com Company
        - Mandeno Granville Electronics, Ltd
        - Micro Computer Control Corporation
             Simulator/source code debugger ($79.95)
        - Microtek Research
        - Production Languages Corp.
        - PseudoCorp

    Emulators ($$$ - high, $$ - medium, $ - low priced)
        - Advanced Micro Solutions  $$
        - Advanced Microcomputer Systems, Inc.  $
        - American Automation  $$$  $$
        - Applied Microsystems  $$
        - ChipTools (front end for Nohau's emulator)
        - Cybernetic Micro Systems  $
        - Dunfield Development Systems $
             plans for pseudo-ice using Dallas DS5000/DS2250
             used together with their resident monitor and host debugger
        - HBI Limited  $
        - Hewlett-Packard  $$$
        - HiTech Equipment Corp.
        - Huntsville Microsystems  $$
        - Intel Corporation  $$$
        - Kontron Electronics  $$$
        - Mandeno Granville Electronics, Ltd
             full line covering everything from the Atmel flash to the
                Siemens powerhouse 80c517a
        - MetaLink Corporation  $$  $
        - Nohau Corporation  $$
        - Orion Instruments  $$$
        - Philips $
             DS-750 pseudo-ICE developed by Philips and CEIBO
             real-time emulation and simulator debug mode
             source-level debugging for C, PL/M, and assembler
             programs 8xC75x parts
             low cost - only $100
             DOS and Windows versions available
        - Signum Systems  $$
        - Sophia Systems  $$$
        - Zax Corporation
        - Zitek Corporation  $$$
(Contacts listed in FAQ below).

See:




From instruction-set simulation and tracing