[Rowson 94] @InProceedings{Rowson:94, author = "James A. Rowson", title = "Hardware/Software Co-Simulation", booktitle = "Proc.~of the 31st Design Automation Conference (DAC~'94)", year = "1994", organization = "ACM", address = "San Diego, CA", OPTmonth = "June", note = "(Tutorial)", OPTannote = "" }
@InProceedings{Rogers:92,
author = "Anne Rogers and Kai Li",
title = "Software Support for Speculative Loads",
pages = "38-50",
booktitle = "Proc.~of the 5th International Conference on Architectural
Support for Programming Languages and Operating Systems",
year = "1992",
month = "October"
}
Evidently contains information about a cycle-level simulator.
More in
@TechReport{Rogers:93,
author = "Anne Rogers and Scott Rosenberg",
title = "Cycle Level {SPIM}",
institution = "Department of Computer Science, Princeton
University",
year = 1993,
address = "Princeton, NJ",
month = "October"
}
Tracing with Pixie Michael D. Smith Center for Integrated Systems Stanford University April 91 ssim: A Superscalar Simulator Mike Johnson AMD M. D. Smith Stanford Univ.Pixie front ends in ftp://velox.stanford.edu/pub
Johnson, Mike: Superscalar microprocessor design Englewood Cliffs, NJ : Prentice Hall, 1991. - XXIV, 288 S. : graph. Darst. (Prentice-Hall series in innovative technology) Literaturverz. S. 273 - 278 ISBN 0-13-875634-1
@Book{Huck:89,
author = {Jerome C. Huck and Michael J. Flynn},
title = {Analyzing Computer Architectures},
publisher = {IEEE Computer Society Press},
year = 1989,
address = {Washington, DC}
}
%A Max Copperman %A Jeff Thomas %T Poor Man's Watchpoints %J ACM SIGPLAN NotIces %V 30 %N 1 %D January 1995 %P 37-44Pardo has a copy. Executive summary: debugging tool; statically patches loads and stores with code to check for data breakpoints.
Amusing story: The processor they were running on has load delay slots and does not have pipeline interlocks. Their tool replaces each load or store with several instructions; it patched a piece of user-mode code of the form
load addr -> r5 store r5 -> addr2Before patching, the code saved the old value of r5 to addr2. After patching, it saved the new value. Technically, this code was broken already because the symptom could have also been exhibited by an interrupt or exception between the load and the store.
``Spike was built inside GNU GCC by Michael Golden and myself. It includes a lot of features that have appeared in ATOM, including the simulator with the benchnark into a single ``self-tracing'' binary. The instruction trace was based on an abstract machine model distilled from GCC's RTL; it had both a high-level and a low-level form. Spike is still in occasional use, but has never been released.''
Basic summary: Wanted to profile. -p/-pg code is larger and slower by enough to make it hard to justify profiling as he default. Assumes the entire source is available. For these and other reasons, wrote jprof which operates with disassembly, analysis and rewriting. Discusses sampling errors, expected accuracy, stability, randomness, etc. Describes jprof: counters and stopwatches; subroutine call graph. Domain/OS on HP/Apollo using 68030. Discusses shared libraries. Can also use page-fault clock. 4-microsecond clocks. Some lessons/observations. Doesn't explain how program running time is affected by jprof.
Summary: DEC is running Win32 application binaries on Alpha by a new combination of interpreter and static translator. The static translator runs in the background, between the first and second executions of the application. It uses info collected by the interpreter during the 1st run, to reliably distinguish active code paths from r/o data and work out the effects of indirect jumps. Static analysis can't do this automatically on its own, for typical x86 binaries.
%A P. J. Brown %T Re-creation of Source Code from Reverse Polish Form %J Softwawe \- Practice & Experience %V 2 %N 3 %P 275-278 %D 1972Note: there's a slightly later SPE that has a follow-up article explaining how to do it faster/more efficiently.
%A Ariel Pashtan %T A Prolog Implementation of an Instruction-Level Processor Simulator %J Software \- Practice and Experience %V 17 %N 5 %P 309-318 %D May 1987
From: bchen@eecs.harvard.edu (Brad Chen) Newsgroups: comp.arch Subject: Windows x86 Address Traces Available Date: 7 Oct 1996 22:20:30 GMT Organization: Harvard University EECS Lines: 15 Message-ID: <53bvne$5lb@necco.harvard.edu> NNTP-Posting-Host: steward.harvard.edu Keywords: Windows x86 address traces
A collection of x86 memory reference traces from Win32 applications are now available from the following URL: http://etch.eecs.harvard.edu/traces/index.html. The collection includes traces from both commercial and public-domain applications. The collection currently includes:
- Perl - MPeg Play - Borland C++ - Microsoft Visual C - Microsoft WordThese traces were created using Etch, and instrumentation and optimization tool for Win32 executables. For more information on Etch see the above URL.
(etch-info@cs.washington.edu)
Peter Kuhn voice: +49-89-289-23092 Institute for Integrated Circuits (LIS) fax1: +49-89-289-28323 Technical University of Munich fax2: +49-89-289-25304 Arcisstr. 21, D-80290 Munich, Germany email: P_Kuhn@lis.e-technik.tu-muenchen.de http: //www.lis.e-technik.tu-muenchen.de/people/kp.html
From: Harish PatilNewsgroups: comp.compilers Subject: Thesis available: Program Monitoring Date: 29 Jan 1997 11:21:02 -0500 Organization: Compilers Central Lines: 59 Sender: johnl@iecc.com Approved: compilers@ivan.iecc.com Message-ID: <97-01-223@comp.compilers> Reply-To: Harish Patil NNTP-Posting-Host: ivan.iecc.com Keywords: report, available, performance Hello everyone: I am glad to announce that my Ph.D. thesis, titled "Efficient Program Monitoring Techniques", is available on-line. This thesis was completed under the supervision of Prof. Charles Fischer at the department of Computer Sciences, University of Wisconsin --Madison. The thesis is available as technical report # 1320. Please check it out at the URL: http://www.cs.wisc.edu/Dienst/UI/2.0/Describe/ncstrl.uwmadison%2fCS-TR-96-1320 An abstract of the thesis follows. Regards, -Harish Efficient Program Monitoring Techniques --------------------------------------- Programs need to be monitored for many reasons, including performance evaluation, correctness checking, and security. However, the cost of monitoring programs can be very high. This thesis contributes two techniques for reducing the high execution time overhead of program monitoring: 1) customization and 2) shadow processing. These techniques have been tested using a memory access monitoring system for C programs. "Customization" reduces the cost of monitoring programs by decoupling monitoring from original computation. A user program can be customized for any desired monitoring activity by deleting computation not relevant for monitoring. The customized program is smaller, easier to analyze, and almost always faster than the original program. It can be readily instrumented to perform the desired monitoring. We have explored the use of program slicing technology for customizing C programs. Customization can cut the overhead of memory access monitoring by up to half. "Shadow processing" hides the cost of on-line monitoring by using idle processors in multiprocessor workstations. A user program is partitioned into two run-time processes. One is the main process executing as usual, without any monitoring code. The other is a shadow process following the main process and performing the desired monitoring. One key issue in the use of shadow process is the degree to which the main process is burdened by the need to synchronize and communicate with the shadow process. We believe the overhead to the main process must be very modest to allow routine use of shadow processing for heavily-used production programs. We therefore limit the interaction between the two processes to communicating certain irreproducible values. In our experimental shadow processing system for memory access checking the overhead to the main process is very low - almost always less than 10%. Further, since the shadow process avoids repeating some of the computations from the main program, it runs much faster than a single process performing both the computation and monitoring. ========================================================================== Harish Patil: Massachusetts Language Lab - Hewlett Packard Mail Stop CHR02DC, 300 Apollo Drive, Chelmsford MA 01824 Phone: 508 436 5717 Fax: 508 436 5135 Email: patil@apollo.hp.com