Version I.5, September 1978
The disassembler reads a standard UCSD code file and outputs symbolic pseudo-assembly (P-Code) along with various statistics concerning opcode frequency, procedure calls, and data segment references. The disassembler was originally written to collect statistics on opcode frequency, etc, as an aid in making architecture improvements. It has since been found helpful in debugging interpreters and optimizing programs, and provides a source of further information regarding some of subtleties of our implementation of Pascal. All statistics gathered are collected by making a pass through the code file instead of collecting them while the code file is actually running.
The Disassembler reads a code file that has been generated by the UCSD Pascal Compiler. If a program USES a UNIT the disassembly will include the UNIT only if the code file has been linked. Assembly routines linked into a Pascal host will never be included in the disassembly.
The Disassembler is invoked by executing DISASM.I5 and requires the file OPCODES.I5 to be on the system disk. The Disassembler will first prompt for an input code file, the suffix .CODE being assigned and thus not required.
The next question refers to the byte sex of the machine the code file is intended to run on, that is whether the first physical byte (byte 0) of a machine word is the most significant byte of the word. For more information, see section 3.6 Byte-Swapping. For the PDP-11 and the 8080 families, physical byte 0 is the least significant byte.
Next the prompt will be for an output file for the disassembled output. Since the output file is untyped, CONSOLE: or PRINTER: (if it is on-line) may be used.
The final question at this stage is whether the user wishes to take control of the disassembly, i.e. decide which procedures are disassembled as opposed to all the procedures in the file.
The following question regards the collection of statistics on references to a particular Procedure's data segment. Should you decide to control the disassembly you will be warned that all statistics gathered are only gathered on those procedures which are disassembled.
Next you will be taken into the Segment Guide. This level displays the segments you have by name and lets you decide on which one you are interested in. The Procedure Guide follows to let you decide on the particular procedure(s) that you wish to disassemble. Typing an “L” at this point will list the procedure(s) contained in this segment. A more complete description of this step occurs in the next section. The Segment Guide may be re-entered by typing “Q” in the Procedure Guide. Thus in this manner you may disassemble several procedures in several different segments without disassembling the entire file. The Segment Guide is exited by typing “Q”.
1 1 1:D 0 (`$L CONSOLE:') 2 1 t:D 1 PROGRAM DISASMDEMO; 3 1 1:D 3 VAR I: INTEGER; 4 1 1:D 4 TOMORROW: BOOLEAN; 5 1 1:D 5 COMMENT: STRING; 6 1 1:C 0 BEGIN 7 1 1:C 0 I:=O; 8 1 1:C 5 TOMORROW:=FALSE; 9 1 1:C 8 REPEAT 1O 1 1:C 8 I:=I+1; 11 1 1:C 13 WRITELN('Disassembly - a step backwards...'); 12 1 1:C 74 UNTIL TOMORROW; 13 I 1:C 77 END.
Figure 1: Sample Pascal Program
BLOCK # 1 OFFSET IN BLOCK= 0 SEGMENT PROC OFFSET# HEX CODE 1 1 0(000): BPT 7 D507 1 1 2(002): SLDC 0 00 1 1 3(003): SRO 3 A803 1 1 5(005): SLDC 0 00 1 1 6(006): SRO 4 AB04 1 1 B(OOB): SLDO 3 EA 1 1 9(009): SLOC 1 01 1 1 10(OOA): ADI 82 1 1 11(008): SRO 3 AB03 1 1 13(000): LOD 1 3 B60103 1 1 16(010): LCA 42 'Disassembly - a step backwards...' 1 1 60(03C): SLDC 0 00 1 1 61(030): CXP WRITESTR C00013 1 1 64(040): CSP IOCHECK 9E00 1 1 66(042): LOD 1 3 B60103 1 1 69(045): CXP WRITELN C00016 1 1 72(048): CSP IOCHECK 9E00 1 1 74(04A): SLDO 4 E3 1 1 75(043): FJP 8 A1F6 1 1 77(04D): RBP 0 C100
Figure 2: Sample Program Disassembled
Figure 1 displays a sample Pascal program that has been listed during compilation. Figure 2 displays the disassembled code of the file generated by the compiler. The left 3 columns in figure 2 correspond to the 3 columns to the right of the line number in figure 1. They are segment number, procedure number, and offset within procedure, respectively. The offset is also given in hex in parentheses. A complete description of UCSD P-Code mnemonics is given in section 3.4. The actual code that exists in the file is given in hex in the rightmost column. The parameters to CXPs and CSPs are converted to the procedure name if it is a known system procedure or function. WRITESTR, WRITELN, and IOCHECK are some examples. The string operand for LCA is printed as a string as evidenced by the line with offset 16. Jumps have their operand(s) converted to an offset from the start of the procedure so that the offset may act as a label. Thus the 8 displayed in the operand field of the FJP at offset 75 really means a jump to the SLDO at offset 8. This is also true of case jumps (XJPs). The block number and byte offset of the start of the procedure are given relative to the start of the code file. Thus this procedure starts at block 1, offset 0 of the code file. The segment dictionary resides in block 0 for all code files.
Data Segment size: 45, Data references: 5, Lex level 0 For segment DISASMDE Procedure # 1 Offset(word) Total % 3 3 60.00 4 2 40.00
Figure 3: Sample Program's Data Segment Statistics
Figure 3 shows the data segment statistics for our sample program. Clearly there is little to be gained from optimizing such a small program but the general idea can still be presented. By using the compiled listing shown in figure 1 one can match offsets to variables as such:
variable offset I 3 TOMORROW 4 COMMENT 5
Now by using the figures in figure 3 one can see that offset 3 or the variable I occurs most frequently and thus deserves it's position. This same idea carried out on a large program may result in substantial size savings. Notice that offset 6 never occurs and thus is not included in the statistics in figure 3.
The prompt for the output file for these statistics occurs after the disassembly has been completed. If you elect to collect these statistics you will be taken into the Segment and Procedure Guides as described in the previous section except that the prompt requests the selection of a data segment on which to collect statistics. In the Procedure Guide, “L” gives a listing of all the procedures in the selected segment by number, lex level, and data segment size. After the selection of a data segment, processing continues, as described in the previous section, from the point after the data segment question.
Each opcode is given with a complete breakdown of which bit was most significant for each operand on any given occurrence of the opcode. These are presented in terms of totals and percentages of the number of occurrences of the opcode. In addition a histogram of the opcode occurrence as a percentage of the total number of opcodes disassembled runs along the right-hand margin. There is also a table of jumps in terms of the number of bits required to represent the distance of the jump for both positive and negative jumps. Finally there are counts of all procedure calls listed by segment and procedure number.
The last prompt of the program is the file to which these
statistics are to be written.