4.9. P-Code Disassembler

Version I.5, September 1978

The disassembler reads a standard UCSD code file and outputs symbolic pseudo-assembly (P-Code) along with various statistics concerning opcode frequency, procedure calls, and data segment references. The disassembler was originally written to collect statistics on opcode frequency, etc, as an aid in making architecture improvements. It has since been found helpful in debugging interpreters and optimizing programs, and provides a source of further information regarding some of subtleties of our implementation of Pascal. All statistics gathered are collected by making a pass through the code file instead of collecting them while the code file is actually running.

4.9.1. Disassembly

The Disassembler reads a code file that has been generated by the UCSD Pascal Compiler. If a program USES a UNIT the disassembly will include the UNIT only if the code file has been linked. Assembly routines linked into a Pascal host will never be included in the disassembly.

The Disassembler is invoked by executing DISASM.I5 and requires the file OPCODES.I5 to be on the system disk. The Disassembler will first prompt for an input code file, the suffix .CODE being assigned and thus not required.

The next question refers to the byte sex of the machine the code file is intended to run on, that is whether the first physical byte (byte 0) of a machine word is the most significant byte of the word. For more information, see section 3.6 Byte-Swapping. For the PDP-11 and the 8080 families, physical byte 0 is the least significant byte.

Next the prompt will be for an output file for the disassembled output. Since the output file is untyped, CONSOLE: or PRINTER: (if it is on-line) may be used.

The final question at this stage is whether the user wishes to take control of the disassembly, i.e. decide which procedures are disassembled as opposed to all the procedures in the file.

The following question regards the collection of statistics on references to a particular Procedure's data segment. Should you decide to control the disassembly you will be warned that all statistics gathered are only gathered on those procedures which are disassembled.

Next you will be taken into the Segment Guide. This level displays the segments you have by name and lets you decide on which one you are interested in. The Procedure Guide follows to let you decide on the particular procedure(s) that you wish to disassemble. Typing an “L” at this point will list the procedure(s) contained in this segment. A more complete description of this step occurs in the next section. The Segment Guide may be re-entered by typing “Q” in the Procedure Guide. Thus in this manner you may disassemble several procedures in several different segments without disassembling the entire file. The Segment Guide is exited by typing “Q”.

 1 1 1:D  0 (`$L CONSOLE:')
 2 1 t:D  1 PROGRAM DISASMDEMO;
 3 1 1:D  3 VAR  I: INTEGER;
 4 1 1:D  4      TOMORROW: BOOLEAN;
 5 1 1:D  5      COMMENT: STRING;
 6 1 1:C  0 BEGIN
 7 1 1:C  0   I:=O;
 8 1 1:C  5   TOMORROW:=FALSE;
 9 1 1:C  8   REPEAT
1O 1 1:C  8     I:=I+1;
11 1 1:C 13     WRITELN('Disassembly - a step backwards...');
12 1 1:C 74   UNTIL TOMORROW;
13 I 1:C 77 END.

Figure 1: Sample Pascal Program

          BLOCK #  1      OFFSET IN BLOCK=   0
SEGMENT PROC     OFFSET#                            HEX CODE
     1    1     0(000):    BPT          7           D507
     1    1     2(002):    SLDC         0           00
     1    1     3(003):    SRO          3           A803
     1    1     5(005):    SLDC         0           00
     1    1     6(006):    SRO          4           AB04
     1    1     B(OOB):    SLDO         3           EA
     1    1     9(009):    SLOC         1           01
     1    1    10(OOA):    ADI                      82
     1    1    11(008):    SRO          3           AB03
     1    1    13(000):    LOD          1  3        B60103
     1    1    16(010):    LCA         42 'Disassembly - a step backwards...'
     1    1    60(03C):    SLDC         0           00
     1    1    61(030):    CXP     WRITESTR         C00013
     1    1    64(040):    CSP     IOCHECK          9E00
     1    1    66(042):    LOD          1  3        B60103
     1    1    69(045):    CXP     WRITELN          C00016
     1    1    72(048):    CSP     IOCHECK          9E00
     1    1    74(04A):    SLDO         4           E3
     1    1    75(043):    FJP          8           A1F6
     1    1    77(04D):    RBP          0           C100

Figure 2: Sample Program Disassembled

Figure 1 displays a sample Pascal program that has been listed during compilation. Figure 2 displays the disassembled code of the file generated by the compiler. The left 3 columns in figure 2 correspond to the 3 columns to the right of the line number in figure 1. They are segment number, procedure number, and offset within procedure, respectively. The offset is also given in hex in parentheses. A complete description of UCSD P-Code mnemonics is given in section 3.4. The actual code that exists in the file is given in hex in the rightmost column. The parameters to CXPs and CSPs are converted to the procedure name if it is a known system procedure or function. WRITESTR, WRITELN, and IOCHECK are some examples. The string operand for LCA is printed as a string as evidenced by the line with offset 16. Jumps have their operand(s) converted to an offset from the start of the procedure so that the offset may act as a label. Thus the 8 displayed in the operand field of the FJP at offset 75 really means a jump to the SLDO at offset 8. This is also true of case jumps (XJPs). The block number and byte offset of the start of the procedure are given relative to the start of the code file. Thus this procedure starts at block 1, offset 0 of the code file. The segment dictionary resides in block 0 for all code files.

4.9.2. Data Segment Reference Statistics

The fourth prompt the Disassembler provides is a question asking if you would like to keep track of all references to a particular procedure's data segment. The most common use of these statistics is in optimization of a given procedure's code file. By re-arranging the order of declaration of variables one may change the offset within a data segment that applies to a given variable. For p-machine architecture reasons the first 16 words offset into the data segment are the fastest and have optimized 1 byte instructions. Offsets from 17 to 127 result in instructions as least 2 bytes long, while references to greater than 127 require at least 3 bytes. By making the most frequently used variables have the smaller offsets one may save considerable code file space and possibly time during execution.

Data Segment size: 45,  Data references: 5,  Lex level 0

For segment DISASMDE Procedure # 1
Offset(word)     Total      %
       3            3    60.00
       4            2    40.00

Figure 3: Sample Program's Data Segment Statistics

Figure 3 shows the data segment statistics for our sample program. Clearly there is little to be gained from optimizing such a small program but the general idea can still be presented. By using the compiled listing shown in figure 1 one can match offsets to variables as such:

variable offset

I 3

TOMORROW 4

COMMENT 5

Now by using the figures in figure 3 one can see that offset 3 or the variable I occurs most frequently and thus deserves it's position. This same idea carried out on a large program may result in substantial size savings. Notice that offset 6 never occurs and thus is not included in the statistics in figure 3.

The prompt for the output file for these statistics occurs after the disassembly has been completed. If you elect to collect these statistics you will be taken into the Segment and Procedure Guides as described in the previous section except that the prompt requests the selection of a data segment on which to collect statistics. In the Procedure Guide, “L” gives a listing of all the procedures in the selected segment by number, lex level, and data segment size. After the selection of a data segment, processing continues, as described in the previous section, from the point after the data segment question.

4.9.3. Opcode, Procedure Call, and Jump Statistics

These statistics are collected as an aid in optimizing the architecture of P-Code and although they are interesting to look at they are of no real use to the typical user. For this reason they will be described only superficially.

Each opcode is given with a complete breakdown of which bit was most significant for each operand on any given occurrence of the opcode. These are presented in terms of totals and percentages of the number of occurrences of the opcode. In addition a histogram of the opcode occurrence as a percentage of the total number of opcodes disassembled runs along the right-hand margin. There is also a table of jumps in terms of the number of bits required to represent the distance of the jump for both positive and negative jumps. Finally there are counts of all procedure calls listed by segment and procedure number.

The last prompt of the program is the file to which these statistics are to be written.

This page last regenerated Sun Jul 25 01:09:12 2010.