3.5. Introduction to the Pascal Pseudo-Machine

Version II.0, February 1979

page203 UCSD uses an interpreter based implementation of Pascal. This means that the compiler emits code for a pseudo-machine which is emulated at run time by a program written in the machine language of the host. The compiler, program editor, stand-alone operating system, and various utilities are themselves written in Pascal and run on the same interpreter. Thus the entire system can be moved to a new host machine by rewriting the interpreter for the new host. This document describes the Pseudo-machine code files as they were in version I.3. Many of the segments mentioned are no longer resident in the code file used as an example. This does not affect the functionality of the description of the mechanisms put forth by this document.

Figure 3.5.10 (the last page of this document) is a skeleton version of a large Pascal program, here-in-after referred to as "The Program". This document is a top-down description of the realization of that program on the UCSD Pascal system. We will make occasional use of a helpful coincidence: The Program is the framework of the portion of the UCSD Pascal environment that's written in Pascal.

If The Program ware expanded to a complete Pascal system, it would consist of at least 6000 lines of Pascal and compile to more than 50,000 bytes of code — too big to fit all at once into the memory of a small machine (by our current definition of small). We have therefore extended Pascal so that a programmer can explicitly partition a program into segments; only some of which need be resident in main memory at a time. The syntax of this extension is shown in figure 3.5.1. (Any syntactic objects not defined explicitly there retain their standard interpretation as defined by Jensen & Wirth: Pascal User Manual and Report.) See Section 5.9 for revised syntax diagrams.

<program> ::= <program heading> <segment block> .

<segment block> ::= <label declaration part>
    <constant declaration part> <type definition part>
    <variable declaration part> <segment declaration part>
    <segment body>

<segment declaration part> ::= SEGMENT <procedure heading>
    <segment block>; \ SEGMENT <function heading>
    <segment block>;

<segment body> ::= <procedure and function declaration part>
    <statement part>

Figure 3.5.1. Segment Declaration Syntax.

page204 Segment declaration syntax (figure 3.5.1) requires that all nested segments be declared before the ordinary procedures or functions of the segment body. Thus, a code segment can be completely generated before processing of code for the next segment starts. This is not a functional limitation, since forward declarations can be used to allow nested segments (COMPILER in The Program) to reference procedures in an outer segment body (CLEARSCREEN). Similarly, segment procedures and functions can themselves be declared forward.

Segmenting a program does not change its meaning in any fundamental sense. When a segment is called (e.g. the COMPILER segment in line A), the interpreter checks to see if it is present in memory due to a previous invocation. If it is, control is transferred and execution proceeds: if not, the appropriate code segment must be loaded from disk before the transfer of control takes place. When no more active invocations of the segment exist, its code is removed from memory. For instance, in The Program, the code for the COMPINIT segment is not present in memory either before or after the execution of line A. Clearly, a program should be segmented in such a way that (non-recursive) segment calls are infrequent; otherwise, much time could be lost in unproductive thrashing (particularly on a system with low performance disk).

                                                  high address
             +-----------------------------------------------+
        ,--> |                    DEBUGGER               10  |
    not |    +-----------------------------------------------+
        |--> |                      FILER                17  |
  shown |    +-----------------------------------------------+
        |    |                     EDITOR                12  |
     in |    +-----------------------------------------------+
        |    |                    COMPINIT                7  |
    the |    +-----------------------------------------------+
        |    |                    COMPILER               41  |
program |    +-----------------------------------------------+
        `--> |                   INITIALIZE               3  |
             +-----------------------------------------------+
             |                  USER PROGRAM              1  |
             +-----------------------------------------------+
             |                  PASCAL SYSTEM            17  |
             +-----------------------------------------------+
             |               SEGMENT DICTIONARY           1  |
             +-----------------------------------------------+
                                                   low address
Figure 3.5.2. Pascal System Codefile.

page205 The code file resulting from compilation of The Program is diagrammed in figure 3.5.2*. The file is a sequence of code segments preceded by a segment dictionary. The size of each segment is noted in blocks, the 512-byte disk allocation quantum used on most PDP-11 operating systems. The sizes indicated are representative of a full Pascal system. Each code segment begins on a block boundary. The ordering (from low address to high address) is determined by the order that one encounters segment procedure bodies in passing through The Program.

* An overview of the relationship between figures 3.5.2 through 3.5.8 (to be discussed in the following pages) is given in figure 3.5.9 at the end of this section. It is helpful to study figure 3.5.9 at this point for a better understanding of the section.

The segment dictionary in the first block of a code file contains an entry for each code segment in the file. The entry includes the disk location and size (in bytes) for the segment. The disk location is given as relative to the beginning of the segment dictionary (which is also the beginning of the code file) and is given in number of blocks. This information is kept in the system communications area (also called SYSCOM) during the execution of the code file, and is used in the loading of non-present segments when they are needed. Figure 3.5.3 details the layout of the table and shows representative contents for the Pascal system code file.

            +-------------------------+
location    |           1             |
            | - - - - - - - - - - - - |  PASCAL SYSTEM
size        |         8500            |
            +-------------------------+
            |          18             |
            | - - - - - - - - - - - - |  USER PROGRAM
            |       variable          |
            +-------------------------+
            |          22             |
            | - - - - - - - - - - - - |  COMPILER
            |        20932            |
            +-------------------------+
            |          63             |
            | - - - - - - - - - - - - |  COMPINIT
            |        3480             |
            +-------------------------+
            |          70             |
            | - - - - - - - - - - - - |  DEBUGGER
            |        5880             |
            +-------------------------+
            |                         |
Figure 3.5.3. The Segment Dictionary

page206 A code segment contains the code for the body of each of its procedures, including the segment procedure itself. Figure 3.5.4 is a detailed diagram of the code segment of The Program (Pascalsystem). Each of a code segment's procedure are assigned a procedure number, starting at 1 for the segment procedure, and ranging as high as 255 (current temporary limit of 127). All references to a procedure are made via its number. Translation from procedure number to location in the code segment is accomplished with the procedure dictionary at the end of the segment. This dictionary is an array indexed by the procedure number. Each array element is a self-relative pointer to the code for the corresponding procedure. Since zero is not a valid procedure number, the zero'th entry of the dictionary is used to store the segment number (even byte) and number of procedures (odd byte). Observe that CLEARSCREEN is the first procedure for which code is generated and that it appears at the beginning of the segment. The outer block code is generated and appears last.

                                 high addresses
       odd                     even
     +-----------------------------------------+
     | Number of procedures | Segment Number   |
     |      in dictionary   |                  |
     +-----------------------------------------+
     | Procedure #1              PASCALSYSTEM  |--.
     | - - - - - - - - - - - - - - - - - - - - |  |
,----| Procedure #2               CLEARSCREEN  |  |
|    | - - - - - - - - - - - - - - - - - - - - |  |
|    |               rest of                   |  |
| ,--|         procedure  dictionary           |  |
| |  +-----------------------------------------+  |
| |  |                                         |  |
| |  |                                         |  |
| |  |      PASCALSYSTEM'S outer block code    |<-'
| |  |                                         |
| |  |                                         |
| |  |-----------------------------------------|
| |  |  other procedures of the Pascal system  |
| |  +-----------------------------------------+
| `->|   PROCEDURE #3                 code     |
|    +-----------------------------------------+
`--->|   PROCEDURE #2  (clearscreen)  code     |
     +-----------------------------------------+
                                  low addresses
Figure 3.5.4. A Code Segment

page207 A more detailed diagram of a single procedure code section is seen in figure 3.5.5. It consists of two parts: the procedure code itself in the lower portion of the section) and a table of attributes of the procedure. These attributes are:

LEX LEVEL
This odd byte is the depth of absolute lexical nesting for the procedure, (i.e. Lex Level (LL) Pascalsystem=-1, LL COMPILER or CLEARSCREEN=O, LL COMPINIT=l, etc.).

PROCEDURE NUMBER
This even byte refers to the number given in the procedure dictionary of the parent segment procedure. For example, the procnum of CLEARSCREEN is 2. (see figure 3.5.4).

ENTER IC
This is a self-relative pointer to the first instruction to be executed far this procedure.

EXIT IC
This is a self-relative pointer to the beginning of the block of procedure instructions which must be executed to terminate procedure properly.

PARAMETER SIZE
The parameter size is the number of bytes of parameters passed to a procedure from its caller.

DATA SEGMENT SIZE
The data size is the size of the data segment (See below) in bytes, excluding the mark stack and PARAM SIZE.

Between these attributes and the procedure code there may be an optional section of memory called the “jump table”. Its entries are addresses within the procedure code. JTAB is a term commonly applied to the six attributes just discussed and the jump table itself.

           high addresses
    odd                        even
   +-------------------------------+
   |   Lex Level   |  Procedure #  |<------+------------------.
   +-------------------------------+       |  PASCALSYSTEM's  |
   |           Enter IC            |--.    |  Procedure       |
   +-------------------------------+  |    |  Dictionary      |
,--|           Exit IC             |  |    |  Pointer         |
|  +-------------------------------+  |    `------------------'
|  |        Parameter Size         |  |
|  +-------------------------------+  |
|  |      Data  Segment  Size      |  |
|  +-------------------------------+  |
|  |- - - - - Jump Table - - - - - |  |
|  +-------------------------------+  |
`->|                               |  |
   |                               |  |
   |                               |  |
   |           CLEARSCREEN         |  |
   |              CODE             |  |
   |                               |<-'
   +-------------------------------+
            low addresses
Figure 3.5.5. Procedure Code Section (of Clearscreen)

page208

        high addresses

|  System Resident Segment  |
+---------------------------+
|                           |
|    System Data Segment    |
|                           |
|- - - - - - - - - - - - - -|
|      mark stack           |
+---------------------------+
|   Compiler Code Segment   |
+---------------------------+
|   Compile Data Segment    |
|- - - - - - - - - - - - - -|
|      mark stack           |
+---------------------------+
|   Compinit Code Segment   |
+---------------------------+
|   Compinit Data Segment   |
|- - - - - - - - - - - - - -|
|      mark stack           |
+---------------------------+
|  CLEARSCREEN Data Segment |
|- - - - - - - - - - - - - -|
|      mark stack           |
+---------------------------+
|      temporaries          |
|- - - - - - - - - - - - - -|
|                           |
|                           |
|                           |
|                           |
|                           |
|                           |
+---------------------------+
|                           |
|         H E A P           |
|                           |
+---------------------------+
|        Interpreter        |
|- - - - - - - - - - - - - -|
|        S Y S C O M        |<-- <segment dictionary>
|- - - - - - - - - - - - - -|
+---------------------------+

        low addresses

Figure 3.5.6. System Memory During Clearscreen Execution

page209 Figure 3.5.6 is a snapshot of system memory during the execution of a call to procedure CLEARSCREEN from line C in COMPINIT. The Pascal interpreter occupies the lowest area in memory. In it is the system communications area (also called SYSCOM) which is accessible both to assembly language routines in the interpreter and (as if it were part of the heap) to system routines coded in Pascal. It serves as an important communication link between these two levels of the system. The Pascal heap is next in the memory layout; it grows toward high memory. The single stack growing down from high memory is used for three types of items: 1) temporary storage needed during expression evaluation; 2) a data segment containing local variables and parameters for each procedure activation; and 3) a code segment for each active segment procedure. (See figure 3.5.6)

Consider the status of operations just before COMPINIT is called in line B. Conceptually, there are six pseudo-variables which point to locations in memory:

a STACK POINTER(SP):
which points to the current top of the stack,

a MARK STACK POINTER(MP):
which points to the "topmost" mark stack in the stack,(remember that the the stack grows down!),

a SEGMENT (SEG) variable:
which points to the base of the procedure dictionary for the currently active segment procedure. For example, just before COMPINIT is called, SEG points to the COMPILER segment's procedure dictionary,

an INTERPRETER PROGRAM COUNTER (IPC):
which contains the address of' the next instruction to be executed in the code segment of the current procedure,

page210 a JTAB pointer:
which points to the collection of procedure attributes and jump table entries in the body of the current procedure code section,

and a NEW POINTER (NP):
which points to the current top of the heap.

When segment procedure COMPINIT is called in line B, its code segment (including all compiler initialization procedures) is loaded on the stack. The COMPINIT data segment is built on top of the stack. Figure 3.5.7 is a diagram of the data segment for COMPINIT.

                    high addresses
            |----------------------------|
            |  Other COMPINIT variables  |
            |----------------------------|
            |          BOOL              |
            |----------------------------|
            |            I               |
            |----------------------------|
            |            J               |
            |----------------------------|
            |          MSSP              |
            |----------------------------|
            |          MSIPC             |
            |----------------------------|
            |          MSSEG             |
            |----------------------------|
            |          MSJTAB            |
            |----------------------------|
            |          MSDYN             |
            |----------------------------|
,------.    |          MSSTAT            |
|  MP  |--> |----------------------------|
`------'    |                            |
                     low addresses
Figure 3.5.7. A data segment,

In the upper portion of the data segment, space is allocated for variables local to the new procedure. For example, COMPINIT's data segment allocates space for integer variables I and J, as well as boolean BOOL.

page211 In the lower portion of the data segment is a “mark stack”. When a call to any procedure is made, the current values of the pseudo-variables, which characterize the operating environment of the calling procedure, are stored in the mark stack of the called procedure. This is so that the pseudo-variables may be restored to pre-call conditions when control is returned to the calling procedure.

For example, the call to COMPINIT causes conditions in COMPILER just before the call to be stored in COMPINIT’s mark stack in the following manner:

MarkStackDYNamic link (MSDYN) ← MP
"" IPC (MSIPC)← IC
"" SEGment Pointer (MSSEG)← SEG
"" Jump TABle (MSJTAB)← JTAB
"" Stack Pointer (SP)← SP

In addition a Static Link field becomes a pointer to the data segment of the lexical parent of the called procedure. In particular, it points to the Static Link field of parent's mark stack. After the building of the data segment new values for IC, SEG, SP, MP, and JTAB are established for the new procedure.

When the call to CLEARSCREEN is made on line C, another data segment is added to the stack and again the pseudo-variables are stored in the new mark stack, as well as the appropriate Static Link, and updated. Note that now the SEG no longer pints to the COMPINIT procedure dictionary, but to the Pascal system dictionary.

No code segment for CLEARSCREEN is added to the stack before the data segment since the code for CLEARSCREEN is already present in segment Pascalsystem. Its invocation causes only a data segment to be added to the stack. When CLEARSCREEN and INIT are completed, the COMPILER data segment will again be the top element on the stack.

Figure 3.5.8 is a detailed diagram of the stack during execution of an instruction in CLEARSCREEN, including appropriate pointers for static, dynamic, etc. links of CLEARSCREEN’s mark stack. Note where the pseudo-variables point in the stack. In particular, JTAB points inside CLEARSCREEN code section which is in the Pascalsystem code segment, IC points inside that CLEARSCREEN code, and SEG points to the base of the Pascalsystem code segment.

page212

                                        to PASCALSYSTEM resident cede segment
to PASCALSYSTEM resident data segment                 ^
    ^                                                 | SEG  | <-     in
    |          |--------  high addresses  ---------|  | JTAB | <- PASCALSYSTEM
    |          |-----------------------------------|  | IPC  | <- code segment
    |          |      COMPILER  code  segment      |
    |          |-----------------------------------|
    |          |      COMPILER  data  segment      |
    |          |- - - - - - - - - - - - - - - - - -|
    |          |            markstack              |
    |          |-----------------------------------|
    |   ,----->|       20        |        4        |      -------.
    |   |      |-----------------------------------|             |
    |   |      |      Pointer to COMPINIT code     |--.          |
    |   |      |-----------------------------------|  |      ---------
    |   |      |      Pointer to Procedure #2      |--+-.       code
    |   |      |-----------------------------------|  | |     segment
    | ,-+----->|-----------------------------------|  | |        of
    | | | ,--->|      COMPINT        code          |<-' |     COMPINIT
    | | | |    |-----------------------------------|    |    ---------
    | | | |    |      Procedures  of  COMPINIT     |<---'        |
    | | | |    |-----------------------------------|      -------+
    | | | |    |      COMPINIT    variables        |             |
    | | | |    |-----------------------------------|             |
    | | | |    |              MSSP                 |             |
    | | | |    |-----------------------------------|             |
    | | | |    |             MSIPC                 |         ---------
    | | | |    |-----------------------------------|            data
    | | | |    |             MSSEG                 |          segment
    | | | |    |-----------------------------------|            of
    | | | |    |             MSJTAG                |          COMPINIT
    | | | |    |-----------------------------------|         ---------
    | | | |    |             MSDYN                 |             |
    | | | |    |-----------------------------------|             |
    | | | |    |             MSSTAT                |<-.          |
    | | | |    |-----------------------------------|  |   -------'
    | | | |    |         evaluation  stack         |  |
    | | | | .->|-----------------------------------|  |   -------.
    | | | | |  |     CLEARSCREEN    variables      |  |          |
    | | | | |  |-----------------------------------|  |          |
    | | | | '--|              MSSP                 |  |          |
    | | | |    |-----------------------------------|  |          |
    | | | '----|              MSIPC                |  |      ---------
    | | |      |-----------------------------------|  |         data
    | | '------|              MSSEG                |  |       segment
    | |        |-----------------------------------|  |          of
    | '--------|              MSJTAB               |  |     CLEARSCREEN
    |          |-----------------------------------|  |      ---------
    |          |              MSDYN                |--'          |
    |          |-----------------------------------|             |
    '----------|              MSSTAT               |<----|MP|    |
               |-----------------------------------|      -------'
               |         evaluation  stack         |
               |               v                   | top of stack
               |- - - - - - - - - - - - - - - - - -|<----|SP|
               |                                   |
               |- - - - - - - - - - - - - - - - - -|<----|NP|
               |            H E A P                |
               |-----------------------------------| top of heap
Figure 3.5.8 The stack during CLEARSCREEN

page213

Figure 3.5.9 illustrates a top-down process by showing the relationships among diagrams 2 through 7.
  code file
  figure 3.5.2
|--------------|
| PASCALSYSTEM |--->| figure 3.5.4|
|--------------|    | CLEARSCREEN |
|              |    | code detail |--->| figure 3.5.5 |
|              |                       | proc. code   |
|--------------|                       | detail       |
|   segment    |
|  dictionary  |--->| figure 3.5.3              |
|--------------|    | segment dictionary detail |
  system memory
  figure 3.5.8
| code segment |--->| figure 3.5.4 |
|--------------|
|              |
|--------------|
|   COMPINIT   |
| data segment |--->| figure 3.5.7        |
|--------------|    | data segment detail |
Figure 3.5.9. Relationship of Document Figures

Figure 3.5.10. The Program
PROGRAM PASCALSYSTEM;
VAR
  SYSCOM: SYSCOMREC;
  CH: CHAR;
PROCEDURE CLEARSCREEN: FORWARD;

SEGMENT PROCEDURE USER PROGRAM;
  BEGIN
  END;
SEGMENT PROCEDURE COMPILER;
VAR
  SY, OP: INTEGER;
  SYMCURSOR: INTEGER;

    PROCEDURE INSYMBOL; FORWARD;

    SEGMENT PROCEDURE COMPINIT;
    VAR
      I, J: INTEGER;
      BOOL: BOOLEAN;
    BEGIN
      ...
      I := 1;
      CLEARSCREEN; (* ← line C *)
      INSYMBOL;
      ...
    END;

    PROCEDURE INSYMBOL;
    BEGIN ... END;

    PROCEDURE BLOCK;
    BEGIN ... END;
  BEGIN (* COMPILER *)
    ...
    COMPINIT; (* ← line B *)
    INSYMBOL;
  END; (* COMPILER *)

  SEGMENT PROCEDURE EDITOR;
  BEGIN ... END;

  PROCEDURE CLEARSCREEN
    BEGIN
      ...
      WRITE( .. );
      ...
    END;

BEGIN (*PASCALSYSTEM *)
  REPEAT
    READ(CH);
    CASE CH OF
      C: COMPILER; (* ← line A *)
page214
      E: EDITOR;
      U: USERPROGRAM;
      ...
    END( *CASE*)
  UNTIL CH = 'H'
END.
page215

page216
This page last regenerated Sun Jul 25 01:09:47 2010.