Version II.0, February 1979
UCSD uses an interpreter based implementation of Pascal. This means that the compiler emits code for a pseudo-machine which is emulated at run time by a program written in the machine language of the host. The compiler, program editor, stand-alone operating system, and various utilities are themselves written in Pascal and run on the same interpreter. Thus the entire system can be moved to a new host machine by rewriting the interpreter for the new host. This document describes the Pseudo-machine code files as they were in version I.3. Many of the segments mentioned are no longer resident in the code file used as an example. This does not affect the functionality of the description of the mechanisms put forth by this document.
Figure 3.5.10 (the last page of this document) is a skeleton version of a large Pascal program, here-in-after referred to as "The Program". This document is a top-down description of the realization of that program on the UCSD Pascal system. We will make occasional use of a helpful coincidence: The Program is the framework of the portion of the UCSD Pascal environment that's written in Pascal.
If The Program ware expanded to a complete Pascal system, it would consist of at least 6000 lines of Pascal and compile to more than 50,000 bytes of code — too big to fit all at once into the memory of a small machine (by our current definition of small). We have therefore extended Pascal so that a programmer can explicitly partition a program into segments; only some of which need be resident in main memory at a time. The syntax of this extension is shown in figure 3.5.1. (Any syntactic objects not defined explicitly there retain their standard interpretation as defined by Jensen & Wirth: Pascal User Manual and Report.) See Section 5.9 for revised syntax diagrams.
<program> ::= <program heading> <segment block> . <segment block> ::= <label declaration part> <constant declaration part> <type definition part> <variable declaration part> <segment declaration part> <segment body> <segment declaration part> ::= SEGMENT <procedure heading> <segment block>; \ SEGMENT <function heading> <segment block>; <segment body> ::= <procedure and function declaration part> <statement part>Figure 3.5.1. Segment Declaration Syntax.
Segment declaration syntax (figure 3.5.1) requires that all nested segments be declared before the ordinary procedures or functions of the segment body. Thus, a code segment can be completely generated before processing of code for the next segment starts. This is not a functional limitation, since forward declarations can be used to allow nested segments (COMPILER in The Program) to reference procedures in an outer segment body (CLEARSCREEN). Similarly, segment procedures and functions can themselves be declared forward.
Segmenting a program does not change its meaning in any fundamental sense. When a segment is called (e.g. the COMPILER segment in line A), the interpreter checks to see if it is present in memory due to a previous invocation. If it is, control is transferred and execution proceeds: if not, the appropriate code segment must be loaded from disk before the transfer of control takes place. When no more active invocations of the segment exist, its code is removed from memory. For instance, in The Program, the code for the COMPINIT segment is not present in memory either before or after the execution of line A. Clearly, a program should be segmented in such a way that (non-recursive) segment calls are infrequent; otherwise, much time could be lost in unproductive thrashing (particularly on a system with low performance disk).
high address +-----------------------------------------------+ ,--> | DEBUGGER 10 | not | +-----------------------------------------------+ |--> | FILER 17 | shown | +-----------------------------------------------+ | | EDITOR 12 | in | +-----------------------------------------------+ | | COMPINIT 7 | the | +-----------------------------------------------+ | | COMPILER 41 | program | +-----------------------------------------------+ `--> | INITIALIZE 3 | +-----------------------------------------------+ | USER PROGRAM 1 | +-----------------------------------------------+ | PASCAL SYSTEM 17 | +-----------------------------------------------+ | SEGMENT DICTIONARY 1 | +-----------------------------------------------+ low addressFigure 3.5.2. Pascal System Codefile.
The code file resulting from compilation of The Program is diagrammed in figure 3.5.2*. The file is a sequence of code segments preceded by a segment dictionary. The size of each segment is noted in blocks, the 512-byte disk allocation quantum used on most PDP-11 operating systems. The sizes indicated are representative of a full Pascal system. Each code segment begins on a block boundary. The ordering (from low address to high address) is determined by the order that one encounters segment procedure bodies in passing through The Program.
* An overview of the relationship between figures 3.5.2 through 3.5.8 (to be discussed in the following pages) is given in figure 3.5.9 at the end of this section. It is helpful to study figure 3.5.9 at this point for a better understanding of the section.
The segment dictionary in the first block of a code file contains an entry for each code segment in the file. The entry includes the disk location and size (in bytes) for the segment. The disk location is given as relative to the beginning of the segment dictionary (which is also the beginning of the code file) and is given in number of blocks. This information is kept in the system communications area (also called SYSCOM) during the execution of the code file, and is used in the loading of non-present segments when they are needed. Figure 3.5.3 details the layout of the table and shows representative contents for the Pascal system code file.
+-------------------------+ location | 1 | | - - - - - - - - - - - - | PASCAL SYSTEM size | 8500 | +-------------------------+ | 18 | | - - - - - - - - - - - - | USER PROGRAM | variable | +-------------------------+ | 22 | | - - - - - - - - - - - - | COMPILER | 20932 | +-------------------------+ | 63 | | - - - - - - - - - - - - | COMPINIT | 3480 | +-------------------------+ | 70 | | - - - - - - - - - - - - | DEBUGGER | 5880 | +-------------------------+ | |Figure 3.5.3. The Segment Dictionary
A code segment contains the code for the body of each of its procedures, including the segment procedure itself. Figure 3.5.4 is a detailed diagram of the code segment of The Program (Pascalsystem). Each of a code segment's procedure are assigned a procedure number, starting at 1 for the segment procedure, and ranging as high as 255 (current temporary limit of 127). All references to a procedure are made via its number. Translation from procedure number to location in the code segment is accomplished with the procedure dictionary at the end of the segment. This dictionary is an array indexed by the procedure number. Each array element is a self-relative pointer to the code for the corresponding procedure. Since zero is not a valid procedure number, the zero'th entry of the dictionary is used to store the segment number (even byte) and number of procedures (odd byte). Observe that CLEARSCREEN is the first procedure for which code is generated and that it appears at the beginning of the segment. The outer block code is generated and appears last.
high addresses odd even +-----------------------------------------+ | Number of procedures | Segment Number | | in dictionary | | +-----------------------------------------+ | Procedure #1 PASCALSYSTEM |--. | - - - - - - - - - - - - - - - - - - - - | | ,----| Procedure #2 CLEARSCREEN | | | | - - - - - - - - - - - - - - - - - - - - | | | | rest of | | | ,--| procedure dictionary | | | | +-----------------------------------------+ | | | | | | | | | | | | | | PASCALSYSTEM'S outer block code |<-' | | | | | | | | | | |-----------------------------------------| | | | other procedures of the Pascal system | | | +-----------------------------------------+ | `->| PROCEDURE #3 code | | +-----------------------------------------+ `--->| PROCEDURE #2 (clearscreen) code | +-----------------------------------------+ low addressesFigure 3.5.4. A Code Segment
A more detailed diagram of a single procedure code section is seen in figure 3.5.5. It consists of two parts: the procedure code itself in the lower portion of the section) and a table of attributes of the procedure. These attributes are:
Between these attributes and the procedure code there may be an optional section of memory called the “jump table”. Its entries are addresses within the procedure code. JTAB is a term commonly applied to the six attributes just discussed and the jump table itself.
high addresses odd even +-------------------------------+ | Lex Level | Procedure # |<------+------------------. +-------------------------------+ | PASCALSYSTEM's | | Enter IC |--. | Procedure | +-------------------------------+ | | Dictionary | ,--| Exit IC | | | Pointer | | +-------------------------------+ | `------------------' | | Parameter Size | | | +-------------------------------+ | | | Data Segment Size | | | +-------------------------------+ | | |- - - - - Jump Table - - - - - | | | +-------------------------------+ | `->| | | | | | | | | | CLEARSCREEN | | | CODE | | | |<-' +-------------------------------+ low addressesFigure 3.5.5. Procedure Code Section (of Clearscreen)
high addresses | System Resident Segment | +---------------------------+ | | | System Data Segment | | | |- - - - - - - - - - - - - -| | mark stack | +---------------------------+ | Compiler Code Segment | +---------------------------+ | Compile Data Segment | |- - - - - - - - - - - - - -| | mark stack | +---------------------------+ | Compinit Code Segment | +---------------------------+ | Compinit Data Segment | |- - - - - - - - - - - - - -| | mark stack | +---------------------------+ | CLEARSCREEN Data Segment | |- - - - - - - - - - - - - -| | mark stack | +---------------------------+ | temporaries | |- - - - - - - - - - - - - -| | | | | | | | | | | | | +---------------------------+ | | | H E A P | | | +---------------------------+ | Interpreter | |- - - - - - - - - - - - - -| | S Y S C O M |<-- <segment dictionary> |- - - - - - - - - - - - - -| +---------------------------+ low addressesFigure 3.5.6. System Memory During Clearscreen Execution
Figure 3.5.6 is a snapshot of system memory during the execution of a call to procedure CLEARSCREEN from line C in COMPINIT. The Pascal interpreter occupies the lowest area in memory. In it is the system communications area (also called SYSCOM) which is accessible both to assembly language routines in the interpreter and (as if it were part of the heap) to system routines coded in Pascal. It serves as an important communication link between these two levels of the system. The Pascal heap is next in the memory layout; it grows toward high memory. The single stack growing down from high memory is used for three types of items: 1) temporary storage needed during expression evaluation; 2) a data segment containing local variables and parameters for each procedure activation; and 3) a code segment for each active segment procedure. (See figure 3.5.6)
Consider the status of operations just before COMPINIT is called in line B. Conceptually, there are six pseudo-variables which point to locations in memory:
When segment procedure COMPINIT is called in line B, its code segment (including all compiler initialization procedures) is loaded on the stack. The COMPINIT data segment is built on top of the stack. Figure 3.5.7 is a diagram of the data segment for COMPINIT.
high addresses |----------------------------| | Other COMPINIT variables | |----------------------------| | BOOL | |----------------------------| | I | |----------------------------| | J | |----------------------------| | MSSP | |----------------------------| | MSIPC | |----------------------------| | MSSEG | |----------------------------| | MSJTAB | |----------------------------| | MSDYN | |----------------------------| ,------. | MSSTAT | | MP |--> |----------------------------| `------' | | low addressesFigure 3.5.7. A data segment,
In the upper portion of the data segment, space is allocated for variables local to the new procedure. For example, COMPINIT's data segment allocates space for integer variables I and J, as well as boolean BOOL.
In the lower portion of the data segment is a “mark stack”. When a call to any procedure is made, the current values of the pseudo-variables, which characterize the operating environment of the calling procedure, are stored in the mark stack of the called procedure. This is so that the pseudo-variables may be restored to pre-call conditions when control is returned to the calling procedure.
For example, the call to COMPINIT causes conditions in COMPILER just before the call to be stored in COMPINIT’s mark stack in the following manner:
Mark | Stack | DYNamic link (MSDYN) | ← MP |
" | " | IPC (MSIPC) | ← IC |
" | " | SEGment Pointer (MSSEG) | ← SEG |
" | " | Jump TABle (MSJTAB) | ← JTAB |
" | " | Stack Pointer (SP) | ← SP |
In addition a Static Link field becomes a pointer to the data segment of the lexical parent of the called procedure. In particular, it points to the Static Link field of parent's mark stack. After the building of the data segment new values for IC, SEG, SP, MP, and JTAB are established for the new procedure.
When the call to CLEARSCREEN is made on line C, another data segment is added to the stack and again the pseudo-variables are stored in the new mark stack, as well as the appropriate Static Link, and updated. Note that now the SEG no longer pints to the COMPINIT procedure dictionary, but to the Pascal system dictionary.
No code segment for CLEARSCREEN is added to the stack before the data segment since the code for CLEARSCREEN is already present in segment Pascalsystem. Its invocation causes only a data segment to be added to the stack. When CLEARSCREEN and INIT are completed, the COMPILER data segment will again be the top element on the stack.
Figure 3.5.8 is a detailed diagram of the stack during execution of an instruction in CLEARSCREEN, including appropriate pointers for static, dynamic, etc. links of CLEARSCREEN’s mark stack. Note where the pseudo-variables point in the stack. In particular, JTAB points inside CLEARSCREEN code section which is in the Pascalsystem code segment, IC points inside that CLEARSCREEN code, and SEG points to the base of the Pascalsystem code segment.
to PASCALSYSTEM resident cede segment to PASCALSYSTEM resident data segment ^ ^ | SEG | <- in | |-------- high addresses ---------| | JTAB | <- PASCALSYSTEM | |-----------------------------------| | IPC | <- code segment | | COMPILER code segment | | |-----------------------------------| | | COMPILER data segment | | |- - - - - - - - - - - - - - - - - -| | | markstack | | |-----------------------------------| | ,----->| 20 | 4 | -------. | | |-----------------------------------| | | | | Pointer to COMPINIT code |--. | | | |-----------------------------------| | --------- | | | Pointer to Procedure #2 |--+-. code | | |-----------------------------------| | | segment | ,-+----->|-----------------------------------| | | of | | | ,--->| COMPINT code |<-' | COMPINIT | | | | |-----------------------------------| | --------- | | | | | Procedures of COMPINIT |<---' | | | | | |-----------------------------------| -------+ | | | | | COMPINIT variables | | | | | | |-----------------------------------| | | | | | | MSSP | | | | | | |-----------------------------------| | | | | | | MSIPC | --------- | | | | |-----------------------------------| data | | | | | MSSEG | segment | | | | |-----------------------------------| of | | | | | MSJTAG | COMPINIT | | | | |-----------------------------------| --------- | | | | | MSDYN | | | | | | |-----------------------------------| | | | | | | MSSTAT |<-. | | | | | |-----------------------------------| | -------' | | | | | evaluation stack | | | | | | .->|-----------------------------------| | -------. | | | | | | CLEARSCREEN variables | | | | | | | | |-----------------------------------| | | | | | | '--| MSSP | | | | | | | |-----------------------------------| | | | | | '----| MSIPC | | --------- | | | |-----------------------------------| | data | | '------| MSSEG | | segment | | |-----------------------------------| | of | '--------| MSJTAB | | CLEARSCREEN | |-----------------------------------| | --------- | | MSDYN |--' | | |-----------------------------------| | '----------| MSSTAT |<----|MP| | |-----------------------------------| -------' | evaluation stack | | v | top of stack |- - - - - - - - - - - - - - - - - -|<----|SP| | | |- - - - - - - - - - - - - - - - - -|<----|NP| | H E A P | |-----------------------------------| top of heap |
Figure 3.5.8 The stack during CLEARSCREEN |
Figure 3.5.9 illustrates a top-down process by showing the relationships among diagrams 2 through 7. |
code file figure 3.5.2 |--------------| | PASCALSYSTEM |--->| figure 3.5.4| |--------------| | CLEARSCREEN | | | | code detail |--->| figure 3.5.5 | | | | proc. code | |--------------| | detail | | segment | | dictionary |--->| figure 3.5.3 | |--------------| | segment dictionary detail | |
system memory figure 3.5.8 | code segment |--->| figure 3.5.4 | |--------------| | | |--------------| | COMPINIT | | data segment |--->| figure 3.5.7 | |--------------| | data segment detail | |
Figure 3.5.9. Relationship of Document Figures |