Version II.0, February 1979
Users of UCSD Pascal occasionally need to write and execute small assembly routines written in the language of the host machine. These routines would be used within a Pascal program to provide low-level or time critical facilities. The UCSD Adaptable Assembler (in conjunction with the UCSD Linker) has been designed to meet those needs. The UCSD Pascal Project will be maintaining all our Pascal interpreters using this assembler in the near future. By this process the users of the UCSD Pascal system will be independent of any manufacturer's system software.
This assembler was modeled after The Last Assembler (TLA) developed at the University of Waterloo. The basic concept behind both the TLA and the UCSD Adaptable Assemblers is the use of a central machine independent core that is common to all versions of the assembler. This central core is augmented with machine specific code to handle the peculiarities of each individual machine.
This document is intended for a reader who is already fluent in at least one assembly language.
1.9.1 Usage
Before attempting to execute the assembler program for a specific machine, an opcodes file (Z80.OPCODES or11.OPCODES) must be located on the system disk. The errors file (Z80.ERRORS or 11.ERR0RS) contains the error messages that are used for error flagging during the assembly. This file is optional; if used, it must also appear on the system disk.
To use the UCSD assembler, type A(ssem from the Command line. This will execute SYSTEM.ASSMBLER. (The user should arrange that the right version of the assembler (PDP-11 or Z80) have that title.)
The program displays, the version of the assembler being executed and assumes that the current workfile is the one to be assembled. If there is no current workfile then the program asks which file is to be assembled.
Output file for the assembled listing (<CR> for none):
As usual for a console or printer output the words CONS0LE or PRINTER must be followed by a colon, i.e. CONSOLE:. If the colon is neglected the output is sent to a file of the name given. At this point, the program reports whether or not the output device (if any) is on line. The assembled code is written out to a file called *SYSTEM.WRK.CODE which cannot be executed by itself but must be changed to link in with a host file.
The program then starts assembling the workfile, flagging errors as they are found. If an a error, other than an I/O error, is found, a general message indicates the nature of the error and also gives the option to continue or exit. The error message will be taken from the ERRORS file if possible. If that is not possible, due to space limitations or the absence of the errors file, the error message number is given. The assembly is aborted if the I/O error encountered is not due to data typed in by the user, otherwise the user is prompted to try again. (See the complete list of Assembler syntax errors and machine specific errors in Table 6.)
The console displays, on the left hand side of the screen, one dot for each line of code assembled and a line counter every 50 lines. When an include file is started, the console displays:
.INCLUDE <file.id>indicating which file has been included.
At the end of the assembly the assembler program indicates that it is finished and tells the user how many errors were found. In addition an alphabetic symbol table is generated.
The reference symbol table consists of three parts. The first column represents the symbol identifier, the second, the symbol type, and the third, the location that it is defined or the value it has. Actual values are given for the symbols representing absolutes and definition locations are given for the symbols representing labels. The location number is given as a hi-byte first number and corresponds to the index numbers on the left hand side of the listing. Only symbols which have definition locations or absolute values have numbers in the third column; other types have dashes.
Following is an example of an assembled listing with symbol table.
PAGE - 1 PRIMARYZ FILE: #5:PRIMARY.Z 0000| .PROC PRIMARYZ Memory after initialization: 6068 0000| 0000| FLOPPY .EQU 0BFDH ;Rom-based floppy driver. 0000| SECMEM .EQU 9000H ;First location in memory 0000| SECENT .EQU 9000H ;Entry point of bootstrap 0000| DECDSK .EQU 08H + 1700H ;Sector start of 2nd bootstrap 0000| B1DSK .EQU 10H + 1700H ;Sector start of BIOS part 1 0000| B2DSK .EQU 18H + 1700H ;Sector start of BIOS part 2 0000| 0000| .ORG 1000H ;Primary boot for ZILOG DOS 1000| 1000| FD 21 **** PRIMARY LD IY,SECREAD ;Get block for second bootstrap 1004| CD FD0B CALL FLOPPY 1007| FD 21 **** LD IY,B1READ ;Get block for part 1 of BIOS 100B| CD F0DB CALL FLOPPY 100E| FD 21 **** LD IY,B2READ ;Get block for part 2 of BIOS 1012| CD FD0B CALL FLOPPY 10151 C3 0090 JP SECENT ;Jump into second bootstrap 1018| 1002* 1810 1018| SECREAD 1018| 00 .BYTE $-$ ;Unused 1019| OA .BYTE 0AH ;Read command 101A| 0090 .WORD SECMEM ;Memory loc. for second boot 101C| 0002 .WORD 200H ;Number of bytes in boot 101E| 0000 .WORD $-$ ;Completion return address 1020| 0010 .WORD PRIMARY ;Error in return address 1022| 00 .BYTE $-$ ;Completion result code 1023| 0817 .WORD SECDSK ;Disk block of second boot 1025| 1009* 2510 1025| B1READ 1025| 00 .BYTE $-$ ;Unused 1026| OA .BYTE 0AH ;Read command 1027| 0093 .WORD SECMEM+300H ;Memory location or BIOS part 1 1029| 0002 .WORD 200H ;Number of bytes in BIOS part 1 102B| 0000 .WORD $-$ ;Completion return address 102D| 0010 .WORD PRIMARY ;Error return address 102F| 00 .BYTE $-$ ;Completion result code 1030| 1017 .WORD B1DSK ;Disk block of BIOS part 1 1032| 1010* 3210 1032| B2READ 1032| 00 .BYTE $-$ ;Unused 1033| 0A .BYTE 0AH ;Read command 1034| 0095 .WORD SECMEN+500H ;Memory location of BIOS part 2 1036| 0002 .WORD 200H ;Number of bytes in BIOS part 2 1038| 0000 .WORD $-$ ;Completion return address 103A| 0010 .WORD PRIMARY ;Error return address 103C| 00 .BYTE $-$ ;Completion result code 103D| 1817 .WORD B2DSK ;Disk block of BIOS part 2 103F| 103F| .END
PAGE- 2 PRIMARYZ FILE:#5:PRIMARY.Z SYMBOLTABLE DUMP AB - Absolute LB - Label UD - Undefined MC - Macro RF - Ref DF - Def PR - Proc FC - Func PB - Public PV - Private CS - Constant B1DSK AB 1710| B1READ LB 1025| B2DSK AB 1718| B2READ LB 1032 FLOPPY AB 08FD| PRIMARY LB 1000| PRIMARYZ PR ----| SECDSK AB 7708 SECENT AB 9000| SECMEM AB 9000| SECREAD LB 1018
Notes:
The location values in the symbol table dump refer to the locations in the listing.
The ****s in the listing call attention to the use of a label not yet defined.
If a star (*) appears after the location number at the left of the listing, it indicates that a forward reference occurring earlier in the assembly has been resolved. The number to the left of the ‘*’ is the location where the reference occurred while the number to the right is the new contents of that location.
1.9.2 High-Level Syntax
All objects declared before the first .PROC or .FUNC are available for use throughout the assembly. No code is allowed to be generated before the first .PRO or .FUNC. The symbol table is reduced at the beginning of each .PROC or .FUNC to the point where it was at the start of the first .PROC or .FUNC.
Only labels may begin in the first column and may optionally be followed by a colon. Local labels must have ‘$’ in the first column and may be up to 8 digits long. If the statement has no label, the first column must contain a space.
All assemblies must end with a .END. However each .PROC or .FUNC need not because they are ended by the occurrence of the next .PROC or .FUNC. Only the last one needs a .END.
A general railroad diagram for all assembly files looks like:
The non-code generating operations are:
.EQU, .DEF, .REF, .PAGE, .TITLE, .LIST, .MACRO, .IF
The code generating operations are any other pseudo-ops and all assembly code for the program.
Labels may be equated to an expression containing either labels and/or absolutes. One must define a label before it is used unless it will simply be equated to another label. Local labels may not occur on the left hand side of an equate (.EQU).
Local labels are mainly used to jump around within a small segment of code without having to use up storage area needed by regular labels. The local label stack may hold up to 21 labels. These are cut back every time upon encountering a regular label and are thus rendered invalid. An example of the use of local labels is shown below, the jump to label $04 being illegal.
$03 STA 4 ; legal use of local label . . JP NZ, $03 . . JP NZ, $04 ; illegal use of local label REALLAB .EQU $ $04 .EQU $
Identifiers are character strings starting with an alpha character. Other characters must be alphanumeric or the ASCII underline ‘_’. Only the first 8 characters are meaningful to the assembler even though more may be entered.
The following operators can be used in expressions processed by this assembler.
The default radix is Hex for the Z80 version and octal for the PDP-11.
For unary operations: ‘+’ plus ‘-’ minus ‘~’ ones complement For binary operations: ‘+’ plus ‘-’ minus ‘~’ exclusive or ‘*’ multiplication ‘/’ truncating division (DIV) ‘%’ remainder division (MOD) ‘|’ bit wise OR ‘&’ bit wise AND ‘=’ equal (valid only in .IF) ‘<>’ not equal (valid only in .IF) All constants must start with an integer 0-9. All operations are applied to whole words.
Assembler directives (also referred to as “pseudo-ops”) allow the programmer to instruct the assembler to do various functions other than provide direct executable code. The following directives are common to all UCSD versions but may differ from manufacturer's standard syntax.
In the following pseudo-op descriptions square brackets, [], are used to denote optional elements. If an element type is not listed it cannot be used in that situation. Angle brackets, <>, denote meta symbols.
The following terms represent general concepts in the explanation of each directive:
value = | any numerical value, label, constant, or expression. |
valuelist = | is a list of one or more values separated by commas. |
idlist = | a list of one or more identifiers separated by commas. |
expression = | any legal expression as defined in Section 1.9.3. |
identifier:integer list = | a list of one or more identifier-integer pairs separated by commas. The colon-integer is optional in each pair and the default is 1. |
Small examples are included after each pseudo-op definition to supply the user with a reference to the specific syntax and form of that directive. The larger example, included in section 3.3.2, is used to show the combined use and detailed examples of directive operations.
References to a .PROC or .FUNC are made in the Pascal host by use of EXTERNAL declarations. At the time of this declaration the actual parameter names must be given. For example, if the Pascal declaration is:
PROCEDURE FARKLE(X, Y: REAL); EXTERNAL;
the associated declaration for the .PROC would be
.PROC FARKLE
A .PROC, .FUNC, or any assembly routine should be inserted into the *SYSTEM.LIBRARY (execute LIBRARIAN) so that it can be referenced by the *SYSTEM.LINKER and linked in at run time. An alternate method would be to execute the LINKER and tell it what files to link in. Either method works. However, if the Pascal host is updated and the assembly routines aren't in the *SYSTEM.LIBRARY, the linker will have to be executed after each update. Therefore, we suggest that the routines be inserted into the *SYSTEM.LIBRARY to avoid this repetition. If the linker is called automatically using the Run command, it will search the *SYSTEM.LIBRARY for the appropriate definition of the assembly routine and link the two together.
.PROC | Identifies a procedure that returns no value. A .PROC is ended by the occurrence of a new .PROC, .FUNC, or .END. | |
Form: |
.PROC <identifier> [ , expression ]
[ expression ] indicates the number of words of parameters expected by this routine. The default is 0. | |
Example: | .PROC DLDRIVE, 2 | |
.FUNC | Identifies a function that returns a value. Two words of space to be used for the function value will be placed on the stack after any parameters. A .FUNC is ended the same way as the .PROC. | |
Form: |
.FUNC <identifier> [ , expression ]
[ expression ] indicates the number of words of parameters expected by this routine. The default is 0. | |
Example: | .FUNC RANDOM, 4 | |
.END | Used to denote the physical end of an assembly. | |
1.9.4.2. Label Definitions and Space Allocation Directives | ||
.ASCII | Converts character values to ASCII equivalent byte constants and places the equivalents into the code stream. | |
Form: |
[ label ] .ASCII "<character string>"
where <character string> is any string of printable ASCII characters, including a space. The length of the string must less than 80 characters. The double quotes are used as delimiters for the characters to be converted. If a double quote is desired in the string, it must be specifically inserted using a .BYTE pseudo-op. | |
Example: |
.ASCII "HELLO"
for the insertion of AB"CD the code must be constructed as: .ASCII "AB" .BYTE 34 ; 42 octal .ASCII "CD" Note: The 314 is the ASCII number for a double quote in hex. The representation actually used will depend on the default radix of the particular machine in use. | |
.BYTE | Allocates a byte of space into the code stream for each value listed. Assigns the associated label, if any, to the address at which the byte was stored. Expression must have a value between -128 and +255. If the value is outside of this range an error will be flagged. | |
Form: |
[ label ] .BYTE [ valuelist ]
the default for no stated value is 0. | |
Example: |
TEMP .BYTE 4
the associated output would be: 04 | |
.BLOCK | Allocates a block of space into code stream for each value listed. Amount allocated is in bytes. Associates the label (if present) with the starting address of the block allocated. | |
Form: |
[ label ] .BLOCK <length> [ , value ]
[ length ] is the number of bytes to hold the <value> specified. The default for no stated value is 0. | |
Example: |
TEMP .BLOCK 4, 6
the associated output would be: 06 06 (* four bytes with the hex value 06 *) 06 06 | |
.WORD | Allocates a word of space in the code stream for each value in the valuelist. Associates the declaration label with the word space allocation. | |
Form: | [ label ] .WORD <valuelist> | |
Example: |
TEMP .WORD 0, 2, 4, ...
the associated hex output would be: 0000 0002 0004 (* words with these values in them *) | |
Example: |
L1 .WORD L2 . . L2 .EQU $ ; $ represents the LC on the Z80 .WORD 5. if LC was 50 at the .EQU the associated hex output would be: 0050 (* assignment due to the L2 value *) . . 0005 (* assignment due to the .WORD 5 *) | |
.EQU | Assigns a value to a label. Labels may be equated to an expression containing either labels and/or absolutes. One must define a label before it is used unless it will simply be equated to another label. A local label may not appear on the left hand side of an equate (.EQU). | |
Form: | label .EQU <value> | |
Example: | BASE .EQU R6 | |
.ORG | Sets the current location counter (LC) to the value of the .ORG. It would normally be used in a stand-alone program. For example, there is one .ORG in the 8080/Z80 interpreter. The current implementation allows one to .ORG only in the forward direction. | |
Form: | [ label ] .ORG <expression> | |
Example: | .ORG 0 | |
1.9.4.3. Macro Facility DirectivesA macro is a named section of text that can be defined once and repeated in other places simply by using its name. The text of the macro may be parameterized, so that each invocation results in a different version of the macro contents. The parameters to the macro are separated by commas. At the invocation point, the macro name is followed by a list of parameters which are delimited by commas or spaces (except for the last one, which is terminated by end of line or the comment indication (‘;’). At invocation time, the text of the macro is inserted (conceptually speaking) by the assembler after being modified by parameter substitution. Whenever %n (where n is a single decimal digit greater that zero) occurs in the macro definition, the text of the nth parameter is substituted. Leading and trailing blanks are stripped from the parameter before the substitution. If a reference occurs in the macro definition to a parameter not provided in a particular invocation, a null string is substituted. A macro definition may not contain another macro definition. definition can certainly, however, include macro invocations. This “nesting” of macro invocations is limited to five levels deep. The expanded macro is always included in the listing file (if listing is enabled at the point of invocation). Macro expansion text is flagged, in the listing, by a ‘#’ just left of each expanded line. Comments occurring in the macro definition are not repeated in the expansion. | ||
.MACRO | Indicates the start of a macro and gives it an identifier. | |
.ENDM | Indicates the end point of a .MACRO. | |
Form: |
.MACRO <identifier> (macro body) .ENDM | |
Example: |
.MACRO HELP STA %1 ; < comment > LDA %2 ; < comment > .ENDM The listing where the macro call is made may look like:
HELP FIRST, SECOND # STA FIRST # LDA SECOND The statement HELP, calls the macro and sends it two parameters, FIRST and SECOND. These parameters are in turn referenced inside the macro using the identifiers %1 for the variable FIRST, and %2 for the variable SECOND. | |
1.9.4.4. Conditional Assembly DirectivesConditionals are used to selectively exclude or include sections of code at assembly time. When the assembler encounters an .IF directive, it evaluates the associated expression. In the simplest case, if the expression is false, the assembler simply discards the text until a .ENDC is reached. If there is an .ELSE directive between the .IF and .ENDC directives, the text before the .ELSE is selected if the expression is true, and the text after the .ELSE if the condition is false. The unassembled part of the conditional will not be included in any listing. Conditionals may be nested.The conditional expression takes one of two forms. The first is the normal arithmetic / logical1 expression used elsewhere in the assembler. This type of expression is considered false if it evaluates to zero; true otherwise. The second form of conditional expression is comparison for equality or inequality (indicated by ‘=’ and ‘<>’, respectively). One may compare strings, characters, or arithmetic / logical expressions. | ||
.IF | Identifies the beginning of the conditional. | |
.ENDC | Identifies the end of a conditional .IF | |
.ELSE | Identifies the alternate to the .IF. If the conditional expression is equal to 0 then the else is used. | |
Form: |
[ label ] .IF <expression> stuff .ELSE (* only if there is an else *) other stuff .ENDC where the expression is the conditional expression to be met. | |
Example: |
.IF LABEL1 - LABEL2 ; arithmetic expression ; This text assembled only if subtraction ; result is not zero. .IF "%1" = "STUFF" ; comparison expression ; This text assembled if subtraction above ; was true and if text of first parameter ; (assume we are in macro) is equal to “STUFF” .ENDC ; terminate nested condition. .ELSE ; This text assembled if subtraction result ; was zero. .ENDC ; terminate outer level conditional | |
1.9.4.5. Pascal Host Communication DirectivesThe directives .CONST, .PUBLIC, and .PRIVATE allow the sharing of information and data space between an assembly routine and a Pascal host. These external references must eventually be resolved by the Linker. Refer to Section 1.8 Linker, for further details. | ||
.CONST | Allows access of globally declared constants in the PASCAL host by the assembly routine. .CONST can only be used in a program to replace 16 bit relocatable objects. | |
Form: | .CONST <id-list> | |
Example: | (* see example after .PRIVATE *) | |
.PUBLIC | Allows a variable declared in the global data segment of the PASCAL host to be used by an ass~assembly language routine and the host program. | |
Form: | .PUBLIC <id-list> | |
Example: | (* see example after .PRIVATE *) | |
.PRIVATE | Allows variables of the assembly1y routine to be stored in the global data segment and yet be inaccessible to the Pascal host. These variables retain their values for the entire execution of the program. | |
Form: |
.PRIVATE <identifier:integer list>
the integer is used to communicate the number of words to be allocated to the identifier. | |
Example: |
(* for .CONST, .PRIVATE and .PUBLIC *)
Given the following Pascal host program:
PROGRAM EXAMPLE; CONST SETSIZE = 50; LENGTH = 50; VAR I, J, F, HOLD, COUNTER, LDC: INTEGER; LST1: ARRAY[0..9] OF CHAR; BEGIN blah blah END. and the following section of an assembly routine:
.CONST LENGTH .PRIVATE PRT, LST2:9 .PUBLIC LDC, I, J This will allow the const LENGTH to be used in the assembly routine almost as if the line LENGTH .EQU 80 had been written. (Recall the limitation mentioned above for the use .CONST identifiers.) The variables LDC, I and J to be used by both the Pascal host and the assembly routine, and the variables PRT and LST2 to be used only by the assembly routine. Further, the LST2:9 causes the variable LST2 to correspond with the beginning of a 9 word block of space in the global data segment. | |
1.9.4.6. External Reference DirectivesThe use of .DEF and .REF is similar to that of .PUBLIC. .DEFs and .REFs associate labels between assembly language routines rather than between an assembly routine and a Pascal host program. Just as with .PRIVATE and .PUBLIC, these external references must eventually be resolved by the Linker. If such resolution cannot be accomplished, the Linker will indicate the offending label. Naturally, the assembler cannot be expected to flag these errors, since it has no knowledge of other assemblies. | ||
.DEF | Identifies a label that is defined in the current routine and available to be used in other .PROCs or .FUNCs. | |
Form: | .DEF <identifier-list> | |
Example: | (* see listing in section 3.3.2.3 for example *) | |
.REF | Identifies a label used in this routine which has been declared in an external .PROC or .FUNC with a .DEF. During the linking process, corresponding .DEFs and .REFs are matched. | |
Form: | .REF <identifier-list> | |
Example: |
(* see listing in section
3.3.2.3 for example *)
Note: The .PROC and the .FUNC directive also generates a .DEF with the sane name. This allows assembly procedures to call .PROCs and .FUNCs if they have been defined in a .REF. | |
1.4.9.7. Listing Control DirectivesIf no listing output file is specified then all .LIST and .NOLIST directives are simply ignored. | ||
.LIST and .NOLIST | Allows selective listing of assembly routines. If no output file is declared then the default is CONSOLE: when a .LIST is encountered. The .NOLIST is used to turn off the .LIST option. Listing may be turned on and off repeatedly within an assembly. | |
Form: |
.LIST .NOLIST | |
.PAGE | Allows the programmer to explicitly ask for top of form page breaks in the listing. | |
Form: | .PAGE | |
The title is only cleared at the start of the file. In section 1.9.1 the title SYMBOL TABLE8LE DUMP was not set by a .TITLE directive. That heading is always used on pages containing symbol table dumps. Upon assembling a further procedure the heading printed returns to what it was before the symbol table dump. | ||
.TITLE |
Allows the titling of each page if desired. The title may be up
to 80 characters in length. At the start of each procedure the
title is set to blanks and must be reset if title is desired.
Note: The title, INTERP SYMBOL TABLE DUMPshown in Section 1.9.1 was not caused by a .TITLE directive. | |
Form: |
.TITLE <title>
where <title> is a string. It doesn't need quotes. It may contain spaces. | |
Example: | .TITLE QRC12 interpreter | |
1.9.4.8. File Directives | ||
.INCLUDE | Causes the indicated source file to be included at that point. | |
Form: |
.INCLUDE <file identifier.TEXT>
where the file identifier is any file to be included. Only spaces are allowed between the end of the file name and the end of the Include line. | |
Example: |
.INCLUDE RIGHT.TEXT .INCLUDE WRONG.TEXT ; syntax error here |
For a list of general errors and also notes on the Z80 and PDP-11 based machines see section 5.6 Assembler Syntax Errors.