MACLNKUM.DOC MACREL/LINK User's Manual Order No. AA-5664B-TA DISCLAIMER This document file was created by scanning the original document and then editing the scanned text. As much as possible, the original text format was restored. Some format changes were made to insure this document would print on current laser printers using 60 lines per page. The original spelling and grammar has been preserved. 1-Mar-1997 MACREL/LINK User's Manual Order No. AA-5664B-TA January 1979 This document is the user's manual for MACREL/LINK, a PDP-8 macro assembly language and linking loader. The manual completely describes all operating instructions, language elements, directives and options. Examples and demonstration programs are contained throughout the manual. SUPERSESSION/UPDATE INFORMATION: This document completely supersedes its predecessor AA-5664A-TA. OPERATING SYSTEM AND VERSION: OS/8 V3D and subsequent versions. SOFTWARE AND VERSION: MACREL Version 2D LINK Version 2B CREF Version 2A OVRDRV Version 2A --------------------------------------------------------------------- | To order additional copies of this manual, contact the Software | | Distribution Center, Digital Equipment Corporation, Maynard, | | Massachusetts 01754 | --------------------------------------------------------------------- digital equipment corporation - maynard. massachusetts First Printing, October 1977 Revised: January 1979 The information in this document is subject to change without notice and should not be construed as a commitment by Digital Equipment Corporation. Digital Equipment Corporation assumes no responsibility for any errors that may appear in this document. The software described in this document is furnished under a license and may only be used or copied in accordance with the terms of such license. No responsibility is assumed for the use or reliability of software on equipment that is not supplied by DIGITAL or its affiliated companies. Copyright (c) 1977, 1979 by Digital Equipment Corporation The postage-prepaid READER'S COMMENTS form on the last page of this document requests the user's critical evaluation to assist us in pre- paring future documentation. The following are trademarks of Digital Equipment Corporation: DIGITAL DECsystem-10 MASSBUS DEC DECtape OMNIBUS PDP DIBOL OS/8 DECUS EDUSYSTEM PHA UNIBUS FLIP CHIP RSTS COMPUTER LABS FOCAL RSX COMTEX INDAC TYPESET-8 DDT LAB-8 TYPESET-11 DECCOMM DECSYSTEM-20 TMS-11 ASSIST-11 RTS-8 ITPS-10 VAX VMS SBI DECnet IAS CONTENTS Page PREFACE xi CHAPTER 1 INTRODUCTION 1-1 1.1 MACREL FEATURES 1-1 1.1.1 Relocation 1-1 1.1.2 Macros 1-2 1.1.3 Directives 1-3 1.2 OVERVIEW OF ASSEMBLY AND RELOCATABLE LOADING 1-3 1.2.1 Assembling with MACREL 1-3 1.2.2 Linking with LINK 1-4 1.3 COMPATIBILITY OF MACREL WITH PAL8 1-4 1.3.1 MACREL: Differences from PAL8 1-5 1.3.1.1 Dollar Sign ($) 1-5 1.3.1.2 DTORG 1-5 1.3.1.3 MACREL and Literals 1-5 1.3.1.4 PAUSE 1-5 1.3.1.5 Closing Literal Expressions 1-5 1.3.1.6 Terminal Support 1-6 1.3.1.7 PAL8 Run-Time Options 1-6 1.4 ASSEMBLING EXISTING PAL8 PROGRAMS WITH MACREL 1-6 1.5 INTRODUCTION TO MACREL RELOCATION 1-7 1.5.1 Source Modules 1-7 1.5.2 Program Sections 1-7 1.5.3 Fundamentals of Relocatable Programming 1-10 1.5.4 Example of Communication Between Program Sections 1-13 1.5.4.1 A Sample Program 1-13 1.5.4.2 Program Operation 1-16 1.5.4.3 Effects of Relocation on the Program 1-17 1.5.4.4 The Symbol Table 1-18 1.5.5 Relocation Type 1-18 1.5.5.1 Absolute Expressions 1-19 1.5.5.2 Simple Relocation Expressions 1-19 1.5.5.3 CDF/CIF Relocation Expressions 1-19 1.5.5.4 .FSECT Relocation Expressions 1-19 1.5.5.5 Complex Relocation Expressions 1-19 1.6 THE ASSEMBLY PROCESS 1-20 1.6.1 Pass One -- Symbol Definition Pass 1-20 1.6.2 Pass Two -- Binary Code Generation Pass 1-21 1.6.3 Pass Three -- Listing Pass 1-21 1.6.4 Pass Four -- Cross-Reference (KREF) Listing Pass 1-21 1.7 THE LINKING PROCESS 1-22 1.7.1 Pass One -- Linking 1-22 1.7.2 Pass Two -- Loading 1-22 iii CONTENTS (Cont.) Page CHAPTER 2 MACREL SOURCE PROGRAM FORMAT 2-1 2.1 MACREL STATEMENTS 2-1 2.1.1 Labels 2-1 2.1.2 Instructions, Directives or Data 2-1 2.1.3 Comments 2-2 2.2 FORMAT EFFECTORS 2-2 2.2.1 Form Feed 2-2 2.2.2 Tabulations 2-2 2.2.3 Statement Terminators 2-2 CHAPTER 3 THE PDP-8 MACHINE INSTRUCTION SET 3-1 3.1 MEMORY REFERENCE INSTRUCTIONS 3-1 3.1.1 Addressing Modes 3-2 3.2 MICROINSTRUCTIONS 3-3 3.2.1 Operate Microinstructions 3-3 3.2.2 Input/Output Transfer Microinstructions 3-5 3.3 AUTOINDEXING 3-5 3.4 STANDARD INSTRUCTION SET 3-6 3.5 CONSTANTS 3-10 CHAPTER 4 EXPRESSIONS AND THEIR COMPONENTS 4-1 4.1 MACREL CHARACTER SET 4-1 4.1.1 Alphanumeric Characters 4-1 4.1.2 Special Characters and Operators 4-1 4.2 SYMBOLS 4-2 4.2.1 Permanent Symbols 4-3 4.2.2 Program-Defined Symbols 4-4 4.2.3 Labels 4-4 4.2.4 Local Symbols 4-6 4.2.5 Backslash (\) Special Operator 4-7 4.2.6 Symbols and Relocation 4-9 4.3 DIRECT ASSIGNMENT STATEMENT 4-9 4.4 CURRENT LOCATION COUNTER 4-11 4.5 LITERALS 4-11 4.5.1 Current Page Literals 4-11 4.5.2 Page Zero Literals 4-14 4.6 NUMBER REPRESENTATION 4-15 4.6.1 Uparrow B (^B) -- Binary Representation 4-15 4.6.2 Uparrow D (^D) -- Decimal Representation 4-16 4.6.3 Uparrow O (^O) -- Octal Representation 4-16 4.6.4 Period (.) -- Decimal Representation 4-16 4.7 ASCII DATA REPRESENTATION 4-16 4.7.1 Double Quote (") -- ASCII Representation 4-17 4.7.2 Single Quote (') -- ASCII Pair 4-17 4.7.3 Uparrow Double Quote (^") -- Control Character Representation 4-18 4.8 MACREL ARITHMETIC OPERATORS 4-18 4.8.1 Plus Sign (+) -- Addition 4-18 iv CONTENTS (Cont.) Page 4.8.2 Minus Sign (-) -- Subtraction 4-19 4.8.3 Uparrow (^) -- Multiplication 4-19 4.8.4 Percent Sign (%) -- Division 4-19 4.8.5 Ampersand (&) -- Boolean AND 4-20 4.8.6 Exclamation Point (') -- Inclusive OR (or Shift) 4-20 4.8.7 Space ( ) and Tab -- Inclusive OR 4-20 4.9 SPECIAL OPERATORS 4-21 4.9.1 Memory Reference Instructions 4-22 4.9.2 I -- Indirect Addressing 4-23 4.9.3 Z -- Page Zero Addressing 4-24 4.9.4 CIF and CDF Instructions 4-24 4.9.5 EDF -- Evaluate Data Field 4-25 4.9.6 .FLD 4-26 4.9.7 XEDF 4-26 4.9.8 .LEVEL 4-27 4.10 EXPRESSION EVALUATION AND SYNTAX 4-27 4.10.1 Operator Precedence and Order of Evaluation 4-28 4.11 USES OF EXPRESSIONS 4-29 4.12 EXPRESSIONS AND RELOCATION 4-30 CHAPTER 5 MACREL DIRECTIVES 5-1 5.1 ASSEMBLY LISTING AND BINARY OUTPUT CONTROL DIRECTIVES 5-1 5.1.1 Assembly Listing Control Directives 5-2 5.1.1.1 .LIST and .NOLIST 5-2 5.1.1.2 .TITLE 5-3 5.1.1.3 .SBTTL 5-4 5.1.1.4 .LISTWD Special Variable 5-4 5.1.2 Other PAL8 Directives 5-5 5.1.2.1 XLIST 5-5 5.1.2.2 EJECT 5-5 5.1.2.3 NOPUNCH and ENPUNCH 5-6 5.2 RADIX CONTROL DIRECTIVES 5-6 5.2.1 .RADIX 5-6 5.2.2 Other PAL8 Directives 5-7 5.3 DATA STORAGE DIRECTIVES 5-7 5.3.1 ZBLOCK 5-7 5.3.2 TEXT 5-8 5.3.2.1 Simple Form of the TEXT Directive 5-8 5.3.2.2 Complex Form of the TEXT 5-10 5.3.2.3 TEXT Options 5-11 5.4 CODE LOCATION DIRECTIVES 5-13 5.4.1 Asterisk (*) 5-14 5.4.2 PAGE 5-16 5.4.3 FIELD 5-17 5.4.3.1 FIELD Directive in Named Program Sections 5-17 5.4.3.2 FIELD Directive in Unnamed Program Sections 5-18 5.4.4 RELOC 5-19 5.5 CONDITIONAL ASSEMBLY DIRECTIVE 5-20 v CONTENTS (Cont.) Page 5.5.1 Nested Conditional Assembly Directives 5-23 5.5.2 Other PAL8 Conditional Assembly Directives 5-24 5.6 ASSEMBLY CHAINING DIRECTIVES 5-24 5.6.1 .INCLUDE 5-24 5.6.2 .CHAIN 5-25 5.7 USER SERVICE ROUTINE (USR) COMMUNICATION DIRECTIVES 5-26 5.7.1 DEVICE 5-26 5.7.2 FILENAME 5-26 5.8 LOADING INFORMATION DIRECTIVES 5-27 5.8.1 .START 5-27 5.8.2 .JSW 5-28 5.8.3 .VERSION 5-29 5.9 SYMBOL TABLE MODIFICATION DIRECTIVES 5-29 5.9.1 EXPUNGE 5-29 5.9.2 FIXMRI 5-31 5.9.3 FIXTAB 5-32 5.10 STACK MANIPULATION DIRECTIVES 5-33 5.10.1 .PUSH 5-33 5.10.2 .POP 5-34 5.11 ASSEMBLY OPTIONS DIRECTIVES 5-34 5.11.1 .ENABLE 5-35 5.11.2 .DISABL 5-37 5.11.3 .ENABWD Special Variable 5-37 CHAPTER 6 MACRO AND REPEAT DIRECTIVES 6-1 6.1 MACRO DEFINITIONS 6-3 6.1.1 .MACRO and .ENDM Directives 6-3 6.2 MACRO CALLS 6-6 6.3 MACRO ARGUMENTS 6-7 6.3.1 Actual Arguments 6-7 6.3.2 Substrings of the Argument 6-9 6.3.2.1 .NCHAR Special Operator 6-10 6.3.3 Symbols and Names in Macros 6-11 6.3.4 Apostrophe (') Special Operator 6-12 6.4 NESTED MACROS 6-13 6.4.1 Nested Macro Definitions 6-13 6.4.2 Nested Macro Calls 6-14 6.4.3 .MEXIT Directive 6-15 6.4.4 Concatenation in Nested Macros 6-16 6.5 CONDITIONAL ASSEMBLY DIRECTIVES IN MACROS 6-16 6.5.1 Nested Conditional Source Code in Macros 6-17 6.5.2 .NARGS Special Operator 6-19 6.6 DEFINING AND CALLING REPEAT BLOCKS 6-19 6.6.1 .REPT and .ENDR Directives 6-20 6.6.2 Nested Repeats 6-20 CHAPTER 7 RELOCATION 7-1 7.1 THE PROGRAM SECTION DIRECTIVES 7-2 vi CONTENTS (Cont.) Page 7.1.1 Unnamed Program Sections 7-4 7.1.2 .ASECT 7-4 7.1.3 .RSECT 7-5 7.1.4 .ZSECT 7-6 7.1.5 .XSECT 7-7 7.1.6 .DSECT 7-9 7.1.7 .FSECT 7-10 7.1.8 Summary of Relocation Performed by LINK 7-12 7.1.9 .SECT * Directive 7-13 7.2 INTER-MODULE COMMUNICATION DIRECTIVES AND SPECIAL OPERATORS 7-14 7.2.1 .EXTERNAL 7-14 7.2.2 .ZTERNAL 7-15 7.2.3 .GLOBAL and == 7-16 7.2.4 .SECREF 7-17 7.2.5 .ENTRY 7-17 7.2.6 CDF Special Operator 7-18 7.2.7 CIF Special Operator 7-19 7.3 HOW TO WRITE RELOCATABLE CODE 7-21 CHAPTER 8 USING MACREL AND KREF 8-1 8.1 RUNNING MACREL 8-1 8.1.1 MACREL Command String 8-1 8.1.2 MACREL Command String Examples 8-4 8.1.3 MACREL Terminal Error Messages 8-5 8.1.3.1 Run-Time Control Commands 8-5 8.1.3.2 Default Terminal Conditions 8-6 8.1.3.3 Terminal Message Format 8-6 8.1.4 Listing Error Messages 8-7 8.1.5 Program Listing Format 8-7 8.1.5.1 Line Number Column 8-8 8.1.5.2 Current Location Counter Value Column 8-8 8.1.5.3 Absolute Assembled Value Column 8-9 8.1.6 Symbol Table Format 8-10 8.1.7 Example Program Listing 8-11 8.1.7.1 Assembly Listing 8-11 8.1.8 Symbol Table Listing 8-14 8.2 RUNNING KREF 8-15 8.2.1 KREF Command String 8-15 8.2.2 Description and Example of KREF Listing 8-15 CHAPTER 9 USING LINK 9-1 9.1 RUNNING LINK 9-1 9.1.1 LINK Command String 9-1 9.1.2 LINK Command String Example 9-3 9.2 RULES FOR USING OVERLAY OPTIONS 9-3 9.3 LINK LIBRARIES 9-6 9.4 LINK ERROR MESSAGES 9-7 9.5 LINK LOAD MAP DESCRIPTION AND EXAMPLE 9-11 vii CONTENTS (Cont.) Page CHAPTER 10 ADVANCED TECHNICAL TOPICS 10-1 10.1 MACREL SYMBOL TABLE SIZE 10-1 10.2 RELOCATABLE BINARY OBJECT MODULE FORMAT 10-2 10.2.1 LSD Preface 10-2 10.2.2 LSD Description 10-3 10.2.3 Text Description 10-4 10.2.3.1 Flag Field Meaning 10-5 10.2.3.2 Loader Codes Generated by MACREL 10-5 10.3 LINK PERFORMANCE 10-8 10.3.1 LINK Processing of Program Section Definitions 10-8 10.4 LINK LIBRARY 10-8 10.5 WRITING AND USING OVERLAYS 10-9 10.5.1 Writing Overlay Code 10-10 10.5.2 How Overlays Work 10-13 10.5.3 Overlay Driver 10-14 10.6 SAVE FILE FORMAT 10-14 10.7 MACRO LIBRARY 10-17 10.8 SETTING THE CURRENT LOCATION COUNTER TO AN UNKNOWN VALUE 10-18 APPENDIX A ASCII CHARACTER SET A-1 APPENDIX B MACREL-PAL8 COMPATIBILITY SUMMARY B-1 APPENDIX C MACREL PERMANENT SYMBOL TABLE C-1 APPENDIX D MACREL DIAGNOSTIC ERROR MESSAGES D-1 INDEX Index-1 FIGURES FIGURE 1-1 Program Sections and Modules 1-10 1-2 Example of Communication Between Program Sections 1-13 3-1 Memory Reference Instruction Bit Assignments 3-2 3-2 Group 1 Operate Microinstruction Bit Assignments 3-3 3-3 Group 2 Operate Microinstruction Bit Assignments 3-4 3-4 Group 3 Operate Microinstruction Bit Assignments 3-4 3-5 Extended Memory Bit Mapping for CDF and CIF Instructions 3-10 4-1 Extended Memory Field Bit Layout for XEDF 4-27 viii CONTENTS (Cont.) Page FIGURES (Cont.) 4-2 BIT Assignments for .LEVEL Special Operator 4-27 5-1 Example of TEXT Option Processing 5-12 6-1 Macro Definition and Storage 6-5 6-2 Macro Expansion 6-7 6-3 Example of Nested Conditional Source Code in Macros 6-18 6-4 Example of a Nested Repeat Block 6-22 6-5 Example of the Expansion of a Nested Repeat Block 6-23 8-1 Example Program Listing 8-11 8-2 Example of Symbol Table Listing 8-14 8-3 Example of KREF Listing 8-16 9-1 Example of LINK Load Map 9-12 10-1 Object Module Format 10-2 10-2 LSD Entry Format 10-4 10-3 Flag Words 10-4 10-4 Loader Code 10-5 10-5 Library File Format 10-9 10-6 MACREL/LINK Overlay Structure (numbers in octal) 10-11 10-7 Permissible JMS's Between MAIN and Overlays 10-12 10-8 Transfer Vector Table 10-13 10-9 Memory Control Block 10-15 10-10 Memory Segment Double Words 10-16 10-11 Overlay Storage 10-16 TABLES TABLE 1-1 Types of Sections 1-8 4-1 Special Characters and Operators 4-2 4-2 Processing of CDF/CIF Expressions 4-25 4-3 Relocation Types Resulting from Addition and Subtraction Operations 4-30 4-4 Relocation Types Resulting from Other Arithmetic Operations 4-31 5-1 .LIST and .NOLIST Options 5-2 5-2 .LISTWD Bit Assignments 5-5 5-3 FIELD Directive Processing for Named Program Sections 5-18 5-4 Conditions for the .IF Conditional Assembly Directive 5-22 5-5 .ENABLE Directive Options 5-36 5-6 .DISABL Directive Options 5-37 5-7 .ENABWD Bit Assignments 5-39 7-1 Types of Program Sections 7-3 7-2 Summary of Program Section Relocation 7-12 ix CONTENTS (Cont.) Page TABLES (Cont.) 8-1 MACREL Command String Options 8-2 8-2 Terminal/Control Commands 8-6 8-3 MACREL Error Codes 8-8 8-4 Symbol Table Descriptor Codes 8-10 9-1 LINK Control Options 9-4 9-2 LINK Error Messages 9-7 9-3 LINK-Detected System Errors 9-9 10-1 MACREL Symbol Table Size 10-1 10-2 LSD Preface 10-3 10-3 MACREL/LINK Loader Codes 10-5 10-4 Extended MACREL/LINK Loader Codes 10-7 10-5 LINK Symbol Table Size 10-8 A-1 ASCII Character Set A-1 x PREFACE ASSUMPTIONS REGARDING READER KNOWLEDGE To use this manual effectively, you must be familiar both with PDP/8 computer hardware and with assembly-language programming. You should also be familiar with OS/8 system operation, and its vocabulary. The manual provides some tutorial information, but assumes you have a general background in the above areas. If necessary, refer to the following documents for more information on any of these topics. DOCUMENTS REFERENCED IN THE TEXT The following documents are referenced in the text: PDP-8/A Minicomputer Handbook _____________________________ PDP-8/E, PDP 8/M & PDP 8/F Small Computer Handbook __________________________________________________ OS/8 Handbook _____________ OS/8 Software Support Manual ________________________________ OS/78 Handbook ______________ ABOUT THIS MANUAL The MACREL/LINK User's Manual describes the MACREL assembler and the _________________________ linking loader LINK and discusses how to write programs using MACREL and how to link the resulting object modules together with LINK. If you are an inexperienced PDP8 programmer, read the manual sequentially, beginning with Chapter 1. For the experienced PAL8 assembly language programmer, the manual also explains the differences between MACREL and older PDP/8 assemblers. Read the manual straight through, and skip over any material that is obvious (discussion of literals, etc.). In particular, begin with the Introduction, which outlines the new features of MACREL and provides an introduction to relocation. This introduction is necessary to understand the rest of the manual. Chapter 2, Source Program Format, describes the rules for formatting a MACREL source program, and shows some new features in MACREL assembly listings. Chapter 3, the Instruction Set, may be skipped. Chapter 4, Symbols and Expressions, contains some new MACREL features plus a discussion of how relocation affects symbols, the current location counter, direct assignment xi statements, and expressions. Chapter 5, Directives, describes the general assembler directives (referred to as pseudo-operators in PAL8). Chapter 6 describes macros and how to use them. Chapter 7, Relocation, describes the relocation features and how to write relocatable programs. Chapters 8 and 9 explain how to run MACREL and LINK, respectively. Chapter 10 discusses certain advanced programming techniques. For the Inexperienced PDP-8 Assembly Language Programmer If you are unfamiliar with assembly languages, read the book "Introduction to Programming" first. If you are not experienced in PAL8 programming, read the manual in two passes. First become familiar with assembly language programming in absolute (non-relocatable) program sections, and then reread the manual to become familiar with relocation. You can skim over the relocation sections of the introduction (though some of the information is required to understand the rest of the manual), and read Chapters 2 through 5 carefully, writing sample programs to test out the features mentioned. Skip over the chapters on the macros and relocation and read the chapters on running MACREL and LINK so you can get your programs running. Then you can read the manual again to learn about relocation and macros. USE OF KEY TERMS Some terms are used throughout the manual in a precise, technical sense. These words include program section, program, page, field, source module, source file, relocatable binary module, linking, absolute, simply relocatable, and complex relocatable. Many of these are covered in the introduction. In particular, notice the distinctions among the terms program, program section, and module. The word "segment" is also sometimes used, but in a generic, nontechnical way (that is, a segment of code, meaning a block or group of code). DOCUMENT CONVENTIONS Examples consist of actual computer output wherever possible. Any responses that you must enter are printed in red ink to distinguish them from computer output. The symbols defined below are used throughout this manual. Symbol Definition ______ __________ [] Brackets indicate that the enclosed argument is optional. || Vertical bars indicate that a single choice must be made from a list of arguments. xii ... Ellipsis indicates optional continuation of an argument list in the form of the last specified argument. UPPER-CASE Upper-case characters indicate elements of the language CHARACTERS that must be used exactly as shown. lower-case Lower-case characters indicate elements of the language characters that are supplied by the programmer. In some instances the symbol (n) is used following a number to indicate the radix. For example, 100(8) indicates that 100 is an octal value, while 100(10) indicates a decimal value. (RET) Indicates that you must enter a carriage return. xiii CHAPTER 1 INTRODUCTION MACREL (MACro-RELocatable) is a PDP/8 based assembler that allows you to write a program using macro coding and relocatable program sections. Macros are generalized instruction sequences that can, if desired, be modified by arguments (data) when the macro is used. Relocatable program sections allow you the freedom of writing programs without regard to the location of the assembled code in PDP/8 memory. The MACREL assembler offers a number of new features to you as a PDP/8 programmer. These are outlined below. The use of these features in assembly-language programming requires some reorientation of thinking. You should read this chapter carefully. 1.1 MACREL FEATURES The main features of MACREL are relocation, macros, and supporting directives and operators. 1.1.1 Relocation Relocation refers to the automatic adjustment of memory addresses at program loading time rather than at program assembly time. This allows you to write programs without consideration of actual memory addresses. Relocation occurs in two stages: first, MACREL assembles your source file into a relocatable binary file, second, LINK converts your relocatable binary file into an executable memory image file. Normally, sections of code begin on page boundaries. MACREL/LINK may relocate these sections to some other page, but generally will not relocate them to a different place on the page. You must still write code with respect to page boundaries. Relocation provides several advantages particularly when working with large programs. You can correct, edit, assemble and list small files rather than having to perform all these functions with large, unwieldy files. Then, when you have assembled all the program modules with MACREL, you can link them together using LINK, the linking loader. By observing the simple programming techniques described in this manual, you do not have to know the location of programs in memory. In addition, by splitting the program into discrete sections that perform specific functions, you produce programs that are faster to debug and easier to understand. Finally, relocation makes it easier to construct libraries of routines that you can incorporate into your program. 1-1 INTRODUCTION In practice, you code your programs in much the same way as you always have. If, however, you specify that a given program segment is relocatable using, for example, the .RSECT directive, LINK will add a fixed number to the addresses shown in the listing causing the program to load at a different set of locations in memory. LINK also automatically handles any reference from one program section or module to another. One additional feature that MACREL and LINK provide is a system of overlays. This allows you to segment large programs so they can run in a small amount of memory. In practice, the whole overlay procedure is almost invisible. For example, if your program does a branch (JMS) to a location that is not in memory, the overlay driver, which is part of the MACREL/LINK software package, brings in the appropriate section of code, and the instruction performs as if the code had always been in memory. Except for your defining the files as overlays, MACREL and LINK automatically handle this entire procedure. 1.1.2 Macros The macro feature of MACREL allows you to predefine a generalized set of code and insert this code anywhere in a program simply by using the name of the macro. In addition, the coding inserted can be varied according to the arguments supplied. The following is a simple example of a macro definition: .MACRO SUB A, B /Subtract A from B CLA TAD A CIA TAD B .ENDM SUB The .MACRO and .ENDM directives tell MACREL that this is a macro definition. In order to call this macro, you merely need to use its name in a statement followed by the two numbers that you want to use. For example, if you want to subtract the contents of N1 from the contents of CNTR, you write the following: SUB N1, CNTR Then, when the program is assembled, the macro code appears as follows: CLA TAB N1 CIA TAB CNTR Note that the arguments shown in the definition have been replaced by the arguments specified in the macro call. 1-2 INTRODUCTION Each time you call this macro, all four instructions will be inserted into the code (the calling line does not produce any object code). Thus, use of a macro will not necessarily result in reduced program size as does, for example, a subroutine that is called repeatedly. Rather, the advantages of macros are that they allow you to predefine commonly used sequences of code, perform complicated text manipulation, and write generalized program sections that assemble in different ways according to the needs of a particular program. 1.1.3 Directives In addition to relocation and macro processing, MACREL provides you with a comprehensive set of directives (called pseudo-operators in PAL8). These directives provide for: o Repeating portions of code or data (.REPT) o Including one source file in the assembly of another (.INCLUDE) o Conditionally assembling code (.IF) o Formatting listing files (.LIST) o Printing titles and subtitles (.TITLE,.SBTTL) o Manipulating an assembly time stack (.PUSH,.POP) o Controlling data storage and expression evaluation (.ENABLE) 1.2 OVERVIEW OF ASSEMBLY AND RELOCATABLE LOADING The complete MACREL/LINK cycle consists of the following steps: o Assembly o Linking o Run These steps are discussed in the next two sections. 1.2.1 Assembling with MACREL MACREL assembles the source code into the appropriate machine-language instructions and their corresponding addresses. If the code is specified as absolute, the addresses are the actual memory addresses where the instructions are to be loaded. If the code is relocatable, 1-3 INTRODUCTION the addresses are relative to the beginning of the program section. LINK alters these addresses by adding a fixed number to each address to determine the actual memory address into which to load the code. In addition to processing instruction codes, MACREL provides a number of directives. These are instructions to the assembler itself. They specify special assembly processing, such as interpreting numbers in octal or decimal format, formatting assembly listings appropriately, or, in some cases, sending special instructions to LINK that allow it to load the code correctly. Several of the directives pertain to the creation of macros. A macro allows you to predefine a frequently used sequence of code and to include this sequence anywhere in the program by using the name of the macro (with appropriate arguments). When MACREL processes a macro, it inserts the predefined code into the assembled program in place of the macro call. Thus, the effect of macro processing is to create source code that is ready to be assembled. The macro feature behaves as if there is a separate program that you run to process the macros prior to assembly. (In actuality, it's done in one combined operation.) This allows for complex manipulation of text and data. Another group of directives pertain to relocation and program modules. They provide information to LINK that allows it to load program sections and modules correctly. Some of these directives tell the assembler how to handle the assembled code and where to load it. Other directives allow for communication between different modules of assembled code. 1.2.2 Linking with LINK The output of the MACREL assembler is a relocatable binary file. It contains the binary machine-language instructions with their relative loading addresses, information about where to load the program sections, and a list of unresolved symbols (undefined symbols declared to be .EXTERNAL). LINK, in turn, takes the relocatable binary files from one or more assembly runs, and creates a memory-image file ready to load. This memory-image file is, in essence, an exact copy of memory itself. You can load and run your program either by using LINK's /G command string option or by using the OS/8 Keyboard Monitor's R command. In order to create the memory-image file, LINK resolves all unresolved symbols first. Then, LINK determines where to load the various program sections, depending upon whether they are absolute or relocatable, and creates the memory-image file. 1.3 COMPATIBILITY OF MACREL WITH PAL8 With few exceptions, MACREL is compatible with PAL8. That is, a 1-4 INTRODUCTION program that can be assembled correctly by PAL8 can also be assembled correctly by MACREL. MACREL, however, provides a number of new features (new directives, relocation, macros, new operators, etc.), and programs using these features cannot be assembled by PAL8. The remainder of this section is devoted to a discussion of the differences between MACREL and PAL8. 1.3.1 MACREL: Differences from PAL8 The following PAL8 features either are not available in MACREL or are handled somewhat differently. They are summarized in Appendix B. 1.3.1.1 Dollar Sign ($) - In PAL8, a dollar sign ($) anywhere (except in a comment or a text string) terminates the assembly. Since $ is a legal element of a symbol name in MACREL, this feature is not implemented in MACREL. However, to retain PAL8 compatibility, MACREL treats a symbol that consists only of one or more dollar signs as an end-of-program signal. 1.3.1.2 DTORG - The PAL8 pseudo-operator DTORG is not implemented in MACREL. Its sole function is to output a special code to a piece of typesetting hardware. It has no other function. 1.3.1.3 MACREL and Literals - When a PAL8 program changes the current location counter to return to a previous page, PAL8 remembers the number of literals on that page and does not overwrite them. MACREL, on the other hand will overwrite literals (except on page zero) any time you set the current location counter to a previous page. For this reason, you should code straight through. (See Section 4.4.) 1.3.1.4 PAUSE - MACREL ignores the PAL8 pseudo-operator PAUSE (i.e., no undefined symbol error is generated). PAUSE is used by PAL8 and is included in MACREL for compatibility. 1.3.1.5 Closing Literal Expressions - PAL8 ignores the right parenthesis ()) or right square bracket (]) in a literal expression. Therefore, the literal expression includes all code to the right of the parenthesis or bracket. In MACREL, the right parenthesis or bracket terminates a literal. This difference is a problem only when you code a literal expression incorrectly. 1-5 INTRODUCTION 1.3.1.6 Terminal Support - In PAL8, if the terminal in use does not support horizontal tab and line feed, the assembler simulates them whenever a listing is output to that terminal. MACREL does not perform these functions. Either the terminal in use supports tabs and line feed, or the OS/8 TTY handler simulates them. 1.3.1.7 PAL8 Run-Time Options - There are a number of differences between PAL8 run-time options and MACREL run-time options. Only two of these directly affect program code generation and are discussed here. The remainder are listed in Appendix B. In PAL8, the /B command-string option makes the exclamation point operator (!) a 6-bit left shift instead of an inclusive OR. In MACREL, the .ENABLE SHIFT directive replaces this option. In PAL8, the /F command-string option disables the extra zero fill in TEXT pseudo-ops. In MACREL, the .DISABLE FILL directive replaces this option. 1.4 ASSEMBLING EXISTING PAL8 PROGRAMS WITH MACREL Except for the few minor incompatibilities mentioned above, PAL8 programs will assemble under MACREL as unnamed absolute sections. However, we do recommend that you name them explicitly as absolute program sections by inserting a line in the following form at the beginning of the program. .ASECT name The directive .ASECT indicates that this is an absolute section, and name is the name you supply for the section. MACREL assumes loading address of 200 in field zero by default. There are four types of files associated with MACREL: Relocatable Binary File (default extension .RB) - the output file used as input to LINK. Listing File (default extension .LS) - the output file which contains the assembly listing. KREF File (default extension .KF) - the file used by KREF, MACREL's cross-reference listing program. Source File 1, Source File 2, etc. (default extension .MA) - your file(s) that contain the ASCII source code to be assembled. 1-6 INTRODUCTION 1.5 INTRODUCTION TO MACREL RELOCATION The remainder of this chapter provides a somewhat more detailed explanation of MACREL/LINK operation. If you are new to PDP-8 assembly language programming you may wish to read Chapters 2 through 4 before reading this section. 1.5.1 Source Modules A source module is a continuous file taken as a whole. To illustrate, recall from Section 1.4 that the input file string (Source File 1), (Source File 2), ... (Source File n) defined a continuous sequence of code to MACREL. That is, after the last character of the last line of source file 1, the first character of the first line of source file 2 was read without a break. The assembler treats this sequence of files as one continuous file. In addition, any file may call other files internally if the .INCLUDE or .CHAIN directives have been specified. Again, the result is one continuous file. This continuous file, or source module, is the input to one assembly operation. The result of the assembly is a relocatable binary module (the relocatable binary file used as input to LINK) plus the listing and KREF files. To summarize (and it is important that these terms be clearly understood), the input to an assembly is a source module and the output is a relocatable binary module (plus the listing and KREF files). Note that these modules may not comprise the entire program. The program may consist of a large number of source modules that communicate among themselves through symbols that have been defined by the .EXTERNAL, .GLOBAL, .ENTRY, and various program section directives. MACREL assembles each source module separately and produces a corresponding relocatable binary module that contains information for LINK about symbols that are defined as .GLOBAL, .EXTERNAL, etc. LINK resolves all these inter-module references and determines where to load the program sections. 1.5.2 Program Sections A program section is a segment of code that begins with a program section (.SECT) directive. It is loaded by LINK into a contiguous set of memory locations. LINK may arbitrarily load different program sections into discontinuous areas of memory. Thus, if two program sections follow each other in the source module, there is no guarantee that LINK will load them into consecutive areas of memory. For PAL8 compatibility, if a sequence of code has no program section directive or name, MACREL defines it as an absolute program section (.ASECT) that loads into the absolute locations indicated on the program listing. Relocatable program sections, such as .RSECTs, are relocated by LINK at link time. The assembler handles all communication among 1-7 INTRODUCTION the various sections of a source module. If, for example, you code a JMS I (SUBR), where SUBR is a label in another section, the assembler generates information to LINK identifying (SUBR) as a literal in another section, and the program executes perfectly even though neither you nor the assembler knows at assembly time where in memory the sections will actually load. There are six types of program sections described in Table 1-1. Programs typically are coded as any combination of .ASECTs and .RSECTs. You can use .ASECT for programs that make reference to absolute memory locations and .RSECT for programs or sections of programs that will be relocated. The remaining four section types are more specialized. If you wish to use page zero locations, you may define a .ZSECT that contains the page zero table (or other code) referenced. If you want to use the autoindex registers (locations 10-17), define an .XSECT with the appropriate labels. If you have a block of data (containing no instructions) that can be located anywhere, place it in a .DSECT. Finally, if you have a number of small subroutines that you wish to load together on a page, define each subroutine as an .FSECT. Such an .FSECT must be shorter than one memory page and LINK may load it anywhere on the page. Table 1-1 Types of Sections ---------------------------------------------------------------------- | Type of Section | Description | |---------------------|----------------------------------------------| | .ASECT | ABSOLUTE SECTION - loaded into memory | | or | starting at the absolute address given. | | .SECT name, A | | |---------------------|----------------------------------------------| | .RSECT | RELOCATABLE SECTION - may be located | | or | anywhere except page 0, and is loaded | | .SECT name, R | starting at the beginning of a page. | |---------------------|----------------------------------------------| | .ZSECT | PAGE ZERO SECTION - will be loaded into | | or | page zero (locations 20-177). | | .SECT name, Z | | |---------------------|----------------------------------------------| | .XSECT | AUTOINDEX SECTION - will be loaded into | | or | locations 10-17 on page zero. | | .SECT name, X | | |---------------------|----------------------------------------------| | .DSECT | DATA SECTION - contains data and may be | | or | relocated anywhere, not necessarily | | .SECT name, D | starting on a page boundary. The section | | | may flow across pages but not across | | | fields. | ---------------------------------------------------------------------- (continued on next page) 1-8 INTRODUCTION Table 1-1 (Cont.) Types of Sections ---------------------------------------------------------------------- | Type of Section | Description | |---------------------|----------------------------------------------| | .FSECT | FLOATING SECTION - contains instructions | | or | and may be relocated anywhere (not | | .SECT name, F | necessarily beginning on a page boundary). | | | It must, however, be wholly contained | | | within a page. Several floating sections | | | may occupy one page. (It is generally used | | | for miscellaneous small subroutines.) | ---------------------------------------------------------------------- LINK can load any program section into any field, but you can specify the memory field where a program section is to be loaded by using a FIELD directive. A program section may not cross field boundaries, and may not contain more than 4095 words of data. You can define a program section using either of two syntaxes: .SECT BPROC. R This line defines a relocatable program section named BPROC. Another example is: .ASECT MAIN This uses the alternative form .ASECT rather than .SECT NAME,A and defines an absolute section named MAIN. You can explicitly specify the loading address of a program section by setting the current location counter with the asterisk directive. *300 This sets the current location counter to 300. MACREL assumes a default loading address of 200 for absolute programs, and 0 (relative to the beginning of the program section) for relocatable program sections. Note that source files, source modules, and program sections are completely different entities. Figure 1-1 shows the relationship of these terms. A given program may consist of a number of source modules, each of which is associated with one assembly operation. A module may consist of one or more program sections, each of which is a logical unit at link time: that is, it loads into a contiguous set of memory locations and is identified by an entry in a load map. Physically, a source module may consist of any number of files. One file might terminate in the middle of a program section and the next one complete that program section and go on to others. This is of no concern to the assembler, however, because it treats all source files as one continuous input stream. 1-9 INTRODUCTION ----------- A program contains | Final | one or more | Program | modules. ----------- ^ | -----------------|................. | | . ------------ ------------ ------------ A module contains | Module 1 | | Module 2 |.....| Module P | one or more ------------ ------------ ------------ sections. ^ | -----------------|................. | | . ------------- ------------ ------------- A section contains | Section 1 | | Section 2 |....| Section M | one or more memory ------------- ------------- ------------- locations. ^ | -----------------|................. | | . -------------- -------------- -------------- Memory locations | Location 1 | | Location 2 |...| Location N | contain the source -------------- -------------- -------------- code. Figure 1-1 Program Sections and Modules MACREL handles communication among program sections of one source module and in general does not require special coding. A JMS I (SUBR), where SUBR is a subroutine in another section, executes correctly without any special effort on your part. Communication between program sections in different source modules, however, requires that all references to a symbol in another module be identified by the directives .EXTERNAL, .GLOBAL, .ENTRY, or their equivalent. These directives are necessary because the other module is assembled at a different time and hence its symbols are not known to MACREL at this assembly time. For example, a TAD I (B, where B is a label in another source module, results in an undefined symbol error message, unless the symbol B is identified in a statement of the form .EXTERNAL B. Chapter 7 describes these module communication directives. 1.5.3 Fundamentals of Relocatable Programming In general, the techniques for programming a relocatable program section are the same as those for programming individual pages of an absolute program section. Since an .RSECT always loads starting at page boundaries, the effect of relocation is merely to move the whole section as a unit by some multiple of 200 (octal) memory locations. 1-10 INTRODUCTION Consider the following sequence of code: *200 CLA TAD B TAD I BPOINT JMS .+3 B, 7 BPOINT, B This would assemble (in an absolute section) as: 1 0200 *200 2 00200 7200 CLA 3 00201 1204 TAD B 4 00202 1605 TAD I BPOINT 5 00203 5206 JMP .+3 6 00204 0007 B, 7 7 00205 0204 BPOINT, B SYMBOL TABLE B 0204 BPOINT 0205 Note that the program ends by adding the value contained at location B twice (once through a direct TAD and once through an indirect TAD). It then jumps to some code following this, at location 206. If this entire piece of code is relocated to start at location *400, it results in the following: 1 0400 *400 2 00400 7200 CIA 3 00401 1204 TAD B 4 00402 1605 TAD I BPOINT 5 00403 5206 JMP .+3 6 00404 0007 B, 7 7 00405 0404 BPOINT, B Note that in the code column to the right of the address column only one line of code has changed from the previous example, namely, the indirect address pointer BPOINT. Everything else is the same. For example, TAD B still has a code value of 1204 (i.e. two's complement add the contents of location 4 on the current page). The indirect address pointer BPOINT has been incremented by octal 200, the amount of the relocation factor. In general, during relocation, instructions and ordinary data are unchanged. Twelve-bit addresses and values that are computed from them are altered by adding to them the address to which they are relocated. 1-11 INTRODUCTION For example, if this segment of code is made into an .RSECT named FLOAT, the code (including the symbol table) looks like this: 1 0000 .RSECT FLOAT 2 ?0000 7200 CLA 3 ?0001 1204 TAD B 4 ?0002 1605 TAD I BPOINT 5 ?0003 5206 JMP .+3 6 ?0004 0007 B, 7 7 ?0005 0004 + BPOINT, B SYMBOL TABLE B 0004+ FLOAT BPOINT 0005+ FLOAT FLOAT 0006 RSECT Compare this example with the one that loads at absolute address 200 and notice the following changes: 1. A question mark (?) replaces the field column of the address to show that LINK may load the program into any field. 2. The address of the first line of code is now shown as 0000 rather than 200, which is relative to the start of the program section. 3. Each succeeding address is relative in the same way. The same idea also applies to the 12-bit address stored in BPOINT, which is now 004 rather than 204, because it will be relocated at link time. 4. The plus sign (+) to the right of the code indicates that this relocation takes place. In the symbol table, FLOAT shows up as the name of this relocatable program section and has an address of zero relative to the beginning of the section. All other labels in the section, then, are defined relative to FLOAT. thus, BPOINT is 5+FLOAT. This means that, at link time, the label BPOINT will be evaluated by taking 5 and adding the value of FLOAT, the memory address of the start of the section. BPOINT consists of an absolute part of 5 and a relocatable part FLOAT. The value of BPOINT, therefore, is the sum of the two parts. The absolute part does not change, but the relocatable part is unknown at assembly time and is determined by LINK after memory is allocated. These effects of relocation are generally not of major concern to you because MACREL and LINK automatically handle them. However, it is helpful to understand what is happening, and how it affects the program listing. 1-12 INTRODUCTION 1.5.4 Example of Communication Between Program Sections In general, communication between program sections is the same as communication between different pages of an absolute program section. For example, in an absolute section, a TAD I (B) executed on one page where B is a label on another page, generates an on-page literal containing the address of B on the other page. In the same way, a TAD I (B) in a relocatable section, where B is a label in another relocatable section, generates an on-page literal pointer, and after linking, that pointer contains the actual address of B in the other section. 1.5.4.1 A Sample Program - The example program SKIFOC shown in Figure 1-2 illustrates communication between program sections. The program obtains a character stored in a buffer, tests the character to see if it is an octal digit, and then prints the result of the test. The program consists of three program sections: .RSECT TESTP, .RSECT SUBS and .DSECT DATA. .RSECT TESTP is the main section of the program. It first calls a subroutine SKIFOC in the .RSECT SUBS and then calls PRINT, the second subroutine in SUBS. When both subroutines are complete, TESTP increments a pointer to the buffer in .DSECT DATA and then checks to see if all the characters in the buffer have been tested. .RSECT SUBS has two subroutines: SKIFOC and PRINT. SKIFOC tests a character from the buffer and then returns to the calling program, TESTP, at either of two locations. PRINT then prints on the terminal, either an "N" or an "S" as a result of the test in SKIFOC. The .DSECT DATA contains the buffer that stores the characters being tested. In this example there are four characters: /, 0, 7, and 8. The final location in the buffer contains null (binary zero), which is used to terminate the test. This example is the listing file of the relocatable binary module produced by MACREL. The assembler adds three columns to the source module during assembly. The first column is the source module line number, the second is the relative address in the section and the third and final column is the code. Items in the code column that are marked by an asterisk (*) will be altered at link time. 1 0000 .RSECT TESTP /TEST SKIFOC ROUTINE 2 0000 FIELD 0 /LOAD IN FIELD 0 3 4 5 00000 4777 START, JMS I (SKIFOC) /CALL SKIFOC,IS IT AN OCTAL DIGIT Figure 1-2 Example of Communication Between Program Sections 1-13 INTRODUCTION 6 00001 1376 TAD ("N-"S) /NO,SET TO PRINT AN ASCII "N" 7 00002 1375 TAD ("S) /YES,GET AN ASCII "S" 8 00003 4774 JMS I (PRINT) /CALL SUB ROUTINE "PRINT" 9 00004 2211 ISZ BLOC /INCREMENT BUFFER POINTER 10 00005 1611 TAD I BLOC /GET NEXT CHARACTER 11 00006 7640 SZA CLA /IS IT ZERO 12 00007 5200 JMP START /NO, TEST IT 13 00010 5773 JMP I (7605) /YES,RETURN TO KEYBOARD MONITOR 14 15 00011 0000 * BLOC, BUFFER /STORAGE FOR LOCAL DATA 16 17 18 19 ----- 00173 7605 00174 0021 * 00175 0323 00176 7773 00177 0004 * 20 0000 .RSECT SUBS /SECTION OF SUB ROUTINES ,SKIFOC & PRINT 21 0000 FIELD 0 /ALSO IN FIELD 0 22 23 24 /STORE LOCAL DATA FOR THIS SECTION 25 26 00000 7520 NEG0, -"0 /MINUS ASCII 0 27 00001 7511 NEG7, -"7 /MINUS ASCII 7 28 00002 0011 * BLOCAD, BLOC /THE ADDRESS OF "BLOC" 29 00003 0000 VBLOC, 0 /ON PAGE STORAGE FOR "BLOC" 30 31 32 33 00004 0000 SKIFOC, 0 /ROUTINE TO TEST THE CHARACTER IN BUFFER 34 /IF IT IS AN OCTAL DIGIT SKIP NEXT 35 /INSTRUCTION (LINE 6) IN THE CALLING 36 /PROGRAM 37 00005 7200 CLA /CLEAR ACCUMULATOR Figure 1-2 (Cont.) Example of Communication Between Program Sections 1-14 INTRODUCTION 38 00006 1602 TAD I BLOCAD /GET ADDRESS OF BUFFER 39 00007 3203 DCA VBLOC /STORE IT ON PAGE 40 00010 1603 TAD I VBLOC /GET CHARACTER AT THIS ADDRESS 41 00011 1200 TAD NEG0 /SUBTRACT ASCII 0 42 00012 7710 SPA CLA /IS IT > OR =TO 0 43 00013 5604 JMP I SKIFOC /NO, RETURN TO CALLING PROGRAM 44 00014 1603 TAD I VBLOC /YES, GET CHARACTER AND TEST AGAIN 45 00015 1201 TAD NEG7 /SUBTRACT ASCII 7 46 00016 7750 SPA SNA CLA /IS IT < OR= TO 7 47 00017 2204 ISZ SKIFOC /YES, INCREMENT RETURN ADDRESS OF 48 /CALLING PROGRAM 49 00020 5604 JMP I SKIFOC /NO, RETURN TO CALLING PROGRAM 50 .RSECT TESTP /TEST SKIFOC PAGE 1-1 FILE 1 51 52 00021 0000 PRINT, 0 /A ROUTINE TO PRINT ONE ASCII CHARACTER 53 54 00022 6046 TLS /PRINT THE CHARACTER 55 00023 6041 TSF /TEST AND SKIP ON FLAG 56 00024 5223 JMP .-1 /STILL PRINTING ,TRY / AGAIN 57 00025 7200 CLA 58 00026 5621 JMP I PRINT /RETURN TO CALLING PROGRAM 59 60 61 62 0000 .DSECT DATA /DATA SECTION TO STORE THE TEST 63 /CHARACTERS ,TWO OCTAL DIGITS AND 64 /TWO THAT ARE NOT OCTAL 65 0000 FIELD 0 /THIS SECTION IS ALSO IN FIELD 0 66 67 00000 0257 BUFFER, "/ /NOT-OCTAL 68 00001 0260 "0 /OCTAL 69 00002 0267 "7 /OCTAL 70 00003 0270 "8 /NOT-OCTAL 71 00004 0000 0 /THE .RSECT TESTP TESTS EACH Figure 1-2 (Cont.) Example of Communication Between Program Sections 1-15 INTRODUCTION 72 /EACH CHARACTER ,WHEN IT FINDS THE ZERO 73 /IN THE BUFFER ,IT STOPS TESTING AND 74 /RETURNS TO MONITOR. (SEE LINES 11 & 13 /ABOVE) Figure 1-2 (Cont.) Example of Communication Between Program Sections 1.5.4.2 Program Operation - Program operation is described below. All references are to the line numbers in the listing. Because TESTP calls two subroutines, program execution begins at line 5, jumps to line 33, continues through line 49, and then returns to TESTP. After the return from the first subroutine, execution jumps to line 52, proceeds through line 58, and again returns to the calling program TESTP. The second return is to line 9 and the program continues through lines 10, 11, 12 and 13. Line 5 is the start of the program and calls the subroutine SKIFOC. Execution then continues on line 33 which stores the return address for SKIFOC. The subroutine is complete in lines 37 through 48. SKIFOC gets the address of the buffer (TAD I BLOCAD, line 38), stores it at VBLOC (line 39) then gets the contents of that address (line 40 and again line 44) and checks to see if the character (contents of VBLOC) is in the range ASCII 260 to ASCII 267 (lines 41 and 42 and again lines 45 and 46). If the character is in this range SKIFOC increments the return address stored at line 33 and returns to TESTP at line 7. If the character is out of range (i.e., not octal), SKIFOC does a normal return to TESTP at line 6. If the return to TESTP is to line 6, the accumulator is loaded with the literal at relative location 00176 (-5, the difference between ASCII N and ASCII S) and then summed with line 7 (ASCII S) to produce the ASCII code for N. If the return is to line 7 the accumulator is loaded with only the literal at location 175, the ASCII value of S. TESTP now jumps to the subroutine PRINT. PRINT is located at the 21st address in the .RSECT SUBS (line 52). This address is stored as a literal at address 00174 and marked with an asterisk to show that it will be relocated at link time. PRINT sends the contents of the accumulator to the terminal and then returns to the calling program. TESTP now increments the pointer to the buffer (line 9), obtains the contents of the buffer (line 10), and checks for zero (line 11). If the buffer now has a zero value (end of test) TESTP exits to the monitor (line 13), if the buffer has another character, TESTP starts over again (at line 5) with the JMP START instruction on line 12. 1-16 INTRODUCTION When run, the program produces the following output: NSSN . This indicates that the first and last characters are not octal digits, while the middle two are. The period is printed by the OS/8 Keyboard Monitor upon completion. 1.5.4.3 Effects of Relocation on the Program - Although this sample program consists of three sections, its construction is identical to a program that is located in three pages of one absolute section. In fact, you could replace the three sets of program directives and field statements by three PAGE directives and program operation would be virtually identical. The following is a brief review of the program in terms of communication between program sections. (You can compare this description with interpage communication within a single absolute program section.) On line 5, there is an indirect JMS to an address (SKIFOC) in another section. This is handled by the literal at address 00177 (below line 19). This literal shows up in the listing as 0004. The actual memory address of SKIFOC is entered into this location at LINK time. Because SKIFOC is at relative address 4 in the section SUBS, the actual value of the literal will be computed as 4 (its place in SUBS) plus SUBS (the loading address of the beginning of the section). The same principle applies to line 8, JMS I (PRINT, where PRINT is another subroutine in SUBS. Here the literal is stored in address 174, and has the value of 0021*. This means that the actual value will be altered to 21 (it's the 21st entry in SUBS) plus SUBS (the loading address of the beginning of the section). The location BLOC (line 15) contains a pointer to BUFFER in the DATA section. Again, it contains the address of the beginning of the DATA section at link time. When TESTP jumps to SKIFOC and PRINT, it stores the appropriate return address in the entry word of each routine. The return to the calling program (even though it's in another section) is handled normally by the hardware; the program uses the usual indirect JMP to the entry. On line 28, the section SUBS has a pointer to BLOC in TESTP. The code shows a 0011*. Again, this is modified at link time by adding 11 (its position in TESTP) to the value of TESTP. Once the address of BLOC is known, the actual communication between the sections is easily initiated. Note that although we used only JMS and TAD instructions to communicate between sections in this sample program, the same 1-17 INTRODUCTION principles apply to any memory reference instructions used between sections. 1.5.4.4 The Symbol Table - The symbol table for this example contains two types of entries: section names and labels. The numbers next to the section names (DATA, SUBS, and TESTP) indicate the size of the sections. Notice that although TESTP contains fewer lines of code than SUBS, TESTP shows up as a larger section (octal 200 as opposed to octal 27) because it uses literals, which load from the end of the page. Thus the entire page is effectively used. (LINK will not load one section into a gap in another.) The remaining entries in the symbol table are labels, and in each case, the entry consists of an octal number that shows the relative location in the section, a plus sign, and the name of the section in which they appear. / A PROGRAM TO TEST "SKIFOC" SYMBOL TABLE BLOC 0011+ TESTP BLOCAD 0002+ SUBS BUFFER 0000+ DATA DATA 0006 DSECT NEG0 0000+ SUBS NEG7 0001+ SUBS PRINT 0021+ SUBS SKIFOC 0004+ SUBS START 0000+ TESTP SUBS 0027 RSECT TESTP 0200 RSECT VBLOC 0003+ SUBS The symbolic name of the section has a value equal to the amount of its relocation offset calculated at LINK time. Thus, if TESTP loads into location 200, its value at link time will be 200, and BLOC (shown as 0011+ TESTP) will have a value of 11 + TESTP, or 211. That is, the location BLOC has an actual memory address of 211. The same principle applies to all other labels in the symbol table. 1.5.5 Relocation Type Absolute program sections may only contain expression and symbol values that are known at assembly time. Relocatable program sections may contain expression and symbol values that are known either at assembly time or at LINK time. There are five types of relocation associated with them: o Absolute 1-18 INTRODUCTION o Simple Relocation o CDF/CIF Relocation o .FSECT Relocation o Complex Relocation 1.5.5.1 Absolute Expressions - An absolute expression is evaluated as a fixed 12-bit number during assembly. For example, the expression N= "A+1 causes "A+1 to be evaluated and the result assigned to N. N, then, has a value of 301 (ASCII A) + 1 or an absolute value of 302. 1.5.5.2 Simple Relocation Expressions - A simple relocation expression consists of an absolute part plus one relocatable part that must be evaluated at link time. A label in an .RSECT is a good example. If the following example is part of a relocatable section: .RSECT ANA CLA TAD B TAD C LOOP, TAD A . . . ANA has an absolute part of zero and a relocatable part to be evaluated at link time. LOOP has an absolute part of 3 (location 0003 in .RSECT ANA) and a relocatable part of ANA. Thus LOOP appears in the symbol table as 0003 + ANA. 1.5.5.3 CDF/CIF Relocation Expressions - An expression is considered CDF/CIF relocatable whenever it uses a value that results from using either the CDF or CIF special operators (see Section 4.9.4). 1.5.5.4 .FSECT Relocation Expressions - An expression is considered .FSECT relocatable whenever it uses a value that results from a relocatable expression residing in an .FSECT, .XSECT, or .ZSECT program section (see Section 7.1). 1.5.5.5 Complex Relocation Expressions - An expression is complex relocatable when it cannot be reduced during assembly to one of the relocation types described in the preceding sections. For example, 1-19 INTRODUCTION continuing with the previous program segment, suppose the code LOOP %4 appeared as follows: .RSECT ANA CLA TAD B TAD C LOOP, TAD A . . . A, LOOP %4 . . . The expression at location A is complex relocatable because its value is 0003 + ANA divided by 4, which MACREL cannot reduce to a simple relocation expression at assembly time. The entire expression, LOOP %4 is passed on intact for LINK to evaluate. An expression need not look complicated to be complex relocatable. For example, B=LOOP+ANA is a complex relocatable expression because it cannot be evaluated at assembly time to an absolute portion and one relocatable part. Here, B would evaluate to 3+<2^ANA> (i.e., 3+ANA+ANA) and twice ANA is not one relocatable part. On the other hand, B=LOOP-ANA is not a complex relocatable expression; in fact, B evaluates to the absolute quantity 3 (3+ANA-ANA). In general, complex CDF/CIF and .FSECT relocatable expressions should be of little concern unless you have unusual code constructions. Relocation is covered in greater detail in Chapter 7. 1.6 THE ASSEMBLY PROCESS The MACREL assembler performs a maximum of four passes through the source module. The number of passes is determined by the number of output files in the file-specification line. (Pass one is always run -- providing an input file is specified.) See Section 1.4 for a list of the file types. 1.6.1 Pass One -- Symbol Definition Pass On its first pass, MACREL constructs the symbol table. The symbol table contains both permanent symbols and program-defined symbols. Each symbol stored is identified according to type, whether or not it is defined, and (if defined) its value. Additional codes indicate whether it has been declared as .EXTERNAL, .GLOBAL, .ENTRY, etc. If the symbol is a macro name, the entire text of the macro is stored in the symbol table area as well. Because source code lines are read and 1-20 INTRODUCTION immediately processed (i.e., not retained in memory), the symbol table requires a large number of memory locations. Thus, a large program being assembled on a machine with insufficient memory could exceed the available symbol table space. In addition to creating the symbol table, the assembler performs its normal algorithm of processing expressions, directives, etc. Thus, most error messages of a syntax nature are generated on pass one. However, no undefined symbol error messages are printed on pass one since the symbol table is not complete until the end of the pass. 1.6.2 Pass Two -- Binary Code Generation Pass If you specify an output file (relocatable binary file, default extension .RB), the assembler starts generating binary code. MACREL reads a line of source code, looks up any symbols in the symbol table, evaluates expressions, and writes the resultant code into the output file. In the case of directives, it performs the action specified by the directive. If you do not specify any output files, the assembler performs error checking on pass one, and then returns to the OS/8 Keyboard Monitor. On this second pass, all symbols must be defined; any undefined or illegally redefined symbol produces an error message. At the end of the pass, all symbols that have been defined by a declaration such as .ENTRY, .EXTERNAL, .GLOBAL and the like, or that are the names of sections are written out in a block of code. This block of code, called the Loader Symbol Dictionary (LSD), is used by LINK to define all global symbols. In addition to global symbols, the LSD contains the size of each section in the module, which LINK uses to determine loading addresses. 1.6.3 Pass Three -- Listing Pass If you specify the second output file (the listing file, default extension .LS), MACREL performs pass three. This pass is essentially the same as pass two but instead of writing binary code, MACREL produces the listing file. At the end of the listing file, the assembler outputs the symbol table as part of the listing. 1.6.4 Pass Four -- Cross-Reference (KREF) Listing Pass If you specify the third output file (the KREF file, default extension .KF), MACREL makes a fourth pass to create the KREF file (cross-reference listing). Note that this is different from PAL8's CREF file. The KREF listing tabulates program-defined symbols 1-21 INTRODUCTION alphabetically and, after each symbol, the line numbers of every line that references that symbol. 1.7 THE LINKING PROCESS The output of one assembly operation is a relocatable binary module in file form. LINK takes up to 128 such files, combines them, and prepares a memory-image file that is ready to load into memory. If desired, LINK also loads the file into memory and starts it. LINK performs this operation in two passes. 1.7.1 Pass One -- Linking On the first pass, LINK creates a Global Symbol Table (GST), which is a combination of all the Loader Symbol Dictionaries (LSDs). In particular, LINK searches through the LSDs of the input files looking for unresolved references. For every symbol in a source module that is defined by an .EXTERNAL or .ZTERNAL directive, LINK searches other modules for .GLOBAL or .ENTRY declarations of the same symbol. As each match is made, the symbol is defined and the reference is satisfied. This process is called linking. LINK continues this process until all .EXTERNAL or .ZTERNAL references are resolved. Normally, this will occur by the end of pass one. If, however, reference is made to programs on a library file, several passes may be required to resolve all references. This is because LINK is more selective when referring to a library file. It selects only modules from the file that are actually referenced, rather than loading everything. However, one library program may reference another one, and so on in turn, and a number of passes may be required to resolve all references. At the end of this process (logical pass one) LINK will have resolved all references unless there is an error. The resulting Global Symbol Table (GST) is then used for pass two. 1.7.2 Pass Two -- Loading Having defined all symbols, LINK allocates memory. All of the section references in the table are sorted, primarily by the size of the program section. Memory is allocated, working from the beginning of the table (largest section) to the end. LINK performs an algorithm that ensures that sections are allocated correctly. In some cases, this may mean a considerable change from the original "largest first" order. The result is that, at the end of this operation, all sections have been allocated memory space. In particular, the actual loading addresses of section names are determined. For example, up until now, 1-22 INTRODUCTION in the section .RSECT MAIN, MAIN had a value (address) of 0+MAIN. Now, MAIN will be given an actual numerical value. Having allocated memory, LINK writes the memory-image file (default extension .SV). LINK also reads the relocatable binary modules again, but this time, as each undefined symbol in the file is referenced, it looks up the symbol in the Global Symbol Table, adds the absolute part, and inserts the correct value. This is the relocation function of LINK. For example: .RSECT MAIN MPTR, NUM1 . . . TAD I MPTR . . . .RSECT MATH CLA TAD A JMP .+4 NUM1, 0 . . . The line MPTR, NUM1 in the .RSECT MAIN contains a 12-bit pointer to NUM1 in .RSECT MATH. Prior to this time, this pointer word contained the simply-relocatable value MATH+3. Now, if the section MATH has been determined to load at location 4200 (i.e., MATH=4200), when LINK reads the MATH+3 reference in the relocatable binary module, it looks up MATH in the table, adds 3, and stores the value 4203 into the memory-image file. LINK continues in this way until the entire file has been written. The memory-image file contains both addresses and the code to be loaded into those addresses. You can load the memory-image file into memory using either the OS/8 Keyboard Monitor's R command, or LINK's /G command string option. 1-23 CHAPTER 2 MACREL SOURCE PROGRAM FORMAT 2.1 MACREL STATEMENTS Source programs are usually prepared on the console terminal (using an OS/8 editor) as a sequence of statements. Each statement is written on a single line and is terminated by typing the RETURN key. The MACREL line buffer can store 128 characters including the carriage return. There are three types of elements in a MACREL statement line that are identified by the order of their appearance in the statement and by the separating (or delimiting) character following or preceding the element. These are: 1. Label, 2. Instruction, directive, or data 3. Comment A statement must contain at least one of these elements and may contain some combination of them. Any combination must be in the order given and they may be separated from each other by any number of spaces or tabs. The assembler interprets and processes the statements, generating one or more binary instructions or data words, or performing an assembly process. 2.1.1 Labels A label is the symbolic name created to identify the location of a statement in the program. If present, the label is written first in a statement. It must be a legal symbol name and be terminated by a comma or colon. Furthermore, there must be no intervening spaces between any of the characters and the comma, or colon. 2.1.2 Instructions, Directives or Data An instruction may be one or more of the mnemonic machine instructions explained in Chapter 3. Directives are direct instructions to the MACREL assembler to perform certain functions. Since directives are instructions to the assembler itself they generally do not create code. 2-1 MACREL SOURCE PROGRAM FORMAT If this element of the statement contains only an expression, it is evaluated by MACREL and stored in memory as data. 2.1.3 Comments You may add notes or comments to a statement by separating these from the remainder of the line with a forward slash (/). Such comments do not affect assembly processing or program execution but are useful in the program listing for later analysis or debugging. The assembler ignores everything from the slash to the next carriage return. It is also possible to have only a carriage return on a line. This causes a blank line in the assembly listing. No error message is given. 2.2 FORMAT EFFECTORS You can use the characters described below to control the format of an assembly listing. They allow a neat readable listing to be produced by providing a means of spacing through the program. 2.2.1 Form Feed The form feed code causes the assembler to output blank lines in order to skip to a new page in the output listing; this is useful in creating a page-by-page listing. The form feed is generated by typing a CTRL/L on the console terminal. 2.2.2 Tabulations Tabulations are used in the body of a source program to separate fields into columns. For example, a line written: GO,TAD TOTAL/MAIN LOOP is much easier to read if tabs are inserted to form: GO, TAD TOTAL /MAIN LOOP 2.2.3 Statement Terminators You can use the RETURN key to terminate a statement and to cause a carriage return/line feed combination to occur in the listing. The semicolon (;) may also be used as a statement terminator and is 2-2 MACREL SOURCE PROGRAM FORMAT considered identical to a carriage return except that it will not terminate a comment. For example in the following line: TAD A /THIS IS A COMMENT; TAD B the entire expression between the slash and the carriage return is considered a comment. Thus, in this case the assembler ignores the TAD B. 2-3 CHAPTER 3 THE PDP-8 MACHINE INSTRUCTION SET This chapter describes the three general classes of computer instructions and the way in which they are used in programs. The first class of instruction operates upon data that is stored in some memory location and must tell the computer where the data is located in memory so that the computer can find it. This type of instruction is said to reference a location in memory; therefore, it is called a memory reference instruction (MRI). When speaking of memory locations, it is very important that you make a clear distinction between the address of a location and the contents of that location. A memory reference instruction refers to a location via a 12-bit address. The instruction causes the computer to act on the contents of that location. However, although the address of a specific location in memory remains the same, the contents of the location are subject to change. In summary, a memory reference instruction uses a 12-bit address value to refer to a memory location, and it operates on the 12-bit binary number stored in the referenced memory location. The second class of instruction consists of the operate microinstructions, which perform a variety of program operations without any need for reference to a memory location. Instructions of this type are used to perform the following operations: clear the accumulator, test for negative accumulator, halt program execution, etc.. Many of these operate microinstructions can be combined (microprogrammed) to increase the operating efficiency of the computer. The third class of instruction consists of the input/output transfer (IOT) instructions. These instructions perform or aid in the transfer of information between a peripheral device and the computer memory. 3.1 MEMORY REFERENCE INSTRUCTIONS Memory reference instructions take the form shown in Figure 3-1. Bits 0 through 2 contain the operation code of the instruction to be performed. Bit 3 tells the computer if the addressing mode is direct or indirect. Bit 4 tells the computer if the instruction is referencing the current page or page zero. This leaves bits 5 through 11 (seven bits) to specify an address. In these seven bits, 200 octal (128 decimal) locations can be specified; the page bit increases 3-1 THE PDP-8 MACHINE INSTRUCTION SET accessible locations to 400 octal or 256 decimal. A list of the memory reference instructions and their codes is given at the end of the chapter. 0 1 2 3 4 5 6 7 8 9 10 11 ------------------------------------------------- | OPERATION | | | | | CODE 7 | | | ADDRESS | | | | | | ----|---|---|---|---|---|---|---|---|---|---|---| ^ ^ INDIRECT ADDRESSING -| | MEMORY PAGE -------------| Figure 3-1 Memory Reference Instruction Bit Assignments In MACREL a memory reference instruction must be followed by a space(s) or tab(s), an optional I or Z designation, and any valid expression. 3.1.1 Addressing Modes The PDP-8 has two addressing modes, direct and indirect. Consider the following: TAD 40 This is a direct address statement, where 40 is interpreted as the location on page zero containing the quantity to be added to the accumulator. References to locations on the current page and page zero may be done directly. For compatibility with older paper-tape assemblers the symbol Z is also accepted as a way of indicating a page zero reference, as follows: TAD Z 40 This is an optional notation, not differing in effect from the previous example. Thus, if location 40 contains 0432, then 0432 is added to the accumulator. When the symbol I appears in a statement between a memory reference instruction and an operand, the operand is interpreted as the address (or location) containing the address of the operand to be used in the current statement. Now consider the following: TAD I 40 This is an indirect address statement, where the contents 40 is used as the address of the location containing the quantity to be added to the accumulator. Thus, if location 40 contains 0432, and location 432 contains 0456, then 456 is added to the accumulator. 3-2 THE PDP-8 MACHINE INSTRUCTION SET NOTE Because the letters I and Z indicate indirect addressing, and a page zero reference respectively, you cannot use them to name a variable. 3.2 MICROINSTRUCTIONS Microinstructions are divided into two groups: operate and Input/Output Transfer (IOT) microinstructions. Operate microinstructions are further subdivided into three groups: Group 1 Group 2, and Group 3. Instructions in these groups cannot be intermixed. 3.2.1 Operate Microinstructions Group 1 microinstructions perform clear, complement, rotate and increment operations, and are designated by the presence of a 0 in bit 3 of the machine instruction word, as shown in Figure 3-2. 0 1 2 3 4 5 6 7 8 9 10 11 ------------------------------------------------- | | | | | | | | | | | | 1 1 1 | 0 |CLA|CLL|CMA|CML| | |BSW|IAC| | | | | | | | | | | | ----|---|---|---|---|---|---|---|---|---|---|---| ^ ^ ^ ROTATE AC AND L RIGHT -------------------| | | ROTATE AC AND L LEFT ------------------------| | ROTATE 1 POSITION IF A 0, 2 POSITIONS IF A 1 ----| (BSW IF BITS 8, 9 ARE 0) LOGICAL SEQUENCE: 1 - CLA, CLL 2 - CMA, CML 3 - IAC 4 - RAR, RAL, RTR, RTL, BSW Figure 3-2 Group 1 Operate Microinstruction Bit Assignments Group 2 microinstructions check the contents of the accumulator and link and, based on the check, continue on to or skip the next instruction. Group 2 microinstructions are identified by the presence of a 1 in bit 3 and a 0 in bit 11 of the machine instruction word. See Figure 3-3. 3-3 THE PDP-8 MACHINE INSTRUCTION SET 0 1 2 3 4 5 6 7 8 9 10 11 ------------------------------------------------- | | | | | | | | | | | | 1 1 1 | 1 |CLA|SMA|SZA|SNL| |OSR|HLT| 0 | | | | | | | | | | | | ----|---|---|---|---|---|---|---|---|---|---|---| ^ REVERSE SKIP SENSING OF BITS 5, 6, 7 IF SET-| LOGICAL SEQUENCE: 1 (BIT 8 IS 0) - SMA OR SZA OR SNL (BIT 8 IS 1) - SPA AND SNA AND SZL 2 - CLA 3 - OSR, HLT Figure 3-3 Group 2 Operate Microinstruction Bit Assignments Group 3 microinstructions reference the multiplier quotient (MQ) register. They are differentiated from Group 2 instructions by the presence of a 1 in bits 3 and 11; the other bits are part of a hardware arithmetic option. Figure 3-4 shows these bit assignments. 0 1 2 3 4 5 6 7 8 9 10 11 ------------------------------------------------- | OPERATION | | | | | | | | | CODE 7 | |CLA|MQA| |MQL| | | | | | | | | | | | ----|---|---|---|---|---|---|---|---|---|---|---| ^ \_/ \_________/ ^ CONTAINS 1 TO | \_________________/ | SPECIFY GROUP 3 -----| ^ | KE8-E EXTENDED ARITHMETIC ELEMENT -------| | CONTAINS A 1 TO SPECIFY GROUP 3 ---------------------| Figure 3-4 Group 3 Operate Microinstruction Bit Assignments You cannot combine Group 1 and Group 2 microinstructions because bit 3 determines either one or the other. Within Group 2, there are two groups of skip instructions, the OR group and the AND group. OR Group AND Group ________ _________ SMA SPA SZA SNA SNL SZL The OR group is designated by a 0 in bit 8 and the AND group by a 1 in bit 8. You cannot combine OR and AND group instructions because bit 8 determines either one or the other. If you combine legal skip instructions, it is important to note the conditions under which a skip may occur. 3-4 THE PDP-8 MACHINE INSTRUCTION SET 1. OR Group-If you combine these skips in a statement, the inclusive OR of the conditions determines the skip. For example: SZA SNL The next statement is skipped if the accumulator contains 0000 or the link is a 1 or both. 2. AND Group-If you combine these skips in a statement, the logical AND of the conditions determines the skip. For example: SNA SZL The next statement is skipped only if the accumulator differs from 0000 and if the link is 0. NOTE If you specify an illegal combination of microinstructions, the assembler will simply perform an inclusive OR between them. For example, CLL SKP is interpreted as SPA because MACREL ORs 7100 (CLL) with 7410 (SKP) to produce 7510 (SPA). 3.2.2 Input/Output Transfer Microinstructions These microinstructions initiate operation of peripheral equipment and effect an information transfer between the central processor and the Input/Output device. (See Standard Instruction Set, Section 3.4.) 3.3 AUTOINDEXING Interpage references are often necessary for obtaining operands when processing large amounts of data. The PDP-8 computers have facilities to ease the addressing of this data. When you indirectly address one of the absolute locations from 10 to 17 (octal) the contents of the location is incremented before it is used as an address, and the incremented number is left in the location. This allows you to address consecutive memory locations using a minimum number of statements. It must be remembered that initially these locations (10 to 17 on page 0) must be set to one less than the first desired address. Because of their characteristics, these locations are called autoindex registers. No incrementation takes place when locations 10 to 17 are addressed directly. For example, if the instruction to be executed next is in location 300 and the data to be referenced is on 3-5 THE PDP-8 MACHINE INSTRUCTION SET the page starting at location 5000, you can use autoindex register 10 to address the data as follows: 0276 1377 TAD C4777 /=5000-1 0277 3010 DCA 10 /SET UP AUTO INDEX 0300 1410 TAD I10 /INCREMENT TO 5000 . . . /BEFORE USE AS AN . . . ADDRESS . . . 0377 4777 C4777,4777 When the instruction in location 300 is executed, the contents of location 10 will be incremented to 5000 and the contents of location 5000 added to the contents of the accumulator. When the instruction TAD I 10 is executed again, the contents of location 5001 will be added to the accumulator and so on. 3.4 STANDARD INSTRUCTION SET The following are the most commonly used elements of the PDP-8 instruction set and are found in the permanent symbol table within the PAL8 Assembler. For additional information on these instructions and for a description of the symbols used when programming other optional devices, see The Small Computer Handbook, or the PDP-8A Miniprocessor ___________________________ _____________________ User's Manual (available from the DIGITAL Software Distribution ______________ Center). ---------------------------------------------------------------------- | Mnemonic Code Operation Cycles | | | | Memory Reference Instructions | | | | AND 0000 Logical AND 2 | | TAD 1000 Two's complement add 2 | | ISZ 2000 Increment and skip if zero 2 | | DCA 3000 Deposit and clear AC 2 | | JMS 4000 Jump to subroutine 2 | | JMP 5000 Jump 1 | | IOT 6000 In/Out transfer - | | OPR 7000 Operate 1 | ---------------------------------------------------------------------- 3-6 THE PDP-8 MACHINE INSTRUCTION SET ---------------------------------------------------------------------- | Mnemonic Code Operation SEQUENCE | | | | Group 1 Operate Microinstructions (1 cycle) | | | | NOP 7000 No operation - | | IAC 7001 Increment AC 3 | | *BSW 7002 Byte swap 4 | | RAL 7004 Rotate AC and link left one 4 | | RTL 7006 Rotate AC and link left two 4 | | RAR 7010 Rotate AC and link right one 4 | | RTR 7012 Rotate AC and link right two 4 | | CML 7020 Complement the link 2 | | CMA 7040 Complement the AC 2 | | CLL 7100 Clear link 1 | | CLA 7200 Clear AC 1 | ---------------------------------------------------------------------- * PDP-8/A,E,F,M and VT78 only ---------------------------------------------------------------------- | Mnemonic Code Operation SEQUENCE | | | | Group 2 Operate Microinstructions (1 cycle) | | | | HLT 7402 Halts the computer 3 | | OSR 7404 Inclusive OR SR with AC 3 | | SKP 7410 Skip unconditionally 1 | | SNL 7420 Skip on non zero link 1 | | SZL 7430 Skip on zero link 1 | | SZA 7440 Skip on zero AC 1 | | SNA 7450 Skip on non zero AC 1 | | SMA 7500 Skip on minus AC 1 | | SPA 7510 Skip on positive AC (zero is 1 | | positive) | | | | *Group 3 Operate Microinstructions | | | | MQA 7501 Multiplier Quotient OR into AC 2 | | MQL 7421 Load Multiplier Quotient 2 | | SWP 7521 Swap AC and Multiplier Quotient 3 | | CLA 7601 Clear AC 1 | | NOP 7401 No operation - | | CAM 7621 Clear AC and MQ 3 | | SWP 7521 Swap AC and MQ 3 | | ACL 7701 Load MQ into AC 3 | | CLA SWP 7721 Load MQ into AC and clear MQ 3 | ---------------------------------------------------------------------- * If MQ is available in hardware. 3-7 THE PDP-8 MACHINE INSTRUCTION SET ---------------------------------------------------------------------- | Mnemonic Code Operation | | | | Combined Operate Microinstructions | | | | CIA 7041 Complement and increment AC | | STL 7120 Set link to 1 | | GLK 7204 Get link (put link in AC, bit 11) | | STA 7240 Set AC to -1 | | LAS 7604 Load AC with SR | | | | Internal IOT Microinstructions | | | | SKON 6000 Skip with interrupts on and turn | | them off | | ION 6001 Turn interrupt processor on | | IOF 6002 Turn interrupt processor off | | GTF 6004 Get flags | | RTF 6005 Restore flag, ION | | SGT 6006 Skip if "Greater Than" flag is set | | CAF 6007 Clear all flags | | | | Keyboard/Reader (1 cycle) | | | | KCF 6030 Clear keyboard flags | | KSF 6031 Skip on keyboard/reader flag | | KCC 6032 Clear keyboard/reader flag and | | AC; set reader run | | KRS 6034 Read keyboard/reader buffer | | (static) | | KIE 6035 Set/clear interrupt enable | | KRB 6036 Clear AC, read keyboard buffer | | (dynamic), clear keyboard flags | | | | Teleprinter/Punch (1 cycle) | | | | TFL 6040 Set teleprinter flag | | TSF 6041 Skip on teleprinter/punch flag | | TCF 6042 Clear teleprinter/punch flag | | TPC 6044 Load teleprinter/punch and print | | TSK 6045 Skip on keyboard or teleprinter flag | | TLS 6046 Load teleprinter/punch, print, | | and clear teleprinter/punch flag | ---------------------------------------------------------------------- 3-8 THE PDP-8 MACHINE INSTRUCTION SET ---------------------------------------------------------------------- | Mnemonic Code Operation | | | | High Speed Perforated Tape Reader | | | | RPE 6010 Set Reader/Punch interrupt enable | | RSF 6011 Skip if reader flag=1 | | RRB 6012 Read reader buffer and clear flag | | RFC 6014 Clear flag and buffer and fetch | | character | | | | High Speed Perforated Tape Punch | | | | PCE 6020 Clear Reader/Punch interrupt enable | | PSF 6021 Skip if punch flag=1 | | PCF 6022 Clear flag and buffer | | PPC 6024 Load buffer and punch character | | PLS 6026 Clear flag and buffer, load buffer | | and punch character | | | | Memory Extension | | | | *CDF 62nl Change to Data Field n (n=00 to 07) | | 62n5 Change to Data field n (n=10 to 17) | | 63nl Change to Data Field n (n=20 to 27) | | 63n5 Change to Data Field n (n=30 to 37) | | *CIF 62n2 Change to Instruction Field n | | (n=00 to 07) | | 62n6 Change to Instruction Field n | | (n=10 to 17) | | 63n2 Change to Instruction Field n | | (n=20 to 27) | | 63n6 Change to Instruction Field n | | (n=30 to 37) | | *CDI 62n3 Change to Data and Instruction Fields n | | (n=00 to 07) | | 62n7 Change to Data and Instruction Fields n | | (n=10 to 17) | | 63n3 Change to Data and Instruction Fields n | | (n=20 to 27) | | 63n7 Change to Data and Instruction Fields n | | (n=30 to 37) | | RDF 6214 Read Data Field | | RIF 6224 Read Instruction Field | | RIB 6234 Read Interrupt Buffer | | RMF 6244 Restore Memory Field | ---------------------------------------------------------------------- * See Figure 3-5 for mapping of extended memory bits into the instruction word. 3-9 THE PDP-8 MACHINE INSTRUCTION SET Field Number (0-37) = abcde bit 0 bit 11 CDF | 110 | 01a | cde | b01 | ------------------------- CIF | 110 | 01a | cde | b10 | ------------------------- CDF CIF | 110 | 01a | cde | b11 | ------------------------- Figure 3-5 Extended Memory Bit Mapping for CDF and CIF Instructions 3.5 CONSTANTS Occasionally you may find it convenient to load the accumulator with a constant produced by combinations of microinstructions, or to produce a combination of microinstructions by loading a constant. Some common examples follow: Constant Instruction ________ ___________ 0 CLA 1 CLA IAC 2 CLA CLL CML RTL 3* CLA CLL CML IAC RAL 4* CLA CLL IAC RTL 6* CLA CLL CML IAC RTL 100* CLA IAC BSW 2000 CLA CLL CML RTR 3777 CLA CLL CMA RAR 4000 CLA CLL CML RAR 5777 CLA CLL CMA RTR 6000* CLA CLL CML IAC RTR 7775 CLA CLL CMA RTL 7776 CLA CLL CMA RAL 7777 STA (=CLA CMA) * Do not use these instructions in software that runs on old (non-Omnibus) PDP-8 computers. 3-10 CHAPTER 4 EXPRESSIONS AND THEIR COMPONENTS This chapter discusses MACREL expressions and their symbolic components. Some parts of this chapter refer to symbols or expressions in relocatable code. You should, therefore, be familiar with Chapter 1 before proceeding with this chapter. Often a major element of a MACREL statement is an expression. An expression may be composed from one or more of the following components: o Symbols o Operators o Special Operators o Numbers Naturally, there are specific rules for the combination of the elements of statements and expressions. This chapter describes these elements and the rules for their valid combination. 4.1 MACREL CHARACTER SET 4.1.1 Alphanumeric Characters MACREL accepts as input all of the upper case alphabetic characters A through Z, the numeric characters 0 through 9, and the characters dollar sign ($) and period (.). Lower case alphabetic characters are permitted within a symbol and are treated as the corresponding upper case character. The first character of a symbol may not be a digit, and only the first 6 characters (not counting an initial $) of a symbol are significant. 4.1.2 Special Characters and Operators MACREL also accepts as input certain special characters and operators, which are listed alphabetically in Table 4-1 below. All of the special characters and operators are explained and their use illustrated in various sections of this manual. 4-1 EXPRESSIONS AND THEIR COMPONENTS Table 4-1 Special Characters and Operators ---------------------------------------------------------------------- | Name | Symbol | Name | Symbol | |--------------------------------------------------------------------| | Angle Brackets | < > | I | I | | Ampersand | & | Left Right Parenthesis | () | | Apostrophe | ' | Left/Right Square Brackets | [] | | Asterisk | * | Minus Sign | - | | Back Slash | \ | Percent Sign | % | | Colon | : | Period | . | | Comma | , | Plus Sign | + | | Dollar Sign | $ | Semicolon | ; | | Double Quotes | " | Up Arrow | ^ | | Exclamation Point | ! | Up Arrow Double Quote | ^" | | Equal Sign | = | Z | Z | | Forward Slash | / | | | ---------------------------------------------------------------------- Both the space and tab are legal input to MACREL also and their use is described in Chapter 2. 4.2 SYMBOLS A symbol in MACREL is a valid combination of letters, digits, and the characters period (.) and dollar sign ($). The symbol must begin with either a letter, a period, or a dollar sign. If it begins with a dollar sign the second character must not be a digit. A symbol may consist of any number of characters, but only the first six are significant. (If the first character is a dollar sign, a total of seven characters are significant.) The remaining characters are merely scanned for valid characters and ignored. Thus, the symbol NEW.VALUE is exactly equivalent to NEW.VA. If the last character in a symbol is a dollar sign, that symbol is considered to be a local symbol (see Section 4.2.4). A symbol is terminated by any character other than those mentioned. Thus, in the expression NEW+1, the plus sign (+) terminates the symbol NEW. Normally symbols in MACREL are unique. A macro name, for example, may not be used as an ordinary symbol. The one exception is local symbols, which need not be unique, in different local symbol blocks. The following are valid symbol names: DATA1 $POINTR .LONGER only .LONGE is significant .A.... NEW.VL .A.1.3 4-2 EXPRESSIONS AND THEIR COMPONENTS GET$ local symbol HYPOTENUSE only HYPOTE is significant The following are not valid symbol names: NU MER symbols cannot contain embedded spaces 1STNUM symbols cannot begin with a digit $3BUF a digit may not follow an initial dollar sign R:03 invalid character NOTE Though legal, you should avoid using symbols starting with period (.) or global symbols that begin with dollar sign ($), as they may be used by future system software released by DIGITAL EQUIPMENT CORP. Internally, symbol definitions are stored in the assembler's symbol table. Thus, with the exception of a few symbols defined by directives, symbols are evaluated during the assembly process and converted to 12-bit values. LINK completes this process of evaluation, especially in the case of relocatable symbols. 4.2.1 Permanent Symbols A number of symbols are reserved for use as machine-instruction mnemonics, directives, and special operators. These symbols are maintained in the permanent symbol table. If you attempt to redefine any of these through use of a comma (,), colon (:), or equals sign (=), an error message is produced. For example, the statement AND=7000 will produce an error message. For compatibility with PAL8, the old pseudo-op PAUSE is reserved as a permanent symbol, but is otherwise ignored. Assembler command string options allow you to add or delete the PDP-8/E symbols and the extended arithmetic instruction mnemonics. Another run-time option deletes certain redundant PAL8 pseudo-ops from the symbol table. The entire permanent symbol table, including those symbols altered by command string options, is listed in Appendix C. You can also alter the permanent symbol table by using the EXPUNGE, FIXMRI, and FIXTAB directives (see Section 5.9). 4-3 EXPRESSIONS AND THEIR COMPONENTS 4.2.2 Program-Defined Symbols You must define any symbol that is not a permanent symbol by entering the symbol name as the first item (excluding any spaces or tabs) on a line, followed immediately by a comma (,), colon (:), or equals sign (=). If you refer to a symbol that is not defined in this way, and is not declared as .EXTERNAL, an undefined symbol error message is printed. For example, the code DCA ALPHA+5 causes an error message if ALPHA is not defined. A symbol defined with a comma or a colon is a label and a symbol defined by the equal sign is a directly assigned symbol. Both labels and directly assigned symbols are discussed below. In addition to these familiar ways of defining symbols, there are three other special ways of defining symbols. The first is by a program section name. The name of a program section is defined when you use it following some form of program section directive. For example: .RSECT GVALUE or .SECT DATA1, D define sections with names of GVALUE and DATA1 respectively. You can use these symbols in expressions, but they cannot be redefined. Two other ways of defining symbols relate to macros. A macro name is a symbol that stands for a sequence of code, and use of the name causes that sequence of code to be inserted into the program. Thus, a macro name is an entirely different entity from an ordinary symbol. However, the use of symbols as dummy arguments in a macro definition does not define them. Both of these special cases are discussed in detail in Chapter 6. 4.2.3 Labels A label is a program-defined symbol terminated by a comma or optionally, by a colon (for local symbol blocks). The comma or colon must immediately follow the symbol name. The label must be the first entry on a line (excluding spaces and tabs, and except for another label) and generally is placed at the left margin. A label is used to identify an address in memory. It is assigned the value of the current location counter. The following two lines of code: *400 STRT, TAD 20 are equivalent to: 4-4 EXPRESSIONS AND THEIR COMPONENTS *400 STRT =. TAD 20 Both examples create an entry in the symbol table with the name of STRT and a value of 400 relative to the beginning of the current program section. Further, they both result in a line of code with a relative location of 400 and value of 1020. (See Sections 4.4 and 4.3.) Since a label is a symbol and has as its value an address, it is also called a symbolic address. In programming it is important to distinguish the address from the contents of that address. Although for expediency you might write a line of code and a comment in the form: TAD BUFSZE /ADD BUFSZE, THE LENGTH /OF THE BUFFER. remember that you are not actually adding BUFSZE (since that is an address) but rather the contents of BUFSZE. This distinction is especially important in expressions involving indirect addressing, literals, and computed addresses. You cannot use a label to redefine a symbol; you must use a direct assignment statement instead. In the following code: *600 INIT, TAD B . . . INIT, NOP the second use of INIT will result in a redefined symbol error message. The same error message results if you attempt to use a label to redefine a symbol already defined through a direct assignment statement. More than one label, however, may have the same value. Thus, the code: *300 NAME1, NAME2, NAME3, 0 is perfectly legal, and results in NAME1, NAME2, and NAME3 all having a value of 300. The following line: NAME1, NAME2, NAME3, 0 is also legal, and also results in the three labels having a value of 300. 4-5 EXPRESSIONS AND THEIR COMPONENTS 4.2.4 Local Symbols The local symbol feature of MACREL allows you to use a convenient symbol name in one segment of a program, and then, without conflict, use the same symbol name in another segment. You designate a local symbol by using a dollar sign ($) as the last character. Thus, DATA$ is a local symbol. Local symbols are defined only within a local symbol block. A local symbol block extends from one label defined with a comma to the next label defined with a comma, or as far as the next program section directive, or PAGE directive, or as far as the end of the program, whichever comes first. Any labels inside the local symbol block must be defined using a colon (:) rather than a comma. A colon is equivalent to a comma, except that it does not cause termination of the local symbol block. The following is an example of a local symbol block (the code is hypothetical): . . . INR, DCA NEW TAD COUNT$ CIA LOOP$: TAD OFFSET ISZ BCNT$ JMP LOOP$ JMP : NEXT /DATA BCNT$: 0 COUNT$: 0 NEXT, 0 Notice that the local symbol block starts with the first use of a label with a comma (INR,) and terminates with the next use of a label with a comma (NEXT,). All labels inside the local symbol block must be defined with a colon. If you refer to a local symbol outside its block and fail to define it as a local symbol in a new local symbol block, the assembler prints an undefined-symbol error message. Local symbols do not appear in the symbol table but can be included in the cross-reference listing. In effect, they are invisible to the program outside their own local symbol block. You can also define local symbols with a direct assignment statement. The equals sign of the direct assignment statement does not terminate the local symbol block; only a label with a comma or a new .SECT Directive or the end of the program delimit the local symbol block. 4-6 EXPRESSIONS AND THEIR COMPONENTS A local symbol may also consist of one to four decimal digits followed by a dollar sign. For example, 3$ or 1234$, are both local symbols. The value of the digits must be less than 4096 (decimal) and may include leading zeros which are not significant. The value may be an expression enclosed in angle brackets. Thus, if A=3 then $ represents the symbol 6$. 4.2.5 Backslash (\) Special Operator You can insert numeric characters that represent the value of an expression into a symbol name by using the backslash (\) special operator. If MACREL encounters a backslash within a symbol name, it evaluates the term to the right of the backslash and inserts its value into the symbol name. For example: DATA\NUMB If the value of the symbol NUMB is 2, this construction generates a symbol DATA2 that is exactly identical to a symbol DATA2 written in the program source. Moreover it can be referenced as DATA2 by other segments of programs. Similarly, if NUMB has a value of 12, the symbol will be DATAl2. The following code: AP=0 N\AP, AP+"0 AP=AP+1 N\AP, AP+"0 AP=AP+1 N\AP, AP+"0 produces three labels with the following contents: N0, which contains ASCII 0; N1, which contains ASCII 1; and N2, which contains ASCII 2. In evaluating symbols with a backslash, the assembler uses the current radix. If the radix is changed to 2, as in the following example: N=7 .RADIX 2 TAD B\N . . . B1, 0 B10, 0 B11, 0 B100, 0 B101, 0 4-7 EXPRESSIONS AND THEIR COMPONENTS B110, 0 B111, 0 the assembler will evaluate the symbol N in TAD B\N as a binary value (.RADIX 2), and the instruction causes the program to load the current contents of Blll. In evaluating the number following the backslash, leading zeros are ignored. Thus, N\0001 is evaluated as N1. Also, if the number to the right of the backslash contains more than six significant digits, only the least significant (rightmost) six are used. This situation might occur, for example, if the current radix were 2 (binary). You may use the backslash construction anywhere in a program and should find it particularly valuable in creating symbols within macros. Notice that all evaluation of the backslash construction occurs at assembly time. In the example shown above, the value of N must be absolute. The backslash must be within the symbol and must be followed by a term. An expression can be used after the backslash if it is enclosed in angle brackets. Thus, A\B+C is not a legal symbol whereas, A\ is. The expression B+C could also be assigned a name by a direct assignment. Thus, N=B+C A\N produces a legal symbol. The backslash may be embedded in a symbol. Thus if C=3 then the following code AB\D represents the symbol AB5D. The backslash may be nested without limit. For example, A\B\C is identical to: A\. 4-8 EXPRESSIONS AND THEIR COMPONENTS 4.2.6 Symbols and Relocation One function that MACREL and LINK perform (as a unit) is the reduction of symbols to 12-bit (or in the case of addresses, 17-bit) numeric values. Although the function of the assembler is to reduce symbols to numerical values, at the end of the assembly operation many symbols are not yet evaluated. In fact, the assembler cannot evaluate symbols that are to be relocated, because the amount by which they will be relocated is determined by the loading address of the beginning of the program section. This address cannot be known until LINK runs. For this reason, the assembler stores symbols as an absolute part and a relocatable part. The complete value of the symbol is not known at assembly time. MACREL passes any expression that is more complicated than this to LINK for resolution. For example, a symbol that requires two relocatable parts to evaluate is complex relocatable. Symbols can be classified into five groups: absolute (relocatable part is zero), relocatable (absolute plus one relocatable part), complex relocatable (neither of the above), CDF/CIF Relocatable and .FSECT Relocatable. Up to now, the examples in this chapter have assumed that the program is one absolute section, and the symbols are all absolute. Any label, however, that is defined in a relocatable section will be simply relocatable. If the example used in Section 4.2.3 appeared in a relocatable section named RELOK, as: .RSECT RELOK *400 STRT, TAD 20 then the label STRT will have an absolute value of 400 and a relocatable part of RELOK. STRT will appear in the symbol table as 00400+RELOK. (Note that the assembler directive RELOC cannot be used as a program section name.) In practice, you will adjust to this difference quickly and there will be little change in the way that you write code. 4.3 DIRECT ASSIGNMENT STATEMENT A direct assignment statement assigns a value to a symbol. The statement has two forms. The first form is: Symbol=Expression There must not be any space between the symbol and the equal sign. The direct assignment statement defines the symbol as having the value of the expression on the right. For example, DATA=7 causes DATA to be assigned a value of 7. Notice that this is different from a label, which defines the symbol as having the value of the current address. 4-9 EXPRESSIONS AND THEIR COMPONENTS The second form of the direct assignment is: Global Symbol==Expression Here, the double equals sign must immediately follow the symbol, and there must be no space between the two equal signs. This form of the direct assignment statement is the same as that of the single equal sign, but in addition it defines the symbol as being a global symbol. Thus, BUFSZE==100 is exactly the same as: .GLOBAL BUFSZE BUFSZE=100 Use of the .GLOBAL directive is discussed in Chapter 7. When used in any program section, the expression to the right of a direct assignment statement can be any legal expression that is either simply relocatable or absolute. Thus, XVALUE=2^-2 is a perfectly legal statement provided that BCNT is defined in the current source module and is absolute. One forward reference to an undefined symbol is allowed, but more than one is illegal. For example, A=B . . . B=3 is a legal sequence of code, because although B is unknown when the assembler reaches A=B, it is known when the assembler reaches B=3. This is one forward reference. However, A=B . . . B=C . . . C=3 is not a legal sequence of code because there are two forward references: A cannot be known until B is known, and B cannot be known until C is known. This structure may cause your program to assemble incorrectly. 4-10 EXPRESSIONS AND THEIR COMPONENTS You can use a direct assignment statement to redefine a previously defined symbol, but not to redefine a macro name, a directive, a machine-instruction mnemonic, or any other permanent symbol. Use of a direct assignment statement to redefine any of these results in an error message. In relocatable program sections, some care is required in the use of direct assignment statements. For example, XVALUE=2^-2 will produce a COMPLEX-RELOCATABLE ERROR message if BCNT is a label in a relocatable section, because the value of the expression containing BCNT cannot be determined until link time, though XVALUE needs a value at assembly time. A direct assignment statement assigns to the symbol to the left of the equals sign all the characteristics of the expression on the right: its absolute part and its relocatable part. An assembly error results if the expression to the right of the equal sign is not simply relocatable, and the symbol to the left of the equal sign is not a global or external symbol. 4.4 CURRENT LOCATION COUNTER The assembler maintains an internal register called the current location counter, which contains the address of the next available location. As each instruction or data word is assigned, the counter, is increased by one. To set this counter, use the asterisk (*) directive (see Section 5.5). To obtain the value of the current location counter, use the period (.) operator. You can use this permanent symbol in expressions in the same way as any other symbol. Because it is a symbol, you cannot use it next to another symbol (including an instruction mnemonic); a space or other character must intervene. For example, in the statement DCA.+3, the assembler would assume you are referring to a symbol DCA., and are trying to add 3 to it. The period always has the value of the current location counter. Like other permanent symbols, it cannot be redefined by a direct assignment statement. Because it has the value of the current location counter, the period itself is often called the current location counter. One of its common uses is demonstrated by a statement of the form: JMP .+3 This means, jump to the current location plus three. This statement implies that you will not at some time insert additional code between JMP .+3 and the location jumped to. If future insertion is a possibility, it is better to jump to a local symbol. 4-11 EXPRESSIONS AND THEIR COMPONENTS 4.5 LITERALS Literals are expressions, which are defined by parentheses (current page literals), or square brackets (page zero literals). They are evaluated and assigned locations by the assembler. 4.5.1 Current Page Literals Parentheses () define a current page literal. The assembler assigns them a location beginning at the highest location on the current page and works downward. For example, in an absolute section the following code, *400 TAD (-4) JMS I (DECODE) . . . results in two literals being generated, and if DECODE is a label at address 600, the assembly listing appears as, 1 0400 *400 2 00400 1377 TAD (-4) 3 00401 4776 JMS I (DECODE) ----- 00576 0600 00577 7774 The literals are listed after the five hyphens in a program listing. The TAD (-4) line results in generating a literal at address 577 (the top of the page) that contains a minus four (7774). The TAD instruction addresses this location. The JMS I (DECODE) line generates a pointer (stored at location 576) to DECODE (which has an address 0600). The JMS instruction uses location 576 as an indirect pointer to location 0600. In evaluating a literal, the assembler first evaluates the expression contained within the parentheses and then stores this value in the l