C for The Microprocessor Engineer P2

22 C FOR THE MICROPROCESSOR ENGINEER Figure 2.2 Moving 16-bit data at òne go'. system operations of jumping to a subroutine and implementing an interrupt, decrement the relevant Stack Pointer before moving data. As mentioned earlier, the Push and Pull operations allow any register or set of registers to be pushed or pulled into or out of a stack at one go. This facilitates the passing of arguments to and from subroutines, and allows called subroutines to use registers without corrupting register-held data in the calling program (see Section 5.2). Figure 2.1 shows how the post-byte is calculated for a Push or a Pull. Specif- ically the System stack is shown; if the User stack is being employed then U is replaced by S. Figure 2.3 shows a snapshot of memory after a Push onto the Sys- tem stack. If only a subset of registers are saved, then the same order is preserved as in the diagram. The time-taken for a Push or Pull is five cycles plus one cycle per byte moved. In Fig. 2.3 this adds up to 17 cycles. The 6809 implements the normal Add and Subtract operations, as shown in Table 2.2, both with and without carry, targeted on an 8-bit Accumulator. An Accumulator_D-based 16-bit Add and Subtract instruction is also provided, but unfortunately not with a carry. An unsigned addition of Accumulator_B to the 16-bit X Index register can also be classed as double, but the 8-bit addend is promoted to 16-bit at addition time, by assuming an upper byte of zero, hence the terminology unsigned. Thus for example, ABX #56h actually adds the constant 0056h to X. It is possible to promote a signed number in Accumulator_B to its 16-bit equiv- alent in Accumulator_D by using the Sign EXtension instruction. This zeros Accumulator_A if bit 7 of B is 0 and fills A with ones (A ITS INSTRUCTION SET 23 Figure 2.3 Stacking registers in memory using PSH and PUL. Also applicable to IRQ and NMI interrupts. 24 C FOR THE MICROPROCESSOR ENGINEER Table 2.2 Arithmetic operations Flags Operation Mnemonic V N Z C Description Add √ √ √ √ Binary addition to A; to B ADDA; ADDB √ √ √ √ [A] ITS INSTRUCTION SET 25 calculates the effective address as [X] + 1 and loads it into the X Index register ([X] 26 C FOR THE MICROPROCESSOR ENGINEER Table 2.3 Shifting Instructions. Flags Operation Mnemonic V N Z C Description Shift left, arithmetic or logic Linear shift left into carry 1 √√ memory ASL b7 √√ C ← ← 0 A; B ASLA; ASLB 1 b7 Shift right, logic √√ Linear shift right into carry memory LSR • b0 √√ 0 → → C A; B LSRA; LSRB • b0 Shift right, arithmetic √√ As above but keeps sign bit memory ASR • b0 √√ b7 → → C A; B ASRA; ASRB • b0 Rotate left Circular shift left into carry 1 √√ memory ROL b7 √√ C ← ← C A; B ROLA; ROLB 1 b7 Rotate right √√ Circular shift right into carry memory ROR • b0 √√ C → → C A; B RORA; RORB • b0 Note 1: V=b7⊕b6 before shift. Circular or Rotate Shift instructions are similar to Add with Carry, in that they can be used for multiple-precision operations. A Rotate takes in the Carry from any previous Shift and in turn saves its ejected bit in the C flag. As an example, a 24-bit word stored in 24 M 16 15 M+1 8 7 M+2 0 can be shifted right once by the sequence [4]: M LSR M ; 0 → ⇒ b16 → C M+1 ROR M+1 ; b16/ C → ⇒ b8 → C M+2 ROR M+2 ; b8 / C → ⇒ b0 → C In all types of Left Shifts, the oVerflow flag is set when bits 7 and 6 differ before the shift (i.e. b7⊕b6), meaning that the (apparent) sign will change after the shift. The logic operations of AND, OR, Exclusive-OR and NOT (Complement) are provided, as shown in Table 2.4. The only unusual feature here is the special instructions of ANDCC and ORCC for clearing or setting flags in the Code Condition register. Thus to clear the I mask (see Fig. 1.1) we have: ITS INSTRUCTION SET 27 ANDCC #11101111b ; Coded as 1C-EFh (equivalent to CLI) and to set it: ORCC #00010000b ; Coded as 1A-10h (eqivalent to SEI) This saves having to provide a series of separate instructions targeted at each of the CCR flags and masks, such as the 6800's CLI and SEI (CLear and SEt Interrupt mask), and also allows more than one flag to be set or cleared in a single instruction. Table 2.4 Logic instructions. Flags Operation Mnemonic V N Z C Description AND √ √ Logic bitwise AND A; B ASL 0 • [A]28 C FOR THE MICROPROCESSOR ENGINEER Table 2.5 Data test operations. Flags Operation Mnemonic V N Z C Description Bit Test √ √ Non-destructive AND A; B BITA; BITB 0 • [A]·[M]; [B]·[M] Compare √ √ √ √ Non-destructive subtract with A; B CMPA; CMPB √ √ √ √ [A]−[M]; [B]−[M] with D CMPD √ √ √ √ [D]−[M:M+1] with X; Y CMPX; CMPY √ √ √ √ [X]−[M:M+1]; [Y]−[M:M+1] with S; U CMPS; CMPU [S]−[M:M+1]; [U]−[M:M+1] Test for Zero or Minus √ √ Non-destructive subtract from zero memory TST 0 √ √ • [M]−00 A; B TSTA; TSTB 0 • [A]−00; [B]−00 ANDB #00100000b ; Clear all Accumulator B bits except 5 {C4-20h} will set the Z flag if bit 5 is 0, otherwise Z will be cleared. Once again this is a destructive examination, and the equivalent from Table 2.5 is BIT test; thus: BITB #00100000b ; Coded as C5-20h does the same thing, but with the contents of Accumulator_B remaining un- changed; and more tests can subsequently be carried out without reloading. Comparison of the magnitude of data in an Accumulator with either a constant or data in memory requires a different approach. Mathematically this can be done by subtracting [M] from [A] and checking the state of the flags. Which flags are relevant depend on whether the numbers are to be treated as unsigned (magnitude only) or signed. Taking the former first gives: [A] Higher than [M] : [A]−[M] gives no Carry and non-Zero C=0, Z=0 (C + Z=1) [A] Equal to [M] : [A]−[M] gives Zero (Z=1) [A] Lower than [M] : [A]−[M] gives a Carry (C=1) The signed situation is more complex, involving both the Negative and oVer- flow flag. Where a subtraction occurs and the difference is positive, then either bit 7 will be 0 and there will be no overflow (both N and V are 0) or else an overflow will occur with bit 7 at logic 1 (both N and V are 1). Logically, this is detected by the function N⊕V. A negative difference is signalled whenever there is no over- flow and the sign bit is 1 (N is 1 and V is 0) or else an overflow occurs together with a positive sign bit (N is 0 and V is 1). Logically, this is N⊕V. Based on these outcomes we have: [A] Greater than [M] : [A]−[M] → non-zero +ve result (N⊕V·Z = 1 or N⊕V+Z = 0) [A] Equal to [M] : [A]−[M] → zero (Z=1) [A] Less than [M] : [A]−[M] → a negative result (N⊕V = 1) Subtraction is a destructive test operation and Comparison is its non-destructive counterpart. It is the most powerful of the Data Testing operations, as it can be ITS INSTRUCTION SET 29 applied to both Index and Stack Pointer registers as well as 8- and 16-bit Accu- mulators. Table 2.6 Operations which affect the Program Counter. Operation Mnemonic Description Bcc cc is the logical condition tested LBcc Always (True) BRA; LBRA Always affirmed regardless of flags Never (False) BRN; LBRN Never carried out Equal BEQ; LBEQ Z flag set (Zero result) not Equal BNE; LBNE Z flag clear (Non-zero result) Carry Set BCS; LBCS1 [Acc] Lower Than (Carry = 1) Carry Clear BCC; LBCC2 [Acc] Higher or Same as (Carry = 0) Lower or Same BLS; LBLS [Acc] Lower or Same as (C+Z=1) Higher Than BHI; LBHI [Acc] Higher Than (C+Z=0) Minus BMI; LBMI N flag set (Bit 7 = 1) Plus BPL; LBPL N flag clear (Bit 7 = 0) Overflow Set BVS; LBVS V flag set Overflow Clear BVC; LBVC V flag clear Greater Than BGT; LBGT [Acc] Greater Than (N ⊕ V · Z = 1) Less Than or Equal BLE; LBLE [Acc] Less Than or Equal (N ⊕ V · Z = 0) Greater Than or Equal BGE; LBGE [Acc] Greater Than or Equal (N ⊕ V = 1) Less Than BLT; LBLT [Acc] Less Than (N ⊕ V = 0) Jump JMP Absolute unconditional goto No Operation NOP Only increments Program Counter 2's complement Branch Note 1: Some assemblers allow the alternative BLO. Note 2: Some assemblers allow the alternative BHS. All Conditional operations in the 6809 are in the form of a Branch instruction. These cause the Program Counter to skip xx places forward or backwards; usu- ally based on the state of the CCR flags. Excluding Branch to SubRoutine (see Section 5.1), there are 16 Branches provided, which can be considered as the True or False outcome of eight flag combinations. Thus Branch if Carry Set (BCS) and Branch if Carry Clear (BCC) are based on the one test (C =?). If the test is True, the offset following the Branch op-code is added to the Program Counter. Thus if the Carry flag is zero: E100:1 BCC-08 ; Coded as 24-08h 30 C FOR THE MICROPROCESSOR ENGINEER will add 0008h to the Program Counter state E102h to give PC = E10Ah. Note that the PC is already pointing to the following instruction when execution occurs, giving an effective destination of ten places on from the Branch location. The Branch offset is sign extended before addition to the Program Counter; thus if the N flag is zero: E100:1 BPL-F8 ; Coded as 24-F8h gives PC ADDRESS MODES 31 Table 2.7: (a) The M6809 instruction set (continued next page). Insert page 1 of Table 2.7 here. 32 C FOR THE MICROPROCESSOR ENGINEER Table 2.7: (b) The M6809 instruction set (continued next page). Insert page 2 of Table 2.7 here. ADDRESS MODES 33 Table 2.7 (c) (continued). The M6809 instruction set. Reproduced by courtesy of Motorola Semicon- ductor Products Ltd. Insert page 3 of Table 2.7 here. 34 C FOR THE MICROPROCESSOR ENGINEER inform the MPU's Control registers where this data is being held. There are a few exceptions to this, the so called Inherent operations, such as NOP (No OP- eration) and RTS (ReTurn from Subroutine). Single-byte instructions whose operand is a single register, for example INCA (INCrement accumulator A), are also sometimes classified as Inherent. With the exception of Inherent instructions, the bytes following the op-code are either the (constant) operand itself, or more usually a pointer to where the operand can be found. We have already met the simplest of these, where the absolute address itself follows, as in: LDA 2000h ; [A] ADDRESS MODES 35 program would take 3072 cycles, whilst the loop equivalent takes considerably longer at 4867 cycles to execute. In the remainder of this section, we will look at the 6809 address modes. In this catalog, op-code may be one or two bytes. Inherent op-code All the operand information is contained in the op-code, with no specific address- related bytes following. All of the 6809 inherent operations are one byte long except SoftWare Interrupt 2. An example is NOP (No OPeration). Motorola also classify most Register-Direct instructions as inherent, for example INCA (IN- Crement A). Table 2.7 gives the Inherent instructions. Register Direct, R op-code post-byte Information concerning the source register(s) and/or destination register(s) are contained in a post-byte. For example TFR A,B (TransFeR the contents of A to B) is coded as 0001 1111 1000 1001b (1F-89h). The post-byte here is divided into two fields. The left field specifies the source register, and the right the destination. Each register is encoded as a bit in a 4-wide code. Thus 1000b is A and 1001b is B. A list of codes is given on page 20. The Transfer, Exchange, Push, and Pull operations come under this category. In Table 2.7 these are classified as Immediate. Immediate, #kk op-code constant 8 bit op-code constant 16 bit With Immediate addressing, the byte or bytes following the op-code are constant data and not a pointer to data. We have used this form of addressing before, in the array argument routine in Table 2.8. Some examples are: ADDB #30h ; Add the constant 30h to Acc. B {Coded as CB-30h} LDX #2000h ; Put the constant 2000h in X {Coded as 8E-20-00h} CMPY #21FFh ; Compare [Y] with the constant 21FFh {Coded as 10-8C-21-FFh} The pound (hash) symbol # is commonly used to indicate a constant number. Absolute, M 36 C FOR THE MICROPROCESSOR ENGINEER op-code DP offset Short (Direct) op-code Address Long (Extended Direct) In Absolute addressing, the address itself — either in whole or part — follows the op-code. Motorola terms the long 16-bit address version as Extended Direct. There is a short version just called Direct, where the effective address (ea) is the concatenation of the Direct Page register with the byte following the op-code. Thus if this register is set at, say, 80h, then the instruction LDA 08h, coded as 96-08h, effectively brings down the byte from address 8008h. Some assem- blers have difficulty in deciding which of these forms to use. For example, in the fragment above, should the assembler generate the code B6-80-08 (LDA 8008) or 96-08 (LDA 08)? After all, the setting of the DP register may have been altered in a call to a subroutine yet to be linked in. There are ways around this, but none is entirely satisfactory. Absolute Indirect, [M] op-code | 9Fh Pointer to address Here the op-code is followed by a post-byte 9Fh and then a 16-bit address. This is not the address of the operand but a pointer to where the operand address is stored in memory. Thus, if the locations 2000:2001h hold the address 80-08h, then the instruction: LDA [2000h] ; [A] ADDRESS MODES 37 Branch Relative op-code offset 8-bit (Short) op-code offset 16-bit (Long) We have already discussed this form of address mode in the previous section. Regular (or short) Branches sign extend the following 8-bit offset, and add this to the Program Counter. Effectively this means that offsets between 80h and FFh are treated as negative. For example the instruction BRA -06 is coded as 20-FAh (FAh is the 2's complement of 06h) when the PC is at E108h, is implemented as: 1110 0001 0000 1000 (PC) = E108h + 1111 1111 1111 1010 (offset) = FF FAh = −6 1 1110 0001 0000 0010 (E102h, which is E108h − 0006h) In calculating this offset, it must be remembered that the PC is already point- ing to the next instruction. Thus the maximum forward point is (00)7Fh + 2 = 127 + 2 = 129 bytes from the op-code and (FF)80h + 2 = −128 + 2 = 126 bytes back. Long Branches have a 16-bit offset and can range from +32,767 and −32,768 bytes from the following op-code, effectively anywhere in the full 64 kbyte ad- dress space of memory that the processor can address at one time. Of course Long Branch code is bigger and slower to execute (see Table 2.7(c) under the column ~). Indexed The Absolute address modes are used where operands lie in fixed locations. In many cases, this places an unacceptable restriction on the data structures which can easily be processed. Compilers, for example, like to pass parameters in a stack, and these should then be capable of being retrieved in locations relative to the Stack Pointer. The 6800 MPU has a primitive form of computed effective address (ea), where this could be up to +FFh (+255) bytes from the contents of one Index register thus: LDAA 8,X ; [A] 38 C FOR THE MICROPROCESSOR ENGINEER op-code post-byte±n ± n, R (5-bit) op-code post-byte ±n ±n, R (8-bit) op-code post-byte ±n ±n, R (16-bit) Here the effective address is R ± n where R is X, Y, S or U. The actual machine code produced depends on the size of n, with a single post-byte capable of in- tegrally handling up to ±15. This complex encoding scheme is worthwhile, as most offsets are small; for example, an analysis has shown that 40% of this type of indexing uses a zero offset [1]. Indirect Constant Offset Index does not have an 8-bit (±127) offset version, the 16-bit variety being used. Fortunately the task of evaluating the post-byte and following bytes is handled automatically by the assembler. Post-Auto-Increment / Pre-Auto-Decrement from Register op-code post-byte ,R+ / ,R++ / ,-R / ,--R As we saw in the listing of Table 2.8(b), indexing comes into its own when stepping through blocks of memory, arrays and related structures. To avoid having to follow (or lead) the use of the Index register with an Increment or Decrement, this mode provides for automatic advance or retard; thus: LDA ,R+ ; Bring down data byte and then increment Index register R LDA ,-R ; Bring down data byte and then increment Index register R twice LDA ,R++ ; Decrement Index register R and then bring down data byte LDA ,--R ; Decrement Index register R twice and then bring down data byte where R is X, Y, S or U. Notice that incrementing is done after and decrementing before the Index register is used. Double Increment/Decrement modes are useful when the arrays contain addresses or other double-byte data. Indirection is only available for this double form, as by its nature addresses are likely to be being accessed. As an example of these modes, consider the problem of multiplying two 256- byte arrays to give a 256 double-byte array. If array_1 begins at 2000h with the second array following directly, and the product array commences at 3000h, then we have: LDX #2000h ; Point IX to array_1[0] LDY #3000h ; Point IY to array_3[0] LOOP: LDA 256,X ; Get array_2[i] LDB ,X+ ; Get array_1[i]; increment i MUL ; Multiply them STD ,Y++ ; Put it away and move on twice CMPX #21FFh ; Last element yet? BLS LOOP ; IF not past it THEN repeat RTS ; ELSE finished ADDRESS MODES 39 Accumulator Offset from Register, A,R / B,R / D,R op-code post-byte As an alternative to a constant offset, any Accumulator can hold a variable offset to an Index register, for example: LDA B,X ;[A] 40 C FOR THE MICROPROCESSOR ENGINEER One of the major advantages of the Relative address mode is that it produces position independent code (PIC). Thus a Branch is relative to where the program is at the time the decision is taken. If the program is moved to a different part of memory, all the offsets move with it unchanged. This is what differentiates a Branch from a Jump operation. The Program Counter Offset mode extends the PIC capability to any instruction which has an Indexed address mode. This is similar to the Constant Offset from Register mode, but with the Program Counter being the Index register. For example in: LDA 200h,PC ;[A] EXAMPLE PROGRAMS 41 We first met the Load Effective Address (LEA) instruction in Table 2.2. Here we observed that it could be used to perform simple arithmetic on the X, Y, U or S registers. Essentially, any effective address computed by any of the Direct Index address modes, except Post-Increment/Pre-Decrement, can be loaded into one of these four registers. A few examples are: LEAX +2,X ; The EA of X+2 is put into X, effectively incrementing X by 2 LEAY D,X ; Adds [D] to [X] and puts sum in Y LEAS -20,S ; Moves the Stack Pointer down 20 bytes 2.3 Example Programs Previously we have used program fragments to illustrate various instruction/address mode combinations. Here we conclude our look at 6809 assembly-level software by developing three programs of a slightly more elaborate nature. This will serve to integrate at least some of the concepts we have discussed, and provide for a comparison with equivalent software using 68000 code in Chapter 4. Each pro- gram module is written in the form of a complete subroutine; that is data is assumed present on entry in some place, usually in a register, and is terminated by a ReTurn from Subroutine (RTS) instruction. Subroutine structure is the subject of Chapter 5. Implementing a software function involves developing an appropriate algo- rithm, writing code in a suitable language, testing and debugging. There is little that can be done to mechanize the former, as algorithms are an expression of human creativity. Once this has been done, a range of software tools, such as assemblers, linkers, compilers and simulators, exist to aid in the production of the latter phases. We will look at these in some detail in Part 2. The most fundamental software tool is the assembler. An assembler is a pro- gram that translates, on a line for line basis, symbolically-coded native language to machine code for the target processor. This saves the error-prone tedium of working out op-codes and relative offsets. Nearly as important is the use of mnemonics for instructions and names for locations (labels). These, together with the use of comments, provide superior documentation compared to strings of hexadecimal digits (see page 168). At this point in the text, we are only concerned to provide sufficient back- ground to allow the reader to follow program syntax as presented in the re- mainder of the text. Assemblers, like any other commercial package (such as a word processor), have their own peculiar rules and peccadilloes, which have to be learnt. One common denominator is the virtually unanimous use of the pro- cessor manufacturers' standard instruction mnemonics, with minor variations. Most of the variations lie in the layout of the source code and the directives (or pseudo operators) used to pass information from the programmer to the assem- bler. A line of source code comprises four fields: an optional label, the essential

Tải về miễn phí