Links to other weeks notes [ 2 ][ 3 ][ 4 ][ 5 ][ 6 ][ 7 ] ( 8 )[ 10 ][ 11 ][ 12 ]
Booth's algorithm is a multiplication algorithm which worked for two's complement numbers. It is similar to our paper-pencil method, except that it looks for the current as well as previous bit in order to decided what to do. Here are steps
Let's look at few examples.
4 bits 0110 <- 6 x 0010 <- 2 ------------- 00000000 - 0110 -------------- 11110100 + 0110 -------------- (1) 00001100 <- 12 (overflow bit ignored) 8 bits
In Booth's algorithm, if the multiplicand and multiplier are n-bit two's complement numbers, the result is considered as 2n-bit two's complement value. The overflow bit (outside 2n bits) is ignored.
The reason that the above computation works is because
0110 x 0010 = 0110 x (-0010 + 0100) = -01100 + 011000 = 1100.
Example 2:
0010 x 0110 ------------ 00000000 - 0010 ------------- 11111100 + 0010 ------------- (1) 00001100
In this we have computed
0010 x 0110 = 0010 x ( -0010 + 1000) = - 00100 + 0010000 = 1100
Example 3, (-5) x (-3):
1011 -> -5 (4-bit two's complement) x 1101 -> -3 ----------- 00000000 - 11111011 (notice the sign extension of multiplicand) ------------ 00000101 + 1111011 ------------- 11111011 - 111011 ------------- 00001111 -> +15
A long example:
10011100 <- -100 x 01100011 <- 99 -------------------- 00000000 00000000 - 11111111 10011100 -------------------- 00000000 01100100 + 11111110 011100 -------------------- 11111110 11010100 - 11110011 100 -------------------- 00001011 01010100 + 11001110 0 -------------------- 11011001 01010100 <- -9900
Note that the multiplicand and multiplier are 8-bit two's complement number, but the result is understood as 16-bit two's complement number. Be careful about the proper alignment of the columns. 10 pair causes a subtraction, aligned with 1, 01 pair causes an addition, aligned with 0. In both cases, it aligns with the one on the left. The algorithm starts with the 0-th bit. We should assume that there is a (-1)-th bit, having value 0.
Repeat 32 times:
shift right logical (srl) 0100 ... 111 -> 00100 ... 11 1100 ... 111 -> 01100 ... 11
Arithmetic shift preserves the sign of a two's complement number, thus
shift right arithmetic (sra) 0100 ... 111 -> 00100 ... 11 1100 ... 111 -> 11100 ... 11
Shift right arithmetic performed on P is equivalent to shift the multiplicand left with sign extension of the paper-pencil calculation of earlier examples.
An example of 4-bit two's complement Booth's algorithm in hardware. Compute 2 x (-3) = - 6 or 0010 x 1101.
Iteration Step Multiplicand Product C 0 initial value 0010 (always) 0000 1101 0 1 1 Ph = Ph-M 1110 1101 0 2 arithmetic shift 1111 0110 1 2 1 Ph = Ph+M 0001 0110 1 2 arithmetic shift 0000 1011 0 3 1 Ph = Ph-M 1110 1011 0 2 arithmetic shift 1111 0101 1 4 1 do nothing 1111 0101 1 2 arithmetic shift 1111 1010 1
The result 1111 1010 is 8-bit two's complement value for -6.
In two's complement multiplication b x a, the value a is
a = -2^{31} a_31 + 2^{30} a_30 + ... + 2 a_1 + a0.The pair (a_i, a_{i-1}) and their difference, and operation are as follows.
a_i a_{i-1} (a_{i-1} - a_i) action 1 0 -1 subtract b (shifted) 0 1 +1 add b (shifted) 0 0 0 do nothing 1 1 0 do nothing
So the value computed by Booth's algorithm is
(0 - a_0) * b + (a_0 - a_1) * b * 2 + (a_1 - a_2) * b * 2^2 ... + (a_29 - a_30) * b 2^30 + (a_30 - a_31) * b * 2^31,
After some simplification, the above expression reduce to
b * (a_0 + 2 * a_1 + ... + 2^30 * a_30 - 2^31 * a_31) = b * a.
which is exactly the product of a and b.
mult $4, $5performs a 32-bit signed multiplication in hardware and put the result in two special registers hi and lo. Since the result is 64-bit in general, hi contains the upper part, and lo contains the lower part of the 64-bit result. The values can be moved to other registers by the instruction:
mflo $2 # copy content in lo to $2, lower half of the product mfhi $3 # move from hi to $3, the upper halfYou have seen
mul $2, $3, $4 # $2 = $3 * $4
This is a pseudo-instruction. It is equivalent to the two machine instructions:
mult $3, $4 # multiply $3 with $4, result in (hi, lo) mflo $2 # move lo to $2
The upper part hi is ignored. The content in hi can be used to check for overflow of multiplication. Overflow occurs if result is more than 32 bits. Overflow does not occur if $2 >=0 and hi = 0, or $2 < 0, and hi = -1.
The unsigned version of multiplication is multu
.
We begin again with the primary school division method:
1001 <- Quotient --------- Divisor -> 1000 | 1001010 <- Dividend - 1000 --------- 1010 - 1000 ------- 10 <- Remainder
The relation between those quantities is
Dividend = Quotient * Divisor + RemainderAgain, the above example can be thought of as in decimal notation, or in binary (74/8 = 9 + 2/8).
Example: Consider 4-bit division, 0111/0010 or 7 / 2 = ?
0011 ----------- 0010 | 0000 0111 ? - 001 0 ------------- 0000 0111 ? - 00 10 ------------- 0000 0111 - 0 010 --------- 0000 0011 - 0010 --------- 0000 0001
Problem Try 1011101101/1011 (binary).
This is very similar to what we do with pencil and paper. Four registers and a 64-bit ALU are needed. The initial values are as follows: The upper 32 bits (Dh) of the 64-bit register D contains the 32-bit divisor; the lower 32 bits are zero. The 64-bit remainder register R contains the dividend (the upper half is zero). Q, the quotient, contains zero. Here is the division algorithm
Repeat 32 times:
Note that quotient gets a bit 1 if the subtraction is successful. Otherwise, quotient gets a bit 0 and the subtraction has to be undone.
Similar to the case of multiplication, the divisor takes only 32 bits. Instead of shifting down the divisor D, we shift up the remainder R.
When the remainder shifts up, the newly created 0 bits is not utilized in the second version. The last modification is to put the result here instead of a separate register. So the final division algorithm looks like this.
Initial values: D contains divisor, R contains dividend (upper half of R (Rh) is zero).
Repeat 32 times:
The notation R0 denote the 0-th bit of R. Rh denotes the upper half of the 64-bit register R. After the operation, the lower half of R contains quotient, and upper half of R contains remainder.
The algorithm can be implement in software see the program divide.s.
div $4, $5which compute the quotient $4/$5, and remainder $4%$5. The quotient is put in the special register lo, and the remainder in hi. We can copy from lo or hi to general registers by
mflo $2 # $2 = $4/$5, the quotient, move from lo mfhi $3 # $3 = $4%$5, the remainder, move from hi
The unsigned version of division is divu
. It is more
convenient to use the assembly instruction in program as, e.g.,
div $2, $3, $4 # $2 = $3/$4But this is equivalent to MIPS machine instructions:
div $3, $4 # machine instruction for division mflo $2 # quotient
Note that div is used for both assembly and machine instruction depending on the number of operands.
Textbook, Computer Organization & Design, pages 205-224.