Links to other weeks notes [ 2 ][ 3 ][ 4 ][ 5 ][ 6 ][ 7 ] ( 8 )[ 10 ][ 11 ][ 12 ]

## Week 8, 1-6 September, 1997

### Booth's Algorithm

Booth's algorithm is a multiplication algorithm which worked for two's complement numbers. It is similar to our paper-pencil method, except that it looks for the current as well as previous bit in order to decided what to do. Here are steps

• If the current multiplier digit is 1 and earlier digit is 0 (i.e. a 10 pair) shift and sign extend the multiplicand, subtract with previous result.
• If it is a 01 pair, add to the previous result.
• If it is a 00 pair, or 11 pair, do nothing.

Let's look at few examples.

```           4 bits
0110   <- 6
x   0010   <- 2
-------------
00000000
-     0110
--------------
11110100
+    0110
--------------
(1) 00001100  <- 12     (overflow bit ignored)
8 bits
```

In Booth's algorithm, if the multiplicand and multiplier are n-bit two's complement numbers, the result is considered as 2n-bit two's complement value. The overflow bit (outside 2n bits) is ignored.

The reason that the above computation works is because

```      0110 x 0010 = 0110 x (-0010 + 0100) = -01100 + 011000 = 1100.
```

Example 2:

```            0010
x  0110
------------
00000000
-      0010
-------------
11111100
+    0010
-------------
(1) 00001100
```

In this we have computed

```    0010 x 0110 = 0010 x ( -0010 + 1000) = - 00100 + 0010000 = 1100
```

Example 3, (-5) x (-3):

```           1011   ->  -5  (4-bit two's complement)
x   1101   ->  -3
-----------
00000000
-  11111011     (notice the sign extension of multiplicand)
------------
00000101
+  1111011
-------------
11111011
-  111011
-------------
00001111   -> +15
```

A long example:

```                    10011100     <- -100
x     01100011     <- 99
--------------------
00000000 00000000
-  11111111 10011100
--------------------
00000000 01100100
+  11111110 011100
--------------------
11111110 11010100
-  11110011 100
--------------------
00001011 01010100
+  11001110 0
--------------------
11011001 01010100      <- -9900
```

Note that the multiplicand and multiplier are 8-bit two's complement number, but the result is understood as 16-bit two's complement number. Be careful about the proper alignment of the columns. 10 pair causes a subtraction, aligned with 1, 01 pair causes an addition, aligned with 0. In both cases, it aligns with the one on the left. The algorithm starts with the 0-th bit. We should assume that there is a (-1)-th bit, having value 0.

#### Booth's algorithm in hardware

The hardware consists of 32-bit register M for the multiplicand, 64-bit product register P, and a 1-bit register C, 32-bit ALU and control. Initially, M contains multiplicand, P contains multiplier (the upper half Ph = 0), and C contains bit 0. The algorithm is the following steps.

Repeat 32 times:

1. If (P0, C) pair is:
• 10: Ph = Ph - M,
• 01: Ph = Ph + M,
• 00: do nothing,
• 11: do nothing.
2. Arithmetic shift P right 1 bit. The shift-out bit gets into C.

#### Logical shift vs. arithmetic shift

The above mentioned shift is arithmetic shift. We have learned the logical shift. For example,
```   shift right logical (srl) 0100 ... 111    ->  00100 ... 11
1100 ... 111    ->  01100 ... 11
```

Arithmetic shift preserves the sign of a two's complement number, thus

```   shift right arithmetic (sra) 0100 ... 111    ->  00100 ... 11
1100 ... 111    ->  11100 ... 11
```

Shift right arithmetic performed on P is equivalent to shift the multiplicand left with sign extension of the paper-pencil calculation of earlier examples.

An example of 4-bit two's complement Booth's algorithm in hardware. Compute 2 x (-3) = - 6 or 0010 x 1101.

```   Iteration       Step       Multiplicand         Product   C

0           initial value     0010 (always)     0000 1101 0
1           1 Ph = Ph-M                         1110 1101 0
2 arithmetic shift                  1111 0110 1

2           1 Ph = Ph+M                         0001 0110 1
2 arithmetic shift                  0000 1011 0

3           1 Ph = Ph-M                         1110 1011 0
2 arithmetic shift                  1111 0101 1

4           1 do nothing                        1111 0101 1
2 arithmetic shift                  1111 1010 1
```

The result 1111 1010 is 8-bit two's complement value for -6.

#### Why Booth's algorithm works?

In two's complement multiplication b x a, the value a is

```   a = -2^{31} a_31 + 2^{30} a_30 + ... + 2 a_1 + a0.
```
The pair (a_i, a_{i-1}) and their difference, and operation are as follows.
```   a_i     a_{i-1}       (a_{i-1} - a_i)       action

1          0                -1              subtract b (shifted)
0          1                +1              add b (shifted)
0          0                 0              do nothing
1          1                 0              do nothing
```

So the value computed by Booth's algorithm is

```      (0 - a_0) * b
+  (a_0 - a_1) * b * 2
+  (a_1 - a_2) * b * 2^2
...
+  (a_29 - a_30) * b 2^30
+  (a_30 - a_31) * b * 2^31,
```

After some simplification, the above expression reduce to

```     b * (a_0 + 2 * a_1 + ... + 2^30 * a_30 - 2^31 * a_31)
= b * a.
```

which is exactly the product of a and b.

### Multiplication Instructions on MIPS

The MIPS machine instruction
```   mult \$4, \$5
```
performs a 32-bit signed multiplication in hardware and put the result in two special registers hi and lo. Since the result is 64-bit in general, hi contains the upper part, and lo contains the lower part of the 64-bit result. The values can be moved to other registers by the instruction:
```   mflo \$2   # copy content in lo to \$2, lower half of the product
mfhi \$3   # move from hi to \$3, the upper half
```
You have seen
```   mul \$2, \$3, \$4      # \$2 = \$3 * \$4
```

This is a pseudo-instruction. It is equivalent to the two machine instructions:

```   mult \$3, \$4   # multiply \$3 with \$4, result in (hi, lo)
mflo \$2       # move lo to \$2
```

The upper part hi is ignored. The content in hi can be used to check for overflow of multiplication. Overflow occurs if result is more than 32 bits. Overflow does not occur if \$2 >=0 and hi = 0, or \$2 < 0, and hi = -1.

The unsigned version of multiplication is `multu`.

### Division

We begin again with the primary school division method:

```                      1001      <- Quotient
---------
Divisor -> 1000  | 1001010      <- Dividend
- 1000
---------
1010
- 1000
-------
10      <- Remainder
```

The relation between those quantities is

```   Dividend = Quotient * Divisor + Remainder
```
Again, the above example can be thought of as in decimal notation, or in binary (74/8 = 9 + 2/8).

Example: Consider 4-bit division, 0111/0010 or 7 / 2 = ?

```                0011
-----------
0010  | 0000 0111
? -  001 0
-------------
0000 0111
? -   00 10
-------------
0000 0111
-    0 010
---------
0000 0011
-      0010
---------
0000 0001
```

Problem Try 1011101101/1011 (binary).

#### First version of division algorithm and hardware

This is very similar to what we do with pencil and paper. Four registers and a 64-bit ALU are needed. The initial values are as follows: The upper 32 bits (Dh) of the 64-bit register D contains the 32-bit divisor; the lower 32 bits are zero. The 64-bit remainder register R contains the dividend (the upper half is zero). Q, the quotient, contains zero. Here is the division algorithm

Repeat 32 times:

1. Divisor shift right 1 bit, D = D >> 1.
2. Subtract D from R, R = R - D.
3. Quotient shift up, Q = Q << 1; If(R>=0) Q0 = 1, else R = R + D.

Note that quotient gets a bit 1 if the subtraction is successful. Otherwise, quotient gets a bit 0 and the subtraction has to be undone.

#### Second version of the division hardware

Similar to the case of multiplication, the divisor takes only 32 bits. Instead of shifting down the divisor D, we shift up the remainder R.

#### Third version of the division hardware

When the remainder shifts up, the newly created 0 bits is not utilized in the second version. The last modification is to put the result here instead of a separate register. So the final division algorithm looks like this.

Initial values: D contains divisor, R contains dividend (upper half of R (Rh) is zero).

Repeat 32 times:

1. Shift remainder left 1 bit, R = R << 1.
2. Subtract divisor from the upper half of R, Rh = Rh - D.
3. If (R>=0) R0 = 1, else Rh = Rh + D.

The notation R0 denote the 0-th bit of R. Rh denotes the upper half of the 64-bit register R. After the operation, the lower half of R contains quotient, and upper half of R contains remainder.

The algorithm can be implement in software see the program divide.s.

#### Signed Division

For division, there is no algorithm similar to Booth's algorithm for signed integers. For two's complement numbers, we need first convert to positive numbers then apply the unsigned division algorithm And determine the sign of results afterwards.

### MIPS instructions for Division

The machine division instruction is
```   div \$4, \$5
```
which compute the quotient \$4/\$5, and remainder \$4%\$5. The quotient is put in the special register lo, and the remainder in hi. We can copy from lo or hi to general registers by
```   mflo \$2   # \$2 = \$4/\$5, the quotient, move from lo
mfhi \$3   # \$3 = \$4%\$5, the remainder, move from hi
```

The unsigned version of division is `divu`. It is more convenient to use the assembly instruction in program as, e.g.,

```   div \$2, \$3, \$4   # \$2 = \$3/\$4
```
But this is equivalent to MIPS machine instructions:
```   div \$3, \$4   # machine instruction for division
mflo \$2      # quotient
```

Note that div is used for both assembly and machine instruction depending on the number of operands.