Links to other weeks notes [ 2 ][ 3 ][ 4 ][ 5 ]( 6 )[ 7 ][ 8 ][ 10 ]

# Chapter 4, Arithmetic for Computers

In this chapter we shall learn
• Binary representation of integers, in particular negative numbers.
• Addition and subtraction, how they are computed by the computer.
• Arithmetic logic unit, part of the CPU responsible for computing.
• Multiplication and division, how they are implement in hardware.
• Floating point numbers, the IEEE standards.

### Base of number representation

#### Base 10 (decimal)

We use a sequence of digits 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 to represents numbers in our daily work, for example 1996. The number 1996 means one thousand plus nine hundred plus ninety and plus six, as you know. Written formally

```   1996 = 1 * 1000 + 9 * 100 + 9 * 10 + 6 * 1
= 1 * 10^3 + 9 * 10^2 + 9 * 10^1 + 6 * 10^0
```

Each earlier digit contributes to the value ten times larger than the next digit. The number 10 is said to be the base or radix of the representation. Such system is called a positional notation system.

#### Base 2 (binary)

It is not necessary that each digit contributes to a power of 10 according to its position in the number. We can use power of 2 instead, and with only two (binary) digits 0 and 1. So in binary 10_two is the value 2_ten. (I'll use _two to indicate binary number and _ten for decimal number). Thus

```   1011_two = 1 * 2^3 + 0 * 2^2 + 1 * 2^1 + 1 * 2^0
= 8 + 0 + 2 + 1
= 11_ten
```

It's good to know the first few values of powers of two for binary to decimal conversion:

```   2^0 = 1;           2^6 = 64;
2^1 = 2;           2^7 = 128;
2^2 = 4;           2^8 = 256;
2^3 = 8;           2^9 = 512;
2^4 = 16;          2^10 = 1024;
2^5 = 32;          2^11 = 2048.
```

#### Base 16 (hexadecimal)

Hexadecimal numbers need digits from 0 to 15, they are 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F where A for 10, B for 11, and so on. In hexadecimal, each digit represents a power of 16. So

```   10FA_sixten = 1 * 16^3 + 0 * 16^2 + 15 * 16^1 + 10 * 16^0
= 1 * 4096 + 0 * 256 + 15 * 16 + 10 * 1
= 4096 + 0 + 240 + 10
= 4346_ten
```

Clearly, one can take any integer base>0 as a radix, then the i-th digit (counting from 0) contributes to the total sum by

```   d * base^i
```
where the digit is a number between 0 and base-1.

### Conversion between Representations

#### Binary to decimal

Here is one more example

```   10010011 = 2^7 + 0 + 0 + 2^4 + 0 + 0 + 2^1 + 2^0
= 128 + 16 + 2 + 1
= 147
```

Whenever there is a 1, there is a contribution of the form 2^i, i is the location of the digits counting from right to left, starting from 0.

#### Decimal to binary

Here is the rule, you may convence yourself why it works.
```   147/2 = 73 + 1/2   -> 1  (right most, least significant digit)
73/2 = 36 + 1/2   -> 1
36/2 = 18 + 0/2   -> 0
18/2 =  9 + 0/2   -> 0
9/2 =  4 + 1/2   -> 1
4/2 =  2 + 0/2   -> 0
2/2 =  1 + 0/2   -> 0
1/2 =  0 + 1/2   -> 1  (left, most significant digit)
```

So the binary representation is (reading backwards)

```   10010011
```

In summary, the number is divided by 2, and the results are the quotient and reminder; the digit in the reminder is the binary digit; the quotient is divided by 2 again, until the quotient become zero. The binary digits from least significant to most significant are found in the reminders.

#### Exercises

Convert decimal number 183 into binary; and convert binary number 1001101 into decimal.

#### Binary to hexadecimal, hexadecimal to binary

Due to the fact that 2^4 = 16, the conversion is very easy. Simply divide the binary digits into group of four (starting from the least significant bit) and use the correspondence between decimal number from 0 to 15, hexadecimal number from 0 to F, and binary four digits number 0000 to 1111,

``` decimal  hex    binary

0      0      0000
1      1      0001
2      2      0010
3      3      0011
4      4      0100
5      5      0101
6      6      0110
7      7      0111
8      8      1000
9      9      1001
10      A      1010
11      B      1011
12      C      1100
13      D      1101
14      E      1110
15      F      1111
```

For example, binary number representation <-> hexadecimal representation:

```   0001 0000 1111 1010   <->   10FA
```

### Unsigned int in MIPS

Since each bit has two possible states 0, or 1, two bits can have four different patterns, 00, 01, 10, 11, and three bits can have 8 possible values, 000, 001, 010, 011, 100, 101, 110, and 111. In general, n-bit patterns can take 2^n different forms. A 32-bit pattern (word) can be one of the 2^{32}. One way to represent 0 and positive integers (unsigned) is to use the correspondence

```   00000000 00000000 00000000 00000000 _two = 0_ten
00000000 00000000 00000000 00000001 _two = 1_ten
00000000 00000000 00000000 00000010 _two = 2_ten
00000000 00000000 00000000 00000011 _two = 3_ten
00000000 00000000 00000000 00000100 _two = 4_ten
...
01011100 10000110 00001101 00100011 _two = 1552289059_ten
...
01111111 11111111 11111111 11111111 _two = 2^31-1 = 2147483647_ten
10000000 00000000 00000000 00000000 _two = 2^31   = 2147483648_ten
10000000 00000000 00000000 00000001 _two = 2^31+1 = 2147483649_ten
...
11011100 10000110 00001101 00100011 _two = 3699772707_ten
...
11111111 11111111 11111111 11111101 _two = 2^32-3 = 4294967293_ten
11111111 11111111 11111111 11111110 _two = 2^32-2 = 4294967294_ten
11111111 11111111 11111111 11111111 _two = 2^32-1 = 4294967295_ten
```
Notice that each number is obtained from the previous one by adding 1. This is the representation of the unsigned integers. 32-bit unsigned integers are limited in the range 0 to 2^32-1. There are exact 2^{32} numbers.

### Negative numbers

Negative numbers are important in calculations. In order to represent negative numbers, half of the bit patterns which were used to represent large positive numbers are used for negative numbers. Specifically, if the bit position 31 (left most bit) is 1, the number will be considered negative, with the following correspondence between bit patterns and values:
```   00000000 00000000 00000000 00000000 _two = 0_ten
00000000 00000000 00000000 00000001 _two = 1_ten
00000000 00000000 00000000 00000010 _two = 2_ten
00000000 00000000 00000000 00000011 _two = 3_ten
00000000 00000000 00000000 00000100 _two = 4_ten
...
01011100 10000110 00001101 00100011 _two = 1552289059_ten
...
01111111 11111111 11111111 11111101 _two = 2^31-3 = 2147482645_ten
01111111 11111111 11111111 11111110 _two = 2^31-2 = 2147482646_ten
01111111 11111111 11111111 11111111 _two = 2^31-1 = 2147483647_ten

10000000 00000000 00000000 00000000 _two = -2^31  = -2147483648_ten
10000000 00000000 00000000 00000001 _two = -2^31+1 = -2147483647_ten
10000000 00000000 00000000 00000010 _two = -2^31+2 = -2147483646_ten
10000000 00000000 00000000 00000011 _two = -2^31+3 = -2147483645_ten
...
11011100 10000110 00001101 00100011 _two = -595194589_ten
...
11111111 11111111 11111111 11111101 _two = -3_ten
11111111 11111111 11111111 11111110 _two = -2_ten
11111111 11111111 11111111 11111111 _two = -1_ten
```
This way of representing negative number is called two's complement. Note that the same bit pattern can represent a positive number or negative number depending on our interpretation. The pattern
```   11111111 11111111 11111111 11111111
```
is -1 in two's complement interpretation, but is 4294967295 if it is interpreted as unsigned integer.

In general, the pattern

```     b_{31}  b_{30}   ....  b_3   b_2   b_1  b_0
```
has the value in two's complement interpretation
```      - b_{31} * 2^{31} + b_{30} * 2^{30} + ... + b_2 * 2^2 + b_1 * 2 + b_0
```
Same as unsigned int, expect that the first term contribution is negative.

#### Example

What is the decimal value of this 32-bit two's complement number?

```   11111111 11111111 111111111 11111100
```

```     1 * (-2)^31 + (1*2^30) + (1*2^29) + ... + (1*2^2) + (0*2) + (0*1)
= -2^31 + 2^30 + 2^29 + 2^28 + ... + 2^3 + 2^2
= -2^31 + (2^31 - 2^2)/(2-1)
= -4
```

#### Summary

• In C, `int` is signed integer represented by two's complement in the range -2^31 to 2^31 - 1.
• In C, `unsigned int` is the unsigned representation from 0 to 2^32 - 1.
• Memory addresses are unsigned numbers.
• From machine point of view, signed and unsigned are simply a matter of interpretation of the bit patterns.

### Comparison for signed and unsigned numbers

The comparison of "less than" clearly depends on the interpretation of the numbers, MIPS uses slt, or slti, for signed comparison, and sltu, and sltiu for unsigned comparison.

#### Example

Suppose register \$16 has the pattern 11111111 11111111 11111111 11111111 and that register \$17 has the binary number 00000000 00000000 00000000 00000001. What are the values of registers \$8 and \$9 after these two instructions?

```   slt  \$8, \$16, \$17   # signed comparison
sltu \$9, \$16, \$17   # unsigned comparison
```

Answer: \$16 has value -1 if the number is interpreted as signed or 2^32-1 = 4294967295, if it is interpreted as unsigned. \$17 has value +1 in both cases. Thus \$8 has value 1, since -1 < 1; and \$9 has value 0, since 4294967295 > 1.

### Shortcuts: negating and sign extending

#### Negating

A simple rule to negate a number is: invert every 0 to 1 and 1 to 0, then add 1. For example, Negate 2,
```           00000000 00000000 00000000 00000010    (+2)
11111111 11111111 11111111 11111101    (invert 1 <-> 0)
+                                   1    (add 1)
---------------------------------------
11111111 11111111 11111111 11111110    (-2)
```

Convert -2 to 2, follow exactly the same step,

```           11111111 11111111 11111111 11111110    (-2)
00000000 00000000 00000000 00000001    (invert 1 <-> 0)
+                                   1    (add 1)
---------------------------------------
00000000 00000000 00000000 00000010    (+2)
```

Also, we can check (+2) + (-2) = 0.

```           11111111 11111111 11111111 11111110    (-2)
+ 00000000 00000000 00000000 00000010    (+2)
---------------------------------------
1 00000000 00000000 00000000 00000000    (+2)
```
The last carry bit can not be stored in the 32-bit field, so it is lost, But it is actually good for us, because we get the result 0 in the 32-bit pattern.

#### Sign extension

When the size of the signed integer gets larger, fill the rest of the left-hand-side by the sign bit. This is called a sign extension. 16-bit two's complement number is called short int in C. Immediate values in instructions are 16 bit long.

Example: Convert 16-bit binary version of 2 and -2 to 32-bit binary numbers.

Answer: The 16-bit binary version of 2 is

```   00000000 00000010
```
The 32-bit version is just
```   00000000 00000000 00000000 00000010
```
The 16-bit -2:
```   00000000 00000010  -> 2
11111111 11111101  -> reverse all bits
11111111 11111110  -> -2
```
The 32-bit number is
```   11111111 11111111 11111111 11111110
```

#### Addition and Subtraction

Binary addition and subtraction are very similar to what we do in decimal. If the sum gets greater than 1, we carry a bit 1 to the next digit. here is an example,

```           00000000 00000000 00000000 00000111    (7)
+ 00000000 00000000 00000000 00000110    (6)
---------------------------------------
00000000 00000000 00000000 00001101    (13)
```
Subtraction:
```           00000000 00000000 00000000 00000111    (7)
- 00000000 00000000 00000000 00000110    (6)
---------------------------------------
00000000 00000000 00000000 00000001    (1)
```
Borrow a 1 from previous digit means 2 in the current position. The subtraction is the addition of the negated number:
```           00000000 00000000 00000000 00000111    (7)
+ 11111111 11111111 11111111 11111010    (-6)
---------------------------------------
1 00000000 00000000 00000000 00000001    (1)
```
Again, a bit is carried outside the 32 bit field. But we don't care.

Exercise: Compute in binary

```         10111001
+    1001011
---------------
?
```

Remember that A to F corresponds to 10 to 15 in decimal. So

```        10A0FD
+    57B0A2
------------
68519F
```

In hexadecimal addition, every 16 caries a 1 to the next digit. In decimal, every 10 carries a 1 to the next digit. And in binary every 2 carries 1 to the next digit.

#### Overflow in two's complement addition and subtraction

Consider the addition of 2^31-1 = 2147483647 with 2,

```           01111111 11111111 11111111 11111111    (2147483647)
+ 00000000 00000000 00000000 00000010    (2)
---------------------------------------
10000000 00000000 00000000 00000001    (-2147483647)
```
The result is clearly incorrect, since we know that
```      2147483647 + 2 = 2147483649
```
But the 32-bit result is negative if it is interpreted as two's complement number. This error is expected, since the value of the result is outside the representable range of the two's complement numbers (from -2^31 to 2^31 -1). Similarly
```           10000000 00000000 00000000 00000001    (-2147483647)
+ 11111111 11111111 11111111 11111110    (-2)
---------------------------------------
(1) 01111111 11111111 11111111 11111111    (2147483647)
```

The sum of two negative numbers is positive!

This behavior of computer arithmetic is called (integer) overflow. Now, the question is, when overflow can occur?

• Overflow cannot occur when adding a positive number with a negative number. Why?
• Overflow occurs if the sum of the two positive numbers is negative; or sum of two negative numbers gives positive result. In such cases, the sign bit is over written by the value of the result, rather than representing signs.

#### Example, 8-bit (byte) number

In C, the data type char can be used as small 2's complement integers. The relation between bit pattern and decimal values are as follows:

```   00000000    ->  0
00000001    ->  1
00000010    ->  2
00000011    ->  3
00000100    ->  4
...            ...
01111111    ->  127 (largest positive number)
10000000    -> -128 (the most negative number)
10000001    -> -127
...
11111110    -> -2
11111111    -> -1
```

Let's do some arithmetic with byte numbers. Compute -6 + 44 = 38:

```      1 1 1 1 1 0 1 0
0 0 1 0 1 1 0 0
. . . . .
+ -----------------
(1) 0 0 1 0 0 1 1 0
```
Dot represents a carry to that bit.

Compute 91 + 61

```      0 1 0 1 1 0 1 1
0 0 1 1 1 1 0 1
+   . . . . . . .
--------------------
1 0 0 1 1 0 0 0
```
The result is a negative number -104. Overflow occurred, since 91 + 61 = 152, which is larger than the largest value within 8 bit two's complement range.

Compute 11 - 13. We'll use 11 + (-13) instead of subtracting.

```   00001101  -> 13
11110010  -> complement all bits
11110011  -> -13

0 0 0 0 1 0 1 1  -> 11
1 1 1 1 0 0 1 1  -> -13
+             . .
--------------------
1 1 1 1 1 1 1 0  -> -2
```

#### Hardware software interface

MIPS detects overflow with an exception. When overflow exception occurs, control is transferred to a procedure to handle the exceptional situation. If you have not done anything special, you'll see the error message: "Floating Exception", followed by a core dump, and the program stops. However, this overflow is detected only for the signed integer calculations. Here is the rule

• add, addi, sub, cause exceptions on overflow (i.e. overflow is reported).
• addu, addiu, subu do not cause exception on overflow.

The pairs (add, addu), (addi, addiu), and (sub, subu) do exactly the same computation, except that the former detects overflow and the later does not. Addresses are unsigned integers. In order not to get an overflow error in address calculations, it is recommended to use the unsigned version, addu, and subu.

[ previous week ][ next week ]