Links to other weeks notes [ 2 ][ 3 ][ 4 ][ 5 ]( 6 )[ 7 ][ 8 ][ 10 ]

Week 6, 18/21 August 1997

Chapter 4, Arithmetic for Computers

In this chapter we shall learn

Binary representation of integers, in particular negative numbers.
Addition and subtraction, how they are computed by the computer.
Arithmetic logic unit, part of the CPU responsible for computing.
Multiplication and division, how they are implement in hardware.
Floating point numbers, the IEEE standards.

Base of number representation

Base 10 (decimal)

We use a sequence of digits 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 to represents numbers in our daily work, for example 1996. The number 1996 means one thousand plus nine hundred plus ninety and plus six, as you know. Written formally

   1996 = 1 * 1000 + 9 * 100 + 9 * 10 + 6 * 1
        = 1 * 10^3 + 9 * 10^2 + 9 * 10^1 + 6 * 10^0

Each earlier digit contributes to the value ten times larger than the next digit. The number 10 is said to be the base or radix of the representation. Such system is called a positional notation system.

Base 2 (binary)

It is not necessary that each digit contributes to a power of 10 according to its position in the number. We can use power of 2 instead, and with only two (binary) digits 0 and 1. So in binary 10_two is the value 2_ten. (I'll use _two to indicate binary number and _ten for decimal number). Thus

   1011_two = 1 * 2^3 + 0 * 2^2 + 1 * 2^1 + 1 * 2^0
            = 8 + 0 + 2 + 1
            = 11_ten

It's good to know the first few values of powers of two for binary to decimal conversion:

   2^0 = 1;           2^6 = 64;
   2^1 = 2;           2^7 = 128;
   2^2 = 4;           2^8 = 256;
   2^3 = 8;           2^9 = 512;
   2^4 = 16;          2^10 = 1024;
   2^5 = 32;          2^11 = 2048.

Base 16 (hexadecimal)

Hexadecimal numbers need digits from 0 to 15, they are 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F where A for 10, B for 11, and so on. In hexadecimal, each digit represents a power of 16. So

   10FA_sixten = 1 * 16^3 + 0 * 16^2 + 15 * 16^1 + 10 * 16^0
               = 1 * 4096 + 0 * 256 + 15 * 16 + 10 * 1 
               = 4096 + 0 + 240 + 10
               = 4346_ten

Clearly, one can take any integer base>0 as a radix, then the i-th digit (counting from 0) contributes to the total sum by

   d * base^i

where the digit is a number between 0 and base-1.

Conversion between Representations

Binary to decimal

Here is one more example

   10010011 = 2^7 + 0 + 0 + 2^4 + 0 + 0 + 2^1 + 2^0
            = 128 + 16 + 2 + 1 
            = 147

Whenever there is a 1, there is a contribution of the form 2^i, i is the location of the digits counting from right to left, starting from 0.

Decimal to binary

Here is the rule, you may convence yourself why it works.

   147/2 = 73 + 1/2   -> 1  (right most, least significant digit)
    73/2 = 36 + 1/2   -> 1
    36/2 = 18 + 0/2   -> 0
    18/2 =  9 + 0/2   -> 0
     9/2 =  4 + 1/2   -> 1
     4/2 =  2 + 0/2   -> 0
     2/2 =  1 + 0/2   -> 0
     1/2 =  0 + 1/2   -> 1  (left, most significant digit)

So the binary representation is (reading backwards)

10010011

In summary, the number is divided by 2, and the results are the quotient and reminder; the digit in the reminder is the binary digit; the quotient is divided by 2 again, until the quotient become zero. The binary digits from least significant to most significant are found in the reminders.

Exercises

Convert decimal number 183 into binary; and convert binary number 1001101 into decimal.

Answers

Binary to hexadecimal, hexadecimal to binary

Due to the fact that 2^4 = 16, the conversion is very easy. Simply divide the binary digits into group of four (starting from the least significant bit) and use the correspondence between decimal number from 0 to 15, hexadecimal number from 0 to F, and binary four digits number 0000 to 1111,

 decimal  hex    binary

    0      0      0000
    1      1      0001
    2      2      0010
    3      3      0011
    4      4      0100
    5      5      0101
    6      6      0110
    7      7      0111
    8      8      1000
    9      9      1001
   10      A      1010
   11      B      1011
   12      C      1100
   13      D      1101
   14      E      1110
   15      F      1111

For example, binary number representation <-> hexadecimal representation:

   0001 0000 1111 1010   <->   10FA

Unsigned int in MIPS

Since each bit has two possible states 0, or 1, two bits can have four different patterns, 00, 01, 10, 11, and three bits can have 8 possible values, 000, 001, 010, 011, 100, 101, 110, and 111. In general, n-bit patterns can take 2^n different forms. A 32-bit pattern (word) can be one of the 2^{32}. One way to represent 0 and positive integers (unsigned) is to use the correspondence

   00000000 00000000 00000000 00000000 _two = 0_ten
   00000000 00000000 00000000 00000001 _two = 1_ten
   00000000 00000000 00000000 00000010 _two = 2_ten
   00000000 00000000 00000000 00000011 _two = 3_ten
   00000000 00000000 00000000 00000100 _two = 4_ten
                   ...
   01011100 10000110 00001101 00100011 _two = 1552289059_ten
                   ...
   01111111 11111111 11111111 11111111 _two = 2^31-1 = 2147483647_ten
   10000000 00000000 00000000 00000000 _two = 2^31   = 2147483648_ten
   10000000 00000000 00000000 00000001 _two = 2^31+1 = 2147483649_ten
                   ...
   11011100 10000110 00001101 00100011 _two = 3699772707_ten
                   ...
   11111111 11111111 11111111 11111101 _two = 2^32-3 = 4294967293_ten
   11111111 11111111 11111111 11111110 _two = 2^32-2 = 4294967294_ten
   11111111 11111111 11111111 11111111 _two = 2^32-1 = 4294967295_ten

Notice that each number is obtained from the previous one by adding 1. This is the representation of the unsigned integers. 32-bit unsigned integers are limited in the range 0 to 2^32-1. There are exact 2^{32} numbers.

Negative numbers

Negative numbers are important in calculations. In order to represent negative numbers, half of the bit patterns which were used to represent large positive numbers are used for negative numbers. Specifically, if the bit position 31 (left most bit) is 1, the number will be considered negative, with the following correspondence between bit patterns and values:

   00000000 00000000 00000000 00000000 _two = 0_ten
   00000000 00000000 00000000 00000001 _two = 1_ten
   00000000 00000000 00000000 00000010 _two = 2_ten
   00000000 00000000 00000000 00000011 _two = 3_ten
   00000000 00000000 00000000 00000100 _two = 4_ten
                   ...
   01011100 10000110 00001101 00100011 _two = 1552289059_ten
                   ...
   01111111 11111111 11111111 11111101 _two = 2^31-3 = 2147482645_ten
   01111111 11111111 11111111 11111110 _two = 2^31-2 = 2147482646_ten
   01111111 11111111 11111111 11111111 _two = 2^31-1 = 2147483647_ten

   10000000 00000000 00000000 00000000 _two = -2^31  = -2147483648_ten
   10000000 00000000 00000000 00000001 _two = -2^31+1 = -2147483647_ten
   10000000 00000000 00000000 00000010 _two = -2^31+2 = -2147483646_ten
   10000000 00000000 00000000 00000011 _two = -2^31+3 = -2147483645_ten
                   ...
   11011100 10000110 00001101 00100011 _two = -595194589_ten
                   ...
   11111111 11111111 11111111 11111101 _two = -3_ten
   11111111 11111111 11111111 11111110 _two = -2_ten
   11111111 11111111 11111111 11111111 _two = -1_ten

This way of representing negative number is called two's complement. Note that the same bit pattern can represent a positive number or negative number depending on our interpretation. The pattern

   11111111 11111111 11111111 11111111

is -1 in two's complement interpretation, but is 4294967295 if it is interpreted as unsigned integer.

In general, the pattern

     b_{31}  b_{30}   ....  b_3   b_2   b_1  b_0

has the value in two's complement interpretation

      - b_{31} * 2^{31} + b_{30} * 2^{30} + ... + b_2 * 2^2 + b_1 * 2 + b_0

Same as unsigned int, expect that the first term contribution is negative.

Example

What is the decimal value of this 32-bit two's complement number?

   11111111 11111111 111111111 11111100

Answer:

     1 * (-2)^31 + (1*2^30) + (1*2^29) + ... + (1*2^2) + (0*2) + (0*1)
   = -2^31 + 2^30 + 2^29 + 2^28 + ... + 2^3 + 2^2
   = -2^31 + (2^31 - 2^2)/(2-1)
   = -4

Summary

In C, int is signed integer represented by two's complement in the range -2^31 to 2^31 - 1.
In C, unsigned int is the unsigned representation from 0 to 2^32 - 1.
Memory addresses are unsigned numbers.
From machine point of view, signed and unsigned are simply a matter of interpretation of the bit patterns.

Comparison for signed and unsigned numbers

The comparison of "less than" clearly depends on the interpretation of the numbers, MIPS uses slt, or slti, for signed comparison, and sltu, and sltiu for unsigned comparison.

Example

Suppose register $16 has the pattern 11111111 11111111 11111111 11111111 and that register $17 has the binary number 00000000 00000000 00000000 00000001. What are the values of registers $8 and $9 after these two instructions?

   slt  $8, $16, $17   # signed comparison
   sltu $9, $16, $17   # unsigned comparison

Answer: $16 has value -1 if the number is interpreted as signed or 2^32-1 = 4294967295, if it is interpreted as unsigned. $17 has value +1 in both cases. Thus $8 has value 1, since -1 < 1; and $9 has value 0, since 4294967295 > 1.

Shortcuts: negating and sign extending

Negating

A simple rule to negate a number is: invert every 0 to 1 and 1 to 0, then add 1. For example, Negate 2,

           00000000 00000000 00000000 00000010    (+2)
           11111111 11111111 11111111 11111101    (invert 1 <-> 0)
         +                                   1    (add 1)
        ---------------------------------------
           11111111 11111111 11111111 11111110    (-2)

Convert -2 to 2, follow exactly the same step,

           11111111 11111111 11111111 11111110    (-2)
           00000000 00000000 00000000 00000001    (invert 1 <-> 0)
         +                                   1    (add 1)
        ---------------------------------------
           00000000 00000000 00000000 00000010    (+2)

Also, we can check (+2) + (-2) = 0.

           11111111 11111111 11111111 11111110    (-2)
         + 00000000 00000000 00000000 00000010    (+2)
        ---------------------------------------
         1 00000000 00000000 00000000 00000000    (+2)

The last carry bit can not be stored in the 32-bit field, so it is lost, But it is actually good for us, because we get the result 0 in the 32-bit pattern.

Sign extension

When the size of the signed integer gets larger, fill the rest of the left-hand-side by the sign bit. This is called a sign extension. 16-bit two's complement number is called short int in C. Immediate values in instructions are 16 bit long.

Example: Convert 16-bit binary version of 2 and -2 to 32-bit binary numbers.

Answer: The 16-bit binary version of 2 is

   00000000 00000010

The 32-bit version is just

   00000000 00000000 00000000 00000010

The 16-bit -2:

   00000000 00000010  -> 2
   11111111 11111101  -> reverse all bits
   11111111 11111110  -> -2

The 32-bit number is

   11111111 11111111 11111111 11111110

Addition and Subtraction

Binary addition and subtraction are very similar to what we do in decimal. If the sum gets greater than 1, we carry a bit 1 to the next digit. here is an example,

           00000000 00000000 00000000 00000111    (7)
         + 00000000 00000000 00000000 00000110    (6)
        ---------------------------------------
           00000000 00000000 00000000 00001101    (13)

Subtraction:

           00000000 00000000 00000000 00000111    (7)
         - 00000000 00000000 00000000 00000110    (6)
        ---------------------------------------
           00000000 00000000 00000000 00000001    (1)

Borrow a 1 from previous digit means 2 in the current position. The subtraction is the addition of the negated number:

           00000000 00000000 00000000 00000111    (7)
         + 11111111 11111111 11111111 11111010    (-6)
        ---------------------------------------
         1 00000000 00000000 00000000 00000001    (1)

Again, a bit is carried outside the 32 bit field. But we don't care.

Exercise: Compute in binary

         10111001
     +    1001011
  ---------------
  ?

Addition in hexadecimal notation

Remember that A to F corresponds to 10 to 15 in decimal. So

        10A0FD
   +    57B0A2
  ------------
        68519F

In hexadecimal addition, every 16 caries a 1 to the next digit. In decimal, every 10 carries a 1 to the next digit. And in binary every 2 carries 1 to the next digit.

Overflow in two's complement addition and subtraction

Consider the addition of 2^31-1 = 2147483647 with 2,

           01111111 11111111 11111111 11111111    (2147483647)
         + 00000000 00000000 00000000 00000010    (2)
        ---------------------------------------
           10000000 00000000 00000000 00000001    (-2147483647)

The result is clearly incorrect, since we know that

      2147483647 + 2 = 2147483649

But the 32-bit result is negative if it is interpreted as two's complement number. This error is expected, since the value of the result is outside the representable range of the two's complement numbers (from -2^31 to 2^31 -1). Similarly

           10000000 00000000 00000000 00000001    (-2147483647)
         + 11111111 11111111 11111111 11111110    (-2)
        ---------------------------------------
       (1) 01111111 11111111 11111111 11111111    (2147483647)

The sum of two negative numbers is positive!

This behavior of computer arithmetic is called (integer) overflow. Now, the question is, when overflow can occur?

Overflow cannot occur when adding a positive number with a negative number. Why?
Overflow occurs if the sum of the two positive numbers is negative; or sum of two negative numbers gives positive result. In such cases, the sign bit is over written by the value of the result, rather than representing signs.

Example, 8-bit (byte) number

In C, the data type char can be used as small 2's complement integers. The relation between bit pattern and decimal values are as follows:

   00000000    ->  0
   00000001    ->  1
   00000010    ->  2
   00000011    ->  3
   00000100    ->  4
   ...            ...
   01111111    ->  127 (largest positive number)
   10000000    -> -128 (the most negative number)
   10000001    -> -127
   ...
   11111110    -> -2
   11111111    -> -1

Let's do some arithmetic with byte numbers. Compute -6 + 44 = 38:

      1 1 1 1 1 0 1 0
      0 0 1 0 1 1 0 0
    . . . . .
   + ----------------- 
  (1) 0 0 1 0 0 1 1 0

Dot represents a carry to that bit.

Compute 91 + 61

      0 1 0 1 1 0 1 1
      0 0 1 1 1 1 0 1
  +   . . . . . . . 
  --------------------
      1 0 0 1 1 0 0 0

The result is a negative number -104. Overflow occurred, since 91 + 61 = 152, which is larger than the largest value within 8 bit two's complement range.

Compute 11 - 13. We'll use 11 + (-13) instead of subtracting.

   00001101  -> 13
   11110010  -> complement all bits
   11110011  -> -13

      0 0 0 0 1 0 1 1  -> 11
      1 1 1 1 0 0 1 1  -> -13
  +             . .   
 --------------------
      1 1 1 1 1 1 1 0  -> -2

Hardware software interface

MIPS detects overflow with an exception. When overflow exception occurs, control is transferred to a procedure to handle the exceptional situation. If you have not done anything special, you'll see the error message: "Floating Exception", followed by a core dump, and the program stops. However, this overflow is detected only for the signed integer calculations. Here is the rule

add, addi, sub, cause exceptions on overflow (i.e. overflow is reported).
addu, addiu, subu do not cause exception on overflow.

The pairs (add, addu), (addi, addiu), and (sub, subu) do exactly the same computation, except that the former detects overflow and the later does not. Addresses are unsigned integers. In order not to get an overflow error in address calculations, it is recommended to use the unsigned version, addu, and subu.

[ previous week ][ next week ]