Back to basics: two's complement

Warning: This blogpost has been posted over two years ago. That is a long time in development-world! The story here may not be relevant, complete or secure. Code might not be complete or obsoleted, and even my current vision might have (completely) changed on the subject. So please do read further, but use it with caution.
Posted on 26 Nov 2010
Tagged with: [ b2b

I occasionally get into discussions where I find that other lack some basic understanding of the elementary systems which he or she has to work with everyday. Today was such a day: we went into a discussion about that it would be so much nicer to have unsigned tinyint (in mysql) range from -127..128 instead of -128..127. Although it COULD be changed, it would go against almost all rudimentary principles of numeric systems used inside computers, but this not always known to PHP-programmers.

So today I introduce the “back-to-basics” posts, which talks about all kind of rudimentary principles in computer technology. These principles found the basics of that what you use each and every day, and even though you probably not aware of them, they still are there..

The first b2b post is about the two’s complement signing system… let’s take a look:

Meet our binary system

I assume you know the binary (base-2) system. If not, please read up on it now. As a refresher: a number (for instance: 6) can be represented in binary like: 00001010b. The least significant bit (the lowest value) is on the right, the most significant bit is on the right. This example shows you how to convert a binary number into a decimal number:

00000110b
|||||||+- 0 *   1 = 0
||||||+-- 1 *   2 = 2
|||||+--- 1 *   4 = 4
||||+---- 0 *   8 = 0
|||+----- 0 *  16 = 0
||+------ 0 *  32 = 0
|+------- 0 *  64 = 0
+-------- 0 * 128 = 0 +
                   ---
                    6

So we can convert any positive number in binary, the higher the number, the more bits you need. But: how do we present a negative number in binary? How would -6 looks like? In the decimal system, we have decided we use a ‘sign’ symbol to represent if the number is negative or positive.

+6 = positive
-6 = negative

When no sign is given, we always assume it’s a positive number because humans are lazy and don’t like to add a sign for every number they type or write, but we really should to take away the ambiguity. Now, computers REALLY don’t like ambiguity, so we MUST provide it a sign but there are different ways to do it.

simple-sign system

The first system is the easiest: assume we have 8 bit digits, which means we can store any positive number from 0 (00000000b) to 255 (11111111b). In order to store negative numbers, we take away the highest bit (bit 7, the most LEFT bit) and use that as the sign bit. If that bit is 0, the number is positive. When it’s 1, the number is negative. Instead of being able to address 0..255, we can address -127..+127.

This looks like our “human” way of signing, except we have to sacrifice a bit. It would mean we only have 7 bits left to store our number, so we can only store half the values, although we gain the negative numbers.

10000001b
|---+---
|   +----  7 bits for the digit
+--------  Sign bit (1 = negative)

00000001b = +1
01111111b = +127
10000001b = -1
11111111b = -127

Although this is by far the easiest method, it has a few drawbacks:

First: there are 2 ways of representing the digit 0:

00000000b = (+)0
100000000b = (-)0

Second: a computer cannot (simply) add or subtract positive and negative numbers:

    11      (carry over)
00000110b   +6
00000011b   +3
---------   -- +
00001001b   +9

    11      (carry over)
00000110b   +6
10000011b   -3
---------   -- +
10001001b   -9 (NOT GOOD)

One’s complement

Ok, so although a nice system: it’s not really workable for adding and subtracting (same goes multiplication and dividing). Let’s try another - but similar - system. The same sign-bit (the most left bit) is still used, BUT when a negative number is represented ALL BITS EXCEPT THE SIGN BIT ARE STORED NEGATIVELY.

An example:

00000101b = +5
10000101b = -5 (simple signed)
11111010b = -5 (one's complement)

As you can see in the example between the simplesigned -5 and the one’s complement -5: all the 7 “digit” bits are negative: when it’s a 1 in simplesigned, it’s 0 in one’s complement and vice versa. But why go through all this hassle of negating bits and all? Let’s find out by adding some numbers:

    11      (carry over)
00000110b   +6
00000011b   +3
---------   -- +
00001001b   +9

11111       (carry over)
00000110b   +6
11111100b   -3 (ones complement!)
---------   -- +
00000010b   +3 (good!) 

     11    (carry over)
11111001b   -6 (ones complement)
00000011b   +3
---------   -- +
11111100b   -3 (ones complement)

Looks like this works! A computer can easily add those values together and the signing will automatically be correct. Sweet!

However, let’s look at this case:

 111111      (carry over)
 11111100b   -3 (ones complement)
 00000111b   +7
 ---------   -- +
 00000011b   +3 (NOT GOOD)

This addition isn’t right. (-3)+(+7) = +4 according almost all math-books. So what went wrong? Actually, nothing went wrong, but we just haven’t finished yet. As you can see, at the most-right bit we triggered a “carry” to the “9th” bit. Off course, in a 8-bit digit, there is no 9th bit BUT your computer will actually save this information. What you have to do is add the carry-bit to the result:

111111      (carry over)
|11111100b   -3 (ones complement)
|00000111b   +7
|---------   -- +
|00000011b   +3 (NOT GOOD)
+->     1b   +1 (the '9th' carry bit)
 ---------
 00000100b   +4 (The correct answer)

Ok, we’re almost there. Although one’s complement solved the “calculation” problems, it still is possible to represent 0 with 2 different values:

00000000b = (+)0
11111111b = (-)0

but adding and subtracting should not cause any problems. But still, ambiguity, so not really the BEST way (although a lot of systems use one’s complement, including TCP/IP).

Two’s complement

This system is almost the same as one’s complement, except we add a “1” to any negative number.

00001100b  +12
11110011b  -12 (ones complement)

     11    (carry over)
11110011b  -12 (one's complement)
00000001b  + 1 (adding 1 to make it a two's complement number)
---------
11110101b  -12 (two's complement)

11111111    (carry over)
11111111b  -0 (one's complement)
00000001b  +1
---------
00000000b  +0 (two's complement)

As you can see, converting a one’s complement (-0) to two’s complement result in a +0. This means that we only have ONE kind of zero instead of two.

Note that in two’s complement we do not carry over the 9th bit, which means the calculations are a bit easier as well:

 11111111    (carry over)
 11111101b   -3 (twos complement)
 00000111b   +7
 ---------   -- +
 00000100b   +4 (The correct answer)

As you can see, going from simplesign to one’s complement solved the calculation problem, and going from one’s to two’s complement solved the double-zero problem. Two’s complement has some other nifty features like detecting overflows (when adding two numbers will be larger than +127 or less than -128)

Roundup:

This results in the following minimum and maximum values for 8 bits in the different sign methods:

No sign (unsigned):     Simplesign method:
00000000b = +0          00000000b = +0
01111111b = +127        01111111b = +127
10000000b = +128        10000000b = -0
11111111b = +255        11111111b = -127

One's complement:       Two's complement:
00000000b = +0          00000000b = +0
01111111b = +127        01111111b = +127
10000000b = -127        10000000b = -128
11111111b = +0          11111111b = -1

You (should) instantly see that when using two’s complement you have gained an extra number, since we don’t have (or can) store the zero twice but still remain the nice way of adding and subtracting binary numbers.

Conclusion:

These things are not used everyday. And like anything else you don’t use every day, it tends to get rusty. An occasional brush up on basic is a good thing. You not only discover HOW things work, but also discover the sometimes ingenious systems and algorithms the “founding fathers” of computer science have discovered, most likely many years before you were born. Values like -2147483648 or even -9223372036854775808 are not taken out of the clear blue sky but are there for a reason. At least you (should) know why..