How do computers represent negative numbers? A gentle introduction
Do you know that there are multiple ways for computers to store negative numbers? Have you ever heard about signed-magnitude? Do you know what are the benefits of two’s complement that make it the way for representing signed numbers?
Well, let me try to answer all these questions :)
What is this? And what it isn’t?
This is a gentle introduction. The goal is to give you some intuition into how different representations work and learn how to work with them. I expect that you are at least a bit familiar with programming.
We’ll be talking only about integers. Floats are very special beasts and are out of the scope of this article.
So, how do you store a number?
If you are familiar with words like bit, byte, and binary representation, feel free to skip the first two paragraphs.
Let’s say you are a computer (brr brr) and a programmer asks you to store some nice number for her, let’s say 5. How do you remember it? Obviously, there is no paper for you to write it down, so you need to store it in your memory. However, your world is composed of only so called bits: ones and zeros. There is no place in your memory to which you can write 5, instead, you need to store it in its binary representation.
The binary representation of 5 is 101. So you might be tempted to write 101 to some address in your memory. But can you? It turns out that isn’t possible either. Your memory is fragmented into bytes — 8-bit segments — and you cannot possibly work with anything smaller than one byte. So you are bound to pad your 101 with zeros so its length is 8 and then you can store it!
So, now you have 0000 0101 somewhere in your memory (the middle space is there just for readability). Your programmer is happy. You can remember stuff! So she tries to make it harder and asks you to store -5.
D’oh! There is no minus sign in your binary world. You only know 0/1. Nonetheless, you think a little. And come up with an idea! What about reserving another byte? This byte can indicate whether the number in the byte immediately next to it is negative or positive!
Therefore, you store something like: minusbyte, 0000 0101 in your memory. Now you need two bytes for your -5.
But wait a second. Do you really need a whole byte for something that can take just two values? Not really, we can include sign in the byte with 5. We can interpret 1 on the position of the most significant bit as minus, and 0 as a plus.
So we end up with 1000 0101 for our -5.
Whoa! We just discovered a new representation of negative numbers: signed magnitude!
Signed magnitude
Let’s analyze this new beast. What is the greatest number we can represent? From the 8 bits in byte, we use one to store the sign and the rest to store the magnitude. From this, we obtain that the greatest magnitude is 2⁷ — 1 = 127. If you don’t see why, I encourage you to take a piece of paper and write down all the powers corresponding to different bits (1, 2, 4, 8, …) and sum the first seven.
Because the greatest magnitude is 127, we can represent any integer between -127 and 127.
Now observe that this range is symmetric: we have the same number of positive and negative numbers. This might not look unusual but it is! Two’s complement which we will introduce shortly doesn’t have this property.
How do we represent zero? If you’ve never seen signed magnitude representation before, please, pause and think about this. An interesting phenomenon occurs. There are two ways to represent 0. One is 0000 0000, the other 1000 0000.
How big a problem is this?
Surprisingly, quite often in a computer we need to check something against zero. But now, when we have two zeros, against what zero we should check? Soon we will start to think if one zero is superior to the other? And although the problem of two zeros is solvable it can be quite a headache.
One last problem with this representation is the way we do subtraction. If you think about this, you quickly find that there are multiple questions we would have to answer and cases to think about. It might happen that we want to subtract two positive numbers, negative and positive, positive and negative and so on. Most importantly, we need some special circuit to do the subtraction for us.
However, what if we can’t have a circuit for subtracting? Either we don’t have enough money or physical space. Is it possible to come up with something than needs only an adder?
It is :)
Two’s complement
Let’s return to the size of a byte again. We already observed that the amount of information we can store in one byte is limited. We cannot store random huge numbers. Let’s see what happens when we try to add 100 and 200.
(100)_10 = (0110 0100)_2
(200)_10 = (1100 1000)_2
0110 0100 + 1100 1000 = 1 0010 1100
Ouch. We cannot store 9 bits. What usually happens is that only the 8 least significant bits get stored so we end up with 0010 1100 and if we convert this to decimal we get 100 + 200 = 44. We call this an overflow. That looks quite annoying. But what if there is a way to use this property to simulate subtraction with addition? Indeed, we were adding two big numbers and ended up with something small.
Note: you can think about the previous addition in terms of modular arithmetics. In our case, we are calculating everything mod 256. That means we calculate the remainder of the result of the addition when divided by 256. A simple way to interpret this is to think about a standard 12-hour clock. When you are working on a 4-hour homework and start at 10 PM, you finish it at 2 AM not 14 PM. The same thing happens here but we have 256 hours instead of 12 :).
Now, the most difficult part follows. Brace yourself!
For any number k it surely holds that 0 = k + (-k) = 0000 0000. However, in our world, we are also completely happy with the result k + (-k) = 1 0000 0000.
This simple fact leads to the most elegant representation of negative numbers one can think of. For each positive number P we will create a negative number N with the property that P + N = 1 0000 0000.
If you are not sure how this work, try to think in decimals. Let’s say that you have a memory of only one digit, so you can remember anything from 0 to 9. Now, how do you find -3? You would like P + N = 10. For P = 3, only satisfying N is 7.
Now, let’s get back to the world of ones and zeros. We need to find a way for computing negative numbers corresponding to some positive.
We know that P + N = 1 0000 0000, therefore we could rearrange this to get N = 1 0000 0000-P.
However, this leads to subtracting. And I promised that we can do it without subtracting.
Let’s observe what happens when we negate bits (we swap ones for zeros and zeros for ones). We have (5)_10 = (0000 0101)_2 and after negating: (1111 1010)_2.
Let’s add those two numbers. 0010 1100 + 1111 1010 = 1111 1111.
This is still not 1 0000 0000 but it’s very, very close. All we need to do is add one. We will therefore say that N = negation(P) + 1. And surprisingly this is all we need.
So, to get N from P we have to negate bits and add 1. Both of these operations are super simple for a computer to do so we can convert numbers really quickly.
One nice thing to note is that negation(negation(x)+1)+1 = x. This allows us to use the same operation no matter to which representation we are converting. It is not so easy to see why this works. I encourage you to try this with (5)_10 = (0000 0101)_2. Negate bits, add one and then repeat it one more time.
Properties of two’s complement
We no longer need a separate circuit for subtraction. It is the same as addition!
We can still quickly see if a given number is positive or negative. All numbers with the most significant bit 0 are still positive (or zero) and those with 1 are negative. For positive numbers, we have the range from 1 to 127.
For negative numbers we have the range from -128 to -1.
Wait, why it’s not -127?
Well, in this representation we no longer have a negative zero. The only way to represent zero is as 0000 0000. This zero now “occupies” space in our positive range. That’s why we can represent more negative numbers than positive ones.
To wrap it up, we have a representation that:
- allows us to use an adder for subtraction
- is asymmetric
- has just one zero
- is elegant
Conclusion
You probably noticed that computers can store greater numbers than those that fit into one byte. Nowadays, you will most often encounter 4-byte integers which can store numbers up to a few billions. However, you can straightforwardly extend what was described here to any number of bytes.
If you are coming from Python, you might have noticed that there is nothing like an upper limit on the size of your integers. This is because Python has a rather intriguing (and inefficient) way of storing numbers that allows it to allocate as much memory as the number requires.
There are also other representations which are worth of knowing. One should definitely get familiar with the way floating-point numbers are represented (more about it here). One archaic way similar to two’s complement is one’s complement (and there is a Wikipedia article about it here).
Hope you learned something new. Have a nice day!