Bit manipulation in PHP
Tagged with: [ bit manipulation ]
Although you probably never need it as much as a C-programmer would, it’s not a bad idea to know how bit manipulation works. This post will tell you a bit about what bit manipulation is, why you could use it and how you are using it already (with or without knowing)
As you probably know, computers works with 2-base system called the binary system. A bit (which stands for binary digit) is simply either a 0 or a 1. A group of 8 bits is called a byte. 1024 bytes is called a kilobyte, 1024 kilobytes is 1 megabyte etc etc..
A byte can be represented in many ways:
Decimal: 65
Hexadecimal: 0x41
ASCII character: 'A'
Binary: 10000001
As you can see, it’s all the same value, but written differently.
Each bit in a byte has a value:
bit 0: 1
bit 1: 2
bit 2: 4
bit 3: 8
bit 4: 16
bit 5: 32
bit 6: 64
bit 7: 128
You can see that binary 01000001
means both bit 6 (on the left, we go from right to left!) is set to 1, and bit 0 is set to 1.
bit 7 = `64, bit 0 = 1.. 64+1 = 65, which happens to be the decimal value of the variable we are working on. It all seems to fit perfectly :)
Now, so much for the basics..
Meet your friends: or, xor, and
There are a few basic bit manipulation commands we can use:
- or
When or-ing two bits, the outcome will be 1 if at least one bit is 1. - xor
When xor-ing two bits, the outcome will be 1 if both bits are different. - and
When and-ing two bits, the outcome will be 1 if both bits are 1.
There are some more:
- not
if the bit you are not-ting is 1, the outcome will be 0, otherwise 1. (reversing the bit basically) - shift left «
the bits will the variable will be shifted X places to the left (more later) - shift right »
the bits in the variable will be shifted X places to the right (again, more later)
Some basic math
0 AND 0 = 0
0 AND 1 = 0
1 AND 0 = 0
1 AND 1 = 1
0 OR 0 = 0
0 OR 1 = 1
1 OR 0 = 1
1 OR 1 = 1
0 XOR 0 = 0
0 XOR 1 = 1
1 XOR 0 = 1
1 XOR 1 = 0
Great, but why do I care?
Suppose you have a variable that holds 16 different bits. Each bit is a “flag” that holds a special case.
Here, the variable is called $error_level and is set to 0. This means, all the bits (flags) are 0 as well.
Now, we want to set the 3rd flag (bit 2). We know bit 2 has a value of 4, so we could just say:
but this causes a problem: if there are already some bits set, they will be unset. So we solve this by OR-ing the flag:
Now, suppose error level is 67 (bits 0, 1 and 7 are set) and we are setting bit 2:
10000011
00000100 |
--------
10000111
Since we are dealing with bits, we only want to manipulate the bit in question, we do not care about any other bits. This way we can. We only tell php which bit (or bits) we want to manipulate. No other bits will be harmed in the process.
Suppose we want to make sure bit 5 is not set (we don’t know if it’s currently set or not):
Looks complicated, but let’s take a look:
The ~16
means: not 16. So every bit that does not make up 16 will be set to 1, every other bit will be set to 0. This is called ‘masking’ bits. That gives us this:
16: 00010000
~16: 11101111
Now, we are going to AND this value to our variable:
10000111 <- random value that is in error-level, could be anything
11101111& <- our ~16
---------
10000111
As you can see, nothing has changed. This is because bit 5 wasn’t set in the first place. Now, let’s try it with a $error_level value where bit 5 IS set:
10111101 <- almost all bits are set, including bit 5
11101111& <- our ~16
---------
10101101
As you can see, the result is the same as the original, except for bit 5 which is set to 0.
You can also set multiple bits at the same time:
10000000
10001001|
---------
10001001
Now, attention paying viewers may have noticed that when you OR-ing data, you might as well can add them up by using +. Be very careful with this: even though this works when dealing with bit-fields only, it does not work when flags are made of several bits. When dealing with bit manipulation, use the bitwise operators, not the arithmic ones.
Neat tricks with bits
$value xor $value = 0
In assembly this is (was) one of the quickest ways to set a variable to zero. It makes sense:
10011011
10011011^
---------
00000000
Since all bits are equal to each other, every bit will produce a zero.
$value = $value >> 1;
This will divide the value by 2.
00010100 >> 1 (20 decimal)
00001010 bits shifted left 1 place (adding a 0 at the end). 10 decimal.
Works for everything (odd numbers are rounded downwards).
Check quickly if a number is even or odd (without doing a divide):
This will mask the first bit (bit 0). When this bit is 1, the value is odd (check it out yourself!)
Why PHP does not do bits
You assign a variable in PHP probably this way:
This will assign the value ‘1’ to $var. Note that you don’t specify what kind of variable it is. You assign it as a integer, but you can also use it as a string, or as a float. Doesn’t matter for PHP. It’s internal structure (the ZVAL) does all the hard work for you converting things the way you want it too. That’s one of the strengths (some say weaknesses) of PHP. As said: since you don’t specify the type, you cannot tell PHP that $var is a single bit. PHP simply does not work this way.
Why PHP DOES do bits
Ok, so I lied.. you can do bit manipulation in PHP, but not really the way you’d expect. What we can do, is use ANY variable as a value that stores bits. For instance, we can use strings, integers, even arrays. Doesn’t matter. Bitwise manipulation can come in handy even in PHP from time to time. It’s a simple way of dealing with on-off flags inside either your code.
Suppose we have this:
instead of all these options (including the getters/setters), you could have one method / property:
Conclusion
Bit manipulation comes in very handy from time to time. It can save space, speed and CAN increase readability when used correctly. An example of bit manipulation is very easy to spot. Just look at the error_reporting function in php: http://nl.php.net/manual/en/function.error-reporting.php. Even if you didn’t understand what was going on, I hope you do now…