Bits, Bytes and Bases First posted: June 26, 2001 Last Modified: June 26, 2001 As a car moves along the road, its odometer counts up tenths of a kilometer (or mile) in the right-most column. The digits go from zero to nine on an internal wheel, with only one digit visible through the odometer window. Once this rightmost wheel completes a full rotation, the next wheel to the left advances a notch, showing one complete kilometer (or mile) has been covered. This second wheel, upon completing a revolution, likewise triggers a "carry operation" causing the next wheel over to move up a notch, and so on, until the odometer's maximum readout has been attained. When the wheels read 999999... -- nines all the way across -- it next rolls around to all zeros, and reinitializes the counting cycle again. This odometer is using a base-10 counting system. Every wheel's notches or digits represent ten times more than the notches on the wheel to its right. An abacus works the same way: single beads in successive columns represent successive powers of a base. Which column a bead is in determines its weight or meaning. When this position-sensitive system is converted to written symbols, it's called positional notation. By way of contrast, consider a notation system that is not positional: the Roman numerals. The symbol I stands for 1, V for 5, X for 10, C for 100 and M for 1000. The rules specify some positional features, such that IV means 4 and VI means 6 -- it matters whether the I comes before or after the V. However there is no strict powering of a base with each successive column to the left. CCC = 300. CCCXXXIV = 334. In a different base, the wheels have a different period of revolution. Nor must the wheels always have the same number of notches. In a mixed base system, some will have more notches than others. For example, consider the standard time-keeping system: 60 seconds equal a minute, and 60 minutes equal an hour, but 24 hours make a day and 365 or 364 days make a year, depending on whether it's a leap year or not (leap seconds are also inserted from time to time, to keep clocks synchronized with Earth's cycles). So in our time-keeping system, we have an excellent example of mixed-base accounting: ``` >>> import time >>> time.localtime() (2001, 6, 26, 20, 3, 53, 1, 177, 1) localtime(...) localtime([seconds]) -> tuple Convert seconds since the Epoch to a time tuple expressing local time. When 'seconds' is not passed in, convert the current time instead. The tuple items are: year (four digits, e.g. 1998) month (1-12) day (1-31) hours (0-23) minutes (0-59) seconds (0-59) weekday (0-6, Monday is 0) Julian day (day in the year, 1-366) DST (Daylight Savings Time) flag (-1, 0 or 1) If the DST flag is 0, the time is given in the regular time zone; if it is 1, the time is given in the DST time zone; if it is -1, mktime() should guess based on the date and time. ``` In a base-2 system, the wheels have only two values, one and zero. Each digit, either one or zero, is called a bit. If this were an odometer, the miles or kilometers would accumulate as per the first column in the table below. The eight bits shown might represent the eight positions on an odometer. ``` Base 2 Base 10 Base 16 Value Equivalent Equivalent ========================================== 00000000 000 00 00000001 001 01 00000010 002 02 00000011 003 03 00000100 004 04 00000101 005 05 00000110 006 06 00000111 007 07 00001000 008 08 00010000 016 10 00100000 032 20 01000000 064 40 01111111 127 EF 10000000 128 F0 11111111 255 FF ``` In a base-16 or hexadecimal system, the digits 0-9 would be insufficient to mark all the notches on each wheel. By convention, we add the letters A-F, so that the sequence goes 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F -- at which point the next wheel to the left turns one notch and the sequence begins again. The base 16 system is intimately related to the base 2 system. We can break a string of eight bits into two groups of four, pairing each four-bit group with the corresponding hexadecimal value, as per the last column in the above table. These eight-bit strings, known as bytes, comprise a logical unit in many computer applications. Dr. Donald Knuth of Standford University has coined another term for sixteen-bit, or two-byte strings: the wyde. Every time a number of bits in a string is extended by one, the number of possible permutations of bits doubles. Just one bit has two possibilities: 0 or 1. Another bit in front allows for a first doubling: all the previous possibilities with a 0 in front, or a 1: 00 01 10 11. Adding a third bit doubles the possibilities again, by putting either a 1 or 0 in front of all the permutations made of two bits: 000 001 010 011 100 101 110 111. So the way to figure out the total number of possible permutations of 1s and 0s in a string N bits long, is to multiply 2 by itself N times. A short way of writing this is 2^N or 2**N (computer languages) or 2N (math typography). In cryptography, a key is often used to encode a message, and the length of this key is often expressed in bits. The total range of possibilities of an N-bit key is called the keyspace. Suppose N=512. That means any given key is one of a huge number of keys, 2 multiplied by itself 512 times (512 consecutive doublings). Expressed as a base-10 number, the size of this keyspace is: ``` 2**512 = 1340780792994259709957402499820584612747936 5820592393377723561443721764030073546976801 8742981669034276900318581864860508537538828 11946569946433649006084096 ``` And yet even though that looks like a very big number, many encryption methods use keys that are even longer.