Vous êtes sur la page 1sur 27

5.

1 DATA REPRESENTATIONS

This section under construction.

All data on digital computers is represented as a sequence of 0s and 1s. This


includes numeric data, text, executable files, images, audio, and video. The ASCII
standard associates a seven bit binary number with each of 128 distinct characters.
The MP3 file format rigidly specifies how to encode each raw audio file as a
sequence of 0s and 1s. All data are numbers, and all numbers are data.

In this section we describe how to represent integers in binary, decimal, and


hexadecimal and how to convert between different representations. We also
describe how to represent negative integers.

Number systems.

There are many ways to represent integers: the number of days in the month of
October can be represented as 31 in decimal, 11111 in binary, 1F in hexadecimal,
or XXXI in Roman Numerals. It is important to remember than an integer is an
integer, no matter whether it is represented in decimal or with Roman Numerals.
Decimal numbers. We are most familiar with
performing arithmetic with the decimal (base
10) number system. This number system has
been widely adopted, in large part because we
have 10 fingers. However, other number
systems still persist in modern society.
Sexagecimal numbers. The Sumerians uses a
sexagecimal (base 60) number system. We
speculate that 60 was chosen since it is
divisible by many integers: 1, 2, 3, 4, 5, 6, 10,
12, 15, 20, and 30. Most clocks are based on
the sexagecimal system. The Babylonians
inherited sexagecimal numbers from the
Sumerians. They divided a circle into 360
degrees since they believed the Sun rotated
around the Earth in about 360 days. Ptolemy
tabulated trigonometric tables using base 360,
and, even today, we still often use degrees
instead of radians when doing geometry.
Binary numbers. Computers are based on the
binary (base 2) number system because each
wire can be in one of two states (on or off).
Hexadecimal numbers. Writing numbers in
binary is tedious since this representation uses
between 3 and 4 times as many digits as the
decimal representation. The hexadecimal
(base 16) number system is often used as a
shorthand for binary. Base 16 is useful
because 16 is a power of 2, and numbers have
roughly as many digits as in the corresponding
decimal representation.

Beginning with the Babylonians, we represented numbers in these systems using


positional notation. A sequence of digits x = xn, xn-1, ..., x1, x0 in base b integer
x = xn bn + xn-1 bn-1 + ... + x1 b1 + x0 b0.

The xi terms are the positional digits, and each digit is required to be an integer
between 0 and b - 1. In binary, the two digits (also referred to as bits) are 0 and 1;
in decimal, the ten digits are 0 through 9; in hexadecimal, the sixteen digits are 0
through 9 and the letters A through F. Every nonnegative integer can be expressed
using positional notation, and the representation is unique (up to an arbitrary
number of leading 0s). As an example the number of days in a leap year is 36610 =
1011011102 = 16E16 since:
366 = 3102 + 6101 + 6 100
366 = 1 28 + 0 27 + 1 26 + 125 + 024 + 123 + 122 + 121 + 020
366 = 1162 + 6161 + E160

The following table gives the binary, decimal, and hexadecimal representations of
the first 48 integers.

BIN DEC HEX BIN DEC HEX BIN DEC HEX

0 0 0 10000 16 10 100000 32 20

1 1 1 10001 17 11 100001 33 21

10 2 2 10010 18 12 100010 34 22

11 3 3 10011 19 13 100011 35 23

100 4 4 10100 20 14 100100 36 24

101 5 5 10101 21 15 100101 37 25

110 6 6 10110 22 16 100110 38 26


111 7 7 10111 23 17 100111 39 27

1000 8 8 11000 24 18 101000 40 28

1001 9 9 11001 25 19 101001 41 29

1010 10 A 11010 26 1A 101010 42 2A

1011 11 B 11011 27 1B 101011 43 2B

1100 12 C 11100 28 1C 101100 44 2C

1101 13 D 11101 29 1D 101101 45 2D

1110 14 E 11110 30 1E 101110 46 2E

1111 15 F 11111 31 1F 101111 47 2F

Number conversion.

You need to know how to convert from a number represented in one system to
another.
Converting from base b to decimal. To convert
from an integer represented in base b to decimal,
multiply the ith digit by the ith power of b, and sum
up the results. For example, the binary number
101101110 is 366 in decimal.

1 0 1 1 0 1 1 1 0 (binary)
256 128 64 32 16 8 4 2 1 (powers of 2)
-------------------------------------------
256 + 64 + 32 + 8 + 4 + 2 (multiplied by
corresponding digit)

366

The hexadecimal number 16E is 366 in decimal.


1 6 E (hex)
256 16 1 (powers of 16)
-------------
256 + 96 + 14 (multiplied by corresponding digit)
366
Converting from decimal to base b. It is
slightly more difficult to convert an integer
represented in decimal to one in base b
because we are accustomed to performing
arithmetic in base 10. The easiest way to
convert from decimal to base b by hand is to
repeatedly divide by the base b, and read the
remainder upwards. For example, the
calculations below convert from the decimal
integer 366 to binary (101101110) and to
hexadecimal (16E).

Converting between base b1 and b2. One way


is to convert the base b1 integer to decimal
(using the first algorithm described above)
and then to convert the resulting decimal
integer to base b2 (using the second algorithm
described above).
Converting between binary and hexadecimal.
We describe a fast and elegant way to convert
directly from the binary to hexadecimal
representation of an integer: First, group the digits 4
at a time starting from the right; then convert each
group to a single hexadecimal digit, padding 0s to the
very last group if necessary. For example, the
hexadecimal representation of
111010111001110001 is 3AE71.

0011 1010 1110 0111 0001


3 A E 7 1

To convert from hexadecimal to binary: convert each


hexadecimal digit individually into its corresponding
4 digit binary number, removing any leading 0's.
9 F 0 3
1001 1111 0000 0011

This works because one base (16) is a power


of the other (2). Likewise, it would be easy to
convert between the base 125 and base 5
representations.
Number conversion in Java. Converting a
string to another type is called parsing, and
can be a significant computational burden,
since there are normally numerous cases to
consider. We have already used one such
built-in static method extensively:
Integer.parseInt() converts from a string
of decimal digits to an integer. Program
BinaryConverter.java defines a static method
fromBinaryString(s) that converts a string
s of binary digits to an integer, and another
static method toBinaryString(n) that
converts an int n to a binary string. We note
that Java's library calls
Integer.parseInt(s, 2) and
Integer.toBinaryString(n) accomplish the
same goals as fromBinaryString(s) and
toBinaryString(), respectively.

Program HexInOut.java reads in a


hexadecimal integer from standard input and
prints it out in hexadecimal to standard
output.

Arithmetic in other number systems.

One way to perform arithmetic is to convert all of the numbers to base 10, perform
arithmetic as usual, and then convert back. In many cases, it is easier to perform
the arithmetic directly in the given number system.
Addition. In grade school you learned how to
add two decimal integers: add the two least
significant digits (rightmost digits); if the sum
is more than 10, then carry a 1 and write
down the sum modulo 10. Repeat with the
next digit, but this time include the carry bit in
the addition. The same procedure generalizes
to base b by replacing the 10 with the base b.
For example, if you are working in base 16
and the two summand digits are 7 and E, then
you should carry a 1 and write down a 5
because 7 + E = 1516. Below, we compute
456710 + 36610 = 493310 in binary (left), decimal
(middle) and hexadecimal (right).

1 0 0 0 1 1 1 0 1 0 1 1 1 4 5 6 7 1 1 D 7
+ 0 0 0 0 1 0 1 1 0 1 1 1 0 + 3 6 6 + 1 6 E
------------------------- --------- ---------
1 0 0 1 1 0 1 0 0 0 1 0 1 4 9 3 3 1 3 4 5
Multiplication. One compelling reason to use
positional number systems is to facilitate
multiplication. Multiplying two Roman
Numerals is awkward and slow. In contrast,
the grade school algorithm for multiplying two
decimal integers is straightforward and
reasonably efficient. As with addition, it easily
generalizes to handle base b integers. All
intermediate single-digit multiplications and
additions are done in base b. Below, we multiply
the decimal integers 4,567 and 366 (left) and then
again in hex (right).

4 5 6 7 1 1 D 7
* 3 6 6 * 1 6 E
--------- --------------
2 7 4 0 2 F 9 C 2
2 7 4 0 2 6 B 0 A
1 3 7 0 1 1 1 D 7
------------- --------------
1 6 7 1 5 2 2 1 9 8 1 6 2

Negative integers.

On a machine with 16-bit words, there are 216 = 65,536 possible integers that can
be stored in one word of memory. By interpreting the 16 bits as a binary number,
we obtain an unsigned integer in the range 0 through 65,535. Instead, we can
interpret the leading bit as the sign of the number, using two's complement
notation. This allows us to interpret the 16 bits as a signed integer in the range
-32,768 through +32,767, as described in the table below. As with binary integers, it is often
convenient to express 16-bit two's complement integers using hexadecimal notation.
BINARY HEX DECIMAL
0000 0000 0000 0000 0000 0

0000 0000 0000 0001 0001 +1

0000 0000 0000 0010 0002 +2

0000 0000 0000 0011 0003 +3

...

0111 1111 1111 1110 7FFE +32,766

0111 1111 1111 1111 7FFF +32,767

1000 0000 0000 0000 8000 -32,768

1000 0000 0000 0001 8001 -32,767

1000 0000 0000 0010 8002 -32,766

...

1111 1111 1111 1101 FFFD -3

1111 1111 1111 1110 FFFE -2

1111 1111 1111 1111 FFFF -1

Negating a two's complement integer. To


negate (change from positive to negative or
negative to positive) a two's complement
integer, first complement all of the bits, then
add 1. By complement, we mean replace all of
the 0's with 1's, and the 1's with 0's. (One way
to think about it is that numbers in the range
-231 to 231 - 1 are stored modulo 232.) The table
below illustrates a few examples.
DECIMAL BINARY COMPLEMENT INCREMENT DECIMAL

+3 0000 0000 0000 1111 1111 1111 1111 1111 1111 -3


0011 1100 1101

1111 1111 1111 0000 0000 0000 0000 0000 0000


-3 +3
1101 0010 0011

0000 0000 0010 1111 1111 1101 1111 1111 1101


+40 -40
1000 0111 1000

1111 1111 1101 0000 0000 0010 0000 0000 0010


-40 +40
1000 0111 1000

0000 0001 0110 1111 1110 1001 1111 1110 1001


+366 -366
1110 0001 0010

0000 0000 0000 1111 1111 1111 0000 0000 0000


0 0
0000 1111 0000

1000 0000 0000 0111 1111 1111 1000 0000 0000


-32,768 -32,768
0000 1111 0000


Converting a hexadecimal two's complement
integer into decimal. To convert the 16-bit
two's complement integer FE92 into decimal,
we start by writing down its binary
representation: 1111 1110 1001 0010. We
recognize it as a negative integer since the
most significant bit is 1. Then, we negate it
(flip bits and add 1) to obtain: 0000 0001
0110 1110. We convert 101101110 (binary)
to 366 (decimal). After putting back in the
negative sign, we obtain the final answer of
-366 (decimal).
Converting a negative decimal integer into its
hexadecimal two's complement
representation. To convert from the decimal
integer -366 to its 16-bit two's complement
representation, we can apply the above steps
in reverse. First we convert 366 (decimal) into
101101110 (binary). Next, we negate it (flip
bits and add one) to obtain 1111 1110 1001
0010. It is important that we fill in all 16 bits.
If desired, we can convert this to hexadecimal:
FE92.
Adding two's complement integers. Adding
two's complement integers is straightforward:
add the numbers as if they were unsigned
integers, ignoring any overflow. Below we
compute 456710 + -36610 = 420110 in decimal
(left) and again in binary using 16-bit two's
complement integers (right). Note that the second
binary integer represents a negative integer using
two's complement notation.

4 5 6 7 0 0 0 1 0 0 0 1 1 1 0 1 0 1 1 1
+ -3 6 6 + 1 1 1 1 1 1 1 0 1 0 0 1 0 0 1 0
--------- ---------------------------------
4 2 0 1 0 0 0 1 0 0 0 0 0 1 1 0 1 0 0 1

In the example above, we carry a one into the


most significant bit (leftmost bit), and we
carry a one out, which we subsequently
discard. Despite the apparent overflow, we are
left with the correct result.
Overflow. However, there is one situation
where this addition produce an "incorrect"
answer: if we do not carry a one into the most
significant bit, but do carry a one out. Below
we compute -32,76610 + (-36610) = -33,13210
using 16-bit two's complement integers. At first, we
might be surprised to see that the result of adding
two negative integers is a positive integer (to see this
quickly, look at the most significant bit of the result).
This occurs because there is carry out of the most
significant digit, but no carry in to it. In hindsight, we
should not be surprised because the true answer
-33,132 cannot be represented as a 16-bit two's
complement integer.

-3 2 7 6 6 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0
+ -3 6 6 + 1 1 1 1 1 1 1 0 1 0 0 1 0 0 1 0
------------- ---------------------------------
-3 3 1 3 2 0 1 1 1 1 1 1 0 1 0 0 1 0 1 0 0

Bit-whacking operators in Java.


In Java, an int is a 32-bit two's complement integer. Java supports a number of
operators to manipulate the bits of integer types, as summarized below. These bit-
whacking operators are especially useful when performing low-level data processing such as
cryptography, data compression, error correction, and transmitting email.
NAME OPERATOR PURPOSE

bitwise NOT ~x Flip all the bits of x

bitwise AND x & y Take the AND of each pair of bits

bitwise OR x | y Take the OR of each pair of bits

bitwise XOR x ^ y Take the XOR of each pair of bits

left shift x << y Move the bits of x to the left y positions

right shift x >> y Move the bits of x to the right y positions

unsigned right shift x >>> y Move the bits of x to the right y positions

Program BitWhacking.java reads in two integers a and b from the command line,
applies the bit-whacking operations, and prints the results.
Bitwise not.

43 = 0000 0000 0000 0000 0000 0000 0010 1011


~ 43 = 1111 1111 1111 1111 1111 1111 1101 0100 = -44
Bitwise and.

43 = 0000 0000 0000 0000 0000 0000 0010 1011


17 = 0000 0000 0000 0000 0000 0000 0001 0001
43 & 17 = 0000 0000 0000 0000 0000 0000 0000 0001 = 1
Bitwise or.

43 = 0000 0000 0000 0000 0000 0000 0010 1011


17 = 0000 0000 0000 0000 0000 0000 0001 0001
43 | 17 = 0000 0000 0000 0000 0000 0000 0011 1011 = 59
Bitwise xor.

43 = 0000 0000 0000 0000 0000 0000 0010 1011


17 = 0000 0000 0000 0000 0000 0000 0001 0001
43 ^ 17 = 0000 0000 0000 0000 0000 0000 0011 1010 = 58
Left shift. Left shifting an integer by i bits
means that each of the bits in the integer are
moved i positions to the left. Zeros are padded
to the right end. This is equivalent to
multiplying the integer by 2i. For example 43
<< 3 is 344 and -43 << 3 is -344.

43 = 0000 0000 0000 0000 0000 0000 0010 1011


43 << 3 = 0000 0000 0000 0000 0000 0001 0101 1000 = 344
-43 = 1111 1111 1111 1111 1111 1111 1101 0101
-43 << 3 = 1111 1111 1111 1111 1111 1110 1010 1000 = -344
Right shift. Right shifting an integer means
that you shift all of the bits i positions to the
right, padding 0s to the beginning if the
integer is positive and 1s if the integer is
negative. It is equivalent to dividing the
integer by 2i and rounding the result down
(towards -infinity). For example 43 >> 3 is 5
and -43 >> 3 is -6. For positive integers, this is
equivalent to integer division, but for negative
integers, the rounding works in the opposite
direction.

43 = 0000 0000 0000 0000 0000 0000 0010 1011


43 >> 3 = 0000 0000 0000 0000 0000 0000 0000 0101 = 5
43 / 8 = 0000 0000 0000 0000 0000 0000 0000 0101 = 5
-43 = 1111 1111 1111 1111 1111 1111 1101 0101
-43 >> 3 = 1111 1111 1111 1111 1111 1111 1111 1010 = -6
-43 / 8 = 1111 1111 1111 1111 1111 1111 1111 1011 = -5
Unsigned right shift. An unsigned right shift is
identical to a signed right shift when applied to
positive integers, but differs when applied to
negative integers. An unsigned right shift
always pads 0s to the left, regardless of
whether the integer is positive or negative. For
example, 43 >>> 3 is 5 but -43 >>> 3 is
536870906. To see why, observe that

43 = 0000 0000 0000 0000 0000 0000 0010 1011


43 >>> 3 = 0000 0000 0000 0000 0000 0000 0000 0101 = 5
-43 = 1111 1111 1111 1111 1111 1111 1101 0101
-43 >>> 3 = 0001 1111 1111 1111 1111 1111 1111 1010 = 536870906

Java library functions for bit-whacking.


The library Integer contains a number of useful functions for bit-whacking. The
table below refers to the two's complement representation of the int variable x.
FUNCTION RETURN VALUE

Integer.bitCount(x) Number of one-bits in x

Integer.highestOneBit(x) Zero out all but the leftmost one bit of x.

Integer.lowestOneBit(x) Zero out all but the rightmost one bit of x.

Integer.numberOfLeadingZeros(x) Number of zero bits preceding highest one bit.

Integer.numberOfTrailingZeros(x) Number of zero bits following lowest one bit.

Rotate of x by circularly shifting i bits to the


Integer.rotateLeft(x, i)
left.

Integer.rotateRight(x, i) Rotate x by circularly shifting i bits to the right.

Integer.reverse(x) Reverse of the bits of x.

Big Endian, little endian.

Computers differ in the way in which they store multi-byte chunks of information,
e.g., the 16-bit short integer 0111000011110010 = 70F2. This consists of the two
bytes 70 and F2, where each byte encodes 8 bits. The two are two primary formats,
and they differ only in the order or "endianness" in which they store the bytes.
Big endian systems store the most significant
bytes first, e.g., they store the integer above
in its natural order 70F2. Java uses this
format, as does Apple Mac, IBM PowerPC G5,
Cray, and Sun Sparc.
Little endian systems store the least significant
bytes first, e.g., they store the integer above
in reverse byte-order F270. This format is
more natural when manually performing
arithmetic, e.g., since for addition you work
from least significant byte to most significant
byte. Intel 8086, Intel Pentium, Intel Xeon use
this format.
Computer scientists occasionally engage in religious wars about which is better.
Fortunately, Java hides the endianness from the end user, so if you create binary
data files in Java, you won't need to worry about endianness when sharing them
over the Internet with Java users on Mac, PC, and Solaris platforms. Unless you
need to read in a legacy binary file (e.g., written in C on a PC), you shouldn't have
to directly confront these details.

Historical note: big endian and little endian derive from Gulliver's Travels. How to
crack an egg? At little end or big end?

Dangers of not recognizing overflow. In September 2004, a computer glitch


grounded air traffic in southern California for several hours and left planes without
radio communication. The software was running on Windows 2000 Advanced
Server. The software relied on a Windows function named GetTickCount(), which
returns the number of milliseconds since the system was started. Unfortunately,
this value is represented using a 32-bit integer. After approximately 49.7 days, the
value overflows. This glitch was known when the system was launched. To avoid
the ensuing problems, the documentation recommended manually rebooting the
system once a month.

Q+A

Q. How much new information is created each year?

A. Here is a relatively recent study.

Q. Why base 10?

A. Largely because we have 10 fingers. Although 10 fingers enable you to count in


base 11 (from 0 to 10), the decimal system was introduced before mathematicians
accepted zero as a number.

Q. Do all programming languages use 32 bit two's complement integers?

A. No. Java is unusual in that it completely specifies the representation of an int.


In C, there is no requirement that an int be 32 bits or that it uses two's
complement notation to represent negative integers. Different C compilers can
represent integers in different ways, and this can lead to incompatibilities when
trying to use the same program with different compilers.

Q. My program only needs integers between -32,768 and 32,767. Does using a 16-
bit short make my program faster? Does it save space?

A. Typically a single short variable is internally stored using 32 bits (especially if


the underlying hardware architecture is 32 bit), in which case it does not save any
space to declare a single short. Moreover, it can take longer because the Java
system must make the 32 bits behave as if it were representing a 16-bit two's
complement integer. On the other hand, if you declare a huge array of type short,
then the elements will be packed two to an int, and the short array will use
approximately half as much memory as an int array of the same length.

Q. Why do I need to cast to add two variables of type short?

A. Java convert the results of most integer operations to be of type int. If a, b,


and c are of type short, then a + b is promoted to type int, and assigning this
sum to a short requires an explicit cast. One exception to this rule is if you use +=,
in which case the cast is performed automatically.

Q. Can I apply bitwise & and bitwise | to boolean values? If so, is there any
difference between the corresponding logical operators && and ||?

A. Yes. The difference is that the logical operators are subject to short circuiting:
the expression (f(x) && g(x)) will not evaluate g(x) if f(x) is false. This also
explains why there is no ^^ operator for logical XOR: it is never possible to short-
circuit an XOR, so it would always be identical to bitwise ^.

Q. What happens if I shift an int more than 31 places?

A. You can't. Java only uses the five low-order bits of the second operand. This has
the effect of shifting the number of values mod 32. Another consequence of this is
that left shifting by a negative integer does not right shift the number. This
behavior coincides with the physical hardware on many microprocessors.

Q. I need an unsigned 32-bit integer, but Java only has signed 32-bit integers.
What should I do?

A. First, are you sure that you really need an unsigned type. Signed and unsigned
integers behave identically on the bitwise operators (except >>), addition,
subtraction, and multiplication. In many applications, these are sufficient, assuming
you replace >> with >>>. Comparison operators are easy to simulate by checking
the sign bit. Division and remainder are the trickiest: the easiest solution is to
convert to type long.
long MASK = (1L << 32) - 1; // 0x00000000FFFFFFFF;
int quotient = (int) ((a & MASK) / (b & MASK));
int remainder = (int) ((a & MASK) % (b & MASK));
Program UnsignedDivision.java uses this trick, and also does it directly using 32-bit
operations.

Q. I need an unsigned 8-bit integer, but Java only has signed 8-bit integers (bytes).
What should I do?

A. Same advice as previous question. One place where it's nice to have unsigned
integers is for a lookup table, indexed by the byte. With signed integers the index
can be negative. Also, if b is a byte, then b << 4 automatically casts b to an int.
This could be undesirable since b is signed. In many applications you need to
remove the signed extended bits via (b << 4) & 0xff.

Q. Why does (b << i) give weird results when b is a byte?

A. In Java, byte is an 8-bit signed integer. Before the right shift, b is converted to
an integer. You may want ((b & 0xff) << i) instead.

Exercises

1. Convert the decimal number 92 to binary.

Answer: 1011100.
2. Convert the hexadecimal number BB23A to
octal.

Answer: first convert to binary 1011 1011


0010 0011 1010, then consider the bits three
at a time 10 111 011 001 000 111 010,
and convert to octal 2731072.
3. Add the two hexadecimal numbers 23AC and
4B80 and give the result in hexadecimal. Hint:
add directly in hex instead of converting to
decimal, adding, and converting back.
4. Assume m and n are positive integers. How
many 1 bits are there in the binary
representation of 2^(m+n)?
5. What is the only decimal integer that is
reverse when written in hexadecimal?

Answer: 53.
6. How many bits are in the binary
representation of 2^2^2^2^17?
7. IPv4 is the protocol developed in the 1970s
that dictates how computers on the Internet
communicate. Each computer on the Internet
needs it own Internet address. IPv4 uses 32
bit addresses. How many computers can the
Internet handle? Is this enough for every
human being to have their own? Every cell
phone and toaster?
8. IPv6 is an Internet protocol in which each
computer has a 128 bit address. How many
computers would the Internet be able to
handle if this standard is adopted? Is this
enough? Answer: 2^128. That at least enough
for the short term - 5000 addresses per
square micrometer of the Earth's surface!
9. When you buy a hard drive, 1 GB means
1,000 MB (megabytes) and 1 MB means 1,000
KB (kilobytes) and 1 KB means 1,000 bytes.
But when you buy memory, 1 GB means 1,024
MB, 1 MB means 1,024 KB, and 1 KB means
1,024 bytes. What percentage difference is
there in the amount of storage in a 100 MB
hard drive vs. 100 MB memory? 1GB hard
drive vs. 1 GB memory? (1024/1000)2 = 4.9%
and (1024/1000)3 = 7.4%.
10. Why does the following code fragment fail?

short a = 4;
short b = 5;
short c = a + b;
Answer: Java automatically promotes the sum
to be of type int. To assign the result to a
short, you need to explicitly cast it back c =
(short) (a + b). Yes, this is rather quirky.

11. What does the following code fragment from


program Overflow.java print out?

int a = 2147483647; // 2^31 - 1


int b = a + 1;
System.out.println("a = " + a);
System.out.println("b = " + b);
12. What does the following code fragment print out?

int a = -5 >> 3;
int b = -5 >>> 3;
System.out.println(a);
System.out.println(b);
13. List all values a of type int for which (a ==
(a >> 1)). Hint: there is more than one.
14. Suppose a is a variable of type int. Find two
values of a for which (a == -a) is true.
Answer: 0 and -2147483648.
15. What is the result of a = -1 * -2147483648?
Answer: 0.
16. What does the following code fragment print out?

int a = 11 & 17;


int b = 11 ^ 17;
int c = 11 | 17;
int d = ~11;
System.out.println(a);
System.out.println(b);
System.out.println(c);
System.out.println(d);
17. Given two positive integers a and b, what
result does the following Java code fragment
leave in c?

c = 0;
while (b > 0) {
if (b & 1 == 1) c = c + a;
b = b >> 1;
a = a << 1;
}

Answer: a * b.
18. What does the following code do to the integers
stored in two different variables a and b?

a = a ^ b;
b = a ^ b;
a = a ^ b;
19. Repeat the previous question, but assume a
and b are the same variable.
20. What does the following code do to the integers
stored in two different variables a and b? Any
problems with overflow?

a = a + b;
b = a - b;
a = a - b;
21. What do each of the following statements do?

x = - ~x;
x = ~ -x;

Increment x, decrement x
22. Modify Binary.java so that it converts from
base 7 to decimal and vice versa.
23. What does the following do?
public static boolean parity(int a) {
a ^= a >>> 32;
a ^= a >>> 16;
a ^= a >>> 8;
a ^= a >>> 4;
a ^= a >>> 2;
a ^= a >>> 1;

return a & 1;
}
Answer: computes the parity of the number of
1 bits set in the binary representation of a
using divide-and-conquer.

24. What is the value of cnt after the following loop?

int cnt = 0;
for (int i = 1; i != 0; i = 2 * i) {
cnt++;
}
Hint: it's not an infinite loop.

25. Explain why the following Java code fragment


correctly determines whether the integer n is a
power of 2.

boolean isPowerOfTwo = (n & -n) == n;

Creative Exercises

1. Linear feedback shift register. Rewrite


LFSR.java to simulate the linear feedback shift
register from Chapter 1 using bit-whacking
operations.
2. Linear feedback shift register cycle
length. Modify the program from the previous
exercise to compute the cycle length of the
LFSR using Floyd's method. What if you
change the taps to xyz?
3. IP addresses and IP numbers An IP
address (IPV4) is comprised of integers w, x,
y, and z and is typically written as the string
w.x.y.z. The corresponding IP number is
given by
IP number = 16777216*w + 65536*x +
256*y + z. Given an IP number, the
corresponding IP address is w = (ipnum /
16777216 ) % 256, x = (ipnum / 65536) %
256, y = (ipnum / 256) % 256, z = (ipnum) %
256. [Or use shifting and masking.]

Write a function that takes an IP number and


returns a String corresponding to the IP
address. Write a function that takes an IP
address and returns a int corresponding to
the IP number. For example, if the IP number
is 3401190660, then the function should
return "202.186.13.4".
4. IP address. Write a program that takes a 32
bit string as a command line argument, and
prints out the corresponding IP address in
dotted decimal form. That is, take the bits 8 at
a time, convert each group to decimal, and
separate each group with a dot. For example,
the binary IP address
01010000000100000000000000000001
should be converted to 80.16.0.1.
5. Base64 encoding. Base64 encoding is a
popular method for sending binary data over
the Internet. It converts arbitrary data to
ASCII text, which can be emailed back
between systems without problems. Write a
program to read in a arbitrary binary file and
encode it using Base64.
6. Counting in base -2. Use the definition of the
positional notation to define the base -2 number
system. There are two digits 0 and 1. Count from -7
to 7 in this system.

0 = 0 -1 = 11 (-2 + 1)
1 = 1 -2 = 10
2 = 110 (4 + -2) -3 = 1101 (-8 + 4 + 1)
3 = 111 -4 = 100
4 = 100 -5 = 1111
5 = 101 -6 = 1110
6 = 11010 (16 + -8 + -2) -7 = 1001
7 = 11011
7. RGBA color format. Some of Java's classes
(BufferedImage, PixelGrabber) use a special encoding
called RGBA to store the color of each pixel. The
format consists of four integers, representing the
red, green, and blue intensities from 0 (not present)
to 255 (fully used), and also the alpha transparency
value from 0 (transparent) to 255 (opaque). The four
8-bit integers are compacted into a single 32-bit
integer. Write a code fragment to extract the four
components from the RGBA integer, and to go the
other way.

// extract
int alpha = (rgba >> 24) & 0xff;
int red = (rgba >> 16) & 0xff;
int green = (rgba >> 8) & 0xff;
int blue = (rgba >> 0) & 0xff;

// write back
rgba = (alpha << 24) | (red << 16) | (green << 8) | (blue << 0);
System.out.println(rgba);
8. Min and max. One of the following computes
min(a, b), the other computes max(a, b) without
branching. Which is which? Explain how it works.

f = b + ((a - b) & -(a < b)); // min(a, b)


g = a - ((a - b) & -(a < b)); // max(a, b)
9. Find the missing value. Suppose you have
an array consisting of 232 - 1 integers of type
int such that no two integer appears more
than once. Since there are 232 possible values,
exactly one integer is missing. Write a code
fragment to find the missing integer using as
little extra storage as possible.

Hint: this is a popular interview question. It's


possible to do it using only one extra int. Use
either properties of integer overflow on two's
complement integers or use the XOR function.
10. Cyclic redundancy check. Write programs
CRC16.java and CRC32.java that read in data
from standard input and computes its 16 or
32-bit CRC. Write a program
CRC16CCITT.java" for 16-bit CRC in CCITT
format.
11. Number conversion. Write a program
Converter.java that converts between base b
and decimal for any 2 &le b 36. You should
have a static method toString(int n, int b) that
converts n to a base b string and a static
method fromString(String s, int b) that
converts from a base b string to an integer. Consider
defining and using
String digits = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ";
12. Excel column numbering. Write a function
that takes a nonnegative integer and converts
it into the corresponding Excel column name
(0 = A, 1 = B, ..., 25 = Z, 26 = AA, ..., 702 =
AAA).
13. Elias Gamma coding. Write a function elias
that takes as input an integer N and returns
the Elias Gamma code as a string. The Elias
Gamma code is a scheme to encode the positive
integers. To generate the code for an integer N, write
the integer N in binary, subtract 1 from the number
of bits in the binary encoding, and prepend that
many zeros. For example, the code for the first 10
positive integers is given below.

1 1 6 00110
2 010 7 00111
3 011 8 0001000
4 00100 9 0001001
5 00101 10 0001010
14. Bit reversal. Write a function that takes an integer
input, reverse its bits, and returns that integer. For
example if n = 8, and the input is 13 (00001101),
then its reversal is 176 (10110000).

public static int bitReverse(int input) {


int ans = 0;
for (int i = 0; i < n; i++) {
ans = (ans << 1) + (input & 1);
input = input >> 1;
}
return ans;
}
15. Bit-reversal sorting. Use the previous
algorithm to "sort" an array of N = 2n elements
into their bit-reversed order. Swap elements i and j if
i and j are bit reversal of each other. Such
permutations arise in the Fast Fourier Transform.

0 1 2 3 4 5 6 12 13 14 15
0000 0001 0010 0011 0100 0101 0110 ... 1100 1101 1110 1111

0000 1000 0100 1100 0010 1010 0110 ... 0011 1011 0111 1111
0 8 4 9 2 10 6 3 11 7 15
16. Swap without temporary storage. What do
the following two code fragments do given integers a
and b?

a = a + b;
b = a - b;
a = a - b;

a = a ^ b;
b = a ^ b;
a = a ^ b;
Answer: each 3-line fragment swaps a and b.
It works provided a and b are not the same
variables (in which case both variables are
zeroed out).

17. Find the unique integer. Suppose you have


an array of 2N + 1 integers, and you know
that each of N integers appear exactly twice.
Describe an elegant and efficient algorithm to
identify the integer that appears only once.
Hint: xor.
18. Bit-whacking version of Gray codes Use
bit-whacking operations and iteration instead
of recursion to generate a gray code. Name
your program BitWhackingGrayCode.java.
19. Free the prisoners I. A warden meets with
17 new prisoners when they arrive. The
warden tells them that they may meet today
and plan a strategy, but after the meeting,
each prisoner will be in solitary confinement
and will not be able to communicate with one
another. The prison has a switch room with 17
switches that can be on or off, although the
initial configuration is not revealed. There is
one special setting of the 17 switches that if it
is ever achieved will enable the prisoners to go
free. Each hour the warden escorts one
prisoner to the switch room, where the
prisoner can flip at most one switch (from on
to off or off to on). The warden can choose
prisoners in arbitrary order, so one prisoner
may be chosen four times in a row, or not at
all. Design a strategy for the 17 prisoners so
that they are guaranteed to be set free after
some finite amount of time.
20. Free the prisoners II. Same premise as
above, except that the switch room has 2
switches (initially both off), and a prisoner
must flip exactly one of the two switches upon
entering the switch room. At any time, a
prisoner may declare "all 17 of us have visited
the control room." If it is true, all prisoners are
freed; otherwise they are all executed. The
warden can choose prisoners in arbitrary
order, so one prisoner may be chosen four
times in a row, but each prisoner will be
chosen infinitely often (assuming they are
never freed). Design a strategy for the 17
prisoners so that they are guaranteed to be
set free after some finite amount of time.
Extra credit: don't assume the initial
configuration is known.
21. Count the number of 1 bits. Write function
that takes an integer input and returns the
number of 1's in its binary representation.

Answer: here are an iterative and a recursive


solution.
public static int bitCount(int input) {
int count = 0;
for (int i = 0; i < 32; i++)
count = count + (input >>> i & 1);
return count;
}

public static int bitCount(int x) {


if (x == 0) return 0;
return (x & 1) + bitCount(x >>> 1);
}
This is how Integer.bitCount() is implemented by
Sun. See if you can figure out how it works.

public static int bitCount(int i) {


i = i - ((i >>> 1) & 0x55555555);
i = (i & 0x33333333) + ((i >>> 2) & 0x33333333);
i = (i + (i >>> 4)) & 0x0f0f0f0f;
i = i + (i >>> 8);
i = i + (i >>> 16);
return i & 0x3f;
}
22. Sparse bit-counting. Explain why the following
function (that often appears in job interviews for
programmers) correctly counts the number of 1 bits
in the binary representation of its input. If the input
has k 1's, how many times does the while loop
iterate?

public static int bitCount(int input) {


int count = 0;
while (input != 0) {
count++;
input = input & (input - 1);
}
return count;
}
23. Table lookup bit-counting. Repeat the
previous exercise, but pre-compute a table to
speed up the computation.

Answer: this one assumes you have a


precomputed table of size 256, with bits[i]
storing the number of 1 bits in the binary
representation of i. You can use the bit counting
function from the previous exercise to initialize it.
previous
public static int bitCount(int input) {
return bits[(input >> 0) & 0xff]
+ bits[(input >> 8) & 0xff]
+ bits[(input >> 16) & 0xff]
+ bits[(input >> 24) & 0xff];
}
Increasing the table size to 216 = 65,536 will
make things faster assuming you have
sufficient memory. A table of size 232 is likely
prohibitive.
24. Dictionary attack. One method that sleazy
spammers use to auto-generate email
addresses is by enumerating all possible email
addresses at a give domain, e.g.,
hotmail.com. This annoying tactic is called a
dictionary or Rumpelstiltskin attack and
explains why you sometimes receive spam on
a new email address to which you haven't
given to anybody. Use Converter.java to
design such a program. Your program
Rumpelstiltskin.java should take a command
line parameter N and print out all 36N possible
passwords of N or fewer characters involving
numbers and uppercase letters.
25. Breaking a gold chain. You have a gold
chain with 14 links that you are going to use
to pay an worker for 15 days at a fee of 1 gold
link per day. It's possible to split the chain into
15 pieces by cutting 14 times. Your goal is to
pay the worker while only breaking the chain 3
times. The worker must receive exactly the
right fraction of total payment after each day
of work. Hint: break the chain so there are
pieces of 1 section, 2 sections, 4 sections, and
8 sections.
26. Hamming encoder. Write a Java program
HammingEncoder.java that reads in a
sequence of 0s and 1s, 4 bits at a time, and
encodes them using Hamming codes.
27. Hamming decoder. Write a Java program
HammingDecoder.java that reads in a
sequence of 0s and 1s encoded using
Hamming codes, 7 bits at a time, and decodes
and correct them.
28. Hamming codes. Modify your solutions to the
previous two exercises so that the input bits
are packed 8 to the byte.
29. Absolute value. The constant
Integer.MIN_VALUE is the most negative 32-
bit two's complement integer. What is
Math.abs(Integer.MIN_VALUE)?
30. Prove that a k-digit decimal number can be
represented in binary with no more than 4k
bits.
31. Sum of powers of 2. Compute the sum of
powers of 2. What value do you end up with
on two's complement machine?
32. CD Database. CDDB and freedb are
databases that allow you to look up CD
information on the Web and display the artist,
title, and song name. Each CD has a (nearly)
unique disc ID number which is used to query
the database.

a. Write a static method sumDigits() that


takes an integer parameter and returns
the sum of the decimal digits in the
integer. For example, sumDigits(6324)
returns 15 since 6 + 3 + 2 + 4 = 15.
b. Write a program CDDB.java that
computes the disc ID from a list of lengths of
the track lengths. The 32-bit (8 hex digit) ID
number is computed from the length of the
tracks on the CD and the number of tracks as
follows:

XXYYYYZZ
XX = checksum of track offsets in seconds, taken mod
255
YYYY = length of the CD in seconds
ZZ = number of tracks on the CD
2. True or false. If a xor b = c, then c xor a = b
and c xor b = a.
3. Explain why the following code fragment does
not leave ABCD in variable a. How would you fix
it?

byte b0 = 0xAB;
byte b1 = 0xCD;
int c = (b0 << 8) | b1;
Answer. In Java, byte is a signed 8-bit
integer. The right-shift promotes b0 to a
(negative) integer. To fix the problem, use c
= ((b0 & 0xff) << 8) | (b1 & 0xff);.

4. Poisonous wine. "You are the ruler of an


empire and you are about to have celebration
tomorrow. The celebration is the most
important party you have ever hosted. You've
got 1000 bottles of wine you were planning to
open for the celebration, but you find out that
one of them is poisoned. The actual poison
exhibits no symptoms until somewhere around
the 23rd hour, then results in sudden death.
You have thousands of prisoners at your
disposal. What is the smallest number of
prisoners you must have to drink from the
bottles to find the poisoned bottle?"

Hint: you can represent the number 1,000


using 10 bits.

Copyright 2007 Robert Sedgewick and Kevin Wayne. All rights reserved.

Vous aimerez peut-être aussi