Académique Documents
Professionnel Documents
Culture Documents
Instructors:
Randal E. Bryant and David R. OHallaron
Carnegie Mellon
Carnegie Mellon
Everything is bits
Each bit is 0 or 1
By encoding/interpreting sets of bits in
various ways
Computers determine what to do (instructions)
and represent and manipulate numbers, sets,
strings, etc
Carnegie Mellon
Carnegie Mellon
Byte = 8 bits
Binary 000000002 to 111111112
Decimal: 010 to 25510
Hexadecimal 0016 to FF16
al y
i m ar
c
x
n
He De Bi
0 0 0000
1 1 0001
2 2 0010
3 3 0011
4 4 0100
5 5 0101
6 6 0110
7 7 0111
8 8 1000
9 9 1001
A 10 1010
B 11 1011
C 12 1100
D 13 1101
E 14 1110
F 15 1111
Carnegie Mellon
Example Data
Representations
Typical 32bit
Typical 64bit
x86-64
char
short
int
long
float
double
long double
10/16
pointer
C Data Type
Carnegie Mellon
Carnegie Mellon
Boolean Algebra
And
Not
~A = 1 when
A=0
Exclusive-Or (Xor)
Carnegie Mellon
01101001
01101001
| 01010101 ^ 01010101
01111101
00111100
01111101
00111100
~ 01010101
10101010
10101010
Carnegie Mellon
Representation
Width w bit vector represents subsets of {0, , w1}
aj = 1 if j A
01101001
76543210
{ 0, 3, 5, 6 }
01010101
76543210
{ 0, 2, 4, 6 }
Operations
&
|
^
~
Intersection
01000001 { 0, 6 }
Union
01111101 { 0, 2, 3, 4, 5, 6 }
Symmetric difference 00111100 { 2, 3, 4, 5 }
Complement
10101010 { 1, 3, 5, 7 }
10
Carnegie Mellon
Bit-Level Operations in C
11
Carnegie Mellon
View 0 as False
Anything nonzero as True
Always return 0 or 1
Early termination
12
Carnegie Mellon
View 0 as False
Anything nonzero as True
Always return 0 or 1
Early termination
!!0x41 0x01
13
Carnegie Mellon
Shift Operations
Left Shift:
x << y
Argument x 01100010
<< 3
00010000
Right Shift:
x >> y
10100010
00010000
00101000
11101000
Undefined Behavior
Shift amount < 0 or word size
14
Carnegie Mellon
15
Carnegie Mellon
Encoding Integers
Unsigned
w1
B2U(X ) xi 2 i
i0
Twos Complement
w2
w1
B2T (X ) xw1 2 xi 2 i
Sign Bit
i0
Sign
Bit
0 for nonnegative
1 for negative
16
Two-complement Encoding
Example
(Cont.)
x =
15213: 00111011 01101101
y =
Carnegie Mellon
17
Carnegie Mellon
Numeric Ranges
Unsigned Values
UMin
= 0
0000
UMax
2w 1
Twos Complement
Values
TMin
=
2w1
1000
1111
TMax
2w1
1
0111
Values for W = 16
Other Values
Minus 1
1111
18
Carnegie Mellon
Observations
|TMin |
C Programming
= TMax +
1
Asymmetric range
UMax
= 2*
TMax + 1
#include <limits.h>
Declares constants, e.g.,
ULONG_MAX
LONG_MAX
LONG_MIN
Values platform specific
19
Carnegie Mellon
0000
0001
0010
0011
0100
0101
0110
0111
1000
1001
1010
1011
1100
1101
1110
1111
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
0
1
2
3
4
5
6
7
8
7
6
5
4
3
2
1
Uniqueness
Every bit pattern
represents unique integer
value
Each representable integer
has unique bit encoding
Can Invert
Mappings
U2B(x) = B2U-1(x)
Bit pattern for unsigned
integer
T2B(x) = B2T-1(x)
Bit pattern for twos
20
Carnegie Mellon
21
Carnegie Mellon
wos Complement
x
Unsigned
T2U
T2B
ux
B2U
Unsigned
ux
U2T
U2B
B2T
Twos Complement
x
22
Carnegie Mellon
Mapping Signed
Bits
Signed
Unsigned
0000
Unsigne
d
0001
0010
0011
0100
0101
0110
0111
1000
-8
1001
-7
1010
-6
1011
-5
10
1100
-4
11
1101
-3
12
1110
-2
13
1111
-1
14
T2U
U2T
5
6
15
23
Carnegie Mellon
Mapping Signed
Bits
Signed
Unsigned
0000
0001
0010
0011
0100
0101
0110
0111
1000
-8
1001
-7
1010
-6
1011
-5
1100
-4
1101
-3
1110
-2
1111
-1
Unsigne
d
0
1
2
3
4
5
6
7
8
+/- 16
9
10
11
12
13
14
15
24
Carnegie Mellon
Unsigned
T2U
T2B
B2U
ux
ux + + +
x - ++
+++
+++
25
Carnegie Mellon
Conversion Visualized
2s Comp.
Unsigned
Ordering Inversion
Negative Big Positive
TMax
2s
Complement
Range
0
1
2
UMax
UMax
1
TMax + 1 Unsigned
TMax
Range
TMin
26
Carnegie Mellon
Constants
By default are considered to be signed integers
Unsigned if have U as suffix
0U, 4294967259U
Casting
Explicit casting between signed & unsigned same as U2T and
T2U
int tx, ty;
unsigned ux, uy;
tx = (int) ux;
uy = (unsigned) ty;
tx = ux;
uy =Systems:
ty;A Programmers Perspective, Third Edition
t and OHallaron, Computer
27
Carnegie Mellon
Casting Surprises
Expression Evaluation
If there is a mix of unsigned and signed in single expression,
signed values implicitly cast to unsigned
Including comparison operations <, >, ==, <=, >=
Examples for W = 32: TMIN = -2,147,483,648 ,
= 2,147,483,647
Constant1
Constant2
0 0U == unsigned
Evaluation
TMAX
Relation
-1 0 < signed
0 0U
-1 0U > unsigned
-1 0
2147483647 -2147483648 > signed
-1 0U
2147483647U -2147483648 < unsigned
2147483647
-2147483647-1
-1 -2 > signed
2147483647U
(unsigned) -1 -2 >-2147483647-1
unsigned
-1 2147483647
-2
2147483648U < unsigned
(unsigned)-1
-2 2147483648U > signed
2147483647 (int)
2147483647
2147483648U
28
Carnegie Mellon
Summary
Casting Signed Unsigned:
Basic Rules
29
Carnegie Mellon
30
Carnegie Mellon
Sign Extension
Task:
Given w-bit signed integer x
Convert it to w+k-bit integer with same value
Rule:
Make k copies of sign bit:
X = xw1 ,, xw1 , xw1 , xw2 ,, x0
k copies of MSB
k
t and OHallaron, Computer Systems: A Programmers Perspective, Third Edition
31
Carnegie Mellon
x
ix
y
iy
Decimal
15213
15213
-15213
-15213
Hex
3B
00 00 3B
C4
FF FF C4
6D
6D
93
93
Binary
00111011
00000000 00000000 00111011
11000100
11111111 11111111 11000100
01101101
01101101
10010011
10010011
32
Carnegie Mellon
Summary:
Expanding, Truncating:
Basic Rules
33
Carnegie Mellon
34
Carnegie Mellon
Unsigned Addition
Operands: w bits
True Sum: w+1 bits
Discard Carry: w
bits
u
+v
u+v
UAddw(u,v)
Standard Addition
Function
Ignores carry output
Implements Modular
Arithmetic
s
= UAddw(u , v) = u + v
mod 2w
35
Carnegie Mellon
Visualizing (Mathematical)
Integer Addition
Add4(u , v)
Integer Addition
4-bit integers u, v
Compute true sum
Integer Addition
Add4(u , v)
Values increase
linearly with u and
v
Forms planar
surface
32
28
24
20
16
14
12
12
10
4
0
2
8
10
12
0
14
36
Carnegie Mellon
Visualizing Unsigned
Addition
Overflow
Wraps Around
If true sum 2w
At most once
True Sum
2w+1 Overflow
UAdd4(u , v)
16
14
12
10
8
14
12
10
Modular Sum
4
0
2
8
10
12
0
14
37
Carnegie Mellon
u
+ v
u+v
Will give s == t
38
Carnegie Mellon
TAdd Overflow
True Sum
Functionality
True sum requires
w+1 bits
Drop off MSB
Treat remaining
bits as 2s comp.
integer
0 1111
2w1
PosOver
TAdd Result
0 1000
2w 11
0111
0 0000
0000
1 0111
2w 1
1000
1 0000
2w
NegOver
39
Carnegie Mellon
Visualizing 2s Complement
Addition
NegOver
Values
TAdd4(u , v)
Wraps Around
If sum 2
w1
Becomes
negative
At most once
If sum < 2w1
Becomes
positive
At most once
8
6
4
2
0
-2
4
2
-4
-6
-2
-8
-4
-8
-6
-4
-2
-6
0
-8
6
v
PosOver
40
Carnegie Mellon
Multiplication
= 22w2
product computed
Computer
is done
inAsoftware,
if needed
t and OHallaron,
Systems:
Programmers Perspective,
Third Edition
41
Carnegie Mellon
Unsigned Multiplication in C
u
* v
Operands: w bits
True Product: 2*w bitsuv
Discard w bits: w
bits
UMultw(u,v)
Standard Multiplication
Function
Ignores high order w bits
Implements Modular
Arithmetic
UMultw(u , v) = u v mod 2w
42
Carnegie Mellon
Signed Multiplication in C
u
* v
Operands: w bits
True Product: 2*w bitsuv
Discard w bits: w
bits
TMultw(u,v)
Standard Multiplication
Function
Ignores high order w bits
Some of which are different for
signed vs. unsigned
multiplication
Lower bits are the same
43
Carnegie Mellon
Operation
u << k gives u * 2k
Both signed and unsigned
Discard k bits: w
bits
Examples
u
* 2k
Operands: w bits
True Product: w+k bits
u2k
0 0 1 0 0 0
UMultw(u,2k)
TMultw(u,2k)
0 0 0
0 0 0
u << 3
==
u * 8
(u << 5) (u << 3) == u * 24
Most machines shift and add faster than multiply
44
Carnegie Mellon
Operands:
Division:
Result:
u
/ 2k
u/2k
u/2k
Binary Point
0 0 1 0 0 0
0 0 0
0 0 0
45
Carnegie Mellon
46
Carnegie Mellon
Addition:
Unsigned/signed: Normal addition followed by
truncate,
same operation on bit level
Unsigned: addition mod 2w
Mathematical addition + possible subtraction of 2 w
Signed: modified addition mod 2w (result in proper
range)
Mathematical addition + possible addition or
subtraction of 2w
Multiplication:
Unsigned/signed: Normal multiplication followed by
truncate, same operation on bit level
Unsigned: multiplication mod 2w
47
Carnegie Mellon
48
Carnegie Mellon
Even better
size_t i;
for (i = cnt-2; i < cnt; i--)
a[i] += a[i+1];
Data type size_t defined as unsigned value with length = word
size
t and OHallaron,
A Programmers
Computer
CodeSystems:
will work
evenPerspective,
if cntThird
= Edition
UMax
49
Carnegie Mellon
50
Carnegie Mellon
51
Carnegie Mellon
Byte-Oriented Memory
Organization
52
Carnegie Mellon
Machine Words
and of addresses
53
Carnegie Mellon
Word-Oriented Memory
Organization
32-bit 64-bit
Addresses Specify
Byte Locations
Words Words
Addr
=
0000
??
Addr
=
0004
??
Addr
=
0008
??
Addr
=
0012
??
Addr
=
0000
??
Addr
=
0008
??
Bytes Addr.
0000
0001
0002
0003
0004
0005
0006
0007
0008
0009
0010
0011
0012
0013
0014
0015
54
Carnegie Mellon
Example Data
Representations
Typical 32bit
Typical 64bit
x86-64
char
short
int
long
float
double
long double
10/16
pointer
C Data Type
55
Carnegie Mellon
Byte Ordering
56
Carnegie Mellon
Example
Variable x has 4-byte value of 0x01234567
Address given by &x is 0x100
Big Endian
01
Little Endian
23
45
67
67
45
23
01
57
Carnegie Mellon
Decimal:
Decimal: 15213
15213
Binary:
Representing Integers
Binary: 0011
0011 1011
1011 0110
0110 1101
1101
Hex:
Hex:
int A = 15213;
IA32, x86-64
6D
3B
00
00
Sun
00
00
3B
6D
93
C4
FF
FF
IA32
6D
3B
00
00
Sun
FF
FF
C4
93
BB
66
D
D
int B = -15213;
IA32, x86-64
33
x86-64
6D
3B
00
00
00
00
00
00
Sun
00
00
3B
6D
58
Carnegie Mellon
Examining Data
Representations
Printf directives:
% p:
Print pointer
% x:
Print Hexadecimal
59
Carnegie Mellon
show_bytes Execution
Example
int
int aa == 15213;
15213;
printf("int
printf("int aa == 15213;\n");
15213;\n");
show_bytes((pointer)
show_bytes((pointer) &a,
&a, sizeof(int));
sizeof(int));
6d
6d
3b
3b
00
00
00
00
60
Carnegie Mellon
Representing Pointers
int
int
int
int
BB == -15213;
-15213;
*P
*P == &B;
&B;
Sun
IA32
x86-64
EF
AC
3C
FF
28
1B
FB
F5
FE
2C
FF
82
FD
7F
00
00
61
Carnegie Mellon
Representing Strings
Strings in C
char
char S[6]
S[6] == "18213";
"18213";
Compatibility
Byte ordering not an issue
Sun
31
38
32
31
33
33
00
00
62
Carnegie Mellon
Integer C Puzzles
Initialization
int x = foo();
int y = bar();
unsigned ux = x;
unsigned uy = y;
x < 0
ux >= 0
x & 7 == 7
ux > -1
x > y
x * x >= 0
x > 0 && y > 0
x >= 0
x <= 0
(x|-x)>>31 == -1
ux >> 3 == ux/8
x >> 3 == x/8
x & (x-1) != 0
((x*2) < 0)
(x<<30) < 0
-x < -y
x + y > 0
-x <= 0
-x >= 0
63
Carnegie Mellon
Bonus extras
64
Carnegie Mellon
Application of Boolean
Algebra
~B
Connection when
B
~A&B
A&~B | ~A&B
= A^B
65
Carnegie Mellon
Binary Number
Property
Claim
1 + 1 + 2 + 4 + 8 + + 2w-1 = 2w
w1
1 2 i
2w
i0
w = 0:
1 = 20
2w + 2w =
2w+1
2w
66
Carnegie Mellon
67
Carnegie Mellon
Typical Usage
/* Kernel memory region holding user-accessible data */
#define KSIZE 1024
char kbuf[KSIZE];
/* Copy at most maxlen bytes from kernel region to user buffer */
int copy_from_kernel(void *user_dest, int maxlen) {
/* Byte count len is minimum of buffer size and maxlen */
int len = KSIZE < maxlen ? KSIZE : maxlen;
memcpy(user_dest, kbuf, len);
return len;
}
#define MSIZE 528
void getstuff() {
char mybuf[MSIZE];
copy_from_kernel(mybuf, MSIZE);
printf(%s\n, mybuf);
}
68
Carnegie Mellon
Malicious Usage
69
Carnegie Mellon
Mathematical Properties
Commutative
UAddw(u , v)= UAddw(v , u)
Associative
UAddw(t, UAddw(u , v))= UAddw(UAddw(t, u ), v)
0 is additive identity
UAddw(u , 0)=u
Let
UCompw (u )=2w u
UAddw(u , UCompw (u ))=0
70
Carnegie Mellon
Mathematical Properties of
TAdd
TMinw
u TMinw
71
Carnegie Mellon
Characterizing TAdd
Positive Overflow
Functionality
TAdd(u , v)
>0
v
<0
w+1 bits
Drop off MSB
Treat remaining bits
as 2s comp. integer
< 0u > 0
Negative Overflow
TAddw (u, v )
w 1
u v 2
u v
u v 2 w 1
u v TMin w
TMinw u v TMax w
TMax w u v
2w
(NegOver)
(PosOver)
72
Carnegie Mellon
Complement
Observation: ~x + x == 1111111 == -1
x 10011101
+ ~x 0 1 1 0 0 0 1 0
-1 1 1 1 1 1 1 1 1
Complete Proof?
73
Carnegie Mellon
x=0
74
Carnegie Mellon
ele_src
malloc(ele_cnt * ele_size)
75
Carnegie Mellon
XDR Code
void* copy_elements(void *ele_src[], int ele_cnt, size_t ele_size) {
/*
* Allocate buffer for ele_cnt objects, each of ele_size bytes
* and copy from locations designated by ele_src
*/
void *result = malloc(ele_cnt * ele_size);
if (result == NULL)
/* malloc failed */
return NULL;
void *next = result;
int i;
for (i = 0; i < ele_cnt; i++) {
/* Copy object i to destination */
memcpy(next, ele_src[i], ele_size);
/* Move pointer to next memory region */
next += ele_size;
}
return result;
}
76
Carnegie Mellon
XDR Vulnerability
malloc(ele_cnt * ele_size)
What if:
ele_cnt = 220 + 1
ele_size = 4096
Allocation = ??
= 212
77
Carnegie Mellon
Compiled Multiplication
Code
C Function
long mul12(long x)
{
return x*12;
}
Explanation
t <- x+x*2
return t << 2;
78
Carnegie Mellon
C Function
Explanation
# Logical shift
return x >> 3;
79
Carnegie Mellon
Operands:
Division:
Result:
x
/ 2k
x/2k
RoundDown(x/2k)
Binary Point
0 0 1 0 0 0
80
Carnegie Mellon
In C: (x + (1<<k)-1) >> k
Biases dividend toward 0
Case 1: No rounding
Dividend:
u
+2k1
/ 2k
u/2k
0 0 0
0 0 0 1 1 1
1
Divisor:
1 1 1 Binary Point
0 0 1 0 0 0
0 1 1 1
1
. 1 1 1
81
Carnegie Mellon
Case 2: Rounding
Dividend:
x
+2k1
0 0 0 1 1 1
1
Incremented by 1
Divisor:
/ 2k
x/2k
Binary Point
0 0 1 0 0 0
0 1 1 1
1
Incremented by 1
Biasing adds 1 to final result
82
Carnegie Mellon
C Function
long idiv8(long x)
{
return x/8;
}
Explanation
if x < 0
x += 7;
# Arithmetic shift
return x >> 3;
Carnegie Mellon
Left shift
Unsigned/signed: multiplication by 2k
Always logical shift
Right shift
Unsigned: logical shift, div (division + round to zero)
by 2k
Signed: arithmetic shift
Positive numbers: div (division + round to zero) by
2k
Negative numbers: div (division + round away
t and OHallaron, Computer
Systems:zero)
A Programmers
Third Edition
from
by Perspective,
2k
84
Carnegie Mellon
Properties of Unsigned
Arithmetic
Multiplication Commutative
UMultw(u , v)= UMultw(v , u)
Multiplication is Associative
UMultw(t, UMultw(u , v))= UMultw(UMultw(t, u ), v)
1 is multiplicative identity
UMultw(u , 1)=u
85
Carnegie Mellon
86
Carnegie Mellon
Reading Byte-Reversed
Listings
Disassembly
Text representation of binary machine code
Generated by program that reads the machine code
Example Fragment
Address
8048365:
8048366:
804836c:
Instruction Code
5b
81 c3 ab 12 00 00
83 bb 28 00 00 00 00
Assembly Rendition
pop %ebx
add $0x12ab,%ebx
cmpl $0x0,0x28(%ebx)
Deciphering Numbers
Value:
Pad to 32 bits:
Split into bytes:
Reverse:
0x12ab
0x000012ab
00 00 12 ab
ab 12 00 00
87