Vous êtes sur la page 1sur 6

Giving credit where credit is due

JDEP 284H
Foundations of Computer Systems

Most

of slides for this lecture are based on


slides created by Drs. Bryant and
OHallaron, Carnegie Mellon University.

Bits and Bytes

have modified them and added new


slides.

Dr. Steve Goddard


goddard@cse.unl.edu

http://cse.unl.edu/~goddard/Courses/JDEP284

Topics

Why Dont Computers Use Base 10?

Why

bits?
Representing information as bits

Base 10 Number Representation

zBinary/Hexadecimal
zByte representations

Thats why fingers are known as digits


Natural representation for financial transactions

Even carries through in scientific notation

z Floating point number cannot exactly represent $1.20


z 1.5213 X 104

numbers
characters and strings
Instructions
Bit-level

Implementing Electronically

Hard to store

Hard to transmit

Messy to implement digital logic functions

z ENIAC (First electronic computer) used 10 vacuum tubes / digit

manipulations

z Need high precision to encode 10 signal levels on single wire

zBoolean algebra
zExpressing in C

z Addition, multiplication, etc.

Binary Representations

Byte-Oriented Memory Organization

Base 2 Number Representation

Programs Refer to Virtual Addresses

Represent 1521310 as 111011011011012


Represent 1.2010 as 1.0011001100110011[0011]2
Represent 1.5213 X 104 as 1.11011011011012 X 213

z SRAM, DRAM, disk

Electronic Implementation

z Only allocate for regions actually used by program

Easy to store with bistable elements


Reliably transmitted on noisy and inaccurate wires
0

Conceptually very large array of bytes


Actually implemented with hierarchy of different memory
types

In Unix and Windows NT (and 2000), address space is private


to a particular process
z Program being executed

z Program can clobber its own data, but not that of others

3.3V

Compiler + RunRun-Time System Control Allocation

2.8V

0.5V

0.0V

Straightforward implementation of arithmetic functions

Where different program objects should be stored


Multiple mechanisms: static, stack, and heap
In any case, all allocation within single virtual address space
6

Page 1

Encoding Byte Values

Machine Words

Byte = 8 bits

Binary 000000002
Decimal:
010
Hexadecimal
0016

to
to
to

111111112
25510
FF16

z Base 16 number representation


z Use characters 0 to 9 and A to F
z Write FA1D37B16 in C as 0xFA1D37B

Or 0xfa1d37b

al
m ry
x
ci na
He De Bi
0 0 0000
1 1 0001
2 2 0010
3 3 0011
4 4 0100
5 5 0101
6 6 0110
7 7 0111
8 8 1000
9 9 1001
A 10 1010
B 11 1011
C 12 1100
D 13 1101
E 14 1110
F 15 1111

Machine Has Word Size

Nominal size of integer-valued data

Most current machines are 32 bits (4 bytes)

z Including addresses
z Limits addresses to 4GB
z Becoming too small for memory-intensive applications

High-end systems are 64 bits (8 bytes)

Machines support multiple data formats

z Potentially address 1.8 X 1019 bytes


z Fractions or multiples of word size
z Always integral number of bytes

Word-Oriented Memory
Organization
32-bit 64-bit
Words Words

Addresses Specify Byte


Locations

Address of first byte in


word
Addresses of successive
words differ by 4 (32-bit) or
8 (64-bit)

Data Representations
Bytes Addr.

Addr
=
0000
??
Addr
=
0000
??
Addr
=
0004
??

Addr
=
0008
??

Addr
=
0008
??

Addr
=
0012
??

Sizes of C Objects (in Bytes)


0000
0001
0002
0003
0004
0005
0006
0007
0008
0009
0010
0011
0012
0013
0014
0015

C Data Type Compaq Alpha


z int

Typical 32-bit

Intel IA32

4
4
1
2
4
8
8
4

4
4
1
2
4
8
10/12
4

4
8
1
2
4
8
8
8

z long int
z char
z short
z float
z double
z long double
z char *

Or any other pointer

10

Byte Ordering

Byte Ordering Example

How should bytes within multimulti-byte word be


ordered in memory?

Big Endian

Conventions

Little Endian

Suns, Macs are Big Endian machines

Least significant byte has lowest address

Example

z Least significant byte has highest address

Least significant byte has highest address

Alphas, PCs are Little Endian machines

z Least significant byte has lowest address

Variable x has 4-byte representation 0x01234567


Address given by &x is 0x100

Big Endian

0x100 0x101 0x102 0x103

Little Endian

0x100 0x101 0x102 0x103

01

67

11

23

45

45

23

67

01

12

Page 2

Reading Byte-Reversed Listings

Examining Data Representations

Disassembly

Code to Print Byte Representation of Data

Text representation of binary machine code


Generated by program that reads the machine code

Casting pointer to unsigned char * creates byte array


typedef unsigned char *pointer;

Example Fragment
Address
8048365:
8048366:
804836c:

Instruction Code
5b
81 c3 ab 12 00 00
83 bb 28 00 00 00 00

void show_bytes(pointer start, int len)


{
int i;
for (i = 0; i < len; i++)
printf("0x%p\t0x%.2x\n",
start+i, start[i]);
printf("\n");
}

Assembly Rendition
pop
%ebx
add
$0x12ab,%ebx
cmpl
$0x0,0x28(%ebx)

Deciphering Numbers

Value:
Pad to 4 bytes:
Split into bytes:
Reverse:

0x12ab
0x000012ab
00 00 12 ab
ab 12 00 00

Printf directives:
%p: Print pointer
%x: Print Hexadecimal
13

14

show_bytes Execution Example

Representing Integers
int A = 15213;
int B = -15213;
long int C = 15213;

int a = 15213;
printf("int a = 15213;\n");

Linux/Alpha A

show_bytes((pointer) &a, sizeof(int));

6D
3B
00
00

Result (Linux):

Decimal: 15213
Binary:

0011 1011 0110 1101

Hex:

0x6d

0x11ffffcb9

0x3b

0x11ffffcba

0x00

0x11ffffcbb

0x00

Linux/Alpha B
93
C4
FF
FF

Linux C

Alpha C

Sun C

00
00
3B
6D

6D
3B
00
00

6D
3B
00
00
00
00
00
00

00
00
3B
6D

int a = 15213;
0x11ffffcb8

Sun A

Sun B
FF
FF
C4
93

Twos complement representation


(Covered next lecture)

15

Alpha P

Representing Pointers
int B = -15213;
int *P = &B;
Alpha Address
Hex:

Binary:
Sun P
EF
FF
FB
2C

16

0001 1111 1111 1111 1111 1111 1100 1010 0000

Representing Floats

A0
FC
FF
FF
01
00
00
00

Float F = 15213.0;
Linux/Alpha F
00
B4
6D
46

Sun F
46
6D
B4
00

Sun Address
Hex:
Binary:

E
F
F
F
F
B
2
C
1110 1111 1111 1111 1111 1011 0010 1100

Linux Address
Hex:
Binary:

B
F
F
F
F
8
D
4
1011 1111 1111 1111 1111 1000 1101 0100

IEEE Single Precision Floating Point Representation


Linux P

Hex:
Binary:

D4
F8
FF
BF

15213:

4
6
6
D
B
4
0
0
0100 0110 0110 1101 1011 0100 0000 0000
1110 1101 1011 01

Not same as integer representation, but consistent across machines


Different compilers & machines assign different locations to objects
Can see some relation to integer representation, but not obvious
17

18

Page 3

Representing Strings
Strings in C

Machine-Level Code Representation


char S[6] = "15213";

Represented by array of characters


Each character encoded in ASCII format

z Standard 7-bit encoding of character set


z Other encodings exist, but uncommon
z Character 0 has code 0x30

Digit i has code 0x30+i

Encode Program as Sequence of Instructions


Each instruction is a simple operation

Linux/Alpha S Sun S

String should be null-terminated


z Final character = 0

31
35
32
31
33
00

z Arithmetic operation
z Read or write memory

31
35
32
31
33
00

z Conditional branch

Instructions encoded as bytes


z Alphas, Suns, Macs use 4 byte instructions

Reduced Instruction Set Computer (RISC)


z PCs use variable length instructions

Complex Instruction Set Computer (CISC)

Compatibility

Byte ordering not an issue

Text files generally platform independent

Different instruction types and encodings for different


machines

z Data are single byte quantities

z Most code not binary compatible

Programs are Byte Sequences Too!

z Except for different conventions of line termination character(s)!


19

20

Representing Instructions
int sum(int x, int y)
{
return x+y;
}

For this example, Alpha &


Sun use two 4-byte
instructions
z Use differing numbers of

instructions in other cases

Boolean Algebra
Developed by George Boole in 19th Century

Alpha sum
00
00
30
42
01
80
FA
6B

Sun sum

PC sum

81
C3
E0
08
90
02
00
09

55
89
E5
8B
45
0C
03
45
08
89
EC
5D
C3

PC uses 7 instructions with


lengths 1, 2, and 3 bytes
z Same for NT and for Linux
z NT / Linux not fully binary

compatible

Algebraic representation of logic


z Encode True as 1 and False as 0

And

Or
A&B = 1 when both A=1 and
B=1
& 0 1

0 0 0
1 0 1
Not

~A = 1 when A=0

0 0 1
1 1 1
ExclusiveExclusive-Or (Xor
(Xor))

~
0 1
1 0

Different machines use totally different instructions and encodings

A|B = 1 when either A=1 or


B=1
| 0 1

A^B = 1 when either A=1 or


B=1, but not both

^ 0 1
0 0 1
1 1 0

21

22

Application of Boolean Algebra

Integer Algebra

Applied to Digital Systems by Claude Shannon

Integer Arithmetic

1937 MIT Masters Thesis


Reason about networks of relay switches
z Encode closed switch as 1, open switch as 0

A&~B
A

~B

~A

Connection when
A&~B | ~A&B

~A&B

= A^B

23

Z, +, *, , 0, 1 forms a ring

Addition is the sum operation

Multiplication is the product operation

is the additive inverse

0 is the identity for sum

1 is the identity for product

24

Page 4

Boolean Algebra Integer Ring

Boolean Algebra

Boolean Algebra

{0,1}, |, &, ~, 0, 1 forms a Boolean algebra

Or is the sum operation

And is the product operation

~ is the complement operation (not


additive inverse)

0 is the identity for sum


1 is the identity for product

Commutativity
A|B = B|A
A&B = B&A
Associativity
(A | B) | C = A | (B | C)
(A & B) & C = A & (B & C)
Product distributes over sum
A & (B | C) = (A & B) | (A & C)
Sum and product identities
A|0 = A
A&1 = A
Zero is product annihilator
A&0 = 0
Cancellation of negation
~ (~ A) = A

A+B = B+A
A*B = B*A
(A + B) + C = A + (B + C)
(A * B) * C = A * (B * C)
A * (B + C) = A * B + B * C
A+0 = A
A*1 =A
A*0 = 0
( A) = A

25

26

Boolean Algebra Integer Ring

{0,1}, ^, &, , 0, 1
Identical to integers mod 2
is identity operation: (A) = A
A^A=0

z A is true or A is true = A is true

A&A = A
Boolean: Absorption
A | (A & B) = A

A *A A

Property

A + (A * B) A

z A is true or A is true and B is true = A is true

A & (A | B) = A
Boolean: Laws of Complements
A | ~A = 1

A * (A + B) A

A + A 1

z A is true or A is false

Properties of & and ^

Boolean Ring

Boolean: Sum distributes over product


A | (B & C) = (A | B) & (A | C) A + (B * C) (A + B) * (B + C)
Boolean: Idempotency
A|A = A
A +A A

Ring: Every element has additive inverse


A | ~A 0
A + A = 0

Boolean Ring

Commutative sum
Commutative product
Associative sum
Associative product
Prod. over sum
0 is sum identity
1 is prod. identity
0 is product annihilator
Additive inverse

A^B = B^A
A&B = B&A
(A ^ B) ^ C = A ^ (B ^ C)
(A & B) & C = A & (B & C)
A & (B ^ C) = (A & B) ^ (B & C)
A^0 = A
A&1 = A
A&0=0
A^A = 0

27

28

Relations Between Operations

General Boolean Algebras

DeMorgans Laws

Operate on Bit Vectors

Express & in terms of |, and vice-versa


Operations applied bitwise

z A & B = ~(~A | ~B)

A and B are true if and only if neither A nor B is false


z A | B = ~(~A & ~B)
A or B are true if and only if A and B are not both
false

01101001
& 01010101
01000001
01000001

01101001
| 01010101
01111101
01111101

01101001
^ 01010101
00111100
00111100

~ 01010101
10101010
10101010

All of the Properties of Boolean Algebra Apply

ExclusiveExclusive-Or using Inclusive Or


z A ^ B = (~A & B) | (A & ~B)

Exactly one of A and B is true


z A ^ B = (A | B) & ~(A & B)

Either A is true, or B is true, but not both

29

30

Page 5

Representing & Manipulating Sets

Bit-Level Operations in C

Representation

Operations &, |, ~, ^ Available in C

Width w bit vector represents subsets of {0, , w1}


aj = 1 if j A
{ 0, 3, 5, 6 }
01101001
76543210

Apply to any integral data type

View arguments as bit vectors


Arguments applied bit-wise

z long, int, short, char

Examples (Char data type)

{ 0, 2, 4, 6 }

01010101
76543210

&
|
^
~

~0x41 -->

0xBE

~010000012

Operations
Intersection
Union
Symmetric difference
Complement

01000001
01111101
00111100
10101010

{ 0, 6 }
{ 0, 2, 3, 4, 5, 6 }
{ 2, 3, 4, 5 }
{ 1, 3, 5, 7 }

~0x00 -->

0x69 & 0x55

--> 101111102

~000000002

0xFF
--> 111111112

-->

0x41

011010012 & 010101012 --> 010000012

0x69 | 0x55

-->

0x7D

011010012 | 010101012 --> 011111012


31

32

Contrast: Logic Operations in C

Shift Operations

Contrast to Logical Operators

Left Shift:

&&, ||, !
z View 0 as False

z Throw away extra bits on left

z Anything nonzero as True

z Fill with 0s on right

z Always return 0 or 1

Right Shift:

z Early termination

Examples (char data type)

!0x41 -->
!0x00 -->
!!0x41 -->

Argument x 01100010

x >> y

Shift bit-vector x right y


positions

<< 3

00010000

Log. >> 2

00011000

Arith. >> 2 00011000

Argument x 10100010

z Throw away extra bits on right

0x00
0x01
0x01

Logical shift
z Fill with 0s on left

x << y

Shift bit-vector x left y positions

-->
-->

Arithmetic shift

<< 3

00010000

Log. >> 2

00101000

Arith. >> 2 11101000

z Replicate most significant bit on

0x69 && 0x55


0x69 || 0x55

0x01
0x01

p && *p (avoids null pointer access)

right
z Useful with twos complement

integer representation
33

34

Cool Stuff with Xor

Main Points
Its All About Bits & Bytes

void funny(int *x, int *y)


{
*x = *x ^ *y;
/* #1 */
*y = *x ^ *y;
/* #2 */
*x = *x ^ *y;
/* #3 */
}

Bitwise Xor is a
form of addition
With extra property
that every value is
its own additive
inverse
A^A=0

Numbers
Programs
Text

Different Machines Follow Different Conventions

*x

*y

Begin

A^B

A^B

(A^B)^B = A

(A^B)^A = B

End

Word size
Byte ordering
Representations

Boolean Algebra is Mathematical Basis


Basic form encodes false as 0, true as 1
General form like bit-level operations in C
z Good for representing & manipulating sets

35

36

Page 6

Vous aimerez peut-être aussi