Vous êtes sur la page 1sur 16

cache 1

Mapping of CACHE Main memory- 256B, address bits for byte access 8,block size 8B , address bits for block access 32,total 32 blocks Cache memory -128B, cache line size 8B, Number of cache lines-16 Mapping of CPU generated logical address in given cache Tag 5-bit Offset-3 bit Fully associative (16-way set associative) , one set only CPU generated 0-bit Address (hex) Tag Valid bit Cache line Index Set 0 0 Set 0 0 48 0 1001 1 Block 9 0 0 F4 1 1110 1 Block 30 0 0 0 D3 1 1010 1 Block 26 DC 1 1011 1 Block 27 O6 0 0000 1 Block 0 0 0 E2 1 1100 1 Block 28 1A 0 0011 1 Block 3

Example of a block of 8 Byte Block address ooooo Byte address within block Ooooo xxx So for 0th to 7th Byte Block 0 is Selected

Ooooo ooo Ooooo oo1 Ooooo o1o Ooooo o11 Ooooo 1oo Ooooo 1o1 Ooooo 11o Ooooo 111

Main memory Block 0 Block 1 Block 2 Block 3 Block 4 Block 5 Block 6 Block 7 Block 8 Block 9 Block 10 Block 11 Block 12 Block 13 Block 14 Block 15 Block 16 Block 17 Block 18 Block 19 Block 20 Block 21 Block 22 Block 23 Block 24 Block 25 Block 26 Block 27 Block 28 Block 29 Block 30 Block 31

5-bit Tag value 0 0000 0 0001 0 0010 0 0011 0 0100 0 0101 0 0110 0 0111 0 1000 0 1001 0 1010 0 1011 0 1100 0 1101 0 1110 0 1111 1 0000 1 0001 1 0010 1 0011 1 0100 1 0101 1 0110 1 0111 1 1000 1 1001 1 1010 1 1011 1 1100 1 1101 1 1110 1 1111

Block Address 0 0 (oo o7) 08 10 1 8 (18-1F) 20 28 30 38 40 4 8 (48-4F) 50 58 60 68 70 78 80 88 90 98 A0 A8 B0 B8 C0 C8 D 0(D0-D7) D 8 (D8-DF) E 0(E0-E7) E8 F 0 (F0-F7) F8

Page 1

cache 1
Example: valid bit of all cache lines are having 0 CPU generates following addresses- 48, F4, 06, E2, F4, DC, 1A, DC, E2 As fully associative, any block can go any where in cache. Six compulsory miss, brought to cache as result of miss. And valid bits are set Other addresses result in hit as block is already in cache All the requested blocks are present results in 9 hits and 6 misses. Example: cache is already having memory blocks as shown CPU generates following addresses- 48, F4, 06, E2, F4, DC, 1A, DC, E2 As fully associative, any block can go any where in cache. All the requested blocks are present results in 9 hits.

Page 2

cache 1
Set associative (8-way set associative) , two sets CPU generated Address (hex) Tag Set 0 77 0 111 47 0 100 Tag 4-bit Valid bit 0 1 1 0 0 1 1 1 0 1 1 1 0 0 1 1 Set-1 bit Cache line Block 14 Block 8 Offset-3 bit 1-bit Index Set 0 0 Main memory Block 0 Block 2 Block 4 Block 6 Block 8 Block 10 Block 12 Block 14 Block 16 Block 18 Block 20 Block 22 Block 24 Block 26 Block 28 Block 30 Block 1 Block 3 Block 5 Block 7 Block 9 Block 11 Block 13 Block 15 Block 17 Block 19 Block 21 Block 23 Block 25 Block 27 Block 29 Block 31 4-bit Tag value 0 000 0 001 0 010 0 011 0 100 0 101 0 110 0 111 1 000 1 001 1 010 1 011 1 100 1 101 1 110 1 111 0 000 0 001 0 010 0 011 0 100 0 101 0 110 0 111 1 000 1 001 1 010 1 011 1 100 1 101 1 110 1 111 Block Address 00 10 20 30 40 50 60 70 80 90 A0 B0 C0 D0 E0 F0 08 18 28 38 48 58 68 78 88 98 A8 B8 C8 D8 E8 FF

03 F3 E2

0 000 1 111 1 110 Set 1

Block 0 Block 30 Block 28

D3 DC 18 7A 99

1 101 1 101 0 001 0 111 1 001

Block 26 Block 27 Block 3 Block 15 Block 19

Set 1

Page 3

cache 1
Set associative (4-way set associative) , four sets CPU generated Address (hex) Tag Set 0 23 1 00 83 1 00 03 0 00 Set 1 4B E9 0 10 1 11 Tag 3-bit Valid bit 0 1 1 1 0 1 1 0 0 1 1 1 0 1 1 1 Set-2 bit Cache line Block 4 Block 16 Block 0 Offset-3 bit 2-bit Index Set 0 0 0 Main memory Block 0 Block 4 Block 8 Block 12 Block 16 Block 20 Block 24 Block 28 Block 1 Block 5 Block 9 Block 13 Block 17 Block 21 Block 25 Block 29 Block 2 Block 6 Block 10 Block 14 Block 18 Block 22 Block 26 Block 30 Block 3 Block 7 Block 11 Block 15 Block 19 Block 23 Block 27 Block 31 3-bit Tag value 0 00 0 01 0 10 0 11 1 00 1 01 1 10 1 11 0 00 0 01 0 10 0 11 1 00 1 01 1 10 1 11 0 00 0 01 0 10 0 11 1 00 1 01 1 10 1 11 0 00 0 01 0 10 0 11 1 00 1 01 1 10 1 11 Block Address 00 20 40 60 80 A0 C0 E0 08 28 48 68 88 A8 C8 E8 10 30 50 70 90 B0 D0 F0 18 38 58 78 98 B8 D8 F8

Block 9 Block 29 Set 1 Block 30 Block 6 Block 22 01

Set 2 F2 30 B1 1 11 0 01 1 10 Set 3 79 DC BE
CPU generated address TAG index Word offset

0 11 1 10 1 01
Decoder to select a SET

Block 15 Block 27 Block 23


Decoder to select a SET

Set 2

10

index 1 TAG

Logic to select requested cache line

4 comperators and 4 & gates 4

Word offset 4 HIT

5
Decoder to select a word

Miss either TAG not matched or valid bit o 5 Read block from next level memory

6 Word to CPU

Set 3

11

Page 4

cache 1
Working1. CPU generates logical address . This is interpreted as tag, index and word offset as per cache organization in hardware 2. Index bits are applied to decoder to select one of the set of tags and corresponding valid bits 3. Tags of selected set are compared simultaneously with input tag if the corresponding valid bit is set. If 4. Requested block is not in cache then there is no match between requested tag and stored tag . It is known as miss. 5. In case of miss, memory management unit reads the requested block from next level memory and place the new block either in a cache line which has 0 valid bit or use one of the replacement methods to find target cache line. Else 4. Requested block is present in cache,it results in match. It is known as hit 5. The corresponding cache line along with hit signal and word offset bits are applied to decoder 6. Word offset selects the desired word from the cache line.

Page 5

cache 1
Set associative (2-way set associative) , eight sets Tag 2-bit CPU generated Address (hex) Tag Valid bit 03 00 Set 0 1 C7 11 1 88 10 Set 1 1 0 0 1 1 Set-3 bit Cache line Block 0 Block 24 Block 17 Offset-3 bit 3-bit Index Main memory Set 0 0 0 0 Block 0 Block 8 Block 16 Block 24 Set 1 0 0 1 Block 1 Block 9 Block 17 Block 25 Set 2 0 1 0 Block 2 Block 10 Block 18 Block 26 Set 3 0 1 1 Block 3 Block 11 Block 19 Block 27 Set 4 1 0 0 Block 4 Block 12 Block 20 Block 28 Set 5 1 0 1 Block 5 Block 13 Block 21 Block 29 Set 6 1 1 0 Block 6 Block 14 Block 22 Block 30 Set 7 1 1 1 Block 7 Block 15 Block 23 Block 31 2-bit Tag value 00 01 10 11 00 01 10 11 00 01 10 11 00 01 10 11 00 01 10 11 00 01 10 11 00 01 10 11 00 01 10 11 Block Address 00 40 80 C0 08 48 88 C8 10 50 90 D0 18 58 98 D8 20 60 A0 E0 28 68 A8 E8 30 70 B0 F0 38 78 B8 F8

Set 2 14 DA 00 11 Set 3

Block 2 Block 27

65 27 AB

01 00 10

Set 4 Set 5

1 1 1 0 0 1 1 1

Block 12 Block 4 Block 21

Set 6 B2 FF BC 10 11 10 Set 7

Block 22 Block 31 Block 23

Page 6

cache 1
Example:A purely sequential programme occupies address space 80-CB in main memory (assume each instruction takes one address- byte size, cache size is 8 Bytes)) CPU executes the programme once.(each instruction executed once in sequence) Calculate number of hit and miss
- it is 2-way set associative cache . two blocks can occupy cache at a time First instruction- address 80h compulsory miss, loads block 16 from MM in set 0 of cache. Next 7 addresses (80 to 87h,in second row of set 0 in cache) are hit 8th address causes-88h- compulsory miss-loads block 17 from MM, in set1 of cache (in empty cache line-first row) Next 7 addresses (88 to 8Fh,in first row of set 1 in cache)are hit address causes-C2h- conflict miss-load block 24 from MM, in set 0 of cache (one of the cache line overwritten-say LRU used,second row) Next 7 addresses (C0 to C7h,in second row of set 0 in cache) are hit

Address C8 causes- conflict miss-load block 25 from MM,in set 1 of cache (one of the cache line overwritten-say LRU used,first row)
Next 4 addresses ,in first row of set 1 in cache)are hit Total reference 32, 4 causes miss and 28 hits Total

Miss-1 Hit - 0 Miss-0 Hit 8 Miss-1 Hit - 0 Miss-0 Hit 8 Miss-1 Hit - 0 Miss-0 Hit 8 Miss-1 Hit - 0 Miss-0 Hit 4 Miss-4 Hit 28

Page 7

cache 1
Direct mapped (1-way set associative) , sixteen sets Set 4-bit CPU generated Address (hex) Tag Valid bit 03 0 Set 0 1 19 23 1 0 Set 1 Set 2 Set 3 Set 4 58 6B 1 1 Set 5 Set 6 Set 7 89 9A AB 1 1 1 Set 8 Set 9 Set 10 Set 11 Set 12 D3 0 Set 13 Set 14 F3 0 Set 15 1 1 0 0 1 1 0 1 1 1 0 0 1 0 1 Block 30 Block 26 Set 9 1 0 0 1 Block 18 Block 19 Set 10 1 0 1 0 Block 20 Block 21 Set 11 1 0 1 1 Block 22 Block 23 Here index bits and tag are interchanged. The logical interpretation also changes Set 12 1 1 0 0 Block 24 Block 25 Set 13 1 1 0 1 Block 26 Block 27 0 1 0 1 C0 C8 D0 D8 0 1 0 1 0 1 90 98 A0 A8 B0 B8 Block 17 Block 19 Block 21 Set 7 0 1 1 1 Block 14 Block 15 Set 8 1 0 0 0 Block 16 Block 17 0 1 0 1 70 78 80 88 Set 5 0 1 0 1 Block 10 Block 11 Set 6 0 1 1 0 Block 12 Block 13 0 1 0 1 50 58 60 68 Block 11 Block 13 Set 3 0 0 1 1 Block 6 Block 7 Set 4 0 1 0 0 Block 8 Block 9 0 1 0 1 30 38 40 48 Tag-1 bit Cache line Block 0 Block 3 Block 4 Set 1 0 0 0 1 Block 2 Block 3 Set 2 0 0 1 0 Block 4 Block 5 0 1 0 1 10 18 20 28 Offset-3 bit 4-bit Index Main memory Set 0 0 0 0 0 Block 0 Block 1 1-bit Tag value 0 1 Block Address 00 08

Page 8

cache 1
Please refer class notes. Spring 2009 Set 14 1 1 1 0 Block 28 Block 29 Set 15 1 1 1 1 Block 30 Block 31 0 1 0 1 E0 E8 F0 F8

Page 9

cache 2
Cache memory Portion of main memory is copied to the faster memory which is closer to processor. Main memory of 64KB and cache of 1KB CPU generates effective address (logical address)- say 16-bit 16- bit Byte Address

If block size (cache line) is 64B. Then log 2 64 bits are required to select one of 64 Bytes in the block(cache line) 6-bits is the offset bits 10- bit Block Address Offset As cache requires 10-bits (1 KB cache)to access a byte So there are 24 cache lines each of 64 bytes Logical view of cache Byte offset Cache line number With in cache line OO00 OO OOOO 0th OO OOO1 Cache line . . . 11 1111 OO01 OO OOOO
1st

There are 210 blocks of main memory each of 64bytes Logical view of main memory BLOCK number OO 0000 0000 A15-6 Byte offset With in block
OO OOOO

0th BLOCK

OO OOO1

A5-0 . . 11 1111
OO OOOO OO OOO1

OO 0000 0001 1st BLOCK

OO OOO1

. . . 11 1111 1110

. . . 1022 BLOCK
nd

. . . 11 1111 . . .
OO OOOO OO OOO1

Cache line

. . . 1110

. . .
14th

. . . 11 1111 . . .
OO OOOO OO OOO1

. . . 11 1111
OO OOOO

Cache line

. . . 11 1111
OO OOOO OO OOO1

11 1111 1111 1023rd BLOCK

1111
15th

OO OOO1

. . . 11 1111

Cache line

. . . 11 1111

Page 10

cache 2
Fully associative cache (16-way set associative cache) CPU address is interpreted as 10- bit Block Address Offset 10-bit block size is termed as tag to identify the block having requested byte 10- bit TAG Offset Any 16 blocks of main memory can be stored in cache. Any block can be stored in any cache line As TAG differentiate block ID , so the tag of each block is also stored in tag line (respective cache line) Byte offset 10-bit Tag line Cache line number With in cache line OO00 OO OOOO 0th There can be only one of the comparator that generates OO OOO1 Cache line . HIT at most. - when the block having the requested byte Comparator . Is mapped on that cache line H . M 11 1111 If none of the comparators generate HIT It means OO01 the block having requested byte is not read from the OO OOOO 1st Main memory To the cache OO OOO1 Cache line . Comparator . Valid bit and dirty bit are not included in this example H . 11 1111 There are 16 cache lines where a block can be mapped M . . . . without any restriction. There are 16 ways a block be . . . . Placed in cache- 16-way set associative cache. . . . . 1110 OO OOOO
14th

OO OOO1

Comparator M

Cache line

H
1111
15th

. . . 11 1111
OO OOOO OO OOO1

Comparator M

Cache line

. . . 11 1111

Index bits= log 2 (No. of cache line/No. Of way)

Page 11

cache 2
Set associative cache (8-way set associative cache) CPU address is interpreted as 1-bit index Offset 9-bit Block address In 10-bit block No, 9-bit tag and 1-bit index is used to identify the block having requested byte 9- bit TAG 1-bit index Offset Cache lines are placed in 21 groups. And one of these groups are selected by decoder Index bits are used to select one of the 2 1 groups. It is implemented through decoder ( 2 k :1 ) Logical view of the Main memory (one LSB bit of 10-bit block No. is used as index bit) Index bits (A6) Index bits (A6) O 1 Tag number O 0000 0000 A15-7
Byte No. with in block

OO OOOO

0th BLOCK

OO OOO1

BLOCK number O 0000 0000 A15-7

Byte No. with in block

OO OOOO
1st

OO OOO1

A5-0 . 11 1111
OO OOOO

BLOCK

A5-0 . 11 1111
OO OOOO OO OOO1

O 0000 0001 2nd BLOCK . . . 1 1111 1110 . Group . 0 .


1020th

O 0000 0001
3th

OO OOO1

. . 11 1111 . . .
OO OOOO OO OOO1

BLOCK . . . 1 1111 1110 . Group 1. . 1021std BLOCK

. . 11 1111 . . .
OO OOOO OO OOO1

BLOCK

. . 11 1111
OO OOOO

. . 11 1111
OO OOOO OO OOO1

1 1111 1111
1022nd

1 1111 1111 1023rd BLOCK

OO OOO1

BLOCK

. . 11 1111

. . 11 1111

Block Numbers in group 0 (index bit-0) 0,2,4,6,..... ..... 1020,1022 OR another logical view of the Main memory Tag number O 00000000 O 0000 0001 Index bits (A6) O 0th 2nd BLOCK BLOCK

Block Numbers in group 1 (index bit-1) 1,3,5,7,..... ..... 1021,1023 1 1111 1111
6-bit Byte offset OO OOOO
1022nd

OO OOO1

. One group of 512-blocks .

BLOCK

. . 11 1111

1
1th 3th

OO OOOO

BLOCK

BLOCK

1023rd BLOCK

OO OOO1

. . 11 1111

Page 12

cache 2
Block Numbers in group 0 (index bit-0) 0,2,4,6,..... ..... 1020,1022 Block Numbers in group 1 (index bit-1) 1,3,5,7,..... ..... 1021,1023

Blocks of same index No. map to the same group of cache. Since there are 9-bit tag, 512 blocks go to one group of 8-cachelines. Set associative cache (4-way set associative cache) CPU address is interpreted as 2-bit index Offset 8-bit Block address In 10-bit block No, 8-bit tag and 2-bit index is used to identify the block having requested byte 8- bit TAG 2-bit index Offset Cache lines are placed in 22 groups. And one of these groups are selected by decoder Index bits are used to select one of the 2 2 groups. It is implemented through decoder ( 2 k :1 ) Logical view of the Main memory (two LSB bits of 10-bit block No. is used as index bit) Tag number O000 0000 Index bits (A7-6) OO 0th BLOCK O000 0001 1111 1111
6-bit Byte offset OO OOOO
4st 1020th

BLOCK

One group . 256- blocks. of

OO OOO1

BLOCK

. . 11 1111
OO OOOO

O1
1st 5th 1021st

OO OOO1

BLOCK

BLOCK

BLOCK

. . 11 1111
OO OOOO

1O 2nd BLOCK 6th BLOCK


1022nd

OO OOO1

BLOCK

. . 11 1111
OO OOOO

11
3rd 7th

BLOCK

BLOCK

1023rd BLOCK

OO OOO1

. . 11 1111

Blocks of same index No. map to the same group of cache. Since there are 8-bit tag, 256 blocks go to one group of 4-cache lines. (If combination of block address bits are taken as index bits- ???)

Page 13

cache 2

Set associative cache (1-way set associative cache)-Direct mapped CPU address is interpreted as 4-bit index Offset 6-bit Block address In 10-bit block No, 6-bit tag and 4-bit index is used to identify the block having requested byte 6- bit TAG 4-bit index Offset Cache lines are placed in 24 groups. And one of these groups are selected by decoder Index bits are used to select one of the 2 4 groups. It is implemented through decoder ( 2 k :1 ) Logical view of the Main memory (4 LSB bits of 10-bit block No. is used as index bit) Tag number O000 00 Index bits (A9-6) Oooo 0th BLOCK
1sth

O000 01 16th BLOCK


17th

1111 11
6-bit Byte offset
1008th

One group of 64-blocks .


.

BLOCK
1009th

OO OOOO . OO OOOO . OO OOOO . OO OOOO . OO OOOO

Ooo1

BLOCK Oo10
2nd

BLOCK
18th

BLOCK
1010th

BLOCK Oo11
3rd

BLOCK
19th

BLOCK
1011th

BLOCK O1oo
4th

BLOCK 20th BLOCK . . . . . .


27th

BLOCK
1012th

BLOCK . . . . . . 1o11 . . . . . .
11th

. . . . . . .

BLOCK . . . . . .
1019th

. . . . . . .
OO OOOO . OO OOOO . OO OOOO . OO OOOO

BLOCK 11oo
12th

BLOCK
28th

BLOCK
1020th

BLOCK 11o1
13th

BLOCK
29th

BLOCK
1021st

BLOCK 111o
14th

BLOCK
30th

BLOCK
1022nd

BLOCK

BLOCK

BLOCK

Page 14

cache 2
1111
15th

BLOCK

31st BLOCK

1023th

OO OOOO

BLOCK

11 1111

Blocks of same index No. map to one group of cache. Since there are 6-bit tag, 64 blocks go to one group of 1-cache line.

Cache Latency Time to return requested data. (assuming fully associative) T1- time to access tag array T2- time to perform tag comparison T3- time to access cache data array T4- time to return selected data or report miss T1 +T2 and T3 a simultaneous action So latency is T1+T2 +T4 OR T3+T4 which ever greater So hit or miss both takes same time

Page 15

cache 3
Memory access hierarchy

CPU
Request

1
0
MISS HIT

9
8 HIT

LEVEL I

1
2
MISS HIT

Update LEVEL 1
7

LEVEL II

Update LEVEL 1I 6
3

MISS HIT

LEVEL III

Update LEVEL 1II 5


4
HIT

LEVEL IV

At level n

Page 22