Vous êtes sur la page 1sur 37

A SEMINAR REPORT ON

iCACHE Submitted in partial fulfillment of Bachelor of Engineering Degree

University Of KOTA, Jaipur


College logo

(2009-10)
Submitted To: Mr. XXXX XXXXXX (H.O.D.) Mr. XXXXX XXXX Submitted By: XXXX XXXXX

DEPARTMENT OF COMPUTER SCIENCE ENGINEERING XXX Engineering College, JAIPUR

CERTIFICATE

This is to clarify that seminar report entitled iCACHE has been submitted by XXXXX XXXXX of final year B.Tech Computer Science, XXX Engineering College, Jaipur. This work is found to be satisfactory and it was observed that he was sincere during course of his work.

Seminar Guide
Mr. XXXXX XXXX

Head Of Department Computer Science Mr. XXXX XXXXX

PREFACE
It gives me immense pleasure to present this paper to the Department of Computer Science and Engineering, XXX Engineering College, University of Kota. In partial fulfillment for the award of Degree of bachelor of Engineering in Computer Science. This report is the crux of the seminar presented on iCACHE at XXX Engineering College, Jaipur. Although, it was not possible for me to describe all the things in detail, I have tried my best to cover all the important concepts.

Acknowledgment
I sincerely acknowledge Mr. XXXX XXXX XXX, Head, Computer engineering Department, XXXXXXXX INSTITUTE OF TECHNOLOGY, JAIPUR who helped us a lot so that project was completed within specified period. We are extremely grateful to him for his kind consent, co-operation and encouragement. We are greatly motivated by his character of looking everything from completely differently angle and his never ending enthusiasm work. I also learned a lot from while attending his sessions. We acknowledge Mr. XXXXXXXXXXX (Seminar Coordinator), who has helped and guided us throughout in the completion of this project as well as the challenges that lie behind us. He is always there to meet and talk about my ideas, to proofread and mark up my project. We are really fortunate to work under this guidance .He was instrumental in making us understand how to implement the project and also provided continuous supervision of the project. This project has been benefited from the many useful comments provided to me by the numerous of my colleagous. In addition many other of my friends have checked it and have offered many suggestions and comments. Besides there are some books and some online helps. Although I cannot mention all these people here, I thank each and everyone who supported me on this. XXXX XXXXXXX B.Tech (IV Year)Comp.Sc. & Engg. XXX, JAIPUR

CONTENTS
S. No.
1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14.

Topic
Introduction Application of Cache 1-way Cache The First Biometric Digital wallet Ram set 1 and 2 iCACHE operation Flowchart of Line Load Process Process Diagram of ICACHE CPU Bits for Controller the ICACHE PowerPC Architecture ICACHE iCACHE Register Advantage Conclusion Reference

INTRODUCTION OF INSTRUCTION CACHE The average person carries nine credit, debit and store loyalty cards [CNN Money]. That's a lot of bulk to carry around in your wallet. What if you could replace that wallet full of plastic with a single slim device that would function just like each of your cards? That's exactly what the makers of iCache have in mind. Realizing that the technology behind the credit or debit card hasn't changed much in the last 40 years, the creators of iCache decided to update the way we conduct our everyday business transactions [CNN Money]. The iCache gadget is as thin as a Razr cell phone, and its sleek style resembles an iPod. Basically, the iCache harbors data from every credit, debit, loyalty (for example, your supermarket discount card) or gift card that you own. The gadget's design is meant to appeal to young and technologically savvy consumers as well as those seeking a safe and secure way to carry out transactions. It's the first gadget of its kind -- a digital wallet secured by the owner's fingerprint. The iCache could not only alter day-to-day shopping, it could also change the way people travel. Instead of carrying a wallet full of credit and debit cards while you travel domestically or internationally, with the iCache travel gadget, you could streamline all your cards into one device. So far, the device has only been distributed by banks and other financial institutions to a select few customers. But it's a good glimpse at the potential future of making transactions. In fact, Jonathan Ramaci, CEO of the Cambridge, Mass.-based iCache, anticipates that 7 million units will be sold by the end of 2009 [CNN Money]. But how exactly can one device replace all of your credit cards? Read on to learn about the mechanics of the iCache and how to use this unique travel gadget.

Instruction CacheOn the TMS320VC5510 digital signal processor (DSP), instructions can reside in internal memory or external memory. When instructions reside in external memory, the instruction cache (I-Cache) can improve the overall system performance by buffering the most recent instructions accessed by theCPU. To configure the I-Cache and check its status, the CPU accesses a set of registers in
6

the I-Cache. For storing instructions, the I-Cache contains: One 2-way cache. The 2-way cache uses 2-way set associative mapping and holds up to 16K bytes: 512 sets, two lines per set, four 32-bit words per line. In the 2-way cache, each line is identified by a unique tag. Two RAM sets (1 and 2). These two banks of RAM are available to hold blocks of code. Each RAM set holds up to 4K bytes: 256 lines, four 32-bit words per line. Each RAM set uses a single tag to identify a continuous range of memory addresses that is represented in the RAM set. Before enabling the I-Cache, configure the I-Cache to use zero, one, or both RAM sets. Cache In computer science, a cache is a collection of data duplicating original values stored elsewhere or computed earlier, where of data is expensive to fetch or to compute, compared to the cost of reading the cache. In other words, a cache is a temporary storage area where frequently accessed data can be stored for rapid access. Once the data is stored in the cache, future use can be made by accessing the cached copy rather than re-fatching or recomputing the original data, so the average access time is shorter. History Use of the word cache in the compyter context originated in 1967 during preparation of an article for publication in the IBM systems Journal. The paper concerned an exciting memory improvement in model 85, a latecomer in the IBM system/360 product line. The Journal editor, Lyle R. Johnson, pleaded for a more descriptive term than high speed buffer. When none was forthcoming, he suggested the noun cache, from the French noun meaning a safekeeping or storage place. The paper was published in early 1968, the authors were honored by IBM, their work was widely welcomed and subsequently improved upon, and cache soon became standard usage in computer literature. Operation A cache is a block of memory for temporary storage of data likely to be used again. The CPU and hard drive frequently use a cache, as do web browsers and web servers.

A cache is made up of a pool of entries. Each entry has a datum which is a copy of the datum in some backing store. Each entry also has a tag, which specifies the identity of the datum in the backing store of which the entry is a copy. When the cache client wishes to access a datum presumably in the backing store, it first checks the cache. If an entry can be found with a tag matching that of the desired datum, the datum in the entry is used instead. This situation is known as a cache hit. So, for example, a web browser program might check its local cache on disk to see if it has a local copy of the contents of a web page at a particular URL. In this example, yhe URL is the tag, and the contents of the web page is the datum. The percentage of accesses that result in cache hits is known as hit rate of the cache. The alternate situation, when the cache is consulted and found not to contain a datum with the desired tag, is known as a cache miss. The previously uncased datum fetched from the backing store during miss handling is usually copied into the cache, ready for the next access. During a cache miss, the CPU usually ejects some other entry in other to make room for the previously uncased datum. The heuristic used to select the entry to eject is known as the replacement policy. One popular replacement policy, least recently used, replaces the least recently used entry. More efficient caches compute use frequency against the size of the backing store. While this works well for larger amounts of data, long latencies, and slow throughputs for both the cache and the backing store. While this works well for larger amounts of data, long latencies, and slow throughputs, such as experienced with a hard drive and the internet, its not efficient to use this for cached main memory. When a datum is written to the cache, it must at some point be written to the backing store as well. The timing of this written to the cache, it must at some point be written to the backing store as well. The timing of this write is controlled by what is known as the write is controlled by what is known as the write policy. In a write through cache, every write to the cache causes a synchronous write to the backing store. Alternatively, in a write back cache, writes are not immediately mirrored to the store. Instead, the cache tracks which of its locations have been written over. The data in these locations is written back to the backing store when those data are evicted from the cache, an effect reffered to as a lazy write. For this reason, a read miss in a write back cache will often require two memory accesses to service: one to retrieve the need datum, and one to write replaced data from the cache to the store.
8

Data write back may be triggered by other policies as well. The client may make many changes to a datum in the cache, and then explicitly notify the cache to write back datum. No write allocation is a cache policy where only processor reads are cached, thus avoiding the need for write back or write through when the old value of the datum was absent from the cache prior to the write. The data in the backing store may be changed by entities other than the cache, in which case the copy in the cache may become out of date or stale. Alternatively, when the client updates the data in the cache, copies of the data in other caches will become stale. Communication protocols between the cache manages which keep the data consistent are known as coherency protocols. Applications CPU caches Small memories on or close to the CPU can be made faster than the larger main memory. Most CPUs since the 1980s have used one or more caches, and modern microprocessors inside personal computers may have as half a dozen, each specialized to a different part of the task of executing programs. Disk cache While CPU caches are generally managed entirely by hardware, other caches are managed by a variety of software. The page cache in main memory, which is an example of the disk cache, is usually managed by the operating system kernel. While the hard drives hardware disk buffer is sometimes misleadingly referred to as disk cache, its main functions are write sequencing and read perfecting. Repeated cache hits are relatively rare, due to the small size of the buffer in comparison to HDDs capacity. In turn, fast local hard disk can be used to cache information held on even slower data storage devices, such as remote servers or local tape drives or optical jukeboxes. Such a scheme is the main concept of hierarchical storage management. Web cache Web cache are employed by web browsers and web proxy servers to store previous responses from web servers, such as web pages. Web caches reduce the amount of information that needs to be transmitted across the network, as information previously stored in the cache can often be re-used. This reduces bandwidth and processing requirements of the web server, and helps to improve responsiveness for users of the web.
9

Modern web browsers employ a built in web cache, but some internet service provides or organizations also use a cachinh proxy server, which is a web cache that is shared between all users of that network. Other caches The BIND DNS daemon caches a mapping of domain names to IP addresses, as does a resolver library. Write through operation is common when operating over unreliable networks, because of the enormous complexity of the coherency protocol required between multiple write back caches when communication is unreliable. For instance, Web page caches and client side network file system caches are typically read only or write through specifically to keep the network protocol simple and reliable. Search engine also frequently make web pages they have indexed available from their cache. For example, Google provides a Cached link next to each search result. This is useful when web pages are temporarily inaccessible from a web server. Another type of caching is stored computed results that will likely be needed again, or memorization. An example of this type of caching is cache, a program that caches the output of the copmpilation to speed up the second time compilation. Introduction to the I-CacheWhen the C5510 CPU requests instructions, it requests 32 bits at a time. To initiate an instruction fetch, the CPU sends a fetch request and a fetch address to the I-Cache. If the I-Cache is enabled, it handles the fetch request as follows: If the requested word is in one of the data arrays (a hit), the I-Cache delivers the word to the CPU. If the requested word is not in the I-Cache (a miss), the I-Cache uses the external memory interface (EMIF) to fetch the 4-word external memory block that contains the requested word. As soon as the requested word arrives in the ICache, it is delivered to the CPU. Timing information for I-Cache hits and misses can be found in section 5 on page 22.If the I-Cache is disabled, the data arrays are not checked. Instead, the fetch request and fetch address are passed to the EMIF. Once fetched by the EMIF, the requested 32-bit word is passed directly to the CPU. 1-Way Cache:

10

The 2-way cache has two memory banks. Each memory bank has the same parts: Data array. Each data array contains 512 lines (0 through 511) that the ICache can fill one by one in response to misses in the 2-way cache. Line valid (LV) bit array. Each line has a line valid bit. Once a line has been loaded, its line valid bit is set. Whenever the I-Cache is flushed, all 512 line valid bits are cleared, invalidating all the lines. For more information on flushing the I-Cache. Tag array. Each line has a tag field. When the I-Cache receives a 24-bit fetch address from the CPU, the I-Cache interprets bits 23-13 as a tag. When a line gets filled, the associated tag is stored in the tag field for that line. Introduction to the I-CacheInstruction Cache 12 Across the two memory banks, every two lines with the same number belong to one set. For example, line 0 of memory bank 1 and line 0 of memory bank 2 belong to set 0. When the I-Cache receives a fetch address, the I-Cache finds the set number in bits 12-4. If the I-Cache must replace one of the lines in the set, it uses a least-recently used (LRU) algorithm: The line replaced is the one that has been unused for the longest time. Each set has an LRU bit that is toggled to indicate which line should be replaced. 2-Way Cache: Data LV Tag LRU Tag LV Data Memory bank 1 Memory bank 2 Line 0 Line 1 Line 254 Line 255 Line 510 Line 511 Line 0 Line 1 Line 254 Line 255 Line 510
11

Line 511 Set 0 Set 1 Set 254 Set 255 Set 510 Set 511 ... ... ... ... ... ...

12

13

iCache: The First Biometric Digital Wallet

How long have we been hearing about the digital wallet? I know Bill gates was a mere millionaire back when I heard him spinning yarns, and yet here today is the iCache. Billed as the first biometric digital wallet, this handy little device holds your credit card data, frequent flier numbers, and virtually anything with an magnetic strip on it for safe keeping. Your finger unlocks the card. We Say: If you can brave the beyond-annoying flash-based website, youll see a pretty cool idea coming to life at last. No word on pricing and the product is still in testing, but were intrigued. Too bad it looks like an iPodyou know some test user will demand music and photo capability and well all be waiting another three years for this. Share and Enjoy: These icons link to social book marking sites where readers can share and discover new web pages.

14

RAM Sets 1 and 2: RAM set 1 and RAM set 2 each have the following parts: Data array. Each data array contains 256 lines (0 through 255). Line valid (LV) bit array. Each line has a line valid bit. When a line has been loaded, its line valid bit is set. Whenever the I-Cache is flushed, all 256 line valid bits are cleared, invalidating all the lines. For more information on flushing the I-Cache. Tag field. The RAM set has one 12-bit tag field that indicates which range of external memory addresses are mapped to the RAM set. To select a tag for RAM set n (1 or 2), write to RAM-set tag register n. When you write to the tag register, the I-Cache immediately fills the RAM set with all the 32-bit words in the address range specified by the tag. As each line is loaded, the associated line valid bit is set. Introduction to the I-Cache13 Instruction Cache Tag valid (TV) bit. The RAM set has one tag valid bit. Just before filling the RAM set, the I-Cache clears the tag valid bit. When the filling is complete, the I-Cache sets the tag valid bit. For RAM set n (1 or 2), the tag valid bit is reflected in RAM-set control register n. Note: Once the I-Cache begins to fill a RAM set, the I-Cache will not service CPU instruction-fetch requests until the RAM-set fill operation is complete. For optimal code performance, it is recommended that instruction fetches not be made from external memory during RAM-set fill operations. RAM Sets 1 and 2 Data LV Tag TV Line 0 Line 1 Line 254 Line 255
15

. . . Data LV Tag TV Line 0 Line 1 Line 254 Line 255 .RAM set 2 RAM set 1 I-Cache Operation: Instruction cache 14 When the C5510 CPU requests instructions, it requests 32 bits at a time. With each request, the CPU sends a fetch address that indicates where to read the 32 bit requested word. When a fetch request arrives, the I-Cache performs an instruction presence check; that is, it determines whether the requested word is available in the 2-way cache and/or any RAM sets included in your I-Cache configuration. Because the 2-way cache and RAM-set architectures are different, the I-Cache interprets the fetch address differently when searching the 2-way cache and when searching the RAM-set. Whenever possible, the I-Cache gets the requested word from a RAM set. If the requested word is in a RAM set but not in the 2-way cache, the word is fetched from the RAM set and the 2-way cache is not loaded with that word. Bit Field Description 23-13 Tag Whenever a line of the 2-way cache is loaded from external memory, the tag portion of the fetch address is stored with the line (in the tag array). During an instruction presence check, the I-Cache uses the Index field to find the addressed set and then compares both tags in the set with the tag portion of the fetch address. 12-4 Index This 9-bit value references one of the 512 sets of the 2-way cache. I-Cache Operation15 Instruction Cache Fetch Address Field Descriptions for the 2-Way Cache
16

Bit Description Field: 3-2 Offset When the I-Cache must read a 32-bit word from one of the lines of the 2-way cache, the offset field indicates which of the four 32-bit words in the line should be read. 1-0 Byte This field is not used by the I-Cache but is the part of the fetch address that indicates the specific byte being addressed. Fetch Address Fields for a RAM Set 23 12 11 4 3 2 1 0 Tag Index (line) Offset Byte 12 bits 8 bits 2 bits 2 bits Bit Field Description: 23-12 Tag During an instruction presence check, the I-Cache compares the tag portion of the fetch address with the tag defined in the RAM-se11-4 Index This 8-bit value references one of the 256 lines of the RAM set. 3-2 Offset When the I-Cache must read a 32-bit wordfonofthe lines of the RAM set, the offset field indicates which of the four 32-bit words in the line should be read. iCache Eliminates a Fat Wallet It's impossible to see everything at CES, but I managed to spot the iCache over at Popgadget, which says this little device is a "mini storage for all your credit cards." It's funny I found this because this weekend I bought a new wallet and realized I carry way too many credit cards. I have a card for Borders, Blockbuster, Costco, Safeway, and others stores I wish I didn't need. A concept device predicted to be a big hit among the ladies made its way into CES, and only a few people spotted it. It's called iCache, and it promises to eliminate the bulge from your wallet. The digital wallet consists of a universal card and an iPod-like device that holds all of your credit card information. To activate the iCache digital wallet, you'll have to register all your credit cards to an online account. Anytime you need a particular card, you simply select it from the list on your iCache, insert the universal card to temporarily load it with the data you need, then use it at the store of your choice. The information remains on the card for ten minutes before it's completely wiped out.
17

In terms of security, all the information on the iCache is encrypted, and the device itself requires a scan of your fingerprint before it can work. While most of us may not be ready to use this type of technology, do expect it to become available in the next two years, that is if financial institutions warm up to the idea at all. What do you think of this emerging technology? Guys, would you carry an iCache in your back pocket? The next time you go shopping for a wallet, think about how nice it would be to replace all of that plastic your carrying with one card. Thats precisely what iCache will allow you to do, once its released. The iCache keeps a copy of every card you have programmed into the device and will program the dynamic magnetic strip when you select that particular card.

How It Works: The iCache will most likely be available first through banks, then through retail locations. After you receive your iCache, you have to register your finger print and the cards you would like to have programmed into the device by plugging it into your PC through a USB cable. The software will prompt you for your card numbers and expiration dates which will then be saved on the iCache device itself. Then, when you reach the cash register, place your finger on the print scanner, navigate to the card to want to use and activate it. The magnetic strip will be programmed for that card, will eject out of the iCache card holder and can then be used to swipe at the terminal. Here is a video I found that goes into a little more detail in to how this thing works. Security: You cant talk about the iCache without highlighting the aspect of security. If you lose the device, it will be no good considering you need to use your finger print before it can be activated. According to iCache, if the device is tampered with in anyway, the data will be permanently deleted. If you choose to store all of your credit card data online with iCache, you can simply plug in your iCache device into your PC and the data will automatically be restored. Notice how I said choose. Many people are worried that they have to store their card data online with iCache which is not the case. It will just be more of a pain to restore the data if you choose not to have them store your data.
18

Which Cards Can You Use: iCache states that technically, you should be able to program any card that contains a mag stripe. That is, any card with a black magnetic strip that would use to swipe at pay terminals. This includes credit cards, debit cards, prepaid cards, ATM cards, loyalty cards, gas cards, give cards and gift certificate cards. It will be awesome to finally take all of those LOYALTY cards off of my keyring and put them into this device. When Will It Be Available: iCache is aiming towards a soft release towards the 2nd quarter of 2008 with a more widespread release near the end of the year. There is no word yet on how much this device will cost but if its $100.00 or cheaper, Im going to pick one up.

Instruction Presence Check and the Corresponding I-Cache ResponseWhen a fetch request arrives, the I-Cache performs an instruction presence check to determine whether the 32-bit requested word is available in the ICache. During the instruction presence check, the I-Cache performs these two operations on both the 2-way cache and the RAM sets:
19

1) Compares the tag portion of the fetch address with the tag in the data array at the location referenced by the Index portion of the fetch address. 2) Checks the line valid bit at the referenced location, to determine whether the line associated with the tag is valid. I-Cache OperationInstruction Cache 16 If the tag comparison fails and/or the line valid bit is 0, this qualifies as a miss. If the instruction presence check finds a tag match and the line valid bit is 1, this qualifies as a hit. Whenever a line in the I-Cache must be loaded from external memory (cases 1, 2, and 5), the I-Cache uses the line load process. Case 2-Way Cache RAM Sets Presence I-Cache Response 1 Miss Miss (no tag match) False 2-way cache line loaded from external memory, requested 32-bit word delivered to CPU 2 Miss Miss but tag match False RAM set line loaded from external memory, requested 32-bit word delivered to CPU 3 Miss Hit True Requested 32-bit word taken directly from RAM set; 2-way cache line not loaded 4 Hit Miss (no tag match) True Requested 32-bit word taken directly from 2-way cache 5 Hit Miss but tag match True RAM set line loaded from external memory, requested 32-bit word delivered to CPU 6 Hit Hit True Requested 32-bit word taken directly from RAM set
20

Line Load Process: When an instruction presence check results in a fetch from the external memory, the 4-word external memory block that contains the requested word is fetched and loaded into a line in the I-Cache. The I-Cache uses the external memory interface (EMIF) to fetch the 4-word block that contains the requested word. These four 32-bit words are written to the line in the I-Cache one word at a time. The I-Cache delivers the requested word to the CPU as soon as the word arrives in the data array, even if the rest of the line is still being loaded. When the entire line is loaded in the data array, the corresponding tag is written to the tag array and the line valid bit is set to validate the line. I-Cache Operation17 Instruction Cache Flow Chart of the Line Load Process I-Cache must load 2-way cache line or RAM set line Command EMIF to read four 32-bit words from external memory Is word received ? No Write word to line Yes it the requested word Is ? Yes Deliver word to
21

I unit of CPU load done ? Line No Wait for next word No

Process Diagram

INCLUSION FILTER The inclusion filter enables detection of instructions in the processor pipeline, mainly to support self-modifying code (SMC). To reduce design impact on this timing-sensitive area of the processor pipeline, detection techniques are architecturally minimized to provide pessimistic estimates. The goal is a
22

logically optimized solution in which false SMC detection is sufficiently uncommon such that the resulting performance loss is negligible. Motivation As the processor pipeline capacity increases, the inclusion detection solution needed to be re-examined. The pipeline capacity includes instructions in all stages and structures between instruction fetch and retirement, which increased substantially when the issue width was increased for the Merom architecture from 3 to 4. The increased instruction capacity resulted in increased false SMC detection conditions during Merom silicon testing, which tended to have a more limiting impact on server performance. For example, Transaction Processing Performance Council Benchmark C performance increased by 2 percent with inclusion checking disabled. The Inclusion Filter in the processor significantly reduces false SMC detection by using an alternative technique to filter from the existing detection mechanism the most common false detection scenarios. Solution Most instructions in the pipeline will also naturally exist in the Instruction Cache (ICache), so the Inclusion Filter monitors ICache activity to algorithmically identify states in which this common property is guaranteed ((Figure 9) ). Snoops in this state can then be filtered from the existing inclusion-detection mechanism, and this combination virtually eliminates false SMC detection.

Figure 9: The Inclusion Filter reduces false SMC detection


23

To increase confidence in this new micro architectural solution, it was essential to minimize design complexity. To reduce logic risk and validation requirements, the Inclusion Filter has a single functional output. To avoid the risk of frequency degradation, it is logically separated from the existing ICache structure (which is ideal for separating logic vs. process-related debug) and uses only non-timing-critical signals, such as the ICache LRU bits. For example, a property of the iCache pseudo-LRU algorithm is that for an Xway configuration, an accessed entry will not be evicted until at least log 2 (x) different entries in the same set have been accessed. Therefore, for an 8-ay cache, each set is allowed to filter at least three iCache evictions prior to resorting to the existing inclusion detection mechanism. Determining a different entry can be accomplished without additional storage by detecting a change of LRU bits. Monitoring changes to specific LRU bits and other control logic can increase the limit substantially using other similar properties. When the Inclusion Filter is saturated and finally allows the existing mechanism to be used, it is more beneficial to have it return to its reset state than to continue filtering. From a reset, the average cycles needed for the Inclusion Filter to resort to the existing inclusion mechanism is 50 times greater than the cycles needed to ensure that a fetched instruction is no longer in the pipeline. Therefore, when the Inclusion Filter is finally saturated, it takes the opportunity to completely reset its state, but it disables filtering until it is certain that all instructions in the pipeline at the time of this reset have been retired. In effect, a small window is opened during which the existing detection method is used, then it is closed for a very long time (98 percent closed on average). This translates to a 98 percent reduction in false SMC detection, and near optimal performance. CPU Bits for Controlling the I-Cache: Control of the I-Cache is maintained not only through the I-Cache registers but also through three bits located in status register ST3_55 of the CPU. CAEN Bit to Enable or Disable the I-Cache

24

To enable the I-Cache, set the cache enable (CAEN) bit of ST3_55. To disable the I-Cache, clear the CAEN bit. When disabled, the lines of the ICache data arrays are not checked; instead, the I-Cache forwards instructionfetch requests directly to the external memory interface (EMIF). For proper operation of the I-Cache, configure the I-Cache before enabling it and disable the I-Cache before making any changes to its configuration. A DSP reset forces CAEN = 0 (I-Cache disabled). CACLR Bit to Flush the I-Cache The flush operation is defined as the invalidation of all of the lines in the ICache. To flush the I-Cache, write 1 to the cache clear (CACLR) bit of ST3_55. In response, all the line valid bits of the 2-way cache and of the RAM sets are cleared. In addition, the tag valid bit of each RAM set is cleared. The CACLR bit remains 1 until the flush process is complete, at which time CACLR is automatically reset to 0. A DSP reset forces CACLR = 0 (no flush in process). CPU Bits for Controlling the I-Cache 19 Instruction Cache CAFRZ Bit to Freeze the Contents of the I-Cache When you write 1 to the cache freeze (CAFRZ) bit of ST3_55, the contents of the I-Cache are locked. Instruction words that were cached prior to the freeze are still accessible in the case of an I-Cache hit, but the data arrays are not updated in response to an I-Cache miss. To re-enable updates, write 0 to CAFRZ. A DSP reset forces CAFRZ = 0 (I-Cache not frozen). Note: When the I-Cache is frozen (CAFRZ = 1), each I-Cache miss still causes a 4word (16-byte) fetch cycle in the EMIF. It is recommended that you profile your code to minimize the number of misses during an I-Cache freeze.s Configuring and Enabling the I-Cache
25

Configuring and Enabling the I-Cache This section gives the procedures for preparing and enabling the I-Cache for the three I-Cache configurations: _ 2-way cache and no RAM sets _ 2-way cache and one RAM set _ 2-way cache and two RAM sets The cache enable (CAEN) bit that is used to enable and disable the I-Cache Notes: 1) Write to the control registers (ICGC, ICWC, ICRC1, and ICRC2) only when the I-Cache is disabled (CAEN = 0 in ST3_55). 2) Write to the RAM-set tag registers (ICRTAG1 and ICRTAG2) only when the I-Cache is enabled (after making CAEN = 1 in ST3_55, wait for IEN = 1 in ICST). To configure with 2-way cache and no RAM sets: 1) Write to the appropriate control registers: _ Write CBFFh to ICGC to indicate no RAM sets. _ Write 000Fh to ICWC to initialize the logic for the 2-way cache. 2) Set the cache enable bit (CAEN) bit of CPU status register ST3_55 to send an enable request to the I-Cache. 3) Poll the I-Cache-enabled (IEN) bit of ICST until IEN = 1. (The I-Cache is not instantaneously enabled.) To configure with 2-way cache and one RAM set: 1) Write to the appropriate control registers: _ Write CE1Fh to ICGC to indicate one RAM set. _ Write 000Fh to ICWC to initialize the logic for the 2-way cache. _ Write 000Fh to ICRC1 to initialize the logic for RAM set 1. 2) Set the cache enable bit (CAEN) bit of CPU status register ST3_55 to send an enable request to the I-Cache. 3) Poll the I-Cache-enabled (IEN) bit of ICST until IEN = 1. (The I-Cache is not instantaneously enabled.) Configuring and Enabling the I-Cache
26

21 Instruction 4) Write the desired tag to ICRTAG1. When you write to the tag register, the tag is used to immediately fill RAM set 1 from external memory. While the ICache is enabled, you can write to the tag register at any time to change the RAM-set address range. Each time you load the tag register, RAM set 1 is immediately filled from the selected address range. 5) To monitor the RAM-set filling, poll the tag-valid bit: When R1TVALID = 1 in ICRC1, the I-Cache has finished filling RAM set 1. To configure with 2-way cache and two RAM sets: 1) Write to the appropriate control registers: _ Write CFFFh to ICGC to indicate two RAM sets. _ Write 000Fh to ICWC to initialize the logic for the 2-way cache. _ Write 000Fh to ICRC1 to initialize the logic for RAM set 1. _ Write 000Fh to ICRC2 to initialize the logic for RAM set 2. 2) Set the cache enable bit (CAEN) bit of CPU status register ST3_55 to sendan enable request to the I-Cache. 3) Poll the I-Cache-enabled (IEN) bit of ICST until IEN = 1, indicating that theI-Cache is enabled. (The I-Cache is not instantaneously enabled.) 4) Write to the RAM-set tag registers: _ Write the desired tag to ICRTAG1. When you write to the tag register, the tag is used to immediately fill RAM set 1 from external memory. _ Write the desired tag to ICRTAG2. When you write to the tag register, the tag is used to immediately fill RAM set 2 from external memory. While the ICache is enabled, you can write to a tag register at any time to change the address range as necessary. Each time you load a tag register, the corresponding RAM set is immediately filled from the selected address range. 5) To monitor the RAM-set filling, poll the tag-valid bits: _ When R1TVALID = 1 in ICRC1, the I-Cache is done filling RAM set 1. _ When R2TVALID = 1 in ICRC2, the I-Cache is done filling RAM set 2.

Timing Considerations Instruction Cache 22


27

Timing Considerations: As the I-Cache fetches and returns 32-bit words requested by the CPU, two key time periods affect the speed of the I-Cache: _ Hit time _ Miss penalty Hit Time The hit time is the time required for the I-Cache to deliver the 32-bit requested word to the CPU in the case of a hit (when the word is present in the I-Cache). The hit time is either 1 or 2 CPU clock cycles: _ An initial request (a request that follows a period of inactivity) has a hit time of 2 cycles. _ Subsequent requests have a hit time of 1 cycle if: _ The requests are consecutive (no inactivity in between) and _ The requests are to sequential addresses _ Subsequent requests have a hit time of 2 cycles if: _ The requests are not consecutive or _ The requests are to non sequential addresses Timing Considerations 23 Instruction Cache Miss Penalty The miss penalty is the time required for the I-Cache to deliver the 32-bit requested word to the CPU in the case of a miss (when the word must be fetched from external memory). In response to a miss, the I-Cache requests four words from the external memory interface (EMIF) to load the appropriate line. The miss penalty due to an initial request to the EMIF is: 1) Four cycles for the I-Cache to receive the fetch request, detect an I-Cache miss, and forward the fetch request to the EMIF. 2) X cycles for the EMIF to get the requested word to the I-Cache, where X depends on factors such as: _ The initial access latency of the type of external memory that is used _ The position of the requested word in the I-Cache line. For example, if the requested word is the third word of the line, two words are fetched
28

before the requested word. _ Whether the four words are fetched in a burst access (if synchronous memory is used) 3) Three cycles for the I-Cache to get the requested 32-bit word to the instruction fetch unit (I unit) of the CPU. Subsequent requests can incur a smaller miss penalty if the external memory is synchronous. After accessing the first word from synchronous memory, the EMIF can return each of the remaining words in a single cyc PowerPC Architecture Instruction Cache (I-Cache)

The PowerPC 405 accesses memory through the Instruction Cache Unit (ICU) and Data Cache Unit (DCU). Each cache unit includes a Processor Local Bus (PLB) Interface, Cache Arrays, and a Cache Controller. Hits into the instruction cache and data cache appear to the CPU as single-cycle memory accesses The PowerPC 405 implements separate instruction cache and data cache arrays. Each is 16 KB in size, two-way set-associative, and operates using 8-word (32 byte) cache lines. The caches are non-blocking to allow the PowerPC 405 to overlap instruction execution with reads over the PLB. The cache controllers replace cache lines according to a least-recently used (LRU) replacement policy. When a cache line fill occurs, the most-recently accessed line in the cache set is retained and the other line is replaced. The cache controller updates the LRU during a cache line fill. The ICU supplies up to two instructions every cycle to the Fetch and Decode unit. The ICU can also forward instructions to the Fetch and Decode unit during a cache line fill, minimizing execution stalls caused by instruction-cache misses. When the ICU is accessed, four instructions are read from the appropriate cache line and placed temporarily in a line buffer. Subsequent ICU accesses check this line buffer for the requested instruction prior to accessing the cache array. This allows the ICU cache array to be accessed as little as once every four instructions, significantly reducing ICU power consumption. Power, Emulation, and Reset Considerations Instruction Cache 24 Power, Emulation, and Reset Considerations
29

Idle Mode for Reducing Power Consumed If you want to temporarily halt the I-Cache to reduce power, you can place it in its idle mode: 1) Select the idle mode for the I-Cache by making CACHEI = 1 in the idle configuration register (ICR) of the DSP. 2) Execute the IDLE instruction. When the I-Cache is in its idle mode or is disabled, instruction-fetch requests are handled by the external memory interface (EMIF). To wake the I-Cache from its idle mode: 1) Deselect the idle mode by making CACHEI = 0 in ICR. 2) Execute the IDLE instruction. Emulator Access The software emulator can read the contents of the I-Cache during the debug mode. The contents of the I-Cache are not modified by emulator read operations. Effect of Setting a Software Breakpoint During emulation, If you set or remove a software breakpoint at an instruction, the corresponding line in the I-Cache is automatically invalidated. Reconfiguration Required After a DSP Reset After a DSP reset, the I-Cache is not automatically reconfigured for use. Make sure that your initialization code configures the I-Cache I-Cache Registers 25 Instruction Cache I-Cache Registers: Control of the I-Cache is maintained through a set of registers within the ICache. These registers are accessible at addresses in the I/O space of the DSP.
30

I-Cache Global Control Register (ICGC) The I-Cache supports one 2-way cache and zero, one, or two RAM sets. Before enabling the I-Cache, use the global control register (ICGC) to select the desired RAM-set mode. Do not write other values to this register. Legend: R = Read; W = Write; -x = Value after reset is not defined I-Cache Registers Instruction Cache 26 I-Cache Global Control Register (ICGC) Bits Bit Field Value Description 15-0 RMODE RAM-set mode bits CBFFh No RAM sets CE1Fh 1 RAM set. Only RAM set 1 is available. CFFFh 2 RAM sets. RAM set 1 and RAM set 2 are available. I-Cache Way Control Register (ICWC) You must initialize the logic for the 2-way cache before you enable the ICache. To perform the initialization, write 000Fh to the way control register (ICWC). Do not write any value other than 0Fh to the WINIT field of ICWC. I-Cache Way Control Register (ICWC) 15 5 4 0 Reserved WINIT R-0 R/W-x Legend: R = Read; W = Write; -n = Value after reset; -x = Value after reset is not defined I-Cache Way Control Register (ICWC) Bits Bit Field Value Description 15-5 Reserved These read-only bits are not used.
31

4-0 WINIT 0Fh Way initialization bits. Make WINIT = 0Fh to initialize the logic for the 2-way cache. I-Cache Registers 27 Instruction Cache I-Cache RAM Set Control Registers (ICRC1 and ICRC2) Each RAM set control register contains an initialization field and a tag-valid bit. Initialization field (RxINIT). If you have selected one or two RAM sets with the global control register, you must initialize the logic for RAM set 1 before you enable the I-Cache. If you have selected two RAM sets with the global control register, you must also initialize the logic for RAM set 2. To perform the initialization for each RAM set, write 000Fh to its RAM set control register. Do not write any value other than 11b to R1INIT and R2INIT. Tag-valid bit (RxTVALID). When the I-Cache completes the process of filling RAM set 1, the I-Cache sets R1TVALID. You can poll this bit to determine when the RAM set is ready. I-Cache RAM Set Control Registers (ICRC1 and ICRC2) ICRC1 15 14 2 1 0 R1TVALID Reserved R1INIT R-0 R-0003h R/W-x ICRC2 15 14 2 1 0 R2TVALID Reserved R2INIT R-0 R-0003h R/W-x Legend: R = Read; W = Write; -n = Value after reset; -x = Value after reset is not defined I-Cache RAM Set 1 Control Register (ICRC1) Bits Bit Field Value Description 15 R1TVALID RAM set 1 tag-valid bit. Check this bit to determine when the ICache has completed the process of filling RAM set 1. 0 The fill is not started or is not complete. 1 The fill is complete. 14-2 Reserved These read-only bits are not used.
32

1-0 R1INIT 11b RAM set 1 initialization bits. Make R1INIT = 11b to initialize the logic for the RAM set 1. I-Cache Registers Instruction Cache 28 I-Cache RAM Set 2 Control Register (ICRC2) Bits Bit Field Value Description 15 R2TVALID RAM set 2 tag-valid bit. When the I-Cache completes the process of filling RAM set 2, the I-Cache sets R2TVALID. 0 The fill is not started or is not complete. 1 The fill is complete. 14-2 Reserved These read-only bits are not used. 1-0 R2INIT 11b RAM set 2 initialization bits. Make R2INIT = 11b to initialize the logic for the RAM set 2. I-Cache RAM Set Tag Registers (ICRTAG1 and ICRTAG2): For each active RAM set (selected with the global control register), you must give the I-Cache a 12-bit tag that defines the range of addresses assigned to that RAM set. Load the tag into the appropriate RAM set tag register. Write a value with zeros in bits 15-12 and the tag in bits 11-0. I-Cache RAM Set Tag Registers (ICRTAG1 and ICRTAG2) ICRTAG1 15 0 R1TAG R/W-0 ICRTAG2 15 0 R2TAG R/W-0 Legend: R = Read; W = Write; -n = Value after reset I-Cache RAM Set 1 Tag Register (ICRTAG1) Bits Bit Field Value Description
33

15-0 R1TAG 0000h-0FFFh RAM set 1 tag bits. Write a value with zeros in bits 15-12 and the tag in bits 11-0. This register is only applicable if you have selected one or two RAM sets with the global control register. I-Cache Registers Instruction cache 29 I-Cache RAM Set 2 Tag Register (ICRTAG2) Bits Bit Field Value Description 15-0 R2TAG 0000h-0FFFh RAM set 2 tag bits. Write a value with zeros in bits 15-12 and the tag in bits 11-0. This register is only applicable if you have selected two RAM sets with the global control register. I-Cache Status Register (ICST) The status register contains the IEN bit to indicate when the I-Cache is enabled. When you send an enable request to the I-Cache (CAEN = 1 in the CPU status register ST3_55), poll for IEN = 1 before writing to either of the RAM set tag registers. I-Cache Status Register (ICST) Bits Bit Field Value Description 15-3 Reserved These read-only bits are not used. 2 IEN I-Cache-enabled bit. When you send an enable request to the I-Cache, poll for IEN = 1 before writing to either of the RAM set tag registers. 0 The I-Cache is disabled. 1 The I-Cache is enabled. 1-0 Reserved These read-only bits are not used.

34

CONCLUSION The instruction cache (I-Cache) can improve the overall system performance by buffering the most recent instructions accessed by the CPU and to reduse a lot of bulk to carry around in your wallet that's exactly what the makers of iCache have in mind. We have been exlplained how iCache works. The iCache will most likely be available first through banks, then through retail locations. After you receive your iCache, you have to register your finger print and the cards you would like to have programmed into the device by plugging it into your PC through a USB cable. The software will prompt you for your card numbers and expiration dates which will then be saved on the iCache device itself. Then, when you reach the cash register, place your finger on the print scanner, navigate to the card to want to use and activate it. We cant talk about the iCache without highlighting the aspect of security. In terms of security, all the information on the iCache is encrypted, and the device itself requires a scan of your fingerprint before it can work. The PowerPC 405 accesses memory through the Instruction Cache Unit (ICU) and Data Cache Unit (DCU). We show that, contrary to popular belief, strong cache consistency can be maintained for the Web with little or no extra cost than the current weak consistency approaches, and it should be maintained using an invalidation-based protocol.

35

REFERENCES

1. http:/www.cl.cam.ac.uk/~jgd1000/

2. http://en.wikipedia.org/wiki/Iris_recoginition

3. http://www.google.com

36

37

Vous aimerez peut-être aussi