Burst Arry Pattern

POWER-EFFICIENT TCAMS FOR BURSTY ACCESS PATTERNS
TERNARY CONTENT ADDRESSABLE MEMORIES (TCAMS) HAVE BECOME A

POPULAR HARDWARE DEVICE FOR FAST ROUTING-TABLE LOOKUPS. HOWEVER, THE HIGH POWER CONSUMPTION IN TCAMS INCREASES POWER SUPPLY AND COOLING COSTS, AND LIMITS THE ROUTER DESIGN TO FEWER PORTS. BASED ON AN ANALYSIS OF THE PREFIX HIERARCHY IN ROUTING TABLES, THE AUTHORS PROPOSE AN OPTIMIZATION MODEL TO REDUCE THE POWER CONSUMPTION FOR BURSTY ACCESS PATTERNS. THIS MODEL CAN REDUCE THE AVERAGE POWER CONSUMPTION BY A FACTOR OF 11.
Weidong Wu Jian Shi Ling Zuo Bingxin Shi Huazhong University of Science and Technology
The inherent parallelism of ternary content addressable memory (TCAM) is attractive for routing-table lookup. Each cell in a TCAM can take three logic states: 0, 1, or dont-care. Dont-cares act as wildcards during a search. The TCAM searches the destination address of an incoming packet with all the prefixes in parallel. Several prefixes can match the destination address. Priority encoder logic then selects the longest matching prefix. In contrast, conventional ASICbased designs require multiple memory accesses for a single route lookup. Therefore, the TCAM-based solution is much faster than ASIC-based solutions for packet forwarding. Despite these advantages, router vendors have been slow to adopt TCAM devices in packet forwarding engines; high power consumption is a main reason. Current high-density TCAM devices consume as much as 12 to 15 W each when all the entries are enabled for search. Moreover, a single line card in a router might require multiple TCAMs to handle
route lookup on large forwarding tables. This high power consumption affects costs in two ways: First, it increases power supply and cooling costs. Second, it reduces port density since higher power consumption implies that designers can pack fewer ports into the same space (or router rack) because of cooling constraints. Therefore, it is important to minimize the power budget for TCAM-based forwarding engines to make them economically viable. Several TCAM vendors such as IDT (http://www.idt.com/products/) provide mechanisms for searching only a part of the TCAM device to reduce power consumption during a lookup operation. There are different access frequencies for each part. Searches access certain prefixes more than other prexes, and the access frequencies remain relatively stable over time1 (as with Google and Yahoo, for example). They access a few prefixes frequently, and these prefixes become hot for short periods of time. Power consumption relates directly to access frequency.
64
Published by the IEEE Computer Society
0272-1732/05/$20.00 2005 IEEE
Authorized licensed use limited to: JSS ACADEMY OF TECHNICAL EDUCATION. Downloaded on January 30, 2010 at 00:35 from IEEE Xplore. Restrictions apply.
Destination IP address
analyze the relation between prexes in routing table and propose a concept for the routing table structure; construct the forwarding table for a TCAM based on level-partitioning techniques, and propose a power optimization model based on the prescient access frequency; and design a dynamic algorithm based on the bursty access frequency to stabilize the power consumption.
Searched blocks
Figure 1. Two-level TCAM architecture.
Related works
Recently, researchers have proposed a few approaches for reducing TCAM power consumption, including routing-table compaction and partitioning techniques: Liu presents a novel technique to eliminate redundancies in the routing table.2 However, this technique takes excessive time for update, and all prefixes match with the destination address of an incoming packet in parallel. There is a power savings of only 42.7 to 48 percent. Panigrahy and Sharma partition the set of prexes in the routing table into small groups, and each route lookup uses only one group; the others remain inactive.3 They introduce the concept of paged TCAM to achieve significantly lower power consumption. However, these results depend on the distribution of trafc among IP addresses. Evidently, if the distribution changes, this method must reconstruct the forwarding table. Zane et al. propose a bit selection and trie-based architecture that results in a power budget of under 1.2 W,4 which is in the same range as that of SRAM-based ASIC designs.1 But both architectures in practice require the recomputation for route updates. Ravikumar and Mahapatra propose a two-level pipelined architecture that reduces power consumption through prex compaction and partitioning.5 The size
of the largest page denes the worst power consumption. So, if the largest page has a high-access frequency, power consumption increases quickly. This architecture is unsuitable for bursty access patterns. Current solutions mainly use the two-level architecture, shown in Figure 1, to reduce power consumption in TCAM. We call the rst-level TCAM the index TCAM and the second-level TCAM the sub-TCAM. The key to reducing the power consumption is the routing table construction in this two-level architecture. The main component of power consumption in TCAM is proportional to the number of searched entries (prexes). So, for simplicity, this article uses this number to represent power consumption. For a route lookup, all prefixes in index TCAM and prefixes in one bucket of subTCAM are searched. Our basic ideal is that prexes in index TCAM are as few as possible, and the prexes in sub-TCAM have the lowest access frequencies as possible.
Bursty access patterns

Internet trafc has received extensive analysis for several years, revealing traffic characteristics. The distribution of packets per destination IP address has a heavy, Paretochart-like tail, whereby some flow contain vastly more packets than others.6 There are two fundamentally important access patterns:7 Independently skewed access patterns. Certain prexes are accessed more times than others, and the access frequencies remain relatively stable over time. Bursty access patterns. A few prefixes
Next hop
In this article, we focus on the problem of making TCAM-based forwarding engines more power efcient based on the access frequency. The main contributions are as follows: We
Index-TCAM
Sub-TCAM
JULYAUGUST 2005
65
ROUTING TABLE LOOKUP
become hot for short periods of time and are accessed more frequently. Several studies have documented such patterns in various applications.8 Researchers have proposed software solutions for the bursty access pattern. Cheung and McCanne analyzed the access times and the size of the processors hierarchical memory structure, and proposed a theoretical approach to minimizing the average lookup time.9 Gupta et al. proposed an efficient data structure to minimize the average lookup time while keeping the worst-case lookup time within a fixed bound.1 However, it requires the periodic rebuilding of the data structure to reect the changes in access patterns; this is inefcient for bursty environments. Ergun et al. introduce a new dynamic data structure, called biased skip list (BSL), to exploit biases in the access pattern, which tend to change dynamically.7 The data structure has a self-update mechanism that reects the changes in the access patterns efciently and immediately, without any need for rebuilding. It improves throughput while keeping the worst case lookup time. Sahni and Kim extend the BSL structure to find longest matching prexes as well as to insert and delete prexes in O(log n) expected time, where n is the number of prexes in the routing table.10
Proof: Consider two prexes: P1 = t1t2t3 tm, P2 = s1s2s3 sn, for 1 m n 32. Then D = d1d2d3 d32 P1, then di = ti (1 i m) D = d1d2d3 d32 P2, then dj = sj (1 j n) Based on these relationships ti = si (1 i m) (1)
and any D1 = s1s2s3 smsm+1 snxn+1xn+2 x32 P2. From Equation 1, D1 = t1t2t3 tmsm+1 snxn+1xn+2 x32 P1, and P2 is a subset of P1, D P2 P1. From Proposition 1, any two prefixes, P1 and P2, are either disjoint or one is the subset of the other. We can categorize prexes in routing table according to the include relation.
Denitions
The following are denitions we use in prexes of routing tables: stand-alone prefixes have no subsets or supersets; subroot prefixes have at least one subset and no supersets; and more-specific prefixes have at least a superset. If we represent the prexes in routing tables as a trie data structure, every prex has a corresponding node in the trie. If n prexes are in the path from the root to prex P, we say P is in level n. For example, given prexes 0/1, 101/3, 0110/4, 010/3, and 01001/5 in a routing table, the trie structure appears in Figure 2a. Of these, 101/3 is a stand-alone prex, 0/1 is a subroot prex, and 010/3, 0110/4, and 01001/5 are more-specic prexes. Prexes 0/1 and 101/3 are in level 0, 010/3 and 0110/4 are in level 1, and 01001/5 is in level 2, Figure 2b. To compute the prex levels, we at rst sort the prexes in ascending order, then compare two prexes. If Pi+1 Pi, Pi+1 is in the next level of Pi. Figure 3 shows the level-partitioning algorithm. Let T be the number of prexes in a routing table. The level-partitioning algo-
Hierarchy of prexes
We often give a presentation of an IP address with strings of 32 bits, D = t1t2t3 ... t32 (ti = 0 or 1, i = 1, 2, 3, , 32). IP addresses are cut into two parts, D = PH (P = t1t2t3 tl, H = tl+1tl+2tl+3 t32). The rst part, P, identies the network on which the host resides, called prefix. The second part, H, identifies the particular host on the given network. The prexes are stored in routers. P = t1t2t3 tl represents the host address range from D1 to D2. D1 = t1t2t3 x, where x is a string of 32l zeros; and D2 = t1t2t3y, where y is a string of 32l ones. The number of the IP address is 232l. Prex P is the set of host addresses. If P matches IP address D, we call D P.
Proposition 1
Consider two prefixes, P1 and P2, and IP address D. If D P1 and D P2, then D P2 P1 (D P1 P2).
66
IEEE MICRO
rithms complexity is O(TW), where W is the length of the IP address, because each IP address has W more-specic prexes at most. We analyze six routing tables (ftp.routeviews.org/bgpdata). In Table 1, the percentage of stand-alone, subroot, and more-specific prexes are 40, 5, and 55 percent. The maximum level is 5. The number of prefixes in level 1 is about 10 times that in level 2; the number of prexes in level 2 is about 10 times that in level 3; and so on. About 98 percent of the more-specific prefixes are in level 1 and level 2. When we search for an IP address in a routing table with the longest-prex matching algorithm, the maximum searching depth is 5, and about 98 percent of searching depths are no more than 2.
any two prefixes in index TCAM have no intersection, there is no more than one search result for any IP address, and all prexes can be in disorder. The new prex in level 0 can be in any free space on index TCAM. Sub-TCAM is partitioned into buckets. All more-specic prexes of a subroot prex are stored in a bucket in chain ancestor ordering.11
root 0/1 010/3 101/3 0110/4 01001/5 Subroot Stand alone (b) More specific Level 5 Level 4 Level 3 Level 2 Level 1 Stand alone
01001/5 010/3 0110/4 0/1 101/3
Our approach
For our approach, we provide the following denitions: T is the number of prexes in a routing table. N is the number of the stand-alone prexes. APi is a stand-alone prex, 1 i N. Afi is the match frequency of prex APi. M is the number of subroot prexes. SPj is a subroot prex, 1 j M. mj is the number of the more-specic prexes of SPj. Sfj is the match frequency of prex SPj. R is the power reduction factor, R = T/(the number of searched prexes for a lookup).
(a)
Figure 2. Example trie (a) and level partitioning (b) for a routing table.
P[ ]; // prefixes L[ ]; // the level of prefixes. For (i from 0 to T-1) // T is the number of prefixes L[i] = 0; // set initial value. Sort(P[ ] ) //sorting prefixes in ascending order; For (i from 0 to T-2) { j = i + 1; While (P[j] P[i] & j<T) { L[j] = L[i] + 1; j++ } }
Static architecture
The prexes in level 0 go into index TCAM with the next-hop and index elds. Because
Figure 3. Level-partitioning algorithm.
Table 1. Prex distribution.

No. of Stand-alone prexes 13,751 26,970 29,596 35,262 43,580 49,431 No. of Subroot prexes 963 2,208 3,170 4,543 5,823 6,675
Routing table no. 19971108 19981101 19991031 20001101 20011101 20021101
Level 1 10,443 19,736 31,775 45,616 48,758 54,570
No. of More-specic prexes Level 2 Level 3 Level 4 Level 5 1,369 139 4 0 4,786 308 18 2 6,563 487 17 3 7,034 635 20 1 9,201 1,042 114 2 9,071 945 46 3
Total 11,955 24,850 38,845 53,306 59,117 64,635
JULYAUGUST 2005
67
Index-TCAM Prefix Next hop Index
Sub-TCAM Prefix Next hop
Priority encoder
IP address
Index = Null
Figure 4. Static architecture.
D; // the destination IP-address P[ ].prefix; // the prefix. P[ ].nexthop // the next hop of the prefix P in indes-TCAM P[ ].index // the pointer to Sub-TCAM P[ i ] = Search (D) in index-TCAM; Next_hop = P[ i] .nexthop; If (P[i].index == Null) Return Next_hop; Search(D) in a TCAM bluck; // P[i].index pointer to. Next_hop = Priority-Encoder( ); // to find the longes t-matching prefix Return Next_hop;
the priority encoder gives the longest matching prex and its next hop. Figure 5 shows the algorithm, which has complexity O(1).
Proposition 2
In the static architecture, T =N +M+
m .
i i =1
Figure 5. Search algorithm in the static architecture.
So the minimum number of searched prexes is Tmin = N + M; the maximum number of searched prexes is Tmax = N + M + max(mj), where 1 j M. The average number of searched prexes is Taverage = N + M +
Table 2. Power consumption for static architecture.

No. of prexes (power reduction factor) Routing table no. T Tmin (R) Tmax (R) Taverage (R) 971108 26,669 14,714 (1.8) 15,554 (1.7) 14,716 (1.8) 981101 54,028 29,177 (1.8) 29,809 (1.8) 29,185 (1.8) 991031 71,611 32,766 (2.1) 34,524 (2.1) 32,784 (2.2) 001101 93,111 39,804 (2.3) 42,678 (2.2) 39,831 (2.3) 011101 108,520 49,402 (2.2) 50,626 (2.1) 49,410 (2.2) 021101 120,741 56,106 (2.1) 57,120 (2.1) 56,113 (2.2)
Sf m .
j j j =1
Figure 4 shows the static architecture. Any destination IP address D matches all prexes on index TCAM in parallel. If there is no prex that matches D, the packet with D cannot be forwarded. If a prex P matches D and the index eld of P is Null, Ps next hop can forward the packet. If a prex P matches D and the index eld of P is not Null, the search will compare D with Ps more-specic prexes in sub-TCAM, and
We analyze the real data form and trace data with 1 million packets (ftp.routeviews.org/ bgpdata) and show the results in Table 2. Tmin equals the number of prexes in level 0. This data does not show much difference between Tmax and Taverage. Power reduction factor R is about 2. From Table 1, the number of the standalone prexes in index TCAM is about 90 percent of Tmin, 85 percent of Tmax, and 95 percent of Taverage. We will describe a scheme for reducing the number of stand-alone prexes in index TCAM.
Dynamic architecture
The key to reducing the power consumption is to reduce the number of the stand-alone prexes in index TCAM. We propose adding the
68
IEEE MICRO
Next hop
Index-TCAM Prefix IP address Next hop Index
Sub-TCAM Prefix Next hop
Priority encoder
Standalone bucket
Figure 6. Dynamic architecture.
stand-alone bucket to sub-TCAM and moving some stand-alone prexes from the index TCAM into this bucket. Figure 6 shows the dynamic architecture that supports this scheme. The search algorithm is the same as the search algorithm in the static architecture, but it needs modication. If there is no search result in index TCAM for destination IP address D, the search will not compare D with all the prexes in the stand-alone bucket. The power consumption follows from the description in proposition 3.
Proposition 3
In the dynamic architecture, let n be the number of stand-alone prefixes in index TCAM. N n is the number of prexes in the stand-alone bucket. The minimum number of searched prexes is then Tmin = n + M; the maximum number, Tmax = n + M + max [N n, max(mj)] where 1 j M. The average number of searched prefixes will be Taverage = n + M +
in index TCAM. APi, APi+1, APi+2, , APi+k move into a sub-TCAM bucket. Thus, Tmin increases by 1, and Tmax decreases to k 1. Taverage varies with n and the access frequency of the prexes in a bucket. Now, we have the following problems: How many standalone prexes are in index TCAM? And how do you identify which stand-alone prexes are in index TCAM? For independently skewed access patterns, the search accesses certain stand-alone prexes more frequently for long periods of time. We can store some of them in index TCAM and identify a number of stand-alone prexes, n0, that minimizes the average number of searched prexes.
Proposition 4
Let stand-alone prefixes {APi} be in descending order of access frequencies {Afi}. The average number of searched prexes is F (n ) = n + M +
Sf m + (N n) Af
j j j =1 i =n +1
j =1
Next hop
Sf j m j + ( N n )
i = n +1
Af .
i
(2) Then, there is n0 such that F(n0) = min[F(n)], where 0 n N. To prove this, we use Equation 2 to write the following: F (n + 1) F (n ) = 1 ( N n ) Af n+1
The value of Tmin increases with n. If N n < max(mj), Tmax = n + M + max(mj), where 1 j M. If N n > max(mj), Tmin = M + N, where 1 j M. From Table 1, Tmax is about half the number of all prexes in the routing table. We can group some prexes in the stand-alone bucket to reduce the value of Tmax. Given prexes in the stand-alone bucketAPi, APi+1, APi+2,, APi+k, assume prex ( AP) is a subroot prex of them and has no intersection with any prexes in index TCAM. We can store AP
i = n +1
Af .
i
F(n + 1) F(n) is a monotonically increasing function of n, because AfN = min(Afi) < 1/2, where 1 i N. Therefore, F(N) F(N 1) = 1 2AfN > 0.
JULYAUGUST 2005
69
AP[ ]; Af[ ]; N;
F0 =M +
// stand-alone prexes // access frequency // number of the stand-alone prexes
Sf m ;
j j j =1
// initial value of Taverage
Sort AP[ ] in descending order of the access frequency;
n = 1; F1 = 0; F = F0; While (F1 < F AND n < N) { F1 = F;

F = F 0 + n + (N n )
i =n +1
Af ;
i
// to compute Taverage
n++; } AP[1], AP[2], , AP[n 1] are in index TCAM;
Figure 7 Optimization algorithm for independently skewed access patterns. .
We used a real trace that contained 1 million transmission control protocol packets that had owed between Lawrence Berkeley Labs and the Internet. Table 3 shows the power consumption for this traffic. Power reduction factors of Tmin and Taverage are more than 17 and 10. Bursty access patterns have a few standalone prefixes that the search accesses more frequently for short periods of time. We can move the stand-alone prexes between index TCAM and the stand-alone bucket to reduce the average power consumption. Assume standalone prefix APi in a standalone bucket is accessed more frequently than a stand-alone prex APj in index TCAM (Afi > Afj). If we move APi into index TCAM, APj goes into a stand-alone bucket, and the values of Tmin and Tmax do not change because the number of prexes in index TCAM does not change. But Taverage can change.
Table 3. Power consumption for the dynamic architecture.

No. of prexes (power reduction factor) Routing table no. T Tmin (R) Tmax (R) Taverage (R) 971108 26,669 1,124 (23.7) 14,714,(1.8) 5,637.8,(4.1) 981101 54,028 2,542 (21.3) 29,177,(1.8) 4,824.4,(11.2) 991031 71,611 3,454 (20.7) 32,766,(2.2) 6,013.3,(11.9) 001101 93,111 4,832 (19.3) 39,804,(2.3) 7,912.7,(11.8) 011101 108,520 6,126 (17.7) 49,402,(2.2) 9,870.8,(11.0) 021101 120,741 6,991 (17.2) 56,106,(2.2) 11,233.1 (10.7)
Proposition 5
Let n be the number of stand-alone prexes {APi} in index TCAM. The average number of searched prexes is F (n ) = n + M +
j =1
Sf j m j + ( N n )
i =n +1
Af .
i
If we look at the case where N = 1, we have F (1) F (0) = 1 N Af 1
Af .
i i =1
If APk is a stand-alone prex in sub-TCAM, APn is a prex in index TCAM, and Afk > Afn , we will move APn into the stand-alone bucket, and APk will move into index TCAM. Then, the average number of searched prexes is F (n) = n + M +
If F(1) F(0) > 0, then F(n + 1) > F(n), and F (0) = min F (n ) = M +
1n N
j =1
Sf j m j + ( N n)(
i =n +1 i k
Af + Af )
i n
j =1
Sf j m j + N
Af .
i i =1
Therefore, F(n) F (n) = (N n)(Afk Afn) > 0. From Proposition 5, for the bursty access pattern, we can exchange the prexes between index TCAM and the stand-alone bucket to keep the average power consumption stable. Figure 8 shows the optimization algorithm for the bursty access pattern.
If F(1) F(0) < 0, then F(n0) = min[F(n)], where 0<n0<N. From Proposition 4, we can store the standalone prexes with higher access frequency in index TCAM to minimize Taverage; Figure 7 shows the optimization algorithm. Its complexity is O(N), because it takes N times to compute Taverage, when N is the number of stand-alone prexes.
Route update
Because there is no intersection between the subroot prexes and the stand-alone prexes,
70
IEEE MICRO
all prexes in the index TCAM and the standalone bucket can be in disorder. If the new prex is a stand-alone prex, we can insert it into the free space in index TCAM or the stand-alone bucket in sub-TCAM. If the new prex is a subroot prex, we can insert it into the free space of index TCAM, and its child prex in index TCAM moves into the bucket that stores its more-specic prexes. If the new prex is a more-specic one, it should go into the bucket to which its subroot prefix points. Because there might be more than one match prex for any destination IP address, all prefixes in each bucket should be in chain-ancestor ordering11 so as to select the longest matching prex. Shah and Gupta proposed a fast updating algorithm, CAO_OPT.11 If adding prexes causes a bucket to overow, all prexes in the bucket can be moved into a larger bucket.
AP[ ]; Af[ ]; n; BP[ ]; Bf[ ];
// stand-alone prexes in index TCAM // access frequency of AP[ ] // number of stand-alone prexes in index TCAM // stand-alone prexes in sub-TCAM // access frequency of BP[ ]
Sort AP[ ] and BP[ ] in descending order of their access frequency;
k = 1; While (Bf[k] > Af[n]) { Move BP[k] into index TCAM Move AP[n] into sub-TCAM Sort(AP[ ]); k++ }
Figure 8. Optimization algorithm for bursty access patterns.
TCAM and cache

If we replace the index TCAM with a cache, it reduces the power consumption. You should use a complex data structure, such as a hash table, to store the prexes in a cache. But the index TCAM stores all prexes without any data structure to use, and it is very simple to update a route.
Implementation issues
The level partitioning algorithm must maintain the parent pointers between prexes. The optimization algorithms for independently skewed and bursty access patterns also need the access frequencies. To accomplish these two tasks, we introduce an auxiliary array with the following elds: P, the prexes in the routing table; T, a pointer to its parent; L, the location in TCAM; and f, the access frequency.
The level-partitioning algorithm and the optimization algorithm for independently skewed access patterns are ofine; they have no impact on the search algorithm. The optimization algorithm for bursty access patterns is online. If a lot of prexes must be moved between index TCAM and sub-TCAM, the search algorithm might be suspended. This is the deciency of our approach.
ur architecture partitions prefixes in a routing table into three categories: stand alone, subroot, and more-specic. This partitioning works with our proposed TCAMbased forwarding engine for bursty access patterns. This engine has a two level architecture, consisting of an index TCAM and a sub-TCAM. The engine handles subroot prexes in index TCAM and more-specic prexes in sub-TCAM. Stand-alone prexes can be in index TCAM or sub-TCAM, depending on their access frequencies. We discuss two fundamentally important access patterns7 in Internet flow: independently skewed and bursty. For the independently skewed access patterns, we propose an optimization algorithm to reduce power consumption. The algorithm can reduce the average power consumption by a factor of 11. For the bursty access pattern, the access frequencies of some prexes in routing table are very high for short periods of time; we propose a dynamic algorithm to move the hot prexes in subTCAM into index TCAM. The algorithm also moves lower-access-frequency prexes in index TCAM into sub-TCAM. This makes the average power consumption relatively stable. A lot of prex movement between the two TCAMs might have an impact on the search algorithm. Our future work will investigate more-efcient schemes. MICRO References
1. P. Gupta, B. Prabhakar, and S. Boyd, NearOptimal Routing Lookups with Bounded Worst Case Performance, Proc. 19th Ann.
JULYAUGUST 2005
71
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
IEEE Infocom (Infocom 00), IEEE Press, 2000, pp. 1,184-1,192. H. Liu, Routing Table Compaction in Ternary CAM, IEEE Micro, vol. 22, no. 1, Jan.-Feb. 2002, pp. 58-64. R. Panigrahy and S. Sharma, Reducing TCAM Power Consumption and Increasing Throughput, Proc. 10th Symp. High-Performance Interconnects (HOTI 02), IEEE CS Press, 2002, pp. 107-112. F. Zane, G. Narlikar, and A. Basu, CoolCAMs: Power-Efficient TCAMs for Forwarding Engines, Proc. 22th Ann. IEEE Infocom (Infocom 03), IEEE Press, 2003, pp. 42-52. V.C. Ravikumar and R.N. Mahapatra, TCAM Architecture for IP Lookup Using Prefixes Properties, IEEE Micro, vol. 24, no. 2, Mar.Apr. 2004, pp. 60-69. E. Kohler et al., Observed Structure of Addresses in IP Trafc, Proc. 2nd Ann. ACM SIGCOMM Workshop on Internet Measurement, ACM Press, 2002, pp. 253-266. F. Ergun et al., A Dynamic Lookup Scheme for Bursty Access Patterns, Proc. 20th Ann. IEEE Infocom (Infocom 01), IEEE Press, pp. 1,444-1,453. S. Lin and N. McKeown, A Simulation Study of IP Switching, Proc. ACM SIGCOMM 97, ACM Press, 1997, pp. 15-24. G. Cheung and S. McCanne, Optimal Routing Table Design for IP Address Lookups Under Memory Constraints, Proc. 16th Ann. IEEE Infocom (Infocom 99), IEEE Press, 1999, pp. 1,437-1,444. S. Sahni and K. Kim, Efficient Dynamic Lookup for Bursty Access Patterns, Intl J. Foundations of Computer Science, vol. 15, no. 4, Aug. 2004, pp. 567-592. D. Shah and P. Gupta, Fast Updating Algorithms for TCAMs, IEEE Micro, vol. 21, no. 1, Jan.-Feb. 2001, pp. 36-47.
Jian Shi is an associate professor in the Department of Electronics and Information Engineering at Huazhong University of Science and Technology. His research interests include quality of service (QoS) routing protocols and optimization in wireless/mobile networks, distributed network management and security technologies, and wireless sensor network architectures and protocols. Shi has a PhD in electronic science and technology from Huahzong University of Science and Technology. He is a member of the China Communication Association. Ling Zou is an associate professor in the Department of Electronics and Information Engineering at Huazhong University of Science and Technology. Her research interests include QoS guarantee in wireless and mobile networks, network trafc modeling and performance analysis, and flow and congestion control mechanisms. Zou has a PhD in electronic science and technology from Huazhong University of Science and Technology. She is a member of the China Communication Association. Bingxin Shi is a professor in the Department of Electronics and Information Engineering at Huazhong University of Science and Technology. His research interests include computer network management, network simulation and performances analysis, mobile IP, and network performance monitoring. Shi has a BS from Huazhong University of Science and Technology. He is a member of the Expert Group of China Education and Research Network.
Weidong Wu is a PhD candidate in the Department of Electronics and Information Engineering at Huazhong University of Science and Technology, Wuhan, Hubei, Peoples Republic of China. His research interests include algorithms to improve Internet router performance and network management. Wu has a BS in mathematics from Central China Normal University. He is a member of the China Mathematics Association.
Direct questions and comments about this article to Weidong Wu, Department of Electronics and Information Engineering, Huazhong University of Science and Technology, Wuhan, 430074, P. R. China; wwdtylwt@public.wh.hb.cn.
For further information on this or any other computing topic, visit our Digital Library at http://www.computer.org/publications/dlib.
72
IEEE MICRO

Burst Arry Pattern

Transféré par

Informations du document

Description originale:

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Burst Arry Pattern

Transféré par

Droits d'auteur :

Formats disponibles

POWER-EFFICIENT TCAMS FOR BURSTY ACCESS PATTERNS

TERNARY CONTENT ADDRESSABLE MEMORIES (TCAMS) HAVE BECOME A

Published by the IEEE Computer Society

0272-1732/05/$20.00 2005 IEEE

Figure 1. Two-level TCAM architecture.

Bursty access patterns

ROUTING TABLE LOOKUP

01001/5 010/3 0110/4 0/1 101/3

Table 1. Prex distribution.

Routing table no. 19971108 19981101 19991031 20001101 20011101 20021101

Level 1 10,443 19,736 31,775 45,616 48,758 54,570

Total 11,955 24,850 38,845 53,306 59,117 64,635

ROUTING TABLE LOOKUP

Index-TCAM Prefix Next hop Index

Sub-TCAM Prefix Next hop

Figure 4. Static architecture.

Figure 5. Search algorithm in the static architecture.

Table 2. Power consumption for static architecture.

Index-TCAM Prefix IP address Next hop Index

Sub-TCAM Prefix Next hop

Figure 6. Dynamic architecture.

ROUTING TABLE LOOKUP

// stand-alone prexes // access frequency // number of the stand-alone prexes

// initial value of Taverage

Sort AP[ ] in descending order of the access frequency;

n = 1; F1 = 0; F = F0; While (F1 < F AND n < N) { F1 = F;

n++; } AP[1], AP[2], , AP[n 1] are in index TCAM;

Figure 7 Optimization algorithm for independently skewed access patterns. .

Table 3. Power consumption for the dynamic architecture.

If we look at the case where N = 1, we have F (1) F (0) = 1 N Af 1

AP[ ]; Af[ ]; n; BP[ ]; Bf[ ];

Sort AP[ ] and BP[ ] in descending order of their access frequency;

Figure 8. Optimization algorithm for bursty access patterns.

TCAM and cache

ROUTING TABLE LOOKUP

Vous aimerez peut-être aussi