Vous êtes sur la page 1sur 37

Friday, January 2, 2009

Intro to Caching,Caching algorithms and caching frameworks part 1


Introduction:
A lot of us heard the word cache and when you ask them about caching they give you a perfect answer but they
dont know how it is built, or on which criteria I should favor this caching framework over that one and so on, in
this article we are going to talk about Caching, Caching Algorithms and caching frameworks and which is better
than the other.
The Interview:
"Caching is a temp location where I store data in (data that I need it frequently) as the original data is expensive
to be fetched, so I can retrieve it faster. "
That what programmer 1 answered in the interview (one month ago he submitted his resume to a company who
wanted a java programmer with a strong experience in caching and caching frameworks and extensive data
manipulation)
Programmer 1 did make his own cache implementation using hashtable and that what he only knows about
caching and his hashtable contains about 150 entry which he consider an extensive data(caching = hashtable, load
the lookups in hashtable and everything will be fine nothing else) so lets see how will the interview goes.
Interviewer: Nice and based on what criteria do you choose your caching solution?
Programmer 1 :huh, (thinking for 5 minutes) , mmm based on, on , on the data (coughing)
Interviewer: excuse me! Could you repeat what you just said again?
Programmer 1: data?!
Interviewer: oh I see, ok list some caching algorithms and tell me which is used for what
Programmer 1: (staring at the interviewer and making strange expressions with his face, expressions that no one
knew that a human face can do :D )
Interviewer: ok, let me ask it in another way, how will a caching behave if it reached its capacity?
Programmer 1: capacity? Mmm (thinking hashtable is not limited to capacity I can add what I want and it will
extend its capacity) (that was in programmer 1 mind he didnt say it)
The Interviewer thanked programmer 1 (the interview only lasted for 10minutes) after that a woman came and
said: oh thanks for you time we will call you back have a nice day
This was the worst interview programmer 1 had (he didnt read that there was a part in the job description which
stated that the candidate should have strong caching background ,in fact he only saw the line talking about
excellent package ;) )
Talk the talk and then walk the walk
After programmer 1 left he wanted to know what were the interviewer talking about and what are the answers to
his questions so he started to surf the net, Programmer 1 didnt know anything else about caching except: when I
need cache I will use hashtable
After using his favorite search engine he was able to find a nice caching article and started to read.
Why do we need cache?
Long time ago before caching age user used to request an object and this object was fetched from a storage place
and as the object grow bigger and bigger, the user had spend more time to fulfill his request, it really made the
storage place suffer a lot coz it had to be working for the whole time this caused both the user and the db to be
angry and there were one of 2 possibilities
1- The user will get upset and complain and even wont use this application again(that was the case always)
2- The storage place will pack up its bags and leave your application , and that made a big problems(no place to
store data) (happened in rare situations )
Caching is a god sent:
After few years researchers at IBM (in 60s) introduced a new concept and named it Cache
What is Cache?
Caching is a temp location where I store data in (data that I need it frequently) as the original data is expensive
to be fetched, so I can retrieve it faster.
Caching is made of pool of entries and these entries are a copy of real data which are in storage (database for
example) and it is tagged with a tag (key identifier) value for retrieval.
Great so programmer 1 already knows this but what he doesnt know is caching terminologies which are as follow:
Cache Hit:
When the client invokes a request (lets say he want to view product information) and our application gets the
request it will need to access the product data in our storage (database), it first checks the cache.
If an entry can be found with a tag matching that of the desired data (say product Id), the entry is used instead.
This is known as a cache hit (cache hit is the primary measurement for the caching effectiveness we will discuss
that later on).
And the percentage of accesses that result in cache hits is known as the hit rate or hit ratio of the cache.
Cache Miss:
On the contrary when the tag isnt found in the cache (no match were found) this is known as cache miss , a hit to
the back storage is made and the data is fetched back and it is placed in the cache so in future hits it will be found
and will make a cache hit.
If we encountered a cache miss there can be either a scenarios from two scenarios:
First scenario: there is free space in the cache (the cache didnt reach its limit and there is free space) so in this
case the object that cause the cache miss will be retrieved from our storage and get inserted in to the cache.
Second Scenario: there is no free space in the cache (cache reached its capacity) so the object that cause cache
miss will be fetched from the storage and then we will have to decide which object in the cache we need to move
in order to place our newly created object (the one we just retrieved) this is done by replacement policy (caching
algorithms) that decide which entry will be remove to make more room which will be discussed below.
Storage Cost:
When a cache miss occurs, data will be fetch it from the back storage, load it and place it in the cache but how
much space the data we just fetched takes in the cache memory? This is known as Storage cost
Retrieval Cost:
And when we need to load the data we need to know how much does it take to load the data. This is known as
Retrieval cost
Invalidation:
When the object that resides in the cache need is updated in the back storage for example it needs to be updated,
so keeping the cache up to date is known as Invalidation.
Entry will be invalidate from cache and fetched again from the back storage to get an updated version.
Replacement Policy:
When cache miss happens, the cache ejects some other entry in order to make room for the previously uncached
data (in case we dont have enough room). The heuristic used to select the entry to eject is known as the
replacement policy.
Optimal Replacement Policy:
The theoretically optimal page replacement algorithm (also known as OPT or Beladys optimal page replacement
policy) is an algorithm that tries to achieve the following: when a cached object need to be placed in the cache,
the cache algorithm should replace the entry which will not be used for the longest period of time.
For example, a cache entry that is not going to be used for the next 10 seconds will be replaced by an entry that
is going to be used within the next 2 seconds.
Thinking of the optimal replacement policy we can say it is impossible to achieve but some algorithms do near
optimal replacement policy based on heuristics.
So everything is based on heuristics so what makes algorithm better than another one? And what do they use for
their heuristics?
Nightmare at Java Street:
While reading the article programmer 1 fall a sleep and had nightmare (the scariest nightmare one can ever have)
Programmer 1: nihahha I will invalidate you. (Talking in a mad way)
Cached Object: no no please let me live, they still need me, I have children.
Programmer 1: all cached entries say that before they are invalidated and since when do you have children? Never
mind now vanish for ever.
Buhaaahaha , laughed programmer 1 in a scary way, ,silence took over the place for few minutes and then a
police serine broke this silence, police caught programmer 1 and he was accused of invalidating an entry that was
still needed by a cache client, and he was sent to jail.
Programmer 1 work up and he was really scared, he started to look around and realized that it was just a dream
then he continued reading about caching and tried to get rid of his fears.
Caching Algorithms:
No one can talk about caching algorithms better than the caching algorithms themselves
Least Frequently Used (LFU):
I am Least Frequently used; I count how often an entry is needed by incrementing a counter associated with each
entry.
I remove the entry with least frequently used counter first am not that fast and I am not that good in adaptive
actions (which means that it keeps the entries which is really needed and discard the ones that arent needed for
the longest period based on the access pattern or in other words the request pattern)
Least Recently Used (LRU):
I am Least Recently Used cache algorithm; I remove the least recently used items first. The one that wasnt used
for a longest time.
I require keeping track of what was used when, which is expensive if one wants to make sure that I always
discards the least recently used item.
Web browsers use me for caching. New items are placed into the top of the cache. When the cache exceeds its
size limit, I will discard items from the bottom. The trick is that whenever an item is accessed, I place at the top.
So items which are frequently accessed tend to stay in the cache. There are two ways to implement me either an
array or a linked list (which will have the least recently used entry at the back and the recently used at the front).
I am fast and I am adaptive in other words I can adopt to data access pattern, I have a large family which
completes me and they are even better than me (I do feel jealous some times but it is ok) some of my family
member are (LRU2 and 2Q) (they were implemented in order to improve LRU caching
Least Recently Used 2(LRU2):
I am Least recently used 2, some people calls me least recently used twice which I like it more, I add entries to
the cache the second time they are accessed (it requires two times in order to place an entry in the cache); when
the cache is full, I remove the entry that has a second most recent access. Because of the need to track the two
most recent accesses, access overhead increases with cache size, If I am applied to a big cache size, that would
be a problem, which can be a disadvantage. In addition, I have to keep track of some items not yet in the cache
(they arent requested two times yet).I am better that LRU and I am also adoptive to access patterns.
-Two Queues:
I am Two Queues; I add entries to an LRU cache as they are accessed. If an entry is accessed again, I move them
to second, larger, LRU cache.
I remove entries a so as to keep the first cache at about 1/3 the size of the second. I provide the advantages of
LRU2 while keeping cache access overhead constant, rather than having it increase with cache size. Which makes
me better than LRU2 and I am also like my family, am adaptive to access patterns.
Adaptive Replacement Cache (ARC):
I am Adaptive Replacement Cache; some people say that I balance between LRU and LFU, to improve combined
result, well thats not 100% true actually I am made from 2 LRU lists, One list, say L1, contains entries that have
been seen only once recently, while the other list, say L2, contains entries that have been seen at least twice
recently.
The items that have been seen twice within a short time have a low inter-arrival rate, and, hence, are thought of
as high-frequency. Hence, we think of L1as capturing recency while L2 as capturing frequency so most of
people think I am a balance between LRU and LFU but that is ok I am not angry form that.
I am considered one of the best performance replacement algorithms, Self tuning algorithm and low overhead
replacement cache I also keep history of entries equal to the size of the cache location; this is to remember the
entries that were removed and it allows me to see if a removed entry should have stayed and we should have
chosen another one to remove.(I really have bad memory)And yes I am fast and adaptive.
Most Recently Used (MRU):
I am most recently used, in contrast to LRU; I remove the most recently used items first. You will ask me why for
sure, well let me tell you something when access is unpredictable, and determining the least most recently used
entry in the cache system is a high time complexity operation, I am the best choice thats why.
I am so common in the database memory caches, whenever a cached record is used; I replace it to the top of
stack. And when there is no room the entry on the top of the stack, guess what? I will replace the top most entry
with the new entry.
First in First out (FIFO):
I am first in first out; I am a low-overhead algorithm I require little effort for managing the cache entries. The
idea is that I keep track of all the cache entries in a queue, with the most recent entry at the back, and the
earliest entry in the front. When there e is no place and an entry needs to be replaced, I will remove the entry at
the front of the queue (the oldest entry) and replaced with the current fetched entry. I am fast but I am not
adaptive
-Second Chance:
Hello I am second change I am a modified form of the FIFO replacement algorithm, known as the Second chance
replacement algorithm, I am better than FIFO at little cost for the improvement. I work by looking at the front of
the queue as FIFO does, but instead of immediately replacing the cache entry (the oldest one), i check to see if its
referenced bit is set(I use a bit that is used to tell me if this entry is being used or requested before or no). If it is
not set, I will replace this entry. Otherwise, I will clear the referenced bit, and then insert this entry at the back of
the queue (as if it were a new entry) I keep repeating this process. You can think of this as a circular queue.
Second time I encounter the same entry I cleared its bit before, I will replace it as it now has its referenced bit
cleared. am better than FIFO in speed
-Clock:
I am Clock and I am a more efficient version of FIFO than Second chance because I dont push the cached entries
to the back of the list like Second change do, but I perform the same general function as Second-Chance.
I keep a circular list of the cached entries in memory, with the "hand" (something like iterator) pointing to the
oldest entry in the list. When cache miss occurs and no empty place exists, then I consult the R (referenced) bit at
the hand's location to know what I should do. If R is 0, then I will place the new entry at the "hand" position,
otherwise I will clear the R bit. Then, I will increment the hand (iterator) and repeat the process until an entry is
replaced.I am faster even than second chance.
Simple time-based:
I am simple time-based caching; I invalidate entries in the cache based on absolute time periods. I add Items to
the cache, and they remain in the cache for a specific amount of time. I am fast but not adaptive for access
patterns
Extended time-based expiration:
I am extended time based expiration cache, I invalidate the items in the cache is based on relative time periods. I
add Items the cache and they remain in the cache until I invalidate them at certain points in time, such as every
five minutes, each day at 12.00.
Sliding time-based expiration:
I am Sliding time-base expiration, I invalidate entries a in the cache by specifying the amount of time the item is
allowed to be idle in the cache after last access time. after that time I will invalidate it . I am fast but not adaptive
for access patterns
Ok after we listened to some replacement algorithms (famous ones) talking about themselves, some other
replacement algorithms take into consideration some other criteria like:
Cost: if items have different costs, keep those items that are expensive to obtain, e.g. those that take a long time
to get.
Size: If items have different sizes, the cache may want to discard a large item to store several smaller ones.
Time: Some caches keep information that expires (e.g. a news cache, a DNS cache, or a web browser cache). The
computer may discard items because they are expired. Depending on the size of the cache no further caching
algorithm to discard items may be necessary.
The E-mail!
After programmer 1 did read the article he thought for a while and decided to send a mail to the author of this
caching article, he felt like he heard the author name before but he couldnt remember who this person was but
anyway he sent him mail asking him about what if he has a distributed environment? How will the cache behave?
The author of the caching article got his mail and ironically it was the man who interviewed programmer 1 :D, The
author replied and said :
Distributed caching:
*Caching Data can be stored in separate memory area from the caching directory itself (who handle the caching
entries and so on) can be across network or disk for example.
*Distrusting the cache allows increase in the cache size.
*In this case the retrieval cost will increase also due to network request time.
*This will also lead to hit ratio increase due to the large size of the cache
But how will this work?
Lets assume that we have 3 Servers, 2 of them will handle the distributed caching (have the caching entries), and
the 3rd one will handle all the requests that are coming (Which asks about cached entries):
Step 1: the application requests keys entry1, entry2 and entry3, after resolving the hash values for these entries,
and based on the hashing value it will be decided to forward the request to the proper server.
Step 2: the main node sends parallel requests to all relevant servers (which has the cache entry we are looking
for).
Step 3: the servers send responses to the main node (which sent the request in the 1st place asking to the cached
entry).
Step 4: the main node sends the responses to the application (cache client).
*And in case the cache entry were not found (the hashing value for the entry will be still computed and will
redirect either to server 1 or server 2 for example and in this case our entry wont be found in server 1 so it will
fetched from the DB and added to server 1 caching list.
Measuring Cache:
Most caches can be evaluated based on measuring the hit ratio and comparing to the theoretical optimum, this is
Widget by Css Reflex | TutZone
usually done by generation a list of cache keys with no real data, but here the hit ratio measurement assumes
that all entries have the same retrieval cost which is not true for example in web caching the number of bytes the
cache can server is more important than the number of hit ration (for example I can replace the big entry will 10
small entries which is more effective in web)
Conclusion:
We have seen some of popular algorithms that are used in caching, some of them are based on time, cache object
size and some are based on frequency of usage, next part we are going to talk about the caching framework and
how do they make use of these caching algorithms, so stay tuned ;)
Related Articles:
Part 2 (Algorithm Implementation)
Part 3 (Algorithm Implementation)
Part 4 (Frameworks Comparison)
Part 5 (Frameworks Comparison)
Posted by Ahmed Ali at 1:09 PM
Labels: Algorithms, Framework
Intro to Caching,Caching algorithms and caching frameworks part 2
Introduction:
In this part we are going to show how to implement some of the famous replacement algorithms as we mentioned
in part 1, the code in this article is just for demonstration purpose which means you will have to do some extra
effort if you want to make use of it in your application (if you are going to build your own implementation and
wont use any caching frameworks)
The Leftover policy:
After programmer 1 read the article he proceeded to review the comments on this article, one of these comments
were talking about leftover policy, which is named Random Cache
Random Cache:
I am random cache, I replace any cache entry I want, I just do that and no one can complain about that, you can
say the unlucky entry, by doing this I remove any overhead of tracking references or so, am better than FIFO
policy, in some cases I perform even better than LRU but in general LRU is better than me.
It is comment time:
While programmer 1 was reading the rest of the comments, he found very interesting comment about
implementation of some of the famous replacement policies, actually it was a link to the commenter site which has
the actual implementation so programmer 1 clicked the link and here what he got:
Meet the Cache Element:
public class CacheElement {
private Object objectValue;
private Object objectKey;
private int index;
private int
hitCount;
.
. // getters and setters
.
}
This is the cache entry which will use to hold the key and the value; this will be used in all the cache algorithms
implementation
Common Code for All Caches:
public final synchronized void addElement(Object key,Object value) {
int index;
Object obj;
// get the entry from the table
obj = table.get(key);
// If we have the entry already in our table
then get it and replace only its value.
if (obj != null) {
CacheElement
element;
element = (CacheElement) obj;
element.setObjectValue(value);
element.setObjectKey(key);
return;
}
}
The above code will be common for all our implementation; it is about checking if the cacheElemnet already exists
in our cache, if so then we just need to place its value and we dont need to make anything else but what if we
didnt find it ? Then we will have to dig deeper and see what will happen below.
The Talk Show:
Todays episode is a special episode , we have special guests , they are in fact compotators we are going to hear
what everyone has to say but first lets introduce our guests:
Random Cache, FIFO Cache
Lets start with the Random Cache.
Meet Random Cache implementation:
public final synchronized void addElement(Object key,Object value) {
int index;
Object obj;
obj = table.get(key);
if (obj
!= null) {
CacheElement element;
// Just replace the value.
element = (CacheElement) obj;
element.setObjectValue(value);
element.setObjectKey(key);
return;
}
// If we haven't
filled the cache yet, put it at the end.
if (!isFull()) {
index =
numEntries;
++numEntries;
} else {
// Otherwise, replace a random
entry.
index = (int) (cache.length * random.nextFloat());
table.remove(cache[index].getObjectKey());
}
cache[index].setObjectValue(value);
cache[index].setObjectKey(key);
table.put(key, cache[index]);
}
Analyzing Random Cache Code (Talk show):
In todays show the Random Cache is going to explain the code line by line and here we go.
I will go straight to the main point; if I am not full then I will place the new entry that the client requested at the
end of the cache (in case there is a cache miss).
I do this by getting the number of entries that resides in the cache and assign it to index (which will be the index
of the current entry the client is adding) after that I increment the number of entries.
if (!isFull()) {
index = numEntries;
++numEntries;
}
If I dont have enough room for the current entry, I will have to kick out a random entry (totally random, bribing
isnt allowed).
In order to get the random entry, I will use the random util. shipped with java to generate a random index and
ask the cache to remove the entry that its index equal to the generated index.
else {
// Otherwise, replace a random entry.
index = (int) (cache.length * random.nextFloat());
table.remove(cache[index].getObjectKey());
}
At the end I just place the entry -either the cache was full or no- in the cache.
cache[index].setObjectValue(value);
cache[index].setObjectKey(key);
table.put(key, cache[index]);
Magnifying the Code:
It is said that when you look at stuff from a near view it is better to understand it, so thats why we have a
magnifying glass and we are going to magnify the code to get more near to it (and maybe understand it more).
Cache entries in the same voice: hi ho, hi ho, into cache we go.
New cache entry: excuse me; I have a question! (Asking a singing old cache entry near to him)
Old cache entry: go ahead.
New cache entry: I am new here and I dont understand my role exactly, how will the algorithm handle us?
Old cache entry: cache! (Instead of man!), you remind me of myself when I was new (1st time I was added to the
cache), I used to ask questions like that, let me show you what will happen.
Meet FIFO Cache Implementation:
public final synchronized void addElement(Object
key,Object value) {
int index;
Object obj;
obj = table.get(key);
if (obj != null) {
CacheElement element;
// Just replace the
value.
element = (CacheElement) obj;
element.setObjectValue(value);
element.setObjectKey(key);
return;
}
// If we haven't
filled the cache yet, put it at the end.
if (!isFull()) {
index =
numEntries;
++numEntries;
} else {
// Otherwise, replace the current
pointer, entry with the new one
index = current;
// in order to make
Circular FIFO
if (++current >= cache.length)
current = 0;
table.remove(cache[index].getObjectKey());
}
cache[index].setObjectValue(value);
cache[index].setObjectKey(key);
table.put(key, cache[index]);
}
Analyzing FIFO Cache Code (Talk show):
After Random Cache, audience went crazy for random cache, which made FIFO a little bit jealous so FIFO started
talking and said:
When there is no more rooms for the new cache entry , I will have to kick out the entry at the front (the one
came first) as I work in a circular queue like manner, by default the current position is at the beginning of the
queue(points to the beginning of the queue).
I assign current value to index (index of the current entry) and then check to see if the incremented current
greater than or equals to the cache length(coz I want to reset current pointer- position to the beginning of the
queue) ,if so then I will set current to zero again ,after that I just kick the entry at the index position (Which is
the first entry in the queue now) and place the new entry.
else {
// Otherwise, replace the current pointer,
which takes care of
// FIFO in a circular fashion.
index = current;
if (++current >= cache.length)
current = 0;
table.remove(cache[index].getObjectKey());
}
cache[index].setObjectValue(value);
cache[index].setObjectKey(key);
table.put(key, cache[index]);
Magnifying the Code:
Back to our magnifying glass we can observe the following actions happening to our entries
Widget by Css Reflex | TutZone
Conclusion:
As we have seen in this article how to implement the FIFO replacement policy and also Random replacement
policy, in the upcoming articles we will try to take our magnifying glass and magnify LFU, LRU replacement policy,
till then stay tuned ;)
Posted by Ahmed Ali at 11:05 PM
Labels: Algorithms, Framework
Intro to Caching,Caching algorithms and caching frameworks part 3
Introduction:
In part 1 we talked about the basics and terminologies of cache and we have also shown replacement policies , in
part 2 we implemented some of these famous replacement polices and now in this part we will continue talking
about the implementation of two famous algorithms which are LFU and LRU. Again, the implementation in this
article is for sake of demonstration and in order to use it (we just concentrate over the replacement algorithm and
we will skip other things like loading data and so on), you will have to do some extra work but you can base your
implementation over it.
Meet LFU Cache Implementation:
public synchronized Object getElement(Object key) {
Object obj;
obj = table.get(key);
if (obj != null) {
CacheElement element = (CacheElement) obj;
element.setHitCount(element.getHitCount() + 1);
return element.getObjectValue();
}
return null;
}
public final synchronized void addElement(Object key, Object value) {
Object obj;
obj = table.get(key);
if (obj != null) {
CacheElement element;
// Just replace the value.
element = (CacheElement) obj;
element.setObjectValue(value);
element.setObjectKey(key);
return;
}
if (!isFull()) {
index = numEntries;
++numEntries;
} else {
CacheElement element = removeLfuElement();
index = element.getIndex();
table.remove(element.getObjectKey());
}
cache[index].setObjectValue(value);
cache[index].setObjectKey(key);
cache[index].setIndex(index);
table.put(key, cache[index]);
}
public CacheElement removeLfuElement() {
CacheElement[] elements = getElementsFromTable();
CacheElement leastElement = leastHit(elements);
return leastElement;
}
public static CacheElement leastHit(CacheElement[] elements) {
CacheElement lowestElement = null;
for (int i = 0; i < elements.length; i++) {
CacheElement element = elements[i];
if (lowestElement == null) {
lowestElement = element;
} else {
if (element.getHitCount() < lowestElement.getHitCount()) {
lowestElement = element;
}
}
}
return lowestElement;
}
Analyzing LFU Cache Code (Talk Show):
Presenter: it is getting hotter and hotter now, our next contestant is LFU cache, please make some noise for it.
Audience began to scream for LFU which made LFU hesitated.
Hello, I am LFU, when the cache client want to add a new element and cache is full (no enough room for the new
entry) I will have to kick out the least frequently used entry, by using the help of the removelfuElement method
which will allow me to get the least frequently used element, after I get it, I will remove this entry and place the
new entry
else {
CacheElement element = removeLfuElement();
index = element.getIndex();
table.remove(element.getObjectKey());
}
If we dived into this method, I am saying if we dived into this method (still nothing happened)
LFU tried pressing the next button on the presentation remote control (to get the next presentation slide) but I
didnt work.
Ahh now we are talking, ok if we dived into this method we will see that the method is just getting the whole
elements in cache by calling getElementsFromTable method and then returns the element with the least hit.
public CacheElement removeLfuElement() {
CacheElement[] elements = getElementsFromTable();
CacheElement leastElement = leastHit(elements);
return leastElement;
}
}
By calling leastHit method which loops over the cache elements and check if the current element has the least hit,
if so, I will make it my lowestElement which I am going replace the new entry with.
public static CacheElement leastHit(CacheElement[] elements) {
CacheElement lowestElement = null;
for (int i = 0; i <>
CacheElement element = elements[i];
if (lowestElement == null)
{ lowestElement = element; }
else {
if (element.getHitCount() <>
{ lowestElement = element; }
}
}
return lowestElement;
}
LFU stopped talking and waited for any action from the audience and the only action it get was scratching heads
(audience didnt get some stuff).
One of the production team whispered to LFU cache and said: you didnt mention how the lowest element will be
distinguished from another element?
Then LFU cache started talking gain and said: By default when you add the element to the cache its hitCoint will
be the same as the previous element so how do we handle the hit count thing?
Every time I encounter a cache hit I will increment the hit count of the entry and then return the entry the cache
client asked for which would be something like that
public synchronized Object getElement(Object key) {
Object obj;
obj = table.get(key);
if (obj != null) {
CacheElement element = (CacheElement) obj;
element.setHitCount(element.getHitCount() + 1);
return element.getObjectValue();
}
return null;
}
Magnifying the Code:
Did anyone say magnification?
Meet LRU Cache Implementation:
private void moveToFront(int index) {
int nextIndex, prevIndex;
if(head != index) {
nextIndex = next[index];
prevIndex = prev[index];
// Only the head has a prev entry that is an invalid index so
// we don't check.
next[prevIndex] = nextIndex;
// Make sure index is valid. If it isn't, we're at the tail
// and don't set prev[next].
if(nextIndex >= 0)
prev[nextIndex] = prevIndex;
else
tail = prevIndex;
prev[index] = -1;
next[index] = head;
prev[head] = index;
head = index;
}
}
public final synchronized void addElement(Object key, Object value) {
int index;
Object obj;
obj = table.get(key);
if(obj != null) {
CacheElement entry;
// Just replace the value, but move it to the front.
entry = (CacheElement)obj;
entry.setObjectValue(value);
entry.setObjectKey(key);
moveToFront(entry.getIndex());
return;
}
// If we haven't filled the cache yet, place in next available spot
// and move to front.
if(!isFull()) {
if(_numEntries > 0) {
prev[_numEntries] = tail;
next[_numEntries] = -1;
moveToFront(numEntries);
}
++numEntries;
} else {
// We replace the tail of the list.
table.remove(cache[tail].getObjectKey());
moveToFront(tail);
}
cache[head].setObjectValue(value);
cache[head].setObjectKey(key);
table.put(key, cache[head]);
}
Analyzing LRU Cache Code (Talk show):
After LFU finished talking, there were not much screaming, they didnt like the presentation and LFU was
hesitating while talking, this gave a big push to LRU which started by saying:
This time I will consider the case also when the cache is not full, I am little more complex than those other
algorithms, when the cache isnt full and it is the first entry I will just increment the numEntries which represents
the number of entries in the cache.
After adding a second entry I will need to move it to the front by calling moveToFront method (we will talk about
it soon), I didnt do this for the first entry because it is for sure the first element.
So lets see some action.
As you can see I am stating that the previous of the current entry will have the tail value and the next entry will
be -1 (undefined in other words) these are just initial data.
After adding the new entry (which isnt the first entry) I will move it to front.
if(!isFull()) {
if(_numEntries > 0) {
prev[_numEntries] = tail;
next[_numEntries] = -1;
moveToFront(numEntries);
}
++numEntries;
}
The moveToFront method moves an entry to the head of the array so that the least recently used elements reside
at the bottom of the array.
Before I do any move I check if the head is not equal to current index (this will be false in case we only have 1
entry) if yes, then assign the value of the next of the current entry (which is a pointer to next entry as in linked
list) to nextIndex and the value of the previous of the current entry (which is a pointer to the previous entry as in
linked list) to prevIndex
int nextIndex, prevIndex;
if(head != index) {
nextIndex = next[index];
prevIndex = prev[index];
Then I assign the value of the nextIndex to the value of next of the previous entry
// Only the head has a prev entry that is an invalid index so
// we don't check.
next[prevIndex] = nextIndex;
After that I am going to check for the nextIndex if it is greater that or equal 0 then the previous the next entry
will have the value of prevIndex , else the tail will be equal to the prevIndex
// Make sure index is valid. If it isn't, we're at the tail
// and don't set prev[next].
if(nextIndex >= 0)
prev[nextIndex] = prevIndex;
else tail = prevIndex;
And because I moved this entry to the front so there wont be any previous entry for it so am assigning -1 to it
and the next entry of the current entry (top one) will be the head (previous old head) and the prev of head (the
old head) will have the index of the current entry and then the new head is assigned the new index (current
index)
prev[index] = -1;
next[index] = head;
prev[head] = index;
head = index;
Magnifying the Code:
It is magnifying time! Get your magnifying glass we are going to see some interesting stuff here
It is Confession Time! :
LRU didnt mention that it is possible to implement the LRU algorithm in a simple way , our previous
implementation is based on Arrays , the other implementation that LRU cache didnt mention is through
LinkedHashMap which was introduced in JDK 1.4
public class LRUCache2 extends LinkedHashMap
{
private
Widget by Css Reflex | TutZone
static final int MAX_ENTRIES = 3;
public LRUCache2()
{
super(MAX_ENTRIES+1, .75F,
true);
}
// This method is
invoked by put and putAll after inserting a new entry into
// the map. It allows the map to have up to 3 entries and then
// delete the oldest entry each time a new entry is
added. protected boolean removeEldestEntry(Map.Entry eldest)
{
return
this.size() > MAX_ENTRIES;
}
}
For sure, the LinkedHashMap solution is less time consuming that the array solution and it is more efficient coz
you will leave the handling of the deletion and so on to the JDK itself, so you wont bother yourself implementing
such stuff.
OSCache use such implementation in its LRU caching implementation.
Conclusion:
We have seen how to implement LFU and LRU algorithms and the two ways to implement the LRU, it is based on
you to choose which way to use, Arrays or LinkedHashMap for me I would recommend Arrays for small size
entries and LinkedHashMap for big size entries.
In next part we will be talking about the Caching framework and a comparison between them and what caching
algorithm is employed by which caching framework, stay tuned till then ;)
Posted by Ahmed Ali at 10:55 PM
Intro to Caching,Caching algorithms and caching frameworks part 4
Introduction:
In part 1 we talked about Caching introduction and some terminologies of caching and in part 2 and part 3 we
have seen some implementation of the famous replacement cache algorithms and now in this part we will see
comparison between open source java caching frameworks as I am not that rich to buy commercial frameworks
:D.
In this part we will talking about OSCache,Ehcache,JCS and Cache4J and we are going to concentrate on memory
caching only, there will be performance comparison based on in memory caching by using JBoss caching
benchmark framework and other test cases for cache.
The Task:
Programming Mania is a famous programming magazine from geeks to geeks every release from the magazine
there a section specialized in frameworks comparison like MVC, ORM and so on, this month they decided that they
are going to make a comparison about caching frameworks
And as we know the editors have programmatic background, in fact they are real programmers (not fake ones).
Head of Editors: this time we want to make our comparison article about caching frameworks, so we need to
investigate the already used caching frameworks and I dont need to remind you that the economic crisis affected
us as well, so we will just care about open source frameworks.
Programmer 1: oh, okay no problem in that.
Head of Editors: excellent, oh and by the way, we will make it in two parts so try getting as much information as
you can.
Programmer 1: ok, no problem.
Head of Editors: oh yea, one more thing, I am excepting you to be done by the day after tomorrow as we are
going to release the article this week.
Programmer 1: !!! : (Shocked)
First few lines!
In order for programmer 1 to make the right comparison he needs to know what type of objects or what caching
frameworks cache, some caching frameworks cache just normal POJOs while others cache portions of JSPs and so
on, below is a list of common objects that caching frameworks cache
1-POJO Caching
2-HTTP Response Caching
3-JSP Caching
4-ORM Data Access Caching
The Checklist:
After Programmer 1 read a lot about caching he made a check list which enables him to make the comparison of
the different frameworks, he will validates each item from the check list against all the caching frameworks.
The check list is as follow:
Programmer 1 decided to list the famous caching frameworks he is going to compare between so he selected the
following frameworks:
Java Caching System (JCS)
Ehcache
OSCache
Cache4J
ShiftOne
WhirlyCache
SwarmCache
JBoss Cache
As soon he finished listing the frameworks he started to write the first few lines in the 1st part
Java Caching System (JCS):
JCS is a distributed caching system written in java for server-side java applications. It is intended to speed up
dynamic web applications by providing a means to manage cached data of various dynamic natures. Like any
caching system, the JCS is most useful for high read, low put applications.
The foundation of JCS is the Composite Cache, which is the pluggable controller for a cache region. Four types of
caches can be plugged into the Composite Cache for any given region: Memory, Disk, Lateral, and Remote.
The JCS jar provides production ready implementations of each of the four types of caches. In addition to the core
four, JCS also provides additional plug-ins of each type.
JCS provides a framework with no point of failure, allowing for full session failover (in clustered environments),
including session data across up to 256 servers
JCS has a wick nested categorical removal, data expiration (idle time and max life) Extensible framework, fully
configurable runtime parameters, and remote synchronization, remote store recovery, Non-blocking "zombie"
(balking facade) pattern
"balking facade pattern , if a method is invoked on an object and that object is not in appropriate state to execute
that method, have the method return without doing anything is in state or even throw an exception for example
'IllegalStateException'
The configurations of JCS are set in a properties file named config.ccf file.
-Memory Cache:
JCS support LRU and MRU, The LRU Memory Cache is an extremely fast, highly configurable memory cache. It
uses a Least Recently Used algorithm to manage the number of items that can be stored in memory. The LRU
Memory Cache uses its own LRU Map implementation that is significantly faster than both the commons LRUMap
implementation and the LinkedHashMap that is provided with JDK1.4 up. (At least that what JCS claims which we
will show below )
-Disk Cache:
The Indexed Disk Cache is a fast, reliable, and highly configurable swap for cached data. The indexed disk cache
follows the fastest pattern for disk swapping.
-Lateral Cache:
The TCP Lateral Cache provides an easy way to distribute cached data to multiple servers. It comes with a UDP
discovery mechanism, so you can add nodes without having to reconfigure the entire farm. The TCP Lateral is
highly configurable.
-Remote Cache:
JCS also provides an RMI based Remote Cache Server. Rather than having each node connects to every other
node, you can use the remote cache server as the connection point.
JCS and Check List:
JCS in Action:
Our programmer 1 was checking JCS site and in the site they claimed that its LRU Map caching algorithm is faster
than LinkedHashMap that is shipped with JDK 1.4 and up. So our newbie ran the following test against JCS (1.3)
and LinkedHashMap JDK 1.4 and 1.6
The above is the PC specification that we are going to run our test on
In order to check what JCS claims we used their own test case from the JCS site (I will be using this test case for
the rest of our frameworks testing)
The following configuration file was used during the test:
JCS
After using this test case for LinkedHashMap and JCS we got the following results:
JCS vs. LinkedHashMap
Ehcache:
Ehcache is a java distributed cache for general purpose caching, J2EE and light-weight containers tuned for large
size cache objects. It features memory and disk stores, replicate by copy and invalidate, listeners, a gzip caching
servlet filter, Fast, Simple.
Ehcache Acts as a pluggable cache for Hibernate 2.1. with Small foot print, Minimal dependencies, fully
documented and Production tested.
It is used in a lot of Java frameworks such as Alfresco, Cocoon, Hibernate, Spring, JPOX, Jofti, Acegi, Kosmos,
Tudu Lists and Lutece.
One of its features is to cache domain objects that map to database entities. As the domain objects that maps to
database entities is the core of any ORM system thats why Ehcache is the default cache for HibernateWith
Ehcache you can serialize both Serializable objects and Non-serializable.
Non-serializable Objects can use all parts of Ehcache except for Disk Store and replication. If an attempt is made
to persist or replicate them they are discarded and a WARNING level log message emitted.
Another feature in Ehache is that admin can monitor the cache statistics, configuration changing and managing the
cache through JMX service as Ehcache supports it (Which is really nice feature).
The configurations of Ehcache are set in an xml file named ehcache.xml file.
-Memory Cache:
EHCache support LRU, LFU and FIFO.
-Disk Cache:
Ehcache can store up to 100G of data to disk and access them in a fast manner.
Ehcache and Check List:
OSCache:
OSCache is a caching solution that includes a JSP tag library and set of classes to perform fine grained dynamic
caching of JSP content, servlet responses or arbitrary objects. It provides both in memory and persistent on disk
caches, and can allow your site to continue functioning normally even if the data source is down(for example if an
error occurs like your db goes down, you can serve the cached content so people can still surf the site).
When dealing with static HTML pages. The Page response can be cached indefinitely in memory thus avoiding
reprocessing of the page. OSCache do so by using the URI and query parameters to form a unique key.
This key is used to store page content. HttpResponse caching is implemented as a ServletFilter. Thus, the cache
filter abstracts the API usage from the client.
By default, the Cache Filter holds the page response in 'Application' scope and refreshes the cache every one hour.
These default values can be changed.
In case of dynamic pages (JSPs), OSCache provides tags that surround the static part in the page. Thus, only the
static part of the page is cached.
OSCache can be configured for persistence cache. When the memory capacity is reached, objects are evicted from
the memory and stored on a hard disk. Objects are evicted from memory based on the configured cache algorithm.
Other caching places (like DB for example) you could also implement your own custom Persistencelistener (to
persist in a any place you want)
OSCache supports distributed caching.
When an application is deployed in a cluster of application servers, the local cache is kept in sync by
communication amongst all the caches in the cluster; this is achieved either by JMS or by JGroups.
Multiple caches can be created, each with their own unique configuration.
Another feature in OSCache is that admin can monitor the cache statistics; configuration changing and managing
the cache through JMX service but this is only available via spring framework (while Ehcache supports this feature
without the need of any other framework or so).
OSCache is also used by many projects Jofti, Spring, Hibernate.
OSCache is also used by many sites like TheServerSide, JRoller, JavaLobby
The configurations of OSCache are set in a property file named oscache.properties file.
-Memory Cache:
OSCache support LRU and FIFO, and any other custom replacement algorithm
-Disk Cache:
OSCache supports the Disk cache, when using memory anddisk since, when capacity is reached, item is removed
from memory but notfrom disk. Therefore, if that item is needed again, it will be found on diskand brought back
into memory. You get a behavior similar as a browsercache. However you still need to do some administrative
tasks to clean the diskcache periodically since this has not been implemented in OSCache.
OSCache and Check List:
Cache4J:
Cache4j is a cache for Java objects that stores objects only in memory (suitable for Russian speaking guys only as
there is not documentation in English and the JavaDoc is in Russian also :D).
It is mainly useful for caching POJO objects only.
In the wish list they stated that they want to support disk caching and distributed handling also but that was long
time ago in 2006 but nothing happened.
It supports LRU, LFU, and FIFO caching algorithms. For storing objects in its cache, cache4j offers hard and soft
references (best practice for caching frameworks is to use the weak reference and soft reference because if the
JVM needs to garbage collect some objects to make room in memory, then the cached objects will be the first one
to be removed).
Cache4j is implemented in a way that multiple application threads can access the cache simultaneously. It also
provides easy to use programming APIs
-Memory Cache:
Cache4J support LRU, LFU and FIFO
Cache4J Check List:
Performance in action:
Ok now it is show time for this performance testing programmer 1 used 3 different test cases which are as follow:
1-Test Case from JCS site (applied on all caching frameworks)
2-JBoss Cache benchmark framework (which is really a very nice cache benchmark framework)
3-Test Case from Cache4J site (applied on all caching frameworks)
In the 1st and 3rd cache test case it just simple testing of retrieve and populating the cache, while in JBoss cache
benchmark there are a lot of test cases shipped with the benchmark from replication to distributed and clustering
testing.
All the testing here were performed on a single machine (no distributed testing were performed) and all the testing
were performed in memory.
The versions of the frameworks we are going to test now are as follow:
OSCache: 2.4.1
Ehcache: 1.6.0
JCS: 1.3
Cache4j: 0.4
Configurations Used:
OSCache
Ehcache
JCS
Cache4J:
SynchronizedCache cache = new SynchronizedCache();
cache.setCacheConfig(new CacheConfigImpl("cacheId", null, 0, 0, 0, 1000000, null, "lru", "strong"));
JBoss cache benchmark:
We can see here that there is nearly 8 million get operation invoked on the different cache frameworks and the
JCS took the smallest time while OSCache took the biggest time
We see here that there is nearly 2 million put operation invoked on the different cache frameworks and cache4j
took the smallest time while OSCache took the biggest time
The cache test performed here was in memory cache and there were 25 threads accessing the cache but we will
not depend on this only and we will just continue with our testing
JCS Test Case:
OScache vs. Ehcache
OSCache vs. JCS
OScache vs. Cache4J
Ehcache vs. JCS
Ehcache vs. Cache4J
JCS vs. Cache4J
The winner in this test in Ehcache which achieved outstanding results against all the other frameworks, this test is
just adding 50,000 items to the cache and then retrieves them and measure the time take for adding and getting
the items from cache
Cache4j Test Case:
---------------------------------------------------------------
java.version=1.6.0_10
java.vm.name=Java HotSpot(TM) Client VM
java.vm.version=11.0-b15
java.vm.info=mixed mode, sharing
java.vm.vendor=Sun Microsystems Inc.
os.name=Windows XP
os.version=5.1
os.arch=x86
---------------------------------------------------------------
This test can take about 5-10 minutes. Please wait ...
---------------------------------------------------------------
GetPutRemoveT GetPutRemove Get
Widget by Css Reflex | TutZone
---------------------------------------------------------------
cache4j 0.4 2250 2125 1703
oscache 2.4.1 4032 4828 1204
ehcache 1.6 1860 1109 703
jcs 1.3 2109 1672 766
---------------------------------------------------------------
As we can see the OSCache also took the biggest time while ehcache took the smallest time.
This test also performs addition and retrieving for cache items which means there is no cache miss (like the test
cases in JBoss cache benchmark)
And the gold medal goes to!
Our candidate framework in this part is ehcache which achieved the best time in most of the testing, best
performance for cache miss and cache hits and not only that but also provides very good features from monitoring
statistics to distributed functionality.
2nd place goes to JCS and OSCache, JCS is really a great caching framework but wont serve the need of caching
response and JSP portions but it will be a great choice for caching POJOs in while OSCache have nice features but
unfortunately the performance is not that good that is because an exception is thrown when there is a cache miss
which would affect the performance, most of the cache frameworks introduced here just return null if cache miss
is encountered.
Finally in the last place comes Cache4j which did really a great job in caching but isnt feature rich and also it is
Russian documented so wont be helpful when you face a problem with it :D but it still achieved outstanding
results.
Conclusion:
In this part we have seen different cache frameworks and we made a comparison for them but thats not the end
we still have more open source caching frameworks to check so stay tuned ;)
Posted by Ahmed Ali at 10:55 PM
Labels: Algorithms, Framework
Intro to Caching,Caching algorithms and caching frameworks part 5
Introduction:
In part 1 we talked about Caching introduction and some terminologies of caching and in part 2 and part 3 we
have seen some implementation of the famous replacement cache algorithms and in part 4 we saw comparisons
between some famous caching frameworks and in this part we are going to continue what we started in part 4 and
as in part 4 we will concentrate only on memory caching.
The Task:
After programmer 1 released the caching article in Programming Mania the geek to geek magazine, he got a lot
of threaten mails and terrible messages from caching geeks defending their beloved caching frameworks and
warning him if he didnt make their beloved caching framework win the contest, he will regret the day he became
a programmer.
That didnt scare our programmer and he went on completing the second part of the comparison
. ShiftOne
WhirlyCache
SwarmCache
JBoss Cache
ShiftOne:
ShiftOne or as they call JOCache is a lightweight caching framework that implements several strict object caching
policies which comes up with a set of cache algorithm implementations that supports in memory cache.
ShiftOne cache forces two rules for every cache:
Max Size - each cache has a hard limit on the number of elements it will contain. When this limit is exceeded,
the least valuable element is evicted. This happens immediately, on the same thread. This prevents the cache
from growing uncontrollably
Element Timeout - each cache has a maximum time that it's elements are considered valid. No element will
ever be returned that exceeds this time limit. This ensures a predictable data freshness.
ShiftOne use decorator pattern in order to make it more flexible for the user to use any underneath caching
product to maintain the cache.
The following caching products can be plugged into ShiftOne:
EHCache
SwarmCache
JCS Cache
Oro Cache
ShiftOne enables client to gather statistics (Hit/Miss) about the cache by using JMX, not only that but also enables
integration with Hibernate ORM through adaptors.
When it comes to in memory caching (which is the only thing JOcache supports) JOCache uses Soft references for
the caching entries.
JOCache was originally implemented as part of the ExQ project to support ResultSet caching. It was later split out
for use by other projects. It was designed to cache large expensive database query results.
-Memory Cache:
ShiftOne cache supports LRU, LFU, FIFO, Single, Zero
ShiftOne and Check List:
WhirlyCache:
WhirlyCache is a fast, configurable in-memory object cache for Java. It can be used to speed up a website or an
application by caching objects that would otherwise have to be created by querying a database or by another
expensive procedure it also provides an in-memory cache.
WhirlyCache runs a separate thread to prune the cache; in other words, the data from the cache is not provided
by the same application thread that the client uses. Thus, there are fewer burdens on the application thread.
Whirlycache is built around several design principles that differ from other cache implementations:
Require synchronization as infrequently as possible
Do as little as possible in the insertion and retrieval operations
Soft limits are acceptable for many applications
Disk overflow becomes a bad idea very quickly
Many attributes of Whirlycache are configurable in an XML file, but the most important components of the cache
are the Backend, the Tuner, and the Policy.
WhirlyCache support pluggable backend implementations that need to implement the ManagedCache interface
(which is a subinterface of java.util.Map, although not all the methods of Map need to be implemented).
WhirlyCache currently support two backends: ConcurrentHashMap and FastHashMap. You can even implement your
own backed by implementing the ManagedCache interface.
The Tuner is a background thread that performs cache maintenance activities specified in the configured Policy
implementation. One Tuner thread per cache is created and it is configured to run every n seconds. It depends on
your application, but you definitely don't want to run the Tuner too often since it will only serve to burden the
system unnecessarily.
-Memory Cache:
Currently, WhirlyCache offers FIFO, LFU and LRU. You can specify a different Policy implementation per named
cache in the whirlycache.xml configuration file
WhirlyCache and Check List:
SwarmCache:
SwarmCache is an in-memory cache intended more for caching domain objects on the data access layer. It offers
support for a distributed cache in a clustered environment.
SwarmCache supports the LRU caching algorithm. However, SwarmCache is essentially an in-memory cache. When
LRU is set as the caching algorithm and the memory capacity is reached, SwarmCache evicts the memory objects
as per LRU logic from its memory.
SwarmCache uses soft references to the cached objects. So, if the LRU is not set as the caching algorithm, it relies
on the garbage collector to swipe through its memory and clean objects that are least frequently accessed.
However, SwarmCache recommends a combination of the above two to be set as the caching algorithm.
SwarmCache provides a wrapper in order to be used with Hibernate ORM and DataNucleus
When used in clustering environment each server instantiates its own manager. For each type of object that the
server wishes to cache, it instantiates a cache and adds it to the manager. The manager joins a multicast group
and communicates with other managers in the group. Whenever an object is removed from a cache, the manager
notifies all other managers in the group. Those managers then ensure that the object is removed from their
respective caches. The result is that a server will not have in its cache a stale version of an object that has been
updated or deleted on another server.
Note that the managers only need to communicate when an object is removed from a cache. This only happens
when an object is updated or deleted. The managers do not co-operate beyond this. This means that the amount
of inter-server communications is proportional to the amount of updates/deletes of the application. Also notice that
there is no "server"; all hosts are equal peers and they can come and go from the cache group as they please
without affecting other group members. Thus the operation of the distributed cache is very robust
-Memory Cache:
LRU, Timeout, Automatic and Hybrid
SwarmCache and Check List:
JBoss Cache:
JBoss offers two kinds of cache flavors, namely CoreCache and PojoCache.
JBoss Core Cache is a tree-structured, clustered, transactional cache. It can be used in a standalone, non-
clustered environment, to cache frequently accessed data in memory thereby removing data retrieval or
calculation bottlenecks while providing "enterprise" features such as JTA compatibility, eviction and persistence.
JBoss Cache is also a clustered cache, and can be used in a cluster to replicate state providing a high degree of
failover. A variety of replication modes are supported, including invalidation and buddy replication, and network
communications can either be synchronous or asynchronous.
JBoss Cache can - and often is - used outside of JBoss AS, in other Java EE environments such as Spring, Tomcat,
Glassfish, BEA WebLogic, IBM WebSphere, and even in standalone Java programs thanks to its minimal
dependency set
POJO Cache is an extension of the core JBoss Cache API. POJO Cache offers additional functionality such as:
maintaining object references even after replication or persistence.
fine grained replication, where only modified object fields are replicated.
"API-less" clustering model where POJOs are simply annotated as being clustered.
In addition, JBoss Cache offers a rich set of enterprise-class features:
being able to participate in JTA transactions (works with most Java EE compliant transaction managers).
Attach to JMX consoles and provide runtime statistics on the state of the cache.
Allow client code to attach listeners and receive notifications on cache events.
Allow grouping of cache operations into batches, for efficient replication
The cache is organized as a tree, with a single root. Each node in the tree essentially contains a map, which acts
as a store for key/value pairs. The only requirement placed on objects that are cached is that they implement
java.io.Serializable.
JBoss Cache works out of the box with most popular transaction managers, and even provides an API where
custom transaction manager lookups can be written.
The cache is completely thread-safe. It employs multi-versioned concurrency control (MVCC) to ensure thread
safety between readers and writers, while maintaining a high degree of concurrency. The specific MVCC
implementation used in JBoss Cache allows for reader threads to be completely free of locks and synchronized
blocks, ensuring a very high degree of performance for read-heavy applications. It also uses custom, highly
performant lock implementations that employ modern compare-and-swap techniques for writer threads, which are
tuned to multi-core CPU architectures.
Multi-versioned concurrency control (MVCC) is the default locking scheme since JBoss Cache 3.x.
-Memory Cache:
JBoss cache support LRU, LFU, MRU, Expiration, ElementSize and FIFO
JBoss Check List:
Performance in action:
Ok now it is show time for this performance testing programmer 1 used 3 different test cases which are as follow:
Test Case from JCS site (applied on all caching frameworks)
JBoss Cache benchmark framework (which is really a very nice cache benchmark framework)
Test Case from Cache4J site (applied on all caching frameworks)
In the 1st and 3rd cache test case it just simple testing of retrieve and populating the cache, while in JBoss cache
benchmark there are a lot of test cases shipped with the benchmark from replication to distributed and clustering
testing.
All the testing here were performed on a single machine (no distributed testing were performed) and all the testing
were performed in memory.
The versions of the frameworks we are going to test now are as follow:
OSCache: 2.4.1
Ehcache: 1.6.0
JCS: 1.3
Cache4j: 0.4
JBoss: 3.0.0
Whirly: 1.0.1
Swarm: 1.0
ShiftOne: 2.0b
JBoss cache benchmark:
We can see here that there is nearly 8 million get operation invoked on the different cache frameworks and the
WhirlyCache took the smallest amount of time (followed by JBoss Cache) while OSCache took the biggest time.
we see here that there is nearly 2 million put operation invoked on the different cache frameworks and
WhirlyCache took the smallest time while OSCache took the biggest time
The cache test performed here was in memory cache and there were 25 threads accessing the cache.
JCS Test Case:
Cache4j vs. JBoss EhCache vs. JBoss JCS vs. JBoss
OSCache vs. JBoss ShiftOne vs. cache4J Shiftone vs. EhCache
ShiftOne vs. JCS ShiftOne vs. OSCache ShiftOne vs. Swarm
ShiftOne vs.JBoss Swarm vs. Cache4J Swarm vs. EHCache
Swarm vs. Jboss Swarm vs. JCS Swarm vs. OSCache
Whirly vs. Cache4J Whirly vs. EhCache Whirly vs. JBoss
Whirly vs. JCS Whirly vs. OScache Whirly vs. ShiftOne
Whirly vs. Swarm
The winner in this test in Ehcache which achieved outstanding results against all the other frameworks, in 2nd
place comes Whirly Cache and in 3rd place comes JBoss cache
Cache4j Test Case:
Cache4J Test With Remove
As we can see the SwarmCache took the biggest time while ehcache and whirlyCache took the smallest time.
This test also performs addition and retrieving for cache items which means there is no cache miss (like the test
cases in JBoss cache benchmark)
But there is an extra step this test do which is removing cache entries from cache and if we omitted this operation
(just concentrated on the put and get operation) we will get the following results
Cache4J Test Without Remove
As we can see the JBoss and Swarm time is heavily reduced, this mean that the remove operation takes a lot of
time in these two cache frameworks, but lets not forget that JBoss is not a flat cache (a structure cache) which
might be the reason for the delay and also it uses transaction like mechanism for caching which would affect also
its performance but still great feature (and for sure we wont invoke remove method so often)
And the gold medal goes to!
Our candidate frameworks in this part are WhirlyCache and JBoss cache both of them are achieved very good
performance in the cache hit and miss but let's not forget that Whirly is not distributed cache which is a bad thing
, beside that JBoss offers structure cache as we discussed before beside the transaction mechanism that is offered
by it also , WhirlyCache is really nice for in memory cache either in single or multi threaded application on the
contrary Swarm cache performance is really bad in multi threading application , it throw out of memory exception
more than once while it is being tested .
Second place goes to ShiftOne which is really nice but suffer from lake of support ,documentation and even
configuration.
If we considered the caching we introduced in the previous part we would have the following order:

Widget by Css Reflex | TutZone
First place: EhCache (still the best) along with Whirly and JBoss
Second place: ShiftOne and JCS
Third place: Cache4J and OSCache
The worst performance was achieved by Swarm cache (I guess It would be fast not to cache you objects than
caching it with Swarm cache :D )
Conclusion:
In this part we have seen the comparison of different Open source cache frameworks we and concluded that
EhCache is considered one of the bets choices (beside JBoss and Whirly cache) while Swarm is one of the poorest
choice you will ever make.
Posted by Ahmed Ali at 9:34 PM
Labels: Framework