Front End Cache
Jonas Bonér of Terracotta fame has a nice blog entry that sums up the way we like to do our caching. But this wasn’t always the case for us. Millions of years ago (ok, maybe only 3 or 4 years ago) while I was working on a consulting project at Bank of America, I had my first major exposure to large scale distributed cache implementations. The application was a very large account management system in the client’s investment services group and coordinated huge amounts of data in a multi-step BPM system. When we realized that the whole process was completed by multiple users in less than an hour, we knew we needed to cache the data better since we knew that each step was loading data that had just been loaded a few minutes earlier by someone else.
OK, deciding to cache was the easy part. The next thing we did was start looking at WHERE to cache it. We were using Hibernate at the time and the obvious (and very amateurish) choice was to just go plug in a distributed cache to the Level 2 cache in Hibernate right? The problem we had is that the data we retrieved from the DB (and Hibernate) wasn’t in the form that we were displaying to the users. There was a fair amount of transformation and formatting which was also expensive when it happens several times in an hour x 1000s of accounts. So while talking to one of the Tangosol guys, they were the first to inform us that this is very common on the road to caching “enlightenment”.
So in the end, we implemented what I have been calling a “front end cache” ever since then. The cache is plugged directly into the data access facade, even in front of the DAO (of course this varies somewhat, but for the sake of clarity…). This way we are caching almost the end result of the data retrieval, and there is no transformation required before display. The data in the cache was correlated to the back end data so that distributed cache updates were still propogated to the front end cache (well, at least invalidations were). This move from plugging in cache to existing frameworks back at the back of the stack, to caching up front after the data is manipulated brought huge performance increases over just deferring to L2 cache on the back end.


Leave a Reply