Clearing cache in JPA 2.0

Managing a data cache effectively is important for application or service performance, although not necessarily easy to do. A simple and common design with a data cache that occurs and seems to cause problems is the following:

  • a process that continually commits new data to a database
  • another process executing is a separate virtual machine, typically a web service, marshaling data from the database
The web service needs to pull the most current data from the database each time there is a client request to access data. With different product implementations of caching, not all the JPA calls seem to work the same. There can be multiple levels of caching as well. The clear() and flush() methods of EntityManager don’t clear multiple levels of caching in all cases, even with caches turned off as set with persistence properties.
Every time there is a request to pull data from the service, the current set of data in the database needs to be represented. Not what is in the cache since it may not be in synch with the database. It seems like this should be simple but it took some experimenting on my part to get this working as needed. There also doesn’t appear to be much information about handling this particular scenario. There are probably solutions posted somewhere but I add one solution here to make it easier to find.
Before making any gets for data, use the following call:

em.getEntityManagerFactory().getCache().evictAll();

This seems to work for all cache settings and forces all caches to be cleared.  Subsequent calls to get data result in getting the latest committed data in the database. This may not be the most efficient way, but it always gets the latest data. For high systems requiring high performance, this won’t wok very well. It would be better to refresh the cache periodically and have all clients just get the latest cached values. But there is still the issue of refreshing the entire cache.

Any suggestions on doing this better are appreciated. But for now, this works consistently across platforms and reasonably quick for small to moderate amounts of data.