Thursday, May 21, 2009

Intro to OpenJPA Caching

Intro to OpenJPA Caching

A part of the JPA 2.0 spec (JSR317) defines the javax.persistence.Cache Interface [1] which exposes a providers' second level(L2) cache. While exploring this new interface, I learned a few things about OpenJPA caching that were not entirely obvious to me at the time. Hopefully I can lay out some of what I've learned to save at least one person some pain.

First off, I'm going to discuss the caching that wasn't obvious to me. Take a quick look through the semi-pseudo code that models the scenario I was running below.

EntityManager em = EntityManagerFactory.createEntityManager()
Entity e = em.findEntity(Entity.class, Long.valueOf(1))
updateEntityViaJDBC(e.getId(), "new data")// a worker method to insert data using JDBC
e = em.findEntity(Entity.class, Long.valueOf(1))
if(e.getData().equals("new data")==false){
//Whoops, where did my updated data go!?!

As you can see in the example above, I find an Entity from the database and then update the database using JDBC. Since my database was updated I figured I needed search again to update my Entity and finally I validated that the Entity was holding onto the data I *thought* it should. The last part of my logic is where I went astray.

Once an Entity is loaded by OpenJPA, it is characterized as a managed Entity. When an Entity is managed by the JPA runtime, the spec says that "Synchronization to the database does not involve a refresh of any managed entities unless the refresh operation is explicitly invoked on those entities" [2]. The more I dug into this one, I found that OpenJPA tries to cache/optimize where ever the spec allows. If you were to create an EntityManager and then call em.find() on the same object 1000 times in a row, OpenJPA would only hit the DB once. I didn't expect that to happen, but I can swallow it now that I know that it is happening! This caching is sometimes referred to as the EntityManager L1 cache and it is scoped to the life of an EntityManager. In short, when an EntityManger falls out of scope or is closed, the L1 cache is cleared. I'm not going to lie, this stuff is complicated and I only discussed a small part of the entire picture. If you want/need more details, please see [3] for the entire description.

One thing to note is that the previous paragraph talked only about the EntityManager L1 cache which is defined by the spec and it shouldn't be confused with the following paragraph which pertains to the L2 cache.

The javax.persistence.Cache interface that is being introduced as part of JSR317 essentially exposes some of the functionality from org.apache.openjpa.persistence.StoreCache that has been in existence since the early days of OpenJPA. The interface itself is not that interesting, but the results from enabling the OpenJPA data cache are pretty impressive. The OpenJPA user manual states that "This cache is designed to significantly increase performance while remaining in full compliance with the JPA standard. This means that turning on the caching option can transparently increase the performance of your application, with no changes to your code." I hate to say this, but it is a case of where you can get something for free. Enabling the data cache is as simple as adding the following property to your persistence.xml.

< name="openjpa.DataCache" value="true">

For more information regarding the OpenJPA data cache see [4].
-Rick --JSR317 download page.
[1] See 6.10 of JSR317.
[2] See 3.2.4 of JSR317.
[3] Chapter 6 of JSR317.


omnitech said...
This comment has been removed by a blog administrator.
Brian said...
This comment has been removed by the author.
Brian said...

I am using IBM RAD 7.5.x with IBM WAS 7.x so relying on the OpenJPA provided by WAS 7.x runtime. I suppose it is OpenJPA 2.x

I have below configurations:

property name="openjpa.DataCache" value="true(EnableStatistics=true)"
property name="openjpa.DataCache" value="true(CacheSize=50000, SoftReferenceSize=0)"
property name="openjpa.QueryCache" value="true(CacheSize=1000, SoftReferenceSize=0)"
property name="openjpa.QuerySQLCache" value="true"/>

Entity Code Snippet

@DataCache(timeout=3600000) // 1hr

public class LookupCodes implements Serializable {

As per above configuration I can see the Caching is enabled and working and I can also see Caching related log generated in my WAS server log files if I enable OpenJPA logging.

The problem is I am not unable to print Cache Statistics using below code:

EntityManagerFactory emf = (EntityManagerFactoryImpl)Persistence.createEntityManagerFactory("MyDomain");
OpenJPAEntityManagerFactory oemf = OpenJPAPersistence.cast(emf);
StoreCache storeCache = oemf.getStoreCache();

boolean contains = storeCache.contains(LookupCodes.class, "7"); // this statement returns "false" and I wonder why?

then I get below response:


Even though the Caching is working and enabled, why am I not able to print the Cache statistics?

Kindly request you to clarify.

mikedd said...

Hi Brian,

WAS 7.x (and RAD 7.5.x AFAIK) come with OpenJPA 1.2 which doesn't have the CacheStatistics interface.

You can install the JPA 2.0 feature pack for WAS 7.0 - which includes OpenJPA 2.0.

Originally I thought you could also install OpenJPA 2.0 as a third party provider, but it turns out there are other issues with that approach. You'll want to stick with the feature pack if you need cache statistics.

Hope this helps

Brian said...

Thank you Mike for your time.

Chintan Parekh said...

I am trying to enable logging for openJPA in WAS 7.0 but it's not working.

Here is my configuration in persistent.xml

It's not printing sql statement or any jpa logging on the RSA console window.

can you please direct me if i am missing something?


Rick Curtis said...

@Chintan -

Please post your question to the OpenJPA users mailing list.

Sorin C said...


From the article:

EntityManager and then call em.find() on the same object 1000 times in a row, OpenJPA would only hit the DB once. I didn't expect that to happen, but I can swallow it now that I know that it is happening! This caching is sometimes referred to as the EntityManager L1 cache and it is scoped to the life of an EntityManager. In short, when an EntityManger falls out of scope or is closed, the L1 cache is cleared.

I can understand from the above fragment that even if the data in the DB pertaining to the entity bean was modified meanwhile, each time a call of find() is issued then the results will be the old values? Is my understanding correct? If yes, what is the best way to make sure that when a find() is called the returned values are the ones from the DB?

Kevin Sutter said...

Hi Sorin,
As long as a given Entity instance is managed by the Persistence Context (L1 cache), then that instance will be returned on the em.find(). Even if an external application modified the data in the backend database.

But, if you attempted to modify and commit this Entity instance, then the changes in the database would be detected and you'd end up with an OptimisticLockException.

To force the re-loading of the Entity instance, you first need to remove that instance from the L1 cache. The easiest way to do this is to call em.clear(). This removes all Entity instances from the Persistence Context and allows you to start fresh.

If that is too much, then you can also remove the specific entity instance via the em.detach() method. Since JPA 2.0, this method is part of the standard JPA API. Prior to JPA 2.0, you would have to use the OpenJPA specific em.detach method. Details on this can be found here [1].

Hope this helps,


Unknown said...

Article is quite good. Pegasi Media is a b2b marketing firm that has worked with many top organizations. Availing its email list is fast, simple, convenient and efficient. Appending services adds the new record as well as fills up the fields that are missing. Pegasi Media Group also perform Data Refinement, Data building, Data Enchancement, and Data De-Duplication. Database marketing is a form of direct market in which the customers are contacted through their email addresses with the help of the database. There is a scope for email marketing to generate personalized communication with the clients in order to promote your sales.
IBM Websphere Users

Kanye Co Jamila said...

Thanks for the post, I am techno savvy. I believe you hit the nail right on the head. I am highly impressed with your blog. It is very nicely explained. Your article adds best knowledge to our Java Online Training from India. or learn thru Java EE Online Training Students.