Thursday, May 21, 2009

Intro to OpenJPA Caching

Intro to OpenJPA Caching

A part of the JPA 2.0 spec (JSR317) defines the javax.persistence.Cache Interface [1] which exposes a providers' second level(L2) cache. While exploring this new interface, I learned a few things about OpenJPA caching that were not entirely obvious to me at the time. Hopefully I can lay out some of what I've learned to save at least one person some pain.

First off, I'm going to discuss the caching that wasn't obvious to me. Take a quick look through the semi-pseudo code that models the scenario I was running below.

EntityManager em = EntityManagerFactory.createEntityManager()
Entity e = em.findEntity(Entity.class, Long.valueOf(1))
updateEntityViaJDBC(e.getId(), "new data")// a worker method to insert data using JDBC
e = em.findEntity(Entity.class, Long.valueOf(1))
if(e.getData().equals("new data")==false){
//Whoops, where did my updated data go!?!
}

As you can see in the example above, I find an Entity from the database and then update the database using JDBC. Since my database was updated I figured I needed search again to update my Entity and finally I validated that the Entity was holding onto the data I *thought* it should. The last part of my logic is where I went astray.

Once an Entity is loaded by OpenJPA, it is characterized as a managed Entity. When an Entity is managed by the JPA runtime, the spec says that "Synchronization to the database does not involve a refresh of any managed entities unless the refresh operation is explicitly invoked on those entities" [2]. The more I dug into this one, I found that OpenJPA tries to cache/optimize where ever the spec allows. If you were to create an EntityManager and then call em.find() on the same object 1000 times in a row, OpenJPA would only hit the DB once. I didn't expect that to happen, but I can swallow it now that I know that it is happening! This caching is sometimes referred to as the EntityManager L1 cache and it is scoped to the life of an EntityManager. In short, when an EntityManger falls out of scope or is closed, the L1 cache is cleared. I'm not going to lie, this stuff is complicated and I only discussed a small part of the entire picture. If you want/need more details, please see [3] for the entire description.

One thing to note is that the previous paragraph talked only about the EntityManager L1 cache which is defined by the spec and it shouldn't be confused with the following paragraph which pertains to the L2 cache.

The javax.persistence.Cache interface that is being introduced as part of JSR317 essentially exposes some of the functionality from org.apache.openjpa.persistence.StoreCache that has been in existence since the early days of OpenJPA. The interface itself is not that interesting, but the results from enabling the OpenJPA data cache are pretty impressive. The OpenJPA user manual states that "This cache is designed to significantly increase performance while remaining in full compliance with the JPA standard. This means that turning on the caching option can transparently increase the performance of your application, with no changes to your code." I hate to say this, but it is a case of where you can get something for free. Enabling the data cache is as simple as adding the following property to your persistence.xml.

< name="openjpa.DataCache" value="true">

For more information regarding the OpenJPA data cache see [4].
-Rick

http://jcp.org/aboutJava/communityprocess/pfd/jsr317/index.html --JSR317 download page.
[1] See 6.10 of JSR317.
[2] See 3.2.4 of JSR317.
[3] Chapter 6 of JSR317.
[4] http://openjpa.apache.org/builds/1.0.2/apache-openjpa-1.0.2/docs/manual/ref_guide_caching.html