Thursday, January 22, 2009

JPA Connection Pooling

OpenJPA does not use any kind of connection pooling out of the box, but no worries for those of you using openJPA on WebSphere. When running on WebSphere, the application server provides production-quality connection pooling. That is great if your application runs on WAS, not so great otherwise. Since I haven't found a *good* reason why anyone would run without connection pooling, it seems like a good topic for a first post.

Modifying an existing application to use connection pooling couldn't be easier. To start, you're going to need to download the Apache Commons Pool and DBCP components and add them to your application classpath. Next you will need to update the application database connection properties. Below is an example openJPA configuration using DBCP. You will need to update the connection info where appropriate if using a different DB. Note that these properties are being set in the META-INF/persistence.xml file, but you can also set the JVM system properties directly.

<property name="openjpa.ConnectionDriverName" value="org.apache.commons.dbcp.BasicDataSource"/>
<property name="openjpa.ConnectionProperties" value="DriverClassName=com.mysql.jdbc.Driver,Url=jdbc:mysql://localhost/test,Username=root,Password=password"/>
Configuring openJPA to use connection pooling is as simple as that. I created a small JSE application for this post that uses both pooled and non-pooled connections to do 1000 selects on an empty table. Even though the example is very simple, the results are pretty impressive.
  • 1000 selects with no connection pooling took ~23.469 seconds.
  • 1000 selects with connection pooling took ~1.234 seconds.
-Rick

Tuesday, January 13, 2009

Auditing with OpenJPA

A transactional application often requires to audit the persistent entities. In this brief example, I will present a small code example to outline how an application can audit changes via the life cycle callback methods. The JPA specification prescribes how an entity or a separate, stateless entity listener instance can receive callback notification during various life cycle state transition of an entity. Seven transition events are defined: PrePersist, PostPersist, PreRemove, PostRemove, PreUpdate, PostUpdate and PostLoad. To receive these callbacks either annotate certain methods a) of the entity class itself or b) of a separate entity listener class. The signature of the method in the entity class itself should be

void <METHOD>()
Whereas, callback methods defined on an entity listener class have the following signature
void <METHOD>(Object pc)
In the later case, the managed entity is passed as an argument.

Given this basic framework, how can we implement audit facility such that an application can track any update made on a persistent entity? Of course, to compute this change, the application must be able to access the entity state before and after the update. The central question is:


how can an application access the previous state of an entity inside a callback method?

The answer lies in the fact that OpenJPA can store the state of an entity as it enters the managed lifecycle. This internal copy is used by OpenJPA when a transaction is rolled back. The state of the entities participating in the failed transaction is restored to the original state i.e. as it were before the transaction started. The simple example presented here shows how to access that copy. Of course, a real audit facility has to figure out how to compute the difference between the current and previous state of an entity and where to record those differences.

Let us consider a simple persistent instance, audit.PObject.
   1:  package audit;
2:
3: import javax.persistence.Entity;
4: import javax.persistence.Id;
5: import javax.persistence.PostUpdate;
6:
7: import org.apache.openjpa.enhance.PersistenceCapable;
8: import org.apache.openjpa.kernel.SaveFieldManager;
9: import org.apache.openjpa.kernel.StateManagerImpl;
10:
11: @Entity
12: public class PObject {
13: @Id
14: private long id;
15:
16: private String name;
17:
18: public long getId() {
19: return id;
20: }
21:
22: public void setId(long id) {
23: this.id = id;
24: }
25:
26: public String getName() {
27: return name;
28: }
29:
30: public void setName(String name) {
31: this.name = name;
32: }
33:
34: @PostUpdate
35: public void audit() {
36: PersistenceCapable currentState = (PersistenceCapable)this;
37: StateManagerImpl sm = (StateManagerImpl)currentState.pcGetStateManager();
38: SaveFieldManager sfm = sm.getSaveFieldManager();
39: PersistenceCapable oldState = sfm.getState();
40: PObject old = (PObject)oldState;
41:
42: System.err.println("old : " + old);
43: System.err.println("current : " + this);
44: }
45:
46: public String toString() {
47: return this.getClass().getName()+"@"
48: + Integer.toHexString(System.identityHashCode(this))
49: + "[" + name +"]";
50: }
51: }
The meat is in audit() method annotated with @PostUpdate to receive the callback notification. Now, within the method body, following steps are carried out to access the previous state of the instance.

Line 36: Cast this instance to org.apache.openjpa.enhance.PersistenceCapable interface. The cast is safe. Because OpenJPA ensures that every persistent class implements PersistenceCapable. OpenJPA actually modifies original bytecode of audit.PObject class. This bytecode modification process is called enhancement and is described in detail in OpenJPA documentation.

Line 37: Get the StateManager. Every PersistenceCapable instance is managed by a StateManager. It is the StateManager who intercepts every access and mutation of entity state and calls the requisite underlying OpenJPA infrastructure to load or store the state of an entity which in turn may access the database via JDBC.

Line 38: Get the SaveFieldManager. A StateManager may refer to a SaveFieldManager to which StateManager delegates the task of maintaining a copy of managed instance.

Line 39: Get the old state. The old state is maintained by the SaveFieldManager.

Line 40: Cast is back to audit.PObject. This entity refers to the state of PObject as it entered a transaction.

At this point, a real audit application perhaps will do some sort of state comparison to determine what has changed between current and the previous state of this instance. I am just printing their values on the console.

Now let us write a simple "Hellow JPA World" style application to check that the callback is received. I am omitting the scaffolding code to get a JPA EntityManagerFactory etcetera and just listing the transactional method.


   1:      public void testPostPersistCallabck() {
2: EntityManager em = emf.createEntityManager();
3: em.getTransaction().begin();
4: PObject pc = new PObject();
5: pc.setId(1001);
6: pc.setName("X");
7: em.persist(pc);
8: em.getTransaction().commit();
9: Object pid = pc.getId();
10: em.clear();
11:
12: em = emf.createEntityManager();
13: em.getTransaction().begin();
14: PObject audit = em.find(PObject.class, pid);
15: audit.setName("Y");
16: em.getTransaction().commit();
17: }
When this code runs on my laptop against a MySQL database, that is what it prints out (hand-edited to show only the generated SQL and System.err.println() output)
203  test  INFO   [main] openjpa.jdbc.JDBC - Using dictionary class "org.apache.openjpa.jdbc.sql.MySQLDictionary".
1078 test TRACE [main] openjpa.jdbc.SQL - <t 21933769, conn 11875256> executing prepstmnt 7245716 INSERT INTO PObject (id, name) VALUES (?, ?) [params=(long) 1001, (String) X]
1156 test TRACE [main] openjpa.jdbc.SQL - <t 21933769, conn 23978087> executing prepstmnt 1440568 SELECT t0.name FROM PObject t0 WHERE t0.id = ? [params=(long) 1001]
1172 test TRACE [main] openjpa.jdbc.SQL - <t 21933769, conn 14837200> executing prepstmnt 30702379 UPDATE PObject SET name = ? WHERE id = ? [params=(String) Y, (long) 1001]

old : audit.PObject@1581e80[X]
current : audit.PObject@3a9d95[Y]
As the last two lines of log output shows, the audit() callback has been invoked and the method had printed the state of the instance as it were before its name is changed from "X" to "Y".
The two critical points before we close.
  • This technique only works with build-time enhancement.
  • OpenJPA must be configured to record the state of persistent entities participating in a transaction. This is specified by the following property:
            <property name="openjpa.RestoreState" value="all"/>
 
 

Tuesday, October 28, 2008

Persisting an Array with OpenJPA

One of the key features of JPA is the ability to add metadata annotations to a POJO such that it can be used interchangably in a persistence enabled and/or non-persistent environment.  The JPA 1.0 specification provides persistence metadata for most of the common data structures and object/entity relationships.  The specification clearly defines that collection-valued persistent fields and properties must be defined as one of: java.util.Collection, java.util.Set, java.util.List, or java.util.Map.  Notice there is no mention of the standard array type (ex.  MyObjectArray[]).  That means if your POJO contains a field that is an array you'll typically need to wrapper that array with one of the collection classes above, tag it as a relationship (most likely a OneToMany) and expose that collection type as a persistent property or change to use one of the supported collection classes.  While wrappering a collection is fairly simple; a) it is a logic change (vs. adding metadata) which may undesirable and b) it can have performance impacts due to copying references from one structure to another. 

If you are like me and that doesn't sit well with you, OpenJPA provides a extension, @org.apache.openjpa.persistence.PersistentCollection which can be used to persist an array.  OpenJPA does this by using a separate "container table" to store array elements.  If maintaining the order of the array is also important (which is typically the case), OpenJPA provides the @org.apache.openjpa.persistence.jdbc.OrderColumn extension.  OrderColumn specifies the column in the container table which will be used to store the index of the array entries.  If you don't like the default container table you can also customize that using the @org.apache.openjpa.persistence.jdbc.ContainerTable extension.

Here's a bit of incomplete code which uses PersistentCollection and OrderColumn to persist a deck of cards stored as an array.  Also, note that PersistentCollection supports cascade and fetch configuration, similar to the JPA relationship annotations.

import org.apache.openjpa.persistence.PersistentCollection;
import org.apache.openjpa.persistence.jdbc.OrderColumn;

@Entity
public class Deck {
    @PersistentCollection(elementCascade=CascadeType.PERSIST,
            fetch=FetchType.EAGER)
    @OrderColumn(name="cardorder")
    private Card[] cards;
// ...
}

@Entity
public class Card {
    @Basic
    private Suit suit;  // Suit is an enum Diamonds, Spades, ...
    
    @Basic
    private Rank rank;  // Rank is an enum Two - Ace
//...
}

If you'd like full source for a simple application based on the entities above email me at techhusky@gmail.com.  For more examples search the OpenJPA unit test source code for the @PersistentCollection annotation.  You'll find that OpenJPA persistent collection support is very configurable via a host of other OpenJPA annotations and can be used to persist other collection types in addition to the array.

-Jeremy

Wednesday, September 24, 2008

Support for OpenJPA?

One of the most common questions I get is how does IBM provide support for OpenJPA? Since the WebSphere JPA solution is built on top of the Apache OpenJPA project, how do customers get fixes to problems that reside in the OpenJPA code base?

The answer is quite simple -- via a WebSphere PMR. The WebSphere JPA team is active with the Apache OpenJPA project as contributors, committers, and members of the PMC. As problems are reported through the normal WebSphere PMR process, we will address the problems just like any other WebSphere problem. It doesn't matter whether the problem resides in the WebSphere JPA code base (extensions) or the Apache OpenJPA code base. Once the problem is discovered, a fix can be integrated into the appropriate Apache OpenJPA service stream and the resolution can be delivered as an iFix or as part of the next fixpack.

Bottom line, WebSphere will support the Apache OpenJPA code base that we ship just as if we had written it ourselves and, in many cases, we have. :-)

Kevin

Saturday, September 20, 2008

Slice : OpenJPA for Distributed Persistence

Slice is now available as an integral part of released version of OpenJPA 1.2.0. Slice extends OpenJPA to transact and query against distributed, horizontally-partitioned, possibly heterogeneous databases. Using OpenJPA's excellent feature derivation framework, Slice offers any existing OpenJPA based application originally developed for single database to transparently upgrade to a database configuration where data is partitioned amongst multiple databases, without any change to the existing application.

Data partitioning is an effective scaling strategy against growing data volume. Many data sets are naturally amenable to partitioning by geographical region (e.g. Homes in each State), temporal interval (e.g. Order in each Month) or by the very nature of application such as multi-tenant, Software-as-Service hosting platforms. As data is distributed across months or states among different databases and Slice executes all critical database operations such as flush and commit in parallel -- the scaling characteristics is determined by size of the maximum database partition instead of the entire data set size. Moreover, Slice supports aggregate query operations such as SUM or MAX -- so that a standard JPQL query such as

select MAX(h.price) from Home h

will issue identical parallel queries across multiple databases, each storing data on Homes in individual state and find maximum of the results of each query and finally return the single maximum value as the result of the query.  

But how about the newly created instances? Which database partition will store a new record? This is, of course, specified by the application itself by implementing a single method of DistributionPolicy interface. The contract is simple: for any new persistent object as input argument, the method should return the name of the database partition. Slice when it encounters a new object during commit will call the user-defined DistributionPolicy implementation and store the new record to the appropriate partition. Slice also tracks the database origin of each persistent instance as they are loaded from different database partitions. So when the application modifies an instances in a transaction and commits -- Slice knows exactly which database partition will receive the update.

Friday, September 19, 2008

Welcome!

As the page header indicates, this blog will be used to discuss Java Persistence as it relates to the WebSphere Application Server. Specifically, we will be focusing on the Java Persistence API (JPA) and WebSphere's JPA solution. The WebSphere JPA solution is built on top of the Apache OpenJPA project. All of the authors on this blog are active with the development of the complete WebSphere JPA solution (including OpenJPA). But, we will adhere to the standard disclaimer that the views posted will be from us as individuals. :-)

Thanks for visiting and watch for new postings very soon.