Sunday, February 22, 2009

Dynamic Fetch Planning

Now I have heard this before:

    "Hibernate can do X, can your thing do X?"

I mumble in a me-too manner. Because Hibernate, as the world knows, is 42 -- the answer to the eternal question. Also I have seen many postings in OpenJPA's own mailing list that can be paraphrased as: "We are migrating our Hibernate application to OpenJPA, but we could do X with Hibernate, OpenJPA does not seem to be able to do X". Or a slight variation on the theme: "This test fails with OpenJPA. But it passes with Hibernate".

As a contributor to OpenJPA, this implied second-class status irks me at times (I also wonder: why are they migrating away from Hibernate anyway?). So when the following email from an unknown sender reached my mailbox -- I smiled. The mail says:

"We are working in investment banking's IT sector. We have a requirement X that Hibernate can not support. Can OpenJPA do X?"

Firstly, I was happy to note that OpenJPA can do X very well.

Secondly, the proposal to include feature X to JPA 2.0 Specification was simply ignored.

Thirdly, investment banking's IT is the bellwether of enterprise technology trends and demands.

I must admit that I do not know Hibernate deep enough to ascertain that it can or can not do X -- I am just going by the sender's comments.

So what is this feature X? It is called Dynamic FetchPlan.

The conventional wisdom is to mix fetch plan with query. This wisdom is rooted in the history of SQL -- where projection and selection criteria appear together in an all-encompassing SQL statement (I have heard stories of a single SQL statement being 2000 lines long). JPA took the same route -- its query language JPQL introduced FETCH JOIN which lets you specify what instances (that are not directly part of the projected result) will be brought in the memory as a side-effect of query execution.

While OpenJPA supports fetch join as per JPA specification -- OpenJPA prefers to separate these two concerns, namely: i) what is selected and ii) what is fetched. Because fetch plan is independently specified, you can issue a simple EntityManager.find(Company.class, "acme.org") -- and as a result the persistent context may either  be populated with a single Company instance or with all the Departments and all the Employees of those Departments with their Addresses and Spouses' names. It all depends on what fetch plan is active while you invoked EntityManager.find(). In one end of the spectrum, you can use basic default fetch plan which fetches the fields of primitive types but no relations. On the other end, you can be as creative as you wish to specify a sub-graph of the complete closure starting from a root Company instance. And you can specify these fetch plans during design, use them and edit them at runtime.

But why should you care to carve out a sub-graph from a root entity? Why does OpenJPA use fetch plan as a pervasive notion throughout its internal design? Why even the question may be worth pondering?

My take on it will take a separate post -- more importantly Oscar is starting in TV....

Thursday, February 12, 2009

OpenJPA Enhancement

What is Enhancement Anyway?

I have cruised around the OpenJPA forums enough to notice that numerous new developers have posted questions regarding enhancement. Since I'm new to OpenJPA myself, I figured I'd dive in to enhancement and see what all the fuss is about. This post is directed mainly at developers that are taking a first look at OpenJPA and having problems getting started. No worries for those of you deploying an application to run in a Java EE 5 compliant application server such as WebSphere Application Server because your Entities will be automatically enhanced at runtime.

As I was exploring how to enhance, I started to wonder why do I need to enhance at all? As it turns out the JPA spec requires some type of monitoring of Entity objects, but the spec does not define how to implement this monitoring. Some JPA providers auto-generate new subclasses or proxy objects that front the user's Entity objects, while others like OpenJPA use byte-code weaving technologies to enhance the actual Entity class objects. Now that we've got that out of the way, lets get back to how to do the enhancement. I'm going to give runtime examples first, as I feel those are the easiest from a developers point of view.

Runtime enhancement-When running in a JSE environment or in a non EE-5 compliant container, OpenJPA defaults to using subclassing enhancement. The subclassing enhancement support was added originally as a convenience to new developers to reduce the amount of work to get a 'HelloWorld-ish' OpenJPA application working out of the box. It was never meant to run in production. So you're probably thinking that this sounds great! OpenJPA handles enhancement automatically for me and I can stop reading this post. Wrong! Subclassing has two major drawbacks. First off, it isn't nearly as fast as byte-code enhancement and the second drawback is that there are some documented functional problems when using the subclassing support. The moral of the story is, don't use this method of enhancement. I'm not the only one making this recommendation as I've come across form posts about removing subclassing enhancement as the default OpenJPA behavior. Additional information regarding the subclassing enhancement can be found here.

The second and recommended way get runtime enhancement is to provide a javaagent when launching the JVM that OpenJPA is going run in. This is the method that I use in most of my test environments because it is very painless. All that is required to get runtime enhancement in Eclipse is to specify the -javaagent:[open_jpa_jar_location] on the Run Configuration page. That is it.

Another simple way to get runtime enhancement is to specify the the -javaagent when launching an application from ant. Below is a snippet from my build.xml that shows how to pass the -javaagent when launching a JSE application that uses OpenJPA. 
    <path id="jpa.enhancement.classpath">
        <pathelement location="bin"/>
        <!-- lib contains all of the jars that came with the OpenJPA binary download -->
        <fileset dir="lib">
            <include name="**/*.jar"/>
        </fileset>
    </path>
...
<target name="drive" depends="clean, build">
        <echo message="Running test with run time enhancement."/>
        <java classname="main.Driver" failonerror="true" fork="yes">
            <jvmarg value="-javaagent:${openJPA-jar}"/>
            <classpath refid="jpa.enhancement.classpath"/>
        </java>
    </target>

Buildtime enhancement -
In this section I'm going to cover how to invoke the build time enhancer ant task that is packaged with OpenJPA through the use of ant and maven. Yes there are many different ways to buildtime enhancement bliss, but I'm only going to talk about using the ant task. Note that this entire section assumes you're running the ant/maven scripts from a command line. I can't say for certian how this works inside Eclipse. Warning: I'm not a maven wizard, so please be nice if I've commited some cardinal maven sin. :-)

I'll start first by showing how to define a OpenJPA enhancer task and how to invoke the task in ant. I had to fumble a little to get it working, but overall it was pretty straight forward. First you'll need to compile the Entites. (Note: as a prereq to running the enhance task, I copied my persistence.xml file to my /build directory. You might not need to do this, but the persistence.xml has to be in the classpath.) Next you'll need to configure the enhancer task and a classpath where the task can be found.(Adding the classpath is the part that the OpenJPA manual missed.) The final step is to call the enhance task. A snippet is provided below.
    <path id="jpa.enhancement.classpath">
        <pathelement location="bin"/>

        <!-- lib contains all of the jars that came with the OpenJPA binary download -->
        <fileset dir="lib">
            <include name="**/*.jar"/>
        </fileset>
    </path>


    <target name="enhance" depends="build">
    <!-- This is a bit of a hack, but I needed to copy the persistence.xml file from my src dir
        to the build dir when we run enhancement -->
    <copy includeemptydirs="false" todir="bin">
        <fileset dir="src" excludes="**/*.launch, **/*.java"/>
    </copy>


    <!-- define the openjpac task -->
    <taskdef name="openjpac" classname="org.apache.openjpa.ant.PCEnhancerTask">
        <classpath refid="jpa.enhancement.classpath"/>
    </taskdef>
      
    <!-- invoke enhancer the enhancer -->
    <openjpac>
        <classpath refid="jpa.enhancement.classpath"/>
    </openjpac>
    <echo message="Enhancing complete."/>
</target>
The second(different) path to build time enhancement is to use the maven antrun plug-in to launch the OpenJPA enhancer task. Since I'm new to maven this road was pretty clunky, but the steps were nearly identical to running directly in ant. I'll include the parts from my pom.xml that took the most time to figure out. Again, I'm not sure if you will need to move the persistence.xml file to the build directory, but I did(I assume I'm not doing something right).
  <build>
  <!-- Copy the persistence.xml file to the build dir -->
  <!-- You can skip this step if you put the persistence.xml in src/main/resources/META-INF instead of src/main/java/META-INF -->
  <resources>
      <resource>
        <directory> src/main/java </directory>
        <includes>
          <include> **/*.xml </include>
          </includes>
      </resource>
    </resources>
    <plugins>
.....           
      <plugin>
        <groupId>org.apache.maven.plugins</groupId>
        <artifactId>maven-antrun-plugin</artifactId>
        <version>1.2</version>
        <executions>
          <execution>
              <phase>process-classes</phase>
            <configuration>
              <tasks>
                  <taskdef name="openjpac" classname="org.apache.openjpa.ant.PCEnhancerTask" classpathref="maven.compile.classpath"/>
                  <openjpac>
                      <classpath refid="maven.compile.classpath"/>
                  </openjpac>
              </tasks>
            </configuration>
            <goals>
              <goal>run</goal>
            </goals>
          </execution>
        </executions>
      </plugin>
    </plugins>
....
  </build>
Hopefully this helps someone out there! Feel free to contact me if you have any questions/comments/suggestions.

-Rick

Tuesday, February 3, 2009

The problem with JPA and Java persistence in general...

Hi,
I just came across another interesting article from a colleague of mine (Billy Newport). Billy has a very active blog on lots of different topics related to Java Development (and a few items not related to programming in the least). But, this one caught my eye due to the title, "The problem with JPA and Java persistence in general..." Surprised by my attention? :-)

I'm not going to try to steal Billy's thunder. If you are interested in debating Billy's viewpoints, you are more than welcome to do so via his blog. I just thought that the content may be of interest to the readers of this blog, so I thought I would cross-post it.

But... I would like to comment that I think the Java EE community has made some great strides in the area of persistence. Not all of the attempts have been welcomed with open arms (ie. CMP Beans), but we really seem to be getting acceptance with the JPA approach. There are several fully-functional implementations of the JPA 1.0 specification, and the JPA 2.0 expert group is extremely active finalizing the next revision of the spec.

So, let me just ask for general comments or suggestions for a Java persistence solution. Is JPA satisfying your persistence needs? What about WebSphere's JPA solution based on OpenJPA? If you are not using it, why not? And, what are you using instead? CMP beans? Straight JDBC? Hibernate? Something else?

Thanks,
Kevin



Monday, February 2, 2009

OpenJPA and pureQuery

There's a hidden gem in WebSphere V7 that many of you may not know about. If you're looking for a way to improve performance of database-intensive applications that access DB2, you should probably take a look, especially if you're using z/OS and are looking to drive down CPU costs.

In WebSphere Application Server V7, we deliver an enhanced Java Persistence API (JPA) implementation that supports static SQL access to DB2.. This may not mean much to you, but your DB2 DBAs will certainly understand the implications of this. In general, Java database access, whether through JDBC or OpenJPA is all dynamic SQL. From a security and performance perspective, static SQL can be better. For some applications, it can be significantly better.

For those of you who have never heard of static SQL, Figure 1 says it all - static SQL is basically preprocessed so all the database access processing does not have to happen when real users are using the application. As well as reducing processing time, it also means that performance stays more consistent. There is no up and down of performance based on whether the query is in the SQL statement cache or not.

To take advantage of this feature, you don't need to change anything about your JPA application; however you do need to have the pureQuery Runtime. WebSphere AS provides the utility that generates the SQL from the persistence unit along and for any JPA named queries. For more on how to actually do this, see this article in the WebSphere Tech journal.

You can see some performance results of using static SQL with pureQuery in this article. More background material on Static SQL can also be found in this article.

To find out more about how pureQuery can work together with WebSphere to reduce costs and improve quality of service, be sure to tune in to this webcast, presented by my colleague in pureQuery-land, Steve Brodsky. You can also read his blog entry on the pureQuery and OpenJPA integration.

Enjoy,
Kevin