Friday, April 25, 2008

OracleAS Java Object Cache

by Fábio Souza

Before I start, I would like to say that many texts and observations were took from the Oracle® Containers for J2EE Services Guide 10g (10.1.3.1.0), Chapter 7.

What is it?
The Java Object Cache (JOC) is a OC4J service that makes cache (in memory or disk) of any kind of Java object.
Use it:
  • To store frequently used objects.

  • To store objects that are costly to create/retrieve.

  • To share objects between applications.
Characteristics:
  • By default, cached objects are stored in memory but disk persistence can be configured.

  • By default, cached objects are local. The DISTRIBUTE mode can be configured.

  • By default, the "write lock" is disabled.

  • Cached objects doesn't have a "read lock".

  • Cached objects are represented by name spaces.

  • Cached objects are invalidated based on time or an explicit request.

  • Cached objects can be invalidated by group or individually.

  • Each cached Java object has a set of associated attributes that control how the object is loaded into the cache, where the object is stored, and how the object is invalidated.

  • When an object is invalidated or updated, the invalid version of the object remains in the cache as long as there are references to that particular version of the object.
The Java Object Cache organization:
  • Cache Environment. The cache environment includes cache regions, subregions, groups, and attributes. Cache regions, subregions, and groups associate objects and collections of objects. Attributes are associated with cache regions, subregions, groups, and individual objects. Attributes affect how the Java Object Cache manages objects.

  • Cache Object Types. The cache object types include memory objects, disk objects, pooled objects, and StreamAccess objects.

    This table gives a little description about the cache environment and the object types.
JOC API:
Only "cache.jar" must be added to application's classpath to start with JOC API. This archive is located in $ORACLE_HOME\javacache\lib\cache.jar. A project in JDeveloper just needs to import the built-in "Java Cache" library.
Distributed Cache Characteristics:
  • The cache management is not centralized. Cache updates and invalidation are propagated to all application server nodes.

  • The OAS' JOC configuration of each OAS node is not propagated to other nodes.

  • The distributed objects must be in the same namespace in each node.
Distributed Cache Configurations:
There are two ways to configure JOC: programmatically (using the oracle.ias.cache.Cache class) and through the javacache.xml file. This post only covers javacache.xml.

Configuring the javacache.xml file:

The javacache.xml localization is specified in the server.xml OC4J configuration file (tag: "<javacache-config>"). To work with javacache.xml it is necessary to start the container with the -Doracle.ias.jcache=true property.
To run JOC in "distributed mode", <communication> tag must be configured in the javacache.xml file like the example below:

<communication>
   <!-- "isdistributed" must be "true" to JOC work with DISTRIBUTE marked objects -->
   <isdistributed>true</isdistributed>
   <!-- Each JOC node must have a "discoverer" element to itself. -->
   <discoverer ip="192.168.0.2" port="7000">
   <discoverer ip="192.168.0.3" port="7000">
</communication>

Configuring the distributed cached objects:
There are three important attributes that we can use when configuring objects:
  • DISTRIBUTE: used to mark objects that will be shared between applications.

  • SYNCHRONIZE: used to allow "write locking" on cached objects.

  • SYNCHRONIZE_DEFAULT: the same use as the SYNCHRONIZE attribute, but it can mark only a region or a group. When they are marked, all objects will be allowed to use the write lock.
The example below shows a distributed cache utilization:

public void defineRegions() {
   // Creating attributes
   Attributes remoteRegionAttr = new Attributes();

   /*
    * Extracted from Attributes.setFlags() javadoc:
    * specifies which of the listed attributes should be set in the Attributes object.
    * The flags may be OR'ed together, i.e., Attributes.DISTRIBUTE | Attributes.SPOOL.
    * Any previous settings will be disregard.
    */
   remoteRegionAttr.setFlags(Attributes.SYNCHRONIZE_DEFAULT | Attributes.DISTRIBUTE);

   try {
      // Creates the "RemoteRegion" with the specified attributes.
      CacheAccess.defineRegion("RemoteRegion", remoteRegionAttr);
   } catch (Exception e) {
      e.printStackTrace();
   }
}

public void createCachedObject(Object object) {
   CacheAccess access = null;
   try {
      // Accessing the created region.
      access = CacheAccess.getAccess("RemoteRegion");
      // Owning cache object's write lock (5 seconds to timeout)
      access.getOwnership("CachedObject", 5000);
      // Caching the object
      access.put("CachedObject", object);
      // Releasing cache object's write lock (5 seconds to timeout)
      access.releaseOwnership("CachedObject", 5000);
   } catch (Exception ex) {
      ex.printStackTrace();
   } finally {
      if (access != null) {
         /*
          * Extracted from CacheAccess.close() javadoc:
          * releases the resource used by current CacheAccess object.
          * Application is required to make this call when it no longer need this CacheAccess instance.
          */
         access.close();
      }
   }
}

public Object retrieveCachedObject() {
   Object object = null;
   CacheAccess access = null;
   try {
      access = CacheAccess.getAccess("RemoteRegion"); // Accessing the created region.
      /*
       * Extracted from CacheAccess.get() javadoc:
       * returns a reference to the object associated with name.
       * If the object is not currently in the cache and a loader object has been
       * registered to the name of the object, then the object will be loaded into
       * the cache. If a loader object has not been defined for this name, then, for
       * DISTRIBUTE object, the default load method will do a netSearch for the object
       * to see if the object exist in any other cache in the system; for a non
       * DISTRIBUTE object, an ObjectNotFoundException will be thrown. The object
       * returned by get is always a reference to a shared object. Get will always
       * return the latest version of the object. A CacheAccess object will only
       * maintain a reference to one cached object at any given time. If get is called
       * multiple times, the object accessed previously will be released.
       */
      object = access.get("CachedObject");
      // It's not necessary to own the writing lock to read cached objects.
   } catch (Exception ex) {
      ex.printStackTrace();
   } finally {
      if (access != null) {
         /*
          * Extracted from CacheAccess.close() javadoc:
          * releases the resource used by current CacheAccess object.
          * Application is required to make this call when it no longer need this CacheAccess instance.
          */
         access.close();
      }
   }
   return object;
}


Concurrent access sensible applications may need read and write synchronism. Because JOC only offers write lock, I had to "force" read lock. This is the solution I found:

/**
 * This method reads and updates a cached object forcing synchronizing both to
 * reading and writing.
 * If it is being executed in multiple threads, each thread will only be able to
 * read the object when another thread release it's writing lock. The writing lock will
 * be released after a thread finishes it's reading and writing.
 */
public void synchronizedReadUpdateCachedObject() {
   CacheAccess access = null;
   try {
      access = CacheAccess.getAccess("RemoteRegion");
      access.getOwnership("CachedObject", 5000);
      Object object = access.get("CachedObject");
      System.out.println(object.toString());
      Object newCachedObject = "New Cached Object";
      // This is the method used to update cached objects.
      access.replace("CachedObject", newCachedObject);
      access.releaseOwnership("CachedObject", 5000);
   } catch (Exception ex) {
      ex.printStackTrace();
   } finally {
      if (access != null) {
         access.close();
      }
   }
}

Declarative Cache:
The JOC offers a way to configure its regions, subregions, groups and objects using XMLs. To use this feature the "preload-file" tag must be added to the javacache.xml. This tag will point to the XML with the environment definitions. More information at: http://http//download.oracle.com/docs/cd/B32110_01/web.1013/b28958/joc.htm#i1085809

CacheWatchUtil:
By default, the Cache Service provides the CacheWatchUtil cache monitoring utility that can display current caches in the system, display a list of cached objects, display caches' attributes, reset cache logger severity, dump cache contents to the log, and so on. It depends on the "dms"library. To execute:

java -classpath $ORACLE_HOME\lib\dms.jar;$ORACLE_HOME\javacache\lib\cache.jar oracle.ias.cache.CacheWatchUtil -config=<path_to_javacache.xml>
More Oracle Application Server 10g caches (entirely took from OC4J documentation):
  • Oracle Application Server Web Cache. The Web Cache sits in front of the application servers (Web servers), caching their content and providing that content to Web browsers that request it. When browsers access the Web site, they send HTTP requests to the Web Cache. The Web Cache, in turn, acts as a virtual server to the application servers. If the requested content has changed, the Web Cache retrieves the new content from the application servers.

    The Web Cache is an HTTP-level cache, maintained outside the application, providing fast cache operations. It is a pure, content-based cache, capable of caching static data (such as HTML, GIF, or JPEG files) or dynamic data (such as servlet or JSP results). Given that it exists as a flat content-based cache outside the application, it cannot cache objects (such as Java objects or XML DOM—Document Object Model—objects) in a structured format. In addition, it offers relatively limited postprocessing abilities on cached data.

  • Web Object Cache. The Web Object Cache is a Web-application-level caching facility. It is an application-level cache, embedded and maintained within a Java Web application. The Web Object Cache is a hybrid cache, both Web-based and object-based. Using the Web Object Cache, applications can cache programmatically, using application programming interface (API) calls (for servlets) or custom tag libraries (for JSPs). The Web Object Cache is generally used as a complement to the Web cache. By default, the Web Object Cache uses the Java Object Cache as its repository.

    A custom tag library or API enables you to define page fragment boundaries and to capture, store, reuse, process, and manage the intermediate and partial execution results of JSP pages and servlets as cached objects. Each block can produce its own resulting cache object. The cached objects can be HTML or XML text fragments, XML DOM objects, or Java serializable objects. These objects can be cached conveniently in association with HTTP semantics. Alternatively, they can be reused outside HTTP, such as in outputting cached XML objects through Simple Mail Transfer Protocol (SMTP), Java Message Service (JMS), Advanced Queueing (AQ), or Simple Object Access Protocol (SOAP).
References:
Oracle Application Server Containers for J2EE Services Guide 10g Release 3 (10.1.3)
Oracle Application Server 10g (10.1.2) JOC Tutorial
Oracle Application Server 10g (10.1.2) JOC javadoc

Wednesday, April 23, 2008

Diagnostics beyond OracleAS Control Console

by Eduardo Rodrigues


One of the many companies recently acquired by Oracle is a small one called "Auptyma", whose founder and former CEO, Mr. Virag Saksena, was previously Director of the CRM Performance Group at Oracle. It's main contribution for Oracle's fast growing product line was its "Java Application Monitor" which is now part of the "Oracle Fusion Middleware Management Packs" (that's where http://www.auptyma.com now takes us to) and was renamed to "Oracle Application Diagnostics for Java" or simply Oracle AD4J (although it's also referred as JADE as in Java Application Diagnostics Expert). Holy alphabet soup!

The "new" component integrates the Oracle Enterprise Manager 10g Grid Control suite but it also lives as a separate product, which is great, specially for us developers. It can be downloaded from http://www.oracle.com/technology/software/products/oem/htdocs/jade.html. The installer is very small and also very easy and intuitive.

AD4J may seem awkward at first, when you notice the fact that it also installs and uses the old Apache JServ and a PostgreSQL database. But it's worth it!

I've tried AD4J with my OC4J 10.1.3.3 standalone and I can say it works pretty well and is certainly a much more interesting and useful diagnostics tool than OracleAS 10g Control Console with JVM metrics enabled. My only extra work was to reinstall JDK 1.5.0_14 and reconfigure OC4J to use it (I was already update 15) because it's the most recent JDK 1.5.0 update currently supported by AD4J.

Its main features includes:
  • Production diagnostics with no application instrumentation, saving time in reproducing problems.

  • Visibility into Java activity including in-flight transactions, allowing administrators to proactively identify issues rather than diagnosing after-the-fact (application hangs, crashes, memory leaks, locks).

  • Tracing of transactions from Java to Database and vice-versa, enabling faster resolution of problems that span different tiers.

  • Differential heap analysis in production applications.

  • Possibility to setup alerts based on configured thresholds and forward them using SNMP (which means potential integration with other enterprise monitoring products like OpenView, for instance)

As you can see, it's a pretty interesting, yet small and simple, diagnostic and monitoring tool for Java applications. You can also check these resources for further info:
Cheers and... keep reading!

Sunday, April 20, 2008

A comprehensive XML processing benchmark

by Eduardo Rodrigues


Introduction


I think I've already mentioned it here but, anyway, I'm currently leading a very interesting and challenging project for a big telecom company here in Brazil. This project is basically a complete reconstruction of the current data loading system used to process, validate and load all cellphone statements, which are stored as XML files, into an Oracle CMSDK 9.0.4.2.2 repository. For those who aren't familiar, Oracle CMSDK is an old content management product, which has succeeded the older Oracle iFS (Internet File System). Because it's not an open repository, we are obligated to use its Java API if we want to programmatically load or retrieve data into or from the repository. That, obviously, prevents us from taking advantage of some of the newest tools available like Oracle's XML DB or even the recent Oracle Data Integrator.

Motivation


One of our biggest concerns in this project is with the performance the new system must deliver. The SLA is really aggressive. So, we decided to make some research to find out the newest XML processing technologies available, try and compare them in order to make sure which ones would really help us in the most efficient way. The only constraints are: we must not consider any non-industry-standard solution nor any non-production (or non-stable) releases.

Test Sceneries


That said, based on research and also on previous experience, these were the technologies I've chosen to test and compare:

I've initially discarded DOM parsers based on the large average size of the XML files we'll be dealing with. We most certainly can't afford the excessive memory consumption involved. I've also discarded Oracle StAX Pull Parser, because it was still a preview release, and J2SE 5.0 built-in XML parsers, since I know they're a proprietary implementation of Apache Xerces based on a version certainly older than 2.9.1.

The test scenery designed was very simple and was intended only to measure and compare performance and memory consumption. The test job would be just to parse a real-world XML file containing 1 phone statement, retrieving and counting a predefined set of elements and attributes. In summary, rules were (for privacy's sake, real XML structure won't be revealed):
  1. Parse all occurrences of "/root/child1/StatementPage" element
  2. For each <StatementPage> do:
    1. Store and print out value of attribute "/root/child1/StatementPage/PageInfo/@pageNumber"
    2. Store and print out value of attribute "/root/child1/StatementPage/PageInfo/@customerCode"
    3. Store any occurrence of element <ValueRecord>, along with all its attributes, within page's subtree
    4. Print out the number of <ValueRecord> elements stored
  3. Print out the total number of <StatementPage> elements parsed
  4. Print out the total number of <ValueRecord> elements parsed

Also, every test should be performed for 2 different XML files: a small file (6.5MB), containing a total of 420 statement pages and 19,133 value records and a large one (143MB) with 7,104 pages and 464,357 value records.

Based on the rules above, I then tested and compared the following technology sets:
  1. Apache Digester using Apache Xerces2 SAX2 parser
  2. Apache Digester using Oracle SAX2 parser
  3. Sun JAXB2 using Xerces2 SAX2 parser
  4. Sun JAXB2 using Oracle SAX2 parser
  5. Sun JAXB2 using Woodstox StAX1 parser
  6. Pure Xerces2 SAX2 parser
  7. Pure Oracle SAX2 parser
  8. Pure Woodstox StAX1 parser

Based on this tutorial fragment from Sun: http://java.sun.com/webservices/docs/1.6/tutorial/doc/SJSXP3.html and considering that performance is our primary goal, I've chosen StAX's cursor API (XMLStreamReader) over iterator. Still aiming for performance, all tested parsers have been configured as non-validating.

In time; all tests were executed on a Dell Latitude D620 notebook, with an Intel Centrino DUO T2400 CPU @ 1.83GHz running on Windows XP Professional SP2 and Sun's Java VM 1.5.0_15 in client mode.

Results


These were the performance results obtained after parsing the small XML file (for obvious reasons, I decided to measure heap usage only when the large file was parsed):

Performance results for small XML file
As you can see, Apache Digester's performance was extremely and surprisingly poor despite all my efforts to improve it. So, I had no other choice than to discard it for next tests with the large XML file, from which the results are presented bellow:

Performance results for large XML file
Notice that the tendency toward a better performance when <!DOCTYPE> tag is removed from the XML document has been clearly confirmed here.

As for memory allocation comparison, I've once again narrowed the tests only to the worst case from performance tests above: large XML file including <!DOCTYPE> tag. The results obtained from JDev's memory profiler were:

Memory allocation for large XML file
Another interesting information we can extract from these tests is how much XML binding represents in terms of overhead when compared to a straight parser:

Overhead charts

Conclusion


After a careful and thorough revision and confirmation of all results obtained from the tests described here, I tend to recommend a mixed solution. Considering its near 12MB/s throughput verified here, I'd certainly choose pure Woodstox StAX parser every time I'll have to deal with medium to large XML sources but, for convenience, I'd also choose JAXB 2 whenever there's a XML schema available to compile its classes from and the size of the source XML is not a concern.

As for complexity, I really can't say that any one of the tested technologies was found considerably more complex to implement than the others. In fact, I don't think this would be an issue for anybody with an average experience with XML processing.

Important Note


Just for curiosity, I've also tested Codehaus StaxMate 1.1 along with Woodstox StAX parser. It's a helper library built on top of StAX in order to create an easier to use abstraction layer for StAX cursor API. I can confirm the implementor's affirmation that StaxMate shouldn't represent any significant overhead for performance. In fact, performance results were identical when compared to pure Woodstox StAX parsing the large XML file. I can also say that it really made my job pretty easier. The only reason I won't consider StaxMate is that it depends on a StAX 1.0 API non-standard extension which is being called "StAX2" by guys at Codehaus.

That's all for now.

Enjoy and... keep reading!

Saturday, April 19, 2008

Fixed blog's appearance on IE6

Great news folks!

We've just fixed our blog's template so it'll be displayed 100% correctly on Internet Explorer 6 browser. Now, we expect this blog to render identically on both Internet Explorer and Firefox (hopefully on all other browsers too).

Please, let us know if there's still any visualization issues.

Cheers and... keep reading!

Wednesday, April 16, 2008

Is there a cook in JDev's team?

This week I found something at least very curious when I launched my JDeveloper 10.1.3.3 as I do almost every morning. This was the "Tip of the Day" it showed me:



Well, I don't know what this means but, anyway, here is a full recipe, just in case: http://www.foodnetwork.com/food/recipes/recipe/0,1977,FOOD_9936_15602,00.html

:)