Thursday, April 29, 2010

Constant, Synchronized or ThreadLocal for Utility Objects?

A subtle mistake I have made previously is to declare certain "utility" objects as constants in my web app without understanding that they are not thread safe. For example, in my current application I need a SimpleDateFormat object to do some work translating String to Date and vice versa. My first thought is to write the following.

private static final SimpleDateFormat DATE_FORMAT =
      new SimpleDateFormat(EXPIRY_TIME_FORMAT);

For a batch job or stand-alone application, this would probably be ok, because only one thread would ever access this object at once.

Some Ojbects are not Thread Safe

The problem with the above code is that SimpleDateFormat is not thread safe. That means that in a web app (for example) where multiple threads hit the same piece of code simultaneously (representing multiple users doing the same thing at once), SimpleDateFormat might output unexpected results. So user A inputting "12th Feb" might get "18th of July" that user B input because calls to parse() from their respective threads happened at the same time - and since SimpleDateFormat uses instance variables to store data about the date it is transforming, user A's data got over-written by user B's data while parse() was being executed simultaneously by two threads.

This won't happen every time: in fact it will be extremely rare because calls to the object finish so quickly. This makes it all the harder to debug. You might only see the error when the application is under extreme load and there won't be a stack trace to show you which line of code is playing up: the application will just continue with the wrong data.

There are two ways around this problem: synchronize access to the object or have a copy of it for each thread. I prefer the latter and will discuss why now.

Synchronized

It's not too hard to synchronize access to the object. Consider the code below.

private static final SimpleDateFormat DATE_FORMAT =
    new SimpleDateFormat(EXPIRY_TIME_FORMAT.toString());
private synchronized Date parseDate(String date) throws ParseException {
  return DATE_FORMAT.parse(date);
}
private synchronized String formatDate(Date date) {
  return DATE_FORMAT.format(date);
}

This code ensures that only one thread at a time will be able to execute formatDate() or parseDate(). Problem solved right? Yes, but there are two issues to consider: trust and speed. Trust. You have to trust that every future developer touching this class (yourself included) will remember to call those methods. I.e. it is still possible to call DATE_FORMAT.parse() or DATE_FORMAT.format() directly and completely avoid the synchronized methods. Most developers are diligent about this sort of thing, but ultimately it is just one more potential source of error. Speed. Synchronization can slow things down because threads have to queue up and wait for their turn to run the synchronized method. This is not usually an issue if the synchronized code is fast (and SimpleDateFormat is).

Thread Local

Granted, neither of the objections above (trust and speed) are out and out show-stoppers for most applications, but they are risks that can be avoided easily. Using ThreadLocal, you can create an instance of the object for each thread in your application. This means that threads don't have to wait for each other to finish calling synchronized methods: they each have their own copy of the object. In fact, you don't even need the synchronization anymore because you have just removed the reason you considered it in the first place: multiple threads sharing the same instance of an object.

Here is how I do it for my example.

/** The expiry time format. */
private static ThreadLocal expiryTimeFormat =
    new ThreadLocal() {
  protected synchronized SimpleDateFormat initialValue() {
    return new SimpleDateFormat(EXPIRY_TIME_FORMAT.toString());
  }
};
private void methodParsingDate(String date) throws ParseException {
  Date parsed = expiryTimeFormat.get().parse(date);
}
private void methodFormattingDate(Date date) {
  String formatted = expiryTimeFormat.get().format(date);
}

Conclusion

When considering how to control access to some utility object within a class, declaring it as a constant (static final) is ok if you know the object is thread safe or if you are writing a single threaded application such as a simple batch job or "stand alone" application used by a single user at a time. If the object is not thread safe, prefer a ThreadLocal "container" over synchronized methods unless the object absolutely must be a singleton or if it is such a big object that multiple copies are not feasible. /p>

How do you know if the object is thread safe in the first place? You have to rely on good javadocs. For example, SimpleDateFormat's documentation states: Date formats are not synchronized. It is recommended to create separate format instances for each thread. If multiple threads access a format concurrently, it must be synchronized externally.