Leaving a Java legacy behind

by Thomas ZeemanMarch 4, 2019

As with a house, software applications require maintenance too. Otherwise, you risk the digital equivalent of rot: bit-rot. Often, the maintenance is done like with a house: Superficially to make it look nice again with a new look-and-feel, a new colour scheme (a paint job) and a few new stock photos. But every once in a while you also need to replace your kitchen, redo your living room. However, the technical equivalent — upgrading your application to a new version of Java — is not that common.

Often, an application that has been written in Java 5, 6 or 7 will still run on that version. With the new release schedule of Java since version 9 in 2017, that means you may be looking at a lot of ground to cover to catch up. At the time of writing, Java 11 is the latest stable release and version 12 is around the corner.

Here at Trifork, we have been developing software in Java for a long time. During that time we have been asked to modify some of our own work (either under an SLA or ad-hoc), but also to do a scan of, or maintenance on, the work of others. Based on that experience, I would like to take you on a bit of a deep dive into what it means to work with those older versions of Java and why and how you can bring them up to date again.

The most obvious reason to do something about an application written in an older version of Java is the fact that at some point, support for that version will stop. That would be enough reason for some managers to freak out, but developers probably won’t worry too much. They are more amenable to other reasons, like improved performance, code readability, new algorithms or more efficient data structures.

Collections
Now, this blog is for developers and we’ve not yet seen a line of code. So let’s dive into some examples. First one for the old greybeards among us, like myself: the Collections API. Where we would originally store objects in an array, vector or hashtable, there are now the List, Set and Map interfaces with various implementations and different characteristics to choose from. So, where we used to create some collection of data like this:

Vector vector = new Vector();
vector.addElement("Element 1");
vector.addElement("Element 2");

We can now do it like this:

List list = new ArrayList();
list.add("Element 1");
list.add("Element 2");

Although the changes look minimal at first glance, this meant a huge deal when introduced. There was now a standard way of dealing with collections, and if we find out that our application benefits more from a LinkedList, then all we need to do is change the first line there from ArrayList to LinkedList.

And since Java 9:

List list = List.of(e1, e2, e3)

Iterating
As with collections, there is also a use for iterating over the content. That is where we have had the perennial for loop (and its lesser used siblings while and do-while):

for(int i = 0; i < list.size(); i++) { // do stuff }

A pretty simple example, but as the components become a bit more evolved an off-by-one error or OB1 is easy to make and not always easily noticed.
That one got pretty much covered by the sugar-coated version from Java 8:

foreach(E e in list) { // do stuff }

At the same time, another new feature was introduced that not only did away with the OB1 but also introduced a host of other opportunities: the Stream API. The for each equivalent of which would be:

list.stream().forEach({ // do stuff })

Besides the forEach(), there are also other options like mapping, filtering and sorting. And all you have to do is provide the operation you want to do:

list.parallelStream().forEach(ElementType::increment)

The above example is doing everything in parallel under the hood using the ExecutorService and Fork-Join Framework as introduced in Java 5 and 7 respectively.

Do be careful with applying this to everything though. You are still responsible for making sure the operation is able to work correctly while running in parallel. It can make you run into trouble so much faster, as Fokke & Sukke commented on a while ago…

Lambdas
The last code snippet also brings me to another recent language feature: λ-functions. Where at the beginning there was the anonymous inner class construction for some highly specific things, like a Runnable implementation:

executorService.submit(new Runnable {
    @Override
    public void run() {
        // do some stuff
    }
})

It is now possible to do this with a lot less boilerplate:

executorService.submit(() -> run() {
    // do some stuff
})

In some cases, your IDE may be able to help you refactor this code with minimal effort.

If your run implementation happens to be about invoking some service method, you may even be able to shorten it to something like:

executorService.submit(SomeService::doStuff)

How is that for reducing the amount of boilerplate?

Strings
I would like to end with some examples around the String class. Arguably, the most used non-primitive ‘primitive’ in Java development, if not in any business programming setting.

Strings have always been a bit of a strange one in Java. It is the only object where you can use operators on like “foo” + “bar” and incidentally it was also for a long time the only class with multiple ways of instantiating an instance. The earlier example in a more ‘true’ OO fashion would be written like:

String fooBar = new String("foo").append(new String("bar"));

Can you spot the significant characters in that letter soup?

On the other hand, the shorter notation with operators was allegedly not the most performant code; whether from computational or memory perspective. Some people may remember the errors about the PermGenSpace being full due to too many String literals being used.
To offset that there was the StringBuffer class. That would allow you to do concatenation without building up a large set of new object instances that would be eligible for garbage collection within milliseconds after creation:

String fooBar = new StringBuffer("foo").append("bar").toString()

Unfortunately, it suffered from the same cautious programming model as the previously mentioned Vector and Hashtable classes. By having all operations synchronised, any operation required an expensive lock to be acquired first.

To remedy that, there is the StringBuilder class. The API is the same as the StringBuffer, so it is a drop-in replacement, but it is no longer synchronised.
Quite an easy change to make!

One thing wasn’t really covered by this and it was an equivalent of the venerable C standard library method printf. Some logging libraries already had placeholders for things to inject into standard log messages, but for other purposes, there wasn’t really a standard way to do that.
This changed with the introduction of the String.format method. Looking at the documentation it does appear as if they wanted to outdo the C version in capabilities but also increased the complexity. And according to some, performance is also suffering from this method.

Luckily the powers that be didn’t stop at that. More improvements have been made, like the stream joining methods (in various guises). The String Constant Pool was moved out of the PermGenSpace and into the heap space; that at least removes the weird situation that an app has enough memory available, but could still fail because of an OutOfMemory error.
Further memory improvements were made in Java 9 with the Compact String that significantly reduces the memory consumption and hence the need for garbage collection in many apps. And all of that goodness comes just by running it on a newer version of the JVM!

Most of these improvements do not, however, release you from the effort of thinking about how a String (or any object for that matter) will be used in your application as most of the measures may be good for some use cases, but can also be very bad for others.

Conclusion
We’ve gone through a lot of examples of how you can gain better performance, improve readability, and strengthen the robustness of a Java-based application, by keeping up with the times and leaving a better legacy behind you.

All of the above are just examples of what has improved over time. There are plenty more to mention, like finally having a working Date & Time API in Java, the removal of the PermGenSpace, the module system and the new Reactive API in version 9, and the val keyword in Java 10.

Beyond that, the ecosystem also provides plenty of opportunities, be it from libraries and frameworks or from integrating with pieces of code written in alternative JVM-based languages (Kotlin for data classes, anyone?).

I hope I’ve provided you with some insights into what is available and that could benefit your work. Java may be getting older, but it has by no means stopped improving. But you do have to keep up with it though — now more than ever before!