The case of the lost id field

by Joris KuipersSeptember 4, 2021

Chapter 1: a new case

Morning arrived like a Windows Update you couldn’t postpone anymore, forcing you to accept a harsh new reality you didn’t ask for. As I biked by the Amstel river I wondered what the day would bring: I was visiting a client that was porting their old Spring-based monolith over to Spring Boot per my advice, to make it easier to run it on Kubernetes.

They had considered to just use a Tomcat container image and stuff their WAR file in there through a Dockerfile. After more than 20 years of doing Enterprise Java I might have become an old cynical bastard, but even I still have my pride: lift and shift is below even my standards, which I tend to sell as “pragmatism” as a consultant nowadays.

Their system wasn’t in a bad shape, given its age. Some people claimed I was involved with its inception, but given the number of developers around in 2013 I say the father could’ve been anyone, really. Most dependencies had been kept up-to-date, so moving to Boot was mostly straightforward. The Spring Data version was pretty old though, with an API that overly optimistically promised you to always return an entity when you called “findOne”.
Like me, the Spring Data team learned the hard way that life doesn’t always give you what you look for. In response, the CrudRepository now offered a “findById” returning an Optional, reflecting the fact that the framework will do its best but can only do so much, struggling with the unexpectedness and inconsistencies of existence like we all do.

I advised the client to wise up as well, bite the bullet and use the version provided by Boot’s managed dependencies: there aren’t many certainties in life, but at least using an up-to-date and supported set of dependencies shines a little light in all this darkness.

As they set upon their Sisyphus task they had discovered a weird problem: when JPA entities were stored in the HTTP Session that was managed using Spring Session Redis, they would be returned with a null id field on deserialization.
It was clear that the problem had been introduced by updating several dependencies, but what was happening exactly?

Chapter 2: the investigation

I know detectives who try to solve cases by talking to people: interviewing witnesses, interrogating suspects. Me, I don’t trust people anymore. I’ve seen too many developers claiming they really tested that code before pushing it, that they had read the documentation as I told them to, that they really formed a sound mental image of the system they’re working on so that they can reason about it.
Me? I trust only my debugger.

Firing it up, I could see the problem playing out right in front me: the entity, freshly loaded from the database, would be given to a RedisSerializer, holding on to its ID field that provided it with its identity like it was clinging to life itself. After deserializing it though, it had been robbed, leaving only null where its ID had once been. All other state was restored correctly, yet the entity was now transient once again, its identity lost, uncertain what the future would hold.

The case was clear, but what suspect was responsible for taking the ID value?

I noticed that the application was using a JdkSerializationRedisSerializer, which uses Java Serialization for marshalling and unmarshalling the objects in the Session. I hadn’t used Java Serialization for probably a decade or so: I buried my knowledge on it with plenty of hard liquor. However, as I was looking into the problem my prior experience drifted back to the surface, together with long-suppressed memories of RMI over IIOP, Spring HTTP Invokers and other bile that is best left unmentioned.

I wrote a small test case using an ObjectOutputStream and ObjectInputStream to see if I could reproduce the problem. Maybe it was because it was a Friday, but the first attempt resulted in exactly the same problem: I caught serialization red handed.
Now, what I didn’t tell you yet is that these entity classes were all extending Spring Data JPA’s AbstractPersistable, a convenient mapped superclass providing an id property and a sensible hashCode/equals. (unlike all the entities written using Lombok’s @Data annotation, which always makes me throw up in my mouth a little while mumbling that entity identity isn’t value identity)

On inspection I noticed that that type didn’t implement Serializable, not by itself nor via the Persistable interface it implements. The team had noticed some new errors regarding their types not implementing Serializable, so they had added that to their own abstract subclass of the AbstractPersistable.
I had found the smoking gun, but the case still didn’t make full sense. I thought about writing a custom read- and writeObject method to handle the id field, but I knew I wouldn’t have been able to live with myself after committing such an atrocity.
Staying in my debugger wouldn’t solve this. I had no choice, I had to go out and do what I despise most: Google for something I don’t just know by heart already…

As I was reading through the ancient scrolls describing the workings of Java serialization with inheritance things started to slowly make sense: when your parent class isn’t Serializable but your concrete subclasses are, Java serialization will ignore the values of all the fields defined by the superclass on serializing, and will restore them to their default value on deserializing by using the default constructor of the superclass… For an ID field that means simply setting it to null.

What didn’t make sense though is why this would’ve changed when updating Spring Data; the case was cracked, but I couldn’t shake this nagging feeling that I hadn’t gotten to the root cause yet… I had to go outside the book to get to the bottom of this.

Chapter 3: the snitch

By now I realized that in all likelihood, the Persistable interface had stopped to implement Serializable along the way. It probably lost the marker interface in one of the sleazy bars near the Dockers it would visit to celebrate its infamous release parties.  Someone had to start singing, so I went to what I now realized was the real crime scene: Spring Data Common’s GitHub repository.

Sure enough, after digging through its long but rather uneventful history, I found this commit:

https://github.com/spring-projects/spring-data-commons/commit/727ab8384cac21e574526f325e19d6af92b8c8df#diff-2ffea8def1a1e1da09e729655a1347baed969c777687eb4c4396852c27cd2b1b

Its commit message mentioned the intention of dropping Serializable for the ID generic type, but it seemed like someone dropped the ball on this one and what was probably a global search-and-replace action removed Serializable from the interface type itself as well…

It all made sense now. Once again Hanlon’s razor had been the weapon of choice. The perpetrator even tried to erase his tracks by changing his last name some years ago, but Git gave him away.

Case closed, time to shadow the Persistable interface extending Serializable and for a much needed drink…