While everyone else is hyping and raving about LLMs, I am looking at old boring essential technologies that matter greatly in daily work.
Object Relational Mappers. Sexy.
Love or hate ORM solutions, it is good to know how they work and where the hairy parts exist before they become a cognitive or technical problem.
TL;DR:
JPA is good, but Spring Data JDBC is simpler. If you can live without JPA, it can be worth it.
To some extent, this post continues the theme from the previous post as Object Relational Mapping is one of those places where your decisions in one area affect how much pain you will have in other areas.
Throughout the years, this is a thing that has troubled me and caused annoying bugs in code when mental models and how technology works have crashed. Back in the day, I had massive problems with Hibernate when the persistence model surprised me with cumbersome and slow queries. I was not the only one with issues.
When starting a new project and thinking about data and persistence, there are many questions to consider.
Do you start with the database or the object model? Are you doing a fully normalised database or something denormalised with arrays or JSON columns?
How will you query the data or objects: through ORM query language, SQL or both?
Can you trust the mapping and query technology to be around and evolve with your needs?
How many different abstractions and concepts do you want to juggle in your head?
Over the years, numerous different technologies in the JVM world have been used to scratch these itches. For example, Hibernate, i/MyBatis, JOOQ and now JPA.
These days JPA specification and Hibernate are very proven and solid technologies, and good tooling exists to make creating and managing the JPA persistence layer fun. If you don’t believe it, check out JPA Buddy.
There are good books available and even Thorben Janssen’s commercial help community Persistence Hub. So there is much well-paved road and help available.
The great thing with JPA is that it does not tie you down into one framework, but rather you can use the same standard and usually the same implementation too in whatever compatible framework you decide to use for your app. For example Micronaut, Quarkus or SpringBoot.
In many ways, JPA is good, and there are good reasons to use it in many situations.
But then again, if I don’t need the capabilities that the JPA and managed entities provide, I can avoid using JPA and keep things simple.
Let me elaborate.
I am firmly in the database corner and want to start design first with the data as it will be persisted. I want to use the powers and capabilities that the highly tuned and perfected database engine provides to ensure that the data is kept in good order and queried effectively. So, if there is tension between the object and database models, the database always wins.
PostgreSQL is an excellent platform and such proven technology that whenever there isn’t a better managed cloud-native persistence option, it is my go-to data storage tool. I am still amazed at how the Postgres community creates new capabilities for the platform, including the ability to do bitemporal data modelling.
What previously tripped me and caused confusion was trying to think everything through Hibernate or some other ORM-model abstraction on top of the relational database. To get everything to work, you had to be good in two different abstractions, relational database and the ORM framework.
And from time to time, the persistence manager was doing clever things in terms of caching and transactions meant that you had to be careful if you wanted to use and query the database outside the ORM.
All the intelligent things that JPA does and enables come with a cost, both technical and cognitive costs. And even though frameworks, tooling and knowledge resources help, you need to be aware of those potential problems. Even JPA Buddy makers have noticed that some JPA users have trouble, and are considering simpler alternatives, because they do not understand the capabilities and their responsibilities correctly.
For many use cases, the Spring Data JDBC model fits my brain well with its simpler model and capabilities compared to JPA and Hibernate.
I can start to model things with Domain Driven Design type thinking and identify the domain model aggregates and their boundaries.
Instead of creating a perfect traversable object graph with one-to-many and many-to-many relationships, I focus on the data and how it is used and model references between aggregates as AggregateReferences. The outcome is that there are fewer surprises in loading and updating data, as I always know what unit I am updating.
Nothing is stopping me in JPA from doing the same, but the constraints of Spring Data Jdbc guide me to think about the problem in simpler terms.
And simple is good.
I want to avoid reinventing the wheel or using fancy patterns if I don’t need them. Rather I want to keep things simple in programming infrastructure and focus my time on writing good business logic.
Simple tools, simple patterns.
Database structure and migrations are done with Flyway, even though simple features are already embedded in persistence frameworks.
Spring Data JDBC is used to model the domain with DDD principles and provide capabilities to query and project the data when only partial views of the aggregate are needed.
The key concept here is to consider the aggregates that get updated and accessed. Embedded and related entities work as expected when loaded and managed with the aggregate.
Queries and projections provide close to the database control how data is queried and mapped to either an aggregate or to a projected data structure.
And though everything happens through reflection and proxy magic, the model is very structured and easy to follow, and something which you could then rewrite manually if the magic fails to work sufficiently.
Additional data mapping with ShapeShift or ModelMapper can be needed in layered architecture if data is served in different formats or types to other use cases.
Nothing fancy. Just simple things.
The good thing here is that with this setup, I can quickly evolve things database first and complement querying data with pure SQL, and even use type-safe tools like JOOQ to provide capabilities to create dynamic queries during runtime.
Spring Data JDBC provides a simple foundation to create and query simple aggregates without anything extra. I can understand what happens and reason why it happens.
So I concur with the Spring Data Jdbc statement why it is created:
" Spring Data JDBC aims to be much simpler conceptually, by embracing the following design decisions:
If you load an entity, SQL statements get run. Once this is done, you have a completely loaded entity. No lazy loading or caching is done.
If you save an entity, it gets saved. If you do not, it does not. There is no dirty tracking and no session.
There is a simple model of how to map entities to tables. It probably only works for rather simple cases. If you do not like that, you should code your own strategy. Spring Data JDBC offers only very limited support for customizing the strategy with annotations."
https://docs.spring.io/spring-data/jdbc/docs/current/reference/html/#jdbc.why
What are your preferred tools and components when persisting and handling data in JVM applications?
Relevant JPA resources:
https://www.manning.com/books/java-persistence-with-spring-data-and-hibernate
https://thorben-janssen.com/jpa-native-queries/
https://docs.spring.io/spring-data/jpa/docs/current/reference/html/#core.extensions.querydsl
https://docs.spring.io/spring-data/jpa/docs/current/reference/html/#projections
Spring Data JDBC resources:
https://thorben-janssen.com/spring-data-jdbc-aggregates/
https://thorben-janssen.com/spring-data-jdbc-custom-queries-and-projections/
https://www.youtube.com/watch?v=SJlKBkZ2yAU
https://github.com/schauder/talk-beyond
JOOQ resources:
https://blog.jooq.org/when-to-use-jooq-and-when-to-use-native-sql/