ORM is a disgusting anti-pattern

From the author of the translation: The text written below may not coincide with the opinion of the author of the translation. All statements are on behalf of the original author, please refrain from unjustified minuses. The original article was released in 2014, so some code snippets may be outdated or “unwanted”.

The content of the article:

  1. Introduction

  2. How ORMs work

  3. What is wrong with ORM

  4. SQL speaking objects

  5. How about performance?

  6. Regarding transactions

Introduction

ORM is a terrible anti-pattern that violates all the principles of object-oriented programming by taking objects apart and turning them into dumb and passive data packets. There is no justification for the existence of an ORM in any application, be it a small web application or an enterprise-sized system with thousands of tables and CRUD manipulation of them. What is the alternative? Objects that speak SQL (SQL speaking objects).

How ORMs work

Object-relational mapping (ORM) is a way (aka design pattern) to access a relational database using an object-oriented language (like Java). There are several ORM implementations in almost every language, for example: Hibernate for java ActiveRecord for Ruby on Rails, Doctrine for PHP and SQLAlchemy for python. In Java, ORM is even standardized as JPA.

First, let’s look at an example of how ORM works. Let’s use Java, PostgreSQL and Hibernate. Let’s say we have a single table in the database called post:

+-----+------------+--------------------------+
| id  | date       | title                    |
+-----+------------+--------------------------+
|   9 | 10/24/2014 | How to cook a sandwich   |
|  13 | 11/03/2014 | My favorite movies       |
|  27 | 11/17/2014 | How much I love my job   |
+-----+------------+--------------------------+

Now we want to manipulate this table with CRUD methods from our Java application (CRUD stands for create, read, update and delete). First we need to create a Post class (sorry it’s so long, but this is the best I can do):

@Entity
@Table(name = "post")
public class Post {
	private int id;
	private Date date;
	private String title;
	@Id
	@GeneratedValue
	public int getId() {
		return this.id;
	}
	@Temporal(TemporalType.TIMESTAMP)
	public Date getDate() {
		return this.date;
	}
	public Title getTitle() {
		return this.title;
	}
	public void setDate(Date when) {
		this.date = when;
	}
	public void setTitle(String txt) {
		this.title = txt;
	}
}

Before any operation with Hibernate, we must create a SessionFactory:

SessionFactory factory = new AnnotationConfiguration()
	.configure()
	.addAnnotatedClass(Post.class)
	.buildSessionFactory();

This factory will give us “sessions” every time we want to work with Post objects. Every session manipulation must be wrapped in this block of code:

Session session = factory.openSession();
try {
  Transaction txn = session.beginTransaction();
  // your manipulations with the ORM, see below
  txn.commit();
} catch (HibernateException ex) {
  txn.rollback();
} finally {
  session.close();
}

When the session is ready, this is how we get a list of all records from this table:

List posts = session.createQuery("FROM Post").list();
for (Post post : (List<Post>) posts) {
  System.out.println("Title: " + post.getTitle());
}

I think you understand what is going on here. Hibernate is a big, powerful engine that establishes a database connection, executes the necessary SELECT queries, and fetches the data. It then creates instances of the Post class and fills them with data. When an object comes to us, it is filled with data, and in order to access them, you need to use getters, as an example getTitle() higher.

When we want to reverse the operation and send an object to the database, we do the same thing, but in reverse order. We create an instance of the Post class, fill it with data, and ask Hibernate to save it:

Post post = new Post();
post.setDate(new Date());
post.setTitle("How to cook an omelette");
session.save(post);

This is how almost every ORM works. The basic principle is always the same – ORM objects are infirm (direct translation of the word anemic) shells with data. We are talking to the ORM framework, and the framework is talking to the database. Objects only help us send requests to the ORM framework and understand its response. Besides getters and setters, objects have no other methods. They don’t even know what database they came from.

This is how object-relational mapping works.

What’s wrong with that, you ask? All!

What’s wrong with ORM?

Seriously, what’s wrong? Hibernate has been one of the most popular Java libraries for over 10 years. Almost every SQL-intensive application in the world uses it. Every Java tutorial will mention Hibernate (or maybe some other ORM like TopLink or OpenJPA) for a database connected application. This is the de facto standard and yet I say it’s wrong? Yes.

I contend that the whole idea behind ORM is wrong. His invention was perhaps the second big mistake in OOP after NULL reference.

ORM, instead of encapsulating interaction with the database inside the object, extracts it, literally tearing apart a strong and cohesive living organism.

In fact, I’m not the only one who says something like this, and certainly not the first. Much has already been published on this subject by highly respected authors, including Orm Hate author Martin Fawler (not against ORM, but worth mentioning anyway), Object-Relational Mapping is the Vietnam of Computer Science from Jeff Atwood, The Vietnam of Computer Science author Ted Neward, ORM Is an Anti-Pattern from Laurie Voss and many others.

However, my arguments differ from what they say. Despite the fact that their arguments are practical and reasonable, for example, “ORM is slow” or “database update difficult“, they miss the point. You can see a very good, practical answer to these practical arguments, from Bozhidar Bozhanov on his blog ORM Haters Don’t Get It.

The bottom line is that ORM, instead of encapsulating interaction with the database inside the object, extracts it, literally tearing apart a strong and cohesive living organism. One part of the object stores data, while the other, implemented inside the ORM mechanism (sessionFactory) knows how to handle this data and passes it to the relational database. Look at this picture; it illustrates what an ORM does.

I, as a message reader, have to deal with two components: 1) the ORM and 2) the “ob-truncated” object returned to me. The behavior I’m interacting with is supposed to be exposed through a single entry point, which is an object in OOP. In the case of an ORM, I get this behavior through two entry points – the ORM engine and a “thing” that we can’t even call an object.

Because of this terrible and insulting violation of the object-oriented paradigm, we have many practical problems already mentioned in respected publications. I can only add a few more.

SQL Not Hidden. ORM users must speak SQL (or a dialect such as HQL). See example above; we call session.CreateQuery("FROM Post")to get all messages. Even though it is not SQL, it is very similar to it. Thus, the relational model is not encapsulated within objects. Instead, it is available to the entire application. Everyone, with every object, inevitably has to deal with a relational model in order to get something or save something. Thus, the ORM does not hide or migrate SQL, but pollutes the entire application with it.

Difficult to test. When any object operates on a list of records, it needs to deal with an instance SessionFactory. How can we lock this dependency? Should we create an imitation of this? How difficult is this task? Take a look at the above code and you will understand how verbose and unwieldy this unit test will be. Instead, we can write integration tests and connect the entire application to a test version of PostgreSQL. In this case, there is no need to mimic SessionFactory, but such tests will be quite slow, and more importantly, our objects that have nothing to do with the database will be tested on a database instance. Terrible idea.

Let me reiterate. The practical problems of ORM are just consequences. The fundamental flaw is that the ORM breaks objects apart, violating horribly and offensively the very idea of ​​what an object is.

SQL speaking objects

What is the alternative? Let me show you this with an example. Let’s try to design the Post class. We will have to split it into two classes: Post and Posts, singular and plural. I already mentioned in one of my previous articles that a good object is always an abstraction of a real entity. Here’s how this principle works in practice. We have two entities: a database table and a table row. That’s why we’ll create two classes. Posts will represent a table, and Post will represent a string.

As I also mentioned in this article, each object must work under a contract and implement an interface. Let’s start our design with two interfaces. Of course, our objects will be immutable. Here’s what they will look like Posts:

interface Posts {
  Iterable<Post> iterate();
  Post add(Date date, String title);
}

Here’s what one would look like Post:

interface Post {
  int id();
  Date date();
  String title();
}

This is how we will list all the records in the database table:

Posts posts = // we'll discuss this right now
for (Post post : posts.iterate()) {
  System.out.println("Title: " + post.title());
}

This is how a new one is made Post:

Posts posts = // we'll discuss this right now
posts.add(new Date(), "How to cook an omelette");

As you can see, now we have real objects. They are responsible for all operations, and they perfectly hide the details of their implementation. There are no transactions, sessions or factories. We don’t even know if these objects actually interact with PostgreSQL or if they store all data in text files. All we need from Posts, is the ability to list all entries for us and create a new one. Implementation details are perfectly hidden inside. Now let’s see how we can implement these two classes.

I’m going to use jcabi-jdbc as a JDBC wrapper, but you can use something else like jOOQ, or just JDBC if you prefer. It doesn’t really matter. The important thing is that your interactions with the database are hidden inside the objects. Let’s start with Posts and implement it in the class PgPosts (“pg” stands for PostgreSQL):

final class PgPosts implements Posts {
  private final Source dbase;
  public PgPosts(DataSource data) {
    this.dbase = data;
  }
  public Iterable<Post> iterate() {
    return new JdbcSession(this.dbase)
      .sql("SELECT id FROM post")
      .select(
        new ListOutcome<Post>(
          new ListOutcome.Mapping<Post>() {
            @Override
            public Post map(final ResultSet rset) {
              return new PgPost(
                this.dbase,
                rset.getInt(1)
              );
            }
          }
        )
      );
  }
  public Post add(Date date, String title) {
    return new PgPost(
      this.dbase,
      new JdbcSession(this.dbase)
        .sql("INSERT INTO post (date, title) VALUES (?, ?)")
        .set(new Utc(date))
        .set(title)
        .insert(new SingleOutcome<Integer>(Integer.class))
    );
  }
}

Next, let’s implement the interface Post in class PgPost:

final class PgPost implements Post {
  private final Source dbase;
  private final int number;
  public PgPost(DataSource data, int id) {
    this.dbase = data;
    this.number = id;
  }
  public int id() {
    return this.number;
  }
  public Date date() {
    return new JdbcSession(this.dbase)
      .sql("SELECT date FROM post WHERE id = ?")
      .set(this.number)
      .select(new SingleOutcome<Utc>(Utc.class));
  }
  public String title() {
    return new JdbcSession(this.dbase)
      .sql("SELECT title FROM post WHERE id = ?")
      .set(this.number)
      .select(new SingleOutcome<String>(String.class));
  }
}

Here’s what a full database interaction script would look like using the classes we just created:

Posts posts = new PgPosts(dbase);
for (Post post : posts.iterate()){
  System.out.println("Title: " + post.title());
}
Post post = posts.add(
  new Date(), "How to cook an omelette"
);
System.out.println("Just added post #" + post.id());

You can see a complete practical example here. This is an open source web application that works with PostgreSQL using the exact approach described above – SQL speaking objects.

How about performance?

I hear you ask: “What about performance?” In this script a few lines above, we are doing a lot of redundant database walks. First, we extract the post IDs with SELECT idand then to get their headers we do an additional SELE callCT title for each entry. This is inefficient or, simply put, too slow.

Don’t worry, this is object-oriented programming, which means it’s flexible! Let’s create a decorator PgPostwhich will take all the data in its constructor and cache it internally forever:

final class ConstPost implements Post {
  private final Post origin;
  private final Date dte;
  private final String ttl;
  public ConstPost(Post post, Date date, String title) {
    this.origin = post;
    this.dte = date;
    this.ttl = title;
  }
  public int id() {
    return this.origin.id();
  }
  public Date date() {
    return this.dte;
  }
  public String title() {
    return this.ttl;
  }
}

Note that this decorator doesn’t know anything about PostgreSQL or JDBC. It just decorates an object like Post and pre-caches the date and title. As usual, this decorator is also immutable.

Now let’s create another implementation Postswhich will return “persistent” objects:

final class ConstPgPosts implements Posts {
  // ...
  public Iterable<Post> iterate() {
    return new JdbcSession(this.dbase)
      .sql("SELECT * FROM post")
      .select(
        new ListOutcome<Post>(
          new ListOutcome.Mapping<Post>() {
            @Override
            public Post map(final ResultSet rset) {
              return new ConstPost(
                new PgPost(
                  ConstPgPosts.this.dbase,
                  rset.getInt(1)
                ),
                Utc.getTimestamp(rset, 2),
                rset.getString(3)
              );
            }
          }
        )
      );
  }
}

Now all records returned iterate() of this new class are preloaded with dates and titles obtained in one round trip to the database.

By using decorators and multiple implementations of the same interface, you can create any functionality you wish. What is most important is that while the functionality is expanding, the complexity of the design is not increasing because the classes are not increasing in size. Instead, we’re introducing new classes that stay cohesive and solid because they’re small.

Regarding transactions

Each entity must deal with its own transactions and encapsulate them just like requests SELECT or INSERT. This will result in nested transactions, which is fine as long as the database server supports them. If there is no such support, create a transaction object for the whole session, which will accept a “calable” class. For example:

final class Txn {
  private final DataSource dbase;
  public <T> T call(Callable<T> callable) {
    JdbcSession session = new JdbcSession(this.dbase);
    try {
      session.sql("START TRANSACTION").exec();
      T result = callable.call();
      session.sql("COMMIT").exec();
      return result;
    } catch (Exception ex) {
      session.sql("ROLLBACK").exec();
      throw ex;
    }
  }
}

Then when you want to wrap multiple object manipulations in a single transaction, do it like this:

new Txn(dbase).call(
  new Callable<Integer>() {
    @Override
    public Integer call() {
      Posts posts = new PgPosts(dbase);
      Post post = posts.add(
        new Date(), "How to cook an omelette"
      );
      post.comments().post("This is my first comment!");
      return post.id();
    }
  }
);

This code will create a new post and post a comment on it. If one of the calls fails, the entire transaction will be rolled back.

This approach seems object-oriented to me. I call these “SQL speaking objects” because they know how to speak SQL to the database server. This is their skill, ideally contained within their boundaries.

Similar Posts

Leave a Reply Cancel reply