Repository antipattern in Android

Translation of the article was prepared on the eve of the start of the course “Android Developer. Professional “…

The Official Android Application Architecture Guide recommends using the Repository classes to “provide a clean API so the rest of the application can easily retrieve data.” However, in my opinion, if you use this pattern in your project, you are guaranteed to get bogged down in messy spaghetti code.

In this article, I will tell you about the “Repository pattern” and explain why it is actually an anti-pattern for Android applications.

Repository

In the above Application Architecture Guide The following structure is recommended for organizing presentation tier logic:

The role of the repository object in this structure is as follows:

Repository modules handle data operations. They provide a clean API so the rest of the application can retrieve this data easily. They know where to get the data and what API calls to make when it is updated. You can think of repositories as intermediaries between different data sources such as persistent models, web services, and caches.

Basically, the guide recommends using repositories to abstract the data source in your application. Sounds very reasonable and even useful, doesn’t it?

However, let’s not forget that chatting is not tossing bags (in this case, writing code), but to reveal architectural topics using UML diagrams – even more so. The real test of any architectural pattern is implementation in code and then identifying its advantages and disadvantages. So let’s find something less abstract to review.

Repository in Android Architecture Blueprints v2

About two years ago, I reviewed the “first version” of Android Architecture Blueprints. In theory, they were supposed to implement a clean MVP example, but in practice, these blueprints resulted in a rather dirty code base. They did contain interfaces named View and Presenter, but did not set any architectural boundaries, so it was not essentially an MVP. You can see this code review here…

Since then, Google has updated architectural blueprints using Kotlin, ViewModel, and other “modern” practices, including repositories. These updated blueprints have been prefixed with v2.

Let’s take a look at the interface TasksRepository from blueprints v2:

interface TasksRepository {
   fun observeTasks(): LiveData<Result<List<Task>>>
   suspend fun getTasks(forceUpdate: Boolean = false): Result<List<Task>>
   suspend fun refreshTasks()
   fun observeTask(taskId: String): LiveData<Result<Task>>
   suspend fun getTask(taskId: String, forceUpdate: Boolean = false): Result<Task>
   suspend fun refreshTask(taskId: String)
   suspend fun saveTask(task: Task)
   suspend fun completeTask(task: Task)
   suspend fun completeTask(taskId: String)
   suspend fun activateTask(task: Task)
   suspend fun activateTask(taskId: String)
   suspend fun clearCompletedTasks()
   suspend fun deleteAllTasks()
   suspend fun deleteTask(taskId: String)
}

Even before reading the code, you can pay attention to the size of this interface – this is already a wake-up call. Such a number of methods in one interface would raise questions even in large Android projects, but we are talking about a ToDo application with only 2000 lines of code. Why does this rather trivial application need a class with such a huge API surface?

Repository as a God Object

The answer to the question from the previous section is covered in the names of the TasksRepository methods. I can roughly divide the methods of this interface into three non-overlapping groups.

Group 1:

fun observeTasks(): LiveData<Result<List<Task>>>
   fun observeTask(taskId: String): LiveData<Result<Task>>

Group 2:

   suspend fun getTasks(forceUpdate: Boolean = false): Result<List<Task>>
   suspend fun refreshTasks()
   suspend fun getTask(taskId: String, forceUpdate: Boolean = false): Result<Task>
   suspend fun refreshTask(taskId: String)
   suspend fun saveTask(task: Task)
   suspend fun deleteAllTasks()
   suspend fun deleteTask(taskId: String)

Group 3:

  suspend fun completeTask(task: Task)
   suspend fun completeTask(taskId: String)
   suspend fun clearCompletedTasks()
   suspend fun activateTask(task: Task)
   suspend fun activateTask(taskId: String)

Now let’s define the areas of responsibility of each of the above groups.

Group 1 is basically an implementation of the Observer pattern using the LiveData facility. Group 2 is the gateway to the data warehouse plus two methods refreshwhich are necessary because the remote data store is hidden behind the repository. Group 3 contains functional methods that basically implement two parts of the application domain logic (task completion and activation).

So this one interface has three different responsibilities. No wonder it’s so big. And although it can be argued that the presence of the first and second groups as part of a single interface is acceptable, adding the third is unjustified. If this project needs to be developed further and it becomes a real Android application, the third group will grow in direct proportion to the number of domain streams in the project. Hmm.

We have a special term for classes that share so many responsibilities: Divine objects. This is a widespread anti-pattern in Android applications. Activitie and Fragment are standard suspects in this context, but other classes can degenerate into Divine objects too. Especially if their names end in “Manager”, right?

Wait … I think I found a better name for TasksRepository:

interface TasksManager {
   fun observeTasks(): LiveData<Result<List<Task>>>
   suspend fun getTasks(forceUpdate: Boolean = false): Result<List<Task>>
   suspend fun refreshTasks()
   fun observeTask(taskId: String): LiveData<Result<Task>>
   suspend fun getTask(taskId: String, forceUpdate: Boolean = false): Result<Task>
   suspend fun refreshTask(taskId: String)
   suspend fun saveTask(task: Task)
   suspend fun completeTask(task: Task)
   suspend fun completeTask(taskId: String)
   suspend fun activateTask(task: Task)
   suspend fun activateTask(taskId: String)
   suspend fun clearCompletedTasks()
   suspend fun deleteAllTasks()
   suspend fun deleteTask(taskId: String)
}

Now the name of this interface reflects its responsibilities much better!

Anemic repositories

Here you may ask: “If I pull the domain logic out of the repository, will that solve the problem?” Well, back to the “architectural diagram” from the Google manual.

If you want to extract, say, methods completeTask from TasksRepository, where would you put them? According to the Google recommended “architecture”, you will need to move this logic into one of your ViewModels. It doesn’t seem like such a bad decision, but it really is.

For example, imagine you are putting this logic into one ViewModel. Then, after a month, your account manager wants to allow users to complete tasks from multiple screens (this is relevant to all ToDo managers I’ve ever used). The logic inside the ViewModel cannot be reused, so you need to either duplicate it or return it to the TasksRepository. Obviously, both approaches are bad.

A better approach would be to extract this domain stream into a custom object and then put it between the ViewModel and the repository. Then different ViewModels will be able to reuse that object to execute that particular thread. These objects are known as “Use cases” or “interactions”… However, if you add use cases to your codebase, repositories become essentially a useless template. Whatever they do, it will fit better with the use cases. Gabor Varadi has already covered this topic in this articleso I won’t go into details. I subscribe to almost everything he said about “anemic repositories”.

But why are use cases so much better than repositories? The answer is simple: use cases encapsulate separate streams. Hence, instead of one repository (for each domain concept) that gradually grows into a Divine object, you will have several highly targeted use-case classes. If the stream depends on the network and the data being stored, you can pass the appropriate abstractions to the use case class and it will “arbitrate” between these sources.

In general, it looks like the only way to prevent the degradation of repositories to Divine classes while avoiding unnecessary abstractions is to get rid of the repositories.

Repositories outside of Android.

Now you may be wondering if repositories are a Google invention. No, they are not. The repository pattern was described long before Google decided to use it in its architecture guide.

For example, Martin Fowler described repositories in his book, Patterns of Enterprise Application Architecture. His blog also has guest articledescribing the same concept. According to Fowler, a repository is just a wrapper around the storage tier that provides a higher-level query interface and possibly in-memory caching. I would say that from Fowler’s point of view, repositories behave like ORMs.

Eric Evans also described repositories in his book Domain Driven Design. He wrote:

Clients request objects from the repository using query methods that select objects based on criteria specified by the client — usually the values of certain attributes. The repository retrieves the requested object by encapsulating the database query and metadata mapping engine. Repositories can implement various queries that select objects based on whatever criteria the client requires.

Note that you can replace the “repository” in the above quote with “Room ORM” and it still makes sense. So, in the context of Domain Driven Design, a repository is an ORM (implemented by hand or using a third party framework).

As you can see, the repository was not invented in the Android world. This is a very sane design pattern that all ORM frameworks are built on. Note, however, what repositories are not: none of the “classics” ever argued that repositories should try to abstract away the distinction between network access and database access.

In fact, I’m pretty sure they’ll find this idea naive and self-defeating. To understand why, you can read another article, this time by Joel Spolsky (founder of StackOverflow), titled “The law of leaky abstractions”… Simply put: networking is too different from database access to abstract without significant leaks.

How the repository became anti-pattern in Android

So has Google misinterpreted the repository pattern and introduced the naive idea of abstracting network access into it? I doubt it.

I found the oldest link to this antipattern at this repository on GitHubwhich, unfortunately, is a very popular resource. I don’t know if this particular author invented this antipattern, but it looks like it was this repo that popularized the general idea within the Android ecosystem. Google developers probably got it from there or from one of the secondary sources.

Conclusion

So, the repository in Android has become an anti-pattern. It looks good on paper, but it becomes problematic even in trivial applications and can lead to real problems in larger projects.

For example, in another Google blueprint, this time for architectural components, the use of repositories eventually led to gems such as NetworkBoundResource… Keep in mind that the sample browser GitHub is still a tiny ~ 2 KLOC app.

As far as I can tell, the “repository pattern” as defined in the official docs is incompatible with clean and maintainable code.

Thanks for reading and as usual you can leave your comments and questions below.