Swift and CoreData. Or how to build Swift ORM based on Objective-C ORM

Habr, hello! My name is Geor, and I develop iOS projects at Prisma Labs. As you probably understood, today we will talk about the cordata and many of you got bored already at this moment. But do not rush to despair, as we will talk mostly about the magic of Swift and about metal. Joke – about metal another time. The story will be about how we defeated the NSManaged boilerplate, reinvented migrations and made the cordata great again.

Developers, come on.

A few words about motivation

It is difficult to work with a cordata. Especially in our swift time. This is a very old framework that was created as a data layer with an emphasis on I / O optimization, making it more complex than other ways of storing data by default. But the productivity of iron over time has ceased to be a bottleneck, and the complexity of the cordata, alas, has not gone anywhere. In modern applications, many people prefer other frameworks to cordata: Realm, GRDB (top), etc. Or they just use files (why not). Even Apple in new tutorials uses Codable serialization / deserialization for persistence.

Despite the fact that the API cordata was periodically replenished with various convenient abstractions (for example, NSPersistentContainer), developers should still monitor the life cycle of NSManaged objects, do not forget to read / write on the queue of the context to which these objects are attached and, of course, swear every time when something goes wrong. And surely in many projects there is a duplicate set of domain-level models and code for converting between them and their NSManaged pairs.

But kordata also has many advantages – a powerful visual editor for the data schema, automatic migrations, a simplified (compared to SQL) query system, secure multithreaded access to data, and so on.

In Prism, we wrote a simple and powerful framework that allows you to forget about the drawbacks of cordata and at the same time use all the power of its light side.

Meet – Sworm

I will not talk completely about the structure of the framework – there is a repository for this, but I will focus on the main points and implementation tricks.

Rejection of NSManagedObject inheritance and built-in CoreData code generation

Instead, NSManagedObjects are used directly as key-value containers. The idea is not new, but the difficulty lies in how to automate the conversion between KV container and domain model. To solve this problem, you need to build 3 bridges:

  1. name

  2. attributes

  3. relations

Everything is simple with the name – by specifying a line with the name in the type of your model, you can unambiguously associate it with the entity in the model:

struct Foo {
    static let entityName: String = "FooEntity"
}

The “bridge” of relations is already a more complex technical construction. In the case of a name, a static field specified within a type is automatically associated with it:

Foo.entityName

But to define a relationship, in addition to the name of this relationship, we also need the type of destination-model, inside which there must also be the name of the corresponding entity. This suggests two thoughts. Firstly, all models converted to NSManageObject must follow the same set of rules, that is, the time has come for the protocol, and secondly, we need an additional data type Relation (name: String), which will bind the name of the relation in a data model with a type corresponding to it in the domain model. For now, let’s omit the details that relationships are different – it doesn’t matter at this stage. So, protocol version 1:

protocol ManagedObjectConvertible {
    static var entityName: String { get }
}

and the type for the relationship:

Relation<T: ManageObjectConvertible>(name: String)

We apply:

struct Foo: ManageObjectConvertible {
    static var entityName: String = "FooEntity"

    static let relation1 = Relation<Bar1>(name: "bar1")
    static let relation2 = Relation<Bar2>(name: "bar2")
}

The idea immediately suggests itself to fix the presence of connections (relations) in our protocol, but how to do this if the number of connections is always different? You won’t be able to create a collection of relationships for several reasons. Firstly, in Swift, generics are invariant, and secondly, sooner or later we will have to remember that Relation splits into several types – one / many / orderedmany, and this will automatically lead to the idea of ​​homogeneity through erasure of types, which does not suit us. But in fact, we are not interested in the relationship itself and we may not even think about their number. Therefore, we will add to the protocol not a specific type of relationship, but an association with a type of relationship. It sounds strange and at first glance incomprehensible, but hold my beer – protocol version 2:

protocol ManagedObjectConvertible {
    associatedtype Relations

    static var entityName: String { get }
    static var relations: Relations { get }
}

Still weird, keep holding beer:

struct Foo: ManageObjectConvertible {
    static let entityName: String = "FooEntity"

    struct Relations {
        let relation1 = Relation<Bar1>(name: "bar1")
        let relation2 = Relation<Bar2>(name: "bar2")
    }

    static let relations = Relations()
}

And now it will become clear – with the help of such an implementation, you can easily get the name of the relationship:

extension ManagedObjectConvertible {
    func relationName<T: ManagedObjectConvertible>(
        keyPath: KeyPath<Self.Relations, Relation<T>>
    ) -> String {
        Self.relations[keyPath: keyPath].name
    }
}

Give back the beer that you are standing 🙂

Final Bridge – Attributes

As you know, any boss has weak points and this one is no exception.

At first glance, the task looks similar to the “bridge” of relations, but unlike them, we need to know about all the available attributes, and we will not be able to get by with the associated type. We need a complete collection of attributes, each of which must be able to do two things: encode a value from the model to the container and decode from the container back to the model. Obviously this is a WritableKeyPath + String key relationship. But, as in the case of relations, we need to solve a problem – how to store information about types, taking into account the invariance of generics and the need to have a homogeneous collection of attributes.

Let a special object of the Attribute type act as an attribute, where T is the domain model. Then the collection of attributes would be `[Attribute<T>]`and for our protocol, replace T with Self. So the protocol is version 3:

public protocol ManagedObjectConvertible {
    associatedtype Relations

    static var entityName: String { get }
    static var attributes: [Attribute<Self>] { get }
    static var relations: Relations { get }
}

And now let’s try to implement the Attribute class directly. Let me remind you that his area of ​​responsibility includes serialization / deserialization of the field between the model and the KV container. First, let’s try to forget for a while about the restrictions on the homogeneity of types and do it head-on:

final class Attribute<T: ManagedObjectConvertible, V> {
    let keyPath: WritableKeyPath<T, V>
    let key: String

    ...

    func update(container: NSManagedObject, model: T) {
        container.setValue(model[keyPath: keyPath], forKey: key)
    }

    func update(model: inout T, container: NSManagedObject) {
        model[keyPath: keyPath] = container.value(forKey: key) as! V
    }
}

The attribute implementation might look like this, but [Attribute<T, V>] – not our case. How can you get rid of V in a class signature while preserving information about that type? Not everyone knows, but in swift, you can add generics to the initializer signature:

final class Attribute<T: ManagedObjectConvertible> {
    ...

    init<V>(
        keyPath: WritableKeyPath<T, V>,
        key: String
    ) { ... }

    ...
}

We now have information about V when the attribute is initialized. And in order not to lose it further, we will uncover the swift analog Bfg – cloves:

final class Attribute<T: ManagedObjectConvertible> {
    let encode: (T, NSManagedObject) -> Void
    let decode: (inout T, NSManagedObject) -> Void

    init<V>(keyPath: WritableKeyPath<T, V>, key: String) {
        self.encode = {
            $1.setValue($0[keyPath: keyPath], forKey: key)
        }

        self.decode = {
            $0[keyPath: keyPath] = $1.value(forKey: key) as! V
        }
    }
}

There is one more empty space left in our protocol. We know how to create an NSManagedObject and fill it with data from the model, we know how to fill a model from an NSManagedObject, but we DO NOT know how to create an instance of our model if necessary.

Protocol – version 4, final:

protocol ManagedObjectConvertible {
    associatedtype Relations

    static var entityName: String { get }
    static var attributes: Set<Attribute<Self>> { get }
    static var relations: Relations { get }

    init()
}

That’s it – we have defeated the inheritance from NSManagedObjects, replacing it with the implementation of the protocol.

Next, let’s look at how you can make the attribute system more flexible and wider.

Flexible attribute system

Cordata supports a set of primitive attributes – bool, int, double, string, data, etc. But in addition to them, there is a little-used Transformable, which allows you to save data of various types in the cordat. The idea is great and we decided to breathe new life into it with the help of the swift type system.

Let’s define the following set of primitive attributes:

Bool, Int, Int16, Int32, Int64, Float, Double, Decimal, Date, String, Data, UUID, URL

And we will approve the rule: the attribute type is valid if the data can be serialized into one of the primitives and deserialized back.

This can be easily expressed in two protocols:

protocol PrimitiveAttributeType {}

protocol SupportedAttributeType {
    associatedtype P: PrimitiveAttributeType

    func encodePrimitive() -> P

    static func decode(primitive: P) -> Self
}

By applying SupportedAttributeType in our Attribute implementation

final class Attribute<T: ManagedObjectConvertible> {
    let encode: (T, NSManagedObject) -> Void
    let decode: (inout T, NSManagedObject) -> Void

    init<V: SupportedAttributeType>(keyPath: WritableKeyPath<T, V>, key: String) {
        self.encode = {
            $1.setValue($0[keyPath: keyPath].encodePrimitive(), forKey: key)
        }

        self.decode = {
            $0[keyPath: keyPath] = V.decode(primitive: $1.value(forKey: key) as! V.P)
        }
    }
}

we will be able to store data of any type in the cordate by analogy with Transformable, but without objc-legacy.

By combining flexible attributes and replacing NSManagedObject inheritance with the implementation of the protocol, you can greatly reduce the code base – remove a lot of boilerplate associated with duplicating models, copying the serialization code of composite attributes, and so on.

Thanks to ManagedObjectConvertible, we have uniquely linked our model types and data schema information. But in order to be able to perform operations with data based on this information, we need a layer of data access objects or DAOs, since domain models usually act as DTOs – data transfer objects.

Hiding NSManaged under the hood

If we consider the NSManaged layer in terms of DAO and DTO, then the context + objects are DAO + DTOs, and the sums are equal, but not the components separately, since NSManagedObject, in addition to data transfer, can still update them, but with the participation of the context. Let’s try to redistribute functionality between NSManaged entities and our domain models. Our models are DTO + meta information about the data schema (implementation of ManagedObjectConvertible). Let’s compose a pseudo-equation:

domain models + raw NSManaged- objects + X = DAO + DTO

I marked NSManaged as raw – because from the point of view of the compiler, we took information about the data schema from them and transferred it to the domain models.

And X is the missing piece that will link information about the data schema, information about the types of models with the NSManaged layer.

The solution to our pseudo-equation will be a new entity:

final class ManagedObject<T: ManagedObjectConvertible> {
    let instance: NSManagedObject

    ...
}

This class will serve as a facade for the NSManaged layer, using the generic in the type signature to access the data schema.

I will not go into the details of the final implementation due to the scale of the framework, but I would like to demonstrate the power of the relationship between models. dynamicMemberLookup in Swift.

If we recall ManagedObjectConvertible, it provides information about the entity name in the data schema, converter attributes, and relationships between models. I specifically drew attention then to how using Keypaths you can get the name of the relationship. Let’s adapt that code to the needs of ManagedObject:

final class ManagedObject<T: ManagedObjectConvertible> {
    ...

    subscript<D: ManagedObjectConvertible>(
        keyPath: KeyPath<T.Relations, Relation<D>>
    ) -> ManagedObject<D> {
        let destinationName = T.relations[keyPath: keyPath]

        // получаем объект отношения через NSManaged API

        return .init(instance: ...)
    }
}

And, accordingly, the use:

managedObject[keyPath: .someRelation]

Simple enough, but we can use a special spell in swift – dynamicMemberLookup:

@dynamicMemberLookup
final class ManagedObject<T: ManagedObjectConvertible> {
    ...

    subscript<D: ManagedObjectConvertible>(
        dynamicMember keyPath: KeyPath<T.Relations, Relation<D>>
    ) -> ManagedObject<D> { ... }
}

and make our code simpler and more readable:

managedObject.someRelation

The last point that I would like to show you today is how you can make a convenient predicate system in queries using our attributes.

Typed predicates

The idea is to replace string queries for cordata with typed swift expressions:

Instead "foo.x > 9 AND foo.y = 10" to write Foo.x > 9 && Foo.y == 10 and from this expression get back "foo.x > 9 AND foo.y = 10"

It is quite easy to do this with information from the Attribute entity and the Equatable and Comparable protocols. We will need to implement a set of comparison and logical operators.

Let’s take a look at the logical operator> as an example. On the left side it has the KeyPath of the required attribute, and on the right side – the value of the corresponding type. Our task is to turn the expression Foo.x> 9 into the string “x> 9”. The simplest is the operator sign. We just sew the literal “>” in the implementation of the operator function. To get the name from the kipass, let’s turn to the implementation of our ManagedObjectConvertible protocol on the Foo entity and try to find the one that corresponds to our kipas in the list of attributes. Now we do not store the container key and the container inside the attribute object, but nothing prevents us from doing this:

final class Attribute<T: ManagedObjectConvertible> {
    let key: String
    let keyPath: PartialKeyPath<T>

    let encode: (T, NSManagedObject) -> Void
    let decode: (inout T, NSManagedObject) -> Void

    ...
}

Note that WritableKeyPath has become PartialKeyPath. And the most important thing is that we can compare kipals between ourselves at runtime, since they implement Hashable. This is an extremely interesting point, which suggests that kipals play an important role not only in compliance time, but also in runtime.

Based on the new implementation, refer to the list of attributes, we can, knowing the kipas, easily pull out the corresponding key in the KV container.

We also need to understand which attributes can be compared to. Obviously, not all types implement Equatable and / or Comparable. But in fact, we are not interested in the type of the attribute itself, but in the type of its final primitive (see SupportedAttributeType).

Since we operate with primitives in the cordate, those attributes whose primitives implement Equatable and / or Comparable will be suitable for us:

func == <T: ManagedObjectConvertible, V: SupportedAttributeType>(
    keyPath: KeyPath<T, V>,
    value: V
) -> Predicate where V.PrimitiveAttributeType: Equatable {
    return .init(
        key: T.attributes.first(where: { $0.keyPath == keyPath })!.key,
        value: value.encodePrimitiveValue(),
        operator: "="
    )
}

where Predicate is a special type that is needed to abstraction of individual fragments of an entire expression.

And for the sake of completeness, a logical operator is missing. For example AND. Its implementation is essentially a gluing of two fragments in an expression and at the top level it can be represented as "((left)) AND ((right))"

Thus, the main idea is that knowing the relationship between types and the data schema, you can compose the correct query string from typed swift expressions and, due to this, make less mistakes.

Conclusion

I tried to highlight the main and most interesting points in the implementation, without trying to tell about each line of our framework. I did not touch on some important topics, for example, progressive migrations, but this and much more is written in detail in the repository.

I hope Sworm makes your life a little easier, as it has been helping us for over a year.

All good!

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *