Customer loyalty is a giant responsibility, not just technology
Hi all. I’m Igor, a team leader in the team that deals with the loyalty system at CSI. I will tell you how retail loyalty systems work and are structured, how we created a new architecture of the Set Loyalty system, what we use from frameworks and tools.
Few people imagine what modern loyalty systems are in large federal retail. These are tens of millions of profiles, billions of check transactions per year, hundreds of RPS (request per second) from different purchase channels – from the checkout to a mobile application that requires a response up to a second. And all this with a deployment on the customer’s servers and the need for maximum availability, since any failure leads to very large losses.
What is the loyalty system in retail
Our company is engaged in the development of front-office systems for retail. It is with the work of our software that the buyer in the store faces: sometimes personally, and sometimes through the cashier. The direction in which I work is developing what is commonly called a “customer loyalty system” – this discounts, coupons, personal offers, bonuses and everything that motivates the buyer to return, to be loyal to a certain retail chain. A loyalty system is now available in almost every store, with the exception of discounters or very small brick-and-mortar stores. It’s there too, but on a different level.
A special notebook where loyal customers are brought in 🙂
Customer loyalty system is a specialized IT system, a ready-made set of tools and mechanics, a platform for building and launching a program by a retailer loyalty in their stores. Separate modules (processing) loyalty are responsible for working with specific blocks of data. Today, almost any network retailer has in its arsenal (modules are called differently in different systems, here I give names from our ecosystem as an example):
Physical/electronic loyalty cards and network customer profiles are the core and basis of the loyalty program. The data is contained in module “CRM: Buyers”.
Based on personal data and purchase history, segments of buyers are formed according to various criteria, and they are provided with their own preferences and personal offers (module “Segments”).
Launch coupons, bonuses, all kinds of promotions and discounts allow promotions management modules (“Bonuses”, “Coupons”). Each of the mechanics must work within the entire network, have its own settings, conditions and restrictions – therefore, there are also services of counters and restrictions.
All this is complemented by an information system (module “Communications”) via sms, instant messengers, mail and mobile applications – this is also part of the loyalty system, albeit an auxiliary one.
What happens at the checkout
So, the client has cash register software installed, for example, Set Retail and a loyalty system (in our case, Set Loyalty). What a typical scenario looks like at the checkout – goods are added, age is checked (if required), the “Calculate” button is pressed. Then the magic begins:
Using a customer discount card is the starting point for getting started with the Set Loyalty loyalty system:
Authorization takes place by searching for a discount card or a buyer’s phone number in the “CRM: Buyers” database.
During the authorization process, requests are sent to the databases of “Segments”, “Bonuses”.
As a result, at the time of checkout, the checkout has all the necessary information: the buyer’s profile, the segments that the buyer fell into, his bonus account.
Now you can add a coupon and repeat the whole cycle.
Based on the check and the data received from the loyalty system, the buyer is provided with discounts on goods – both universal for all holders of discount cards, and unique: for the segment of buyers and even personal for a specific buyer.
The terms of purchase are calculated in approximately the same way in all purchase channels: in an online store, at checkouts for self-purchase. And all this should work as quickly as possible so that the buyer does not even notice the delay in service.
How did we get there
Initially, Set Retail had (and still has) a separate built-in loyalty module with a huge number of mechanics: promotions and discounts – they are calculated directly at the checkout. These promotions and discounts can even be started in the ERP system and go down to the desired store through the central office, but in the end they get to the cashier – the end point for selling and calculating promotions and discounts. Information about the results of the use of promotions at the checkouts “rises” back to the stores, and then to the central office. As a result, the data exchange occurs according to the following scheme.
At the same time, retailers wanted to use new opportunities in their loyalty programs, for example, personal discounts.
With the current approach, this would require that all personal discounts be available at each checkout, since a customer can go to any checkout. The result of applying the promotion should go up to the central office and upload to the remaining servers and cash desks so that other cash desks can take into account already applied personal discounts and update the data on the client. And these are hundreds of millions of active offers, the state of which should be up-to-date at any time in any place of purchase. Similarly, with the data of bonus accounts, one-time coupons – there was too much data. There was a need to write services for processing all the necessary data.
Resolved – we are doing new loyalty processing. We call the product Set Loyalty.
We use a lot of microservices in the central loyalty server that can be scaled (run several instances of the same type). This removes the problem of points of failure: if something happens to one service, it will not block the work of the others. Nomad was chosen to manage the deployment (why exactly – it is written in an article by my colleague), Consul was added to it to store settings and Service Discovery, as a proxy for traefik routing. All three components: nomad, consul, traefik interact with each other flawlessly.
We did not experiment with the choice of a framework for writing services, we planned to write using the Spring Boot framework, which allowed us to quickly start and write the first services. We took PostgreSQL as a database and Patroni for it as an automatic failover.
The problem with many communication channels is solved, on the one hand, by the concentration of information on the loyalty side, that is, we do not release data that can not be released to the cash register, the cash desk requests them if necessary. On the other hand, where it remains necessary, asynchronous data (in particular, checks) is obtained using such a highly scalable system as Kafka.
All integrations (interactions between services, requests from cash desks) began to be done through JSON, synchronous messaging in the REST style, previously used various implementations of RPC over HTTP, in particular SOAP – here we had to rebuild our thinking when creating service APIs. As a result, it turned out like this:
Now the checkout requests all the necessary information to work with the buyer in the POS Gateway (API Gateway microservice architecture template), which aggregated information from all services. Here there was an application for reactive programming, WebFlux was involved.
What’s going on at the box office. When using a discount card, you must first poll the “CRM: Buyers” service. We receive the buyer’s profile along with a unique buyer identifier, using which we make requests to the services of segments, coupons, bonuses, etc. Naturally, the response for the checkout can be generated faster if requests for the customer ID are made in parallel. WebFlux allows you to conveniently perform parallel queries and aggregate them.
The CRM: Buyers service is responsible for storing customer profiles and discount cards. It turned out to be the most complex service, obviously without the prefix micro, since there were entities that either could not be separated, or it was difficult to do so. Also, “CRM: Buyers” must support many integrations (in addition to cash desks): with self-service counters, customer CRM systems, customer personal accounts, etc.
The result is a flexible and scalable system: each retail chain can have its own subset of services (for example, the chain does not need bonuses) — loyalty modules. You can change the number of service and database instances, group databases of different domains as if on one host, or divide into many, use replicas, and otherwise change the system to suit business needs: depending on the size of the customer audience. This approach allows you to adapt the solution: for a large network, this is one infrastructure and high availability requirements, for a medium one, others are smaller and cheaper.
Service API Development
As I wrote, there were certain difficulties with the transition from RPC to REST to the principle of building API services. When using Spring MVC, two approaches are used: either they write controllers, DTO classes, add swagger annotations to them and generate a spec, or vice versa: first they write a spec, and then generate controllers with DTO (also known as API First).
So, in our case, it was the second approach that turned out to be more productive:
1. It allows you to immediately get a neatly written document that can be submitted for third-party integrations (which may not be written in Java at all).
2. Such a document can be used even before the development of the service, to parallel the work.
3. Less code in the repository, since most of it is generated at compile time.
4. Importantly, it allows you to immediately “think” in REST categories: resource, GET POST PUT, instead of pulling method names on the HTTP path.
At the beginning, we continued to use the usual approach when writing tests: we took each class, whether it be a controller, service or DAO, closed the external dependencies of the class through Mockito (only DAOs worked with embedded databases), covered the class with tests.
But this approach has a number of problems:
Each such class test relies on certain behavior of the classes on which it depends, this behavior may change, but the test will continue to work.
Any refactoring requires rewriting a huge number of tests, although the behavior of the service as a whole has not changed.
A common situation is when all the tests seem to be written, but the service as a whole does not work due to some very simple errors.
In the new solution, the use of mockMVC (end2end) tests turned out to be a good find: all Spring beans are raised, a real embedded database and Kafka are used, tests are reduced to REST requests and reading from Kafka, integrations with other services are closed through Mockito. Such tests are more difficult to write, but they effectively cover the main part of the service code, taking such a test as a basis, it is easy to reproduce almost any defect found on the production. Thus, the developer focuses on end2end tests of the main scenarios, in other words, mockMvc tests are written based on the spec first of all, then individual services, the dao layer, etc., are covered with tests, as it were from top to bottom.
We were motivated to introduce new approaches and technologies by the needs of the market. The solution turned out to be in demand and allowed us to satisfy not only the needs of retail chains and customers, but also us as developers who always want to try something new.
We are completely satisfied with the result and the chosen solution architecture, we are sure that we will cope with the requirements of even the largest federal networks. Services show consistently high availability and response times measured in milliseconds even during RPS peak hours.
Below is an example from grafana of one of our clients with a customer base of over 30 million customers. The technical services of our customers themselves, if they wish, monitor the “health” indicators of their loyalty program.
The article turned out to be a review, if there is interest in specific points, I will be happy to answer in the comments.