Algorithm for creating a seamless list of data

How to optimize the data migration process from different sources and get a seamless list? In this article, we will talk about the front-end method in the Flutter cross-platform framework. Flutter is a powerful and popular framework for building mobile and web applications. It offers a wide range of features and tools to make development easier and faster. The article is especially useful for those who create mobile applications in the Dart language.

The problem the algorithm solves

With the development of the product, the question of scaling arises – the transition to a different architecture, for example, from a monolithic solution to a microservice one. When you have hundreds of thousands of clients, you can’t just take them and transfer them.

The migration process is multidimensional and lengthy:

Firstly, there are a dozen products that may be at different stages of migration between architectures;

Secondly, customers are transferred in several stages: first, a dozen selected customers, then a hundred loyal customers.

Therefore, the time during which different clients are in both architectures is quite long and can take months. But time, money, and customer business must not stop, which means that data must be available from two sources at the same time and look seamless.

Description of the algorithm

There are 2 approaches to seamless migration given:

  • make a facade on the back that will combine data from the monolith and the microservice platform;

  • entrust this to the front, in this case a mobile Flutter application on Dart.

Let’s consider the second option. Let’s take “Payments” as a basic and basic product, but keep in mind that this is only the first sign, a dozen more subsystems / modules / microservices are next in line.

To begin with, the user should see a list of payments grouped by dates and sorted by date in descending order. Keep in mind that a client, which is a legal entity, has several actors/users, each of which sees transactions for a specific legal entity. At the same time, different actors can either be switched to microservices or even in a monolith.

That is, information on payments up to a certain point should not just be taken from the monolith, but after that from the microservice, but they are mixed, so to speak, and are both there and there at the same time (without duplication).

Work algorithm: description and implementation

Here’s how to display a slender list:

  • During the initial load (page 0), we load a certain portion of Size from both services Mo and Mc, while scrolling, we determine from which record it is worth continuing to download from Mo, and from which from Mc;

  • Join the lists and sort ( Mo + Mc ).sort();

  • Add only the first Size records to the resulting list
    Res = ( Mo + Mc ).sort().take(Size);

```dart
class UtilsUnitedRecords<T> {
	  ///Получаем из 2х списков тот, который нужно добавить в результирующий
	  List<T> unitedRecord(List<T> setA, List<T> setB, int Function(T a, T b) compare) {
		var union = [...setA, ...setB];
		union.sort(compare);
		return union.take(Environment.sizePage).toList();
	  }

	  ///Получаем позицию с которой нужно продолжить загрузку
	  int getStartPosition(List<T> setA, bool Function(T a) check) {
		return setA.where((element) => check(element)).length;
	  }
  }

class Repository<T> {

  Repository({
    required this.api,
    required this.apiMs,
    required this.store,
    required this.permissionRepository,
  });

  final IStore store;
  final IApi api;
  final IApi apiMs;
  final IPermissionRepository permissionRepository;

  final utils = UtilsUnitedRecords<T>

search({
    S searchParams,
    bool isFirst = true,
    int pageSize = Environment.sizePage,
  }) async {
	final userCanUseMS = await permissionRepository.getMicroservicePermission();

    //Проверяем сколько уже выкачано из МС и МОНО,
    //  т.о. определяем стартовую позицию загрузки для каждого сервиса
    var startPositionMono = isFirst
		? 0
		: userCanUseMS 
			? utils.getStartPosition(store, _isMs) 
			: 0;
    var startPositionMS = isFirst
		? 0
		: utils.getStartPosition(store, _isMono);
	
	//Заменяем стартовую позицию поиска в параметрах и ищем
	searchParams = searchParams.copyWith(startPosition: startPositionMS);
    final responseMono = await search(api: api, searchParams: searchParams);

    searchParams = searchParams.copyWith(startPosition: startPositionMS);
    final responseMs = userCanUseMS? await search(api: apiMs, searchParams: searchParams): [];

    //Соединяем списки, сортируем получившийся и возвращаем ту часть, коорую нужно добавить в стор
    final insertList = utils.unitedRecords(responseMs, responseMono, _compare);

    //В зависимости от того первая это загрузка или нет, вызывается установка или дополнение списка в стор
    final action = isFirst 
		? store.setLetters 
		: store.addLetters;
    action(letters: insertList);
  }
```

The process of testing a new algorithm on real data, evaluating its effectiveness and accuracy

Let’s check the test dataset in Excel. Let’s assume that the page when scrolling the list will load on two records. Indeed, we see a seamless list from different sources.

Example of a seamless list

Example of a seamless list

The first thing you see during testing is how oversampling occurs. Sometimes, if in the resulting insertList data comes from only one source, you have to re-request from the second. The worst option is when all users of the client were transferred to the microservice, after some time the data from the monolith is no longer visible even when scrolling through several screens, and they continue to be requested and do not get into the resulting data set.

This leads to caching of data if they are not included in the selection.

```dart
enum ServerDef { MONO, MS }
class UtilsUnitedRecords<T> {
	  ///Получаем из 2х списков тот, который нужно добавить в результирующий
	  List<T> unitedRecord(List<T> setA, List<T> setB, int Function(T a, T b) compare) {
		var union = [...setA, ...setB];
		union.sort(compare);
		return union.take(Environment.sizePage).toList();
	  }

	  ///Получаем позицию с которой нужно продолжить загрузку
	  int getStartPosition(List<T> setA, bool Function(T a) check) {
		return setA.where((element) => check(element)).length;
	  }

	///Возвращает данные для кэширования, если все данные одного из наборов не попали в результирующий список
  MapEntry<ServerDef, List<T>>? needCaching(
      {required List<T> union,
      required MapEntry<ServerDef, List<T>> setA,
      required MapEntry<ServerDef, List<T>> setB}) {
    final minSet = (setA.value.length < setB.value.length) ? setA : setB;

    final isContains = union.any((element) => minSet.value.contains(element));
    return (isContains) ? null : minSet;
  }
  }

class Repository<T> {

  Repository({
    required this.api,
    required this.apiMs,
    required this.store,
    required this.permissionRepository,
  });

  final IStore store;
  final IApi api;
  final IApi apiMs;
  final IPermissionRepository permissionRepository;

  final utils = UtilsUnitedRecords<T>
  MapEntry<ServerDef, List<T>>? _cacheData;

search({
    S searchParams,
    bool isFirst = true,
    int pageSize = Environment.sizePage,
  }) async {
	final userCanUseMS = await permissionRepository.getMicroservicePermission();

    //Проверяем сколько уже выкачано из МС и МОНО,
    //  то есть, определяем стартовую позицию загрузки для каждого сервиса
    var startPositionMono = isFirst
		? 0
		: userCanUseMS 
			? utils.getStartPosition(store, _isMs) 
			: 0;
    var startPositionMS = isFirst
		? 0
		: utils.getStartPosition(store, _isMono);
	if (isFirst){
cacheData = null;
}

//Заменяем стартовую позицию поиска в параметрах и ищем
searchParams = (_cacheData != null && _cacheData!.key == ServerDef.MONO)
        ? _cacheData!.value
	: searchParams.copyWith(startPosition: startPositionMS);

    final responseMono = (_cacheData != null && _cacheData!.key == ServerDef.MS)
        ? _cacheData!.value
	: await search(api: api, searchParams: searchParams);

    searchParams = searchParams.copyWith(startPosition: startPositionMS);
    final responseMs = userCanUseMS? await search(api: apiMs, searchParams: searchParams): [];

    //Соединяем списки, сортируем получившийся и возвращаем ту часть, которую нужно добавить в стор
    final insertList = utils.unitedRecords(responseMs, responseMono, _compare);

    //В зависимости от того первая это загрузка или нет, вызывается установка или дополнение списка в стор
    final action = isFirst 
		? store.setLetters 
		: store.addLetters;
    action(letters: insertList);

//Определяем, нужно ли кэшировать данные
    _cacheData = urds.needCaching(
      union: insertList,
      setA: CacheData(ServerDef.MONO, responseMono),
      setB: CacheData(ServerDef.MS, responseMs),
    );
  }

Results and conclusions about the significance of the algorithm and further prospects for its development

The algorithm solves the problem of obtaining seamless data from different sources with the following consequences:

  • the decision to make on the front (in this case, a mobile application) allows you to implement a local (in the context of the user on the device) cache at the repository level. When implemented on the back, one would have to use additional Redis storages or analogues with tracking user sessions and forced cleaning;

  • in this case, we have only 2 sources, but the repository and function are easily rewritten under the array of sources;

  • there are still a dozen products waiting to be moved to microservices, auxiliary functions / utils are generics, which means that nothing will have to be rewritten;

  • search function and functions to determine where the object is from _isMs And _isMono in the repository are quite universal, it will be necessary to update the function of comparing objects of other products _compare for sorting;

There are a couple more ideas for improvement, for example:

  • use a parallel data query (in dart this is implemented using isolates);

  • optimize the calculation of the starting position;

  • cache unused data, now they are not re-requested only if none of the records from the response was used.

Thus, with the help of simple manipulations, it is possible to optimize the process and solve the problem of obtaining seamless data from different sources.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *