Divide and rule. Analysis of tasks

Solving problems using the method “Divide and Conquer” or in English “Divide and Conquer” is one of the basic methods for speeding up algorithms. An example of this is going from the quadratic complexity of bubble sort or insertion sort to complexity.  inline O (n  log {n}) in merge sort. Or the transition from linear to logarithmic complexity, when searching for an element in a sorted array (see binary search).

In this article, we’ll look at two sample tasks with explanation and code that will use this approach.

To begin with, let’s formulate division and dominion. The solution to the problem using this approach has the following three properties:

  1. Divide the input into smaller subsets.
  2. Solve subproblems recursively.
  3. Combine solutions to subproblems to solve the original problem.

Let’s try to learn these properties in practice and try to solve the following two problems.

Problem number 1

Dan unimodal an array consisting of  inline n unique elements. Let’s call the array unimodal, if its elements standing up to the maximum are in ascending order, and the elements after the maximum are in descending order. It is required to provide an algorithm that returns the maximum element of such an array for the logarithmic ( inline O  left ( log n  right)) time.

Solution

The obvious way to find the maximum in an arbitrary array is to iterate over all its elements, keeping the current largest element.

max = arr[0]
for item in arr[1:]:
  if item > max:
    max = item

When the array is exhausted, in the variable max the largest of the items will be recorded. The complexity of this approach is equivalent to linear, since it is necessary to view each element from  inline n possible.

A unimodal array has two remarkable properties that allow you to optimize your search.

  1. It is known that in such an array the maximum is in the vicinity of smaller elements, i.e. the previous and next elements are guaranteed to be less than the required one. We will use this property to set the main base casein which we complete the search for the largest element.
  2. If we look at an arbitrary element of the array and see that the elements after it are decreasing, then it makes no sense to search among them for the maximum, since it is found earlier. The same is true for elements increasing to the one under consideration. This point helps to reduce the number of array elements considered.

The above properties answer two questions: “what are we looking for in the array” and “in which subset of the array we are looking for this”. To answer the question: “how do we choose the next element under consideration”, we will use the fact that the maximum lies somewhere inside the array, and we will take into consideration the middle element in the next subset.

def unimodal_max(arr, low, high):
    # если в результате разбиения подмножество состоит из одного элемента, 
    # то возвращаем этот элемент
    if low == high:
        return arr[low]
    
    # если в результате разбиения подмножество состоит из двух элементов, 
    # то возвращаем максимум из них
    if high - low == 1:
        if arr[low] >= arr[high]:
            return arr[low]
        else:
            return arr[high]
    
    # "базовый случай": при длине подмножества большем двух, 
    # рассматриваем серединный элемент (см. свойство 1)
    mid = (low + high) // 2
    if arr[mid] > arr[mid + 1] and arr[mid] > arr[mid - 1]:
        return arr[mid]
    
    # если серединный элемент не оказался искомым, 
    # то продолжаем поиск максимума только, 
    # где имеет смысл (см. свойство 2)
    if arr[mid - 1] > arr[mid] > arr[mid + 1] :
        return unimodal_max(arr, low, mid-1)
    else:
        return unimodal_max(arr, mid + 1, high)

# пример вызова: 
# assert(unimodal_max([0,1,2,3,4,5,10,9,8,7,6]) == 10)

Complexity assessment

At the first run of this algorithm, if its length is more than three, and if the middle element is not the desired one, then no matter how large  inline n, the next call will only work with  inline  frac {n} {2} elements, among which the maximum must be found by property 2. During the second and all subsequent calls of this algorithm, the number of elements accepted for consideration is halved. This process continues until the base case by the number of elements (two or less elements), or the base case from property 1, is fulfilled.

Dividing an array in two until reaching the base case cannot take longer than  inline  log_ {2} {n} times, and in each of these calls a constant is made, independent of  inline n number of operations. Hence it follows that the complexity of the presented algorithm  inline O  left ( log n  right)

Problem number 2

Given an unordered array of  inline n elements where  inline n is a power of two. It is required to provide an algorithm that finds the second largest array element, performing at most  inline n +  log_ {2} {n} - 2 comparison operations

Solution

If the first problem can be solved knowing how binary search works, then in this one we need a little more ingenuity. Let’s think about how we can solve this problem in a straightforward way and how many comparison operations we use in this case.

Let’s loop through the array and maintain two numbers max and second_largest

# будем считать, что в массиве более 2х элементов
max = arr[0]
if arr[1] > max:
  max = arr[1]
  second_largest = arr[0]
else:
  second_largest = arr[1]
for item in arr[2:]:
  # если очередной элемент больше обоих максимумов,
  # обновляем их оба
  if item > max:
    second_largest = max
    max = item
  # если больше только второго максимума,
  # то обновляем его
  elif item > second_largest:
    second_largest = item

Let’s count the number of comparisons

The best possible option might be  inline n-1 comparison. This happens when the condition of the first if is executed at each iteration of the loop (this gives  inline n-2 comparisons), and one comparison is added before the loop. This behavior can be achieved on an ascending ordered array.

Consider a decreasing array. In this case, the comparison in if every iteration is executed, but control is transferred to the branch elifwhere another comparison is also performed. It turns out that the total of comparison operations in the case of a decreasing array, taking into account the first comparison before the loop, will equal  inline 2n-3

How can you shorten this close to  inline 2n operations number up to  inline n +  log_ {2} {n} - 2? Let’s try to search for the maximum by dividing the available elements into two parts and then searching for the largest values ​​in them.

Consider an array of elements

9, 18, 3, 5, 25, 1, 5, 19.

The maximum of these numbers is highlighted in green on the diagram. Looking from the bottom up, you can see that the maximum is found by comparing the two largest numbers in their subset.

https://habr.com/ru/company/otus/blog/599309/image

In this case, the number of comparisons to find the maximum is equal to the sum:  inline n / 2 + n / 4 + n / 8 + ... + 1 = n-1… To find second_largest, you will need to consider all the elements with which it was compared max (they are highlighted in yellow-orange on the diagram). Finding the maximum among them will require  inline  log_ {2} {n} - 1 operation, since there are no more items in this small list than the tree depth minus one.

It remains only to implement this algorithm:

def second_largest(arr):
    # функция find_max_comp выполняет поиск максимума в массиве,
    # по принципу на диаграмме
    # кроме этого, в ней осуществляется поддержка списка элементов,
    # с которыми максимум сравнивался
    def find_max_comp(low, high, arr):
            if low >= high:
                return arr[low], []
            mid = (high + low) // 2
            x, lst_x = find_max_and_comp(low, mid, arr)
            y, lst_y = find_max_and_comp(mid + 1, high, arr)
            if x > y:
                lst_x.append(y)
                return x, lst_x
            else:
                lst_y.append(x)
                return y, lst_y
    
    # для определения второго по величине элемента, 
    # требуется вычислить список рассмотренных с наибольшим элементов,
    # а затем найти максимум среди них
    comparisons = find_max_and_comp(0, len(arr) - 1, arr)[1]
    return find_max_and_comp(0, len(comparisons) - 1, comparisons)[0]

The number of comparison operations of the algorithm above is just equal to  inline n +  log_ {2} {n} - 2… This fact can be shown in numbers if you create a global variable – a counter.

Conclusion

The Divide and Conquer approach to problem solving is not the easiest concept in programming, but it is easy to identify. Often, when you need to solve the problem of finding an object among other objects with a given logarithmic complexity, most likely you are expected to use this approach, based on recursive calls on two subsets of the data. This is not a rule, but we have analyzed two confirmations of this hypothesis in this article.

Information

[1] The problem conditions are taken from the book “Algorithms Illuminated: Part 1: The Basics” by Tim Roughgarden. [2] Copyright solutions.

I prepared this article on the eve of the start of the course “Algorithms and Data Structures” from OTUS. Learn more about the course can be here.

Similar Posts

Leave a Reply