Difference between pool.map and pool.map_async in Python

Another cheat sheet article about the multiprocessing module in Python, without any fuss, from a beginner for beginners in multiprocessor programming.

pool.map and pool.map_async are methods of the multiprocessing.Pool module in Python that allow functions to be executed in parallel on multiple processes.

  1. pool.map: This method blocks program execution until all tasks have completed. It takes a function and a list of arguments, applies the function to each argument in the list, and returns the results in the same order in which they were passed to the list. This method is suitable for cases where you need to wait for all tasks to complete before continuing with the program.

  2. pool.map_async: This method also takes a function and a list of arguments, but does not block program execution. Instead, it returns a multiprocessing.pool.AsyncResult object, which allows you to asynchronously track the execution status of each task and retrieve results as they complete. This method is useful when you want to continue executing a program without waiting for all tasks to complete.

So the main difference between pool.map and pool.map_async is that the former blocks the program from executing while the latter allows the program to continue executing without waiting for all tasks to complete.

Here are examples of using pool.map and pool.map_async in Python:

Example of using pool.map:

import multiprocessing

def square(x):
    return x * x

if __name__ == "__main__":
    pool = multiprocessing.Pool()
    
    numbers = [1, 2, 3, 4, 5]
    results = pool.map(square, numbers)
    
    print(results)

In this example, we create a process pool called pool, define a function called square, which returns the square of a number, and a list of numbers, numbers. The pool.map(square, numbers) method applies the square function to each element of the numbers list in parallel across multiple processes and returns the results in the same order.

Example of using pool.map_async:

import multiprocessing

def cube(x):
    return x * x * x

if __name__ == "__main__":
    pool = multiprocessing.Pool()
    
    numbers = [1, 2, 3, 4, 5]
    async_result = pool.map_async(cube, numbers)
    
    # Продолжаем выполнение программы без ожидания завершения задач
    
    async_result.wait()  # Ожидаем завершения всех задач
    
    results = async_result.get()
    
    print(results)

In this example, we also create a process pool called pool, define a cube function, a list of numbers, and use the pool.map_async(cube, numbers) method to asynchronously apply the cube function to each element of the list. We continue executing the program without waiting for the tasks to complete and then call async_result.wait() to wait for all tasks to complete and get the results using async_result.get().

Both of these methods allow you to use parallel computing to speed up tasks.

You can read more about the multiprocessing module in more detail, but just as succinctly, in this article.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *