4 Methods for Testing Python Applications with the Command Line (CLI)

You have just completed creating your first Python command line application. Or perhaps this is already your second or third application. you’ve been for a while learning pythonand are now ready to create something bigger and more complex, but still meant to be executed on the command line. Or you are used to developing and testing web or GUI applications and are now starting to build Command Line Interface (CLI) applications.

In all these situations and more, you will need to learn and master various methods for testing Python CLI applications.

While tool selection may seem daunting, the main thing to remember is that you are essentially comparing the results your code generates with the results you expect to get. Everything comes from this.

In this tutorial, you’ll learn four practices for testing Python command-line applications:

  • Debugging “Lo-Fi” with print()

  • Using the Visual Python Debugger

  • Unit testing with pytest and mocks

  • Integration testing

Everything will be based on a basic Python CLI application. This means that the user will interact with the application through the command line, entering commands and getting the corresponding results. The application will receive data in the form of a multi-level dictionary. A dictionary can contain various levels of nesting and structure, such as nested dictionaries within lists, and so on. Then, this data will be passed to two functions. The functions will perform certain data transformations according to the given logic. They may include processing, filtering, sorting, or any other operation necessary to process the data. After such transformations, the functions will provide the results back to the application. The results can be presented in various formats, such as text messages or structured data. Finally, the application will display the results to the user through the command line interface.

So the whole process will involve entering data through the command line, passing and processing the data with functions, and then outputting the results back to the user.

In the example code below, we’ll look at a few different methods to help you with your testing. While this tutorial is certainly not exhaustive, I hope it will provide you with enough knowledge so that you can confidently create effective tests in the main areas of testing.

I’ve added a few bugs to the source code that we’ll be looking for through testing methods.

Note: For simplicity, some basic tricks are not included in this code, such as checking for the existence of keys in a dictionary.

First of all, let’s imagine our objects at each stage of the application. Let’s start with a structure that describes John Q Public (John Q Public):

JOHN_DATA = {
    'name': 'John Q. Public',
    'street': '123 Main St.',
    'city': 'Anytown',
    'state': 'FL',
    'zip': 99999,
    'relationships': {
        'siblings': ['Michael R. Public', 'Suzy Q. Public'],
        'parents': ['John Q. Public Sr.', 'Mary S. Public'],
    }
}

We then simplify [трансформируем их многоуровневые структуры в более линейные формы, чтобы было легче работать с данными] other dictionaries, expecting this to happen after calling our first transform function initial_transform:

JOHN_DATA = {
    'name': 'John Q. Public',
    'street': '123 Main St.',
    'city': 'Anytown',
    'state': 'FL',
    'zip': 99999,
    'siblings': ['Michael R. Public', 'Suzy Q. Public'],
    'parents': ['John Q. Public Sr.', 'Mary S. Public'],
}

Then using the function final_transform we collect all information about the address into one address record:

JOHN_DATA = {
    'name': 'John Q. Public',
    'address': '123 Main St. \nAnytown, FL 99999'
    'siblings': ['Michael R. Public', 'Suzy Q. Public'],
    'parents': ['John Q. Public Sr.', 'Mary S. Public'],
}

And when called print_person the following will be written to the console:

Hello, my name is John Q. Public, my siblings are Michael R. Public 
and Suzy Q. Public, my parents are John Q. Public Sr. and Mary S. Public, 
and my mailing address is:
123 Main St. 
Anytown, FL 99999

testapp.py:

def initial_transform(data):
    """
    Flatten nested dicts
    """
    for item in list(data):
        if type(item) is dict:
            for key in item:
                data[key] = item[key]

    return data


def final_transform(transformed_data):
    """
    Transform address structures into a single structure
    """
    transformed_data['address'] = str.format(
        "{0}\n{1}, {2} {3}", transformed_data['street'], 
        transformed_data['state'], transformed_data['city'], 
        transformed_data['zip'])

    return transformed_data


def print_person(person_data):
    parents = "and".join(person_data['parents'])
    siblings = "and".join(person_data['siblings'])
    person_string = str.format(
        "Hello, my name is {0}, my siblings are {1}, "
        "my parents are {2}, and my mailing"
        "address is: \n{3}", person_data['name'], 
        parents, siblings, person_data['address'])
    print(person_string)


john_data = {
    'name': 'John Q. Public',
    'street': '123 Main St.',
    'city': 'Anytown',
    'state': 'FL',
    'zip': 99999,
    'relationships': {
        'siblings': ['Michael R. Public', 'Suzy Q. Public'],
        'parents': ['John Q. Public Sr.', 'Mary S. Public'],
    }
}

suzy_data = {
    'name': 'Suzy Q. Public',
    'street': '456 Broadway',
    'apt': '333',
    'city': 'Miami',
    'state': 'FL',
    'zip': 33333,
    'relationships': {
        'siblings': ['John Q. Public', 'Michael R. Public', 
                    'Thomas Z. Public'],
        'parents': ['John Q. Public Sr.', 'Mary S. Public'],
    }
}

inputs = [john_data, suzy_data]

for input_structure in inputs:
    initial_transformed = initial_transform(input_structure)
    final_transformed = final_transform(initial_transformed)
    print_person(final_transformed)

At the moment, the code does not live up to expectations, so we will continue to experiment with the implementation of efficient versions of these four methods as we study them. Thus, you will gain practical experience in using these methods, expand your comfort zone and begin to understand what tasks they are most suitable for.


“Lo-Fi” debugging with Print

This is one of the easiest ways to test. All you have to do is print the variable or object you are interested in – before the function call, after the function call, or inside the function.

Accordingly, this allows you to check the input data of the function, the output data of the function and the logic of its work.

If you save the above code as testapp.py and try to run it with python command testapp.pyyou might see an error like this:

Traceback (most recent call last):
  File "testapp.py", line 60, in <module>
    print_person(final_transformed)
  File "testapp.py", line 23, in print_person
    parents = "and".join(person_data['parents'])
KeyError: 'parents'

IN person_data the key is missing, which is passed to print_person. The first step is to validate the function input print_person and finding out why the expected output (printed message) is not generated. We’ll just add a function call print before calling print_person:

final_transformed = final_transform(initial_transformed)
print(final_transformed)
print_person(final_transformed)

Function print copes with this task, showing in the output that we have neither the top-level key parents (parent elements), nor key siblings . But in the interest of logic, I’ll show you a function pprint, which renders layered objects in a more readable way. To use this method, add the command at the top of your script from pprint import pprint.

Instead of print(final_transformed) we call pprint(final_transformed) to test our object:

{'address': '123 Main St.\nFL, Anytown 99999',
 'city': 'Anytown',
 'name': 'John Q. Public',
 'relationships': {'parents': ['John Q. Public Sr.', 'Mary S. Public'],
                   'siblings': ['Michael R. Public', 'Suzy Q. Public']},
 'state': 'FL',
 'street': '123 Main St.',
 'zip': 99999}

Compare this to the expected end result we described above.

Since we know that the function final_transform doesn’t affect the relationships dictionary, it’s time to figure out what’s going on inside the function initial_transform. Usually in these situations I would use a traditional debugger to step through the code step by step. However, now I want to show you another way to debug using printout.

We can display the state of objects in the form of code, but the matter is not limited to this. You can print anything you like, and thus monitor the progress of the program. You can also display markers to see which and when logic circuits are executed.

Since the function initial_transform mostly consists of multiple loops and internal dictionaries are processed by a nested loop for, it would be useful to understand what is going on inside it. So we can find out if something important and interesting is going on there.

def initial_transform(data):
    """
    Flatten nested dicts
    """
    for item in list(data):
        if type(item) is dict:
            print "item is dict!"
            pprint(item)
            for key in item:
                data[key] = item[key]

    return data

If a dictionary is encountered in our input data, then a warning will be issued in the console, and then we will see what this element looks like.

However, after the program is executed, our output in the console remains unchanged. This suggests that our operator if does not work as expected. Instead of continuing to output data to find an error, now is a great time to demonstrate the benefits of using a debugger.

However, as an exercise, I recommend looking for errors in this code using only print debugging. This is a good practice and will get you thinking about how to use the console to alert you to various situations happening within a program.

Summarizing

When to Use Print Debugging:

  • Simple objects

  • Short scripts

  • Mistakes that seem simple

  • Quick checks

Detail:

  • pprint – giving a more visually pleasing or readable appearance to the data output. Output using pprint will be nicely formatted, with indentation, delimiters, and line breaks, making the data structure easy to read.

Pros:

Minuses:

  • Often you need to run the entire program, otherwise:

  • It is necessary to add additional code in order to control the order of code execution in the program manually

  • There is a risk of leaving test code incomplete, especially in complex programs


Using the debugger

Debuggers are great when you want to step through the code and examine the state of the entire application. They help in those cases when you know approximately where the errors occur, but cannot understand the reason. Debuggers also give you an overview of everything that’s going on in your application.

There are many debuggers, and they are often included in integrated development environments (IDEs). Python has pdb modulewhich can be used in an interactive shell REPL (“Read-Eval-Print Loop”, which means “read-calculate-output loop”) for debugging code. Instead of going into the implementation details of all available debuggers, in this section I’ll show you how to use debuggers with common features such as setting breakpoints (breakpoints) and watches (observation).

Breakpoints are special markers or instructions in your code that tell the debugger where to stop program execution so you can carefully examine the current state of your application. Watches are expressions that you can add during a debug session to monitor the value of variables (and more).

But let’s get back to breakpoints. They are added where you want to start or continue a debugging session. Since we are debugging the method initial_transform, we want to put one of them right there. I will mark the breakpoint with the symbol (*):

def initial_transform(data):
    """
    Flatten nested dicts
    """
(*) for item in list(data):
        if type(item) is dict:
            for key in item:
                data[key] = item[key]

    return data

When we start debugging, the program execution will pause at this line, and you will be able to see the variables and their types at that particular point in the program execution. We have several options for code navigation: step over (step forward), step in (step inside) and step out (step out) – the most common.

step over is the command you will use most often. It simply moves on to the next line of code.

step in an attempt to delve into the code. You can use this when you come across a function call that you want to explore in more detail. Here you will go directly to the code of this function and get the opportunity to examine the state already there. It is also often used, confused with step over. Fortunately, step out can come to the rescue – it takes us back to the calling function.

We can also install here watch, for example, type(item) is dict. This can be done in most IDEs using the ‘add watch’ button during a debug session. The code will now display True or False no matter where you are.

Set “watch” and then “step forward” to stop at the line if type(item) is dict:. You should now see the state of the observation, the new variable item and object data.

Even without using observation, we can notice a problem: instead of the function type consider what the variable points to itemit actually checks the type of the variable itself item, which is a string. Ultimately, computers do only what we tell them. Thanks to the debugger, we see the error in our code and fix it as follows:

def initial_transform(data):
    """
    Flatten nested dicts
    """
    for item in list(data):
        if type(data[item]) is dict:
            for key in data[item]:
                data[key] = item[key]

    return data

Let’s run the code through the debugger again to make sure it follows the given path. However, in fact this does not happen, and the code structure now looks like this:

john_data = {
    'name': 'John Q. Public',
    'street': '123 Main St.',
    'city': 'Anytown',
    'state': 'FL',
    'zip': 99999,
    'relationships': {
        'siblings': ['Michael R. Public', 'Suzy Q. Public'],
        'parents': ['John Q. Public Sr.', 'Mary S. Public'],
    },
    'siblings',
    'parents',
}

Now that we’ve covered how to use the visual debugger, let’s go ahead and test our new knowledge by running the following example.

We have already talked about the visual debugger. And they used it. We liked it. However, this technique has its pros and cons, and you can familiarize yourself with them in the section below.

Summarizing

When to use the Python debugger:

  • More complex projects

  • Difficult to find errors

  • More than one object needs to be checked

  • You roughly imagine Where an error occurs, but you need to pinpoint it

Detail:

Pros:

  • Program flow control

  • Bird’s-eye view of application status

  • No need to know exactly where the error occurred

Minuses:


Unit testing with Pytest and Mocks

When it comes to testing, the previous methods can be tedious and even require code changes. Especially if you’re looking to exhaustively test combinations of input and output, as well as cover every possible branch of your code. In our example, the result of running initial_transform still doesn’t look the way we want it to.

Although the logic of our code is quite simple, over time it can become larger and more complex, or become the responsibility of an entire team. How can you test your application in a more streamlined, granular, and automated way?

Unit tests come to the rescue.

Unit testing is a technique that breaks down source code into smaller, more understandable blocks (units) (usually methods or functions) and then tests them separately from each other.

The bottom line is that you create a set of scripts to test each method with different inputs. This makes sure that every logical branch within all methods is tested. This process is called code coverage and generally everyone aims for 100% coverage. However, this is not always practical and necessary, but this can be discussed separately in another article (or tutorial).

During the test, each method is considered in isolation: external calls are overridden using a technique called mocking to ensure accurate return values. And after the test is executed, all temporary objects and states are deleted. These and other techniques are used to ensure the independence and isolation of the unit under test.

Repeatability and isolation are key to these kinds of tests, although we continue with the topic of comparing expected and actual results. Now that you have an idea of ​​unit testing in general, you can take a quick tour and see how to unit test Flask applications with minimum set of viable tests.

Pytest

So now that we may have gotten too deep into the theory, let’s see how it works in practice. Python has a built-in unittest module, but I believe that pytest makes it even more convenient. In any case, I will show you the basics of unit testing, as it can take too much time to study this topic in detail.

It’s good practice to put all tests in the test directory inside your project. For our small script, the test_testapp.py file located next to testapp.py.

We will write a unit test for the function initial_transformto show how to set the set of expected inputs and outputs and make sure they match. The main approach I use with pytest is to create fixtureswhich takes some parameters and uses them to generate the test inputs I need and the expected outputs.

Let’s start by setting up the fixture. Just look at the code, and think about the test cases that would be needed to cover every possible branch of logic in a function. initial_transform:

import pytest
import testapp as app

@pytest.fixture(params=['nodict', 'dict'])
def generate_initial_transform_parameters(request):

Before we generate the input data, let’s understand what’s going on here so as not to get confused.

First, we use the @pytest decorator.fixtureto declare that the following function definition is a fixture. In addition, we introduce a named params parameter that will be used with the function generate_initial_transform_parameters.

An interesting feature is that every time a decorated function is used, it will be called with all the parameters. So just a function call generate_initial_transform_parameters will call it twice: first with nodict as a parameter, and then with dict.

To access these parameters, we add a special object request from pytest to our function signature.

Now let’s collect our input data and what we expect to get as a result:

@pytest.fixture(params=['nodict', 'dict'])
def generate_initial_transform_parameters(request):
    test_input = {
        'name': 'John Q. Public',
        'street': '123 Main St.',
        'city': 'Anytown',
        'state': 'FL',
        'zip': 99999,
    }
    expected_output = {
        'name': 'John Q. Public',
        'street': '123 Main St.',
        'city': 'Anytown',
        'state': 'FL',
        'zip': 99999,
    }

    if request.param == 'dict':
        test_input['relastionships'] = {
            'siblings': ['Michael R. Public', 'Suzy Q. Public'],
            'parents': ['John Q. Public Sr.', 'Mary S. Public'],
        }
        expected_output['siblings'] = ['Michael R. Public', 'Suzy Q. Public']
        expected_output['parents'] = ['John Q. Public Sr.', 'Mary S. Public']

    return test_input, expected_output

There is nothing surprising here: we set the input data and the expected result. And if we have a ‘dict’ parameter, then we change the input data and the expected result, which gives us the opportunity to test the if block.

Then we start writing the test. In this test, we need to pass the fixture as a parameter to the test function in order to access it:

def test_initial_transform(generate_initial_transform_parameters):
    test_input = generate_initial_transform_parameters[0]
    expected_output = generate_initial_transform_parameters[1]
    assert app.initial_transform(test_input) == expected_output

There are a few key things to keep in mind when writing test functions. First, the name of the test function must begin with the prefix test_. This signals that the function is a test and will test some aspect of the code. In addition, test functions should be based on assert statements. With their help, we assert that the expected result is the same as what we get by running the input data through our function.

When you run test functions in your development environment with test settings or with the pytest tool from the command line, you may run into errors. This is a necessary part of the process. Your results may not be what you expect, and that’s perfectly normal.

Now let’s consolidate this with the next exercise. Nothing replaces hands-on experience! Applying what you read as you work will help you absorb the information better and remember it more easily in the future. So do not be afraid of mistakes and start practicing to consolidate the knowledge gained.

Mocks

Mocks

Moki

Mocks are a concept used in software testing, especially unit testing. These are special objects or components that are created to mimic the behavior of real objects or functions in your code. Since we are only testing one single block (unit) of code, our interest is limited to its behavior, with little regard to how other function calls work. It is important for us to get reliable results from exactly the part of the code that we are testing.

Let’s add an external function call to initial_transform:

def initial_transform(data):
    """
    Flatten nested dicts
    """
    for item in list(data):
        if type(data[item]) is dict:
            for key in data[item]:
                data[key] = data[item][key]
            data.pop(item)

    outside_module.do_something()
    return data

We don’t want to make real feature requests do_something(), so instead we will create a “mock” (imitation) in our test script. The mock will intercept this call and return the result that we previously set for it. I prefer to set up these mocks in special “fixtures” as this is part of the test preparation and we can collect all this preparation code in one place:

@pytest.fixture(params=['nodict', 'dict'])
def generate_initial_transform_parameters(request, mocker):
    [...]
    mocker.patch.object(outside_module, 'do_something')
    mocker.do_something.return_value(1)
    [...]

Now, every time you call initial_transformfunction call do_something will be intercepted and 1 will be returned instead of the actual call. You can also use fixture parameters to control what your mock will return. This is useful when branching in code depends on the result of an external call.

Another interesting trick is to use side_effect (side effect). In this case, these are side effects that may be associated with mocking objects for testing. Among other possibilities, this allows you to simulate different return values ​​on successive calls to the same function:

def initial_transform(data):
    """
    Flatten nested dicts
    """
    for item in list(data):
        if type(data[item]) is dict:
            for key in data[item]:
                data[key] = data[item][key]
            data.pop(item)

    outside_module.do_something()
    outside_module.do_something()
    return data

We will set up our mock as follows by passing a list of pins (for each subsequent call) as a parameter side_effect:

@pytest.fixture(params=['nodict', 'dict'])
def generate_initial_transform_parameters(request, mocker):
    [...]
    mocker.patch.object(outside_module, 'do_something')
    mocker.do_something.side_effect([1, 2])
    [...]

Mocking is a very powerful tool. It is so powerful that you can even create mock servers to test third party APIs. Again, I want to encourage you to do more research on mocking using a mocker.

Summarizing

When to use Python unit testing frameworks:

  • For large and complex projects.

  • In the case of open source projects (OSS).

  • When you want to automate the testing process.

Useful tools:

Advantages of using frameworks:

  • Test run automation.

  • Ability to detect different types of errors.

  • Easy setup and modification for the development team.

Cons of using frameworks:

  • The need to write additional code (tests).

  • Requiring tests to be updated when the code changes.

  • Tests cannot fully recreate the actual execution of the application.


Integration testing

Integration testing is one of the simplest verification methods, but perhaps the most important. Its essence is to fully run your application with real data in an environment that is similar to the production environment.

This could be your personal computer, a test server, a duplicate production server, or just changing the connection to the test database instead of production. This allows you to make sure that your changes will work after the deployment.

As with other testing methods, you test that your application generates expected results given certain inputs. But this time, you are using real plug-ins (as opposed to unit testing, where they are simulated), perhaps writing data to real databases or files. In the case of large applications, you also make sure that your code integrates well with the overall system.

How to do integration testing depends largely on your application. For example, our test application can be run on its own using the python command testapp.py. However, suppose your code is part of a complex distributed application, such as an ETL pipeline. In this case, you will need to run the entire system on test servers with your new code, run data through it, and make sure it flows through the entire system in the correct form. Beyond the command line for integration testing you can use tools like pyVows if your application is based on Django.

Summarizing

When to Use Python Integration Testing:

Useful tools:

Pros of using:

Cons of using:

  • In the case of large applications, it is difficult to accurately track the flow of data

  • Requires test environments that are as close as possible to production environments


Let’s sum up all of the above

In conclusion, all command-line testing is about comparing expected results with what actually happens when you use certain inputs. We have looked at various methods for doing this, and they are, in many cases, complementary. These methods will be helpful when developing Python CLI applications, but this tutorial is just a starting point.

Python has a rich ecosystem of testing tools and it’s worth considering. Feel free to explore and get familiar with other tools and techniques – you may find something that I did not mention here, but which may be useful to you. If this happens, be sure to share your experience in the comments!

As a quick reminder, here are the techniques we talked about today and how to apply them:

  • Debugging with Printing: Printing variables and markers in your code to see how your program is executing

  • Debuggers: control the operation of the program to get an idea of ​​the state of the application and the progress of the program

  • Unit testing: splitting an application into independent blocks (units) and testing all possible executions within these blocks

  • Integration Testing: Testing Changes in the Context of the Whole Application

So, start testing! As you work through these methods, don’t forget to share in the comments how you use them and which ones work best for you.

We invite everyone who has read to the end to an open lesson on August 22, dedicated to the Robot Framework. In this lesson, we will talk about what BDD and BDT are, what are the advantages and disadvantages of this approach. Consider the structure of a project using the Robot Framework. Let’s write our first test in this approach. You can sign up for a lesson on the course page “Python QA Engineer”.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *