building scalable and clean applications in python (Tutorial)

When it comes to building scalable and maintainable applications, understanding important concepts such as clean code principles, architectural patterns, and SOLID design practices is critical. By learning these principles, beginners will gain an understanding of how to build robust, flexible, and easily testable applications, allowing them to keep their code base clear and maintainable as their projects grow.

A bit of clean code theory

Before diving into the architecture, I'd like to answer a few frequently asked questions:

  • What are the benefits of specifying types in python?

  • What are the reasons for dividing an application into layers?

  • What are the benefits of using OOP?

  • What are the disadvantages of using global variables or singletons?

Feel free to skip the theory sections if you already know the answers and go directly to the “Creating a Program” section.

Always specify types

Type annotation significantly improves the code, increasing its clarity, reliability, and maintainability:

  1. Type safety: Type annotations help identify type mismatches early on, which reduces errors and ensures that your code behaves as expected.

  2. Self-documenting code: Type hints improve code readability and act as inline documentation by clarifying the expected types of function inputs and outputs.

  3. Improving code quality: The use of type hints improves design and architecture by promoting thoughtful planning and implementation of data structures and interfaces.

  4. Improved tool support: Tools such as mypyuse type annotations for static type checking, identifying potential errors before execution begins, thereby simplifying the development and testing process.

  5. Support for modern libraries: FastAPI, Pydantic and other libraries use type annotations to automate data validation, generate documentation, and reduce code duplication.

  6. Advantages of typed data classes over simple data structures: Typed data classes improve readability, structured data, and type safety over arrays and tuples. They use attributes instead of string keys, which minimizes errors due to typos and improves code completion. Dataclasses also provide clear definition of data structures, support default values, and simplify code maintenance and debugging.

Why do we need to divide the application into layers

Separating an application into layers improves maintainability, scalability, and flexibility. Key reasons for this strategy include:

Sharing concerns

  • Each layer focuses on a specific aspect, making it easier to develop, debug, and maintain.

Reusability

Scalability

Ease of maintenance

Improved collaboration

Flexibility and adaptability

  • Changes in technology or design can be implemented in certain layers. Only the affected layers need adaptation, the rest remain unaffected.

Testability

Using a layered architecture provides significant benefits in development speed, operational management, and long-term maintenance, making systems more reliable, manageable, and adaptable to change.

Global constants vs. injected parameters

When developing software, the choice between using global constants and using dependency injection (DI) can have a significant impact on the flexibility, maintainability, and scalability of applications. This analysis examines the disadvantages of global constants and contrasts them with the advantages provided by dependency injection.

Global Constants

  1. Fixed configuration: Global constants are static and cannot dynamically adapt to different environments or requirements without changing the code base. This rigidity limits their use in various operating scenarios.

  2. Limited scope of testing: Testing becomes difficult when using global constants because they are not easily overridden. Developers may need to change global state or use complex workarounds to accommodate different test scenarios, increasing the risk of bugs.

  3. Reducing modularity: Relying on global constants reduces modularity because components become dependent on specific values ​​set globally. This dependency reduces the ability to reuse components across projects or contexts.

  4. High connectivity: Global constants integrate specific behavior and configurations directly into the codebase, making it difficult to adapt or evolve the application without significant changes.

  5. Hidden dependencies: Like global variables, global constants hide dependencies within an application. It becomes unclear which parts of the system depend on these constants, making the code difficult to understand and maintain.

  6. Difficulties of maintenance and refactoring: Over time, using global constants can lead to maintenance problems. Refactoring such a codebase is risky because changes to constants may accidentally affect different parts of the application.

  7. Duplicating state at the module level: In Python, module-level code can be executed multiple times if the import occurs in different paths (for example, absolute and relative). This can lead to duplicate global instances and difficult to track down service errors.

Injected parameters

  1. Dynamic flexibility and customizability: Dependency injection allows you to dynamically configure components, making applications adaptable to changing conditions without the need to change code.

  2. Improved testability: DI improves testability by allowing mocks or alternative configurations to be introduced during testing, effectively isolating components from external dependencies and providing more reliable test results.

  3. Increased modularity and reusability: Components become more modular and reusable since they are designed to work with any injected parameters corresponding to the expected interfaces. This separation of concerns increases the portability of components across different parts of the application or even across different projects.

  4. Low connectivity: Injected parameters promote low coupling by decoupling the system's logic from its configuration. This approach makes it easier to update and make changes to the application.

  5. Explicitly declaring dependencies: In DI, components explicitly declare their dependencies, usually through constructor parameters or setters. This clarity makes the system easier to understand, maintain, and expand.

  6. Scalability and complexity management: As applications grow, DI helps manage complexity by isolating problems and decoupling configuration from usage, enabling efficient scaling and maintenance of large systems.

Procedural programming vs OOP

Using object-oriented programming (OOP) and dependency injection (DI) can significantly improve the quality and maintainability of code compared to a procedural approach with global variables and functions. Here's a simple comparison demonstrating these benefits:

Procedural Approach: Global Variables and Functions

# Global configuration
database_config = {
    'host': 'localhost',
    'port': 3306,
    'user': 'user',
    'password': 'pass'
}

def connect_to_database():
    print(f"Connecting to database on {database_config['host']}...")
    # Assume connection is made
    return "database_connection"

def fetch_user(database_connection, user_id):
    print(f"Fetching user {user_id} using {database_connection}")
    # Fetch user logic
    return {'id': user_id, 'name': 'John Doe'}

# Usage
db_connection = connect_to_database()
user = fetch_user(db_connection, 1)
  • Code duplication: database_config must be passed or accessed globally across multiple functions.

  • Testing difficulties: Simulating a database connection or configuration involves manipulating global state, which is prone to errors.

  • High connectivity: Functions directly depend on global state and specific implementations.

OOP + DI approach

from typing import Dict, Optional
from abc import ABC, abstractmethod

class DatabaseConnection(ABC):
    @abstractmethod
    def connect(self):
        pass

    @abstractmethod
    def fetch_user(self, user_id: int) -> Dict:
        pass

class MySQLConnection(DatabaseConnection):
    def __init__(self, config: Dict[str, str]):
        self.config = config

    def connect(self):
        print(f"Connecting to MySQL database on {self.config['host']}...")
        # Assume connection is made

    def fetch_user(self, user_id: int) -> Dict:
        print(f"Fetching user {user_id} from MySQL")
        return {'id': user_id, 'name': 'John Doe'}

class UserService:
    def __init__(self, db_connection: DatabaseConnection):
        self.db_connection = db_connection

    def get_user(self, user_id: int) -> Dict:
        return self.db_connection.fetch_user(user_id)

# Configuration and DI
config = {
    'host': 'localhost',
    'port': 3306,
    'user': 'user',
    'password': 'pass'
}
db = MySQLConnection(config)
db.connect()
user_service = UserService(db)
user = user_service.get_user(1)
  • Reduced code duplication: The database configuration is encapsulated in a connection object.

  • DI capabilities: Easy to replace MySQLConnection to another database connection class, for example PostgresConnectionwithout changing the code UserService.

  • Encapsulation and abstraction: The implementation details of how users are retrieved or how the database is connected are hidden from view.

  • Convenience of mocking and testing: UserService can be easily tested by injecting a stub DatabaseConnection.

  • Object lifetime management: The lifecycle of database connections can be managed in more detail (for example, using context managers).

  • Using OOP principles: Demonstrates inheritance (abstract base class), polymorphism (implementation of abstract methods), and protocols (interfaces defined DatabaseConnection).

By structuring an application using OOP and DI, the code becomes more modular, easier to test, and flexible to changes such as replacing dependencies or changing configuration.

Creating a program

All examples and more detailed information with comments can be found in repositories

Start of a new project

A small checklist:

1. Manage Projects and Dependencies with Poetry

poetry new python-app-architecture-demo

This command will create a minimal directory structure: separate folders for the application and tests, a project metainformation file pyproject.tomllocal dependency files and git configurations.

2. Version Control with Git

Initialize git:

git init

Add a file .gitignore to exclude unnecessary files from your repository. Use standard .gitignoreprovided by GitHub and add the rest of the exceptions like .DS_Store for macOS and editors folder (.idea, .vscode, .zedetc):

wget -O .gitignore https://raw.githubusercontent.com/github/gitignore/main/Python.gitignore
echo .DS_Store >> .gitignore

3. Dependency management

Install your project's dependencies using poetry:

poetry add fastapi pytest aiogram

You can install all dependencies later using:

poetry install

Consult each library's official documentation if you need more specific instructions.

4. Configuration files

Create a file config.py to centralize application settings is a common and effective approach.

Set environment variables for secrets and settings:

touch .env example.env

.env contains sensitive data and should be git-ignored while example.env contains placeholder or default values ​​and is stored in the repository.

5. Application entry point

Define your application's entry point to main.py:

python_app_architecture/main.py:

def run():
    print('Hello, World!')

if __name__ == '__main__': # avoid run on import
    run()

Make your project usable as a library and allow programmatic access by importing the function run V __init__.py:

python_app_architecture/init.py

from .main import run

Enable direct project execution with Poetry by adding a shortcut to __main__.py. This will allow you to use the command poetry run python python_app_architecture instead of a longer one poetry run python python_app_architecture/main.py.

python_app_architecture/main.py:

from .main import run
run()

Defining Directories and Layers

Disclaimer:
Of course, every application is different, and their architecture will differ depending on their goals and objectives. I'm not saying that this is the only correct option, but it seems to me that it is quite average and suitable for a large part of projects. Try to focus on basic approaches and ideas rather than specific examples.

Now let's set up directories for the different layers of the application.

It generally makes sense to version the API (for example, by creating subdirectories like api/v1), but we will keep things simple for now and skip this step.

.
├── python_app_architecture_demo
│   ├── coordinator.py
│   ├── entities
│   ├── general
│   ├── mappers
│   ├── providers
│   ├── repository
│   │   └── models
│   └── services
│       ├── api_service
│       │   └── api
│       │       ├── dependencies
│       │       ├── endpoints
│       │       └── schemas
│       └── telegram_service
└── tests
  • app

    • entities – data structures of the entire application. Purely data carriers without logic.

    • general – suitcase with tools. Folder for common utilities, helpers, and library wrappers.

    • mappers – specialists in transforming data, such as database models into entities, or between different data formats. It is good practice to encapsulate mappers within their scope, rather than keeping them global. For example, the models-entities mapper can be part of a repository module. Another example: the schemas-entities mapper should remain inside the API service and be its private tool.

    • providers – the basis of business logic. Providers implement core application logic but remain independent of interface details, keeping their operations abstract and isolated.

    • repositories – librarians. Guardians of data access, abstracting the complexities of interaction with the database.

    • services – each service acts as an (almost) autonomous sub-application, organizing its own specific area of ​​business logic and delegating the main tasks to providers. This configuration ensures centralized and consistent logic throughout the application

      • api_service – manages external communications via http/s, structured around the FastAPI framework.

        • dependencies – the core tools and helpers needed for the various parts of your API, integrated using the DI FastAPI system

        • endpoints – http interface endpoints

        • schemas – definition of data structures for API requests and responses

      • telegram_service – works similarly to the API service, providing the same functionality in a different interface, but without duplicating the business logic code by calling the same providers, the chir uses the API service.

  • tests – the directory is intended exclusively for testing and contains all test code, maintaining a clear separation from the application logic.

The connection between layers will look something like this:

note that entities – not active components, but only data structures that are transferred between layers:

Remember that layers are not directly related, but only depend on abstractions. Implementations are passed using dependency injection:

Such a flexible structure makes it easy to add functionality, for example, change the database, create a service or connect a new interface without unnecessary changes or duplication of code, since the logic of each module is located on its own layer:

At the same time, all the logic of a separate service is encapsulated inside it:

Learning to code

Endpoint

Let's start from the end point:

# api_service/api/endpoints/user.py

from typing import Annotated
from fastapi import APIRouter, Depends, HTTPException, status
from entities.user import UserCreate
from ..dependencies.providers import (
	user_provider, # 1
	UserProvider # 2
)

router = APIRouter()

@router.post("/register")
async def register(
    user: UserCreate, # 3
    provider: Annotated[UserProvider, Depends(user_provider)] # 4
):
    provider.create_user(user) # 5
    return {"message": "User created!"}
  1. Import the dependency injection helper function (we'll look at that in a minute)

  2. Importing UserProvider protocol for type annotation

  3. The endpoint requires the request body to contain a schema UserCreate in json format

  4. Parameter provider in function register represents an implementation instance UserProviderinjected by FastAPI using the mechanism Depends.

  5. In method create_user functions UserProvider parsed user data is transmitted. This demonstrates a clear separation of concerns where the API layer delegates business logic to the provider layer, adhering to the principle that front-end layers should not contain business logic.

UserProvider

Now let's look at the business logic:

# providers/user_provider.py

from typing import Protocol, runtime_checkable, Callable
from typing_extensions import runtime_checkable
from repository import UserRepository
from providers.mail_provider import MailProvider
from entities.user import UserCreate


@runtime_checkable
class UserProvider(Protocol): # 1
    def create_user(self, user: UserCreate): ...

@runtime_checkable
class UserProviderOutput(Protocol): # 2
    def user_provider_created_user(self, provider: UserProvider, user: UserCreate): ...

class UserProviderImpl: # 3

    def __init__(self,
        repository: UserRepository,  # 4 
        mail_provider: MailProvider, # 4
        output: UserProviderOutput | None, # 5
        on_user_created: Callable[[UserCreate], None] | None # 6
    ):
        self.repository = repository
        self.mail_provider = mail_provider
        self.output = output
        self.on_user_created = on_user_created

    # Implementation

    def create_user(self, user: UserCreate): # 7
    
        self.repository.add_user(user) # 8
        self.mail_provider.send_mail(user.email, f"Welcome, {user.name}!") # 9

        if output := self.output: # unwraping the optional
            output.user_provider_created_user(self, user) # 10

        # 11
        if on_user_created := self.on_user_created:
            on_user_created(user)
  1. Interface Definition: UserProvider is a protocol that defines a method create_user, which any class that adheres to this protocol must implement. It serves as the formal contract for the user creation functionality.

  2. Observer Protocol: UserProviderOutput serves as an observer (or delegate) that receives notification when a user is created. This protocol provides free communication and improves the event-driven architecture of the application.

  3. Protocol Implementation: UserProviderImpl implements user creation logic, but does not need to explicitly declare its allegiance UserProvider due to the dynamic nature of Python and the use of duck typing.

  4. Basic dependencies: The constructor accepts UserRepository And MailProvider – both are defined as protocols – as parameters. Relying solely on these protocols, UserProviderImpl remains separate from specific implementations, illustrating the principles of Dependency Injection, where the provider is independent of the underlying details, interacting only through specific contracts.

  5. Optional Output Delegate: The constructor accepts an optional instance UserProviderOutputwhich, if provided, will be notified when user creation is complete.

  6. Callback function: As an alternative to the output delegate, you can pass the callable function on_user_created to handle additional actions after user creation, providing flexibility in responding to events.

  7. Central business logic: Method create_user encapsulates the core business logic for adding a user, demonstrating separation from API processing.

  8. Interacting with the repository: Uses UserRepository to abstract database operations (such as adding a user), ensuring that the provider does not directly manipulate the database.

  9. Advanced business logic: Involves sending email via MailProviderillustrating that a provider's responsibilities may extend beyond simple CRUD operations.

  10. Event Notification: If an output delegate is provided, it notifies it of the user creation event, using the observer pattern to enhance interactivity and modular response to events.

  11. Executing a callback: Optionally performs a callback function, providing a simple method for extending functionality without complex class hierarchies or dependencies.

FastAPI Dependencies

Okay, but how to instantiate the provider and implement it? Let's look at the injection code implemented using the FastAPI DI engine:

# services/api_service/api/dependencies/providers.py
from typing import Annotated
from fastapi import Request, Depends
from repository import UserRepository
from providers.user_provider import UserProvider, UserProviderImpl
from providers.mail_provider import MailProvider
from coordinator import Coordinator
from .database import get_session, Session
import config


def _get_coordinator(request: Request) -> Coordinator:
    # private helper function
    # NOTE: You can pass the DIContainer in the same way
    return request.app.state.coordinator

def user_provider(
    session: Annotated[Session, Depends(get_session)], # 1
    coordinator: Annotated[Coordinator, Depends(_get_coordinator)] # 2
) -> UserProvider: # 3
    # UserProvider's lifecycle is bound to short endpoint's lifecycle, so it's safe to use strong references here
    return UserProviderImpl( # 4
        repository=UserRepository(session), # 5
        mail_provider=MailProvider(config.mail_token), # 6
        output=coordinator, # 7
        on_user_created=coordinator.on_user_created # 8
        # on_user_created: lambda: coordinator.on_user_created() # add a lambda if the method's signature is not compatible
    )
  1. Obtaining a database session through FastAPI's dependency injection system, ensuring that every request has a clean session.

  2. Retrieving from an instance's application state Coordinatorwhich is responsible for managing broader application-level tasks and acts as an event manager.

  3. Note: the function returns the protocol, but not the exact implementation.

  4. Instance construction UserProviderImpl by injecting all necessary dependencies. This demonstrates the practical application of dependency injection for assembling complex objects.

  5. Initialization UserRepository with a session received from the FastAPI DI system. This repository handles all data persistence operations, abstracting database interaction from the provider.

  6. Settings MailProvider using a configuration token.

  7. Injection Coordinator as an output protocol. It is assumed that Coordinator implements the protocol UserProviderOutputallowing it to receive notifications when a user is created.

  8. Assigns a method from Coordinator as a callback that will be executed when the user is created. This allows additional operations or notifications to be triggered as a side effect of the user creation process.

This structured approach ensures that UserProvider equipped with all the necessary tools to effectively perform its tasks, while adhering to the principles of free communication and high connectivity.

Coordinator

The Coordinator class acts as the main orchestrator in your application, managing various services, interactions, events, establishing initial state, and injecting dependencies. Here is a detailed description of its roles and functionality based on the code provided:

# coordinator.py

from threading import Thread
import weakref
import uvicorn
import config
from services.api_service import get_app as get_fastapi_app
from entities.user import UserCreate
from repository.user_repository import UserRepository
from providers.mail_provider import MailProvider
from providers.user_provider import UserProvider, UserProviderImpl
from services.report_service import ReportService
from services.telegram_service import TelegramService


class Coordinator:

    def __init__(self):
        self.users_count = 0 # 1

        self.telegram_service = TelegramService( # 2
            token=config.telegram_token,
            get_user_provider=lambda session: UserProviderImpl(
                repository=UserRepository(session),
                mail_provider=MailProvider(config.mail_token),
                output=self,
                on_user_created=self.on_user_created
            )
        )

        self.report_service = ReportService(
            get_users_count = lambda: self.users_count # 3
        )

    # Coordinator's Interface

    def setup_initial_state(self):
        fastapi_app = get_fastapi_app()

        fastapi_app.state.coordinator = self # 4

        # 5
        fastapi_thread = Thread(target=lambda: uvicorn.run(fastapi_app))
        fastapi_thread.start()

        # 6
        self.report_service.start()
        self.telegram_service.start()

    # UserProviderOutput Protocol Implementation

    def user_provider_created_user(self, provider: UserProvider, user: UserCreate):
        self.on_user_created(user)

    # Event handlers

    def on_user_created(self, user):
        print("User created: ", user)
        self.users_count += 1

        # 7
        if self.users_count >= 10_000:
            self.report_service.interval_seconds *= 10
        elif self.users_count >= 10_000_000:
            self.report_service.stop() # 8
  1. Some states may be shared across providers, services, layers, and the entire application.

  2. Building implementations and dependency injection

  3. Something to be aware of here are circular references, deadlocks and memory leaks, see for details full code.

  4. Pass the coordinator instance to the FastAPI application state so that you can access it at endpoints through the FastAPI DI system.

  5. Run all services in separate threads

  6. Already running in a separate thread inside the service

  7. Some cross-service logic, just for example

  8. Example of managing services from the coordinator

This orchestrator centralizes control and communication between various components, increasing application manageability and scalability. It effectively coordinates actions between services, ensuring that the application responds appropriately to state changes and user interactions. This design pattern is very important to maintain a clean separation of concerns and provide more robust and flexible application behavior.

Container DI

However, in large-scale applications, manual use of DI can result in a significant amount of boilerplate code. That's when DI Container comes to the rescue. DI Containers, or Dependency Injection Containers, are powerful tools used in software development to manage dependencies in an application. They serve as a central place where objects and their dependencies are registered and managed. When an object requires a dependency, the DI container automatically handles the instantiation and provisioning of those dependencies, ensuring that the objects receive all the necessary components to function effectively. This approach promotes loose coupling, improves testability and overall maintainability of the codebase by abstracting complex dependency management logic from the business logic of the application. DI containers simplify the development process by automating and centralizing the configuration of component dependencies.

There are many libraries for python providing different implementations of DI Container, I have looked through almost all of them and noted down the best ones IMO

  • python-dependency-injector – automated, class based, has different lifecycle options such as Singleton or Factory

  • lagom – dictionary interface with automatic resolution

  • dishka – good scope control through the context manager

  • that-depends – support for context managers (objects must be closed at the end), built-in fastapi integration

  • punq – a more classic approach with methods register And resolve.

  • rodi – classic, simple, automatic

main.py

Finally, update the file main.py:

# main.py
from coordinator import Coordinator


def run(): # entry point, no logic here, only run the coordinator
    coordinator = Coordinator()
    coordinator.setup_initial_state()

if __name__ == '__main__':
    run()

Conclusion

To get a complete understanding of the architectural and implementation strategies discussed, it is useful to review all files in repositories. Despite the limited amount of code, each file is provided with meaningful comments and additional details that allow you to further understand the structure and functionality of the application. Learning these aspects will improve your familiarity with the system, ensuring that you are well prepared to effectively adapt or extend the application.

This approach is universal for various Python applications. It is effective for stateless backend servers, such as those built with FastAPI, but its benefits are especially pronounced in frameworkless and stateful applications. This includes desktop applications (both GUI and command line) as well as systems that control physical devices such as IoT devices, robotics, drones, and other hardware-centric technologies.

Also, I recommend reading the book Clean code Robert Martin. You can find a summary and main conclusions Here.


The approach shown has been tested in practice and is used for the main programs of the hub and speaker of a smart home system MajorDomthe details of which I periodically write in telegram.


Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *