🐍 Python Backend Project Advanced Setup (FastAPI Example)

Dmytro Parfeniuk
Python in Plain English
11 min readJul 23, 2023

--

πŸ”— 🌢️ FastAPI & Pydantic 2.4 & SQLAlchemy 2.0 & More

πŸ‘‹ Hola! You might know something about Python if you are here. Especially about Python web frameworks. There is one thing that really annoys me on using Django for example, it’s β€” the imposition of a project structure layer.

You could ask why is it a problem, right? Because you just follow the official documentation and then you just have a code that everybody who reads this documentation understands.

But once you start writing β€œbetter” applications you get into other world-class design patterns, such as DDD and its layered architecture, then you even more complicate your system with CQRS in some time. So personally, it became harder to maintain the code base following all those principles when the framework is the CENTRAL part of the whole application. You can’t even go out from it once you decide to change the framework in some time…

βœ… In this article I will try to raise the issue and then solve it.

🀚 Disclaimer: Let’s limit the backend API project for the internet marketplace.

πŸ› Issues

The code and project configuration files are not divided

In some projects (especially Django) you might see that β€œapplication” or let’s say main components are placed right in the project root.

└─ backend/
β”œβ”€ .gitignore
β”œβ”€ .env.default
β”œβ”€ .env
β”œβ”€ .alembic.ini
β”œβ”€ Pipfile
β”œβ”€ Pipfile.lock
β”œβ”€ pyproject.toml
β”œβ”€ README.md
β”œβ”€ config/
β”œβ”€ users/
β”œβ”€ authentication/
β”œβ”€ products/
β”œβ”€ shipping/
β”œβ”€ http/
β”œβ”€ mock/
β”œβ”€ seed/
└─ static/

Well, it’s fine, since there are not 100 folders inside the backend/, but the problem here is readability. Code is read by developers more than written, so it is better to have parts separate from each other. Let’s transform the example above:

└─ backend/
β”œβ”€ .gitignore
β”œβ”€ .env.default
β”œβ”€ .env
β”œβ”€ .alembic.ini
β”œβ”€ Pipfile
β”œβ”€ Pipfile.lock
β”œβ”€ pyproject.toml
β”œβ”€ README.md
β”œβ”€ http/
β”œβ”€ mock/
β”œβ”€ seed/
β”œβ”€ static/
└─ src/
β”œβ”€ config/
β”œβ”€ users/
β”œβ”€ authentication/
β”œβ”€ products/
└─ shipping/

πŸŽ‰ Much better! Now a developer can understand that src/ folder container code sources. So the structure is grouped better.

🧱 Logical component

What is inside each folder within the src/ folder? Usually, we have something sort of: data models, services, constants, …

But here is the problem with that approach. The authentication/ folder does not have the User data model and it depends on the users/ folder. The shipping/ folder acts exactly the same way, it depends on products/.

Then you might create a components dependency tree that way developers understand which component is depending on which.

The main complexity here β€” is maintaining this code base 😫. Everybody should care about that diagram and update it with each new component. Another one β€” each component has a different structure which makes the project’s files system inconsistent. The authentication does not have the database table if it is a JWT-based authentication and it depends on the users component which represents the data model and database interaction.

πŸ’β€β™‚οΈ On the other hand β€” the shipping component gathers all that logic itself. It depends on the order information which depends on a product and the user information. Maintaining that code might be a little bit tricky in a while.

Why? Because let’s say, now the client wants us to add a new feature that is related to 2 and more components. Let’s say now we have to create a page for admins that will give them analytics per product. The first β€” is by the product id we have to return the number of current orders and the second one β€” is the amount β€œin progress” deliveries. Where should we place these controllers and business logic? In orders, in shipping, or in products?

Well, it would be better to create a separate module that works with all of them together and update the diagram above. This is because this component does not have its own data models. It just works with other components in the system.

└─ backend/
β”œβ”€ .gitignore
β”œβ”€ ...
└─ src/
β”œβ”€ config/
β”œβ”€ orders/
β”œβ”€ products/
β”œβ”€ shipping/
└─ analytics/ # new component

πŸšͺ Then, our controllers would be:

1. HTTP GET /analytics/products/<id>/orders

2. HTTP GET /analytics/products/<id>/shipping?status=inProgress

βœ… And actually, it solves all our issues. Only one thing here β€” not transparent structure architecture. We should care about the scheme that tells us about dependencies.

πŸ—οΈ So using layered architecture by Eric Evans (DDD) would be a great idea there. It tells us to separate logical components into a few layers:

1. The β€œpresentation” layer that corresponds to the API gateway of the application

2. The β€œapplication/operation” layer represents the main complex business logic unit. It delegates the complexity between smaller components.

3. The β€œdomain” layer corresponds to the business logic unit.

4. The β€œinfrastructure” encapsulates the code that is used for building all components above. (all libraries that are installed are part of the infrastructure layer)

For example, users, products, and orders represent their own data models, and standalone services, and implement database interaction. These sources are recommended to place into the β€œdomain” layer.

πŸ’β€β™‚οΈ On the other hand, an order is a user’s operation (which depends on the product and user) that should be placed into the β€œapplication” layer.

πŸ—ƒοΈ Database tables usually are used by all components in the system and we can not guarantee that the user’s subdomain won’t access to order’s table somewhen in the future. It means that database tables should be placed into the infrastructure layer.

Then we have something like this:

└─ backend/
β”œβ”€ .gitignore
β”œβ”€ .env.default
β”œβ”€ .env
β”œβ”€ .alembic.ini
β”œβ”€ Pipfile
β”œβ”€ Pipfile.lock
β”œβ”€ pyproject.toml
β”œβ”€ README.md
β”œβ”€ http/
β”œβ”€ mock/
β”œβ”€ seed/
β”œβ”€ static/
└─ src/
β”œβ”€ main.py # application entrypoint
β”œβ”€ config/ # application configuration
β”œβ”€ presentation/
β”œβ”€ rest/
β”œβ”€ orders
β”œβ”€ shipping/
└─ analytics/
└─ graphql/
β”œβ”€ application/
β”œβ”€ authentication/
β”œβ”€ orders/
└─ analytics/
β”œβ”€ domain/
β”œβ”€ authentication/
β”œβ”€ users/
β”œβ”€ orders/
β”œβ”€ shipping/
└─ products/
β”œβ”€ infrastructure/
β”œβ”€ database/
└─ migrations/
β”œβ”€ errors/
└─ application/
└─ factory.py

Basically, the idea is next: the API interface is represented in the presentation layer. Then, it calls the application layer if the logic is complex or directly the domain layer if not. Next, the infrastructure layer includes the factory for creating the application.

πŸ‘‰ This structure fits most of the needs and it scales very easily.

Frameworks make you write πŸ’©

All frameworks have documentation that describes the features they provide and for simplification, they also use PoC examples, aren’t they? For a better explanation, the framework feature becomes a central idea of a code that describes the feature.

So how can we get the framework and use it as an infrastructure to write the application instead of making it a central brain of the whole application?

πŸ”¨ Real-world example

πŸ‘‰ The whole code is available on πŸ”— GitHub

Disclaimer: I am not going to create an MVP that makes any sense. Some complicated components, like shipping are skipped.

Disclaimer: Files models.py gathers the next types of components: entities, values objects, and aggregates.

First, let’s create a minimal project setup with the following technologies:

Programming language:

  • Python

Running tools:

  • Gunicorn: WSGI server
  • Uvicorn: ASGI server

Additional tools:

  • FastAPI: web framework
  • Pydantic: data models and validation
  • SQLAlchemy: ORM
  • Alembic: database migration tools
  • Loguru: logging engine

Code quality tools:

  • pytest, hypothesis, coverage
  • ruff, mypy
  • black, isort

After completing the configuration files let’s start with integrating the application entrypoint. Usually, we call this file main.py or run.py, then I would prefer to stay with main.py.

from fastapi import FastAPI
from loguru import logger

from src.config import settings
from src.infrastructure import application
from src.presentation import rest

# Adjust the logging
# -------------------------------
logger.add(
"".join(
[
str(settings.root_dir),
"/logs/",
settings.logging.file.lower(),
".log",
]
),
format=settings.logging.format,
rotation=settings.logging.rotation,
compression=settings.logging.compression,
level="INFO",
)


# Adjust the application
# -------------------------------
app: FastAPI = application.create(
debug=settings.debug,
rest_routers=(rest.products.router, rest.orders.router),
startup_tasks=[],
shutdown_tasks=[],
)

As you can see, we just import the main user-oriented components into the entrypoint file for building the application. I help to keep it transparent for the developer who maintains the software.

You can see that first of all, we adjust the logging and then build the application using the fabric.

</ > Let’s dig into the application factory:

import asyncio
from functools import partial
from typing import Callable, Coroutine, Iterable

from fastapi import APIRouter, FastAPI
from fastapi.exceptions import RequestValidationError
from pydantic import ValidationError

from src.infrastructure.errors import (
BaseError,
custom_base_errors_handler,
pydantic_validation_errors_handler,
python_base_error_handler,
)

__all__ = ("create",)


def create(
*_,
rest_routers: Iterable[APIRouter],
startup_tasks: Iterable[Callable[[], Coroutine]] | None = None,
shutdown_tasks: Iterable[Callable[[], Coroutine]] | None = None,
**kwargs,
) -> FastAPI:
"""The application factory using FastAPI framework.
πŸŽ‰ Only passing routes is mandatory to start.
"""

# Initialize the base FastAPI application
app = FastAPI(**kwargs)

# Include REST API routers
for router in rest_routers:
app.include_router(router)

# Extend FastAPI default error handlers
app.exception_handler(RequestValidationError)(
pydantic_validation_errors_handler
)
app.exception_handler(BaseError)(custom_base_errors_handler)
app.exception_handler(ValidationError)(pydantic_validation_errors_handler)
app.exception_handler(Exception)(python_base_error_handler)

# Define startup tasks that are running asynchronous using FastAPI hook
if startup_tasks:
for task in startup_tasks:
coro = partial(asyncio.create_task, task())
app.on_event("startup")(coro)

# Define shutdown tasks using FastAPI hook
if shutdown_tasks:
for task in shutdown_tasks:
app.on_event("shutdown")(task)

return app

πŸ—£οΈ Discussion:

Using this-like factory helps developers not make mistakes. You just see a few properties that you have to fill: routers, startup and shutdown tasks and it makes it easier to understand the whole code base. You translate this code like this: β€˜Okay, I am creating the application and passing the argument with routers and tasks. I assume that if I remove one route or task from here they will not be used anymore…’ and you are right! Remember that code is read more often than written.

πŸ—ƒοΈ Repository pattern

Usually, the repository pattern is implemented by the ORM which is used on the project (currently SQLAlchemy). It implements the data mapper for database access.

But when you start using the class that represents the data mapper in the whole application it becomes way harder to migrate to another ORM in the future, or let’s say replace the ORM with your own data mapper.

Creating a simple abstraction layer on that may help you a lot. Let’s have a look on src/infrastructure/database/repository.py

from typing import Any, AsyncGenerator, Generic, Type

from sqlalchemy import Result, asc, delete, desc, func, select, update

from src.infrastructure.database.session import Session
from src.infrastructure.database.tables import ConcreteTable
from src.infrastructure.errors import (
DatabaseError,
NotFoundError,
UnprocessableError,
)

__all__ = ("BaseRepository",)


# Mypy error: https://github.com/python/mypy/issues/13755
class BaseRepository(Session, Generic[ConcreteTable]): # type: ignore
"""This class implements the base interface for working with database
and makes it easier to work with type annotations.
"""

schema_class: Type[ConcreteTable]

def __init__(self) -> None:
super().__init__()

if not self.schema_class:
raise UnprocessableError(
message=(
"Can not initiate the class without schema_class attribute"
)
)

async def _get(self, key: str, value: Any) -> ConcreteTable:
"""Return only one result by filters"""

query = select(self.schema_class).where(
getattr(self.schema_class, key) == value
)
result: Result = await self.execute(query)

if not (_result := result.scalars().one_or_none()):
raise NotFoundError

return _result

async def count(self) -> int:
result: Result = await self.execute(func.count(self.schema_class.id))
value = result.scalar()

if not isinstance(value, int):
raise UnprocessableError(
message=(
"For some reason count function returned not an integer."
f"Value: {value}"
),
)

return value

async def _save(self, payload: dict[str, Any]) -> ConcreteTable:
try:
schema = self.schema_class(**payload)
self._session.add(schema)
await self._session.flush()
await self._session.refresh(schema)
return schema
except self._ERRORS:
raise DatabaseError

async def _all(self) -> AsyncGenerator[ConcreteTable, None]:
result: Result = await self.execute(select(self.schema_class))
schemas = result.scalars().all()

for schema in schemas:
yield schema

Which depends on src/infrastructure/database/session.py

class Session:
# All sqlalchemy errors that can be raised
_ERRORS = (IntegrityError, PendingRollbackError)

def __init__(self) -> None:
self._session: AsyncSession = CTX_SESSION.get()

async def execute(self, query) -> Result:
try:
result = await self._session.execute(query)
return result
except self._ERRORS:
raise DatabaseError

πŸ—£οΈ Discussion

First of all the BaseRepository could be named BaseCRUD (create/read/update/delete), or BaseDAL (data access layer). It does not matter that much.

1. It implements the interface for creating concrete classes that represent access to the database layer for the concrete table.

2. This class provides pretty good manipulation of generic types in the project.

A small example of the src/domain/products/repository.py

class ProductRepository(BaseRepository[ProductsTable]):
schema_class = ProductsTable

async def all(self) -> AsyncGenerator[Product, None]:
async for instance in self._all():
yield Product.from_orm(instance)

async def get(self, id_: int) -> Product:
instance = await self._get(key="id", value=id_)
return Product.from_orm(instance)

async def create(self, schema: ProductUncommited) -> Product:
instance: ProductsTable = await self._save(schema.dict())
return Product.from_orm(instance)

Take a look that the schema_class is used for shadow operations in the BaseRepository class since it is not allowed to use this class directly from the GenericType.

3. The async for operation allows us not to generate intermediate structures (lists, tuples, …) that claim a lot of RAM for select queries.

4. All generic methods have an underscore in the beginning for flexibility. The general purpose is: the order’s `get()` could be different from the product’s `get()` database operation. It is better to keep them separate. On the other hand, the count method, which returns the primitive could be shared for all database tables.

⚠️ If there is a need to get the information that can’t be represented by one table you can easily create a Session() instance that allows you the lowest database access interface.

✨ β€˜Create order’ feature roadmap

Let’s have a look at the β€˜create order’ actions pipeline and its dependencies.

and from the code perspective:

The presentation/orders.py file:

from fastapi import APIRouter, Depends, Request, status

from src.application import orders
from src.application.authentication import get_current_user
from src.domain.orders import (
Order,
OrderCreateRequestBody,
OrderPublic,
)
from src.domain.users import User
from src.infrastructure.database.transaction import transaction
from src.infrastructure.models import Response

router = APIRouter(prefix="/orders", tags=["Orders"])

@router.post("", status_code=status.HTTP_201_CREATED)
async def order_create(
request: Request,
schema: OrderCreateRequestBody,
user: User = Depends(get_current_user),
) -> Response[OrderPublic]:
"""Create a new order."""

# Save product to the database
order: Order = await orders.create(payload=schema.dict(), user=user)
order_public = OrderPublic.from_orm(order)

return Response[OrderPublic](result=order_public)

The application/orders.py

from src.domain.orders import Order, OrdersRepository, OrderUncommited
from src.domain.users import User
from src.infrastructure.database.transaction import transaction


@transaction
async def create(payload: dict, user: User) -> Order:
payload.update(user_id=user.id)

order = await OrdersRepository().create(OrderUncommited(**payload))

# Do som other stuff...

return order

And the @transaction decorator implementation

from functools import wraps

from loguru import logger
from sqlalchemy.exc import IntegrityError, PendingRollbackError
from sqlalchemy.ext.asyncio import AsyncSession

from src.infrastructure.database.session import CTX_SESSION, get_session
from src.infrastructure.errors import DatabaseError


def transaction(coro):
"""This decorator should be used with all coroutines
that want's access the database for saving a new data.
"""

@wraps(coro)
async def inner(*args, **kwargs):
session: AsyncSession = get_session()
CTX_SESSION.set(session)

try:
result = await coro(*args, **kwargs)
await session.commit()
return result
except DatabaseError as error:
# NOTE: If any sort of issues are occurred in the code
# they are handled on the BaseCRUD level and raised
# as a DatabseError.
# If the DatabseError is handled within domain/application
# levels it is possible that `await session.commit()`
# would raise an error.
logger.error(f"Rolling back changes.\n{error}")
await session.rollback()
raise DatabaseError
except (IntegrityError, PendingRollbackError) as error:
# NOTE: Since there is a session commit on this level it should
# be handled because it can raise some errors also
logger.error(f"Rolling back changes.\n{error}")
await session.rollback()
finally:
await session.close()

return inner

πŸ—£οΈ Discussion

  1. All PublicModels also could be placed in the src/presentation/orders/contracts.py. This example is created for simplicity.
  2. Transaction us using the ContextVar which is great for controlling the async task that is executing at the moment. πŸ”— Python official documentation

2. Transaction decorator could be applied to any coroutine in the code which is so special… like Komodo dragons πŸ¦Žβ€¦

This structure comes from personal experience and I do not pretend on the β€œbest project structure” grant 😁.

πŸ€” But remember that following at least some architectural styles is better than not following any of them.

In Plain English

Thank you for being a part of our community! Before you go:

--

--