Python Formatter-as-a-Service with FastAPI & Black | Web Formatter Blog

Python Formatter-as-a-Service with FastAPI & Black
Learn how to build a robust Python code formatting API using FastAPI and Black formatter.
Introduction
Code formatting is an essential aspect of software development that ensures consistency, readability, and maintainability. In the Python ecosystem, Black has emerged as a popular "uncompromising code formatter" that automatically formats code according to a consistent style, eliminating debates about formatting preferences.
In this comprehensive guide, we'll explore how to build a Python code formatting API using FastAPI and Black. This API will allow developers to format Python code programmatically through HTTP requests, creating a "formatter-as-a-service" that can be integrated into various development workflows, IDEs, or CI/CD pipelines.
Why Build a Formatting API?
You might wonder why you'd need a formatting API when developers can simply install Black locally. Here are several compelling reasons:
- Consistent formatting across teams: Ensure all team members use identical formatting settings
- CI/CD integration: Format code automatically during continuous integration processes
- Web-based code editors: Provide formatting capabilities in browser-based IDEs
- Cross-platform compatibility: Avoid installation issues across different operating systems
- Version control: Maintain consistent formatting even as formatter versions evolve
- Resource constraints: Format code on resource-limited devices by offloading processing
- Centralized configuration: Manage formatting rules from a single source of truth
By building a formatting API, you create a centralized service that can be leveraged across your entire development ecosystem.
Technologies Overview
FastAPI Overview
FastAPI is a modern, high-performance web framework for building APIs with Python. It's built on top of Starlette for the web parts and Pydantic for data validation. Key features that make FastAPI ideal for our formatting service include:
- Speed: One of the fastest Python frameworks available, on par with NodeJS and Go
- Automatic API documentation: Built-in Swagger UI and ReDoc support
- Data validation: Automatic request validation using Python type hints
- Asynchronous support: Native async/await syntax for high concurrency
- Standards-based: Based on open standards like OpenAPI and JSON Schema
Black Formatter Overview
Black is a Python code formatter that reformats entire files in a deterministic way, adhering to PEP 8 guidelines with a few exceptions. Key characteristics include:
- Uncompromising: Limited configuration options to prevent formatting debates
- Deterministic: Always produces the same output for the same input
- Fast: Optimized for performance, making it suitable for large codebases
- Widely adopted: Used by many prominent Python projects and companies
- Preserves AST: Ensures the formatted code maintains the same abstract syntax tree
By combining FastAPI's performance with Black's formatting capabilities, we can create a robust and efficient formatting service.
Project Setup
Environment Setup
Let's start by setting up a virtual environment for our project:
# Create a new directory for the project
mkdir python-formatter-api
cd python-formatter-api
# Create and activate a virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\\Scripts\\activate
Installing Dependencies
Next, let's install the required packages:
pip install fastapi uvicorn black pydantic python-multipart
Here's what each package does:
-
fastapi
: The web framework for building our API -
uvicorn
: ASGI server for running our FastAPI application -
black
: The Python code formatter we'll integrate -
pydantic
: Data validation library (included with FastAPI but listed for clarity) -
python-multipart
: For handling form data in requests
Let's create a requirements.txt file to track our dependencies:
pip freeze > requirements.txt
Project Structure
Let's organize our project with the following structure:
python-formatter-api/
├── app/
│ ├── __init__.py
│ ├── main.py # FastAPI application entry point
│ ├── models.py # Pydantic models for request/response
│ ├── formatter.py # Black integration logic
│ └── utils.py # Utility functions
├── tests/
│ ├── __init__.py
│ ├── test_api.py # API tests
│ └── test_formatter.py # Formatter tests
├── .env # Environment variables
├── Dockerfile # For containerization
├── docker-compose.yml # For local development with Docker
└── requirements.txt # Project dependencies
Let's create these directories and files:
mkdir -p app tests
touch app/__init__.py app/main.py app/models.py app/formatter.py app/utils.py
touch tests/__init__.py tests/test_api.py tests/test_formatter.py
touch .env Dockerfile docker-compose.yml
Building the Basic API
Creating the FastAPI App
Let's start by creating our FastAPI application in{" "}
app/main.py
:
from fastapi import FastAPI, HTTPException
from fastapi.middleware.cors import CORSMiddleware
from app.models import FormatRequest, FormatResponse
from app.formatter import format_python_code
# Create FastAPI app
app = FastAPI(
title="Python Code Formatter API",
description="API for formatting Python code using Black",
version="1.0.0",
)
# Add CORS middleware
app.add_middleware(
CORSMiddleware,
allow_origins=["*"], # In production, replace with specific origins
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
@app.get("/")
async def root():
"""Root endpoint returning API information."""
return {
"message": "Python Code Formatter API",
"docs": "/docs",
"version": "1.0.0",
}
# We'll add our formatting endpoint next
This sets up our FastAPI application with a title, description, and version. We've also added CORS middleware to allow cross-origin requests, which is important if you want to call this API from a web application hosted on a different domain.
Implementing the Formatting Endpoint
Now, let's define our Pydantic models in{" "}
app/models.py
for request and response validation:
from pydantic import BaseModel, Field
from typing import Optional, Dict, Any
class FormatRequest(BaseModel):
"""Request model for code formatting."""
code: str = Field(..., description="Python code to format")
line_length: Optional[int] = Field(88, description="Maximum line length")
fast: Optional[bool] = Field(False, description="If True, skip reformatting of docstrings")
skip_string_normalization: Optional[bool] = Field(
False, description="If True, skip string normalization"
)
class Config:
schema_extra = {
"example": {
"code": "def hello_world():\\n print('Hello, world!')",
"line_length": 88,
"fast": False,
"skip_string_normalization": False,
}
}
class FormatResponse(BaseModel):
"""Response model for code formatting."""
formatted_code: str = Field(..., description="Formatted Python code")
error: Optional[str] = Field(None, description="Error message if formatting failed")
statistics: Optional[Dict[str, Any]] = Field(
None, description="Formatting statistics"
)
class Config:
schema_extra = {
"example": {
"formatted_code": "def hello_world():\\n print(\\"Hello, world!\\")",
"error": None,
"statistics": {
"original_lines": 2,
"formatted_lines": 2,
"changed": True
}
}
}
Now, let's add the formatting endpoint to app/main.py
:
@app.post("/format", response_model=FormatResponse)
async def format_code(request: FormatRequest):
"""Format Python code using Black."""
try:
result = format_python_code(
request.code,
line_length=request.line_length,
fast=request.fast,
skip_string_normalization=request.skip_string_normalization,
)
return result
except Exception as e:
raise HTTPException(status_code=400, detail=str(e))
Request Validation
One of FastAPI's strengths is automatic request validation using
Pydantic models. Our FormatRequest
model ensures:
-
The
code
field is required and must be a string - Optional parameters have sensible defaults
- All fields have descriptions for the auto-generated documentation
If a request doesn't match our model, FastAPI will automatically return a 422 Unprocessable Entity error with details about the validation failure.
Integrating Black
Creating the Formatting Function
Let's implement the format_python_code
function in{" "}
app/formatter.py
:
import black
from black.report import NothingChanged
from typing import Dict, Any, Optional
from app.models import FormatResponse
def format_python_code(
code: str,
line_length: int = 88,
fast: bool = False,
skip_string_normalization: bool = False,
) -> FormatResponse:
"""
Format Python code using Black.
Args:
code: The Python code to format
line_length: Maximum line length
fast: If True, skip reformatting of docstrings
skip_string_normalization: If True, skip string normalization
Returns:
FormatResponse object with formatted code and statistics
"""
# Count original lines
original_lines = len(code.splitlines())
try:
# Configure Black's formatting mode
mode = black.Mode(
line_length=line_length,
string_normalization=not skip_string_normalization,
is_pyi=False,
magic_trailing_comma=True,
preview=False,
target_versions={black.TargetVersion.PY38},
experimental_string_processing=False,
)
# Format the code
formatted_code = black.format_str(code, mode=mode)
# Count formatted lines
formatted_lines = len(formatted_code.splitlines())
# Prepare statistics
statistics = {
"original_lines": original_lines,
"formatted_lines": formatted_lines,
"changed": formatted_code != code,
}
return FormatResponse(
formatted_code=formatted_code,
statistics=statistics,
)
except NothingChanged:
# Code was already well-formatted
return FormatResponse(
formatted_code=code,
statistics={
"original_lines": original_lines,
"formatted_lines": original_lines,
"changed": False,
},
)
except Exception as e:
# Handle formatting errors
return FormatResponse(
formatted_code=code,
error=f"Formatting error: {str(e)}",
)
This function:
- Takes Python code and Black configuration options as input
- Configures Black's formatting mode based on the provided options
-
Attempts to format the code using Black's{" "}
format_str
function - Handles the case where the code is already properly formatted
- Captures any formatting errors and returns them in the response
- Includes statistics about the formatting operation
Black Configuration Options
Black is intentionally designed with limited configuration options to prevent formatting debates. However, it does provide a few options:
Option | Description | Default |
---|---|---|
line_length
|
Maximum line length | 88 |
string_normalization
|
Normalize string quotes and prefixes | True |
is_pyi
|
Format as .pyi stub file | False |
magic_trailing_comma
|
Add trailing commas in multi-line collections | True |
target_versions
|
Python versions to target | PY38 |
In our API, we've exposed the most commonly used options while keeping sensible defaults.
Error Handling
When formatting Python code, several types of errors might occur:
- Syntax errors: The input code contains invalid Python syntax
- Encoding issues: Problems with character encoding
- Memory errors: For extremely large inputs
Our implementation handles these errors gracefully by:
- Catching exceptions during formatting
- Including error messages in the response
- Returning the original code when formatting fails
Let's add a utility function in app/utils.py
to
handle large inputs safely:
def validate_code_size(code: str, max_size_kb: int = 500) -> bool:
"""
Validate that the code size is within acceptable limits.
Args:
code: The Python code to validate
max_size_kb: Maximum allowed size in kilobytes
Returns:
True if the code size is acceptable, False otherwise
"""
# Convert to bytes to get accurate size
code_bytes = code.encode('utf-8')
size_kb = len(code_bytes) / 1024
return size_kb <= max_size_kb
Now, let's update our formatting endpoint in{" "}
app/main.py
to use this validation:
from app.utils import validate_code_size
@app.post("/format", response_model=FormatResponse)
async def format_code(request: FormatRequest):
"""Format Python code using Black."""
# Validate code size
if not validate_code_size(request.code):
raise HTTPException(
status_code=413,
detail="Code exceeds maximum size limit (500KB)"
)
try:
result = format_python_code(
request.code,
line_length=request.line_length,
fast=request.fast,
skip_string_normalization=request.skip_string_normalization,
)
return result
except Exception as e:
raise HTTPException(status_code=400, detail=str(e))
Advanced Features
Asynchronous Processing
For handling large code files or high request volumes, we can implement asynchronous processing. Let's create a background task system using FastAPI's background tasks:
from fastapi import BackgroundTasks
from uuid import uuid4
import asyncio
import time
# In-memory storage for task results (use Redis or a database in production)
format_tasks = {}
@app.post("/format/async")
async def format_code_async(
request: FormatRequest,
background_tasks: BackgroundTasks
):
"""Start an asynchronous formatting task."""
# Generate a unique task ID
task_id = str(uuid4())
# Add the formatting task to background tasks
background_tasks.add_task(
process_formatting_task,
task_id,
request.code,
request.line_length,
request.fast,
request.skip_string_normalization
)
return {"task_id": task_id, "status": "processing"}
async def process_formatting_task(
task_id: str,
code: str,
line_length: int,
fast: bool,
skip_string_normalization: bool
):
"""Process a formatting task in the background."""
try:
# Format the code
result = format_python_code(
code,
line_length=line_length,
fast=fast,
skip_string_normalization=skip_string_normalization,
)
# Store the result
format_tasks[task_id] = {
"status": "completed",
"result": result.dict(),
"created_at": time.time()
}
except Exception as e:
# Store the error
format_tasks[task_id] = {
"status": "failed",
"error": str(e),
"created_at": time.time()
}
@app.get("/format/async/{task_id}")
async def get_format_task(task_id: str):
"""Get the result of an asynchronous formatting task."""
if task_id not in format_tasks:
raise HTTPException(status_code=404, detail="Task not found")
return format_tasks[task_id]
This implementation:
- Creates an endpoint for submitting asynchronous formatting tasks
- Returns a task ID that clients can use to check the status
- Processes the formatting in the background
- Provides an endpoint to retrieve the results
In a production environment, you would want to use a more robust task queue system like Celery or Redis Queue, and store task results in a database or Redis.
Response Caching
To improve performance, we can implement caching for formatting results. Let's add a simple in-memory cache:
import hashlib
from functools import lru_cache
# Create a hash of the request parameters to use as a cache key
def get_cache_key(code: str, line_length: int, fast: bool, skip_string_normalization: bool) -> str:
"""Generate a cache key based on formatting parameters."""
key = f"{code}:{line_length}:{fast}:{skip_string_normalization}"
return hashlib.md5(key.encode()).hexdigest()
# Cache the formatting function
@lru_cache(maxsize=100)
def cached_format_python_code(
cache_key: str,
code: str,
line_length: int = 88,
fast: bool = False,
skip_string_normalization: bool = False,
) -> FormatResponse:
"""Cached version of format_python_code."""
return format_python_code(
code,
line_length=line_length,
fast=fast,
skip_string_normalization=skip_string_normalization,
)
# Update the endpoint to use caching
@app.post("/format", response_model=FormatResponse)
async def format_code(request: FormatRequest):
"""Format Python code using Black with caching."""
# Validate code size
if not validate_code_size(request.code):
raise HTTPException(
status_code=413,
detail="Code exceeds maximum size limit (500KB)"
)
try:
# Generate cache key
cache_key = get_cache_key(
request.code,
request.line_length,
request.fast,
request.skip_string_normalization,
)
# Get cached or new result
result = cached_format_python_code(
cache_key,
request.code,
request.line_length,
request.fast,
request.skip_string_normalization,
)
return result
except Exception as e:
raise HTTPException(status_code=400, detail=str(e))
Best Practices
Security Considerations
When exposing a code formatting API, consider these security best practices:
- Input validation: Validate all input parameters, including code size limits
- Rate limiting: Implement rate limiting to prevent abuse
- Authentication: Add API key authentication for production use
- Sandboxing: Run the formatter in a restricted environment
- Monitoring: Implement logging and monitoring for suspicious activity
Monitoring and Logging
For a production-ready formatting API, implement comprehensive monitoring:
- Request metrics: Track request volume, latency, and error rates
- Resource usage: Monitor CPU, memory, and disk usage
- Error tracking: Log and alert on formatting errors
- Health checks: Implement health check endpoints
- Audit logging: Log access patterns for security analysis
API Documentation
FastAPI automatically generates API documentation using OpenAPI and Swagger UI:
-
Access interactive documentation at
/docs
-
Alternative ReDoc documentation available at
/redoc
-
Export OpenAPI schema at
/openapi.json
Ensure your endpoint descriptions and parameter annotations are clear and comprehensive to help API consumers.
Conclusion
In this guide, we've built a robust Python code formatting API using FastAPI and Black. This service provides a flexible way to format Python code programmatically, making it easier to maintain consistent code style across teams and projects.
Some key takeaways:
- FastAPI provides a high-performance framework for building APIs with automatic documentation
- Black offers deterministic Python code formatting with minimal configuration
- Combined, they create a powerful formatting service that can be integrated into various workflows
- Advanced features like asynchronous processing and caching improve scalability
- Proper error handling and security measures are essential for production deployment
By following this guide, you can deploy your own formatting API and customize it to meet your specific needs. Whether you're looking to enforce code style across a team, integrate formatting into CI/CD pipelines, or provide formatting as a service, this approach offers a flexible and scalable solution.