Uvicorn & FastAPI Deployment

Production-grade REST API deployment for async Python services

Reference implementations: OCapistaine (app/main.py), Vaettir (FastAPI adapters)

Why Uvicorn?
Architecture
FastAPI Lifespan Integration
Configuration
Production Deployment
Graceful Shutdown
Examples
Troubleshooting

Why Uvicorn?

Uvicorn vs. Flask

Aspect	Flask	Uvicorn
Async Support	Limited, workaround via extensions	Native ASGI, built-in async/await
Concurrency	Thread-based (`werkzeug`), 1 worker = 1 process	Event loop-based (uvloop), handles 1000s of concurrent requests
Startup/Shutdown	Basic hooks (`before_first_request`)	Proper lifespan context manager (async)
Performance	~2-3x slower on async workloads	Optimized for async, low overhead
Production Ready	Needs additional servers (Gunicorn wrapper)	Production-ready out of box
Scheduler Integration	Awkward (separate threads, locking)	Seamless (same event loop)
Framework	Micro-framework	Micro-framework + ASGI standard

When to Choose Uvicorn

✅ Use Uvicorn when:

Building REST APIs with async operations
Running scheduled tasks in same process
Need low latency (< 50ms)
Multiple concurrent requests expected (API endpoints, webhooks)
Using async libraries (aiohttp, asyncpg, motor)

✅ Applies to:

OCapistaine: REST API + webhook endpoints + scheduler
Vaettir adapters: FastAPI microservices
Any new service in Locki ecosystem

Architecture

FastAPI + Uvicorn Stack

┌─────────────────────────────────────────────────────────┐
│                    Client Requests                       │
│              (HTTP, WebSocket, Webhooks)                │
└────────────────────────┬────────────────────────────────┘
                         │
┌────────────────────────▼────────────────────────────────┐
│              Uvicorn (ASGI Server)                      │
│  ┌────────────────────────────────────────────────────┐ │
│  │  Event Loop (async/await runtime)                 │ │
│  │  - Handles multiple concurrent requests           │ │
│  │  - Non-blocking I/O operations                    │ │
│  │  - Worker threads (if configured)                 │ │
│  └────────────────────────────────────────────────────┘ │
└────────────────────────┬────────────────────────────────┘
                         │
┌────────────────────────▼────────────────────────────────┐
│          FastAPI Application (`app/main.py`)            │
│  ┌────────────────────────────────────────────────────┐ │
│  │  Lifespan Context Manager                         │ │
│  │  ┌──────────────────────────────────────────────┐ │ │
│  │  │  Startup                                     │ │ │
│  │  │  - Initialize async resources               │ │ │
│  │  │  - Connect to databases/services            │ │ │
│  │  │  - Start scheduler (APScheduler)            │ │ │
│  │  │  - Warm caches                              │ │ │
│  │  └──────────────────────────────────────────────┘ │ │
│  │                                                    │ │
│  │  Application Logic (running)                      │ │
│  │                                                    │ │
│  │  ┌──────────────────────────────────────────────┐ │ │
│  │  │  Shutdown                                    │ │ │
│  │  │  - Stop scheduler                            │ │ │
│  │  │  - Wait for tasks to complete (timeout)      │ │ │
│  │  │  - Close connections                         │ │ │
│  │  │  - Cleanup resources                         │ │ │
│  │  └──────────────────────────────────────────────┘ │ │
│  └────────────────────────────────────────────────────┘ │
│                                                         │
│  ┌────────────────────────────────────────────────────┐ │
│  │  Route Handlers (REST endpoints)                  │ │
│  │  - Async def endpoints (non-blocking)            │ │
│  │  - Access lifespan resources                      │ │
│  └────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────┘
                         │
┌────────────────────────▼────────────────────────────────┐
│           External Services & Resources                 │
│  - Redis (cache, scheduling)                           │
│  - Database (async driver)                             │
│  - LLM APIs (Ollama, OpenAI, Gemini)                   │
│  - Message queues (if applicable)                      │
└─────────────────────────────────────────────────────────┘

FastAPI Lifespan Integration

Basic Pattern

The lifespan context manager is the key to managing application startup/shutdown gracefully:

from fastapi import FastAPI
from contextlib import asynccontextmanager

@asynccontextmanager
async def lifespan(app: FastAPI):
    # Startup code
    print("⬆️  Application starting...")

    # Initialize resources here
    await initialize_resources()

    # Start scheduler
    await start_scheduler()

    yield  # Application runs here

    # Shutdown code
    print("⬇️  Application shutting down...")

    # Cleanup resources (order matters - reverse of startup)
    await stop_scheduler()
    await cleanup_resources()

app = FastAPI(lifespan=lifespan)

OCapistaine Example

File: app/main.py

@asynccontextmanager
async def lifespan(app: FastAPI):
    """Application lifespan handler."""
    # Startup
    logger.info("OCapistaine API starting...")

    # Check Redis connection
    if redis_health_check():
        logger.info("Redis connected")
    else:
        logger.warning("Redis not available - some features may be limited")

    # Start scheduler
    from app.services.scheduler import start_scheduler
    await start_scheduler()

    yield

    # Shutdown
    from app.services.scheduler import stop_scheduler
    await stop_scheduler()
    logger.info("OCapistaine API shutting down...")

app = FastAPI(
    title="OCapistaine API",
    description="AI-powered civic transparency",
    version="0.1.0",
    lifespan=lifespan,
)

Configuration

Development vs. Production

# Development mode (hot-reload, debug output)
uvicorn app.main:app --reload --host 0.0.0.0 --port 8000

# Production mode (optimized, no reload)
uvicorn app.main:app --workers 4 --host 0.0.0.0 --port 8000 --access-log

Configuration Parameters

Parameter	Dev	Production	Purpose
`--reload`	✅	❌	Auto-restart on code changes (development only)
`--workers`	1	4-8	Number of worker processes
`--worker-class`	`uvicorn.workers.UvicornWorker`	Same	Worker type
`--host`	`127.0.0.1`	`0.0.0.0`	Listen address
`--port`	8000	8000	Port (configurable via env)
`--access-log`	❌	✅	Log HTTP requests
`--log-level`	`info`	`warning`	Logging level
`--ssl-keyfile`	-	`/path/to/key`	SSL certificate (if HTTPS)
`--ssl-certfile`	-	`/path/to/cert`	SSL certificate (if HTTPS)

Environment Variables

# Port configuration
UVICORN_PORT=8050

# SSL (if needed)
UVICORN_SSL_KEYFILE=/etc/ssl/private/key.pem
UVICORN_SSL_CERTFILE=/etc/ssl/certs/cert.pem

# Worker configuration
UVICORN_WORKERS=4

# Logging
UVICORN_LOG_LEVEL=info

Configuration File (uvicorn.config.py)

For complex setups, use a config file:

# uvicorn.config.py
import os
from pathlib import Path

# Load environment
from dotenv import load_dotenv
load_dotenv()

# Configuration object
config = {
    "app": "app.main:app",
    "host": os.getenv("UVICORN_HOST", "0.0.0.0"),
    "port": int(os.getenv("UVICORN_PORT", 8000)),
    "workers": int(os.getenv("UVICORN_WORKERS", 4)),
    "reload": os.getenv("ENVIRONMENT", "production") != "production",
    "log_level": os.getenv("LOG_LEVEL", "info"),
    "access_log": os.getenv("ENVIRONMENT") == "production",
}

# Usage: uvicorn --config uvicorn.config:config

Production Deployment

Multi-Worker Setup

In production, run multiple workers to handle concurrent requests:

# 4 workers (recommended: 2-4x CPU cores)
uvicorn app.main:app --workers 4 --host 0.0.0.0 --port 8000 --access-log

# Auto-detect CPU cores
uvicorn app.main:app --workers 0 --host 0.0.0.0 --port 8000 --access-log

Important: With multiple workers, each worker runs the lifespan startup/shutdown independently. If you have a shared scheduler, you need:

# ✅ Good: Only start scheduler in worker 0
@asynccontextmanager
async def lifespan(app: FastAPI):
    if os.getenv("WORKER_ID", "0") == "0":
        logger.info("Starting scheduler in worker 0")
        await start_scheduler()
        yield
        await stop_scheduler()
    else:
        logger.info(f"Worker {os.getenv('WORKER_ID')} skipping scheduler")
        yield

Better approach: Use separate scheduler process (see Process Orchestration)

# Process 1: API workers (no scheduler)
uvicorn app.main:app --workers 4 --port 8000

# Process 2: Scheduler daemon (single instance)
python -m app.services.scheduler.daemon

Reverse Proxy (Nginx)

server {
    listen 443 ssl http2;
    server_name api.example.com;

    ssl_certificate /etc/ssl/certs/cert.pem;
    ssl_certificate_key /etc/ssl/private/key.pem;

    location / {
        proxy_pass http://localhost:8000;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;

        # WebSocket support (if needed)
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection "upgrade";
    }
}

Docker Deployment

FROM python:3.12-slim

WORKDIR /app

# Install dependencies
COPY pyproject.toml poetry.lock ./
RUN pip install poetry && poetry install --no-dev

# Copy application
COPY app/ app/
COPY src/ src/
COPY .env .

# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
    CMD python -c "import requests; requests.get('http://localhost:8000/health')"

# Run uvicorn
CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000", "--workers", "4"]

Graceful Shutdown

Signal Handling

Uvicorn automatically handles SIGTERM/SIGINT signals and calls the lifespan shutdown handler:

SIGTERM received (e.g., `kill -15 PID`)
Uvicorn stops accepting new requests
Existing requests complete (timeout after 30s)
Lifespan shutdown code executes
Process exits

Timeout Configuration

# Timeout for shutdown (default: 30s)
# Some tasks may not complete - use for long-running operations
timeout 30 uvicorn app.main:app --workers 4

Shutdown Checklist

In your shutdown handler, ensure:

@asynccontextmanager
async def lifespan(app: FastAPI):
    # Startup...
    await start_scheduler()
    yield

    # Shutdown (order matters)
    logger.info("Shutting down...")

    # 1. Stop accepting new requests (handled by uvicorn)

    # 2. Wait for running tasks
    await stop_scheduler(wait=True, timeout=20)

    # 3. Close connections
    await close_redis()
    await close_db()

    # 4. Cleanup resources
    await cleanup_temp_files()

    logger.info("Shutdown complete")

Examples

Example 1: Minimal API with Scheduler

# app/main.py
from fastapi import FastAPI
from contextlib import asynccontextmanager
from app.services.scheduler import start_scheduler, stop_scheduler

@asynccontextmanager
async def lifespan(app: FastAPI):
    # Startup
    await start_scheduler()
    yield
    # Shutdown
    await stop_scheduler()

app = FastAPI(lifespan=lifespan)

@app.get("/")
async def root():
    return {"status": "healthy"}

@app.get("/health")
async def health():
    return {"status": "ok"}

# Usage:
# uvicorn app.main:app --reload --port 8000

Example 2: API with Redis Connection Pool

# app/main.py
from fastapi import FastAPI
from contextlib import asynccontextmanager
import aioredis

redis = None

@asynccontextmanager
async def lifespan(app: FastAPI):
    # Startup
    global redis
    redis = await aioredis.create_redis_pool('redis://localhost')
    print("✓ Redis connected")

    yield

    # Shutdown
    redis.close()
    await redis.wait_closed()
    print("✓ Redis closed")

app = FastAPI(lifespan=lifespan)

@app.get("/cache/{key}")
async def get_cache(key: str):
    value = await redis.get(key)
    return {"key": key, "value": value}

@app.post("/cache/{key}")
async def set_cache(key: str, value: str):
    await redis.set(key, value, expire=3600)
    return {"key": key, "status": "set"}

Example 3: API with Middleware and CORS

# app/main.py
from fastapi import FastAPI
from fastapi.middleware.cors import CORSMiddleware
from contextlib import asynccontextmanager

@asynccontextmanager
async def lifespan(app: FastAPI):
    print("Starting up")
    yield
    print("Shutting down")

app = FastAPI(lifespan=lifespan)

# Add CORS middleware
app.add_middleware(
    CORSMiddleware,
    allow_origins=["http://localhost:8502", "https://example.com"],
    allow_credentials=True,
    allow_methods=["*"],
    allow_headers=["*"],
)

@app.get("/api/data")
async def get_data():
    return {"data": "value"}

Troubleshooting

Port Already in Use

# Find process using port 8000
lsof -i :8000

# Kill process
kill -9 <PID>

# Or use automatic cleanup script
./scripts/start.sh  # See Scripts Standardization

Lifespan Not Running

Problem: Startup code never executes

Solution: Ensure lifespan is passed to FastAPI:

# ✅ Correct
app = FastAPI(lifespan=lifespan)

# ❌ Wrong
app = FastAPI()
# lifespan not passed!

Scheduler Starts Multiple Times

Problem: Each worker starts a separate scheduler (race conditions)

Solution: Use separate scheduler process or worker ID check:

# Option 1: Separate process (recommended)
# Process 1: uvicorn app.main:app --workers 4
# Process 2: python -m app.services.scheduler.daemon

# Option 2: Worker ID check (if embedded scheduler required)
@asynccontextmanager
async def lifespan(app: FastAPI):
    if os.getenv("UVICORN_WORKER_ID", "0") == "0":
        await start_scheduler()
        yield
        await stop_scheduler()
    else:
        yield

Connection Timeouts

Problem: Remote services timeout during requests

Solution: Use connection pooling and timeouts:

# ✅ Connection pool
import httpx
client = httpx.AsyncClient(pool_limits=httpx.PoolLimits(max_connections=100))

# ✅ Request timeout
response = await client.get(url, timeout=10.0)

Memory Leaks

Problem: Memory grows over time

Solution: Ensure proper cleanup in shutdown:

@asynccontextmanager
async def lifespan(app: FastAPI):
    # ...startup...
    yield

    # Explicitly cleanup
    gc.collect()
    await redis.close()
    await db.close()

Performance Tuning

Event Loop Configuration

# Use uvloop (faster event loop)
pip install uvloop
uvicorn app.main:app --loop uvloop

# Or configure in code:
import uvloop
asyncio.set_event_loop_policy(uvloop.EventLoopPolicy())

Connection Pooling

# Reuse connections, don't create per-request
class State:
    redis: aioredis.Redis = None
    db_pool: asyncpg.Pool = None

app.state.redis = redis  # Set during startup
# Use: await app.state.redis.get(key)

Caching

from functools import lru_cache

@lru_cache(maxsize=128)
def get_config(key: str):
    # Cached for the process lifetime
    return config[key]

References

Uvicorn Docs: https://www.uvicorn.org
FastAPI Docs: https://fastapi.tiangolo.com
ASGI Standard: https://asgi.readthedocs.io
Uvloop: https://github.com/MagicStack/uvloop

Last Updated: 2026-02-22 Branch: valkyria Tested With: Python 3.12, FastAPI 0.104+, Uvicorn 0.24+

Table of Contents​

Why Uvicorn?​

Uvicorn vs. Flask​

When to Choose Uvicorn​

Architecture​

FastAPI + Uvicorn Stack​

FastAPI Lifespan Integration​

Basic Pattern​

OCapistaine Example​

Configuration​

Development vs. Production​

Configuration Parameters​

Environment Variables​

Configuration File (uvicorn.config.py)​

Production Deployment​

Multi-Worker Setup​

Reverse Proxy (Nginx)​

Docker Deployment​

Graceful Shutdown​

Signal Handling​

Timeout Configuration​

Shutdown Checklist​

Examples​

Example 1: Minimal API with Scheduler​

Example 2: API with Redis Connection Pool​

Example 3: API with Middleware and CORS​

Troubleshooting​

Port Already in Use​

Lifespan Not Running​

Scheduler Starts Multiple Times​

Connection Timeouts​

Memory Leaks​

Performance Tuning​

Event Loop Configuration​

Connection Pooling​

Caching​

References​

Table of Contents