Task: Audierne Docs Processing
File: app/services/tasks/task_audierne_docs.py
Schedule: 0 */2 * * * (Every 2 hours - dev mode)
LLM Provider: Ollama (default) with gemini failover
Resource Lock: lock:ollama:global (TTL: 600s)
Purpose
Processes audierne2026 markdown documents to build training datasets for Forseti prompt optimization. Designed for continuous improvement of charter validation.
What It Does
-
Reads Progress File
- Checks
docs/docs/audierne2026/PROCESSING_PROGRESS.md - Identifies next unprocessed document
- Checks
-
Acquires Ollama Lock (if using Ollama)
- Prevents concurrent Ollama usage with other tasks
- TTL: 600 seconds (10 minutes)
- Falls back to gemini if lock unavailable
-
Processes One Document
- Extracts themes using LLM (chunked for large docs: 15k chars, 500 overlap)
- Generates contributions (2 per theme by default)
- Includes violation examples for testing
-
Creates Opik Dataset
- Dataset name:
audierne-{filename}-{date} - Contains input/output pairs for optimization
- Dataset name:
-
Updates Progress
- Marks document as complete
- Logs themes, contributions, dataset name
-
Releases Ollama Lock
- Always releases in finally block
Documents
Located in docs/docs/audierne2026/:
| File | Size | Description |
|---|---|---|
| Mayor-2026-wishes.md | 21KB | Mayor's New Year speech |
| municipal-council-january2026.md | 22KB | Council meeting minutes |
| MVP_meeting.md | 13KB | MVP planning meeting |
| MVP-meeting-satellite.md | 11KB | Satellite meeting notes |
| overview.md | 6KB | Project overview |
Progress Tracking
Progress file: docs/docs/audierne2026/PROCESSING_PROGRESS.md
| File | Status | Processed At | Themes | Contributions | Dataset |
|------|--------|--------------|--------|---------------|---------|
| MVP-meeting-satellite.md | ✅ Done | 2026-02-06 11:29 | 3 | 12 | audierne-MVP-meeting-satellite-20260206 |
| MVP_meeting.md | ⏳ Pending | - | - | - | - |
Configuration
# Default settings in task file
DEFAULT_PROVIDER = "ollama"
FALLBACK_PROVIDER = "gemini" # Used when Ollama unavailable
CONTRIBUTIONS_PER_THEME = 2
INCLUDE_VIOLATIONS = True
OLLAMA_LOCK_TTL = 600 # 10 minutes
Failover Behavior
When Ollama is unavailable or locked by another task:
- Task checks
lock:ollama:globalin Redis - If locked → automatically uses gemini instead
- Result includes
provider_usedto show actual provider
# Task result with failover
{
"status": "success",
"provider_used": "gemini", # Failover occurred
"warnings": ["Ollama busy, using gemini"]
}
Disable Failover:
# Force Ollama only (will skip if locked)
result = task_audierne_docs(
provider="ollama",
enable_failover=False
)
# Returns status="skipped", reason="ollama_locked" if busy
Manual Run
# Run next pending document
python -c "from app.services.scheduler import run_task_now; print(run_task_now('task_audierne_docs'))"
# Force specific document
python -c "
from app.services.tasks.task_audierne_docs import task_audierne_docs
result = task_audierne_docs(force_doc='Mayor-2026-wishes.md')
print(result)
"
Success Keys
Note: Uses skip_success_check=True to allow multiple runs per day (one doc at a time).
success:task_audierne_docs:YYYYMMDD # Set when ALL docs are processed
Output
Each run produces:
- Opik Dataset:
audierne-{filename}-{date}with training data - Progress Update: Markdown table with status
- Redis Storage: Contributions stored for validation testing
Workflow Integration
┌─────────────────────────────────────────────────────────────────┐
│ Continuous Improvement Loop │
├─────────────────────────────────────────────────────────────────┤
│ │
│ task_audierne_docs (every 2h) │
│ ↓ │
│ Generate contributions from real docs │
│ ↓ │
│ Create Opik dataset │
│ ↓ │
│ task_opik_evaluate (every 30min) │
│ ↓ │
│ Evaluate Forseti accuracy │
│ ↓ │
│ Opik Optimizer │
│ ↓ │
│ Improved prompts │
│ ↓ │
│ task_prompt_sync (midnight) │
│ ↓ │
│ Updated prompts in production │
│ │
└─────────────────────────────────────────────────────────────────┘