OCapistaine Scheduler Task Flow Diagram

Last Updated: February 2026 Status: Initial implementation - Core tasks active

Complete Task Chain Flow

┌─────────────────────────────────────────────────────────────────┐
│ SCHEDULER ORCHESTRATION                                         │
│ Runs every 7 minutes (6 AM - 11 PM)                             │
└─────────────────────────────────────────────────────────────────┘
                            │
                            ▼
┌─────────────────────────────────────────────────────────────────┐
│ 1. task_contributions_analysis                                  │
│    Validate citizen contributions from GitHub/Vaettir           │
│    - Fetch from audierne2026/participons                        │
│    - Run Forseti validation                                     │
│    - Log results to Opik                                        │
│    Dependencies: None                                            │
└─────────────────────────────────────────────────────────────────┘
                            │
              ┌─────────────┴─────────────┐
              │                           │
              ▼                           ▼
┌──────────────────────────────┐  ┌──────────────────────────────┐
│ (Future) task_rag_indexing   │  │ (Future) task_mockup_gen     │
│ Index validated contributions│  │ Generate test scenarios      │
│ into vector store            │  │ from validated contributions │
│ Dependencies: [contributions]│  │ Dependencies: [contributions]│
└──────────────────────────────┘  └──────────────────────────────┘

Standalone Scheduled Tasks

┌─────────────────────────────────────────────────────────────────┐
│ task_firecrawl                                                  │
│ Runs: Daily at 3 AM                                             │
│ Crawl municipal documents from configured sources               │
│ - mairie_arretes (~4010 documents)                              │
│ - mairie_deliberations                                          │
│ - commission_controle                                           │
│ Dependencies: None (standalone cron job)                        │
└─────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────┐
│ task_opik_experiment                                            │
│ Runs: Daily at 5 AM                                             │
│ Run LLM evaluation experiments                                  │
│ - Forseti validation accuracy                                   │
│ - Category classification precision/recall                      │
│ - Wording correction quality                                    │
│ Dependencies: None (standalone cron job)                        │
└─────────────────────────────────────────────────────────────────┘

OCapistaine Workflow Decision Tree

The scheduler implements a priority-based workflow for continuous improvement:

                    ┌─────────────────────────┐
                    │ Start Daily Workflow    │
                    └───────────┬─────────────┘
                                │
                                ▼
                    ┌─────────────────────────┐
                    │ Check GitHub Issues     │
                    │ (audierne2026/participons)
                    └───────────┬─────────────┘
                                │
                    ┌───────────┴───────────┐
                    │                       │
            Has new issues?           No new issues
                    │                       │
                    ▼                       ▼
        ┌───────────────────┐   ┌───────────────────┐
        │ task_contributions │   │ Check Mockup Queue│
        │ _analysis          │   │ (Redis)           │
        │                    │   └─────────┬─────────┘
        │ • Forseti validate │             │
        │ • Categorize       │   ┌─────────┴─────────┐
        │ • Log to Opik      │   │                   │
        └────────┬───────────┘   Has mockups?   No mockups
                 │                   │               │
                 ▼                   ▼               ▼
        ┌───────────────────┐ ┌───────────────┐ ┌───────────────┐
        │ Update experiment │ │ task_mockup   │ │ task_firecrawl│
        │ dataset           │ │ _experiment   │ │ (if scheduled)│
        └───────────────────┘ │               │ └───────────────┘
                              │ • Run Opik    │
                              │   evaluation  │
                              │ • Try new     │
                              │   prompts     │
                              └───────────────┘

Data Sources and Triggers

Priority 1: Live Contributions (GitHub)

audierne2026/participons repository
        │
        │ New issue created
        ▼
┌─────────────────────────────────────────────────────────────────┐
│ N8N Webhook → Redis Queue → task_contributions_analysis         │
│                                                                 │
│ OR: Periodic polling via orchestrate_task_chain (every 7 min)   │
└─────────────────────────────────────────────────────────────────┘

Priority 2: Mockup Testing (Streamlit UI)

app/front.py → Mockup Tab
        │
        │ User creates test contribution
        ▼
┌─────────────────────────────────────────────────────────────────┐
│ Streamlit → ForsetiAgent.validate() → Log to Opik               │
│                                                                 │
│ If interesting case: Save to Redis → task_opik_experiment       │
└─────────────────────────────────────────────────────────────────┘

Priority 3: Document Crawling (Scheduled)

┌─────────────────────────────────────────────────────────────────┐
│ task_firecrawl (3 AM daily)                                     │
│                                                                 │
│ Sources:                                                        │
│ ├── mairie_arretes (audierne.bzh/publications-arretes/)         │
│ ├── mairie_deliberations (audierne.bzh/deliberations-...)       │
│ └── commission_controle (audierne.bzh/documentheque/...)        │
│                                                                 │
│ Output: ext_data/{source}/*.md                                  │
└─────────────────────────────────────────────────────────────────┘

Task Implementation Status

Task	Uses Workflows?	Status	Purpose
task_contributions_analysis	✅ Yes	✅ Active	Validate GitHub contributions
task_opik_experiment	✅ Yes	✅ Active	Run LLM evaluations
task_firecrawl	❌ No (crawler)	✅ Active	Crawl municipal documents
task_rag_indexing	✅ Yes	🔴 Planned	Index to vector store
task_mockup_experiment	✅ Yes	🔴 Planned	Process mockup test cases
task_prompt_optimization	✅ Yes	🔴 Planned	Run Opik optimization

Cache Keys and Data Flow

Contribution Processing

GitHub Issue
    │
    ▼
task_contributions_analysis
    │
    ├── Redis: contribution:{issue_id}
    │   └── { title, body, category, validation_result, timestamp }
    │
    ├── Redis: contributions:validated:{date}
    │   └── List of validated issue IDs
    │
    └── Opik: forseti_validation trace
        └── { input, output, latency, model, tokens }

Experiment Data

task_opik_experiment
    │
    ├── Redis: experiment:latest
    │   └── { experiment_id, date, metrics, status }
    │
    └── Opik: experiment traces
        └── { dataset, evaluations, scores }

Crawl Status

task_firecrawl
    │
    ├── Redis: crawl:{source}
    │   └── { last_crawl, documents, status }
    │
    └── Filesystem: ext_data/{source}/
        └── *.md, *.html, *_metadata.json

Execution Timeline Example

Typical Daily Run:

03:00:00 - task_firecrawl starts (cron)
           └── Crawl municipal documents
03:30:00 - task_firecrawl completes (estimated)

05:00:00 - task_opik_experiment starts (cron)
           └── Run daily LLM evaluations
05:15:00 - task_opik_experiment completes (estimated)

06:00:00 - orchestrate_task_chain starts (every 7 min)
06:00:01 - task_contributions_analysis starts
           └── Check GitHub for new issues
           └── Validate with Forseti
           └── Log to Opik
06:02:00 - task_contributions_analysis completes
06:07:00 - orchestrate_task_chain runs (skips - already completed)
...
23:00:00 - orchestrate_task_chain stops (outside active hours)

Continuous Improvement Loop

┌─────────────────────────────────────────────────────────────────┐
│                    CONTINUOUS IMPROVEMENT                        │
└─────────────────────────────────────────────────────────────────┘

     ┌──────────────┐
     │ 1. COLLECT   │ ← Live contributions from citizens
     │    Data      │ ← Mockup tests from admin
     └──────┬───────┘ ← Crawled documents
            │
            ▼
     ┌──────────────┐
     │ 2. VALIDATE  │ ← Forseti agent validation
     │    & Log     │ ← Opik tracing
     └──────┬───────┘
            │
            ▼
     ┌──────────────┐
     │ 3. ANALYZE   │ ← task_opik_experiment
     │    Results   │ ← Accuracy metrics
     └──────┬───────┘ ← Error patterns
            │
            ▼
     ┌──────────────┐
     │ 4. OPTIMIZE  │ ← (Future) Opik Optimizer
     │    Prompts   │ ← A/B testing
     └──────┬───────┘
            │
            ▼
     ┌──────────────┐
     │ 5. DEPLOY    │ ← Update prompt registry
     │    & Monitor │ ← Monitor production
     └──────┬───────┘
            │
            └────────────────────────────────────────┐
                                                     │
                                              (Loop back)

Monitoring Points

Critical Success Indicators

✅ task_contributions_analysis validates new contributions daily
✅ task_opik_experiment runs evaluations without errors
✅ task_firecrawl crawls documents as scheduled
✅ No task deadlocks or circular dependencies

Key Redis Keys to Monitor

# Scheduler locks (db=6)
redis-cli -n 6 KEYS "lock:*"
redis-cli -n 6 KEYS "success:*"

# Application data (db=5)
redis-cli -n 5 KEYS "contribution:*"
redis-cli -n 5 KEYS "crawl:*"

Performance Metrics

Task execution times (each task should complete within expected time)
Redis memory usage (should remain stable)
Error rates (should be <5% per task)
Validation accuracy (tracked in Opik)

README.md - Scheduler architecture overview
USAGE_EXAMPLES.md - How-to guide
TASK_BOILERPLATE.md - Task implementation guide

Last Updated: February 2026 Status: Initial implementation - Core tasks active

Complete Task Chain Flow​

Standalone Scheduled Tasks​

OCapistaine Workflow Decision Tree​

Data Sources and Triggers​

Priority 1: Live Contributions (GitHub)​

Priority 2: Mockup Testing (Streamlit UI)​

Priority 3: Document Crawling (Scheduled)​

Task Implementation Status​

Cache Keys and Data Flow​

Contribution Processing​

Experiment Data​

Crawl Status​

Execution Timeline Example​

Continuous Improvement Loop​

Monitoring Points​

Critical Success Indicators​

Key Redis Keys to Monitor​

Performance Metrics​

Related Documentation​