Theory of Constraints (ToC)
Overview
The Theory of Constraints is a management philosophy that focuses on identifying and managing the most limiting factor (constraint) that stands in the way of achieving a goal.
Core Concept
"A chain is no stronger than its weakest link."
In any system, there is always ONE constraint that limits the system's output. Improving anything other than the constraint is a waste of resources.
The Five Focusing Steps
- IDENTIFY the constraint
- EXPLOIT the constraint (maximize its efficiency)
- SUBORDINATE everything else to the constraint
- ELEVATE the constraint (add capacity)
- REPEAT - find the new constraint
Application to OCapistaine
Constraint: Data Acquisition
In our project, the primary constraint is data acquisition from municipal sources:
| Constraint | Impact | Resolution |
|---|---|---|
| Firecrawl API rate limits | Slows document collection | Batch processing, caching |
| OCR processing time | Bottleneck for PDFs | Parallel processing |
| Municipal website structure | Requires custom extraction | Adaptive scraping |
Budget Constraints
For civic projects with limited budgets:
| Resource | Constraint | Strategy |
|---|---|---|
| API Credits | Limited Firecrawl calls | Prioritize high-value sources |
| Compute | Local processing limits | Cloud bursting for peaks |
| Time | 4-week hackathon deadline | Focus on MVP features |
Constraint: Provider Inference Speed (Chunking Case Study)
Date: 2026-03-04 | Component: app/mockup/field_input.py
A 58,880 character municipal meeting transcript failed to process: all 5 chunks timed out with empty errors when sent to Ollama deepseek-r1:7b for theme extraction.
Applying the Five Focusing Steps:
| Step | Analysis | Action |
|---|---|---|
| 1. IDENTIFY | The constraint is local model inference speed — not the code, not the data, not the network. A 7B model with json_mode=True cannot process 12k char chunks within 120s. | Profiled logs: exactly 120s per chunk = timeout |
| 2. EXPLOIT | Reduce chunk size (15k → 8k) to fit within the existing 120s budget. No new hardware, no timeout increase, no model change. | Changed CHUNK_SIZE constant |
| 3. SUBORDINATE | The Feature layer (chunking) adapts to Provider constraints, not the other way around. The Provider stays chunk-unaware — it only sees single requests. | ABC pattern |
| 4. ELEVATE | Documented chunk size guidelines by provider class. When a faster model is deployed, chunk sizes can safely increase. | Guidelines table in AGENTS_FRAMEWORK.md |
| 5. REPEAT | Next constraint identified: anonymization sends full 58k text in one shot to Ollama (no chunking). Same bottleneck, different feature. | Resolved — see below |
What we did NOT do (and why):
| Temptation | Why resisted |
|---|---|
| Increase timeout to 300s | Throwing resources at the constraint — masks the problem, degrades UX |
| Require a bigger model | Premature elevation — 7B is the deployment target |
| Skip theme extraction on large docs | Subordinating the goal to the constraint instead of the other way around |
TRIZ connection: This is a Segmentation resolution (TRIZ Principle #1). The contradiction is: we want large documents processed BUT we have small context windows. Segmentation resolves it by dividing the input into independent parts that each fit the constraint, then deduplicating results across chunks.
See: Provider-Aware Chunking for the full implementation pattern and chunk size guidelines by provider class.
Constraint: Anonymization on Large Documents (PII-First Resolution)
Date: 2026-03-04 | Component: app/mockup/anonymizer.py, app/mockup/field_input.py
The REPEAT step above identified anonymization as the next bottleneck. The AnonymizationFeature sent the full 58k document to Ollama 7B in a single shot — same 120s timeout, same failure.
Why Segmentation was wrong here: Unlike theme extraction, anonymization requires entity consistency across the entire document. Chunking would require a shared state machine or merge pass to ensure "Jean Dupont" gets the same placeholder in every chunk. The complexity and failure modes (inconsistent anonymization) outweighed the benefit.
Applying the Five Focusing Steps (second cycle):
| Step | Analysis | Action |
|---|---|---|
| 1. IDENTIFY | Same constraint: local model inference cannot handle 58k chars within 120s | Confirmed via logs |
| 2. EXPLOIT | Use existing Opik Presidio NLP guardrail (validate_no_pii) as a detection pass — no LLM, no context limit, 200ms | Refactored validate_no_pii into detect_pii_entities + apply_pii_replacements |
| 3. SUBORDINATE | LLM enrichment becomes conditional: skip if NLP found entities AND text >15k chars | skip_llm heuristic in _apply_anonymization() |
| 4. ELEVATE | For small docs, LLM still runs (extracts keywords, distinguishes institutions from PII) | anonymization_type: "pii+llm" for small docs |
| 5. REPEAT | Next constraint: keyword extraction quality (NLP can't distinguish semantic keywords like "Mairie" from PII) | Backlog item |
TRIZ connection: This is a Prior Action resolution (TRIZ Principle #10), not Segmentation. The contradiction is: the system must be intelligent (context-aware PII detection) AND fast (no timeout on large docs). Prior Action resolves it by performing the bulk of the work in advance via NLP, making the LLM optional.
See: When the Bottleneck Moves for the full narrative.
Integration with TRIZ
Theory of Constraints identifies what to solve. TRIZ provides how to solve it.
When facing a constraint that seems unsolvable, apply TRIZ contradiction resolution:
- If constraint is "resource vs. ambition" → See TRIZ patterns
- If constraint is "input size vs. model capacity" → TRIZ Principle #1 (Segmentation) + #24 (Intermediary)
Resources
- Goldratt, E. (1984). The Goal: A Process of Ongoing Improvement
- ToC Institute