Theory of Constraints (ToC)

Overview

The Theory of Constraints is a management philosophy that focuses on identifying and managing the most limiting factor (constraint) that stands in the way of achieving a goal.

Core Concept

"A chain is no stronger than its weakest link."

In any system, there is always ONE constraint that limits the system's output. Improving anything other than the constraint is a waste of resources.

The Five Focusing Steps

IDENTIFY the constraint
EXPLOIT the constraint (maximize its efficiency)
SUBORDINATE everything else to the constraint
ELEVATE the constraint (add capacity)
REPEAT - find the new constraint

Application to OCapistaine

Constraint: Data Acquisition

In our project, the primary constraint is data acquisition from municipal sources:

Constraint	Impact	Resolution
Firecrawl API rate limits	Slows document collection	Batch processing, caching
OCR processing time	Bottleneck for PDFs	Parallel processing
Municipal website structure	Requires custom extraction	Adaptive scraping

Budget Constraints

For civic projects with limited budgets:

Resource	Constraint	Strategy
API Credits	Limited Firecrawl calls	Prioritize high-value sources
Compute	Local processing limits	Cloud bursting for peaks
Time	4-week hackathon deadline	Focus on MVP features

Constraint: Provider Inference Speed (Chunking Case Study)

Date: 2026-03-04 | Component: app/mockup/field_input.py

A 58,880 character municipal meeting transcript failed to process: all 5 chunks timed out with empty errors when sent to Ollama deepseek-r1:7b for theme extraction.

Applying the Five Focusing Steps:

Step	Analysis	Action
1. IDENTIFY	The constraint is local model inference speed — not the code, not the data, not the network. A 7B model with `json_mode=True` cannot process 12k char chunks within 120s.	Profiled logs: exactly 120s per chunk = timeout
2. EXPLOIT	Reduce chunk size (15k → 8k) to fit within the existing 120s budget. No new hardware, no timeout increase, no model change.	Changed `CHUNK_SIZE` constant
3. SUBORDINATE	The Feature layer (chunking) adapts to Provider constraints, not the other way around. The Provider stays chunk-unaware — it only sees single requests.	ABC pattern
4. ELEVATE	Documented chunk size guidelines by provider class. When a faster model is deployed, chunk sizes can safely increase.	Guidelines table in AGENTS_FRAMEWORK.md
5. REPEAT	Next constraint identified: anonymization sends full 58k text in one shot to Ollama (no chunking). Same bottleneck, different feature.	Resolved — see below

What we did NOT do (and why):

Temptation	Why resisted
Increase timeout to 300s	Throwing resources at the constraint — masks the problem, degrades UX
Require a bigger model	Premature elevation — 7B is the deployment target
Skip theme extraction on large docs	Subordinating the goal to the constraint instead of the other way around

TRIZ connection: This is a Segmentation resolution (TRIZ Principle #1). The contradiction is: we want large documents processed BUT we have small context windows. Segmentation resolves it by dividing the input into independent parts that each fit the constraint, then deduplicating results across chunks.

See: Provider-Aware Chunking for the full implementation pattern and chunk size guidelines by provider class.

Constraint: Anonymization on Large Documents (PII-First Resolution)

Date: 2026-03-04 | Component: app/mockup/anonymizer.py, app/mockup/field_input.py

The REPEAT step above identified anonymization as the next bottleneck. The AnonymizationFeature sent the full 58k document to Ollama 7B in a single shot — same 120s timeout, same failure.

Why Segmentation was wrong here: Unlike theme extraction, anonymization requires entity consistency across the entire document. Chunking would require a shared state machine or merge pass to ensure "Jean Dupont" gets the same placeholder in every chunk. The complexity and failure modes (inconsistent anonymization) outweighed the benefit.

Applying the Five Focusing Steps (second cycle):

Step	Analysis	Action
1. IDENTIFY	Same constraint: local model inference cannot handle 58k chars within 120s	Confirmed via logs
2. EXPLOIT	Use existing Opik Presidio NLP guardrail (`validate_no_pii`) as a detection pass — no LLM, no context limit, 200ms	Refactored `validate_no_pii` into `detect_pii_entities` + `apply_pii_replacements`
3. SUBORDINATE	LLM enrichment becomes conditional: skip if NLP found entities AND text >15k chars	`skip_llm` heuristic in `_apply_anonymization()`
4. ELEVATE	For small docs, LLM still runs (extracts keywords, distinguishes institutions from PII)	`anonymization_type: "pii+llm"` for small docs
5. REPEAT	Next constraint: keyword extraction quality (NLP can't distinguish semantic keywords like "Mairie" from PII)	Backlog item

TRIZ connection: This is a Prior Action resolution (TRIZ Principle #10), not Segmentation. The contradiction is: the system must be intelligent (context-aware PII detection) AND fast (no timeout on large docs). Prior Action resolves it by performing the bulk of the work in advance via NLP, making the LLM optional.

See: When the Bottleneck Moves for the full narrative.

Integration with TRIZ

Theory of Constraints identifies what to solve. TRIZ provides how to solve it.

When facing a constraint that seems unsolvable, apply TRIZ contradiction resolution:

If constraint is "resource vs. ambition" → See TRIZ patterns
If constraint is "input size vs. model capacity" → TRIZ Principle #1 (Segmentation) + #24 (Intermediary)

Resources

Goldratt, E. (1984). The Goal: A Process of Ongoing Improvement
ToC Institute

Overview​

Core Concept​

The Five Focusing Steps​

Application to OCapistaine​

Constraint: Data Acquisition​

Budget Constraints​

Constraint: Provider Inference Speed (Chunking Case Study)​

Constraint: Anonymization on Large Documents (PII-First Resolution)​

Integration with TRIZ​

Resources​