Aller au contenu principal

Theory of Constraints (ToC)

Overview

The Theory of Constraints is a management philosophy that focuses on identifying and managing the most limiting factor (constraint) that stands in the way of achieving a goal.

Core Concept

"A chain is no stronger than its weakest link."

In any system, there is always ONE constraint that limits the system's output. Improving anything other than the constraint is a waste of resources.

The Five Focusing Steps

  1. IDENTIFY the constraint
  2. EXPLOIT the constraint (maximize its efficiency)
  3. SUBORDINATE everything else to the constraint
  4. ELEVATE the constraint (add capacity)
  5. REPEAT - find the new constraint

Application to OCapistaine

Constraint: Data Acquisition

In our project, the primary constraint is data acquisition from municipal sources:

ConstraintImpactResolution
Firecrawl API rate limitsSlows document collectionBatch processing, caching
OCR processing timeBottleneck for PDFsParallel processing
Municipal website structureRequires custom extractionAdaptive scraping

Budget Constraints

For civic projects with limited budgets:

ResourceConstraintStrategy
API CreditsLimited Firecrawl callsPrioritize high-value sources
ComputeLocal processing limitsCloud bursting for peaks
Time4-week hackathon deadlineFocus on MVP features

Constraint: Provider Inference Speed (Chunking Case Study)

Date: 2026-03-04 | Component: app/mockup/field_input.py

A 58,880 character municipal meeting transcript failed to process: all 5 chunks timed out with empty errors when sent to Ollama deepseek-r1:7b for theme extraction.

Applying the Five Focusing Steps:

StepAnalysisAction
1. IDENTIFYThe constraint is local model inference speed — not the code, not the data, not the network. A 7B model with json_mode=True cannot process 12k char chunks within 120s.Profiled logs: exactly 120s per chunk = timeout
2. EXPLOITReduce chunk size (15k → 8k) to fit within the existing 120s budget. No new hardware, no timeout increase, no model change.Changed CHUNK_SIZE constant
3. SUBORDINATEThe Feature layer (chunking) adapts to Provider constraints, not the other way around. The Provider stays chunk-unaware — it only sees single requests.ABC pattern
4. ELEVATEDocumented chunk size guidelines by provider class. When a faster model is deployed, chunk sizes can safely increase.Guidelines table in AGENTS_FRAMEWORK.md
5. REPEATNext constraint identified: anonymization sends full 58k text in one shot to Ollama (no chunking). Same bottleneck, different feature.Resolved — see below

What we did NOT do (and why):

TemptationWhy resisted
Increase timeout to 300sThrowing resources at the constraint — masks the problem, degrades UX
Require a bigger modelPremature elevation — 7B is the deployment target
Skip theme extraction on large docsSubordinating the goal to the constraint instead of the other way around

TRIZ connection: This is a Segmentation resolution (TRIZ Principle #1). The contradiction is: we want large documents processed BUT we have small context windows. Segmentation resolves it by dividing the input into independent parts that each fit the constraint, then deduplicating results across chunks.

See: Provider-Aware Chunking for the full implementation pattern and chunk size guidelines by provider class.

Constraint: Anonymization on Large Documents (PII-First Resolution)

Date: 2026-03-04 | Component: app/mockup/anonymizer.py, app/mockup/field_input.py

The REPEAT step above identified anonymization as the next bottleneck. The AnonymizationFeature sent the full 58k document to Ollama 7B in a single shot — same 120s timeout, same failure.

Why Segmentation was wrong here: Unlike theme extraction, anonymization requires entity consistency across the entire document. Chunking would require a shared state machine or merge pass to ensure "Jean Dupont" gets the same placeholder in every chunk. The complexity and failure modes (inconsistent anonymization) outweighed the benefit.

Applying the Five Focusing Steps (second cycle):

StepAnalysisAction
1. IDENTIFYSame constraint: local model inference cannot handle 58k chars within 120sConfirmed via logs
2. EXPLOITUse existing Opik Presidio NLP guardrail (validate_no_pii) as a detection pass — no LLM, no context limit, 200msRefactored validate_no_pii into detect_pii_entities + apply_pii_replacements
3. SUBORDINATELLM enrichment becomes conditional: skip if NLP found entities AND text >15k charsskip_llm heuristic in _apply_anonymization()
4. ELEVATEFor small docs, LLM still runs (extracts keywords, distinguishes institutions from PII)anonymization_type: "pii+llm" for small docs
5. REPEATNext constraint: keyword extraction quality (NLP can't distinguish semantic keywords like "Mairie" from PII)Backlog item

TRIZ connection: This is a Prior Action resolution (TRIZ Principle #10), not Segmentation. The contradiction is: the system must be intelligent (context-aware PII detection) AND fast (no timeout on large docs). Prior Action resolves it by performing the bulk of the work in advance via NLP, making the LLM optional.

See: When the Bottleneck Moves for the full narrative.

Integration with TRIZ

Theory of Constraints identifies what to solve. TRIZ provides how to solve it.

When facing a constraint that seems unsolvable, apply TRIZ contradiction resolution:

  • If constraint is "resource vs. ambition" → See TRIZ patterns
  • If constraint is "input size vs. model capacity" → TRIZ Principle #1 (Segmentation) + #24 (Intermediary)

Resources

  • Goldratt, E. (1984). The Goal: A Process of Ongoing Improvement
  • ToC Institute