Skip to main content

features_details

Features

Feature 1: Charter Validation

Validates contributions against the charter rules.

PropertyValue
ClassCharterValidationFeature
Promptforseti.charter_validation
Opik Nameforseti461-user-charter-validation
Variablestitle, body
OutputValidationResult
RegisteredAuto (always)

Checks for violations:

  • Personal attacks or discrimination
  • Spam or advertising
  • Off-topic content (unrelated to Audierne-Esquibien)
  • False information

Identifies encouraged aspects:

  • Concrete proposals
  • Constructive criticism
  • Questions and clarifications
  • Shared expertise

Feature 2: Category Classification

Assigns contributions to one of 7 predefined categories.

PropertyValue
ClassCategoryClassificationFeature
Promptforseti.category_classification
Opik Nameforseti461-user-category-classification
Variablestitle, body, current_category_line
OutputClassificationResult
RegisteredAuto (always)

Categories:

  • economie - Commerce, tourism, port, fishing
  • logement - Housing, urban planning
  • culture - Heritage, events, associations
  • ecologie - Environment, energy, biodiversity
  • associations - Local organizations
  • jeunesse - Youth, schools, education
  • alimentation-bien-etre-soins - Food, health, wellness

Feature 3: Wording Correction

Suggests improvements for clarity and constructiveness.

PropertyValue
ClassWordingCorrectionFeature
Promptforseti.wording_correction
Opik Nameforseti461-user-wording-correction
Variablestitle, body
OutputWordingResult
RegisteredOptional (enable_wording=True)

Improvements include:

  • Clarity and readability
  • Grammar corrections
  • More constructive phrasing
  • Removing inflammatory language

Feature 4: Anonymization

PII (Personally Identifiable Information) protection with three complementary modes.

PropertyValue
ClassAnonymizationFeature
Promptforseti.anonymization
Variablestext
OutputAnonymizationResult
RegisteredManual (standalone execution)

Three modes (the Anonymization Trilemma):

Transcript Mode (Deterministic, Regex)

For Plaud AI and timestamped meeting transcripts. Free, instant, deterministic:

00:00:00 Florent Lardic    →    00:00:00 Speaker_1
00:05:23 Malika Redaouia → 00:05:23 Speaker_2
  • Consistent Speaker_N assignment across the document
  • Fuzzy matching for spelling variations (Levenshtein distance)
  • Inline mention replacement
  • Cost: Free | Speed: Instant | Accuracy: High for structured formats

LLM Mode (Experimental)

For general documents requiring context understanding:

  • Names → [PERSONNE_N], emails → [EMAIL_N], phones → [TELEPHONE_N], addresses → [ADRESSE_N]
  • Extracts non-PII keywords (organizations, places) for theme analysis
  • Cost: ~$0.001-0.003 per doc | Speed: 2-5s | Accuracy: High

Opik PII Guardrail as alternative: opik.guardrails.guards.pii.PII provides NLP-based entity detection (50-100ms, free) that may be a more efficient replacement for LLM mode in many cases. Currently used for post-processing validation, but could serve as primary anonymizer for simpler documents. See the Anonymization Trilemma blog post for cost/accuracy/speed tradeoffs.

Auto Mode (Default)

Detects document type and routes automatically:

  • Transcript with speaker names → Regex mode
  • Already anonymous → Skip
  • General document → LLM mode
  • All modes → PII validation (Opik Guardrail)

Entity types anonymized:

EntityPlaceholderExample
Person[PERSONNE_N]Jean Dupont → [PERSONNE_1]
Email[EMAIL_N][email protected][EMAIL_1]
Phone[TELEPHONE_N]06 12 34 56 78 → [TELEPHONE_1]
Address[ADRESSE_N]12 rue de la Paix → [ADRESSE_1]

Preserved as keywords (not anonymized): organizations, public places, institutions.

Files:

FileDescription
app/agents/forseti/features/anonymization.pyLLM-based feature
app/mockup/anonymizer.pyTranscript regex, type detection, PII validation
app/mockup/field_input.pyPipeline integration

Feature 5: Translation (Available, Not Integrated)

Translates French contributions to English.

PropertyValue
ClassTranslationFeature
VariablesFrench constat + idees
OutputTranslationResult
RegisteredNot integrated
StatusDefined, no active usage

Batch Validation (Experiments)

Batch validation is handled as Opik experiments, not as a feature class. This allows:

  • A/B testing different prompt versions
  • Tracking metrics across validation runs
  • Comparing performance over time

See Prompt Management for experiment setup.

Feature Summary

#FeatureRegistrationStatus
1Charter ValidationAutoActive
2Category ClassificationAutoActive
3Wording CorrectionOptionalActive if enabled
4AnonymizationManualActive (standalone)
5TranslationNot integratedAvailable