VERA
← Home 8 signal types · regex · process-type gated

How Compliance Signals Work

We scan NEPA documents for patterns that indicate litigation risk. Every flag shows the exact text that triggered it and where it appears—no black box.

What they are

Compliance signals are named risk flags aligned to real NEPA litigation. Courts often overturn projects for the same kinds of defects: deferred mitigation, missing environmental justice analysis, weak no-action alternative discussion, approval contingent on studies not yet done. We run eight detectors over the project’s main document(s); each hit is stored with a verbatim excerpt and character offset so you can see exactly what triggered it.

The eight flag types

Flag Severity What it catches
deferred_mitigation High Mitigation pushed to “final design,” “future phases,” or “subsequent documentation.”
future_studies_reliance High Approval contingent on studies not yet completed.
ej_absent High Explicit statement that environmental justice is not addressed (EA/EIS only).
ej_thin_coverage Medium EJ mentioned only in passing or deferred to future NEPA (EA/EIS only).
no_action_absent High No-action alternative missing or not included (EA/EIS only).
no_action_thin Medium No-action alternative summarily dismissed or not adequately compared (EA/EIS only).
cumulative_impacts_thin Medium Cumulative impacts not meaningfully analyzed or deferred (EA/EIS only).
tribal_interests Info Tribal consultation or tribal interests mentioned—review for completeness.

Process-type gating (CE vs EA/EIS)

Categorical Exclusions (CE) are approved under a different standard than EAs and EISs. CE templates typically don’t include environmental justice sections, no-action alternatives, or cumulative impacts analysis—by design. If we ran those detectors on CE text, we’d get meaningless noise.

So we gate five flag types to EA and EIS only: ej_absent, ej_thin_coverage, no_action_absent, no_action_thin, cumulative_impacts_thin. On CE documents we only run deferred_mitigation, future_studies_reliance, and tribal_interests. This keeps results relevant and cuts false positives.

Why regex (not an LLM)?

We use deterministic regex patterns for three reasons:

The tradeoff is recall: we might miss creatively worded risk. We optimize for precision so lenders can trust that a flag is a real signal, not a false alarm.

Exclusions (reducing false positives)

Some phrases look like risk but aren’t. We explicitly exclude them:

Matches that fall inside these excluded phrases are dropped before creating a flag.

Deduplication and scan flow

We deduplicate by (flag_type, normalized excerpt) so the same phrase doesn’t create multiple identical flags. Scan runs only on the project’s main documents (for EIS we prefer FEIS/DEIS as the analysis document). Results are stored in the flags table with project_id, document_id, excerpt, and char_offset so the UI (and attestation) can reference them precisely.

Test suite

We maintain tests/test_signals.py with synthetic should-fire and should-not-fire cases for each flag. For example: a sentence like “mitigation measures will be developed prior to final design” should fire deferred_mitigation; “BMPs will be implemented prior to construction” should not. The suite has 18 tests and runs on every change so we don’t regress.

Key files to look at

In one sentence

Eight regex-based detectors look for litigation-relevant patterns in NEPA text; process-type gating and explicit exclusions keep false positives low; every flag stores the exact excerpt and offset so the result is auditable and attestable.