VERA

Verified Environmental Review & Attestation

NEPA compliance risk for lenders — in seconds, with proof on Solana

61,881 NEPA projects 8 compliance signals Actian VectorAI DB Solana devnet On-device LLM
1 / 15 ← → or Space to navigate

The Problem

Every major U.S. infrastructure project must clear NEPA. Lenders need to know: Is this project's environmental review litigation-ready?

  • EIS documents run 500–5,000+ pages — impossible to read at deal volume
  • Courts overturn projects on predictable grounds: deferred mitigation, missing EJ analysis, thin no-action alternative
  • Borrower self-certification is unacceptable; legal review costs $5K–$15K and takes weeks
We built VERA so a lender can go from "nuclear power project" to "2 high-severity flags, here's the proof on Solana" in under 60 seconds — without reading the PDF.
2 / 15

Our Solution — Three Pillars

① Transparent compliance signals

Eight named risk flags aligned to real litigation patterns. Each flag shows the verbatim excerpt and char offset. Regex-based, testable, 18 pytest tests. No black box.

② On-chain attestation

Attest on Solana via SPL Memo. Hash is permanent. Anyone — lender, counterparty, regulator — can verify without trusting our backend.

③ Semantic search

Actian VectorAI DB + OpenAI embeddings. Ask in plain language across 61K+ projects. RAG answers grounded in retrieved excerpts only.

④ Intelligence layer

Project + global chat, AI flag explanations, FAST-41 Stuckness Radar with OPEF Copilot, NEPA Observatory. All LLM inference on-device — no document text leaves your machine.

3 / 15

What We Built

  • Data pipeline: NEPATEC2.0 ingest (CE/EA/EIS), process-type-aware FEIS/DEIS main-doc selection, SHA-256 per document, SQLite FTS5
  • Compliance signals: 8 flag types, process-type gating, false-positive exclusions, 18 pytest tests
  • Actian VectorAI DB: OpenAI embeddings, K-NN search, RAG Q&A; numpy fallback (same API) when Actian unavailable
  • Solana: SHA-256 of flags + doc hashes on-chain via SPL Memo; trustless third-party verify
  • AI chat: Project-scoped + global RAG chat; "Explain with AI" per flag; LLM audit log
  • Observatory: Corpus-level stats, by state, by agency, by process type; cached at startup
  • Stuckness Radar: FAST-41 stuckness scores, choropleth map, OPEF Copilot narration
  • MCP server: VERA data exposed as tools for MCP clients (Claude Desktop, etc.)
  • Explainer pages: In-app how-it-works docs for every major feature
  • API: FastAPI 20+ endpoints; OpenAPI at /docs
4 / 15

Core Flow

From search to tamper-proof compliance record:

Search projects Open project Scan See flags + excerpts Explain with AI Attest on Solana Verify

Side by side with the main flow:

Chat (project or global)

Ask any question about a project's documents. Retrieval-augmented — model answers only from retrieved excerpts.

Semantic ask

Natural-language queries across the full corpus via Actian. "Which EIS documents have weak EJ analysis?"

5 / 15

How it works — Data pipeline

We ingest NEPATEC2.0 (PNNL / HuggingFace): CE, EA, and EIS JSONL — 61,881 projects · ~6.97M pages · 60+ agencies.

  • Main-doc selection for EIS: We prefer FEIS/DEIS as the analysis document. Many EIS projects mark 4–5 files as "main" (RODs, errata); we skip those for scanning.
  • CE category and milestones (ROD, NOI, FONSI, etc.) inferred from metadata or filename when blank.
  • SHA-256 stored per document — used in attestation to prove document integrity.
  • All data in SQLite + FTS5 full-text search; WAL mode; indexes on process_type, agency, state, is_main.
6 / 15

How it works — Compliance signals

Eight deterministic regex detectors. Every flag: verbatim excerpt, char offset, severity. Auditable, testable, reproducible.

FlagSev.What it catches
deferred_mitigationHighMitigation pushed to "future phases" or "final design"
future_studies_relianceHighApproval contingent on incomplete studies
ej_absentHighNo EJ analysis present (EA/EIS only)
no_action_absentHighNo-action alternative missing (EA/EIS only)
ej_thin_coverageMedEJ mentioned in passing only (EA/EIS only)
no_action_thinMedNo-action dismissed without analysis (EA/EIS only)
cumulative_impacts_thinMedCumulative impacts deferred or minimal (EA/EIS only)
tribal_interestsInfoTribal consultation found — review completeness

CE documents are exempt from EJ/no-action/cumulative flags (process-type gating). 18 pytest tests enforce this.

7 / 15

How it works — Semantic search

Actian VectorAI DB stores OpenAI text-embedding-3-small vectors for document chunks (600 chars, 80 overlap). Query → embed → K-NN → top chunks.

Search

K-NN similarity. Filter by process_type, agency, state. Returns ranked passages with scores.

Ask (RAG)

Retrieve top-k chunks → "answer only from these excerpts" prompt → Ollama answers locally. No hallucination.

Fallback

If Actian unreachable (Apple Silicon, air-gapped), numpy cosine-similarity store activates automatically. Same API — callers can't tell.

8 / 15

How it works — Solana attestation

After a scan: SHA-256(project_id + timestamp + flag detail + doc hashes) → write to SPL Memo program on devnet. One instruction, no custom contract.

What's on-chain

Compact JSON memo (≤566 bytes): pid, ts, flag counts, sha256: hash. Full detail stays in our API. Anyone can recompute the hash from our flags endpoint.

Trustless verify

A third party can verify without calling our Verify endpoint: fetch the raw tx from any Solana Explorer, decode the memo, call /api/.../flags, recompute SHA-256, compare. No trust in VERA required.

Doc hashes in the payload prove integrity of the underlying documents — if the document changed after attestation, the hash won't match.

9 / 15

How it works — Observatory & Stuckness Radar

NEPA Observatory

  • Corpus-level stats: 61K projects across 60+ agencies
  • Choropleth map by state
  • By-agency and by-process-type breakdowns
  • Stats cached at startup for instant load

Stuckness Radar (FAST-41)

  • 22K+ FAST-41 permitting projects with stuckness score [0–1]
  • Map toggle: avg score / % paused / project count
  • Sector & agency bar charts
  • OPEF Copilot: LLM narration + freeform questions ("Which sector has the most paused projects?")
10 / 15

Chat, AI Explanations & Privacy

Project chat

Ask anything about a project's documents. Context: project metadata + document chunks + stored flags. Ollama answers from retrieved excerpts only.

Explain with AI

Per-flag LLM explanation: given the flag type and triggering excerpt, Ollama explains in 1–2 sentences why this matters to a lender.

LLM audit log

Every LLM call is logged: prompt SHA-256, model, tokens, response, timestamp. Full audit trail for every generated output.

Privacy guarantee: All LLM inference (chat, flag explanations, radar narration) runs via Ollama on-device. Document text never reaches an external service — only embeddings go to OpenAI. Safe for sensitive government documents.
11 / 15

MCP Server & API

MCP Server

VERA exposes its data as 10 structured tools via the Model Context Protocol — project search, document retrieval, compliance flags, stats. Any MCP-compatible AI assistant (Claude Desktop, etc.) can query VERA as a structured knowledge source. Run with stdio or SSE transport.

REST API

20+ FastAPI endpoints: search, scan, flags, attest, verify, project chat, global chat, semantic search/ask/index, dashboard, radar. Full interactive docs at /docs. All endpoints are JSON; easy to integrate into any lender workflow.

In-app explainer pages for every major feature: /actian.html · /solana.html · /signals.html · /stuckness.html · /data-pipeline.html

12 / 15

Live Demo

Walk through in ~2 minutes:

  1. Search "nuclear" or "Savannah River" → open a project
  2. Scan → see flags (ej_absent, deferred_mitigation) with excerpts + char offset
  3. Click Explain with AI on a flag → Ollama explains why it matters to a lender
  4. Attest on Solana → tx signature → open Solana Explorer (devnet)
  5. Click Verify → confirm on-chain hash matches current scan
  6. Semantic Search → Ask: "Documents with weak environmental justice analysis" → see ranked excerpts + RAG answer
  7. Stuckness Radar → click Narrate → OPEF Copilot explains which sectors are stuck

App: index.html · Docs: /docs

13 / 15

Tech Stack

  • Data: NEPATEC2.0 (HuggingFace · CC0), FAST-41 CSV61,881 projects · 6.97M pages
  • Backend: Python 3, FastAPI, SQLite (FTS5 · WAL · triggers)
  • Signals: Regex detectors, process-type gating, 18 pytest tests
  • Vector DB: Actian VectorAI DB (Cortex gRPC) + numpy fallbacksame API, auto-failover
  • Embeddings: OpenAI text-embedding-3-small (1536d)
  • LLM: Ollama · qwen2.5:3b · on-devicechat, explanations, narration — nothing leaves machine
  • Blockchain: Solana devnet · SPL Memo · solders + solana Python
  • MCP: FastMCP server · 10 tools · stdio + SSE
  • Frontend: Alpine.js · Tailwind CSS · D3 · Chart.js
14 / 15

Impact & What's Next

Today: Lenders get NEPA compliance risk in seconds. Flags are auditable and testable. Attestations are tamper-proof and verifiable by any third party without trusting us. All LLM inference is on-device — safe for sensitive deal documents.

Next: Mainnet attestations (multisig), expanded signal library (climate, water, species, per-agency tuning), bulk attest for portfolio due diligence, live ePlanning API data feed, embeddings for the full 60K+ project corpus.

VERA = deterministic signals + Actian VectorAI DB + Solana attestation + on-device LLM.
Auditable civic AI with tamper-proof proof — for the infrastructure finance era.

Thank you.

15 / 15