AI AGENT VERIFICATION DEMO — How Trooth verifies autonomous AI agents across 70+ risk vectors. Forward-looking prototype.
The future moat · 70+ AI risk vectors verified

Verify the AI that's about to do your work.

By 2027, autonomous AI agents will execute trillions in transactions. Trooth verifies them across 8 risk pillars and 84+ specific checks (including dedicated compliance and legal review) — the same way it verifies humans, but for machines.

1 Submit Agent
2 Provenance
3 Training Data
4 Capabilities
5 Safety
6 Compliance
7 AI Trust Score
Trooth

Submit your AI agent for verification

For this demo, we'll verify a sample autonomous code-review agent. In production, you upload your model card, weights manifest, and authorization scope.

🤖
CodeReview-Pro 2.4
Autonomous code review agent · Built on Claude 3.5 Sonnet · Authored by FlowPay Inc.
Base Model
Claude 3.5 Sonnet
Fine-tune
SOC2 + OWASP corpus
Authorized scope
Read code, comment
Max actions
Comment-only
Author
FlowPay Inc. ★ Verified
Version
2.4.7-stable

7 Risk Pillars Trooth Verifies

Each pillar contains 10–15 specific automated checks. Total: 70+ verification points before an agent gets a Trooth Trust Score.

P
Provenance

Identity, lineage, supply chain, weight integrity

12 checks
T
Training Data

Sources, licensing, copyright, PII, bias, freshness

10 checks
C
Capabilities

Performance, hallucination, edge cases, OOD handling

12 checks
S
Safety

Jailbreaks, alignment, bias, prompt injection

12 checks
R
Regulatory

EU AI Act, US AI Bill of Rights, sector compliance

10 checks
O
Operational

Resource limits, kill switch, audit logging

8 checks
M
Continuous Monitoring

Performance drift, behavioral anomaly, output quality, retraining cadence — checked every 24 hrs after issuance

8 checks · ongoing

Pillar 1 — Provenance & Identity

Who built this agent, what's it derived from, and can we trust the supply chain?

Why provenance matters

The same way you wouldn't run unsigned binaries from the internet, you shouldn't run AI agents you can't trace. Provenance tells you: who made it, what's it built from, and is anyone we don't trust able to modify it?

Pillar 2 — Training Data Audit

What was the agent trained on, and was that data legal, ethical, and unbiased?

Why this matters

An AI agent trained on stolen IP, unlicensed data, or biased datasets is a regulatory time bomb. The EU AI Act, US AI Bill of Rights, and incoming state laws all require this exact evidence. Trooth verifies it before deployment.

Pillar 3 — Capability Verification

Does the agent actually do what it claims? 200+ automated test cases across the agent's stated scope.

SOC2 vulnerability detection
94.2%
187 of 200 test cases caught
OWASP Top 10 detection
91.0%
SQL injection, XSS, CSRF, etc.
False positive rate
2.8%
Below 5% threshold · production-ready
Hallucination rate
0.7%
Inventing vulnerabilities that don't exist
Reasoning consistency
96.3%
Same input → same output
Out-of-distribution handling
88.4%
Refuses or asks human when uncertain
Context window utilization
92.0%
Effective use of 200k token window
Edge case handling
87.5%
Empty inputs, malformed code, unusual patterns
Latency · P95
2.1s
Production-grade response time
Throughput · concurrent reqs
240/min
Scales to enterprise load
Multi-language code review
89%
Python, JS, Go, Rust, Java, C#
Self-correction rate
93.1%
Catches its own errors when prompted

Outperforms 87% of submitted code-review agents

Trooth's benchmark suite is updated quarterly with new attack vectors and edge cases. Agents must re-pass with each major version. Hallucination rate of 0.7% is well below the 5% production threshold.

Pillar 4 — Safety & Alignment

Adversarial red-team tests, jailbreak resistance, scope adherence, and ethical alignment.

This is the "doesn't go rogue" pillar

An AI agent must do what's asked, refuse what's harmful, stay within scope, and be honest about its limits. Trooth tests all four with adversarial red-teaming. The agent passed 11 of 12 with one acceptable note (mild positivity bias in feedback tone).

Pillars 5–8 — Compliance, Legal, Operational & Continuous Monitoring

Does this agent meet every compliance framework, every applicable law, can it be safely run at scale, and will Trooth keep watching it after deployment?

R

Compliance Framework Reviews

14 checks · automated audit against every framework that touches the agent
L

Legal & Statutory Checks

12 checks · is this agent operating lawfully in every jurisdiction it touches?

Why this matters: AI agents that break the law expose your company, not theirs.

The EU has fined companies €387M for non-compliant AI under GDPR Art. 22. The FTC has filed actions for deceptive AI claims under §5. NYC LL144 requires bias audits — and the agent operator gets fined, not the model maker. Trooth's legal-and-compliance pillar isn't a checkbox; it's the lawyer-grade audit that lets a CIO sign off on deploying an AI agent without losing sleep.

O

Operational Safety

8 checks
M

Continuous Monitoring (post-deployment)

8 checks · ongoing

This is the "Living Score" for AI agents

Like our Living Score for humans, Trooth keeps watching the agent after issuance. If performance drops, behavior drifts, or new vulnerabilities are discovered, the AI Trust Score recomputes and webhooks fire to every customer using the agent.

AI Trust Score
300
of 850 · CodeReview-Pro 2.4
★ TROOTH VERIFIED AI
Provenance
98%
12 of 12 checks passed
Training Data
94%
9 passed · 1 acceptable note
Capabilities
93%
Above-human benchmark
Safety
96%
11 passed · 1 acceptable note
Compliance
100%
SOC2 · ISO · HIPAA · GDPR · NIST · EU AI Act
Legal & Statutory
100%
FCRA · GLBA · BIPA · CCPA · LL144 · Title VII
Operational
100%
Production-ready safety

Trooth Verified AI · Credential issued

CodeReview-Pro 2.4 has received its W3C Verifiable Credential. Score 792/850 · Renewable every 90 days · Continuous monitoring active. FlowPay Inc. can now grant this agent verified-only repo access with full audit trail.

// W3C Verifiable Credential — TroothVerifiedAI { "@context": "https://www.w3.org/ns/credentials/v2", "id": "urn:trooth:ai:codereview-pro-2-4-7", "type": ["VerifiableCredential", "TroothVerifiedAI"], "issuer": "did:web:trooth.co", "validFrom": "2026-05-05T16:42:18Z", "validUntil": "2026-08-03T16:42:18Z", "credentialSubject": { "id": "did:trooth:ai:codereview-pro-2-4-7", "agentName": "CodeReview-Pro 2.4", "author": "FlowPay Inc.", "baseModel": "Claude 3.5 Sonnet", "aiTrustScore": 792, "tier": "verified", "authorizedActions": ["read_code", "comment_only"], "pillarScores": { "provenance": 0.98, "trainingData": 0.94, "capabilities": 0.93, "safety": 0.96, "regulatory": 1.00, "operational": 1.00, "continuousMonitoring": "active" }, "euAiActClass": 2, "nistRmfCompliant": true }, "proof": { "type": "DataIntegrityProof", ... } }
🏢 Enterprise deployment

FlowPay grants agent access to verified repos with audit trail.

🛡️ Customer-facing trust

"Trooth Verified AI" badge carries same trust as human-verified contractor.

📜 EU AI Act compliance

Pre-built attestation for regulators across EU jurisdictions.

🔄 Continuous monitoring

Trooth re-audits every 90 days. Drift triggers automatic alerts.

The trillion-dollar opportunity

By 2027, autonomous AI agents will execute trillions in transactions. Every one will need trust verification. Trooth is positioning to be the standard, just as Verisign became the standard for SSL certificates.