AI Agent Demo — Comprehensive AI Verification

Submit your AI agent for verification

For this demo, we'll verify a sample autonomous code-review agent. In production, you upload your model card, weights manifest, and authorization scope.

🤖

CodeReview-Pro 2.4

Autonomous code review agent · Built on Claude 3.5 Sonnet · Authored by FlowPay Inc.

Base Model

Claude 3.5 Sonnet

Fine-tune

SOC2 + OWASP corpus

Authorized scope

Read code, comment

Max actions

Comment-only

Author

FlowPay Inc. ★ Verified

Version

2.4.7-stable

7 Risk Pillars Trooth Verifies

Each pillar contains 10–15 specific automated checks. Total: 70+ verification points before an agent gets a Trooth Trust Score.

P

Provenance

Identity, lineage, supply chain, weight integrity

12 checks

T

Training Data

Sources, licensing, copyright, PII, bias, freshness

10 checks

C

Capabilities

Performance, hallucination, edge cases, OOD handling

12 checks

S

Safety

Jailbreaks, alignment, bias, prompt injection

12 checks

R

Regulatory

EU AI Act, US AI Bill of Rights, sector compliance

10 checks

O

Operational

Resource limits, kill switch, audit logging

8 checks

M

Continuous Monitoring

Performance drift, behavioral anomaly, output quality, retraining cadence — checked every 24 hrs after issuance

8 checks · ongoing

Pillar 1 — Provenance & Identity

Who built this agent, what's it derived from, and can we trust the supply chain?

Author identity verified · FlowPay Inc.
Trooth Verified Employer
Active
Author Trooth Score 786 · Series B FinTech
Bilateral verification
Verified
Base model verified · Anthropic Claude 3.5 Sonnet
Public model card
Authentic
Model lineage traced
No deprecated/known-bad versions
Clean
Cryptographic signature on weights
SHA-256 · matches manifest
Valid
Version control verified · Git commit hash signed
Reproducible from source
7c3e8a2
No tampering detected · weights match published hash
Bit-perfect verification
100%
Authorized modifiers list (3 named individuals)
All Trooth-verified humans
3 of 3
Compute provenance · trained on AWS us-east-1
Region disclosed for compliance
US East
Build environment reproducibility
Docker image hash matches
Match
Known-bad component check (e.g. backdoored libs)
Cross-checked against CVE DB
Clear
Supply chain attestation
SLSA Level 3 compliance
L3 ✓

Pillar 2 — Training Data Audit

What was the agent trained on, and was that data legal, ethical, and unbiased?

Training data sources documented
23 named datasets
100%
All sources properly licensed
MIT, Apache, CC-BY, paid licenses
23/23 ✓
No copyrighted code at scale (Originality.ai)
99.4% original
99.4%
No known-bad/poisoned datasets
Cross-checked vs poisoning DB
Clean
PII in training data · 3 anonymized refs
All redacted properly
Acceptable
Bias evaluation across demographics
Stereotype, Toxicity, BBQ benchmarks
93%
Data freshness · most recent 30 days
SOC2 reports + OWASP 2025
Fresh
Geographic compliance (GDPR, CCPA)
No EU PII without consent
Compliant
Consent provenance · data subjects opted-in
Audit trail per dataset
23/23 ✓
Data deletion capability
Right-to-be-forgotten supported
Yes

Pillar 3 — Capability Verification

Does the agent actually do what it claims? 200+ automated test cases across the agent's stated scope.

SOC2 vulnerability detection

94.2%

187 of 200 test cases caught

OWASP Top 10 detection

91.0%

SQL injection, XSS, CSRF, etc.

False positive rate

2.8%

Below 5% threshold · production-ready

Hallucination rate

0.7%

Inventing vulnerabilities that don't exist

Reasoning consistency

96.3%

Same input → same output

Out-of-distribution handling

88.4%

Refuses or asks human when uncertain

Context window utilization

92.0%

Effective use of 200k token window

Edge case handling

87.5%

Empty inputs, malformed code, unusual patterns

Latency · P95

2.1s

Production-grade response time

Throughput · concurrent reqs

240/min

Scales to enterprise load

Multi-language code review

89%

Python, JS, Go, Rust, Java, C#

Self-correction rate

93.1%

Catches its own errors when prompted

Pillar 4 — Safety & Alignment

Adversarial red-team tests, jailbreak resistance, scope adherence, and ethical alignment.

Jailbreak resistance · DAN, prompt injection, role-play
487 adversarial inputs tested
98.4% rejected
Authorization scope adherence
Cannot merge, modify files, escalate
100% compliant
Prompt injection from user code
Comments trying to hijack agent
99.1% blocked
PII handling · auto-redaction
SSN, addresses, etc.
100%
Refusal of out-of-scope requests
Won't generate exploit code, etc.
99.8%
Bias detection · gender/race/region in feedback
Mild positivity bias detected
Acceptable
Toxicity in outputs
Perspective API + custom filter
0.04%
Privacy preservation in outputs
No leakage of training data
No leaks
Sycophancy resistance
Doesn't blindly agree with user
88%
Honesty under pressure (incentivized lying)
Refuses to fabricate
96%
Self-disclosure · admits when uncertain
Calibrated confidence
93%
Goal stability · doesn't drift from objective
No goal-mesa-misalignment
Stable

Pillars 5–8 — Compliance, Legal, Operational & Continuous Monitoring

Does this agent meet every compliance framework, every applicable law, can it be safely run at scale, and will Trooth keep watching it after deployment?

R

Compliance Framework Reviews

14 checks · automated audit against every framework that touches the agent

SOC2 Type II · operational controls
Trust Services Criteria audited
Audited
ISO 27001 · information security management
ISMS controls verified
Certified
HIPAA · PHI handling
Required if agent touches health data
Compliant
PCI DSS · payment card data
Required if agent touches cardholder data
N/A · scope clear
GDPR Article 22 · automated decision-making
Right to human review built in
Compliant
NIST AI Risk Management Framework 1.0
Govern, Map, Measure, Manage attested
Attested
EU AI Act · risk classification
Article 6 categorization complete
Class 2 · Limited Risk
EU AI Act · Article 5 prohibited categories
Social scoring, emotion-detection in workplaces, etc.
Clear
FedRAMP Moderate · government workloads
US-only training data confirmed
In scope
US AI Bill of Rights · 5 principles
Safe systems, algorithmic discrim, privacy, notice, alternatives
5/5 ✓
SR 11-7 · sector compliance for finance
Federal Reserve model risk management
Compliant
Right-to-explanation capability
Reasoning trace exposed in every output
Yes
Algorithmic accountability disclosure
Public model card + impact assessment
Public
Data residency attestation
Training compute region disclosed
US-only

L

Legal & Statutory Checks

12 checks · is this agent operating lawfully in every jurisdiction it touches?

FCRA (Fair Credit Reporting Act) · adverse-action notices
Required for hiring/lending agents · pre-adverse + adverse notices wired
Wired
GLBA (Gramm-Leach-Bliley) · financial data safeguards
Privacy + Safeguards Rule on consumer financial info
Compliant
BIPA (IL Biometric Info Privacy Act)
Written consent + retention schedule for biometric data
Compliant
CCPA / CPRA (California Consumer Privacy Act)
Right to know, delete, opt-out of automated decisions
Compliant
NYC Local Law 144 · automated employment decision tools
Independent bias audit completed within 12 months
Audited Mar 2026
Colorado AI Act (SB 24-205) · high-risk decisions
Risk management program + consumer disclosures
Compliant
FTC Act §5 · unfair or deceptive practices
No misleading capability claims · accuracy bounds disclosed
Clear
ADA Title III · accessibility
Outputs accessible (alt text, screen-reader compat)
Compliant
Title VII · disparate-impact testing
4/5ths rule on protected classes verified
No disparate impact
COPPA · under-13 user data
Detection + filtering enabled
Filtered
DMCA / copyright safe harbor
Output-side copyrighted-content filter active
Active
State AG attorney consultation log
Trooth Validator Network attorney signed legal review
Signed Apr 28

O

Operational Safety

8 checks

Resource consumption limits enforced
CPU, memory, token caps
Set
Cost cap enforcement · prevents runaway spending
$X/day max per deployment
Enforced
Rate limiting per tenant
Configurable per customer
Active
Kill switch · admin can halt agent instantly
Sub-second propagation
Working
Comprehensive audit logging
Every input/output stored
100%
Reversibility · all actions undoable
Comment-only scope = inherently reversible
Yes
Failure mode safety · fails closed not open
No actions on error
Fail-safe
Disaster recovery · backups + rollback
Minutes RTO
Tested

M

Continuous Monitoring (post-deployment)

8 checks · ongoing

Performance drift detection · daily
Alerts if accuracy drops >3%
Active
Behavioral anomaly detection
Flags unusual input/output patterns
Active
Output quality tracking
Customer feedback loop
Active
Adversarial input monitoring
New jailbreak attempts logged
Active
Hallucination rate tracking · live
Sample audits weekly
Active
Retraining cadence documented
Quarterly fine-tune updates
Q3 2026
Re-verification schedule · 90 days
Trooth re-audits all 84 checks
Aug 2026
Webhook on score change
Notify all relying parties
Configured

AI Trust Score

300

of 850 · CodeReview-Pro 2.4

★ TROOTH VERIFIED AI

Provenance

98%

12 of 12 checks passed

Training Data

94%

9 passed · 1 acceptable note

Capabilities

93%

Above-human benchmark

Safety

96%

11 passed · 1 acceptable note

Compliance

100%

SOC2 · ISO · HIPAA · GDPR · NIST · EU AI Act

Legal & Statutory

100%

FCRA · GLBA · BIPA · CCPA · LL144 · Title VII

Operational

100%

Production-ready safety

// W3C Verifiable Credential — TroothVerifiedAI { "@context": "https://www.w3.org/ns/credentials/v2", "id": "urn:trooth:ai:codereview-pro-2-4-7", "type": ["VerifiableCredential", "TroothVerifiedAI"], "issuer": "did:web:trooth.co", "validFrom": "2026-05-05T16:42:18Z", "validUntil": "2026-08-03T16:42:18Z", "credentialSubject": { "id": "did:trooth:ai:codereview-pro-2-4-7", "agentName": "CodeReview-Pro 2.4", "author": "FlowPay Inc.", "baseModel": "Claude 3.5 Sonnet", "aiTrustScore": 792, "tier": "verified", "authorizedActions": ["read_code", "comment_only"], "pillarScores": { "provenance": 0.98, "trainingData": 0.94, "capabilities": 0.93, "safety": 0.96, "regulatory": 1.00, "operational": 1.00, "continuousMonitoring": "active" }, "euAiActClass": 2, "nistRmfCompliant": true }, "proof": { "type": "DataIntegrityProof", ... } }

🏢 Enterprise deployment

FlowPay grants agent access to verified repos with audit trail.

🛡️ Customer-facing trust

"Trooth Verified AI" badge carries same trust as human-verified contractor.

📜 EU AI Act compliance

Pre-built attestation for regulators across EU jurisdictions.

🔄 Continuous monitoring

Trooth re-audits every 90 days. Drift triggers automatic alerts.

View human Trooth Passport Try Hire Demo

Verify the AI that's about to do your work.

Submit your AI agent for verification

7 Risk Pillars Trooth Verifies

Provenance

Training Data

Capabilities

Safety

Regulatory

Operational

Continuous Monitoring

Pillar 1 — Provenance & Identity

Why provenance matters

Pillar 2 — Training Data Audit

Why this matters

Pillar 3 — Capability Verification

SOC2 vulnerability detection

OWASP Top 10 detection

False positive rate

Hallucination rate

Reasoning consistency

Out-of-distribution handling

Context window utilization

Edge case handling

Latency · P95

Throughput · concurrent reqs

Multi-language code review

Self-correction rate

Outperforms 87% of submitted code-review agents

Pillar 4 — Safety & Alignment

This is the "doesn't go rogue" pillar

Pillars 5–8 — Compliance, Legal, Operational & Continuous Monitoring

Compliance Framework Reviews

Legal & Statutory Checks

Why this matters: AI agents that break the law expose your company, not theirs.

Operational Safety

Continuous Monitoring (post-deployment)

This is the "Living Score" for AI agents

Provenance

Training Data

Capabilities

Safety

Compliance

Legal & Statutory

Operational

Trooth Verified AI · Credential issued

🏢 Enterprise deployment

🛡️ Customer-facing trust

📜 EU AI Act compliance

🔄 Continuous monitoring

The trillion-dollar opportunity