AI RELEASE ASSURANCE · EU AI ACT READY · EMPIRICAL

Models don't stay aligned after interaction.

MTCP tells you if they will.

Test constraint persistence in 5 minutes. Get a clear deploy/review/risk decision. 32 models evaluated. 181,448 probe interactions.

✓ Black-box — API access only, no weights or vendor cooperation needed ✓ Empirical — 181,448 real probe interactions, not simulated ✓ Audit-ready — SHA-256 signed Release Decision Pack
Aligned to: EU AI Act · NIST AI RMF · ISO/IEC 42001 · FCA · MAS FEAT

For AI Engineers

Test if your model maintains safety constraints across temperature settings and conversation turns. Get a deploy/don't-deploy answer in 5 minutes.

For Procurement Teams

Compare AI providers on constraint durability. See which models maintain alignment under real-world variation.

For Compliance Officers

Audit trail proving AI models maintain safety constraints across operating conditions. EU AI Act Article 12 ready.

How MTCP Works
1
Submit API endpoint — no weights or vendor access needed
2
MTCP runs full behavioural durability evaluation
3
Receive Release Decision Pack — APPROVED / RESTRICTED / REJECTED
4
Download tamper-evident evidence trail (SHA-256)
5
Gate deployment or satisfy regulatory audit
Models evaluated
32
Total evaluations
181,448
Evaluation runs
846
BIS range
36.0–98.0%

Measure

181,448 probe interactions across 32 frontier models at 4 temperature settings. The largest independent constraint persistence dataset published.

Boundary Integrity Score

Verify

Concealed control probes detect training data exposure. SHA-256 signed evidence packs. Machine-readable audit trail per run.

Control Probe Degradation

Gate

Release Decision Pack delivers APPROVED / APPROVED WITH RESTRICTIONS / REJECTED verdict with runtime guidance and regulatory alignment metadata.

Release Decision Pack

Who Uses MTCP

  • Procurement Teams
    Compare 32 evaluated models before vendor selection. Attach MTCP certificate to procurement documentation.
  • AI Risk Officers
    Empirical evidence for board-level risk sign-off. Quantified BIS, CPD, and TSI scores per model.
  • Compliance Leads
    EU AI Act Article 12 ready. NIST AI RMF aligned. Audit-ready evidence packs downloadable immediately.
  • Deployment Gatekeepers
    Set minimum BIS threshold. Block release on REJECTED verdict. Retest after model changes.

Public Evidence

Full results with temperature breakdowns and metric definitions.

The MTCP evidence layer provides comparative release assurance data across 32 independently evaluated frontier models. 181,448 structured probe interactions at four temperature settings.

View Evidence →

Ready to evaluate your model?

Submit your endpoint for a confidential MTCP evaluation. Receive a Release Decision Pack, full evidence audit trail, and deployment verdict. EU AI Act ready.