⛉
Compliance evidence triage
ComplianceClassify each uploaded document/screenshot against the right control (SOC 2 / ISO 27001) and flag whether it's sufficient evidence — at scale.
The routing recipe
Job typeClassification / routingPolicyCheapest that passesConstraintsgroq
routes to
GPT-OSS 20B groq
groq:openai/gpt-oss-20b
80.6
match score
84.6
task fit
75.4
cost
96.4
speed
Why this model from benchmarks & capabilities
- "Classification / routing" leans hardest on instruction; GPT-OSS 20B scores 78/100 there — #18 of 32.
- Strength on this class of work shows up on IFEval, MMLU (zero-shot), where OpenAI (open-weight) models are competitive.
- Cost: ~$0.112/Mtok blended ($0.03 in / $0.14 out). Speed: ~1000 tok/s on Groq LPU.
- Chosen over Llama 3.1 8B Instant for higher task-fit (84.6 vs 75.6) & faster.
| Capability the job needs | Weight | GPT-OSS 20B | Catalog rank |
|---|---|---|---|
| instruction | 0.90 | #18 of 32 | |
| speed | 0.90 | #1 of 32 | |
| knowledge | 0.30 | #21 of 32 |
Relevant benchmarksIFEval MMLU (zero-shot)
EconomicsCost: ~$0.112/Mtok blended ($0.03 in / $0.14 out).
Speed: ~1000 tok/s on Groq LPU. vs runner-upChosen over Llama 3.1 8B Instant for higher task-fit (84.6 vs 75.6) & faster. (Llama 3.1 8B Instant, score 79.9)
Speed: ~1000 tok/s on Groq LPU. vs runner-upChosen over Llama 3.1 8B Instant for higher task-fit (84.6 vs 75.6) & faster. (Llama 3.1 8B Instant, score 79.9)
Example result
Sample input ▾
Artifact: 'AWS IAM policy export showing MFA enforced on all root and admin accounts, dated this month.'
control: SOC 2 CC6.1 (Logical access — authentication)
evidence_type: configuration export
sufficient: yes
reason: Demonstrates MFA enforcement on privileged accounts with a current date; maps cleanly to access-control criteria. (Also supports ISO A.5.17 / A.8.5.)
Test it on your own data