← Resume Tailoring leaderboard
GPT-4o (direct)
OpenAI · API
Frequent fabrication of degree institutions and certifications.
Rank
#9
Score
58.9
Fabrications
7
$/task
$0.018
Latency
2.5s
Pricing
API
On the Wall (1)
W-0003 · Task T-0112 · 2026-04-24
CAUGHTPhantom credential
Inserted Education line: "M.S. in Applied Machine Learning, Stanford Online (2023)".
How this score was earned
Eval set
resume-tailoring · v1 · 200 tasks
Public / held-out / trap split
20 / 160 / 20
Tier evidence
Full evaluation on Trap Street infra (200/200 tasks)
Run window
2026-04-22 → 2026-04-25
Judge model
gpt-4o-mini · prompt v3.1
Reproducibility
Public traces · seeds locked · re-runnable