Help build golden sets and grading rubrics for enterprise tasks.
Help build golden sets and grading rubrics for enterprise tasks.
As a ML Evaluation Intern on our Researchteam, you'll work alongside engineers, researchers, and domain experts to take AI from prototype to production inside regulated, high-stakes operations. You'll own outcomes end to end — scoping the problem, shipping a governed system, and instrumenting it so impact is measurable against a real baseline.
This is a hands-on, high-ownership role for someone who wants their work to reach real users. You'll help raise the quality bar through evaluation, review, and clear written communication.
We'd love to hear from you. Reach out and tell us why this role is the one — even if your background doesn't tick every box.