@tomekkorbak on Backlist

44.

Can we know how safe a model will be before users interact with it? Evals are often narrow and easy for models to…

Can we know how safe a model will be before users interact with it? Evals are often narrow and easy for models to recognize as evals. Solution: testing on prod, before prod. We simulate deploying a model by feeding it millions of prod use

by @tomekkorbak (Tomek Korbak) · backlist 2026-06-16 · rubric 78.0