44.
Can we know how safe a model will be before users interact with it? Evals are often narrow and easy for models to…
Can we know how safe a model will be before users interact with it? Evals are often narrow and easy for models to recognize as evals. Solution: testing on prod, before prod. We simulate deploying a model by feeding it millions of prod use