47.
Offline sandbox implementations in eval frameworks are all so different. Cause the agent is in the sandbox, its A…
Offline sandbox implementations in eval frameworks are all so different. Cause the agent is in the sandbox, its API calls need to be online Some toggle it on/off for every API call, some install a sidecar, some disable web search tools, so