AI Agents Do Well in Simulations, Falter in Real-World Shopkeeping Test

pymnts.com/news/artificial-intelligence/2025/ai-agents-do-well-in-simulations-falter-in-real-world-shopkeeping-test

In a bid to test whether artificial intelligence (AI) agents can operate autonomously in the real economy, Andon Labs and Anthropic deployed Claude Sonnet 3.7 — nicknamed “Claudius” — to run an actual small, automated vending store at Anthropic’s San Francisco office for a month.

This story appeared on pymnts.com, 2025-07-02 23:15:57.
The Entire Business World on a Single Page. Free to Use →