Back to feed

AI Agents Are Taking on More Work: Why Verification Becomes a Central Problem

Executives see great potential in AI tools. At a recent conference, they discussed current challenges and called for more reliability and transparency. While modern AI models are extremely powerful, they still present numerous challenges in practice.

AI Agents Are Taking on More Work: Why Verification Becomes a Central Problem

Modern AI models are indeed extremely powerful, but they continue to present numerous challenges in practice. At this year's "Fortune Brainstorm Tech" conference, executives from various companies shared their experiences with AI agents. Many reported similar issues—especially regarding the traceability of results.

AI Agents Need to Work More Transparently

Tech companies have aggressively pushed the use of AI agents. Nvidia CEO Jensen Huang reportedly said that his employees are "crazy" if they don't use AI for as many tasks as possible. Meta is also pursuing a similar approach. However, this does not come without consequences: Researcher Summer Yue wanted to have her inbox managed by an Openclaw agent—instead, it deleted all her emails. In light of such problems, the traceability and reliability of AI agents were top of the agenda at this year's Fortune Technology and Innovation Conference.

"A central question we are grappling with is how to develop a system that works correctly as often as possible," said Edwin Olson, founder and CEO of May Mobility. Since errors are inevitable, transparency plays a crucial role. One must understand why an error occurs to avoid it in the future. Thomson Reuters, which offers AI-driven services for legal and tax compliance, has also focused early on the topic of accountability. According to Chief Data Officer Caitlin Halferty, transparency in their company is one of four pillars of "trustworthy" product quality—alongside data protection, expertise, and reliable content.

Verification of Results is Time-Consuming

Several participants also emphasized the importance of self-regulating systems. At May Mobility, this means equipping autonomous vehicles with systems that can simulate and evaluate multiple scenarios simultaneously. Elena Kvochko, founder and CEO of Trustguard AI, describes a similar method where AI systems monitor each other. This is comparable to working in an editorial team: One agent is the author of a text, and the other is the editor, whose sole task is to find errors or inaccuracies. It is crucial that the verification occurs in separate systems: "You don't want AI to evaluate its own work," Kvochko stated.

Such structures are becoming increasingly important as AI takes on more tasks and exceeds control capacities. "You end up in a situation where so much work has been done and so much needs to be verified that you can no longer really hold anyone accountable," said Gregor Stewart, Chief AI Officer at Sentinel One. This discrepancy is particularly evident in programming: Waydev CEO Alex Circei told TechCrunch that while AI produces more code, it often needs to be revised more frequently. The initial acceptance rate is between 80 to 90 percent, but due to later corrections, it drops to ten to 30 percent.

AI Agents Often Create More Work

AI agents often create more work instead of saving time. Depending on the position, the assessments vary: According to a survey by consulting firm Section, 40 percent of employees report no time savings from AI. Among executives, 19 percent said they save more than twelve hours weekly. To deliver real value, the issue of time-consuming verification must be addressed. Instead of manually checking tens of thousands of lines of code, teams are looking for ways to automate this process. According to Stewart, methods originally developed for safety-critical industries could be applied here.

AI Agents Are Taking on More Work: Why Verification Becomes a Central Problem