4 tips for building better AI agents that your business can trust

aiagent11gettyimages 2263463626 — Ekaterina Demidova/Moment via Getty Images

Follow ZDNET:Add us as a preferred sourceon Google.

ZDNET’s key takeaways

Companies are exploring AI agents in multiple ways.
Professionals must consider how to exploit these technologies.
Measurement, collaboration, and experimentation are key.

AI agents will impact every professional role. If your company hasn’t started using agents yet, it will soon, either through off-the-shelf software products or in-house tools that draw on large language models and data sources.

Professionals exploring how to use agents in their roles are well-advised to seekbest-practice guidance. One such source of information is Joel Hron, CTO at Thomson Reuters Labs, who is helping the information services company exploit generative AI, machine learning, and agentic technologies.

Also: Worried AI agents will replace you? 5 ways you can turn anxiety into action at work

Hron told ZDNET that Thomson Reuters uses a mix of in-house models and off-the-shelf tools to power its AI innovations. As well as advances in frontier labs from Big Tech firms, Hron and his team ensure the firm exploits its proprietary knowledge and assets.

“If you look at the core of what we do well, it’s being able to synthesize human expertise and information into judgment that can be served back to professionals,” he said.

“The delivery mechanism for how that expertise is delivered is evolving right now. Traditionally, it’s been delivered via software. But it’s increasingly delivered via agents, or agents plus software.”

Hron points to several key agentic achievements at Thomson Reuters, including the AI-powered legal research tool Westlaw Advantage and the firm’s Deep Research agent that reviews insights and strategizes as a researcher would.

Also:AI agents are fast, loose, and out of control, MIT study finds

From these explorations, Hron said he’s learned four key lessons that professionals can use to build trustworthy agentic AI systems.

1. Measure your success

Hron said the first area to focus on is evaluations: “You need to know what good looks like.”

While this focus on evaluations sounds like an obvious requirement, Hron said it’s a hard process to get right, to quantify, and to systematize.

“We’ve said that for the last three years that this is one of the most important things for building good AI systems, and it continues to be true today in an era of agents,” he said.

Hron: “We still want the confidence of our human experts.”

Thomson Reuters

Hron’s team tracks and measures agentic success in several ways. First, they leverage public benchmarks, which he said provide good early indicators of the positive potential performance of new models.

Also:5 security tactics your business can’t get wrong in the age of AI – and why they’re critical

Second, they’ve developed their own internal benchmarks with strong directions for automated evaluations: “Rather than just saying, ‘How close is the generated answer to a good answer?’, our process is about really defining, ‘Well, what makes the answer good?'”

Finally, Thomas Reuters keeps humans in the loop, ensuring evaluations go a step beyond automated assessments.

“Automated evaluations help drive the flywheel faster for our development teams, and they can test a lot of ideas relatively quickly, and that’s good. But before we ship, we still want the confidence of our human experts and their assessment of the performance,” he said.

“The continued reliance on that approach has allowed us to ship great products that perform well in the market. I think human input is a critical ingredient to us being able to do that work well and do it with confidence.”

2. Make experts sit together

Hron advised professionals to understand deeply what agents do and how they operate over time.

“Tightly coupling that awareness to the user experience is increasingly important,” he said. “If you think about these agentic systems like human AI collaborators, then the human and the agent need a common language and a common interface that they work on.”

Also:Why enterprise AI agents could become the ultimate insider threat

Hron said this common language and interface should give humans valuable insight into agentic thought processes and vice versa.

“This area is a new and important UI experience, and I think tightly coupling deep technical understanding of the agent with a good user experience is critical.”

While many experts talk about the importance of human/agent coupling, Hron said the key to success is straightforward: bringing teams in the business together.

“This process isn’t scientific — it’s about forcing my designers to sit with data scientists and talk about what’s happening,” he said. “The closer we can make those two sets of people, and the more often they can sit together, the better you have the osmosis of thinking across those two areas.”

3. Develop proven capabilities

Despite any hype that might have you believe otherwise, Hron said professionals must recognize that agents and the models that power them are far from omniscient.

Hron said AI models are improving across three dimensions: writing code, executing plans, and multi-step reasoning. The latest advances allow model capabilities to be extended by other software tools.

“What that development means for us as a company is more positive than negative, because it means that, if we can take all of these hundreds of applications that we’ve sold into the market for many decades, and we can decompose them, then we have proven capabilities for professionals,” he said.

Also:90% of AI projects fail – here are 3 ways to ensure yours doesn’t

“If we can decompose these elements as tools for the agent, then we’re actually extending the capabilities of these models quite a lot, and that’s really the future of agents.”

Rather than seeing agentic AI as an omniscient model that attempts to do everything under the sun, Hron advised professionals to give agents access to proven capabilities people already use, which is a focus of his team.

“We’re looking at our systems and asking ourselves, ‘OK, we’ve built this for a human user for many, many years. Now, what ergonomics are required for an agent to work with this system? How do you adapt the process to be conducive to working with an agent, versus necessarily a human in all cases? And what does that approach mean for how the tool looks, feels, and performs?'”

4. Look beyond the firewall

Thomson Reuters Labs recently launched the Trust in AI Alliance, a builder-led forum for senior AI researchers from Anthropic, AWS, Google Cloud, OpenAI, and Thomson Reuters to discuss how trust is engineered into agentic systems.

Hron said the Alliance, which shares lessons publicly to inform the broader industry conversation around trustworthy AI, also helps senior members of his team to learn best practices from industry pioneers.

“We’re trying to bring forward a focus for explainability and transparency in terms of how these models operate,” he said.

Also:5 ways you can stop testing AI and start scaling it responsibly

Hron said the technology pioneers and their models have significantly reduced the time and effort required to get from zero accuracy to 90%.

“But we’re not in the 90% game,” he said. “We’re in the 99% and 99.9% game, and we must consider how we get that extra nine or two nines of accuracy, which is the difference for trust.”

As part of this process, Thomson Reuters is also working with academic institutions. Late last year, the company announced a five-year partnership to create a joint Frontier AI Research Lab at Imperial College London.

“In these initiatives, we’re focused on those last two nines of accuracy, because that’s what people look to buy from us for when we release our products to market,” said Hron.

“The frontier technology organizations will continue to push the limits on what’s possible. But for us, the margin is where the competitive edge in the world of law, tax, and compliance is won and lost. And so that’s what we really need to get right.”

4 tips for building better AI agents that your business can trust

ZDNET’s key takeaways

1. Measure your success

2. Make experts sit together

3. Develop proven capabilities

4. Look beyond the firewall

Why scientists are counting tiny Antarctic krill from Space

Netflix will soon start airing video podcasts like The Breakfast Club

Sex toys maker Tenga says hacker stole customer information | TechCrunch

Samsung Galaxy XR: Everything you need to know

A new earbud security flaw may expose you to remote eavesdropping – heres how to fix it

Marquis says over 672,000 people had personal and financial data stolen in ransomware attack | TechCrunch

ZDNET’s key takeaways

1. Measure your success

2. Make experts sit together

3. Develop proven capabilities

4. Look beyond the firewall

Similar Posts