Is OpenAI’s Deep Research amazing or worthless?

2/17/2025

tldr; It’s more a hindrance than help for good analysis.

Deep Research is great at what LLMs do best: processing and synthesizing large quantities of language, far in excess of what humans can do. But we already had that in other models. Deep Research lacks the intelligence that we need to derive insight for our clients. Instead, it marches off in a non-expert direction and returns something that sounds great but is actually devoid of real analysis.

Deep Research doesn’t have motivation on its own. Motivation is a consequence of domain expertise. What do I need to know, and why? What is the so-what? Anyone who has watched Deep Research, Operator, or any other foundational AI in action can confirm they struggle mightily with defining a goal and following through in a way an expert would. Until Deep Research can come close to our definition of the goal and source identification, we don’t want it making those decisions for us, since they are almost always worse than what we ask our agentic system to do.

Deep Research also doesn’t have instinct. Our instincts of where to look to find signal are the product of decades of experience in intelligence. It is easy for an expert to define what constitutes a good source, but far more difficult to rank and prioritize good sources. Deep Research can’t do either.

We asked Deep Research to develop a report on latest developments in the US-China Competition, and it relied significantly on a public transcript of a phone call between Secretary of State Marco Rubio and Foreign Minister Wang Yi. That source has the veneer of quality: primary, official government communications that can be corroborated across two accounts. In reality, it is a heavily sanitized PR exercise that can incite the news media but contains little intelligence value.

For example, Deep Research parroted Chinese talking points from that call, wasting everyone’s time in the process:

From China’s perspective, defending sovereignty (especially on Taiwan) is non-negotiable. In the official readout, Wang Yi stated that Taiwan is “中国领土的一部分” (“part of China’s territory since ancient times”) and that China “绝不允许把台湾从中国分裂出去” (“will never allow Taiwan to be separated from China”). He urged the U.S. to “务必慎重处理” (“handle [the Taiwan issue] with utmost prudence”). Chinese sources claim that Rubio reaffirmed the U.S. does not support Taiwan independence. The Ministry of Foreign Affairs indicated that China welcomes any assurance from Washington that the one-China policy remains intact.

Not only is that riveting analysis unhelpful; these call transcripts consist of nothing but boilerplate and protocol. Experts wouldn’t spend much time on them, but Deep Research leaned on it heavily.

Finally, Deep Research simply isn’t persistent. Along with Operator and the agentic reasoning elements in underlying models like o1 and o3-mini, Deep Research generally gives up when things get hard. Many of the best, most insightful sources we collect make it exceedingly difficult for us. Some seek to actively stymie our efforts. When Deep Research meets those challenges, it gives up and moves to something more readily accessible on the open internet.

All of these things will change. AI models will continue to improve. Motivation, instinct and persistence span billions of years of evolution, and language just a tiny fraction of that time. Engineering those traits in agentic AI systems will be harder and more complicated than large language models. For today’s analysis, large language models remain an excellent tool in the hands of domain experts, and a vapid party trick otherwise.

This is where Radiant Intel comes in. We have an automated collection infrastructure that only pulls from expert-vetted sources, curated for the task at hand. Our system uses proprietary AI agents to develop analysis that reflects our team’s decades of experience in intelligence. Finally, we have a robust testing and evaluation process to make sure our clients are receiving robust analysis, without translation errors, hallucinations, or empty rhetoric.

As a company built on foundational AI, we get as excited as anyone when the labs roll out genuine advances. This ain’t it.

Briefs

Is OpenAI’s Deep Research amazing or worthless?

Are you sure?