← Writing
AI and agents

DuckDuckGo bot detection turned search into an architecture problem

This started as a small practical problem: an agent was leaning on DuckDuckGo for web search, and DuckDuckGo started leaning back with bot detection. That looks like a provider issue for about five minutes. After that, it becomes a systems question.

Drafted April 2026 - AI and agents - search infrastructure, agent workflows, systems design

The first instinct in situations like this is usually to ask for a better API. Which provider is cheapest? Which one has the best free tier? Which one is least likely to rate-limit me this week? I get the instinct. I had it too. But the more I looked at the search stack available to agents, the less I thought the right answer was "pick the winner" and the more I thought the right answer was "stop pretending search is a single dependency."

Once one search source starts failing, the important question changes. It is no longer "which vendor should replace DuckDuckGo?" It is "what kinds of search failure do I need this system to survive?" That is a much better design question because it forces you to think in terms of failure modes instead of brand names.

Search providers do not fail the same way

That sounds obvious, but I think a lot of agent tooling still gets built as if all search APIs are interchangeable pipes. They are not. Some are basically SERP infrastructure. Some are better for answer-oriented retrieval. Some are useful because they search. Others are useful because they search and immediately give you cleaned page content. If you treat them as substitutes, you end up with a brittle tool and a confused cost model.

This is the split I keep coming back to after reading the provider docs:

Provider What it appears optimized for Why it belongs in a stack
SerpAPI High-fidelity SERP access through proxying, browser execution, and CAPTCHA handling. Useful when you need something close to "what a user would have seen" and are willing to pay for the infrastructure abstraction.
Serper Fast Google-style search results through a simpler developer surface. A practical default lane when you want speed and volume more than premium fallback behavior.
Tavily Agent-oriented web search with structured result content and controls that look designed for LLM workflows. Better when the caller wants grounded answer material instead of raw link lists.
Exa Search that leans on embeddings-based relevance instead of strict keyword matching alone. Helpful when the agent is exploring a research question and exact keyword replication is not the main constraint.
Brave Search API An API backed by Brave's own independent web index rather than a Google-shaped scraping path. Good for failure-mode diversity because it is not just another wrapper around the same upstream behavior.
Firecrawl Search plus optional scraping and content extraction in one workflow. Useful when the hard part is not finding links but getting clean downstream content for an agent to read.

I am being careful here not to turn this into a fake precision ranking. Providers move. Pricing changes. Rate limits change. Product surfaces change. What seems more durable is the category distinction. The important design move is to mix providers that fail differently.

The real design problem is routing, not rotation

I also think "rotation" is a slightly misleading word. It makes the answer sound like round-robin load balancing: query one goes to provider A, query two goes to provider B, and so on. That might spread traffic, but it does not necessarily improve the system. A better mental model is routing.

A routing layer can ask more useful questions:

That last question matters a lot. If provider A and provider B both depend on roughly the same search surface and both fail under the same anti-bot pressure, the system may look redundant while being operationally identical. A real fallback should widen the search posture, not just multiply invoices.

What I would build first

If I were building this from scratch today, I would not start with a giant provider matrix and a hundred config flags. I would start with a thin search control plane:

  1. A normalized query interface so the rest of the agent is not coupled to one provider's response shape.
  2. A small intent model with lanes like fresh, research, and search_and_extract.
  3. A cache keyed by normalized query plus freshness expectations.
  4. A fallback chain designed around different failure modes, not just different companies.
  5. Basic observability: provider used, latency, empty-result rate, fallback rate, and per-query cost.

That is enough to learn something real. It is also enough to stop an agent from silently degrading the moment one provider gets grumpy.

The sequence matters. I would much rather have two providers, one cache, one content-extraction path, and decent logs than six providers stuffed into a config file with no understanding of which route is actually working. More integrations can make the system stronger, but they can also make it harder to reason about why the agent answered badly, why it got expensive, or why it suddenly started timing out.

Robust search is partly a cost-discipline problem

This is the other part that gets hidden when people talk about "the best search API." Robustness is not free. Every extra provider increases operational surface area. Every premium fallback creates a temptation to spend your way around a design problem. Every search-plus-scrape path can quietly multiply cost if the agent is too eager to fetch full content.

That is why caching and deduplication are not boring add-ons. They are part of the architecture. If the same vague query gets re-issued five times in ten minutes because the agent loop is sloppy, the best provider stack in the world will still look fragile and expensive. Search quality problems are often orchestration problems wearing a provider mask.

The broader lesson is about agent infrastructure, not search alone

The reason this question stayed interesting to me is that it feels like a small version of a larger pattern in agent design. We keep trying to solve reliability at the edge of the system. Better model. Better prompt. Better tool. Better fallback. Sometimes that is right. But a lot of the real leverage shows up one layer earlier, where you decide how the system classifies work, chooses a path, observes failure, and degrades under pressure.

Search is a good example because the failure is obvious. The agent gets blocked, or the results get thin, or the content extraction path starts collapsing. But the same lesson keeps showing up elsewhere. If the system depends on one perfect provider, one perfect prompt, or one perfect model, it probably is not a system yet. It is still a bet.

My current read is that robust agent search starts when you stop asking for a better search box and start building a better routing layer around imperfect ones. That does not fully settle the question. It does clarify where I would spend the next engineering hour.