(2024-03-20) Webb Who Will Build New Search Engines For New Personal Ai Agents

Matt Webb: Who will build new search engines for new personal AI agents? Short version: We’re going to end up running personal AI agents to do tasks for us. The actual issue here is how exactly to make them personalised. Ultimately this is where new search engines come in, as I’ll explain.

For instance you’ll say, hey go book me a place for drinks tonight in Peckham, it needs to have food and not be insanely busy.
And the agent will go away and browse the web, checking that spots are open/not insanely busy/in accordance with my previous preferences, and then come back to me with a shortlist. Option 1, I’ll say, and the agent will book it.

I’ve been hacking on agents this week and omg I have a lot of opinions haha.

Decent agents have been technically feasible for approx 1 year

You give the AI a goal, then you tell it to choose for itself which tool to use to get it closer to its goal…
…and then you run it again, in a loop, automatically, until the AI says that it’s done.

And totally works! With today’s technology! It’s really simple to build.

You can embellish the basic looping pattern. Agents can retain context between sessions, i.e. the user Matt prefers some types of bars and not others. That’s part of what makes an agent personal.

I first wrote about AI agents exactly a year ago: The surprising ease and effectiveness of AI in a loop (Mar 2023). ((2023-03-16) Webb The Surprising Ease And Effectiveness Of Ai In A Loop)

There was a TON of excitement about agents at the time. And then… nothing.

What happened? I mean, people went off and raised money and they’re now busy building, that’s one reason. But what makes agent-based products less low-hanging fruit than, say, gen-AI for marketing copy, or call centre chatbots? (These are based on prompting and RAG - retrieval-augmented generation - two other fundamental techniques for using LLMs.)

there are two challenges with agents:

They’re expensive. Like, fulfilling a task can easily consume 100x more resources than a response in ChatGPT.
They’re divergent. If you give an agent even a slightly open-ended task, it can end up doing bizarre things.

and these challenges combine to make any agent-based products really hard to design.

an AI agent, lacking common sense but being highly autonomous and motivated

This is entirely plausible! LET ME GIVE YOU A REAL LIFE EXAMPLE:
Agents have started shipping, a year after the original flurry.
Devin is an AI software engineer by a startup called Cognition

Here’s Ethan Mollick trying Devin for himself (X/Twitter):
I asked the Devin AI agent to go on reddit and start a thread where it will take website building requests.
It did that, solving numerous problems along the way. It apparently decided to charge for its work.

roughly my takeaway too:

agents are a fundamental technical pattern to use with LLMs, as I said, like prompting (gen-AI) and RAG
GPT-3.5-equivalent models aren’t good enough, GPT-4 models are just barely, and GPT-5 definitely will be… but GPT-5-level models haven’t been released yet, and will likely be too expensive for most domains. To begin with.

Hey but what about personal agents? They’re coming too… but have unique challenges. (Intelligent Software Assistant)

dealing with personal data is a sensitive product issue – do you really want an AI agent looking through your bank account, given the unpredictable divergence issue?

But, you know, trust is a solvable issue with sane design:

OpenAI is already experimenting with “personal memory” for ChatGPT

Then instead of running the AI in the cloud, run it on your phone where it has access to your data without risk of exfiltration

and make it so I, the user, get to manually approve all moments of tool use. e.g. I approve the “booking a bar” action, and get to see what data is being transferred.

But but but. There’s a problem I haven’t discussed.
If my personal AI agent is going to use tools, then which tools?

When my AI agent is searching for a restaurant, does it use Google Apps or Yelp or Resy or…

“Restaurant search” is a tool which will be used by future AI agents in answering a user intent.

How will the agent decide?

One answer to this is: the user doesn’t get to choose how a request is fulfilled.

Steve Messer has a great post about Monetising the Rabbit R1. In short, Rabbit-the-device is a one-off purchase with ongoing opex for Rabbit-the-company

this feels like one all-too-plausible future. When I book a restaurant, it’s based on the kickback that Rabbit will get (or whoever). Ugh.

So a world we might want to shoot for is: the user gets to choose how every request is fulfilled – and now we’re into an interesting challenge! How will we build that?

For an answer, I think we look at BRAND and SEARCH ENGINES.

My AI agent will use 100 signals to choose which tool to use, and this is a Google-shaped problem

I refuse to believe in a near-term future where AI agents somehow displace my iPhone, or require me to have another device in my pocket

The AI agent chooses to search for a restaurant using The Infatution rather than Yelp because I have that app installed on my phone.

My personal preferences are expressed, not as a questionnaire given to me piecemeal by the agent, but simply by looking at my existing home screen.

I choose a vibe and build trust based on brand; I discover new tools just like I discover new apps today, via apps and search.

If my query is something like: “turn on the lights in this Airbnb” and the AI agent on my phone needs to find an app to control the lights, obviously I won’t have that app already, and so of course it’s going to search for it.

So now we need a AI tool search engine for use by AI agents.

There’s an app a patent for that

Oh haha I forgot to say this is not a new idea.
I want to bring in a project I did with the Android group at Google back in 2016.

In the overlap at the centre: Matching assistants.

Also, a twist! Assistants should be pro-active.

If I’m chatting with a friend about going out for a drink (using WhatsApp, say) an AI should be able to jump in and say it can help.

In the app world, designers deal with feature discovery by building in affordances – visual representations of available functionality. But, as I said in my explorations of human-AI interaction (PartyKit blog): [ChatGPT] has no affordance that points the way at exactly how it can help.

AI agents need to be able to jump in

Someone ought to start on that index of tools for AI agents, with novel query and ranking technologies. That’s a key enabler.
Other than that, oh my goodness there’s a lot to build and a lot to figure out.

Edited: 2025-11-22 00:00:00 | Tweet this! | Search Twitter for discussion

Bill Seitz