Think | Building AI Products, Part 1: First Principles Thinking

Every week, someone pitches me an AI feature that starts with "What if we used GPT to..." and I have to resist the urge to stop them right there.

Not because the idea is bad. Sometimes it's great. But starting with the model is like starting a house by picking the paint color. You might end up with a beautiful wall, but you haven't asked whether the room needs to exist.

This is the first in a three-part series on building AI products. Part 1 is about the thinking that should happen before any prompt is written or any API is called.

Architect reviewing a blueprint — first principles thinking means designing the room before picking the paint color — Starting with the model is like starting a house by picking the paint color. First principles thinking inverts this. Photo by Daniel McCullough on Pexels.

Start with the problem, not the technology

The most common mistake in AI product development right now is technology-forward thinking: we have access to a powerful model, so let's find something for it to do. The result is features that feel impressive in a demo and useless in daily life. An AI-generated summary nobody reads. A chatbot that answers questions the help center already covers. A "smart" feature that's slower than the manual workflow it replaced.

First principles thinking inverts this. You start with the human problem. What is the user trying to accomplish? Where do they get stuck? What takes too long, requires too much context-switching, or depends on information they don't have? Only after you've mapped the problem clearly do you ask: can AI make this meaningfully better?

"Meaningfully" is the operative word. AI can do a lot of things. The question is whether it does them well enough, reliably enough, and fast enough to be worth the complexity it introduces.

The three questions

Before greenlighting any AI feature, I ask my team three questions:

What would we build if AI didn't exist? This forces you to define the problem without anchoring to a solution. If the answer is "we wouldn't build anything," that's a signal the feature is technology-looking-for-a-problem. If the answer is "we'd build X, but it would be expensive / slow / impossible at scale," then AI is a genuine enabler.

What's the failure mode? AI features fail differently than traditional software. They don't crash; they confidently produce wrong answers. Before building, you need to know what happens when the model is wrong. Is the user equipped to catch the error? Is the cost of a wrong answer low (a bad email draft) or high (a wrong medical recommendation)? This shapes everything from the UX to the guardrails.

What does "good enough" look like? AI outputs exist on a spectrum from perfect to useless. Most land somewhere in the middle: helpful but imperfect. The product question is where on that spectrum the output needs to land to create value. A summarizer that captures 80% of the key points might be wonderful. A code generator that's correct 80% of the time might be dangerous. "Good enough" is use-case-specific, and defining it upfront prevents endless iteration toward an impossible standard.

AI as capability, not category

The mental model I find most useful is treating AI as a capability rather than a category. You don't build "an AI product." You build a product that uses AI to solve a specific problem better than the alternatives. The AI is the engine. The product is the car. Nobody buys a car because of the engine spec. They buy it because it gets them where they need to go.

This framing keeps the team grounded. It prevents the common trap of shipping an AI feature and then searching for a metric to justify it. If you started with the problem, you already know what success looks like.

What comes next

First principles thinking gets you to the right problem. But AI products have a unique challenge that traditional products don't: the output is non-deterministic. The same input can produce different results. That means you can't just test a feature and ship it. You need a rigorous evaluation practice.

That's Part 2: the discipline of evals.