← Back to Blog
Engineering

Why We Stopped Trying to Guess What You Meant

Share:

We built an intent classifier to route user inputs in the AI writer. It was a heuristic layer on top of a base model that handled ambiguity better on its own. We deleted it. Here's what we learned.

By Sadok Hasan

Why We Stopped Trying to Guess What You Meant

Why We Stopped Trying to Guess What You Meant

We built an intent classifier. This is a normal thing to do when you are building an AI writing product. The user pastes something into the composer. The system needs to know: are they asking for a new post, a rewrite, a repurposing, or are they starting a conversation? Route to the wrong handler and the product does the wrong thing.

The classifier we built used a combination of heuristics: word counts, first-person ratios, action-word detection, checks for revision-signal phrases. It ran before the main generation pipeline and returned one of four intents, which then routed to different system prompts and different tool invocations.

It was wrong in exactly the cases that mattered most. This is part of the How Bloomberry Voice Works series.


The Classification Problem

When a user pastes content into an AI writing tool, there are many kinds of content they might be pasting. A rough outline. A half-finished draft they want to complete. An article they want to repurpose. A thought they have been sitting with that they want to turn into a post. A tweet-length observation they want expanded.

A good classifier needs to handle all of these correctly. The heuristics we built handled the obvious cases: short, directive inputs with action words ("write a post about X") and clear conversational questions. They failed on the interesting cases: long pasted content from external sources, bare URLs with no surrounding context, articles that began with action words in their first sentences.

The failure mode was specific and painful. A user who pasted an article to repurpose would get a conversational response explaining what the article was about. The action words in the article's opening sentences triggered the wrong detection path. The classifier decided this was a writing instruction, not a source to transform, and the system responded accordingly.

For a product whose job is to generate content, there is no worse failure than generating a conversational response instead. It violates the core expectation.


What the Heuristics Could Not Handle

The deeper problem was architectural. The heuristics were trying to classify intent from surface features of the input text. But intent is not reliably present in the surface features of the text. A long, first-person piece of writing could be content to repurpose or a draft to complete or an example to learn from β€” the surface features alone do not distinguish them. Action words in the first thirty words could be instructions from the user or the opening line of an article they are pasting.

The classifier was solving a hard problem β€” intent inference from ambiguous text β€” using rules that could only reliably address the easy version of that problem. The hard version is exactly where the interesting user cases live.

Meanwhile, the base model we were routing to after classification is trained on massive amounts of text and handles ambiguity through learned representations that no rule system can approximate. Asking a rule layer to resolve ambiguity before the model sees the input is asking the weaker system to do the work the stronger system would handle better.


What We Did Instead

We reduced the classifier to three deterministic fast-paths:

Bare URL inputs β€” if the input is a URL, or a URL with very short surrounding text, it routes to content generation. The user wants a post from that URL. The system fetches the content and generates from it.

Voice training keywords β€” if the input contains explicit voice-training language ("learn from this," "add to my voice"), it routes to the learning flow rather than generation.

Explicit instruction prefix β€” the Bloomberry UI allows users to tag their input with an explicit instruction prefix when they want to specify intent. That tag takes priority.

Everything else defaults to generating a post. The model receives the input, framed as a generation request, and produces content from it. The model handles the specifics of what "write a post from this" means when the input is a rough outline versus a full article versus a single observation.

The conversational response path is not the default. It is the exception. For a product built to generate content, generation should be the assumption β€” not the outcome of a successful classification.


What We Learned from Deleting It

Removing the classifier required accepting that the system would sometimes generate a post from input the user intended as a question. That edge case exists. But it is a much smaller problem than generating conversational responses from input the user intended as source material for a post. The costs are asymmetric.

The second thing we learned is that framing matters more than routing. Whether the system generates useful content from ambiguous input depends much more on how the generation prompt is framed than on which handler the classifier sent the input to. When the prompt tells the model "the user's job is to produce posts and this is source material for one," the model handles the transformation correctly across a wide range of input types. The routing layer was adding complexity without adding accuracy.

The third thing we learned is about defaults. Every system has a default behavior for ambiguous inputs. The wrong default is to ask a question or return a conversational response. That shifts burden to the user β€” they have to clarify, re-submit, get confused about what the product does. The right default is to attempt generation and give the user something to react to. An output that is almost right is more useful than a question about what they meant.


The Engineering Principle

A heuristic layer is justified when it is doing something the model genuinely cannot. When the heuristic layer is trying to resolve ambiguity that the model would handle better without the layer, removing it improves the system.

The question to ask of any complexity in an AI product architecture is: what does this layer solve that the model cannot? If the layer exists because someone was not confident the model would do the right thing, and the actual data shows the model would have done the right thing more often than the layer did, the layer is making the system worse.

Intent classification felt necessary when we built it. The user inputs genuinely were ambiguous. But the solution to ambiguity in an AI product is almost never a heuristic filter on top of a capable model. It is a better default assumption and clearer framing for the model to work with.


Frequently Asked Questions

What is intent classification in AI writing tools?

Intent classification is a layer in an AI writing product that tries to determine what the user wants before processing their input. When you paste text or a URL into a writing tool, the classifier decides: are you asking for a new post, a rewrite, a repurposing, or something else? The problem is that classifiers built on heuristics fail on exactly the ambiguous cases that matter most β€” and for a writing tool, almost every interesting input is ambiguous.

Why did Bloomberry remove its intent classifier?

Because the classifier was failing in the direction that hurt the most: it was routing generation requests to conversational responses instead of generating posts. Users who pasted an article or dropped a URL were getting AI prose explaining topics back at them rather than a generated LinkedIn post. The classifier's heuristics β€” word counts, action-word detection, first-person ratios β€” were designed for one class of inputs and misfired on another. The base model, given proper framing, handled the ambiguity better without the classifier layer.

What does Bloomberry do instead of intent classification?

We use three deterministic fast-paths for the clear cases β€” bare URL inputs route to content generation, voice training keywords route to the learning flow, and inputs prefixed with an explicit instruction tag route directly to generation. Everything else defaults to generating a post. The base model handles what it means to "write from" whatever was provided. We stopped trying to classify intent and started assuming the intent that makes sense for a writing tool: generate.

How does Bloomberry decide whether to generate a post or respond conversationally?

The default is generation. Conversational responses are the exception, not the rule, because the product's job is to produce content. When a user provides input β€” a URL, an idea, a pasted article β€” the system assumes they want a post from it. The model's instructions frame the task as generating content, and the model handles the specifics of how to transform the input. Classification only steps in for the unambiguous exceptions.

Is a simpler architecture always better for AI products?

Not always, but often for the right reasons. Complexity in an AI product is justified when a heuristic layer is doing something the model genuinely cannot. When the heuristic layer is trying to second-guess the model's handling of ambiguous input, it usually makes things worse β€” because the model handles ambiguity through learned representations that a rule-based heuristic cannot replicate. The right question is always: what does this layer solve that the model cannot? If the answer is unclear, remove the layer.


Related reading: How Bloomberry voice works β€” the full series | Why every AI model writes differently | AI LinkedIn post generator


The product whose job is to generate content should generate content by default. Classification is what you add when the default is wrong. We had the default wrong.

Try Bloomberry free

Ready to write sharper?

Bloomberry turns your ideas into publish-ready thought leadership.

Try Bloomberry free

Related Bloomberry tools

Browse examples

Related guides

More from the blog