The Hidden Foundation: Why Taxonomy Decides Whether AI Can Actually Help You

When discussions about proper AI implementation come up, the same words tend to appear rather quickly.

Prompting. Context engineering. Memory. Model choice. Data quality. Governance, if the discussion has reached that point.

All of these are important. Please do not misunderstand what I am saying here. I am not suggesting that we should forget prompting, or context, or memory, or the need for clean and accessible data. Obviously not. If the data is wrong, the AI will not become magically right. If the prompt is badly written, the model may misunderstand the task. If the system has no memory or context, it will answer as if every question exists in a vacuum.

But there is another word which I think deserves more attention than it often gets.

Taxonomy.

Admittedly, taxonomy does not sound as exciting as AI agents, advanced prompting, or whatever new model has just been released. It sounds like something belonging to librarians, documentation teams, or the part of knowledge management people only begin to care about once search has already become painful. But the more I work with practical AI use cases, the more convinced I become that taxonomy is one of the quiet foundations deciding whether AI can actually help people find what they need.

Because AI does not only need access to information. It needs information that has been organized in a way that makes retrieval meaningful.

That point might sound obvious, but I do not think it is treated as obvious in practice. A lot of the current AI enthusiasm seems to rest on the assumption that if we simply connect a model to enough documents, it will somehow find what is needed. Connect it to Google Drive. Connect it to Confluence. Connect it to SharePoint. Connect it to Notion. Let it loose on the knowledge base, and surely it will figure things out.

Sometimes it will.

But often it will not.

And when it does not, I suspect that the problem is not always the model, the prompt, or the user. Sometimes the problem is that the organization never really organized its knowledge in the first place.

The problem: access is not the same as understanding

A common assumption in AI adoption is that the hard part is connecting the AI to the data.

That is, if the AI can technically access the documents, then the central problem has been solved. The model can search, summarize, and answer, and the organization can move on to the more interesting parts of implementation.

I think this is mistaken.

Access is not the same as understanding. And it is certainly not the same as reliable retrieval.

If a knowledge base is messy, AI does not magically clean it by searching through it. It may retrieve the wrong page more quickly. It may summarize outdated information more fluently. It may confidently combine two pieces of guidance that were never meant to be combined. It may find a broad page that contains the answer somewhere, but miss the specific update because the relevant paragraph is buried among unrelated material.

In other words, AI can make weak knowledge structure more visible.

Consider a typical customer support organization. Over several years, it may have accumulated monthly update pages. Each month includes several different updates: a billing reminder, a technical workflow change, a privacy clarification, a mobile issue, a campaign announcement, and perhaps a general operational note.

For a human reader, this structure is not necessarily unreasonable. If someone remembers that a change happened sometime around April, they can open the April update page and scan it. They can recognize the paragraph they need. They can decide, based on their own experience, whether it still seems relevant.

For an AI system, however, that page is a much weaker retrieval target.

The title might simply be:

April 2024 Support Updates

That title tells the AI very little about the specific knowledge inside the page. The page may contain five or ten unrelated updates. Some may still be current. Some may have been superseded. Some may apply only to one team. Some may be reminders rather than changes. Some may be policy guidance, while others are temporary campaign instructions.

If someone asks, “What should agents do when a customer requests an annual renewal refund exception?”, the AI may retrieve the April update page. But that page is not really about annual renewal refund exceptions. It is a container. The actual knowledge object is one update inside it.

That distinction matters.

A human can open a container and look around. AI retrieval works better when the thing being retrieved is already a meaningful unit.

Human-readable is not always AI-retrievable

A lot of internal documentation was built around human habits.

Chronological archives. Department folders. Long pages with multiple topics. Informal naming conventions. Tags created by whoever happened to write the page. Duplicated or overlapping documents. Old guidance that remains visible because deleting it feels risky, or because no one is sure whether it is still needed.

These structures are not necessarily bad. Often they made perfect sense at the time. A monthly update page is easy to create. A team folder is easy to understand. A long page is convenient when someone wants to add information quickly without thinking too much about where each piece should live.

The problem is that AI retrieval changes the cost of ambiguity.

When a person reads a page, they bring context. They know that “payments” and “billing” might be related. They know that a “campaign” is not the same thing as a “workflow update.” They know which internal terms are official and which are just casual shorthand. They may know that one page is outdated because the team quietly stopped using it, even if the page itself never says so.

AI systems do not automatically know these things. They need signals.

Those signals can come from titles, headings, summaries, labels, metadata, effective dates, source authority, content type, and relationships between pages. They can also come from the way the content itself is written. But if none of these signals are clear, then the AI is left to infer too much.

That is where taxonomy becomes important.

A taxonomy is not just a neat filing system. It is a controlled way of saying: these are the categories that matter, these are the terms we use, and this is how knowledge should be described so it can be found again.

In an AI-enabled knowledge environment, taxonomy becomes retrieval infrastructure.

What the research is now saying

The AI conversation has moved quickly. In 2023 and 2024, a lot of the discussion focused on whether organizations had enough data, whether the data was clean enough, and whether generative AI could be connected to internal systems.

Those questions still matter. But by 2026, the discussion seems to have sharpened. The issue is not only data access. It is context.

Gartner has argued that organizations with successful AI initiatives invest significantly more in foundational areas such as data quality, governance, AI-ready people, and change management. More importantly for this discussion, Gartner frames context — including semantics and metadata — as mission-critical infrastructure for AI agents. In other words, AI success is not only about better models. It is about giving those models governed, contextual access to the right data. [Gartner, April 2026]

That is a crucial shift.

It means that metadata is not decoration. Semantics are not academic overhead. Governance is not merely a bureaucratic obstacle added after the exciting part. They are part of the operating environment that allows AI systems to retrieve and use knowledge responsibly.

Recent retrieval research points in the same direction. A 2026 paper on enterprise knowledge retrieval found that metadata-enriched retrieval approaches consistently outperformed content-only baselines. That matters because it gives technical support to a practical intuition: the labels, summaries, categories, and contextual fields around a document can materially improve whether the right information is retrieved. [Mishra et al., 2026]

Enterprise Knowledge has made a similar point from a taxonomy and knowledge-management perspective. Their writing on the role of taxonomists in generative AI emphasizes that AI can help create, validate, and suggest taxonomy components, but human expertise remains necessary for strategy, governance, context, and decision-making. AI can assist taxonomy work. It cannot replace the organizational judgment that decides what the taxonomy should mean. [Enterprise Knowledge]

The development of GraphRAG and ontology-guided retrieval reinforces the same pattern. Traditional RAG systems often treat knowledge as flat chunks of text. That works for some questions, but it becomes fragile when a question requires relationships, dependencies, source authority, or multi-step reasoning. Graph-based and ontology-guided approaches attempt to solve this by modeling entities, relationships, and shared vocabularies explicitly. [Enterprise Knowledge, GraphRAG in the Enterprise]

Of course, most organizations do not need to begin with a full knowledge graph. That would be too much, and probably the wrong starting point for many teams. But the direction is worth noticing: the more we expect AI systems to reason over internal knowledge, the more important structured meaning becomes.

What should a taxonomy consider?

A useful taxonomy does not need to start as a grand enterprise ontology. In many cases, the first step is much simpler: decide which distinctions matter enough to control.

For example, in a support knowledge base, some fields should probably be controlled: team or support area, update type, effective date, policy status, source authority, audience, sensitivity, and product or platform area.

If the organization has six support areas, the AI should not invent a seventh because one page used slightly different wording. If the official areas are Tech, Billing, Privacy, Mobile, Ticketing, and General, then those should be the available choices.

The same applies to update types. A policy change is not the same as a policy reminder. A workflow update is not the same as a campaign. A temporary exception is not the same as permanent guidance. If those distinctions matter operationally, then they should be explicit.

Other fields need more flexibility.

Keywords, product names, campaign names, issue-specific tags, and emerging terminology may need to grow over time. A living keyword directory can be valuable, especially in fast-moving environments. But “living” should not mean chaotic. Someone still needs to review whether new keywords are useful, whether they duplicate existing ones, and whether they should become part of a controlled vocabulary later.

This is the balance I think is often missed.

Some categories should be fixed because they define the structure of the organization’s knowledge. Other terms should remain flexible because language changes, products change, problems change, and users ask questions in ways no taxonomy designer can fully predict in advance.

A practical taxonomy therefore needs both stability and adaptation. Controlled categories provide consistency. Flexible keywords allow the system to learn from new situations. Governance keeps both from drifting into noise.

From monthly archives to atomic knowledge objects

One practical solution is to move from broad container pages to atomic knowledge objects.

Instead of storing several unrelated updates inside one monthly page, each update becomes its own page or entry.

Old model:

April 2024 Support Updates

New model:

2024-04-08 | Billing | Annual Renewal Refund Exceptions | Policy Reminder
2024-04-12 | Mobile | Login Error After App Update | Workflow Update
2024-04-16 | Privacy | Account Deletion Request Wording | Policy Change
2024-04-22 | Ticketing | Escalation Queue Routing | Workflow Update

The point is not to create more pages for the sake of having more pages. That would just be another kind of mess. The point is to create smaller and clearer knowledge units.

Each page should answer, as far as possible: What changed? Who does it affect? What should someone do? When did this become relevant? Is it still current? What team or area owns it? What type of update is it? What keywords or labels describe it? Does it replace or relate to another update?

This structure helps humans, but it also helps AI. The title carries useful information. The metadata narrows retrieval. The page itself has a clear purpose. The AI no longer has to retrieve a broad archive and infer which paragraph matters. It can retrieve the update.

The solution is not “more tagging.” It is better knowledge design.

There is a danger in reducing this discussion to tagging.

Tags are useful, but taxonomy is not simply adding more labels to more pages. In fact, uncontrolled tagging can make the problem worse. If every person creates their own terms, the knowledge base may end up with several labels for the same thing:

refund
refunds
billing-refund
reimbursement
payment-exception
annual-renewal-refund

Some variation is useful. Synonyms matter. But if no one defines preferred terms or relationships between terms, AI retrieval may fragment across inconsistent language.

The solution is a knowledge design approach.

First, create atomic content — each page or entry should represent one meaningful knowledge object. Second, use predictable titles that contain retrieval signals: date, area, subject, and type. Third, define controlled fields — some values should come from fixed lists, especially teams, update types, status, and source authority. Fourth, maintain flexible keywords, but review and consolidate them regularly. Fifth, add summaries written for retrieval — a short summary should state what the page is about in language people are likely to use when asking questions. Sixth, track lifecycle and authority — AI should not treat outdated drafts and current policies as equal. Seventh, model relationships where needed — some pages supersede others, some apply only to certain teams, some are temporary exceptions. Finally, evaluate retrieval, not just content — it is not enough that the right answer exists somewhere. The question is whether the AI can reliably find it.

This is not glamorous work. It is not the part of AI implementation that gets the most attention. But it is often the work that determines whether the implementation becomes useful beyond the demo.

Better models do not remove the need for structure

One objection is that models are getting better.

That is true.

Modern AI systems are better at summarization, reasoning, query interpretation, and working with messy input than earlier systems were. Longer context windows and stronger retrieval pipelines can compensate for some weaknesses in documentation.

But compensation is not the same as design.

A better model may be able to work around poor structure some of the time. It may infer that “refund exception” belongs under billing. It may parse a long monthly page and identify the relevant section. It may notice that one page appears newer than another.

But relying on inference where structure should exist creates risk. It makes the system harder to evaluate. It makes failures harder to diagnose. It increases dependence on probabilistic interpretation when the organization could have supplied explicit signals.

If a team knows that an update is a policy reminder, why make the AI infer it? If a page supersedes an older page, why make the AI guess? If only one source is authoritative, why let the AI treat all sources equally?

The point of taxonomy is not to limit intelligence. It is to reduce unnecessary ambiguity so intelligence can be applied where it is actually useful.

Conclusion: AI readiness begins before the prompt

We talk a lot about prompts, memory, model choice, context windows, and which AI tools organizations should adopt. Those conversations are important. But they often start too late.

Before the prompt, there is the knowledge environment.

If that environment is inconsistent, outdated, poorly labeled, and organized around broad human browsing habits, AI will inherit those weaknesses. It may still produce impressive answers. It may still be useful. But it will be harder to trust, harder to govern, and harder to scale.

Taxonomy does not guarantee AI success. It does not solve every problem of knowledge management. It will not capture all tacit knowledge, eliminate all ambiguity, or prevent every hallucination.

But ignoring taxonomy makes reliable AI harder.

The more AI becomes part of organizational work, the more important it becomes to define what things are, how they relate, which terms are official, which sources are authoritative, and what knowledge is current.

AI does not make taxonomy obsolete.

AI makes taxonomy operational.

Before you add AI, fix your taxonomy. Not because taxonomy is the exciting part, but because it is one of the quiet foundations that determines whether the exciting part works.

Research notes: Gartner, April 2026 · Enterprise Knowledge — Taxonomist Role in the New World of Generative AI · Enterprise Knowledge — GraphRAG in the Enterprise · Mishra et al., arXiv:2512.05411 · Sun et al., TaSR-RAG, arXiv:2603.09341

Shmuel Koltov

The Hidden Foundation: Why Taxonomy Decides Whether AI Can Actually Help You

The problem: access is not the same as understanding

Human-readable is not always AI-retrievable

What the research is now saying

What should a taxonomy consider?

From monthly archives to atomic knowledge objects

The solution is not “more tagging.” It is better knowledge design.

Better models do not remove the need for structure

Conclusion: AI readiness begins before the prompt

Comments

Leave a Reply Cancel reply

More posts

The Hidden Foundation: Why Taxonomy Decides Whether AI Can Actually Help You