Heather Hedden: The Accidental Taxonomist on AI Strategy and Automation

March 26, 2026

Heather Hedden: The Accidental Taxonomist on AI Strategy and Automation

Taxonomies are proving more and more useful as AI evolves. Lately, AI advocates are more and more preoccupied with agentic orchestration, for example., Gerald Friedland, principal scientist at AWS, made this observation in a LinkedIn post recently: “There is a reason every company has a predefined orchestration structure — usually hierarchical. Hierarchies reduce the number…

GraphAI

Alan Morrison

19–29 minutes

“There is a reason every company has a predefined orchestration structure — usually hierarchical. Hierarchies reduce the number of communication paths and keep the constraint system manageable. AI agent orchestration is trying to rediscover the same principle.”

Heather Hedden runs Hedden Information Management, a consultancy focused on making information findable, accessible, interoperable and reusable. She builds and advises on hierarchical and faceted taxonomies, thesauri, ontologies, metadata schemas, and book indexes for both published and enterprise content.

Heather is best known as the author of The Accidental Taxonomist, now in its third edition (2022), which has become a standard reference for practitioners in the field. She also teaches taxonomy through in-person workshops, corporate training, and online courses, and maintains a widely-read blog at hedden-information.com.

I had the opportunity to interview Heather in March 2026, and it was a conversation full of insight on best enterprise practices in information modeling.

Full disclosure: She’s also done a number of webinars for Graphwise, which also sponsors this Curator blog.

Background on the Accidental Taxonomist

Heather’s career began as a manual indexer at Information Access Company (later acquired by Gale), where she assigned index terms from controlled vocabularies across journals and periodicals. That hands-on experience shapes how she approaches every project today.

In this Curator interview, Hedden covers the practical and strategic dimensions of building controlled vocabularies, taxonomies, and thesauri inside real organizations.

She begins by clarifying an important conceptual point: information taxonomies are not classification systems in the Linnaean sense. They are closer to thesauri, and the boundary between the two has blurred further because both now share the SKOS (Simple Knowledge Organization System) data model from the W3C. That shared foundation means shared tooling, and the distinction today is more one of emphasis than kind.

Scoping is where every project starts. Heather stresses the importance of understanding both the short-term use case and the longer-term roadmap before any terms are defined. Organizations rarely have trained taxonomists on staff, and subject matter experts, while knowledgeable, tend to organize content in ways that reflect their own perspective rather than how end users search and browse.

Stakeholder engagement is critical to taxonomy development. Hedden argues that the people who participate in building a taxonomy are far more likely to maintain and govern it after a consultant leaves. That is a stronger form of human in the loop than simply reviewing AI-generated output.

On automation, she takes a measured view. Generative AI is genuinely useful for targeted tasks — suggesting synonyms, clarifying distinctions between terms, or generating initial hierarchies in unfamiliar subject areas. It is less reliable for producing a complete taxonomy from scratch.

During this discussion, Heather also does a quick demo of Graphwise’s Taxonomy Builder feature, which brings AI-assisted term generation directly into the taxonomy management workflow.

Governance remains the hardest gap to fill with automation. A taxonomy without clear ownership, update policies, and maintenance procedures rarely survives past the consulting engagement. That human responsibility, Heather says, will not change.

A YouTube recording of my conversation with Heather follows, accompanied by an edited transcript.

Edited Transcript

Alan Morrison: Hi, we’re online with Heather Hedden and I am the graphRAG curator Alan Morrison. Good to be with you all.

Is this the third edition of The Accidental Taxonomist? So, people have known this book for years and years now. When was it first introduced?

Heather Hedden: 2010. Then 2016 and then 2022 for the third edition.

Alan Morrison: And Heather is somebody who’s presented at any number of conferences. You can go to her site. And what is the site, Heather?

Heather Hedden: I use the business name Head and Information Management. So, it’s https://www.hedden-information.com/. And there I have a list of all my conference presentations. It’s a good resource I go back there too to check myself. And I have PDFs for those that are the usual half hour, hour presentation. If it was a half-day workshop, it’s not up there. Or if it was just a panel discussion, it’s not. But otherwise there are lots of presentations to refer to.

Alan Morrison: And you have your blog. It’s always a pleasure to read what you write because you make such good distinctions and there are a lot of interrelated concepts in taxonomies and there’s a lot of close relationships between things. It’s hard to disambiguate the terms and you spend a lot of good time disambiguating them to the extent possible anyway. We’ve got glossaries, we’ve got control vocabularies…. you started in indexing for a publisher trade.

Heather Hedden: Yes.

Alan Morrison: I used to use some of those books you were probably indexing. They were really good resources. How did you get started in this area?

Heather Hedden: Yeah. I can explain that. And in fact, there are two kinds of indexes. They’re the indexes that are at the back of books or manuals or monographs and then there are the indexes that are of articles or other content that continuously gets added. That’s what I started with and those are your journal, magazine, newspaper indexes that libraries subscribe to. I mean, if you’re really old like me,

Heather Hedden: You remember the HW Wilson company and those green volumes on the shelves before anything was digitized and you could look up. I started with a company called Information Access Company that was then acquired by Thompson and merged into Gale. Some people in the library field have heard of Gale or Gale Research because it’s a reference publisher.

Alan Morrison: That’s why I remember Gale.

Heather Hedden: Yeah. Gale still exists. This was before it was automated. We were all human manual indexers. We would read or skim articles and then assign index terms from a controlled vocabulary — you need that consistency when you’ve got lots of indexers, lots of journals, and lots of sources. A book is different: one source, one indexer, no controlled vocabulary needed — you just create the index.

I later did some freelance book indexing too, because it’s fun — you’re creating almost a taxonomy at the same time as you’re indexing, a little bit of both, instead of using a controlled vocabulary or taxonomy that someone else created. After a while I decided to move into the group that manages and develops the controlled vocabularies.

We didn’t use the word taxonomy then but there were lots of control vocabularies and the main subject one had broader narrower and related it was a thesaurus.

Alan Morrison: The tooling nowadays seems to blend the taxonomy with thesauri in some ways.

Heather Hedden: Definitely. Some people when they think of taxonomy, they think of a hierarchy, which it usually has, and then they’re thinking of classification systems or classification schemes, the Linnean taxonomy library classification system. But that’s different. Really our information taxonomies are closer to thesauri. They’re not classifications. We’re not putting things in class other than there’s usually a dominant hierarchical structure.

Heather Hedden:

The word “taxonomy” caught on, and that’s what we use. I’ve also been thinking about how thesauri and taxonomies are almost on a continuum — the lines between them have blurred. This is largely because of the SKOS data model from the W3C, one of the semantic web standards.

SKOS stands for Simple Knowledge Organization System. “Knowledge organization system” is the broader category that includes taxonomies, thesauri, terminologies, ontologies, and other term lists. SKOS is particularly well suited for taxonomies and thesauri. Because we’re using the same data model — and therefore the same tooling — the distinction between the two isn’t as clear-cut as it once was. It’s more a matter of emphasis: a taxonomy has a more dominant hierarchical structure, while a thesaurus may have many top terms and makes use of “related term” or “related concept” relationships that a taxonomy doesn’t necessarily include.

Taxonomies are often implemented in a faceted format, and we now see them extended with ontologies. Instead of a “related term,” we get other semantic relationships.

Alan Morrison:

Historically, there has been some resistance to standards like SKOS — partly because people were reluctant to use URIs, for example. Is that still a problem when you’re helping someone take better advantage of taxonomies, thesauri, and ontologies?

Heather Hedden:

It comes back to the tooling. If you’re building a taxonomy within the confines of a content management system where the focus is content management rather than taxonomy management, that system may not support SKOS and URIs.

But if you want to go further — if you have multiple content management systems and other systems and want a single consistent taxonomy across all of them, eventually linking everything and perhaps moving toward an enterprise-wide knowledge graph — then you have to bring taxonomy management out of those siloed systems into a dedicated taxonomy or taxonomy-and-ontology management tool, which would then be based on SKOS. And then the other standards for ontologies apply as well.

Alan Morrison:

How do you typically get started with a new client?

Heather Hedden:

Even before they become a client, we have to figure out the scope. A taxonomy could be used for a very limited scope — one application, one content management system — and a consultant may still be needed for that. Or the thinking may be broader: connecting different systems, perhaps including some external ones. Often it’s a phased approach: start with the intranet, then move on to something else later.

Understanding the short-term scope and the longer-term roadmap is important, because the taxonomy should be designed in a way that allows it to extend beyond a limited scope and become enterprise-wide over time.

That raises questions even about how you name things in the tools. At the top level there might be something called a knowledge model or a project. You may have multiple projects — or want just one — and then you need to decide what falls under the next level, which in SKOS is called a “concept scheme.” Deciding what should constitute a concept scheme can itself be a challenge.

Alan Morrison:

A concept scheme being a grouping of the concepts you develop using SKOS. And I assume that in most cases some work has already been done in the organization. Are the people assigned to this work trained taxonomists, or subject matter experts?

Heather Hedden:

Typically an organization doesn’t have taxonomists. They might have subject matter experts, and a few may have people in knowledge management who know something about taxonomies — but subject matter experts are not thinking beyond their area of expertise when it comes to organization.

If you ask a subject matter expert to build a hierarchy, they’ll approach it in a certain way that may not reflect how other end users think. I can give a specific example: I’ve been working with a manufacturing company on their parts taxonomy. The internal expert organizes things in a particular way, but their customers — who are experts in the industry, though not in those specific devices — may approach the same content quite differently.

Alan Morrison:

That’s a useful example, because there’s a whole supply chain potentially involved, and a lot of volume depending on the manufacturing being done — a lot of potential for the information system to benefit from a more coherent approach to information management.

Ontologists often start with competency questions to scope a domain. How does your scoping process fit into the larger goals of the organization in terms of building its information management capabilities?

Heather Hedden:

That’s a good question, especially since a taxonomy can be limited in scope or used enterprise-wide as part of a big knowledge management effort. And as you mentioned in your introduction, taxonomy touches on many different areas. I have to work with knowledge managers and content strategists as well — taxonomy isn’t done in isolation. It also involves people on the technology and IT side.

Alan Morrison:

Do you think of yourself as an architect in some ways?

Heather Hedden:

I haven’t, no. There is a related field called information architecture, and some information architects do think of themselves in those terms. I don’t consider myself an information architect, though I’m familiar with the basic principles and work alongside them.

The matter of scope is always central. Are we talking about a limited-scope taxonomy or something broader? That depends on how mature the organization is — whether they already have good taxonomies, which often they don’t. They may have some term lists and partial taxonomies that never got implemented, or something sitting in a system as a simple term list. My job is to help get those into better shape.

For the higher-level enterprise strategy, I tend to collaborate with others who work at that level of information and knowledge management.

Alan Morrison:

Once you get those into better shape, is there a way to demonstrate the benefit?

Heather Hedden:

I think it’s often easier to demonstrate what’s wrong first — show a search that isn’t working. But taxonomies can be used for more than search. Once we see they’re working well, they can be applied in further implementations, including your area of expertise, graph RAG.

Alan Morrison:

And it’s all about reusability ultimately — making content more widely used, more discoverable. For example, I’ve been working with GraphWise’s KnowledgeHub recently, and I’m finding it’s actually working the way it’s supposed to. It’s become a shortcut to finding the information I need for my blog posts.

Heather Hedden:

That’s great. And you’ve essentially answered your own question about giving a demo. Building something tangible for an organization — that’s very effective.

Alan Morrison:

There’s always a question of how much commitment an organization is willing to make to reach a valuable result. The organizational context differs project to project. Can you talk about how that varies?

Heather Hedden:

Taxonomy isn’t always owned by the same department. It could sit with marketing, with a knowledge management team, with product, with technical documentation, with customer support, or with the team managing the public website. Whoever first owns it may later realize it needs to become a broader strategy. Developing a taxonomy always involves engaging multiple stakeholders — it impacts several parts of the organization, and that can’t be automated.

Getting those initial conversations right is crucial. Once we get to the point of suggesting terms, that’s where automation becomes more useful.

Alan Morrison:

So there’s a lot of prep work before you even start talking about the actual terms?

Heather Hedden:

Yes. You want to figure out the high-level structure first — what the concept schemes should be. I actually hadn’t thought of using generative AI at that level until a client asked me to suggest what a taxonomy for their industry might look like. We used AI for that early ideation, and it turned out to be a useful application.

Alan Morrison:

That makes sense — the ideation phase is where you’re generating potential concepts. In the past you might have been brainstorming with people; now you can brainstorm with a language model.

Heather Hedden:

Exactly. And even if you brainstorm in other ways — tag clouds, for instance, which I’ve used through a previous consultancy — you could now use generative AI to organize the results. But I wouldn’t automatically default to these tools. I’d ask the client whether they find brainstorming tools useful. If they do, great; if not, we don’t have to use them.

Heather Hedden:

One consultancy I worked with — Enterprise Knowledge — would actually present clients with a range of methods and let them choose some. That approach was effective because it made stakeholders feel genuinely engaged. That’s important, because part of a taxonomy’s success is what happens after the consultant leaves: will the organization continue to maintain and govern it? If stakeholders had a role in building it — rather than just receiving something generated by AI — they’re more likely to stay involved. That’s not the typical human in the loop where someone gives feedback on AI output. It’s engagement at a higher level.

Alan Morrison:

Jessica Talisman, whom I interviewed in February, talks about tacit and explicit knowledge. This exercise of building a taxonomy is really about making tacit knowledge explicit — and once it’s explicit, it can be made reusable through structure and connections to broader concept schemes.

Heather Hedden:

Yes, that’s important in taxonomies just as it is in knowledge management more broadly.

Alan Morrison:

What are some of the typical things that threaten to derail a project once it’s underway?

Heather Hedden:

One common problem is when someone involved has a preconceived idea of what the taxonomy should look like — often because they’re conflating it with a classification scheme, or because they’ve seen a different kind of implementation that wouldn’t fit here.

Someone who knows their subject extremely well might produce something that looks like a book’s table of contents, which isn’t quite the same thing.

Part of my role as a taxonomy consultant is not just building taxonomies but educating clients about different approaches: what generally works and what doesn’t, what role a taxonomy actually plays.

I often describe a taxonomy as a bridge between users and content. It has to be tailored to both — which is why we constantly build new taxonomies rather than simply buying them off the shelf, even though you can start with a purchased one and then modify it.

Alan Morrison:

Each organization is different.

Heather Hedden:

Exactly. Their needs differ. Their scope varies — some more limited, some broader. The subject area varies. The sources for terminology vary. They may have many existing term lists and vocabularies or very few. There may be off-the-shelf taxonomies in their industry, or none. Their subject matter experts may know something about taxonomies, or they might not. They may have varying views on using generative AI — some enthusiastic, some cautious. And there’s always timing, scheduling, and budget to consider. A taxonomy project can be small or large; I can tailor it to the budget. You have time for a small taxonomy or we can do a big project — it really depends.

Alan Morrison:

And I assume that larger projects usually grow out of demonstrated success at a smaller scale — addressing a specific pain point first.

Heather Hedden:

Sometimes, yes. Though it doesn’t always happen that way. I always write documentation and a governance plan explaining the taxonomy, how it’s used, and how it should be maintained. The idea is that the organization continues to maintain it themselves after I’m done. Maybe they figure it out from reading my book, too — I hope so.

Alan Morrison:

Let’s talk about the tooling, because it has been changing. What’s changed from an automation perspective — whether AI or otherwise?

Heather Hedden:

Before generative AI and LLMs, there was already machine learning and rules-based text analytics, which has been very valuable. One source of terms for a taxonomy is running a text analytics tool over a body of content. Graphwise’s predecessor tool, PoolParty, has had a feature called corpus analysis for some time, which customers have used to varying degrees. It identifies prominent terms based on word proximity and frequency, both within individual documents and across a corpus. That’s been useful for a while.

Heather Hedden:

On the other side — not taxonomy development but implementation — the move toward auto-tagging has been significant. Manual tagging still has a place in lower-volume workflows where editors want hands-on control, but when content volume is high and new items arrive frequently, auto-tagging is very effective. That relies on text analytics and natural language processing. Generative AI is especially useful for generating taxonomy content itself.

Alan Morrison:

Though jumping straight in with generative AI can lead to trouble?

Heather Hedden:

From the beginning, it was more useful for smaller tasks — generating synonyms, listing subtypes for a category — rather than producing an entire taxonomy at once. A full AI-generated taxonomy might be scraping copyrighted content from the web without you even knowing it, and it may have been built for a different purpose entirely.

Smaller, targeted tasks work better. I recently used ChatGPT and Claude this way for a project with Northern Light, which provides curated competitive and market intelligence in life sciences and pharmaceuticals. For subject areas like drug types that I wasn’t deeply familiar with, I used the AI to clarify distinctions between terms — and it was genuinely useful.

It’s not public controversy-laden content; it’s science and technology, where I feel fairly confident about the outputs. And more recently, even without asking, the AI was giving me two-level hierarchies of drug types that were quite good starting points.

What we’re seeing now is that this used to require copying AI output out of a chat window and putting it somewhere else. GraphWise’s latest feature — Taxonomy Builder — brings that inside the tool. They had an earlier feature called Taxonomy Advisor for suggesting narrower concepts, alternative labels, and definitions. Now Taxonomy Builder extends the whole workflow from within the tool.

Alan Morrison:

Can we see an example of how Taxonomy Builder works?

Heather Hedden:

I’ll give it a try. Here’s a start of a taxonomy in healthcare — we have some concept schemes, some without content yet. If I right-click on “Healthcare Services,” there’s a “Build Your Taxonomy” option. I’ll keep the instructions simple and click generate. While it’s running you can’t manually create anything at the same time — hence the orange banner. It uses a combination of LLMs, I’m told, from OpenAI, Anthropic, and Google, since different models perform better in different areas.

Heather Hedden:

Good — we now have some results. It’s suggested several narrower concepts across multiple levels, not just one level. I can accept everything, select individually, or ask for more. Let me show the “extend” option. I was at “Specialized Care” — I can add more narrower concepts at the same level. And here, under “Emergency Care Services,” we can add alternative labels. That’s where it starts to look more like a thesaurus — you’re looking for synonyms.

Alan Morrison:

A thesaurus in the original sense of the word, as a synonym dictionary.

Heather Hedden:

Exactly. Synonyms — or “alternative labels” as SKOS calls them, which is more precise, since they don’t have to be exact synonyms. They just need to be sufficiently equivalent to enable tagging where those phrases appear in documents, or to match search strings that users enter. The AI can identify near-equivalent terms well.

Heather Hedden:

We can also add definitions. By default it suggests three to choose from. I can pick the one I like and discard the others. Definitions can be edited too, though I’ll admit I’m not fond of writing definitions — I prefer building the taxonomy itself. But if they’re needed, they’re increasingly valuable when the taxonomy is being used for AI services like graph RAG, where more context is better.

Alan Morrison:

The corpus analysis feature — does the Taxonomy Builder currently use a client’s corpus?

Heather Hedden:

Not yet — that’s planned for a future version. Right now it draws on LLMs using public content from the web. But bringing in the client’s own corpus is definitely planned, and I think it will be very valuable. My one complaint about corpus analysis has always been that you end up with an enormous ranked list of terms and it’s still not obvious how to fit them into the taxonomy. Having the AI organize them contextually would help a great deal.

Alan Morrison:

Where does automation leave things today? What gaps still need to be filled by people?

Heather Hedden:

Governance is still the main one. The taxonomy governance plan — policies, procedures, ownership, update and maintenance processes — usually doesn’t exist when I arrive at an organization, and it remains critical even with automation.

With more automated tools, the governance plan has to address things like how many narrower concepts to generate at each level, how deep or broad to go. These parameters need to be decided thoughtfully based on how the taxonomy will be used and displayed. Will the hierarchy be used so that looking up a subject also retrieves content tagged with narrower concepts? Will users browse the hierarchy, or search? The same fundamental design questions remain.

What automation does do is make taxonomy creation more accessible. More people can now build and update taxonomies — not just expert taxonomists. That means the governance plan has to account for a broader group of contributors. Subject matter experts who are building out their own areas need some grounding in taxonomy principles: how hierarchy works, what alternative labels are actually for, whether tagging will be automated or manual — because that affects what alternative labels should be included and whether they’ll be visible to users.

Alan Morrison:

When generative AI first appeared, were you worried it would reduce your role?

Heather Hedden:

I was a little skeptical at first — I’m the expert, I can build taxonomies better, do I really want this automation? But I’ve come around. It does help me in subject areas I’m not familiar with. More importantly, it makes taxonomy creation more accessible to more people inside organizations. In Graphwise’s literature they’ve talked about the cold start problem — the difficulty of getting started with a taxonomy before you can build out a semantic layer or knowledge graph. This addresses that. We’re going to see more taxonomies created, for big and small projects alike, in more places. I’m optimistic. Even if I’m doing smaller consulting engagements — more guiding and educating than building — that’s fine. We still need guidance. And more companies building taxonomies means more companies eventually building knowledge graphs, which is good for everyone.

Alan Morrison:

What does the near future look like — 2026 and 2027?

Heather Hedden:

I think we’ve largely covered it. The increasing recognition of the value of taxonomies and ontologies will continue. The blurring between thesauri and taxonomies that we discussed at the start is also happening between taxonomies and ontologies — again largely because of the tooling. Taxonomy and ontology management systems are converging, and the underlying standards — SKOS for taxonomies and thesauri, RDFS and OWL for ontologies — are all based on RDF and can be combined in the same project. You don’t have to get into editing the triples directly; the tools handle it.

For me, the focus remains on taxonomy extended into ontology — some classes, attributes, and a few semantic relationships, treating ontology as a natural extension of the taxonomy. Others come from the ontology side and view everything as an ontology. Both perspectives are valid, and working together with someone who takes that other view is probably the best approach — different points of view lead to better results when you’re trying to serve many different users.

Alan Morrison:

It’s been really helpful to get your perspective on all of this. Thanks for taking the time, Heather.

Heather Hedden:

Thank you, Alan, for your questions and direction — you got me thinking about things I hadn’t articulated quite this way before.

For More Information:

Graphwise. “Taxonomy Builder.” Graphwise Help Center. Accessed March 26, 2026. https://help.graphwise.ai/en/graph-modeling/thesaurus/taxonomy-builder.html.

Hedden, Heather. The Accidental Taxonomist (blog). Hedden Information Management. https://www.hedden-information.com/the-accidental-taxonomist/.

Hedden, Heather. “AI in Taxonomy Building.” Webinar for Graphwise. Accessed March 26, 2026. https://graphwise.ai/event/ai-in-taxonomy-building/.

———. “Getting Started with Taxonomies.” Webinar for Graphwise. Accessed March 26, 2026. https://graphwise.ai/resources/on-demand-webinar/getting-started-with-taxonomies/.

———. “Taxonomy in the Age of AI.” Webinar for Graphwise. Accessed March 26, 2026. https://graphwise.ai/event/taxonomy-in-the-age-of-ai/.

Hedden, Heather, and Gary Leicester. “AI-Assisted Taxonomy Creation: Tools, Workflows, and Where Do Humans Fit In?” Webinar for Graphwise. Accessed March 26, 2026. https://graphwise.ai/events/.