Information architecture core concepts

Some core concepts of information architecture, used for a recent client project

Information architecture core concepts

At MakerX we rarely do information architecture or content strategy projects, so I was excited to be able to return to my IA roots when one of our clients asked us for recommendations on improving the structure and user experience of their software documentation.

We provided 3 things:

  • information architecture core concepts
  • recommendations for their content
  • a migration plan

We provided core concepts to help them to understand why we recommended specific approaches; and to give them tools to make decisions for situations we didn't anticipate.

Clearly we can't share the detailed recommendations, as they are for a specific context, but we can share the core concepts as these relate to a range of information architecture projects.

Without further ado—some information architecture core concepts...

Known and unknown information seeking

Information-seeking behaviours describe how people go about finding and learning. 

The two main types of information-seeking behaviours are known-item and unknown-item.

Known-item tasks

In a known-item task, users generally:

  • Know what they are looking for
  • May have looked for it before
  • Can describe it (they know the terminology)
  • Know where to start looking
  • Know when they’ve found the answer

Examples might include:

  • Finding out what an error message means
  • Checking arguments for an API call
  • Looking up a parameter

Known-item tasks are well-suited to the following solutions as the user has correct (or close to correct) terminology that will quickly lead them to the result:

  • Search
  • Asking a chatbot for an answer
  • A–Z lists

Unknown item tasks

In an unknown (also known as exploratory) task, users generally:

  • Have a broad idea of what they are looking for
  • May not be able to describe it and often don’t know the terminology
  • May not be sure where to start looking
  • May try a number of approaches in a number of locations
  • May attempt many times, over a period of time
  • May never actually complete the information task

Examples might include:

  • Finding out if a specific technology is a good solution for an app they might want to build
  • Learning about core concepts
  • Troubleshooting a problem that they’re unfamiliar with

Learning tasks are always unknown-item tasks.

Unknown-item tasks are well-suited to the following solutions:

  • Start by searching, but often with the wrong term
  • Navigate to parent, sibling and child pages once they’ve found information that looks useful
  • Start at the broadest concept and gradually go into more detail as they learn
  • Find examples to try to understand the concept by example
  • Ask someone for help

Unknown item tasks are best supported by hierarchical content. It is essential to group and sequence the content in a way that helps people both find and follow the content.

Why this matters

Users of software documentation come to it with known-item and unknown-item tasks and both types must be catered for—content for only one type makes the other significantly harder to do. For example, if all content was ordered in A–Z lists, it would be difficult (probably impossible) to figure out how to learn a new concept. But if reference-style content was organised into hierarchies and written descriptively it would be difficult to quickly find and grab a reference.

Technical content authors have familiarity with their content and may default to organising it in a way that supports known-item tasks. Instead, it is important to consider the range of information tasks that users actually need to do, and prepare content for the full range. This is where undertaking user research, getting a fresh pair of eyes or making use of dedicated content writers can help.

When something looks difficult, people think it is difficult

People unconsciously decide how hard or easy a task will be based on how it looks. This seems illogical—we should determine difficulty by the content and concepts themselves. However, we make snap decisions unconsciously.

Which of these seems easier to read?

Two layouts for the same content

A page looks hard when it has:

  • Poorly sequenced navigation
  • A lot of content
  • Poor heading hierarchy
  • Long paragraphs
  • Similar sized paragraphs with nothing to break the rhythm

A page looks easier when it has:

  • Clean, professional, visual design that uses white space well
  • Clear visual hierarchy of headings  
  • A variety of information structures such as bulleted lists, tables, graphics, code snippets, callouts (e.g. tips) and examples
  • A varied page rhythm with different length paragraphs
  • A manageable amount of content
  • Well-organised navigation and well-sequenced content that users can skim

Why this matters

When people are learning, it’s important that they can look at the page and think “This looks like something I can tackle”. If they are looking up a reference, it’s important that they think they can get the information without breaking their flow. 

Creating content that looks easy (and is easy) requires deliberate effort.

People lose trust when they find one thing wrong

Accurate content is important for the obvious reason that it helps people to do what they need to do. 

Accuracy is also important for the less obvious reason that when someone finds an error or something out of date, they can quickly lose trust in the whole content set. This happens because they are now no longer able to determine what will be wrong or right.

Why this matters

Keeping content up to date takes effort, and maintenance plans need to be incorporated into all content creation processes.

Hierarchy is actually fine

Many content guides that say that navigation should only contain 2 levels. The problem with this guideline is that it is illogical for any large documentation set—the only way to achieve this would be to:

  • have a lot of pages to the top level (making very long top level navigation)
  • have a lot of pages at level 2 (same)
  • write very long pages that cover multiple ideas (hard to consume and find information)
  • split the documentation into many ‘sections’, making it difficult as users then need to navigate between sections

There is nothing inherently wrong with hierarchy and hierarchy depth. Hierarchies are a natural aspect of human thinking—we very easily aggregate granular concepts into higher-level concepts and also very easily break broad ideas down into detail. 

Hierarchy allows layered content

Hierarchy allows us to present information in a learnable way. We can start with broad ideas to get people across a concept, then expand them into more detail, then even more detail without overwhelming the user. Hierarchical structure allows people to take in only the level that is relevant to them, and get more detail as they need or are interested.

Hierarchy can be fine in navigation

Sometimes people will say that the problem is that hierarchy becomes overwhelming in navigation. Most (though admittedly not all) documentation tools can represent hierarchical content well, with styling that allows users to easily see:

  • the page position in the hierarchy
  • parents of a page, to understand context (the best show only parents, not everything at the same level)
  • siblings of a page, to find related content
  • children of a page, to find more detailed content

Groups and categories

Groups (or categories) bring together content that is about a single idea. 

Groups / categories are a fundamental part of human thinking and we create groups from concepts all the time. This makes groups a natural way to structure online content as well. 

Characteristics of groups include:

  • There is an evident reason that a set of pages or concepts are together.
  • The set of content in a group should be mutually exclusive and collectively exhaustive (MECE). Mutually exclusive means that every element in a group appears only once, with no duplication. Collectively exhaustive means that every idea that is a part of the concept is included, with nothing excluded.
  • Groups don’t need to be equal in size—some can be large and some might be a couple of pages. 
  • Nothing should be left outside a group—if there is a single concept that fits with no other concepts, maybe it is its own group and just hasn’t been fleshed out enough. 
  • For a given amount of content, a larger number of small groups is better than a small number of large groups—large groups become more abstract in concept, and more difficult to understand.

Grouping methods that appear to work but don’t 

Some grouping methods that are commonly used in websites (and in software documentation in particular) appear to be OK on the surface, but generally don’t work as:

  • They don’t match the way people think about their information task
  • The content doesn’t fit into them easily, often resulting in overlap and duplication

These include:

  • Lifecycle / linear
  • Content types
  • Audience

Lifecycle / linear

Lifecycle-based groups represent a linear sequence that a user will theoretically go through. Common on the internet are examples like School / University / Buying a house / Retiring. Common in software documentation is to have 'Get started' as a major category, followed by groups that represent linear steps in learning.

The reasons this grouping method doesn’t work are: 

  • People rarely follow the same kind of linear path that the categories assume. Even 'Get started' doesn’t always work well, as the steps to get started may be different for each individual depending on what they are trying to achieve.
  • Content rarely slots neatly into these categories.

Content types

When people are looking for something, top of mind is the topic of the thing they are looking for—rarely do they think about the format that it will come in. Once they have found the topic of interest, if there are a range of formats, they may consider what format they would like to learn with.

Examples common in software documentation include:

  • 'Docs' itself: While the concept of software docs is well-known, it’s not well defined or consistent—'docs' cover different things for different products. People don’t come to 'find the docs'—they will come to 'find an answer to a question' or 'learn about a concept'.
  • Tutorials, articles, blog: Because learners don’t think in terms of content types, they will not necessarily look at tutorials or articles.
  • Videos: While most organisations store videos on YouTube, they are best discovered (linked to / embedded) when they are in context of the topic that people want to learn.

Audience (especially Beginner - Advanced)

Another grouping method that on the surface seems sensible is user characteristics, or audience groups. In this context, the most common is audience levels such as Beginner / Intermediate / Advanced.

People don’t easily fall into these groups. A beginner could be someone new to programming; or new to a language but super-experienced with programming more broadly. Their needs are quite different as their background knowledge is different. But an experienced developer new to a language isn’t intermediate as they need to understand some specific core concepts. Defining an advanced user is even harder!

Content also doesn’t fit into these groups well.

Further reading

(This wasn't part of our recommendations—this is for you. Some oldies, but goodies...)