In praise of the tag

A manifesto

Oct 31, 2022

Warning: This post is too long to read in your email client as I’ve used a lot of images to illustrate my points, so I’d recommend opening it in browser or app instead.

If you’re using metadata, you’re already tagging. Simple as that.

What you might not be doing is utilising the full power of tags, which is what I want to show you.

This is quite a long post, and is one that I’ve been slowly writing over quite a few weeks now, so settle down with a cup of tea.

Over the past year or so, I’ve noticed a shift against tagging as a method of organising and retrieving information. The current furor over Tana has ameliorated it slightly, due to the introduction of ‘supertags’, but as a general rule tags don’t seem to be the popular flavour.

I don’t quite understand why, as I still believe that tagging is one of the most powerful ways of retrieving what you actually want – and after all, isn’t that the point of personal knowledge management? There’s no point in storing and organising if it doesn’t make it easier to actually get what you need from it.

So this post is to try to show you why you should also be using tags, and to show you my own system.

What you’ll find in here:

What is a tag?

We all instinctively recognise and know how to use tags, but it’s important to think carefully about what a tag actually is.

In a 2014 article for The Indexer, Heather Hedden describes tagging as cataloguing or assigning metadata. I really like this description, as it focuses on the metadata element that is key. Let’s keep this in mind as we continue.

The different types of tag

Pauline Rafferty in a 2018 article discusses eleven different ‘categories’ of tag. I won’t go into all of them here, but instead I’ll focus on the particularly relevant ones for personal knowledge management.

There are obvious categories such as ‘content-based’ tags. This is, as the name suggests, a tag that organises the information by content, e.g. a note about indexes is tagged as #indexes.

There are also ‘attribute’ tags, which are inherent attributes of an object that may not be apparent from the content itself, e.g. a quality such as #funny. (This could also be seen as a ‘subjective’ tag, which is the tagger’s opinion about the content. I note that there is also a tag category of ‘factual’ tags, which is actually a supercategory of content-based, context-based, and attribute tags. It’s all a bit confusing.)

Rafferty also talks about ‘context-based’ tags, which involve things such as time of creation, location of creation, etc. In many PKM apps, you’ll find that this type of ‘context-based’ tagging is embedded in the metadata for any given note.

Metadata as tags

If we return to the original idea that tags are a way of cataloguing or assigning metadata, then you could also see metadata as a form of tag. They’re interchangeable, in a way. You don’t think of ‘time of creation’ or ‘last edited’ metadata as a tag, but it is, really.

Custom metadata can easily be seen as tagging. I personally have custom metadata categories that could equally be tags, such as ‘time period’ (e.g. ‘17th century’), but that I don’t want to clutter up my tag index. (See a later section for my tag index.)

If you use metadata, you’re already tagging. Expand it some more.

Why tagging is so useful

There are four main reasons, which are admittedly in part Obsidian-based, but I think the principles are relatively transferable.

They are:

First, however, I’ll go through the Knowledge Organisation Model. If you’re not interested in the theoretical framework – which is fair enough – skip to section 1, cross-referencing and combining tags.

The Knowledge Organisation Model

Hope Olson and Lisa Given outline a model of knowledge organisation (and retrieval) in a 2003 article for The Indexer (a publication I’ve had great delight in working through the archives of).

They include a helpful but mildly confusing diagram:

To make sense of this, I’ll talk you through each component.

Coextensiveness is how the information has been sorted into categories, and how useful it is to the finder. For example, think about how you could organise categories of animals. You could organise them based on zoological taxonomy, e.g. cats are felines. Alternatively, you could classify animals based on their relationship to humans – so categories such as ‘domesticated’ and ‘wild’ are more useful than ‘feline’ or ‘amphibian’. You might also want to further subdivide into ‘pets’, ‘livestock’, and so on. This choice depends wholly on the context of your notes.

Skipping over ‘consistency’ (we’ll come back to that), we then have specificity. This is the relative detail within the vocabulary, or the number of hierarchical levels defined. Think of this in terms of how many subfolders you have, or subtags. Staying with the example of animals, you may want to start subdividing ‘cats’ down into breeds – Burmese, Ragdoll, British Shorthair, and so on. Alternatively, maybe you want to provide further categories such as ‘diet’, ‘health’, ‘environment’. Potentially you might then break down ‘health’ by body parts, or by age of the cat.

Precision is a measure of how effectively a system can retrieve the relevant information. This is my favourite element, as it’s the one I consider the most important – and where tags shine. If precision is high, then there is little irrelevant information returned; if it’s low, then you have to sift through a lot of irrelevant information to find what you’re after.

Looking back at that model, you can see that an arrow leads from specificity → precision. This shows that specificity impacts on precision. Makes sense, doesn’t it? If you’re looking to find information about what health problems an elderly cat might have, then you really need that information catalogued in a way you can easily find it.

Exhaustivity is the breadth or comprehensiveness of the cataloguing. With indexing, it’s the question of how often a topic needs to appear in a book before it gets listed in the index. For a folder system, it might be how often something needs to appear before those notes get their own folder. For tagging, it might be how often until a new tag or subtag is created. For a MoC/link system, it might be how often until you create a MoC note.

Recall is, quite simply, how much of the available information is actually retrieved. It’s inversely connected to precision: high recall will bring back as much information as possible, however loosely relevant, resulting in lower precision. High precision will result in lower recall, as fewer instances are returned. If you want better recall but without the expense of precision, you could improve this by increasing exhaustivity, as per the arrow in the diagram.

Consistency is the last component, and something I think tags are often criticised for. Consistency is where you use the same rules to organise your categories in a way that you implicitly know how to navigate them. With tagging, you can easily fall foul of simple things like using both singular and plural versions of a tag, e.g. #index and #indexes, making it difficult to actually group all of the relevant notes. There are ways of getting around this, however, that are really quite simple.

Cross-referencing and combining tags

Very simply, you can combine tags in absolutely any way. This allows you to retrieve notes in a way that scores high on precision or scores high on recall – or a balance of the two.

This means you don’t need to fix your notes into a hierarchical structure, nor do you have to define connections between notes in any more complex terms than ‘they roughly fit into the same category and I’d use the same key term to look for them in a book index or google’. My personal tagging choice is ‘where would I look for this?’ and then use the according term.

Keeping to the theme of indexes, since I have an appropriate number of them in my vault to function as a good example, say I want to find all of my notes that relate to both indexes and alphabetical order. This is a search that requires high precision and high recall – I don’t want any notes about indexes but not alphabetical order, or vice versa, but I do want to find all of my notes about both.

In this case, I can simply type into the search bar:

This is a demonstration of traditional tagging. I can immediately see that I have a secondary source (green) and a primary source (blue) included, however, and I’ve decided I want to exclude them.

In this case, I have a custom metadata field called ‘notetype’ added to all of my notes in my vault and they’re separated into folders. This means that if I want to only return notes that are notes, I just need to add ‘notetype: note’ to my search query or ‘path: Notes’. (It’s also worth noting that it’s due to the notetype custom metadata field that the notes are colour-coded and have emojis, thanks to the Supercharged Links plugin - strongly recommended.)

I can also use the Omnisearch plugin, the dataview plugin, or the DB Folder plugin, depending on how I want to use or store my search results. If I want to find all of my index-related notes for the 16th century, as an example, I can get this back using DB Folder:

I have other notetypes (secondary source, primary source, and person), marked up with notetype-specific metadata. Again, this makes for easy retrieval and viewing.

For example, all of my person notes for the 16th century, using a dataview table (sorted by when they were born, if that data is available):

I can narrow or widen search parameters as I please, because all of my notes are thoroughly tagged up. If I wanted to find all my listed printers in the 16th century, that is easily done.

Subtags

In the above example about indexes and alphabetical order, I could theoretically have created a subtag of #indexes/alphabeticalorder…or would that be #alphabeticalorder/indexes?

This effectively turns into a question of hierarchy – or folders. I don’t want that. It doesn’t tick the right box for specificity. I don’t want indexes or alphabetical order to relate to each other hierarchically, and I want them to cluster with other notes about indexes and other notes about alphabetical order. A subtag would be inappropriate here.

Yet there are circumstances where I might want to use a subtag, because it relates to a highly specific type of note with a clear hierarchy.

For example, I want to find all of my notes that contain something funny about indexes. I could theoretically use the attribute tag #funny on all notes where I found the content entertaining – I may still do that – and then combine it with #indexes, but when it comes to my index view (see later) I want to see those funny stories nested as a subheading under indexes.

I have a few notes about funny indexes now, and I had previously had to link them directly – creating a tedious requirement to backlink – so I’m reconsidering the exhaustivity and specificity of my tagging system.

I’ve decided to create a subtag #indexes/humourous:

If I want to rapidly find a particular note about funny indexes, I can do so in the space of seconds.

Tags → index view

Admittedly this is where my nerdery for indexes (if you hadn’t already noticed it) comes to the fore.

I have a number of notes in my vault that serve as index notes. The main one is creatively called ‘! Index’, which contains every single ‘note’ note in my vault.

This is automatically generated and updated using the dataview plugin:

It’s easy to navigate: cmd + f and I can search for whatever tag I’m thinking of.

And I can see subtags, too:

The value of this view is that I can start to see clusters of notes, and as I scroll through it I can see where I should start breaking tags down into subtags, or go into notes directly and link them to each other.

Where you should link instead of tag

I am also a big fan of linking – I just don’t think you should rely on it as your primary way of navigating your notes.

Instead, I link (and sometimes embed sections of) notes where it develops a line of thought or has a contradictory point of view. I also link to secondary and primary source pages, as well as people.

Here’s an example note:

You can see here that there’s a fair amount going on, but all of my links have a specific purpose that adds meaning and aids navigation. Using the core templates plugin, each type of note takes seconds to create and fill in, yet is incredibly helpful.

Where you should use folders instead of tags

The only circumstances I use tags are to organise very broad categories of notes. I have a folder for ‘bibliography’ (separated by secondary/primary), blog post drafts, notes, templates, people, and readwise.

I really don’t recommend having any more folders than aids you in finding your proper ‘finding devices’, i.e. static dataview/DB Folder query notes, that sort of thing.

Pitfalls

Quickly, as this is getting ridiculously long, I will address some pitfalls with tagging, and how to avoid them.

Undertagging

Tag more than you think you need to. You don’t know what emergent categories are going to develop as you build your collection of notes – there is nothing more painful than having to sift through hundreds of notes adding a tag over and over. If in doubt, tag it. It does no harm.

Inconsistencies in naming conventions

Decided whether you want to refer to things in the singular or plural, and check your tag index regularly to make sure there are no stray typos or other mistakes. Plugins like Frontmatter Tag Suggest are really helpful in this, but obviously you need to be on top of your custom metadata fields yourself.

The (relevant) community plugins I use

Bibliography

Hedden, Heather, ‘Tagging versus Indexing’, The Indexer, 32.2 (2014), 81–82

Olson, Hope A., and Lisa M. Given, ‘Indexing and the “organized” Researcher’, The Indexer, 23.3 (2003), 129–33

Rafferty, Pauline, ‘Tagging’, Knowledge Organization, 45.6 (2018), 500–516

ThoughtfulAtlas’ Newsletter

Discussion about this post