taxonomy and folksonomy

So in grad school I felt myself leaning pretty far to the "f" in the folksonomy vs. taxonomy debate, though at that time we didn't have the word "folksonomy" invented yet. I don't think. I remember sitting in cataloging class thinking something pretty close to "we should search, not sort." I was thinking that smart objects (book records with enough information and catalogs with enough power to tell us where they were) could self-describe, a la Google -- that is, let the records describe themselves based on how they are used. Sort of.

This leads us to tagging and folksonomy -- what people call and how people label the stuff that they use. Flikr and Technorati have got it right; they let users describe things in natural language. Tags are intuitive, they build community, and they're fun. Controlled vocabularies are stiff and inert, slow as sloths in butter, and famously limited by the limits of their descriptors' imagination and knowledge.

That said, controlled vocabulary is of course vital. We need prescriptive language, and we depend on it, because it has structure, consistency, and authority. It's not made up by crack-adled circus clowns, after all. It's deliberated on in an ongoing, rolling, decades-enfolding process by specialists and subject experts. That stability is its great strength.

Now what we need to make happen is: records built on strong taxonomy with subfields for folksonomic tags.

A word on tags. Too many of them make tagging meaningless for any given record. If you have to read through a thousand tags for one record, then any given tag in that record is relatively useless. One strength of tags is that they point the user in a direction. So if we're going to exploit tagging for library records, it seems to me that we have to control it. The best way to do this is to keep only the 30 or so most relevant tags -- relevance being determined by repetition. So if the tag "archaeology" is the fourth most attached tag for your Planetary: Leaving the 20th Century (by Warren Ellis -- and read him) record, it will be the fourth most relevant tag. It warrants inclusion in the subfield.

This all leads to reader participation in the organization of information. Librarians no longer have a monopoly on this. Computers and people are finding new, sexier ways to it for themselves. For librarians to stay in the game, we've got to incorporate self-organizing, bottom-up, grassroots, folksonomies into the very careful and rather inert records we create. We need moveable records (or, to clarify: portions of records) that make library materials dynamic for our users. iBistro is a step toward Amazon, but we've got to start stepping faster.

Anyway, the point I meant to get around to is that I want to see tagging incorporated into cataloging in a big way, and soon. If I've missed such developments, please share.


