You can tell a lot about what information is used for by how it is stored. Or at least, you should be able to tell, because it should be organised in such a way that it’s really easy to access. Dictionaries are in alphabetical order by keyword. The Icelandic telephone book is alphabetised by given name, since the Icelandic culture has no family names. Each person’s second name is modelled on the first name of one of their parents.
Show me your flowchart and conceal your tables, and I shall continue to be mystified. Show me your tables, and I won’t usually need your flowchart; it’ll be obvious.
— Fred Brooks, The Mythical Man-Month
Organising information badly can be a powerful impediment to efficiency. It’s amazing how much easier some tasks become when the information you require is organised in the right way. A thesaurus isn’t ordered alphabetically, but grouped by theme. Can you imagine how difficult it would be to find a word in a thesaurus if it didn’t also have the alphabetical list at the back?
I’m interested in data structures from a computer science perspective. It’s amazing what a difference good data structures can have on the efficiency and comprehensibility of a program. But you don’t have to delve into the world of programming to work this out.
Do you remember Choose Your Own Adventure books? I read a few, but they were always really disappointing. I spent half the time backtracking from abrupt deaths and very little time enjoying the reading. Games which require lots of decisions are really not suited to book format, unless you enjoy spending most of your time hunting for the next page in the story.
There is a natural symmetry between the linear format of a book, and the linear format of a plot. A branching tree of decisions, such as a game, doesn’t fit into this system. And the result is an extremely disappointing game/book hybrid. Organising data for efficient retrieval is, in some cases, a full-time job. You can even get degrees in librarianship, or get paid lots of money as a database analyst.
Smart data structures and dumb code works a lot better than the other way around.
— Eric S Raymond, The Cathedral and the Bazaar
By way of contrast, the best data retrieval system I have ever come across is due to Mike, one of Helen’s old flatmates. The problem is typical: how to organise a CD collection for quick retrieval of good tunes. On reflection, all standard methods of organisation are sub-optimal. Alphabetising doesn’t work, because no-one thinks, “I’d like to listen to some L music today”. Organisation by style doesn’t work unless you can cut tracks out of CDs and move them to other discs.
My problem, especially when faced with a large collection, is two-fold. Indecision coupled with the secretary problem. I don’t know what to listen to, and if I make a decision I’m very sure there’s a better album buried in there, somewhere. What I do have is a good idea of what I don’t want to listen to. So, all you need to do is reorganise all the CDs so they’re in the wrong boxes. If you think “I don’t know what I want, but it’s definitely not Bach” you can happily put on whatever you find in the box labelled JS Bach: Cello Suites.
Alas, as with all other data structures, you must offset advantages in one place with disadvantages elsewhere. Finding good music is O(1). Finding a particular album is O(n). (I suppose it would be n-1, since it’s never in the box which it’s supposed to be in, but this is the same thing.) Oh well!