Why There’s No Single Best Way To Store Information (opens in new tab)

Just as there’s no single best way to organize your bookshelf, there’s no one-size-fits-all solution to storing information.

Consider the simple situation where you create a new digital file. Your computer needs to rapidly find a place to put it. If you later want to delete it, the machine must quickly find the right bits to erase. Researchers aim to design storage systems, called data structures, that balance the amount of time it takes to add data, the time it takes to later remove it, and the total amount of memory the system needs.

To get a feel for these challenges, imagine you keep all your books in a row on one long shelf. If they’re organized alphabetically, you can quickly pick out any book. But whenever you acquire a new book, it’ll take time to find its proper spot. Conversely, if you place books wherever there’s space, you’ll save time now, but they’ll be hard to find later. This trade-off between insertion time and retrieval time might not be a problem for a single-shelf library, but you can see how it could get cumbersome with thousands of books.

Instead of a shelf, you could set up 26 alphabetically labeled bins and assign books to bins based on the first letter of the author’s last name. Whenever you get a new book, you can instantly tell which bin it goes in, and whenever you want to retrieve a book, you will immediately know where to look. In certain situations, both insertion and removal can be a lot faster than they would be if you stored items on one long shelf.

Of course, this bin system comes with its own problems. Retrieving books is only instantaneous if you have one book per bin; otherwise, you’ll have to root around to find the right one. In an extreme scenario where all your books are by Asimov, Atwood, and Austen, you’re back to the problem of one long shelf, plus you’ll have a bunch of empty bins cluttering up your living room.

Computer scientists often study data structures called hash tables that resemble more sophisticated versions of this simple bin system. Hash tables calculate a storage address for each item from a known property of that item, called the key. In our example, the key for each book is the first letter of the author’s last name. But that simple key makes it likely that some bins will be much fuller than others. (Few authors writing in English have a last name that starts with X, for example.) A better approach is to start with the author’s full name, replace each letter in the name with the number corresponding to its position in the alphabet, add up all these numbers, and divide the sum by 26. The remainder is some number between zero and 25. Use that number to assign the book to a bin.

Loading more...

Keyboard Shortcuts

Navigation
Next / previous item
j/k
Open post
oorEnter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help