Skip to content

Instantly share code, notes, and snippets.

@justinmeiners
Last active January 31, 2024 17:54
Show Gist options
  • Star 4 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save justinmeiners/57f38bddae9029db3c6401fae113bd7c to your computer and use it in GitHub Desktop.
Save justinmeiners/57f38bddae9029db3c6401fae113bd7c to your computer and use it in GitHub Desktop.
Why is std::map typically implemented as a red-black tree?

Why is std::map typically implemented as a red-black tree?

Why not a hash table?

A type only requires < operator (comparison) to be used as a key in a tree. However, hash tables require that each key type has a hash function defined. Keeping type requirements to a minimum is very important for generic programming so you can use it with a wide variety of types and algorithms.

Designing a good hash table requires intimate knowledge of the context it which it will be used. Should it use open addressing, or linked chaining? What levels of load should it accept before resizing? Should it use an expensive hash that avoids collisions, or one that is rough and fast?

Since the STL can't anticipate which is the best choice for your application, the default needs to be more flexible. Trees "just work" and scale nicely.

(C++11 did add hash tables with unordered_map. You can see from the documentation it requires setting policies to configure many of these options.)

What about other trees?

Red Black trees offer fast lookup and are self balancing, unlike BSTs. Another user pointed out its advantages over the self-balancing AVL tree.

Alexander Stepanov (The creator of STL) said that he would use a B* Tree instead of a Red-Black tree if he wrote std::map again, because it is more friendly for modern memory caches.

One of the biggest changes since then has been the growth of caches. Cache misses are very costly, so locality of reference is much more important now. Node-based data structures, which have low locality of reference, make much less sense. If I were designing STL today, I would have a different set of containers. For example, an in-memory B*-tree is a far better choice than a red-black tree for implementing an associative container. - Alexander Stepanov

Should maps always use trees?

Another possible maps implementation would be a sorted vector (insertion sort) and binary search. This would work well for containers which aren't modified often but are queried frequently. I often do this in C as qsort and bsearch are built in.

Do I even need to use map?

Cache considerations mean it rarely makes sense to use std::list or std::deque over std:vector even for those situations we were taught in school (such as removing an element from the middle of the list). Applying that same reasoning, using a for loop to linear search a list is often more efficient and cleaner than building a map for a few lookups.

Of course choosing a readable container is usually more important than performance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment