Skip to content

Instantly share code, notes, and snippets.

@hannahwhy
Created January 3, 2015 19:03
Show Gist options
  • Save hannahwhy/5c07ed9d2ba27b5de428 to your computer and use it in GitHub Desktop.
Save hannahwhy/5c07ed9d2ba27b5de428 to your computer and use it in GitHub Desktop.
#include <boost/intrusive/unordered_set.hpp>
#include <chrono>
#include <random>
#include <vector>
#include <iostream>
namespace bi = boost::intrusive;
struct Billboard : public bi::unordered_set_base_hook<>
{
bi::unordered_set_member_hook<> member_hook;
Billboard(unsigned int tag) : tag(tag) {}
unsigned int tag = 0;
float x = 0.0f;
float y = 0.0f;
// The rest of this is boost::intrusive::unordered_set plumbing.
friend bool operator==(const Billboard& a, const Billboard& b) {
return a.tag == b.tag;
}
friend std::size_t hash_value(const Billboard& a) {
std::hash<unsigned int> h;
return h(a.tag);
}
/**
* A functor to find billboards by tag without constructing a billboard.
*/
struct Finder {
bool operator()(unsigned int tag, const Billboard& b) const {
return tag == b.tag;
}
bool operator()(const Billboard& b, unsigned int tag) const {
return b.tag == tag;
}
};
};
typedef bi::unordered_set<Billboard> Billboards;
typedef Billboards::bucket_type bucket_type;
typedef Billboards::bucket_traits bucket_traits;
int main(int argc, char** argv)
{
constexpr unsigned int billboardCount = 100;
constexpr unsigned int iterationCount = 100000;
unsigned int i;
// Construct billboards and set tags.
std::vector<Billboard> bs;
for (i = 0; i < billboardCount; ++i) {
bs.emplace_back(i);
}
// Build the map. We use 100 hash buckets because that's in the
// boost::intrusive documentation.
bucket_type buckets[100];
Billboards bmap(bucket_traits(buckets, 100));
for (auto& b : bs) {
bmap.insert(b);
}
// Access a random billboard and write a position to it. Do this
// iterationCount times.
std::minstd_rand rand;
// Seed value shouldn't matter for this, I hope.
rand.seed(0);
std::hash<unsigned int> hasher;
std::chrono::high_resolution_clock clock;
auto start = clock.now();
for (i = 0; i < iterationCount; ++i) {
auto it = bmap.find(rand() % billboardCount, hasher, Billboard::Finder());
if (it != bmap.end()) {
it->x = static_cast<float>(rand());
it->y = static_cast<float>(i);
}
}
auto end = clock.now();
float total = std::accumulate(bmap.begin(), bmap.end(), 0.0f,
[](float acc, Billboard& b) { acc += b.x; acc += b.y; return acc; });
std::cerr << total << std::endl;
std::cerr << "(map) " << iterationCount << " iterations: ";
std::cerr << std::chrono::duration_cast<std::chrono::microseconds>(end - start).count() << " us";
std::cerr << std::endl;
// ------------------------------------------------------------------
// Do the same pick-one-and-write-to-it thing, but this time skip the map.
start = clock.now();
for (i = 0; i < iterationCount; ++i) {
auto& billboard = bs[rand() % billboardCount];
billboard.x = static_cast<float>(rand());
billboard.y = static_cast<float>(i);
}
end = clock.now();
total = std::accumulate(bmap.begin(), bmap.end(), 0.0f,
[](float acc, Billboard& b) { acc += b.x; acc += b.y; return acc; });
std::cerr << total << std::endl;
std::cerr << "(vector) " << iterationCount << " iterations: ";
std::cerr << std::chrono::duration_cast<std::chrono::microseconds>(end - start).count() << " us";
std::cerr << std::endl;
return 0;
}
@hannahwhy
Copy link
Author

This weird test came about because my game renderer needs a way to return handles to visual objects so that game logic may manipulate them (e.g. change animation). I could use std::shared_ptr/std::weak_ptr too, and I'll check those out in a bit, but I first wanted to see how the opaque-handle thing worked out.

I'm using boost::intrusive just to learn it; I suspect intrusive containers will be helpful for me but have not yet reasoned it out too much.

Turns out that the lookup thing is probably good enough for my needs (I can live with a 2x slowdown), though I don't understand the huge performance difference between -O0 (no optimization) and -Og (optimizations that don't interfere with debugging). objdump time, I guess?

Timings on a Chromebook Pixel, Kubuntu 14.10, g++ Ubuntu 4.9.1-16ubuntu6:

$ g++ -O0 -Wall --std=c++1y b2.cpp -I/usr/include/boost
$ ./a.out 
1.03461e+11
(map) 100000 iterations: 89452 us
9.99388e+10
(vector) 100000 iterations: 3492 us

$ g++ -Og -Wall --std=c++1y b2.cpp -I/usr/include/boost
$ ./a.out 
1.03461e+11
(map) 100000 iterations: 5996 us
9.99388e+10
(vector) 100000 iterations: 2717 us

@hannahwhy
Copy link
Author

clang results, because why not:

$ clang++ --version
Ubuntu clang version 3.5.0-4ubuntu2 (tags/RELEASE_350/final) (based on LLVM 3.5.0)
Target: x86_64-pc-linux-gnu
Thread model: posix

# clang++ has no -Og

$ clang++ -O0 -Wall b2.cpp --std=c++1y -I/usr/include/boost
$ ./a.out 
1.03461e+11
(map) 100000 iterations: 68546 us
9.99388e+10
(vector) 100000 iterations: 7138 us
$ clang++ -O1 -Wall b2.cpp --std=c++1y -I/usr/include/boost
$ ./a.out 
1.03461e+11
(map) 100000 iterations: 47350 us
9.99388e+10
(vector) 100000 iterations: 2981 us
$ clang++ -O2 -Wall b2.cpp --std=c++1y -I/usr/include/boost
$ ./a.out 
1.03461e+11
(map) 100000 iterations: 4277 us
9.99388e+10
(vector) 100000 iterations: 3828 us

@hannahwhy
Copy link
Author

So if you throw 10000 elements into a hash table with 100 buckets, performance goes to hell at all optimization levels. This makes sense, and makes boost::intrusive::unordered_set slightly less attractive of an option because if I use it I'll have to handle rehashing myself.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment