Skip to content

Instantly share code, notes, and snippets.

@engie
Created September 25, 2013 12:17
Show Gist options
  • Save engie/6698845 to your computer and use it in GitHub Desktop.
Save engie/6698845 to your computer and use it in GitHub Desktop.
Notes on pooling shared ptr's
//Storing shared pointers in boost pool allocators
//tl;dr It just works (tm)
//
//* If a lot of smallish objects are being allocated & freed a pool allocator
//can speed things up by specialising in allocating exactly that size of
//object.
//
//* If pointer lifetimes need to be managed with reference counting,
//std::shared_ptr can wrap an object to provide safe reference counting.
//
//* std::make_shared can reduce the cost of creating a shared pointer by doing
//a single allocation for the reference counting infrastructure and data.
//
//* Boost pool allocators can be used with shared pointers by passing them into
//std::allocate_shared - see below for more details!
#include <iostream>
#include <memory>
#include <boost/pool/pool_alloc.hpp>
int main(int argc, char** argv)
{
{
//Allocations monitored using https://github.com/samsk/log-malloc2
//This does a single 4 byte allocation
int* a = new int(5);
//This allocates 4 bytes (for the data), then 24 bytes for the
//shared_ptr counter
auto b = std::shared_ptr<int>(new int(5));
//This does a single 32 byte allocation for the shared_ptr counter and
//the data
auto c = std::make_shared<int>(5);
//Creating a pool_allocator object doesn't actually do an allocation
//
//Really this acts as a reference to a set of singleton allocators
boost::pool_allocator<int> alloc;
//This uses malloc to get 272 bytes - 32 chunks of 8 bytes + a header
//Allocations are 8 bytes, not 4, because when an allocation is waiting
//to be issued that space is used to hold a pointer (and I'm doing this
//on a 64 bit machine). The pointers form a linked list of all the
//free blocks waiting to be issued
int* d = alloc.allocate(1);
//This doesn't result in any additional calls to malloc
for(size_t i=0;i<31;i++)
d = alloc.allocate(1);
//This causes the pool allocator to ask malloc for an extra 528 bytes.
//That's enough to issue another 64 allocations. The pool allocator
//will keep asking for twice as much memory from malloc until it is
//able to satisfy all of the requests from its internal store
d = alloc.allocate(1);
//std::allocate_shared does a single allocation for data & reference
//counting information, just like make_shared. The difference is that
//allocate_shared can use an allocator other than malloc.
//
//A pool allocator specialised to dole out 4 byte chunks of memory
//will not be able to satisfy a request from allocate_shared for 32
//bytes.
//
//When allocate_shared uses the pool it calls rebind_traits to get a
//*new* allocator that issues 32 bytes at a time. This new allocator is
//registered as a singleton, so it is reused.
//
//This call causes a pool_allocator to be crated that gets 1040 bytes
//from malloc - enough for 32 shared_ptrs to integers including the
//integer.
auto p = std::allocate_shared<int, boost::pool_allocator<int>>(alloc, 5);
//This doesn't cause any more memory to be retrieved from malloc
std::shared_ptr<int> ptrs[31];
for(size_t i=0;i<31;i++)
{
ptrs[i] = std::allocate_shared<int, boost::pool_allocator<int>>(alloc, 5);
}
}
}
@sketch34
Copy link

I was just surprised to find that the pooled allocator version is slower than the default. This is with MSVC / C++23 / Release build using Catch2 benchmarks.

Any thoughts on why? My benchmark might be too naive, and perhaps std::make_shared is doing some small-alloc optimisation under the hood that is better specialized to the use-case.

  BENCHMARK("shared_ptr default allocator") {
    auto p = std::make_shared<char>();
  };

  using char_allocator = boost::pool_allocator<char>;
  char_allocator alloc;
  BENCHMARK("shared_ptr pooled allocator") {
    auto p = std::allocate_shared<char, char_allocator>(alloc);
  };
benchmark name                       samples       iterations    estimated
                                     mean          low mean      high mean
                                     std dev       low std dev   high std dev
-------------------------------------------------------------------------------
shared_ptr default allocator                   100           400        1.4 ms 
                                         37.895 ns    36.2275 ns      41.46 ns 
                                        11.8993 ns    6.46899 ns    19.3755 ns 

shared_ptr pooled allocator                    100           214     1.4124 ms 
                                        67.0748 ns    66.5748 ns    67.8645 ns 
                                        3.13541 ns    2.19626 ns    4.25377 ns 

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment