- create an empty vector,
push_back()
elements into it. - create an empty vector, allocate some capacity with
reserve()
, thenpush_back()
elements into it. - create a vector of
n
elements, use indexing and copy-assignment. - create an empty vector,
emplace_back()
elements into it. - create an empty vector, allocate some capacity with
reserve()
, thenemplace_back()
elements into it.
There are other ways, e.g. creating the container with a pair of iterators, or filling it via standard library algorithms. I won't consider these here.
The following two considerations are important for the following analysis:- Avoid (re)allocations: Memory allocation is slow. reallocation often involves copying everything that's already in the container to the new location.
- Avoid unnecessary work: It's better to construct the element you want than to default-construct, then assign. It's better to construct an element where you want it than to construct it elsewhere, then copy.
Other factors will also affect the efficiency of the chosen solution, but these are significant factors that we have direct control over. Other factors may become obvious through profiling your code.
Each `push_back()` copy-constructs an element in the vector from the argument to the `push_back()` call. If the vector `size() == capacity()`, a reallocation will be performed. This will usually (but may not always) double the capacity, and may result in copying _all_ of the existing elements into new storage. Using `reserve()` allocates enough memory for the elements before we start. It's always worth doing this if you know (or have a reasonable guess for) the number of elements. If you're guessing, over-estimates are better than under-estimates.The push_back()
call will still use the copy constructor of the element type, but there shouldn't be any allocations, because the space is already provided. You just pay the cost of a single allocation during the reserve()
call. If you do run over the existing capacity, push_back()
will reallocate, often doubling the capacity. This is why an over-estimate for the size is better; you're less likely to get a reallocation, whereas with an under-estimate, not only will you likely reallocate, but you'll waste memory allocating almost twice as much as you needed!
Note that the "doubling capacity" behaviour is not specified by the standard, but it is a common implementation, designed to reduce the reallocation frequency when using push_back()
for data sets of unknown size.
In this example, I fill a vector with 10,000 strings using each of the approaches listed above. I time each one and print the results.
This is similar to your question, but not identical; your results will differ!
Note that the emplace_back()
with reserve()
is the fastest, but the indexing & assignment is also quick here. This is likely because std::string
has an efficient swap()
, and its default constructor doesn't do much. The other approaches are an order of magnitude slower.
#include <chrono>
#include <iostream>
#include <vector>
using Clock = std::chrono::high_resolution_clock;
using time_point = std::chrono::time_point<Clock>;
std::vector<std::string> strings = {"one", "two", "three", "four", "five"};
std::chrono::duration<double> vector_push_back(const size_t n) {
time_point start, end;
start = Clock::now();
std::vector<std::string> v;
for (size_t i = 0; i < n; ++i) {
v.push_back(strings[i % strings.size()]);
}
end = Clock::now();
return end - start;
}
std::chrono::duration<double> vector_push_back_with_reserve(const size_t n) {
time_point start, end;
start = Clock::now();
std::vector<std::string> v;
v.reserve(n);
for (size_t i = 0; i < n; ++i) {
v.push_back(strings[i % strings.size()]);
}
end = Clock::now();
return end - start;
}
std::chrono::duration<double> vector_element_assignment(const size_t n) {
time_point start, end;
start = Clock::now();
std::vector<std::string> v(n);
for (size_t i = 0; i < n; ++i) {
v[i] = strings[i % strings.size()];
}
end = Clock::now();
return end - start;
}
std::chrono::duration<double> vector_emplace_back(const size_t n) {
time_point start, end;
start = Clock::now();
std::vector<std::string> v;
for (size_t i = 0; i < n; ++i) {
v.emplace_back(strings[i % strings.size()]);
}
end = Clock::now();
return end - start;
}
std::chrono::duration<double> vector_emplace_back_with_reserve(const size_t n) {
time_point start, end;
start = Clock::now();
std::vector<std::string> v;
v.reserve(n);
for (size_t i = 0; i < n; ++i) {
v.emplace_back(strings[i % strings.size()]);
}
end = Clock::now();
return end - start;
}
int main() {
const size_t n = 10000;
std::cout << "vector push_back: " << vector_push_back(n).count() << "\n";
std::cout << "vector push_back with reserve: " << vector_push_back(n).count() << "\n";
std::cout << "vector element assignment: " << vector_element_assignment(n).count() << "\n";
std::cout << "vector emplace_back: " << vector_emplace_back(n).count() << "\n";
std::cout << "vector emplace_back with reserve: " << vector_emplace_back_with_reserve(n).count() << "\n";
}
Results:
vector push_back: 0.00205563
vector push_back with reserve: 0.00152464
vector element assignment: 0.000610934
vector emplace_back: 0.00125141
vector emplace_back with reserve: 0.000545451
For most new code, using `reserve()` and `emplace_back()` (or `push_back()` for older code) should give you a good first-approximation for efficiency. If it isn't enough, profile it and find out where the bottleneck is. It probably won't be where you think it is.