vector – Giovanni Dicanio's Blog

Suppose that you have some data stored in a std::vector, and you need to pass it to a function that takes a pointer to the beginning of the data, and in addition the data size or element count.

Something like this:

// Input data to process
std::vector<int> myData = { 11, 22, 33 };

//
// Do some processing on the above data
//
DoSomething(
    ??? ,         // beginning of data 
    myData.size() // element count
);

You may think of using the address-of operator (&) to get a pointer to the beginning of the data, like this:

// Note: *** Wrong code ***
DoSomething(&myData, myData.size());

But the above code is wrong. In fact, if you use the address-of operator (&) with a std::vector instance, you get the address of the “control block” of std::vector, that is the block that contains the three pointers first, last, end, according to the model discussed in a previous blog post:

Taking the address of a std::vector returns a pointer to the beginning of its control block — Taking the address of a std::vector (&v) points to its control block

Luckily, if you try the above code, it will fail to compile, with a compiler error message like this one produced by the Visual C++ compiler in VS 2019:

Error C2664: 'void DoSomething(const int *,size_t)': 
cannot convert argument 1 
from 'std::vector<int,std::allocator<int>> *' to 'const int *'
Types pointed to are unrelated; conversion requires reinterpret_cast, C-style cast or function-style cast

What you really want here is the address of the vector’s elements stored in contiguous memory locations, and pointed to by the vector’s control block.

To get that address, you can invoke the address-of operator on the first element of the vector (which is the element at index 0): &v[0].

// This code works
DoSomething(&myData[0], myData.size());

As an alternative, you can invoke the std::vector::data method:

DoSomething(myData.data(), myData.size());

Now, there’s a note I’d like to point out for the case of empty vectors:

According to the documentation on CppReference:

If size() is 0, data() may or may not return a null pointer.
CppReference.com

I would have preferred a well-defined behavior such that, when size is 0 (i.e. the vector is empty), data() must return a null pointer (nullptr). This is the good behavior that is implemented in the C++ Standard Library that comes with VS 2019. I believe the C++ Standard should be fixed to adhere to this intelligent behavior.

Someone was modernizing some legacy C++ code. They had an array defined like this:

int v[100];

and they needed the size, in bytes, of that array, to pass that value to some function. They used sizeof(v) to get the previous array size.

When modernizing their code, they chose to use std::vector instead of the above raw C-style array. And they still used sizeof(v) to retrieve the size of the vector. For example:

// Create a vector storing 100 integers
std::vector<int> v(100);

std::cout << sizeof(v);

The output they got when building their code in release mode with Visual Studio 2019 was 24. They also noted that they always got the same 24 output, independently from the number of elements in the std::vector!

This is clearly a bug. Let’s try to shed some light on it and show the proper way of getting the size of the total number of elements stored in a std::vector.

First, to understand this bug, you need to know how a std::vector is implemented. Basically, at least in Microsoft STL implementation (in release builds, and when using the default allocator¹), a std::vector is made by three pointers, kind of like this:

template <typename T>
class vector
{
    T* first;
    T* last;
    T* end;

    ...
};

Diagram showing a typical implementation of std::vector, made by three pointers: first, last and end. The valid elements are those pointed between first (inclusive) and last (exclusive). — Typical implementation of std::vector with three pointers: first, last, end

first: points to the beginning of the contiguous memory block that stores the vector’s elements
last: points one past the last valid element stored in the vector
end: points one past the end of the allocated memory for the vector’s elements

Spelunking inside the Microsoft STL implementation, you’ll see that the “real” names for these pointers are _Myfirst, _Mylast, _Myend, as shown for example in this part of the <vector> header:

Part of the vector header that shows the identifiers that represent the vector's three internal pointers: first, last and end. — An excerpt of the <vector> header that comes with Microsoft Visual Studio 2019

So, when you use sizeof with a std::vector instance, you are actually getting the size of the internal representation of the vector. In this case, you have three pointers. In 64-bit builds, each pointer occupies 8 bytes, so you have a total of 3*8 = 24 bytes, which is the number that sizeof returned in the above example.

As you can see, this number is independent from the actual number of elements stored in the vector. Whether the vector has one, three, ten or 10,000 elements, the size of the vector’s internal representation made by those three pointers is always fixed and given by the above number (at least in the Microsoft’s STL implementation that comes with Visual Studio²).

Now that the bug has been analyzed and the mystery explained, let’s see how to fix it.

Well, to get the “size of a vector”, considered as the number of bytes occupied by the elements stored in the vector, you can get the number of elements stored in the vector (returned by the vector::size method), and multiply that by the (fixed) size of each element, e.g.:

//
// v is a std::vector<int>
//
// v.size()    : number of elements in the vector
// sizeof(int) : size, in bytes, of each element
//
size_t sizeOfAllVectorElementsInBytes = v.size() * sizeof(int);

To write more generic code, assuming the vector is not empty, you can replace sizeof(int) with sizeof(v[0]), which is the sizeof the first element stored in the vector. (If the vector is empty, there is no valid element stored in it, so the index zero in v[0] is out of bounds, and the above code won’t work; it will probably trigger an assertion failure in debug builds.)

In addition, you could use the vector::value_type type name member to get the size of a single element (which would work also in the case of empty vectors). For example:

using IntVector = std::vector<int>;

IntVector v(100);

// Print the number of bytes occupied 
// by the (valid) elements stored in the vector:
cout << v.size() * sizeof(IntVector::value_type);

To be even more generic, a helper template function like this could be used:

//
// Return the size, in bytes, of all the valid elements
// stored in the input vector
//
template <typename T, typename Alloc>
inline size_t SizeOfVector(const std::vector<T, Alloc> & v)
{
    return v.size() * sizeof(std::vector<T, Alloc>::value_type);
}


//
// Sample usage:
//
std::vector<int> v(100);
std::cout << SizeOfVector(v) << '\n'; 
// Prints 400, i.e. 100 * sizeof(int)

Bonus Reading

If you want to learn more about the internal implementation of std::vector (including how they represent the default allocator with an “empty class” using the compressed pair trick), you can read these two interesting blog posts on The Old New Thing blog:

In debug builds, or when using a custom allocator, there can be more stuff added to the above simple std::vector representation. ↩︎
Of course, that number can change in debug builds, or when using custom allocators, as per the previous note. ↩︎

Tag: vector

How to Access std::vector’s Internal Array

How to Get the “Size” of a std::vector?

Bonus Reading