Suppose that in your C++ code base you have a legacy C-interface function:
// Takes a C-style NUL-terminated string pointer as input
void DoSomethingLegacy(const char* s)
{
// Do something ...
printf("DoSomethingLegacy: %s\n", s);
}
The above function is called from a C++ function/method, for example:
void DoSomethingCpp(std::string const& s)
{
// Invoke the legacy C function
DoSomethingLegacy(s.data());
}
The calling code looks like this:
std::string s = "Connie is learning C++";
// Extract the "Connie" substring
std::string s1{ s.c_str(), 6 };
DoSomethingCpp(s1);
The string that is printed out is “Connie”, as expected.
Then, someone who knew about the new std::string_view feature introduced in C++17, modifies the above code to “modernize” it, replacing the use of std::string with std::string_view:
// Pass std::string_view instead of std::string const&
void DoSomethingCpp(std::string_view sv)
{
DoSomethingLegacy(sv.data());
}
The calling code is modified as well:
std::string s = "Connie is learning C++";
// Use string_view instead of string:
//
// std::string s1{ s.c_str(), 6 };
//
std::string_view sv{ s.c_str(), 6 };
DoSomethingCpp(sv);
The code is recompiled and executed. But, unfortunately, now the output has changed! Instead of the expected “Connie” substring, now the entire string is printed out:
Connie is learning C++
What’s going on here? Where does that “magic string” come from?
Analysis of the Bug
Well, the key to figure out this bug is understanding that std::string_view’s are not necessarily NUL-terminated. On the other hand, the legacy C-interface function does expect as input a C-style NUL-terminated string (passed via const char*).
In the initial code, a std::string object was created to store the “Connie” substring:
// Extract the "Connie" substring
std::string s1{ s.c_str(), 6 };
This string object was then passed via const& to the DoSomethingCpp function, which in turn invoked the string::data method, and passed the returned C-style string pointer to the DoSomethingLegacy C-interface function.
Since strings managed via std::string objects are guaranteed to be NUL-terminated, the string::data method pointed to a NUL-terminated contiguous sequence of characters, which was what the DoSomethingLegacy function expected. Everyone’s happy.
On the other hand, when std::string is replaced with std::string_view in the calling code:
// Use string_view instead of string:
//
// std::string s1{ s.c_str(), 6 };
//
std::string_view sv{ s.c_str(), 6 };
DoSomethingCpp(sv);
you lose the guarantee that the sub-string is NUL-terminated!
In fact, this time when sv.data is invoked inside DoSomethingCpp, the returned pointer points to a sequence of contiguous characters that is the original string s, which is the whole string “Connie is learning C++”. There is no NUL-terminator after “Connie” in that string, so the legacy C function that takes the string pointer just goes on and prints the whole string, not just the “Connie” substring, until it finds a NUL-terminator, which follows the last character of “Connie is learning C++”.

So, be careful when replacing std::string const& parameters with string_views! Don’t forget that string_views are not guaranteed to be NUL-terminated! That is very important when writing or maintaining C++ code that interoperates with legacy C or C-style code.
If you want a C string pointer from a std::string, it’s better to call c_str, not data. Not only does the name directly reflect the intention, but since string_view has no c_str method you’ll get a compile error if you do something like the above example, replacing strings with string_views.
LikeLike
This seems a good suggestion, although std::string::data is guaranteed to return a null-terminated string.
From CppReference:
“The returned array is null-terminated, that is, data() and c_str() perform the same function.”
https://en.cppreference.com/w/cpp/string/basic_string/data
In any case, the problem still exists (and causes nasty bugs) when people pass around string_views and somewhere in the call stack they assume that the string_views are null-terminated, like when they passed std::string const&.
LikeLike