C++ Programming – Giovanni Dicanio's Blog

The IsoCpp.org Process for Suggesting Articles Is Broken and Should Be Fixed

The process of submitting article suggestions to IsoCpp.org can be kind of “frustrating”, with inconsistencies in acceptance timing and a lack of communication. Making suggestions requires some effort, yet the outcomes feel random. I propose some improvements.

I have suggested several articles to the IsoCpp.org Web site. Some article suggestions were published just a few hours after sending them; others the next day or two, others after a week or two, while other suggestions seemed like lost by anonymous persons in a “black hole”. Who processed those suggestions? Why were those rejected?

This process seems kind of random and unprofessional, and not respectful for the time we put in suggesting articles.

In fact, for suggesting an article, it’s not sufficient to copy-and-paste a link to the content and just click a “Suggest” button. You have to prepare a little document, following a pattern and some editorial guides made available from the IsoCpp Web site. It does take some time.

Then, you click the button to make the suggestion… and it’s like a random coin toss! Will the suggestion be accepted? Will the suggestion be discarded? When? Why? By whom?

The process is clearly broken, and should be fixed, out of respect for the time of the people who made a suggestion, and for what should be a quality Web site that lists links to relevant content.

A possible fix to the process could be like this:

Once you make a suggestion, an email is sent to you, saying that the IsoCpp editorial team has received the suggestion, and will reply in a maximum given period of time: one week, two weeks, whatever. But do give a time limit, and don’t just disappear! I think a 15-day time limit for a reply would be acceptable.

Then, do send a reply to the person suggesting the article, be it positive or negative. But do send a reply! If the suggestion is accepted, say thank you and give a link to the Web page containing the suggestion.

On the other hand, if the suggestion is not accepted, say thank you again, and do give a reason for the refusal. And also give the person suggesting the article an option to further discuss that via email with the editor who refused the suggestion, with the option to discuss that with other editors, too.

Moreover, once a person has a certain number of approved suggestions, let the system automatically approve their suggestions by default. This “privilege level” could be revoked if a certain number of unworthy suggestions or suggestions not relevant for the IsoCpp topics are made.

How to Set the C++ Language Standard Version in VS Code

Manually editing the tasks.json to add the desired C++ compiler option.

So, after the previous discussion on that confusing UI design choice, how can you set the C++ language standard version for building your C++ code in VS Code with the MS C/C++ Extension?

One option is to open the tasks.json file, and edit it to add the desired compiler option. In particular, to enable C++20 compilation mode, the option for the MSVC compiler is /std:c++20. So, add this option as a string “/std:c++20” in the args property array in tasks.json:

Specifying the C++ language standard version in tasks.json as a command line option. — Editing the *tasks.json* file to specify the C++ language standard version in the *“args”* array

I still think that this modification to tasks.json should have been automatically done by the C/C++ Configurations UI, once the C++20 language standard version is set in there.

VS Code with MS C/C++ Extension: A Confusing UI Design Choice

In VS Code, selecting the C++ language standard is not as intuitive as one would expect.

I have been using Visual Studio for C++ development since it was still called Visual C++ (and was a 100% C++-focused IDE), starting from version 4 (maybe 4.2) on Windows 95. I loved VC++ 6. Even today, Microsoft Visual Studio is still my first choice for C++ development on Windows.

In addition to that, I wanted to use VS Code for C++ development for some course work. Why choosing VS Code? Well, in addition to being free to use (as is the Visual Studio Community Edition), another important point of VS Code in that teaching context is its cross-platform feature: in fact, it’s available not only for Windows, but also for Linux and Mac, and students using those platforms could easily follow along.

I had VS Code and the MS C/C++ extension already installed on one of my PCs. I wrote some C++ demo code that used some C++20 features. I tried to build that code, and I got some error messages, telling me that I was using features that required at least C++20. Fine, I thought: Maybe the default C++ standard is set to something pre-C++20 (for example, VS 2019 defaults to C++14).

So, I pressed Ctrl+Shift+P, selected C/C++ Edit Configurations (UI), and in the C/C++ Configurations page, selected c++20 for the C++ standard.

Then I pressed F5 to start a debugging session, preceded by a build process, and saw that the build process failed.

I took a look at the error message in the terminal window, and to my surprise the error messages were telling me that some libraries (like <span>) were available only with C++20 or later. But I had just selected the C++20 standard a few minutes ago!

So, I double-checked, pressing Ctrl+Shift+P and selecting C/C++ Edit Configurations (UI), and in the C/C++ Configurations, the selected C++ standard was c++20, as expected.

The C++20 Standard is selected in the Microsoft C/C++ Extension Configurations UI. — C++20 selected in the MS C/C++ Extension Configurations UI

I also took a look at the c_cpp_properties.json, and found that the “cppStandard” property was properly set to “c++20”, as well.

The C++20 Standard is selected in the c_cpp_properties.json file. — C++20 selected in the c_cpp_properties.json

Despite these confirmations in the UI, I noted that in the terminal window, on the command line used to build the C++ source code, the option to set the C++20 compilation mode was not passed to the C++ compiler!

The command line doesn't contain an option for the C++20 language standard previously set in the UI. — Surprisingly, the option for the C++ language standard was not passed on the command line

So, basically, the UI was telling me that the C++20 mode was enabled. But the C++ compiler was invoked in a way that did not reflect that, as the flag enabling C++20 was not specified on the command line!

I also tried to close and reopen VS Code, double-checked things one more time, but the results were always the same: C++20 was set in the C/C++ Configurations UI and in the c_cpp_properties.json file, but compilation failed due to the C++20 option not specified on the command line when invoking the C++ compiler.

I thought that this was a bug, and opened an issue on the MS C/C++ Extension GitHub page.

After some time, to my surprise, I noted that the issue was closed as “by design”! Seriously? I mean, what kind of good reasonable intuitive design is the one in which the UI tells you that you have selected a given C++ language standard, but the command line doesn’t compile your code according to that??

This is the comment associated to the closing of the issue:

This is “by design”. The settings in c_cpp_properties.json do not affect the build. You need to set the flags in your tasks.json or other source of build info (CMakeLists.txt etc.).

So, am I supposed to manually set the C++20 flag in the tasks.json, despite having already set it in the C/C++ Configurations UI? Well, I do think that is either a bug, or a bad and confusing design choice. If I set the C++20 option in the UI, that should be automatically reflected on the command line, as well. If a modification is required to tasks.json to enable C++20, that should have been the job of the UI, in which I had already selected the C++20 standard!

Compare that to the sane intuitive behavior of Visual Studio, in which you can simply set the C++ standard option in the UI, and the IDE will invoke the C++ compiler with the proper flags, reflecting that.

Selecting the C++ Language Standard in Visual Studio 2019

C++ Myth-Buster: UTF-8 Is a Simple Drop-in Replacement for ASCII char-based Strings in Existing Code

Let’s bust a myth that is a source of many subtle bugs. Are you sure that you can simply drop UTF-8-encoded text in char-based strings that expect ASCII text, and your C++ code will still work fine?

Several (many?) C++ programmers think that we should use UTF-8 everywhere as the Unicode encoding in our C++ code, stating that UTF-8 is a simple easy drop-in replacement for existing code that uses ASCII char-based strings, like const char* or std::string variables and parameters.

Of course, that UTF-8-simple-drop-in-replacement-for-ASCII thing is wrong and just a myth!

In fact, suppose that you wrote a C++ function whose purpose is to convert a std::string to lower case. For example:

// Code proposed by CppReference:
// https://en.cppreference.com/w/cpp/string/byte/tolower
//
// This code is basically the same found on StackOverflow here:
// https://stackoverflow.com/q/313970
// https://stackoverflow.com/a/313990 (<-- most voted answer)

std::string str_tolower(std::string s)
{
    std::transform(s.begin(), s.end(), s.begin(),
        // wrong code ...
        // <omitted>
 
        [](unsigned char c){ return std::tolower(c); } // correct
    );
    return s;
}

Well, that function works correctly for pure ASCII characters. But as soon as you try to pass it a UTF-8-encoded string, that code will not work correctly anymore! That was already discussed in my previous blog post, and also in this post on The Old New Thing blog.

I’ll give you another simple example. Consider the following C++ function, PrintUnderlined(), that receives a std::string (passed by const&) as input, and prints it with an underline below:

// Print the input text string, with an underline below
void PrintUnderlined(const std::string& text)
{
    std::cout << text << '\n';
    std::cout << std::string(text.length(), '-') << '\n';
}

For example, invoking PrintUnderlined(“Hello C++ World!”), you’ll get the following output:

Hello C++ World!
----------------

Well, as you can see, this function works fine with ASCII text. But, what happens if you pass UTF-8-encoded text to it?

Well, it may work as expected in some cases, but not in others. For example, what happens if the input string contains non-pure-ASCII characters, like the LATIN SMALL LETTER E WITH GRAVE è (U+00E8)? Well, in this case the UTF-8 encoding for “è” is represented by two bytes: 0xC3 0xA8. So, from the viewpoint of the std::string::length() method, that “single character è” counts as two chars. So, you’ll get two underscore characters for the single è, instead of the expected one underscore character. And that will produce a bogus output with the PrintUnderlined function! And note that this same function works correctly for ASCII char-based strings.

So, if you have some existing C++ code that works with const char* or std::string, or similar char-based string types, and assumes ASCII encoding for text, don’t expect to pass a UTF-8-encoded strings and have it just automagically working fine! The existing code may still compile fine, but there is a good chance that you could have introduced subtle runtime bugs and logic errors!

Spend some time thinking about the exact type of encoding of the const char* and std::string variables and parameters in your C++ code base: Are they pure ASCII strings? Are these char-based strings encoded in some particular ANSI/Windows code pages? Which code page? Maybe it’s an “ANSI” Windows code page like Latin 1 / Western European Windows-1252 code page? Or some other code page?

You can pack many different kinds of stuff in char-based strings (ASCII text, text encoded in various code pages, etc.), and there is no guarantee that code that used to work fine with that particular encoding would automatically continue to work correctly when you pass UTF-8-encoded text.

If we could start everything from scratch today, using UTF-8 for everything would certainly be an option. But, there is a thing called legacy code. And you cannot simply assume that you can just drop UTF-8-encoded strings in the existing char-based strings in existing legacy C++ code bases, and that everything will magically work fine. It may compile fine, but running fine as expected is another completely different thing.

How to Safely Pass a C++ String View as Input to a C-interface API

Use STL string objects like std::string/std::wstring as a safe bridge.

Last time, we saw that passing a C++ std::[w]string_view to a C-interface API (like Win32 APIs) expecting a C-style null-terminated string pointer can cause subtle bugs, as there is a requirement impedance mismatch. In fact:

The C-interface API (e.g. Win32 SetWindowText) expects a null-terminated string pointer
The STL string views do not guarantee null-termination

So, supposing that you have a C++17 (or newer) code base that heavily uses string views, when you need to interface those with Win32 API function calls, or whatever C-interface API, expecting C-style null-terminated strings, how can you safely pass instances of string views as input parameter?

Invoking the string_view/wstring_view’s data method would be dangerous and source of subtle bugs, as the data returned pointer is not guaranteed to point to a null-terminated string.

Instead, you can use a std::string/wstring object as a bridge between the string views and the C-interface API. In fact, the std::string/wstring’s c_str method does guarantee that the returned pointer points to a null-terminated string. So it’s safe to pass the pointer returned by std::[w]string::c_str to a C-interface API function that expects a null-terminated C-style string pointer (like PCWSTR/LPCWSTR parameters in the Win32 realm).

For example:

// sv is a std::wstring_view

// C++ STL strings can be easily initialized from string views
std::wstring str{ sv };

// Pass the intermediate wstring object to a Win32 API,
// or whatever C-interface API expecting 
// a C-style *null-terminated* string pointer.
DoSomething( 
    // PCWSTR/LPCWSTR/const wchar_t* parameter
    str.c_str(), // wstring::c_str

    // Other parameters ...    
);

// Or use a temporary string object to wrap the string view 
// at the call site:
DoSomething(
    // PCWSTR/LPCWSTR/const wchar_t* parameter
    std::wstring{ sv }.c_str(),

    // Other parameters ...
);

Comparing Different Methods for Accessing Raw Character Buffers in Strings vs. String Views

Let’s try to make clarity on some different available options.

In a previous blog post, I discussed the reasons why I passed std::wstring objects by const&, instead of using std::wstring_view. The key point was that those strings were passed as input parameters to C-interface API functions, that expected null-terminated C-style strings.

In particular, the standard string classes like std::string and std::wstring offer the c_str method, which returns a pointer to a read-only string buffer that is guaranteed to be null-terminated. This method is only available in the const version; there is no non-const overload of c_str that returns a character buffer with read/write access. If you need read-write access to the internal string character buffer, you need to invoke the data method, which is available in both const and non-const overloaded forms. The string‘s data method guarantees that the returned buffer is null-terminated.

On the other hand, the string view‘s data method does not offer this null-termination guarantee. In addition, there is no c_str method available for string views (which makes sense if you think that c_str implies a null-terminated C-style string pointer, and [w]string_views are not guaranteed to be null-terminated).

The properties discussed in the above paragraph can be summarized in the following comparison table:

Method	[w]string	[w]string_view
c_str (const)	Returns null-terminated read-only string buffer	N/A
c_str (non-const)	N/A	N/A
data (const)	Returns null-terminated read-only string buffer	No guarantee for null-termination
data (non-const)	Returns null-terminated read/write string buffer	No guarantee for null-termination

Accessing raw character buffer in C++ standard strings vs. string views

Suggestion for the C++ Standard Library: Null-terminated String Views?

As a side note, it probably wouldn’t be bad if null-terminated string views were added to the C++ standard library. That would make it possible to pass instances of those null-terminated string views instead of const& to string objects as input strings to C-interface APIs.

Visual Studio 2019 and [[nodiscard]] in its Default C++14 Mode

“Back-porting” and enabling [[nodiscard]] in C++14 code bases is a feature that can come in handy.

[[nodiscard]] is a feature that was added in C++17. However, it’s interesting to note that the Visual C++ compiler that comes with Visual Studio 2019 accepts [[nodiscard]] also when you compile C++ code in the default C++14 mode.

C++ project property page in VS 2019, showing the default C++14 mode selected. — Visual Studio 2019 C++ Project Properties

If, for some reason, you can’t compile your C++ code base with C++17 mode enabled, I would still suggest to use [[nodiscard]] in your C++ code, at least in places where it can help spotting subtle bugs (like in the cases suggested here).

C4834 warning message emitted by VC++ 2019 in its default C++14 mode, when the return value of a [[nodiscard]] function is discarded. — C4834 [[nodiscard]]-related warning message emitted by VC++ in default C++14 mode

You could even use some conditional compilation with preprocessor macros to enable [[nodiscard]] when compiling your C++ code with compilers like VC++ 2019 that support this feature, and expand that preprocessor macro to nothing in C++ compilers that don’t support it.

Where to Apply [[nodiscard]]?

A couple of suggestions for applying the [[nodiscard]] attribute in C++ code.

In the previous blog post, I shared some thoughts on [[nodiscard]]. As I wrote there, it would be better to have a good default of always assuming [[nodiscard]], and opt out from that good default only in some specific cases. But, unfortunately, the current state of the C++ language standard is different, and actually the opposite of the above.

So, where would I strongly recommend to apply [[nodiscard]]?

The Case of Bad Confusing Naming Choices

A common case that comes to my mind is the example of std::vector::empty. I have always found that vector’s method name confusing. I mean: Is “empty” a method to empty the vector, that is to remove its content? No. std::vector::empty is a bool-returning method to check if the vector has no element.

So, in this case, applying [[nodiscard]] to the vector::empty method will make the C++ compiler generate a warning when you write this kind of C++ code:

// myVector is a std::vector

// Empty the vector content [NOTE: Wrong code]
myVector.empty();

As can be read from the comment above, the intention of the programmer was to empty the vector. But the vector::empty method simply returned a bool to check if the vector had no elements. And, in the above case, that returned value was discarded.

So, the use of the [[nodiscard]] attribute with the vector::empty method will generate a warning in the above bogus case. It’s like the C++ compiler telling the programmer: “Hey, programmer! You discarded the return value of vector::empty!” So the programmer thinks: “Hey, what return value? Wait a minute… I just wanted to empty the vector! Ahhh…Ok. I got it: This vector::empty method returns a bool and is used to check if the vector is empty. So, the method I should call is another one!” And then, after some search, the vector::clear method is invoked and the above code fixed.

// Correct code to empty the vector:
myVector.clear();

Note: The so maltreated MFC, with its CArray class, did choose a much better naming for its “empty” method, calling it IsEmpty. With such a good name, you won’t certainly make the mistake of invoking IsEmpty when you actually meant to make the vector empty. The designers of the STL made a confusing choice for the vector::empty method: calling it is_empty would have been much better and clear.

So, this is actually a case in which [[nodiscard]] comes to the rescue for a bad confusing choice for naming methods.

The Case of Returning Raw Owning Resource Handles

Another important case where I strongly suggest to apply the [[nodiscard]] attribute is when you return to the caller some raw owning handle or pointer (which you can think of as an “handle” to memory resources).

For example, in my WinReg C++ library, I wrap a raw Windows Registry HKEY handle in the safe boundaries of the RegKey C++ class.

However, for flexibility of design, I chose to have a RegKey::Detach method that transfers the ownership of that HKEY raw handle to the caller:

// Transfer ownership of current HKEY to the caller.
// Note that the caller is responsible for closing the key handle!
[[nodiscard]] HKEY Detach() noexcept;

After invoking that method, the caller becomes the new owner of the returned HKEY. As such, the caller will be responsible for properly tracking and releasing the raw handle when not needed anymore.

In such cases, discarding (by mistake) the returned value of the RegKey::Detach method would cause a resource leak. So, having the C++ compiler speak up in such cases would definitely help in preventing such leaks. So, if you have to be selective of where you apply the [[nodiscard]] attribute, this is certainly a great place for it!

To [[nodiscard]] or Not [[nodiscard]] by Default?

Some reflections on the use (or abuse?) of the [[nodiscard]] attribute introduced in C++17.

An interesting addition made in C++17 has been the [[nodiscard]] attribute. The basic idea is to apply it such that the C++ compiler will emit a warning when the return values of functions or methods are discarded.

Someone said that, even in previous versions of the C++ standard, if your C++ compiler didn’t emit a warning when you discard the return value of functions or methods, you should discard that C++ compiler.

Really?

Well, consider this simple piece of C++ code:

int GetSomething()
{
    return 10;
}

int main()
{
    GetSomething();
}

I tried compiling that code with VS 2019 at warning level 4 (which is a good warning level for VC++), and the code compiles successfully, with no warnings emitted by the VC++ compiler.

Of course, I won’t discard VC++, as I consider it the best C++ compiler for Windows (and Visual Studio is my favorite IDE for Windows).

On the other hand, if I add the [[nodiscard]] attribute to the GetSomething function:

[[nodiscard]] int GetSomething()
{
    ...

then the VC++ compiler does emit a warning message:

warning C4834: discarding return value of function with ‘nodiscard’ attribute

So, should C++ compilers emit warnings on discarded return values by default?

Well, if you think about it, printf returns an int, which I think a lot of programmers discard 🙂

So, if C++ compilers would emit such warnings by default, compiling lots of code that uses printf to print some text to the standard output would result in very noisy compilations.

Well, you may argue that printf comes from C. But I could reply that it’s used a lot in C++ (at least until the recent std::print and related formatting functions added in C++20/23).

Moreover, even focusing on more C++-specific features, even something like operator= overloads return a reference that is many times discarded:

T& operator=(const T& other)
{
    ...
    return *this;
}

Should an innocent assignment like:

a = b;

trigger a warning??

So, assuming a default [[nodiscard]] policy, that would indeed generate lots of noisy compilation sessions!

On the other hand, adding [[nodiscard]] to many functions and methods can result in very noisy and kind of cluttered code, too.

So, we need to find a balance here.

A good alternative could be adding [[nodiscard]] to structs or classes or enumerations that represent an error information or something that shouldn’t be discarded, so the C++ compiler will automatically apply the [[nodiscard]] behavior to functions and methods that return those [[nodiscard]] classes or enumerations. Note that this behavior is already implemented in C++17, and available in C++ compilers like Visual C++ 2019.

Another option I was thinking about is this: Assume [[nodiscard]] behavior as the default, and opt out using another attribute, like [[maybe_discarded]], in cases like the operator= overloads, in which discarding the returned value or reference is perfectly sensible. This would adhere to the philosophy of good defaults. In other words, if you don’t specify anything, you’ll get [[nodiscard]] behavior by default, and warning messages. And in those specific cases when discarding the returned value can make sense, only there you specify an explicit attribute like [[maybe_discarded]].

How to Access std::vector’s Internal Array

Pay attention to the target of the address-of (&) operator!

Suppose that you have some data stored in a std::vector, and you need to pass it to a function that takes a pointer to the beginning of the data, and in addition the data size or element count.

Something like this:

// Input data to process
std::vector<int> myData = { 11, 22, 33 };

//
// Do some processing on the above data
//
DoSomething(
    ??? ,         // beginning of data 
    myData.size() // element count
);

You may think of using the address-of operator (&) to get a pointer to the beginning of the data, like this:

// Note: *** Wrong code ***
DoSomething(&myData, myData.size());

But the above code is wrong. In fact, if you use the address-of operator (&) with a std::vector instance, you get the address of the “control block” of std::vector, that is the block that contains the three pointers first, last, end, according to the model discussed in a previous blog post:

Taking the address of a std::vector returns a pointer to the beginning of its control block — Taking the address of a std::vector (&v) points to its control block

Luckily, if you try the above code, it will fail to compile, with a compiler error message like this one produced by the Visual C++ compiler in VS 2019:

Error C2664: 'void DoSomething(const int *,size_t)': 
cannot convert argument 1 
from 'std::vector<int,std::allocator<int>> *' to 'const int *'
Types pointed to are unrelated; conversion requires reinterpret_cast, C-style cast or function-style cast

What you really want here is the address of the vector’s elements stored in contiguous memory locations, and pointed to by the vector’s control block.

To get that address, you can invoke the address-of operator on the first element of the vector (which is the element at index 0): &v[0].

// This code works
DoSomething(&myData[0], myData.size());

As an alternative, you can invoke the std::vector::data method:

DoSomething(myData.data(), myData.size());

Now, there’s a note I’d like to point out for the case of empty vectors:

According to the documentation on CppReference:

If size() is 0, data() may or may not return a null pointer.
CppReference.com

I would have preferred a well-defined behavior such that, when size is 0 (i.e. the vector is empty), data() must return a null pointer (nullptr). This is the good behavior that is implemented in the C++ Standard Library that comes with VS 2019. I believe the C++ Standard should be fixed to adhere to this intelligent behavior.