Three Pieces of Advice on Using Modern C++ at Win32 API Boundaries

C is widely used as a programming language at API interfaces. But that doesn’t mean that you must stick to C (or old-style C++) in *your own* code!

The previous article on enumerating modules loaded into a process using Win32 API functions and C++ invites/inspires some reflections and pieces of advice on using modern C++ at the Win32 API boundaries.

#1: Raw C Handles Should Be Wrapped in Safe C++ Classes (a.k.a. Raw C Handles Are Radioactive)

Many Win32 API C-interface functions use raw C handles (e.g. represented by the HANDLE type). For example, we saw in the previous article that the CreateToolhelp32Snapshot function returns a HANDLE that we used with other related API functions to enumerate the loaded modules.

When the handle is not needed anymore, for example after the enumeration process is completed (or even if it’s interrupted by an error), the raw handle must be freed calling the CloseHandle Win32 API function. This is a common pattern for lots of Win32 API functions:

HANDLE hSomething = CreateSomething( /* ...various parameters... */ );
// Check that the handle is valid
// (a typical error value is INVALID_HANDLE_VALUE)

// Do some processing with the above handle
DoSomething(hSomething, /* ...various parameters ... */);

// Close the handle at the end of the elaboration
CloseHandle(hSomething);

// Avoid dangling references to handles already closed
hSomething = INVALID_HANDLE_VALUE;

Well, in modern C++ the idea is to wrap this raw C HANDLE in a safe C++ class, such that, when instances of this class go out of scope, the handle will be automatically closed.

That is made possible by the fact that the C++ class destructor will be automatically called when instances of the class go out of scope, so a proper call to CloseHandle can be made by the destructor itself (or by some cleanup helper method invoked by the destructor).

To be safe, the cleanup code should also take into account the case in which the wrapped handle is invalid (case represented by the INVALID_HANDLE_VALUE for the CreateToolhelp32Snapshot API function discussed above).

So, the initial skeleton code for such a wrapper C++ class could look like this:

//----------------------------------------------------
// C++ class that safely wraps a raw C-style HANDLE,
// and releases it when instances of the class
// go out of scope.
//----------------------------------------------------
class ScopedHandle
{
public:
    // Gain ownership of the input raw handle
    explicit ScopedHandle(HANDLE h) noexcept
        : m_handle{h}
    {}

    // Get access to the wrapped raw handle,
    // for example to pass it as an argument
    // to other Win32 API functions
    HANDLE GetHandle() const noexcept
    {
        return m_handle;
    }

    // Safely releases the wrapped handle
    // (if the handle is valid)
    ~ScopedHandle() noexcept
    {
        if (m_handle != INVALID_HANDLE_VALUE)
        {
            ::CloseHandle(m_handle);
        }
    }

private:
    // Wrapped raw handle
    HANDLE m_handle;
};

As I discussed in more details in my course on Practical C++ 14 and C++17 Features (that can be still applied to newer versions of the C++ standard, as well), you can think of the raw handle as something “radioactive”, that should be safely wrapped in RAII boundaries, provided by a C++ class that behaves as a resource manager, like the one shown above.

Moreover, to avoid subtle bugs, it’s important to prevent copies for a class like the one described above:

class ScopedHandle
{
    //
    // Disable Copy
    //
private:
    ScopedHandle(ScopedHandle const&) = delete;
    ScopedHandle& operator=(ScopedHandle const&) = delete;
...

(If you do want to make the class copyable, it’s important that copy operations are well defined and implemented; for example, you could use some form of reference count applied to the wrapped handle.)

It’s also possible to improve this kind of resource manager class, for example adding move semantics. That would make it possible, for example, to return a wrapped handle by some factory function, or store it in containers like std::vector. In such case the class name should be changed to reflect its improved nature (ScopedHandle wouldn’t work anymore); for example, we could name it SafeHandle, or UniqueHandle (if it’s movable but not copyable), or whatever you like best.

If you want to see some C++ compilable code for a resource manager class like that, you can take a look at the winreg::RegKey class of my WinReg C++ library (you can find the code in the header-only WinReg.hpp file). Note that, in this case, the wrapped raw handle is of type HKEY (i.e. a handle to a registry key).

The code can be generalized, as well. For example, you could write a generic SafeHandle<T> template. This could be the topic of some future articles.

Moreover, if you want to reuse something already available, the Microsoft WIL open-source library provides a wil::unique_handle template for that purpose.

Whatever class or template you choose to use or write, the bottom line is: Do not use raw handles in modern C++ code; wrap them in safe “RAII” boundaries provided by C++ resource manager classes.

#2: Use C++ String Classes Instead of Raw C-style Null-terminated Character Arrays

Win32 API functions usually work with C structures that represent strings using either raw C-style null-terminated character pointers, or null-terminated character arrays.

In modern C++, you can do better than that! In fact, you can use safe and convenient C++ string classes instead of working with those more basic raw C-style constructs.

For example, the MODULEENTRY32 structure used in the previous article on module enumeration, has two fields that are WCHAR C-style raw null-terminated character arrays: szModule and szExePath.

// Structure definition from MSDN:
// https://learn.microsoft.com/en-us/windows/win32/api/tlhelp32/ns-tlhelp32-moduleentry32w

typedef struct tagMODULEENTRY32W {
  DWORD   dwSize;
  ...

  // Null-terminated WCHAR arrays representing Unicode UTF-16
  // strings in C:
  WCHAR   szModule[MAX_MODULE_NAME32 + 1];
  WCHAR   szExePath[MAX_PATH];
} MODULEENTRY32W;

Instead of working with those, you can create instances of C++ string classes, like CString or std::wstring, and operate on those much safer and higher level constructs made available by the C++ language and libraries:

MODULEENTRY32 moduleEntry;
...

// Create a string object storing the module name
std::wstring moduleName(moduleEntry.szModule);

// Can use ATL/MFC CString as well:
CString moduleName(moduleEntry.szModule);

Once you have created string objects from those C raw character arrays, forget about the original C character arrays, and use only the C++ string objects in the rest of your modern C++ code.

C++ string classes have many advantages over pure raw C-style arrays of characters, like being easily and safely copyable. They can also be concatenated with a very simple and highly readable syntax, like using the operator+ overload (as in: s1 + s2). And they are properly freed when they go out of scope, as well.

#3: Use C++ Containers Like std::vector Instead of Raw C Arrays

If you take a look at MSDN examples, that are typically written in C, you’ll see lots of uses of raw C arrays to store a set of elements. Typically the code follows this pattern:

SOME_STRUCTURE elements[MAX_COUNT];

// May have another variable representing 
// the actual number of elements stored in the array.
// This is increased when a new element is added.
int elementCount = 0;

In modern C++, you can do better than that: In fact, you can create a std::vector containing instances of the structures, and you can dynamically grow the vector, for example adding new elements to it invoking its push_back method:

// Start creating an empty vector
std::vector<ModuleInfo> loadedModules;

// When a new module is found during the enumeration, 
// add it to the vector container
loadedModules.push_back( ModuleInfo{ /* ... */ } );

Hope you find these suggestions of some interest!

Slide enumerating the three pieces of advice on using modern C++ at Win32 API boundaries, described in details in the article.

C is a great language for the “boundaries”. But you can happily switch gears to modern C++ on your own side of the boundary.

How to Enumerate the Modules Loaded in a Process

Let’s see how to use a convenient Windows API C-interface library from C++ code to enumerate the modules loaded in a given process.

Suppose that you want to enumerate all the modules in a given process in Windows, for example because you want to discover the DLLs loaded into a given process.

You can use the so called Tool Help Library for that (the associated Windows SDK header file is <tlhelp32.h>)

In particular, you can start taking the snapshot of the process’s module list, invoking the CreateToolhelp32Snapshot Windows API function. On success, this function will return a HANDLE that will be used with the following module enumeration functions.

// Create module snapshot for the enumeration.
//
// The first parameter passed to CreateToolhelp32Snapshot
// tells the API what kind of elements are included in the snapshot:
// In this case we are interested in the list of *modules*
// loaded in the process of given ID.
HANDLE hEnum = ::CreateToolhelp32Snapshot(TH32CS_SNAPMODULE,
                                          processID);
if (hEnum == INVALID_HANDLE_VALUE)
{
    // Signal error...
}

// On success, store the raw handle in some safe RAII wrapper,
// so it will be *automatically* closed via CloseHandle, 
// even in case of exceptions.
//
// E.g.: ScopedHandle enumHandle{ hEnum };

Then you can initialize the enumeration loop calling the Module32First API function, which will return information about the first module in the list. The module information is stored as fields of the MODULEENTRTY32 structure. You can retrieve the particular pieces of information that you want (e.g. the module name, or its size, etc.) from this structure’s fields.

Next, you can repeatedly invoke the Module32Next API to get information for the subsequent modules in the list.

In C++ code, the enumeration logic can look like this:

// Initialize this structure with its size field (dwSize).
// Other fields of the structure containing module info 
// will be written by the Module32First and Module32Next APIs.
MODULEENTRY32 moduleEntry = { sizeof(moduleEntry) };

// Start the enumeration loop invoking Module32First
BOOL continueEnumeration = ::Module32First(enumHandle.Get(),
                                           &moduleEntry);
while (continueEnumeration)
{
    // Extract the pieces of information we need 
    // from the MODULEENTRY32 structure
    moduleList.push_back(
        // Here we retrieve the module name (szModule) 
        // and the module size (modBaseSize),
        // pack them in a C++ struct, and push it into
        // a vector<ModuleInfo> that will contain the list
        // of all modules in the given process
        ModuleInfo{ moduleEntry.szModule, 
                    moduleEntry.modBaseSize }
    );

    // Move to the next module (if any)
    continueEnumeration = ::Module32Next(enumHandle.Get(),
                                         &moduleEntry);
}

When there are no more modules to enumerate in the snapshot, Module32Next will return FALSE, and a subsequent call to GetLastError will return ERROR_NO_MORE_FILES.

You can see that in action in some compilable C++ code in this repo of mine on GitHub.

For example, suppose that you want to see a list of DLLs loaded in the Notepad process. You can use your favorite tool to get the process ID (PID) associated to your running instance of Notepad, then pass this PID as the only parameter to the command line tool cited above, and it will print out a list of the loaded modules (DLLs) inside the Notepad process, as shown below.

List of modules in the Notepad process.

Comparing STL vs. ATL/MFC String Usage at the Windows API Boundaries

A comparison between the worlds of STL vs. ATL/MFC string usage at the Windows API boundaries. Plus a small suggestion to improve C++ standard library strings.

In previous articles we saw some options for using STL strings and ATL/MFC CString at the Windows API boundaries. Let’s do a quick refresher and comparison between these two “worlds”.

For the sake of simplicity, let’s assume Unicode builds (which have been the default since VS 2005) and consider std::wstring for the STL side of the comparison.

The Input String Case

When passing strings as input parameters to Windows API C-interface functions, you can invoke the c_str method for STL std::wstring instances; on the other side, you can just pass CString instances, as CString implements an implicit C-style string pointer conversion operator, that will be automatically invoked by the compiler. At first sight, it seems that the CString approach is simpler (i.e. just pass the CString object), although in modern C++ there is a propensity to avoid implicit conversions, so the explicit call to c_str required by STL strings sounds safer. (Anyway, if you prefer explicit method invocations, CString offers a GetString method, as well.)

The Output String Case

Using an External Temporary Buffer – Both STL strings and ATL/MFC CString have constructor overloads that take an input pointer to a raw character buffer that is assumed to be null-terminated, and can build string objects from the content of that raw C-style null-terminated character buffer. This means that you can create an external temporary character buffer, pass a pointer to it as output string parameter to the C-interface Windows API you want to invoke, and then build the result string object (both STL wstring and ATL/MFC CString) using a pointer to that external intermediate buffer. In addition, an explicit buffer length can be passed together with the pointer to the beginning of the buffer, in case you want or need to explicitly pass the string length, and not relying on the null terminator.

Working In-Place – For both STL strings and ATL/MFC CString it’s possible to work with an internal buffer. This can be allocated using the resize method for STL strings, and then can be accessed via the non-const pointer returned by the data method invoked for the same string. If the returned string is shorter than the allocated buffer length, you have to find the position of the null-terminator scribbled in by the invoked Windows API, and call the STL string’s resize method once again to set the proper size (“length”) of the result string object.

On the other hand, with CString you can use the GetBuffer/ReleaseBuffer method pair: You can allocate the internal CString buffer specifying a proper (minimum) size invoking GetBuffer, then pass the pointer it returns on success to Windows C-interface APIs, and finally invoke CString::ReleaseBuffer to let the CString object update its internal state to properly store the null-terminated string written by the called function into the provided buffer.

Summary Table of the Various Cases

The following table summarizes the various cases discussed so far in a compact form:

STL vs. ATL/MFC string usage at the Windows API boundaries – Summary table

I think that for the “working in-place” output sub-case, CString is more convenient than STL strings, as:

  1. You don’t have to specify any initial value for filling the buffer allocated with GetBuffer; on the other hand, with STL strings you must specify some initial value to fill the string buffer when you invoke the string’s resize method (or equivalently the string constructor that takes a count of characters to repeat). So the CString::GetBuffer method is also likely more efficient, as it doesn’t need to fill the allocated buffer (at least in release builds).
  2. It’s possible to allocate a larger-than-needed buffer with GetBuffer (all in all you pass the safe minimum required buffer length to this method), then have the Windows API function write a shorter null-terminated string in that buffer. The ReleaseBuffer method will automatically scan the buffer content for the string’s null-terminator, and will properly update the CString object internal state (e.g. the string length) in accordance to that. This nice feature (scan until the null-terminator and properly set the string size) is not available with STL strings, as there is no such thing as a resize_until_null method.

A Small Suggestion for Improving STL Strings Interoperability with C-Interface Functions (Including Windows APIs)

So, here’s as a small suggestion for improving the C++ standard library strings: It would be nice to have something like get_buffer and release_buffer methods available for STL strings, following the same semantics of CString’s GetBuffer and ReleaseBuffer methods, with:

1. No need to specify an initial character to fill the STL string object when the internal buffer is allocated.

2. Automatically set the size of the final string object based on the null-terminator written into the internal buffer.

How to Use ATL/MFC CString at the Windows API Boundaries

Let’s discuss how CString C++ objects can be used at the boundaries of C-interface Windows APIs, considering both the input and output string cases.

We have two cases to consider: Let’s start with the input read-only string case (which is the easiest one).

The Input String Case

If you have a CString instance and you want to pass it to a Windows API function that expects an input read-only string, you can simply pass the CString instance itself! It’s as simple as this:

//
// *** Input string case ***
// 

// Some string to pass as input parameter
CString myString = TEXT("Connie"); 

// Just pass the CString object to the Windows API function
// that expects an input string, like e.g. SetWindowText
DoSomething(myString);

If you compile your C++ code in Unicode (UTF-16) build mode (which has been the default since Visual Studio 2005): CString is a wchar_t-based string, and the input string parameter of the Windows API function that follows the TCHAR model is typically a PCWSTR, which is basically a null-terminated const wchar_t* C-style string pointer. In this case CString offers an implicit conversion operator to PCWSTR, so you can simply pass the CString object, and the C++ compiler will automatically invoke the PCWSTR conversion operator to pass the given string as read-only string input parameter.

The Output String Case

Now let’s consider the output case, i.e. you are invoking a Windows API function that expects an output string parameter. Usually, in C-interface Windows API functions this is represented by a non-const pointer to a raw C-style character array, in particular a wchar_t* (or PWSTR) in the Unicode UTF-16 form.

You (i.e. the caller) pass a non-const pointer to a writable wchar_t buffer, and the Windows API function will fill this caller-provided buffer with proper characters, including a terminating NUL. This is all C stuff, but at the end of the process what you really want is the result string to be stored in a C++ CString object. So, how can you bridge these two worlds of C-style null-terminated string pointers and C++ CString class instances?

Option 1: Using an Intermediate Buffer

Well, one option would be to allocate an intermediary character buffer, then let the Windows API function fill it with its own result text, and finally create a CString object initialized with the content of that external intermediary buffer. For example:

//
// The Output String Case -- Using a Temporary Buffer
//

// 1. Allocate a temporary buffer to pass to the Windows API function
auto buffer = std::make_unique< wchar_t[] >(bufferLength);

// 2. Call the Windows API function, passing the address of the temporary buffer.
// The Windows API function will write its string into that buffer,
// including a NUL terminator, as per usual C string convention.
GetSomeText(
    buffer.get(), // <-- starting address of the output buffer
    bufferLength, // <-- size of the buffer (e.g. in wchar_t's)
    /* ... other parameters ... */
);

// 3. Create a CString object initialized with the NUL-terminated
// string stored in the temporary buffer previously allocated
CString result(buffer.get());

// NOTE: Since the temporary buffer was created with make_unique,
// it will be *automatically* released.
// Thank you C++ destructors!

Option 2: Working In-place with the CString’s Own Buffer

In addition to that, instead of creating an external temporary buffer, passing its address to the Windows API function to let it write the output string, and then copying the result string into a CString object, CString offers another option.

In fact, you can request CString objects to allocate some internal memory, and get write access to that. In this way, you can directly pass the address of that CString’s own internal memory to the desired Windows API function, and let it write the result string directly inside the CString’s internal buffer, without the need of creating an external intermediary buffer, and making an additional string copy (from the temporary buffer to the CString object).

The method that CString offers to allocate an internal character buffer is GetBuffer. You can specify to it the minimum required buffer length. On success, this method returns a non-const pointer to the beginning of the buffer memory, which you can pass to the Windows API function expecting an output string parameter.

Then, after the called function has written its result string into the provided buffer (including the NUL-terminator), you can invoke the CString::ReleaseBuffer method to let the CString object update its internal state to properly store the NUL-terminated string previously written in its own buffer.

In C++ code, this process looks like that:

//
// The Output String Case -- Using CString's Internal Buffer
//

// This CString object will store the output result string
CString result;

// Allocate a CString buffer of proper size, 
// and return a _non-const_ pointer to it
wchar_t* pBuffer = result.GetBuffer(bufferLength);

// Pass the pointer to the internal CString buffer
// and the buffer length to the Windows API function
// expecting an output string parameter
GetSomeText(
    pBuffer,      // <-- starting address of the buffer
    bufferLength, // <-- size of the buffer (e.g. in wchar_t's)
    /* ... other parameters ... */
);

// We assume that the Windows API function has written
// a properly NUL-terminated string into the provided buffer.
// So we can invoke CString::ReleaseBuffer to update
// the CString object's internal state, 
// and release control of the buffer.
result.ReleaseBuffer();

// It's good practice to clear the buffer pointer to avoid subtle bugs
// caused by referencing it after the ReleaseBuffer call
pBuffer = nullptr;

// Now you can happily use the CString result object in your code!

As you can note, in this second case, working in-place with the CString’s internal buffer, the output string characters are written only once inside the CString object, instead of being first written to an external intermediate buffer, and then being copied inside the result CString object.

Why Don’t You Use String Views (as std::wstring_view) Instead of Passing std::wstring by const&?

Thank you for the suggestion. But *in that context* that would cause nasty bugs in my code, and in code that relies on it.

…Because (in the given context) that would be wrong 🙂

This “suggestion” comes up with some frequency…

The context is this: I have some Win32 C++ code that takes input string parameters as const std::wstring&, and someone suggests me to substitute those wstring const reference parameters with string views like std::wstring_view. This is usually because they have learned from someone in some course/video course/YouTube video/whatever that in “modern” C++ code you should use string views instead of passing string objects via const&. [Sarcastic mode on]Are you passing a string via const&? Your code is not modern C++! You are such an ignorant C++98 old-style C++ programmer![Sarcastic mode off] 😉

(There are also other “gurus” who say that in modern C++ you should always use exceptions to communicate error conditions. Yeah… Well, that’s a story for another time…)

So, Thank you for the suggestion, but using std::wstring_view instead of const std::wstring& in that context would introduce nasty bugs in my C++ code (and in other people’s code that relies on my own code)! So, I won’t do that!

In fact, my C++ code in question (like WinReg) talks to some Win32 C-style APIs. These expect PCWSTR as input parameters representing Unicode UTF-16 strings. A PCWSTR is basically a typedef for a _Null_terminated_ const wchar_t*. The key here is the null termination part.

If you have:

// Input string passed via const&.
//
// Someone suggests me to replace 'const wstring &' 
// with wstring_view:
//
//     void DoSomething(std::wstring_view s, ...)
//
void DoSomething(const std::wstring& s, ...)
{
    // This API expects input string as PCWSTR,
    // i.e. _null-terminated_ const wchar_t*.
    SomeWin32Api(s.data(), ...); // <-- See the P.S. later
}

std::wstring guarantees that the pointer returned by the wstring::data() method points to a null-terminated string.

On the other hand, invoking std::wstring_view::data() does not guarantee that the returned pointer points to a null-terminated string. It may, or may not. But there is no guarantee!

So, since [w]string_views are not guaranteed to be null-terminated, using them with Win32 APIs that expect null-terminated strings is totally wrong and a source of nasty bugs.

So, if your target are Win32 API calls that expect null-terminated C-style strings, just keep passing good old std::wstring by const&.

Bonus Reading: The Case of string_view and the Magic String


P.S. Invoking data() vs. c_str() – To make things clearer (and more bug-resistant), when you need a null-terminated C-style string pointer as input parameter, it’s better to invoke the c_str() method on [w]string (instead of the data() method), as there is no corresponding c_str() method available with [w]string_view.

In this way, if someone wants to “modernize” the existing C++ code and tries to change the input string parameter from [w]string const& to [w]string_view, they get a compiler error when the c_str() method is invoked in the modified code (as there is no c_str() method available for string views). It’s much better to get a compile-time error than a subtle run-time bug!

On the other hand, the data() method is available for both strings and string views, but its guarantees about null-termination are different for strings vs. string views.

So, invoking the string’s c_str() method (instead of the data() method) is what I suggest when passing STL strings to Win32 API calls that expect C-style null-terminated string pointers as input (read-only) parameters. I consider this a best practice.

(Of course, if the C-interface API function needs to write to the provided string buffer, the data() method must be invoked, as it’s overloaded for both the const and non-const cases.)

Embedding (and Extracting) Binary Files like DLLs into an EXE as Resources

A Windows .EXE executable file can contain binary resources, which are basically arbitrary binary data embedded in the file.

In particular, it’s possible to embed one or more DLLs as binary resources into an EXE. In this article, I’ll first show you how to embed a DLL as a binary resource into an EXE using the Visual Studio IDE; then, you’ll learn how to access that binary resource data using proper Windows API calls.

A Windows EXE file can contain one or more DLLs embedded as binary resources.

Embedding a Binary Resource Using Visual Studio IDE

If you are using Visual Studio to develop your Windows C++ applications, from Solution Explorer you can right-click your EXE project node, then choose Add > Resource from the menu.

Menu command to add a resource using Visual Studio.
Adding a resource from the Visual Studio IDE

Then click the Import button, and select the binary resource to embed into the EXE, for example: TestDll.dll.

The Add Resource dialog box in Visual Studio
Click the Import button to add the binary resource (e.g. DLL)

In the Custom Resource Type dialog box that appears next, enter RCDATA as Resource type.

Then click the OK button.

A hex representation of the resource bytes is shown in the IDE. Type Ctrl+S or click the diskette icon in the toolbar to save the new resource data.

You can close the binary hex representation of the resource.

The resource was automatically labeled by the Visual Studio IDE as IDR_RCDATA1. To change that resource ID, you can open Resource View. Then, expand the project node, until you see the RCDATA virtual folder, and then IDR_RCDATA1 inside it. Click the IDR_RCDATA1 item to select it.

In the Properties grid below, you can change the ID field, for example: you can rename the resource ID as IDR_TEST_DLL.

Type Ctrl+S to save the modifications.

The binary resource ID under the RCDATA virtual folder in Resource View
The binary resource ID (IDR_TEST_DLL) under the RCDATA virtual folder in Resource View
The resource properties grid
The Properties grid to edit the resource properties

Don’t forget to #include the resource header (for example: “resource.h”) in your C++ code when you need to refer to the embedded resource by its ID.

In particular, if you open the resource.h file that was created and modified by Visual Studio, you’ll see a #define line that associates the “symbolic” name of the resource (e.g. IDR_TEST_DLL) with an integer number that represents the integer ID of the resource, for example:

#define IDR_TEST_DLL            101

Accessing an Embedded Binary Resource from C/C++ Code

Once you have embedded a binary resource, like a DLL, into your EXE, you can access the resource’s binary data using some specific Windows APIs. In particular:

  1. Invoke FindResource to get the specified resource’s information block (represented by an HRSRC handle).
  2. Invoke LoadResource, passing the above handle to the resource information block. On success, LoadResource will return another handle (declared as HGLOBAL for backward compatibility), that can be used to access the first byte of the resource.
  3. Invoke LockResource passing the resource handle returned by LoadResource, to get access to the first byte of the resource.

To get the size of the resource, you can call the SizeofResource API.

The above “API dance” can be translated into the following C++ code:

#include "resource.h"  // for the resource ID (e.g. IDR_TEST_DLL)


// Locate the embedded resource having ID = IDR_TEST_DLL
HRSRC hResourceInfo = ::FindResource(hModule,
                                     MAKEINTRESOURCE(IDR_TEST_DLL),
                                     RT_RCDATA);
if (hResourceInfo == nullptr)
{
    // Handle error...
}

// Get the handle that will be used to access 
// the first byte of the resource
HGLOBAL hResourceData = ::LoadResource(hModule, hResourceInfo);
if (hResourceData == nullptr)
{
    // Handle error...
}

// Get the address of the first byte of the resource
const void * pvResourceData = ::LockResource(hResourceData);
if (pvResourceData == nullptr)
{
    // Handle error...
}

// Get the size, in bytes, of the resource
DWORD dwResourceSize = ::SizeofResource(hModule, hResourceInfo);
if (dwResourceSize == 0)
{
    // Handle error...
}

I uploaded on GitHub a C++ demo code that extracts a DLL embedded as a resource in the EXE, and, for testing purposes, invokes a function exported from the extracted DLL. In particular, you can take a look at the ResourceBinaryView.h file for a reusable C++ class to get a read-only binary view of a resource.

P.S. An EXE is not the only type of Windows Portable Executable (PE) file that can have embedded resources. For example: DLLs can contain resources, as well.

Unicode Conversions with String Views as Input Parameters

Replacing input STL string parameters with string views: Is it always possible?

In a previous blog post, I showed how to convert between Unicode UTF-8 and UTF-16 using STL string classes like std::string and std::wstring. The std::string class can be used to store UTF-8-encoded text, and the std::wstring class can be used for UTF-16. The C++ Unicode conversion code is available on GitHub as open source project.

The above code passes input string parameters using const references (const &) to STL string objects:

// Convert from UTF-16 to UTF-8
std::string ToUtf8(std::wstring const& utf16)
    
// Convert from UTF-8 to UTF-16
std::wstring ToUtf16(std::string const& utf8)

Since C++17, it’s also possible to use string views for input string parameters. Since string views are cheap to copy, they can just be passed by value (instead of const&). For example:

// Convert from UTF-16 to UTF-8
std::string ToUtf8(std::wstring_view utf16)
    
// Convert from UTF-8 to UTF-16
std::wstring ToUtf16(std::string_view utf8)

As you can see, I replaced the input std::wstring const& parameter above with a simpler std::wstring_view passed by value. Similarly, std::string const& was replaced with std::string_view.

Important Gotcha on String Views and Null Termination

There is an important note to make here. The WideCharToMultiByte and MultiByteToWideChar Windows C-interface APIs that are used in the conversion code can accept input strings in two forms:

  1. A null-terminated C-style string pointer
  2. A counted (in bytes or wchar_ts) string pointer

In my code, I used the second option, i.e. the counted behavior of those APIs. So, using string views instead of STL string classes works just fine in this case, as string views can be seen as a pointer and a “size”, or count of characters.

A representation of string views: they can be seen as a pointer and a size.
A representation of string views: pointer + size

But string views are not necessarily null-terminated, which implies that you cannot safely use string view parameters when passing strings to APIs that expect null-terminated C-style strings. In fact, if the API is expecting a terminating null, it may well run over the valid string view characters. This is a very important point to keep in mind, to avoid subtle and dangerous bugs when using input string view parameters.

The modified code that uses input string view parameters instead of STL string classes passed by const& can be found in this branch of the main Unicode string conversion project on GitHub.

Simplifying Windows Registry Programming with the C++ WinReg Library

A convenient easy-to-use and hard-to-misuse high-level C++ library that wraps the complexity of the Windows Registry C-interface API.

The native Windows Registry API is a C-interface API, that is low-level and kind of hard and cumbersome to use.

For example, suppose that you simply want to read a string value under a given key. You would end up writing code like this:

Complex code to get a string value from the registry, using the Windows native Registry API.
Sample code excerpt to read a string value from the Windows Registry using the native Windows C-interface API.

Note how complex and bug-prone that kind of code that directly calls the Windows RegGetValueW API is. And this is just the part to query the destination string length. Then, you need to allocate a string object with proper size (and pay attention to proper size-in-bytes-to-size-in-wchar_ts conversion!), and after that you can finally read the actual string value into the local string object.

That’s definitely a lot of bug-prone C++ code, and this is just to query a string value!

Moreover, in modern C++ code you should prefer using nice higher-level resource manager classes with automatic resource cleanup, instead of raw HKEY handles that are used in the native C-interface Windows Registry API.

Fortunately, it’s possible to hide that kind of complex and bug-prone code in a nice C++ library, that offers a much more programmer-friendly interface. This is basically what my C++ WinReg library does.

For example, with WinReg querying a string value from the Windows Registry is just a simple one-line of C++ code! Look at that:

Simple one-liner C++ code to get a string value from the Registry using WinReg.
You can query a string value with just one simple line of C++ code using WinReg.

With WinReg you can also enumerate all the values under a given key with simple intuitive C++ code like this:

auto values = key.EnumValues();

for (const auto & [valueName, valueType] : values)
{
    //
    // Use valueName and valueType
    //
    ...
}

WinReg is an open-source C++ library, available on GitHub. For the sake of convenience, I packaged and distribute it as a header-only library, which is also available via the vcpkg package manager.

If you need to access the Windows Registry from your C++ code, you may want to give C++ WinReg a try.

Decoding Windows SDK Types: PSTR and PCSTR

The char-based equivalents of the already decoded wchar_t-based C-style string typedefs

In previous blog posts, we decoded some Windows SDK string typedefs like PWSTR and PCWSTR. As we already saw, they are basically typedefs for C-style null-terminated wchar_t Unicode UTF-16 string pointers. In particular, the “C” in PCWSTR means that the string is read-only (const).

To recap these typedefs in table form, we have:

Windows SDK TypedefC/C++ Underlying Type
PWSTRwchar_t*
PCWSTRconst wchar_t*

Now, as you can imagine, there are also char-based variants of these wchar_t string typedefs. You can easily recognize their names as the char versions do not have the “W” (which stands for WCHAR or equivalently wchar_t): They are PSTR and PCSTR.

As you can see, there is the “STR” part in their names, specifying that these are C-style null-terminated strings. The “P” stands for pointer. So, in table form, they correspond to the following C/C++ underlying types:

Windows SDK TypedefC/C++ Underlying Type
PSTRchar*
PCSTRconst char*

As expected, the const version, which represents read-only strings, has the letter “C” before “STR”.

These char-based string pointers can be used to represent ASCII strings, strings in some kind of multi-byte encoding, and even UTF-8-encoded strings.

Moreover, as we already saw for the wchar_t versions, another common prefix you can find for them is “LP” instead of just “P” for “pointer”, as in LPSTR and LPCSTR. LPSTR is the same as PSTR; similarly, LPCSTR is the same as PCSTR.

In table form:

Windows SDK TypedefC/C++ Underlying Type
PSTRchar*
LPSTRchar*
PCSTRconst char*
LPCSTRconst char*
Windows SDK char-based C-style string typedefs

How Can You Pass STL Strings as PWSTR Parameters at the Windows API Boundaries?

Getting text from Windows C-interface APIs and storing it into STL string objects.

In a previous blog post, we saw that a PWSTR is basically a pointer to a wchar_t array that is typically filled with some Unicode UTF-16 null-terminated text by Windows API calls. In other words, a PWSTR is an output C-style string pointer. You pass it to some Windows API, and, on success, the API will have written some null-terminated text into the caller provided buffer.

Typically, a PWSTR pointer parameter is accompanied by another parameter that represents the size of the output buffer pointed to. In this way, the Windows API knows where to stop when writing the output string, preventing dangerous buffer overflow bugs and security problems.

For example, if you consider the MultiByteToWideChar API prototype1:

int MultiByteToWideChar(
    UINT   CodePage,
    DWORD  dwFlags,
    LPCCH  lpMultiByteStr,
    int    cbMultiByte,
    LPWSTR lpWideCharStr,
    int    cchWideChar
);

The second to last parameter (lpWideCharStr) is a caller-provided pointer to an output buffer, and the last parameter (cchWideChar) is the size, in wchar_ts, of that output buffer.

Now, suppose that you have an STL string and want to pass it as an output PWSTR parameter. How can you do that?

If you have a std::wstring object, it’s ready to store Unicode UTF-16 text on Windows.

Allocating an External Buffer

To store some UTF-16 text returned by a Windows API via PWSTR parameter in a wstring object, you can allocate a wchar_t buffer of proper size, for example using std::unique_ptr and std::make_unique:

// Allocate an output buffer of proper size
auto buffer = std::make_unique< wchar_t[] >(bufferLength);

Then, you can invoke the desired Windows API, passing the buffer pointer and size:

// Call the Windows API to get some text in the allocated buffer 
result = GetSomeTextApi(
    buffer.get(), // output buffer pointer (PWSTR)
    bufferLength, // size of the output buffer, in wchar_ts

    // ...other parameters...
);

// Check 'result' for errors...

Then, since the output buffer is null-terminated, you can use a std::wstring constructor overload to create a std::wstring object from the null-terminated text stored in that buffer:

// Create a wstring that stores the null-terminated text
// returned by the API in the output buffer
std::wstring text(buffer.get());

However, you can do even better than that.

Working Directly with the String’s Internal Buffer

In fact, instead of allocating an external buffer owned by a unique_ptr, and then doing a second string allocation with the std::wstring constructor, you could simply create a wstring of proper size, and then pass to the API the address of the internal string character array. In this way, you work in-place in the wstring character array, without allocating an external buffer. For example:

// Allocate a string of proper size (bufferLength is in wchar_ts)
std::wstring text;
text.resize(bufferLength);

// Get the text from the Windows API
result = GetSomeTextApi(
    &text[0],     // output buffer pointer (PWSTR)
    bufferLength, // size of the output buffer, in wchar_ts

    // ...other parameters...
);

Note that the address of the internal string buffer can be obtained with the &text[0] syntax. In addition, since C++17, the std::wstring class offers a convenient wstring::data method, that you can invoke to get the address of the internal string buffer, as well.

Note also that, sine C++17, it’s became legal to overwrite the terminating NUL character in the internal string buffer with another NUL.

On the other hand, with C++11 and C++14, to fully adhere to the C++ standard, you had to allocate a buffer of larger size to make room for the NUL terminator written by the Windows API, and then you had resize down the string to chop off this additional NUL:

text.resize(bufferLength - 1);

I wrote an article for MSDN Magazine on “Using STL Strings at Win32 API Boundaries”. Note that this article predates C++17. So, if your C++ toolkit is compatible with C++17:

  • You can invoke the wstring::data method (in addition to the C++11/14 compatible &text[0] syntax)
  • You can let Windows APIs overwrite the terminating NUL in the wstring internal buffer with another NUL, instead of making extra room for the additional NUL written by Windows APIs, and then resizing the wstring down to chop it off.2
  1. The MultiByteToWideChar API prototype uses LPWSTR, which is perfectly equivalent to PWSTR, as we already saw in the blog post discussing the PWSTR type. ↩︎
  2. Overwriting the STL string’s terminating NUL with another NUL has worked fine for me even with C++11 and C++14 Visual Studio compilers and library implementations. ↩︎