C – Giovanni Dicanio's Blog

Linus Torvalds and the Supposedly “Garbage Code”

Linus Torvalds criticized a RISC-V Linux kernel contribution from a Google engineer as “garbage code.” The discussion focuses on the helper function make_u32_from_two_u16() versus Linus’s proposed explicit code. Let’s discuss the importance of using proper type casting, bit manipulation, and creating a safer, reusable macro or function for clarity and bug reduction.

Recently, Linus Torvalds publicly dismissed a RISC-V code contribution to the Linux kernel made by a Google engineer as “garbage code”:

https://lkml.org/lkml/2025/8/9/76

First, I think Linus should be more respectful of other people.

In addition, let’s focus on the make_u32_from_two_u16() helper. My understanding is that this is a C preprocessor macro (as the Linux Kernel is mainly written in C). Let’s compare that helper with the explicit code “(a << 16) + b” proposed by Linus.

First, this explicit code is likely wrong, and in fact Linus adds that “maybe you need to add a cast”.

Why should we add a cast? In Linus’s words: “[…] to make sure that ‘b’ doesn’t have high bits that pollutes the end result”. So, what should the explicit code look like according to him? “(a << 16) + (uint16_t)b”?

But let’s do a step back. We should ask ourselves: What are the types of ‘a’ and ‘b’? From the helper’s name, I would think they are two “u16”, so two uint16_t.

If I was asked to write C code that takes two uint16_t values ‘a’ and ‘b’ as input and combines them into a uint32_t, I would write something like that:

  ((uint32_t)a << 16) | (uint32_t)b

I would use the bitwise OR (|) instead of +; I find it more appropriate as we are working at the bit manipulation level here. But maybe that’s just a matter of personal preference and coding style.

Moreover, I’d use the type casts as shown above, on both ‘a’ and ‘b’.

I’m not sure what Linus meant with ‘b’ potentially having “high bits that pollutes the end result”. Could ‘b’ be a uint32_t? In that case, I would use a bitmask like 0xFFFF with bitwise AND (&) to clear the high bits of ‘b’.

Moreover, I’d probably use better names for ‘a’ and ‘b’, too, like ‘high’ and ‘low’, to make it clear what is the high 16-bit word and what is the low 16-bit word.

So, the correct explicit code is not something as simple as “(a << 16) + b”. You may need to type cast, and you have to pay attention to do it correctly with proper use of parentheses. And you may potentially need to clear the high bits of ‘b’ with a bitmask?

And, if this operation of combining two uint16_t into a uint32_t is done in several places, you sure have many opportunities to introduce bugs with the explicit code that Linus advocates for in his email!

So, it would be much better, clearer, nicer, and safer, to raise the semantic level of the code, and write a helper function or macro to do that combination safely and correctly.

A C macro could look like this:

#include <stdint.h>

#define MAKE_U32_FROM_TWO_U16(high, low) \
        ( ((uint32_t)(high) << 16) | (uint32_t)(low) )

Should we take into consideration the case in which ‘low’ has higher bits to clear? Then the macro becomes something like this:

#define MAKE_U32_FROM_TWO_U16(high, low) \
        ( ((uint32_t)(high) << 16) | ((uint32_t)(low) & 0xFFFF))

As you can see, the type casts, the parentheses, the potential bit-masking, do require attention. But once you get the code right, you can safely and conveniently reuse it every time you need!

So, the real garbage code is actually repeatedly writing explicit bug-prone or wrong code, like “(a << 16) + b”! Not hiding such code in a sane helper macro (or function), like shown above.

Instead of a preprocessor macro, we could use an inline helper function. For example, in C++ we could write something like this:

#include <stdint.h>

inline uint32_t make_u32_from_two_u16(uint16_t high, uint16_t low) 
{
    return (static_cast<uint32_t>(high) << 16) | 
           static_cast<uint32_t>(low);
}

We could even further refine this function, marking it noexcept, as it’s guaranteed to not throw exceptions.

And we could also make the function constexpr, as it can be evaluated at compile-time when the input arguments are constant.

With these additional refinements, we get:

inline constexpr uint32_t make_u32_from_two_u16(
    uint16_t high, 
    uint16_t low)  noexcept 
{
    return (static_cast<uint32_t>(high) << 16) |
           static_cast<uint32_t>(low);
}

Three Pieces of Advice on Using Modern C++ at Win32 API Boundaries

C is widely used as a programming language at API interfaces. But that doesn’t mean that you must stick to C (or old-style C++) in *your own* code!

The previous article on enumerating modules loaded into a process using Win32 API functions and C++ invites/inspires some reflections and pieces of advice on using modern C++ at the Win32 API boundaries.

#1: Raw C Handles Should Be Wrapped in Safe C++ Classes (a.k.a. Raw C Handles Are Radioactive)

Many Win32 API C-interface functions use raw C handles (e.g. represented by the HANDLE type). For example, we saw in the previous article that the CreateToolhelp32Snapshot function returns a HANDLE that we used with other related API functions to enumerate the loaded modules.

When the handle is not needed anymore, for example after the enumeration process is completed (or even if it’s interrupted by an error), the raw handle must be freed calling the CloseHandle Win32 API function. This is a common pattern for lots of Win32 API functions:

HANDLE hSomething = CreateSomething( /* ...various parameters... */ );
// Check that the handle is valid
// (a typical error value is INVALID_HANDLE_VALUE)

// Do some processing with the above handle
DoSomething(hSomething, /* ...various parameters ... */);

// Close the handle at the end of the elaboration
CloseHandle(hSomething);

// Avoid dangling references to handles already closed
hSomething = INVALID_HANDLE_VALUE;

Well, in modern C++ the idea is to wrap this raw C HANDLE in a safe C++ class, such that, when instances of this class go out of scope, the handle will be automatically closed.

That is made possible by the fact that the C++ class destructor will be automatically called when instances of the class go out of scope, so a proper call to CloseHandle can be made by the destructor itself (or by some cleanup helper method invoked by the destructor).

To be safe, the cleanup code should also take into account the case in which the wrapped handle is invalid (case represented by the INVALID_HANDLE_VALUE for the CreateToolhelp32Snapshot API function discussed above).

So, the initial skeleton code for such a wrapper C++ class could look like this:

//----------------------------------------------------
// C++ class that safely wraps a raw C-style HANDLE,
// and releases it when instances of the class
// go out of scope.
//----------------------------------------------------
class ScopedHandle
{
public:
    // Gain ownership of the input raw handle
    explicit ScopedHandle(HANDLE h) noexcept
        : m_handle{h}
    {}

    // Get access to the wrapped raw handle,
    // for example to pass it as an argument
    // to other Win32 API functions
    HANDLE GetHandle() const noexcept
    {
        return m_handle;
    }

    // Safely releases the wrapped handle
    // (if the handle is valid)
    ~ScopedHandle() noexcept
    {
        if (m_handle != INVALID_HANDLE_VALUE)
        {
            ::CloseHandle(m_handle);
        }
    }

private:
    // Wrapped raw handle
    HANDLE m_handle;
};

As I discussed in more details in my course on Practical C++ 14 and C++17 Features (that can be still applied to newer versions of the C++ standard, as well), you can think of the raw handle as something “radioactive”, that should be safely wrapped in RAII boundaries, provided by a C++ class that behaves as a resource manager, like the one shown above.

Moreover, to avoid subtle bugs, it’s important to prevent copies for a class like the one described above:

class ScopedHandle
{
    //
    // Disable Copy
    //
private:
    ScopedHandle(ScopedHandle const&) = delete;
    ScopedHandle& operator=(ScopedHandle const&) = delete;
...

(If you do want to make the class copyable, it’s important that copy operations are well defined and implemented; for example, you could use some form of reference count applied to the wrapped handle.)

It’s also possible to improve this kind of resource manager class, for example adding move semantics. That would make it possible, for example, to return a wrapped handle by some factory function, or store it in containers like std::vector. In such case the class name should be changed to reflect its improved nature (ScopedHandle wouldn’t work anymore); for example, we could name it SafeHandle, or UniqueHandle (if it’s movable but not copyable), or whatever you like best.

If you want to see some C++ compilable code for a resource manager class like that, you can take a look at the winreg::RegKey class of my WinReg C++ library (you can find the code in the header-only WinReg.hpp file). Note that, in this case, the wrapped raw handle is of type HKEY (i.e. a handle to a registry key).

The code can be generalized, as well. For example, you could write a generic SafeHandle<T> template. This could be the topic of some future articles.

Moreover, if you want to reuse something already available, the Microsoft WIL open-source library provides a wil::unique_handle template for that purpose.

Whatever class or template you choose to use or write, the bottom line is: Do not use raw handles in modern C++ code; wrap them in safe “RAII” boundaries provided by C++ resource manager classes.

#2: Use C++ String Classes Instead of Raw C-style Null-terminated Character Arrays

Win32 API functions usually work with C structures that represent strings using either raw C-style null-terminated character pointers, or null-terminated character arrays.

In modern C++, you can do better than that! In fact, you can use safe and convenient C++ string classes instead of working with those more basic raw C-style constructs.

For example, the MODULEENTRY32 structure used in the previous article on module enumeration, has two fields that are WCHAR C-style raw null-terminated character arrays: szModule and szExePath.

// Structure definition from MSDN:
// https://learn.microsoft.com/en-us/windows/win32/api/tlhelp32/ns-tlhelp32-moduleentry32w

typedef struct tagMODULEENTRY32W {
  DWORD   dwSize;
  ...

  // Null-terminated WCHAR arrays representing Unicode UTF-16
  // strings in C:
  WCHAR   szModule[MAX_MODULE_NAME32 + 1];
  WCHAR   szExePath[MAX_PATH];
} MODULEENTRY32W;

Instead of working with those, you can create instances of C++ string classes, like CString or std::wstring, and operate on those much safer and higher level constructs made available by the C++ language and libraries:

MODULEENTRY32 moduleEntry;
...

// Create a string object storing the module name
std::wstring moduleName(moduleEntry.szModule);

// Can use ATL/MFC CString as well:
CString moduleName(moduleEntry.szModule);

Once you have created string objects from those C raw character arrays, forget about the original C character arrays, and use only the C++ string objects in the rest of your modern C++ code.

C++ string classes have many advantages over pure raw C-style arrays of characters, like being easily and safely copyable. They can also be concatenated with a very simple and highly readable syntax, like using the operator+ overload (as in: s1 + s2). And they are properly freed when they go out of scope, as well.

#3: Use C++ Containers Like std::vector Instead of Raw C Arrays

If you take a look at MSDN examples, that are typically written in C, you’ll see lots of uses of raw C arrays to store a set of elements. Typically the code follows this pattern:

SOME_STRUCTURE elements[MAX_COUNT];

// May have another variable representing 
// the actual number of elements stored in the array.
// This is increased when a new element is added.
int elementCount = 0;

In modern C++, you can do better than that: In fact, you can create a std::vector containing instances of the structures, and you can dynamically grow the vector, for example adding new elements to it invoking its push_back method:

// Start creating an empty vector
std::vector<ModuleInfo> loadedModules;

// When a new module is found during the enumeration, 
// add it to the vector container
loadedModules.push_back( ModuleInfo{ /* ... */ } );

Hope you find these suggestions of some interest!

Slide enumerating the three pieces of advice on using modern C++ at Win32 API boundaries, described in details in the article.

C is a great language for the “boundaries”. But you can happily switch gears to modern C++ on your own side of the boundary.

Pluralsight 50% Off Sale Extended Until 5/22

Just a heads up to let you know that Pluralsight has extended the 50% off sale until 11:59 p.m. MT on May 22! You get lots of high-quality courses for a very convenient price!

Click the banner below and SAVE NOW!

Pluralsight FREE Week!

This is Pluralsight Free Week: you can watch thousands of expert-led video courses for free for the entire week!

Just a heads up to let you know that Pluralsight is running a FREE Week promo: This means that the Pluralsight’s platform is free for the entire week!

You can watch 7,000+ expert-led video courses for free for the entire week!

For example: Are you interested in an Introduction to Algorithms and Data Structures in C++?

Or in Getting Started with the C Programming Language?

Or would you like to learn more about Object-oriented Programming in Rust?

It’s all free for this entire week!

Enjoy learning!