Integer overflow – Giovanni Dicanio's Blog

How to Prevent Integer Overflows in C++?

A simple solution that works fine in many situations.

Let’s consider the Sum function shown a few posts ago that exhibited bogus results due to signed integer overflow:

// Sum the 16-bit signed integers stored in the values vector
int16_t Sum(const std::vector<int16_t>& values)
{
    int16_t sum = 0;
 
    for (auto num : values)
    {
        sum += num;
    }
 
    return sum;
}

How can you fix the integer overflow problem and associated weird negative results discussed in the above blog post?

Well, one option would be to simply “step up” the integer type. In other words, instead of storing the sum into an int16_t variable, you could use a larger integer type, like int32_t or even int64_t. I mean, the maximum positive integer value that can be represented with a 64-bit (signed) integer is (2^63)-1 = 9,223,372,036,854,775,807: sounds pretty reasonable to represent the sum of some 16-bit samples, doesn’t it?

The updated code using int64_t for the cumulative sum can look like this:

// Sum the 16-bit signed integers stored in the values vector.
// The cumulative sum is stored in a 64-bit integer 
// to prevent integer overflow.
int64_t Sum(const std::vector<int16_t>& values)
{
    int64_t sum = 0;

    for (auto num : values)
    {
        sum += num;
    }

    return sum;
}

Considering the calling code of the initial blog post:

std::vector<int16_t> v{ 10, 1000, 2000, 32000 };
std::cout << Sum(v) << '\n';

this time with the int64_t fix you get the correct result of 35,010. Of course, also a simple shorter int32_t would have worked fine in this example. And, if you want to be super-safe, you can still wrap it in a convenient SafeInt<>.

C++ Mystery Revealed: Why Do I Get a Negative Integer When Adding Two Positive Integers?

Explaining some of the “magic” behind signed integer overflow weird results.

In this blog post, I showed you some “interesting” (apparently weird) results that you can get when adding signed integer numbers. In that example, you saw that, if you add two positive signed integer numbers and their sum overflows (signed integer overflow is undefined behavior in C++), you can get a negative number.

As another example, I compiled this simple C++ code with Visual Studio C++ compiler:

#include <cstdint>  // for int16_t
#include <iostream> // for std::cout

int main()
{
    using std::cout;

    int16_t a = 3000;
    int16_t b = 32000;
    int16_t c = a + b;  // -30536

    cout << " a = " << a << '\n';
    cout << " b = " << b << '\n';
    cout << " a + b = " << c << '\n';
}

and got this output, with a negative sum of -30536.

Two positive integers are added, and a negative sum is returned due to signed integer overflow. — Signed integer overflow causing a *negative sum* when adding *positive* integers

Now, you may ask: Why is that?

Well, to try to answer this question, consider the binary representations of the integer numbers from the above code:

int16_t a = 3000;  // 0000 1011 1011 1000
int16_t b = 32000; // 0111 1101 0000 0000
int16_t c = a + b; // 1000 1000 1011 1000

If you add a and b bitwise, you’ll get the binary sequence shown above for c.

Now, if you interpret the c sum’s binary sequence as a signed integer number in the Two’s complement representation, you’ll immediately see that you have a negative number! In fact, the most significant bit is set to 1, which in Two’s complement representation means that the number is negative.

int16_t c = a + b; 
    //  c: 1000 1000 1011 1000
    //     1xxx xxxx xxxx xxxx
    //     *
    //     ^--- Negative number
    //          Most significant bit = 1

In particular, if you interpret the sum’s binary sequence 1000 1000 1011 1000 in Two’s complement, you’ll get exactly the negative value of -30536 shown in the above screenshot.

Protecting Your C++ Code Against Integer Overflow Made Easy by SafeInt

Let’s discuss a cool open-source C++ library that helps you write nice and clear C++ code, but with safety checks *automatically* added under the hood.

In previous blog posts I discussed the problem of integer overflow and some subtle bugs that can be caused by that (we saw both the signed and the unsigned integer cases).

Now, consider the apparently innocent simple C++ code that sums the integer values stored in a vector:

// Sum the 16-bit signed integers stored in the values vector
int16_t Sum(const std::vector<int16_t>& values)
{
    int16_t sum = 0;
 
    for (auto num : values)
    {
        sum += num;
    }
 
    return sum;
}

As we saw, that code is subject to bogus integer overflow, and may return a negative number even if all positive integer numbers are added together!

To prevent that kind of bugs, we added a safety check before doing the cumulative sum, throwing an exception in case an integer overflow was detected. Better throwing an exception than returning a bogus result!

The checking code was:

//
// Check for integer overflow *before* doing the sum
//
if (num > 0 && sum > std::numeric_limits<int16_t>::max() - num)
{
    throw std::overflow_error("Overflow in Sum function when adding a positive number.");
}
else if (num < 0 && sum < std::numeric_limits<int16_t>::min() - num)
{
    throw std::overflow_error("Overflow in Sum function when adding a negative number.");
}

// The sum is safe
sum += num;

Of course, writing this kind of complicated check code each time there is a sum operation that could potentially overflow would be excruciatingly cumbersome, and bug prone!

It would be certainly better to write a function that performs this kind of checks, and invoke it before adding two integers. That would be certainly a huge step forward versus repeating the above code each time two integers are added.

But, in C++ we can do even better than that!

In fact, C++ offers the ability to overload operators, such as + and +=. So, we could write a kind of SafeInt class, that wraps a “raw” built-in integer type in safe boundaries, and that overloads various operators like +,+=, and so on, and transparently and automatically checks that the operations are safe, and throws an exception in case of integer overflow, instead of returning a bogus result.

This class could be actually a class template, like a SafeInt<T>, where T could be an integer type, like int, int16_t, uint16_t, and so on.

That is a great idea! But developing that code from scratch would certainly require lots of time and energy, and especially we would spend a lot of time debugging and refining it, considering various corner cases and paying attention to the various overflow conditions, and so on.

Fortunately, you don’t have to do all that work! In fact, there is an open source library that does exactly that! This library is called SafeInt. It was initially created in Microsoft Office in 2003, and is now available as open source on GitHub.

To use the SafeInt C++ library in your code, you just have to #include the SafeInt.hpp header file. Basically, the SafeInt class template behaves like a drop-in replacement for built-in integer types; it does however do all the proper integer overflow checks behind the hood of its overloaded operators.

So, considering our previous Sum function, we can make it safe simply replacing the “raw” int16_t type that holds the sum with a SafeInt<int16_t>:

#include "SafeInt.hpp"    // The SafeInt C++ library

// Sum the 16-bit integers stored in the values vector
int16_t Sum(const std::vector<int16_t>& values)
{
    // Use SafeInt to check against integer overflow
    SafeInt<int16_t> sum; // = 0; <-- automatically init to 0

    for (auto num : values)
    {
        sum += num; // <-- *automatically* checked against integer overflow!!
    }

    return sum;
}

Note how the code is basically the same clear and simple code we initially wrote! But, this time, the cumulative sum operation “sum += num” is automatically checked against integer overflow by the SafeInt’s implementation of the overloaded operator +=. The great thing is that all the checks are done automatically and under the hood by the SafeInt’s overloaded operators! You don’t have to spend time and energy writing potential bug-prone check code. It’s all done automatically and transparently. And the code looks very clear and simple, without additional “pollution” of if-else checks and throwing exceptions. This kind of (necessary) complexity is well embedded and hidden under the SafeInt’s implementation.

SafeInt by default signals errors, like integer overflow, throwing an exception of type SafeIntException, with the m_code data member set to a SafeIntError enum value that indicates the reason for the exception, like SafeIntArithmeticOverflow in case of integer overflow. The following code shows how you can capture the exception thrown by SafeInt in the above Sum function:

std::vector<int16_t> v{ 10, 1000, 2000, 0, 32000 };

try
{
    cout << Sum(v) << '\n';
}
catch (const SafeIntException& ex)
{
    if (ex.m_code == SafeIntArithmeticOverflow)
    {
        cout << "SafeInt integer overflow exception correctly caught!\n";
    }
}

Note that also other kinds of errors are checked by SafeInt, like attempts to divide by zero.

Moreover, the default SafeInt exception handler can be customized, for example to throw another exception class, like std::runtime_error, or a custom exception, instead of the default SafeIntException.

So, thanks to SafeInt it’s really easy to protect your C++ code against integer overflow (and divisions by zero) and associated subtle bugs! Just replace “raw” built-in integer types with the corresponding SafeInt<T> wrapper, and you are good to go! The code will still look nice and simple, but safety checks will happen automatically under the hood. Thank you very much SafeInt and C++ operator overloading!

Protecting Your C++ Code Against Unsigned Integer “Overflow”

Let’s explore what happens in C++ when you try to add *unsigned* integer numbers and the sum exceeds the maximum value. You’ll also see how to protect your unsigned integer sums against subtle bugs!

Last time, we discussed signed integer overflow in C++, and some associated subtle bugs, like summing a sequence of positive integer numbers, and getting a negative number as a result.

Now, let’s focus our attention on unsigned integer numbers.

As we did in the previous blog post, let’s start with an apparently simple and bug-free function, that takes a vector storing a sequence of numbers, and computes their sum. This time the numbers stored in the vector are unsigned integers of type uint16_t (i.e. 16-bit unsigned int):

#include <cstdint>  // for uint16_t
#include <vector>   // for std::vector

// Sum the 16-bit unsigned integers stored in the values vector
uint16_t Sum(const std::vector<uint16_t>& values)
{
    uint16_t sum = 0;

    for (auto num : values)
    {
        sum += num;
    }

    return sum;
}

Now, try calling the above function on the following test vector, and print out the result:

std::vector<uint16_t> v{ 10, 1000, 2000, 32000, 40000 };
std::cout << Sum(v) << '\n';

On my beloved Visual Studio 2019 C++ compiler targeting Windows 64-bit, I get the following result: 9474. Well, this time at least we got a positive number 😉 Seriously, what’s wrong with that result?

Well, if you see the sequence of the input values stored in the vector, you’ll note that the result sum is too small! For example, the vector contains the values 32000 and 40000, which are only by themselves greater than the resulting sum of 9474! I mean: this is (apparently…) nonsense. This is indeed a (subtle) bug!

Now, if you compute the sum of the above input numbers, the correct result is 75010. Unfortunately, this value is larger than the maximum (positive) integer number that can be represented with 16 bits, which is 65535.

Side Note: How can you get the maximum integer number that can be represented with the uint16_t type in C++? Simple: You can just invoke std::numeric_limits<uint16_t>::max():

cout << "Maximum value representable with uint16_t: \n";
cout << std::numeric_limits<uint16_t>::max() << '\n';

End of Side Note

So, here you basically have an integer “overflow” problem. In fact, the sum of the input uint16_t values is too big to be represented with an uint16_t.

Before moving forward, I’d like to point out that, while in C++ signed integer overflow is undefined behavior (so, basically the result you get depends on the particular C++ compiler/toolchain/architecture/even compiler switches, like GCC’s -fwrapv), unsigned integer “overflow” is well defined. Basically, what happens in the case of two unsigned integers being added together and exceeding the maximum value is the so called “wrap around“, according to the modulo operation.

To understand that with a concrete example, think of a clock. For example, if you think of the hour hand of a clock, when the hour hand points to 12, and you add 1 hour, the clock’s hour hand will point to 1; there is no “13”. Similarly, if the hour hand points to 12, and you add 3 hours, you don’t get 15, but 3. And so on. So, what happens for the clock is a wrap around after the maximum value of 12:

12 “+” 1 = 1

12 “+” 2 = 2

12 “+” 3 = 3

…

I enclosed the plus signs above in double quotes, because this is not a sum operation as we normally intend. It’s a “special” sum that wraps the result around the maximum hour value of 12.

You would get a similar “wrap around” behavior with a mechanical-style car odometer: When you reach the maximum value of 999’999 (kilometers or miles), the next kilometer or mile brings the counter back to zero.

Adding unsigned integer values follows the same logic, except that the maximum value is not 12 or 999’999, but 65535 for uint16_t. So, in this case you have:

65535 + 1 = 0

65535 + 2 = 1

65535 + 3 = 2

and so on.

You can try this simple C++ loop code to see the above concept in action:

constexpr uint16_t kU16Max = std::numeric_limits<uint16_t>::max(); // 65535
for (uint16_t i = 1; i <= 10; i++)
{
    uint16_t sum = kU16Max + i;
    std::cout << " " << kU16Max << " + " << i << " = " << sum << '\n';
}

I got the following output:

Output of the above C++ loop code, showing unsigned integer wrap around in action. — Sample C++ loop showing unsigned integer wrap around

So, unsigned integer overflow in C++ results in wrap around according to modulo operation. Considering the initial example of summing vector elements: 10, 1000, 2000, 32000, 40000, the sum of the first four elements is 35010 and fits well in the uint16_t type. But when you add to that partial sum the last element 40000, you exceed the limit of 65535. At this point, wrap around happens, and you get the final result of 9474.

How of curiosity, you may ask: Where does that “magic number” of 9474 come from?

The modulo operation comes here to the rescue! Modulo basically means dividing two integer numbers and taking the remainder as the result of the operation.

So, if you take the correct sum value of 75010 and divide it by the number of integer values that can be represented with 16 bits, which is 2^16 = 65536, and you get the remainder of that integer division, the result is 9474, which is the result returned by the above Sum function!

Now, some people like to say that in C++ there is no overflow with unsigned integers, as before the overflow happens, the modulo operation is applied with the wrap around. I think this is more like a “word war”, but the concept should be clear at this point. In any case, when the sum of two unsigned integers doesn’t fit in the given unsigned integer type, the modulo operation is applied, with a consequent wrap around of the result. The key point is that, for unsigned integers, this is well defined behavior. Anyway, this is the reason why I enclosed the word “overflow” in double quotes in the blog title, and somewhere in the blog post text as well.

Coming back to the original sum problem, independently from the mechanics of the modulo operation and wrap around of unsigned integers, the key point is that the Sum function above returned a value that is not what a user would normally expect.

So, how can you prevent that from happening?

Well, just as we saw in the previous blog post on signed integer overflow, before doing the actual partial cumulative sum, we can check that the result does not overflow. And, if it does, we can throw an exception to signal the error.

Note that, while in the case of signed integers we have to check both the positive number and negative number cases, the latter check doesn’t apply here to unsigned integers (as there are no negative unsigned integers):

#include <cstdint>    // for uint16_t
#include <limits>     // for std::numeric_limits
#include <stdexcept>  // for std::overflow_error
#include <vector>     // for std::vector

// Sum the 16-bit unsigned integers stored in the values vector.
// Throws a std::overflow_error exception on integer overflow.
uint16_t Sum(const std::vector<uint16_t>& values)
{
    uint16_t sum = 0;

    for (auto num : values)
    {
        //
        // Check for integer overflow *before* doing the sum.
        // This will prevent bogus results due to "wrap around"
        // of unsigned integers.
        //
        if (num > 0 && sum > std::numeric_limits<uint16_t>::max() - num)
        {
            throw std::overflow_error("Overflow in Sum function.");
        }

        // The sum is safe
        sum += num;
    }

    return sum;
}

If you try to invoke the above function with the initial input vector, you will see that you get an exception thrown, instead of a wrong sum returned:

std::vector<uint16_t> v{ 10, 1000, 2000, 32000, 40000 };

try
{
    std::cout << Sum(v) << '\n';
}
catch (const std::overflow_error& ex)
{
    std::cout << "Overflow exception correctly caught!\n";
    std::cout << ex.what() << '\n';
}

Next time, I’d like to introduce a library that can help writing safer integer code in C++.

Beware of Integer Overflows in Your C++ Code

Summing signed integer values on computers, with a *finite* number of bits available for representing integer numbers (16 bits, 32 bits, whatever) is not always possible, and can lead to subtle bugs. Let’s discuss that in the context of C++, and let’s see how to protect our code against those bugs.

Suppose that you are operating on signed integer values, for example: 16-bit signed integers. These may be digital audio samples representing the amplitude of a signal; but, anyway, their nature and origin is not of key importance here.

You want to operate on those 16-bit signed integers, for example: you need to sum them. So, you write a C++ function like this:

#include <cstdint>  // for int16_t
#include <vector>   // for std::vector

// Sum the 16-bit signed integers stored in the values vector
int16_t Sum(const std::vector<int16_t>& values)
{
    int16_t sum = 0;

    for (auto num : values)
    {
        sum += num;
    }

    return sum;
}

The input vector contains the 16-bit signed integer values to sum. This vector is passed using const reference (const &), as we are observing it inside the function, without modifying it.

Then we use the safe and convenient range-for loop to iterate through each number in the vector, and update the cumulative sum.

Finally, when the range-for loop is completed, the sum is returned back to the caller.

Pretty straightforward, right?

Now, try and create a test vector containing some 16-bit signed integer values, and invoke the above Sum() function on that, like this:

std::vector<int16_t> v{ 10, 1000, 2000, 32000 };
std::cout << Sum(v) << '\n';

I compiled and executed the test code using Microsoft Visual Studio 2019 C++ compiler in 64-bit mode, and the result I got was -30526: a negative number?!

Well, if you try to debug the Sum function, and execute the function’s code step by step, you’ll see that the initial partial sums are correct:

10 + 1000 = 1010

1010 + 2000 = 3010

Then, when you add the partial sum of 3010 with the last value of 32000 stored in the vector, the sum becomes a negative number.

Why is that?

Well, if you think of the 16-bit signed integer type, the maximum (positive) value that can be represented is 32767. You can get this value, for example, invoking std::numeric_limits<int16_t>::max():

cout << "Maximum value representable with int16_t: \n";
cout << std::numeric_limits<int16_t>::max() << '\n';

So, in the above sum example, when 3010 is summed with 32000, the sum exceeds the maximum value of 32767, and you hit an integer overflow.

In C++, signed integer overflow is undefined behavior. In this case of Microsoft Visual C++ 2019 compiler on Windows, we got a negative number as a sum of positive numbers, which, from a “high level” perspective is mathematically meaningless. (Actually, if you consider the binary representation of these numbers, the result kind of makes sense. But, going down to this low binary level is out of the scope of this post; moreover, in any case, from a “high-level” common mathematical perspective, summing positive integer numbers cannot lead to a negative result.)

So, how can we prevent such integer overflows to happen and cause buggy meaningless results?

Well, we could modify the above Sum function code, performing some safety checks before actually calculating the sum.

// Sum the 16-bit signed integers stored in the values vector
int16_t Sum(const std::vector<int16_t>& values)
{
    int16_t sum = 0;

    for (auto num : values)
    {
        //
        // TODO: Add safety checks here to prevent integer overflows 
        //
        sum += num;
    }

    return sum;
}

If you think about it, if you are adding two positive integer numbers, what is the condition such that their sum is representable with the same signed integer type (int16_t in this case)?

Well, the following condition must be satisfied:

a + b <= MAX

where MAX is the maximum value that can be represented in the given type: std::numeric_limits<int16_t>::max() or 32767 in our case.

In other words, the above condition expresses in mathematical terms that the sum of the two positive integer numbers a and b cannot exceed the maximum value MAX representable with the given signed integer type.

So, the overflow condition is the negation of the above condition, that is:

a + b > MAX

Of course, as we just saw above, you cannot perform the sum (a+b) on a computer if the sum value overflows! So, it seems like a snake biting its own tail, right? Well, we can fix that problem simply massaging the above condition, moving the ‘a’ quantity on the right-hand side, and changing its sign accordingly, like this:

b > MAX – a

So, the above is the overflow condition when a and b are positive integer numbers. Note that both sides of this condition can be safely evaluated, as (MAX – a) is always representable in the given type (int16_t in this example).

Now, you can do a similar reasoning for the case that both numbers are negative, and you want to protect the sum from becoming less than numeric_limits::min, which is -32768 for int16_t.

The overflow condition for summing two negative numbers is:

a + b < MIN

Which is equivalent to:

b < MIN – a

Now, let’s apply this knowledge to modify our Sum function to prevent integer overflow. We’ll basically check the overflow conditions above before doing the actual sum, and we’ll throw an exception in case of overflow, instead of producing a buggy sum value.

#include <cstdint>    // for int16_t
#include <limits>     // for std::numeric_limits
#include <stdexcept>  // for std::overflow_error
#include <vector>     // for std::vector

// Sum the 16-bit signed integers stored in the values vector.
// Throws a std::overflow_error exception on integer overflow.
int16_t Sum2(const std::vector<int16_t>& values)
{
    int16_t sum = 0;

    for (auto num : values)
    {
        //
        // Check for integer overflow *before* doing the sum
        //
        if (num > 0 && sum > std::numeric_limits<int16_t>::max() - num)
        {
            throw std::overflow_error("Overflow in Sum function when adding a positive number.");
        }
        else if (num < 0 && sum < std::numeric_limits<int16_t>::min() - num)
        {
            throw std::overflow_error("Overflow in Sum function when adding a negative number.");
        }

        // The sum is safe
        sum += num;
    }

    return sum;
}

Note that if you add two signed integer values that have different signs (so, you are basically making a subtraction of their absolute values), you can never overflow. So you might think of doing an additional check on the signs of the variables num and sum above, but I think that would be a useless complication of the above code, without any real performance benefits, so I would leave the code as is.

So, in this blog post we have discussed signed integer overflow. Next time, we’ll see the case of unsigned integers.