Blog Post

Structure Padding in C++

Patrik Weiskircher
Illustration: Structure Padding in C++

Structs and classes are a fundamental part of C++ and are used to group related data together. This article will look at one aspect of them that’s often overlooked: memory padding. Understanding how data is laid out in memory can help you write more efficient code and optimize the performance of your programs.

Because padding and memory layout is the same between structs and classes, this post will use the term struct, but everything that applies to structs also applies to classes.

What Is a Struct?

A struct is a user-defined data type that allows you to combine data into one group. It looks like this:

struct name {
    std::string first_name;
    std::string last_name;
};

The example above defines a struct called name that contains two members: first_name and last_name. The members of a struct can be of any data type, including other structs or classes. For example, you can use the struct like this:

// A method that prints out the members of the struct.
void print_name(const name& name) {
    std::cout << "First name: " << name.first_name << std::endl;
    std::cout << "Last name: " << name.last_name << std::endl;
}

int main(int argc, char **argv) {
    // Create an instance of the struct and initialize its members.
    name my_name{.first_name = "Patrik", .last_name = "Weiskircher"};
    // Call a method with the struct as an argument.
    print_name(my_name);
}

There are many ways to initialize and pass a struct around (this post uses C++20’s designated initializers), but that’s not the focus of this article. You can find more information about structs in the C++ documentation.

Memory Padding

Data is stored in memory. For example, a uint32_t will take up four bytes in memory. How this is stored is dependent on many rules. We won’t be going too much into detail here — this could be the topic of many blog posts. But one thing that’s important to know about is padding.

In structs, padding is the space that’s added between the members to allow efficient access to the data by the CPU. Here’s an example:

struct my_second_struct {
    // bool == 1 byte
    bool first_flag;
    // bool == 1 byte
    bool second_flag;
    // uint32_t == 4 bytes
    uint32_t first_value;
};

The struct above has two Booleans and one 32-bit integer. Together, they take up six bytes. But if padding and memory layout in structs were this easy, we wouldn’t need to read a blog post about it. Let’s see how big this struct really is:

#include <iostream>
#include <cstdint>

struct my_second_struct {
    bool first_flag;
    bool second_flag;
    uint32_t first_value;
};

int main(int argc, char **argv) {
    std::cout << sizeof(my_second_struct) << std::endl;
}

Put this in a file and compile it. On my Apple silicon Mac, it looks like this:

$ clang++ -o padding padding.cpp
$ ./padding
8

The result is eight bytes. Why’s that? The CPU likes to read data in chunks of certain sizes — for example, four bytes. To make access as efficient as possible, the compiler adds padding between the members. In this case, the compiler adds two bytes of padding after the two Boolean members to make sure that the 32-bit integer is properly aligned.

padding graph

This can also be seen with the clangd plugin in Visual Studio Code.

clangd showing padding

Why Should You Know About Padding?

Padding is important to understand because it can affect the size of your data structures and how they’re laid out in memory. This can have an impact on the performance of your program, especially if you’re working with large data structures or need to optimize memory usage.

As an example, say you’re tasked with adding another Boolean to the struct above. Let’s call it bool very_very_important. You now have to decide where to add it. I personally like to add things at the bottom unless they’re related. That would result in this:

struct my_third_struct {
    bool first_flag;
    bool second_flag;
    uint32_t first_value;
    bool very_very_important;
};

How big is the struct now? Let’s look using Visual Studio Code again.

clangd showing the new size

It’s now 12 bytes! The compiler added three bytes of padding after the very_very_important member to make sure the whole struct is properly aligned. Now imagine this struct is used in an array with 1 million elements. That’s 3 million bytes of wasted memory!

But there’s a better way:

struct my_third_struct {
    bool first_flag;
    bool second_flag;
    bool very_very_important;
    uint32_t first_value;
};

If you add the Boolean to the other Booleans — where there’s already padding — you get a new Boolean for free.

clangd showing the optimized size

Conclusion

Understanding how data is laid out in memory can help you write more efficient code and optimize the performance of your programs. By being aware of padding and how it affects the size and layout of your data structures, you can make better decisions about how to organize your data and improve the performance of your programs.

Author
Patrik Weiskircher Core Team Lead

Patrik is the team lead of the Core Team, which oversees the shared codebase between our products. He knows far too many things about PDFs — ask him about fonts!

Share Post
Free 60-Day Trial Try PSPDFKit in your app today.
Free Trial

Related Articles

Explore more
DEVELOPMENT  |  Web • C++

My Experience with Web Development from a Systems Programming Perspective

BLOG  |  Web • C++ • WebAssembly

Render Performance Improvements in PSPDFKit for Web

DEVELOPMENT  |  C++ • Office • Tips

Testing Subjective Office Conversion Results