Need for speed: C++ unity builds

Published on . Tagged with c++.
mechnical snail

As I type these words, I'm staring at the LLVM compilation screen. It has been running for an hour. What a waste of energy. I really hate long compilation times.

That's why I started using unity builds. A unity build consolidates all code into a single translation unit:

#include "file_1.cc"
#include "file_2.cc"
#include "file_3.cc"
#include "file_4.cc"
#include "file_5.cc"

How much time can I save by organizing code in this manner? I wrote a simple test where I generated 10,000 files:

#include <iostream>
#include <string>
#include <vector>
#include <memory>
#include <map>
#include <set>
#include <chrono>
#include <functional>
#include <random>
#include <fstream>
#include <thread>

class MyTestClazzX {
public:
    void func(const std::string& str) {
        std::cout << "Hello from class " + MyTestClassX + ": " + str << std::endl;
    }
};

Then I performed different compilations using normal and unity builds. Here are the results:

  unity-builds-cmp git:(main)  python test_unity_build.py 10000
Cleaning src/ directory
Generating 10000 files...
Compiling unity build... [g++]
    DONE. The process took 20.84 seconds
Compiling unity build using... [clang++]
    DONE. The process took 10.60 seconds
Compiling normal build using 12 cores... [clang++]
    DONE. The process took 916.86 seconds

The single-threaded unity build was about 90 times faster than the normal one, even though the normal build was utilizing multiple cores. Of course, the exact results are artificial and depend very much on the files you are compiling, but the sheer difference is astonishing.

I use unity builds for my private projects, which usually do not have more than 100 files. So, here are the results for 100:

  unity-builds-cmp git:(main)  python test_unity_build.py 100
Cleaning src/ directory
Generating 100 files...
Compiling unity build... [g++]
    DONE. The process took 0.64 seconds
Compiling unity build using... [clang++]
    DONE. The process took 0.58 seconds
Compiling normal build using 12 cores... [clang++]
    DONE. The process took 7.48 seconds

About 13 times faster. This is noticeable and definitely worthwhile for me. It compiles under one second without using precomiled headers, incremental builds or pimpl. A full rebuild. Each time. In under one second.

So, what about cons? Certainly, putting everything into a single translation unit can result in a big mess:

How do I deal with it?

Well, I am using it for my own projects written from scratch. It is easier to write code for unity builds from the beginning then to transform big, complex code base into it. My ruleset: