A Dive to the Build Process - What Goes On When You Press the Play Button

Ju Hong Kim · April 14, 2021

Report a bug

On my previous post What does the Build Team Do, I covered various topics about builds and devOps. Today, I want to go more in depth on what goes on when you press the run button which I briefly brushed over in the previous post. This does get quite technical but don’t fret. I have stopped myself from writing and reading up in depth about the topic to avoid going through the rabbit hole.

If you prefer looking at the slides, I’ll provide them below:


What is a Build


For the sake of those who did not read my previous post , I’ll repeat some of the content I shared previously.

You may recall when working on Java, C, or C++ you ran a Build by pressing pressing Build, Run, or Play button

or you “compile” your code on the terminal

But what does this actually do? As you may know, a build is just simply the process of converting your source code (i.e. .c or .java) to a binary/exectuable (i.e. .exe or .o).

An executable to put it briefly is the ready to run form of a program (hence why we call it an executable). An executable consists of instructions (i.e. machine code) that consists of zeros and ones assembled in a way the CPU understands. Some common locations where executables can often be found on Unix-Like Operating System (OS) under /bin, /sbin, /usr/bin, or usr/local/bin.


Steps In A Build


Turns out, there is actually a lot that goes on in a build. Your IDE or compiler abstracts the build process. Frankly, most developers will never need to understand how your compiler works nor do they need to understand the steps of creating a build. Here’s a diagram of what occurs when you are “compiling” (building) your program:

1. Preprocessor - Expands source code

Before your source code is translated to assembly or machine code, it first goes through a program which we call the preprocessor. Essentially the preprocessor expands and substitute your code by expanding macros and include files.

The preprocessor gives you the ability to:

  • include header files
  • object-like (i.e. constants) and function like macros (more on this below)
  • conditional compilation

For those with no or limited C programming experience, here’s an example of a preprocessor:

#define MAX_BUFFER_SIZE 10

int main() {
    int arr[MAX_BUFFER_SIZE];
    ...

The output of the preprocessor becomes:

int main() {
    int arr[10];
    ...

Although it may look similar to a constant variable, macros are substituted by the preprocessor while the compiler will deal with variables. There are some things you can do with macros that const cannot do such as the example above where the array size can be easily changed. You cannot do this easily in C with a variable. For example, the code below is illegal in C89:

int max_buffer_size = 10;
int arr[max_buffer_size]; #this is illegal

Macros can do more than subsitute constant values. The power of macros are its ability to control what part of the source code to substitute (conditional directives) and the ability to write “function” like macros that does not have the overhead of function calls.

#ifdef DEBUG
printf("File: %s on line %d", __FILE, __LINE);
printf("Value of x: %d\n", x);
#endif

If the macro DEBUG is 1 or enabled (i.e. pass -DDEBUG to the preprocessor), the resulting source code will contain the block above. If not, the code above will not appear in the resulting program. This is especially important if you are working on software that supports multiple platforms (we’ll get to this later). But here’s a sneak preview (extracted from the source code of Node.js JS runtime):

The ability to control what part of the source code gets compiled is essential in software development. Not only is it useful to remove any debug code from the product, but it also enables programmers to control how a particular feature is implemented depending on the targeted platform. Perhaps you want to take advantage full advantage of the exclusive libraries that exist on each platform such as DirectX on Windows or Metal on Apple to render graphics. You can simply use a Macros to ensure you can split the code for Apple environments from Windows environment. In addition, perhaps there are some features that are unsupported on some platforms or you wish to turn off for the free version. You can simply use conditional macros to turn features on or off as you desire.

In regards to macro’s “function” like ability, here’s a short example:

#define PI 3.14
#define AREA(x) (PI*(x)*(x))
double area = PI * Square(9.5); //area = pi * r^2

The compiler will see the resulting source code below:

const double area_of_scope = (3.14*(9.5)*(9.5));

Side Note: A Quick Metric on Function Overhead Performance

On a side note, if you want to see how much performance hit function overhead can be, see the following example:

#include <stdio.h>
#include <stdint.h>
#include <time.h>

#define DIVISOR 3.56
#define DIV(x) ((x)/DIVISOR)

double div(uint32_t x) {
  return x / DIVISOR;
}

int main () {
  double sum = 0;
  clock_t begin = clock();
  for (uint32_t i = 0; i < UINT32_MAX; i++) {
    #ifndef DIV_MACRO
    sum += div(i);
    #else
    sum += DIV(i);
    #endif
  }
  clock_t end = clock();
  double time_spent = (double)(end - begin) / CLOCKS_PER_SEC;
  printf("%f", time_spent);
  return 0;
}

To measure the function overhead performance, I wanted to conduct a test where the function gets called a lot of times (hence why the program loops UINT32_MAX) and performs some expensive operation (float division are a bit expensive) so that results are very noticeable.

The program was compiled using no optimization (i.e. -O0) because any form of optimization will remove the need to create a new stack frame which would defeat the purpose of explaining the penalty of calling a function. You can verify this through looking at the resulting assembly code (i.e. -S) or running gprof (a profiling tool) with optimization enabled. You’ll see that they reduced the amount of stack frames created to 0. Meanwhile with no optimization, the function div gets called 4294967295/4294967295 as needed. Your compilers are very good at optimization.

Here was the result on my Raspberry Pi 3B, taking the average of 3 runs:

== WITH MACRO ==
Optimization 0: 143.475339

== NO MACRO - FUNCTION CALLS ==
Optimization 0: 193.950554666667

As you can see, there is a notiecable performance impact when we choose not to use a macro


Include Directives

In software development, it is normal to break down programs into many parts. This is particularly useful when specific parts of the code are frequently used by many other components. That’s where libraries come in. There’s no point of pasting the same code in various files or re-implementing code that already exists. Libraries make it easy to reuse and update code. If a particular library has a bug, you can simply fix the code that lies in one file or download the latest version. While your compiler does not care if you do not specify the library (i.e. stdio.h) when calling a function not defined in the same file, you’ll encounter a linker error (which I’ll get to later on).

/tmp/ccVUd15N.o: In function `main':
test.c:(.text+0xf): undefined reference to `print'
collect2: error: ld returned 1 exit status

Let’s look at an example to see what your preprocessor does when you include a header file with the following source code:

#include <stdio.h>
int main() {
  printf("Hello World\n");
  return 0;
}

If we were to run the preprocessor (i.e. gcc -E file.c), source code expands from 6 lines to 720 lines or 75B to 15.6KB file. The file takes up 208 times more space. So what is going on?

Include directives inserts the contents of the header files to the source code. So in the example earlier, most of the code is from expanding the contents of stdio.h library. Which is why you see lines such as these in the resulting source code that gets passed into the compiler:

extern int printf (const char *__restrict __format, ...);
extern int fseek (FILE *__stream, long int __off, int __whence);

A good example to visualize what is going on is expanding the following:

#include "include.h"

#include "declare_main.h"
#include "open_bracket.h"
#include "print_hello.h"
#include "close_bracket.h"

where each file contains a line of C code:

#include <stdio.h>
void main()
{
  printf("Hello World\n");
}

I find this to be a good example because each header file contains a single line of code which itself makes no sense. The code is only complete when you bring in all the header files together. I do understand this is not how you should be using header files but it’s an extreme example to visualize how include directives work.

A gif showing you the preprocessor madness

2. Compile & Assembler

The job of the compilation and assembling stage is to create an object file which contains machine code but is not complete. The resulting object file is not executable. If we were to try execute the object file, we’ll get the following error:

Example:

$ gcc -c test.c -o test.o
$ chmod +x test.o
$ ./test.o
-bash: ./test.o: cannot execute binary file: Exec format error 

The compiler will first translate the given high level source code into assembly which gets passed onto the assembler to be translated to machine language. Typically, each assembly instruction has a one to one corresponding machine code instruction unlike high level languages where a lot of lines of assembly code will be needed per instruction. An object file typically isn’t complete because it contains symbol references (i.e. variables and functions) not defined in the file itself and doesn’t have the instructions for the program to be executable. To make the program executable, the files must be linked which is done by the linker.

Note: To replicate the compile and assembling stage, run gcc with -c option

Note: To produce the assembly code, run gcc with -S option. You can use as to assemble the assembly code into binary.

3. Linker

A linker is a program that takes one or more object files & data and combines them into an exectuable file, a library file, or another object file. During compilation (the previous stage), the compiler will not complain if the code references symbols or functions in the code are not defined. To illustrate, here’s an example:

main.c

void honk();
void meow();
int main() {
    honk();
    meow();
    return 0;
}

where main.c contains function calls to honk and meow not defined in the code itself. So how does main.c know about the two functions? The simple answer is through declaraing to the compiler that the functions exist. The ability to call functions defined in other files is common in programming (run readelf -a main.o and you’ll see that meow and honk are located in the relocation section. I probably won’t explain what this is).

A linker resolves the missing references of symbols and functions not defined in a file by linking them with other files that does contain the implementation. In software developing, we like to berak prorgams into many smaller parts (especially in Object oriented programming) and use functions that are defined in other files. Therefore, to compile code referencing symbols or functions defined elsewhere, you need to declare (provide a signature or definition) them for the build to not break as seen in main.c. In main.c, the first two lines void honk(); and void meow(); declares to the compiler of the functions existence. Something simple as printf from the standard input and ouput library has to be declared in your source code. Although you may have never explicitly delcared the printf function in your programs, referencing (i.e. including) the standard input/output (i.e. stdio.h) library at the start of your source code actually contains the declaration of printf and all other functions the standard io library provides you. This is actually done by the preprocessor which I explained earlier.

One of the benefits of linkers is that they allow compilation to be done separately or in smaller manageable parts. A change in one module or file does not require recompilation of the entire project. All that is needed is to recompile and relink a single or a small set of modules. This is a time saving feature.

a diagram showing you code becomes object that goes to linker to output an executable

Though confusing, gcc is a C and C++ compiler that normally does all the steps of the build from preprocessing to linking object files. Here’s the description of gcc from the man pages:

When you invoke GCC, it normally does preprocessing, compilation, assembly and linking. 


Demo


Let’s go over to the terminal and work on building a program step by step. We have the following 3 files:

main.c

void honk();
void meow();
int main() {
    honk();
    meow();
    return 0;
}

goose.c

#include <stdio.h>
#define PI 3.14
#define Square(x) ((x)*(x))
void honk() {
    double area = PI * Square(9); //area = pi * r^2
    printf("The Goose Honks the area of the circular pond: %.2f m^2\n", area);
}

cat.c

#include <stdio.h>
void meow() {
    printf("The Cat Meows at its friend\n");
}

Here’s a diagram, please ignore the fact the diagram may not be a proper UML diagram (I forgot how it works).

Step 1 - Preprocessing

The first step in the build process is preprocessing. To see what the compiler (gcc in our case) sees, we can run gcc -E <file.c>.

If we were to run the preprocessor for the file goose.c, we can see the following:

# 4 "goose.c"
extern int printf (const char *__restrict __format, ...);
void honk() {
  double area = 3.14 * ((9)*(9));
  printf("The Goose Honks the area of the circular pond: %.2f m^2\n", area);
}

(this is just the snippet of the output from the preprocessor and not the entire time)

As expected, all the macros of been expanded and substituted along with the declaration of printf to tell the compiler that the function exists.

a gif demoing running the preprocessor

Step 2- Compilation and Assembler

Once the source files have been preprocessed such that all external symbols have been declared and all macros have either been expanded or ran, we need to compile the code to produce the assembly code of our source files. This can be done using the -S option in gcc. Let’s look at the ouput of compiling main.c (file: main.s):

        .file   "main.c"
        .text
        .globl  main
        .type   main, @function
main:
.LFB0:
        .cfi_startproc
        pushq   %rbp
        .cfi_def_cfa_offset 16
        .cfi_offset 6, -16
        movq    %rsp, %rbp
        .cfi_def_cfa_register 6
        movl    $0, %eax
        call    honk
        movl    $0, %eax
        call    meow
        movl    $0, %eax
        popq    %rbp
        .cfi_def_cfa 7, 8
        ret
        .cfi_endproc
.LFE0:
        .size   main, .-main
        .ident  "GCC: (GNU) 10.2.1 20201125 (Red Hat 10.2.1-9)"
        .section        .note.GNU-stack,"",@progbits

Notice how there’s call to meow and honk but there’s no label anywhere in the code. The linker will resolve this issue in a moment (though after we assemble the code from assembly to binary).


Although assembly is an extremely low-level language, our CPU does not understand ASCII characters. We need to translate the resulting assembly code to ones and zeroes. Luckily, assembly is a mnemonic language (symbolic language) meaning that each command in Assembly typically has a corresponding opcode (binary instruction). So we will need to run the assembler to transform our assembly code to machine language. This can be done using the assembler as. But let’s just use the -c option in gcc to do all the steps we covered so far: preprocessor, compilation, and assembly stage.

$ gcc -c main.c
$ gcc -c cat.c
$ gcc -c goose.c

If we were to look at the symbols for meow and honk for our newly created object file of main.c (file: main.o) using readelf, we’ll see the following:

$ readelf --syms main.o | grep -E "honk|meow"
     9: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND honk 
    10: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND meow 

Notice UND appears in our object file. The object file has absolutely no clue where honk and meow are defined because it’s defined in a different object file. Which comes to our most interesting topic in the build, linkers.

Recall how I mentioned that object files are not executable. To clarify, there are three main types of object files:

  1. Relocatable object files

  2. executable object files

  3. shared object files

The object files main.o, cat.o, and goose.o are not executable object files as I have repeated many times but they are relocatable files. Relocatable files are binary code and data formatted/structured in a way where it can be combined with other relocatable object files to create an executable object files. Let’s look at one of the relocatable file:

$ file main.o
main.o: LF 64-bit LSB relocatable, x86-64, version 1 (SYSV), not stripped

Notice how the the file utility tells us that main.o is a relocatable object file. (It also gives us a lot of different information such as LSB for Lease Significant Bit -> Little Endian, x86-64 architecture).

Here’s a typical ELF (The standard Executable & Linkable Format Linux uses) relocatable format:

We can see the same in our newly created object files as well using objdump:

$ objdump -x main.o

main.o:     file format elf64-x86-64
main.o
architecture: i386:x86-64, flags 0x00000011:
HAS_RELOC, HAS_SYMS
start address 0x0000000000000000

Sections:
Idx Name          Size      VMA               LMA               File off  Algn
  0 .text         0000001f  0000000000000000  0000000000000000  00000040  2**0
                  CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
  1 .data         00000000  0000000000000000  0000000000000000  0000005f  2**0
                  CONTENTS, ALLOC, LOAD, DATA
  2 .bss          00000000  0000000000000000  0000000000000000  0000005f  2**0
                  ALLOC
  3 .comment      0000002f  0000000000000000  0000000000000000  0000005f  2**0
                  CONTENTS, READONLY
  4 .note.GNU-stack 00000000  0000000000000000  0000000000000000  0000008e  2**0
                  CONTENTS, READONLY
  5 .eh_frame     00000038  0000000000000000  0000000000000000  00000090  2**3
                  CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA
SYMBOL TABLE:
0000000000000000 l    df *ABS*    0000000000000000 main.c
0000000000000000 l    d  .text    0000000000000000 .text
0000000000000000 l    d  .data    0000000000000000 .data
0000000000000000 l    d  .bss    0000000000000000 .bss
0000000000000000 l    d  .note.GNU-stack    0000000000000000 .note.GNU-stack
0000000000000000 l    d  .eh_frame    0000000000000000 .eh_frame
0000000000000000 l    d  .comment    0000000000000000 .comment
0000000000000000 g     F .text    000000000000001f main
0000000000000000         *UND*    0000000000000000 honk
0000000000000000         *UND*    0000000000000000 meow


RELOCATION RECORDS FOR [.text]:
OFFSET           TYPE              VALUE 
000000000000000a R_X86_64_PLT32    honk-0x0000000000000004
0000000000000014 R_X86_64_PLT32    meow-0x0000000000000004


RELOCATION RECORDS FOR [.eh_frame]:
OFFSET           TYPE              VALUE 
0000000000000020 R_X86_64_PC32     .text

Step 3: Linking

As I stated earlier, producing the object file for the project does not mean you can execute the program. The object files may contain instructions that your CPU could somewhat understand, it is not complete. Object files will have undefined references to symbols and functions that are not defined within the files themselves. To illustrate, imagine you are running an errand to deliver a package to a person’s house using the bus. However, you are not given the address of where the house is, who is the receiver, nor have the package with you. You cannot complete the tasks with those missing information. That’s where the linker comes in handy. The information you need to perform some action in software development is often defined elsewhere. The linker combines all the object files, static files and performs symbol resolution and packaged them into one executable file.

We will be using ld to link our object files to relocate data and tie up symbol references.

The basic idea is to link all the object files generated for this project like so:

ld cat.o goose.o main.c -o prog

Unfortunately, it’s a lot more complex than that. We also need to link other files which I won’t go into. The actual command is given below:

ld /usr/lib64/crti.o /usr/lib64/crtn.o /usr/lib64/crt1.o /usr/lib64/libc.so cat.o goose.o main.o -dynamic-linker /lib64/ld-linux-x86-64.so.2 -o prog

(Thanks to ZeZNiQ for outlining the crt files needed to execute on amd64 (x86_64))

Once the files have been compiled, assembled, and linked, we can execute the program:

$ ./prog
The Goose Honks the area of the circular pond: 254.34 m^2
The Cat Meows

Let’s inspect the executable itself using objdump:

If we were to look at the code for main, we see the following:

0000000000401177 <main>:
  401177:       55                      push   %rbp
  401178:       48 89 e5                mov    %rsp,%rbp
  40117b:       b8 00 00 00 00          mov    $0x0,%eax
  401180:       e8 c2 ff ff ff          callq  401147 <honk>
  401185:       b8 00 00 00 00          mov    $0x0,%eax
  40118a:       e8 a7 ff ff ff          callq  401136 <meow>
  40118f:       b8 00 00 00 00          mov    $0x0,%eax
  401194:       5d                      pop    %rbp

Notice how our exectuable has an address associated with meow and honk function calls. Our linker is responsible for the symbol resolution and relocation. In the object files, the code and data sections starts at address 0. But when the linker combines various object files, it has to do two things: (1) associate a memory location for each symbol definition (2) modify all references to those symbols to the associated memory location. Anyhow Linkers are complex and would span a few blogs. I still don’t have a full picture of how linkers work. If you want to learn more, I would suggest reading up a lecture I found online that gives a nice explanation about the build process but in more depth: Linking. For a full series of blog post on linkers, try reading Linkers blog series from the creator of the Gold Linker himself.


Build Definition - Summary


  • build is the process of converting source code into an executable

  • builds have a few steps such as compilation and linking

  • compilation is the process of converting source code into machine code saved in the form of an object file

  • linking is the process of combining object files into an executable


Quick Overview of Static and Dynamic Libraries


As stated earlier, it’s very normal for different programs/projects to share the same functions in software development. Notable examples are printf, sqrt, cos, fopen. Libraries allow programmers to reuse common functions among different projects and programs. This saves development time and prevent reventing the wheel. It also allows a consistent and reliable set of functions with defined behavior that can easily be updated by either updating the code or downloading the latest library. Some examples of libraries outside of C you may be familiar with are:

  • DirectX SDK (Microsoft’s Multimedia API used frequently by Videogame programmers)
  • Asio (C++ library for network and low-level I/O programming)
  • Boost in C++
  • GTK (GNOME GUI library) or Qt (another GUI library)

Static Libraries


Static Libraries are libraries that are linked during the build process and not during runtime. Static libraries have the extension of .a and are also known as archives. They are known as archives because they are a collection of object files.

Note: The image is incorrect. An archiver creates the static library and not the linker.

For instance, let’s say I generated an archive file named libanimal.a which is an archive that contains cat.o and goose.o. I can simply link it to a program like so:

gcc main.o -L. -lanimal -o prog

This is much easier than typing all the files that composed the static library such as the below:

gcc cat.o goose.o main.o -o prog

You can easily see how scalable this becomes when you are working on larger projects. But that’s not the entire point of static libraries.

Benefits

  • If multiple components or projects use the same set of libraries, no need to recompile
  • Saves compilation time, compile once and link it each time
  • makes distributing object files much easier (since it’s like a zip file where it combines various files into one)

You can also easily distribute your code to your clients and friends saving them the trouble of compiling the code and somewhat protect your code from others by not giving them the source code.

To see the object files that composes the static library, use the tool ar with the option -t:

$ ar -t libanimal.a
cat.o
goose.o

Dynamic Libraries


Dynamic Libraries (also known as shared libraries) are linked during runtime whenever we run the program. Which differs from static libraries where it gets link during the build process. Dynamic libraries have the extension *.so on UNIX/Linux systems and on Windows they have the extension *.dll.

Based on the name dynamic library and shared library we can infer the following:

  • dynamic libraries are loaded and linked during RUNTIME
  • shared library: means multiple programs share the same library in memory by linking to it

Benefits

  • Executable is smaller because definition isn’t in the executable itself (i.e. the code for the libraries are stored in the computer at some common location such as /usr/lib
  • All programs using the library just need to link to the definition loaded in memory
  • Easy to update if a bug exists

You can view the dependencies programs have by running ldd. It’s a useful tool to debug weird program behaviors:

 ldd /bin/ls
	linux-vdso.so.1 (0x00007fff41141000)
	libselinux.so.1 => /lib64/libselinux.so.1 (0x00007f11d1688000)
	libcap.so.2 => /lib64/libcap.so.2 (0x00007f11d1482000)
	libc.so.6 => /lib64/libc.so.6 (0x00007f11d10bf000)
	libpcre2-8.so.0 => /lib64/libpcre2-8.so.0 (0x00007f11d0e3b000)
	libdl.so.2 => /lib64/libdl.so.2 (0x00007f11d0c37000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f11d1ad5000)
	libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f11d0a17000)

Static v.s Dynamic Libraries


Cons of Static Libraries:

  • generate bigger executable files since library is integrated inside executable
  • less memory efficient - each program has their own copy of the library
  • Not easy to update - Requires all programs to update their copy of the library if a bug exists

Cons of Dynamic Libraries:

  • Since many programs link to the same library, can cause compatibility issues if library gets updated or removed
  • Extremely annoying to install program
    An error explaining that it failed to execute because it's unable to locate a dll

Image Credits (Not exhaustive)


  • Olivier Pinçon & Sébastien Granjoux - http://library.gnome.org/users/anjuta-build-tutorial/2.26/build-gcc.html.en, GFDL

More Resources:

Twitter, Facebook