Marcin Jahn | Dev Notebook
  • Home
  • Programming
  • Technologies
  • Projects
  • About
  • Home
  • Programming
  • Technologies
  • Projects
  • About
  • An icon of the Core section Core
    • Programs Execution
    • Stack and Heap
    • Asynchronous Programming
      • Overview
      • Event Queues
      • Fibers
      • Stackless Coroutines
  • An icon of the .NET section .NET
    • HTTPClient
    • Async
      • How Async Works
      • TAP Tips
    • Equality
    • Comparisons
    • Enumerables
    • Unit Tests
    • Generic Host
    • Logging
    • Configuration
    • Records
    • Nullability
    • Garbage Collector
    • IL and Allocations
    • gRPC
    • Source Generators
    • Platform Invoke
    • ASP.NET Core
      • Overview
      • Middleware
      • Razor Pages
      • Routing in Razor Pages
      • Web APIs
      • Filters
      • Identity
      • Validation
      • Tips
    • Entity Framework Core
      • Overview
      • Testing
      • Tips
  • An icon of the Angular section Angular
    • Overview
    • Components
    • Directives
    • Services and DI
    • Routing
    • Observables (RxJS)
    • Forms
    • Pipes
    • HTTP
    • Modules
    • NgRx
    • Angular Universal
    • Tips
    • Standalone Components
  • An icon of the JavaScript section JavaScript
    • OOP
    • JavaScript - The Weird Parts
    • JS Functions
    • ES Modules
    • Node.js
    • Axios Tips
    • TypeScript
      • TypeScript Environment Setup
      • TypeScript Tips
    • React
      • React Routing
      • MobX
    • Advanced Vue.js Features
  • An icon of the Rust section Rust
    • Overview
    • Cargo
    • Basics
    • Ownership
    • Structures
    • Enums
    • Organization
    • Collections
    • Error Handling
    • Generics
    • Traits
    • Lifetimes
    • Closures
    • Raw Pointers
    • Smart Pointers
    • Concurrency
    • Testing
    • Tips
  • An icon of the C/C++ section C/C++
    • Compilation
    • Structures
    • OOP in C
    • Pointers
    • Strings
    • Dynamic Memory
    • argc and argv Visualization
  • An icon of the GTK section GTK
    • Overview
    • GObject
    • GJS
  • An icon of the CSS section CSS
    • Responsive Design
    • CSS Tips
    • CSS Pixel
  • An icon of the Unity section Unity
    • Unity
  • An icon of the Functional Programming section Functional Programming
    • Fundamentals of Functional Programming
    • .NET Functional Features
    • Signatures
    • Function Composition
    • Error Handling
    • Partial Application
    • Modularity
    • Category Theory
      • Overview
      • Monoid
      • Other Magmas
      • Functors
  • An icon of the Algorithms section Algorithms
    • Big O Notation
    • Array
    • Linked List
    • Queue
    • Hash Table and Set
    • Tree
    • Sorting
    • Searching
  • An icon of the Architecture section Architecture
    • What is architecture?
    • Domain-Driven Design
    • ASP.NET Core Projects
  • An icon of the Core section Core
    • Programs Execution
    • Stack and Heap
    • Asynchronous Programming
      • Overview
      • Event Queues
      • Fibers
      • Stackless Coroutines
  • An icon of the .NET section .NET
    • HTTPClient
    • Async
      • How Async Works
      • TAP Tips
    • Equality
    • Comparisons
    • Enumerables
    • Unit Tests
    • Generic Host
    • Logging
    • Configuration
    • Records
    • Nullability
    • Garbage Collector
    • IL and Allocations
    • gRPC
    • Source Generators
    • Platform Invoke
    • ASP.NET Core
      • Overview
      • Middleware
      • Razor Pages
      • Routing in Razor Pages
      • Web APIs
      • Filters
      • Identity
      • Validation
      • Tips
    • Entity Framework Core
      • Overview
      • Testing
      • Tips
  • An icon of the Angular section Angular
    • Overview
    • Components
    • Directives
    • Services and DI
    • Routing
    • Observables (RxJS)
    • Forms
    • Pipes
    • HTTP
    • Modules
    • NgRx
    • Angular Universal
    • Tips
    • Standalone Components
  • An icon of the JavaScript section JavaScript
    • OOP
    • JavaScript - The Weird Parts
    • JS Functions
    • ES Modules
    • Node.js
    • Axios Tips
    • TypeScript
      • TypeScript Environment Setup
      • TypeScript Tips
    • React
      • React Routing
      • MobX
    • Advanced Vue.js Features
  • An icon of the Rust section Rust
    • Overview
    • Cargo
    • Basics
    • Ownership
    • Structures
    • Enums
    • Organization
    • Collections
    • Error Handling
    • Generics
    • Traits
    • Lifetimes
    • Closures
    • Raw Pointers
    • Smart Pointers
    • Concurrency
    • Testing
    • Tips
  • An icon of the C/C++ section C/C++
    • Compilation
    • Structures
    • OOP in C
    • Pointers
    • Strings
    • Dynamic Memory
    • argc and argv Visualization
  • An icon of the GTK section GTK
    • Overview
    • GObject
    • GJS
  • An icon of the CSS section CSS
    • Responsive Design
    • CSS Tips
    • CSS Pixel
  • An icon of the Unity section Unity
    • Unity
  • An icon of the Functional Programming section Functional Programming
    • Fundamentals of Functional Programming
    • .NET Functional Features
    • Signatures
    • Function Composition
    • Error Handling
    • Partial Application
    • Modularity
    • Category Theory
      • Overview
      • Monoid
      • Other Magmas
      • Functors
  • An icon of the Algorithms section Algorithms
    • Big O Notation
    • Array
    • Linked List
    • Queue
    • Hash Table and Set
    • Tree
    • Sorting
    • Searching
  • An icon of the Architecture section Architecture
    • What is architecture?
    • Domain-Driven Design
    • ASP.NET Core Projects

Compilation

Our source files go through the following tools that act on them:

  1. Preprocessor
  2. Compiler
  3. Assembler
  4. Linker

Let’s explain the process based on a simple solution:

main.cpp:

#include "source.hpp"
int main() {
add(2,3);
return 0;
}

add.cpp:

int add(int a, int b) {
return a + b;
}

add.hpp:

int add(int a, int b);

We can compile the app with:

Terminal window
g++ main.cpp add.cpp
# or to change the binary name
g++ main.cpp add.cpp -o program

The result is an executable file a.out.

Make

With our projects growing in size, more source files will be present. The listing of these files need to be provided to the compiler. To simplify the process, Makefile and make should be used.

Header Files

First, it’s good to understand the purpose of the header files. These files include declarations of various entities (like functions or global variables). The actual implementations (defintitions) of these functions go into the .cpp/.c files (although, the header files also might contain definitions, it’s not illegal). The implementation might change over time, which requires recompilation. The header files are less likely to change, since they only contain the signatures of functions. In other files, we’re not directly relying on the .cpp/.c files. Instead, we’re relying on the header files.

Header files are then like interfaces that are expected to not change. We are supposed to rely on them instead of on the actual implementations.

In our programs, we cannot refer to symbols that are not defined/declared. The symbol can be defined in the current file, or in some other file that is included into the current files. Additionally, a given entity can only be defined once. It’s called the One Definition Rule in the C++ Standard.

Preprocessor

The first step when compiling our program is the Preprocessing. It handles all the lines that start with the # (e.g., #include or #if, macros substitution).

To get the output of the preprocessor, we can execute:

Terminal window
gcc++ -E main.cpp

Here’s what we get on stdout:

# 1 "main.cpp"
# 1 "<built-in>"
# 1 "<command-line>"
# 1 "/usr/include/stdc-predef.h" 1 3 4
# 1 "<command-line>" 2
# 1 "main.cpp"
# 1 "source.hpp" 1
int add(int a, int b);
# 2 "main.cpp" 2
int main() {
add(2,3);
return 0;
}

If we included some common library like iostream, our file would become huge after the preprocessing stage.

What preprocessor did in this case was just including the content of add.hpp directly into main.cpp.

The files that preprocessor creates have the .i extension. They are sometimes called Translation Units. These files are not generated by default, we rarely need them.

Compiler

The result of preprocessing is handed over to the Compiler. Compiler analyzes the text of the code and builds a tree (like AST - Abstract Syntax Tree).

We can dump out AST with:

Terminal window
g++ -fdump-tree-all-graph -g main.cpp add.cpp

The resulting .dot files can be viewed.


The next step of the compilation is the generation of the Assemby code. Here’ how we can generate Assembly code:

Terminal window
g++ -S add.cpp

Here’s the resulting add.s file:

.arch armv8-a
.file "add.cpp"
.text
.align 2
.global _Z3addii
.type _Z3addii, %function
_Z3addii:
.LFB0:
.cfi_startproc
sub sp, sp, #16
.cfi_def_cfa_offset 16
str w0, [sp, 12]
str w1, [sp, 8]
ldr w1, [sp, 12]
ldr w0, [sp, 8]
add w0, w1, w0
add sp, sp, 16
.cfi_def_cfa_offset 0
ret
.cfi_endproc
.LFE0:
.size _Z3addii, .-_Z3addii
.ident "GCC: (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0"
.section .note.GNU-stack,"",@progbits

This code was generated on an ARM64 machine.

Assembler

We can generate object files from our source code with:

Terminal window
g++ -c main.c
g++ -c add.c

It will produce main.o and add.o files. These are blobs of machine code. They need to be joined together (via the Linker) in a proper way to have the final executable.

We can explore what’s inside of the .o files with the objdump command:

Terminal window
objdump -t add.o

The result is:

add.o: file format elf64-littleaarch64
SYMBOL TABLE:
0000000000000000 l df *ABS* 0000000000000000 add.cpp
0000000000000000 l d .text 0000000000000000 .text
0000000000000000 l d .data 0000000000000000 .data
0000000000000000 l d .bss 0000000000000000 .bss
0000000000000000 l d .note.GNU-stack 0000000000000000 .note.GNU-stack
0000000000000000 l d .eh_frame 0000000000000000 .eh_frame
0000000000000000 l d .comment 0000000000000000 .comment
0000000000000000 g F .text 0000000000000020 _Z3addii

The last line lists our function add.

In general, the .o files contain:

  • data - the actual machine instructions, parts of our program. We will find there references to entities defined in other translation units, these are placeholders that will be filled by the Linker.
  • metadata - information needed by Linker to combine the object files into an actual executable. An example is the “link” between names of symbols and their addresses in memory.

Linker

Here’s how to link the object files into an executable:

Terminal window
g++ main.o source.o -o program # results in a.out

Linker “glues” together the .o files. It also can link dynamic libraries (.so files) that may come form the “outside” of our solution (like some standard libraries). An example of it could be “iostream”.

.so files

The .so shared library might be generated with gcc -shared -fPIC lib.c -o lib.so.

Here’s some information about soname.

#include "add.hpp"
#include <iostream>
int main() {
std::cout << add(2,3) << std::endl;;
return 0;
}

In this program, I’m including “iostream” to make use of the cout function.

After compilation, we can have a look at the dynamic libraries being linked to our program:

Terminal window
ldd a.out

The result is:

linux-vdso.so.1 (0x0000ffff8ff98000)
libstdc++.so.6 => /lib/aarch64-linux-gnu/libstdc++.so.6 (0x0000ffff8fd71000)
libc.so.6 => /lib/aarch64-linux-gnu/libc.so.6 (0x0000ffff8fbfe000)
libm.so.6 => /lib/aarch64-linux-gnu/libm.so.6 (0x0000ffff8fb54000)
/lib/ld-linux-aarch64.so.1 (0x0000ffff8ff68000)
libgcc_s.so.1 => /lib/aarch64-linux-gnu/libgcc_s.so.1 (0x0000ffff8fb30000)

Other OSs

The .so files are Linux dynamic libraries. Windows uses .dll, and macOS uses .dylib.

When compiling programs, we can specify explicitly the dynamic libraries that we want to link, with the -l flag in g++.

Resources

make and g++

Structures  →
© 2023 Marcin Jahn | Dev Notebook | All Rights Reserved. | Built with Astro.