Marcin Jahn | Dev Notebook
  • Home
  • Programming
  • Technologies
  • Projects
  • About
  • A logo of LinkedIn
  • A logo of GitHub
  • cirro.services
  • Home
  • Programming
  • Technologies
  • Projects
  • About
  • An icon of the .NET section .NET
    • HTTPClient
    • Async
      • How Async Works
      • TAP Tips
    • Equality
    • Comparisons
    • Enumerables
    • Unit Tests
    • Generic Host
    • Logging
    • Configuration
    • Records
    • gRPC
    • Platform Invoke
    • ASP.NET Core
      • Overview
      • Middleware
      • Razor Pages
      • Routing in Razor Pages
      • Web APIs
      • Filters
      • Identity
      • Validation
      • Tips
    • Entity Framework Core
      • Overview
      • Testing
      • Tips
  • An icon of the Angular section Angular
    • Overview
    • Components
    • Directives
    • Services and DI
    • Routing
    • Observables (RxJS)
    • Forms
    • Pipes
    • HTTP
    • Modules
    • NgRx
    • Angular Universal
    • Tips
    • Standalone Components
  • An icon of the JavaScript section JavaScript
    • OOP
    • JavaScript - The Weird Parts
    • JS Functions
    • ES Modules
    • Node.js
    • Axios Tips
    • TypeScript
      • TypeScript Environment Setup
      • TypeScript Tips
    • React
      • React Routing
      • MobX
    • Advanced Vue.js Features
  • An icon of the Rust section Rust
    • Overview
    • Cargo
    • Basics
    • Ownership
    • Structures
    • Enums
    • Organization
    • Collections
    • Error Handling
    • Generics
    • Traits
    • Lifetimes
    • Closures
    • Raw Pointers
    • Smart Pointers
    • Concurrency
    • Testing
    • Tips
  • An icon of the C/C++ section C/C++
    • Structs and Classes
    • Compilation
    • Pointers
    • Strings
    • Dynamic Memory
    • argc and argv Visualization
  • An icon of the GTK section GTK
    • Overview
    • GJS
  • An icon of the CSS section CSS
    • Responsive Design
    • CSS Tips
    • CSS Pixel
  • An icon of the Unity section Unity
    • Unity
  • An icon of the Functional Programming section Functional Programming
    • Fundamentals of Functional Programming
    • .NET Functional Features
    • Signatures
    • Function Composition
    • Error Handling
    • Partial Application
    • Modularity
    • Category Theory
      • Overview
      • Monoid
      • Other Magmas
      • Functors
  • An icon of the Algorithms section Algorithms
    • Big O Notation
    • Array
    • Linked List
    • Queue
    • Hash Table and Set
    • Tree
    • Sorting
    • Searching
  • An icon of the Architecture section Architecture
    • What is architecture?
    • Domain-Driven Design
    • ASP.NET Core Projects
  • An icon of the .NET section .NET
    • HTTPClient
    • Async
      • How Async Works
      • TAP Tips
    • Equality
    • Comparisons
    • Enumerables
    • Unit Tests
    • Generic Host
    • Logging
    • Configuration
    • Records
    • gRPC
    • Platform Invoke
    • ASP.NET Core
      • Overview
      • Middleware
      • Razor Pages
      • Routing in Razor Pages
      • Web APIs
      • Filters
      • Identity
      • Validation
      • Tips
    • Entity Framework Core
      • Overview
      • Testing
      • Tips
  • An icon of the Angular section Angular
    • Overview
    • Components
    • Directives
    • Services and DI
    • Routing
    • Observables (RxJS)
    • Forms
    • Pipes
    • HTTP
    • Modules
    • NgRx
    • Angular Universal
    • Tips
    • Standalone Components
  • An icon of the JavaScript section JavaScript
    • OOP
    • JavaScript - The Weird Parts
    • JS Functions
    • ES Modules
    • Node.js
    • Axios Tips
    • TypeScript
      • TypeScript Environment Setup
      • TypeScript Tips
    • React
      • React Routing
      • MobX
    • Advanced Vue.js Features
  • An icon of the Rust section Rust
    • Overview
    • Cargo
    • Basics
    • Ownership
    • Structures
    • Enums
    • Organization
    • Collections
    • Error Handling
    • Generics
    • Traits
    • Lifetimes
    • Closures
    • Raw Pointers
    • Smart Pointers
    • Concurrency
    • Testing
    • Tips
  • An icon of the C/C++ section C/C++
    • Structs and Classes
    • Compilation
    • Pointers
    • Strings
    • Dynamic Memory
    • argc and argv Visualization
  • An icon of the GTK section GTK
    • Overview
    • GJS
  • An icon of the CSS section CSS
    • Responsive Design
    • CSS Tips
    • CSS Pixel
  • An icon of the Unity section Unity
    • Unity
  • An icon of the Functional Programming section Functional Programming
    • Fundamentals of Functional Programming
    • .NET Functional Features
    • Signatures
    • Function Composition
    • Error Handling
    • Partial Application
    • Modularity
    • Category Theory
      • Overview
      • Monoid
      • Other Magmas
      • Functors
  • An icon of the Algorithms section Algorithms
    • Big O Notation
    • Array
    • Linked List
    • Queue
    • Hash Table and Set
    • Tree
    • Sorting
    • Searching
  • An icon of the Architecture section Architecture
    • What is architecture?
    • Domain-Driven Design
    • ASP.NET Core Projects

Compilation

Our source files go through the following tools that act on them:

  1. Preprocessor
  2. Compiler
  3. Assembler
  4. Linker

Let’s explain the process based on a simple solution:

main.cpp:

#include "source.hpp"

int main() {
    add(2,3);
    return 0;
}

add.cpp:

int add(int a, int b) {
    return a + b;
}

add.hpp:

int add(int a, int b);

We can compile the app with:

g++ main.cpp add.cpp

# or to change the binary name
g++ main.cpp add.cpp -o program

The result is an executable file a.out.

Make

With our projects growing in size, more source files will be present. The listing of these files need to be provided to the compiler. To simplify the process, Makefile and make should be used.

Header Files

First, it’s good to understand the purpose of the header files. These files include declarations of various entities (like functions or global variables). The actual implementations (defintitions) of these functions go into the .cpp/.c files (although, the header files also might contain definitions, it’s not illegal). The implementation might change over time, which requires recompilation. The header files are less likely to change, since they only contain the signatures of functions. In other files, we’re not directly relying on the .cpp/.c files. Instead, we’re relying on the header files.

Header files are then like interfaces that are expected to not change. We are supposed to rely on them instead of on the actual implementations.

In our programs, we cannot refer to symbols that are not defined/declared. The symbol can be defined in the current file, or in some other file that is included into the current files. Additionally, a given entity can only be defined once. It’s called the One Definition Rule in the C++ Standard.

Preprocessor

The first step when compiling our program is the Preprocessing. It handles all the lines that start with the # (e.g., #include or #if, macros substitution).

To get the output of the preprocessor, we can execute:

gcc++ -E main.cpp

Here’s what we get on stdout:

# 1 "main.cpp"
# 1 "<built-in>"
# 1 "<command-line>"
# 1 "/usr/include/stdc-predef.h" 1 3 4
# 1 "<command-line>" 2
# 1 "main.cpp"
# 1 "source.hpp" 1
int add(int a, int b);
# 2 "main.cpp" 2

int main() {
 add(2,3);
 return 0;
}

If we included some common library like iostream, our file would become huge after the preprocessing stage.

What preprocessor did in this case was just including the content of add.hpp directly into main.cpp.

The files that preprocessor creates have the .i extension. They are sometimes called Translation Units. These files are not generated by default, we rarely need them.

Compiler

The result of preprocessing is handed over to the Compiler. Compiler analyzes the text of the code and builds a tree (like AST - Abstract Syntax Tree).

We can dump out AST with:

g++ -fdump-tree-all-graph -g main.cpp add.cpp

The resulting .dot files can be viewed.


The next step of the compilation is the generation of the Assemby code. Here’ how we can generate Assembly code:

g++ -S add.cpp

Here’s the resulting add.s file:

  .arch armv8-a
  .file  "add.cpp"
  .text
  .align  2
  .global  _Z3addii
  .type  _Z3addii, %function
_Z3addii:
.LFB0:
  .cfi_startproc
  sub  sp, sp, #16
  .cfi_def_cfa_offset 16
  str  w0, [sp, 12]
  str  w1, [sp, 8]
  ldr  w1, [sp, 12]
  ldr  w0, [sp, 8]
  add  w0, w1, w0
  add  sp, sp, 16
  .cfi_def_cfa_offset 0
  ret
  .cfi_endproc
.LFE0:
  .size  _Z3addii, .-_Z3addii
  .ident  "GCC: (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0"
  .section  .note.GNU-stack,"",@progbits

This code was generated on an ARM64 machine.

Assembler

We can generate object files from our source code with:

g++ -c main.c
g++ -c add.c

It will produce main.o and add.o files. These are blobs of machine code. They need to be joined together (via the Linker) in a proper way to have the final executable.

We can explore what’s inside of the .o files with the objdump command:

objdump -t add.o

The result is:

add.o:     file format elf64-littleaarch64

SYMBOL TABLE:
0000000000000000 l    df *ABS*  0000000000000000 add.cpp
0000000000000000 l    d  .text  0000000000000000 .text
0000000000000000 l    d  .data  0000000000000000 .data
0000000000000000 l    d  .bss  0000000000000000 .bss
0000000000000000 l    d  .note.GNU-stack  0000000000000000 .note.GNU-stack
0000000000000000 l    d  .eh_frame  0000000000000000 .eh_frame
0000000000000000 l    d  .comment  0000000000000000 .comment
0000000000000000 g     F .text  0000000000000020 _Z3addii

The last line lists our function add.

In general, the .o files contain:

  • data - the actual machine instructions, parts of our program. We will find there references to entities defined in other translation units, these are placeholders that will be filled by the Linker.
  • metadata - information needed by Linker to combine the object files into an actual executable. An example is the “link” between names of symbols and their addresses in memory.

Linker

Here’s how to link the object files into an executable:

g++ main.o source.o -o program # results in a.out

Linker “glues” together the .o files. It also can link dynamic libraries (.so files) that may come form the “outside” of our solution (like some standard libraries). An example of it could be “iostream”.

.so files

The .so shared library might be generated with gcc -shared -fPIC lib.c -o lib.so.

Here’s some information about soname.

#include "add.hpp"
#include <iostream>

int main() {
    std::cout << add(2,3) << std::endl;;
    return 0;
}

In this program, I’m including “iostream” to make use of the cout function.

After compilation, we can have a look at the dynamic libraries being linked to our program:

ldd a.out

The result is:

linux-vdso.so.1 (0x0000ffff8ff98000)
libstdc++.so.6 => /lib/aarch64-linux-gnu/libstdc++.so.6 (0x0000ffff8fd71000)
libc.so.6 => /lib/aarch64-linux-gnu/libc.so.6 (0x0000ffff8fbfe000)
libm.so.6 => /lib/aarch64-linux-gnu/libm.so.6 (0x0000ffff8fb54000)
/lib/ld-linux-aarch64.so.1 (0x0000ffff8ff68000)
libgcc_s.so.1 => /lib/aarch64-linux-gnu/libgcc_s.so.1 (0x0000ffff8fb30000)

Other OSs

The .so files are Linux dynamic libraries. Windows uses .dll, and macOS uses .dylib.

When compiling programs, we can specify explicitly the dynamic libraries that we want to link, with the -l flag in g++.

Resources

make and g++

← Structs and Classes
Pointers →
© 2023 Marcin Jahn | Dev Notebook | All Rights Reserved. | Built with Astro.