Programming – the ability to code ideas into existence is no less than a superpower. Programming languages can vary from simple to complex and from low-level to high-level. However, compilation of the written code is a common process in most high-level languages.
C is a widely popular programming language that works across different machines. It was created in 1972 and has been used for various purposes, from complex games and applications to complete operating systems like Microsoft Windows.
However, without the compiler of ‘C’ language, the high-level code developers wrote to create these products could not be assembled into machine code. So, it is safe to say that these products would not exist without compilation! Before we delve deeper into the compilation process for C, let us understand the term in detail.
What does Compilation mean?
Most programming or ‘coding’ is done in languages that usually mimic one or the other natural languages of the world, most commonly – English.
These are also called high-level programming languages. However, computers cannot understand them, as they only understand the language of bits, Ones, and Zeroes, otherwise known as assembly or machine language.
The Compiler is the magic that transforms human coding into machine language. Many programming languages include compilers. Compilers verify the written code for any errors in syntax, logic, or structure and only change it to machine language if the code is correct. This process is known as Compilation.
The compilation process consists of four significant steps:
- Preprocessing
- Compiling
- Assembling
- Linking
Let us look at each of them one by one.
1. Preprocessing
The code that a human writes is called the source code. The source code file is saved with a ‘.c’ extension. During the first phase of compilation, the source code comes to the pre-processor, which performs three essential functions on it:
- Cleaning the code
- Passing directives
- Expansion of code
One of the best practices of coding is for developers to add comments to their code so that it is easy for other programmers to understand. While these comments are helpful for humans, they are not understood by machines.
During the first phase, the preprocessor takes the source code file as input and cleans it, which simply means it removes all unnecessary elements, such as comments. This makes the code easily decipherable by machines.
The next action taken by the preprocessor is to pass preprocessor directives. These are simply lines of code within the source code file that begin with the character ‘#.’ Some examples of preprocessor directives are as follows:
- #include
- #define
- #undef
- #if
- #ifelse
The preprocessor interprets the directives and takes appropriate action. For example, if the directive #include<stdio.h> is found in the program, the preprocessor interprets the directive and substitutes it with the contents of the <stdio.h> file.
Lastly, the preprocessor expands the code in the source code file, and this expanded code is then passed to the compiler.
If you’d like to understand the difference between a compiler and an interpreter, Click here.
2. Compiling
As the name suggests, the compiler compiles the expanded code sent by the preprocessor into assembly language. This language is specific to the target processor and is also human-readable.
3. Assembling
The assembler converts the compiled code into pure binary code or assembly code (Ones and Zeroes). The file that the assembler creates is called the object file.
It has the same file name as the source file, but its extension is ‘.obj’ in DOS and ‘.o’ in UNIX. For example, if the source file is named ‘Hello World.c’, the object file would be named ‘Hello World. obj’
Want to see what it takes to go from a learner to a software engineer? Read Baljeet’s inspirational journey here.
4. Linking the Compilation
The C programming language has a library of pre-compiled functions. All programs coded in the C language use one or more of these library functions. The assembled code of these library files is stored with the ‘.lib’ (or ‘.a’) extension.
During the linking process of compilation, the assembly code of the source file is combined with that of the used library files to ensure seamless program execution.
In some cases, a program uses multiple source files, one of which can call a function within another. In such cases, the linking phase becomes the most crucial phase of compilation as it links the assembly code of both files.
The resulting file is a fully compiled one that is ready for execution. It always has the same file name as the source file but can have an extension of ‘.exe’ in DOS or ‘a.out’ in UNIX.
Let us look at compilation in action via an example of a simple program that uses a preprocessor directive and ‘printf’ function.
#include <stdio.h>
int main()
printf( “Hello CodeQuotient”
return 0;
To understand things better, here is a flow of the compilation of this program:
C program (Hello CodeQuotient.C) > Preprocessing > Expanded Source file > Compiling > Assembly Code file > Assembling > Object file (Hello CodeQuotient.obj) > Linking > Executable file (Hello CodeQuotient.exe)
Wrapping Up!
After successful compilation, the file is ready for execution. However, the compilation would fail for the program above, as our syntax has an error. The semicolon is missing after the ‘printf’ function is called.
If you are a learner who missed catching this error, fret not. CodeQuotient offers the industry-renowned Software Engineering Bootcamp– a three-month course that focuses on project-based learning to help you become a skilled and logical full-stack developer and unlock the supreme programmer inside you!
Want to get started on your software engineering journey but not sure what to begin with? Get in touch with us, and we will provide detailed instructions for you through our program.