C++: Preprocessor Directives

Preprocessor directives are lines which are not actually part of our application's code, but are instead read by the preprocessor, and so will be transformed into something else at compilation. These lines are always preceded with a hash symbol ('#') and there are a number of different things that you can do with them. We will cover some of the common preprocessor directives and their uses in this tutorial.

Include

The first preprocessor directive which we will cover is one which you should already know quite a lot about! We've been using the #include directive ever since our first C++ program, but we've sort of glossed over what it does.

When the preprocessor finds an 'include' directive, it fetches the file specified and dumps it in the place of the directive - so in the case of <iostream>, the preprocessor looks into the directories where it knows it might find these kind of files, and then dumps the file (e.g. iostream.h) where the directive is present.

Files specified using triangular brackets (e.g. <iostream>) will be looked for in the directories that the compiler has "noted down", and files specified using double quotes will be looked for in the same directory as your project. As such, you can create your own custom '.cpp' and '.h' files, and include these using double quotes with the include directive. Creating custom header and source files is very important to the creation process of big applications, and so it's probably worth us taking a quick detour to talk about this.

Custom Header and Source Files

Header files contain declarations for symbols such as variables, functions, and classes, and then these are usually defined in related source files. Having external header and source files can make for three main advantages:

Break up, organise, and modularise code.
Avoid re-compilation of code which has remained the same.
Break up work between multiple programmers on much larger projects.

Sometimes you might even want to group together some functions and classes/structs you've made into a single header and related source file so that you can share it with others -- for example if you made a set of standard functions for some 3D mathematics POD structs.

If you create a header file (anything with the '.h' extension), and use this to declare variables and functions, then most compilers will link up the source ('.cpp') file with the same name automatically. Functions can be declared without their contents being defined by simply putting a semicolon instead of curly brackets - these are often called prototypes. You might have something like the following in a header file:

void AddTwoNumbers(int one, int two);

If this was in "numbers.h", for example, we could then give the following function code in "numbers.cpp":

void AddTwoNumbers(int one, int two)
{
	return one+two;
}

This provides a very structured and defined view of the contained symbols (in this case, just a function), and can then all be included into the main .cpp file (i.e. 'main.cpp') via a simple #include "numbers.h" if the files are in the same directory as the project. You should already know how to do the variable declarations in header files too, and class declarations without proper member function definitions can be accomplished just like the regular function declarations that were shown above. For member function definitions after the basic 'prototype' declaration, you can use the scope resolution operator, ::, on the class name. So if a class named 'Car' contained the prototype for a basic member 'int' function (member function) called 'Drive' which took no parameters, that function could be later defnied using int Car::Drive(){ } -- this is demonstrated in the following code snippet:

class Car { //Class structure, we don't do any actual functionality programming here - may be in a header file.
public:
	void FillUp(int value); //Function prototype
};

void Car::FillUp(int value)
{
	//Actual stuff the function will do goes here - the implementation rather than the structure.
}

In the case above, of course the function implementation and structure could be defined at the same time as we've done previously, but often separating the structure and the implementation can make for a very tidy and clean class (plus, the :: (scope resolution) operator is pretty neat). I suggest trying to create some header and source files of your own and then seeing how well that works for you. As a challenge to help you get used to your IDE and the whole process, try making a struct for a custom data-type which contains two ints, and making a few functions which take this type of struct as a parameter and deal with them (add them together, or whatever). Put these purely in one external header file and related source file, and then #include these files and try using the struct and related functions in your 'main.cpp' file.

Define

After that large (yet important) tangent, let's move onto another preprocessor directive! Define is probably the second most popular preprocessor directive. In its most basic form, it provides replacement for a certain identifiers. Some constants are defined using #define, and these are usually identified using uppercase characters - take for example the following:

#define PI 3.14159

The above would define the constant 'PI' as 3.14159, and so whenever 'PI' is written elsewhere in the code, this would relate to 3.14159. In case this didn't make it obvious, the format for #define is always: #define identifier replacement.

You can also create basic macros using #define. You can give a "function call"-esque format for the identifier, and then you can use the 'variables' that you specified in the identifier to do some special calculations. The most common of these is using ternary operators. To learn more about ternary operators, I'd suggest Google, however the basic format is usually along the lines of: condition?return_if_condition_is_true:return_if_condition_is_false. As such, a simple macro which finds the biggest of two numbers (let's call it "get_max") could be written as follows:

#define getmax(a,b) a>b?a:b

This function-style macro could then be used anywhere in the program to quickly and easily find the largest of two values.

Macros and "constant" definitions last from whenever they were defined, until wherever they're undefined. Usually they just run on to the end of the code, however the #undef directive can be used to undefined values and macros at will - for example: #undef PI.

Conditionals

You can create if-statement type functionality that's interpreted by the preprocessor! This type of functionality is very simple and should be familiar to you from the basic if-statement functionality. So simple in fact, that I'm just going to list you the basic directives for conditionals:

#ifdef - If the value specified is defined
#ifndef - If the value specified is not defined
#if - If the specified condition is true
#elsif - Else, if the specified condition is true
#else - Else
#endif - Ends the current 'if' (or 'if' block)

The implementation should be fairly obvious, but just in-case you're having difficulty, it's probably worth me providing some examples. Let's say we #defineed a value further up in our document for the size of an important array, we might do some basic preprocessor logic like the following:

#ifdef ARRAY_SIZE
	#if ARRAY_SIZE>1000
		#undef ARRAY_SIZE
	#endif
#endif

In the above example we're simply un-defining "ARRAY_SIZE" if it's bigger than 1000, perhaps because we decided that the application would be too slow with a size greater than 1000 or for some other reason. This would create an error later in the program where ARRAY_SIZE was used, however is no longer defined, if it was originally as a value larger than 1000. There is however a better way to throw errors in some situations...

Error

'Error' is the final preprocessor that we're going to talk about in this tutorial, and it also happens to be one of the simplest. Usually combined with the condition directives, it simply throws an error with the text specified. So we might want some simple functionality like the following:

#if ARRAY_SIZE>1000
	#error ARRAY_SIZE is too large (> 1000).
#endif

Tying Things Off

With most of the important directives done and dusted (there are a few more which are important, but we don't need to worry about those right now), we can finish off this tutorial by just covering the five macro names which are defined at all times. These are useful in a number of different situations, and I suggest that you tie off this tutorial for yourself by combining some of the directives we've covered with the macros below, and just seeing what extra functionality you can add to some basic programs (I suggest especially experimenting with extra header and source files, these are very important in bigger projects!). The five preset macros are as follows:

__LINE__ - Integer of the current line of code being compiled.
__FILE__ - String of the file name being compiled.
__DATE__ - String of the date the compilation process began, in the format "MMM DD YYYY".
__TIME__ - String of the time the compilation process began, in the format "HH:MM:SS".
__cplusplus - Integer constant which all compilers should define.