The Allegro Wiki is migrating to github at https://github.com/liballeg/allegro_wiki/wiki

Difference between revisions of "Header file"

From Allegro Wiki
Jump to: navigation, search
(Fixing <code> tags?)
m (Pragma Once)
 
(9 intermediate revisions by the same user not shown)
Line 11: Line 11:
 
<source lang="cpp">
 
<source lang="cpp">
 
/***** Examples of declarations: *****/
 
/***** Examples of declarations: *****/
 
enum Enum1;
 
  
 
enum Enum1 { Enum1_Foo, Enum1_Bar };
 
enum Enum1 { Enum1_Foo, Enum1_Bar };
Line 34: Line 32:
 
class Class1
 
class Class1
 
{
 
{
 +
    static char c1;
 +
 
     Struct1 s1[2];
 
     Struct1 s1[2];
  
Line 42: Line 42:
 
typedef Class1 Class2;
 
typedef Class1 Class2;
  
 +
extern int n1;
  
  
Line 67: Line 68:
  
 
Struct1 s1;
 
Struct1 s1;
 +
 +
char Class1::c1 = 'c';
  
 
void Class1::foo(void)
 
void Class1::foo(void)
Line 86: Line 89:
  
 
Class2 c2;
 
Class2 c2;
 +
 +
int n1;
 
</source>
 
</source>
  
Line 94: Line 99:
 
== Preprocessing ==
 
== Preprocessing ==
  
First, the source file (.c, .cc, or .cpp) is passed through a process known as the [http://en.wikipedia.org/wiki/C_preprocessor C preprocessor]. This is responsible for processing all of the [http://en.wikipedia.org/wiki/Preprocessor_directive preprocessor directives]. These are basically all of the lines whose first non-whitespace character is a pound- or number-sign (#). Examples are <code lang="c">#include</code> and <code>#ifndef</code>. The program effectively removes these lines from the code stream and interprets them in some special way. The <code>#include</code> directive is basically used to copy the contents of a header file into a source file before compilation. The <code>#ifndef</code> directive is a conditional that operates on the text that follows until a <code>#elif</code>, <code>#else</code>, or <code>#endif</code> is encountered. These directives together are used to include or exclude the code in the code stream. They basically control what code the compiler sees.
+
First, the source file (.c, .cc, or .cpp) is passed through a process known as the [http://en.wikipedia.org/wiki/C_preprocessor C preprocessor]. This is responsible for processing all of the [http://en.wikipedia.org/wiki/Preprocessor_directive preprocessor directives]. These are basically all of the lines whose first non-whitespace character is a pound- or number-sign (#). Examples are <code lang="c">#include</code> and <code>#ifndef</code>. The program effectively removes these lines from the code stream and interprets them in some special way. The <code>#include</code> directive is basically used to copy the contents of a header file into a source file before compilation. The <code>#ifndef</code> directive is a conditional that operates on the text that follows until an <code>#elif</code>, <code>#else</code>, or <code>#endif</code> is encountered. These directives together are used to include or exclude the code in the code stream. They basically control what code the compiler sees.
  
 
== Compiling ==
 
== Compiling ==
Line 150: Line 155:
 
== Linking ==
 
== Linking ==
  
The linking stage happens at the end with all of the object files (.o). It is basically a process of stitching the object files together into a single file, either a library or a program. In addition to the object files, you also link with external libraries. These are your archives (.a=Unix-likes, .lib=Windows), dynamic-link libraries (.dll=Windows), and shared objects (.so=Unix-likes)[There may be others too. Linux kernel modules are a type of library too, I think, but I'm not overly familiar with them.]. The linker is responsible for taking all of your object files and libraries and putting them together into another library or your program (e.g., game). One of its jobs in doing so is making sure that a particular <i>thing</i> is only ever defined one time. For example, you can only have one <code>func1</code>, <code>Class1::foo(Enum1)</code>, or <code>c2</code>. If the compiler finds more than one then it will cause an error.
+
The linking stage happens at the end with all of the object files (.o). It is basically a process of stitching the object files together into a single file, either a library or a program. In addition to the object files, you also link with external libraries. These are your archives (.a=Unix-likes, .lib=Windows), dynamic-link libraries (.dll=Windows), and shared objects (.so=Unix-likes)[There may be others too. Linux kernel modules are a type of library too, I think, but I'm not overly familiar with them.]. The linker is responsible for taking all of your object files and libraries and putting them together into another library or your program (e.g., game). One of its jobs in doing so is making sure that a particular <i>thing</i> is only ever defined one time. For example, you can only have one <code>func1(void)</code>, <code>Class1::foo(Enum1)</code>, or <code>c2</code>. If the linker finds more than one then it will cause an error.
  
 
= Header-Guards =
 
= Header-Guards =
Line 167: Line 172:
 
</source>
 
</source>
  
The header-guard should surround the entire file and basically consists of preprocessor directives. The first is a <code>#ifndef</code>. This is used to test for the nonexistence of a [http://en.wikipedia.org/wiki/C_preprocessor#Macro_definition_and_expansion macro] that is named after the header file (and is hopefully unique throughout your entire codebase). In other words, it tests that the macro hasn't been defined. The <code>#ifndef</code> is terminated by a <code>#endif</code> directive at the end of the header file. Immediately after the <code>#ifndef</code> directive you should have a <code>#define</code> directive to define that same macro that you checked for. In other words, the first time the header file is included within a compilation unit the <code>#ifndef</code> check will fail because the macro won't be defined yet. Immediately after the check, the macro is defined, so the next time the header file is included within that compilation unit the macro <u>will</u> be defined and the <code>#ifndef</code> check will succeed and the header file contents will be skipped.
+
The header-guard should surround the entire file and basically consists of preprocessor directives. The first is an <code>#ifndef</code>. This is used to test for the nonexistence of a [http://en.wikipedia.org/wiki/C_preprocessor#Macro_definition_and_expansion macro] that is named after the header file (and is hopefully unique throughout your entire codebase). In other words, it tests that the macro hasn't been defined. The <code>#ifndef</code> is terminated by an <code>#endif</code> directive at the end of the header file. Immediately after the <code>#ifndef</code> directive you should have a <code>#define</code> directive to define that same macro that you checked for. In other words, the first time the header file is included within a compilation unit the <code>#ifndef</code> check will succeed because the macro won't be defined yet. Immediately after the check, the macro is defined, so the next time the header file is included within that compilation unit the macro <u>will</u> be defined and the <code>#ifndef</code> check will fail and the header file contents will be skipped.
  
 
A few notes on the header-guard macro:
 
A few notes on the header-guard macro:
  
 
* It should be in all uppercase (as with other macros), and of course, cannot contain spaces or punctuation, so any non-alphanumeric characters should be substituted with underscores (a.h => A_H).
 
* It should be in all uppercase (as with other macros), and of course, cannot contain spaces or punctuation, so any non-alphanumeric characters should be substituted with underscores (a.h => A_H).
* While most people limit the name to the name of the file, I prefer to take a more caution approach and include a "namespace" prefix to be extra safe (to avoid conflicts with other people's code). This namespace can be the application name (pacman/include/a.h => PACMAN_A_H) or an actual namespace within the application (pacman/include/networking/a.h => PACMAN_NETWORKING_A_H).
+
* While most people limit the name to the name of the file, I prefer to take a more cautious approach and include a "namespace" prefix to be extra safe (to avoid conflicts with other people's code). This namespace can be the application name (pacman/include/a.h => PACMAN_A_H) or an actual namespace within the application (pacman/include/networking/a.h => PACMAN_NETWORKING_A_H).
  
 
= Pragma Once =
 
= Pragma Once =
  
There is an alternative preprocessor directive to accomplish the same thing as header-guards: <code>#pragma once</code>. Unfortunately, this is less portable, meaning that not all C or C++ compilers support it. It is safer to use header-guards and not worry about portability (it's only a <i>little</i> more typing).
+
There is an alternative preprocessor directive to accomplish the same thing as header-guards: <code>#pragma once</code>. Unfortunately, this is less portable, meaning that not all preprocessors support it. It is safer to use header-guards and not worry about portability (it's only a <i>little</i> more typing).

Latest revision as of 06:59, October 1, 2011

Introduction

The concept of a header file is a rather difficult one to grasp given the way that C and C++ are typically taught. We often get asked about how they work and often have to guide people into using them properly. Hopefully this will help to save us duplication of effort.

Declaration vs. Definition

Declarations are things that don't really exist within the final program. They're basically data types (enums, classes, structs, and typedefs) and function signatures. Definitions are things that actually exist in the program. Basically, variables and functions.

/***** Examples of declarations: *****/

enum Enum1 { Enum1_Foo, Enum1_Bar };

void func1(void);

Enum1 func2(int, int);

struct Struct1;

struct Struct1
{
    Enum1 e1;

    void foo(void);
    Enum1 bar(int, int);
};

class Class1;

class Class1
{
    static char c1;

    Struct1 s1[2];

    void foo(void);
    Struct1 bar(Enum1);
};

typedef Class1 Class2;

extern int n1;


/***** Examples of definitions: *****/

Enum1 e1;

void func1(void)
{
}

Enum1 func2(int a, int b)
{
    return a > b ? Enum1_Foo : Enum1_Bar;
}

void Struct1::foo(void)
{
}

Enum1 Struct1::bar(int a, int b)
{
    return a > b ? Enum1_Foo : this->e1;
}

Struct1 s1;

char Class1::c1 = 'c';

void Class1::foo(void)
{
}

Struct1 Class1::bar(Enum1 e1)
{
    Struct1 s1;

    s1.e1 = e1;

    this->s1[e1] = s1;

    return s1;
}

Class1 c1;

Class2 c2;

int n1;

Build Stages

When you build a C or C++ program the code goes through a series of stages from text code to machine code. The key to understanding the role of header files is understanding this build process. The basic steps are as follows.

Preprocessing

First, the source file (.c, .cc, or .cpp) is passed through a process known as the C preprocessor. This is responsible for processing all of the preprocessor directives. These are basically all of the lines whose first non-whitespace character is a pound- or number-sign (#). Examples are #include and #ifndef. The program effectively removes these lines from the code stream and interprets them in some special way. The #include directive is basically used to copy the contents of a header file into a source file before compilation. The #ifndef directive is a conditional that operates on the text that follows until an #elif, #else, or #endif is encountered. These directives together are used to include or exclude the code in the code stream. They basically control what code the compiler sees.

Compiling

Now that all of the preprocessor directives have been processed and removed from the source file, we are ready to compile it. The compiler reads through all of the declarations and definitions and produces an object file from it (.o)[This step can be done entirely in memory too i.e., no file is produced]. Only definitions are included in the object file. The declarations are needed for the compiler to know that something exists because each source file is processed separately. For example, if a.c defines a function, func1, and you want to call it from within b.c, then inside of b.c you need to hint to the compiler that func1 exists. Otherwise it won't know and you'll get an error. You can do this manually:

// a.c

void func1(void)
{
}
// b.c

void func1(void);

void func2(void)
{
    func1();
}

Or you can declare func1 in a header file, a.h, and include it in the b.c source file.

// a.h

void func1(void);
// a.c

void func1(void)
{
}
// b.c

#include "a.h"

void func2(void)
{
    func1();
}

Now imagine that you have tens or hundreds of things defined in a.c and you need to use them in b.c and c.c. You could declare them in both b.c and c.c, or you could declare them once in a.h and include that in b.c and c.c. That is the basic idea behind header files.

Linking

The linking stage happens at the end with all of the object files (.o). It is basically a process of stitching the object files together into a single file, either a library or a program. In addition to the object files, you also link with external libraries. These are your archives (.a=Unix-likes, .lib=Windows), dynamic-link libraries (.dll=Windows), and shared objects (.so=Unix-likes)[There may be others too. Linux kernel modules are a type of library too, I think, but I'm not overly familiar with them.]. The linker is responsible for taking all of your object files and libraries and putting them together into another library or your program (e.g., game). One of its jobs in doing so is making sure that a particular thing is only ever defined one time. For example, you can only have one func1(void), Class1::foo(Enum1), or c2. If the linker finds more than one then it will cause an error.

Header-Guards

There's one more thing for you to understand in order to use header files properly. The compiler requires declarations to not conflict with one another. For example, you can't declare two different types of Enum1 or Class1. However, header files can be included by other header files, and the whole point of header files is to be reused! Unfortunately, that can lead to redeclaration errors if the same header file is encountered more than once within the same compilation unit. There is a solution, however: Header-guards! Header-guards are effectively used to ensure that each source file will only ever include a header file once so you'll never redeclare something. A header-guard looks like this:

// c.h

#ifndef C_H
    #define C_H

// Header file contents here...

#endif

The header-guard should surround the entire file and basically consists of preprocessor directives. The first is an #ifndef. This is used to test for the nonexistence of a macro that is named after the header file (and is hopefully unique throughout your entire codebase). In other words, it tests that the macro hasn't been defined. The #ifndef is terminated by an #endif directive at the end of the header file. Immediately after the #ifndef directive you should have a #define directive to define that same macro that you checked for. In other words, the first time the header file is included within a compilation unit the #ifndef check will succeed because the macro won't be defined yet. Immediately after the check, the macro is defined, so the next time the header file is included within that compilation unit the macro will be defined and the #ifndef check will fail and the header file contents will be skipped.

A few notes on the header-guard macro:

  • It should be in all uppercase (as with other macros), and of course, cannot contain spaces or punctuation, so any non-alphanumeric characters should be substituted with underscores (a.h => A_H).
  • While most people limit the name to the name of the file, I prefer to take a more cautious approach and include a "namespace" prefix to be extra safe (to avoid conflicts with other people's code). This namespace can be the application name (pacman/include/a.h => PACMAN_A_H) or an actual namespace within the application (pacman/include/networking/a.h => PACMAN_NETWORKING_A_H).

Pragma Once

There is an alternative preprocessor directive to accomplish the same thing as header-guards: #pragma once. Unfortunately, this is less portable, meaning that not all preprocessors support it. It is safer to use header-guards and not worry about portability (it's only a little more typing).