| GCEK.Ac.In | Search | Contact |
This is the GCE Kannur alumni site. Please click the GCEK.Org link above to go to college site.
 
 
 Books
 C/C++
     Think in C++ v1
     Think in C++ v2
 Java
 OOAD
 ASP/ASP.Net
 Zip Version
 .. Back to Ref


  


Roll No.
Password
New User? 
Remember Me 
  3: The C in C++
MindView Inc.
[ Viewing Hints ] [ Exercise Solutions ] [ Volume 2 ] [ Free Newsletter ]
[ Seminars ] [ Seminars on CD ROM ] [ Consulting ]

Thinking in C++, 2nd ed. Volume 1

©2000 by Bruce Eckel

[ Previous Chapter ] [ Table of Contents ] [ Index ] [ Next Chapter ]

3: The C in C++

Since C++ is based on C, you must be familiar with the syntax of C in order to program in C++, just as you
must be reasonably fluent in algebra in order to tackle calculus.

If you’ve never seen C before, this chapter will give you a decent background in the style of C used in C++. If you are familiar with the style of C described in the first edition of Kernighan & Ritchie (often called K&R C), you will find some new and different features in C++ as well as in Standard C. If you are familiar with Standard C, you should skim through this chapter looking for features that are particular to C++. Note that there are some fundamental C++ features introduced here, which are basic ideas that are akin to the features in C or often modifications to the way that C does things. The more sophisticated C++ features will not be introduced until later chapters.

This chapter is a fairly fast coverage of C constructs and introduction to some basic C++ constructs, with the understanding that you’ve had some experience programming in another language. A more gentle introduction to C is found in the CD ROM packaged in the back of this book, titled Thinking in C: Foundations for Java & C++ by Chuck Allison (published by MindView, Inc., and also available at www.MindView.net). This is a seminar on a CD ROM with the goal of taking you carefully through the fundamentals of the C language. It focuses on the knowledge necessary for you to be able to move on to the C++ or Java languages rather than trying to make you an expert in all the dark corners of C (one of the reasons for using a higher-level language like C++ or Java is precisely so we can avoid many of these dark corners). It also contains exercises and guided solutions. Keep in mind that because this chapter goes beyond the Thinking in C CD, the CD is not a replacement for this chapter, but should be used instead as a preparation for this chapter and for the book.

Creating functions

In old (pre-Standard) C, you could call a function with any number or type of arguments and the compiler wouldn’t complain. Everything seemed fine until you ran the program. You got mysterious results (or worse, the program crashed) with no hints as to why. The lack of help with argument passing and the enigmatic bugs that resulted is probably one reason why C was dubbed a “high-level assembly language.” Pre-Standard C programmers just adapted to it.

Standard C and C++ use a feature called function prototyping. With function prototyping, you must use a description of the types of arguments when declaring and defining a function. This description is the “prototype.” When the function is called, the compiler uses the prototype to ensure that the proper arguments are passed in and that the return value is treated correctly. If the programmer makes a mistake when calling the function, the compiler catches the mistake.

Essentially, you learned about function prototyping (without naming it as such) in the previous chapter, since the form of function declaration in C++ requires proper prototyping. In a function prototype, the argument list contains the types of arguments that must be passed to the function and (optionally for the declaration) identifiers for the arguments. The order and type of the arguments must match in the declaration, definition, and function call. Here’s an example of a function prototype in a declaration:

int translate(float x, float y, float z);

You do not use the same form when declaring variables in function prototypes as you do in ordinary variable definitions. That is, you cannot say: float x, y, z. You must indicate the type of each argument. In a function declaration, the following form is also acceptable:

int translate(float, float, float);

Since the compiler doesn’t do anything but check for types when the function is called, the identifiers are only included for clarity when someone is reading the code.

In the function definition, names are required because the arguments are referenced inside the function:

int translate(float x, float y, float z) {
  x = y = z;
  // ...
} 

It turns out this rule applies only to C. In C++, an argument may be unnamed in the argument list of the function definition. Since it is unnamed, you cannot use it in the function body, of course. Unnamed arguments are allowed to give the programmer a way to “reserve space in the argument list.” Whoever uses the function must still call the function with the proper arguments. However, the person creating the function can then use the argument in the future without forcing modification of code that calls the function. This option of ignoring an argument in the list is also possible if you leave the name in, but you will get an annoying warning message about the value being unused every time you compile the function. The warning is eliminated if you remove the name.

C and C++ have two other ways to declare an argument list. If you have an empty argument list, you can declare it as func( ) in C++, which tells the compiler there are exactly zero arguments. You should be aware that this only means an empty argument list in C++. In C it means “an indeterminate number of arguments (which is a “hole” in C since it disables type checking in that case). In both C and C++, the declaration func(void); means an empty argument list. The void keyword means “nothing” in this case (it can also mean “no type” in the case of pointers, as you’ll see later in this chapter).

The other option for argument lists occurs when you don’t know how many arguments or what type of arguments you will have; this is called a variable argument list. This “uncertain argument list” is represented by ellipses (...). Defining a function with a variable argument list is significantly more complicated than defining a regular function. You can use a variable argument list for a function that has a fixed set of arguments if (for some reason) you want to disable the error checks of function prototyping. Because of this, you should restrict your use of variable argument lists to C and avoid them in C++ (in which, as you’ll learn, there are much better alternatives). Handling variable argument lists is described in the library section of your local C guide.

Function return values

A C++ function prototype must specify the return value type of the function (in C, if you leave off the return value type it defaults to int). The return type specification precedes the function name. To specify that no value is returned, use the void keyword. This will generate an error if you try to return a value from the function. Here are some complete function prototypes:

int f1(void); // Returns an int, takes no arguments
int f2(); // Like f1() in C++ but not in Standard C!
float f3(float, int, char, double); // Returns a float
void f4(void); // Takes no arguments, returns nothing

To return a value from a function, you use the return statement. return exits the function back to the point right after the function call. If return has an argument, that argument becomes the return value of the function. If a function says that it will return a particular type, then each return statement must return that type. You can have more than one return statement in a function definition:

//: C03:Return.cpp
// Use of "return"
#include <iostream>
using namespace std;

char cfunc(int i) {
  if(i == 0)
    return 'a';
  if(i == 1)
    return 'g';
  if(i == 5)
    return 'z';
  return 'c';
}

int main() {
  cout << "type an integer: ";
  int val;
  cin >> val;
  cout << cfunc(val) << endl;
} ///:~

In cfunc( ), the first if that evaluates to true exits the function via the return statement. Notice that a function declaration is not necessary because the function definition appears before it is used in main( ), so the compiler knows about it from that function definition.

Using the C function library

All the functions in your local C function library are available while you are programming in C++. You should look hard at the function library before defining your own function – there’s a good chance that someone has already solved your problem for you, and probably given it a lot more thought and debugging.

A word of caution, though: many compilers include a lot of extra functions that make life even easier and are tempting to use, but are not part of the Standard C library. If you are certain you will never want to move the application to another platform (and who is certain of that?), go ahead –use those functions and make your life easier. If you want your application to be portable, you should restrict yourself to Standard library functions. If you must perform platform-specific activities, try to isolate that code in one spot so it can be changed easily when porting to another platform. In C++, platform-specific activities are often encapsulated in a class, which is the ideal solution.

The formula for using a library function is as follows: first, find the function in your programming reference (many programming references will index the function by category as well as alphabetically). The description of the function should include a section that demonstrates the syntax of the code. The top of this section usually has at least one #include line, showing you the header file containing the function prototype. Duplicate this #include line in your file so the function is properly declared. Now you can call the function in the same way it appears in the syntax section. If you make a mistake, the compiler will discover it by comparing your function call to the function prototype in the header and tell you about your error. The linker searches the Standard library by default, so that’s all you need to do: include the header file and call the function.

Creating your own libraries with the librarian

You can collect your own functions together into a library. Most programming packages come with a librarian that manages groups of object modules. Each librarian has its own commands, but the general idea is this: if you want to create a library, make a header file containing the function prototypes for all the functions in your library. Put this header file somewhere in the preprocessor’s search path, either in the local directory (so it can be found by #include "header") or in the include directory (so it can be found by #include <header>). Now take all the object modules and hand them to the librarian along with a name for the finished library (most librarians require a common extension, such as .lib or .a). Place the finished library where the other libraries reside so the linker can find it. When you use your library, you will have to add something to the command line so the linker knows to search the library for the functions you call. You must find all the details in your local manual, since they vary from system to system.

Controlling execution

This section covers the execution control statements in C++. You must be familiar with these statements before you can read and write C or C++ code.

C++ uses all of C’s execution control statements. These include if-else, while, do-while, for, and a selection statement called switch. C++ also allows the infamous goto, which will be avoided in this book.


True and false

All conditional statements use the truth or falsehood of a conditional expression to determine the execution path. An example of a conditional expression is A == B. This uses the conditional operator == to see if the variable A is equivalent to the variable B. The expression produces a Boolean true or false (these are keywords only in C++; in C an expression is “true” if it evaluates to a nonzero value). Other conditional operators are >, <, >=, etc. Conditional statements are covered more fully later in this chapter.

if-else

The if-else statement can exist in two forms: with or without the else. The two forms are:

if(expression)
    statement

or

if(expression)
    statement
else
    statement

The “expression” evaluates to true or false. The “statement” means either a simple statement terminated by a semicolon or a compound statement, which is a group of simple statements enclosed in braces. Any time the word “statement” is used, it always implies that the statement is simple or compound. Note that this statement can also be another if, so they can be strung together.

//: C03:Ifthen.cpp
// Demonstration of if and if-else conditionals
#include <iostream>
using namespace std;

int main() {
  int i;
  cout << "type a number and 'Enter'" << endl;
  cin >> i;
  if(i > 5)
    cout << "It's greater than 5" << endl;
  else
    if(i < 5)
      cout << "It's less than 5 " << endl;
    else
      cout << "It's equal to 5 " << endl;

  cout << "type a number and 'Enter'" << endl;
  cin >> i;
  if(i < 10)
    if(i > 5)  // "if" is just another statement
      cout << "5 < i < 10" << endl;
    else
      cout << "i <= 5" << endl;
  else // Matches "if(i < 10)"
    cout << "i >= 10" << endl;
} ///:~

It is conventional to indent the body of a control flow statement so the reader may easily determine where it begins and ends[30].

while

while, do-while, and for control looping. A statement repeats until the controlling expression evaluates to false. The form of a while loop is

while(expression)
    statement

The expression is evaluated once at the beginning of the loop and again before each further iteration of the statement.

This example stays in the body of the while loop until you type the secret number or press control-C.

//: C03:Guess.cpp
// Guess a number (demonstrates "while")
#include <iostream>
using namespace std;

int main() {
  int secret = 15;
  int guess = 0;
  // "!=" is the "not-equal" conditional:
  while(guess != secret) { // Compound statement
    cout << "guess the number: ";
    cin >> guess;
  }
  cout << "You guessed it!" << endl;
} ///:~

The while’s conditional expression is not restricted to a simple test as in the example above; it can be as complicated as you like as long as it produces a true or false result. You will even see code where the loop has no body, just a bare semicolon:

while(/* Do a lot here */)
 ;

In these cases, the programmer has written the conditional expression not only to perform the test but also to do the work.

do-while

The form of do-while is

do
    statement
while(expression); 

The do-while is different from the while because the statement always executes at least once, even if the expression evaluates to false the first time. In a regular while, if the conditional is false the first time the statement never executes.

If a do-while is used in Guess.cpp, the variable guess does not need an initial dummy value, since it is initialized by the cin statement before it is tested:

//: C03:Guess2.cpp
// The guess program using do-while
#include <iostream>
using namespace std;

int main() {
  int secret = 15;
  int guess; // No initialization needed here
  do {
    cout << "guess the number: ";
    cin >> guess; // Initialization happens
  }   while(guess != secret);
  cout << "You got it!" << endl;
} ///:~

For some reason, most programmers tend to avoid do-while and just work with while.

for

A for loop performs initialization before the first iteration. Then it performs conditional testing and, at the end of each iteration, some form of “stepping.” The form of the for loop is:

for(initialization; conditional; step)
 statement

Any of the expressions initialization, conditional, or step may be empty. The initialization code executes once at the very beginning. The conditional is tested before each iteration (if it evaluates to false at the beginning, the statement never executes). At the end of each loop, the step executes.

for loops are usually used for “counting” tasks:

//: C03:Charlist.cpp
// Display all the ASCII characters
// Demonstrates "for"
#include <iostream>
using namespace std;

int main() {
  for(int i = 0; i < 128; i = i + 1)
    if (i != 26)  // ANSI Terminal Clear screen
      cout << " value: " << i 
           << " character: " 
           << char(i) // Type conversion
           << endl;
} ///:~

You may notice that the variable i is defined at the point where it is used, instead of at the beginning of the block denoted by the open curly brace ‘{’. This is in contrast to traditional procedural languages (including C), which require that all variables be defined at the beginning of the block. This will be discussed later in this chapter.

The break and continue keywords

Inside the body of any of the looping constructs while, do-while, or for, you can control the flow of the loop using break and continue. break quits the loop without executing the rest of the statements in the loop. continue stops the execution of the current iteration and goes back to the beginning of the loop to begin a new iteration.

As an example of break and continue, this program is a very simple menu system:

//: C03:Menu.cpp
// Simple menu program demonstrating
// the use of "break" and "continue"
#include <iostream>
using namespace std;

int main() {
  char c; // To hold response
  while(true) {
    cout << "MAIN MENU:" << endl;
    cout << "l: left, r: right, q: quit -> ";
    cin >> c;
    if(c == 'q')
      break; // Out of "while(1)"
    if(c == 'l') {
      cout << "LEFT MENU:" << endl;
      cout << "select a or b: ";
      cin >> c;
      if(c == 'a') {
        cout << "you chose 'a'" << endl;
        continue; // Back to main menu
      }
      if(c == 'b') {
        cout << "you chose 'b'" << endl;
        continue; // Back to main menu
      }
      else {
        cout << "you didn't choose a or b!"
             << endl;
        continue; // Back to main menu
      }
    }
    if(c == 'r') {
      cout << "RIGHT MENU:" << endl;
      cout << "select c or d: ";
      cin >> c;
      if(c == 'c') {
        cout << "you chose 'c'" << endl;
        continue; // Back to main menu
      }
      if(c == 'd') {
        cout << "you chose 'd'" << endl;
        continue; // Back to main menu
      }
      else {
        cout << "you didn't choose c or d!" 
             << endl;
        continue; // Back to main menu
      }
    }
    cout << "you must type l or r or q!" << endl;
  }
  cout << "quitting menu..." << endl;
} ///:~

If the user selects ‘q’ in the main menu, the break keyword is used to quit, otherwise the program just continues to execute indefinitely. After each of the sub-menu selections, the continue keyword is used to pop back up to the beginning of the while loop.

The while(true) statement is the equivalent of saying “do this loop forever.” The break statement allows you to break out of this infinite while loop when the user types a ‘q.’

switch

A switch statement selects from among pieces of code based on the value of an integral expression. Its form is:

switch(selector) {
    case integral-value1 : statement; break;
    case integral-value2 : statement; break;
    case integral-value3 : statement; break;
    case integral-value4 : statement; break;
    case integral-value5 : statement; break;
    (...)
    default: statement;
} 

Selector is an expression that produces an integral value. The switch compares the result of selector to each integral value. If it finds a match, the corresponding statement (simple or compound) executes. If no match occurs, the default statement executes.

You will notice in the definition above that each case ends with a break, which causes execution to jump to the end of the switch body (the closing brace that completes the switch). This is the conventional way to build a switch statement, but the break is optional. If it is missing, your case “drops through” to the one after it. That is, the code for the following case statements execute until a break is encountered. Although you don’t usually want this kind of behavior, it can be useful to an experienced programmer.

The switch statement is a clean way to implement multi-way selection (i.e., selecting from among a number of different execution paths), but it requires a selector that evaluates to an integral value at compile-time. If you want to use, for example, a string object as a selector, it won’t work in a switch statement. For a string selector, you must instead use a series of if statements and compare the string inside the conditional.

The menu example shown above provides a particularly nice example of a switch:

//: C03:Menu2.cpp
// A menu using a switch statement
#include <iostream>
using namespace std;

int main() {
  bool quit = false;  // Flag for quitting
  while(quit == false) {
    cout << "Select a, b, c or q to quit: ";
    char response;
    cin >> response;
    switch(response) {
      case 'a' : cout << "you chose 'a'" << endl;
                 break;
      case 'b' : cout << "you chose 'b'" << endl;
                 break;
      case 'c' : cout << "you chose 'c'" << endl;
                 break;
      case 'q' : cout << "quitting menu" << endl;
                 quit = true;
                 break;
      default  : cout << "Please use a,b,c or q!"
                 << endl;
    }
  }
} ///:~

The quit flag is a bool, short for “Boolean,” which is a type you’ll find only in C++. It can have only the keyword values true or false. Selecting ‘q’ sets the quit flag to true. The next time the selector is evaluated, quit == false returns false so the body of the while does not execute.

Using and misusing goto

The goto keyword is supported in C++, since it exists in C. Using goto is often dismissed as poor programming style, and most of the time it is. Anytime you use goto, look at your code and see if there’s another way to do it. On rare occasions, you may discover goto can solve a problem that can’t be solved otherwise, but still, consider it carefully. Here’s an example that might make a plausible candidate:

//: C03:gotoKeyword.cpp
// The infamous goto is supported in C++
#include <iostream>
using namespace std;

int main() {
  long val = 0;
  for(int i = 1; i < 1000; i++) {
    for(int j = 1; j < 100; j += 10) {
      val = i * j;
      if(val > 47000)
        goto bottom; 
        // Break would only go to the outer 'for'
    }
  }
  bottom: // A label
  cout << val << endl;
} ///:~

The alternative would be to set a Boolean that is tested in the outer for loop, and then do a break from the inner for loop. However, if you have several levels of for or while this could get awkward.


Recursion

Recursion is an interesting and sometimes useful programming technique whereby you call the function that you’re in. Of course, if this is all you do, you’ll keep calling the function you’re in until you run out of memory, so there must be some way to “bottom out” the recursive call. In the following example, this “bottoming out” is accomplished by simply saying that the recursion will go only until the cat exceeds ‘Z’:[31]

//: C03:CatsInHats.cpp
// Simple demonstration of recursion
#include <iostream>
using namespace std;

void removeHat(char cat) {
  for(char c = 'A'; c < cat; c++)
    cout << "  ";
  if(cat <= 'Z') {
    cout << "cat " << cat << endl;
    removeHat(cat + 1); // Recursive call
  } else
    cout << "VOOM!!!" << endl;
}

int main() {
  removeHat('A');
} ///:~

In removeHat( ), you can see that as long as cat is less than ‘Z’, removeHat( ) will be called from within removeHat( ), thus effecting the recursion. Each time removeHat( ) is called, its argument is one greater than the current cat so the argument keeps increasing.

Recursion is often used when evaluating some sort of arbitrarily complex problem, since you aren’t restricted to a particular “size” for the solution – the function can just keep recursing until it’s reached the end of the problem.

Introduction to operators

You can think of operators as a special type of function (you’ll learn that C++ operator overloading treats operators precisely that way). An operator takes one or more arguments and produces a new value. The arguments are in a different form than ordinary function calls, but the effect is the same.

From your previous programming experience, you should be reasonably comfortable with the operators that have been used so far. The concepts of addition (+), subtraction and unary minus (-), multiplication (*), division (/), and assignment(=) all have essentially the same meaning in any programming language. The full set of operators is enumerated later in this chapter.


Precedence

Operator precedence defines the order in which an expression evaluates when several different operators are present. C and C++ have specific rules to determine the order of evaluation. The easiest to remember is that multiplication and division happen before addition and subtraction. After that, if an expression isn’t transparent to you it probably won’t be for anyone reading the code, so you should use parentheses to make the order of evaluation explicit. For example:

A = X + Y - 2/2 + Z;

has a very different meaning from the same statement with a particular grouping of parentheses:

A = X + (Y - 2)/(2 + Z);

(Try evaluating the result with X = 1, Y = 2, and Z = 3.)

Auto increment and decrement

C, and therefore C++, is full of shortcuts. Shortcuts can make code much easier to type, and sometimes much harder to read. Perhaps the C language designers thought it would be easier to understand a tricky piece of code if your eyes didn’t have to scan as large an area of print.

One of the nicer shortcuts is the auto-increment and auto-decrement operators. You often use these to change loop variables, which control the number of times a loop executes.

The auto-decrement operator is ‘--’ and means “decrease by one unit.” The auto-increment operator is ‘++’ and means “increase by one unit.” If A is an int, for example, the expression ++A is equivalent to (A = A + 1). Auto-increment and auto-decrement operators produce the value of the variable as a result. If the operator appears before the variable, (i.e., ++A), the operation is first performed and the resulting value is produced. If the operator appears after the variable (i.e. A++), the current value is produced, and then the operation is performed. For example:

//: C03:AutoIncrement.cpp
// Shows use of auto-increment
// and auto-decrement operators.
#include <iostream>
using namespace std;

int main() {
  int i = 0;
  int j = 0;
  cout << ++i << endl; // Pre-increment
  cout << j++ << endl; // Post-increment
  cout << --i << endl; // Pre-decrement
  cout << j-- << endl; // Post decrement
} ///:~

If you’ve been wondering about the name “C++,” now you understand. It implies “one step beyond C.”

Introduction to data types

Data types define the way you use storage (memory) in the programs you write. By specifying a data type, you tell the compiler how to create a particular piece of storage, and also how to manipulate that storage.

Data types can be built-in or abstract. A built-in data type is one that the compiler intrinsically understands, one that is wired directly into the compiler. The types of built-in data are almost identical in C and C++. In contrast, a user-defined data type is one that you or another programmer create as a class. These are commonly referred to as abstract data types. The compiler knows how to handle built-in types when it starts up; it “learns” how to handle abstract data types by reading header files containing class declarations (you’ll learn about this in later chapters).

Basic built-in types

The Standard C specification for built-in types (which C++ inherits) doesn’t say how many bits each of the built-in types must contain. Instead, it stipulates the minimum and maximum values that the built-in type must be able to hold. When a machine is based on binary, this maximum value can be directly translated into a minimum number of bits necessary to hold that value. However, if a machine uses, for example, binary-coded decimal (BCD) to represent numbers, then the amount of space in the machine required to hold the maximum numbers for each data type will be different. The minimum and maximum values that can be stored in the various data types are defined in the system header files limits.h and float.h (in C++ you will generally #include <climits> and <cfloat> instead).

C and C++ have four basic built-in data types, described here for binary-based machines. A char is for character storage and uses a minimum of 8 bits (one byte) of storage, although it may be larger. An int stores an integral number and uses a minimum of two bytes of storage. The float and double types store floating-point numbers, usually in IEEE floating-point format. float is for single-precision floating point and double is for double-precision floating point.

As mentioned previously, you can define variables anywhere in a scope, and you can define and initialize them at the same time. Here’s how to define variables using the four basic data types:

//: C03:Basic.cpp
// Defining the four basic data
// types in C and C++

int main() {
  // Definition without initialization:
  char protein;
  int carbohydrates;
  float fiber;
  double fat;
  // Simultaneous definition & initialization:
  char pizza = 'A', pop = 'Z';
  int dongdings = 100, twinkles = 150, 
    heehos = 200;
  float chocolate = 3.14159;
  // Exponential notation:
  double fudge_ripple = 6e-4; 
} ///:~

The first part of the program defines variables of the four basic data types without initializing them. If you don’t initialize a variable, the Standard says that its contents are undefined (usually, this means they contain garbage). The second part of the program defines and initializes variables at the same time (it’s always best, if possible, to provide an initialization value at the point of definition). Notice the use of exponential notation in the constant 6e-4, meaning “6 times 10 to the minus fourth power.”

bool, true, & false

Before bool became part of Standard C++, everyone tended to use different techniques in order to produce Boolean-like behavior. These produced portability problems and could introduce subtle errors.

The Standard C++ bool type can have two states expressed by the built-in constants true (which converts to an integral one) and false (which converts to an integral zero). All three names are keywords. In addition, some language elements have been adapted:

Element

Usage with bool

&& || !

Take bool arguments and produce bool results.

< > <= >= == !=

Produce bool results.

if, for,
while, do

Conditional expressions convert to bool values.

? :

First operand converts to bool value.

Because there’s a lot of existing code that uses an int to represent a flag, the compiler will implicitly convert from an int to a bool (nonzero values will produce true while zero values produce false). Ideally, the compiler will give you a warning as a suggestion to correct the situation.

An idiom that falls under “poor programming style” is the use of ++ to set a flag to true. This is still allowed, but deprecated, which means that at some time in the future it will be made illegal. The problem is that you’re making an implicit type conversion from bool to int, incrementing the value (perhaps beyond the range of the normal bool values of zero and one), and then implicitly converting it back again.

Pointers (which will be introduced later in this chapter) will also be automatically converted to bool when necessary.

Specifiers

Specifiers modify the meanings of the basic built-in types and expand them to a much larger set. There are four specifiers: long, short, signed, and unsigned.

long and short modify the maximum and minimum values that a data type will hold. A plain int must be at least the size of a short. The size hierarchy for integral types is: short int, int, long int. All the sizes could conceivably be the same, as long as they satisfy the minimum/maximum value requirements. On a machine with a 64-bit word, for instance, all the data types might be 64 bits.

The size hierarchy for floating point numbers is: float, double, and long double. “long float” is not a legal type. There are no short floating-point numbers.

The signed and unsigned specifiers tell the compiler how to use the sign bit with integral types and characters (floating-point numbers always contain a sign). An unsigned number does not keep track of the sign and thus has an extra bit available, so it can store positive numbers twice as large as the positive numbers that can be stored in a signed number. signed is the default and is only necessary with char; char may or may not default to signed. By specifying signed char, you force the sign bit to be used.

The following example shows the size of the data types in bytes by using the sizeof operator, introduced later in this chapter:

//: C03:Specify.cpp
// Demonstrates the use of specifiers
#include <iostream>
using namespace std;

int main() {
  char c;
  unsigned char cu;
  int i;
  unsigned int iu;
  short int is;
  short iis; // Same as short int
  unsigned short int isu;
  unsigned short iisu;
  long int il;
  long iil;  // Same as long int
  unsigned long int ilu;
  unsigned long iilu;
  float f;
  double d;
  long double ld;
  cout 
    << "\n char= " << sizeof(c)
    << "\n unsigned char = " << sizeof(cu)
    << "\n int = " << sizeof(i)
    << "\n unsigned int = " << sizeof(iu)
    << "\n short = " << sizeof(is)
    << "\n unsigned short = " << sizeof(isu)
    << "\n long = " << sizeof(il) 
    << "\n unsigned long = " << sizeof(ilu)
    << "\n float = " << sizeof(f)
    << "\n double = " << sizeof(d)
    << "\n long double = " << sizeof(ld) 
    << endl;
} ///:~

Be aware that the results you get by running this program will probably be different from one machine/operating system/compiler to the next, since (as mentioned previously) the only thing that must be consistent is that each different type hold the minimum and maximum values specified in the Standard.

When you are modifying an int with short or long, the keyword int is optional, as shown above.


Introduction to pointers

Whenever you run a program, it is first loaded (typically from disk) into the computer’s memory. Thus, all elements of your program are located somewhere in memory. Memory is typically laid out as a sequential series of memory locations; we usually refer to these locations as eight-bit bytes but actually the size of each space depends on the architecture of the particular machine and is usually called that machine’s word size. Each space can be uniquely distinguished from all other spaces by its address. For the purposes of this discussion, we’ll just say that all machines use bytes that have sequential addresses starting at zero and going up to however much memory you have in your computer.

Since your program lives in memory while it’s being run, every element of your program has an address. Suppose we start with a simple program:

//: C03:YourPets1.cpp
#include <iostream>
using namespace std;

int dog, cat, bird, fish;

void f(int pet) {
  cout << "pet id number: " << pet << endl;
}

int main() {
  int i, j, k;
} ///:~

Each of the elements in this program has a location in storage when the program is running. Even the function occupies storage. As you’ll see, it turns out that what an element is and the way you define it usually determines the area of memory where that element is placed.

There is an operator in C and C++ that will tell you the address of an element. This is the &’ operator. All you do is precede the identifier name with ‘&’ and it will produce the address of that identifier. YourPets1.cpp can be modified to print out the addresses of all its elements, like this:

//: C03:YourPets2.cpp
#include <iostream>
using namespace std;

int dog, cat, bird, fish;

void f(int pet) {
  cout << "pet id number: " << pet << endl;
}

int main() {
  int i, j, k;
  cout << "f(): " << (long)&f << endl;
  cout << "dog: " << (long)&dog << endl;
  cout << "cat: " << (long)&cat << endl;
  cout << "bird: " << (long)&bird << endl;
  cout << "fish: " << (long)&fish << endl;
  cout << "i: " << (long)&i << endl;
  cout << "j: " << (long)&j << endl;
  cout << "k: " << (long)&k << endl;
} ///:~

The (long) is a cast. It says “Don’t treat this as if it’s normal type, instead treat it as a long.” The cast isn’t essential, but if it wasn’t there, the addresses would have been printed out in hexadecimal instead, so casting to a long makes things a little more readable.

The results of this program will vary depending on your computer, OS, and all sorts of other factors, but it will always give you some interesting insights. For a single run on my computer, the results looked like this:

f(): 4198736
dog: 4323632
cat: 4323636
bird: 4323640
fish: 4323644
i: 6684160
j: 6684156
k: 6684152

You can see how the variables that are defined inside main( ) are in a different area than the variables defined outside of main( ); you’ll understand why as you learn more about the language. Also, f( ) appears to be in its own area; code is typically separated from data in memory.

Another interesting thing to note is that variables defined one right after the other appear to be placed contiguously in memory. They are separated by the number of bytes that are required by their data type. Here, the only data type used is int, and cat is four bytes away from dog, bird is four bytes away from cat, etc. So it would appear that, on this machine, an int is four bytes long.

Other than this interesting experiment showing how memory is mapped out, what can you do with an address? The most important thing you can do is store it inside another variable for later use. C and C++ have a special type of variable that holds an address. This variable is called a pointer.

The operator that defines a pointer is the same as the one used for multiplication: ‘*’. The compiler knows that it isn’t multiplication because of the context in which it is used, as you will see.

When you define a pointer, you must specify the type of variable it points to. You start out by giving the type name, then instead of immediately giving an identifier for the variable, you say “Wait, it’s a pointer” by inserting a star between the type and the identifier. So a pointer to an int looks like this:

int* ip; // ip points to an int variable

The association of the ‘*’ with the type looks sensible and reads easily, but it can actually be a bit deceiving. Your inclination might be to say “intpointer” as if it is a single discrete type. However, with an int or other basic data type, it’s possible to say:

int a, b, c;

whereas with a pointer, you’d like to say:

int* ipa, ipb, ipc;

C syntax (and by inheritance, C++ syntax) does not allow such sensible expressions. In the definitions above, only ipa is a pointer, but ipb and ipc are ordinary ints (you can say that “* binds more tightly to the identifier”). Consequently, the best results can be achieved by using only one definition per line; you still get the sensible syntax without the confusion:

int* ipa;
int* ipb;
int* ipc;

Since a general guideline for C++ programming is that you should always initialize a variable at the point of definition, this form actually works better. For example, the variables above are not initialized to any particular value; they hold garbage. It’s much better to say something like:

int a = 47;
int* ipa = &a;

Now both a and ipa have been initialized, and ipa holds the address of a.

Once you have an initialized pointer, the most basic thing you can do with it is to use it to modify the value it points to. To access a variable through a pointer, you dereference the pointer using the same operator that you used to define it, like this:

*ipa = 100;

Now a contains the value 100 instead of 47.

These are the basics of pointers: you can hold an address, and you can use that address to modify the original variable. But the question still remains: why do you want to modify one variable using another variable as a proxy?

For this introductory view of pointers, we can put the answer into two broad categories:

  1. To change “outside objects” from within a function. This is perhaps the most basic use of pointers, and it will be examined here.
  2. To achieve many other clever programming techniques, which you’ll learn about in portions of the rest of the book.

Modifying the outside object

Ordinarily, when you pass an argument to a function, a copy of that argument is made inside the function. This is referred to as pass-by-value. You can see the effect of pass-by-value in the following program:

//: C03:PassByValue.cpp
#include <iostream>
using namespace std;

void f(int a) {
  cout << "a = " << a << endl;
  a = 5;
  cout << "a = " << a << endl;
}

int main() {
  int x = 47;
  cout << "x = " << x << endl;
  f(x);
  cout << "x = " << x << endl;
} ///:~

In f( ), a is a local variable, so it exists only for the duration of the function call to f( ). Because it’s a function argument, the value of a is initialized by the arguments that are passed when the function is called; in main( ) the argument is x, which has a value of 47, so this value is copied into a when f( ) is called.

When you run this program you’ll see:

x = 47
a = 47
a = 5
x = 47

Initially, of course, x is 47. When f( ) is called, temporary space is created to hold the variable a for the duration of the function call, and a is initialized by copying the value of x, which is verified by printing it out. Of course, you can change the value of a and show that it is changed. But when f( ) is completed, the temporary space that was created for a disappears, and we see that the only connection that ever existed between a and x happened when the value of x was copied into a.

When you’re inside f( ), x is the outside object (my terminology), and changing the local variable does not affect the outside object, naturally enough, since they are two separate locations in storage. But what if you do want to modify the outside object? This is where pointers come in handy. In a sense, a pointer is an alias for another variable. So if we pass a pointer into a function instead of an ordinary value, we are actually passing an alias to the outside object, enabling the function to modify that outside object, like this:

//: C03:PassAddress.cpp
#include <iostream>
using namespace std;

void f(int* p) {
  cout << "p = " << p << endl;
  cout << "*p = " << *p << endl;
  *p = 5;
  cout << "p = " << p << endl;
}

int main() {
  int x = 47;
  cout << "x = " << x << endl;
  cout << "&x = " << &x << endl;
  f(&x);
  cout << "x = " << x << endl;
} ///:~

Now f( ) takes a pointer as an argument and dereferences the pointer during assignment, and this causes the outside object x to be modified. The output is:

x = 47
&x = 0065FE00
p = 0065FE00
*p = 47
p = 0065FE00
x = 5

Notice that the value contained in p is the same as the address of x – the pointer p does indeed point to x. If that isn’t convincing enough, when p is dereferenced to assign the value 5, we see that the value of x is now changed to 5 as well.

Thus, passing a pointer into a function will allow that function to modify the outside object. You’ll see plenty of other uses for pointers later, but this is arguably the most basic and possibly the most common use.

Introduction to C++ references

Pointers work roughly the same in C and in C++, but C++ adds an additional way to pass an address into a function. This is pass-by-reference and it exists in several other programming languages so it was not a C++ invention.

Your initial perception of references may be that they are unnecessary, that you could write all your programs without references. In general, this is true, with the exception of a few important places that you’ll learn about later in the book. You’ll also learn more about references later, but the basic idea is the same as the demonstration of pointer use above: you can pass the address of an argument using a reference. The difference between references and pointers is that calling a function that takes references is cleaner, syntactically, than calling a function that takes pointers (and it is exactly this syntactic difference that makes references essential in certain situations). If PassAddress.cpp is modified to use references, you can see the difference in the function call in main( ):

//: C03:PassReference.cpp
#include <iostream>
using namespace std;

void f(int& r) {
  cout << "r = " << r << endl;
  cout << "&r = " << &r << endl;
  r = 5;
  cout << "r = " << r << endl;
}

int main() {
  int x = 47;
  cout << "x = " << x << endl;
  cout << "&x = " << &x << endl;
  f(x); // Looks like pass-by-value, 
        // is actually pass by reference
  cout << "x = " << x << endl;
} ///:~

In f( )’s argument list, instead of saying int* to pass a pointer, you say int& to pass a reference. Inside f( ), if you just say ‘r’ (which would produce the address if r were a pointer) you get the value in the variable that r references. If you assign to r, you actually assign to the variable that r references. In fact, the only way to get the address that’s held inside r is with the ‘&’ operator.

In main( ), you can see the key effect of references in the syntax of the call to f( ), which is just f(x). Even though this looks like an ordinary pass-by-value, the effect of the reference is that it actually takes the address and passes it in, rather than making a copy of the value. The output is:

x = 47
&x = 0065FE00
r = 47
&r = 0065FE00
r = 5
x = 5

So you can see that pass-by-reference allows a function to modify the outside object, just like passing a pointer does (you can also observe that the reference obscures the fact that an address is being passed; this will be examined later in the book). Thus, for this simple introduction you can assume that references are just a syntactically different way (sometimes referred to as “syntactic sugar”) to accomplish the same thing that pointers do: allow functions to change outside objects.

Pointers and references as modifiers

So far, you’ve seen the basic data types char, int, float, and double, along with the specifiers signed, unsigned, short, and long, which can be used with the basic data types in almost any combination. Now we’ve added pointers and references that are orthogonal to the basic data types and specifiers, so the possible combinations have just tripled:

//: C03:AllDefinitions.cpp
// All possible combinations of basic data types, 
// specifiers, pointers and references
#include <iostream>
using namespace std;

void f1(char c, int i, float f, double d);
void f2(short int si, long int li, long double ld);
void f3(unsigned char uc, unsigned int ui, 
  unsigned short int usi, unsigned long int uli);
void f4(char* cp, int* ip, float* fp, double* dp);
void f5(short int* sip, long int* lip, 
  long double* ldp);
void f6(unsigned char* ucp, unsigned int* uip, 
  unsigned short int* usip, 
  unsigned long int* ulip);
void f7(char& cr, int& ir, float& fr, double& dr);
void f8(short int& sir, long int& lir, 
  long double& ldr);
void f9(unsigned char& ucr, unsigned int& uir, 
  unsigned short int& usir, 
  unsigned long int& ulir);

int main() {} ///:~

Pointers and references also work when passing objects into and out of functions; you’ll learn about this in a later chapter.

There’s one other type that works with pointers: void. If you state that a pointer is a void*, it means that any type of address at all can be assigned to that pointer (whereas if you have an int*, you can assign only the address of an int variable to that pointer). For example:

//: C03:VoidPointer.cpp
int main() {
  void* vp;
  char c;
  int i;
  float f;
  double d;
  // The address of ANY type can be
  // assigned to a void pointer:
  vp = &c;
  vp = &i;
  vp = &f;
  vp = &d;
} ///:~

Once you assign to a void* you lose any information about what type it is. This means that before you can use the pointer, you must cast it to the correct type:

//: C03:CastFromVoidPointer.cpp
int main() {
  int i = 99;
  void* vp = &i;
  // Can't dereference a void pointer:
  // *vp = 3; // Compile-time error
  // Must cast back to int before dereferencing:
  *((int*)vp) = 3;
} ///:~

The cast (int*)vp takes the void* and tells the compiler to treat it as an int*, and thus it can be successfully dereferenced. You might observe that this syntax is ugly, and it is, but it’s worse than that – the void* introduces a hole in the language’s type system. That is, it allows, or even promotes, the treatment of one type as another type. In the example above, I treat an int as an int by casting vp to an int*, but there’s nothing that says I can’t cast it to a char* or double*, which would modify a different amount of storage that had been allocated for the int, possibly crashing the program. In general, void pointers should be avoided, and used only in rare special cases, the likes of which you won’t be ready to consider until significantly later in the book.

You cannot have a void reference, for reasons that will be explained in Chapter 11.

Scoping

Scoping rules tell you where a variable is valid, where it is created, and where it gets destroyed (i.e., goes out of scope). The scope of a variable extends from the point where it is defined to the first closing brace that matches the closest opening brace before the variable was defined. That is, a scope is defined by its “nearest” set of braces. To illustrate:

//: C03:Scope.cpp
// How variables are scoped
int main() {
  int scp1;
  // scp1 visible here
  {
    // scp1 still visible here
    //.....
    int scp2;
    // scp2 visible here
    //.....
    {
      // scp1 & scp2 still visible here
      //..
      int scp3;
      // scp1, scp2 & scp3 visible here
      // ...
    } // <-- scp3 destroyed here
    // scp3 not available here
    // scp1 & scp2 still visible here
    // ...
  } // <-- scp2 destroyed here
  // scp3 & scp2 not available here
  // scp1 still visible here
  //..
} // <-- scp1 destroyed here
///:~ 

The example above shows when variables are visible and when they are unavailable (that is, when they go out of scope). A variable can be used only when inside its scope. Scopes can be nested, indicated by matched pairs of braces inside other matched pairs of braces. Nesting means that you can access a variable in a scope that encloses the scope you are in. In the example above, the variable scp1 is available inside all of the other scopes, while scp3 is available only in the innermost scope.

Defining variables on the fly

As noted earlier in this chapter, there is a significant difference between C and C++ when defining variables. Both languages require that variables be defined before they are used, but C (and many other traditional procedural languages) forces you to define all the variables at the beginning of a scope, so that when the compiler creates a block it can allocate space for those variables.

While reading C code, a block of variable definitions is usually the first thing you see when entering a scope. Declaring all variables at the beginning of the block requires the programmer to write in a particular way because of the implementation details of the language. Most people don’t know all the variables they are going to use before they write the code, so they must keep jumping back to the beginning of the block to insert new variables, which is awkward and causes errors. These variable definitions don’t usually mean much to the reader, and they actually tend to be confusing because they appear apart from the context in which they are used.

C++ (not C) allows you to define variables anywhere in a scope, so you can define a variable right before you use it. In addition, you can initialize the variable at the point you define it, which prevents a certain class of errors. Defining variables this way makes the code much easier to write and reduces the errors you get from being forced to jump back and forth within a scope. It makes the code easier to understand because you see a variable defined in the context of its use. This is especially important when you are defining and initializing a variable at the same time – you can see the meaning of the initialization value by the way the variable is used.

You can also define variables inside the control expressions of for loops and while loops, inside the conditional of an if statement, and inside the selector statement of a switch. Here’s an example showing on-the-fly variable definitions:

//: C03:OnTheFly.cpp
// On-the-fly variable definitions
#include <iostream>
using namespace std;

int main() {
  //..
  { // Begin a new scope
    int q = 0; // C requires definitions here
    //..
    // Define at point of use:
    for(int i = 0; i < 100; i++) { 
      q++; // q comes from a larger scope
      // Definition at the end of the scope:
      int p = 12; 
    }
    int p = 1;  // A different p
  } // End scope containing q & outer p
  cout << "Type characters:" << endl;
  while(char c = cin.get() != 'q') {
    cout << c << " wasn't it" << endl;
    if(char x = c == 'a' || c == 'b')
      cout << "You typed a or b" << endl;
    else
      cout << "You typed " << x << endl;
  }
  cout << "Type A, B, or C" << endl;
  switch(int i = cin.get()) {
    case 'A': cout << "Snap" << endl; break;
    case 'B': cout << "Crackle" << endl; break;
    case 'C': cout << "Pop" << endl; break;
    default: cout << "Not A, B or C!" << endl;
  }
} ///:~

In the innermost scope, p is defined right before the scope ends, so it is really a useless gesture (but it shows you can define a variable anywhere). The p in the outer scope is in the same situation.

The definition of i in the control expression of the for loop is an example of being able to define a variable exactly at the point you need it (you can do this only in C++). The scope of i is the scope of the expression controlled by the for loop, so you can turn around and re-use i in the next for loop. This is a convenient and commonly-used idiom in C++; i is the classic name for a loop counter and you don’t have to keep inventing new names.

Although the example also shows variables defined within while, if, and switch statements, this kind of definition is much less common than those in for expressions, possibly because the syntax is so constrained. For example, you cannot have any parentheses. That is, you cannot say:

while((char c = cin.get()) != 'q')

The addition of the extra parentheses would seem like an innocent and useful thing to do, and because you cannot use them, the results are not what you might like. The problem occurs because ‘!=’ has a higher precedence than ‘=’, so the char c ends up containing a bool converted to char. When that’s printed, on many terminals you’ll see a smiley-face character.

In general, you can consider the ability to define variables within while, if, and switch statements as being there for completeness, but the only place you’re likely to use this kind of variable definition is in a for loop (where you’ll use it quite often).

Specifying storage allocation

When creating a variable, you have a number of options to specify the lifetime of the variable, how the storage is allocated for that variable, and how the variable is treated by the compiler.

Global variables

Global variables are defined outside all function bodies and are available to all parts of the program (even code in other files). Global variables are unaffected by scopes and are always available (i.e., the lifetime of a global variable lasts until the program ends). If the existence of a global variable in one file is declared using the extern keyword in another file, the data is available for use by the second file. Here’s an example of the use of global variables:

//: C03:Global.cpp
//{L} Global2
// Demonstration of global variables
#include <iostream>
using namespace std;

int globe;
void func();
int main() {
  globe = 12;
  cout << globe << endl;
  func(); // Modifies globe
  cout << globe << endl;
} ///:~

Here’s a file that accesses globe as an extern:

//: C03:Global2.cpp {O}
// Accessing external global variables
extern int globe;  
// (The linker resolves the reference)
void func() {
  globe = 47;
} ///:~

Storage for the variable globe is created by the definition in Global.cpp, and that same variable is accessed by the code in Global2.cpp. Since the code in Global2.cpp is compiled separately from the code in Global.cpp, the compiler must be informed that the variable exists elsewhere by the declaration

extern int globe;

When you run the program, you’ll see that the call to func( ) does indeed affect the single global instance of globe.

In Global.cpp, you can see the special comment tag (which is my own design):

//{L} Global2

This says that to create the final program, the object file with the name Global2 must be linked in (there is no extension because the extension names of object files differ from one system to the next). In Global2.cpp, the first line has another special comment tag {O}, which says “Don’t try to create an executable out of this file, it’s being compiled so that it can be linked into some other executable.” The ExtractCode.cpp program in Volume 2 of this book (downloadable at www.BruceEckel.com) reads these tags and creates the appropriate makefile so everything compiles properly (you’ll learn about makefiles at the end of this chapter).

Local variables

Local variables occur within a scope; they are “local” to a function. They are often called automatic variables because they automatically come into being when the scope is entered and automatically go away when the scope closes. The keyword auto makes this explicit, but local variables default to auto so it is never necessary to declare something as an auto.

Register variables

A register variable is a type of local variable. The register keyword tells the compiler “Make accesses to this variable as fast as possible.” Increasing the access speed is implementation dependent, but, as the name suggests, it is often done by placing the variable in a register. There is no guarantee that the variable will be placed in a register or even that the access speed will increase. It is a hint to the compiler.

There are restrictions to the use of register variables. You cannot take or compute the address of a register variable. A register variable can be declared only within a block (you cannot have global or static register variables). You can, however, use a register variable as a formal argument in a function (i.e., in the argument list).

In general, you shouldn’t try to second-guess the compiler’s optimizer, since it will probably do a better job than you can. Thus, the register keyword is best avoided.

static

The static keyword has several distinct meanings. Normally, variables defined local to a function disappear at the end of the function scope. When you call the function again, storage for the variables is created anew and the values are re-initialized. If you want a value to be extant throughout the life of a program, you can define a function’s local variable to be static and give it an initial value. The initialization is performed only the first time the function is called, and the data retains its value between function calls. This way, a function can “remember” some piece of information between function calls.

You may wonder why a global variable isn’t used instead. The beauty of a static variable is that it is unavailable outside the scope of the function, so it can’t be inadvertently changed. This localizes errors.

Here’s an example of the use of static variables:

//: C03:Static.cpp
// Using a static variable in a function
#include <iostream>
using namespace std;

void func() {
  static int i = 0;
  cout << "i = " << ++i << endl;
}

int main() {
  for(int x = 0; x < 10; x++)
    func();
} ///:~

Each time func( ) is called in the for loop, it prints a different value. If the keyword static is not used, the value printed will always be ‘1’.

The second meaning of static is related to the first in the “unavailable outside a certain scope” sense. When static is applied to a function name or to a variable that is outside of all functions, it means “This name is unavailable outside of this file.” The function name or variable is local to the file; we say it has file scope. As a demonstration, compiling and linking the following two files will cause a linker error:

//: C03:FileStatic.cpp
// File scope demonstration. Compiling and 
// linking this file with FileStatic2.cpp
// will cause a linker error

// File scope means only available in this file:
static int fs; 

int main() {
  fs = 1;
} ///:~

Even though the variable fs is claimed to exist as an extern in the following file, the linker won’t find it because it has been declared static in FileStatic.cpp.

//: C03:FileStatic2.cpp {O}
// Trying to reference fs
extern int fs;
void func() {
  fs = 100;
} ///:~

The static specifier may also be used inside a class. This explanation will be delayed until you learn to create classes, later in the book.

extern

The extern keyword has already been briefly described and demonstrated. It tells the compiler that a variable or a function exists, even if the compiler hasn’t yet seen it in the file currently being compiled. This variable or function may be defined in another file or further down in the current file. As an example of the latter:

//: C03:Forward.cpp
// Forward function & data declarations
#include <iostream>
using namespace std;

// This is not actually external, but the 
// compiler must be told it exists somewhere:
extern int i; 
extern void func();
int main() {
  i = 0;
  func();
}
int i; // The data definition
void func() {
  i++;
  cout << i;
} ///:~

When the compiler encounters the declaration ‘extern int i’, it knows that the definition for i must exist somewhere as a global variable. When the compiler reaches the definition of i, no other declaration is visible, so it knows it has found the same i declared earlier in the file. If you were to define i as static, you would be telling the compiler that i is defined globally (via the extern), but it also has file scope (via the static), so the compiler will generate an error.

Linkage

To understand the behavior of C and C++ programs, you need to know about linkage. In an executing program, an identifier is represented by storage in memory that holds a variable or a compiled function body. Linkage describes this storage as it is seen by the linker. There are two types of linkage: internal linkage and external linkage.

Internal linkage means that storage is created to represent the identifier only for the file being compiled. Other files may use the same identifier name with internal linkage, or for a global variable, and no conflicts will be found by the linker – separate storage is created for each identifier. Internal linkage is specified by the keyword static in C and C++.

External linkage means that a single piece of storage is created to represent the identifier for all files being compiled. The storage is created once, and the linker must resolve all other references to that storage. Global variables and function names have external linkage. These are accessed from other files by declaring them with the keyword extern. Variables defined outside all functions (with the exception of const in C++) and function definitions default to external linkage. You can specifically force them to have internal linkage using the static keyword. You can explicitly state that an identifier has external linkage by defining it with the extern keyword. Defining a variable or function with extern is not necessary in C, but it is sometimes necessary for const in C++.

Automatic (local) variables exist only temporarily, on the stack, while a function is being called. The linker doesn’t know about automatic variables, and so these have no linkage.

Constants

In old (pre-Standard) C, if you wanted to make a constant, you had to use the preprocessor:

#define PI 3.14159

Everywhere you used PI, the value 3.14159 was substituted by the preprocessor (you can still use this method in C and C++).

When you use the preprocessor to create constants, you place control of those constants outside the scope of the compiler. No type checking is performed on the name PI and you can’t take the address of PI (so you can’t pass a pointer or a reference to PI). PI cannot be a variable of a user-defined type. The meaning of PI lasts from the point it is defined to the end of the file; the preprocessor doesn’t recognize scoping.

C++ introduces the concept of a named constant that is just like a variable, except that its value cannot be changed. The modifier const tells the compiler that a name represents a constant. Any data type, built-in or user-defined, may be defined as const. If you define something as const and then attempt to modify it, the compiler will generate an error.

You must specify the type of a const, like this:

const int x = 10;

In Standard C and C++, you can use a named constant in an argument list, even if the argument it fills is a pointer or a reference (i.e., you can take the address of a const). A const has a scope, just like a regular variable, so you can “hide” a const inside a function and be sure that the name will not affect the rest of the program.

The const was taken from C++ and incorporated into Standard C, albeit quite differently. In C, the compiler treats a const just like a variable that has a special tag attached that says “Don’t change me.” When you define a const in C, the compiler creates storage for it, so if you define more than one const with the same name in two different files (or put the definit