Introduction

This is the first in a tutorial series on C and C++ pointers. It is written for beginning programmers to clarify pointer concepts, but is a great refresher for all programmers. It includes code snippets, diagrams, and discussion text. For a more advanced, under-the-hood deep dive on arrays, pointers, and references, see this post. Let’s dive right in.

What are Pointers in C and C++?

Pointers in C and C++ are variables that point to other variables. They can also point to unnamed objects or regions in computer memory.

Foundations: What is a Variable?

In his book Programming: Principles and Practice Using C++, Edition 2, Bjarne Stroustrup defines a variable this way:

An object is a region of memory with a type that specifies what kind of information can be placed in it. A named object is called a variable.

He further explains:

You can think of an object as a “box” into which you can put a value of the object’s type…

So, we can visualize a variable as a box with a type (like char, int, string, etc.) that also has a name by which we can refer to it, and which contains a value that is stored inside the box (in computer memory):

Variable Box Diagram
Variable Box Diagram

char Variables

char c;  // Define a variable whose name is c, type is char
c = 'a'; // assign the character 'a' to variable c

printf("%c", c); // Output: a

In the above variable definition, c is the name of the variable, and char is the type. The type of a variable defines both the size of the variable in memory (for char variables, one byte, the smallest addressable area in memory) and how it is to be handled by the compiler (it can be assigned a letter like 'a' or a signed integer whose values can go from -128 to 127).

int Variables

int i;     // Define a variable whose name is i, type is int
i = 15339; // assign the integer value 15339 to i

printf("%d", i); // Output: 15339

Above, we have defined a variable named i of type int. It is an object that takes up four bytes in memory, and is treated by the compiler as a signed integer variable whose values can go from -32768 to 32767.

Note that the variable names are arbitrary, as long as they are in the proper format and do not collide with other names, such as keywords or names in libraries or other linked in code.

Variables in Memory

When the program is compiled and run, these variables will be placed at a certain location in memory. For instance, if we have 16 bytes of memory, with the first byte starting at address zero and the last byte starting at address 15 (zero to 15 inclusive is 16), the variable c could be placed at address 2 (the third byte from the bottom), and i at address 4 (the fifth byte from the bottom):

char and int in Memory Diagram
char and int in Memory

Note that in C and C++, addresses and indexes (such as an index into an array - look for next post) are zero based. That is, they start at zero. So, the address (or index) of the 1st byte in memory is 0, the 2nd is 1, …, the nth byte is n - 1.

Pointers Explained

A pointer is exactly what its name implies: a pointer to something, often another variable, which is what we deal with in this tutorial. A pointer is a variable, just like a char or int variable, but its type is pointer to type, where you specify the type:

Defining Pointers

char* pc; // Define a variable named pc of type `pointer to char`
int* pi;  // Define a variable named pi of type `pointer to int`

Above, the * postfix operator after the type (char and int) is what indicates to the compiler you are defining a pointer to that type. The type of pc is pointer to char, and the type of pi is pointer to int.

Assigning Values to Pointers

Since pointers are variables, like char and int, you must assign them a value, and the value you put in them is the address of the variable you want the pointer to point to (you can also assign them the value null in C and C++, or nullptr in C++ only, which indicates they point to nothing):

char c;   // Define a char variable
int i;    // Define an int variable

char* pc; // Define a variable of type `pointer to char`
int* pi;  // Define a variable of type `pointer to int`

pc = &c;  // Get the address of c and put it in pc
pi = &i;  // Get the address of i and put it in pi

// Print out the addresses stored in pc and pi
printf("%p %p", pc, pi);
// Output: 02 04

Above, the & prefix operator gets the address of a variable, so &c returns the address of the variable c, and here we assign it as the value for the pointer pc. Since the address of c is 2, that is the value that gets put into pc. Similarly, the value 4 gets put in pi.

Notice that in the printf statement, we use the %p format specifier to format pointer (address) printouts, in this case as 0 padded hexadecimal numbers (see below).

Pointers to char and int Diagram
Pointers to char and int

You can see above clearly that the value assigned to the pointer (the number put into the pointer) is the address of the variable it points to.

Pointers are Just Variables in Memory Too

Pointers are variables. In the diagram above, I put them to the side to make it easier to see the point being made: that they point to a variable and that the value stored in the pointer variable is the memory address of the variable they point to. However, Pointers are variable objects just like any other variable, so they also reside in computer memory:

char, int, and pointers in Memory Diagram
char, int, and Pointers in Memory

As you can see, I have changed the figure to have the pointers placed in memory like any other variable, and have placed them at their very own respective addresses. Note that for a 16 byte total memory system, a single byte is sufficient to hold all the possible addresses, so that is how I did it in the above diagram to save space:

char c;   // Define a char variable
int i;    // Define an int variable

char* pc; // Define a variable of type `pointer to char`
int* pi;  // Define a variable of type `pointer to int`

pc = &c;  // Get the address of c and put it in pc
pi = &i;  // Get the address of i and put it in pi

// Print out the address of pi and pc and the addresses stored in them
printf("%p %p %p %p", &pc, &pi, pc, pi);
// Output: 0C 09 02 04

In the printf statement, I use the %p format specifier to print four addresses as 0 padded hexadecimal numbers (in our “make believe” system of 16 bytes of memory). There are sixteen hex digits which include the 10 from decimal (0-9) plus six more: A (10), B (11), C (12), D (13), E (14), and F (15). So, the 0C for the address of pc is just 12, and the 09 for the address of pi is just 9, both of which match the addresses where pc and pi are located in the diagram above.

The first two are the addresses of the pointer variables pc and pi. I use the & operator to get the address of pc as the first number to be printed out. This not what the pointer pc points to, but the address in memory of the pointer variable pc itself (hence &pc).

A quick look at the diagram above, and we see the pointer variable pc is indeed located at the address 12. Similarly, &pi prints out as the hex value 09, and looking at the diagram above, we see I have located the pointer variable pi at address 9.

The next two 0 padded hex numbers show the contents of, or value stored in the pointer variables. These are the addresses of the char variable c (&c) and the int variable i (&i), which are 02 and 04, respectively. You can see from the diagram above that indeed I placed c at address 2 and i at address 4.

Remember, this is our “make believe” system that only has 16 bytes of memory, pointers only occupy one byte in memory, and I have determined myself the location of all the variables in memory for illustration purposes. Below I discuss what a typical real system would look like, but the “make believe” system works well here for simplicity and is correct conceptually and better for simplicity and space savings.

Using Pointers

Once one has pointed a pointer to another variable, one uses the * prefix operator to retrieve the value that is inside the variable that the pointer points to. This is called dereferencing (a pointer refers to another variable, so de-reference-ing it is to change it from a reference to the variable to the actual value of the variable it references, just as if one directly used the referenced variable itself):

char c = 'a';  // define char variable c and give it the value: a
char* pc = &c; // define "pointer to char" variable pc
               // and give it the value: `address of c`

printf("%c %c", c, *pc); // Output: a a

Note, above I have used the shortcut notation that assigns an initial value right in the definitions of each variable. First, I define a char variable named c, simultaneously assigning it the value 'a'. Then, I define a pointer to char variable pc, simultaneously assigning it the address of the variable c.

Using printf, I first print out the value of c (a character), directly using the variable c itself.

I then print out the value of c again, but this time indirectly through the pointer to it (pc). I use the * prefix operator to dereference pc (hence, *pc), which returns the value of the variable to which it points, in this case the value of c (‘a’). So, we end up just printing ‘a’ twice.

Changing Pointers

Pointers can be changed at any time to point to something else, and you can assign the contents of one pointer to another:

char c = 'a';   // Define char variable c and assign it value: a
char d = 'f';   // Define char variable c and assign it value: f

char* pc = &c;  // Define pointer-to-char variable pc
                // and assign it the value: `address of c`
char* pd = &d;  // Define pointer-to-char variable pd
                // and assign it the value `address of d`

printf("%c %c %p %p %c %c", c, d, pc, pd, *pc, *pd);
// Output: a f 02 04 a f

pd = pc; // Assign value in pc to pd

printf("%c %c %p %p %c %c", c, d, pc, pd, *pc, *pd);
// Output: a f 02 02 a a

pc = &d; // Assign `address of d` to pc
printf("%c %c %p %p %c %c", c, d, pc, pd, *pc, *pd);
// Output: a f 04 02 f a

After defining and assigning values to the two char variables c and d, I define and assign the address of c to pc then the address of d to pd.

The first printf shows what one would expect: c holds 'a', d holds 'f'. &pc and &pd are the addresses where pc and pd reside in memory. pc holds the address of c, pd holds the address of d. In the printout, dereferencing pc (*pc) returns the value of c, and dereferencing pd (*pd) returns the value of d.

Since pointers are just variables that hold values which happen to be the addresses of the variables they point to, I can assign the value (“contents of”) one to another, as long as they are the pointers of the same type. I do this (pd = pc), then the following printf shows that pd now points to c, just like pc points to c. We can verify this because the printout of the address held in both pc and pd are the same (the address of c, which is 2). Also, when dereferenced, they both print out the same value: the value of c (which is ‘a’). Note that the values in c and d remain unchanged, as you can see from the first to numbers in the print out.

Finally, I can directly put the address of d in pc (pc = &d) without affecting pd. As the following printf shows, we have ended up switching what pc and pd point to: pc points to d and pd points to c. Comparing the addresses stored in pc and pd, they are switched from the first printf to the last, and printing *pc and *pd prints out the values ‘f’ then ‘a’, switched in order from the first printout of ‘a’ then ‘f’. Again, the values of c and d printed using those variable names directly have not changed.

Assigning Values Using Pointers

Finally, we show below how to change the value of the variable a pointer points to:

char c = 'a';  // Define char variable c and assign it the value: a
char* pc = &c; // Define pointer-to-char variable pc
               // and assign it the value: `address of c`

// print out value stored in c directly then with dereferenced pc
printf("%c %c", c, *pc);
// Output: a g

*pc = 'f'; // Using the dereferenced pointer pc, assign c the value: f

// Print out the value stored in c directly then with dereferenced pc
printf("%c %c", c, *pc);
// Output: f f

Just dereference (using the * prefix operator) the pointer in the assignment, and the variable the pointer points to will be updated. Of course, the type of the value you are assigning must be the same type as the variable being pointed to (and not an address): in this case, a char type, not a pointer to char (not a char*) type.

When we assign to a dereferenced pointer (*pc), the address of the pointer itself (&pc) and the value stored in pc (the address of c) do not change, as you can see from the address printouts from both printfs. Just the value of the variable c that pc points to changes. So, *pc = 'f' changes the value in c from ‘a’ to ‘f’.

Real Memory Spaces

In a typical desktop or server system, the program might run in a 32 bit environment (for instance x86), so the size of the pointer would need to be 4 bytes (the same size as an int). More commonly today, it could run in a 64 bit environment (for instance x64), so the size would need to be 8 bytes (the size of a long long).

The maximum theoretically addressable memory space in a 32 bit system is 2^32, or ‭4,294,967,296‬ bytes (4 GB). In Windows, typically only half of this is available to a user process. The theoretically maximum addressable memory space in a 64 bit system is ‭2^64, or around 16 exabytes. Only a small portion of this is used by Windows systems. Here is a diagram that illustrates details of an x64 paged virtual address space.

The diagram below shows how the variables are stored in a process (program), illustrating an actual run of the code snippet immediately below it. Scale and proportion have been sacrificed to keep the diagram a workable size, but are good enough to get the point across:

char, int, and Pointers in Memory - Real Implementation
char, int, and Pointers in Memory - Real Implementation

The 0x in front of each 0 padded hex number is the way to specify a hex number literal when writing code in C or C++. In the diagram, as well as in the code below, the simplistic small number addresses have been changed to the actual 32 bit (4 byte) addresses printed out when I ran the instrumentation code. The diagram reflects that these addresses are larger than a char (c) and the same size as an int (i).

The the pointer’s addresses reflect their addresses to the right side in the diagram, and the contents of the pointers reflects the addresses of the variables they point to, also to the right.

Note also that the arrangement of the variables is opposite from all the other diagrams - the first defined variable is at the top, the last at the bottom. This is to show the reality that in an actual system, since the associated code snippet below was instrumented in a function, each local variable is pushed onto the stack in the order defined, and the stack grows from top to bottom.

char c = 'a';  // Define a char variable and assign value: 'a'
int i = 15336; // Define an int variable and assign value: 15336

char* pc; // Define a variable of type `pointer to char`
int* pi;  // Define a variable of type `pointer to int`

pc = &c;  // Get the address of c and put it in pc
pi = &i;  // Get the address of i and put it in pi

// Print out the address of pi and pc and the addresses stored in them
printf("%p %p %p %p", &pc, &pi, pc, pi);
// Output: 00BAFA74 00BAFA68 00BAFA8F 00BAFA80

The above code snippet is identical to the code above under the heading Pointers are Just Variables in Memory Too, except that instead of showing the addresses from my 16 byte example system in the printf outputs as I have done everywhere else above, I show the real printouts from my sample code running as a 32 bit process which displays 32 bit (4 byte) addresses.

This section is not critical for an introductory understanding of pointers, but is here for completeness and to show in real terms how my 16 byte address space with 1 byte pointers really is just for pedagogical purposes. If it seems a little too complex at this point, don’t worry about it - just try to understand the gist of the section.

Conclusion

I hope I have accomplished my goal of providing an approachable and insightful tutorial on C and C++ pointers. Next up in the series is investigating the close relationship between arrays and pointers along with pointer arithmetic.

Thanks to Pexels for the free image