Memory, Pointers and C under the hood. (full guide)

Memory, Pointers and C under the hood. (full guide)

You should leave his guide with full clarity of pointers in C.

Featured on Hashnode

Introduction

Pointers in C can be confusing. There are many ways to work with pointers and understanding practical use cases can be hard too.

We're gonna start by explaining how memory works in C. That will make it easier to understand the purpose of pointers.

This is a thorough guide to every single detail on pointers.

Memory

What?

In a computer, memory is organized as a large number of bytes.

Each byte is used to store value and has a unique address. The computer can perform read or write operations to these addresses.

Memory block

A memory block is a section of computer memory where each byte is right next to its neighbor, without any gaps or interruptions This arrangement makes it easier for the computer to read and write multiple pieces of related data.

You often use memory blocks for storing groups of data that belong together, such as the elements in an array or the characters in a string. By keeping them arranged like that, the computer can access and manage the data more efficiently.

Memory address

A memory address is a unique identifier that points to a byte in a computer's memory.

They will vary every time you run a program.

Visual explanation

Analogy

To better understand memory in a computer, let's take a look at an analogy.

Bookshelf

Imagine we have a bookshelf in a library. The shelf has slots. Each slot can hold one book. Every slot has a unique name to identify it whenever a librarian wants to look up a book.

We also arrange books in a single row. Sometimes by category or the letter the book starts with.

  • Bookshelf is the memory.

  • Each slot is a byte.

  • Each book is a value a byte stores.

  • Each unique slot name is the memory address.

  • Memory block in this case would be when we arrange books in a single row e.g. when storing all "Fantasy" books in a row, making them easy to find for visitors.

Memory management

C is a low-level language.

With great power comes great responsibility.

Some memory is managed for you. Some you have to manage yourself.

There are two primary memory segments in C you should be aware of:

  • Stack: The memory managed for you.

  • Heap: The memory you've to manage yourself.

Stack

The Stack is a block of memory that grows and shrinks automatically. This is managed by the C compiler. You don't have to do anything here.

Imagine a stack of books on a table. You can add a book to the top of the stack (push), or you can take a book off the top (pop). However, you can't add or remove a book from the middle or the bottom, only from the top.

This is a "Last-In, First-Out" (LIFO) system. The last book you put on the stack is the first one you'll remove.

But, what is it?

The Stack is a computer memory region that automatically manages temporary variables in a Last-In, First-Out system, created and destroyed by the C runtime without manual intervention.

Stack Pointer

The Stack uses a special variable called the Stack Pointer to keep track of the top of the stack. It's like watching the top book on our stack of books.

When a new variable is added to the Stack, the Stack Pointer moves up. When a variable is removed, the Stack Pointer moves down.

Characteristics of the Stack

  • Fast Allocation/Deallocation: The stack is very fast because all it does is move the stack pointer up and down. We don't have to look for anything when allocating new memory.

  • Fixed Size: The size is determined at the beginning, and it's limited.

  • Last-In, First-Out (LIFO): Data is added or removed from the top of the stack.

  • Automatically Managed: The compiler takes care of adding/removing data.

  • Local Scope: Data on the stack is local to the function that pushed it. This means that variables on the stack are only accessible within the function that created them. Once the function returns, those variables are popped off the stack, and their values are lost.

Code

#include <stdio.h>

void myFunction(int param) {
  int localVar = 10;  // localVar is pushed onto the Stack
  printf("Parameter value: %d\n", param);
  printf("Local variable value: %d\n", localVar);
  // Both param and localVar are popped off the 
  // Stack when myFunction returns
}

int main() {
  int mainVar = 5;  // mainVar is pushed onto the Stack
  myFunction(mainVar);  // mainVar is passed as a parameter to myFunction
  // mainVar remains on the Stack until main() returns
  return 0;
  // mainVar is popped off the Stack here
}

Heap

Imagine you have a large, open field where you can place items anywhere you like. There are no specific rules about where to place things, but you do have a limited amount of space. Every time you want to put an item in this field, you need to tag it with a small flag indicating its location.

You can add or remove items at any time, but if you lose the flag (or forget to remove it), that item will stay there forever, taking up valuable space.

This is how the Heap works. It's a region of your computer's memory that is not managed automatically for you. Unlike the Stack, where variables are managed for you, in the Heap, you have to manage memory yourself.

When you allocate memory on the Heap, it stays there until you or your program explicitly frees it.

Heap Pointer

To keep track of what's where in the Heap, you use pointers. When you allocate a new block of memory, you get back a pointer that "points" to the start of that block.

It's like the small flag you'd stick next to the item you place on the field.

Characteristics of the Heap

  • Dynamic Size: The heap grows and shrinks during runtime as required.

  • Slower Allocation/Deallocation: Compared to the stack, heap allocation and deallocation are slower. This is because e.g. allocation on the heap involves finding a block of memory large enough to fit the data, which can take time.

  • Explicit Management: You have to explicitly allocate and deallocate memory.

  • Global Scope: Data in the heap is globally accessible as long as you've a pointer to the data.

Code

#include <stdio.h>
#include <stdlib.h>

void myFunction() {
  int *heapVar = malloc(10 * sizeof(int));  // Allocating an array of 10 ints on the Heap

  if (heapVar == NULL) {  // Always check if malloc succeeded
    printf("Memory allocation failed!\n");
    exit(1);
  }

  heapVar[0] = 42;  // Access heap memory like regular array
  printf("First value in heap array: %d\n", heapVar[0]);

  free(heapVar);  // Don't forget to free the memory!
  // heapVar now dangles, do not use it until you point it to a valid memory location again
}

int main() {
  myFunction();  // myFunction allocates and frees memory on the Heap
  return 0;
}

Pointers

What?

Pointers in C are variables holding the address of where data is stored. They don't contain the data, but point to it.

Think of a pointer as a treasure map. It doesn't contain the treasure (data). Instead, it tells you where to find it.

#include <stdio.h>

int main() {
    int x = 10;
    int *ptr = &x;
    printf("Value of x: %d\n", *ptr);
    return 0;
}

Here, ptr doesn't hold the treasure (10), but it tells you where to find it: in the memory location of x.

You turn a variable into a pointer by putting * behind it.

If we did int *pointer = x;, it would mean assigning the value of x to the pointer, which wouldn't work.

To get the address of where the data of a variable is stored, you put & behind it.

Declare an empty pointer

Declaring an empty pointer is like getting an empty treasure map. You don't yet know where the treasure is, but you have something to fill in the details later.

int *ptr1;
float *ptr2;
char *ptr3;

Address and Dereference Operators

The ampersand for variables holding values (&) is like asking, "Where is your home?".

The asterisk when used on a pointer (*) is like going to that home to see what's inside.

Let's take a look at some code:

int x = 20;
int *ptr = &x;  // &x means the "address of x"
int y = *ptr;  // *ptr means "value at the address stored in ptr"

Don't confuse yourself here. We're using * again but on the pointer. This is called dereferencing. We're grabbing the value the address of the pointer points to.

The NULL Pointer

The NULL pointer is a special pointer value. It means the pointer is not pointing to any memory location.

Think of it as a signpost in the middle of the desert that says, "Nothing to see here." Just like you wouldn't dig for treasure in a location where there is a sign that says there is nothing.

You'd typically use NULL pointers when you want to initialize a pointer but don't have a specific address for it to point to yet. This ensures that you don't accidentally dereference garbage values.

int *ptr = NULL;

if (ptr) { // Will not execute because ptr is NULL
    printf("%d", *ptr);
}

Pointer Arithmetic

Pointer arithmetic allows you to move your pointer forwards or backwards in memory, effectively letting you navigate an array or block of memory.

Imagine you're reading a book, and your finger is following the lines as you read. You can move your finger (pointer) forward to the next word or backward to the previous word. This is what pointer arithmetic allows you to do in memory.

Pointer arithmetic is most useful when you are dealing with arrays and buffers. It enables you to traverse an array without using an index.

It's important to note strings are arrays of characters in C. So they would be useful for strings too.

Arrays

#include <stdio.h>

int main() {
    int arr[5] = {1, 2, 3, 4, 5};
    int *ptr = arr;

    // ptr points to the first element of arr, which is 1
    printf("First element: %d\n", *ptr);  // Output: First element: 1

    // Using pointer arithmetic to move to the next element
    ptr += 1;
    printf("Second element: %d\n", *ptr);  // Output: Second element: 2

    return 0;
}

To clarify. When you assign an array to a pointer, the pointer points to the first element.

Strings

#include <stdio.h>

int main() {
    char str[] = "Hello";
    char *ptr = str;

    // ptr points to the first character of str, which is 'H'
    printf("First character: %c\n", *ptr);  // Output: First character: H

    // Using pointer arithmetic to move to the next character
    ptr += 1;
    printf("Second character: %c\n", *ptr);  // Output: Second character: e

    return 0;
}

As you can see, assigning strings to a pointer points to the first character, the first element in the array of characters.

printf with *?

You may be confused looking at the examples. Why do we use *ptr when we call printf ?

We have to dereference the pointer. Otherwise, we would be printing the memory address. But in the examples, we want to print the values stored in those addresses.

What is array exactly?

Array as you may have guessed, is a memory block

It's a section of arranged bytes.

Strings under the hood

In our previous example, we had the string Hello . Under the hood, it would be an array of characters: {'H', 'e', 'l', 'l', 'o', '\0'}.

You may have expected a part of it, except the value at the end: \0 .

What's this?

In C, strings are arrays of characters, and the language doesn't have a string data type like in some other programming languages.

Therefore, we have the null-terminator (\0) which serves as a marker for the end of the string, telling functions that process strings where to stop reading the array of characters.

But why do we need it?

  1. Functions processing strings: Functions like strlen(), strcpy(), and strcat() rely on the null-terminator to know when the string ends. Without it, these functions would keep reading memory until they encounter a random zero byte, which could lead to buffer overflows, segmentation faults, or other undefined behavior.

  2. Memory Corruption: If the null-terminator isn't there, reading beyond the valid memory allocated for the string is very likely to happen. This would result in corrupting the data and causing unpredictable software behavior.

  3. Security Risks: The undefined behavior associated with not having a null-terminator can be exploited in security attacks, such as buffer overflow attacks.

Pointers and Functions

Pointers can be passed to functions as arguments, allowing those functions to modify the actual values the pointers point to.

Imagine you gave someone the remote control to your smart house. They can change the thermostat, turn lights on and off, etc., without needing to be physically inside your house.

That's how it is passing pointers to functions.

Pointers are often passed to functions to directly modify variables from calling functions, to return multiple values, or to pass large data structures efficiently.

void modify(int *a) { // function takes a pointer to `a`.
    *a = 10; // dereference to assign new value to location the address
             // points to.
}

int main() {
    int x = 5;
    modify(&x);  // x is now 10
    return 0;
}

Pointers to Pointers

A variable that points to a pointer. A variable that contains the address of another pointer.

That's a pointer to a pointer.

They're declared by using double asterisks ** .

int x = 5;          // A variable 'x'
int *p = &x;        // A pointer 'p' storing the address of 'x'
int **pp = &p;      // A pointer to a pointer 'pp' storing the address of 'p'

Analogy

Imagine you have a treasure chest (which represents a variable). Inside this chest is a map (pointer) that leads to another chest.

Now, you have a third chest that contains a map to the chest with the first map. This third chest is the pointer-to-pointer. It indirectly leads you to the ultimate treasure (the original variable).

Wtf, why do we need this?

Now that it clicked for you, you are probably like me, thinking it is a useless feature.

This isn't something you'd use often.

The general principle is, whenever you need an extra level of indirection or flexibility, consider using a pointer to a pointer.

But let's take a look at some use cases when you'd want to use a pointer to a pointer.

Fun fact

You could also have a pointer that points to pointer to pointer.

int x = 5;          // A variable 'x'
int *p = &x;        // A pointer 'p' storing the address of 'x'
int **pp = &p;      // A pointer to a pointer 'pp' storing the address of 'p'
int ***ppp = &pp;      // A pointer to a pointer to pointer 'ppp' storing the address of 'pp'

Common use cases

Use cases you may want to consider using pointers to pointers:

  1. Dynamic 2D Arrays: When you want a 2D array (like a table or grid) that can change in size, you'll use pointers to pointers. This gives you a array of arrays, and each list can have a different size.

  2. Command-line Argument Parsing: When your C program starts, it receives its command-line arguments as a list of strings. In C, this list is actually a pointer to a pointer. You can go through this list easily to find out what options or files the user specified.

  3. Dynamic Data Structures: If you're building complex stuff like trees or networks of connected items (also known as graphs), you'll often need pointers to pointers. This allows one node to easily connect to other nodes.

  4. Changing Pointers in Functions: Sometimes, you're not just changing the data a pointer is pointing to, you're changing where the pointer itself is pointing. To do this in a function, you'll need to pass in a pointer to a pointer.

We're gonna take a look at some of them.

2D Arrays

Let's first take a look at how 2D arrays look:

#include <stdio.h>

int main() {
    // Declare a 3x2 array. This creates a contiguous block of memory.
    int arr[3][2] = {
        {1, 2},  // First row
        {3, 4},  // Second row
        {5, 6}   // Third row
    };

    // Accessing elements: arr[row][column]
    printf("Element at (2,1): %d\n", arr[1][0]); // Output will be 3

    // Nested loop to iterate through all elements
    for (int i = 0; i < 3; ++i) {
        for (int j = 0; j < 2; ++j) {
            printf("%d ", arr[i][j]);
        }
        printf("\n");
    }

    return 0;
}

As you can see above, 2D arrays are like rows and columns. The first index is the row you're accessing. The second index is the column you're accessing.

Problems if you want to create dynamically sized 2D arrays:

  1. Fixed Size: The dimensions of a 2D array must be known at compile-time, which can be limiting if you need a dynamically resizable array.

  2. Memory Overhead: If you allocate a large 2D array to accommodate maximum size, but end up using only a fraction of it, you waste memory.

  3. No Variable-Length Rows: All rows must have the same number of columns, which may not always be desirable.

Let's look at how to create dynamically sized 2D arrays using pointers to pointers:

#include <stdio.h>
#include <stdlib.h>

int main() {
    // Declare a pointer to a pointer. It will serve as our 2D array.
    int **arr;

    // Allocate memory for 3 pointers (3 rows).
    // Each of these pointers will point to an array (a row).
    arr = malloc(3 * sizeof(int *));
    if (arr == NULL) {
        return 1;  // Memory allocation failed
    }

    // Allocate memory for each row and set values.
    for (int i = 0; i < 3; ++i) {
        // Allocate memory for 2 integers in each row.
        arr[i] = malloc(2 * sizeof(int));
        if (arr[i] == NULL) {
            return 1;  // Memory allocation failed
        }

        // Initialize the elements. arr[i][j] = i * j;
        arr[i][0] = i;
        arr[i][1] = i * 10;
    }

    // Accessing elements: arr[row][column]
    // Nested loop to iterate through the dynamically allocated 2D array
    for (int i = 0; i < 3; ++i) {
        for (int j = 0; j < 2; ++j) {
            printf("%d ", arr[i][j]);  // Print each element
        }
        printf("\n");
    }

    // Free each dynamically-allocated row
    for (int i = 0; i < 3; ++i) {
        free(arr[i]);
    }

    // Free the array containing the row pointers
    free(arr);

    return 0;
}

We dynamically allocate an "array of pointers" first (arr), and then for each of those pointers, we dynamically allocate an array of integers.

This is a dynamic 2D array, fully under your control. You can change its dimensions at runtime, allocate each row independently, and pass it to functions without needing to specify its size.

Command-Line argument parsing

When developing a C program that takes command-line arguments, the operating system passes these arguments to your main() function. This is done via the argc and argv parameters in main(). Here, argc is the argument count, and argv is an array of pointers to strings, representing the arguments. However, this array of pointers is actually passed as a pointer to a pointer (char **argv).

With arrays, you need to set their size when you create them, and you can't change that size later. But with a pointer to a pointer, you can point it to different areas of memory as needed. This lets you change the size by pointing to larger or smaller blocks of memory.

#include <stdio.h>

// Our main function which gets command line arguments
int main(int argc, char **argv) {
    // Loop through each command-line argument
    for(int i = 0; i < argc; i++) {
        // Print the argument
        // argv[i] grabs the i-th argument from the command line
        printf("Argument %d is %s\n", i, argv[i]);
    }
    return 0;
}

It may look confusing that we aren't dereferencing argv before we access the value. How is this possible?

The truth is, C does that under the hood for us when indexing an array.

When you write argv[i], it's the same as *(argv + i). The pointer argv is incremented by i positions and then dereferenced to get the value stored at that location.

Now, that may be confusing. To clarify, an array pointer points to the first element of its array. That's why we need to increment argv before we dereference to access the value.

So, when you see *(argv + i), you can read it like this:

  1. "Take the pointer argv".

  2. "Move i steps down to point to the i-th element in the array."

  3. "Now dereference this new pointer to give me the value stored there."

Writing argv[i] is more convenient however. So stick to that.

Advanced function arguments

If you pass a single-level pointer to the function and modify it, the changes won't be reflected outside the function. This is because the function works on a local copy of the pointer, not on the original pointer itself.

The solution is to pass pointer to the pointer as an argument to the function. This way, you're giving the function the ability to modify the pointer itself, not just the data it points to.

This wouldn't work:

#include <stdio.h>
#include <stdlib.h>

// Function to initialize array
void initializeArray(int *arr, int size) {
    arr = malloc(size * sizeof(int));  // Allocate memory
    for (int i = 0; i < size; ++i) {
        arr[i] = i;
    }
}

int main() {
    int *myArray = NULL;  // Declare a pointer to int
    initializeArray(myArray, 5);  // Call function to initialize array

    if (myArray == NULL) {
        printf("Memory not allocated.\n");
    }

    return 0;
}

In this example, myArray is still NULL after the function call because the function modifies a local copy of myArray, not myArray itself.

This would work:

#include <stdio.h>
#include <stdlib.h>

// Function to initialize array
void initializeArray(int **arr, int size) {
    // Allocate memory
    *arr = malloc(size * sizeof(int));  
    // Initialize array
    for (int i = 0; i < size; ++i) {
        (*arr)[i] = i;
    }
}

int main() {
    int *myArray = NULL;  // Declare a pointer to int
    initializeArray(&myArray, 5);  // Pass the address of the pointer to initialize array

    // Check if memory was allocated and array was initialized
    if (myArray != NULL) {
        for (int i = 0; i < 5; ++i) {
            printf("%d ", myArray[i]);  // Should print 0 1 2 3 4
        }
        printf("\n");
    }

    // Don't forget to free the memory
    free(myArray);

    return 0;
}

In this example, initializeArray() takes a pointer to a pointer (int **arr). Inside the function, we use *arr to access what arr points to, the original myArray pointer in main(), and allocate memory for it. This way, the changes are also reflected in main().

Function Pointers

Function pointers in C are a way to store the address of a function, so you can call the function later, or even pass it to another function as a parameter.

The syntax for declaring a function pointer: return_type (*function_pointer_name)(parameter_type1, parameter_type2, ...); .

Let's say for example we had this function:

int add(int a, int b);

The pointer to the function would be declared as:

int (*add_ptr)(int, int);

You can initialize the function pointer using the address-of operator &:

add_ptr = &add;

To call the function, you deference the pointer:

int result = (*add_ptr)(5, 3);  // this calls add(5, 3)

C however let's you omit the dereferencing when calling pointer functions. This is also valid:

int result = add_ptr(5, 3);  // this also calls add(5, 3)

When you need them

Function pointers in C let you treat functions like variables: you can assign functions to them, pass them to other functions, or even call them, just like you'd use a normal variable. They give you a way to choose and use different functions on-the-fly.

Use cases when they're often used:

  1. Plug-in Architectures: If you are developing a system that allows for plug-ins or modules, function pointers can serve as hooks to extend functionality without modifying existing code.

  2. Event-Driven Programming: In GUIs or other event-driven systems, function pointers can be used to handle events dynamically, such as button clicks or key presses.

  3. Callback Functions: Function pointers are often used as arguments to functions that require a user-defined operation, like custom comparison functions in sorting algorithms.

  4. State Machines: In embedded systems or protocol designs, function pointers can represent states, making the state machine compact and easy to manage.

Realistic example

Let's say we're building a GUI framework.

Typically, GUI frameworks include various widgets like buttons, checkboxes, sliders, etc., that the user can interact with.

In this example, we'll focus on buttons.

When a user clicks a button, some specific action should occur. This action could be anything: opening a new window, saving a file, or even exiting the application.

One of the challenges in building a GUI framework is providing a way to easily customize this behavior for different buttons.

Why function pointer is necessary

Using function pointers allows us to assign custom behavior to each button dynamically.

By attaching a function pointer as an event handler for a button click event, the framework becomes more powerful for the users.

Users can specify what should happen when a button is clicked by passing a function pointer that points to their custom function.

Code

#include <stdio.h>

// Function pointer type definition for handling button clicks.
// We name the type to make the code more readable.
typedef void (*ButtonClickedHandler)(void);

// Button structure, encapsulating the attributes of a button.
typedef struct {
    ButtonClickedHandler onClick;  // Function pointer for the click event.
    const char* label;  // Label text on the button.
} Button;

// Function that simulates clicking a button.
// It takes a pointer to a Button structure.
void button_click(Button* button) {
    // Check if a function pointer has been set.
    if (button->onClick != NULL) {
        printf("Button '%s' clicked. ", button->label);
        button->onClick();  // Call the function through the function pointer.
    } else {
        printf("Button '%s' clicked but no action assigned.\n", button->label);
    }
}

// A custom function that gets called when a button is clicked.
void say_hello() {
    printf("Hello, World!\n");
}

// Another custom function for another button.
void say_goodbye() {
    printf("Goodbye, World!\n");
}

int main() {
    // Create two buttons.
    Button helloButton = {say_hello, "Say Hello"};
    Button goodbyeButton = {say_goodbye, "Say Goodbye"};

    // Simulate clicking the buttons.
    button_click(&helloButton);  // Output: "Button 'Say Hello' clicked. Hello, World!"
    button_click(&goodbyeButton);  // Output: "Button 'Say Goodbye' clicked. Goodbye, World!"

    return 0;
}

Pointers and Structs

As you probably know by now, we can have pointers pointing to structs too.

struct Person {
    char name[50];
    int age;
};

struct Person *personPtr; // how you would declare a pointer

How you would iniaitilize and access struct pointers:

struct Person personInstance;
personPtr = &personInstance;

strcpy(personPtr->name, "Alice");
personPtr->age = 30;

Dynamic memory allocation

How you would dynamically allocate memory

personPtr = (struct Person *) malloc(sizeof(struct Person));

if (personPtr) {
    strcpy(personPtr->name, "Bob");
    personPtr->age = 25;
}

free(personPtr);  // Don't forget to release memory

The (struct Person *) before the malloc function call is a type cast. It explicitly tells the compiler to treat the raw pointer returned by malloc as a pointer to struct Person.

Pointers inside structs

struct Node {
    int data;
    struct Node *next;
};

In the example above, a Node struct contains an integer and a pointer to another Node. This forms the basis for linked lists.

Functions and Structs

void updateAge(struct Person *p, int newAge) {
    if (p) {
        p->age = newAge;
    }
}

Pointers to Array of structs

struct Person people[10];
struct Person *arrayPtr;

arrayPtr = people;

(arrayPtr + 2)->age = 28; // This will set the age of the third person in the array to 28

Generic pointers

Imagine you're a postman delivering mail to houses. Now, instead of knowing the specific details of every house, all you care about is the address. That way, you can deliver to any house, big or small.

A generic pointer is like that postman. It doesn't care about what data type it's pointing to, it just holds an address.

It can hold the address of any data type and doesn't know what type of data it's pointing to.

Declaration

void *pointerName; // this is how it looks like

Using a generic pointer

Say you have a random integer and you want our "postman" (generic pointer) to remember its address:

int randomNum = 42;          // Just a random number
void *genPtr = &randomNum;   // Our postman (generic pointer) holding its address

If you want to see that number again using the postman, you need to tell him it's an integer house. We use type casting to tell the compiler what type:

int retrievedNum = *(int *)genPtr;   // Tell postman, "Hey, that's an integer house!"

Why use generic pointers?

Flexibility

Generic pointers allow code to work with any data type, reducing the need for type-specific implementations.

Type Agnosticism

Functions like malloc() don't need to know your data type, they just allocate the requested memory size. Generic pointers let these functions work with many data types.

Interfacing with Systems

In complex or embedded systems, you often interact directly with memory. Generic pointers let you handle memory without forcing a specific data structure on it.

Emulating Polymorphism

Want some object-oriented-like behaviors in C, such as polymorphism? Generic pointers are your go-to, allowing unified handling of diverse data types.

Optimization

Instead of multiple functions for different types, one function using generic pointers can handle multiple data types, leading to potential performance benefits.

Compatibility

For library developers, generic pointers can help maintain both backward and forward compatibility, ensuring a consistent interface even if underlying data structures change.

Common mistakes

Let's go over some common mistakes when working with pointers.

Uninitialized Pointers

If you declare a pointer but don't initialize it with a valid address, it points to a random memory location. Trying to dereference it can lead to undefined behavior.

int *ptr;
*ptr = 5;  // Dereferencing uninitialized pointer

Always initialize pointers either to NULL or to a valid address before using them:

int *ptr = NULL;
// ... Later in the code
ptr = &someIntegerVariable;

Dangling Pointers

A dangling pointer is a pointer that still references a memory location after that memory location has been deallocated.

int *ptr = (int *)malloc(sizeof(int));
free(ptr);
*ptr = 4;  // ptr is now a dangling pointer

After deallocating memory using free(), always set the pointer to NULL. This way, if you accidentally try to dereference it again, it becomes a NULL pointer dereference, which is easier to spot:

free(ptr);
ptr = NULL;

The reason it's easier to spot is because dereferencing a NULL pointer almost always causes a program to crash immediately with a segmentation fault or similar error.

Memory Leaks

If you allocate memory using malloc(), calloc(), or realloc(), and then lose the reference to that memory without free()-ing it, you've caused a memory leak.

for(int i = 0; i < 100; i++) {
    int *ptr = (int *)malloc(sizeof(int));
    // Forgot to free ptr
}

The problem here is that you no longer have access to the pointer. Hence you can't free it later in the code. So it takes up unnecessary memory in the heap.

Think of a garage where you can park your car for a few hours. If everyone starts leaving their cars there and never comes back to pick them up, the garage will eventually run out of space. The parked cars are like memory allocations, and the full garage is a system that's run out of memory due to not "picking up" or freeing the used memory.

Buffer Overflows

If you write data past the end of an allocated block of memory, you overwrite whatever data is stored after that block, leading to undefined behavior.

char *ptr = (char *)malloc(5 * sizeof(char));  // Allocate space for 5 chars
strcpy(ptr, "HelloWorld");  // Writes past the allocated memory

Always be certain about the bounds of your memory allocations. If using functions like strcpy(), ensure the source string will fit within the allocated destination.

End

I'll end this post by summarizing some points when you would want to use pointers:

  1. Dynamic Memory Allocation: When you need memory to be allocated during runtime rather than at compile time, such as creating dynamic data structures.

  2. Function Arguments: For modifying the actual values of arguments from within a function (pass-by-reference).

  3. Arrays and Strings: Working with arrays and strings, especially when you want to pass them to functions or return them without copying the entire data.

  4. Data Structures: Implementing complex data structures, where elements need to reference other elements, such as linked lists, trees, and graphs.

  5. Function Pointers: For dynamic function dispatch, callback functions, or table of functions (like in event-driven programming or custom sort operations)

I think understanding pointers isn't hard. But the confusing parts rather have to do with the practical use cases and how C works under the hood.