Trees in C

Linked lists are chains. One thing after another. But what if you need to organize data in a hierarchy?

Your computer’s file system has folders inside folders. A company has managers managing managers. A family tree has parents and children and grandchildren.

That’s what trees are for.

What’s a Tree?

A tree is an upside-down tree. (Computer scientists apparently never go outside.)

The root node A sits at the top. It has two children: B and C. Node B has two children of its own: D and E. Node C has one child: F. Nodes D, E, and F have no children - they’re called leaves.

A tree structure with node A at the top (the root). A has two children: B on the left and C on the right. B has two children: D and E. C has one child: F. D, E, and F are leaf nodes with no children.

The terminology comes from family trees:

  • Root: The top node (like the oldest ancestor)
  • Parent: A node with children below it
  • Child: A node connected below another node
  • Leaf: A node with no children (the end of a branch)
  • Siblings: Nodes with the same parent

Node A is the root. B and C are A’s children (and siblings of each other). D and E are B’s children. D, E, and F are leaves.

Binary Trees

A binary tree is a tree where each node has at most two children: a left child and a right child.

typedef struct TreeNode {
    int value;
    struct TreeNode *left;
    struct TreeNode *right;
} TreeNode;

Like a linked list, but instead of one next pointer, we have two: left and right. Here’s a binary tree with 10 at the root. Its left child is 5 (which has children 3 and 7), and its right child is 15 (which has only a right child, 20).

A binary tree with 10 at the root. The root's left child is 5, and right child is 15. Node 5 has two children: 3 on the left and 7 on the right. Node 15 has only a right child: 20. Nodes 3, 7, and 20 are leaves.

Creating Tree Nodes

#include <stdio.h>
#include <stdlib.h>

typedef struct TreeNode {
    int value;
    struct TreeNode *left;
    struct TreeNode *right;
} TreeNode;

TreeNode *create_node(int value) {
    TreeNode *node = malloc(sizeof(TreeNode));
    if (node != NULL) {
        node->value = value;
        node->left = NULL;
        node->right = NULL;
    }
    return node;
}

Building a Tree by Hand

int main(void) {
    // Create nodes
    TreeNode *root = create_node(10);
    root->left = create_node(5);
    root->right = create_node(15);
    root->left->left = create_node(3);
    root->left->right = create_node(7);
    root->right->right = create_node(20);

    /*
           10
          /  \
         5    15
        / \     \
       3   7    20
    */

    return 0;
}

Tree Traversal

“Traversal” means visiting every node in the tree. But unlike a linked list (where there’s only one way to go), trees have choices. Do you go left first or right first?

There are three main ways to traverse a binary tree:

In-Order Traversal (Left, Root, Right)

Visit the left subtree, then the current node, then the right subtree.

void in_order(TreeNode *node) {
    if (node == NULL) return;

    in_order(node->left);    // Left first
    printf("%d ", node->value);  // Then current
    in_order(node->right);   // Then right
}

For our tree: 3 5 7 10 15 20

Notice: if the tree is a binary search tree (explained below), in-order gives you the values in sorted order!

Pre-Order Traversal (Root, Left, Right)

Visit the current node first, then left, then right.

void pre_order(TreeNode *node) {
    if (node == NULL) return;

    printf("%d ", node->value);  // Current first
    pre_order(node->left);   // Then left
    pre_order(node->right);  // Then right
}

For our tree: 10 5 3 7 15 20

Pre-order is useful for copying a tree or printing its structure.

Post-Order Traversal (Left, Right, Root)

Visit both children first, then the current node.

void post_order(TreeNode *node) {
    if (node == NULL) return;

    post_order(node->left);  // Left first
    post_order(node->right); // Then right
    printf("%d ", node->value);  // Current last
}

For our tree: 3 7 5 20 15 10

Post-order is useful when you need to delete a tree (delete children before parents) or evaluate an expression tree.

Memory Trick

  • In: I - left, N - node, right (alphabetical: L, N, R)
  • Pre: Node first, like a PREface comes first
  • Post: Node last, like a POSTscript comes last

Binary Search Trees

A Binary Search Tree (BST) is a binary tree with a special rule:

  • Everything in the left subtree is smaller than the current node
  • Everything in the right subtree is bigger than the current node

In this example, 10 is the root. All values to the left (5, 3, 7) are less than 10. All values to the right (15, 20) are greater than 10. Within each subtree, the same rule applies: 3 and 7 are both positioned correctly relative to 5, and 20 is correctly to the right of 15.

A binary search tree with 10 at the root. Left subtree contains 5 (with children 3 and 7) - all values less than 10. Right subtree contains 15 (with child 20) - all values greater than 10. The BST property holds at every node: 3 < 5 < 7, and 15 < 20.

This rule makes searching super fast. It’s like the “higher or lower” guessing game. Each step eliminates half the remaining possibilities.

It’s like looking up a word in a dictionary - you don’t start at page 1 and read every word. You open to the middle, see if you need to go earlier or later, and keep halving the search space.

Searching a BST

TreeNode *search(TreeNode *root, int value) {
    // Base case: empty or found
    if (root == NULL || root->value == value) {
        return root;
    }

    // Value is smaller? Go left
    if (value < root->value) {
        return search(root->left, value);
    }

    // Value is bigger? Go right
    return search(root->right, value);
}

In a balanced tree with n nodes, search takes about logâ‚‚(n) steps. For 1 million nodes, that’s only about 20 comparisons!

Inserting into a BST

TreeNode *insert(TreeNode *root, int value) {
    // Empty spot? Create node here
    if (root == NULL) {
        return create_node(value);
    }

    // Smaller? Go left
    if (value < root->value) {
        root->left = insert(root->left, value);
    }
    // Bigger? Go right
    else if (value > root->value) {
        root->right = insert(root->right, value);
    }
    // Equal? Already exists, do nothing

    return root;
}

Building a BST

int main(void) {
    TreeNode *root = NULL;

    root = insert(root, 10);
    root = insert(root, 5);
    root = insert(root, 15);
    root = insert(root, 3);
    root = insert(root, 7);
    root = insert(root, 20);

    printf("In-order: ");
    in_order(root);  // 3 5 7 10 15 20 (sorted!)
    printf("\n");

    // Search
    TreeNode *found = search(root, 7);
    if (found) {
        printf("Found: %d\n", found->value);
    } else {
        printf("Not found\n");
    }

    return 0;
}

Finding Min and Max

In a BST, the minimum value is always in the leftmost node. The maximum is in the rightmost.

TreeNode *find_min(TreeNode *root) {
    if (root == NULL) return NULL;
    while (root->left != NULL) {
        root = root->left;
    }
    return root;
}

TreeNode *find_max(TreeNode *root) {
    if (root == NULL) return NULL;
    while (root->right != NULL) {
        root = root->right;
    }
    return root;
}

Freeing a Tree

Use post-order traversal - free children before freeing the parent:

void free_tree(TreeNode *root) {
    if (root == NULL) return;

    free_tree(root->left);   // Free left subtree
    free_tree(root->right);  // Free right subtree
    free(root);              // Then free this node
}

If you freed the parent first, you’d lose the pointers to the children. Memory leak!

Tree Height

The height of a tree is the longest path from the root to any leaf:

int height(TreeNode *root) {
    if (root == NULL) return -1;  // Empty tree has height -1

    int left_height = height(root->left);
    int right_height = height(root->right);

    // Return the bigger one, plus 1 for this level
    if (left_height > right_height) {
        return left_height + 1;
    } else {
        return right_height + 1;
    }
}

Counting Nodes

int count_nodes(TreeNode *root) {
    if (root == NULL) return 0;

    return 1 + count_nodes(root->left) + count_nodes(root->right);
}

One for this node, plus however many are in each subtree.

Complete Example: Word Dictionary

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

typedef struct WordNode {
    char word[50];
    char definition[200];
    struct WordNode *left;
    struct WordNode *right;
} WordNode;

WordNode *create_word(const char *word, const char *def) {
    WordNode *node = malloc(sizeof(WordNode));
    if (node != NULL) {
        strncpy(node->word, word, 49);
        node->word[49] = '\0';
        strncpy(node->definition, def, 199);
        node->definition[199] = '\0';
        node->left = NULL;
        node->right = NULL;
    }
    return node;
}

WordNode *insert_word(WordNode *root, const char *word, const char *def) {
    if (root == NULL) {
        return create_word(word, def);
    }

    int cmp = strcmp(word, root->word);
    if (cmp < 0) {
        root->left = insert_word(root->left, word, def);
    } else if (cmp > 0) {
        root->right = insert_word(root->right, word, def);
    }
    // If equal, word already exists

    return root;
}

WordNode *find_word(WordNode *root, const char *word) {
    if (root == NULL) return NULL;

    int cmp = strcmp(word, root->word);
    if (cmp == 0) {
        return root;
    } else if (cmp < 0) {
        return find_word(root->left, word);
    } else {
        return find_word(root->right, word);
    }
}

void print_all(WordNode *root) {
    if (root == NULL) return;

    print_all(root->left);
    printf("%s: %s\n", root->word, root->definition);
    print_all(root->right);
}

void free_dict(WordNode *root) {
    if (root == NULL) return;
    free_dict(root->left);
    free_dict(root->right);
    free(root);
}

int main(void) {
    WordNode *dict = NULL;

    dict = insert_word(dict, "algorithm", "A step-by-step procedure for solving a problem");
    dict = insert_word(dict, "binary", "A number system using only 0 and 1");
    dict = insert_word(dict, "compiler", "A program that translates code into machine language");
    dict = insert_word(dict, "debug", "To find and fix errors in code");
    dict = insert_word(dict, "function", "A reusable block of code");

    printf("All words (alphabetical):\n");
    print_all(dict);

    printf("\nLooking up 'compiler':\n");
    WordNode *found = find_word(dict, "compiler");
    if (found) {
        printf("  %s: %s\n", found->word, found->definition);
    }

    free_dict(dict);
    return 0;
}

The Problem with Unbalanced Trees

If you insert already-sorted data, you get a “degenerate” tree - basically a linked list:

Insert: 1, 2, 3, 4, 5

    [1]
      \
      [2]
        \
        [3]
          \
          [4]
            \
            [5]

Now search takes n steps instead of log(n). Not good.

There are special trees (AVL trees, red-black trees) that automatically rebalance themselves. But those are beyond this lesson - just know they exist.

Try It Yourself

  1. Write a function that checks if a binary tree is a valid BST
  2. Write a function that prints all leaf nodes
  3. Implement deletion from a BST (this is tricky!)
  4. Write a function that finds the lowest common ancestor of two nodes

Common Mistakes

  • Forgetting the base case (NULL check) in recursive functions
  • Freeing parent before children (lose access to children)
  • Not maintaining the BST property when inserting
  • Confusing the traversal orders

Next Up

In Part 19, we’ll learn about hash tables - data structures that can find things in constant time, like magic.


Enjoyed This?

If this helped something click, subscribe to my YouTube channel. More content like this, same approach - making things stick without insulting your intelligence. It’s free, it helps more people find this stuff, and it tells me what’s worth making more of.