Characters and ASCII
Every character you see on screen is secretly a number. Understanding this idea is key to working with text in C - and to understanding security problems that can happen when handling text.
The char Type
A char holds one character - one letter, digit, or symbol. It takes up one byte of memory (8 bits), which can hold values from 0 to 255.
char letter = 'A';
char digit = '7';
char symbol = '@';Notice the single quotes. This is important:
'A'is a single character (the letter A, stored as the number 65)"A"is a string (we will cover this in Part 5 - it is different!)
Characters Are Numbers
Here is the key idea: char is just a small number. When you store 'A', you are actually storing the number 65.
char letter = 'A';
printf("Character: %c\n", letter); // Prints: A
printf("Number: %d\n", letter); // Prints: 65This means you can do math with characters:
char letter = 'A';
letter = letter + 1; // Now letter is 'B' (66)
letter = letter + 25; // Now letter is 'Z' (90)ASCII: The Character-to-Number Mapping
ASCII (American Standard Code for Information Interchange) is the standard that defines which number represents which character. It covers values 0-127.
Quick Reference: The Important Parts
You do not need to memorize the full ASCII table. Just remember these key ranges - you can always look up the rest:
| Range | Characters | What to Remember |
|---|---|---|
| 0 | '\0' (null) | Marks the end of strings - very important! |
| 48-57 | '0' to '9' | Digits. Note: '0' is 48, not 0! |
| 65-90 | 'A' to 'Z' | Uppercase letters |
| 97-122 | 'a' to 'z' | Lowercase letters (32 more than uppercase) |
That is it! These four ranges cover most of what you will need day-to-day.
Key Ranges to Know
Here is a more detailed breakdown:
| Range | Characters | Notes |
|---|---|---|
| 0-31 | Control characters | Non-printable. Includes \0 (0), \t (9), \n (10), \r (13) |
| 32 | Space | The space character |
| 48-57 | '0' to '9' | Digits. '0' is 48, NOT 0! |
| 65-90 | 'A' to 'Z' | Uppercase letters |
| 97-122 | 'a' to 'z' | Lowercase. Exactly 32 more than uppercase. |
The Critical Difference: '\0' vs '0'
This confuses everyone at first, and getting it wrong causes serious bugs:
| Character | Decimal | What It Is |
|---|---|---|
'\0' | 0 | The null character - invisible, marks end of strings |
'0' | 48 | The digit zero - a printable character |
char null_char = '\0'; // ASCII 0 - the string terminator
char zero_digit = '0'; // ASCII 48 - the character "0"
// These are completely different!
if (null_char == zero_digit) {
printf("Same\n"); // This will NEVER print
} else {
printf("Different\n"); // This WILL print
}The null character '\0' marks where strings end in C. We will cover this in detail in Part 5.
The Full ASCII Table
The tables below list every ASCII character. These are reference tables - you do not need to memorize them. Bookmark this page and come back when you need to look something up.
Control Characters (0-31)
These are invisible characters that tell the computer to do things like start a new line or make a beep sound. You cannot see them on screen, but they still exist in your text.
| Dec | Hex | Oct | Char | Description |
|---|---|---|---|---|
| 0 | 0x00 | 000 | NUL \0 | Null character (string terminator) |
| 1 | 0x01 | 001 | SOH | Start of heading |
| 2 | 0x02 | 002 | STX | Start of text |
| 3 | 0x03 | 003 | ETX | End of text |
| 4 | 0x04 | 004 | EOT | End of transmission |
| 5 | 0x05 | 005 | ENQ | Enquiry |
| 6 | 0x06 | 006 | ACK | Acknowledge |
| 7 | 0x07 | 007 | BEL \a | Bell (makes a beep) |
| 8 | 0x08 | 010 | BS \b | Backspace |
| 9 | 0x09 | 011 | HT \t | Horizontal tab |
| 10 | 0x0A | 012 | LF \n | Line feed (newline) |
| 11 | 0x0B | 013 | VT \v | Vertical tab |
| 12 | 0x0C | 014 | FF \f | Form feed |
| 13 | 0x0D | 015 | CR \r | Carriage return |
| 14 | 0x0E | 016 | SO | Shift out |
| 15 | 0x0F | 017 | SI | Shift in |
| 16 | 0x10 | 020 | DLE | Data link escape |
| 17 | 0x11 | 021 | DC1 | Device control 1 |
| 18 | 0x12 | 022 | DC2 | Device control 2 |
| 19 | 0x13 | 023 | DC3 | Device control 3 |
| 20 | 0x14 | 024 | DC4 | Device control 4 |
| 21 | 0x15 | 025 | NAK | Negative acknowledge |
| 22 | 0x16 | 026 | SYN | Synchronous idle |
| 23 | 0x17 | 027 | ETB | End of transmission block |
| 24 | 0x18 | 030 | CAN | Cancel |
| 25 | 0x19 | 031 | EM | End of medium |
| 26 | 0x1A | 032 | SUB | Substitute |
| 27 | 0x1B | 033 | ESC \e | Escape |
| 28 | 0x1C | 034 | FS | File separator |
| 29 | 0x1D | 035 | GS | Group separator |
| 30 | 0x1E | 036 | RS | Record separator |
| 31 | 0x1F | 037 | US | Unit separator |
Printable Characters (32-126)
These are the characters you can actually see - letters, numbers, and symbols.
| Dec | Hex | Oct | Char | Dec | Hex | Oct | Char | Dec | Hex | Oct | Char | ||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 32 | 0x20 | 040 | (space) | 64 | 0x40 | 100 | @ | 96 | 0x60 | 140 | ` | ||
| 33 | 0x21 | 041 | ! | 65 | 0x41 | 101 | A | 97 | 0x61 | 141 | a | ||
| 34 | 0x22 | 042 | " | 66 | 0x42 | 102 | B | 98 | 0x62 | 142 | b | ||
| 35 | 0x23 | 043 | # | 67 | 0x43 | 103 | C | 99 | 0x63 | 143 | c | ||
| 36 | 0x24 | 044 | $ | 68 | 0x44 | 104 | D | 100 | 0x64 | 144 | d | ||
| 37 | 0x25 | 045 | % | 69 | 0x45 | 105 | E | 101 | 0x65 | 145 | e | ||
| 38 | 0x26 | 046 | & | 70 | 0x46 | 106 | F | 102 | 0x66 | 146 | f | ||
| 39 | 0x27 | 047 | ' | 71 | 0x47 | 107 | G | 103 | 0x67 | 147 | g | ||
| 40 | 0x28 | 050 | ( | 72 | 0x48 | 110 | H | 104 | 0x68 | 150 | h | ||
| 41 | 0x29 | 051 | ) | 73 | 0x49 | 111 | I | 105 | 0x69 | 151 | i | ||
| 42 | 0x2A | 052 | * | 74 | 0x4A | 112 | J | 106 | 0x6A | 152 | j | ||
| 43 | 0x2B | 053 | + | 75 | 0x4B | 113 | K | 107 | 0x6B | 153 | k | ||
| 44 | 0x2C | 054 | , | 76 | 0x4C | 114 | L | 108 | 0x6C | 154 | l | ||
| 45 | 0x2D | 055 | - | 77 | 0x4D | 115 | M | 109 | 0x6D | 155 | m | ||
| 46 | 0x2E | 056 | . | 78 | 0x4E | 116 | N | 110 | 0x6E | 156 | n | ||
| 47 | 0x2F | 057 | / | 79 | 0x4F | 117 | O | 111 | 0x6F | 157 | o | ||
| 48 | 0x30 | 060 | 0 | 80 | 0x50 | 120 | P | 112 | 0x70 | 160 | p | ||
| 49 | 0x31 | 061 | 1 | 81 | 0x51 | 121 | Q | 113 | 0x71 | 161 | q | ||
| 50 | 0x32 | 062 | 2 | 82 | 0x52 | 122 | R | 114 | 0x72 | 162 | r | ||
| 51 | 0x33 | 063 | 3 | 83 | 0x53 | 123 | S | 115 | 0x73 | 163 | s | ||
| 52 | 0x34 | 064 | 4 | 84 | 0x54 | 124 | T | 116 | 0x74 | 164 | t | ||
| 53 | 0x35 | 065 | 5 | 85 | 0x55 | 125 | U | 117 | 0x75 | 165 | u | ||
| 54 | 0x36 | 066 | 6 | 86 | 0x56 | 126 | V | 118 | 0x76 | 166 | v | ||
| 55 | 0x37 | 067 | 7 | 87 | 0x57 | 127 | W | 119 | 0x77 | 167 | w | ||
| 56 | 0x38 | 070 | 8 | 88 | 0x58 | 130 | X | 120 | 0x78 | 170 | x | ||
| 57 | 0x39 | 071 | 9 | 89 | 0x59 | 131 | Y | 121 | 0x79 | 171 | y | ||
| 58 | 0x3A | 072 | : | 90 | 0x5A | 132 | Z | 122 | 0x7A | 172 | z | ||
| 59 | 0x3B | 073 | ; | 91 | 0x5B | 133 | [ | 123 | 0x7B | 173 | { | ||
| 60 | 0x3C | 074 | < | 92 | 0x5C | 134 | \ | 124 | 0x7C | 174 | | | ||
| 61 | 0x3D | 075 | = | 93 | 0x5D | 135 | ] | 125 | 0x7D | 175 | } | ||
| 62 | 0x3E | 076 | > | 94 | 0x5E | 136 | ^ | 126 | 0x7E | 176 | ~ | ||
| 63 | 0x3F | 077 | ? | 95 | 0x5F | 137 | _ | 127 | 0x7F | 177 | DEL |
Delete Character (127)
| Dec | Hex | Oct | Char | Description |
|---|---|---|---|---|
| 127 | 0x7F | 177 | DEL | Delete character |
Useful Patterns
Converting Between Cases
Uppercase and lowercase letters differ by exactly 32:
char upper = 'A'; // 65
char lower = upper + 32; // 97 = 'a'
char c = 'g'; // 103
char C = c - 32; // 71 = 'G'You can also use bitwise operations (these flip individual bits in the number). Bit 5 is the one that controls whether a letter is uppercase or lowercase:
char c = 'G';
c = c | 32; // Force lowercase: 'g' (set bit 5)
c = c & ~32; // Force uppercase: 'G' (clear bit 5)
c = c ^ 32; // Toggle case (flip bit 5)Checking Character Types
char c = '7';
// Is it a digit?
if (c >= '0' && c <= '9') {
int value = c - '0'; // Convert char '7' to int 7
printf("Digit with value: %d\n", value);
}
// Is it uppercase?
if (c >= 'A' && c <= 'Z') {
printf("Uppercase letter\n");
}
// Is it lowercase?
if (c >= 'a' && c <= 'z') {
printf("Lowercase letter\n");
}
// Is it a letter?
if ((c >= 'A' && c <= 'Z') || (c >= 'a' && c <= 'z')) {
printf("Letter\n");
}The standard library (built-in C code you can use) provides helpful functions in <ctype.h>:
#include <ctype.h>
char c = 'A';
isdigit(c); // 0 (false) - not a digit
isalpha(c); // non-zero (true) - is a letter
isupper(c); // non-zero (true) - is uppercase
islower(c); // 0 (false) - not lowercase
isspace(c); // 0 (false) - not whitespace
toupper(c); // 'A' (already upper)
tolower(c); // 'a'Converting Digit Characters to Numbers
The character '5' is not the number 5 - it is the number 53 (its ASCII value). To get the actual number:
char digit = '7';
int value = digit - '0'; // 55 - 48 = 7
// Going the other way:
int num = 3;
char c = num + '0'; // 3 + 48 = 51 = '3'Escape Sequences
Some characters cannot be typed directly on your keyboard (like “newline” or “tab”). C uses backslash escape sequences to represent them:
| Escape | Dec | Description |
|---|---|---|
\0 | 0 | Null character (string terminator) |
\a | 7 | Bell/alert (makes a beep) |
\b | 8 | Backspace |
\t | 9 | Horizontal tab |
\n | 10 | Newline (line feed) |
\v | 11 | Vertical tab |
\f | 12 | Form feed |
\r | 13 | Carriage return |
\\ | 92 | Backslash |
\' | 39 | Single quote |
\" | 34 | Double quote |
You can also specify any character by its octal or hex value:
char newline = '\n'; // Using escape sequence
char newline2 = '\012'; // Using octal (012 = 10)
char newline3 = '\x0A'; // Using hex (0A = 10)
// All three are identicalExtended ASCII (128-255)
Standard ASCII only defines characters 0-127 (7 bits). But a char is 8 bits, which can hold values 0-255. The characters 128-255 are called “Extended ASCII.”
Here’s the problem: There is no single agreed-upon standard for Extended ASCII. Different computers and operating systems used different mappings for these extra characters:
| Code Page | Used By | Characters 128-255 |
|---|---|---|
| ISO-8859-1 (Latin-1) | Western Europe | French, German, Spanish characters |
| Windows-1252 | Windows | Similar to Latin-1, with extras |
| CP437 | DOS | Box-drawing characters, some accents |
| KOI8-R | Russian | Cyrillic alphabet |
This caused chaos. A file written on one system would display garbage on another. The character at position 200 might be È on one system and ╚ on another.
Extended ASCII Table (ISO-8859-1 / Latin-1)
This table shows ISO-8859-1, also called “Latin-1” - the most common extended ASCII for Western languages. Again, this is just a reference - no need to memorize it.
| Dec | Hex | Char | Dec | Hex | Char | Dec | Hex | Char | Dec | Hex | Char | |||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 128 | 0x80 | (ctrl) | 160 | 0xA0 | (nbsp) | 192 | 0xC0 | À | 224 | 0xE0 | à | |||
| 129 | 0x81 | (ctrl) | 161 | 0xA1 | ¡ | 193 | 0xC1 | Á | 225 | 0xE1 | á | |||
| 130 | 0x82 | (ctrl) | 162 | 0xA2 | ¢ | 194 | 0xC2 | Â | 226 | 0xE2 | â | |||
| 131 | 0x83 | (ctrl) | 163 | 0xA3 | £ | 195 | 0xC3 | Ã | 227 | 0xE3 | ã | |||
| 132 | 0x84 | (ctrl) | 164 | 0xA4 | ¤ | 196 | 0xC4 | Ä | 228 | 0xE4 | ä | |||
| 133 | 0x85 | (ctrl) | 165 | 0xA5 | ¥ | 197 | 0xC5 | Å | 229 | 0xE5 | å | |||
| 134 | 0x86 | (ctrl) | 166 | 0xA6 | ¦ | 198 | 0xC6 | Æ | 230 | 0xE6 | æ | |||
| 135 | 0x87 | (ctrl) | 167 | 0xA7 | § | 199 | 0xC7 | Ç | 231 | 0xE7 | ç | |||
| 136 | 0x88 | (ctrl) | 168 | 0xA8 | ¨ | 200 | 0xC8 | È | 232 | 0xE8 | è | |||
| 137 | 0x89 | (ctrl) | 169 | 0xA9 | © | 201 | 0xC9 | É | 233 | 0xE9 | é | |||
| 138 | 0x8A | (ctrl) | 170 | 0xAA | ª | 202 | 0xCA | Ê | 234 | 0xEA | ê | |||
| 139 | 0x8B | (ctrl) | 171 | 0xAB | « | 203 | 0xCB | Ë | 235 | 0xEB | ë | |||
| 140 | 0x8C | (ctrl) | 172 | 0xAC | ¬ | 204 | 0xCC | Ì | 236 | 0xEC | ì | |||
| 141 | 0x8D | (ctrl) | 173 | 0xAD | (shy) | 205 | 0xCD | Í | 237 | 0xED | í | |||
| 142 | 0x8E | (ctrl) | 174 | 0xAE | ® | 206 | 0xCE | Î | 238 | 0xEE | î | |||
| 143 | 0x8F | (ctrl) | 175 | 0xAF | ¯ | 207 | 0xCF | Ï | 239 | 0xEF | ï | |||
| 144 | 0x90 | (ctrl) | 176 | 0xB0 | ° | 208 | 0xD0 | Ð | 240 | 0xF0 | ð | |||
| 145 | 0x91 | (ctrl) | 177 | 0xB1 | ± | 209 | 0xD1 | Ñ | 241 | 0xF1 | ñ | |||
| 146 | 0x92 | (ctrl) | 178 | 0xB2 | ² | 210 | 0xD2 | Ò | 242 | 0xF2 | ò | |||
| 147 | 0x93 | (ctrl) | 179 | 0xB3 | ³ | 211 | 0xD3 | Ó | 243 | 0xF3 | ó | |||
| 148 | 0x94 | (ctrl) | 180 | 0xB4 | ´ | 212 | 0xD4 | Ô | 244 | 0xF4 | ô | |||
| 149 | 0x95 | (ctrl) | 181 | 0xB5 | µ | 213 | 0xD5 | Õ | 245 | 0xF5 | õ | |||
| 150 | 0x96 | (ctrl) | 182 | 0xB6 | ¶ | 214 | 0xD6 | Ö | 246 | 0xF6 | ö | |||
| 151 | 0x97 | (ctrl) | 183 | 0xB7 | · | 215 | 0xD7 | × | 247 | 0xF7 | ÷ | |||
| 152 | 0x98 | (ctrl) | 184 | 0xB8 | ¸ | 216 | 0xD8 | Ø | 248 | 0xF8 | ø | |||
| 153 | 0x99 | (ctrl) | 185 | 0xB9 | ¹ | 217 | 0xD9 | Ù | 249 | 0xF9 | ù | |||
| 154 | 0x9A | (ctrl) | 186 | 0xBA | º | 218 | 0xDA | Ú | 250 | 0xFA | ú | |||
| 155 | 0x9B | (ctrl) | 187 | 0xBB | » | 219 | 0xDB | Û | 251 | 0xFB | û | |||
| 156 | 0x9C | (ctrl) | 188 | 0xBC | ¼ | 220 | 0xDC | Ü | 252 | 0xFC | ü | |||
| 157 | 0x9D | (ctrl) | 189 | 0xBD | ½ | 221 | 0xDD | Ý | 253 | 0xFD | ý | |||
| 158 | 0x9E | (ctrl) | 190 | 0xBE | ¾ | 222 | 0xDE | Þ | 254 | 0xFE | þ | |||
| 159 | 0x9F | (ctrl) | 191 | 0xBF | ¿ | 223 | 0xDF | ß | 255 | 0xFF | ÿ |
Unicode: The Modern Solution
The Extended ASCII mess was a real problem. Imagine writing an email in French on your computer, then your friend in Japan opens it and sees garbage characters. That happened all the time!
Unicode fixed this by creating one big list that gives every character in every language its own unique number - over 150,000 characters and counting. It includes letters from every alphabet, plus things like Chinese characters and even emojis.
What you need to know for now:
UTF-8 is how most computers store Unicode text today. The good news: characters 0-127 work exactly like ASCII, so all your ASCII code still works.
Text handling gets tricky. For example, Turkish has four different “I” letters (
I,ı,İ,i), not just two like English. Code that assumes uppercase of'i'is always'I'will mess up Turkish text.C was not designed for Unicode. For now, stick with basic ASCII characters (the ones in this lesson). When you need to handle text from different languages, you will need to learn about special libraries.
If you want to learn more later, see unicode.org.
Try It Yourself
- Write a program that prints the ASCII table from 32-126
- Write a function that converts a lowercase letter to uppercase (without using
toupper) - Write a function that takes a digit character (‘0’-‘9’) and returns its numeric value
- What does
'A' + 'a'equal? Calculate it, then verify with code.
Common Mistakes
- Confusing
'\0'and'0': The null character (value 0) vs the digit zero (value 48) - this trips up everyone at first! - Confusing
charandint: They are both numbers, butcharonly holds values 0-255 - Forgetting single quotes:
Ais a variable name,'A'is the character A - Assuming all text is ASCII: Characters from other languages use different systems
Next Up
In Part 4, we will learn about bitwise operations - how to manipulate individual bits for low-level programming tasks like flags, permissions, and efficient calculations.
Enjoyed This?
If this helped something click, subscribe to my YouTube channel. More content like this, same approach - making things stick without insulting your intelligence. It’s free, it helps more people find this stuff, and it tells me what’s worth making more of.