11 Most Common Pitfalls in C Programming Language

1. Mixing signed and unsigned integers in arithmetic operations

It is usually not a good idea to mix signed and unsigned integers in arithmetic operations. For example, what will be output of following example?

#include <stdio.h>

int main(void)
{ 
    unsigned int a = 1000;
    signed int b = -1;

    if (a > b) puts("a is more than b");
    else puts("a is less or equal than b"); 

    return 0;
}  

Since 1000 is more than -1 you would expect the output to be a is more than b, however that will not be the case.

Arithmetic operations between different integral types are performed within a common type defined by the so called usual arithmetic conversions.

In this case the "common" type is unsigned int. This means that int operand b will get converted to unsigned int before the comparison.

When -1 is converted to an unsigned int the result is the maximal possible unsigned int value, which is greater than 1000, meaning that a > b is false.

2. Overstepping Array Boundaries

Arrays always starts with index 0 and ends with index array length minus 1.

Wrong:

#include <stdio.h>
int main()
{
    int x = 0;
    int myArray[5] = { 1,2,3,4,5}; //Declaring 5 elements

    for(x=1; x<=5; x++) //Looping from 1 till 5.
       printf("%d\t",myArray[x]);

    printf("\n");
    return 0;
}


//Output: 2 3 4 5 GarbageValue

Correct:

#include <stdio.h>
int main()
{
    int x = 0;
    int myArray[5] = { 1,2,3,4,5}; //Declaring 5 elements

    for(x=0; x<5; x++) //Looping from 0 till 4.
       printf("%d\t",myArray[x]);

    printf("\n");
    return 0;
}

//Output: 1 2 3 4 5

So, Know the array length before working on arrays, else we might end up in corrupting the buffer or causing segmentation fault by accessing different memory location.

3. Missing out the Base Condition in Recursive Function

Calculating the factorial of a number is a classic example of a recursive function.

Missing the Base Condition:

#include <stdio.h>

int factorial(int n)
{
       return n * factorial(n - 1);
}

int main()
{
    printf("Factorial %d = %d\n", 3, factorial(3));
    return 0;
}
//Typical output: Segmentation fault

The problem with this function is it would loop infinitely, causing a segmentation fault — it needs a base condition to stop the recursion.

Base Condition Declared:

#include <stdio.h>

int factorial(int n)
{
    if (n == 1) // Base Condition, very crucial in designing the recursive functions.
    {
       return 1;
    }
    else
    {
       return n * factorial(n - 1);
    }
}

int main()
{
    printf("Factorial %d = %d\n", 3, factorial(3));
    return 0;
}

//Output :  Factorial 3 = 6

This function will terminate as soon as it hits the condition n is equal to 1 (provided the initial value of n is small enough — the upper bound is 12 when int is a 32-bit quantity).

Rules to be followed:

  • Initialize the algorithm. Recursive programs often need a seed value to start with. This is accomplished either by using a parameter passed to the function or by providing a gateway function that is non-recursive but that sets up the seed values for the recursive calculation.
  • Check to see whether the current value(s) being processed match the base case. If so, process and return the value.
  • Redefine the answer in terms of a smaller or simpler sub-problem or sub-problems.
  • Run the algorithm on the sub-problem.
  • Combine the results in the formulation of the answer.
  • Return the results.

4. Using character constants instead of string literals, and vice versa

In C, character constants and string literals are different things.

A character surrounded by single quotes like 'a' is a character constant. A character constant is an integer whose value is the character code that stands for the character. How to interpret character constants with multiple characters like 'abc' is implementation-defined.

Zero or more characters surrounded by double quotes like "abc" is a string literal. A string literal is an unmodifiable array whose elements are type char. The string in the double quotes plus terminating null-character are the contents, so "abc" has 4 elements ({'a', 'b', 'c', '\0'})

Example 1, a character constant is used where a string literal should be used. This character constant will be converted to a pointer in an implementation-defined manner and there is little chance for the converted pointer to be valid, so this example will invoke undefined behavior.

#include <stdio.h>

int main(void) {
    const char *hello = 'hello, world'; /* bad */
    puts(hello);
    return 0;
}

Example 2, a string literal is used where a character constant should be used. The pointer converted from the string literal will be converted to an integer in an implementation-defined manner, and it will be converted to char in an implementation-defined manner. (How to convert an integer to a signed type which cannot represent the value to convert is implementation-defined, and whether char is signed is also implementation-defined.) The output will be some meaningless thing.

#include <stdio.h>

int main(void) {
    char c = "a"; /* bad */
    printf("%c\n", c);
    return 0;
}

In almost all cases, the compiler will complain about these mix-ups. If it doesn't, you need to use more compiler warning options, or it is recommended that you use a better compiler.

5. Floating point literals are of type double by default

Care must be taken when initializing variables of type float to literal values or comparing them with literal values, because regular floating point literals like 0.1 are of type double. This may lead to surprises:

#include <stdio.h>
int main() {
    float  n = 0.1;
    if (n > 0.1) printf("Wierd\n");
    return 0;
}
// Prints "Wierd" when n is float

Here, n gets initialized and rounded to single precision, resulting in value 0.10000000149011612. Then, n is converted back to double precision to be compared with 0.1 literal (which equals to 0.10000000000000001), resulting in a mismatch.

Besides rounding errors, mixing float variables with double literals will result in poor performance on platforms which don't have hardware support for double precision.

6. Forgetting to free memory

One must always remember to free memory that was allocated, either by your own function, or by an library function called from your function.

#include <stdlib.h>
#include <stdio.h>

int main(void)
{
    char *line = NULL;
    size_t size = 0;

    /* memory implicitly allocated in getline */
    getline(&line, &size, stdin);

    /* uncomment the line below to correct the code */
    /* free(line); */

    return 0;
}

This is a rather innocent mistake in this specific example, because when a process exits, almost all operating systems free all the allocated memory for you. Also note that getline could fail in many different ways, but in whichever way it fails, the memory it has allocated should always be freed (when you've finished using it) if line is not NULL. Memory can be allocated even if the first call to getline() detects EOF (which is reported by a return value of -1, not EOF).

7. Adding a semicolon to a #define

Mostly happens with me!! It is easy to get confused in the C preprocessor, and treat it as part of C itself. But that is a mistake, because the preprocessor is just a text substitution mechanims. For example, if you write

// WRONG
#define MAX 100;
int arr[MAX];

The code will be converted to

int arr[100;];

Which is a syntax error. The remedy is to remove the semicolon from the #define line.

8. Be careful with semicolons

Be careful with semicolons. Following example

if (x > a);
   a = x;

actually means:

if (x > a) {}
a = x;

which means x will be assigned to a in any case, which might not be what you wanted originally.

Sometimes, missing a semicolon will also cause an unnoticeable problem:

if (i < 0) 
    return
day = date[0];
hour = date[1];
minute = date[2];

The semicolon behind return is missed, so day=date[0] will be returned.

9. Mistakenly writing = instead of == when comparing

The = operator is used for assignment.

The == operator is used for comparison.

One should be careful not to mix the two. Sometimes one mistakenly writes

/* assign y to x */
if (x = y) {
     /* logic */
}

when what was really wanted is:

/* compare if x is equal to y */
if (x == y) {
    /* logic */
}

The former assigns value of y to x and checks if that value is non zero, instead of doing comparison, which is equivalent to:

if ((x = y) != 0) {
    /* logic */
}

This comic showing same thing. In which, programmer used = instead of == in if statement. That's why robots are killing humans. :P

10. Copying too much

char buf[8]; /* tiny buffer, easy to overflow */

printf("What is your name?\n");
scanf("%s", buf); /* WRONG */
scanf("%7s", buf); /* RIGHT */

If the user enters a string longer than 7 characters (- 1 for the null terminator), memory behind the buffer buf will be overwritten. This results in undefined behavior. Malicious hackers often exploit this in order to overwrite the return address, and change it to the address of the hacker's malicious code.

11. Macros are simple string replacements

Macros are simple string replacements. So, they will work with preprocessing tokens.

#include <stdio.h>

#define SQUARE(x) x*x

int main(void) {
    printf("%d\n", SQUARE(1+2));
    return 0;
}

You may expect this code to print 9, (3*3), but actually 5 will be printed because the macro will be expanded to 1+2*1+2.

You should wrap the arguments taken and the whole expression in macro in parentheses to avoid this problem.

#include <stdio.h>

#define SQUARE(x) ((x)*(x))

int main(void) {
    printf("%d\n", SQUARE(1+2));
    return 0;
}