Common mistakes in C++
Published on 2018-04-20
In this post, we will talk about several mistakes that developers (especially beginners) often do while writing programs in the C++ programming language. The content will focus on errors that are visible during run-time (execution) - and not on coding and/or syntax errors that can be caught by your compiler (and are visible in the messages shown after a program fails to compile).
This post is a direct result of the many mistakes that the author himself "managed" to make, and mistakes that were noticed by the author while teaching students and/or observing the most common mistakes that are made during programming competitions for primary and secondary school students. Remember, it's always better to learn from other people's mistakes, than to actually make them.
Array length
One of the most common mistakes that beginners make is, of course, a declaration of an array with insufficient length - i.e. an array that does not have enough space to store all the necessary elements. When declaring a new array, after the variable name, we provide the "size of the array", and not the index of the last element. For example, with "int arr[10]", we create an array of 10 integer values - arr[0] to arr[9]. You can't access the element at position arr[10], because that would be the eleventh element, and we only created an array with ten elements.
int arr[100]; for (int i=0; i<=100; i++) { arr[i] = i; //ERROR, no arr[100] }
The best way to avoid these errors is to use vectors (smart arrays), to test your programs as much as possible, or get used to creating arrays with size/length which is (slightly) bigger than the minimum required. The thinking here is that there will always be enough memory for one or two additional elements (whether we create an array of 1001 integers, or 1000, won't really change much). That being said, I'm not a big fan of the last approach (having bigger arrays), because it (may) hide a bug in our code. Some developers are doing it though, so it's certainly worth mentioning.
Strings
This error is related to the previous one - length of arrays. When using character arrays in C++, it is often thought that text with length N requires a declaration of an array of characters of size N, i.e. char[N] - as we discussed in the previous paragraph. This thinking is wrong, because there is actually an element in the array that indicates where the text ends - the so-called "null character" ('\0'). In particular, for text with a maximum length of N, we need to create an array of length (N + 1).
#include <cstring> #include <iostream> using namespace std; int main() { char s1[6], s2[2]; strcpy(s2, "0 "); //s2="0 " strcpy(s1, "POINT"); //s1="POINT" cout << s2 << endl; // on some systems, this program // will print "0 POINT" instead of "0 " return 0; }
The program given above, at least on some computers (like that of the author), will print "0 POINT" - although we try to print only the value of s2. This is because in s2 there is no space to store the "null character", so the system can't determine where s2 ends. During the execution of this particular program, the character string s1 is located immediately after s2 (in memory), so the program will keep printing data until it reaches s1[5] (the null character for the string s1).
Avoid these errors by creating a sufficiently large array, or just use the string container (rather than an array of chars).
Initialization
All variables must be initialized before we use them in our programs. When we create a single variable in C++, the variable WILL NOT get a (default) value automatically - we need to store a value ourselves. If a particular variable has no initial value, and it is used in calculations, the behavior of the entire program will be unpredictable - i.e. running the same program with the same input data on different computers will produce different results.
#include <iostream> using namespace std; int main() { int x = 10; //init - don't forget! /* int n; //good, as value assignment cin >> n; //is in the next row */ int arr[10]; for (int i=0; i<10; i++) arr[i] = 0; //applies to arrays!!! int sum = 0; //init - don't forget! for (int i=0; i<10; i++) sum += arr[i]; //sum = 0 (100%) cout << sum << endl; //prints '0' return 0; }
Variables in which we store basic data like integers or booleans can be initialized in the same line in which they are declared. On the other hand, for something like arrays, there are several ways in which we can initialize them:
- when creating the array itself - for example, int arr[10] = {0};
- manually, with one or more for loops (see previous program)
- with the fill algorithm, from STL - #include <algorithm>
- with the memset function (arr, value, bytesCount) - #include <cstring>
If you know the value that needs to be assigned, the best way to initialize the elements is by using the first option (array initialization). But sometimes we do not know in advance the exact value that should be assigned to the elements of an array; or we want to rewrite the value multiple times (for example, if we use one array for multiple calculations).
If you are working with one-dimensional arrays, the best way to assign a value during the execution of a program is with the fill algorithm (from STL). If you work with multi-dimensional arrays, using fill is a bit more complicated, as we need to know the location of both the first and the last element in the array. In that case, it is probably simpler to initialize a multi-dimensional array in some other way (see below).
Since we have not talked about memset(array, value, bytesCount) yet, here is a demonstration on how to use it:
#include <iostream> #include <cstring> using namespace std; int main() { //array int arr[5]; memset(arr, 0, sizeof(arr)); //ok, arr={0,0,0,0,0} int mat[2][2]; memset(mat, -1, sizeof(mat)); //ok, mat={{-1,-1}, {-1,-1}} cout << arr[0] << endl; //prints '0' return 0; }
Remember that memset(array, value, bytesCount) sets the value "value" of each byte in the array (not of each element!). Use memset() if you are assigning the values 0 or -1. Nothing else, unless you know exactly what you're doing!
Let's explain the values 0 and -1. If the value is 0 and we fill the entire memory with zeroes, then all int values will be 0 (easy to understand, since all bits are 0). Similarly, we can set the value -1. In modern computers, integer numbers are stored in a format where the values of positive integers correspond to their actual value. Negative numbers, on the other hand, are stored in the following way - the negative number (-N) is stored by flipping all bits from the positive number (N-1). For example, the negative number -1 is stored by flipping all bits in the positive number (N-1) = 1-1 = 0. If we have 8 bits of space (for example, char), by flipping all bits in 0 (00000000), we will get 11111111. Therefore, we can use the memset function to set the value -1. This will work for all integer data types - short, int, long, and long long.
Assigning values
Many beginners make errors by swapping the operators '=' (value assignment) and '==' (equality comparison). Because in C/C++, we can use integers as boolean expressions (for example, in "if" statements), the compiler may not issue a warning to us that we have made a mistake.
#include <iostream> using namespace std; int main() { int x = 3; if (x = 1) //now, x=1 and the if passes cout << "x=1" << endl; //prints 'x=1' return 0; }
Understand the difference between the operators '=' and '=='. The operator '=' is used to assign a value, while '==' is used to compare values.
Range
Basic data types (like int, long, float, double, ...) do not have an unlimited range, as to represent all possible values. Many programs, even though they implement a correct algorithm, can print the wrong result for certain input parameters. The belief that, for example, int can store any integer, is wrong.
The following program will not print the correct result if int stores 32-bit numbers (which is the default). The sum of all numbers from 1 to 1 000 000 is 500 000 500 000 - a number that can not be stored in a variable of type int (which has a range of up to 2 147 483 647):
#include <cstring> #include <iostream> using namespace std; int main() { int sum = 0; for (int i=1; i<=1000000; i++) sum += i; cout << sum << endl; //prints '1784293664' return 0; }
Because int can store both positive and negative numbers, a certain combination of bits can cause the system to print a negative number - although we calculate a sum of positive numbers (so how can the result be negative?). This is a big problem that needs to be taken seriously.
It is sufficient to know the maximum number of digits that can be stored correctly, or the bits used (as you can calculate everything from there). For example, int uses 32 bits and can store values between -231 and 231-1 (a total of 232 values), while long long uses 64 bits and can store values between -263 and 263-1 (a total of 264 values). Computer architectures in use today are either 32-bit or 64-bit, so it's very unlikely that there will be a positive effect (memory-wise) from using data types smaller than int (like short).
C/C++ - Basic data types (part 1) | |
---|---|
type | range |
char | -128 (-27) to 127 (27-1), or 0 to 255 (if unsigned char) |
short | -32 768 (-215) to 32 767 (215-1), or 0 to 65535 |
int | -2 147 483 648 (-231) to 2 147 483 647 (231-1), or 0 to 4 294 967 295 |
long | -2 147 483 648 (-231) to 2 147 483 647 (231-1), or 0 to 4 294 967 295 |
long long | -9 223 372 036 854 775 808 (-263) to 9 223 372 036 854 775 807 (263-1) |
Before you start to code any program, try to estimate (approximately) what values you will need to store - this should not take you a lot of time, but it's something done by all good developers. Then, use a data type that has sufficiently space (range) to represent those values. For example, imagine that we need to make a program that will store data about visitors on a course. It is obvious that the range offered by the data type "int" will be sufficient to count those visitors - as the course will surely have less than 231 = 2 147 483 648 visitors. For other variables, you can do a similar analysis. Over time, you will become really good (and quick) at estimating these things easily.
Floating-point numbers
The most common way to store real numbers (by computers) is with the so-called Floating-Point Arithmetic standard (IEEE 754). In it, numbers are stored with a sign, a normalized value, and an exponent. For example, the number 12345.6789 can be stored as +1.23456789 * 104 (this is a simplified example - computers work in the binary number system).
The data type "float" uses 32 bits to store a value, 23 of which are for precision (the normalized value), 1 bit is used to store the sign (positive or negative), while the remaining 8 bits are used to store the exponent (the 104 part in the example given above). With "double", we can store numbers more precisely (double uses 64 bits to store a value), but, of course, it also has limited precision.
Here are a few frequently asked questions:
- When I input X, why does the program read (X + 0.0000000001)?
- When comparing two equal numbers, why does the program say that they are not equal?
The fact that the number of bits which are used to store values is limited prevents the system from storing the exact value. For example, the following program will work indefinitely (i.e. it will be stuck in a loop):
#include <iostream> using namespace std; int main() { double a=0.1, s=0; while (a != 1.0) { s += a; a += 0.1; } cout << s << endl; //will never run return 0; }
You probably assume that the program will stop running when "a" gets the value 1.0 (since the condition is a != 1.0), but this program will just keep running. Why you ask? Because double has limited precision, and "a" will never be (exactly) equal to 1.0.
C/C++ - Basic data types (part 2) | |
---|---|
type | significant digits |
float | 7-8 (the first 7 digits are correct) |
double | 15 to 16 significant digits |
long double | architecture dependent, but at least 15-16 digits |
When you want to convert a floating-point number to an integer, use the function round(num) or floor(num + EPS) - both defined in the <cmath> file (#include <cmath>). Similarly, you can use the expression (int)(num + EPS). EPS is a small value (for example, 0.000001), which helps us avoid these so-called rounding errors.
Additionally, since the numbers are rounded to a certain decimal, it is not possible to check whether two floating-point numbers are equal in the same way as integer numbers (remember the result of the program given above). Specifically, there may be two variables that we expect to store the same value, but the program may output that they store different values - due to limited precision. Because of this, you should check if two real (floating-point) numbers are equal or not with the method presented below (EPS is a small value that is used to decide if two numbers should be treated as equal):
#include <iostream> #include <cmath> using namespace std; int main() { double EPS = 0.000001; double A = 0.0; for (int i=0; i<10; i++) A += 0.1; double B = 1.0; if (fabs(A-B) <= EPS) //the program (correctly) cout << "Equal" << endl; //prints 'Equal' return 0; }
In the program given above we used the function fabs(X), which is defined in the file <cmath> (#include <cmath>). It calculates the absolute value of a given real number X. If, for some reason, you do not remember the name of the function, you can write your own (or just learn to use Google):
#include <iostream> using namespace std; double ABS(double x) { if (x < 0) return -x; return x; } int main() { double EPS = 0.000001; double A = 0.0; for (int i=0; i<10; i++) A += 0.1; double B = 1.0; if (ABS(A-B) <= EPS) cout << "Equal" << endl; //prints 'Equal' return 0; }
If possible, try to avoid floating-point numbers, as integers are always exact (precise). Also, in situations where it is necessary to work with floating-point numbers, make sure you avoid the data type "float", as it has very small precision for any serious work.