Basic Elements of C++

C++ is a modern, general-purpose programming language. It is very powerful, very rich, and comes with a vast library of data structures, algorithms, utilities, and many other features. As programmer Bernie might well say, C++ is a YUUGE language! Alas, this is a very brief introduction to C++. So, we only look at some basic elements of the language and the standard library. We also presume that you are somewhat familiar with the C language.

Hello World! Plus a Non-Trivial Example

You know the drill. Write and compile the following C++ program. Do it!

#include <iostream>

int main() {
    std::cout << "Hello world!" << std::endl;
}

Okay, I know you’ve seen enough Hello world! programs by now, but it is still good to start with a very simple example even just to see if things work at a basic level (editor, compiler, etc.). Plus, I hope you’d also be somewhat intrigued by the use of those left-shift operators. No? Oh, I see, you’ve never seen a left-shift operator. That’s okay. No worries. Maybe we’ll talk about those at some point. But for now, the main goal is to make sure you have a working program that prints “Hello world!”. Did that work for you? If not, you’d better ask your good hacker friend to help you sort things out. If and when everything works as expected, let’s try a more involved and also more interesting example.

#include <iostream>
#include <vector>
#include <algorithm>

int main() {
    int i;
    std::vector<int> A;
    while (std::cin >> i)
        A.push_back(i);

    std::sort(A.begin(), A.end());

    for (std::vector<int>::iterator itr = A.begin(); itr != A.end(); ++itr)
        std::cout << *itr << std::endl;
}

The above code exemplifies many features of C++. We will examine many of these features in detail at some point, but let’s just go through the code intuitively for now. Let’s start with the first three lines:

#include <iostream>
#include <vector>
#include <algorithm>

Including header files works like in C. The same is true for the preprocessor in general. Now, let’s look at the main function. As you can see, that is also similar to C. In fact, its general parameters are the same as in C. For example, we could write the following code to print the command-line arguments:

#include <iostream>

int main(int argc, char * argv[]) {
    for (int i = l; i < argc; ++i)
        std::cout << argv[i] << std::endl;
}

Going back to the example (the second one, not the Hello world! one), in the second line of the body of the main function, we find a first truly new feature that distinguishes C++ from C:

std::vector<int> A;

This is a declaration of a vector of int objects. As one can intuitively figure out, a vector is a linear data structure (a sequence) that also supports access by index, just like an ordinary array in C. Except that it is dynamic, meaning that it can grow and shrink as needed, which is what happens two lines down with A.push_back(i). And there is so much more in the declaration and use of this vector. We can not possibly cover those concepts and ideas in this introductory example, but don’t worry, we will return to them later.

Let’s now consider the head of the while loop:

while (std::cin >> i)

Intuitively, this means: read an integer from the standard “console” input stream std::cin and run the loop as long as the read operation is successful, otherwise terminate the loop when the read operation fails or there is nothing to read. That is the intent, again intuitively. However, here too there is a lot to unpack. What is the deal with the right-shift (>>) operator? And how is it possible that the right-shift expression std::cin >> i would store a value into object i? Yet, that is exactly what happens.

Let’s move on. After we fill the array with integer values read from the standard input, we call the sort function provided by the standard library:

std::sort(A.begin(), A.end());

As expected, this sorts the vector. Simple! Needless to say, there is a lot to unpack here, too. First, notice that the vector supports the two methods begin() and end(). In general, C++ supports user-defined types, like vector, in the form of classes that define “objects” in the sense of object-oriented programming. So, a vector is an object that encapsulates some data, and that provides some associated member functions, also referred to as methods. We will discuss all these concepts in detail later. Here in particular, we see two methods, begin() and end(), that return a kind of pointer to the beginning and the end of the vector, respectively. These pointer-like objects are called iterators, and again there is a lot to say about iterators, as we will see later.

The nice thing about this sort implementation, is that it is generic. In fact, a vector is also generic, and this form of generic programming is a fundamental and pervasive feature of C++ and its standard library. We will see about this later, but for now let’s see what it means for sort to be generic through another example. The code below shows how one can use the same sort on, say, an array of float objects:

float X[100];
for (int i = 0; i < 100; ++i)
    std::cin >> X[i];

std::sort(X, X + 100);

Here std::sort works on an array of floating-point numbers. This is a traditional C-style array, that is, a sequence of contiguous float objects in memory. In this case sort takes two pointers: X points to the first element of the array, and X+100 points to the element immediately after the last one. In the previous example, sort(A.begin(),A.end()), takes two user-defined pointer-like “iterators” whose semantics is the same as pointers to the first and one-past-last elements in the vector container, respectively. But the point here is, it’s the same sort!

In fact, it isn’t just sort that is generic here. Notice that std::cin >> X[i] is also the same kind of expression to read a floating-point number as the previous std::cin >> i to read an integer. In C you would use scanf("%d",&i) to read an integer or scanf("%f",&x) to read a floating-point number. So you use the same function scanf but you need to specify whether you want to read an integer or a floating-point number, and you do that with the format strings "%d" or "%f". C++ is more direct, and less redundant and error-prone. The compiler knows that i is an integer, so the expression std::cin >> i reads an integer, similarly X[i] is a floating-point number, so std::cin >> X[i] just does the right thing and reads a floating-point number from the input. And you don’t have to bother adding a format string filled with those pesky input or formatting directives.

In the last part of the example, we print out the content of the vector, and by now you should intuitively figure out what is going on with the iteration.

for (std::vector<int>::iterator itr = A.begin(); itr != A.end(); ++itr)
    std::cout << *itr << std::endl;

Some might lament that this code is complicated and clunky. Perhaps it looks that way to some, but a good C++ programmer would immediately recognize the pattern and would not be bothered by the apparent complexity. Still, if not complex, that is certainly a long instruction for a simple for-loop. But look, you can write the same loop more compactly:

for (auto itr = A.begin(); itr != A.end(); ++itr)
    std::cout << *itr << std::endl;

Or even more simply:

for (auto a : A)
    std::cout << a << std::endl;

C++ for the Impatient: Examples

If you are a C programmer and you are approaching C++ for the first time, you might be turned off a bit. If you are a beginner in C, you might already feel uneasy about the complexities of keeping track of objects and using them correctly through pointers. Why would you want to get tangled up in a language like C++ that is presumably way more complex than C? Even some very experienced C programmers are skeptical if not even harshly critical of C++ for various reasons.

The thing is, C++ is indeed a rich and therefore complex language. And it may or may not be appropriate for some application domains, such as programming operating systems. However, many of its rich features can provide a significant and almost immediate pay-off. In other words, C++ simplifies programming a whole lot in many cases. Let’s see some examples.

Suppose you have to write a program or a library to deal with a network of dependencies. An element \(x\) might depend on other elements \(y_1, y_2, \ldots\), which in turn might depend on other elements. Suppose you are reading the dependencies from an input stream such that each line has a string representing an element \(x\) followed by zero or more strings representing its dependencies \(y_1, y_2, \ldots\). Suppose also that you’d have to compute the set of direct and indirect dependencies for a given element \(x\). This amounts to representing a directed graph and writing a simple exploration of the graph (e.g., BFS). Conceptually, this is quite straightforward. However, in C you would have to deal with the complexities of dynamic data structures, including strings to represent elements, dynamic arrays for the adjacency list, and also some kind of associative data structure to have an efficient access to the adjacency list.

In C++, all of that is very straightforward:

#include <iostream>
#include <vector>
#include <map>
#include <string>
#include <sstream>
#include <deque>

std::vector <std::string>               V;      // vertex id --> element name
std::map <std::string, unsigned>        V_idx;  // element name --> vertex id
std::vector <std::vector <unsigned> >   G;      // adjacency list: vertex id --> list of vertex id

static unsigned add_vertex (const std::string & name) {
    auto i = V_idx.insert({name, G.size()});
    if (i.second) {             // name did not exist, so insert actually inserted in V_idx
        V.push_back(name);
        G.push_back({});
    }
    return i.first->second;     // return the index in G
}

void read_dependencies (std::istream & input) {
    std::string line;
    while (std::getline(input, line)) {
        std::istringstream line_input (line);
        std::string u, v;
        line_input >> u;
        unsigned u_id = add_vertex(u);
        while (line_input >> v) {
            unsigned int v_id = add_vertex(v);
            G[u_id].push_back(v_id);
        }
    }
}

void print_all_dependencies (const std::string & x, std::ostream & output) {
    unsigned int u;
    auto itr = V_idx.find(x);
    if (itr == V_idx.end())
        return;                 // error: element not found
    u = itr->second;
    std::vector <bool> visited (G.size());
    std::deque <unsigned int> Q;
    visited[u] = true;
    Q.push_back(u);
    while (! Q.empty()) {
        u = Q.front ();
        Q.pop_front ();
        for (auto v : G[u])
            if (! visited[v]) {
                output << V[v] << " ";
                visited[v] = true;
                Q.push_back(v);
            }
    }
}

In this program, we use a number of container data structures, including strings, from the C++ standard library. The string class provides us with very flexible means to represent strings, which we use to store the names of elements in the application, as well as to read entire lines from the input through the getline function. Notice that in all those cases, the storage allocation is handled transparently by the string object. Indeed, there isn’t a single explicit memory allocation operation in the whole program!

Another extremely useful data structure we use is a vector, which we already introduced with the first example. What is particularly interesting in this case is that we use a vector of vectors to represent the adjacency list of the dependency graph. In particular, we declare and use the adjacency list as a vector<vector<unsigned>>. As we mentioned in the case of the first example, a vector is a generic container, and this example illustrates this notion quite clearly.

We then use a double-ended queue to implement the breadth-first search over the dependency graph. That data structure is directly provided by the C++ standard library through the deque class, which is also a generic container. Last but not least, we use a map as an index from element names to vertex ids. The “id” of a vertex is simply its position within the adjacency-list (vector G). So, to access vertexes within the graph by name, we use V_idx, which maps element names (string) to ids (unsigned) and it is in fact declared as a map<string,unsigned>.

Notice also how we read and parse the graph from the input. We use getline to read each line from the input stream into a string variable (line), and then we use the istringstream feature that conveniently turns that string into an input stream.

Some Differences with C

In the same spirit of starting with intuitions, we list here a few differences between C and C++.

Headers

Standard headers do not have a “.h” suffix. We’ve seen this already:

#include <iostream>

All the C standard headers are available with a “c” prefix and without the “.h” suffix.

#include <cstdio>
#include <cassert>

Of course you may still include files with any name (compatible with your system) and it is still common practice to use the “.h” suffix for C++ headers.

#include <iostream>
#include "my_library.h"

Null Pointers

C++ has a proper literal value for the null pointer:

int * p = nullptr;

You can still write int * p = 0; but nullptr is preferable for clarity in the code—for the same reason some programmers might prefer to write int * p = NULL—but also for other, technical reasons.

Boolean Type and Values

C++ has a specific Boolean type, bool, with its literal values true and false.

bool hungry = false;
bool happy = true;

Namespaces

We have already seen a namespace in our first example. When we write std::vector, or std::cin, or std::sort, we are referring to classes, objects, or functions of the std namespace, which is where you find the standard library of C++.

And of course, users can define and then use their own namespaces to support basic modularization of identifiers.

namespace usi {

namespace math_lib {

int sum(int a, int b) {
    return a + b;
}

} // namespace math_lib

} // namespace usi

int vector_sum(int * begin, int * end) {
    int s;
    for(s = 0; begin != end; ++begin)
        s = usi::math_lib::sum(s, *begin); // qualified name for function sum()
    return s;
}

Namespaces are what they say they are: confined spaces for names to modularize identifiers and also to avoid conflicts.

namespace usi {

namespace math_lib {

int sum(int a, int b) {
    return a + b;
}
} // namespace math_lib
} // namespace usi

namespace mod_7 {
double sum(int a, int b) {
    return (a + b) % 7;
}
} // namespace mod_7

using mod_7::sum;

int main() {
    int i;
    int j;
    i = sum(1,2); // uses mod_7::sum
    j = usi::math_lib::sum(1,2);
}

The C programming language does not have namespaces, and therefore a similar modularization is achieved through name prefixes:

/* namespace "usi" { print(); set_name(); get_name() */
void usi_print();
void usi_set_name(const char * name);
const char * usi_get_name();

It might be convenient to use specific identifiers without having to refer to their namespace. This can be done with using directives, as in the example below:

#include <iostream>

using std::cout;
using std::endl;

int main() {
    cout << "ciao!" << endl;
};

You might even do that for an entire namespace:

#include <iostream>

using namespae std;

int main() {
    cout << "ciao!" << endl;
};

References

A reference is an alias to an existing object.

int i = 1;
int & ref = i;                  // reference initialization

The code above declares ref as an alias of object i. Notice that the reference is not itself an object. In fact, in this case the compiler might implement the reference without allocating any memory, conceptually replacing every use of ref with i. Sometimes the compiler might need to store a reference, in which case the underlying implementation will probably consist of a pointer object.

Notice that there is a fundamental difference between the initialization and an assignment of a reference.

int i = 1;
int j = 2;
int & ref = i;                  // reference initialization
ref = j;                        // assignment.  Of what object?
j = j + 1;
std::cout << ref << std::endl;

Since a reference is an alias and not an object, every reference must be immediately initialized.

int & ref;                      // error: makes no sense

Also, again since a reference is not an object, it makes no sense to declare arrays of references, pointers to references, and references to references.

int i;                          // i is the only object here
int & ref = i;                  // ref is now an alias of i
int & * p;                      // error: makes no sense
int & A[10];                    // error: makes no sense
int & & double_ref;             // error: makes no sense

However, even though the reference itself is not an object, as an expression a reference does identify an object—that is, it is an lvalue. So, among other things, you can take the address of a reference. And you get the address of the object that the reference refers to. Try the following code to see that yourself.

int i;
int & ref = i;
std::cout << "i is here: " << &i << "; ref is here: " << &ref << std::endl;

Since references are lvalues (they identify an object), the initialization of a reference must evaluate to an lvalue.

int i;
int & r = i;                    // okay
int & s = i + 1;                // Error!

However, the initialization of a const reference—meaning, a reference to a const object—does not have to be an l-value. Do you see why? Consider the following code as an example.

int sum(const int & a, const int & b) {
    return a + b;
}

int main() {
    int i = 7;
    int j = 2;
    std::cout << sum(i, j + 1) << std::endl;
}

We insist that a reference must be initialized with an lvalue, but that is only true for lvalue references, which are the references we have seen so far. There are also rvalue references, but we won’t talk about them here. Instead, let’s talk about using references. References can and are often used as formal function parameters. What is the output of the following program?

void swap(int & a, int & b) {
    int tmp = a;
    a = b;
    b = tmp;
}

int main() {
    int i = 7;
    int j = 2;
    swap(i, j);
    std::cout << "i = " << i << std::endl << "j = " << j << std::endl;
}

One reason to use references is to obtain side-effects, as in the example above. Another reason is to avoid expensive copy operations when we pass complex objects as parameters.

void print_size_slow(vector<int> v) {
  std::cout << v.size() << std::endl;
}

void print_size_fast(vector<int> & v) {
  std::cout << v.size() << std::endl;
}

int main() {
    vector<int> A(1000);        // a vector of 1000 integers (all 0)

    print_size_slow(A);         // makes a copy of the entire vector
    print_size_fast(A);         // operates directly on A
}

In fact, the print functions in the example above do not need to change the input vector, so the best way to define them is as taking a constant reference (const vector<int> &) to the vector.

It is perhaps less intuitive but certainly equally important, perhaps even more important to return a reference from a function. This way, function calls can be used as lvalue expressions. As an example, consider a simplistic implementation of a matrix:

int M[2000];
int height = 40;
int width = 30;

int & m(int col, int row) {
  return M[col + row*width];
}

int main() {
    for (int i = 0; i < height; ++i)
        for (int j = 0; j < width; ++j)
            std::cin >> m(i,j); // stores a value into M
    m(0,0) = 1;                 // stores a value into M
}

This feature is even more important in combination with another fundamental feature of C++, namely operator overloading.

Exceptions

C++ provides exceptions as a powerful error-handling mechanism. As in other languages, such as Java, the code that notices an error might raise an exception by throwing an exception object with a throw expression. The exception is then caught by the first matching catch block of a try block that directly or indirectly led to the execution of the throw clause. This means that the control flow goes back up through the execution stack. A matching catch block is one where the catch-clause matches the type of the exception object defined by the throw expression.

#include <iostream>
#include <new>

int * ramp(int n) {
    int * R = new int[n];
    for (int i = 0; i < n; ++i)
        R[i] = i;
    return R;
}

int main() {
    try {
        int * A = ramp(1000000);
        // ...
        delete [](A);
    } catch (const std::bad_alloc& e) {
        std::cout << "Allocation failed: " << e.what() << '\n';
    }
}

The exceptions thrown by functions of the standard library are typically derived from (subtype of) std::exception. However, any value can be thrown as an exception. For example:

#include <string>
#include <iostream>

const char * small_error = "okay";
const char * big_error = "not okay";
const char * nasty_error = "AAAHHhhhh!";

int f(int x) {
    if (x > 7)
        return x - 2;

    switch(x) {
    case 0: throw 10;
    case 1: throw 20;
    case 2: throw small_error;
    case 3: throw big_error;
    default: throw nasty_error;
    }
}

int main() {
    int i;
    std::cout << "Input: ";
    std::cin >> i;
    try {
        std::cout << "Output: " << f(i) << std::endl;
    } catch (int ex) {
        std::cout << "error number " << ex << std::endl;
    } catch (const char * error_msg) {
        std::cout << error_msg << std::endl;
    }
}

Notice that f(int x) throws some exceptions but those exceptions are not specified as part of the function declaration. In C++, exceptions are not part of a function type. A function or method is either potentially-throwing, meaning that it may throw exceptions of any type, or non-throwing, meaning that it will never throw exceptions. In any case, there is no static type checking for exceptions. That is, the compiler does not prevent a non-throwing function from directly or indirectly throwing exceptions. Instead, those restrictions are enforced at run-time. Let’s see how all this works.

By default, without an exception specification, a function or method is potentially-throwing. Destructors and other special default member functions and deallocation functions are a special case: they are non-throwing by default. You may declare a function as non-throwing with the noexcept specifier. And you may also declare a function as potentially-throwing with noexcept(false). For example, the functions int f(int x) noexcept and void ciao() noexcept defined in the example below are guaranteed to never throw exceptions. Let’s see what that means.

#include <string>
#include <iostream>
#include <exception>

const char * small_error = "okay";
const char * big_error = "not okay";
const char * nasty_error = "AAAHHhhhh!";

void abandon_ship() {
    std::cout << "Mayday mayday mayday!" << std::endl;
}

void ciao() noexcept {
    std::cout << "ciao!" << std::endl;
}

void goodbye() {
    std::cout << "goodbye!" << std::endl;
}

int f(int x) noexcept {
    if (x > 7)
        return x - 2;

    switch(x) {
    case 0: throw 10;
    case 1: throw 20;
    case 2: throw small_error;
    case 3: throw big_error;
    default: throw nasty_error;
    }
}

int main() {
    std::set_terminate(abandon_ship);

    int i = 1;
    std::cout << "Input: ";
    try {
        std::cout << "Output: " << f(i) << std::endl;
    } catch (int ex) {
        std::cout << "error number " << ex << std::endl;
    } catch (const char * error_msg) {
        std::cout << error_msg << std::endl;
    }
    try {               // the compiler can ignore the whole try-catch
        ciao();         // because ciao() is declared as "noexcept"
    } catch (int ex) {  // which means that the catch block is dead code
        std::cout << "error number " << ex << std::endl;
    }
    try {               // the compiler must handle the try-catch block
        goodbye();      // because goodbye() may throw any exception
    } catch (int ex) {
        std::cout << "error number " << ex << std::endl;
    }
}

Notice that in the example code above, function f is declared as non-throwing but its code may well throw various exceptions. The ciao() function is also declared as non-throwing, and you might consider that to be okay, since the code of the function is simple enough and does not itself throw exceptions. However, notice that the code of ciao() may in fact throw exceptions, since the functions and methods used therein, which are hidden behind the use of the left-shift operator (<<) are potentially-throwing. You can check that yourself by reading the documentation of the operator<< of the std::ostream class of the standard library. I’m telling you, that makes for excellent bathroom reading!

So, is this code correct? As already mentioned above, the noexcept specification does not prevent that function or any one of the functions called directly or indirectly by that function to throw exceptions. So, the code is correct, and therefore the compiler should dutifully and correctly compile that code. On the other hand, the compiler will also ensure that the semantics of the noexcept specification is realized.

The question, then, is, what happens if one of the non-throwing functions actually throws an exception? You should definitely try running the program and make sense of it intuitively. (Do it!) What happens is that the exception is never passed to the caller, and instead the exception triggers the invocation of std::terminate(), which in turn invokes the default termination function std::abort() or a user-defined termination handler. In the case of the example above, the code defines that termination handler to be the abandon_ship() function with std::set_terminate(abandon_ship). Notice that the execution actually terminates right after the execution of the termination handler in std::terminate.

Classes

A class is a user-defined type. As you might expect, a class may have data members and member functions (or “methods”). A C++ class may also define nested classes and other type definitions. Here is an example of a String class.

class String {
public:
    // constructors (see below)
    String();
    String(const char *);
    String(const char * begin, const char * end);
    String(const char * begin, unsigned int len);
    String(const String &);

    // destructor
    ~String();

    // other, non-static member functions
    const char * c_string() const;
    unsigned int size() const;
    unsigned int append(const char *);

    // a static member function
    static unsigned int max_size();

private:
    // data members, all private
    char * data;
    unsigned int string_size;
    unsigned int allocated_size;
};

Notice that a class can be defined almost identically by either the class or struct keywords. The only difference is that the default member access is private for class while it is public for struct. For example, the following two classes are identical.

struct vector2d_struct {
    // the default member access is *public*, so this section is public
    vector2d_struct ();
    vector2d_struct (double, double);

    void set_cartesian (double x, double y);
    void set_polar (double radius, double angle);

    double get_x() const;
    double get_y() const;
    double get_radius() const;
    double get_angle() const;

private:
    double x, y;
};

class vector2d_class {
    // the default member access is *private*, so this section is private
    double x, y;
public:
    vector2d_class ();
    vector2d_class (double, double);

    void set_cartesian (double x, double y);
    void set_polar (double radius, double angle);

    double get_x() const;
    double get_y() const;
    double get_radius() const;
    double get_angle() const;
};

Constructors and Initialization

A constructor is a special member function used to initialize an object. A constructor has code just like any other member function. A constructor may also specify a list of initialization expressions for the data members. For example:

#include <string>

class named_object {
    std::string name;

public:
    named_object();
    named_object(const char * n);

    std::ostream & print(std::ostream & output);
};

named_object::named_object()
    : name("Nemo") {};

named_object::named_object(const char * n)
    : name(n) {};

Notice that the list defines initializations of the data members in terms of their specific constructors. In this case, name is of type std::string, which has a constructor that takes a const char * and that is invoked in both cases.

The direct base classes of a class can be considered data members, and therefore can also be initialized in the same way. For example:

class pixmap {
public:
    pixmap(const char * n): name(n) {};

private:
    const char * name;
};

class ascii_pixmap : public pixmap { // class ascii_pixmap extends pixmap
public:
    ascii_pixmap(unsigned int width, unsigned int height);

private:
    unsigned int width;
    unsigned int height;
    char * data;
    char fg;
    char bg;
};

ascii_pixmap::ascii_pixmap(unsigned int w, unsigned int h)
    : pixmap("ascii"), width(w), height(h), fg('*'), bg(' ') {
    data = new char[width*height]; // WARNING: check allocation
    std::cout << "ascii_pixmap::ascii_pixmap() called." << std::endl;
}

Copy Constructor

A copy constructor for a class T is a constructor that is declared to take a reference to an object or type T as parameter. More specifically T &, const T & (and other cases).

class vec2d {
public:
    double x;
    double y;

    vec2d() : x(0), y(0) {};                    // "default" constructor
    vec2d(double a, double b) : x(a), y(b) {};  // other constructor
    vec2d(const vec2d & v) : x(v.x), y(v.y) {}; // copy constructor
};

A copy constructor is invoked every time you initialize an object by copying another object (or r-value expression) of the same type. This happens explicitly in the examples below:

vec2d v1;                       // invokes default constructor
vec2d v2(10, 3.2);              // invokes the second constructor
vec2d v3(-2, 8);                // invokes the second constructor
vec2d v4 = v2;                  // invokes the copy constructor
vec2d v5(v3);                   // also invokes the copy constructor

The copy constructor is also invoked every time an object is initialized implicitly. This happens, for example, with the formal parameters of a function or method. For example, if one defines the method:

bool orthogonal(vec2d u, vec2d v) { // u and v are copy-constructed
    return (u.x*v.x + u.y*v.y == 0);
}

the following invocation of orthogonal implicitly calls the copy constructor:

if (orthogonal(v1, v3))
    std::cout << "v1 and v3 are orthogonal." << std::endl;
else
    std::cout << "v1 and v3 are not orthogonal." << std::endl;

Constructors and Assignments

There is a difference between an assignment and an initialization. For example:

int i = 7;                      // this is an initialization
int j;
j = 8;                          // this is an assignment

The example above is based on int objects. However, the same is true for user-defined types—consistent with the principle that user-defined types should be used exactly like basic, predefined types. For example:

std::string s = "ciao";         // this is an initialization (constructor)
std::string t;                  // default constructor
t = "mamma";                    // assignment

And of course C++ allows the programmer to define class-specific assignment operators. This is what happens with the assignment of the standard string objects in the example above. For our vec2d example, we could define various operators, including the assignment operator. This is sometimes called operator overloading. We will discuss operator overloading in more detail later.

Complete Example: Constructors, Copy-constructors, and Assignment

The example below illustrates the syntax and semantics of constructors, including copy-constructors, and assignments (and other operators).

#include <iostream>

class vec2d {
public:
    double x;
    double y;

    vec2d() : x(0), y(0) {
        std::cout << "default constructor." << std::endl;
    };

    vec2d(double a, double b) : x(a), y(b) {
        std::cout << "constructor with a=" << a << " b=" << b << std::endl;
    };

    vec2d(const vec2d & v) : x(v.x), y(v.y){
        std::cout << "copy constructor with v.x=" << v.x << " v.y=" << v.y << std::endl;
    };

    vec2d & operator = (const vec2d & v) {
        // (*this) is the implicit left-hand side operand of the assignment
        x = v.x;
        y = v.y;
        std::cout << "assignment with v.x=" << v.x << " v.y=" << v.y << std::endl;
        return *this;
    };  

    vec2d operator - (const vec2d & v2) {
        return vec2d(x - v2.x, y - v2.y);
    }
};

std::ostream & operator << (std::ostream & output, const vec2d & v) {
    output << "(" << v.x << "," << v.y << ")";
    return output;
}

vec2d operator - (const vec2d & v) {
    return vec2d(-v.x, -v.y);
}

vec2d operator - (const vec2d & v1, const vec2d & v2) {
    return vec2d(v1.x-v2.x, v1.y-v2.y);
}

bool orthogonal(vec2d u, vec2d v) { // u and v are also copy-constructed
    return (u.x*v.x + u.y*v.y == 0);
}

int main() {
    vec2d v1;                   // invokes default constructor
    vec2d v2(10, 3.2);          // invokes the second constructor
    vec2d v3(-2, 8);            // invokes the second constructor
    vec2d v4 = v2;              // invokes the copy constructor
    vec2d v5(v3);               // also invokes the copy constructor

    v1 = -v4;
    std::cout << "v1 = " << v1 << std::endl;
    std::cout << "v4 = " << v4 << std::endl;
    if (orthogonal(v1, v3))
        std::cout << "v1 and v3 are orthogonal." << std::endl;
    else
        std::cout << "v1 and v3 are not orthogonal." << std::endl;
};

The output of the program above is:

default constructor.
constructor with a=10 b=3.2
constructor with a=-2 b=8
copy constructor with v.x=10 v.y=3.2
copy constructor with v.x=-2 v.y=8
constructor with a=-10 b=-3.2
assignment operator with v.x=-10 v.y=-3.2
v1 = (-10,-3.2)
v4 = (10,3.2)
copy constructor with v.x=-2 v.y=8
copy constructor with v.x=-10 v.y=-3.2
v1 and v3 are not orthogonal.

Virtual Methods

The following code exemplifies the syntax and semantics of virtual methods.

class A {
public:
    virtual int f();
    virtual int g(int x);

    int a;
};

int A::f() {
    std::cout << "A::f() called" << std::endl;
    return 0;
}

int A::g(int x) {
    std::cout << "A::g(" << x << ") called" << std::endl;
    return x + 1;
}

class B : public A {
public:
    virtual int f();

    int b;
};

int B::f() {
    std::cout << "B::f() called" << std::endl;
    return 10;
}

int main() {
    A * a;
    B * b = new B();

    a = b;

    std::cout << a->f() << std::endl;
    std::cout << b->g(10) << std::endl;
}

Operator Overloading

Imagine you are defining a matrix class to represent matrices in a mathematical library. In C you might do that as follows:

struct matrix {
    unsigned int cols;
    unsigned int rows;
    float * values;
};

You might then define and then use an addition operation as follows:

void matrix_add(struct matrix * A, const struct matrix * B) {
    /* computes A += B */
    if (A->cols != B->cols || A->rows != B->rows)
        return;                 /* error */
    int * a = A->data;
    const int * a = B->data;
    for (unsigned int i = 0; i < A->rows; ++i)
        for (unsigned int j = 0; j < A->cols; ++j)
            *a++ += *b++;
}

int main() {
    struct matrix A;
    struct matrix B;

    /* ... */
    matrix_add(&A, &B);
}

This solution would be pretty much identical in Java. However, in C++ you can more naturally define an addition operation by defining the += operator. You can do that on the same exact struct matrix defined above:

matrix & operator += (matrix & A, const matrix & B) {
    /* computes A += B and returns A */
    if (A.cols != B.cols || A.rows != B.rows)
        return A;
    int * a = A.data;
    const int * a = B.data;
    for (unsigned int i = 0; i < A.rows; ++i)
        for (unsigned int j = 0; j < A.cols; ++j)
            *a++ += *b++;
    return A;
}

int main() {
    struct matrix A;
    struct matrix B;

    /* ... */
    A += B;
}

Container Library

One of the great advantages of C++ over C is the utility of the container library. It is fair to say that the C standard library is quite limited in that it lacks even basic data structures and algorithms. On the contraty, the standard library of C++ is very rich and also versatile.

Let’s look at some basic library features. The library covers strings.

#include <string>
#include <iostream>

int main() {
    std::string s;
    while (std::cin >> s) {
        std::string::size_type pos = s.find("ciao");
        if (pos == std::string::npos) {
            std::cout << s << " does not contains the string `ciao'." << std::endl;
        } else {
            std::cout << s << " contains the string `ciao' at position " << pos << std::endl;
        }
    }
}

As it turns out, strings are a lot like a container of the container library of the C++ standard library. So, let’s look at perhaps the most basic container, namely the vector:

#include <vector>
#include <iostream>

int main() {
    std::vector<int> A;
    int x;

    while (std::cin >> x)
        A.push_back(x);

    std::cout << "I just read " << A.size() << " numbers:" << std::endl;
    for (std::vector<int>::iterator itr = A.begin(); itr != A.end(); ++i)
        std::cout << *itr << std::endl;
}

The vector class defines a generic vector. Generic here means with respect to the type of the elements in the vector, which in this example is int. Thus vector is a template class, just like practically all container classes of the C++ Standard library. We now look a the idea of a template container a bit more closely.

A Template Library

How would you implement a container library in C? For example, how would you implement a list in C? You should first ask a fundamental question: a list of what? Let’s say we start with a list of integers. You would then write something like the following:

/* a single-link list of integers */
struct list_int {
    int value;
    struct list_int * next;
};

struct list_int * list_int_add(struct list_int * l, int v) {
    struct list_int * res = malloc(sizeof(struct list_int));
    if (res) {
        res->value = v;
        res->next = l;
    }
    return res;
}

int list_int_find(struct list_int * l, int v) {
    for (; l; l = l ->next)
        if (l->value == v)
            return 1;
    return 0;
}

Now, imagine you need another list, say, a list of double. What would you do? You would probably write the same exact code, except that you would change the value type to double:

/* a single-link list of doubles */
struct list_double {
    double value;
    struct list_double * next;
};

struct list_double * list_double_add(struct list_double * l, double v) {
    struct list_double * res = malloc(sizeof(struct list_double));
    if (res) {
        res->value = v;
        res->next = l;
    }
    return res;
}

int list_double_find(struct list_double * l, double v) {
    for (; l; l = l ->next)
        if (l->value == v)
            return 1;
    return 0;
}

An alternative design would be to have a list of pointers to objects of any type. You can do that with a list of pointers to void.

/* a single-link list of doubles */
struct list_generic {
    void * value;
    struct list_generic * next;
};

struct list_generic * list_generic_add(struct list_generic * l, void * v) {
    struct list_generic * res = malloc(sizeof(struct list_generic));
    if (res) {
        res->value = v;
        res->next = l;
    }
    return res;
}

This works. However, notice that each element in the list requires two separate allocations: one for the structural element of the list (struct list_generic), and one for the element object itself, which in the example above is delegated and therefore owned by the user.

Notice also that looking for an element is not as immediate either. Copying the same code as above would result in a comparison between pointers, which is typically not semantically meaningful:

int list_generic_find(struct list_generic * l, void * v) {
    for (; l; l = l ->next)
        if (l->value == v)
            return 1;
    return 0;
}

Instead, we would have to also define a comparison function. We could do it like so:

int list_generic_find(struct list_generic * l, void * v, int (*compare)(void *, void *)) {
    for (; l; l = l ->next)
        if (compare(l->value, v))
            return 1;
    return 0;
}

Fortunately, C++ also allows to define generic classes that can be then specialized, or instantiated for a specific type. So, in C++ you could write something like this:

/* a single-link list of doubles */
template <typename T>
struct list {
    T  value;
    struct list * next;
};

template <typename T>
struct list<T> * list_add(struct list<T> * l, const T & v) {
    struct list<T> * res = malloc(sizeof(struct list<T>));
    if (res) {
        res->value = v;
        res->next = l;
    }
    return res;
}

template <typename T>
int list_find(struct list<T> * l, const T & v) {
    for (; l; l = l ->next)
        if (l->value == v)
            return 1;
    return 0;
}

Or better yet, we can define everything within a template class:

/* a single-link list of doubles */
template <typename T>
class list {
private:
    struct list_elem {
        T value;
        list_elem * next;
        list_elem(const T & v, list_elem * n)
            : value(v), next(n) {};
    };

    list_elem * first;

public:
    list()
        : first(nullptr) {};

    void list_add(const T & v) {
        first = new list_elem(v, first);
    }

    bool list_find(const T & v) {
        for (const list_elem * i = first; i != nullptr; i = i->next)
            if (i->value == v)
                return true;
        return false;
    }
};

Notice that this code requires that the template type parameter (T) support an equality comparison operator (==). This is quite natural in C++, and of course in any case we can also support alternative designs with ad-hoc comparator functions or objects. In fact, all of this is already implemented in the C++ container library.

Another fundamental concept in the container library is the concept of iterator. An iterator in a linear data structure is an abstraction of a pointer in an array in C. One way to iterate over the elements of an array in C is as follows:

int A[100];

int print_array() {
    for (const int * i = A; i != A+100; ++i)
        printf("%d\n", *i);
}

int print_range(const int * begin, const int * end) {
    for (; begin != end; ++begin)
        printf("%d\n", *begin);
}

int main() {
    /* ... */
    print_range(A, A + 50);
    print_range(A + 50, A + 100);
}

The iterations in print_array and print_range use two pointers: begin is a pointer to the first element of the iteration, and end is the first element after the last element of the iteration. Thus the iteration starts with an “iterator” pointer set at begin and increments that pointer until it is equal to end. The body of the iteration then accesses each object by dereferencing the pointer.

The same pattern is pervasive in the use of the C++ standard library. For example:

#include <vector>

std::vector <int> A(100);

int print_array() {
    for (std::vector<int>::const_iterator i = A.begin(); i != A.end(); ++i)
        printf("%d\n", *i);
}

int print_range(std::vector<int>::const_iterator begin, std::vector<int>::const_iterator end) {
    for (; begin != end; ++begin)
        printf("%d\n", *begin);
}

int main() {
    /* ... */
    print_range(A.begin(), A.begin() + 50);
    print_range(A.begin() + 50, A.end());
}

Notice that in the example above we use iterators exactly like pointers also in the same form of pointer arithmetic. That is, for a pointer to an integer p, we can write p+10 to point to the tenth integer in the array of integers pointed to by p. Similarly, using the vector class of the standard C++ library, we can take an iterator to the beginning of the vector, we can increment an iterator, and we can de-reference an iterator to obtain the element. We can also move an iterator by a number of positions using classic pointer arithmetic. However, this latter operation is not available for all data structures. For example, we can not use it in a linked list. Still, in essence, the concept of iterator is identical also for a list:

#include <list>

std::list <int> L;

int print_list() {
    for (std::list<int>::const_iterator i = L.begin(); i != L.end(); ++i)
        printf("%d\n", *i);
}

int print_range(std::list<int>::const_iterator begin, 
                std::list<int>::const_iterator end) {
    for (; begin != end; ++begin)
        printf("%d\n", *begin);
}

Templates

A template is code that is parameterized by a type. Or by two types, or by a value, or by a value and a type, or… yeah, we could go on forever. But let’s keep things simple for now. Here the term “code” refers to class and function definitions. For example, a function template defines a family of functions, each specialized for a specific type. The following code defines a family of functions that returns the size of an array of elements of a given type.

template <typename T>
unsigned int array_size(unsigned int n) {
    return n * sizeof(T);
}

Notice that this definition per-se is what defines the family of functions, but does not in itself define any specific function within that family. The specific functions in the family are defined, implicitly or explicitly, by instantiating the template. For example:

std::cout << array_size<char>(100) << endl;

The compiler can also deduce the type parameter directly from an instantiation. For example:

#include <fstream>
#include <chrono>

template <typename T>
void log_message(const T & x) {
    std::ofstream log_stream("LOG.TXT", std::app);
    log_stream << std::chrono::system_clock::now() << ": " << x << std::endl;
}