Basic Elements of C++
C++ is a modern, general-purpose programming language. It is very powerful, very rich, and comes with a vast library of data structures, algorithms, utilities, and many other features. As programmer Bernie might well say, C++ is a YUUGE language! Alas, this is a very brief introduction to C++. So, we only look at some basic elements of the language and the standard library. We also presume that you are somewhat familiar with the C language.
Hello World! Plus a Non-Trivial Example
You know the drill. Write and compile the following C++ program. Do it!
#include <iostream> int main() { std::cout << "Hello world!" << std::endl; }
Okay, I know you’ve seen enough Hello world! programs by now, but it is still good to start with a very simple example even just to see if things work at a basic level (editor, compiler, etc.). Plus, I hope you’d also be somewhat intrigued by the use of those left-shift operators. No? Oh, I see, you’ve never seen a left-shift operator. That’s okay. No worries. Maybe we’ll talk about those at some point. But for now, the main goal is to make sure you have a working program that prints “Hello world!”. Did that work for you? If not, you’d better ask your good hacker friend to help you sort things out. If and when everything works as expected, let’s try a more involved and also more interesting example.
#include <iostream> #include <vector> #include <algorithm> int main() { int i; std::vector<int> A; while (std::cin >> i) A.push_back(i); std::sort(A.begin(), A.end()); for (std::vector<int>::iterator itr = A.begin(); itr != A.end(); ++itr) std::cout << *itr << std::endl; }
The above code exemplifies many features of C++. We will examine many of these features in detail at some point, but let’s just go through the code intuitively for now. Let’s start with the first three lines:
#include <iostream> #include <vector> #include <algorithm>
Including header files works like in C. The same is true for the
preprocessor in general. Now, let’s look at the main
function. As
you can see, that is also similar to C. In fact, its general
parameters are the same as in C. For example, we could write the
following code to print the command-line arguments:
#include <iostream> int main(int argc, char * argv[]) { for (int i = l; i < argc; ++i) std::cout << argv[i] << std::endl; }
Going back to the example (the second one, not the Hello world!
one), in the second line of the body of the main
function, we find a
first truly new feature that distinguishes C++ from C:
std::vector<int> A;
This is a declaration of a vector of int
objects. As one can
intuitively figure out, a vector is a linear data structure (a
sequence) that also supports access by index, just like an ordinary
array in C. Except that it is dynamic, meaning that it can grow and
shrink as needed, which is what happens two lines down with
A.push_back(i)
. And there is so much more in the declaration and
use of this vector. We can not possibly cover those concepts and
ideas in this introductory example, but don’t worry, we will return to
them later.
Let’s now consider the head of the while
loop:
while (std::cin >> i)
Intuitively, this means: read an integer from the standard “console”
input stream std::cin
and run the loop as long as the read operation
is successful, otherwise terminate the loop when the read operation
fails or there is nothing to read. That is the intent, again
intuitively. However, here too there is a lot to unpack. What is the
deal with the right-shift (>>
) operator? And how is it possible
that the right-shift expression std::cin >> i
would store a value
into object i
? Yet, that is exactly what happens.
Let’s move on. After we fill the array with integer values read from
the standard input, we call the sort
function provided by the
standard library:
std::sort(A.begin(), A.end());
As expected, this sorts the vector. Simple! Needless to say, there
is a lot to unpack here, too. First, notice that the vector supports
the two methods begin()
and end()
. In general, C++ supports
user-defined types, like vector
, in the form of classes that
define “objects” in the sense of object-oriented programming. So, a
vector is an object that encapsulates some data, and that provides
some associated member functions, also referred to as methods. We
will discuss all these concepts in detail later. Here in particular,
we see two methods, begin()
and end()
, that return a kind of
pointer to the beginning and the end of the vector, respectively.
These pointer-like objects are called iterators, and again there is
a lot to say about iterators, as we will see later.
The nice thing about this sort
implementation, is that it is
generic. In fact, a vector
is also generic, and this form of
generic programming is a fundamental and pervasive feature of C++ and
its standard library. We will see about this later, but for now let’s
see what it means for sort
to be generic through another example.
The code below shows how one can use the same sort
on, say, an array
of float
objects:
float X[100]; for (int i = 0; i < 100; ++i) std::cin >> X[i]; std::sort(X, X + 100);
Here std::sort
works on an array of floating-point numbers. This is
a traditional C-style array, that is, a sequence of contiguous float
objects in memory. In this case sort
takes two pointers: X
points
to the first element of the array, and X+100
points to the element
immediately after the last one. In the previous example,
sort(A.begin(),A.end())
, takes two user-defined pointer-like
“iterators” whose semantics is the same as pointers to the first and
one-past-last elements in the vector container, respectively. But the
point here is, it’s the same sort
!
In fact, it isn’t just sort
that is generic here. Notice that
std::cin >> X[i]
is also the same kind of expression to read a
floating-point number as the previous std::cin >> i
to read an
integer. In C you would use scanf("%d",&i)
to read an integer or
scanf("%f",&x)
to read a floating-point number. So you use the same
function scanf
but you need to specify whether you want to read an
integer or a floating-point number, and you do that with the format
strings "%d"
or "%f"
. C++ is more direct, and less redundant and
error-prone. The compiler knows that i
is an integer, so the
expression std::cin >> i
reads an integer, similarly X[i]
is a
floating-point number, so std::cin >> X[i]
just does the right thing
and reads a floating-point number from the input. And you don’t have
to bother adding a format string filled with those pesky input or
formatting directives.
In the last part of the example, we print out the content of the vector, and by now you should intuitively figure out what is going on with the iteration.
for (std::vector<int>::iterator itr = A.begin(); itr != A.end(); ++itr) std::cout << *itr << std::endl;
Some might lament that this code is complicated and clunky. Perhaps it looks that way to some, but a good C++ programmer would immediately recognize the pattern and would not be bothered by the apparent complexity. Still, if not complex, that is certainly a long instruction for a simple for-loop. But look, you can write the same loop more compactly:
for (auto itr = A.begin(); itr != A.end(); ++itr) std::cout << *itr << std::endl;
Or even more simply:
for (auto a : A) std::cout << a << std::endl;
C++ for the Impatient: Examples
If you are a C programmer and you are approaching C++ for the first time, you might be turned off a bit. If you are a beginner in C, you might already feel uneasy about the complexities of keeping track of objects and using them correctly through pointers. Why would you want to get tangled up in a language like C++ that is presumably way more complex than C? Even some very experienced C programmers are skeptical if not even harshly critical of C++ for various reasons.
The thing is, C++ is indeed a rich and therefore complex language. And it may or may not be appropriate for some application domains, such as programming operating systems. However, many of its rich features can provide a significant and almost immediate pay-off. In other words, C++ simplifies programming a whole lot in many cases. Let’s see some examples.
Suppose you have to write a program or a library to deal with a network of dependencies. An element \(x\) might depend on other elements \(y_1, y_2, \ldots\), which in turn might depend on other elements. Suppose you are reading the dependencies from an input stream such that each line has a string representing an element \(x\) followed by zero or more strings representing its dependencies \(y_1, y_2, \ldots\). Suppose also that you’d have to compute the set of direct and indirect dependencies for a given element \(x\). This amounts to representing a directed graph and writing a simple exploration of the graph (e.g., BFS). Conceptually, this is quite straightforward. However, in C you would have to deal with the complexities of dynamic data structures, including strings to represent elements, dynamic arrays for the adjacency list, and also some kind of associative data structure to have an efficient access to the adjacency list.
In C++, all of that is very straightforward:
#include <iostream> #include <vector> #include <map> #include <string> #include <sstream> #include <deque> std::vector <std::string> V; // vertex id --> element name std::map <std::string, unsigned> V_idx; // element name --> vertex id std::vector <std::vector <unsigned> > G; // adjacency list: vertex id --> list of vertex id static unsigned add_vertex (const std::string & name) { auto i = V_idx.insert({name, G.size()}); if (i.second) { // name did not exist, so insert actually inserted in V_idx V.push_back(name); G.push_back({}); } return i.first->second; // return the index in G } void read_dependencies (std::istream & input) { std::string line; while (std::getline(input, line)) { std::istringstream line_input (line); std::string u, v; line_input >> u; unsigned u_id = add_vertex(u); while (line_input >> v) { unsigned int v_id = add_vertex(v); G[u_id].push_back(v_id); } } } void print_all_dependencies (const std::string & x, std::ostream & output) { unsigned int u; auto itr = V_idx.find(x); if (itr == V_idx.end()) return; // error: element not found u = itr->second; std::vector <bool> visited (G.size()); std::deque <unsigned int> Q; visited[u] = true; Q.push_back(u); while (! Q.empty()) { u = Q.front (); Q.pop_front (); for (auto v : G[u]) if (! visited[v]) { output << V[v] << " "; visited[v] = true; Q.push_back(v); } } }
In this program, we use a number of container data structures,
including strings, from the C++ standard library. The string
class
provides us with very flexible means to represent strings, which we
use to store the names of elements in the application, as well as to
read entire lines from the input through the getline
function.
Notice that in all those cases, the storage allocation is handled
transparently by the string object. Indeed, there isn’t a single
explicit memory allocation operation in the whole program!
Another extremely useful data structure we use is a vector
, which we
already introduced with the first example. What is particularly
interesting in this case is that we use a vector of vectors to
represent the adjacency list of the dependency graph. In particular,
we declare and use the adjacency list as a vector<vector<unsigned>>
.
As we mentioned in the case of the first example, a vector
is a
generic container, and this example illustrates this notion quite
clearly.
We then use a double-ended queue to implement the breadth-first search
over the dependency graph. That data structure is directly provided
by the C++ standard library through the deque
class, which is also a
generic container. Last but not least, we use a map as an index
from element names to vertex ids. The “id” of a vertex is simply its
position within the adjacency-list (vector G
). So, to access
vertexes within the graph by name, we use V_idx
, which maps element
names (string
) to ids (unsigned
) and it is in fact declared as a
map<string,unsigned>
.
Notice also how we read and parse the graph from the input. We use
getline
to read each line from the input stream into a string
variable (line
), and then we use the istringstream
feature that
conveniently turns that string
into an input stream.
Some Differences with C
In the same spirit of starting with intuitions, we list here a few differences between C and C++.
Headers
Standard headers do not have a “.h” suffix. We’ve seen this already:
#include <iostream>
All the C standard headers are available with a “c” prefix and without the “.h” suffix.
#include <cstdio> #include <cassert>
Of course you may still include files with any name (compatible with your system) and it is still common practice to use the “.h” suffix for C++ headers.
#include <iostream> #include "my_library.h"
Null Pointers
C++ has a proper literal value for the null pointer:
int * p = nullptr;
You can still write int * p = 0;
but nullptr
is preferable for
clarity in the code—for the same reason some programmers might
prefer to write int * p = NULL
—but also for other, technical
reasons.
Boolean Type and Values
C++ has a specific Boolean type, bool
, with its literal values
true
and false
.
bool hungry = false; bool happy = true;
Namespaces
We have already seen a namespace in our first example. When we write
std::vector
, or std::cin
, or std::sort
, we are referring to
classes, objects, or functions of the std
namespace, which is where
you find the standard library of C++.
And of course, users can define and then use their own namespaces to support basic modularization of identifiers.
namespace usi { namespace math_lib { int sum(int a, int b) { return a + b; } } // namespace math_lib } // namespace usi int vector_sum(int * begin, int * end) { int s; for(s = 0; begin != end; ++begin) s = usi::math_lib::sum(s, *begin); // qualified name for function sum() return s; }
Namespaces are what they say they are: confined spaces for names to modularize identifiers and also to avoid conflicts.
namespace usi { namespace math_lib { int sum(int a, int b) { return a + b; } } // namespace math_lib } // namespace usi namespace mod_7 { double sum(int a, int b) { return (a + b) % 7; } } // namespace mod_7 using mod_7::sum; int main() { int i; int j; i = sum(1,2); // uses mod_7::sum j = usi::math_lib::sum(1,2); }
The C programming language does not have namespaces, and therefore a similar modularization is achieved through name prefixes:
/* namespace "usi" { print(); set_name(); get_name() */ void usi_print(); void usi_set_name(const char * name); const char * usi_get_name();
It might be convenient to use specific identifiers without having to
refer to their namespace. This can be done with using
directives,
as in the example below:
#include <iostream> using std::cout; using std::endl; int main() { cout << "ciao!" << endl; };
You might even do that for an entire namespace:
#include <iostream> using namespae std; int main() { cout << "ciao!" << endl; };
References
A reference is an alias to an existing object.
int i = 1; int & ref = i; // reference initialization
The code above declares ref
as an alias of object i
. Notice
that the reference is not itself an object. In fact, in this case the
compiler might implement the reference without allocating any memory,
conceptually replacing every use of ref
with i
. Sometimes
the compiler might need to store a reference, in which case the
underlying implementation will probably consist of a pointer object.
Notice that there is a fundamental difference between the initialization and an assignment of a reference.
int i = 1; int j = 2; int & ref = i; // reference initialization ref = j; // assignment. Of what object? j = j + 1; std::cout << ref << std::endl;
Since a reference is an alias and not an object, every reference must be immediately initialized.
int & ref; // error: makes no sense
Also, again since a reference is not an object, it makes no sense to declare arrays of references, pointers to references, and references to references.
int i; // i is the only object here int & ref = i; // ref is now an alias of i int & * p; // error: makes no sense int & A[10]; // error: makes no sense int & & double_ref; // error: makes no sense
However, even though the reference itself is not an object, as an expression a reference does identify an object—that is, it is an lvalue. So, among other things, you can take the address of a reference. And you get the address of the object that the reference refers to. Try the following code to see that yourself.
int i; int & ref = i; std::cout << "i is here: " << &i << "; ref is here: " << &ref << std::endl;
Since references are lvalues (they identify an object), the initialization of a reference must evaluate to an lvalue.
int i; int & r = i; // okay int & s = i + 1; // Error!
However, the initialization of a const
reference—meaning, a
reference to a const
object—does not have to be an l-value. Do
you see why? Consider the following code as an example.
int sum(const int & a, const int & b) { return a + b; } int main() { int i = 7; int j = 2; std::cout << sum(i, j + 1) << std::endl; }
We insist that a reference must be initialized with an lvalue, but that is only true for lvalue references, which are the references we have seen so far. There are also rvalue references, but we won’t talk about them here. Instead, let’s talk about using references. References can and are often used as formal function parameters. What is the output of the following program?
void swap(int & a, int & b) { int tmp = a; a = b; b = tmp; } int main() { int i = 7; int j = 2; swap(i, j); std::cout << "i = " << i << std::endl << "j = " << j << std::endl; }
One reason to use references is to obtain side-effects, as in the example above. Another reason is to avoid expensive copy operations when we pass complex objects as parameters.
void print_size_slow(vector<int> v) { std::cout << v.size() << std::endl; } void print_size_fast(vector<int> & v) { std::cout << v.size() << std::endl; } int main() { vector<int> A(1000); // a vector of 1000 integers (all 0) print_size_slow(A); // makes a copy of the entire vector print_size_fast(A); // operates directly on A }
In fact, the print functions in the example above do not need to
change the input vector, so the best way to define them is as taking a
constant reference (const vector<int> &
) to the vector.
It is perhaps less intuitive but certainly equally important, perhaps even more important to return a reference from a function. This way, function calls can be used as lvalue expressions. As an example, consider a simplistic implementation of a matrix:
int M[2000]; int height = 40; int width = 30; int & m(int col, int row) { return M[col + row*width]; } int main() { for (int i = 0; i < height; ++i) for (int j = 0; j < width; ++j) std::cin >> m(i,j); // stores a value into M m(0,0) = 1; // stores a value into M }
This feature is even more important in combination with another fundamental feature of C++, namely operator overloading.
Exceptions
C++ provides exceptions as a powerful error-handling mechanism.
As in other languages, such as Java, the code that notices an error
might raise an exception by throwing an exception object with a
throw
expression. The exception is then caught by the
first matching catch
block of a try
block that
directly or indirectly led to the execution of the throw
clause. This means that the control flow goes back up through the
execution stack. A matching catch
block is one
where the catch-clause matches the type of the exception object
defined by the throw
expression.
#include <iostream> #include <new> int * ramp(int n) { int * R = new int[n]; for (int i = 0; i < n; ++i) R[i] = i; return R; } int main() { try { int * A = ramp(1000000); // ... delete [](A); } catch (const std::bad_alloc& e) { std::cout << "Allocation failed: " << e.what() << '\n'; } }
The exceptions thrown by functions of the standard library are
typically derived from (subtype of) std::exception
. However, any
value can be thrown as an exception. For example:
#include <string> #include <iostream> const char * small_error = "okay"; const char * big_error = "not okay"; const char * nasty_error = "AAAHHhhhh!"; int f(int x) { if (x > 7) return x - 2; switch(x) { case 0: throw 10; case 1: throw 20; case 2: throw small_error; case 3: throw big_error; default: throw nasty_error; } } int main() { int i; std::cout << "Input: "; std::cin >> i; try { std::cout << "Output: " << f(i) << std::endl; } catch (int ex) { std::cout << "error number " << ex << std::endl; } catch (const char * error_msg) { std::cout << error_msg << std::endl; } }
Notice that f(int x)
throws some exceptions but those exceptions are
not specified as part of the function declaration. In C++, exceptions
are not part of a function type. A function or method is either
potentially-throwing, meaning that it may throw exceptions of any
type, or non-throwing, meaning that it will never throw exceptions.
In any case, there is no static type checking for exceptions. That
is, the compiler does not prevent a non-throwing function from
directly or indirectly throwing exceptions. Instead, those
restrictions are enforced at run-time. Let’s see how all this works.
By default, without an exception specification, a function or method
is potentially-throwing. Destructors and other special default member
functions and deallocation functions are a special case: they are
non-throwing by default. You may declare a function as non-throwing
with the noexcept
specifier. And you may also declare a function as
potentially-throwing with noexcept(false)
. For example, the
functions int f(int x) noexcept
and void ciao() noexcept
defined
in the example below are guaranteed to never throw exceptions. Let’s
see what that means.
#include <string> #include <iostream> #include <exception> const char * small_error = "okay"; const char * big_error = "not okay"; const char * nasty_error = "AAAHHhhhh!"; void abandon_ship() { std::cout << "Mayday mayday mayday!" << std::endl; } void ciao() noexcept { std::cout << "ciao!" << std::endl; } void goodbye() { std::cout << "goodbye!" << std::endl; } int f(int x) noexcept { if (x > 7) return x - 2; switch(x) { case 0: throw 10; case 1: throw 20; case 2: throw small_error; case 3: throw big_error; default: throw nasty_error; } } int main() { std::set_terminate(abandon_ship); int i = 1; std::cout << "Input: "; try { std::cout << "Output: " << f(i) << std::endl; } catch (int ex) { std::cout << "error number " << ex << std::endl; } catch (const char * error_msg) { std::cout << error_msg << std::endl; } try { // the compiler can ignore the whole try-catch ciao(); // because ciao() is declared as "noexcept" } catch (int ex) { // which means that the catch block is dead code std::cout << "error number " << ex << std::endl; } try { // the compiler must handle the try-catch block goodbye(); // because goodbye() may throw any exception } catch (int ex) { std::cout << "error number " << ex << std::endl; } }
Notice that in the example code above, function f
is declared as
non-throwing but its code may well throw various exceptions. The
ciao()
function is also declared as non-throwing, and you might
consider that to be okay, since the code of the function is simple
enough and does not itself throw exceptions. However, notice that the
code of ciao()
may in fact throw exceptions, since the functions and
methods used therein, which are hidden behind the use of the
left-shift operator (<<
) are potentially-throwing. You can check
that yourself by reading the documentation of the operator<<
of the
std::ostream
class of the standard library. I’m telling you, that
makes for excellent bathroom reading!
So, is this code correct? As already mentioned above, the noexcept
specification does not prevent that function or any one of the
functions called directly or indirectly by that function to throw
exceptions. So, the code is correct, and therefore the compiler
should dutifully and correctly compile that code. On the other hand,
the compiler will also ensure that the semantics of the noexcept
specification is realized.
The question, then, is, what happens if one of the non-throwing
functions actually throws an exception? You should definitely try
running the program and make sense of it intuitively. (Do it!) What
happens is that the exception is never passed to the caller, and
instead the exception triggers the invocation of std::terminate()
,
which in turn invokes the default termination function std::abort()
or a user-defined termination handler. In the case of the example
above, the code defines that termination handler to be the
abandon_ship()
function with std::set_terminate(abandon_ship)
.
Notice that the execution actually terminates right after the
execution of the termination handler in std::terminate
.
Classes
A class is a user-defined type. As you might expect, a class may
have data members and member functions (or “methods”). A C++
class may also define nested classes and other type definitions. Here
is an example of a String
class.
class String { public: // constructors (see below) String(); String(const char *); String(const char * begin, const char * end); String(const char * begin, unsigned int len); String(const String &); // destructor ~String(); // other, non-static member functions const char * c_string() const; unsigned int size() const; unsigned int append(const char *); // a static member function static unsigned int max_size(); private: // data members, all private char * data; unsigned int string_size; unsigned int allocated_size; };
Notice that a class can be defined almost identically by either the
class
or struct
keywords. The only difference is that the default
member access is private
for class
while it is public
for
struct
. For example, the following two classes are identical.
struct vector2d_struct { // the default member access is *public*, so this section is public vector2d_struct (); vector2d_struct (double, double); void set_cartesian (double x, double y); void set_polar (double radius, double angle); double get_x() const; double get_y() const; double get_radius() const; double get_angle() const; private: double x, y; }; class vector2d_class { // the default member access is *private*, so this section is private double x, y; public: vector2d_class (); vector2d_class (double, double); void set_cartesian (double x, double y); void set_polar (double radius, double angle); double get_x() const; double get_y() const; double get_radius() const; double get_angle() const; };
Constructors and Initialization
A constructor is a special member function used to initialize an object. A constructor has code just like any other member function. A constructor may also specify a list of initialization expressions for the data members. For example:
#include <string> class named_object { std::string name; public: named_object(); named_object(const char * n); std::ostream & print(std::ostream & output); }; named_object::named_object() : name("Nemo") {}; named_object::named_object(const char * n) : name(n) {};
Notice that the list defines initializations of the data members in
terms of their specific constructors. In this case, name
is of type
std::string
, which has a constructor that takes a const char *
and
that is invoked in both cases.
The direct base classes of a class can be considered data members, and therefore can also be initialized in the same way. For example:
class pixmap { public: pixmap(const char * n): name(n) {}; private: const char * name; }; class ascii_pixmap : public pixmap { // class ascii_pixmap extends pixmap public: ascii_pixmap(unsigned int width, unsigned int height); private: unsigned int width; unsigned int height; char * data; char fg; char bg; }; ascii_pixmap::ascii_pixmap(unsigned int w, unsigned int h) : pixmap("ascii"), width(w), height(h), fg('*'), bg(' ') { data = new char[width*height]; // WARNING: check allocation std::cout << "ascii_pixmap::ascii_pixmap() called." << std::endl; }
Copy Constructor
A copy constructor for a class T
is a constructor that is declared
to take a reference to an object or type T
as parameter. More
specifically T &
, const T &
(and other cases).
class vec2d { public: double x; double y; vec2d() : x(0), y(0) {}; // "default" constructor vec2d(double a, double b) : x(a), y(b) {}; // other constructor vec2d(const vec2d & v) : x(v.x), y(v.y) {}; // copy constructor };
A copy constructor is invoked every time you initialize an object by copying another object (or r-value expression) of the same type. This happens explicitly in the examples below:
vec2d v1; // invokes default constructor vec2d v2(10, 3.2); // invokes the second constructor vec2d v3(-2, 8); // invokes the second constructor vec2d v4 = v2; // invokes the copy constructor vec2d v5(v3); // also invokes the copy constructor
The copy constructor is also invoked every time an object is initialized implicitly. This happens, for example, with the formal parameters of a function or method. For example, if one defines the method:
bool orthogonal(vec2d u, vec2d v) { // u and v are copy-constructed return (u.x*v.x + u.y*v.y == 0); }
the following invocation of orthogonal
implicitly calls the copy
constructor:
if (orthogonal(v1, v3)) std::cout << "v1 and v3 are orthogonal." << std::endl; else std::cout << "v1 and v3 are not orthogonal." << std::endl;
Constructors and Assignments
There is a difference between an assignment and an initialization. For example:
int i = 7; // this is an initialization int j; j = 8; // this is an assignment
The example above is based on int
objects. However, the same is
true for user-defined types—consistent with the principle that
user-defined types should be used exactly like basic, predefined
types. For example:
std::string s = "ciao"; // this is an initialization (constructor) std::string t; // default constructor t = "mamma"; // assignment
And of course C++ allows the programmer to define class-specific
assignment operators. This is what happens with the assignment of the
standard string objects in the example above. For our vec2d
example, we could define various operators, including the assignment
operator. This is sometimes called operator overloading. We will
discuss operator overloading in more detail later.
Complete Example: Constructors, Copy-constructors, and Assignment
The example below illustrates the syntax and semantics of constructors, including copy-constructors, and assignments (and other operators).
#include <iostream> class vec2d { public: double x; double y; vec2d() : x(0), y(0) { std::cout << "default constructor." << std::endl; }; vec2d(double a, double b) : x(a), y(b) { std::cout << "constructor with a=" << a << " b=" << b << std::endl; }; vec2d(const vec2d & v) : x(v.x), y(v.y){ std::cout << "copy constructor with v.x=" << v.x << " v.y=" << v.y << std::endl; }; vec2d & operator = (const vec2d & v) { // (*this) is the implicit left-hand side operand of the assignment x = v.x; y = v.y; std::cout << "assignment with v.x=" << v.x << " v.y=" << v.y << std::endl; return *this; }; vec2d operator - (const vec2d & v2) { return vec2d(x - v2.x, y - v2.y); } }; std::ostream & operator << (std::ostream & output, const vec2d & v) { output << "(" << v.x << "," << v.y << ")"; return output; } vec2d operator - (const vec2d & v) { return vec2d(-v.x, -v.y); } vec2d operator - (const vec2d & v1, const vec2d & v2) { return vec2d(v1.x-v2.x, v1.y-v2.y); } bool orthogonal(vec2d u, vec2d v) { // u and v are also copy-constructed return (u.x*v.x + u.y*v.y == 0); } int main() { vec2d v1; // invokes default constructor vec2d v2(10, 3.2); // invokes the second constructor vec2d v3(-2, 8); // invokes the second constructor vec2d v4 = v2; // invokes the copy constructor vec2d v5(v3); // also invokes the copy constructor v1 = -v4; std::cout << "v1 = " << v1 << std::endl; std::cout << "v4 = " << v4 << std::endl; if (orthogonal(v1, v3)) std::cout << "v1 and v3 are orthogonal." << std::endl; else std::cout << "v1 and v3 are not orthogonal." << std::endl; };
The output of the program above is:
default constructor. constructor with a=10 b=3.2 constructor with a=-2 b=8 copy constructor with v.x=10 v.y=3.2 copy constructor with v.x=-2 v.y=8 constructor with a=-10 b=-3.2 assignment operator with v.x=-10 v.y=-3.2 v1 = (-10,-3.2) v4 = (10,3.2) copy constructor with v.x=-2 v.y=8 copy constructor with v.x=-10 v.y=-3.2 v1 and v3 are not orthogonal.
Virtual Methods
The following code exemplifies the syntax and semantics of virtual methods.
class A { public: virtual int f(); virtual int g(int x); int a; }; int A::f() { std::cout << "A::f() called" << std::endl; return 0; } int A::g(int x) { std::cout << "A::g(" << x << ") called" << std::endl; return x + 1; } class B : public A { public: virtual int f(); int b; }; int B::f() { std::cout << "B::f() called" << std::endl; return 10; } int main() { A * a; B * b = new B(); a = b; std::cout << a->f() << std::endl; std::cout << b->g(10) << std::endl; }
Operator Overloading
Imagine you are defining a matrix class to represent matrices in a mathematical library. In C you might do that as follows:
struct matrix { unsigned int cols; unsigned int rows; float * values; };
You might then define and then use an addition operation as follows:
void matrix_add(struct matrix * A, const struct matrix * B) { /* computes A += B */ if (A->cols != B->cols || A->rows != B->rows) return; /* error */ int * a = A->data; const int * a = B->data; for (unsigned int i = 0; i < A->rows; ++i) for (unsigned int j = 0; j < A->cols; ++j) *a++ += *b++; } int main() { struct matrix A; struct matrix B; /* ... */ matrix_add(&A, &B); }
This solution would be pretty much identical in Java. However, in C++
you can more naturally define an addition operation by defining the
+=
operator. You can do that on the same exact struct matrix
defined above:
matrix & operator += (matrix & A, const matrix & B) { /* computes A += B and returns A */ if (A.cols != B.cols || A.rows != B.rows) return A; int * a = A.data; const int * a = B.data; for (unsigned int i = 0; i < A.rows; ++i) for (unsigned int j = 0; j < A.cols; ++j) *a++ += *b++; return A; } int main() { struct matrix A; struct matrix B; /* ... */ A += B; }
Container Library
One of the great advantages of C++ over C is the utility of the container library. It is fair to say that the C standard library is quite limited in that it lacks even basic data structures and algorithms. On the contraty, the standard library of C++ is very rich and also versatile.
Let’s look at some basic library features. The library covers strings.
#include <string> #include <iostream> int main() { std::string s; while (std::cin >> s) { std::string::size_type pos = s.find("ciao"); if (pos == std::string::npos) { std::cout << s << " does not contains the string `ciao'." << std::endl; } else { std::cout << s << " contains the string `ciao' at position " << pos << std::endl; } } }
As it turns out, strings are a lot like a container of the container
library of the C++ standard library. So, let’s look at perhaps the
most basic container, namely the vector
:
#include <vector> #include <iostream> int main() { std::vector<int> A; int x; while (std::cin >> x) A.push_back(x); std::cout << "I just read " << A.size() << " numbers:" << std::endl; for (std::vector<int>::iterator itr = A.begin(); itr != A.end(); ++i) std::cout << *itr << std::endl; }
The vector
class defines a generic vector. Generic here means
with respect to the type of the elements in the vector, which in
this example is int
. Thus vector
is a template class, just
like practically all container classes of the C++ Standard library.
We now look a the idea of a template container a bit more closely.
A Template Library
How would you implement a container library in C? For example, how would you implement a list in C? You should first ask a fundamental question: a list of what? Let’s say we start with a list of integers. You would then write something like the following:
/* a single-link list of integers */ struct list_int { int value; struct list_int * next; }; struct list_int * list_int_add(struct list_int * l, int v) { struct list_int * res = malloc(sizeof(struct list_int)); if (res) { res->value = v; res->next = l; } return res; } int list_int_find(struct list_int * l, int v) { for (; l; l = l ->next) if (l->value == v) return 1; return 0; }
Now, imagine you need another list, say, a list of double
.
What would you do? You would probably write the same exact code,
except that you would change the value type to double
:
/* a single-link list of doubles */ struct list_double { double value; struct list_double * next; }; struct list_double * list_double_add(struct list_double * l, double v) { struct list_double * res = malloc(sizeof(struct list_double)); if (res) { res->value = v; res->next = l; } return res; } int list_double_find(struct list_double * l, double v) { for (; l; l = l ->next) if (l->value == v) return 1; return 0; }
An alternative design would be to have a list of pointers to objects
of any type. You can do that with a list of pointers to void
.
/* a single-link list of doubles */ struct list_generic { void * value; struct list_generic * next; }; struct list_generic * list_generic_add(struct list_generic * l, void * v) { struct list_generic * res = malloc(sizeof(struct list_generic)); if (res) { res->value = v; res->next = l; } return res; }
This works. However, notice that each element in the list requires
two separate allocations: one for the structural element of the list
(struct list_generic
), and one for the element object itself, which
in the example above is delegated and therefore owned by the user.
Notice also that looking for an element is not as immediate either. Copying the same code as above would result in a comparison between pointers, which is typically not semantically meaningful:
int list_generic_find(struct list_generic * l, void * v) { for (; l; l = l ->next) if (l->value == v) return 1; return 0; }
Instead, we would have to also define a comparison function. We could do it like so:
int list_generic_find(struct list_generic * l, void * v, int (*compare)(void *, void *)) { for (; l; l = l ->next) if (compare(l->value, v)) return 1; return 0; }
Fortunately, C++ also allows to define generic classes that can be then specialized, or instantiated for a specific type. So, in C++ you could write something like this:
/* a single-link list of doubles */ template <typename T> struct list { T value; struct list * next; }; template <typename T> struct list<T> * list_add(struct list<T> * l, const T & v) { struct list<T> * res = malloc(sizeof(struct list<T>)); if (res) { res->value = v; res->next = l; } return res; } template <typename T> int list_find(struct list<T> * l, const T & v) { for (; l; l = l ->next) if (l->value == v) return 1; return 0; }
Or better yet, we can define everything within a template class:
/* a single-link list of doubles */ template <typename T> class list { private: struct list_elem { T value; list_elem * next; list_elem(const T & v, list_elem * n) : value(v), next(n) {}; }; list_elem * first; public: list() : first(nullptr) {}; void list_add(const T & v) { first = new list_elem(v, first); } bool list_find(const T & v) { for (const list_elem * i = first; i != nullptr; i = i->next) if (i->value == v) return true; return false; } };
Notice that this code requires that the template type parameter
(T
) support an equality comparison operator
(==
). This is quite natural in C++, and of course in any
case we can also support alternative designs with ad-hoc comparator
functions or objects. In fact, all of this is already implemented in
the C++ container library.
Another fundamental concept in the container library is the concept of iterator. An iterator in a linear data structure is an abstraction of a pointer in an array in C. One way to iterate over the elements of an array in C is as follows:
int A[100]; int print_array() { for (const int * i = A; i != A+100; ++i) printf("%d\n", *i); } int print_range(const int * begin, const int * end) { for (; begin != end; ++begin) printf("%d\n", *begin); } int main() { /* ... */ print_range(A, A + 50); print_range(A + 50, A + 100); }
The iterations in print_array
and print_range
use two pointers: begin
is a pointer to the first element of
the iteration, and end
is the first element after the
last element of the iteration. Thus the iteration starts with an
“iterator” pointer set at begin
and increments that pointer
until it is equal to end
. The body of the iteration then
accesses each object by dereferencing the pointer.
The same pattern is pervasive in the use of the C++ standard library. For example:
#include <vector> std::vector <int> A(100); int print_array() { for (std::vector<int>::const_iterator i = A.begin(); i != A.end(); ++i) printf("%d\n", *i); } int print_range(std::vector<int>::const_iterator begin, std::vector<int>::const_iterator end) { for (; begin != end; ++begin) printf("%d\n", *begin); } int main() { /* ... */ print_range(A.begin(), A.begin() + 50); print_range(A.begin() + 50, A.end()); }
Notice that in the example above we use iterators exactly like
pointers also in the same form of pointer arithmetic. That is, for a
pointer to an integer p
, we can write p+10
to point to the tenth
integer in the array of integers pointed to by p
. Similarly, using
the vector
class of the standard C++ library, we can take an
iterator to the beginning of the vector, we can increment an iterator,
and we can de-reference an iterator to obtain the element. We can
also move an iterator by a number of positions using classic pointer
arithmetic. However, this latter operation is not available for all
data structures. For example, we can not use it in a linked list.
Still, in essence, the concept of iterator is identical also for a
list:
#include <list> std::list <int> L; int print_list() { for (std::list<int>::const_iterator i = L.begin(); i != L.end(); ++i) printf("%d\n", *i); } int print_range(std::list<int>::const_iterator begin, std::list<int>::const_iterator end) { for (; begin != end; ++begin) printf("%d\n", *begin); }
Templates
A template is code that is parameterized by a type. Or by two types, or by a value, or by a value and a type, or… yeah, we could go on forever. But let’s keep things simple for now. Here the term “code” refers to class and function definitions. For example, a function template defines a family of functions, each specialized for a specific type. The following code defines a family of functions that returns the size of an array of elements of a given type.
template <typename T> unsigned int array_size(unsigned int n) { return n * sizeof(T); }
Notice that this definition per-se is what defines the family of functions, but does not in itself define any specific function within that family. The specific functions in the family are defined, implicitly or explicitly, by instantiating the template. For example:
std::cout << array_size<char>(100) << endl;
The compiler can also deduce the type parameter directly from an instantiation. For example:
#include <fstream> #include <chrono> template <typename T> void log_message(const T & x) { std::ofstream log_stream("LOG.TXT", std::app); log_stream << std::chrono::system_clock::now() << ": " << x << std::endl; }