C++: Variable Scope and 'This'

"Scope" is a concept which can be applied to many things in C++, and generally refers to the region of code in which something is accessible.

In the case of variables, variables are only accessible after they've been declared in the code. Variables which are defined in "blocks", which generally means they're in some sort of structure, between curly brackets, are said to have the local scope as these are only accessible by things inside the block in which the variable was declared. Take for example the following in which the variable "x" can only be accessed from the main function:

void Function()
{
	//We can't access 'x' from here.
}

int main()
{
	int x;
	//We can access 'x' here!
	return 0;
}

So far, we've learnt about many "blocks", and as alluded to earlier, these can be classified generally by sections of code with curly brackets surrounding them - for example functions, while loops, for loops, if-statements, etc. Nested blocks (blocks inside blocks) also have access to the local variables of their parent blocks, take for example the following:

int main()
{
	int x;

	if(true)
	{
		//We can access 'x' here!
	}

	return 0;
}

Note that for cases like the above where only one line should be treated as a "block" for a statement, we can simply indent the single line and leave out the curly brackets - take, for example, the following:

int main()
{
	int x;
	cin << x;

	if(x > 5)
		cout << x << endl;

	return 0;
}

In C++, we can actually create blocks without any special keywords like "for" or "if" just for general purpose by surrounding a section of code in curly brackets. Usually this isn't extremely useful, however in some cases it can provide a good way to isolate "more local" variables:

int main()
{
	int x = 5;

	{ //A regular 'block'
		int x = 10;
		cout << x; //=> 10
	}

	return 0;
}

On running the above example you can see that local variables (with the same name) are chosen over those of a wider scope. In the example the local variable is hiding the variable in the wider, 'main', scope, and as such there is no easy way of accessing the 'x' in 'main' from the block. This is why naming in this fashion should be avoided if possible.

If a variable (or anything else for that matter) is declared outside of any blocks, it is accessible from anywhere in the code. Variables declared this way are said to have global scope, however as convenient as these may be, they are often considered very bad practice. Take for example the following situation:

int x = 3;

int main()
{
	int x = 5;

	cout << x; //=> 5

	return 0;
}

Once again the local variable takes preference over the global, and hence 5 is outputted. In this case, we could actually use the scope resolution operator (::) to target the variable using the global scope by simply not specifying anything on the left side of the operator:

int x = 3;

int main()
{
	int x = 5;

	cout << ::x; //=> 3

	return 0;
}

Generally it's considered bad practice to give multiple variables the same name in situations where it's possible that scope could cause naming conflicts, and in cases like classes, some people like to use different notations to represent member variables to avoid naming conflicts. A popular naming convention is "Hungarian Notation" which prefixes class member variables with "m_", for example:

class A
{
public:
	A(int age, string name)
	{
		m_age = age;
		m_name = name;
	}
	int m_age;
	string m_name;
};

In the case of scope naming conflicts with classes, there is also a hidden pointer passed behind the scenes to every class member function (or at least those that aren't static, but we haven't learnt about that yet!). Take, for example, the above code snippet without the naming conventions:

class A
{
public:
	A(int age, string name)
	{
		age = age;
		name = name;
	}
	int age;
	string name;
};

It's clear what the programmer wants to do here, they want to set the member variables to the local scope parameters. The problem is that the "age" and "name" they're setting, are actually the parameters themselves! So if we added an 'output' member function and called this after constructing with the two parameters, we would see that the member variables still hold "no value":

class A
{
public:
	A(int age, string name)
	{
		age = age;
		name = name;
	}
	void output()
	{
		cout << age << " " << name;
	}
	int age;
	string name;
};

int main()
{
	A joe(17, "Joe");
	joe.output();

	return 0;
}

The best way to solve this would be to change the parameter/variable names. There is, however, another way we can accomplish this by using this hidden pointer. The hidden pointer is named this, and points to the object that the member function is being performed on. As such, we can dereference the pointer and use the dot operator to get the member variable of the object instead of the local member function one. As we covered previously, the (*a).b syntax is identical to the a->b syntax, and as such using the "arrow" operator makes a lot of sense here. Although it's a bit messy, we could use the following:

class A
{
public:
	A(int age, string name)
	{
		this->age = age;
		this->name = name;
	}
	void output()
	{
		cout << age << " " << name;
	}
	int age;
	string name;
};

There are actually an awful lot of nice things that you can do with access to this hidden pointer. A nice idea is making "chain-able" member functions. So if related functions of a class return a reference to the object itself (the dereference of the 'this' pointer), then member functions could be chained up in a object.A().B().C() syntax! Note that if we don't set the return type of the function to a reference to the object type, a copy of the object will be passed each time, and so the chaining will completely break.

#include <iostream>
#include <string>

using namespace std;

class Number
{
public:
	int number;

	Number () { number = 0; }
	Number& add(int step) //"Number" object reference return type
	{
		number += step;
		return *this;
	}
	Number& minus(int step)
	{
		number -= step;
		return *this;
	}
};

int main()
{
	Number one;
	one.add(5).minus(2).add(7);
	cout << one.number;

	return 0;
}

Chaining member functions is pretty neat, although in this case there is a much cooler solution available if we overload some operators - but we haven't learnt about that yet.