Ruby: Arrays and Hashes

In this tutorial we're going to talk about the two very similar data-types that we've avoided up until now: arrays and hashes.

Arrays are essentially just lists of items. I usually think of them in a tabular sense, like a table with one (or two) row(s) but many columns. The reason I said one or two rows, is because associated with each value in the list is an integer called the array index. This index number counts up from 0 for every new list item - so the first item in the list has an index of 0, the second has an index of 1, and the third has an index of 2. This is important because when we want to get or set the value of a specific list element, we can reference it by its index number. With this in mind, we can create arrays in Ruby via a number of methods. The first method is via the usage of square brackets. If you create a variable by simply typing a name, using the equals operator, and then setting this equal to a list of comma-separated items contained in square brackets - an array will be created with these items! So something like the following:

salutations = ['hi', 'hello', 'howdy']

In the above case, the items occupy indices 0, 1, and 2. Empty arrays can also be created using square brackets like this by leaving the brackets empty:

data = []

An completely identical way to create an empty array is to use the new method on the Array class (which is of course the class of all arrays, as can be seen on running data.class after creating the array shown in the above snippet). So the following would create the same array shown in the snippet above (if you don't believe me try outputting them or checking the return values in IRB):

data = Array.new

The sheer fact that we can create empty arrays would tend to indicate that we can add items to an arrays. This can be accomplished through numerous methods, one of which is simply using the equals operator on the array element of the next index. So if we wanted to add one to the "salutations" array, we would set the value of the element with the index of 3. This obviously begs the question of how we target array elements, the answer of which is really quite simple. Array elements are targeted by writing the name of the array followed by the index number in square brackets. So 'hello' in our 'salutations' array would be referenced by salutations[1]. We can use the equals operator on an array element to either change the value of that element (e.g. changing 'hello' with something like salutations[1] = 'sup?'), or to create a new element. So in the case of the 'salutations' array in which we decided index 3 would be the next element, we could add another element with something like the following:

salutations[3] = 'alright?'

Array elements can also be targeted via a few helpful array methods. salutations.first will reference the first element in the array ('hi' in this case), and salutations.last will reference the last element in the array (which should be 'alright?' if we added an element with index 3). While we're mentioning a few methods - the number of elements an array has can be found by using the length method. So with the extra element added, salutations.length should return 4. Another method of adding elements to arrays is to use the << operator with the array name on the left side and the next element value on the right side. So if we wanted to add 'alright?' to out array as we did using the equals operator in the snippet above, we could alternatively write the following:

salutations << 'alright?'

The << method is often favoured for adding elements as it does not require an index number (although neither does salutations[salutations.size] = 'alright?' - the former is just a bit simpler/nicer), and the = method is usually used for changing the value of elements.

The point of arrays should be fairly obvious - common basic examples include the scores of a class in a test, shopping lists, and random words. In cases like these it may be useful to be able to create an array from a space-separated list - this can be accomplished at the array initialisation using the shorthand %w notation. Something like the following would create an array with an element for each word:

shopping_list = %w(milk cheese butter bread DVDs)

Moving on a little bit from basic lists of data, hashes (objects of the Hash class) are basically arrays but with arbitrary identifiers rather than index integer identifiers. Much like arrays, there are two main methods of creating a hash. The first method behaves much like the square bracket initialisation of an array, but with curly brackets instead. Inside the curly brackets each key/value pair is separated by a comma, and key/value pairs are denoted by an identifier, followed by the => operator, followed by the element value. We talked about symbols in the tutorial about basic variable types, and it's very common for symbols to be used as keys in hashes as the identifiers are commonly short strings which are re-used throughout an application. Remembering that symbols are denoted using a colon, a basic hash which contains some data (arbitrary in this case) could be created using something like the following:

colours = { :logo => 'white', :banner => 'blue', :contrast => 'orange' }

In a similar way to arrays, a new hash can be created by using empty curly brackets:

data = {}

Hash elements are then referenced after the creation by writing the key/identifier in square brackets (just like an array reference). Using the 'colours' hash created above, 'blue' would be returned from colours[:banner]. Key/value pairs can then be modified or created by using the equals operator on a targeted element -- the value for colours[:logo] could be changed using something as simple as colours[:logo] = 'green', and a new element could be added by simply using an identifier that hasn't already been used in the hash, something like the following would work fine:

colours[:footer] = 'blue'

As can be done with arrays (and a whole bunch of classes actually), a new hash can also be created by using the new method on the class (data = Hash.new), and while we're talking about this, using this method to declare a new hash can also have an advantage. If a value is given to the new method (either via a space and then the value or by specifying the value in brackets) - this value will become the default value for all keys, so element values that have not yet been created/initialised will be set to this value (note that this default value can also be set by passing a value to the default method of a hash). The default value for hashes (if no parameters are specified) is nil, however sometimes setting it to something else is necessary. If we were to use the "scores in a class" example for a hash which stored students' names alongside their scores, it might be useful to set the default value to '0' as students who were not assigned a value were likely ill for the test and thus scored 0 - this could be accomplished as shown below:

scores = Hash.new(0)

The use of this can be seen in the commented code below:

scores = Hash.new(0) #The array to hold our scores

scores[:dave] = 55 #Set Dave's score
scores[:chris] = 66 #Set Chris' score

puts "Dave scored #{ scores[:dave] }" #Output Dave's score
puts "Chris scored #{ scores[:chris] }" #Output Chris' score
puts "Lea scored #{ scores[:lea] }" #Output Lea's score (which we haven't set!)