Ruby Stripped, Part 3: Arrays

Ruby Stripped, Part 3: Arrays

5 Comments
1. Introduction

It can’t be said often enough: Everything in Ruby is an object! No, you don’t have a deja-vu. When it comes to arrays in Ruby this is more true than ever. But let us start with a look at Flowstone’s provided arrays. There are float, integer and string arrays. A float array treats every value as a float, the integer array as an integer, and the string array of course as a string.
That’s actually the nature of an array. It is a collection of values of the same type. In Ruby this is also true. A Ruby array is a collection of the type object. And that’s what makes Ruby’s arrays so versatile. You can have a string, a float and an integer in just one array, because all of those are actually objects. You can build your own class and store it in an array, because every class is an object. This comes with the cost of a higher developer’s responsibility.

2. Declaring Arrays

But let’s start at the beginning. The easiest way to tell Ruby that you want something to be an array is by enclosing it in square brackets:

#this
a = []
#is equivalent to
a = Array.new

#this
a = [1, 2]
#is equivalent to
a = 1, 2
#and equivalent to
a = Array.new(2){|index| index + 1}

The class method new can also create a copy of an array:

a = [1, 2]
b = Array.new(a)
3. Object Reference

When creating arrays their content is referenced rather than copied. This is important as it can lead to a behaviour that you might not expect.

unexpected changes in an array

unexpected changes in an array

In this case, the contents of a and b are changed, after they were assigned to the array c. Still, array c shows the new content. This is because a and b just point to the numbers in memory, and c just points to a and b. This can be solved with one of two methods that every object has: clone or dup. To avoid further issues, I advise to use clone unless you exactly know what dup does. A real copy of an object is created using clone.

A clone prevents changes in the array

A clone prevents changes in the array

As you can see, the changes to a are not reflected in c anymore. c now points to a copy of a and not a itself. You should keep that in mind whenever you will use arrays built from sources that you don’t want to change directly. On the other hand, there are many occasions, where that behaviour is wanted. Here’s an example, where an array gets sorted and then stays that way, no matter how we change the contents.

A one-time sort

A one-time sort

4. Enumerator

Each array includes enumerators. An enumerator is a class that allows iteration. You can use an enumerator on its own, as shown below.

An enumerator used to iterate

An enumerator used to iterate

You define how to iterate by using the provided methods. In the last image we used .next, which doesn’t need any arguments and simply returns the result of the next iteration. If there isn’t any, StopIteration is raised. Here are two other methods of the enumerator class.

test

Using provided methods of the enumerator class

The first one, .each, simply iterates by yielding to the block, using whatever behaviour we defined (in this case counting the power of two). To avoid infinite iteration, I added a break from the loop after 16384 is reached. The second one, .each_with_index iterates by yielding to the block, returning each iteration result and its internal index from the enumerator.
Both of them are passed a block. In this block you set up a variable to be filled with the yielded result. This “declaration” is done with a piped list of variables to be used (“|n|”). After that follow instructions of what to do with the current yielded result. In this case n is converted to a string, a comma appended and both of them added to the string s1.
.each_with_index has a certain order, shown by the method’s name. It will yield the result first and then its index. So our pipe reads |n, i| for number and index. We then use i to generate the power of two string (“2**i”) and n for the string converted representation of the yielded result. Straight forward.
But you can also just call the methods without passing a block. You will then just get an enumerator object. This behaviour is useful to chain enumerators. Let me show you this example, that uses methods from the enumerator class, but chains them.

Chaining enumerators

Chaining enumerators

Now .each returns an enumerator that is enumerated with .with_index(1) (again returning an enumerator) and finally enumerated with .with_object(a) that is passed a block (do-end is just another form of a block, that allows to span several lines). We now get an array of the first two enumerator results, “declared” as (n, i) and the third result o. After that those variables are just used as you would use them anywhere else in code.
The with_object method just passes the given object in each iteration. It doesn’t need to be a statically defined object as in this example. Remember part 2 of this series: everything is an object. For example, passing the result of another block or another enumerator is just as valid. However, here we pass an array as object that contains all our formatting strings.
with_index(1) means that we want the index to be offsetted by 1, returning 1 instead of 0, 2 instead of 1, etc.

You can also convert arrays into enumerators:

a = ["Yo", 1, [3,4]]
e = a.to_enum
e.next #returns Yo (class String)
e.next #returns 1 (class Fixnum)
e.next #returns [3,4] (class Array)
e.next #raises StopIteration
5. Iterating over Arrays

After the last chapter you should now understand that, when iterating over arrays, we call an enumerator class. Everything explained in the last chapter is therefore valid for arrays as well. I won’t list each method that invokes an enumerator. Here are just a few names: each, map, collect, delete_if, keep_if, select and many more.

There was an explicit request to explain some of the methods. That’s also why I was explaining the enumerator class. Without that knowledge it is not so easy to understand them.

array.each            iterates over the whole array, returning the object from the array that corresponds to the current iteration. [1, 2, 3].each will return 1 first, then 2 and finally 3

array.reverse_each    iterates over the whole array, but in reverse order. [1, 2, 3].reverse_each will return 3 first, then 2 and finally 1

array.each_index    iterates over the whole array, returning the internal index of the enumerator instead of the object. [1, 2, 3].each_index will return 0 first, then 1 and finally 2

array.map            iterates over the whole array, using the block per iteration to build a new array with all the results. [1, 2, 3].map {|n| n*2} returns a new array with the content [2, 4, 6]

array.collect        same as .map. You can also use the destructive versions .map! or .collect!, which instead of returning a new array change the content of the source array.

array.cycle            iterates over the whole array as many times as stated. cycle without a number means infinite. cycle(3) iterates 3 times over the array, etc. a = 0; [1, 2, 3].cycle(2) {|n| a += n}; a returns 12 (a + 1 = 1 + 2 = 3 + 3 = 6 + 1 = 7 + 2 = 9 + 3 = 12)

array.index            iterates over the array until the block results to true. Useful to find a specific index. [1, 2, 3, 4, 2].index {|obj| obj > 2} returns 2 (we find the number 3 at index 2). The counterpart of this method is .rindex which iterates reversed, starting from the end of the array. [1, 2, 3, 4, 2].rindex {|obj| obj > 2} returns 3 (we find the number 4 at index 3).

Be careful when using loops of any kind. In Ruby, infinite loops are just fine (and don’t raise an error), while Flowstone stops execution if it thinks the Ruby instance computes too much without giving control back. Once stopped, you can only save and exit Flowstone, because Ruby gets only resetted by restarting Flowstone. Rmember that a Ruby instance is interactive: The whole code will be executed after each new keystroke. As soon as you, for example, have typed .cycle Ruby will execute an infinite loop, although you might want to add (2) or something. Better turn off the instance while typing code for loops.

6. Versatility

Ruby’s arrays are the most comfortable I ever worked with. It’s their versatility what makes them so special. If I would be asked to name just one killer feature of Ruby, I’d probably say arrays (but almost head-to-head with blocks). The methods added to the Array class are just great, meaningful and take away a lot of the usual code editing.

#Try these:
a = [1, 2, 3].push(4, 5, 6).push("end")
a = [1, 2, 3].zip(["one", "two", "three"]).flatten!
a = [1, 2, 3] + ["one", "two", "three"]
a = [1, 2, 3] - [2]

Also, you are not bound to the whole array. There’s plenty of ways to access parts of the array.

a = [1, 2, 3, 4, 5, 6, 7, 8, 9]

#explicit index
a[1] #returns 2
a[-1] #returns 9

#starting index, length (equivalent to method slice)
a[1, 3] #returns [2, 3, 4]
a[-3, 3] #returns [7, 8, 9]

#range
a[2..4] #returns [3, 4, 5]
a[2...4] #returns [3, 4]
a[1..-2] #returns [2, 3, 4, 5, 6, 7, 8]
a[-6..-4] #returns [4, 5, 6]

#methods take, first and last
a.take(3) #returns [1, 2, 3]
a.first(3) #returns [1, 2, 3]
a.last(3) #returns [7, 8, 9]

#conditional take
a.take_while {|o| o % 4 != 0} #returns until block is false => [1, 2, 3]

#picking
a.values_at(1..3, 5, 8) # returns [2, 3, 4, 6, 9]

I’m sure I didn’t cover everything related to arrays. Feel free to comment, and I’ll try to answer!

0 0 0 0 0
tulamide

About the author:

All kinds of script languages, Python, Ruby, Construct, Construct 2. Writing (if you understand german, have a look at 'Wortheim' via http://tulamide.tumblr.com/ ), electronic music, UI design. Too much for one life ;)

5 Comments

  1. kohugaly
    kohugaly  - April 11, 2015 - 3:11 pm

    Thanks tulamide! It is very appreciated! I was always puzzled what the enumerator is and how it’s related to arrays. Using “.each”-type of methods makes so much more sense now!
    However, I have one more question:
    How do I make a hard copy of a nested array, including all subarrays. When I simply use .clone or .dup in such situation the individual subarrays still contain references to original objects. Example:
    a=1,2
    b=3,4

    array1=[a,b]
    array2=array1.clone
    a[0]=5
    watch array2 #shows [[5,2],[3,4]]
    #I expect array2 to have [[1,2],[3,4]]

    Only way I’ve found to produce such effect is “Marshal.load(Marshal.dump(array))”, which looks impractical syntax-wise. Is there a shortcut for that or will I have to add new method to array class?

    • tulamide
      tulamide  - April 11, 2015 - 6:29 pm

      Well, it’s all about what you will change afterwards. It is of no use to clone array1, because the clone is a 1:1 copy of array1, which points to the arrays a and b. array1.clone therefore also just points to a and b. So, array1 = a.clone, b.clone would have created the immutables as you expected.

      Marshaling is used to serialize data, for example to save them to a file for later restore. With .dump the data (with all dependencies if you don’t use ‘limit’) is converted to a binary stream. And .load converts the stream back to structures. Just cloning should be quicker (no conversion done). Also, according to ruby-doc.org, you can’t marshal anything: “if the objects to be dumped include bindings, procedure or method objects, instances of class IO, or singleton objects, a TypeError will be raised.” Using marshal therefore is only safe if you know for sure that the array will never contain objects with those specifications.

      But for a final answer to your question it would be good to have an explicit example, where you need that immuatable behaviour. There might be a better or simpler way, or immutables not even needed.

  2. kohugaly
    kohugaly  - April 12, 2015 - 9:04 pm

    Here’s a real example of an issue I’ve run into:

    In the DSPcode2 compiler the code is converted to a syntactic tree, which is basically an array, where first element is the name of a operation and other elements are the arguments and parameters. each of those parameters can be a syntactic tree of itself.
    simple example: (a+b*11) -> [“add”, [“variable”,”a”], [“multiply”,[“variable”,”b”],[“constant”,”11″]] ]

    Now let’s say I want to make a hard copy of the tree, modify it and then do various tests to compare the original and modified copy (things like comparing CPU load of branches scheduling the branches in most efficient order, replacing them or deleting them).

    • tulamide
      tulamide  - April 12, 2015 - 11:09 pm

      Ah, now I see! A tree class recursively calling itself would be the best to achieve that. I’ll try to build an example tomorrow. It will make your life a lot easier, if I can realize it exactly as it is in my mind right now.

    • tulamide
      tulamide  - April 16, 2015 - 11:03 am

      Check your pm on DSPr forums, please ;)

Add Comment Register



Leave a comment

You must be logged in to post a comment.

Back to Top