I spotted a challenge that was something like this online, and I adapted it. There was something that I wanted to understand a little bit better. The interesting thing here is about the data structures used, to increase efficiency.
Find Contiguous Squares
Imagine you have a grid of n
width and height, and it’s populated with random
numbers. It could be represented something like this:
[
[1, 1, 2, 2, 1, 1, 1, 2, 1, 2],
[1, 0, 0, 2, 1, 1, 2, 2, 0, 2],
[0, 1, 0, 1, 2, 1, 0, 1, 2, 2],
[2, 1, 2, 1, 0, 0, 0, 2, 1, 0],
[0, 2, 1, 2, 1, 1, 2, 1, 2, 0],
[1, 1, 1, 2, 1, 1, 0, 0, 2, 1],
[1, 2, 1, 2, 2, 2, 2, 1, 2, 1],
[2, 1, 1, 2, 2, 1, 2, 0, 1, 1],
[0, 1, 1, 2, 2, 0, 1, 0, 2, 2],
[2, 2, 2, 1, 0, 1, 1, 0, 2, 2]
]
In fact, that’s the exact grid that the image is of above.
Now, you need to write a program that accepts the grid and returns the greatest number of contiguous squares that have the same value. In this case, the function should return 11
. If you look for a moment you’ll find the cluster of 11 orange squares, it’s pretty trivial for the human brain to do.
But, let’s expand this to say, a million squares. The following program can do this on my computer in about 5 seconds.
class GridChecker
def initialize(grid)
@assessed = {}
@grid = grid
end
def run
greatest = 0
@grid.each_with_index do |row, ri|
row.each_with_index do |_col, ni|
coords = [ri, ni]
next if @assessed.key?(coords)
cluster_size = check_square(coords)
greatest = cluster_size if cluster_size > greatest
end
end
greatest
end
def check_square(coords)
@assessed[coords] = nil
find_cluster(coords)
end
def find_cluster(coords)
total = 1
adjascent_squares(coords).map do |adjascent_coords|
next if @assessed.key?(adjascent_coords)
@assessed[adjascent_coords] = nil
total += find_cluster(adjascent_coords)
end
total
end
def adjascent_squares(coords)
neighbor_coords(*coords).filter do |x|
within_grid(*x) &&
@grid[x[0]][x[1]] == @grid[coords[0]][coords[1]]
end
end
def within_grid(row, col)
row >= 0 &&
row < grid_height &&
col >= 0 &&
col < grid_width
end
def grid_width
@grid_width ||= @grid[0].length
end
def grid_height
@grid_height ||= @grid.length
end
def neighbor_coords(row, col)
[
[row - 1, col],
[row, col - 1],
[row + 1, col],
[row, col + 1]
]
end
end
The Twist
Here’s the thing that surprised me a little bit, and took some learning. I’ve always thought about arrays as being simpler data structures than hashes (or objects or libraries depending on your language), and so in my mind I had just taken for granted that using them was more efficient for many things. But this simple task very quickly demonstrats that this is not the case.
The key here is that in the @assessed
instance variable, I’ve used a hash
rather than an Array. It’s easy to use either in this case. It feels weird to be
assigning arbitrary nil
values that aren’t even used for anything. But the
thing is that the look up time on this ever growing variable is so much faster
when using a hash, and it doesn’t slow down the way it does with an array. Using
an array for the same thing means that you have to iterate over it repeatedly,
leading to a huge slow down as you add more data.
Something Else to Play with
Here’s a quick script to generate a grid of n dimensions to play with. It’s a quick way to get started with your own implementation, or to play with performance.
require 'json'
def gen_grid(n)
n.times.map do
n.times.map do
rand(3)
end
end
end
File.write('grid.json', gen_grid(1000).to_json)