The problem asks you to find the number with the longest Collatz chain. The Collatz sequence takes a number and applies the following based on it being even/odd:
if n.even?
n / 2
else
3 * n + 1
end
We’ll be testing the range 1 to 1 million
.
The brute force method
The first thing that may come to mind is looping through all the numbers in the range 1:100 million
. Then you do another loop on each number that determines its Collatz chain. Here’s what that look’s like in Ruby code:
def collatz_num_with_array
counter = 1
collatz_chain = [1]
until counter > 1_000_000
collatz_sequence = collatz_chains(counter)
if collatz_sequence.length > collatz_chain.length
collatz_chain = collatz_sequence
end
counter += 1
end
collatz_chain
end
def collatz_chains(num)
sequence = [num]
until num <= 1
num.even? ? num = num / 2 :
num = 3 * num + 1
sequence << num
end
sequence
end
This approach is very computationally intense. Not only are you redoing all the math operations for every number from 1 to 1 million
, but you are also creating new arrays for each loop. Though this is still an open problem, the upper limit on computations approach O(n^2)
. Looping through each array twice on average, sometimes much more!
A touch of elegance
One mistake worth noting in the solution above is that we are actually creating an array to save each sequence entry. If we look closer at the problem, this is not what it is asking for. We can quickly eliminate that array and just use a counter instead. Upon closer inspection there is a mathematical pattern that we can use to our benefit as well. Each time a number is odd, we apply (3*num + 1)
to it. This immediately tells us the result will be an even number. We can use that and cut out a step every time we get an odd result. Changes are in bold below.
def collatz_num_without_array
counter = 1
collatz_number = 1
collatz_chain_max = 0
until counter > 1_000_000
num_terms_in_chain = collatz_chains(counter)
if num_terms_in_chain >= collatz_chain_max
collatz_number = counter
collatz_chain_max = num_terms_in_chain
end
counter += 1
end
collatz_number
end
def collatz_chains(num)
sequence_steps = 1
until num <= 1
if num.even?
num = num / 2 :
else
num = (3 * num + 1) / 2
sequence_steps += 1
end
sequence_steps += 1
end
sequence_steps
end
As a side note, if you desire the sequence, just build a helper function that will determine the sequence for a specific number you give it. This is nice and all and will shave off a nice chunk of operations, but there has to be more because this is still very computationally complex.
Time to cache in
Since we are scaling numbers from small -> large
, there is a very good chance that a number may repeat itself inside the collatz_chains
method. Let’s draw up an example in our minds. Numbers 4 & 8
should give us a good visual hook to wrap our minds around.
collatz_chains(4) = collatz_chains(2) + 1 sequence_steps
collatz_chains(2) = 1 sequence_steps
collatz_chains(4) = 1 sequence_steps + 1 sequence_steps
= 2 sequence_steps
collatz_chains(8) = collatz_chains(4) + 1 sequence_steps
= 2 sequence_steps + 1 sequence_steps
= 3 sequence_steps
Starting at 1
and increasing our collatz_chains
calculations by 1
each step will give us a key advantage to cut down on the number of operations. We have already calculated the number of steps for 2
in the example above. Let’s reuse that info like we did in the short example. We will cache the number of sequence_steps
for each number so that when we get that number again, we can just add in the steps we’ve already calculated. Let’s see what this looks like in code. Changes are in bold below.
@num_terms_cache = {2 => 2}
...
def collatz_num_without_array
counter = 1
largest_collatz_num = 2
collatz_chain_max = 0
until counter > 1_000_000
num_terms_in_chain = 1
collatz_num = counter
until collatz_num < 2
num_terms_in_chain = collatz_chains(collatz_num)
end
@num_terms_cache[counter] = num_terms_in_chain
if num_terms_in_chain >= longest_collatz_iterations
collatz_chain_max = num_terms_in_chain
longest_collatz_num = counter
end
counter += 1
end
longest_collatz_num
end
def collatz_chains(num)
sequence_steps = 1
until num <= 1
if @num_terms_cache.include? num
sequence_steps += @num_terms_cache[num]
break
end
if num.even?
num = num / 2 :
else
num = (3 * num + 1) / 2
sequence_steps += 1
end
sequence_steps += 1
end
sequence_steps
end
Instead of tossing out the calculations you did in each iteration, we’ll go and save it into a hash. Each time that numbers comes up again we can break our hash and add in the number of sequence_steps the previous calculation had done. At a number like 800,000
, this will become a god send. The algorithm approaches O(n)
bounds now.
Key takeaway: Cacheing is our friend when we don’t want to redo all the operations from the ground up!