WIP: good_size: optimise by cost heuristic
As per your suggestion in !17 (merged), this chooses
good_size by minimising the cost heuristic length*sum(prime factors).
In my benchmarks this is still actually faster than the original
good_size. However, in testing I've found that the resulting FFTs aren't always actually faster. Maybe we could improve the heuristic further?