Reducing plan size to O(sqrt(N))?
Currently the size of a plan is roughly comparable to the size of the input/output arrays (for 1D transforms). In the case of Bluestein transforms, it's even a few times larger. Given that we cache plans, this could be considered a bit wasteful.
It is possible to store a reduced set of roughly 2*sqrt(N) twiddle factors and to compute the work arrays (WA) from these quite efficiently (one if-statement, 2 complex loads and one complex multiplication) at the beginning of every pass. This will cause an overall slowdown (I expect less than 10 percent), but has two advantages:
- much smaller plan size
- only one kind of plan needed for real and complex FFTs
@peterbell10, do you think this is worth trying?
Providing this as an option will most likely cause too much code growth; we should keep the current state or make the full switch, I guess.