Look into deallocating temp memory used for rotating views in FFTOffsetViewPairs
It isn't clear to me what the cost of allocating/deallocating memory on the GPU is. Currently, we keep the allocation for the temporary array for doing the FFTs around as long as the OffsetViewFFTPair is around. However, we only actually need it inside the FFT wrapper. We might be better off allocating at the start of the FFT wrapper and deallocating at the end. That would mean we would only have a single temporary array around at a time, possibly saving a lot of memory.