Combining multiple FFTs

Suppose there was some data of length $N$ that had been split into $A$ chunks (each of length $N/A$) and each of the chunks had been FFT’d. What I wanted to figure out was how could you recombine these individual FFTs to recreate an FFT of the whole N-length data stretch. Now there’s a simple way to do this in Matlab (see here, also linked in struck through stuff below), but I wanted to work out how to do it “properly”, which after a lot of Googling I finally figured out – my method generalises the definition of the decimation-in-frequency radix-4 transform (see e.g. page 32 of this) via the method used for a sequence of two FFTs in this paper.

So, if we start with the definition of the Discrete Fourier Transform and its inverse:
$$X_k = \sum_{n=0}^{N-1}x_nW_N^{nk}$$ and
$$x_k = (1/N)\sum_{n=0}^{N-1}X_nW_N^{-nk}$$
where $W_N = e^{-2\pi i /N}$.

The decimation-in-frequency FFT for an arbitrary radix (e.g. radix-A) can be defined as:
$$X_{Ak+m} = \sum_{n=0}^{M-1}\left[\sum_{l=0}^{A-1}(W_A)^{lm}x_{n+lM}W_N^{nm}\right]W_M^{nk},$$
where $M = N/A$, $k = 0, 1, …, M-1$ and $m=0,1,…,A-1$. The $x_{n+lM}$ vectors are the sequential chunks of the original time series. Since we have the FFTs of these chunks we can use the definition of the inverse FFT above and substitute it in for $x$ giving
$$X_{Ak+m} = \sum_{n=0}^{M-1} \frac{1}{M} \left[\sum_{j=0}^{M-1}\left(\sum_{l=0}^{A-1}W_A^{lm}F_{lj}\right)W_M^{-jn}W_N^{nm}\right]W_M^{nk},$$
where $F_{lj} = {\rm FFT}(\{x_{jM, jM+1, …, (j+1)M-1}\})_{l}$.

We can simplify this expression with a couple of substitutions, $W_N = W_M^{1/A}$ and $d_{jm} = \sum_{l=0}^{A-1} W_A^{lm}F_{lj}$, giving
$$X_{Ak+m} = \frac{1}{M}\sum_{n=0}^{M-1}W_M^n\sum_{j=0}^{M-1}d_{jm}W_M^{(m/A)-j+k},$$
which, using the (I assume standard) identity used in here to get rid of one of the summations), gives
$$X_{Ak+m} = \frac{1}{M}\sum_{j=0}^{M-1}d_{jm}\frac{(1-W_M^{((m/A)-j+k)M})}{(1-W_M^{((m/A)-j+k)})}.$$

And, voila, that’s the general solution to the problem – although note that for the $Ak+0$ values you should just use
$$X_{Ak+0} = d_{j0},$$
(strangely the general solution doesn’t seem to work for these values, but they may be a problem with what I did).

Below’s a short Matlab code that does this if the $A$ FFTs have been put into an $M\times A$ array called X:

% get size of X
sx = size(X);
Nbins = sx(1);
Nffts = sx(2);

% get the length of the final FFT
Ntot = Nbins*Nffts;

% create output Y
Y = zeros(Ntot,1);

m2pi = -2*pi*1i;
tw = exp(m2pi/Nffts);
W = exp(m2pi/Nbins);

jbins = (0:(Nbins-1))';

% get the i*Nbins + 1 full FFT values (for i=0, Nffts)
for i=1:Nffts
    Y(1:Nffts:end) = Y(1:Nffts:end) + X(:,i);
end

% get the rest
for j=1:Nffts-1
    for k=0:Nbins-1
        % get sums of ffts
        ds = zeros(Nbins,1);
    
        for i=0:Nffts-1
            ds = ds + tw^(j*i)*X(:,i+1);
        end
        
        % get extra twiddles
        T = (1 - W.^(((j/Nffts) - jbins + k)*Nbins)) ./ ...
            (1 - W.^((j/Nffts) - jbins + k));
        
        Y(j+1 + Nffts*k) = (1/Nbins)*sum(ds.*T);
    end
end

There are probably quicker ways of implementing this in Matlab. In reality I expect this will always be slower (it seems to be $\propto N^2/A$ operations) than just inverse FFTing each of the individual FFTs and recombining them to create the full spectrum!

Today I’ve been searching the web trying to find out how you can combine multiple FFT of short chunks of data into an equivalent FFT of the whole stretch of data (e.g. if you have some data that’s 128 point long [let’s keep it in powers of two!] and I split it into 4 chunks of 32 points and FFT each of those, how do I combine those FFTs to make the equivalent of an FFT of the whole 128 point long stretch?). After learning a lot about how FFT’s actually work (radixes, butterflies, twiddle factors and all that) I found a nice simple Matlab-fied answer here (See Greg Heath’s reply at – 2010-06-18 22:54:00).

[Update: The method in the link is actually a bit of a cheat as it assumes that the short FFTs have been FFT’d with zero-padding to the full sample length. In reality (in the situation that I want this method for) the short FFTs will not have been zero padded. So, to duplicate the effect of zero padding the short FFTs need to be interpolated with a sinc function to the full sample rate. In Matlab this can be done with the interpft function, although what this really does is (i)FFT the data zero pad it and the re-FFTs it (it’s therefore quick), but information on how to do this more formally (using the sinc interpolation) can be found here.]

4 Replies to “Combining multiple FFTs”

I think you left some Anonymous comments at my sinc interpolation page over at:

http://phaseportrait.blogspot.com/2008/06/sinc-interpolation-in-matlab.html

It took a while, but I finally got around to responding to your comments about the differences (or lack of) between zero-padding in the frequency domain and doing a sinc interpolation in the time domain.

As discussed there, simple sinc interpolation in the time domain assumes a bandlimited signal (i.e., finite frequency support) that is zero outside of the given spectral data. Consequently, sinc interpolation (regardless of how it’s implemented) describes a time-domain signal that itself cannot have finite support. Sinc interpolation and zero-padding in the frequency domain are truly identical. Sinc interpolation provides intermediate time-domain samples that are consistent with the given spectral data.

Your comments about ever-better approximations make me think you were thinking of padding in the time domain, not the frequency domain (which, by duality, is mathematically identical but has a different physical interpretation).

Two questions, one trivial and one not so.

1) I got here via a Google search and don’t know what to do with what I think is Latex in the article. What need I do to see the formatted formulas?

2) Have you considered the inverse that would take a long FFT and decompose it into segments of shorter length? The reason for wanting this is that I want to use it inside a VST plugin development platform, ReaJS from the company behind Reactor, that has a hard coded upper limit of 16384 complex samples (32768 if there is no need to de-permute the permuted result) for both FFT and iFFT but is very, very fast. I’ve transcoded your Matlab to the much simplified array-free C based language that it uses and as I was beginning to debug it I realized that without the inverse of your code to capitalize on the speed of the really fast, but limited, built in iFFT there was no point. I don’t begin to understand the logic behind your algorithm from the code and would greatly appreciate it if you could give a bit of attention to providing the solution to the decomposition.

It just occurred to me that now that I’ve unrolled all the Matlab implicit array loops, my transcoding could serve as a nice template for a native machine code implementation. If you like, I’ll be happy to provide you a link to it if and when I succeed in debugging it.

Many Thanks,

Don Gateley

Hmm. I found this to be extremely slow so did a loop analysis and found the the innermost loop, calculating each element of the vector ds, executes

(Ntot ^ 2)(1 – 1 / Nffts)

times. This is O(Ntot ^2) which is the same as the not-fast Fourier transform of the final result. I.e. no benefit remains from the Nfft fast transforms. Placing a counter inside the ds calculation loop incrementing each time through it by Nbin, the length of ds, verifies that formula.

matthew says:

April 21, 2014 at 5:55 pm

Hi Don,

Sorry for the extremely slow response. Thanks for the comments. Regarding the LaTeX not displaying that was a problem at my end – I’d updated the wordpress theme, but not fixed it so that MathJax worked. I’ve sorted it out now.

Regarding the speed of the algorithm. I agree that this is never going to outperform any method making use of FFTs – when I was initially thinking about this I’d hoped that it would, but in the end it just became an exercise for me in learning how FFTs worked.

Reply

Ted Pavlic says:

November 21, 2011 at 7:38 pm

I think you left some Anonymous comments at my sinc interpolation page over at:

http://phaseportrait.blogspot.com/2008/06/sinc-interpolation-in-matlab.html

It took a while, but I finally got around to responding to your comments about the differences (or lack of) between zero-padding in the frequency domain and doing a sinc interpolation in the time domain.

As discussed there, simple sinc interpolation in the time domain assumes a bandlimited signal (i.e., finite frequency support) that is zero outside of the given spectral data. Consequently, sinc interpolation (regardless of how it’s implemented) describes a time-domain signal that itself cannot have finite support. Sinc interpolation and zero-padding in the frequency domain are truly identical. Sinc interpolation provides intermediate time-domain samples that are consistent with the given spectral data.

Your comments about ever-better approximations make me think you were thinking of padding in the time domain, not the frequency domain (which, by duality, is mathematically identical but has a different physical interpretation).

Don Gateley says:

March 2, 2014 at 2:37 am

Two questions, one trivial and one not so.

1) I got here via a Google search and don’t know what to do with what I think is Latex in the article. What need I do to see the formatted formulas?

2) Have you considered the inverse that would take a long FFT and decompose it into segments of shorter length? The reason for wanting this is that I want to use it inside a VST plugin development platform, ReaJS from the company behind Reactor, that has a hard coded upper limit of 16384 complex samples (32768 if there is no need to de-permute the permuted result) for both FFT and iFFT but is very, very fast. I’ve transcoded your Matlab to the much simplified array-free C based language that it uses and as I was beginning to debug it I realized that without the inverse of your code to capitalize on the speed of the really fast, but limited, built in iFFT there was no point. I don’t begin to understand the logic behind your algorithm from the code and would greatly appreciate it if you could give a bit of attention to providing the solution to the decomposition.

It just occurred to me that now that I’ve unrolled all the Matlab implicit array loops, my transcoding could serve as a nice template for a native machine code implementation. If you like, I’ll be happy to provide you a link to it if and when I succeed in debugging it.

Many Thanks,

Don Gateley

Don Gateley says:

March 2, 2014 at 8:39 pm

Hmm. I found this to be extremely slow so did a loop analysis and found the the innermost loop, calculating each element of the vector ds, executes

(Ntot ^ 2)(1 – 1 / Nffts)

times. This is O(Ntot ^2) which is the same as the not-fast Fourier transform of the final result. I.e. no benefit remains from the Nfft fast transforms. Placing a counter inside the ds calculation loop incrementing each time through it by Nbin, the length of ds, verifies that formula.

1. matthew says:
  
  April 21, 2014 at 5:55 pm
  
  Hi Don,
  
  Sorry for the extremely slow response. Thanks for the comments. Regarding the LaTeX not displaying that was a problem at my end – I’d updated the wordpress theme, but not fixed it so that MathJax worked. I’ve sorted it out now.
  
  Regarding the speed of the algorithm. I agree that this is never going to outperform any method making use of FFTs – when I was initially thinking about this I’d hoped that it would, but in the end it just became an exercise for me in learning how FFTs worked.

Combining multiple FFTs

Related

4 Replies to “Combining multiple FFTs”

Leave a Reply Cancel reply

Share this:

Related

4 Replies to “Combining multiple FFTs”

Leave a Reply Cancel reply