Fastest Fourier Transforms in the West (FFTW)
FFTW Notes
FFTW is a C library (with Fortran support) for computing discrete Fourier transforms (DFT) in one or more dimensions, of arbitrary input size, and of both real and complex data.
Compiling with FFTW
Load the FFTW Module
module load fftw/3.1.1-pgi
module load fftw/2.1.5-pgi
Use module show fftw/3.1.1-pgi to see variables the module set to make compiling simpler.
$FFTW_INCFFTW/3.x:pgcc fftw3.c $FFTW_INC $FFTW_LINK -lfftw3
FFTW/2.x:pgcc fftw2.c $FFTW_INC $FFTW_LINK -lfftw
$FFTW_LINKPoints to the location of the FFTW header and include files.
-lfftw3Points to the locations of the FFTW libraries.
-lrfftw -lfftwCommand to tell the compiler to link libfftw3.a from the location $FFTW_LINK.
Command to tell the compiler to link libfftw.a from the location $FFTW_LINK. The comamnd -lfftw only supports complex transforms, -lrfftw will do real to complex transforms. FFTW/3.x does not have this limitation.
FFTW Notes
Start by reading the fftw/3.x turorial
For Fortran read the fftw page on fortran
You can use our environment variables from the module in Makefile's by loading the module and wrapping the variables in (), $(FFTW_INC) $(FFTW_LINK)
Always use fftw/3.x if you are able to do so. Performance is better, and it is the only version of fftw still under development. The exception to this is MPI enabled fftw, which currently is only stable in fftw/2.x.
Keep your number of samples to multiples of small primes. For example computing 19999999 samples (a prime number) took 2m42s. Computing 20000000 (prime factors of 2 and 5) samples took 1m27s. See Cooley Tukey Algorithm for details why.
Always use fftw_malloc() when using fftw.
fftw/3.x should work with complex.h from C99.
You should not mix fftw/2.x and fftw/3.x - they are very different, but some names match.
fftw_complex fftwf_complex are double and single (float) precision each.
Unless you are saving your wisdom for later use, or reusing the same plan over many time, use only FFTW_ESTIMATE in your planner. The other flags take very long to run for small gains.
Using any planner flag other than FFTW_ESTIMATE will overwrite the buffers. This can be confusing if your input data is already in the *in buffer.
There is threaded (shared memory parallel) fftw available, as is MPI enbabled fftw contact cac-support@umich.edu for help.



