My third neural network experiment (second was FIR filter). DFT output is just a linear combination of inputs, so it should be implementable by a single layer with no activation function.
Animation of weights being trained:
Red are positive, blue are negative. The black squares (2336 out of 4096) are unused, and could be pruned out to save computation time (if I knew how to do that).
Even with pruning, it would be less efficient than an FFT, so if the FFT output is useful, probably best to do it externally and provide it as separate inputs?
This at least demonstrates that neural networks can figure out frequency content on their own, though, if it's useful to the problem.
The loss goes down for a while but then goes up. I don't know why:
@msaad1311
Yes, it's highly inefficient, as I said in the description and the comments. Even more so than a direct DFT because of all the zero weights being calculated unnecessarily. This isn't something that you should be doing. It was just an experiment while teaching myself neural nets. If your neural net would benefit from frequency domain information, it's better to just do a numpy FFT and pass the output to the net (possibly adding a magnitude function afterward, since that nonlinearity was much harder to learn in my tests).
If you need the net to learn FFT-like transforms in general, look at the comment above about butterfly networks https://gist.github.com/endolith/98863221204541bf017b6cae71cb0a89#gistcomment-3461984