Replace direct smoothing code by pure Python
This gets rid of smooth_util.pyx and replaces it by a pure Python implementation that seems to be slightly faster for the tests I have done so far.
Since smooth_util.pyx was the last Cython code in Nifty, I have also removed all mentions of Cython from documentation, requirements files etc.