Fix bug in GPU version

The associaten with Cuda memory as Fortran arrays was sometimes
done wrongly , since not the number of array elements but instead
the number of array elements times the sizeof(datatype) was used.
This was a mix-up between the C allocation and the Fortran reshape.

This could lead to memory corruption
