lift the restriction bandwidth == nblk in the GPU version of ELPA 2
For the GPU version of ELPA 2, the intermediate bandwidth is always taken as the scalapack block size. It would be better, if the optimal value (as for the CPU version) could be selected, since it is very important for performance. It is, however, hard-coded somewhere in the band reduction step.