copy a_dev to a_mat at the end of bandred instead of later
Previously, matrix a has been kept on the device at the end of bandred and then copied to host later in redist_band (called from tridiag_band). However, it seems to be more convenient to do it at the end of bandred (since it has to be done anyway). Band to tridi is thus now not using a on the device at all. However, we keep it in the interface, sice it might be usefull in the future.