particle code hangs for certain initial conditions
Particle code sometimes hangs, with no clear reason. This is very inconsistent behavior. I see it on my laptop with gcc 8.1 and mpich 3.3, but not on our local cluster. Plus, if I run the code several times, it will not always fail, hence my belief that it is somehow related to non-deterministic MPI stuff.
A possible fix is now implemented in branch bugfix/particle_distribution
.
I put in a DEBUG_MSG_WAIT
and noticed that suddenly the code stopped hanging. It never hanged when I had this debug message. I believe some of the asynchronous transfers get their tags confused between time-steps, and the MPI_Barrier()
from DEBUG_MSG_WAIT
is a hard fix to that.
So currently I added an MPI_Barrier()
that fixes the problem until we have time to figure out if there's a cleaner fix.