particle code hangs for certain initial conditions
Particle code sometimes hangs, with no clear reason. This is very inconsistent behavior. I see it on my laptop with gcc 8.1 and mpich 3.3, but not on our local cluster. Plus, if I run the code several times, it will not always fail, hence my belief that it is somehow related to non-deterministic MPI stuff.
A possible fix is now implemented in branch
I put in a
DEBUG_MSG_WAIT and noticed that suddenly the code stopped hanging. It never hanged when I had this debug message. I believe some of the asynchronous transfers get their tags confused between time-steps, and the
DEBUG_MSG_WAIT is a hard fix to that.
So currently I added an
MPI_Barrier() that fixes the problem until we have time to figure out if there's a cleaner fix.