Threadpool fixes
Fixes #14 (closed)
I'm a bit disappointed that std::hardware_destructive_interference_size
isn't supported properly but it seems none of the major standard libraries implement it, so there's no point even trying.
The deadlocks you were seeing might be related to the race condition I tried to fix here. There is a window where the worker threads might have all checked the shared work queue and found nothing to do but the producer thread is just about to push a work item onto the queue. If the workers go to sleep in that window, they would never check the queue and the thread pool deadlocks. The fix there obviously wasn't good enough so I've added an extra atomic variable to track this situation and guaruntee the workers won't go to sleep.
I think travis is more vulnerable to this race because they're running on VMs with only 2 cores. The more threads, the less likely it should be that all the workers are in this in-between stage. I've now had quite a few travis passes in a row with this addition, so I'm hopeful that it's really fixed this time.