parallelization for mirrored KL

The KL with mirror_samples=True can now be parallelized by another factor of 2. Threads will reuse drawn samples to mirror them if possible.

Merge request reports

Loading