Incorrect number of samples in optimize_kl
After I updated my script to the last changes of `re.optimize_kl`, it seems that when I provide a varying number of samples (i.e. `n_samples` is a callable), `optimize_kl()` only returns a number of samples that corresponds to the first iteration defined in `n_samples`. Not sure if the during optimization it actually uses the right number of samples though (it seems to do so).
issue