Finding mono and stereo events take too much time when loading events
When performing a profiling of ctapipe_io_magic (as of commit c676ef1c) taking in one run (composed of multiple subruns), I found out that >50% of the time is spent in finding the indices of mono and stereo events (_find_mono_events
and _find_stereo_events
methods of the MarsRun
class), see attached screenshot (produced with snakeviz
and the profiling file attached).
From inspection of the two methods code, the bottleneck is due to the several for
loops, which are looping over a consistent number of events. In this sense, there should be a huge boost if those loops are replaced by numpy
functions, since the starting data structures are numpy
arrays.
The script I tested (called test_ctapipe_io_magic.py
) is very simple:
from ctapipe_io_magic import MarsRun, MAGICEventSource
onerun = MarsRun("/storage/gpfs_data/ctalocal/aberti/MAGIC_LST/Analysis/Crab_campaign/CrabNebula/2020-01-19/Calibrated_ON/20200119*05088541*.root")
and to create the profile file I run the script with:
python -m cProfile -o profile.prof test_ctapipe_io_magic.py