Add support for NVTX profiling
When profiling the GPU version, NVTX can be used to highlight the corresponding regions of the code in the timeline of the profiling tool (nvvp or nsight systems). This is very useful to correlate what happens on the GPU with what part of the code we are in.
Currently, the regions are only defined in the elpa1 solver.