- 25 Feb, 2015 1 commit
-
-
Lorenz Huedepohl authored
-
- 02 Feb, 2015 1 commit
-
-
Lorenz Huedepohl authored
There is currently no reliable way to measure RAM accesses with PAPI, the previous way by counting load and store instructions is not very useful, as it is unknown how many bytes are transferred in each instruction. On certain CPUs there is a reliable way to measure this via an "uncore" performance counter, one can check if your CPU (and/or Linux kernel version) support this by checking if the files /sys/devices/uncore_imc/events/data_reads /sys/devices/uncore_imc/events/data_writes exist. To access these counter from an unprivileged program one has to set the "paranoia" level of the perf subsystem to at most 0, adjustable via the file /proc/sys/kernel/perf_event_paranoid Along with this change there is a small API/ABI breakage as some keyword arguments related to the memory measurement have been renamed/split-up.
-
- 21 Jul, 2014 1 commit
-
-
Lorenz Hüdepohl authored
-
- 28 May, 2014 2 commits
-
-
Andreas Marek authored
-
Andreas Marek authored
-
- 13 May, 2014 1 commit
-
-
Lorenz Huedepohl authored
Additionally one can now also measure load and stores, and thus the memory bandwidth. Therefore, also the arithmetic intensity. One caveat, though: The user is responsible to provide a meaningful value for the amount of bytes transferred in one load/store, via the "bytes_per_ldsr" parameter of the new function %set_print_options. Till now, I have now way of obtaining this value programmatically, and it also can and will vary for different sections of a program. For example, a SSE movapd instructions loads/stores 16 byte, but is still counted as one "load and store" instruction, just as well as a 1-byte mov. Feel free to advise me on a better set of machine counters.. Also, somewhat updated documentation.
-
- 06 May, 2014 1 commit
-
-
Lorenz Huedepohl authored
Now, one can select which kind of measurements are taken by calling the member functions %measure_flops, and %measure_memory of a timer_t object. For example type(timer_t) :: timer call timer%measure_flops(.true.) call timer%measure_memory(.true.) call timer%enable() An explicit ftimings_init() call is now no longer necessary, PAPI will be initialized on the first %measure_flops(.true.) call.
-
- 02 May, 2014 2 commits
-
-
Lorenz Huedepohl authored
-
Lorenz Huedepohl authored
Next so some refactoring into four separate source files, support for also recording values of perfomance counters via the PAPI library was added, at the moment a FLOP count is measured and results are presented in timer_print as MFlop/s.
-