... | ... | @@ -21,18 +21,18 @@ Table of contents |
|
|
# Diagnostic output
|
|
|
|
|
|
|
|
|
Arepo will not only output the simulation snapshot and reduced data via
|
|
|
the halo-finder files, but also a number of (mostly ascii) diagnostic log-
|
|
|
files which contain important information about the code performance and
|
|
|
runtime behavior.
|
|
|
|
|
|
In practice, to quickly check the performance of large
|
|
|
production runs, it is useful to check the ``timebins.txt``
|
|
|
and ``cpu.txt`` files. The former will give information how many simulation
|
|
|
elements are on which timestep, i.e. characteristics of the system
|
|
|
simulated, the latter provides information about the computational
|
|
|
time spent in each part of the code, which can be influenced to some
|
|
|
degree by the values of the code parameters.
|
|
|
Arepo will not only output the simulation snapshot and reduced data
|
|
|
via the halo-finder files, but also a number of (mostly ascii)
|
|
|
diagnostic log- files which contain important information about the
|
|
|
code performance and runtime behavior.
|
|
|
|
|
|
In practice, to quickly check the performance of large production
|
|
|
runs, it is useful to check the ``timebins.txt`` and ``cpu.txt``
|
|
|
files. The former will give information how many simulation elements
|
|
|
are evolved with which timestep sizes, i.e. characteristics of the
|
|
|
system simulated, the latter provides information about the
|
|
|
computational time spent in each part of the code, which can be
|
|
|
influenced to some degree by the values of the code parameters.
|
|
|
|
|
|
For ongoing simulations, these can be checked via
|
|
|
|
... | ... | @@ -43,18 +43,18 @@ For ongoing simulations, these can be checked via |
|
|
stdout
|
|
|
======
|
|
|
|
|
|
The standard output contains general information about the simulation status
|
|
|
and many of the main routines will print general information in it.
|
|
|
The output itself is mainly relevant for reconstructing what the simulation
|
|
|
did, which is needed e.g. for debugging purposes.
|
|
|
The standard output contains general information about the simulation
|
|
|
status and many of the main routines will print general information in
|
|
|
it. The output itself is mainly relevant for reconstructing what the
|
|
|
simulation did, which is needed e.g. for debugging purposes.
|
|
|
|
|
|
balance.txt
|
|
|
===========
|
|
|
|
|
|
Output of fractional cpu time used in each individual step, optimized to be
|
|
|
machine readable (while cpu.txt is more human readable).
|
|
|
Output of fractional cpu time used in each individual step, optimized
|
|
|
to be machine readable (while cpu.txt is more human readable).
|
|
|
|
|
|
Symbol key:
|
|
|
Symbol key
|
|
|
|
|
|
total = '-' / '-'
|
|
|
treegrav = 'a' / ')'
|
... | ... | @@ -116,21 +116,24 @@ example: |
|
|
cpu.txt
|
|
|
=======
|
|
|
|
|
|
Each sync-point, such a block is written. This file
|
|
|
reports the result of the different timers built into Arepo. Each
|
|
|
computationally expensive operation has a different timer attached to it and
|
|
|
this way allows to closely monitor what the computational time is spent on.
|
|
|
Some of the timers (e.g. treegrav) have sub-timers for individual operations.
|
|
|
This is denoted by the indentation hierarchy in the first column.
|
|
|
The distribution of these timings is highly problem dependent, but it is
|
|
|
possible to identify inefficient parts of the overall algorithm and optimize
|
|
|
only the most time-consuming parts of the code. There is the option
|
|
|
``OUTPUT_CPU_CSV`` which also returns this data as a ``cpu.csv`` file.
|
|
|
|
|
|
The different columns are:
|
|
|
name; wallclock time (in s) this step; percentage this step; wallclock time
|
|
|
(in s) cumulative; percentage up to this step. A typical block of cpu.txt looks
|
|
|
the following (here a gravity-only, tree-only run):
|
|
|
For each sync-point, such a block is written. This file reports
|
|
|
measurements of the different timers built into Arepo. Each
|
|
|
computationally expensive operation has a different timer attached to
|
|
|
it, thus allowing to closely monitor where the computational time is
|
|
|
spent. Some of the timers (e.g. treegrav) have sub-timers for
|
|
|
individual operations. This is denoted by the indentation hierarchy
|
|
|
in the first column. The fraction of time spent in different code
|
|
|
parts, as well as the absolute amount, is highly problem
|
|
|
dependent. The timers make it possible to identify inefficient parts
|
|
|
of the overall algorithm and concentrate on the most time-consuming
|
|
|
parts of the code. There is also the option ``OUTPUT_CPU_CSV`` which
|
|
|
returns thes same data as a more easily machine-readable ``cpu.csv``
|
|
|
file.
|
|
|
|
|
|
The different columns are: name; wallclock time (in s) this step;
|
|
|
percentage this step; wallclock time (in s) cumulative; percentage up
|
|
|
to this step. A typical block of cpu.txt looks the following (here a
|
|
|
gravity-only, tree-only run):
|
|
|
|
|
|
Step 131, Time: 0.197266, CPUs: 1, MultiDomains: 8, HighestActiveTim
|
|
|
eBin: 20
|
... | ... | @@ -180,13 +183,13 @@ the following (here a gravity-only, tree-only run): |
|
|
misc 0.00 0.0% 0.02 0.0%
|
|
|
|
|
|
|
|
|
domain.txt
|
|
|
domain.txt
|
|
|
==========
|
|
|
|
|
|
The load-balancing (cpu work and memory) both in gravity and hydro calculation
|
|
|
are reported for each timebin individually. Reported every sync-point.
|
|
|
Ideally balanced runs have the value 1, the higher the value, the more
|
|
|
imbalanced the simulation.
|
|
|
The load-balancing (cpu work and memory) both for gravity and hydro
|
|
|
calculations is reported for each timebin individually. Reported every
|
|
|
sync-point. Ideally balanced runs have a value 1, the higher the
|
|
|
value, the more imbalanced the simulation.
|
|
|
|
|
|
DOMAIN BALANCE, Sync-Point 13314, Time: 0.997486
|
|
|
Timebins: Gravity Hydro cumulative grav-balance
|
... | ... | @@ -204,15 +207,15 @@ imbalanced the simulation. |
|
|
--------------------------------------------------------------------
|
|
|
-----------------
|
|
|
|
|
|
energy.txt
|
|
|
energy.txt
|
|
|
==========
|
|
|
|
|
|
In specified intervals (in simulation time, specified by the parameter
|
|
|
`TimeBetStatistics`) the total energy and its components are computed and
|
|
|
written into `energy.txt`. This file also contains the cumulative energy
|
|
|
that had to be injected into the system to ensure positivity in thermal energy.
|
|
|
All output in code units. Note: this only works with up to 6 particle types.
|
|
|
The columns are
|
|
|
In specified intervals (in simulation time, specified by the parameter
|
|
|
`TimeBetStatistics`) the total energy and its components are computed
|
|
|
and written into the file `energy.txt`. This file also contains the
|
|
|
cumulative energy that had to be injected into the system to ensure
|
|
|
positivity in thermal energy. All output is in code units. Note: this
|
|
|
only works with up to 6 particle types. The columns are
|
|
|
|
|
|
1. simulation time/ scalefactor
|
|
|
2. total thermal energy
|
... | ... | @@ -245,7 +248,7 @@ The columns are |
|
|
29. total injected energy due to positivity enforcement of thermal e
|
|
|
nergy
|
|
|
|
|
|
Two example lines
|
|
|
Two example lines:
|
|
|
|
|
|
0.96967 3.29069e+06 0 4.27406e+07 3.29069e+06 0 1.65766e+06 0 0 3.93
|
|
|
02e+07 0 0 0 0 0 0 0 0 1.78097e+06 0 0 0 503.489 3047.89 0 0 65.5756
|
... | ... | @@ -254,11 +257,11 @@ Two example lines |
|
|
4203e+07 0 0 0 0 0 0 0 0 1.76774e+06 0 0 0 503.306 3047.89 0 0 65.75
|
|
|
86 0 7.71477
|
|
|
|
|
|
info.txt
|
|
|
info.txt
|
|
|
========
|
|
|
|
|
|
Every sync-point, the time-bins, time, timestep and number of active particles
|
|
|
are written into this file, e.g.
|
|
|
Every sync-point, the time-bins, time, timestep and number of active
|
|
|
particles are written into this file, e.g.
|
|
|
|
|
|
Sync-Point 13327, TimeBin=16, Time: 0.999408, Redshift: 0.000592464,
|
|
|
Systemstep: 0.000147974, Dloga: 0.000148072, Nsync-grv: 17679,
|
... | ... | @@ -267,14 +270,15 @@ are written into this file, e.g. |
|
|
memory.txt
|
|
|
==========
|
|
|
|
|
|
Arepo internally uses an own memory manager. This means that one large chunk of
|
|
|
memory is reserved initially for Arepo (specified by the parameter
|
|
|
`MaxMemSize`) and allocation for individual arrays is handled internally.
|
|
|
The reason for introducing this was to avoid memory fragmentation during
|
|
|
runtime on some machines, but also to have detailed information about how much
|
|
|
memory Arepo actually needs and to terminate if this exceeds a pre-defined
|
|
|
threshold. ``memory.txt`` reports this internal memory usage, and how much memory
|
|
|
is actually needed by the simulation.
|
|
|
Arepo uses its own internal memory manager. This means that one large
|
|
|
chunk of memory is reserved initially for Arepo (specified by the
|
|
|
parameter `MaxMemSize`), and the allocation for individual arrays is
|
|
|
then handled internally from this pool. The reason for introducing
|
|
|
this was to avoid memory fragmentation during runtime on some
|
|
|
machines, but also to have detailed information about how much memory
|
|
|
Arepo actually needs and to terminate if this exceeds a pre-defined
|
|
|
threshold. ``memory.txt`` reports this internal memory usage, and how
|
|
|
much memory is actually needed by the simulation.
|
|
|
|
|
|
MEMORY: Largest Allocation = 816.742 Mbyte | Largest Allocation W
|
|
|
ithout Generic = 132.938 Mbyte
|
... | ... | @@ -462,41 +466,45 @@ is actually needed by the simulation. |
|
|
sfr.txt
|
|
|
=======
|
|
|
|
|
|
In case ``USE_SFR`` is active, Arepo will create a ``sfr.txt`` file, which reports
|
|
|
the stars created in every call of the star-formation routine.
|
|
|
In case ``USE_SFR`` is active, Arepo will create a ``sfr.txt`` file,
|
|
|
which reports the stars created in every call of the star-formation
|
|
|
routine.
|
|
|
|
|
|
The individual columns are:
|
|
|
|
|
|
* time (code units or scale factor)
|
|
|
* total stellar mass to be formed in timestepo prior to stochastic sampling (code units),
|
|
|
* total stellar mass to be formed in timestepo prior to stochastic
|
|
|
sampling (code units),
|
|
|
* instantaneous star formation rate of all cells (Msun/yr),
|
|
|
* instantaneous star formation rate of active cells (Msun/yr),
|
|
|
* total mass in stars formed in this timestep (after sampling) (code units),
|
|
|
* cumulative stellar mass formed (code units).
|
|
|
|
|
|
Example:
|
|
|
Example:
|
|
|
|
|
|
4.373019e-01 9.714635e-03 1.100743e+02 1.405136e+02 2.2809
|
|
|
41e-02 2.752464e+01
|
|
|
4.373667e-01 4.007455e-04 1.104648e+02 5.795346e+00 0.0000
|
|
|
00e+00 2.752464e+01
|
|
|
4.374315e-01 2.009357e-02 1.104276e+02 2.905270e+02 0.0000
|
|
|
00e+00 2.752464e+01
|
|
|
4.374962e-01 3.904148e-04 1.103389e+02 5.643836e+00 0.0000
|
|
|
00e+00 2.752464e+01
|
|
|
4.373019e-01 9.714635e-03 1.100743e+02 1.405136e+02 2.2809
|
|
|
41e-02 2.752464e+01
|
|
|
4.373667e-01 4.007455e-04 1.104648e+02 5.795346e+00 0.0000
|
|
|
00e+00 2.752464e+01
|
|
|
4.374315e-01 2.009357e-02 1.104276e+02 2.905270e+02 0.0000
|
|
|
00e+00 2.752464e+01
|
|
|
4.374962e-01 3.904148e-04 1.103389e+02 5.643836e+00 0.0000
|
|
|
00e+00 2.752464e+01
|
|
|
|
|
|
|
|
|
timebins.txt
|
|
|
============
|
|
|
|
|
|
Arepo is optimized for time-integrating both hydrodynamical as well as
|
|
|
gravitational interactions on the largest possible timestep that is allowed by
|
|
|
the timestep criterion and allowed by the binary hierarchy of time steps.
|
|
|
Each for each timestep, a linked list of particles on this particular
|
|
|
integration step exists, and their statistics are reported in `timebins.txt`.
|
|
|
In this file, the number of gas cells and collisionless particles in each
|
|
|
timebin (i.e. integration timestep) is reported for each sync-point, as well
|
|
|
as the cpu time and fraction spent on each timebin. A typical bock looks like
|
|
|
Arepo is optimized for time-integrating both hydrodynamical as well as
|
|
|
gravitational interactions on the largest possible timestep that is
|
|
|
allowed by the timestep criterion and allowed by the binary hierarchy
|
|
|
of time steps. For each timestep, a linked list of particles on this
|
|
|
particular integration step exists, and their statistics are reported
|
|
|
in `timebins.txt`. In this file, the number of gas cells and
|
|
|
collisionless particles in each timebin (i.e. integration timestep) is
|
|
|
reported for each sync-point, as well as the cpu time and the fraction
|
|
|
of the total cost contributed by each timebin. A typical block looks
|
|
|
like
|
|
|
|
|
|
Sync-Point 2658, Time: 0.307419, Redshift: 2.25289, Systemstep: 9.10
|
|
|
27e-05, Dloga: 0.000296144
|
... | ... | @@ -513,12 +521,13 @@ as the cpu time and fraction spent on each timebin. A typical bock looks like |
|
|
------------------------
|
|
|
Total active: 185 143
|
|
|
|
|
|
timings.txt
|
|
|
|
|
|
timings.txt
|
|
|
===========
|
|
|
|
|
|
The performance of the gravitational tree algorithm is reported in
|
|
|
`timings.txt` for each sync-point. An example of a single sync-point looks
|
|
|
the following:
|
|
|
The performance of the gravitational tree algorithm is reported in
|
|
|
`timings.txt` for each sync-point. An example of a single sync-point
|
|
|
looks like the following
|
|
|
|
|
|
Step(*): 372, t: 0.0455302, dt: 0.000215226, highest active timebin:
|
|
|
19 (lowest active: 19, highest occupied: 19)
|
... | ... | @@ -529,3 +538,5 @@ the following: |
|
|
maximum number of nodes: 31091, filled: 0.677246
|
|
|
avg times: all=0.519064 tree1=0.515797 tree2=0 commwait=1.0013
|
|
|
6e-05 sec
|
|
|
|
|
|
|