Changes
Page history
Update diagnosticfiles
authored
Sep 07, 2019
by
Rainer Weinberger
Show whitespace changes
Inline
Side-by-side
userguide/diagnosticfiles.md
View page @
3500e34e
...
@@ -21,18 +21,18 @@ Table of contents
...
@@ -21,18 +21,18 @@ Table of contents
# Diagnostic output
# Diagnostic output
Arepo will not only output the simulation snapshot and reduced data
via
Arepo will not only output the simulation snapshot and reduced data
the halo-finder files, but also a number of (mostly ascii)
diagnostic log-
via
the halo-finder files, but also a number of (mostly ascii)
files which contain important information about the
code performance and
diagnostic log-
files which contain important information about the
runtime behavior.
code performance and
runtime behavior.
In practice, to quickly check the performance of large
In practice, to quickly check the performance of large
production
production
runs, it is useful to check the
``timebins.txt``
runs, it is useful to check the
``timebins.txt``
and
``cpu.txt``
and
``cpu.txt``
files. The former will give information how many simulation
files. The former will give information how many simulation
elements
elements are on
which timestep, i.e. characteristics of the
system
are evolved with
which timestep
sizes
, i.e. characteristics of the
simulated, the latter provides information about the
computational
system
simulated, the latter provides information about the
time spent in each part of the code, which can be
influenced to some
computational
time spent in each part of the code, which can be
degree by the values of the code parameters.
influenced to some
degree by the values of the code parameters.
For ongoing simulations, these can be checked via
For ongoing simulations, these can be checked via
...
@@ -43,18 +43,18 @@ For ongoing simulations, these can be checked via
...
@@ -43,18 +43,18 @@ For ongoing simulations, these can be checked via
stdout
stdout
======
======
The standard output contains general information about the simulation
status
The standard output contains general information about the simulation
and many of the main routines will print general information in
it.
status
and many of the main routines will print general information in
The output itself is mainly relevant for reconstructing what the
simulation
it.
The output itself is mainly relevant for reconstructing what the
did, which is needed e.g. for debugging purposes.
simulation
did, which is needed e.g. for debugging purposes.
balance.txt
balance.txt
===========
===========
Output of fractional cpu time used in each individual step, optimized
to be
Output of fractional cpu time used in each individual step, optimized
machine readable (while cpu.txt is more human readable).
to be
machine readable (while cpu.txt is more human readable).
Symbol key
:
Symbol key
total = '-' / '-'
total = '-' / '-'
treegrav = 'a' / ')'
treegrav = 'a' / ')'
...
@@ -116,21 +116,24 @@ example:
...
@@ -116,21 +116,24 @@ example:
cpu.txt
cpu.txt
=======
=======
Each sync-point, such a block is written. This file
For each sync-point, such a block is written. This file reports
reports the result of the different timers built into Arepo. Each
measurements of the different timers built into Arepo. Each
computationally expensive operation has a different timer attached to it and
computationally expensive operation has a different timer attached to
this way allows to closely monitor what the computational time is spent on.
it, thus allowing to closely monitor where the computational time is
Some of the timers (e.g. treegrav) have sub-timers for individual operations.
spent. Some of the timers (e.g. treegrav) have sub-timers for
This is denoted by the indentation hierarchy in the first column.
individual operations. This is denoted by the indentation hierarchy
The distribution of these timings is highly problem dependent, but it is
in the first column. The fraction of time spent in different code
possible to identify inefficient parts of the overall algorithm and optimize
parts, as well as the absolute amount, is highly problem
only the most time-consuming parts of the code. There is the option
dependent. The timers make it possible to identify inefficient parts
``OUTPUT_CPU_CSV``
which also returns this data as a
``cpu.csv``
file.
of the overall algorithm and concentrate on the most time-consuming
parts of the code. There is also the option
``OUTPUT_CPU_CSV``
which
The different columns are:
returns thes same data as a more easily machine-readable
``cpu.csv``
name; wallclock time (in s) this step; percentage this step; wallclock time
file.
(in s) cumulative; percentage up to this step. A typical block of cpu.txt looks
the following (here a gravity-only, tree-only run):
The different columns are: name; wallclock time (in s) this step;
percentage this step; wallclock time (in s) cumulative; percentage up
to this step. A typical block of cpu.txt looks the following (here a
gravity-only, tree-only run):
Step 131, Time: 0.197266, CPUs: 1, MultiDomains: 8, HighestActiveTim
Step 131, Time: 0.197266, CPUs: 1, MultiDomains: 8, HighestActiveTim
eBin: 20
eBin: 20
...
@@ -183,10 +186,10 @@ the following (here a gravity-only, tree-only run):
...
@@ -183,10 +186,10 @@ the following (here a gravity-only, tree-only run):
domain.txt
domain.txt
==========
==========
The load-balancing (cpu work and memory) both
in
gravity and hydro
calculation
The load-balancing (cpu work and memory) both
for
gravity and hydro
are
reported for each timebin individually. Reported every
sync-point.
calculations is
reported for each timebin individually. Reported every
Ideally balanced runs have
the
value 1, the higher the
value, the more
sync-point.
Ideally balanced runs have
a
value 1, the higher the
imbalanced the simulation.
value, the more
imbalanced the simulation.
DOMAIN BALANCE, Sync-Point 13314, Time: 0.997486
DOMAIN BALANCE, Sync-Point 13314, Time: 0.997486
Timebins: Gravity Hydro cumulative grav-balance
Timebins: Gravity Hydro cumulative grav-balance
...
@@ -208,11 +211,11 @@ energy.txt
...
@@ -208,11 +211,11 @@ energy.txt
==========
==========
In specified intervals (in simulation time, specified by the parameter
In specified intervals (in simulation time, specified by the parameter
`TimeBetStatistics`
) the total energy and its components are computed
and
`TimeBetStatistics`
) the total energy and its components are computed
written into
`energy.txt`
. This file also contains the
cumulative energy
and
written into
the file
`energy.txt`
. This file also contains the
that had to be injected into the system to ensure
positivity in thermal energy.
cumulative energy
that had to be injected into the system to ensure
All output in code units. Note: this
only works with up to 6 particle types.
positivity in thermal energy.
All output
is
in code units. Note: this
The columns are
only works with up to 6 particle types.
The columns are
1. simulation time/ scalefactor
1. simulation time/ scalefactor
2. total thermal energy
2. total thermal energy
...
@@ -245,7 +248,7 @@ The columns are
...
@@ -245,7 +248,7 @@ The columns are
29. total injected energy due to positivity enforcement of thermal e
29. total injected energy due to positivity enforcement of thermal e
nergy
nergy
Two example lines
Two example lines
:
0.96967 3.29069e+06 0 4.27406e+07 3.29069e+06 0 1.65766e+06 0 0 3.93
0.96967 3.29069e+06 0 4.27406e+07 3.29069e+06 0 1.65766e+06 0 0 3.93
02e+07 0 0 0 0 0 0 0 0 1.78097e+06 0 0 0 503.489 3047.89 0 0 65.5756
02e+07 0 0 0 0 0 0 0 0 1.78097e+06 0 0 0 503.489 3047.89 0 0 65.5756
...
@@ -257,8 +260,8 @@ Two example lines
...
@@ -257,8 +260,8 @@ Two example lines
info.txt
info.txt
========
========
Every sync-point, the time-bins, time, timestep and number of active
particles
Every sync-point, the time-bins, time, timestep and number of active
are written into this file, e.g.
particles
are written into this file, e.g.
Sync-Point 13327, TimeBin=16, Time: 0.999408, Redshift: 0.000592464,
Sync-Point 13327, TimeBin=16, Time: 0.999408, Redshift: 0.000592464,
Systemstep: 0.000147974, Dloga: 0.000148072, Nsync-grv: 17679,
Systemstep: 0.000147974, Dloga: 0.000148072, Nsync-grv: 17679,
...
@@ -267,14 +270,15 @@ are written into this file, e.g.
...
@@ -267,14 +270,15 @@ are written into this file, e.g.
memory.txt
memory.txt
==========
==========
Arepo internally uses an own memory manager. This means that one large chunk of
Arepo uses its own internal memory manager. This means that one large
memory is reserved initially for Arepo (specified by the parameter
chunk of memory is reserved initially for Arepo (specified by the
`MaxMemSize`
) and allocation for individual arrays is handled internally.
parameter
`MaxMemSize`
), and the allocation for individual arrays is
The reason for introducing this was to avoid memory fragmentation during
then handled internally from this pool. The reason for introducing
runtime on some machines, but also to have detailed information about how much
this was to avoid memory fragmentation during runtime on some
memory Arepo actually needs and to terminate if this exceeds a pre-defined
machines, but also to have detailed information about how much memory
threshold.
``memory.txt``
reports this internal memory usage, and how much memory
Arepo actually needs and to terminate if this exceeds a pre-defined
is actually needed by the simulation.
threshold.
``memory.txt``
reports this internal memory usage, and how
much memory is actually needed by the simulation.
MEMORY: Largest Allocation = 816.742 Mbyte | Largest Allocation W
MEMORY: Largest Allocation = 816.742 Mbyte | Largest Allocation W
ithout Generic = 132.938 Mbyte
ithout Generic = 132.938 Mbyte
...
@@ -462,13 +466,15 @@ is actually needed by the simulation.
...
@@ -462,13 +466,15 @@ is actually needed by the simulation.
sfr.txt
sfr.txt
=======
=======
In case
``USE_SFR``
is active, Arepo will create a
``sfr.txt``
file, which reports
In case
``USE_SFR``
is active, Arepo will create a
``sfr.txt``
file,
the stars created in every call of the star-formation routine.
which reports the stars created in every call of the star-formation
routine.
The individual columns are:
The individual columns are:
*
time (code units or scale factor)
*
time (code units or scale factor)
*
total stellar mass to be formed in timestepo prior to stochastic sampling (code units),
*
total stellar mass to be formed in timestepo prior to stochastic
sampling (code units),
*
instantaneous star formation rate of all cells (Msun/yr),
*
instantaneous star formation rate of all cells (Msun/yr),
*
instantaneous star formation rate of active cells (Msun/yr),
*
instantaneous star formation rate of active cells (Msun/yr),
*
total mass in stars formed in this timestep (after sampling) (code units),
*
total mass in stars formed in this timestep (after sampling) (code units),
...
@@ -490,13 +496,15 @@ timebins.txt
...
@@ -490,13 +496,15 @@ timebins.txt
============
============
Arepo is optimized for time-integrating both hydrodynamical as well as
Arepo is optimized for time-integrating both hydrodynamical as well as
gravitational interactions on the largest possible timestep that is allowed by
gravitational interactions on the largest possible timestep that is
the timestep criterion and allowed by the binary hierarchy of time steps.
allowed by the timestep criterion and allowed by the binary hierarchy
Each for each timestep, a linked list of particles on this particular
of time steps. For each timestep, a linked list of particles on this
integration step exists, and their statistics are reported in
`timebins.txt`
.
particular integration step exists, and their statistics are reported
In this file, the number of gas cells and collisionless particles in each
in
`timebins.txt`
. In this file, the number of gas cells and
timebin (i.e. integration timestep) is reported for each sync-point, as well
collisionless particles in each timebin (i.e. integration timestep) is
as the cpu time and fraction spent on each timebin. A typical bock looks like
reported for each sync-point, as well as the cpu time and the fraction
of the total cost contributed by each timebin. A typical block looks
like
Sync-Point 2658, Time: 0.307419, Redshift: 2.25289, Systemstep: 9.10
Sync-Point 2658, Time: 0.307419, Redshift: 2.25289, Systemstep: 9.10
27e-05, Dloga: 0.000296144
27e-05, Dloga: 0.000296144
...
@@ -513,12 +521,13 @@ as the cpu time and fraction spent on each timebin. A typical bock looks like
...
@@ -513,12 +521,13 @@ as the cpu time and fraction spent on each timebin. A typical bock looks like
------------------------
------------------------
Total active: 185 143
Total active: 185 143
timings.txt
timings.txt
===========
===========
The performance of the gravitational tree algorithm is reported in
The performance of the gravitational tree algorithm is reported in
`timings.txt`
for each sync-point. An example of a single sync-point
looks
`timings.txt`
for each sync-point. An example of a single sync-point
the following
:
looks like
the following
Step(*): 372, t: 0.0455302, dt: 0.000215226, highest active timebin:
Step(*): 372, t: 0.0455302, dt: 0.000215226, highest active timebin:
19 (lowest active: 19, highest occupied: 19)
19 (lowest active: 19, highest occupied: 19)
...
@@ -529,3 +538,5 @@ the following:
...
@@ -529,3 +538,5 @@ the following:
maximum number of nodes: 31091, filled: 0.677246
maximum number of nodes: 31091, filled: 0.677246
avg times: all=0.519064 tree1=0.515797 tree2=0 commwait=1.0013
avg times: all=0.519064 tree1=0.515797 tree2=0 commwait=1.0013
6e-05 sec
6e-05 sec