README.md 7.57 KB
Newer Older
Martin Reinecke's avatar
Martin Reinecke committed
1
2
Distinctly Useful Code Collection (DUCC)
========================================
Martin Reinecke's avatar
Martin Reinecke committed
3

Martin Reinecke's avatar
Martin Reinecke committed
4
5
6
7
This is a collection of basic programming tools for numerical computation,
including Fast Fourier Transforms, Spherical Harmonic Transforms, non-equispaced
Fourier transforms, as well as some concrete applications like 4pi convolution
on the sphere and gridding/degridding of radio interferometry data.
Martin Reinecke's avatar
Martin Reinecke committed
8

Martin Reinecke's avatar
Martin Reinecke committed
9
10
The code is written in C++17, but provides a simple and comprehensive Python
interface.
Martin Reinecke's avatar
Martin Reinecke committed
11
12
13

### Requirements

Martin Reinecke's avatar
Martin Reinecke committed
14
- [Python >= 3.6](https://www.python.org/)
Martin Reinecke's avatar
Martin Reinecke committed
15
- [pybind11](https://github.com/pybind/pybind11)
Martin Reinecke's avatar
Martin Reinecke committed
16
  (only during compiling/installation)
Martin Reinecke's avatar
Martin Reinecke committed
17
18
19
20
21
22
- a C++17-capable compiler, e.g.
  - `g++` 7 or later
  - `clang++`
  - MSVC 2019 or later
  - Intel icpx (oneAPI compiler series)
  Note that older Intel compilers (icc/icpc) are not supported.
Martin Reinecke's avatar
Martin Reinecke committed
23
24
25

### Sources

Martin Reinecke's avatar
Martin Reinecke committed
26
The latest version of DUCC can be obtained by cloning the repository via
Martin Reinecke's avatar
Martin Reinecke committed
27
28
29
30
31

    git clone https://gitlab.mpcdf.mpg.de/mtr/ducc.git

### Installation

Martin Reinecke's avatar
Martin Reinecke committed
32
DUCC can be installed using a simple `pip` invocation:
Martin Reinecke's avatar
Martin Reinecke committed
33

Martin Reinecke's avatar
Martin Reinecke committed
34
    pip3 install --user ducc0
Martin Reinecke's avatar
Martin Reinecke committed
35

Martin Reinecke's avatar
Martin Reinecke committed
36
37
38
This will download the source package and compile it to a binary that is
optimized for the host CPU, but will probably not run on other computers
with slightly different CPUs.
Martin Reinecke's avatar
Martin Reinecke committed
39

Martin Reinecke's avatar
Martin Reinecke committed
40
41
42
43
44
45
46
47
At the cost of performance (typically factors of 1.5 to 2) it is possible to
generate a binary which will work on all CPUs of a given architecture (e.g.
x86_64). This can be done via

    DUCC0_OPTIMIZATION=portable pip3 install --user ducc0

NOTE: compilation of the code can take a significant amount of time
(several minutes).
Martin Reinecke's avatar
Martin Reinecke committed
48

Martin Reinecke's avatar
Martin Reinecke committed
49

Martin Reinecke's avatar
Martin Reinecke committed
50
Installing multiple versions simultaneously
Martin Reinecke's avatar
Martin Reinecke committed
51
-------------------------------------------
Martin Reinecke's avatar
Martin Reinecke committed
52
53
54
55
56
57

The interfaces of the DUCC components are expected to evolve over time; whenever
an interface changes in a manner that is not backwards compatible, the DUCC
version number will increase. As a consequence it might happen that one part of
a Python code may use an older version of DUCC while at the same time another
part requires a newer version. Since DUCC's version number is included in the
Martin Reinecke's avatar
Martin Reinecke committed
58
module name itself (the module is not called `ducc`, but rather `ducc<X>`),
Martin Reinecke's avatar
Martin Reinecke committed
59
60
61
62
this is not a problem, as multiple DUCC versions can be installed
simultaneously.
The latest patch levels of a given DUCC version will always be available at the
HEAD of the git branch with the respective name. In other words, if you need
Martin Reinecke's avatar
fixes    
Martin Reinecke committed
63
the latest incarnation of DUCC 0, this will be on branch "ducc0" of the
Martin Reinecke's avatar
Martin Reinecke committed
64
65
66
67
git repository, and it will be installed as the package "ducc0".
Later versions will be maintained on new branches and will be installed as
"ducc1" and "ducc2", so that there will be no conflict with potentially
installed older versions.
Martin Reinecke's avatar
Martin Reinecke committed
68
69


Martin Reinecke's avatar
Martin Reinecke committed
70
71
72
73
74
75
76
DUCC components
===============

ducc.fft
--------

This package provides Fast Fourier, trigonometric and Hartley transforms with a
Martin Reinecke's avatar
Martin Reinecke committed
77
simple Python interface. It is an evolution of `pocketfft` and `pypocketfft`
Martin Reinecke's avatar
Martin Reinecke committed
78
which are currently used by `numpy` and `scipy`.
Martin Reinecke's avatar
Martin Reinecke committed
79

Martin Reinecke's avatar
Martin Reinecke committed
80
81
The central algorithms are derived from Paul Swarztrauber's
[FFTPACK](http://www.netlib.org/fftpack) code.
Martin Reinecke's avatar
Martin Reinecke committed
82

Martin Reinecke's avatar
Martin Reinecke committed
83
### Features
Martin Reinecke's avatar
Martin Reinecke committed
84
85
86
87
88
89
90
91
92
93
- supports fully complex and half-complex (i.e. complex-to-real and
  real-to-complex) FFTs, discrete sine/cosine transforms and Hartley transforms
- achieves very high accuracy for all transforms
- supports multidimensional arrays and selection of the axes to be transformed
- supports single, double, and long double precision
- makes use of CPU vector instructions when performing 2D and higher-dimensional
  transforms
- supports prime-length transforms without degrading to O(N**2) performance
- has optional multi-threading support for multidimensional transforms

Martin Reinecke's avatar
Martin Reinecke committed
94
### Design decisions and performance characteristics
Martin Reinecke's avatar
Martin Reinecke committed
95
96
97
98
99
- there is no internal caching of plans and twiddle factors, making the
  interface as simple as possible
- 1D transforms are significantly slower than those provided by FFTW (if FFTW's
  plan generation overhead is ignored)
- multi-D transforms in double precision perform fairly similar to FFTW with
Martin Reinecke's avatar
Martin Reinecke committed
100
  FFTW_MEASURE; in single precision `ducc.fft` can be significantly faster.
Martin Reinecke's avatar
Martin Reinecke committed
101
102
103
104
105

ducc.sht
--------

This package provides efficient spherical harmonic trasforms (SHTs). Its code
Martin Reinecke's avatar
Martin Reinecke committed
106
107
108
109
110
111
112
is derived from [libsharp](https://arxiv.org/abs/1303.4945), but has been
significantly enhanced.

### Noteworthy features
- support for any grid based on iso-latitude rings with equidistant pixels in
  each of the rings
- support for accurate spherical harmonic analyis on certain sub-classes of
Martin Reinecke's avatar
tweaks    
Martin Reinecke committed
113
114
  grids (Clenshaw-Curtis, Fejer-1 and McEwen-Wiaux) at band limits beyond those
  for which quadrature weights exist. For details see
Martin Reinecke's avatar
Martin Reinecke committed
115
116
117
  [this note](https://wwwmpa.mpa-garching.mpg.de/~martin/shtnote.pdf).
- substantially improved transformation speed (up to a factor of 2) on the
  above mentioned grid geometries for high band limits
Martin Reinecke's avatar
tweaks    
Martin Reinecke committed
118
- accelerated recurrences as presented in
Martin Reinecke's avatar
tweaks    
Martin Reinecke committed
119
  [Ishioka (2018)](https://www.jstage.jst.go.jp/article/jmsj/96/2/96_2018-019/_pdf)
Martin Reinecke's avatar
Martin Reinecke committed
120
121
- vector instruction support
- multi-threading support
Martin Reinecke's avatar
Martin Reinecke committed
122

Martin Reinecke's avatar
Martin Reinecke committed
123
The code for rotating spherical harmonic coefficients was taken (with some
124
modifications) from Mikael Slevinsky's
Martin Reinecke's avatar
Martin Reinecke committed
125
126
[FastTransforms package](https://github.com/MikaelSlevinsky/FastTransforms).

Martin Reinecke's avatar
Martin Reinecke committed
127
128
129
130

ducc.healpix
------------

Martin Reinecke's avatar
Martin Reinecke committed
131
This library provides Python bindings for the most important functionality
Martin Reinecke's avatar
Martin Reinecke committed
132
133
related to the [HEALPix](https://arxiv.org/abs/astro-ph/0409513) tesselation,
except for spherical harmonic transforms, which are covered by `ducc.sht`.
Martin Reinecke's avatar
Martin Reinecke committed
134
135
136
137

The design goals are
- similarity to the interface of the HEALPix C++ library
  (while respecting some Python peculiarities)
Martin Reinecke's avatar
Martin Reinecke committed
138
139
140
141
142
143
144
145
146
147
- simplicity (no optional function parameters)
- low function calling overhead


ducc.totalconvolve
------------------

Library for high-accuracy 4pi convolution on the sphere, which generates a
total convolution data cube from a set of sky and beam `a_lm` and computes
interpolated values for a given list of detector pointings.
Martin Reinecke's avatar
Martin Reinecke committed
148
This code has evolved from the original
Martin Reinecke's avatar
updates    
Martin Reinecke committed
149
[totalconvolver](https://arxiv.org/abs/astro-ph/0008227) algorithm
Martin Reinecke's avatar
Martin Reinecke committed
150
via the [conviqt](https://arxiv.org/abs/1002.1050) code.
Martin Reinecke's avatar
Martin Reinecke committed
151

Martin Reinecke's avatar
Martin Reinecke committed
152

Martin Reinecke's avatar
Martin Reinecke committed
153
### Algorithmic details:
Martin Reinecke's avatar
Martin Reinecke committed
154
- the code uses `ducc.sht` SHTs and `ducc.fft` FFTs to compute the data cube
Martin Reinecke's avatar
Martin Reinecke committed
155
156
- shared-memory parallelization is provided via standard C++ threads.
- for interpolation, the algorithm and kernel described in
Martin Reinecke's avatar
Martin Reinecke committed
157
  <https://arxiv.org/abs/1808.06736> are used. This allows very efficient
Martin Reinecke's avatar
Martin Reinecke committed
158
159
160
161
162
163
  interpolation with user-adjustable accuracy.


ducc.wgridder
-------------

Martin Reinecke's avatar
Martin Reinecke committed
164
165
166
167
Library for high-accuracy gridding/degridding of radio interferometry datasets
(code paper available at <https://arxiv.org/abs/2010.10122>).
This code has also been integrated into
[wsclean](https://gitlab.com/aroffringa/wsclean)
Martin Reinecke's avatar
Martin Reinecke committed
168
(<https://arxiv.org/abs/1407.1943>)
Martin Reinecke's avatar
Martin Reinecke committed
169
as the `wgridder` component.
Martin Reinecke's avatar
Martin Reinecke committed
170

Martin Reinecke's avatar
Martin Reinecke committed
171
### Programming aspects
Martin Reinecke's avatar
Martin Reinecke committed
172
- shared-memory parallelization via standard C++ threads.
Martin Reinecke's avatar
Martin Reinecke committed
173
174
175
- kernel computation is performed on the fly, avoiding inaccuracies
  due to table lookup and reducing overall memory bandwidth

Martin Reinecke's avatar
Martin Reinecke committed
176
### Numerical aspects
Martin Reinecke's avatar
Martin Reinecke committed
177
- uses the analytical gridding kernel presented in
Martin Reinecke's avatar
Martin Reinecke committed
178
  <https://arxiv.org/abs/1808.06736>
Martin Reinecke's avatar
Martin Reinecke committed
179
- uses the "improved W-stacking method" described in
Martin Reinecke's avatar
Martin Reinecke committed
180
  <https://arxiv.org/abs/2101.11172>
Martin Reinecke's avatar
Martin Reinecke committed
181
182
183
- in combination these two aspects allow extremely accurate gridding/degridding
  operations (L2 error compared to explicit DFTs can go below 1e-12) with
  reasonable resource consumption
184
185
186
187
188
189
190
191


ducc.misc
---------

Various unsorted functionality which will hopefully be categorized in the
future.

Martin Reinecke's avatar
Martin Reinecke committed
192
193
194
195
196
This module contains an efficient algorithm for the computation of abscissas and
weights for Gauss-Legendre quadrature. For degrees up to 100, the solutions are
computed in the standard iterative fashion; for higher degrees Ignace Bogaert's
[FastGL algorithm](https://epubs.siam.org/doi/pdf/10.1137/140954969)
is used.