README.md 6.25 KB
Newer Older
Tobias Winchen's avatar
Tobias Winchen committed
1
2
3
4
5
EDD backend Control with ansible
================================

[Ansible](https://www.ansible.com/) is an IT automatization tool written in
python that requires only a SSH login on the host to control its configuration.
Tobias Winchen's avatar
Tobias Winchen committed
6
7
8
We use it here to manage the EDD backend. The [Ansible
documentation](https://docs.ansible.com/ansible/latest/user_guide/intro_getting_started.html)
is a good entry point to get familiar with the tool.
Tobias Winchen's avatar
Tobias Winchen committed
9

10

Tobias Winchen's avatar
Tobias Winchen committed
11
## Ansible basics
Tobias Winchen's avatar
Tobias Winchen committed
12
13
14
15
The ansible terminology is derived from theater - roles are assigned and
a play is performed.

Within ansible terminology a **host** assumes one or more **roles**. Setting up
Tobias Winchen's avatar
Tobias Winchen committed
16
all roles is the **play** applied to all resources in the **inventory**.
Tobias Winchen's avatar
Tobias Winchen committed
17
Setting up a **role** consists of one or multiple **tasks**. For the EDD the
Tobias Winchen's avatar
Tobias Winchen committed
18
available services are e.g.
Tobias Winchen's avatar
Tobias Winchen committed
19
20
  - GatedSpectromeeter
  - VDIF Conversion
Tobias Winchen's avatar
Tobias Winchen committed
21
  - CiritcallySampledPFB
Tobias Winchen's avatar
Tobias Winchen committed
22
23
24
  - ...
are roles assigned to individual hosts.

25

Tobias Winchen's avatar
Tobias Winchen committed
26
### Inventories
Tobias Winchen's avatar
Tobias Winchen committed
27
28
29
All hosts + global variables for the individual sites are collected in the
corresponding inventory directory, e.g. effelsberg. You can e.g. ping all hosts
via the command:
Tobias Winchen's avatar
Tobias Winchen committed
30
 `$ ansible -i effelsberg -m ping all`
Tobias Winchen's avatar
Tobias Winchen committed
31
32
33
.


Tobias Winchen's avatar
Tobias Winchen committed
34
### Roles
Tobias Winchen's avatar
Tobias Winchen committed
35
Roles are defined in the directory `roles`, e.g `roles/gated_spectrometer`.
Tobias Winchen's avatar
Tobias Winchen committed
36
Here the tasks performed by a role are in `tasks/main.yml` that contains only
Tobias Winchen's avatar
Tobias Winchen committed
37
one task to start the according docker container. The build and start of EDD
Tobias Winchen's avatar
Tobias Winchen committed
38
39
40
docker container is abstracted out into a common role, so that only some
variables have to be defined. The Dockerfile to build the corresponding image
is stored in the templates.
Tobias Winchen's avatar
Tobias Winchen committed
41

42

Tobias Winchen's avatar
Tobias Winchen committed
43
### Play
Tobias Winchen's avatar
Tobias Winchen committed
44
Every configuration is a play. The `example_run.yml` assignees the role
Tobias Winchen's avatar
Tobias Winchen committed
45
46
gated_spectrometer to the first gpu node and executes test roles (a simple
ping) on the next.  The play is executed by:
Tobias Winchen's avatar
Tobias Winchen committed
47

Tobias Winchen's avatar
Tobias Winchen committed
48
`$ansible-playbook -i effelsberg example_run.yml`
Tobias Winchen's avatar
Tobias Winchen committed
49

50

Tobias Winchen's avatar
Tobias Winchen committed
51
52
53
## Ansible for EDD provisioning
### EDD Core
The core EDD consists of
Tobias Winchen's avatar
Tobias Winchen committed
54
55
56
57
58
59
  * a master controller,
  * a redis DB,
  * a docker registry,
	* a dhcp server,
  * a ansibleinterface container running on every node of the system used to
		grant ansible access to the node to the amster controller.
Tobias Winchen's avatar
Tobias Winchen committed
60
61

The basic_configuration.yml playbook will ensure the core system is up and
Tobias Winchen's avatar
Tobias Winchen committed
62
63
64
65
66
67
68
running. It is imported in the site-specific configuration playbooks
  * effelsberg_config
	* ska_proto_config
	* tnrt_config
where also site-specific settings are done.

The basoc configuration will also ensure certain configurations on the bare metal systems,
Tobias Winchen's avatar
Tobias Winchen committed
69
70
71
72
e.g.
	* Installing the correct certificates for the docker registry
	* Installing the correct version of the nvidia driver

Tobias Winchen's avatar
Tobias Winchen committed
73
Use e.g.
Tobias Winchen's avatar
Tobias Winchen committed
74
`
Tobias Winchen's avatar
Tobias Winchen committed
75
$ANSIBLE_CACHE_PLUGIN=memory ansible-playbook -i effelsberg_devel effelserg_config.yml
Tobias Winchen's avatar
Tobias Winchen committed
76
77
78
`
to execute the playbook, respectively **also** use
`
Tobias Winchen's avatar
Tobias Winchen committed
79
$ANSIBLE_CACHE_PLUGIN=memory ansible-playbook -i effelsberg_devel effelserg_config.yml --tags=build
Tobias Winchen's avatar
Tobias Winchen committed
80
81
82
83
84
85
86
87
`
to build the containers. This is a force build to always pull latest changes
from the repositoreis. The basic configuration playbook will *create* the
registry certificate and ssh key-pairs for the docker registry, respectively
ansible_interface. **Old keys will be overwritten, so manually granted access
to components outside of the ansible system by copying e.g. the certificate
will be withdrawn.**

88
89
To build also the base container and the side cars, use

Tobias Winchen's avatar
Tobias Winchen committed
90
$ANSIBLE_CACHE_PLUGIN=memory ansible-playbook -i effelsbergdevel basic_configuration.yml --tags=buildbase,build,buildsidecar
Tobias Winchen's avatar
Tobias Winchen committed
91
92
93
94

### EDD provisioning
To provison EDD based pipelines, a play needs to be loaded
`
Tobias Winchen's avatar
Tobias Winchen committed
95
$ansible-playbook -i effelsberg provision_descriptions/example_playbook.yml
Tobias Winchen's avatar
Tobias Winchen committed
96
97
98
`
Potentially, the required containers need **also** to be build before loading the play:
`
Tobias Winchen's avatar
Tobias Winchen committed
99
$ansible-playbook -i effelsberg provision_descriptions/example_playbook.yml --tags=build
Tobias Winchen's avatar
Tobias Winchen committed
100
101
102
`
To stop the containers launched use:
`
Tobias Winchen's avatar
Tobias Winchen committed
103
$ansible-playbook -i effelsberg example_run.yml --tags stop
Tobias Winchen's avatar
Tobias Winchen committed
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
``

The master controller may also provision the edd. The master controller
container thus pulls the edd_ansible repository and installs the site config
and roles.


## EDD Ansible structure
The repository is organied in the recommended playbook structure.

  - roles/ contains individual roles for EDD components, e.g.
    - roles/edd_master_controller
    - roles/gated_spectrometer
    - ...
  - roles/common contains common tasks to launch, stop, build the pipeline
    containers
  - roles/edd_base contains the tasks to build the eddbase container (usefull,
    but nor required) as base for pipeline containers. The role also launches
    the ansible interface on all nodes.


Tobias Winchen's avatar
Tobias Winchen committed
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
## Features
Some featrure dcumentation while there is o real doc:

### Selecting numa nodes
Numa node for roles can be selected via the environment variable
EDD_ALLOWED_NUMA_NODES inside the container. The container itself sees all
devices, but the environment variable is used by mpikat numa to enforce some
restrictions. It is set in a play e.g. by

- hosts: gpu_server[1]
  roles:
    - role: gated_full_stokes_spectrometer
      container_env: "EDD_ALLOWED_NUMA_NODES=0"


### Data directory
All pipelines have a directory from the host system mounted to /mnt. On the
host it is {{ data_base_path }}/{{ container_name }} by default, with 
/media/scratch as default data_base_path in effelsberg. 
The data_base path can be overriden in the provision description to select e.g.
a network file system, or the variabel {{ data_path }} can be set to a location
avoiding the container_name suffix.

- hosts: gpu_server[0]
  roles:
    - role: pulsar_pipeline
      container_name: pulsar_pipeline1
      data_base_path: "/beegfsEDD"


### Container names
The default container name can be changed with the container_name variable.
Needed e.g. to distinguish multiple instances of the same pipeline.

- hosts: gpu_server[0]
  roles:
    - role: pulsar_pipeline
      container_name: pulsar_pipeline1
      container_env: "EDD_ALLOWED_NUMA_NODES=0"
    - role: pulsar_pipeline
      container_name: pulsar_pipeline2
      container_env: "EDD_ALLOWED_NUMA_NODES=1"


### Skarabs
  The device variable is used to associate a skarab controller with a sakarab
  board.
    - role: skarab_pfb_controller
      device: skarab_00





Tobias Winchen's avatar
Tobias Winchen committed
179
180
181
182
## Development hints
- Execute quick_build_role.sh  roles/myrole to quickly build  a single role
  without executing a playbook with --build tags which may build several
  roles.
Tobias Winchen's avatar
Tobias Winchen committed
183

Tobias Winchen's avatar
Tobias Winchen committed
184

Tobias Winchen's avatar
Tobias Winchen committed
185
186
## ToDo:
  - manage tags + launch different tags