Skip to content
Snippets Groups Projects
Commit a64243e3 authored by David Carreto Fidalgo's avatar David Carreto Fidalgo
Browse files

improve TOC

parent 54afc748
No related branches found
No related tags found
1 merge request!2Feat/readme
......@@ -5,7 +5,7 @@
[[_TOC_]]
## Getting started
# Getting started
Let's go through a minimal example of how to run a Python script using a container on our [Raven system](link/to/official/docs).
......@@ -93,7 +93,7 @@ And submit it to the cluster with `sbatch example.slurm`
That's it, congratulations, you ran your first Python script inside a container using a GPU! 🎉
## Examples and blueprints
# Examples and blueprints
In this repository you find following examples or blueprints that you can adapt and use for your projects:
......@@ -103,7 +103,7 @@ In this repository you find following examples or blueprints that you can adapt
...
## Official base containers
# Official base containers
You probably want to build your own container with your own custom software stack and environment in it.
We recommend, however, that you alway build on top of certain *base containers* depending on your application and the kind of GPUs you want to use.
......@@ -127,21 +127,21 @@ From: nvcr.io/nvidia/pytorch:25.04-py3
> Most of the these base containers are quite large and it will take some time to download and build them.
### Nvidia
## Nvidia
- [PyTorch](https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorch)
- [TensorFlow](https://catalog.ngc.nvidia.com/orgs/nvidia/containers/tensorflow)
- [JAX](https://catalog.ngc.nvidia.com/orgs/nvidia/containers/jax)
### AMD
## AMD
- [PyTorch](https://hub.docker.com/r/rocm/pytorch/tags)
- [TensorFlow](https://hub.docker.com/r/rocm/tensorflow/tags)
- [JAX](https://hub.docker.com/r/rocm/jax/tags)
## Working with Apptainer
# Working with Apptainer
We use [Apptainer](https://apptainer.org/docs/user/main/index.html) to build/run containers on our HPC systems.
You will need a Linux system to run Apptainer natively on your machine, and it’s easiest to [install](https://apptainer.org/docs/user/main/quick_start.html) if you have root access.
......@@ -151,12 +151,12 @@ But it is also easy to use or convert [docker images with Apptainer](https://app
For a nice introduction to Apptainer on our HPC systems, have a look at the awesome [presentation by Michele](https://datashare.mpcdf.mpg.de/s/df4p3bMuWCF53Y3).
You can also browse [our documentation](https://docs.mpcdf.mpg.de/doc/computing/software/containers.html#apptainer).
### Building containers
## Building containers
Containers are built via a [definition file](https://apptainer.org/docs/user/latest/definition_files.html) and the `apptainer build` command.
In each folder of this repo you will find a definition `.def` file and a `README.md` that describes the exact build command.
### Pull from Dokcer Hub
## Pull from Dokcer Hub
You can easily [pull containers](https://apptainer.org/docs/user/latest/docker_and_oci.html#containers-from-docker-hub)
from the [Docker Hub](https://hub.docker.com/) or other OCI registries:
......@@ -165,7 +165,7 @@ from the [Docker Hub](https://hub.docker.com/) or other OCI registries:
$ apptainer pull my_apptainer.sif docker://sylabsio/lolcow:latest
```
### Convert from Docker Daemon or Docker Archive files
## Convert from Docker Daemon or Docker Archive files
You can also [convert images/containers](https://apptainer.org/docs/user/latest/docker_and_oci.html#containers-from-docker-hub) running in your Docker Daemon:
```shell
......@@ -182,13 +182,13 @@ $ docker save 5a15b484bc65 -o lolcow.tar
$ apptainer build my_apptainer.sif docker-archive:lolcow.tar
```
### Running containers
## Running containers
**TODO:**
- mention important flags, like `--nv` for example
- how to run the containers on our SLURM cluster
### Patching containers
## Patching containers
If you want to permanently modify files in your container, you can use persistent "overlays". These are writable file system images that sit on top of your immutable SIF container.
......@@ -210,23 +210,23 @@ Now you can modify files inside the container and the modifications will be stor
You can apply overlays with the `run`, `exec`, `shell` and `instance start` commands.
### Using containers with RVS
## Using containers with RVS
The [Remote Visualisation Service (RVS)](https://docs.mpcdf.mpg.de/doc/visualization/index.html) allows you to run Jupyter sessions on the HPC systems.
You can use your container as a kernel within such a session by providing a `kernel.json` spec file.
#### 1. Setting up the container
### 1. Setting up the container
Make sure you install ipython and ipykernel in your container:
```
pip install ipython ipykernel
```
#### 2. Setting up RVS
### 2. Setting up RVS
Load apptainer module when initializing your RVS session.
#### 3. Creating the kernel
### 3. Creating the kernel
Create a kernel spec file
```bash
......@@ -264,7 +264,7 @@ Keep in mind that you are inside the container.
If you want to access files outside your home directory, you have to bind them explicitly in the kernel spec file when calling the apptainer command.
For example, in the kernel spec file above we bind your `ptmp` folder.
### Local-to-HPC Workflow
## Local-to-HPC Workflow
**TODO: The sandbox option does not work 100% correctly for VSCode or PyCharm, use docker images instead! Need to update this guide!**
......@@ -272,14 +272,14 @@ A nice workflow to develop a python library locally and deploy it on our HPO sys
We are still investigating if something similar is possible with `Docker` (please let us know if you find a way :) ).
#### 1. Create a definition file
### 1. Create a definition file
In the root directory of your library (repository) create a *definition* `*.def` file.
This definition file should reflect your environment in which you want your library to develop and use.
You can leverage base environments, such as docker images on DockerHub, or existing apptainers.
#### 2. Build the sandbox
### 2. Build the sandbox
Build the sandbox (container in a directory) instead of the default SIF format:
......@@ -287,7 +287,7 @@ Build the sandbox (container in a directory) instead of the default SIF format:
apptainer build --fakeroot --sandbox my_container my_container.def
```
#### 3. Install your library in the sandbox
### 3. Install your library in the sandbox
Now we can add our library that we develop to the sandbox environment and install it in [`editable`](https://setuptools.pypa.io/en/latest/userguide/development_mode.html) mode:
......@@ -295,16 +295,16 @@ Now we can add our library that we develop to the sandbox environment and instal
apptainer exec --writable my_container python -m pip install -e .
```
#### 4. Point your IDE's interpreter to the sandbox
### 4. Point your IDE's interpreter to the sandbox
You should be able to point the interpreter of your IDE (VSC, PyCharm, etc.) to the python executable inside the sandbox folder.
#### 5. Add your developed library to the my_container.def file
### 5. Add your developed library to the my_container.def file
While in principle you could build a SIF container directly from your sandbox, it is better to modify your *definition* `*.def` file to include your library/package.
In this way, your container is fully reproducible using only the definition file.
#### 6. Build your *.sif apptainer, deploy on our HPC systems
### 6. Build your *.sif apptainer, deploy on our HPC systems
Once you built the SIF container, you can copy it to our HPC systems and use it there.
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment