From 1f7b8f6074d8216e847bf274dfc2c19806ff80fb Mon Sep 17 00:00:00 2001
From: Klaus Reuter <khr@mpcdf.mpg.de>
Date: Wed, 29 Nov 2023 17:30:34 +0100
Subject: [PATCH] polish README.md

---
 README.md | 127 +++++++++++++++++++++++++++++-------------------------
 1 file changed, 69 insertions(+), 58 deletions(-)

diff --git a/README.md b/README.md
index 2d23d55..e954e10 100644
--- a/README.md
+++ b/README.md
@@ -1,36 +1,39 @@
-# Condainer - Conda environments for HPC systems
+# Condainer - Compressed Conda environments for HPC systems
 
 ![avatar](./fig/condainer_small.jpg)
 
 ## TL;DR - Quick start guide
 
-Condainer puts Conda environments into compressed squashfs images which makes
+Condainer puts Conda environments into compressed (squashfs) images which makes
 the use of such environments portable and more efficient, in particular on HPC
-systems.  These environments (respectively images) are standalone and completely
-avoid the integration of a specific `conda` executable into the user's `.bashrc`
-file which often causes issues, for example on HPC systems.
+systems. These Condainer environments are standalone, and sidestep the typical
+integration of a specific `conda` executable into the user's `.bashrc` file
+completely, which often causes issues, for example with the software environment
+on HPC systems.
 
 ### Build a compressed environment
 
-Starting in an empty directory, use the following commands once to build a compressed image of your Conda environment, defined by 'environment.yml':
+Starting in an empty directory, use the following commands once to build a
+compressed image of your Conda environment that is defined in 'environment.yml':
 
 ```bash
 cnd init
 ls
-# edit the provided example 'environment.yml' file, or copy your own file here, before running
+# edit the example 'environment.yml' file, or copy your own file here, before running
 cnd build
 ls
 ```
 
 ### Activate a compressed environment
 
-After building successfully, you can activate the environment for your current shell session, just like with plain conda:
+After building successfully you can activate the environment for your current
+shell session, sililar to plain Conda or to a Python virtual environment:
 
 ```bash
 source activate
 ```
 
-### Alternatively, run an executable from a compressed environment without activating it
+### Alternatively, run an executable from a compressed environment directly
 
 In case you do not want to activate the environment, you can run individual executables from the environment transparently, e.g.
 
@@ -38,13 +41,13 @@ In case you do not want to activate the environment, you can run individual exec
 cnd exec -- python3
 ```
 
-Please see the sections below for more detailed explanations and more options.
+See the sections below for more detailed explanations and more options.
 
 ## Background
 
-### Problem: Conda environments on HPC systems
+### Often a Problem: Conda environments on HPC file systems
 
-The Conda package manager and related workflows have become an
+The Conda package manager and the related workflows have become an
 adopted standard when it comes to distributing scientific software
 for easy installation by end users. It not only handles native
 Python packages but also manages dependencies in the form of
@@ -52,78 +55,82 @@ binary blobs, such as third-party libraries that are provided as
 shared objects. Using `conda`, complex software environments can
 be defined by means of simple descriptive `environment.yml` files.
 
-Large environments can easily amount to several 100k individual
-(small) files. On a local desktop file system, this is typically not
-an issue.  However, in particular on the large shared parallel file
-systems of HPC systems, the vast amount of small files can cause
-severe trouble as these filesystems are optimized for different IO
-patterns. Inode exhaustion, and heavy load due to (millions of) file
-opens, short reads, and closes during the startup phase of
-(parallel) Python jobs from numerous different users on the HPC
-cluster are only two examples.
+Once installed, large environments can easily amount to several 100k individual
+(small) files. On a local desktop file system, this is typically not an issue.
+However, in particular on the large shared parallel file systems of HPC systems,
+the vast amount of small files can cause issues as these filesystems are
+optimized for different scenarios. Inode exhaustion and heavy load due to
+(millions of) file opens, short reads, and closes happening during the startup
+phase of (parallel) Python jobs from the different users on the system are only
+two examples.
 
-### Solution: Put Conda environments into compressed image files
+### Solution: Move Conda environments into compressed image files
 
-Condainer solves these issues by putting conda environments into
+Condainer adresses these issues by moving Conda environments into
 compressed squashfs images, reducing the number of files
-stored directly on the host file system by orders of magnitude.
+stored on the host file system directly by orders of magnitude.
 Condainer images are standalone and portable, i.e., they can be
 copied between different systems, improving reproducibility
-and reusability of proven working software environments.
+and reusability of proven-to-work software environments.
 
 Technically, Condainer uses the Python basis from Miniforge
-(which is a free alternative similar to Miniconda) and installs the
-software stack defined by the user via an `environment.yml` into an environment.
+(which is a free alternative similar to Miniconda) and then installs the
+software stack defined by the user based on the usual `environment.yml` file.
 Package resolution and installation are extremely fast thanks to the
 `mamba` package manager (an optimized replacement for `conda`).
 As a second step, Condainer creates a compressed squashfs image file
 from that installation, before it deletes the latter to save disk
 space. The compressed image is then mounted at the very same
-directory, providing the complete conda environment to
+directory, providing the complete Conda environment to
 the user who can `activate` or `deactivate` it, just as usual. Moreover,
 Condainer provides a wrapper to run executables from the
-conda environment directly and transparently, without the need to
+Conda environment directly and transparently, without the need to
 explicitly mount and unmount the image.
 
 Please note that the squashfs images used by Condainer are not "containers"
 in the strict terminology of Docker, Apptainer, etc. With Condainer,
 there is no encapsulation, isolation, or similar, rather Condainer
 is an easy-to-use wrapper around the building, compressing,
-mounting, and unmounting of conda environments on top of compressed
+mounting, and unmounting of Conda environments on top of compressed
 image files.
 
 ## Installation
 
-After cloning the repository, Condainer can be installed using pip, e.g. using
+After cloning the repository, Condainer can be installed via `pip``, e.g. using the command
 
 `pip install --user .`
 
-which would place the executable `cnd` into `~/.local/bin`.
+which would place the executable `cnd` into `~/.local/bin` in the user's homedirectory.
 
 ## Usage
 
 The Condainer executable is `cnd` and is controlled via subcommands and flags.
-See `cnd --help` for full details.
-The following subcommands are available for `cnd`:
+See `cnd --help` for full details. The following subcommands are available for `cnd`
+and are described briefly below.
 
 ### Initialize a project using `cnd init`
 
-Create an empty directory, enter it, and run `cnd init` to
-create a skeleton for a condainer project. You may edit
-`condainer.yml`, and, importantly, add your `environment.yml` file
-to the same directory.
+Create an empty directory, enter it, and run `cnd init` to create a skeleton for
+a Condainer project. Optionally, you may inspect and edit the config file
+`condainer.yml`. Importantly, add your `environment.yml` file to the same
+directory.
 
 ### Build and compress an environment using `cnd build`
 
-Build the conda environment specified in `environment.yml`.  In case
-a file `requirements.txt` is present, its contents will be installed
-in addition, using `pip`.  Finally, create a compressed
-squashfs image, and delete the files from the staging environment.
+Build the Conda environment specified in `environment.yml`.  In case a file
+`requirements.txt` is present, its contents will be installed in addition using
+`pip`.  Finally, create a compressed squashfs image, and delete the files from
+the staging process.
+
+To stage the files for the Conda environment, a uniquely named directory below
+the base directory (as specified in `condainer.yml`) is used.  By default, the base
+directory is `/tmp`.  The unique subdirectory name is of the form `condainer-UUID`
+where UUID is a type4 UUID generated and saved during `cnd init`.
 
 ### Execute a command using `cnd exec`
 
 Using a command of the form `cnd exec -- python3 myscript.py`
-it is possible to run executables from the contained conda
+it is possible to run executables from the compressed Conda
 environment directly, in the present example the Python interpreter
 `python3`.  Mounting and unmounting of the squashfs image are
 handled automatically and invisibly to the user.  Note that the '--'
@@ -137,19 +144,23 @@ In the project directory, run `source activate` to activate the
 compressed environment for your current shell session.  Similarly,
 run `source deactivate` to deactivate it.
 Once activated, the compressed environment is available just like
-normal, however read-only.
+normal, however, in read-only mode.
 
 ### Explicitly mount the squashfs image using `cnd mount`
 
-The command `cnd mount` mounts the squashfs image at the base
-location specified in `condainer.yml`. Mount points have the form of
-`cnd-UUID` where UUID is the type4 UUID generated and saved
-during `cnd init`. Hints on activating and deactivating the
-conda environment are printed.
+The command `cnd mount` mounts the squashfs image below the base directory that
+is specified in `condainer.yml`.  Hints on activating and deactivating the Conda
+environment are printed.
+
+Consistent with the `cnd build` step, the mount point is identical to the
+directory used during staging and building, such that the absolute paths to the
+files are unchanged between build and mount.
+
+### Explicitly unmount the squashfs image using `cnd umount`
 
-### Explicitly un-mount the squashfs image using `cnd umount`
+Unmount the image, if mounted.
 
-Unmount the image, if mounted. Make sure to run `conda deactivate`
+Make sure to run `conda deactivate`
 in all relevant shell sessions prior to unmounting.
 
 ### Print information using `cnd status`
@@ -158,21 +169,20 @@ Show some information and the mount status of the image.
 
 ### Check if the necessary tools are available using `cnd prereq`
 
-Check and show if the required software is locally available, also see
-below.
+Check and show if the required software is locally available (see below).
 
 ## System requirements
 
-Condainer should work on any recent Linux system and expects the following set
+Condainer works on any recent Linux system and expects the following set
 of tools available and enabled for non-privileged users:
 
 * fuse
 * squashfuse
 * squashfs tools
 
-On an Ubuntu (or similar) system, run (as root) the command
+On an Ubuntu (or similar) system, run the command
 
-`apt install squashfs-tools squashfuse`
+`sudo apt install squashfs-tools squashfuse`
 
 to install the necessary tools. In addition `curl` is required to download
 the Miniforge installer, in case it is not available locally.
@@ -185,8 +195,9 @@ No installer is downloaded in case that variable is defined.
 
 ## Features and Limitations
 
-* Any valid `environment.yml` should work, there is no lock-in effect when using Condainer, and you can use the same `environment.yml` with plain Conda elsewhere.
-* Condainer environments are read-only and immutable. In case you need to add packages, rebuild the image. (You can toggle between multiple existing squashfs images by editing the UUID string in `condainer.yml`.)
+* Any valid `environment.yml` will work with Condainer, there is no lock-in when using Condainer, as you can use the same `environment.yml` with plain Conda elsewhere.
+* Condainer environments are read-only and immutable. In case you need to add packages, rebuild the image.
+* Within the same project, when experimenting, you can toggle between multiple existing squashfs images by editing the UUID string in `condainer.yml`.
 
 ## Contact
 
-- 
GitLab