diff --git a/doc/source/topic_guides/kmos3_speed.rst b/doc/source/topic_guides/kmos3_speed.rst
index d394e25c51e7d68bb67f333086a5d3f3559e2aa8..4ade768d5327bb567ad662713b28ef17e776200a 100644
--- a/doc/source/topic_guides/kmos3_speed.rst
+++ b/doc/source/topic_guides/kmos3_speed.rst
@@ -1,136 +1,106 @@
 
 .. _o1-backend:
 
-How the kmos3 kMC algorithm works
-=================================
+The kmos3 Implementation
+========================
 
-kmos3 asks you to describe your model to the processor
-in seemingly arcane ways. It can save model descriptions
-in XML but they are basically unreadable and a pain to edit.
-The API has some glitches and is probably incomplete: so why learn it?
+How the kmos3 KMC Algorithm Works
+---------------------------------
+
+Kmos3 asks you to describe your model to the processor in seemingly arcane ways. It can save model
+descriptions in XML but they are basically unreadable and a pain to edit. The API has some
+glitches and is probably incomplete: so why learn it?
 
 Because it is fast (in two ways).
 
-The code it produces is commonly faster than naive implementations
-of the kMC method. Most straightforwards implementations of kMC take a time
-proportional to 2*N  per kMC step,
-where N is the number of sites in the system.
-However the code that kmos3 produces is O(1) until the RAM
-of your system is exceeded. As benchmarks have shown this may happen when
-100,000 or more sites are required. However tests have also shown
-that kmos3 can be faster than O(N) implementations from around
-60-100 sites. If you have different experiences please let me know
-but I think this gives some rule of thumb.
-
-
-Why is it faster? Straightforward implementations of kMC scan the
-entire system twice per kMC step. First to determine the total
-rate, then to determine the next process to be executed. The
-present implementation does not. kmos3 keeps a database of available
-processes which allow to quickly pick the next process. It also
-updates the database of available processes which cost additional
-overhead. However this overhead is independent of the system's size
-and only scales with the degree of interaction between sites, which
-is seems hard to define in general terms.
-
-The second way reason why it is fast is because you can formulate
-processes in a intuitive fashion and let kmos3 figure how to
-make fast running code out of it. So we save in human time and
-CPU time, which is essentially human time as well. Yay!
-
-To illustrate just how fast the algorithm is the graph below shows
-the CPU time needed to simulate 1 million kMC steps on a simple
-cubic lattice in 2 dimension with two reacting species and
-without lateral interaction. As this shows the CPU time
-spent per kMC step as nearly constant for up nearly 10^5 sites.
+The code it produces is commonly faster than naive implementations of the KMC method. Most
+straightforward implementations of KMC take a time proportional to 2*N per KMC step, where N is
+the number of sites in the system. However, the code that kmos3 produces is O(1) until the RAM
+of your system is exceeded. As benchmarks have shown this may happen when 100,000 or more sites
+are required. However, tests have also shown that kmos3 can be faster than O(N) implementations
+from around 60-100 sites. If you have different experiences please let us know but we think this
+gives some rule of thumb.
+
+Why is it faster? Straightforward implementations of KMC scan the entire system twice per KMC
+step. First, to determine the total rate, then, to determine the next process to be executed. The
+present implementation does not. Kmos3 keeps a database of available processes, which allows to
+quickly pick the next process. It also updates the database of available processes, which costs
+additional overhead. However, this overhead is independent of the system's size and only scales
+with the degree of interaction between sites, which is hard to define in general terms.
+
+The second reason why it is fast is because you can formulate processes in an intuitive fashion
+and let kmos3 figure how to make fast running code out of it. So we save in human time and CPU
+time, which is essentially human time as well. Yay!
+
+To illustrate just how fast the algorithm is, the graph below shows the CPU time needed to
+simulate 1 million KMC steps on a simple cubic lattice in 2 dimensions with two reacting species
+and without lateral interactions. You can see that the CPU time spent per KMC step is nearly
+constant for up to about :math:`10^5` sites.
 
 .. figure:: ../img/benchmark.png
-   :width: 75%
-   :align: center
+  :width: 75%
+  :align: center
 
-   Benchmark for a simple surface reaction model. All simulations have been
-   performed on a single CPU of Intel I7-2600K with 3.40 GHz clock speed.
+  Benchmark for a simple surface reaction model. All simulations have been performed on a single
+  CPU of an Intel I7-2600K with 3.40 GHz clock speed.
 
-The kmos3 O(1) solver
+The kmos3 O(1) Solver
 ---------------------
 
 .. figure:: ../img/data_structures.png
-   :width: 75%
-   :align: center
-
-   The data model underlying the kmos3 solver. The central component
-   is the `avail_sites` array which stores for each elementary
-   step the sites for which it is executable. Secondly
-   it stores the location in memory, where the availability
-   of the site is stored for direct access. The array of
-   `rate constants` holds the numeric rate constant and only
-   changes, when a physical parameter is changed. The
-   `nr of sites` array holds the total number of sites for each
-   process and needs to be updated whenever
-   a process becomes available und unavailable. The `accum. rates`
-   has to be updated once per kMC step and holds the accumulated
-   rate constant for each processes. That is, the last field
-   of accum. rates holds :math:`k_{\mathrm{tot}}`,
-   the total rate of the system.
-
-
-So what makes the kMC solver so furiously fast? The underlying
-data structure is shown in the picture above. The most important
-part is that the solver never scans the entire system for
+  :width: 75%
+  :align: center
+
+  The data model underlying the kmos3 solver. The central component is the `avail_sites` array,
+  which stores for each elementary step the sites for which it is executable. Secondly, it stores
+  the location in memory, where the availability of the site is stored for direct access. The
+  array of `rate constants` holds the numeric rate constant and only changes, when a physical
+  parameter is changed. The `nr of sites` array holds the total number of sites for each process
+  and needs to be updated whenever a process becomes available or unavailable. The `accum. rates`
+  has to be updated once per KMC step and holds the accumulated rate constant for each processes.
+  That is, the last field of `accum. rates` holds :math:`k_{\mathrm{tot}}`--the total rate of the
+  system.
+
+
+So what makes the KMC solver so furiously fast? The underlying data structure is shown in the
+picture above. The most important part is that the solver never scans the entire system for
 available processes except at program initialization.
 
-Please have a look at the sketch of data structures above. Given that
-all arrays are initialized and populated, in each kMC step the
-following things happen:
+Please have a look at the sketch of data structures above. Given that all arrays are initialized
+and populated, in each KMC step the following things happen:
 
-In the first step we need to identify the next process and site.
-To do so we draw a random number :math:`R_{1} \in [0, 1]`.
-This number has to be scaled to :math:`k_{\mathrm{tot}}`,
-so we multiply it with the last field in `accum. rates`.  Next
-we simply perform a
-`binary search <http://en.wikipedia.org/wiki/Binary_search_algorithm>`_
-for the right process on `accum. rates`. Having determined the
-process, we pick a site using a second random number :math:`R_{2}`,
-which is constant in time since `avail sites` is filled up with
-the available site for each process from the left.
+In the first step we need to identify the next process and site. To do so, we draw a random number
+:math:`R_{1} \in [0, 1]`. This number has to be scaled to :math:`k_{\mathrm{tot}}`, so we multiply
+it with the last field in `accum. rates`.  Next, we simply perform a
+`binary search <https://en.wikipedia.org/wiki/Binary_search#Algorithm>`_
+for the corresponding process in `accum. rates`. Having determined the process, we pick a site
+using a second random number :math:`R_{2}`, which is constant in time since `avail sites` is
+filled up with the available sites for each process from the left.
 
-Totally independent of this we calculate the duration of the
-current step with another random number :math:`R_3` using
+Totally independent of this we calculate the duration of the current step with another random
+number :math:`R_3` using
 
 .. math::
+  \Delta t = \frac{-\log(R_{3})}{k_{\mathrm{tot}}}.
+
+So, while the determination of process and site is extremely straightforward, the CPU intensive
+part just starts now. The `proclist` module is written in such a way that for each elementary step
+it updates the `avail sites` array only in the local neighborhood of the site, where the process
+is executed. It is furthermore heuristically optimized in order to require only a minimal number
+of `if`-statements to figure out which database updates are necessary. This will be explained in
+more detail in the next subsection.
+
+For the current description it is sufficient to know that for all database updates by the
+`proclist` module
+
+  - the `nr of sites` array is updated as well and
+
+  - adding or deleting an available site only takes constant time, since the number of available
+    sites as well as the memory addresses are always updated. Thus, new sites are simply added at
+    the end of the list of available sites. When a site has to be deleted, the last site in the
+    array is moved to the memory slot available now.
+
+Thus, once all local updates are finished the `accum. rates` array is simply updated once and we
+are ready for the next KMC step.
 
-  \Delta t = \frac{-\log(R_{3})}{k_{\mathrm{tot}}}
-
-So, while the determination of process and site is
-extremely straightforward, the CPU intensive part
-just starts now. The `proclist` module is written
-in such a way, for each elementary step it
-updates the `avail sites` array only in the
-local neighborhood of the site, where the process
-is executed. It is furthermore heuristically
-optimized in order to require only a minimal
-number of `if`-statement to figure out which
-database updates are necessary. This will be
-explained in greate detail in the next subsection.
-
-For the current description it is sufficient to
-know that for all database updates by the `proclist`
-module :
-
-  - the `nr of sites` array is updated as well.
-
-  - adding or deleting an available site only
-    takes constant time, since the number of
-    available sites as well as the memory addresses
-    is always updated. Thus new sites are simply
-    add at the end of the list of available sites.
-    When a site has to be deleted the last site
-    in the array is moved to the memory slot
-    available now.
-
-
-Thus once all local updates are finished the
-`accum. rates` array is simply updated once.
-And ready we are for the next kMC step.
-
-.. TODO:: describe translation algorithm
+.. TODO:: Describe translation algorithm.