Skip to content
GitLab
Projects Groups Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in
  • N NIFTy
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 20
    • Issues 20
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 13
    • Merge requests 13
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Releases
  • Packages and registries
    • Packages and registries
    • Container Registry
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • ift
  • NIFTy
  • Merge requests
  • !511

WIP: Made _sumup of KL faster by summing up in parallel when possible

  • Review changes

  • Download
  • Email patches
  • Plain diff
Closed Reimar H Leike requested to merge faster_MPIKL into NIFTy_6 May 27, 2020
  • Overview 5
  • Commits 4
  • Pipelines 4
  • Changes 1

Marked as work in progress because

  1. has to be tested whether it is really faster
  2. Speed could in theory be even more increased if one uses the numpy versions of communication, i.e. Send and Recv

The basic idea of this change is to sum up the KL in a tree-type structure. This goes as follows. Suppose we have this list:

1  2  3  4  5  6

Then we combine the numbers pairwise:

1  2  3  4  5  6
| /   |  /  |  /
3     7     11

We then iterate this procedure:

3  7  11
| /   | 
10    11

Here the 11 was at the end of the list, and was thus not summed. Finally we get

10 11
|  /  
21

This procedure converges in O(ln(N)) steps, if N is the length of the list. Suppose the objects of the list are distributed over different MPI tasks, then in every step we need at most N/2 communications. However, because the range of numbers is shared consecutively, every task needs to communicate at most 2 times per iteration, so most communications are done in parallel. The method is organized such that the result of a sum is always computed by the higher ranked tasks. i.e. if the list is distributed over 4 tasks as follows:

1  1  2  2  3  4

Then the ranks that hold the sums evolve as follows:

1  1  2  3  3  4
| /   |  /  |  /
1     3     4

So far two communication were needed, with task 3 needing to communicate twice. It continues as

1  3  4
| /   |  
3     4
|    /
|  /
 4

with two more necessary communications. In the end task 4 broadcasts the result.

Edited May 27, 2020 by Reimar H Leike
Assignee
Assign to
Reviewers
Request review from
Time tracking
Source branch: faster_MPIKL