Skip to content
GitLab
Projects Groups Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in
  • P pypocketfft
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 4
    • Issues 4
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 2
    • Merge requests 2
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar

On Thursday, 2nd February from 9 to 10.00 am there will be a maintenance with a short downtime of the GitLab service.

  • Martin Reinecke
  • pypocketfft
  • Issues
  • #13
Closed
Open
Issue created Jul 31, 2020 by Martin Reinecke@mtrOwner

Make thread pool size more flexible?

When running parallel FFTs with, say, four threads on a system with 12 CPUs and 24 virtual cores, I'm not observing a set of 4 virtual cores at 100% with htop (as would be the case with an OpenMP code), but rather a homogeneous, small load on all 24 virtual cores. If I'm interpreting this correctly, this happens because we allocate a thread pool with as many threads as there are virtual cores, and assigning tasks to them in a round-robin fashion. So the load is jumping around very quickly.

This doesn't seem optimal: it invalidates a lot of caches, and it probably confuses the thread scheduler.

Would it be possible to resize the pool on demand, roughly like this:

inline thread_pool &get_pool2(size_t nthreads=0)
  {
  static std::unique_ptr<thread_pool> pool(std::make_unique<thread_pool>(1));
  if ((!pool) || ((nthreads!=0) && (nthreads!=pool->size()))) // resize
    {
    pool = std::make_unique<thread_pool>(nthreads);
    }
#if __has_include(<pthread.h>)
  static std::once_flag f;
  call_once(f,
    []{
    pthread_atfork(
      +[]{ get_pool2().shutdown(); },  // prepare
      +[]{ get_pool2().restart(); },   // parent
      +[]{ get_pool2().restart(); }    // child
      );
    });
#endif

  return *pool;
  }

@g-peterbell do you think this could work, or am I missing some subtle multithreading issue? First tests look OK, but with concurrency I'd like to hear a second opinion :)

Assignee
Assign to
Time tracking