Skip to content

GitLab

  • Menu
Projects Groups Snippets
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in
  • T TurTLE
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 3
    • Issues 3
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 0
    • Merge requests 0
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Monitor
    • Monitor
    • Incidents
  • Packages & Registries
    • Packages & Registries
    • Container Registry
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Wiki
    • Wiki
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • TurTLE
  • TurTLE
  • Issues
  • #5

Closed
Open
Created Oct 25, 2016 by Cristian Lalescu@clalescuMaintainer

FFT scaling

The code needs to be fast at FFTs, so this issue is addressed to FFTs.

Here are preliminary scaling results for the code itself, obtained for the 1536^3 test cases scaling.pdf. For this discussion, only the "ftest" line is relevant. My interpretation is that the direct FFTW approach scales quite reasonably.

Procedure for plot:

  1. take snapshot from 1536^3 DNS, run four different DNS for 64 time steps with this snapshot as initial condition:
  • "ftest": only run Navier Stokes solver
  • "ptest-1e5": same as "ftest", but add 10^5 particles, with sampling at every timestep.
  • "ptest-2e7": same as "ftest", but add 2 x 10^7 particles, with sampling at every timestep.
  • "ptest-2e7-lessiO": same as "ftest", but add 2 x 10^7 particles, with sampling at every 16 timesteps.
  1. The jobs are run using 128, 192, 256, 384 and 512 MPI processes, on draco, so always using a number of nodes at full capacity. In fact only "ftest" can run on 512 processes, since the particle code can't run if there are less than 4 z slices per slab allocated to MPI process.

  2. Afterwards, read the overall execution time from the output file for each process, average over all processes for each run, and plot as a function of the number of processes.

To generate the plot, i am using the file https://gitlab.mpcdf.mpg.de/clalescu/bfps_addons/blob/develop/tests/timing_analyzer.py It's currently set up to work with my peculiar file structure, but I trust the "check_scaling" function is clearly enough defined.

Assignee
Assign to
Time tracking