Skip to content

Instantly share code, notes, and snippets.

@jeff-cohere
Created October 13, 2020 22:58
Show Gist options
  • Save jeff-cohere/93cad5c0f73d06129d87f8702771b258 to your computer and use it in GitHub Desktop.
Save jeff-cohere/93cad5c0f73d06129d87f8702771b258 to your computer and use it in GitHub Desktop.
Scaling study for Richards demo

Here are the top 10 hotspots for a series of runs for the Richards demo, using 1 to 4 processes.

nprocs=1, 20x20x10

tdyprof: showing top 10 hits (max times across ranks):
          Stage Name                               Event Name       Time             FLOP
          Main Stage                                  summary    1.81532      1.99715e+08
          Main Stage               TDyTimeIntegratorRunToTime    1.80311      1.98985e+08
          Main Stage                                SNESSolve    1.80084      1.98985e+08
          Main Stage                         SNESJacobianEval    1.60136      2.52543e+07
          Main Stage              TDyMPFAOSNESJacobian_3DMesh    1.60133      2.52543e+07
          Main Stage        TDyMPFAOIJacobian_Vertices_3DMesh    1.58773      2.51143e+07
       TDycore Setup                                  summary   0.574035          576000.
       TDycore Setup                   TDyDriverInitializeTDy   0.526048          576000.
       TDycore Setup                                 TDySetup   0.299569          576000.
       TDycore Setup                       TDyMPFAOInitialize   0.299541          576000.

nprocs=1, 40x40x10

tdyprof: showing top 10 hits (max times across ranks):
          Stage Name                               Event Name       Time             FLOP
          Main Stage                                  summary    5.04824      6.17778e+08
          Main Stage               TDyTimeIntegratorRunToTime    5.01368      6.14869e+08
          Main Stage                                SNESSolve    5.00478      6.14869e+08
          Main Stage                         SNESJacobianEval    4.43187      7.33948e+07
          Main Stage              TDyMPFAOSNESJacobian_3DMesh    4.43183      7.33948e+07
          Main Stage        TDyMPFAOIJacobian_Vertices_3DMesh    4.38936      7.29948e+07
       TDycore Setup                                  summary    2.22651        2.304e+06
       TDycore Setup                   TDyDriverInitializeTDy    2.09313        2.304e+06
       TDycore Setup                                 TDySetup    1.19311        2.304e+06
       TDycore Setup                       TDyMPFAOInitialize    1.19309        2.304e+06

nprocs=1, 80x80x10

tdyprof: showing top 10 hits (max times across ranks):
          Stage Name                               Event Name       Time             FLOP
          Main Stage                                  summary    17.8945      2.06509e+09
          Main Stage               TDyTimeIntegratorRunToTime    17.6165      2.05347e+09
          Main Stage                                SNESSolve    17.5806      2.05347e+09
          Main Stage                         SNESJacobianEval    15.5841      2.60539e+08
          Main Stage              TDyMPFAOSNESJacobian_3DMesh    15.5841      2.60539e+08
          Main Stage        TDyMPFAOIJacobian_Vertices_3DMesh    15.4252      2.59131e+08
       TDycore Setup                                  summary    8.63444        9.216e+06
       TDycore Setup                   TDyDriverInitializeTDy    8.22892        9.216e+06
       TDycore Setup                                 TDySetup    4.52274        9.216e+06
       TDycore Setup                       TDyMPFAOInitialize    4.52272        9.216e+06

nprocs=1, 160x160x10

tdyprof: showing top 10 hits (max times across ranks):
          Stage Name                               Event Name       Time             FLOP
          Main Stage                                  summary    59.4124      8.86982e+09
          Main Stage               TDyTimeIntegratorRunToTime    57.7581      8.82339e+09
          Main Stage                                SNESSolve    57.6145      8.82339e+09
          Main Stage                         SNESJacobianEval    49.4889      8.08691e+08
          Main Stage              TDyMPFAOSNESJacobian_3DMesh    49.4888      8.08691e+08
          Main Stage        TDyMPFAOIJacobian_Vertices_3DMesh    48.9497      8.04339e+08
       TDycore Setup                                  summary      36.97       3.6864e+07
       TDycore Setup                   TDyDriverInitializeTDy    35.7561       3.6864e+07
       TDycore Setup                                 TDySetup    18.6543       3.6864e+07
       TDycore Setup                       TDyMPFAOInitialize    18.6542       3.6864e+07

nprocs=2, 20x20x10

tdyprof: showing top 10 hits (max times across ranks):
          Stage Name                               Event Name       Time             FLOP
          Main Stage                                  summary    1.02672      1.16233e+08
          Main Stage               TDyTimeIntegratorRunToTime   0.957031      1.15823e+08
          Main Stage                                SNESSolve   0.955431      1.15826e+08
          Main Stage                         SNESJacobianEval    0.84859      1.11841e+07
          Main Stage              TDyMPFAOSNESJacobian_3DMesh   0.848545      1.11841e+07
          Main Stage        TDyMPFAOIJacobian_Vertices_3DMesh   0.841271      1.11221e+07
       TDycore Setup                                  summary   0.305952          323560.
       TDycore Setup                   TDyDriverInitializeTDy   0.284166          323560.
       TDycore Setup                                 TDySetup   0.154784          323560.
       TDycore Setup                       TDyMPFAOInitialize   0.154766          323560.

nprocs=2, 40x40x10

tdyprof: showing top 10 hits (max times across ranks):
          Stage Name                               Event Name       Time             FLOP
          Main Stage                                  summary     3.2248      4.14123e+08
          Main Stage               TDyTimeIntegratorRunToTime    3.02377      4.12578e+08
          Main Stage                                SNESSolve    3.01927      4.12682e+08
          Main Stage                         SNESJacobianEval    2.68598      3.67022e+07
          Main Stage              TDyMPFAOSNESJacobian_3DMesh    2.68595      3.67022e+07
          Main Stage        TDyMPFAOIJacobian_Vertices_3DMesh    2.66645      3.64926e+07
       TDycore Setup                                  summary    1.13737      1.22555e+06
       TDycore Setup                   TDyDriverInitializeTDy      1.075      1.22555e+06
       TDycore Setup                                 TDySetup    0.59671      1.22527e+06
       TDycore Setup                       TDyMPFAOInitialize   0.596693      1.22527e+06

nprocs=2, 80x80x10

tdyprof: showing top 10 hits (max times across ranks):
          Stage Name                               Event Name       Time             FLOP
          Main Stage                                  summary    12.6833       1.3097e+09
          Main Stage               TDyTimeIntegratorRunToTime    11.9438       1.3039e+09
          Main Stage                                SNESSolve     11.924       1.3039e+09
          Main Stage                         SNESJacobianEval    10.6063      1.36184e+08
          Main Stage              TDyMPFAOSNESJacobian_3DMesh    10.6062      1.36184e+08
          Main Stage        TDyMPFAOIJacobian_Vertices_3DMesh    10.5244      1.35462e+08
       TDycore Setup                                  summary    4.74698      4.75794e+06
       TDycore Setup                   TDyDriverInitializeTDy    4.51119      4.75794e+06
       TDycore Setup                                 TDySetup    2.47716      4.75737e+06
       TDycore Setup                       TDyMPFAOInitialize    2.47715      4.75737e+06

nprocs=2, 160x160x10

tdyprof: showing top 10 hits (max times across ranks):
          Stage Name                               Event Name       Time             FLOP
          Main Stage                                  summary    49.3778      6.01266e+09
          Main Stage               TDyTimeIntegratorRunToTime     45.921       5.9902e+09
          Main Stage                                SNESSolve    45.8399      5.98908e+09
          Main Stage                         SNESJacobianEval    40.1297      4.28098e+08
          Main Stage              TDyMPFAOSNESJacobian_3DMesh    40.1297      4.28098e+08
          Main Stage        TDyMPFAOIJacobian_Vertices_3DMesh     39.859      4.25859e+08
       TDycore Setup                                  summary    20.1883       1.8784e+07
       TDycore Setup                   TDyDriverInitializeTDy    19.4305       1.8784e+07
       TDycore Setup                                 TDySetup    10.4836      1.87837e+07
       TDycore Setup                       TDyMPFAOInitialize    10.4836      1.87837e+07

nprocs=4, 20x20x10

tdyprof: showing top 10 hits (max times across ranks):
          Stage Name                               Event Name       Time             FLOP
          Main Stage                                  summary     1.0204      7.08227e+07
          Main Stage               TDyTimeIntegratorRunToTime   0.949497      7.05969e+07
          Main Stage                                SNESSolve   0.948075      7.05969e+07
          Main Stage                         SNESJacobianEval   0.830733      7.26601e+06
          Main Stage              TDyMPFAOSNESJacobian_3DMesh   0.830689      7.26601e+06
          Main Stage        TDyMPFAOIJacobian_Vertices_3DMesh   0.822609      6.93139e+06
       TDycore Setup                                  summary   0.206448          184175.
          Main Stage                         MatAssemblyBegin   0.190865               0.
          Main Stage                            BuildTwoSided   0.190405               0.
          Main Stage                           BuildTwoSidedF   0.190083               0.

nprocs=4, 40x40x10

tdyprof: showing top 10 hits (max times across ranks):
          Stage Name                               Event Name       Time             FLOP
          Main Stage                                  summary    2.19229      2.04364e+08
          Main Stage               TDyTimeIntegratorRunToTime    2.03096      2.08928e+08
          Main Stage                                SNESSolve    2.02758      2.08928e+08
          Main Stage                         SNESJacobianEval    1.76421      1.79599e+07
          Main Stage              TDyMPFAOSNESJacobian_3DMesh    1.76417      1.79599e+07
          Main Stage        TDyMPFAOIJacobian_Vertices_3DMesh    1.74839      1.78619e+07
       TDycore Setup                                  summary   0.737202          664346.
       TDycore Setup                   TDyDriverInitializeTDy   0.700233          643193.
       TDycore Setup                                 TDySetup   0.389306          629536.
       TDycore Setup                       TDyMPFAOInitialize   0.389285          629536.

nprocs=4, 80x80x10

tdyprof: showing top 10 hits (max times across ranks):
          Stage Name                               Event Name       Time             FLOP
          Main Stage                                  summary    8.51721       6.6461e+08
          Main Stage               TDyTimeIntegratorRunToTime    7.85292      6.62084e+08
          Main Stage                                SNESSolve    7.84122      6.68737e+08
          Main Stage                         SNESJacobianEval    6.92043      6.86104e+07
          Main Stage              TDyMPFAOSNESJacobian_3DMesh    6.92039      6.86104e+07
          Main Stage        TDyMPFAOIJacobian_Vertices_3DMesh    6.85059      6.82396e+07
       TDycore Setup                                  summary    2.82776      2.45151e+06
       TDycore Setup                   TDyDriverInitializeTDy    2.70493      2.45151e+06
       TDycore Setup                                 TDySetup    1.49653      2.45066e+06
       TDycore Setup                       TDyMPFAOInitialize     1.4965      2.45066e+06

nprocs=4, 160x160x10

tdyprof: showing top 10 hits (max times across ranks):
          Stage Name                               Event Name       Time             FLOP
          Main Stage                                  summary    32.2019      2.96246e+09
          Main Stage               TDyTimeIntegratorRunToTime    29.1048      2.94922e+09
          Main Stage                                SNESSolve    29.0537      2.95143e+09
          Main Stage                         SNESJacobianEval    24.8824      2.02449e+08
          Main Stage              TDyMPFAOSNESJacobian_3DMesh    24.8823      2.02449e+08
          Main Stage        TDyMPFAOIJacobian_Vertices_3DMesh    24.6805       2.0026e+08
       TDycore Setup                                  summary    12.4927      9.53986e+06
       TDycore Setup                   TDyDriverInitializeTDy    12.0201      9.53986e+06
       TDycore Setup                                 TDySetup     6.0831      9.49937e+06
       TDycore Setup                       TDyMPFAOInitialize    6.08306      9.49937e+06
@bishtgautam
Copy link

main_stage_summary
tdycore_setup_summary

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment