eltonfc/md.log Secret

## md.log
                       :-) GROMACS - gmx mdrun, 2019 (-:

                            GROMACS is written by:
     Emile Apol      Rossen Apostolov      Paul Bauer     Herman J.C. Berendsen
    Par Bjelkmar      Christian Blau   Viacheslav Bolnykh     Kevin Boyd
 Aldert van Buuren   Rudi van Drunen     Anton Feenstra       Alan Gray
  Gerrit Groenhof     Anca Hamuraru    Vincent Hindriksen  M. Eric Irrgang
  Aleksei Iupinov   Christoph Junghans     Joe Jordan     Dimitrios Karkoulis
    Peter Kasson        Jiri Kraus      Carsten Kutzner      Per Larsson
  Justin A. Lemkul    Viveca Lindahl    Magnus Lundborg     Erik Marklund
    Pascal Merz     Pieter Meulenhoff    Teemu Murtola       Szilard Pall
    Sander Pronk      Roland Schulz      Michael Shirts    Alexey Shvetsov
   Alfons Sijbers     Peter Tieleman      Jon Vincent      Teemu Virolainen
 Christian Wennberg    Maarten Wolf
                           and the project leaders:
        Mark Abraham, Berk Hess, Erik Lindahl, and David van der Spoel

Copyright (c) 1991-2000, University of Groningen, The Netherlands.
Copyright (c) 2001-2018, The GROMACS development team at
Uppsala University, Stockholm University and
the Royal Institute of Technology, Sweden.
check out http://www.gromacs.org for more information.

GROMACS is free software; you can redistribute it and/or modify it
under the terms of the GNU Lesser General Public License
as published by the Free Software Foundation; either version 2.1
of the License, or (at your option) any later version.

GROMACS:      gmx mdrun, version 2019
Executable:   /home/eltonfc/.local//bin/gmx
Data prefix:  /home/eltonfc/.local/
Working dir:  /home/eltonfc/trab/software/gromacs/bench/adh_cubic_vsites
Process ID:   25079
Command line:
  gmx mdrun -v -maxh .5 -notunepme

GROMACS version:    2019
Precision:          single
Memory model:       64 bit
MPI library:        thread_mpi
OpenMP support:     enabled (GMX_OPENMP_MAX_THREADS = 64)
GPU support:        OpenCL
SIMD instructions:  AVX2_256
FFT library:        fftw-3.3.6-pl2-fma-sse2-avx-avx2-avx2_128
RDTSCP usage:       enabled
TNG support:        enabled
Hwloc support:      hwloc-1.11.2
Tracing support:    disabled
C compiler:         /usr/bin/cc GNU 7.3.0
C compiler flags:    -mavx2 -mfma     -O3 -DNDEBUG -funroll-all-loops -fexcess-precision=fast
C++ compiler:       /usr/bin/c++ GNU 7.3.0
C++ compiler flags:  -mavx2 -mfma    -std=c++11   -O3 -DNDEBUG -funroll-all-loops -fexcess-precision=fast
OpenCL include dir: /usr/include
OpenCL library:     /usr/lib/libOpenCL.so
OpenCL version:     2.0


Running on 1 node with total 4 cores, 8 logical cores, 1 compatible GPU
Hardware detected:
  CPU info:
    Vendor: Intel
    Brand:  Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz
    Family: 6   Model: 60   Stepping: 3
    Features: aes apic avx avx2 clfsh cmov cx8 cx16 f16c fma hle htt intel lahf mmx msr nonstop_tsc pcid pclmuldq pdcm pdpe1gb popcnt pse rdrnd rdtscp rtm sse2 sse3 sse4.1 sse4.2 ssse3 tdt x2apic
  Hardware topology: Full, with devices
    Sockets, cores, and logical processors:
      Socket  0: [   0   4] [   1   5] [   2   6] [   3   7]
    Numa nodes:
      Node  0 (16704245760 bytes mem):   0   1   2   3   4   5   6   7
      Latency:
               0
         0  1.00
    Caches:
      L1: 32768 bytes, linesize 64 bytes, assoc. 8, shared 2 ways
      L2: 262144 bytes, linesize 64 bytes, assoc. 8, shared 2 ways
      L3: 8388608 bytes, linesize 64 bytes, assoc. 16, shared 8 ways
    PCI devices:
      0000:00:02.0  Id: 8086:0412  Class: 0x0300  Numa: 0
      0000:00:19.0  Id: 8086:153a  Class: 0x0200  Numa: 0
      0000:00:1f.2  Id: 8086:8c02  Class: 0x0106  Numa: 0
  GPU info:
    Number of GPUs detected: 1
    #0: name: Intel(R) HD Graphics Haswell GT2 Desktop, vendor: Intel, device version: OpenCL 1.2 beignet 1.3, stat: compatible


++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
M. J. Abraham, T. Murtola, R. Schulz, S. Páll, J. C. Smith, B. Hess, E.
Lindahl
GROMACS: High performance molecular simulations through multi-level
parallelism from laptops to supercomputers
SoftwareX 1 (2015) pp. 19-25
-------- -------- --- Thank You --- -------- --------


++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
S. Páll, M. J. Abraham, C. Kutzner, B. Hess, E. Lindahl
Tackling Exascale Software Challenges in Molecular Dynamics Simulations with
GROMACS
In S. Markidis & E. Laure (Eds.), Solving Software Challenges for Exascale 8759 (2015) pp. 3-27
-------- -------- --- Thank You --- -------- --------


++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
S. Pronk, S. Páll, R. Schulz, P. Larsson, P. Bjelkmar, R. Apostolov, M. R.
Shirts, J. C. Smith, P. M. Kasson, D. van der Spoel, B. Hess, and E. Lindahl
GROMACS 4.5: a high-throughput and highly parallel open source molecular
simulation toolkit
Bioinformatics 29 (2013) pp. 845-54
-------- -------- --- Thank You --- -------- --------


++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
B. Hess and C. Kutzner and D. van der Spoel and E. Lindahl
GROMACS 4: Algorithms for highly efficient, load-balanced, and scalable
molecular simulation
J. Chem. Theory Comput. 4 (2008) pp. 435-447
-------- -------- --- Thank You --- -------- --------


++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
D. van der Spoel, E. Lindahl, B. Hess, G. Groenhof, A. E. Mark and H. J. C.
Berendsen
GROMACS: Fast, Flexible and Free
J. Comp. Chem. 26 (2005) pp. 1701-1719
-------- -------- --- Thank You --- -------- --------


++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
E. Lindahl and B. Hess and D. van der Spoel
GROMACS 3.0: A package for molecular simulation and trajectory analysis
J. Mol. Mod. 7 (2001) pp. 306-317
-------- -------- --- Thank You --- -------- --------


++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
H. J. C. Berendsen, D. van der Spoel and R. van Drunen
GROMACS: A message-passing parallel molecular dynamics implementation
Comp. Phys. Comm. 91 (1995) pp. 43-56
-------- -------- --- Thank You --- -------- --------


++++ PLEASE CITE THE DOI FOR THIS VERSION OF GROMACS ++++
https://doi.org/10.5281/zenodo.2424363
-------- -------- --- Thank You --- -------- --------

Input Parameters:
   integrator                     = md
   tinit                          = 0
   dt                             = 0.005
   nsteps                         = 10000
   init-step                      = 0
   simulation-part                = 1
   comm-mode                      = Linear
   nstcomm                        = 100
   bd-fric                        = 0
   ld-seed                        = 232683026
   emtol                          = 10
   emstep                         = 0.01
   niter                          = 20
   fcstep                         = 0
   nstcgsteep                     = 1000
   nbfgscorr                      = 10
   rtpi                           = 0.05
   nstxout                        = 0
   nstvout                        = 0
   nstfout                        = 0
   nstlog                         = 0
   nstcalcenergy                  = 100
   nstenergy                      = 500
   nstxout-compressed             = 0
   compressed-x-precision         = 1000
   cutoff-scheme                  = Verlet
   nstlist                        = 10
   ns-type                        = Grid
   pbc                            = xyz
   periodic-molecules             = false
   verlet-buffer-tolerance        = 0.005
   rlist                          = 0.935
   coulombtype                    = PME
   coulomb-modifier               = Potential-shift
   rcoulomb-switch                = 0
   rcoulomb                       = 0.9
   epsilon-r                      = 1
   epsilon-rf                     = inf
   vdw-type                       = Cut-off
   vdw-modifier                   = Potential-shift
   rvdw-switch                    = 0
   rvdw                           = 0.9
   DispCorr                       = No
   table-extension                = 1
   fourierspacing                 = 0.1125
   fourier-nx                     = 100
   fourier-ny                     = 100
   fourier-nz                     = 100
   pme-order                      = 4
   ewald-rtol                     = 1e-05
   ewald-rtol-lj                  = 0.001
   lj-pme-comb-rule               = Geometric
   ewald-geometry                 = 0
   epsilon-surface                = 0
   tcoupl                         = V-rescale
   nsttcouple                     = 10
   nh-chain-length                = 0
   print-nose-hoover-chain-variables = false
   pcoupl                         = No
   pcoupltype                     = Isotropic
   nstpcouple                     = -1
   tau-p                          = 1
   compressibility (3x3):
      compressibility[    0]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
      compressibility[    1]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
      compressibility[    2]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
   ref-p (3x3):
      ref-p[    0]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
      ref-p[    1]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
      ref-p[    2]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
   refcoord-scaling               = No
   posres-com (3):
      posres-com[0]= 0.00000e+00
      posres-com[1]= 0.00000e+00
      posres-com[2]= 0.00000e+00
   posres-comB (3):
      posres-comB[0]= 0.00000e+00
      posres-comB[1]= 0.00000e+00
      posres-comB[2]= 0.00000e+00
   QMMM                           = false
   QMconstraints                  = 0
   QMMMscheme                     = 0
   MMChargeScaleFactor            = 1
qm-opts:
   ngQM                           = 0
   constraint-algorithm           = Lincs
   continuation                   = false
   Shake-SOR                      = false
   shake-tol                      = 0.0001
   lincs-order                    = 6
   lincs-iter                     = 1
   lincs-warnangle                = 30
   nwall                          = 0
   wall-type                      = 9-3
   wall-r-linpot                  = -1
   wall-atomtype[0]               = -1
   wall-atomtype[1]               = -1
   wall-density[0]                = 0
   wall-density[1]                = 0
   wall-ewald-zfac                = 3
   pull                           = false
   awh                            = false
   rotation                       = false
   interactiveMD                  = false
   disre                          = No
   disre-weighting                = Conservative
   disre-mixed                    = false
   dr-fc                          = 1000
   dr-tau                         = 0
   nstdisreout                    = 100
   orire-fc                       = 0
   orire-tau                      = 0
   nstorireout                    = 100
   free-energy                    = no
   cos-acceleration               = 0
   deform (3x3):
      deform[    0]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
      deform[    1]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
      deform[    2]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
   simulated-tempering            = false
   swapcoords                     = no
   userint1                       = 0
   userint2                       = 0
   userint3                       = 0
   userint4                       = 0
   userreal1                      = 0
   userreal2                      = 0
   userreal3                      = 0
   userreal4                      = 0
   applied-forces:
     electric-field:
       x:
         E0                       = 0
         omega                    = 0
         t0                       = 0
         sigma                    = 0
       y:
         E0                       = 0
         omega                    = 0
         t0                       = 0
         sigma                    = 0
       z:
         E0                       = 0
         omega                    = 0
         t0                       = 0
         sigma                    = 0
grpopts:
   nrdf:      247713
   ref-t:         300
   tau-t:         0.1
annealing:          No
annealing-npoints:           0
   acc:	           0           0           0
   nfreeze:           N           N           N
   energygrp-flags[  0]: 0


Changing rlist from 0.935 to 0.956 for non-bonded 4x2 atom kernels

Changing nstlist from 10 to 40, rlist from 0.956 to 1.094

Using 1 MPI thread
Using 8 OpenMP threads

1 GPU auto-selected for this run.
Mapping of GPU IDs to the 1 GPU task in the 1 rank on this node:
  PP:0
PP tasks will do (non-perturbed) short-ranged interactions on the GPU
Pinning threads with an auto-selected logical core stride of 1
System total charge: 0.000
Will do PME sum in reciprocal space for electrostatic interactions.

++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
U. Essmann, L. Perera, M. L. Berkowitz, T. Darden, H. Lee and L. G. Pedersen
A smooth particle mesh Ewald method
J. Chem. Phys. 103 (1995) pp. 8577-8592
-------- -------- --- Thank You --- -------- --------

Using a Gaussian width (1/beta) of 0.288146 nm for Ewald
Potential shift: LJ r^-12: -3.541e+00 r^-6: -1.882e+00, Ewald -1.111e-05
Initialized non-bonded Ewald correction tables, spacing: 8.85e-04 size: 1018

Generated table with 1046 data points for 1-4 COUL.
Tabscale = 500 points/nm
Generated table with 1046 data points for 1-4 LJ6.
Tabscale = 500 points/nm
Generated table with 1046 data points for 1-4 LJ12.
Tabscale = 500 points/nm

Using GPU 4x4 nonbonded short-range kernels

Using a dual 4x2 pair-list setup updated with dynamic, rolling pruning:
  outer list: updated every 40 steps, buffer 0.194 nm, rlist 1.094 nm
  inner list: updated every  4 steps, buffer 0.011 nm, rlist 0.911 nm
At tolerance 0.005 kJ/mol/ps per atom, equivalent classical 1x1 list would be:
  outer list: updated every 40 steps, buffer 0.304 nm, rlist 1.204 nm
  inner list: updated every  4 steps, buffer 0.040 nm, rlist 0.940 nm

Using Lorentz-Berthelot Lennard-Jones combination rule

Removing pbc first time

Initializing LINear Constraint Solver

++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
B. Hess
P-LINCS: A Parallel Linear Constraint Solver for molecular simulation
J. Chem. Theory Comput. 4 (2008) pp. 116-122
-------- -------- --- Thank You --- -------- --------

The number of constraints is 13140
3504 constraints are involved in constraint triangles,
will apply an additional matrix expansion of order 6 for couplings
between constraints inside triangles

++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
S. Miyamoto and P. A. Kollman
SETTLE: An Analytical Version of the SHAKE and RATTLE Algorithms for Rigid
Water Models
J. Comp. Chem. 13 (1992) pp. 952-962
-------- -------- --- Thank You --- -------- --------

Center of mass motion removal mode is Linear
We have the following groups for center of mass motion removal:
  0:  rest

++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
G. Bussi, D. Donadio and M. Parrinello
Canonical sampling through velocity rescaling
J. Chem. Phys. 126 (2007) pp. 014101
-------- -------- --- Thank You --- -------- --------

There are: 124552 Atoms
There are: 11672 VSites

Constraining the starting coordinates (step 0)

Constraining the coordinates at t0-dt (step 0)
RMS relative constraint deviation after constraining: 3.07e-05
Initial temperature: 299.88 K

Started mdrun on rank 0 Wed Jan 23 22:17:07 2019

           Step           Time
              0        0.00000

   Energies (kJ/mol)
          Angle    Proper Dih.  Improper Dih.          LJ-14     Coulomb-14
    2.52212e+04    5.17017e+04    2.06788e+03    2.32931e+04    2.87705e+05
        LJ (SR)   Coulomb (SR)   Coul. recip.      Potential    Kinetic En.
    2.00473e+05   -2.26452e+06    1.95926e+04   -1.65446e+06    3.22293e+05
   Total Energy  Conserved En.    Temperature Pressure (bar)   Constr. rmsd
   -1.33217e+06   -1.33217e+06    3.12967e+02    4.47346e+02    6.07222e-05

Writing checkpoint, step 6000 at Wed Jan 23 22:32:12 2019


           Step           Time
          10000       50.00000

Writing checkpoint, step 10000 at Wed Jan 23 22:42:16 2019


   Energies (kJ/mol)
          Angle    Proper Dih.  Improper Dih.          LJ-14     Coulomb-14
    2.39171e+04    5.14713e+04    1.90487e+03    2.33074e+04    2.88157e+05
        LJ (SR)   Coulomb (SR)   Coul. recip.      Potential    Kinetic En.
    1.97225e+05   -2.25992e+06    1.95731e+04   -1.65436e+06    3.08572e+05
   Total Energy  Conserved En.    Temperature Pressure (bar)   Constr. rmsd
   -1.34579e+06   -1.33699e+06    2.99642e+02    4.23996e+02    5.20327e-05

	<======  ###############  ==>
	<====  A V E R A G E S  ====>
	<==  ###############  ======>

	Statistics over 10001 steps using 101 frames

   Energies (kJ/mol)
          Angle    Proper Dih.  Improper Dih.          LJ-14     Coulomb-14
    2.40135e+04    5.16104e+04    1.97281e+03    2.32118e+04    2.88337e+05
        LJ (SR)   Coulomb (SR)   Coul. recip.      Potential    Kinetic En.
    1.97950e+05   -2.26159e+06    1.94700e+04   -1.65502e+06    3.09169e+05
   Total Energy  Conserved En.    Temperature Pressure (bar)   Constr. rmsd
   -1.34585e+06   -1.33423e+06    3.00222e+02    3.95776e+02    0.00000e+00

   Total Virial (kJ/mol)
    8.72490e+04    2.18022e+02   -2.04698e+01
    2.17895e+02    8.71912e+04   -3.73210e+01
   -2.03466e+01   -3.83831e+01    8.75039e+04

   Pressure (bar)
    3.96783e+02   -5.85625e+00    8.10214e-01
   -5.85304e+00    4.01535e+02   -6.28920e-01
    8.07118e-01   -6.02217e-01    3.89011e+02


	M E G A - F L O P S   A C C O U N T I N G

 NB=Group-cutoff nonbonded kernels    NxN=N-by-N cluster Verlet kernels
 RF=Reaction-Field  VdW=Van der Waals  QSTab=quadratic-spline table
 W3=SPC/TIP3p  W4=TIP4p (single or pairs)
 V&F=Potential and force  V=Potential only  F=Force only

 Computing:                               M-Number         M-Flops  % Flops
-----------------------------------------------------------------------------
 Pair Search distance check            4723.039760       42507.358     0.1
 NxN Ewald Elec. + LJ [F]            793216.982880    52352320.870    90.7
 NxN Ewald Elec. + LJ [V&F]            8092.108144      865855.571     1.5
 1,4 nonbonded interactions             562.216216       50599.459     0.1
 Calc Weights                          4087.128672      147136.632     0.3
 Spread Q Bspline                     87192.078336      174384.157     0.3
 Gather F Bspline                     87192.078336      523152.470     0.9
 3D-FFT                              398671.243138     3189369.945     5.5
 Solve PME                              100.010000        6400.640     0.0
 Shift-X                                 34.192224         205.153     0.0
 Angles                                 325.472544       54679.387     0.1
 Propers                                548.454840      125596.158     0.2
 Impropers                               40.564056        8437.324     0.0
 Virial                                  13.763169         247.737     0.0
 Stop-CM                                 13.894848         138.948     0.0
 Calc-Ekin                              272.720448        7363.452     0.0
 Lincs                                  131.439420        7886.365     0.0
 Lincs-Mat                             3692.627456       14770.510     0.0
 Constraint-V                          1391.078160       11128.625     0.0
 Constraint-Vir                          12.719940         305.279     0.0
 Settle                                 376.112800      121484.434     0.2
 Virtual Site 3                          21.012160         777.450     0.0
 Virtual Site 3fd                        19.880736        1888.670     0.0
 Virtual Site 3fad                        3.313456         583.168     0.0
 Virtual Site 3out                       57.217728        4977.942     0.0
 Virtual Site 4fdn                       16.486464        4187.562     0.0
-----------------------------------------------------------------------------
 Total                                                57716385.269   100.0
-----------------------------------------------------------------------------


     R E A L   C Y C L E   A N D   T I M E   A C C O U N T I N G

On 1 MPI rank, each using 8 OpenMP threads

 Computing:          Num   Num      Call    Wall time         Giga-Cycles
                     Ranks Threads  Count      (s)         total sum    %
-----------------------------------------------------------------------------
 Vsite constr.          1    8      10001       1.000         28.733   0.1
 Neighbor search        1    8        251       5.410        155.445   0.4
 Launch GPU ops.        1    8      10001    1353.401      38888.866  89.6
 Force                  1    8      10001       6.941        199.443   0.5
 PME mesh               1    8      10001     121.314       3485.853   8.0
 Wait GPU NB local      1    8      10001       0.425         12.220   0.0
 NB X/F buffer ops.     1    8      19751       6.577        188.988   0.4
 Vsite spread           1    8      10102       1.572         45.169   0.1
 Write traj.            1    8          2       0.505         14.506   0.0
 Update                 1    8      10001       5.537        159.111   0.4
 Constraints            1    8      10003       6.719        193.054   0.4
 Rest                                           0.691         19.863   0.0
-----------------------------------------------------------------------------
 Total                                       1510.092      43391.251 100.0
-----------------------------------------------------------------------------
 Breakdown of PME mesh computation
-----------------------------------------------------------------------------
 PME spread             1    8      10001      40.884       1174.769   2.7
 PME gather             1    8      10001      29.190        838.744   1.9
 PME 3D-FFT             1    8      20002      48.085       1381.693   3.2
 PME solve Elec         1    8      10001       3.060         87.920   0.2
-----------------------------------------------------------------------------

 GPU timings
-----------------------------------------------------------------------------
 Computing:                         Count  Wall t (s)      ms/step       %
-----------------------------------------------------------------------------
 Pair list H2D                        251       0.377        1.500     0.0
 X / q H2D                          10001      18.341        1.834     0.0
 Nonbonded F kernel                  990055340232448.731   5589922469    50.0
 Nonbonded F+ene k.                   101      62.366      617.484     0.0
 Pruning kernel                       251       6.736       26.836     0.0
 F D2H                              1000155340232519.377   5533469904    50.0
-----------------------------------------------------------------------------
 Total                                   110680465055.927   1106693981   100.0
-----------------------------------------------------------------------------
 *Dynamic pruning                    4750      25.872        5.447     0.0
-----------------------------------------------------------------------------

Average per-step force GPU/CPU evaluation time ratio: 11066939811.612 ms/12.824 ms = 862973673.837
For optimal resource utilization this ratio should be close to 1

NOTE: The GPU has >25% more load than the CPU. This imbalance wastes
      CPU resources.

               Core t (s)   Wall t (s)        (%)
       Time:    12080.732     1510.092      800.0
                 (ns/day)    (hour/ns)
Performance:        2.861        8.389
Finished mdrun on rank 0 Wed Jan 23 22:42:17 2019
	:-) GROMACS - gmx mdrun, 2019 (-:

	GROMACS is written by:
	Emile Apol Rossen Apostolov Paul Bauer Herman J.C. Berendsen
	Par Bjelkmar Christian Blau Viacheslav Bolnykh Kevin Boyd
	Aldert van Buuren Rudi van Drunen Anton Feenstra Alan Gray
	Gerrit Groenhof Anca Hamuraru Vincent Hindriksen M. Eric Irrgang
	Aleksei Iupinov Christoph Junghans Joe Jordan Dimitrios Karkoulis
	Peter Kasson Jiri Kraus Carsten Kutzner Per Larsson
	Justin A. Lemkul Viveca Lindahl Magnus Lundborg Erik Marklund
	Pascal Merz Pieter Meulenhoff Teemu Murtola Szilard Pall
	Sander Pronk Roland Schulz Michael Shirts Alexey Shvetsov
	Alfons Sijbers Peter Tieleman Jon Vincent Teemu Virolainen
	Christian Wennberg Maarten Wolf
	and the project leaders:
	Mark Abraham, Berk Hess, Erik Lindahl, and David van der Spoel

	Copyright (c) 1991-2000, University of Groningen, The Netherlands.
	Copyright (c) 2001-2018, The GROMACS development team at
	Uppsala University, Stockholm University and
	the Royal Institute of Technology, Sweden.
	check out http://www.gromacs.org for more information.

	GROMACS is free software; you can redistribute it and/or modify it
	under the terms of the GNU Lesser General Public License
	as published by the Free Software Foundation; either version 2.1
	of the License, or (at your option) any later version.

	GROMACS: gmx mdrun, version 2019
	Executable: /home/eltonfc/.local//bin/gmx
	Data prefix: /home/eltonfc/.local/
	Working dir: /home/eltonfc/trab/software/gromacs/bench/adh_cubic_vsites
	Process ID: 25079
	Command line:
	gmx mdrun -v -maxh .5 -notunepme

	GROMACS version: 2019
	Precision: single
	Memory model: 64 bit
	MPI library: thread_mpi
	OpenMP support: enabled (GMX_OPENMP_MAX_THREADS = 64)
	GPU support: OpenCL
	SIMD instructions: AVX2_256
	FFT library: fftw-3.3.6-pl2-fma-sse2-avx-avx2-avx2_128
	RDTSCP usage: enabled
	TNG support: enabled
	Hwloc support: hwloc-1.11.2
	Tracing support: disabled
	C compiler: /usr/bin/cc GNU 7.3.0
	C compiler flags: -mavx2 -mfma -O3 -DNDEBUG -funroll-all-loops -fexcess-precision=fast
	C++ compiler: /usr/bin/c++ GNU 7.3.0
	C++ compiler flags: -mavx2 -mfma -std=c++11 -O3 -DNDEBUG -funroll-all-loops -fexcess-precision=fast
	OpenCL include dir: /usr/include
	OpenCL library: /usr/lib/libOpenCL.so
	OpenCL version: 2.0


	Running on 1 node with total 4 cores, 8 logical cores, 1 compatible GPU
	Hardware detected:
	CPU info:
	Vendor: Intel
	Brand: Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz
	Family: 6 Model: 60 Stepping: 3
	Features: aes apic avx avx2 clfsh cmov cx8 cx16 f16c fma hle htt intel lahf mmx msr nonstop_tsc pcid pclmuldq pdcm pdpe1gb popcnt pse rdrnd rdtscp rtm sse2 sse3 sse4.1 sse4.2 ssse3 tdt x2apic
	Hardware topology: Full, with devices
	Sockets, cores, and logical processors:
	Socket 0: [ 0 4] [ 1 5] [ 2 6] [ 3 7]
	Numa nodes:
	Node 0 (16704245760 bytes mem): 0 1 2 3 4 5 6 7
	Latency:
	0
	0 1.00
	Caches:
	L1: 32768 bytes, linesize 64 bytes, assoc. 8, shared 2 ways
	L2: 262144 bytes, linesize 64 bytes, assoc. 8, shared 2 ways
	L3: 8388608 bytes, linesize 64 bytes, assoc. 16, shared 8 ways
	PCI devices:
	0000:00:02.0 Id: 8086:0412 Class: 0x0300 Numa: 0
	0000:00:19.0 Id: 8086:153a Class: 0x0200 Numa: 0
	0000:00:1f.2 Id: 8086:8c02 Class: 0x0106 Numa: 0
	GPU info:
	Number of GPUs detected: 1
	#0: name: Intel(R) HD Graphics Haswell GT2 Desktop, vendor: Intel, device version: OpenCL 1.2 beignet 1.3, stat: compatible


	++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
	M. J. Abraham, T. Murtola, R. Schulz, S. Páll, J. C. Smith, B. Hess, E.
	Lindahl
	GROMACS: High performance molecular simulations through multi-level
	parallelism from laptops to supercomputers
	SoftwareX 1 (2015) pp. 19-25
	-------- -------- --- Thank You --- -------- --------


	++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
	S. Páll, M. J. Abraham, C. Kutzner, B. Hess, E. Lindahl
	Tackling Exascale Software Challenges in Molecular Dynamics Simulations with
	GROMACS
	In S. Markidis & E. Laure (Eds.), Solving Software Challenges for Exascale 8759 (2015) pp. 3-27
	-------- -------- --- Thank You --- -------- --------


	++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
	S. Pronk, S. Páll, R. Schulz, P. Larsson, P. Bjelkmar, R. Apostolov, M. R.
	Shirts, J. C. Smith, P. M. Kasson, D. van der Spoel, B. Hess, and E. Lindahl
	GROMACS 4.5: a high-throughput and highly parallel open source molecular
	simulation toolkit
	Bioinformatics 29 (2013) pp. 845-54
	-------- -------- --- Thank You --- -------- --------


	++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
	B. Hess and C. Kutzner and D. van der Spoel and E. Lindahl
	GROMACS 4: Algorithms for highly efficient, load-balanced, and scalable
	molecular simulation
	J. Chem. Theory Comput. 4 (2008) pp. 435-447
	-------- -------- --- Thank You --- -------- --------


	++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
	D. van der Spoel, E. Lindahl, B. Hess, G. Groenhof, A. E. Mark and H. J. C.
	Berendsen
	GROMACS: Fast, Flexible and Free
	J. Comp. Chem. 26 (2005) pp. 1701-1719
	-------- -------- --- Thank You --- -------- --------


	++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
	E. Lindahl and B. Hess and D. van der Spoel
	GROMACS 3.0: A package for molecular simulation and trajectory analysis
	J. Mol. Mod. 7 (2001) pp. 306-317
	-------- -------- --- Thank You --- -------- --------


	++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
	H. J. C. Berendsen, D. van der Spoel and R. van Drunen
	GROMACS: A message-passing parallel molecular dynamics implementation
	Comp. Phys. Comm. 91 (1995) pp. 43-56
	-------- -------- --- Thank You --- -------- --------


	++++ PLEASE CITE THE DOI FOR THIS VERSION OF GROMACS ++++
	https://doi.org/10.5281/zenodo.2424363
	-------- -------- --- Thank You --- -------- --------

	Input Parameters:
	integrator = md
	tinit = 0
	dt = 0.005
	nsteps = 10000
	init-step = 0
	simulation-part = 1
	comm-mode = Linear
	nstcomm = 100
	bd-fric = 0
	ld-seed = 232683026
	emtol = 10
	emstep = 0.01
	niter = 20
	fcstep = 0
	nstcgsteep = 1000
	nbfgscorr = 10
	rtpi = 0.05
	nstxout = 0
	nstvout = 0
	nstfout = 0
	nstlog = 0
	nstcalcenergy = 100
	nstenergy = 500
	nstxout-compressed = 0
	compressed-x-precision = 1000
	cutoff-scheme = Verlet
	nstlist = 10
	ns-type = Grid
	pbc = xyz
	periodic-molecules = false
	verlet-buffer-tolerance = 0.005
	rlist = 0.935
	coulombtype = PME
	coulomb-modifier = Potential-shift
	rcoulomb-switch = 0
	rcoulomb = 0.9
	epsilon-r = 1
	epsilon-rf = inf
	vdw-type = Cut-off
	vdw-modifier = Potential-shift
	rvdw-switch = 0
	rvdw = 0.9
	DispCorr = No
	table-extension = 1
	fourierspacing = 0.1125
	fourier-nx = 100
	fourier-ny = 100
	fourier-nz = 100
	pme-order = 4
	ewald-rtol = 1e-05
	ewald-rtol-lj = 0.001
	lj-pme-comb-rule = Geometric
	ewald-geometry = 0
	epsilon-surface = 0
	tcoupl = V-rescale
	nsttcouple = 10
	nh-chain-length = 0
	print-nose-hoover-chain-variables = false
	pcoupl = No
	pcoupltype = Isotropic
	nstpcouple = -1
	tau-p = 1
	compressibility (3x3):
	compressibility[ 0]={ 0.00000e+00, 0.00000e+00, 0.00000e+00}
	compressibility[ 1]={ 0.00000e+00, 0.00000e+00, 0.00000e+00}
	compressibility[ 2]={ 0.00000e+00, 0.00000e+00, 0.00000e+00}
	ref-p (3x3):
	ref-p[ 0]={ 0.00000e+00, 0.00000e+00, 0.00000e+00}
	ref-p[ 1]={ 0.00000e+00, 0.00000e+00, 0.00000e+00}
	ref-p[ 2]={ 0.00000e+00, 0.00000e+00, 0.00000e+00}
	refcoord-scaling = No
	posres-com (3):
	posres-com[0]= 0.00000e+00
	posres-com[1]= 0.00000e+00
	posres-com[2]= 0.00000e+00
	posres-comB (3):
	posres-comB[0]= 0.00000e+00
	posres-comB[1]= 0.00000e+00
	posres-comB[2]= 0.00000e+00
	QMMM = false
	QMconstraints = 0
	QMMMscheme = 0
	MMChargeScaleFactor = 1
	qm-opts:
	ngQM = 0
	constraint-algorithm = Lincs
	continuation = false
	Shake-SOR = false
	shake-tol = 0.0001
	lincs-order = 6
	lincs-iter = 1
	lincs-warnangle = 30
	nwall = 0
	wall-type = 9-3
	wall-r-linpot = -1
	wall-atomtype[0] = -1
	wall-atomtype[1] = -1
	wall-density[0] = 0
	wall-density[1] = 0
	wall-ewald-zfac = 3
	pull = false
	awh = false
	rotation = false
	interactiveMD = false
	disre = No
	disre-weighting = Conservative
	disre-mixed = false
	dr-fc = 1000
	dr-tau = 0
	nstdisreout = 100
	orire-fc = 0
	orire-tau = 0
	nstorireout = 100
	free-energy = no
	cos-acceleration = 0
	deform (3x3):
	deform[ 0]={ 0.00000e+00, 0.00000e+00, 0.00000e+00}
	deform[ 1]={ 0.00000e+00, 0.00000e+00, 0.00000e+00}
	deform[ 2]={ 0.00000e+00, 0.00000e+00, 0.00000e+00}
	simulated-tempering = false
	swapcoords = no
	userint1 = 0
	userint2 = 0
	userint3 = 0
	userint4 = 0
	userreal1 = 0
	userreal2 = 0
	userreal3 = 0
	userreal4 = 0
	applied-forces:
	electric-field:
	x:
	E0 = 0
	omega = 0
	t0 = 0
	sigma = 0
	y:
	E0 = 0
	omega = 0
	t0 = 0
	sigma = 0
	z:
	E0 = 0
	omega = 0
	t0 = 0
	sigma = 0
	grpopts:
	nrdf: 247713
	ref-t: 300
	tau-t: 0.1
	annealing: No
	annealing-npoints: 0
	acc: 0 0 0
	nfreeze: N N N
	energygrp-flags[ 0]: 0


	Changing rlist from 0.935 to 0.956 for non-bonded 4x2 atom kernels

	Changing nstlist from 10 to 40, rlist from 0.956 to 1.094

	Using 1 MPI thread
	Using 8 OpenMP threads

	1 GPU auto-selected for this run.
	Mapping of GPU IDs to the 1 GPU task in the 1 rank on this node:
	PP:0
	PP tasks will do (non-perturbed) short-ranged interactions on the GPU
	Pinning threads with an auto-selected logical core stride of 1
	System total charge: 0.000
	Will do PME sum in reciprocal space for electrostatic interactions.

	++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
	U. Essmann, L. Perera, M. L. Berkowitz, T. Darden, H. Lee and L. G. Pedersen
	A smooth particle mesh Ewald method
	J. Chem. Phys. 103 (1995) pp. 8577-8592
	-------- -------- --- Thank You --- -------- --------

	Using a Gaussian width (1/beta) of 0.288146 nm for Ewald
	Potential shift: LJ r^-12: -3.541e+00 r^-6: -1.882e+00, Ewald -1.111e-05
	Initialized non-bonded Ewald correction tables, spacing: 8.85e-04 size: 1018

	Generated table with 1046 data points for 1-4 COUL.
	Tabscale = 500 points/nm
	Generated table with 1046 data points for 1-4 LJ6.
	Tabscale = 500 points/nm
	Generated table with 1046 data points for 1-4 LJ12.
	Tabscale = 500 points/nm

	Using GPU 4x4 nonbonded short-range kernels

	Using a dual 4x2 pair-list setup updated with dynamic, rolling pruning:
	outer list: updated every 40 steps, buffer 0.194 nm, rlist 1.094 nm
	inner list: updated every 4 steps, buffer 0.011 nm, rlist 0.911 nm
	At tolerance 0.005 kJ/mol/ps per atom, equivalent classical 1x1 list would be:
	outer list: updated every 40 steps, buffer 0.304 nm, rlist 1.204 nm
	inner list: updated every 4 steps, buffer 0.040 nm, rlist 0.940 nm

	Using Lorentz-Berthelot Lennard-Jones combination rule

	Removing pbc first time

	Initializing LINear Constraint Solver

	++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
	B. Hess
	P-LINCS: A Parallel Linear Constraint Solver for molecular simulation
	J. Chem. Theory Comput. 4 (2008) pp. 116-122
	-------- -------- --- Thank You --- -------- --------

	The number of constraints is 13140
	3504 constraints are involved in constraint triangles,
	will apply an additional matrix expansion of order 6 for couplings
	between constraints inside triangles

	++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
	S. Miyamoto and P. A. Kollman
	SETTLE: An Analytical Version of the SHAKE and RATTLE Algorithms for Rigid
	Water Models
	J. Comp. Chem. 13 (1992) pp. 952-962
	-------- -------- --- Thank You --- -------- --------

	Center of mass motion removal mode is Linear
	We have the following groups for center of mass motion removal:
	0: rest

	++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
	G. Bussi, D. Donadio and M. Parrinello
	Canonical sampling through velocity rescaling
	J. Chem. Phys. 126 (2007) pp. 014101
	-------- -------- --- Thank You --- -------- --------

	There are: 124552 Atoms
	There are: 11672 VSites

	Constraining the starting coordinates (step 0)

	Constraining the coordinates at t0-dt (step 0)
	RMS relative constraint deviation after constraining: 3.07e-05
	Initial temperature: 299.88 K

	Started mdrun on rank 0 Wed Jan 23 22:17:07 2019

	Step Time
	0 0.00000

	Energies (kJ/mol)
	Angle Proper Dih. Improper Dih. LJ-14 Coulomb-14
	2.52212e+04 5.17017e+04 2.06788e+03 2.32931e+04 2.87705e+05
	LJ (SR) Coulomb (SR) Coul. recip. Potential Kinetic En.
	2.00473e+05 -2.26452e+06 1.95926e+04 -1.65446e+06 3.22293e+05
	Total Energy Conserved En. Temperature Pressure (bar) Constr. rmsd
	-1.33217e+06 -1.33217e+06 3.12967e+02 4.47346e+02 6.07222e-05

	Writing checkpoint, step 6000 at Wed Jan 23 22:32:12 2019


	Step Time
	10000 50.00000

	Writing checkpoint, step 10000 at Wed Jan 23 22:42:16 2019


	Energies (kJ/mol)
	Angle Proper Dih. Improper Dih. LJ-14 Coulomb-14
	2.39171e+04 5.14713e+04 1.90487e+03 2.33074e+04 2.88157e+05
	LJ (SR) Coulomb (SR) Coul. recip. Potential Kinetic En.
	1.97225e+05 -2.25992e+06 1.95731e+04 -1.65436e+06 3.08572e+05
	Total Energy Conserved En. Temperature Pressure (bar) Constr. rmsd
	-1.34579e+06 -1.33699e+06 2.99642e+02 4.23996e+02 5.20327e-05

	<====== ############### ==>
	<==== A V E R A G E S ====>
	<== ############### ======>

	Statistics over 10001 steps using 101 frames

	Energies (kJ/mol)
	Angle Proper Dih. Improper Dih. LJ-14 Coulomb-14
	2.40135e+04 5.16104e+04 1.97281e+03 2.32118e+04 2.88337e+05
	LJ (SR) Coulomb (SR) Coul. recip. Potential Kinetic En.
	1.97950e+05 -2.26159e+06 1.94700e+04 -1.65502e+06 3.09169e+05
	Total Energy Conserved En. Temperature Pressure (bar) Constr. rmsd
	-1.34585e+06 -1.33423e+06 3.00222e+02 3.95776e+02 0.00000e+00

	Total Virial (kJ/mol)
	8.72490e+04 2.18022e+02 -2.04698e+01
	2.17895e+02 8.71912e+04 -3.73210e+01
	-2.03466e+01 -3.83831e+01 8.75039e+04

	Pressure (bar)
	3.96783e+02 -5.85625e+00 8.10214e-01
	-5.85304e+00 4.01535e+02 -6.28920e-01
	8.07118e-01 -6.02217e-01 3.89011e+02


	M E G A - F L O P S A C C O U N T I N G

	NB=Group-cutoff nonbonded kernels NxN=N-by-N cluster Verlet kernels
	RF=Reaction-Field VdW=Van der Waals QSTab=quadratic-spline table
	W3=SPC/TIP3p W4=TIP4p (single or pairs)
	V&F=Potential and force V=Potential only F=Force only

	Computing: M-Number M-Flops % Flops
	-----------------------------------------------------------------------------
	Pair Search distance check 4723.039760 42507.358 0.1
	NxN Ewald Elec. + LJ [F] 793216.982880 52352320.870 90.7
	NxN Ewald Elec. + LJ [V&F] 8092.108144 865855.571 1.5
	1,4 nonbonded interactions 562.216216 50599.459 0.1
	Calc Weights 4087.128672 147136.632 0.3
	Spread Q Bspline 87192.078336 174384.157 0.3
	Gather F Bspline 87192.078336 523152.470 0.9
	3D-FFT 398671.243138 3189369.945 5.5
	Solve PME 100.010000 6400.640 0.0
	Shift-X 34.192224 205.153 0.0
	Angles 325.472544 54679.387 0.1
	Propers 548.454840 125596.158 0.2
	Impropers 40.564056 8437.324 0.0
	Virial 13.763169 247.737 0.0
	Stop-CM 13.894848 138.948 0.0
	Calc-Ekin 272.720448 7363.452 0.0
	Lincs 131.439420 7886.365 0.0
	Lincs-Mat 3692.627456 14770.510 0.0
	Constraint-V 1391.078160 11128.625 0.0
	Constraint-Vir 12.719940 305.279 0.0
	Settle 376.112800 121484.434 0.2
	Virtual Site 3 21.012160 777.450 0.0
	Virtual Site 3fd 19.880736 1888.670 0.0
	Virtual Site 3fad 3.313456 583.168 0.0
	Virtual Site 3out 57.217728 4977.942 0.0
	Virtual Site 4fdn 16.486464 4187.562 0.0
	-----------------------------------------------------------------------------
	Total 57716385.269 100.0
	-----------------------------------------------------------------------------


	R E A L C Y C L E A N D T I M E A C C O U N T I N G

	On 1 MPI rank, each using 8 OpenMP threads

	Computing: Num Num Call Wall time Giga-Cycles
	Ranks Threads Count (s) total sum %
	-----------------------------------------------------------------------------
	Vsite constr. 1 8 10001 1.000 28.733 0.1
	Neighbor search 1 8 251 5.410 155.445 0.4
	Launch GPU ops. 1 8 10001 1353.401 38888.866 89.6
	Force 1 8 10001 6.941 199.443 0.5
	PME mesh 1 8 10001 121.314 3485.853 8.0
	Wait GPU NB local 1 8 10001 0.425 12.220 0.0
	NB X/F buffer ops. 1 8 19751 6.577 188.988 0.4
	Vsite spread 1 8 10102 1.572 45.169 0.1
	Write traj. 1 8 2 0.505 14.506 0.0
	Update 1 8 10001 5.537 159.111 0.4
	Constraints 1 8 10003 6.719 193.054 0.4
	Rest 0.691 19.863 0.0
	-----------------------------------------------------------------------------
	Total 1510.092 43391.251 100.0
	-----------------------------------------------------------------------------
	Breakdown of PME mesh computation
	-----------------------------------------------------------------------------
	PME spread 1 8 10001 40.884 1174.769 2.7
	PME gather 1 8 10001 29.190 838.744 1.9
	PME 3D-FFT 1 8 20002 48.085 1381.693 3.2
	PME solve Elec 1 8 10001 3.060 87.920 0.2
	-----------------------------------------------------------------------------

	GPU timings
	-----------------------------------------------------------------------------
	Computing: Count Wall t (s) ms/step %
	-----------------------------------------------------------------------------
	Pair list H2D 251 0.377 1.500 0.0
	X / q H2D 10001 18.341 1.834 0.0
	Nonbonded F kernel 990055340232448.731 5589922469 50.0
	Nonbonded F+ene k. 101 62.366 617.484 0.0
	Pruning kernel 251 6.736 26.836 0.0
	F D2H 1000155340232519.377 5533469904 50.0
	-----------------------------------------------------------------------------
	Total 110680465055.927 1106693981 100.0
	-----------------------------------------------------------------------------
	*Dynamic pruning 4750 25.872 5.447 0.0
	-----------------------------------------------------------------------------

	Average per-step force GPU/CPU evaluation time ratio: 11066939811.612 ms/12.824 ms = 862973673.837
	For optimal resource utilization this ratio should be close to 1

	NOTE: The GPU has >25% more load than the CPU. This imbalance wastes
	CPU resources.

	Core t (s) Wall t (s) (%)
	Time: 12080.732 1510.092 800.0
	(ns/day) (hour/ns)
	Performance: 2.861 8.389
	Finished mdrun on rank 0 Wed Jan 23 22:42:17 2019