next up previous contents
Next: 5. Thermo_pw on the Up: User's Guide for the Previous: 3.20 what='elastic_constants_geo'   Contents

4. Restarting an interrupted run

There are several situations that might require the restart of the THERMO_PW code. We must distinguish two different cases: THERMO_PW stopped while running QUANTUM ESPRESSO routines, because the code reached the maximum cpu time or because some external event stopped the run, or THERMO_PW stopped after doing some post-processing task. This second case comprises also the normal termination of THERMO_PW and the necessity to change some details of the plot rerunning the post-processing tools without redoing the QUANTUM ESPRESSO calculations.

Support for the first case is based on the recover features provided by QUANTUM ESPRESSO routines and usually works when images are not used. This restarting method needs files in the outdir directory. In this case THERMO_PW behaves as QUANTUM ESPRESSO except for the fact that max_seconds in the input of pw.x or of ph.x is not active. To run thermo_pw for a fixed number of seconds max_seconds must be set in the THERMO_CONTROL namelist. If the code stopped inside pw.x, restart_mode must be set to 'restart' in the input of pw.x while if the code stopped inside ph.x routines recover must be set to .TRUE. in the input of ph.x.

When running thermo_pw with several images and calculating a phonon dispersion or using the what='mur_lc_t' option, max_seconds is controlled by the image driver of thermo_pw. Presently, after max_seconds a signal is sent to the asynchronous driver and the master stops sending new works to the images. It stops the code when all the images have terminated their current task. Recovering from this point is possible without loosing any previous work by keeping the outdir directory and by setting recover=.TRUE. in the ph.x input. Note however that when thermo_pw is stopped by the operating system in an unclean way this restart method could not work.

As a last resource you can remove the outdir directory, and THERMO_PW will not recalculate the quantities contained in files that are already in the working directory. Completed phonon calculations at a given geometry for which the dynamical matrices are available are not redone if you put the fildyn name in the thermo_control namelist. It is possible to stop thermo_pw after the calculation of the phonon dispersions for a fixed number of geometries by setting the input variable max_geometries in the THERMO_CONTROL namelist or also specify exactly which geometries to do in a given run using the variables start_geometry and last_geometry or start_geometry_qha and last_geometry_qha.

In general the restart of thermo_pw from a post-processing task is much easier. Each routine checks if a file with the same name as the file that it would produce is already in the working directory, and if this happens, it reads its content and returns. This feature cannot be disabled from input. In order to recalculate a given quantity, just remove the file that contains it from the working directory. For instance in an anharmonic calculation, if you have already all the dynamical matrices for all the geometries and you do not have any more the outdir directory, it is possible to skip entirely the phonon calculations and the reading of the files produced by pw.x by setting the variable after_disp=.TRUE. and giving the name of the dynamical matrices file using the variable fildyn in the THERMO_CONTROL namelist. In this case THERMO_PW can compute the anharmonic properties with a different set of temperatures, or with a different sampling on the phonon frequencies, etc.. You need to erase the output files that contain the phonon dos, or the thermal properties from a previous calculation, keeping the dynamical matrices files and the restart directory and rerun THERMO_PW. Similarly if the files containing the bands energy eigenvalues are already in the working directory, it is possible to set the input variable only_bands_plot to change the bands plot without redoing the bands calculation. Note however that in this case it is not possible to change the Brillouin zone path.
The following variables can be used to stop THERMO_PW before it concludes all the calculations:

max_seconds: the code stops after max_seconds have elapsed. 
              Note that the check is not done continuously so 
              the clean stop might occur a few minutes after 
              max_seconds.
              Default: real 10E8

max_geometries: the code stops after computing the dispersions 
              in max_geometries.
              Default: integer 1000000

start_geometry: the code starts doing the phonons for 
              start_geometry.
              Default: integer 1

last_geometry: the code does only the phonons for geometries 
              with index lower than last_geometry.
              Default: integer total number of geometries.

Note that the first time that you use start_geometry and last_geometry the codes makes a self-consistent pw.x run for all the geometries and saves the results in the outdir directory. If for any reason the outdir directory is removed after computing some geometries, just remove also the restart directory and the code will recreate the information in outdir but will not recalculate the dynamical matrices already available.

Finally we consider some typical runs of the most time consuming option what='mur_lc_t', but what we say is valid also for the computation of the phonon dispersions at a single geometry. With a single processor or with a personal computer with a small number of processors you can run the code without interruption. In general in these cases it is not useful to use the image parallelization before exploiting all the parallelization levels of QUANTUM ESPRESSO, since images have an overhead due to the necessity of reinitialize the phonon calculation and recalculate the bands. If you cannot complete the calculation in a single run you need to send several times the THERMO_PW run. In this case you can set start_geometry=last_geometry, max_seconds in the thermo_pw input, and recover=.TRUE. in the ph.x input. You need to repeat this for all the geometries and finally you can collect the results and compute the anharmonic properties. Note that when start_geometry or last_geometry are set by the user the anharmonic calculation is skipped.


next up previous contents
Next: 5. Thermo_pw on the Up: User's Guide for the Previous: 3.20 what='elastic_constants_geo'   Contents
2024-09-24