cspy-moldis command
The cspy-moldis command is used to distort molecular conformers to yield a database of molecular conformations around local conformational minima.
The conformation adopted by a molecule in the solid state may not be the same as the conformational minima in a vacuum. The conformations which are used should be considered within the context of their relative energy. The energy penalty associated with the distortion from vacuum minima may be compensated for by gains in lattice energy, but the higher the molecular strain, the less likely the conformation can be found in a stable crystal structure.
cspy-moldis [-h] [--scan_sobol SCAN_SOBOL] [--scan_dofs SCAN_DOFS [SCAN_DOFS ...]]
[--constraints CONSTRAINTS [CONSTRAINTS ...]] [--jointdb] [--charges CHARGES]
[--potential {F,W}] [--multiplicities MULTIPLICITIES] [--xtb]
[--foreshorten_hydrogens FORESHORTEN_HYDROGENS] [-fun FUNCTIONAL]
[-bas BASIS_SET] [-basfile BASIS_FILE] [-gmem GAUSSIAN_MEMORY]
[-gcpu GAUSSIAN_CPUS] [--chelpg] [--gaussian_cleanup] [-vol] [-pcm] [-epcm]
[-esp DIELECTRIC_CONSTANT] [-gopt OPT] [--iso ISO] [--freq FREQ]
[--gaussian_all_molecules] [--additional_args ADDITIONAL_ARGS]
[--set_molecular_states SET_MOLECULAR_STATES] [--force_rerun | --no-force_rerun]
[--log-level LOG_LEVEL] [--keep-files] [--skip-header]
[--status-file STATUS_FILE]
[filenames ...]
positional arguments
filenames- Names of the files containing molecular coordinates. These can be .xyz, .fchk,or a mixture of both file types. Alternatively, you can specify conformational databasesalong with the –jointdb to combine these into one (see jointdb flag)
options
--scan_sobolSCAN_SOBOL- Uses a sobol vector for the DOF scan. Specify the number of scan points required (default:0)--scan_dofsSCAN_DOFS- A list containing the coordinates to scan. Please see documentation for information. Atoms should be indexed starting from 1. (default:)--constraintsCONSTRAINTS- A list of atom numbers you wish to constrain during a gaussian optimisation.Atoms should be indexed starting from 1.e.g –constraints'1 2 3''5 6 7 8''9'would lead to the fixing of the angle betweenatoms 1,2, and 3, the dihedral angle between atoms 5, 6, 7 and 8, and the atomicposition of atom'9'. (default:[])--jointdb- Join a list of databases into a single database--chargesCHARGES- Charge of each molecule given in args.filenames as a single Python string in the same order as the filenames.(e.g.'0 -1 1') Note: If all molecules are neutral, then there is no need to specify this flag. (default:0)--potentialPOTENTIAL- Intermolecular potential type (default:F)--multiplicitiesMULTIPLICITIES- Multiplicty of each molecule given in args.filenames as a single Python string (e.g.'1 3 1') Note: If all molecules have a multiplicty of 1, then there is no need to specify this flag. (default:1)--xtb- Do a scan with xtb--foreshorten_hydrogensFORESHORTEN_HYDROGENS- Length to foreshorten hydrogens for dma files after an xtb scan. Usually None for FIT, 0.1 for W99-funFUNCTIONAL,--functionalFUNCTIONAL- Electronic structure functional for Gaussian calculation (a string passed verbatim into Gaussian input file, e.g. B3LYP or PBEPBE) (default:B3LYP)-basBASIS_SET,--basis_setBASIS_SET- Basis set to use in each Gaussian run (default:6-311G**)-basfileBASIS_FILE,--basis_fileBASIS_FILE- File contains the basis set information that are not standard in Gaussian (default:)-gmemGAUSSIAN_MEMORY,--gaussian_memoryGAUSSIAN_MEMORY- Memory to use in each Gaussian run (default:2GB)-gcpuGAUSSIAN_CPUS,--gaussian-cpusGAUSSIAN_CPUS- Number of cpus to use for each Gaussian calculation (default:1)--chelpg- Use CHelpG to calculate partial charges--gaussian_cleanup- Clean up gaussian output-vol,--molecular_volume- Calculate molecular volume-pcm,--polarizable_continuum- Specify the value of epilson to use in the polarisable contiuum modele.g.'3'or'9.2'-epcm,--external_iteration_pcm- Use a Polarizable Continuum Model-espDIELECTRIC_CONSTANT,--dielectric_constantDIELECTRIC_CONSTANT- Dielectric constant for Polarizable Continuum (default:3.0)-goptOPT,--optOPT- Options within gaussian optimisation (default:ModRedundant,)--isoISO- ISO value (default:)--freqFREQ- Options for frequency, use a blank to turn on default (default:False)--gaussian_all_molecules- Run Gaussian on all molecules.--additional_argsADDITIONAL_ARGS- Additional arguments for gaussian (default:)--set_molecular_statesSET_MOLECULAR_STATES- Set the charge and spin multiplicity for each molecule in the crystal e.g.'0,1 0,3 -1,1'would set the first molecule as a singlet, second as a triplet and third as a negatively charged singlet state, enter as a string--force_rerun- Force re-running of Gaussian - NOTE: not widely used.--log-levelLOG_LEVEL- Log level (default:INFO)--keep-files- Keep DMACRYS and NEIGHCRYS files which, for each structure, are stored in a new directory in the pwd.--skip-header- Skip the mol-CSPy header at the start of the job.--status-fileSTATUS_FILE- Specify output status file (default:status.txt)
Note
cspy-moldis requires either Psi4 or Gaussian to be installed.
Overview
cspy-moldis will accept molecular conformers in either .xyz or .fchk format and are provided to the command line as the first term:
cspy-moldis conf0.xyz conf1.xyz ...
These conformers are each distorted by rotating specified groups of atoms along the specified axis. A finite basis set DFT singlepoint is then run to calculate the energy, point charges, and multipoles of the distorted conformations.
The output will be a conformational database for each conformation containg the geometry, as well as the energy, point charges, multipoles, and molecular axes. This database can then be provided as input to cspy-flex.
Scanning DOFs
mol-CSPy does not make judgements on conformational flexibility by itself. It falls upon the user to decide which degrees of freedom should be sampled.
The greater the number of degrees of freedom, the more conformations, and the greater the cost of the CSP. The scanned DOFs should therefore be selected carefully.
For the purposes of defining DOFs, all atoms are indexed starting from 1. This provides better compatibility with the software that interfaces with mol-CSPy’s flexibile workflow.
Furthermore, it is recommended when setting up DOFs, that the .xyz molecule is loaded in Mercury and atoms are labelled according to file ordering. This will allow the user to visualise the index of each atom in the molecule.
Defining a DOF
Each dof must be defined as torsion, comprised of four atoms.
We refer to the coordinates (c) of the torsion by the indices of each atom.
For a torsion between atoms, 4, 3, 6, and 6, a torsion may be defined as:
"{c:'4_3_6_7'}"
Sobol Scanning
The user can quasi-randomly scan distortions around that torsion by providing the --scan_sobol with an integer that defines how many distortations to generate.
Grid Scanning
The recommended approach to scanning DOFs is a grid scan. No additional flag is required, but the parameters of the scan must be defined in the DOF. There are four terms that should be used to define a scan:
n(number of steps)s(step size)o(offset)i(initial)
n is the number of distorting steps that will be applied to the conformation. This will directly determine the number of the distorted conformers.
s is the size of the step applied at each distortion. The sign determines the direction of the distortion. By default, radians are assumed, but the value may be provided in degrees if followed by .*D.
o is the offset of the sampling. By default, distortions will be applied in a single direction away from the input conformation. An offset can provided such that the range of distortions center around the input configuration. Again, radians are asummed but degrees are accepted.
i is mutually exlusive with offset and serves the same purpose, but explicitly defines the starting value of the parameter. This may not be appropriate if multiple conformations are provided.
The below DOF defines a torsion of atoms 4, 3, 6, and 6, and instructs moldis to create 7 distortions (n) with a step size of 15 degrees (s) and an offset of -45 degrees (o).
"{c:'4_3_6_7',n:7,s:15.*D,o:-45.*D}"
Joining Databases
The output of cspy-moldis is one database per input conformation. If the user wishes for a single database comprising distorted conformations, they can run cspy-moldis again, but replace the input files with a list of databases and provide the -jointdb flag.
This will yield a single database that is suitable for cspy-flex.
Examples
The following bash will run cspy-moldis for two conformations of a single molecule that has 3 degrees of freedom.
dof1="{c:'4_3_6_7',n:7,s:15.*D,o:-45.*D}"
dof2="{c:'3_6_7_9',n:7,s:15.*D,o:-45.*D}"
dof3="{c:'6_7_9_19',n:7,s:15.*D,o:-45.*D}"
mpirun -np 4 cspy-moldis conf0.xyz conf1.xyz --scan_dofs $dof1 $dof2 $dof3
The following bash will run cspy-moldis for to combined two conformer databases into one.
mpirun -np 4 cspy-moldis conf0.db conf1.db --jointdb