ForceConstants¶
The ForceConstants class creates objects that store the system information and load (or calculate) force constant matrices (IFCs) to be used by an instance of the Phonon class. This is normally the first thing you should initialize when working with kALDo and we’ll walk you through how to do that in the following sections on this page. Initializing a ForceConstants object will also initialize SecondOrder and ThirdOrder containers that you can access as attributes. Refer to the section on Loading Precalculated IFCs.
Calculating IFCs in kALDo¶
We use the finite difference approach, or sometimes called the frozen-phonon method. This method neglects temperature effects on phonon frequency and velocities, but it’s a good approximation if your system is far from melting (well-below the Debye Temperature).
The basic idea is we explicitly calculate the change in forces on atoms when they are displaced from their equilibrium and use this to approximate the true derivative of the potential energy at equilibrium. You should keep aim to keep the displacements small, but not so small that the potential you’re using can’t resolve the change in forces. On empirical potentials like Tersoff, you can try pretty small displacements (1e-5 Angstroms) but on more complex potentials like machine learning potentials you may need to use larger displacements (1e-3 Angstroms).
For a system with N atoms, the number of second order IFCs is \((3N)^2\) but, because each frame returns \(3N\) forces, we only need to calculate \(\frac{(3N)^2}{3N} = 3N\) frames. However, the finite difference method uses 2 displacements (+/-) to approximate the derivative so total number is actually \(2\times 3N = 6N\).
Each term of the third order FCs are calculated with 2 derivatives calculated with 2 displacements, so the total number of frames is \(2 \times 2 \times 3N \times 3N = (6N)^2\).
Should I use kALDo to calculate my IFCs?¶
It’s generally faster to use compiled software like LAMMPS or Quantum Espresso to generate the IFCs when possible.
However, if you want to calculate the IFCs directly from Python, kALDo has some advantages because both second-
and third-order finite-difference calculations can be parallelized with n_workers. This parallel workflow has
been used successfully on HPC compute clusters, including multi-node runs, with a near-linear speed up (for calculators
that do not use thread-parallelization).
The distance_threshold option can also greatly reduce the amount of work required for third order IFC generation,
by skipping interactions between atoms that are too far apart. This is especially useful for large systems
(like glasses, or clathrates).
Some general cases where calculating IFCs directly in kALDo is a good fit are:
You want to explore the effect of different potentials on only harmonic phonon properties (so no third order needed).
Users have a custom, or Python-based calculator
The system has no symmetry (which many of the compiled software packages exploit to reduce total calculations)
Larger third-order calculations where a physically reasonable
distance_thresholdcan be used to skip distant interactions
Serial Runtime Estimate¶
If you’re not sure, SecondOrder.calculate() prints to stdout the atom it is currently working on, so run the
calculation with n_workers=1 and stop it after it prints a few atoms. Take the average time per atom and divide by
6 to get the time per frame \(t_{pf}\) including overhead from I/O operations and task setup. For serial runs, the
total times are given by:
Parallel Runtime Estimate¶
For parallel runs, kALDo distributes displaced-atom tasks across workers, so a useful first estimate for wall time is
the serial estimate divided by n_workers:
This is only an approximate guide because real runs also include process startup, I/O, scratch traffic, load imbalance,
and calculator-specific overhead. For third-order calculations, using a nonzero distance_threshold can reduce the
effective amount of work even further by skipping distant interactions, so the actual runtime may be substantially lower
than the no-cutoff estimate above.
For calculators that use shared-memory parallelization (e.g. Orb), it’s likely that the fastest route will be a blend
of thread and process parallelization. Try out different combinations of n_workers and OMP_NUM_THREADS for the
best results.
Note
ML potentials (Orb, MACE, MatterSim, CPUNEP, …) need a small extra step for parallel runs because their
PyTorch-backed instances cannot be pickled across processes. See Parallel runs with ML calculators for the
recommended pattern (a top-level factory function plus an if __name__ == '__main__': guard) and the
delta_shift guidance for float32 calculators.
Calculation Workflow¶
Hint
Be sure to minimize the potential energy of your atomic positions before calculating the IFCs. The steps here assume you have already done this.
Both SecondOrder.calculate() and ThirdOrder.calculate() can distribute the finite-difference work
across multiple worker processes with the n_workers argument. In both cases, kALDo parallelizes over displaced
unit-cell atoms: second order assigns one atom’s harmonic finite-difference block to each task, while third order
assigns one first-index atom of the anharmonic tensor to each task. This keeps the workflow simple and makes it easy to
resume interrupted runs.
For second- and third-order scratch runs, intermediate results can be written to a scratch_dir. Each completed atom
writes its own scratch artifact together with a .done sentinel file, so rerunning the calculation with the same
scratch directory skips finished atoms and only recomputes missing work.
Import required packages which will be kALDo, the ASE calculator you want to use, and either a function to build atoms (like ase.build.bulk) or their read tool (ase.io.read):
from kaldo.forceconstants import ForceConstants from ase.io import read from ase.build import bulk from ase.calculatos.lammpslib import LAMMPSlib
Create your Atoms, Calculator and ForceConstants object. If you don’t set a “folder” argument for the ForceConstants object the default save location is “displacements”. See the ASE docs for information on their calculators:
# Create the Atoms object atoms = bulk('C', 'diamond', a=3.567) # Create the ASE calculator object calc = LAMMPSlib(lmpcmds=['pair_style tersoff', 'pair_coeff * * SiC.tersoff C'], parameters={'control': 'energy', 'log': 'none', 'pair': 'tersoff', 'mass': '1 12.01', 'boundary': 'p p p'}, files=['SiC.tersoff'])) # Create the ForceConstants object fc = ForceConstants(atoms, supercell=[3, 3, 3], folder='path_to_save_IFCS/',)Now use the “calculate” methods of the second and third order objects by passing in the calculator as an argument. The “delta_shift” argument is how far the atoms move.:
# Second Order (for harmonic properties - phonon frequencies, velocities, heat capacity, etc.) fc.second.calculate(calc, delta_shift = 1e-5, n_workers = 2,) # Third Order (for anharmonic properties - phonon lifetimes, thermal conductivity, etc.) fc.third.calculate(calc delta_shift = 1e-5, n_workers = 2,)
Try referencing the carbon diamond example to see an example where we use kALDo and ASE to control LAMMPS calculations.
Hint
Some libraries, like LAMMPS, can use either a python wrapper (LAMMPSlib) or a direct call to the binary executable (LAMMPSrun). We don’t notice a significant performance increase by using the direct call method, because the bottleneck is the I/O of data back to python. This is part of the reason our examples use the python wrapper, which offers more flexibility without losing access to the full LAMMPS functionality.
Parallel runs with ML calculators¶
The serial idiom shown above (calc = LAMMPSlib(...); fc.second.calculate(calc, ...)) keeps working for parallel
runs as long as the calculator can be pickled and forked safely. That’s true for analytical and shared-library
calculators (EMT, Lennard-Jones, LAMMPS), but typically not for ML potentials (Orb, MACE, MatterSim, CPUNEP, …)
that hold PyTorch models, GPU contexts, or C handles. Two issues come up in that case:
The live calculator instance can’t cross a process boundary. Spawn-based multiprocessing pickles every argument; non-picklable instances raise
TypeError(kALDo’s parallel validator catches this with a copy-pasteable fix message) or, worse, fork into a worker that then crashes withBrokenProcessPoolorCannot re-initialize CUDA in forked subprocess.Spawn re-imports your script. On any host with CUDA available, kALDo’s executor selects the spawn start method to avoid fork-vs-CUDA crashes. Spawn re-executes the entry-point module in every worker, so any top-level
fc.second.calculate(...)call would re-fire inside the worker.
The recommended pattern: define a no-argument factory function at module top level and pass the function (not its
return value) as the calculator argument. Each worker calls the function once to build its own isolated
calculator. Wrap your script’s executable code in the standard if __name__ == '__main__': guard so the worker’s
re-import of the script doesn’t re-fire your kaldo calls:
from ase.build import bulk
from orb_models.forcefield import pretrained
from orb_models.forcefield.calculator import ORBCalculator
from kaldo.forceconstants import ForceConstants
def make_calculator():
"""Top-level so spawn-imported workers can reach it by name."""
orbff = pretrained.orb_v3_conservative_inf_omat(
device='cpu', precision='float32-highest',
)
return ORBCalculator(orbff, device='cpu')
if __name__ == '__main__':
atoms = bulk('Al', 'fcc', a=4.05, cubic=True)
fc = ForceConstants(atoms=atoms, supercell=(3, 3, 3), folder='fc_al')
fc.second.calculate(make_calculator, delta_shift=1e-2, n_workers=4)
fc.third.calculate(make_calculator, delta_shift=1e-2, n_workers=4)
Notice that make_calculator is passed without parentheses. Calling it (make_calculator()) would build one
instance in the parent process and try to ship it to workers — exactly the un-picklable case we’re avoiding.
Choosing delta_shift¶
Float32 ML calculators produce force noise on the order of 1e-7 eV/Å. Finite-difference second derivatives
divide this by delta_shift, so a too-small delta produces FC noise that overwhelms the physics. The kALDo default
delta_shift=1e-3 is tuned for analytical calculators (EMT, LAMMPS) where forces are accurate to machine
precision; for ML potentials, prefer delta_shift=1e-2 to 5e-2. SecondOrder.calculate and
ThirdOrder.calculate warn once when delta_shift < 1e-2 and the calculator looks ML-based.
Common pitfalls¶
Passing the constructed instance instead of the factory.
calc = ORBCalculator(...)followed byfc.second.calculate(calc, n_workers=4)will fail validation on torch-based calculators. Pass the factory function (without parentheses) instead.Forgetting the
if __name__ == '__main__':guard. Without the guard, spawn re-imports your script and every worker tries to spawn its own pool, which Python rejects (RuntimeError: An attempt has been made to start a new process before the current process has finished its bootstrapping phase). kALDo detects this before workers spawn and raises with a guard-fix message.Defining the factory inside another function or class. Closures don’t pickle and aren’t reachable by name from spawn-imported workers. Move the
defto module top level.Setting ``delta_shift=1e-4`` (or smaller) for ML. This is the right delta for analytical potentials, but produces ~1e-3 FC noise on float32 ML potentials, swamping the real signal. Use
1e-2instead.
Loading Precalculated IFCs¶
Construct your ForceConstants object by using the from_folder method. The first step of
the amorphous silicon example can help you get started.
If you’d like to load IFCs into the already-created instances without the from_folder() generate the
ForceConstants object and then use the load() method of the SecondOrder and ThirdOrder objects to pull data as
needed.
Hint
If you just want to check harmonic data first, use the “is_harmonic” argument when creating the ForceConstants object to only load the second order IFCs. This can save considerable amounts of time if, for instance, you just need to generate a phonon dispersion along a new path.
Hint
TDEP with kaldo will only work for bulk materials (3D) materials
Input Files and Formats¶
Format |
Config filename |
ASE format (for config file) |
2nd Order FCs |
3rd Order FCs |
|---|---|---|---|---|
numpy |
replicated_atoms.xyz |
xyz |
second.npy |
third.npz |
eskm |
CONFIG |
dlp4 |
Dyn.form |
THIRD |
lammps |
replicated_atoms.xyz |
xyz |
Dyn.form |
THIRD |
vasp-sheng |
CONTROL / POSCAR [1] |
N/A / vasp [2] |
FORCE_CONSTANTS_2ND |
FORCE_CONSTANTS_3RD |
qe-sheng |
CONTROL / POSCAR [1] |
N/A / vasp [2] |
espresso.ifc2 |
FORCE_CONSTANTS_3RD |
vasp-d3q |
CONTROL / POSCAR [1] |
N/A / vasp [2] |
FORCE_CONSTANTS_2ND |
FORCE_CONSTANTS_3RD_D3Q |
qe-d3q |
CONTROL / POSCAR [1] |
N/A / vasp [2] |
espresso.ifc2 |
FORCE_CONSTANTS_3RD_D3Q |
hiphive |
atom_prim.xyz + replicated_atoms.xyz |
xyz |
model2.fcs |
model3.fcs |
tdep |
POSCAR/UCPOSCAR/SSPOSCAR |
xyz |
second.npy |
third.npz |
Notes on Formats
API Reference¶
-
class kaldo.forceconstants.ForceConstants(atoms, supercell: tuple[int, int, int] =
(1, 1, 1), third_supercell: tuple[int, int, int] | None =None, folder: str ='displacement', distance_threshold: float | None =None, second_order: SecondOrder | None =None, third_order: ThirdOrder | None =None)[source]¶ A ForceConstants class object is used to create or load the second or third order force constant matrices as well as store information related to the geometry of the system.
- Parameters:¶
- atoms : Tabulated xyz files or ASE Atoms object¶
The atoms to work on.
- supercell : tuple[int, int, int], optional¶
Size of supercell given by the number of repetitions (l, m, n) of the small unit cell in each direction. Default: (1, 1, 1)
- third_supercell : tuple[int, int, int], optional¶
Same as supercell, but for the third order force constant matrix. If not provided, it’s copied from supercell. Default:
self.supercell- folder : str, optional¶
Name to be used for the displacement information folder. Default:
'displacement'- distance_threshold : float, optional¶
If the distance between two atoms exceeds threshold, the interatomic force is ignored. Default: None
- second_order : SecondOrder, optional¶
Preloaded second-order force constants attached to the instance. Default:
None(lazy construction)- third_order : ThirdOrder, optional¶
Preloaded third-order force constants attached to the instance. Default:
None(lazy construction)
- n_modes¶
The number of possible vibrational modes in the system from a lattice dynamics perspective. Equivalent to 3*n_atoms where the factor of 3 comes from the 3 Cartesian directions.
- Type:¶
int
- n_replicas¶
The number of repeated unit cells represented in the system. Equivalent to
np.prod(supercell).- Type:¶
int
- n_replicated_atoms¶
The number of atoms represented in the system. Equivalent to
n_atoms * np.prod(supercell)- Type:¶
int
- cell_inv¶
A 3x3 matrix which satisfies AB=I where A is the matrix of cell vectors, I is the identity matrix, and B is the cell_inv matrix.
- Type:¶
np.array(3, 3)
-
classmethod from_folder(folder: str, supercell: tuple[int, int, int] =
(1, 1, 1), format: str ='numpy', third_energy_threshold: float =0.0, third_supercell: tuple[int, int, int] | None =None, is_acoustic_sum: bool =False, only_second: bool =False, distance_threshold: float | None =None, chunk_size: int =100000)[source]¶ Create a finite difference object from a folder
The folder should contain the a set of files whose names and contents are dependent on the “format” parameter. Below is the list required for each format (also found in the api_forceconstants documentation if you prefer to read it with nicer formatting and explanations).
numpy: replicated_atoms.xyz, second.npy, third.npz
eskm: CONFIG, replicated_atoms.xyz, Dyn.form, THIRD
lammps: replicated_atoms.xyz, Dyn.form, THIRD
vasp-sheng: CONTROL/POSCAR, FORCE_CONSTANTS_2ND/FORCE_CONSTANTS, FORCE_CONSTANTS_3RD
qe-sheng: CONTROL/POSCAR, espresso.ifc2, FORCE_CONSTANTS_3RD
vasp-d3q: CONTROL/POSCAR, FORCE_CONSTANTS_2ND/FORCE_CONSTANTS, FORCE_CONSTANTS_3RD_D3Q
qe-d3q: CONTROL/POSCAR, espresso.ifc2, FORCE_CONSTANTS_3RD_D3Q
hiphive: atom_prim.xyz, replicated_atoms.xyz, model2.fcs, model3.fcs
tdep: infile.ucposcar, infile.ssposcar, infile.forceconstant, infile.forceconstant_thirdorder
- Parameters:¶
- folder : str¶
Chosen folder to load in system information.
- supercell : (int, int, int), optional¶
Number of unit cells in each cartesian direction replicated to form the input structure. Default is (1, 1, 1)
- format : 'numpy', 'eskm', 'lammps', 'vasp-sheng', 'qe-sheng', 'vasp-d3q', 'qe-d3q', 'hiphive', 'tdep'¶
Format of force constant information being loaded into ForceConstants object. Default is
'numpy'- third_energy_threshold : float, optional¶
When importing sparse third order force constant matrices, energies below the threshold value in magnitude are ignored. Units: eV/Angstrom^3 Default is None
- distance_threshold : float, optional¶
When calculating force constants, contributions from atoms further than the distance threshold will be ignored.
- third_supercell : (int, int, int), optional¶
Takes in the unit cell for the third order force constant matrix. Default is self.supercell
- is_acoustic_sum : Bool, optional¶
If true, the acoustic sum rule is applied to the dynamical matrix. Default is False
- chunk_size : int, optional¶
Number of entries to process per chunk when reading sparse third order files. Larger values use more memory but may be faster for very large files. Default: 100000
- Returns:¶
forceconstants – A new instance of the ForceConstants class
- Return type:¶
ForceConstants object
-
unfold_third_order(reduced_third=
None, distance_threshold=None)[source]¶ This method extrapolates a third order force constant matrix from a unit cell into a matrix for a larger supercell.
- elastic_prop()[source]¶
Return the stiffness tensor (aka elastic modulus tensor) of the system in GPa.
This describes the stress-strain relationship of the material and can sometimes be used as a loose predictor for thermal conductivity. Requires the dynamical matrix to be loaded or calculated.
Notes
Notation follows: Theory of the elastic constants of graphite and graphene, DOI 10.1002/pssb.200879604
Companion Objects¶
Although ForceConstants is the main user-facing entry point, it creates
separate SecondOrder and ThirdOrder objects that many users interact
with through fc.second and fc.third. These classes are mostly internal
containers, but users with non-standard input files or custom finite-difference
workflows may still find their load(...) and calculate(...) methods
useful.
SecondOrder¶
The harmonic companion object is usually accessed as ForceConstants.second. Direct use
is most helpful when you want to load second-order data in a non-standard way (e.g. not using the
ForceConstants.from_folder method) or to run the harmonic finite-difference calculation.
-
classmethod SecondOrder.load(folder: str, supercell: tuple[int, int, int] =
(1, 1, 1), format: str ='numpy', is_acoustic_sum: bool =False)[source]¶ Load second order force constants from a folder in the given format, used for library internally.
To load force constants data,
ForceConstants.from_folderis recommended.- Parameters:¶
- folder : str¶
Specifies where to load the data files.
- supercell : tuple[int, int, int]¶
The supercell for the third order force constant matrix. Default: (1, 1, 1)
- format : str¶
Format of the second order force constant information being loaded into SecondOrder object. Default: ‘sparse’
- is_acoustic_sum : bool, optional¶
If true, the acoustic sum rule is applied to the dynamical matrix. Default: False
- Returns:¶
second_order – A new instance of the SecondOrder class
- Return type:¶
SecondOrder object
-
SecondOrder.calculate(calculator=
None, delta_shift=0.001, is_storing=True, is_verbose=False, n_workers=1, scratch_dir=None, keep_scratch=False)[source]¶ Calculate second-order force constants with finite differences.
This is the method typically reached through
fc.second.calculate(...). It can either load an existingsecond.npyfromself.folderwhenis_storingis enabled, or compute the harmonic force constants from the current structure and calculator.See the Parallel runs with ML calculators section of the ForceConstants documentation for the recommended pattern when running torch-based calculators (Orb, MACE, MatterSim, CPUNEP) in parallel: define a no-arg factory function at module top level and pass it (without parentheses) as
calculator.- Parameters:¶
- calculator : callable or ASE Calculator instance or None¶
For serial runs, pass an ASE Calculator instance (the existing kaldo idiom). For parallel runs, pass a callable that returns a fresh ASE Calculator: a class with a no-arg constructor, a top-level factory function,
functools.partial, etc. Each worker invokes the callable once to build its own isolated calculator. If None,replicated_atomsmust already have a calculator attached.- delta_shift : float, optional¶
Finite-difference displacement in Angstrom. The default
1e-3is tuned for analytical calculators (EMT, LAMMPS). ML potentials in float32 (Orb, MACE, MatterSim, …) need1e-2or larger because float32 force noise (~1e-7 eV/Å) divided by a tiny delta produces FC noise that swamps the physics. A warning fires whendelta_shift < 1e-2and the calculator looks ML-based. Default: 1e-3- is_storing : bool, optional¶
If True, try to load an existing result from
self.folderfirst and save newly computed data after the calculation. Default: True- is_verbose : bool, optional¶
If True, log per-atom progress information. Default: False
- n_workers : int or None, optional¶
Number of worker processes used for the displaced-atom finite- difference tasks.
1runs serially,Noneuses all available workers. Each worker is capped to one OpenMP / MKL / OpenBLAS thread so that calculators with internal multithreading (PyNEP, torch CPU, numpy+MKL) don’t oversubscribe. Override by settingOMP_NUM_THREADS/MKL_NUM_THREADSin the environment before invoking. Default: 1- scratch_dir : str or None, optional¶
Optional scratch directory for atom-by-atom intermediate files used for recovery of interrupted calculations. Pass an explicit path to override. Pass an empty string
''to disable scratch files and fall back to in-memory accumulation. Default:{folder}/second_orderwhenself.folderis set andn_workers > 1- keep_scratch : bool, optional¶
If True, keep scratch files after successful assembly. Default: False
ThirdOrder¶
The anharmonic companion object is usually accessed as ForceConstants.third. Direct use
is most helpful when you need to load third-order data from a specific format or
when launching the finite-difference calculation.
-
classmethod ThirdOrder.load(folder: str, supercell: tuple[int, int, int] =
(1, 1, 1), format: str ='sparse', third_energy_threshold: float =0.0, chunk_size: int =100000)[source]¶ Load third order force constants from a folder in the given format, used for library internally.
To load force constants data,
ForceConstants.from_folderis recommended.- Parameters:¶
- folder : str¶
Specifies where to load the data files.
- supercell : tuple[int, int, int]¶
The supercell for the third order force constant matrix. Default: (1, 1, 1)
- format : str¶
Format of the third order force constant information being loaded into ForceConstant object. Default: ‘sparse’
- third_energy_threshold : float, optional¶
When importing sparse third order force constant matrices, energies below the threshold value in magnitude are ignored. Units: eV/A^3 Default: \(None\)
- chunk_size : int, optional¶
Number of entries to process per chunk when reading sparse third order files. Larger values use more memory but may be faster for very large files. Default: 100000
- Returns:¶
third_order – A new instance of the ThirdOrder class
- Return type:¶
ThirdOrder object
-
ThirdOrder.calculate(calculator=
None, delta_shift=0.0001, distance_threshold=None, is_storing=True, is_verbose=False, n_workers=1, scratch_dir=None, keep_scratch=False, jat_flush_every=50)[source]¶ Calculate the third order force constants.
This is the method typically reached through
fc.third.calculate(...). It can load an existing stored result fromself.folderwhenis_storingis enabled, or compute the anharmonic force constants directly from finite-difference force evaluations.See the Parallel runs with ML calculators section of the ForceConstants documentation for the recommended pattern when running torch-based calculators (Orb, MACE, MatterSim, CPUNEP) in parallel: define a no-arg factory function at module top level and pass it (without parentheses) as
calculator.- Parameters:¶
- calculator : callable or ASE Calculator instance¶
For serial runs, pass an ASE Calculator instance (the existing kaldo idiom). For parallel runs, pass a callable that returns a fresh ASE Calculator: a class with a no-arg constructor, a top-level factory function,
functools.partial, etc. Each worker invokes the callable once to build its own isolated calculator:from ase.calculators.emt import EMT calculator=EMTIf None, replicated_atoms must already have a calculator attached.
- delta_shift : float¶
Finite-difference displacement in Angstrom. The default
1e-4is tuned for analytical calculators (EMT, LAMMPS). ML potentials in float32 (Orb, MACE, MatterSim, …) need1e-2or larger because float32 force noise (~1e-7 eV/Å) divided by a tiny delta produces FC noise that swamps the physics. A warning fires whendelta_shift < 1e-2and the calculator looks ML-based. Default: 1e-4- n_workers : int or None¶
Number of parallel worker processes.
1runs serially.Noneuses all available CPUs. Each worker is capped to one OpenMP / MKL / OpenBLAS thread so calculators with internal multithreading (PyNEP, torch CPU, numpy+MKL) don’t oversubscribe. Override by settingOMP_NUM_THREADS/MKL_NUM_THREADSin the environment before invoking. Default: 1 (serial)- scratch_dir : str or None¶
Directory for scratch chunk files written during calculation to keep peak memory low. Pass an explicit path to override. Pass an empty string
''to disable scratch files and fall back to in-memory accumulation. Default:{folder}/third_orderwhenself.folderis set- keep_scratch : bool¶
If True, scratch files are kept after assembly. Default: False
- jat_flush_every : int¶
Number of jat iterations each worker buffers before flushing to disk. Smaller values use less memory at the cost of more I/O. Default 50.