Scipy Optimization for Chemistry


PSI-K Conference

Last month I participated in the PSI-K 2022 Conferece. It was great to travel outside of Japan after three years of COVID-19 restrictions. I got some pretty neat goodies while there:

jpg

in addition to enjoying the great Swiss summer weather. Some of the talks got me thinking about optimization, which I thought I'd explore in this notebook.

Optimization

In this post, I want to see how useful the built in scipy optimizers are for quantum chemistry calculations. We'll employ them as black boxes for computing various properties. We will use PySCF as our QM driver and Benzene as a reference molecule.

h_to_eV = 27.2114
!cat Benzene.xyz
12

C         -1.21310       -0.68840        0.00000
C         -1.20280        0.70640        0.00010
C         -0.01030       -1.39480        0.00000
C          0.01040        1.39480       -0.00010
C          1.20280       -0.70630        0.00000
C          1.21310        0.68840        0.00000
H         -2.15770       -1.22440        0.00000
H         -2.13930        1.25640        0.00010
H         -0.01840       -2.48090       -0.00010
H          0.01840        2.48080        0.00000
H          2.13940       -1.25630        0.00010
H          2.15770        1.22450        0.00000

I had some strange issues with PySCF's xyz reader (maybe a buggy version), so I read the molecule in manually.

def read_geom(iname):
    atom = []
    with open(iname) as ifile:
        next(ifile)
        next(ifile)
        for line in ifile:
            split = line.split()
            atom.append([split[0], [float(x) for x in split[1:]]])
    return atom

And build.

from pyscf import gto
mol = gto.Mole()
mol.basis = "6-31G*"
mol.atom = read_geom("Benzene.xyz")
mol.verbose = 0
_ = mol.build()

Example 1: Functional Optimization

As a first example, let's try optimizing the ratio of exact exchange in the PBE0 functional for a given molecule. The goal will be to match the homo-lumo gap of a system. First, we need a reference value, for which we will use $\Delta$SCF.

cat = mol.copy()
cat.charge = 1
cat.spin = 1
_ = cat.build()
from pyscf import dft
mf = dft.RKS(mol)
mf.xc = "PBE0"
ene = mf.kernel()

mf_cat = dft.RKS(cat)
mf_cat.xc = "PBE0"
ene_cat = mf_cat.kernel()

And so the target homo-lumo band gap is:

target_gap = ene_cat - ene
print(target_gap * h_to_eV, "eV")
9.143932978023763 eV

But simply using Koopman's theorem we would haven found a value of

hidx = int(mol.nelec[0])
print(h_to_eV * (mf.mo_energy[hidx] - mf.mo_energy[hidx-1]), "eV")
7.237101946475208 eV

We can write a function which returns the error of computing the gap with Koopman's theorem using given a fixed amount of exchange.

tested_x = []
tested_y = []
def f(mix):
    # Use the LIBXC Codes for PBE_X and PBE_C
    mf.xc = str(mix) + " * HF + " + str(1.0-mix) + " * 101 + 130"
    _ = mf.kernel()
    gap = mf.mo_energy[hidx] - mf.mo_energy[hidx-1]
    err = abs(gap - target_gap) * h_to_eV
    tested_x.append(mix)
    tested_y.append(err)
    return err
print(f(0.25))
1.9068310315485535

Scipy has an optimizer which seems specialized for a single variable, so let's give it a spin.

from scipy.optimize import minimize_scalar
opt = minimize_scalar(f, bounds=(0., 1.), tol=1e-1)
from matplotlib import pyplot as plt
fig, axs = plt.subplots(1, 1, figsize=(4,3))

axs.plot(tested_x, tested_y, 'kx')
axs.axvline(opt.x, color='r', linestyle='--')

axs.set_ylabel("Error (eV)", fontsize=12)
_ = axs.set_xlabel("Ratio", fontsize=12)

png

And it looks like the winner is the classic half and half mix. It is a shame that the bounds are violated so severely though, but for it worked for our purposes.

Example 2: Geometry Optimization

As a second example, we will try performing some geometry optimization. Note that PySCF has some extra features for geometry optimization, in particular the scanner mode. But we will do things manually for illustration purposes. We will also use a smaller basis set just to keep the notebook run-time manageable.

mol = gto.Mole()
mol.basis = "STO-3G"
mol.atom = read_geom("Benzene.xyz")
mol.verbose = 0
_ = mol.build()
class OptTarget:
    def __init__(self, mol):
        self.mol = mol.copy()
        self.energies = []  # Store computed energies

    def __call__(self, pos):
        nat = len(self.mol.atom)

        # Reshape the position variable to a per atom
        rpos = pos.reshape(nat, 3)

        # Update the positions
        for i in range(nat):
            self.mol.atom[i][1] = rpos[i]
        self.mol.build()

        # Compute the energy
        mf = dft.RKS(self.mol)
        mf.xc = "PBE0"
        e = mf.kernel()
        self.energies.append(e)

        return e
from scipy.optimize import minimize
t = OptTarget(mol)
opt = minimize(t, x0=mol.atom_coords(), method="CG", tol=1e-2)
fig, axs = plt.subplots(figsize=(4,3))
axs.plot(t.energies, 'kx--')
axs.set_ylabel("Energy (a.u.)", fontsize=12)
axs.set_xlabel("Iteration", fontsize=12)
_ = axs.set_ylim(-229.5, -226)

png

With this basic approach the progress is quite slow. This is because the black box minimizer is trying to numerically approximate the gradient. Since we can get the gradient from the pyscf calculation, let's integrate it into the process.

from pyscf import grad
class OptTarget:
    def __init__(self, mol):
        self.mol = mol.copy()
        self.energies = []  # Store the computed energies
        self.grad_norm = []  # Store the norm of the computed gradients

    def __call__(self, pos):
        from numpy.linalg import norm
        nat = len(self.mol.atom)

        # Reshape the position variable to a per atom
        rpos = pos.reshape(nat, 3)

        # Update the positions
        for i in range(nat):
            self.mol.atom[i][1] = rpos[i]
        self.mol.build()

        # Compute the energy
        mf = dft.RKS(self.mol)
        mf.xc = "PBE0"
        e = mf.kernel()
        self.energies.append(e)

        # Compute the gradient
        mf = grad.RKS(mf)
        g = mf.kernel()
        self.grad_norm.append(norm(g))
        return e, g.flatten()
from scipy.optimize import minimize
t = OptTarget(mol)
opt = minimize(t, x0=mol.atom_coords(), method="CG", jac=True, tol=1e-2)
fig, axs = plt.subplots(figsize=(4,3))
axs.plot(t.energies, 'kx--', label="Energy")
axs.set_ylabel("Energy (a.u.)", fontsize=12)
axs.set_xlabel("Iteration", fontsize=12)
axs.set_ylim(-229.5, -226)
axs.legend(loc="upper left")

axs2 = axs.twinx()
axs2.plot(t.grad_norm, 'r+--', label="Force")
axs2.set_ylabel("Gradient Norm (Å)")
axs2.set_yscale("log")
_ = axs2.legend(loc="upper right")

png

Using the gradient information, convergence is much quicker.

Example 3: Contraction Coefficient Optimization

Now let's try something really fancy: optimization of the contraction coefficients of our basis set.

class OptTarget:
    def __init__(self, mol):
        self.mol = mol.copy()
        self.energies = []  # Store the computed energies

    def __call__(self, coefficients):
        # Build the basis
        basis = {"H": gto.basis.load("STO-3G", "H"),
                 "C": gto.basis.load("STO-3G", "C")}

        # Update with the new coefficients
        i = 0
        for at in basis.values():
            for shell in at:
                for coeff in shell[1:]:
                    coeff[1] = coefficients[i]
                    i += 1
        self.mol.basis = basis
        self.mol.build()

        # Compute the energy
        mf = dft.RKS(self.mol)
        mf.xc = "PBE0"
        e = mf.kernel()
        self.energies.append(e)

        return e

I'll manually write the starting coefficients of the STO-3G basis set as our initial guess.

t = OptTarget(mol)
start_coeff = [0.1543289673E+00,  0.5353281423E+00, 0.4446345422E+00, #H 1S
               0.1543289673E+00,  0.5353281423E+00, 0.4446345422E+00, #C 1S
               -0.9996722919E-01, 0.3995128261E+00, 0.7001154689E+00, #C 2S
               0.1559162750E+00,  0.6076837186E+00, 0.3919573931E+00] #C 2p

Optimize and plot. This time the TNC method is used because we are constraining the contraction coefficients to be between -1 and 1. In practice we probably would want to impose more sophisticated constraints.

opt = minimize(t, x0=start_coeff, bounds=[(-1, 1) for _ in start_coeff],
               method="TNC", tol=1e-2)
fig, axs = plt.subplots(figsize=(4,3))
axs.plot(t.energies, 'kx--')
axs.set_ylabel("Energy (a.u.)", fontsize=12)
axs.set_xlabel("Iteration", fontsize=12)
_ = axs.set_ylim(-229.4, -229)

png

The end result is that we could bring the energy down. Optimizing a basis set is a dangerous game of course, you potentially lose out on a lot of error cancellation and other properties might be negatively affected. How about the band gap?

tn = OptTarget(mol)
tcat = OptTarget(cat)
print("Initial Basis:", h_to_eV*(tn(start_coeff) - tcat(start_coeff)), "eV")
Initial Basis: -8.095511667441343 eV
tn = OptTarget(mol)
tcat = OptTarget(cat)
print("Optimized Basis:", h_to_eV*(tn(opt.x) - tcat(opt.x)), "eV")
Optimized Basis: -8.859976551095974 eV

Not bad, in this case our new basis was able to improve things slightly. There is more to the subject of basis set optimization, such as trying to keep the overlap matrix well conditioned or generation of the gradient, but that will have to be a subject of a future blog post.