Polarizable Embedding “The Frame”

Sadly both Polarizable Embedding and Projective-Embedding are often abbreviated as PE, which can lead to confusion. Here, we will instead use Frame for Fragment-based Multiscale Embedding or written-out Polarizable Embedding, while Projective-Embedding remains PE.

The Frame acronym comes is also appropiate as you can think of the Polarizable Embedding as the outer layer - the Frame that holds the solvent/environment response, while Projective-Embedding focuses wavefunction accuracy inside your embedded Quantum Mechanical region.

In this example, we start from a PDB file of acrolein in water, then we demonstrate how to:

Generate a polarizable embedding potential (.pot) and DALTON molecule file (.mol) using PyFraME,
Convert the DALTON .mol to .xyz,
Finally run a Qrunch calculation that combines a polarizable “Frame” (the solvent environment) with our Projection-based Wavefunction-in-DFT embedding (PE) to solve a small active region.

This naturally yields a three-level hierarchy:

Frame: polarizable environment (classical, distributed multipoles + polarizabilities)
DFT on the full molecule: mean-field description supplying orbitals & density
Embedded WF solver: high-level treatment inside the active region

References and background

Files used in this tutorial

Input PDB: acrolein_water.pdb (from the Zenodo dataset above) Download acrolein_water.pdb
PyFraME script: run_pyframe.py Download run_pyframe.py
Converter: mol_to_xyz.py → produces acrolein_water.xyz Download mol_to_xyz.py

You may preview the PDB interactively with the Dash Molecule 3D Viewer:

Go to the Data tab.
Click the Drag and drop or click to upload file area.
Upload acrolein_water.pdb.

1) Generate the PE Frame potentials with PyFraME

Download the helper

run_pyframe.py

Given the PDB file acrolein_water.pdb in your working directory, run your helper script.

$ python run_pyframe.py

This will create a new folder acrolein_water/ with:

acrolein_water.pot — the polarizable embedding potential
acrolein_water.mol — a DALTON molecule input file

2) Convert DALTON `.mol` → `.xyz`

Qrunch expects an .xyz. So we use the converter:

Download the converter

mol_to_xyz.py

Run it

python mol_to_xyz.py acrolein_water/acrolein_water.mol acrolein_water.xyz

3) Run Qrunch with a PE Frame + Projective-Embedding

We now have the molecule input that define the quantum mechanical region .xyz and the potential .pot file that represents the polarizable environment. We can now run the qrunch script.

Full script

The script below shows the full script, which are then explained section by section.

"""Demo script for FAST-VQE with Polarizable Embedding and Projection based Wavefunction-in-DFT embedding."""

# pyright: reportUnknownVariableType=false, reportMissingImports=false
# ruff: noqa
from pathlib import Path

from matplotlib import pyplot as plt
import numpy as np

import qrunch as qc


def plot_energies(
    result: dict[str, list[float]],
    *,
    outdir: Path = Path("dist"),
    show: bool = False,
) -> None:
    """Plot the energies from FAST-VQE and CAS-CI."""
    energies = np.asarray(result["energies"])
    reference_energy = result["casci_energy"][0]

    iterations = np.asarray(list(range(len(energies))))
    # Create the plot
    fig1 = plt.figure()
    plt.plot(iterations, energies, color="darkgreen", marker="o", label="FAST-VQE")

    # Add a dashed horizontal line at the CAS-CI value.
    plt.axhline(y=reference_energy, color="black", linestyle="--", label="CAS-CI")

    plt.xlabel("Iteration number")
    plt.ylabel("Total energy [Hartree]")
    plt.title("FAST-VQE convergence")
    plt.legend()

    convergence_plot = outdir / "frame_vqe_convergence.png"
    fig1.savefig(convergence_plot, dpi=200, bbox_inches="tight")

    # -----------------------------
    # Figure 2: |Energy - Reference| (log scale)
    # -----------------------------
    # Use absolute error; add a tiny epsilon to avoid log(0) if we hit the reference exactly.
    eps = np.finfo(float).eps
    abs_error = np.abs(energies - reference_energy) + eps

    fig2 = plt.figure()
    plt.semilogy(iterations, abs_error, "-*", color="darkgreen", label="|E - E_ref|")
    plt.xlabel("Iteration number")
    plt.ylabel("Absolute error [Hartree]")
    plt.title("FAST-VQE error vs reference")
    plt.legend()
    plt.tight_layout()

    error_convergence_plot = outdir / "frame_vqe_error_semilogy.png"
    fig2.savefig(error_convergence_plot, dpi=200, bbox_inches="tight")

    if show:
        plt.show()
    else:
        # Close figures to free memory when running in batch contexts.
        plt.close(fig1)
        plt.close(fig2)




def main(
    *,
    calculate_casci: bool = False,
    max_iterations: int = 200,
    path_to_molecule_xyz_file: Path = Path.cwd() / "acrolein_water.xyz",
    path_to_molecule_pot_file: Path = Path.cwd() / "acrolein_water.pot",
    outdir: Path = Path("dist"),
) -> dict[str, list[float]]:
    """
    Run FAST-VQE on acrolein in water using embedding.

    First the Polarizable Embedding is used to model the solvent (water) environment,
    and then we apply the Projection based Wavefunction-in-DFT embedding scheme
    to treat a specific part of the acrolein molecule using our
    quantum computer algorithm.
    """
    solvent = qc.solvent_creator().frame(potential_file=Path(path_to_molecule_pot_file)).create()

    # Create a Molecule configuration (molecule + basis, spin, etc)
    # This originate from a large pdb file, so the outer region is
    # now defined by the Solvent potential_file, and the
    # embedded_atoms define the "active region"
    # So here we chose atom index 0 and 1: Oxygen and Carbon
    molecular_configuration = qc.build_molecular_configuration(
        molecule=Path(path_to_molecule_xyz_file),
        basis_set="sto3g",
        charge=0,
        spin_difference=0,
        units="angstrom",
        solvent=solvent,
        embedded_atoms=[0, 1],
    )

    pe_problem_builder = (
        qc.problem_builder_creator()
        .ground_state()
        .projective_embedding()
        .add_problem_modifier()
        # We further define a Complete Active Space (CAS) here we define CAS(10,10)
        .active_space(
            number_of_active_spatial_orbitals=10,
            number_of_active_alpha_electrons=5,
        )
        .choose_orbital_assigner()
        .total_weight(assignment_tolerance=0.5)
        .create()
    )

    # We apply the Projection based Wavefunction-in-DFT embedding scheme on the molecule.
    pe_problem = pe_problem_builder.build_restricted(molecular_configuration=molecular_configuration)

    # Create an adaptive VQE based ground state calculator.
    fast_vqe_calculator = (
        qc.calculator_creator()
        .vqe()
        .iterative()
        .standard()
        .with_options(
            options=qc.options.IterativeVqeOptions(
                max_iterations=max_iterations,
            )
        )
        .choose_stopping_criterion()
        .patience(patience=5, threshold=1e-4)
        .create()
    )

    # Calculate the ground state energy using the FAST-VQE calculator using our simulator
    energy_result = fast_vqe_calculator.calculate(pe_problem)

    # You could create a standard CAS-CI ground state calculator.
    if calculate_casci:
        casci_calculator = (
            qc.calculator_creator()
            .configuration_interaction()
            .standard()  # FCI (full space) or CASCI (if problem has an active space)
            .create()
        )
        casci_result = casci_calculator.calculate(pe_problem)
        casci_energy = casci_result.total_energy.value
    else:
        # Here we cheat and provide the energy directly
        casci_energy = -187.72474201929262  # CASCI-in-DFT value

    energies = energy_result.total_energy_per_macro_iteration_with_initial_energy_and_final_energy.values
    output: dict[str, list[float]] = {
        "energies": energies,
        "casci_energy": [casci_energy],
    }
    plot_energies(output, outdir=outdir)

    return output


if __name__ == "__main__":
    output_dict = main()

Explanation

Import the QDK library

# pyright: reportUnknownVariableType=false, reportMissingImports=false
# ruff: noqa
from pathlib import Path

from matplotlib import pyplot as plt
import numpy as np

import qrunch as qc

import qrunch as qc loads the public, stable API. For most users this is the only import needed, and the one guaranteed to remain stable across versions.

The plotting method

def plot_energies(
    result: dict[str, list[float]],
    *,
    outdir: Path = Path("dist"),
    show: bool = False,
) -> None:
    """Plot the energies from FAST-VQE and CAS-CI."""
    energies = np.asarray(result["energies"])
    reference_energy = result["casci_energy"][0]

    iterations = np.asarray(list(range(len(energies))))
    # Create the plot
    fig1 = plt.figure()
    plt.plot(iterations, energies, color="darkgreen", marker="o", label="FAST-VQE")

    # Add a dashed horizontal line at the CAS-CI value.
    plt.axhline(y=reference_energy, color="black", linestyle="--", label="CAS-CI")

    plt.xlabel("Iteration number")
    plt.ylabel("Total energy [Hartree]")
    plt.title("FAST-VQE convergence")
    plt.legend()

    convergence_plot = outdir / "frame_vqe_convergence.png"
    fig1.savefig(convergence_plot, dpi=200, bbox_inches="tight")

    # -----------------------------
    # Figure 2: |Energy - Reference| (log scale)
    # -----------------------------
    # Use absolute error; add a tiny epsilon to avoid log(0) if we hit the reference exactly.
    eps = np.finfo(float).eps
    abs_error = np.abs(energies - reference_energy) + eps

    fig2 = plt.figure()
    plt.semilogy(iterations, abs_error, "-*", color="darkgreen", label="|E - E_ref|")
    plt.xlabel("Iteration number")
    plt.ylabel("Absolute error [Hartree]")
    plt.title("FAST-VQE error vs reference")
    plt.legend()
    plt.tight_layout()

    error_convergence_plot = outdir / "frame_vqe_error_semilogy.png"
    fig2.savefig(error_convergence_plot, dpi=200, bbox_inches="tight")

    if show:
        plt.show()
    else:
        # Close figures to free memory when running in batch contexts.
        plt.close(fig1)
        plt.close(fig2)

This code plot the convergence of the VQE energy and the error relative to the reference. Complete Active Space Configuration Interaction (CASCI) energy.

The code generates a PNG file under dist/:

frame_vqe_convergence.png
frame_vqe_error_semilogy.png

Define the molecular configuration

    solvent = qc.solvent_creator().frame(potential_file=Path(path_to_molecule_pot_file)).create()

    # Create a Molecule configuration (molecule + basis, spin, etc)
    # This originate from a large pdb file, so the outer region is
    # now defined by the Solvent potential_file, and the
    # embedded_atoms define the "active region"
    # So here we chose atom index 0 and 1: Oxygen and Carbon
    molecular_configuration = qc.build_molecular_configuration(
        molecule=Path(path_to_molecule_xyz_file),
        basis_set="sto3g",
        charge=0,
        spin_difference=0,
        units="angstrom",
        solvent=solvent,
        embedded_atoms=[0, 1],
    )

We specify the acrolein + water system from the .xyz file, set the STO-3G basis, neutral charge, and zero spin difference. The key addition is the solvent (The SolventFrame) that include the polarizable embedding potential.

Furthermore, the embedded atoms are defined as the first two atoms (O and C of acrolein).

Build the ground-state problem

    pe_problem_builder = (
        qc.problem_builder_creator()
        .ground_state()
        .projective_embedding()
        .add_problem_modifier()
        # We further define a Complete Active Space (CAS) here we define CAS(10,10)
        .active_space(
            number_of_active_spatial_orbitals=10,
            number_of_active_alpha_electrons=5,
        )
        .choose_orbital_assigner()
        .total_weight(assignment_tolerance=0.5)
        .create()
    )

    # We apply the Projection based Wavefunction-in-DFT embedding scheme on the molecule.
    pe_problem = pe_problem_builder.build_restricted(molecular_configuration=molecular_configuration)

We create a standard ground-state problem using the problem builder.

Create and run the FAST-VQE calculator

    # Create an adaptive VQE based ground state calculator.
    fast_vqe_calculator = (
        qc.calculator_creator()
        .vqe()
        .iterative()
        .standard()
        .with_options(
            options=qc.options.IterativeVqeOptions(
                max_iterations=max_iterations,
            )
        )
        .choose_stopping_criterion()
        .patience(patience=5, threshold=1e-4)
        .create()
    )

    # Calculate the ground state energy using the FAST-VQE calculator using our simulator
    energy_result = fast_vqe_calculator.calculate(pe_problem)

We specify a maximum number of iterations, build and run the FAST-VQE calculator.

Compute a high-accuracy FCI reference

    # You could create a standard CAS-CI ground state calculator.
    if calculate_casci:
        casci_calculator = (
            qc.calculator_creator()
            .configuration_interaction()
            .standard()  # FCI (full space) or CASCI (if problem has an active space)
            .create()
        )
        casci_result = casci_calculator.calculate(pe_problem)
        casci_energy = casci_result.total_energy.value
    else:
        # Here we cheat and provide the energy directly
        casci_energy = -187.72474201929262  # CASCI-in-DFT value

We build a CASCI calculator and calculate the CASCI energy for the same problem.

Make the plots

    energies = energy_result.total_energy_per_macro_iteration_with_initial_energy_and_final_energy.values
    output: dict[str, list[float]] = {
        "energies": energies,
        "casci_energy": [casci_energy],
    }
    plot_energies(output, outdir=outdir)

We plot the VQE convergence and the error relative to the CASCI reference, by calling the plotting function defined above.

Running the Example

After saving the script as run_acrolein.py, you can run it directly from the command line:

$ python run_acrolein.py

You should see 2 plots:

Naturally you can increase the convergence criteria, and max iterations to get a more accurate result.