Choose a Data Persister Manager

Goal

Save and/or load intermediate results to resume calculations after crashes or to reuse expensive computations (e.g., mean-field, orbital localization) across runs.

Prerequisites

  • A builder instance that supports .choose_data_persister_manager() (e.g., ground state problem builders, reaction path builders, and VQE calculator creators)

Overview

The data persister manager controls where and how intermediates are stored.

import qrunch as qc

problem_builder = (
    qc.problem_builder_creator()
    .ground_state()
    .projective_embedding()
    .choose_data_persister_manager()
    # .<pick-a-persister>(...)
    .create()
)

Currently, a single option is available:

  • .file_persister(...): stores on disk (persistent across program runs)

The data persister manager can also be specified for reaction path builders and, importantly, for calculator creators, for instance:

import qrunch as qc

vqe_calculator = (
    qc.calculator_creator()
    .vqe()
    .iterative()
    .standard()
    .choose_data_persister_manager()
    # .<pick-a-persister>(...)
    .create()
)

Options

File Persister

Save intermediates to a directory on disk.

problem_builder = (
    qc.problem_builder_creator()
    .ground_state()
    .projective_embedding()
    .choose_data_persister_manager()
    .file_persister(
        directory="my_cache_directory",
        extension=".qdk",
        do_save=True,
        load_policy="raise_on_hash_collision",
        overwriting_policy="rename",
        padding_width=3
    )
    .create()
)

Parameters:

  • directory: directory where files will be stored

  • extension: optional file extension (default .qdk)

  • do_save: if False, disables saving (default True)

  • load_policy: controls what happens when data already exists:

    • "off": do not load cached data

    • "raise_on_hash_collision": raise error if file exists but there is metadata mismatch

    • "fallback": if there are missing or invalid files, compute them from scratch

    • "expected": requires data to exist and raises error if it is missing

The "expected" behaviour enables more permissive file validation. If an exact metadata match is not found, validation is retried using relaxed criteria. This allows files to be loaded when mismatches are caused by minor code changes.

Early versions of Qrunch produced files with non-optimal metadata, so many files written with those versions fail strict metadata checks.

With the "raise_on_hash_collision" behaviour, such files are rejected because no exact match exists. With the "expected" behaviour, the system attempts to load the file despite metadata mismatches, allowing files that would otherwise be discarded to be accepted.

  • overwriting_policy: Controls what happens when you save data that already exists:

    • "overwrite": overwrites the existing files

    • "error": raises an error if a file exists

    • "rename": appends _001_, _002_, … to create a new filename

The default behaviour was previously "rename". Starting with version “qrunch_1.1.0”, the default was changed to "overwrite".

When using a file persister during a VQE calculation, the "rename" behaviour could generate hundreds of files, one per iteration, resulting in cluttered directories.

The default is therefore now "overwrite", which keeps directories clean by replacing existing files. If you need to retain all intermediate files, you can explicitly set the behaviour to "rename".

Note this is not an issue for problem builders, where no intermediate files are generated, and the default have no effect one way or the other.

  • padding_width: controls the renaming padding of filenames. The default is 3, resulting in filenames padded with 001, but it can be reduced or increased

Usage Notes

  • Crash resilience: use .file_persister(...) to resume after interruption

  • Reuse across runs: file persistence allows reusing mean-field, localization, and orbital assignment results when changing active space or other downstream options.

See Also