qrunch.storage.data_persister_manager

Contains manager to handle saving and loading using a serializer and persister.

Functions

with_data_persister(key, metadata_builder)

Decorate the given method with a data persister for caching results.

Classes

DataPersisterManager

Manages single DataPersister and their associated metadata.

DataPersisterManagerSubCreator

Sub-creator for data persisters.

class DataPersisterManager

Bases: object

Manages single DataPersister and their associated metadata.

__init__(persister: PersisterProtocol, serializer: SerializerProtocol | None = None, default_load_policy: LoadPolicy = LoadPolicy.RAISE_ON_HASH_COLLISION) None

Initialize the DataPersisterManager.

A single DataPersister is used to save the data, using a name that consist of a key and a unique identifier, such that this manager can keep track of many similar data.

Parameters:
  • persister (PersisterProtocol) – An object that can handle reading and writing serialized byte data.

  • serializer (SerializerProtocol | None) – An object that can serialize. Defaults to DataClassJSONSerializer.

  • default_load_policy (LoadPolicy) – The default checkpoint persistence policy.

Return type:

None

do_save(key: str) bool

Return True if data associated with the external key should be saved.

Parameters:

key (str) – The external key for the data checkpoint.

Return type:

bool

get_all_metadata() dict[str, dict[str, list[Mapping[str, str | Mapping[str, str | NestedStrDict]]]]]

Return all metadata from the underlying persister.

This is a thin pass-through that delegates to self._data_persister.get_all_metadata(). It is useful when a full inventory is needed regardless of registration state.

Return type:

dict[str, dict[str, list[Mapping[str, str | Mapping[str, str | NestedStrDict]]]]]

get_all_unique_id_metadata_for_key(key: str) dict[str, list[Mapping[str, str | Mapping[str, str | NestedStrDict]]]]

Retrieve metadata for all stored unique IDs under the given key.

This method retrieves the list of unique identifiers associated with the key using self._data_persister.list_data_for_key(key), then loads the metadata for each unique_id by calling self._data_persister.load_metadata(key, unique_id). All the metadata entries are collected and returned in a single, flattened list.

Parameters:

key (str) – The external key for which metadata should be retrieved.

Return type:

dict[str, list[Mapping[str, str | Mapping[str, str | NestedStrDict]]]]

get_all_unique_id_metadata_for_registered_keys() dict[str, dict[str, list[Mapping[str, str | Mapping[str, str | NestedStrDict]]]]]

Collect metadata for all unique ids for every registered key.

The result is a mapping from key to the per-key mapping of unique_id to the list of metadata entries returned by get_all_unique_id_metadata_for_key.

Return type:

dict[str, dict[str, list[Mapping[str, str | Mapping[str, str | NestedStrDict]]]]]

get_load_policy(key: str) LoadPolicy

Return the load policy for the key.

Parameters:

key (str) – The external key for the data checkpoint.

Return type:

LoadPolicy

has_data(key: str) bool

Check if data exists for the given external key.

Parameters:

key (str) – The external key for the data checkpoint.

Raises:

DataPersisterError – If the key is unregistered.

Return type:

bool

has_matching_data(key: str, unique_id: str) bool

Check if data exists for the given external key and provided unique_id.

Parameters:
  • key (str) – The external key for the data checkpoint.

  • unique_id (str) – The provided unique identifier.

Raises:

DataPersisterError – If the key is unregistered.

Return type:

bool

is_key_registered(key: str) bool

Return True if the external key is registered.

Parameters:

key (str) – The external key (unique identifier for the data checkpoint).

Return type:

bool

load(key: str, metadata: Mapping[str, str | Mapping[str, str | NestedStrDict]], cls: type[T]) T | None

Load an object and validate its generator metadata.

Parameters:
  • key (str) – The external key for the data checkpoint.

  • metadata (Mapping[str, str | Mapping[str, str | NestedStrDict]]) – The expected metadata.

  • cls (type[T]) – The expected type of the loaded object.

Raises:
  • CheckpointError – If data is not found as LoadPolicy is EXPECTED.

  • CheckpointError – If the key and unique_id match, but the metadata is differs from the expected, and LoadPolicy is RAISE_ON_HASH_COLLISION.

Return type:

T | None

register(*, key: str, load_policy: LoadPolicy | None = None, do_save: bool = True) None

Register a persistence key with loading and saving rules.

When a key is registered, update the internal mapping by querying the underlying persister for all names associated with the key.

Parameters:
  • key (str) – The external key for the data checkpoint.

  • load_policy (LoadPolicy | None) – Whether data under this key should be loaded, and which policy should be applied.

  • do_save (bool) – Whether data under this key should be savable.

Return type:

None

register_checkpoints(persistence_checkpoints: list[str], *, load_policy: LoadPolicy, do_save: bool) None

Register persistence checkpoints in the DataPersisterManager for tracking intermediate computations.

This method ensures that specific computational steps are registered as checkpoints, allowing for data persistence and retrieval during the embedding process.

Parameters:
  • persistence_checkpoints (list[str]) – A list of keys to register.

  • do_save (bool) – Whether to enable saving for these checkpoints.

  • load_policy (LoadPolicy) – The shared load policy for these checkpoints.

Return type:

None

registered_keys() list[str]

Return registered keys.

This reflects the keys that have been explicitly registered via register() or register_checkpoints().

Return type:

list[str]

save(key: str, instance: Any, metadata: Mapping[str, str | Mapping[str, str | NestedStrDict]]) None

Save an object and its metadata.

If the underlying persister cannot write using the base internal name, generate a new one (thus allowing multiple internal names for the same key).

Parameters:
  • key (str) – The external key for the data checkpoint.

  • instance (Any) – The object to save.

  • metadata (Mapping[str, str | Mapping[str, str | NestedStrDict]]) – Additional metadata to be stored alongside the instance.

Return type:

None

class DataPersisterManagerSubCreator

Bases: Generic[T]

Sub-creator for data persisters.

__init__(parent: T, attr_name: str, checkpoints: list[str]) None

Initialize the data persister sub-creator.

Parameters:
  • parent (T) – The parent object to attach the data persister to.

  • attr_name (str) – The attribute name to use for the data persister.

  • checkpoints (list[str]) – A list of checkpoint names for the data persister. The checkpoints are for the class that the data persister should be used in.

Return type:

None

file_persister(directory: Path | str | None, extension: str = '.qdk', *, do_save: bool = True, load_policy: LoadPolicy | Literal['off', 'raise_on_hash_collision', 'fallback', 'expected'] = LoadPolicy.RAISE_ON_HASH_COLLISION, overwriting_policy: Literal['overwrite', 'error', 'rename', 'ignore'] = 'rename', padding_width: int = 3) T

Use a persister saving to a file on disc.

Parameters:
  • directory (Path | str | None) – The directory to save the files in. If None, saves to the current directory.

  • extension (str) – The file extension to use when saving files.

  • do_save (bool) – Whether to save data at the registered checkpoints.

  • load_policy (LoadPolicy | Literal['off', 'raise_on_hash_collision', 'fallback', 'expected']) – The load policy to use when loading data at the registered checkpoints.

  • overwriting_policy (Literal['overwrite', 'error', 'rename', 'ignore']) – The policy to use when a file already exists.

  • padding_width (int) – The width of the padding. 3 correspond to 001-style padding .

Return type:

T

with_data_persister(key: str, metadata_builder: Callable[[P], Mapping[str, str | Mapping[str, str | NestedStrDict]]]) Callable[[Callable[[P], T]], Callable[[P], T]]

Decorate the given method with a data persister for caching results.

This decorator wraps a given function with functionality to check if a result is already stored in the data persister manager. If stored, it loads the saved result to avoid recomputation. If not stored, it computes the result, saves it, and then returns the computed result.

Parameters:
  • key (str) – A unique string key representing the persistence checkpoint.

  • metadata_builder (Callable[[~P], Mapping[str, str | Mapping[str, str | NestedStrDict]]]) – The function to build the metadata describing the object. Must take the same input as the wrapped function.

Returns:

A decorator that can wrap a function with loading and saving logic added.

Return type:

Callable[[Callable[[~P], T]], Callable[[~P], T]]