qrunch.storage.data_persister_manager
Contains manager to handle saving and loading using a serializer and persister.
Functions
|
Decorate the given method with a data persister for caching results. |
Classes
Manages single DataPersister and their associated metadata. |
|
Sub-creator for data persisters. |
- class DataPersisterManager
Bases:
objectManages single DataPersister and their associated metadata.
- __init__(persister: PersisterProtocol, serializer: SerializerProtocol | None = None, default_load_policy: LoadPolicy = LoadPolicy.RAISE_ON_HASH_COLLISION) None
Initialize the DataPersisterManager.
A single DataPersister is used to save the data, using a name that consist of a key and a unique identifier, such that this manager can keep track of many similar data.
- Parameters:
persister (PersisterProtocol) – An object that can handle reading and writing serialized byte data.
serializer (SerializerProtocol | None) – An object that can serialize. Defaults to
DataClassJSONSerializer.default_load_policy (LoadPolicy) – The default checkpoint persistence policy.
- Return type:
None
- do_save(key: str) bool
Return True if data associated with the external key should be saved.
- Parameters:
key (str) – The external key for the data checkpoint.
- Return type:
bool
- get_all_metadata() dict[str, dict[str, list[Mapping[str, str | Mapping[str, str | NestedStrDict]]]]]
Return all metadata from the underlying persister.
This is a thin pass-through that delegates to self._data_persister.get_all_metadata(). It is useful when a full inventory is needed regardless of registration state.
- Return type:
dict[str, dict[str, list[Mapping[str, str | Mapping[str, str | NestedStrDict]]]]]
- get_all_unique_id_metadata_for_key(key: str) dict[str, list[Mapping[str, str | Mapping[str, str | NestedStrDict]]]]
Retrieve metadata for all stored unique IDs under the given key.
This method retrieves the list of unique identifiers associated with the key using self._data_persister.list_data_for_key(key), then loads the metadata for each unique_id by calling self._data_persister.load_metadata(key, unique_id). All the metadata entries are collected and returned in a single, flattened list.
- Parameters:
key (str) – The external key for which metadata should be retrieved.
- Return type:
dict[str, list[Mapping[str, str | Mapping[str, str | NestedStrDict]]]]
- get_all_unique_id_metadata_for_registered_keys() dict[str, dict[str, list[Mapping[str, str | Mapping[str, str | NestedStrDict]]]]]
Collect metadata for all unique ids for every registered key.
The result is a mapping from key to the per-key mapping of unique_id to the list of metadata entries returned by get_all_unique_id_metadata_for_key.
- Return type:
dict[str, dict[str, list[Mapping[str, str | Mapping[str, str | NestedStrDict]]]]]
- get_load_policy(key: str) LoadPolicy
Return the load policy for the key.
- Parameters:
key (str) – The external key for the data checkpoint.
- Return type:
- has_data(key: str) bool
Check if data exists for the given external key.
- Parameters:
key (str) – The external key for the data checkpoint.
- Raises:
DataPersisterError – If the key is unregistered.
- Return type:
bool
- has_matching_data(key: str, unique_id: str) bool
Check if data exists for the given external key and provided unique_id.
- Parameters:
key (str) – The external key for the data checkpoint.
unique_id (str) – The provided unique identifier.
- Raises:
DataPersisterError – If the key is unregistered.
- Return type:
bool
- is_key_registered(key: str) bool
Return True if the external key is registered.
- Parameters:
key (str) – The external key (unique identifier for the data checkpoint).
- Return type:
bool
- load(key: str, metadata: Mapping[str, str | Mapping[str, str | NestedStrDict]], cls: type[T]) T | None
Load an object and validate its generator metadata.
- Parameters:
key (str) – The external key for the data checkpoint.
metadata (Mapping[str, str | Mapping[str, str | NestedStrDict]]) – The expected metadata.
cls (type[T]) – The expected type of the loaded object.
- Raises:
CheckpointError – If data is not found as LoadPolicy is EXPECTED.
CheckpointError – If the key and unique_id match, but the metadata is differs from the expected, and LoadPolicy is RAISE_ON_HASH_COLLISION.
- Return type:
T | None
- register(*, key: str, load_policy: LoadPolicy | None = None, do_save: bool = True) None
Register a persistence key with loading and saving rules.
When a key is registered, update the internal mapping by querying the underlying persister for all names associated with the key.
- Parameters:
key (str) – The external key for the data checkpoint.
load_policy (LoadPolicy | None) – Whether data under this key should be loaded, and which policy should be applied.
do_save (bool) – Whether data under this key should be savable.
- Return type:
None
- register_checkpoints(persistence_checkpoints: list[str], *, load_policy: LoadPolicy, do_save: bool) None
Register persistence checkpoints in the DataPersisterManager for tracking intermediate computations.
This method ensures that specific computational steps are registered as checkpoints, allowing for data persistence and retrieval during the embedding process.
- Parameters:
persistence_checkpoints (list[str]) – A list of keys to register.
do_save (bool) – Whether to enable saving for these checkpoints.
load_policy (LoadPolicy) – The shared load policy for these checkpoints.
- Return type:
None
- registered_keys() list[str]
Return registered keys.
This reflects the keys that have been explicitly registered via register() or register_checkpoints().
- Return type:
list[str]
- save(key: str, instance: Any, metadata: Mapping[str, str | Mapping[str, str | NestedStrDict]]) None
Save an object and its metadata.
If the underlying persister cannot write using the base internal name, generate a new one (thus allowing multiple internal names for the same key).
- Parameters:
key (str) – The external key for the data checkpoint.
instance (Any) – The object to save.
metadata (Mapping[str, str | Mapping[str, str | NestedStrDict]]) – Additional metadata to be stored alongside the instance.
- Return type:
None
- class DataPersisterManagerSubCreator
Bases:
Generic[T]Sub-creator for data persisters.
- __init__(parent: T, attr_name: str, checkpoints: list[str]) None
Initialize the data persister sub-creator.
- Parameters:
parent (T) – The parent object to attach the data persister to.
attr_name (str) – The attribute name to use for the data persister.
checkpoints (list[str]) – A list of checkpoint names for the data persister. The checkpoints are for the class that the data persister should be used in.
- Return type:
None
- file_persister(directory: Path | str | None, extension: str = '.qdk', *, do_save: bool = True, load_policy: LoadPolicy | Literal['off', 'raise_on_hash_collision', 'fallback', 'expected'] = LoadPolicy.RAISE_ON_HASH_COLLISION, overwriting_policy: Literal['overwrite', 'error', 'rename', 'ignore'] = 'rename', padding_width: int = 3) T
Use a persister saving to a file on disc.
- Parameters:
directory (Path | str | None) – The directory to save the files in. If None, saves to the current directory.
extension (str) – The file extension to use when saving files.
do_save (bool) – Whether to save data at the registered checkpoints.
load_policy (LoadPolicy | Literal['off', 'raise_on_hash_collision', 'fallback', 'expected']) – The load policy to use when loading data at the registered checkpoints.
overwriting_policy (Literal['overwrite', 'error', 'rename', 'ignore']) – The policy to use when a file already exists.
padding_width (int) – The width of the padding. 3 correspond to 001-style padding .
- Return type:
T
- with_data_persister(key: str, metadata_builder: Callable[[P], Mapping[str, str | Mapping[str, str | NestedStrDict]]]) Callable[[Callable[[P], T]], Callable[[P], T]]
Decorate the given method with a data persister for caching results.
This decorator wraps a given function with functionality to check if a result is already stored in the data persister manager. If stored, it loads the saved result to avoid recomputation. If not stored, it computes the result, saves it, and then returns the computed result.
- Parameters:
key (str) – A unique string key representing the persistence checkpoint.
metadata_builder (Callable[[~P], Mapping[str, str | Mapping[str, str | NestedStrDict]]]) – The function to build the metadata describing the object. Must take the same input as the wrapped function.
- Returns:
A decorator that can wrap a function with loading and saving logic added.
- Return type:
Callable[[Callable[[~P], T]], Callable[[~P], T]]