lazyscribe package

Subpackages

Submodules

Custom exceptions for lazyscribe.

exception lazyscribe.exception.ArtifactError[source]

Bases: LazyscribeError

Base exception for artifact errors.

exception lazyscribe.exception.ArtifactLoadError[source]

Bases: ArtifactError

Raised when an artifact cannot be loaded.

exception lazyscribe.exception.ArtifactLogError[source]

Bases: ArtifactError

Raised when an artifact cannot be logged.

exception lazyscribe.exception.InvalidVersionError[source]

Bases: LazyscribeError

Raised when an invalid version is provided.

exception lazyscribe.exception.LazyscribeError[source]

Bases: Exception

Base exception for lazyscribe errors.

exception lazyscribe.exception.ReadOnlyError[source]

Bases: LazyscribeError

Raised when a project or repository is opened in read-only mode and write operations are tried.

exception lazyscribe.exception.SaveError[source]

Bases: LazyscribeError

Raised when a project or repository is unable to save objects to the filesystem.

exception lazyscribe.exception.VersionNotFoundError[source]

Bases: LazyscribeError

Raised when the version cannot be found.

Experiment dataclass.

class lazyscribe.experiment.Experiment(name: str, project: Path, dir: Path = NOTHING, fs: AbstractFileSystem = NOTHING, author: str = NOTHING, last_updated_by: str = NOTHING, metrics: dict[str, float | int] = NOTHING, parameters: dict[str, Any] = NOTHING, created_at: datetime = NOTHING, last_updated: datetime = NOTHING, dependencies: dict[str, Experiment] = NOTHING, short_slug: str = NOTHING, slug: str = NOTHING, tests: list[Test] = NOTHING, artifacts: list[Artifact] = NOTHING, tags: list[str] = NOTHING, dirty: bool = NOTHING)[source]

Bases: object

Experiment data class.

This class is not meant to be initialized directly. It is meant to be used through the lazyscribe.project.Project class.

Parameters

namestr

The name of the experiment.

projectpathlib.Path

The path to the project JSON associated with the project.

dirpathlib.Path, optional (default None)

Directory for the project and the experiment. If not supplied, the parent directory for the project file will be used.

authorstr, optional (default getpass.getuser())

The author of the experiment.

last_updated_bystr, optional (default None)

Last editor of the experiment. If not supplied, the author will be used.

metricsdict[str, float | int], optional (default {})

A dictionary of metric values. Each metric value can be an individual value or a list.

parametersdict[str, Any], optional (default {})

A dictionary of experiment parameters. The key must be a string but the value can be anything.

created_atdatetime.datetime, optional (default lazyscribe._utils.utcnow())

When the experiment was created (in UTC).

last_updateddatetime.datetime, optional (default lazyscribe._utils.utcnow())

When the experiment was last updated (in UTC).

short_slugstr, optional (default None)

Slugified name. Defaults to calling slugify.slugify() on the name attribute.

slugstr, optional (default None)

Unique identifier for the experiment. Defaults to the slugified name with the creation date appended in the format YYYYMMDDHHMMSS.

tagslist[str], optional (default [])

Tags for filtering and identifying experiments across a project.

dependenciesdict[str, lazyscribe.experiment.Experiment], optional (default {})

A dictionary of upstream project experiments. The key is the short slug for the upstream experiment and the value is an Experiment instance.

testslist[lazyscribe.test.Test], optional (default [])

List of lazyscribe.test.Test objects corresponding to sub-population/non-global metrics.

artifactslist[lazyscribe.artifacts.base.Artifact], optional (default [])

List of lazyscribe.artifact.base.Artifact objects corresponding to experimental artifacts.

dirtybool, optional (default True)

Whether or not this experiment should be saved when lazyscribe.project.Project.save() is called. This decision is based on whether the experiment is new or has been updated.

artifacts: list[Artifact]
author: str
created_at: datetime
dependencies: dict[str, Experiment]
dir: Path
dirty: bool
fs: AbstractFileSystem
last_updated: datetime
last_updated_by: str
load_artifact(name: str, validate: bool = True, **kwargs: Any) Any[source]

Load a single artifact.

Parameters

namestr

The name of the artifact to load.

validatebool, optional (default True)

Whether or not to validate the runtime environment against the artifact metadata.

**kwargs

Keyword arguments for the handler read function.

Returns

Any

The artifact object.

Raises

lazyscribe.exception.ArtifactLoadError

If validate and runtime environment does not match artifact metadata. Or if there is no artifact found with the name provided.

log_artifact(name: str, value: Any, handler: str, fname: str | None = None, overwrite: bool = False, **kwargs: Any) None[source]

Log an artifact to the experiment.

This method associates an artifact with the experiment, but the artifact will not be written until lazyscribe.Project.save() is called.

Parameters

namestr

The name of the artifact.

valueAny

The object to persist to the filesystem.

handlerstr

The name of the handler to use for the object.

fnamestr, optional (default None)

The filename for the artifact. If not provided, it will be derived from the name of the artifact and the builtin suffix for each handler.

overwritebool, optional (default False)

Whether or not to overwrite an existing artifact with the same name. If set to True, the previous artifact will be removed and overwritten with the current artifact.

**kwargs

Keyword arguments for the write function of the handler.

Raises

lazyscribe.exception.ArtifactLogError

Raised if an artifact is supplied with the same name as an existing artifact and overwrite is set to False.

log_metric(name: str, value: float | int) None[source]

Log a metric to the experiment.

This method will overwrite existing keys.

Parameters

namestr

Name of the metric.

valueint | float

Value of the metric.

log_parameter(name: str, value: Any) None[source]

Log a parameter to the experiment.

This method will overwrite existing keys.

Parameters

namestr

The name of the parameter.

valueAny

The parameter itself.

log_test(name: str, description: str | None = None) Iterator[Test][source]

Add a test to the experiment using a context handler.

A test is a specific location for non-global metrics.

Parameters

namestr

Name of the test.

descriptionstr, optional (default None)

An optional description for the test.

Yields

lazyscribe.test.Test

The lazyscribe.test.Test dataclass.

metrics: dict[str, float | int]
name: str
parameters: dict[str, Any]
property path: Path

Path to an experiment folder.

This folder can be used to store any plots or artifacts that you want to associate with the experiment.

Returns

pathlib.Path

The path for the experiment.

project: Path
promote_artifact(repository: Repository, name: str) None[source]

Associate an artifact with a lazyscribe.repository.Repository.

The purpose of this method is to move an artifact from an ephemeral experiment to the versioned repository.

If the artifact does not exist on disk yet, this function is simply a passthrough to lazyscribe.repository.Repository.log_artifact(). If the artifact does exist on disk already, this function will copy the artifact from the experiment directory to the repository, increment the version, and call lazyscribe.repository.Repository.save().

Parameters

repositorylazyscribe.repository.Repository

The lazyscribe.repository.Repository to promote the artifact to.

namestr

The artifact to promote.

Raises

lazyscribe.exception.ArtifactLogError

Raised if the artifact to be promoted is not newer than the latest version available in the repository. Raised if

  • the artifact name exists on the filesystem, and

  • the filesystem protocol does not match between the repository and the experiment.

lazyscribe.exception.ArtifactLoadError

Raised if there is no artifact with the name name in the experiment.

lazyscribe.exception.SaveError

Raised when writing to the filesystem fails.

short_slug: str
slug: str
tag(*args: str, overwrite: bool = False) None[source]

Add one or more tags to the experiment.

Important

If this function is called with no supplied values for *args _and_ overwrite=True, the result will be that the experiment has no associated tags.

Parameters

*args

The tags.

overwritebool, optional (default False)

Whether to add or overwrite the new tags.

tags: list[str]
tests: list[Test]
to_dict() dict[str, Any][source]

Serialize the experiment to a dictionary.

Returns

dict[str, Any]

The experiment dictionary.

class lazyscribe.experiment.ReadOnlyExperiment(name: str, project: Path, dir: Path = NOTHING, fs: AbstractFileSystem = NOTHING, author: str = NOTHING, last_updated_by: str = NOTHING, metrics: dict[str, float | int] = NOTHING, parameters: dict[str, Any] = NOTHING, created_at: datetime = NOTHING, last_updated: datetime = NOTHING, dependencies: dict[str, Experiment] = NOTHING, short_slug: str = NOTHING, slug: str = NOTHING, tests: list[Test] = NOTHING, artifacts: list[Artifact] = NOTHING, tags: list[str] = NOTHING, dirty: bool = NOTHING)[source]

Bases: Experiment

Immutable version of an experiment.

Linked list utilities.

The code for this module is lifted from here.

class lazyscribe.linked.LinkedList(head: Node | None = None)[source]

Bases: object

The linked list.

Parameters

headNode, optional (default None)

The start of the list.

append(data: Any) None[source]

Append a new node to the end of the list.

Parameters

dataAny

The new data.

static from_list(data: list[Any]) LinkedList[source]

Convert a standard list to a linked list.

head: Node | None
class lazyscribe.linked.Node(data: Any = None, next: Node | None = None)[source]

Bases: object

Node for the linked list.

Parameters

dataAny, optional (default None)

The current data.

nextNode | None, optional (default None)

The next data in the chain.

data: Any
next: Node | None
to_list() list[Any][source]

Convert the nodes to a de-duped list.

Returns

list[Any]

A standard list of data.

lazyscribe.linked.merge(list1: Node, list2: Node) Node[source]

Merge two linked lists.

Parameters

list1Node

The head of the first list.

list2Node

The head of the second list.

Returns

Node

The head of the final, merged list.

Project storing and logging.

class lazyscribe.project.Project(fpath: str | Path = 'project.json', mode: Literal['r', 'a', 'w', 'w+'] = 'a', author: str | None = None, **storage_options: Any)[source]

Bases: object

Project class.

Parameters

fpathstr | pathlib.Path, optional (default “project.json”)

The location of the project file. If no project file exists, this will be the location of the output JSON file when save is called.

mode{“r”, “a”, “w”, “w+”}, optional (default “a”)

The mode for opening the project.

authorstr, optional (default None)

The project author. This author will be used for any new experiments or modifications to existing experiments. If not supplied, getpass.getuser() will be used.

**storage_options

Storage options to pass to the filesystem initialization. Will be passed to fsspec.filesystem().

Attributes

experimentslist[lazyscribe.experiment.Experiment]

The list of experiments in the project.

Raises

ValueError

Raised on invalid mode value.

append(other: Experiment) None[source]

Append an experiment to the project.

For details on the merging process, see here.

Parameters

otherlazyscribe.experiment.Experiment

The experiment to add.

Raises

lazyscribe.exception.ReadOnlyError

Raised when trying to log a new experiment when the project is in read-only mode.

author: str
experiments: list[Experiment]
filter(func: Callable[[Experiment], bool]) Iterator[Experiment][source]

Filter the experiments in the project.

Parameters

funcCallable[[lazyscribe.experiment.Experiment], bool]

A callable that takes in a lazyscribe.experiment.Experiment object and returns a boolean indicating whether or not it passes the filter.

Yields

lazyscribe.experiment.Experiment

An experiment.

fpath: Path
load() None[source]

Load existing experiments.

If the project is in read-only or append mode, existing experiments will be loaded in read-only mode. If opened in editable mode, existing experiments will be loaded in editable mode.

log(name: str) Iterator[Experiment][source]

Log an experiment to the project.

Parameters

namestr

The name of the experiment.

Yields

lazyscribe.experiment.Experiment

A new lazyscribe.experiment.Experiment object.

Raises

lazyscribe.exception.ReadOnlyError

Raised when trying to log a new experiment when the project is in read-only mode.

merge(other: Project) Project[source]

Merge two projects.

The new project will inherit the current project fpath, author, and mode.

For details on the merging process, see here.

Returns

lazyscribe.project.Project

A new project.

mode: Literal['r', 'a', 'w', 'w+']
save() None[source]

Save the project data.

This includes saving any artifact data.

Raises

lazyscribe.exception.ReadOnlyError

Raised when trying to save when the project is in read-only mode.

lazyscribe.exception.SaveError

Raised when writing to the filesystem fails.

storage_options: dict[str, Any]

Repository releases.

class lazyscribe.release.Release(tag: str, artifacts: list[tuple[str, int]], created_at: datetime = NOTHING)[source]

Bases: object

Create a release associated with a Repository instance.

Parameters

tagstr

A string descriptor for the release. Commonly coincides with semantic or calendar versioning.

artifactslist[tuple[str, int]]

A list of the latest available artifacts and versions in the source repository.

created_atdatetime.datetime, optional (default lazyscribe._utils.utcnow())

The creation timestamp for the release (in UTC).

artifacts: list[tuple[str, int]]
created_at: datetime
classmethod from_dict(info: dict[str, Any]) Release[source]

Convert a serialized representation of the release back to a python object.

Parameters

infodict

The dictionary representation of the release.

Returns

lazyscribe.release.Release

The new release object.

tag: str
to_dict() dict[str, list[tuple[str, int]] | str][source]

Serialize the release to a dictionary.

Returns

dict

A dictionary with the release information.

lazyscribe.release.create_release(repository: Repository, tag: str) Release[source]

Create a release.

A release is a collection of specific artifact versions. It is generated by taking the latest available version of each artifact.

Parameters

repositorylazyscribe.repository.Repository

The source repository.

tagstr

A string descriptor of the tag. Commonly coincides with semantic or calendar versioning.

Returns

lazyscribe.release.Release

The release object.

lazyscribe.release.dump(obj: list[Release], fp: IOBase, **kwargs: Any) None[source]

Write the releases data.

from lazyscribe import release as lzr

releases: list[lazyscribe.release.Release]
with open("releases.json", "w") as outfile:
    lzr.dump(releases, outfile)

Parameters

objlist[lazyscribe.release.Release]

The list of release objects.

fpio.IOBase

A buffer we can write to.

**kwargs

Keyword arguments for json.dump.

lazyscribe.release.dumps(obj: list[Release], **kwargs: Any) str[source]

Convert a list of releases to a JSON-serialized string.

To prevent namespace confusion, we recommend importing this function through an alias:

from lazyscribe import release as lzr

releases: list[lzr.Release]
out = lzr.dumps(releases)

Parameters

objlist[lazyscribe.release.Release]

The list of release objects.

**kwargs

Keyword arguments for json.dumps.

Returns

str

The JSON-serialized string.

lazyscribe.release.find_release(releases: list[Release], version: str | datetime | None = None, match: Literal['asof', 'exact'] = 'exact') Release[source]

Find a release based on tag or timestamp.

Parameters

releaseslist[lazyscribe.release.Release]

The releases associated with the repository.

versionstr | datetime, optional (default None)

The version to find. If a string is provided, the function will assume the value corresponds to a tag. If a datetime is provided, the function will assume the value corresponds to a creation date. If None is provided, the latest release will be returned.

match{“asof”, “exact”}, optional (default “exact”)

Matching logic. Only relevant if a datetime is provided for version. exact will provide the release with the exact value matching version. asof will provide the most recent release as of the version datetime provided.

Returns

lazyscribe.release.Release

The release object.

Raises

lazyscribe.exception.VersionNotFoundError

Raised if a release cannot be found.

ValueError

Raised if the specified matching logic does not match the version type specified.

lazyscribe.release.load(fp: IOBase, **kwargs: Any) list[Release][source]

Generate a list of releases from a file buffer.

To prevent namespace confusion, we recommend importing this function through an alias:

from lazyscribe import release as lzr

with open("releases.json") as infile:
    releases = lzr.load(infile)

Parameters

fpfile-like object

A buffer that we can read using JSON.

**kwargs

Keyword arguments for json.load

Returns

list[lazyscribe.release.Release]

A list of releases.

lazyscribe.release.loads(s: str, **kwargs: Any) list[Release][source]

Generate a list of releases from a string.

To prevent namespace confusion, we recommend importing this module through an alias:

from lazyscribe import release as lzr

mydata = '[{"tag": "v0.1.0", "artifacts": [], "created_at": "2025-01-01T00:00:00"}]'
releases = lzr.loads(mydata)

Parameters

sstr

The string representation of a JSON file.

**kwargs

Keyword arguments for json.loads

Returns

list[lazyscribe.release.Release]

A list of releases.

lazyscribe.release.release_from_toml(cfg: str) None[source]

Generate a release for supplied repositories from a configuration.

This function will read in a TOML-compatible configuration file and look for the [tool.lazyscribe] table. This table must contain 1 field:

  • repositories (list): path to repository JSON files for which we want releases.

The configuration has optional fields, including

  • version: the current version of the overall project. If not supplied, this function will look for the version attribute of the [project] table.

  • format: format for the repository release versions. This string will be formatted with the version string, as well as the year, month, and day of the release. By default, this format is v{version}.

This function will read in each repository, create a new release, and write it to a releases.json file in the same directory as the source repository JSON file.

Suppose you have the following entry in pyproject.toml:

[project]
version = "1.0.0"

...

[tool.lazyscribe]
repositories = [
    "src/models/model-1/repository.json",
    "src/models/model-2/repository.json"
]

calling

import lazyscribe.release as lzr

with open("pyproject.toml") as infile:
    release_from_toml(infile.read())

will create two new files:

  • src/models/model-1/releases.json, and

  • src/models/model-2/release.json.

Each of these files will contain a v1.0.0 release.

Parameters

cfgstr

String contents of the configuration file

Repository storing and logging.

class lazyscribe.repository.Repository(fpath: str | Path = 'repository.json', mode: Literal['r', 'w', 'w+'] = 'w+', **storage_options: Any)[source]

Bases: object

Repository class for holding versioned artifacts.

Parameters

fpathstr | Path, optional (default “repository.json”)

The location of the repository file. If no repository file exists, this will be the location of the output JSON file when save is called.

mode{“r”, “a”, “w”, “w+”}, optional (default “w+”)

The mode for opening the repository.

  • r: All artifacts will be loaded. No new artifacts can be logged.

  • w: No existing artifacts will be loaded. Artifacts can be added.

  • w+: All artifacts will be loaded. New artifacts can be added.

**storage_options

Storage options to pass to the filesystem initialization. Will be passed to fsspec.filesystem().

Attributes

artifactslist[lazyscribe.artifact.Artifact]

The list of artifacts in the repository.

artifacts: list[Artifact]
filter(version: datetime | str | list[tuple[str, datetime | str | int]]) Repository[source]

Filter a repository.

This method returns a new, read-only object with a subset of the input artifacts. Use this method to truncate a repository to a collection of artifacts relevant to a given use case.

Parameters

versiondatetime.datetime | str | list[tuple[str, datetime.datetime | str | int]]

The version corresponding to the output version of each artifact. If a datetime or string is provided, this method will do an asof search for each artifact.

If a list is provided, it will be treated as a list of exact versions to load.

Returns

lazyscribe.repository.Repository

A read-only copy of the existing repository with one version per artifact.

Raises

RuntimeError

Raised if the current repository object has artifacts that have not been saved to the filesystem.

fpath: Path
get_artifact_metadata(name: str, version: datetime | str | int | None = None, match: Literal['asof', 'exact'] = 'exact') dict[str, Any][source]

Retrieve the metadata for an artifact.

Parameters

namestr

The name of the artifact to load.

versiondatetime.datetime | str | int, optional (default None)

The version of the artifact to load. Can be provided as a datetime corresponding to the created_at field, a string corresponding to the created_at field in the format "%Y-%m-%dT%H:%M:%S" (e.g. "2025-01-25T12:36:22"), or an integer version. If set to None or not provided, defaults to the most recent version.

match{“asof”, “exact”}, optional (default “exact”)

Matching logic. Only relevant for str and datetime.datetime values for version. exact will provide an artifact with the exact created_at value provided. asof will provide the most recent version as of the version value.

Returns

dict[str, Any]

The artifact metadata.

Raises

ValueError

Raised on invalid match value. Raised if no valid artifact was found.

get_version_diff(name: str, version: datetime | str | int | tuple[datetime | str | int, datetime | str | int], match: Literal['asof', 'exact'] = 'exact') str[source]

Generate the unified diff between versions of the same artifact.

Parameters

namestr

The name of the artifact to compare.

versionsdatetime | str | int | tuple[datetime | str | int, datetime | str | int]

The versions to compare. If a single version is provided, the artifact will be compared to the latest available artifact. A tuple specifies the two versions to compare.

match{“asof”, “exact”}, optional (default “exact”)

Matching logic. Only relevant for str and datetime.datetime values. exact will provide an artifact with the exact created_at value provided. asof will provide the most recent version as of the version value.

Raises

lazyscribe.exception.ArtifactLoadError

Raised if the artifact does not exist on the filesystem yet.

ValueError

Raised if the provided artifact(s) represent binary files.

Returns

str

Concatenated output from difflib.unified_diff().

load() None[source]

Load existing artifacts.

load_artifact(name: str, validate: bool = True, version: datetime | str | int | None = None, match: Literal['asof', 'exact'] = 'exact', **kwargs: Any) Any[source]

Load a single artifact.

Parameters

namestr

The name of the artifact to load.

validatebool, optional (default True)

Whether or not to validate the runtime environment against the artifact metadata.

versiondatetime.datetime | str | int, optional (default None)

The version of the artifact to load. Can be provided as a datetime corresponding to the created_at field, a string corresponding to the created_at field in the format "%Y-%m-%dT%H:%M:%S" (e.g. "2025-01-25T12:36:22"), or an integer version. If set to None or not provided, defaults to the most recent version.

match{“asof”, “exact”}, optional (default “exact”)

Matching logic. Only relevant for str and datetime.datetime values for version. exact will provide an artifact with the exact created_at value provided. asof will provide the most recent version as of the version value.

**kwargs

Keyword arguments for the handler read function.

Returns

Any

The artifact object.

Raises

ValueError

Raised on invalid match value. Raised if no valid artifact was found.

lazyscribe.exception.ArtifactLoadError

Raised if validate and runtime environment does not match artifact metadata.

log_artifact(name: str, value: Any, handler: str, fname: str | None = None, **kwargs: Any) None[source]

Log an artifact to the repository.

This method associates an artifact with the repository, but the artifact will not be written until lazyscribe.repository.Repository.save() is called.

Parameters

namestr

The name of the artifact.

valueAny

The object to persist to the filesystem.

handlerstr

The name of the handler to use for the object.

fnamestr, optional (default None)

The filename for the artifact. If set to None or not provided, it will be derived from the name of the artifact and the builtin suffix for each handler.

**kwargs

Keyword arguments for the write function of the handler.

Raises

lazyscribe.exception.ReadOnlyError

If repository is in read-only mode.

mode: Literal['r', 'w', 'w+']
save() None[source]

Save the repository data.

This includes saving any artifact data.

Raises

lazyscribe.exception.ReadOnlyError

Raised when trying to save when the project is in read-only mode.

lazyscribe.exception.SaveError

Raised when writing to the filesystem fails.

storage_options: dict[str, Any]

Sub-population tests.

class lazyscribe.test.ReadOnlyTest(name: str, description: str | None = NOTHING, metrics: dict[str, float | int] = NOTHING, parameters: dict[str, Any] = NOTHING)[source]

Bases: Test

Immutable version of the test.

class lazyscribe.test.Test(name: str, description: str | None = NOTHING, metrics: dict[str, float | int] = NOTHING, parameters: dict[str, Any] = NOTHING)[source]

Bases: object

Sub-population tests.

These objects should only be instantiated within an experiment. A test is associated with some subset of the entire experiment. For example, a test could be used to evaluate the performance of a model against a specific subpopulation.

Parameters

namestr

The name of the test.

descriptionstr, optional (default None)

A description of the test.

metricsdict[str, float | int], optional (default {})

A dictionary of metric values. Each metric value can be an individual value or a list.

parametersdict[str, Any], optional (default {})

A dictionary of test parameters. The key must be a string but the value can be anything.

description: str | None
log_metric(name: str, value: float | int) None[source]

Log a metric to the test.

This method will overwrite existing keys.

Parameters

namestr

Name of the metric.

valueint | float

Value of the metric.

log_parameter(name: str, value: Any) None[source]

Log a parameter to the test.

This method will overwrite existing keys.

Parameters

namestr

The name of the parameter.

valueAny

The parameter itself.

metrics: dict[str, float | int]
name: str
parameters: dict[str, Any]
to_dict() dict[str, Any][source]

Serialize the test to a dictionary.

Returns

dict[str, Any]

The test dictionary.

Module contents

Main module.

class lazyscribe.Experiment(name: str, project: Path, dir: Path = NOTHING, fs: AbstractFileSystem = NOTHING, author: str = NOTHING, last_updated_by: str = NOTHING, metrics: dict[str, float | int] = NOTHING, parameters: dict[str, Any] = NOTHING, created_at: datetime = NOTHING, last_updated: datetime = NOTHING, dependencies: dict[str, Experiment] = NOTHING, short_slug: str = NOTHING, slug: str = NOTHING, tests: list[Test] = NOTHING, artifacts: list[Artifact] = NOTHING, tags: list[str] = NOTHING, dirty: bool = NOTHING)[source]

Bases: object

Experiment data class.

This class is not meant to be initialized directly. It is meant to be used through the lazyscribe.project.Project class.

Parameters

namestr

The name of the experiment.

projectpathlib.Path

The path to the project JSON associated with the project.

dirpathlib.Path, optional (default None)

Directory for the project and the experiment. If not supplied, the parent directory for the project file will be used.

authorstr, optional (default getpass.getuser())

The author of the experiment.

last_updated_bystr, optional (default None)

Last editor of the experiment. If not supplied, the author will be used.

metricsdict[str, float | int], optional (default {})

A dictionary of metric values. Each metric value can be an individual value or a list.

parametersdict[str, Any], optional (default {})

A dictionary of experiment parameters. The key must be a string but the value can be anything.

created_atdatetime.datetime, optional (default lazyscribe._utils.utcnow())

When the experiment was created (in UTC).

last_updateddatetime.datetime, optional (default lazyscribe._utils.utcnow())

When the experiment was last updated (in UTC).

short_slugstr, optional (default None)

Slugified name. Defaults to calling slugify.slugify() on the name attribute.

slugstr, optional (default None)

Unique identifier for the experiment. Defaults to the slugified name with the creation date appended in the format YYYYMMDDHHMMSS.

tagslist[str], optional (default [])

Tags for filtering and identifying experiments across a project.

dependenciesdict[str, lazyscribe.experiment.Experiment], optional (default {})

A dictionary of upstream project experiments. The key is the short slug for the upstream experiment and the value is an Experiment instance.

testslist[lazyscribe.test.Test], optional (default [])

List of lazyscribe.test.Test objects corresponding to sub-population/non-global metrics.

artifactslist[lazyscribe.artifacts.base.Artifact], optional (default [])

List of lazyscribe.artifact.base.Artifact objects corresponding to experimental artifacts.

dirtybool, optional (default True)

Whether or not this experiment should be saved when lazyscribe.project.Project.save() is called. This decision is based on whether the experiment is new or has been updated.

artifacts: list[Artifact]
author: str
created_at: datetime
dependencies: dict[str, Experiment]
dir: Path
dirty: bool
fs: AbstractFileSystem
last_updated: datetime
last_updated_by: str
load_artifact(name: str, validate: bool = True, **kwargs: Any) Any[source]

Load a single artifact.

Parameters

namestr

The name of the artifact to load.

validatebool, optional (default True)

Whether or not to validate the runtime environment against the artifact metadata.

**kwargs

Keyword arguments for the handler read function.

Returns

Any

The artifact object.

Raises

lazyscribe.exception.ArtifactLoadError

If validate and runtime environment does not match artifact metadata. Or if there is no artifact found with the name provided.

log_artifact(name: str, value: Any, handler: str, fname: str | None = None, overwrite: bool = False, **kwargs: Any) None[source]

Log an artifact to the experiment.

This method associates an artifact with the experiment, but the artifact will not be written until lazyscribe.Project.save() is called.

Parameters

namestr

The name of the artifact.

valueAny

The object to persist to the filesystem.

handlerstr

The name of the handler to use for the object.

fnamestr, optional (default None)

The filename for the artifact. If not provided, it will be derived from the name of the artifact and the builtin suffix for each handler.

overwritebool, optional (default False)

Whether or not to overwrite an existing artifact with the same name. If set to True, the previous artifact will be removed and overwritten with the current artifact.

**kwargs

Keyword arguments for the write function of the handler.

Raises

lazyscribe.exception.ArtifactLogError

Raised if an artifact is supplied with the same name as an existing artifact and overwrite is set to False.

log_metric(name: str, value: float | int) None[source]

Log a metric to the experiment.

This method will overwrite existing keys.

Parameters

namestr

Name of the metric.

valueint | float

Value of the metric.

log_parameter(name: str, value: Any) None[source]

Log a parameter to the experiment.

This method will overwrite existing keys.

Parameters

namestr

The name of the parameter.

valueAny

The parameter itself.

log_test(name: str, description: str | None = None) Iterator[Test][source]

Add a test to the experiment using a context handler.

A test is a specific location for non-global metrics.

Parameters

namestr

Name of the test.

descriptionstr, optional (default None)

An optional description for the test.

Yields

lazyscribe.test.Test

The lazyscribe.test.Test dataclass.

metrics: dict[str, float | int]
name: str
parameters: dict[str, Any]
property path: Path

Path to an experiment folder.

This folder can be used to store any plots or artifacts that you want to associate with the experiment.

Returns

pathlib.Path

The path for the experiment.

project: Path
promote_artifact(repository: Repository, name: str) None[source]

Associate an artifact with a lazyscribe.repository.Repository.

The purpose of this method is to move an artifact from an ephemeral experiment to the versioned repository.

If the artifact does not exist on disk yet, this function is simply a passthrough to lazyscribe.repository.Repository.log_artifact(). If the artifact does exist on disk already, this function will copy the artifact from the experiment directory to the repository, increment the version, and call lazyscribe.repository.Repository.save().

Parameters

repositorylazyscribe.repository.Repository

The lazyscribe.repository.Repository to promote the artifact to.

namestr

The artifact to promote.

Raises

lazyscribe.exception.ArtifactLogError

Raised if the artifact to be promoted is not newer than the latest version available in the repository. Raised if

  • the artifact name exists on the filesystem, and

  • the filesystem protocol does not match between the repository and the experiment.

lazyscribe.exception.ArtifactLoadError

Raised if there is no artifact with the name name in the experiment.

lazyscribe.exception.SaveError

Raised when writing to the filesystem fails.

short_slug: str
slug: str
tag(*args: str, overwrite: bool = False) None[source]

Add one or more tags to the experiment.

Important

If this function is called with no supplied values for *args _and_ overwrite=True, the result will be that the experiment has no associated tags.

Parameters

*args

The tags.

overwritebool, optional (default False)

Whether to add or overwrite the new tags.

tags: list[str]
tests: list[Test]
to_dict() dict[str, Any][source]

Serialize the experiment to a dictionary.

Returns

dict[str, Any]

The experiment dictionary.

class lazyscribe.Project(fpath: str | Path = 'project.json', mode: Literal['r', 'a', 'w', 'w+'] = 'a', author: str | None = None, **storage_options: Any)[source]

Bases: object

Project class.

Parameters

fpathstr | pathlib.Path, optional (default “project.json”)

The location of the project file. If no project file exists, this will be the location of the output JSON file when save is called.

mode{“r”, “a”, “w”, “w+”}, optional (default “a”)

The mode for opening the project.

authorstr, optional (default None)

The project author. This author will be used for any new experiments or modifications to existing experiments. If not supplied, getpass.getuser() will be used.

**storage_options

Storage options to pass to the filesystem initialization. Will be passed to fsspec.filesystem().

Attributes

experimentslist[lazyscribe.experiment.Experiment]

The list of experiments in the project.

Raises

ValueError

Raised on invalid mode value.

append(other: Experiment) None[source]

Append an experiment to the project.

For details on the merging process, see here.

Parameters

otherlazyscribe.experiment.Experiment

The experiment to add.

Raises

lazyscribe.exception.ReadOnlyError

Raised when trying to log a new experiment when the project is in read-only mode.

author: str
experiments: list[Experiment]
filter(func: Callable[[Experiment], bool]) Iterator[Experiment][source]

Filter the experiments in the project.

Parameters

funcCallable[[lazyscribe.experiment.Experiment], bool]

A callable that takes in a lazyscribe.experiment.Experiment object and returns a boolean indicating whether or not it passes the filter.

Yields

lazyscribe.experiment.Experiment

An experiment.

fpath: Path
load() None[source]

Load existing experiments.

If the project is in read-only or append mode, existing experiments will be loaded in read-only mode. If opened in editable mode, existing experiments will be loaded in editable mode.

log(name: str) Iterator[Experiment][source]

Log an experiment to the project.

Parameters

namestr

The name of the experiment.

Yields

lazyscribe.experiment.Experiment

A new lazyscribe.experiment.Experiment object.

Raises

lazyscribe.exception.ReadOnlyError

Raised when trying to log a new experiment when the project is in read-only mode.

merge(other: Project) Project[source]

Merge two projects.

The new project will inherit the current project fpath, author, and mode.

For details on the merging process, see here.

Returns

lazyscribe.project.Project

A new project.

mode: Literal['r', 'a', 'w', 'w+']
save() None[source]

Save the project data.

This includes saving any artifact data.

Raises

lazyscribe.exception.ReadOnlyError

Raised when trying to save when the project is in read-only mode.

lazyscribe.exception.SaveError

Raised when writing to the filesystem fails.

storage_options: dict[str, Any]
class lazyscribe.Repository(fpath: str | Path = 'repository.json', mode: Literal['r', 'w', 'w+'] = 'w+', **storage_options: Any)[source]

Bases: object

Repository class for holding versioned artifacts.

Parameters

fpathstr | Path, optional (default “repository.json”)

The location of the repository file. If no repository file exists, this will be the location of the output JSON file when save is called.

mode{“r”, “a”, “w”, “w+”}, optional (default “w+”)

The mode for opening the repository.

  • r: All artifacts will be loaded. No new artifacts can be logged.

  • w: No existing artifacts will be loaded. Artifacts can be added.

  • w+: All artifacts will be loaded. New artifacts can be added.

**storage_options

Storage options to pass to the filesystem initialization. Will be passed to fsspec.filesystem().

Attributes

artifactslist[lazyscribe.artifact.Artifact]

The list of artifacts in the repository.

artifacts: list[Artifact]
filter(version: datetime | str | list[tuple[str, datetime | str | int]]) Repository[source]

Filter a repository.

This method returns a new, read-only object with a subset of the input artifacts. Use this method to truncate a repository to a collection of artifacts relevant to a given use case.

Parameters

versiondatetime.datetime | str | list[tuple[str, datetime.datetime | str | int]]

The version corresponding to the output version of each artifact. If a datetime or string is provided, this method will do an asof search for each artifact.

If a list is provided, it will be treated as a list of exact versions to load.

Returns

lazyscribe.repository.Repository

A read-only copy of the existing repository with one version per artifact.

Raises

RuntimeError

Raised if the current repository object has artifacts that have not been saved to the filesystem.

fpath: Path
get_artifact_metadata(name: str, version: datetime | str | int | None = None, match: Literal['asof', 'exact'] = 'exact') dict[str, Any][source]

Retrieve the metadata for an artifact.

Parameters

namestr

The name of the artifact to load.

versiondatetime.datetime | str | int, optional (default None)

The version of the artifact to load. Can be provided as a datetime corresponding to the created_at field, a string corresponding to the created_at field in the format "%Y-%m-%dT%H:%M:%S" (e.g. "2025-01-25T12:36:22"), or an integer version. If set to None or not provided, defaults to the most recent version.

match{“asof”, “exact”}, optional (default “exact”)

Matching logic. Only relevant for str and datetime.datetime values for version. exact will provide an artifact with the exact created_at value provided. asof will provide the most recent version as of the version value.

Returns

dict[str, Any]

The artifact metadata.

Raises

ValueError

Raised on invalid match value. Raised if no valid artifact was found.

get_version_diff(name: str, version: datetime | str | int | tuple[datetime | str | int, datetime | str | int], match: Literal['asof', 'exact'] = 'exact') str[source]

Generate the unified diff between versions of the same artifact.

Parameters

namestr

The name of the artifact to compare.

versionsdatetime | str | int | tuple[datetime | str | int, datetime | str | int]

The versions to compare. If a single version is provided, the artifact will be compared to the latest available artifact. A tuple specifies the two versions to compare.

match{“asof”, “exact”}, optional (default “exact”)

Matching logic. Only relevant for str and datetime.datetime values. exact will provide an artifact with the exact created_at value provided. asof will provide the most recent version as of the version value.

Raises

lazyscribe.exception.ArtifactLoadError

Raised if the artifact does not exist on the filesystem yet.

ValueError

Raised if the provided artifact(s) represent binary files.

Returns

str

Concatenated output from difflib.unified_diff().

load() None[source]

Load existing artifacts.

load_artifact(name: str, validate: bool = True, version: datetime | str | int | None = None, match: Literal['asof', 'exact'] = 'exact', **kwargs: Any) Any[source]

Load a single artifact.

Parameters

namestr

The name of the artifact to load.

validatebool, optional (default True)

Whether or not to validate the runtime environment against the artifact metadata.

versiondatetime.datetime | str | int, optional (default None)

The version of the artifact to load. Can be provided as a datetime corresponding to the created_at field, a string corresponding to the created_at field in the format "%Y-%m-%dT%H:%M:%S" (e.g. "2025-01-25T12:36:22"), or an integer version. If set to None or not provided, defaults to the most recent version.

match{“asof”, “exact”}, optional (default “exact”)

Matching logic. Only relevant for str and datetime.datetime values for version. exact will provide an artifact with the exact created_at value provided. asof will provide the most recent version as of the version value.

**kwargs

Keyword arguments for the handler read function.

Returns

Any

The artifact object.

Raises

ValueError

Raised on invalid match value. Raised if no valid artifact was found.

lazyscribe.exception.ArtifactLoadError

Raised if validate and runtime environment does not match artifact metadata.

log_artifact(name: str, value: Any, handler: str, fname: str | None = None, **kwargs: Any) None[source]

Log an artifact to the repository.

This method associates an artifact with the repository, but the artifact will not be written until lazyscribe.repository.Repository.save() is called.

Parameters

namestr

The name of the artifact.

valueAny

The object to persist to the filesystem.

handlerstr

The name of the handler to use for the object.

fnamestr, optional (default None)

The filename for the artifact. If set to None or not provided, it will be derived from the name of the artifact and the builtin suffix for each handler.

**kwargs

Keyword arguments for the write function of the handler.

Raises

lazyscribe.exception.ReadOnlyError

If repository is in read-only mode.

mode: Literal['r', 'w', 'w+']
save() None[source]

Save the repository data.

This includes saving any artifact data.

Raises

lazyscribe.exception.ReadOnlyError

Raised when trying to save when the project is in read-only mode.

lazyscribe.exception.SaveError

Raised when writing to the filesystem fails.

storage_options: dict[str, Any]
class lazyscribe.Test(name: str, description: str | None = NOTHING, metrics: dict[str, float | int] = NOTHING, parameters: dict[str, Any] = NOTHING)[source]

Bases: object

Sub-population tests.

These objects should only be instantiated within an experiment. A test is associated with some subset of the entire experiment. For example, a test could be used to evaluate the performance of a model against a specific subpopulation.

Parameters

namestr

The name of the test.

descriptionstr, optional (default None)

A description of the test.

metricsdict[str, float | int], optional (default {})

A dictionary of metric values. Each metric value can be an individual value or a list.

parametersdict[str, Any], optional (default {})

A dictionary of test parameters. The key must be a string but the value can be anything.

description: str | None
log_metric(name: str, value: float | int) None[source]

Log a metric to the test.

This method will overwrite existing keys.

Parameters

namestr

Name of the metric.

valueint | float

Value of the metric.

log_parameter(name: str, value: Any) None[source]

Log a parameter to the test.

This method will overwrite existing keys.

Parameters

namestr

The name of the parameter.

valueAny

The parameter itself.

metrics: dict[str, float | int]
name: str
parameters: dict[str, Any]
to_dict() dict[str, Any][source]

Serialize the test to a dictionary.

Returns

dict[str, Any]

The test dictionary.