.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "tutorials/tests.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. .. rst-class:: sphx-glr-example-title .. _sphx_glr_tutorials_tests.py: Logging non-global metrics and artifacts with tests ==================================================== In this tutorial, we will demonstrate how you can use :py:class:`lazyscribe.test.Test` objects to log metrics, parameters, and artifacts for specific sub-populations of your experiment data. A common pattern in ML development is to evaluate a model on the overall dataset *and* on specific data slices (e.g. by demographic group, data source, or class). Attaching these per-slice results directly to the experiment — rather than keeping them in separate files — makes it easier to compare slices across experiments and to reproduce past evaluations. .. GENERATED FROM PYTHON SOURCE LINES 17-27 .. code-block:: Python import json import tempfile from pathlib import Path from sklearn.datasets import make_classification from sklearn.svm import SVC from lazyscribe import Project .. GENERATED FROM PYTHON SOURCE LINES 28-29 First, create some toy data and split off a "subpopulation" (the last 200 samples). .. GENERATED FROM PYTHON SOURCE LINES 29-33 .. code-block:: Python X, y = make_classification(n_samples=1000, n_features=10, random_state=0) X_sub, y_sub = X[800:], y[800:] .. GENERATED FROM PYTHON SOURCE LINES 34-40 Next, initialise the project and run the experiment. We use :py:meth:`lazyscribe.experiment.Experiment.log_test` as a context manager to log the sub-population evaluation. Inside the context, we can call the same :py:meth:`~lazyscribe.test.Test.log_metric` and :py:meth:`~lazyscribe.test.Test.log_parameter` methods as on a regular experiment, as well as the new :py:meth:`~lazyscribe.test.Test.log_artifact` method. .. GENERATED FROM PYTHON SOURCE LINES 40-59 .. code-block:: Python tmpdir = Path(tempfile.mkdtemp()) project = Project(fpath=tmpdir / "project.json", mode="w") with project.log(name="base-performance") as exp: model = SVC(kernel="linear", random_state=0) model.fit(X, y) exp.log_metric("score", model.score(X, y)) with exp.log_test(name="subpopulation-a") as test: sub_score = model.score(X_sub, y_sub) predictions = model.predict(X_sub).tolist() test.log_metric("score", sub_score) test.log_parameter("n_samples", len(y_sub)) # Persist the predictions list as a JSON artifact. test.log_artifact(name="predictions", value=predictions, handler="json") .. GENERATED FROM PYTHON SOURCE LINES 60-62 Artifacts are **not** written to disk at call time. Call :py:meth:`lazyscribe.Project.save` to persist both the project JSON and any pending artifact files. .. GENERATED FROM PYTHON SOURCE LINES 62-65 .. code-block:: Python project.save() .. GENERATED FROM PYTHON SOURCE LINES 66-67 Let's verify the test was captured by printing its data. .. GENERATED FROM PYTHON SOURCE LINES 67-72 .. code-block:: Python exp_data = project["base-performance"] test_data = exp_data.tests[0] print(json.dumps(test_data.to_dict(), indent=4, default=str)) .. rst-class:: sphx-glr-script-out .. code-block:: none { "name": "subpopulation-a", "description": null, "metrics": { "score": 0.955 }, "parameters": { "n_samples": 200 }, "artifacts": [ { "name": "predictions", "fname": "predictions-20260424140221.json", "created_at": "2026-04-24T14:02:21", "expiry": null, "version": 0, "handler": "json" } ] } .. GENERATED FROM PYTHON SOURCE LINES 73-75 To reload the test artifact in a later session, open the project in read mode and call :py:meth:`lazyscribe.test.Test.load_artifact` on the test. .. GENERATED FROM PYTHON SOURCE LINES 75-82 .. code-block:: Python project_read = Project(fpath=tmpdir / "project.json", mode="r") exp_read = project_read["base-performance"] test_read = exp_read.tests[0] loaded_predictions = test_read.load_artifact("predictions") print(loaded_predictions) .. rst-class:: sphx-glr-script-out .. code-block:: none [0, 1, 0, 1, 0, 1, 1, 1, 1, 0, 0, 1, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 1, 1, 0, 0, 1, 0, 0, 0, 1, 1, 1, 1, 0, 0, 1, 1, 0, 1, 1, 0, 1, 0, 0, 1, 0, 0, 1, 1, 1, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 1, 1, 1, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 1, 1, 0, 1, 0, 0, 1, 1, 1, 0, 1, 0, 0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 0, 1, 1, 0, 0, 1, 0, 1, 0, 0, 0, 1, 1, 1, 1, 0, 1, 1, 0, 0, 0, 1, 0, 1, 1, 0, 1, 1, 1, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1] .. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 0.019 seconds) .. _sphx_glr_download_tutorials_tests.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: tests.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: tests.py ` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: tests.zip ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_