Represent the project as a table¶
Important
The functionality described in this guide was removed in version 2.0. To obtain a tabular
representation of your project and/or repository, install lazyscribe-arrow>=0.3 and use
code similar to the following:
import pyarrow as pa
from lazyscribe import Project
from lazyscribe_arrow.interchange import to_table
project: Project = ...
table: pyarrow.Table = to_table(project)
This code returns a pyarrow.Table object. Using Arrow’s “zero-copy” interchange format
with popular data structures, you can easily use this object with Pandas, Polars, DuckDB, etc.
import polars as pl
df = pl.DataFrame(table)
To aid in visualization and comparison, lazyscribe has a built-in method
lazyscribe.project.Project.to_tabular() for generating a pandas-ready format:
from lazyscribe import Project
project = Project(fpath=..., mode="r")
experiments, tests = project.to_tabular()
The experiments entry in the tuple is a list of dictionaries, with each dictionary
representing a single experiment in the project. It will contain metadata as well as each
metric value and parameters that aren’t dictionaries, tuples, or lists. The tests object
refers to non-global metrics. Similarly, it will contain some experiment metadata
along with the test-level metrics from the experiment.
To use these lists, convert them to pandas.DataFrame objects with multi-index column names:
import pandas as pd
exp_df = pd.DataFrame(experiments)
exp_df.columns = pd.MultiIndex.from_tuples(exp_df.columns)
test_df = pd.DataFrame(tests)
test_df.columns = pd.MultiIndex.from_tuples(test_df.columns)
You can also make a pandas.Series for an experiment or a test:
experiment = project.experiments[0]
exp_s = pd.Series(experiment.to_tabular())
test = experiment.tests[0]
test_s = pd.Series(test.to_tabular())