aymara_ai.lib.plot#

Functions#

`eval_pass_stats`(eval_runs)	Create a DataFrame of pass rates and pass totals from one or more score runs.
`eval_pass_stats_by_category`(eval_run, prompts, responses)	Create a DataFrame of pass rates and pass totals from one eval run.
`graph_eval_stats`(eval_runs[, title, ylim_min, ...])	Draw a bar graph of pass rates from one or more score runs.
`graph_eval_by_category`(eval_run, prompts, responses[, ...])	Draw a bar graph of pass rates from one eval run.

Module Contents#

aymara_ai.lib.plot.eval_pass_stats(eval_runs)#

Create a DataFrame of pass rates and pass totals from one or more score runs.

Parameters:: eval_runs (Union[EvalRunResponse, List[EvalRunResponse]]) – One or a list of test score runs to graph.
Returns:: DataFrame of pass rates per test score run.
Return type:: pd.DataFrame

aymara_ai.lib.plot.eval_pass_stats_by_category(eval_run, prompts, responses)#

Create a DataFrame of pass rates and pass totals from one eval run.

Parameters:

eval_run (EvalRunResult) – One eval run to graph.
prompts (List[EvalPrompt]) – List of evaluation prompts.
responses (List[ScoredResponse]) – List of scored responses.

Returns:

DataFrame of pass rates per evaluation prompt category.

Return type:

pd.DataFrame

aymara_ai.lib.plot.graph_eval_stats(eval_runs, title=None, ylim_min=None, ylim_max=None, yaxis_is_percent=True, ylabel='Responses Passed', xaxis_is_eval_run_uuids=False, xlabel=None, xtick_rot=30.0, xtick_labels_dict=None, **kwargs)#

Draw a bar graph of pass rates from one or more score runs.

Parameters:

eval_runs (Union[List[EvalRunResult], EvalRunResult]) – One or a list of eval runs to graph.
title (str, optional) – Graph title.
ylim_min (float, optional) – y-axis lower limit, defaults to rounding down to the nearest ten.
ylim_max (float, optional) – y-axis upper limit, defaults to matplotlib’s preference but is capped at 100.
yaxis_is_percent (bool, optional) – Whether to show the pass rate as a percent (instead of the total number of prompts passed), defaults to True.
ylabel (str) – Label of the y-axis, defaults to ‘Responses Passed’.
xaxis_is_eval_run_uuids (Optional[bool]) – Whether the x-axis represents tests (True) or score runs (False), defaults to True.
xlabel (str) – Label of the x-axis, defaults to ‘Eval Runs’ if xaxis_is_eval_run_uuids=True and ‘Evals’ otherwise.
xtick_rot (float) – rotation of the x-axis tick labels, defaults to 30.
xtick_labels_dict (dict, optional) – Maps eval names (keys) to x-axis tick labels (values).
kwargs – Options to pass to matplotlib.pyplot.bar.

Return type:

None

aymara_ai.lib.plot.graph_eval_by_category(eval_run, prompts, responses, title=None, ylim_min=None, ylim_max=None, yaxis_is_percent=True, ylabel='Responses Passed', xlabel='Prompt Category', xtick_rot=30.0, xtick_labels_dict=None, **kwargs)#

Draw a bar graph of pass rates from one eval run.

Parameters:

eval_run (EvalRunResult) – The eval run to graph.
prompts (List[EvalPrompt]) – List of evaluation prompts.
responses (List[ScoredResponse]) – List of scored responses.
title (str, optional) – Graph title.
ylim_min (float, optional) – y-axis lower limit, defaults to rounding down to the nearest ten.
ylim_max (float, optional) – y-axis upper limit, defaults to matplotlib’s preference but is capped at 100.
yaxis_is_percent (bool, optional) – Whether to show the pass rate as a percent (instead of the total number of questions passed), defaults to True.
ylabel (str) – Label of the y-axis, defaults to ‘Responses Passed’.
xlabel (str) – Label of the x-axis, defaults to ‘Prompt Category’.
xtick_rot (float) – rotation of the x-axis tick labels, defaults to 30.
xtick_labels_dict (dict, optional) – Maps test_names (keys) to x-axis tick labels (values).
kwargs – Options to pass to matplotlib.pyplot.bar.

Return type:

None