aymara_ai.lib.plot#

Functions#

eval_pass_stats(eval_runs)

Create a DataFrame of pass rates and pass totals from one or more score runs.

eval_pass_stats_by_category(eval_run, prompts, responses)

Create a DataFrame of pass rates and pass totals from one eval run.

graph_eval_stats(eval_runs[, title, ylim_min, ...])

Draw a bar graph of pass rates from one or more score runs.

graph_eval_by_category(eval_run, prompts, responses[, ...])

Draw a bar graph of pass rates from one eval run.

Module Contents#

aymara_ai.lib.plot.eval_pass_stats(eval_runs)#

Create a DataFrame of pass rates and pass totals from one or more score runs.

Parameters:

eval_runs (Union[EvalRunResponse, List[EvalRunResponse]]) – One or a list of test score runs to graph.

Returns:

DataFrame of pass rates per test score run.

Return type:

pd.DataFrame

aymara_ai.lib.plot.eval_pass_stats_by_category(eval_run, prompts, responses)#

Create a DataFrame of pass rates and pass totals from one eval run.

Parameters:
Returns:

DataFrame of pass rates per evaluation prompt category.

Return type:

pd.DataFrame

aymara_ai.lib.plot.graph_eval_stats(eval_runs, title=None, ylim_min=None, ylim_max=None, yaxis_is_percent=True, ylabel='Responses Passed', xaxis_is_eval_run_uuids=False, xlabel=None, xtick_rot=30.0, xtick_labels_dict=None, **kwargs)#

Draw a bar graph of pass rates from one or more score runs.

Parameters:
  • eval_runs (Union[List[EvalRunResult], EvalRunResult]) – One or a list of eval runs to graph.

  • title (str, optional) – Graph title.

  • ylim_min (float, optional) – y-axis lower limit, defaults to rounding down to the nearest ten.

  • ylim_max (float, optional) – y-axis upper limit, defaults to matplotlib’s preference but is capped at 100.

  • yaxis_is_percent (bool, optional) – Whether to show the pass rate as a percent (instead of the total number of prompts passed), defaults to True.

  • ylabel (str) – Label of the y-axis, defaults to ‘Responses Passed’.

  • xaxis_is_eval_run_uuids (Optional[bool]) – Whether the x-axis represents tests (True) or score runs (False), defaults to True.

  • xlabel (str) – Label of the x-axis, defaults to ‘Eval Runs’ if xaxis_is_eval_run_uuids=True and ‘Evals’ otherwise.

  • xtick_rot (float) – rotation of the x-axis tick labels, defaults to 30.

  • xtick_labels_dict (dict, optional) – Maps eval names (keys) to x-axis tick labels (values).

  • kwargs – Options to pass to matplotlib.pyplot.bar.

Return type:

None

aymara_ai.lib.plot.graph_eval_by_category(eval_run, prompts, responses, title=None, ylim_min=None, ylim_max=None, yaxis_is_percent=True, ylabel='Responses Passed', xlabel='Prompt Category', xtick_rot=30.0, xtick_labels_dict=None, **kwargs)#

Draw a bar graph of pass rates from one eval run.

Parameters:
  • eval_run (EvalRunResult) – The eval run to graph.

  • prompts (List[EvalPrompt]) – List of evaluation prompts.

  • responses (List[ScoredResponse]) – List of scored responses.

  • title (str, optional) – Graph title.

  • ylim_min (float, optional) – y-axis lower limit, defaults to rounding down to the nearest ten.

  • ylim_max (float, optional) – y-axis upper limit, defaults to matplotlib’s preference but is capped at 100.

  • yaxis_is_percent (bool, optional) – Whether to show the pass rate as a percent (instead of the total number of questions passed), defaults to True.

  • ylabel (str) – Label of the y-axis, defaults to ‘Responses Passed’.

  • xlabel (str) – Label of the x-axis, defaults to ‘Prompt Category’.

  • xtick_rot (float) – rotation of the x-axis tick labels, defaults to 30.

  • xtick_labels_dict (dict, optional) – Maps test_names (keys) to x-axis tick labels (values).

  • kwargs – Options to pass to matplotlib.pyplot.bar.

Return type:

None