aigct.plotter

Classes

VEAnalysisPlotter

Plot results of an analysis

Module Contents

class aigct.plotter.VEAnalysisPlotter(config: aigct.util.Config)[source]

Plot results of an analysis

_config[source]
_roc_pr_config[source]
_mwu_config[source]
_calibration_line_config[source]
_plot_roc_curves(aucs: pandas.DataFrame, user_vep_name: str, curve_coords: pandas.DataFrame, batch_no: int, num_batches: int, num_top_labelled_veps: int, ves_color_palette: dict, file_name: str = None)[source]
_plot_pr_curves(aucs: pandas.DataFrame, user_vep_name: str, curve_coords: pandas.DataFrame, batch_no: int, num_batches: int, num_top_labelled_veps: int, ves_color_palette: dict, file_name: str = None)[source]
_display_mwu_table(results: aigct.model.VEAnalysisResult, file_name: str = None)[source]
_display_pr_table(results: aigct.model.VEAnalysisResult, file_name: str = None)[source]
_display_roc_table(results: aigct.model.VEAnalysisResult, file_name: str = None)[source]
_plot_mwu_bar(mwus: pandas.DataFrame, batch_no: int, num_batches: int, file_name: str = None, palette=None)[source]
plot_pr_results(results: aigct.model.VEAnalysisResult, num_top_labelled_veps: int, ves_color_palette: dict, dir: str = None)[source]
plot_roc_results(results: aigct.model.VEAnalysisResult, num_top_labelled_veps: int, ves_color_palette: dict, dir: str = None)[source]
plot_mwu_results(results: aigct.model.VEAnalysisResult, ves_color_palette: dict, dir: str = None)[source]
plot_results(results: aigct.model.VEAnalysisResult, metrics: str | list[str] = ['roc', 'pr', 'mwu'], num_top_labelled_veps: int = None, num_top_genes: int = None, dir: str = None)[source]

Plot the results of an analysis either to the screen or to files.

Parameters

resultsVEAnalysisResult

Analysis result object

metricsstr or list[str]

Specifies which metrics to plot. Can be a string indicating a single metric or a list of strings for multiple metrics. The metrics are: roc, pr, mwu.

num_top_labelled_vepsint

If not None, only this many of the top performing veps will be labelled in the auc plot legends. This is useful when there are many veps and the legend becomes too cluttered.

num_top_genesint

If compute_gene_metrics was set to True in call to compute_metrics, then only include this many top genes in the plot. The top gene are the ones for which the most variants were observed.

dirstr, optional

Directory to place the plot files. The files will be placed in a subdirectory off of this directory whose name begins with ve_analysis_plots and suffixed by a unique timestamp. If not specified will plot to screen.

plot_gene_results(results: aigct.model.VEAnalysisResult, metrics: list[str], num_top_genes: int = None, dir: str = None)[source]

Plot gene-level results of an analysis.

Parameters

resultsVEAnalysisResult

Analysis result object containing gene-level metrics

metricslist[str]

List of metrics to plot (roc, pr, mwu)

num_top_genesint

Number of top genes to plot based on the number of variants in each gene included in the analysis.

dirstr, optional

Directory to place the plot files

plot_gene_level_results(gene_general_metrics: pandas.DataFrame, gene_metrics: pandas.DataFrame, metric_column: str, metric_display_name: str, title: str, gene_metric_sorter: aigct.report_util.GeneMetricSorter, ves_color_palette: dict, dir: str = None, figure_file_name: str = None, table_file_name: str = None)[source]
_display_gene_metric_table(metrics_df: pandas.DataFrame, metric_column: str, metric_display_name: str, title: str, file_name: str = None)[source]
_plot_score_vs_pathogenic_fraction(axes, results, x_lower_limit: float = None, x_upper_limit: float = None, annotate: bool = False, dir: str = None)[source]

Plot the fraction of pathogenic variants versus mean score as a line plot.

Parameters

axesmatplotlib.axes.Axes

Matplotlib axes object to plot on

resultsVEAnalysisCalibrationResult

Calibration result object containing score bins

annotatebool, optional

If True, annotate each point with the score range

dirstr, optional

Directory to save the plot file

_plot_precision_recall_vs_thresholds1(results: aigct.model.VEAnalysisCalibrationResult, dir: str = None)[source]

Obsolete to be removed. Plot precision, recall, and F1 score versus threshold values.

Parameters

resultsVEAnalysisCalibrationResult

Calibration result object containing precision-recall curve data

dirstr, optional

Directory to save the plot file

_plot_precision_recall_vs_threshold(results: aigct.model.VEAnalysisCalibrationResult, threshold_boundary: float = 0.5, precision_cutoff: float = 0.9, dir: str = None)[source]

Obsolete to be removed. Plot precision, recall, and F1 score versus threshold values.

Parameters

resultsVEAnalysisCalibrationResult

Calibration result object containing precision-recall curve data

dirstr, optional

Directory to save the plot file

_plot_score_vs_variant_counts(axes, results: aigct.model.VEAnalysisCalibrationResult, bins: int, x_lower_limit: float = None, x_upper_limit: float = None, dir: str = None)[source]

Plot histograms showing the distribution of RANK_SCORE values for positive (BINARY_LABEL=1) and negative (BINARY_LABEL=0) variants.

Parameters

resultsVEAnalysisCalibrationResult

Calibration result object containing variant scores and labels

axesmatplotlib.axes.Axes

Matplotlib axes object to plot on

binsint

Number of bins to use for the histogram

dirstr, optional

Directory to save the plot file

plot_calibration_curves(results: aigct.model.VEAnalysisCalibrationResult, target_precision: float = None, target_recall: float = None, target_f1: float = None, dir: str = None)[source]

Plot the results of calling VEAnalyzer.compute_calibration_metrics. Generates 3 plots: 1. Pathogenic fraction by score interval 2. Distribution of variant scores by pathogenicity 3. Precision, recall, and F1 score versus threshold values

The first 2 plots are vertically stacked in a single figure.

Parameters

resultsVEAnalysisCalibrationResult

Calibration result object returned by calling VEAnalyzer.compute_calibration_metrics.

target_precision: float, optional

If specified, will plot a vertical line at the threshold that achieves the target precision.

target_recall: float, optional

If specified, will plot a vertical line at the threshold that achieves the target recall.

target_f1: float, optional

If specified, will plot a vertical line at the threshold that achieves the target f1 score.

dirstr, optional

Directory to place the plot files. The files will be placed in a subdirectory off of this directory whose name begins with ve_calibration_plots and suffixed by a unique timestamp. If not specified will plot to screen.

_plot_binned_data(results: aigct.model.VEAnalysisCalibrationResult, dir: str = None)[source]

Plot the binned data as 2 subplots vertically stacked.

_plot_metrics_vs_threshold(results: aigct.model.VEAnalysisCalibrationResult, target_precision: float = None, target_recall: float = None, target_f1: float = None, dir: str = None)[source]

Plot precision, recall, F1 score, versus threshold values.