aigct.plotter

Classes

VEAnalysisPlotter

Plot results of an analysis

Module Contents

class aigct.plotter.VEAnalysisPlotter(config: aigct.util.Config)[source]

Plot results of an analysis

_config[source]

_roc_pr_config[source]

_mwu_config[source]

_calibration_line_config[source]

_plot_roc_curves(aucs: pandas.DataFrame, user_vep_name: str, curve_coords: pandas.DataFrame, batch_no: int, num_batches: int, num_top_labelled_veps: int, ves_color_palette: dict, file_name: str = None)[source]

_plot_pr_curves(aucs: pandas.DataFrame, user_vep_name: str, curve_coords: pandas.DataFrame, batch_no: int, num_batches: int, num_top_labelled_veps: int, ves_color_palette: dict, file_name: str = None)[source]

_display_mwu_table(results: aigct.model.VEAnalysisResult, file_name: str = None)[source]

_display_pr_table(results: aigct.model.VEAnalysisResult, file_name: str = None)[source]

_display_roc_table(results: aigct.model.VEAnalysisResult, file_name: str = None)[source]

_plot_mwu_bar(mwus: pandas.DataFrame, batch_no: int, num_batches: int, file_name: str = None, palette=None)[source]

plot_pr_results(results: aigct.model.VEAnalysisResult, num_top_labelled_veps: int, ves_color_palette: dict, dir: str = None)[source]

plot_roc_results(results: aigct.model.VEAnalysisResult, num_top_labelled_veps: int, ves_color_palette: dict, dir: str = None)[source]

plot_mwu_results(results: aigct.model.VEAnalysisResult, ves_color_palette: dict, dir: str = None)[source]

plot_results(results: aigct.model.VEAnalysisResult, metrics: str | list[str] = ['roc', 'pr', 'mwu'], num_top_labelled_veps: int = None, num_top_genes: int = None, dir: str = None)[source]

Plot the results of an analysis either to the screen or to files.

Parameters

resultsVEAnalysisResult: Analysis result object
metricsstr or list[str]: Specifies which metrics to plot. Can be a string indicating a single metric or a list of strings for multiple metrics. The metrics are: roc, pr, mwu.
num_top_labelled_vepsint: If not None, only this many of the top performing veps will be labelled in the auc plot legends. This is useful when there are many veps and the legend becomes too cluttered.
num_top_genesint: If compute_gene_metrics was set to True in call to compute_metrics, then only include this many top genes in the plot. The top gene are the ones for which the most variants were observed.
dirstr, optional: Directory to place the plot files. The files will be placed in a subdirectory off of this directory whose name begins with ve_analysis_plots and suffixed by a unique timestamp. If not specified will plot to screen.

plot_gene_results(results: aigct.model.VEAnalysisResult, metrics: list[str], num_top_genes: int = None, dir: str = None)[source]

Plot gene-level results of an analysis.

Parameters

resultsVEAnalysisResult: Analysis result object containing gene-level metrics
metricslist[str]: List of metrics to plot (roc, pr, mwu)
num_top_genesint: Number of top genes to plot based on the number of variants in each gene included in the analysis.
dirstr, optional: Directory to place the plot files

plot_gene_level_results(gene_general_metrics: pandas.DataFrame, gene_metrics: pandas.DataFrame, metric_column: str, metric_display_name: str, title: str, gene_metric_sorter: aigct.report_util.GeneMetricSorter, ves_color_palette: dict, dir: str = None, figure_file_name: str = None, table_file_name: str = None)[source]

_display_gene_metric_table(metrics_df: pandas.DataFrame, metric_column: str, metric_display_name: str, title: str, file_name: str = None)[source]

_plot_score_vs_pathogenic_fraction(axes, results, x_lower_limit: float = None, x_upper_limit: float = None, annotate: bool = False, dir: str = None)[source]

Plot the fraction of pathogenic variants versus mean score as a line plot.

Parameters

axesmatplotlib.axes.Axes: Matplotlib axes object to plot on
resultsVEAnalysisCalibrationResult: Calibration result object containing score bins
annotatebool, optional: If True, annotate each point with the score range
dirstr, optional: Directory to save the plot file

_plot_precision_recall_vs_thresholds1(results: aigct.model.VEAnalysisCalibrationResult, dir: str = None)[source]

Obsolete to be removed. Plot precision, recall, and F1 score versus threshold values.

Parameters

resultsVEAnalysisCalibrationResult: Calibration result object containing precision-recall curve data
dirstr, optional: Directory to save the plot file

_plot_precision_recall_vs_threshold(results: aigct.model.VEAnalysisCalibrationResult, threshold_boundary: float = 0.5, precision_cutoff: float = 0.9, dir: str = None)[source]

Obsolete to be removed. Plot precision, recall, and F1 score versus threshold values.

Parameters

resultsVEAnalysisCalibrationResult: Calibration result object containing precision-recall curve data
dirstr, optional: Directory to save the plot file

_plot_score_vs_variant_counts(axes, results: aigct.model.VEAnalysisCalibrationResult, bins: int, x_lower_limit: float = None, x_upper_limit: float = None, dir: str = None)[source]

Plot histograms showing the distribution of RANK_SCORE values for positive (BINARY_LABEL=1) and negative (BINARY_LABEL=0) variants.

Parameters

resultsVEAnalysisCalibrationResult: Calibration result object containing variant scores and labels
axesmatplotlib.axes.Axes: Matplotlib axes object to plot on
binsint: Number of bins to use for the histogram
dirstr, optional: Directory to save the plot file

plot_calibration_curves(results: aigct.model.VEAnalysisCalibrationResult, target_precision: float = None, target_recall: float = None, target_f1: float = None, dir: str = None)[source]

Plot the results of calling VEAnalyzer.compute_calibration_metrics. Generates 3 plots: 1. Pathogenic fraction by score interval 2. Distribution of variant scores by pathogenicity 3. Precision, recall, and F1 score versus threshold values

The first 2 plots are vertically stacked in a single figure.

Parameters

resultsVEAnalysisCalibrationResult: Calibration result object returned by calling VEAnalyzer.compute_calibration_metrics.
target_precision: float, optional: If specified, will plot a vertical line at the threshold that achieves the target precision.
target_recall: float, optional: If specified, will plot a vertical line at the threshold that achieves the target recall.
target_f1: float, optional: If specified, will plot a vertical line at the threshold that achieves the target f1 score.
dirstr, optional: Directory to place the plot files. The files will be placed in a subdirectory off of this directory whose name begins with ve_calibration_plots and suffixed by a unique timestamp. If not specified will plot to screen.

_plot_binned_data(results: aigct.model.VEAnalysisCalibrationResult, dir: str = None)[source]: Plot the binned data as 2 subplots vertically stacked.

_plot_metrics_vs_threshold(results: aigct.model.VEAnalysisCalibrationResult, target_precision: float = None, target_recall: float = None, target_f1: float = None, dir: str = None)[source]: Plot precision, recall, F1 score, versus threshold values.