aigct.repository

Data access layer methods for accessing variant repository. Classes here provide an encapsulation layer to hide the internal details of the repository structure.

Attributes

MERGE_CHUNK_SIZE

TASK_SUBFOLDER

DATA_FOLDER

TASK_FOLDERS

VARIANT_PK_COLUMNS

VARIANT_NON_PK_COLUMNS

VARIANT_TABLE_DEF

VARIANT_LABEL_NON_PK_COLUMNS

VARIANT_EFFECT_LABEL_TABLE_DEF

VARIANT_EFFECT_SCORE_PK_COLUMNS

VARIANT_EFFECT_SCORE_NON_PK_COLUMNS

VARIANT_EFFECT_SCORE_TABLE_DEF

VARIANT_TASK_TABLE_DEF

VARIANT_EFFECT_SOURCE_TABLE_DEF

VARIANT_DATA_SOURCE_TABLE_DEF

VARIANT_FILTER_TABLE_DEF

VARIANT_FILTER_GENE_TABLE_DEF

VARIANT_FILTER_VARIANT_TABLE_DEF

TABLE_DEFS

Classes

TableDef

RepoSessionContext

VariantEffectLabelCache

Caches the variant csv file in a dataframe. Implements the singleton

DataCache

Caches a repository csv file in a dataframe. Implements the singleton

TaskBasedDataCache

Caches a repository csv file in a dataframe. Maintains a separate

TaskDataCache

Caches the variant csv file in a dataframe. Implements the singleton

VariantEffectScoreCache

Caches the variant csv file in a dataframe. Implements the singleton

VariantCache

Caches the variant csv file in a dataframe. Implements the singleton

VariantTaskCache

Caches the variant csv file in a dataframe. Implements the singleton

VariantEffectSourceCache

Caches a repository csv file in a dataframe. Implements the singleton

VariantFilterCache

Classes that wish to behave as threadsafe singletons can inherit from

VariantEffectSourceRepository

VariantTaskRepository

VariantFilterRepository

VariantRepository

VariantEffectLabelRepository

VariantEffectScoreRepository

Functions

read_repo_csv(→ pandas.DataFrame)

query_by_filter(→ pandas.DataFrame)

query_by_filters(→ pandas.DataFrame)

Module Contents

aigct.repository.MERGE_CHUNK_SIZE = 500000[source]
aigct.repository.TASK_SUBFOLDER[source]
aigct.repository.DATA_FOLDER = 'data'[source]
aigct.repository.TASK_FOLDERS[source]
class aigct.repository.TableDef[source]
folder: str[source]
file_name: str[source]
pk_columns: list[str][source]
non_pk_columns: list[str][source]
columns: list[str][source]
full_file_name: str[source]
__post_init__()[source]
aigct.repository.VARIANT_PK_COLUMNS = ['GENOME_ASSEMBLY', 'CHROMOSOME', 'POSITION', 'REFERENCE_NUCLEOTIDE', 'ALTERNATE_NUCLEOTIDE'][source]
aigct.repository.VARIANT_NON_PK_COLUMNS = ['PRIOR_GENOME_ASSEMBLY', 'PRIOR_CHROMOSOME', 'PRIOR_POSITION', 'PRIOR_PRIOR_GENOME_ASSEMBLY',...[source]
aigct.repository.VARIANT_TABLE_DEF[source]
aigct.repository.VARIANT_LABEL_NON_PK_COLUMNS = ['LABEL_SOURCE', 'RAW_LABEL', 'BINARY_LABEL'][source]
aigct.repository.VARIANT_EFFECT_LABEL_TABLE_DEF[source]
aigct.repository.VARIANT_EFFECT_SCORE_PK_COLUMNS = ['GENOME_ASSEMBLY', 'CHROMOSOME', 'POSITION', 'REFERENCE_NUCLEOTIDE', 'ALTERNATE_NUCLEOTIDE',...[source]
aigct.repository.VARIANT_EFFECT_SCORE_NON_PK_COLUMNS = ['RAW_SCORE', 'RANK_SCORE'][source]
aigct.repository.VARIANT_EFFECT_SCORE_TABLE_DEF[source]
aigct.repository.VARIANT_TASK_TABLE_DEF[source]
aigct.repository.VARIANT_EFFECT_SOURCE_TABLE_DEF[source]
aigct.repository.VARIANT_DATA_SOURCE_TABLE_DEF[source]
aigct.repository.VARIANT_FILTER_TABLE_DEF[source]
aigct.repository.VARIANT_FILTER_GENE_TABLE_DEF[source]
aigct.repository.VARIANT_FILTER_VARIANT_TABLE_DEF[source]
aigct.repository.TABLE_DEFS[source]
aigct.repository.read_repo_csv(file: str) pandas.DataFrame[source]
class aigct.repository.RepoSessionContext(data_folder_root: str, table_defs: dict[str, TableDef])[source]
_data_folder_root[source]
_table_defs[source]
property data_folder_root[source]
table_def(table_name: str)[source]
table_file(table_name: str, task: str = None)[source]
class aigct.repository.VariantEffectLabelCache[source]

Bases: aigct.util.ParameterizedSingleton

Caches the variant csv file in a dataframe. Implements the singleton pattern to ensure there is only one instance of the cached dataframe. We use an _init_once method rather than the normal __init__ method as required by the ParameterizedSingleton class.

_init_once(data_folder_root: str)[source]
get_data_frame(task_code: str)[source]
class aigct.repository.DataCache[source]

Bases: aigct.util.ParameterizedSingleton

Caches a repository csv file in a dataframe. Implements the singleton pattern to ensure there is only one instance of the cached dataframe. We use an _init_once method rather than the normal __init__ method as required by the ParameterizedSingleton class.

_init_once(data_folder_root: str, table_def: TableDef)[source]
property data_frame[source]
class aigct.repository.TaskBasedDataCache[source]

Bases: aigct.util.ParameterizedSingleton

Caches a repository csv file in a dataframe. Maintains a separate cache for each task in a dict. Implements the singleton pattern to ensure there is only one instance of the cached dataframe. We use an _init_once method rather than the normal __init__ method as required by the ParameterizedSingleton class.

_init_once(data_folder_root: str, table_def: TableDef, disable_cache: bool = False)[source]
get_data_frame(task_code: str)[source]
class aigct.repository.TaskDataCache[source]

Bases: DataCache

Caches the variant csv file in a dataframe. Implements the singleton pattern to ensure there is only one instance of the cached dataframe. We use an _init_once method rather than the normal __init__ method as required by the ParameterizedSingleton class.

_init_once(data_folder_root: str)[source]
class aigct.repository.VariantEffectScoreCache[source]

Bases: TaskBasedDataCache

Caches the variant csv file in a dataframe. Implements the singleton pattern to ensure there is only one instance of the cached dataframe. We use an _init_once method rather than the normal __init__ method as required by the ParameterizedSingleton class.

_init_once(data_folder_root: str, disable_cache: bool = False)[source]
class aigct.repository.VariantCache[source]

Bases: DataCache

Caches the variant csv file in a dataframe. Implements the singleton pattern to ensure there is only one instance of the cached dataframe. We use an _init_once method rather than the normal __init__ method as required by the ParameterizedSingleton class.

_init_once(data_folder_root: str)[source]
class aigct.repository.VariantTaskCache[source]

Bases: DataCache

Caches the variant csv file in a dataframe. Implements the singleton pattern to ensure there is only one instance of the cached dataframe. We use an _init_once method rather than the normal __init__ method as required by the ParameterizedSingleton class.

_init_once(data_folder_root: str)[source]
class aigct.repository.VariantEffectSourceCache[source]

Bases: DataCache

Caches a repository csv file in a dataframe. Implements the singleton pattern to ensure there is only one instance of the cached dataframe. We use an _init_once method rather than the normal __init__ method as required by the ParameterizedSingleton class.

_init_once(data_folder_root: str)[source]
class aigct.repository.VariantFilterCache[source]

Bases: aigct.util.ParameterizedSingleton

Classes that wish to behave as threadsafe singletons can inherit from this class. To be used only by classes that have an initialization method that takes parameters. The class must implement an _init_once method instead of the normal __init__ method for initialization. It takes same parameters as __init__ method. By inheriting from this class all instantiations of the subclass will return the same instance.

_init_once(data_folder_root: str)[source]
get_data_frames(task_code: str) dict[source]
class aigct.repository.VariantEffectSourceRepository(session_context: RepoSessionContext, variant_effect_score_repo)[source]
_cache[source]
_variant_effect_score_repo[source]
get_all() pandas.DataFrame[source]
get_by_task(task_code: str) pandas.DataFrame[source]
get_by_code(codes: list[str]) pandas.DataFrame[source]
class aigct.repository.VariantTaskRepository(session_context: RepoSessionContext)[source]
_cache[source]
get_all() pandas.DataFrame[source]
class aigct.repository.VariantFilterRepository(session_context: RepoSessionContext)[source]
_cache[source]
get_by_task(task_code: str) dict[str, pandas.DataFrame][source]
get_by_task_filter_name(task_code: str, filter_name: str) aigct.model.VariantFilter[source]
get_by_task_filter_names(task_code: str, filter_names: list[str]) list[aigct.model.VariantFilter][source]
aigct.repository.query_by_filter(query_df: pandas.DataFrame, filter: pandas.Series, filter_gene_df: pandas.DataFrame, filter_variant_df: pandas.DataFrame) pandas.DataFrame[source]
aigct.repository.query_by_filters(query_df: pandas.DataFrame, filters: list[aigct.model.VariantFilter]) pandas.DataFrame[source]
class aigct.repository.VariantRepository(session_context: RepoSessionContext)[source]
_cache[source]
get_all() pandas.DataFrame[source]
get(qry: aigct.model.VEQueryCriteria) pandas.DataFrame[source]

Fetches variants. The optional parameters are filter criteria used to limit the set of variants returned.

class aigct.repository.VariantEffectLabelRepository(session_context: RepoSessionContext, variant_task_repo: VariantTaskRepository, variant_repo: VariantRepository, filter_repo: VariantFilterRepository)[source]
_cache[source]
_task_repo[source]
_filter_repo[source]
_variant_repo[source]
get_all_by_task(task_code: str) pandas.DataFrame[source]
get_all_for_all_tasks() pandas.DataFrame[source]
get(task_code: str, qry: aigct.model.VEQueryCriteria = None) pandas.DataFrame[source]

Fetches variant effect labels.

class aigct.repository.VariantEffectScoreRepository(session_context: RepoSessionContext, task_repo: VariantTaskRepository, variant_repo: VariantRepository, filter_repo: VariantFilterRepository)[source]
_cache[source]
_task_repo[source]
_filter_repo[source]
_variant_repo[source]
get_all_by_task(task_code: str) pandas.DataFrame[source]
get_all_by_task_slim(task_code: str) pandas.DataFrame[source]
get(task_code: str, variant_effect_sources: list[str] | str = None, include_variant_effect_sources: bool = True, qry: aigct.model.VEQueryCriteria = None) pandas.DataFrame[source]
get_all_for_all_tasks() pandas.DataFrame[source]