aigct.query
Classes and methods to query the variant repository. This layer provides an abstraction layer that sits on top of the data access layer in the repository module. It uses the repository module to access the raw data and includes methods to optionally transform the data to make it more meaningful or presentation ready to the caller.
Classes
Methods to query the variant repository |
Functions
|
Module Contents
- aigct.query.cleanup_variant_query_params(params: aigct.model.VEQueryCriteria)[source]
- class aigct.query.VEBenchmarkQueryMgr(variant_effect_label_repo: aigct.repository.VariantEffectLabelRepository, variant_repo: aigct.repository.VariantRepository, variant_task_repo: aigct.repository.VariantTaskRepository, variant_effect_source_repo: aigct.repository.VariantEffectSourceRepository, variant_effect_score_repo: aigct.repository.VariantEffectScoreRepository, variant_filter_repo: aigct.repository.VariantFilterRepository)[source]
Methods to query the variant repository
- get_variants(qry: aigct.model.VEQueryCriteria) pandas.DataFrame[source]
Fetch variants based on query criteria.
Parameters
- qryVEQueryCriteria
See description of VEQueryCriteria in model package. Specifies criteria that would limit the set of variants to be retrieved. The filter_names attribute is ignored.
Returns
DataFrame
- get_variant_effect_source_stats(task_code: str, variant_effect_sources=None, include_variant_effect_sources: bool = True, qry: aigct.model.VEQueryCriteria = None) pandas.DataFrame[source]
Get all variant effect sources for a task along with the number of variants, number of positive labels, number of negative labels, number of genes for each source.
Parameters
task_code : str
- variant_effect_sourceslist, optional
If specified it would restrict the results based on system supplied vep’s in this list.
- include_variant_effect_sourcesbool, optional
If variant_effect_source is specified, indicates whether to limit the results to sources in variant_effect_sources or not in variant_effect_sources.
- qryVEQueryCriteria, optional
See description of VEQueryCriteria in model package. Specifies criteria that would limit the set of variants to be retrieved.
Returns
DataFrame
- get_all_task_variant_effect_label_stats() pandas.DataFrame[source]
Returns one row per task with number of variants, number of positive labels, number of negative labels, number of genes.
Returns
DataFrame
- get_variant_effect_scores(task_code: str, variant_effect_sources=None, include_variant_effect_sources: bool = True, qry: aigct.model.VEQueryCriteria = None) pandas.DataFrame[source]
Fetches variant effect scores for variant effect sources.
Parameters
- task_codestr
task code
- variant_effect_sourceslist, optional
If specified it would restrict the results based on system supplied vep’s in this list.
- include_variant_effect_sourcesbool, optional
If variant_effect_source is specified, indicates whether to limit the results to sources in variant_effect_sources or not in variant_effect_sources.
- qryVEQueryCriteria, optional
See description of VEQueryCriteria in model package. Specifies criteria that would limit the set of variants to be retrieved.
Returns
DataFrame
- get_variants_by_task(task_code: str, qry: aigct.model.VEQueryCriteria = None) pandas.DataFrame[source]
Fetches variants by task. The optional parameters are filter criteria used to limit the set of variants returned.
Parameters
task_code : str
- qryVEQueryCriteria, optional
See description of VEQueryCriteria in model package. Specifies criteria that would limit the set of variants to be retrieved.
Returns
DataFrame
- get_variant_distribution(task_code: str, by: str = 'gene', qry: aigct.model.VEQueryCriteria = None) pandas.DataFrame[source]
Fetches the distribution of variants by gene or chromsome. For each gene/chromosome lists number of variants for which we have labels along with the number of positive and negative label counts.
Parameters
- task_codestr
Task code
- bystr
Values are gene or chromosome. Specifies the type of distribution to return.
- qryVEQueryCriteria, optional
See description of VEQueryCriteria in model package. Specifies criteria that would limit the set of variants to be retrieved.
Returns
DataFrame
- get_variant_filter(task_code: str, filter_name: str) aigct.model.VariantFilter[source]
Return a variant filter for a task by name.
Returns
- VariantFilter
Object containing list of genes/variant id’s included in the filter. See description of the object.
- get_all_variant_filters(task_code: str) dict[str, pandas.DataFrame][source]
Return basic descriptive information about all variant filters for a task.
Returns
- dict[str, pd.DataFrame]
A dictionary of 3 data frames with the following keys: filter_df - Data frame of filters containing CODE, NAME, DESCRIPTION, etc. filter_gene_df - Data frame of genes associated with each filter filter_variant_df - Data frame of variants associated with each filter