dae.enrichment_tool package

Subpackages

Submodules

dae.enrichment_tool.base_enrichment_background module

class dae.enrichment_tool.base_enrichment_background.BaseEnrichmentBackground(resource: GenomicResource)[source]

Bases: ABC

Provides class for enrichment background models.

property background_id: str
property background_type: str
abstract calc_enrichment_test(events_counts: EventCountersResult, overlapped_counts: EventCountersResult, gene_set: Iterable[str], **kwargs: Any) EnrichmentResult[source]

Calculate the enrichment test.

abstract is_loaded() bool[source]

Check if the background data is loaded.

abstract load() None[source]

Load the background data.

abstract property name: str

Get the background name.

property resource_id: str
class dae.enrichment_tool.base_enrichment_background.BaseEnrichmentResourceBackground(resource: GenomicResource)[source]

Bases: BaseEnrichmentBackground, ResourceConfigValidationMixin

Provides class for enrichment resource background models.

property filename: str
static get_schema() dict[str, Any][source]

Return schema to be used for config validation.

property name: str

Get the background name.

dae.enrichment_tool.build_coding_length_enrichment_background module

dae.enrichment_tool.build_coding_length_enrichment_background.build_coding_length_background(gene_models: GeneModels) DataFrame[source]

Build coding length enrichment background data.

dae.enrichment_tool.build_coding_length_enrichment_background.cli(argv: list[str] | None = None) None[source]

Command line tool to create coding length enrichment background.

dae.enrichment_tool.build_ur_synonymous_enrichment_background module

dae.enrichment_tool.build_ur_synonymous_enrichment_background.cli(argv: list[str] | None = None, gpf_instance: GPFInstance | None = None) None[source]

Command line tool to create UR synonymous enrichment background.

dae.enrichment_tool.enrichment_builder module

class dae.enrichment_tool.enrichment_builder.BaseEnrichmentBuilder(enrichment_helper: EnrichmentHelper, dataset: GenotypeData)[source]

Bases: object

Base class for enrichment builders.

abstract build(gene_syms: Iterable[str], background_id: str | None, counting_id: str | None) list[dict[str, Any]][source]
class dae.enrichment_tool.enrichment_builder.EnrichmentBuilder(enrichment_helper: EnrichmentHelper, dataset: GenotypeData)[source]

Bases: BaseEnrichmentBuilder

Build enrichment tool test.

build(gene_syms: Iterable[str], background_id: str | None, counting_id: str | None) list[dict[str, Any]][source]
build_results(gene_syms: Iterable[str], background_id: str | None, counting_id: str | None) list[dict[str, Any]][source]

Build and return a list of enrichment results.

Returns:

A list of dictionaries representing the enrichment results.

dae.enrichment_tool.enrichment_cache_builder module

dae.enrichment_tool.enrichment_cache_builder.cli(argv: list[str] | None = None, gpf_instance: GPFInstance | None = None) None[source]

Generate enrichment tool cache.

dae.enrichment_tool.enrichment_helper module

class dae.enrichment_tool.enrichment_helper.EnrichmentHelper(grr: GenomicResourceRepo)[source]

Bases: object

Helper class to create enrichment tool for a genotype data.

build_enrichment_event_counts_cache(study: GenotypeData, psc_id: str) None[source]

Build enrichment event counts cache for a genotype data.

calc_enrichment_test(study: GenotypeData, psc_id: str, gene_syms: Iterable[str], effect_groups: Iterable[str] | Iterable[Iterable[str]], background_id: str | None = None, counter_id: str | None = None) dict[str, dict[str, EnrichmentResult]][source]

Perform enrichment test for a genotype data.

collect_genotype_data_backgrounds(genotype_data: GenotypeData) list[BaseEnrichmentBackground][source]

Collect enrichment backgrounds configured for a genotype data.

create_background(background_id: str) BaseEnrichmentBackground[source]

Construct and return an enrichment background.

create_counter(counter_id: str) CounterBase[source]

Create counter for a genotype data.

static get_default_background_model(genotype_data: GenotypeData) str[source]

Return default background model field from the enrichment config. If it is missing, default to the first selected background model.

static get_default_counting_model(genotype_data: GenotypeData) str[source]
static get_enrichment_config(genotype_data: GenotypeData) Box | None[source]
static get_selected_counting_models(genotype_data: GenotypeData) list[str][source]

Return selected counting models field from the enrichment config. If it is missing, default to the counting field.

static get_selected_person_set_collections(genotype_data: GenotypeData)[source]

Return selected person set collections field from the enrichment config. If it is missing, default to the first available person set collection in the provided study.

static has_enrichment_config(genotype_data: GenotypeData) bool[source]

dae.enrichment_tool.enrichment_serializer module

class dae.enrichment_tool.enrichment_serializer.EnrichmentSerializer(enrichment_config: dict[str, Any], results: list[dict[str, Any]])[source]

Bases: EffectTypesMixin

Serializer for enrichment tool results.

serialize() list[dict][source]
serialize_all(grouping_results: dict[str, Any], effect_type: str, result: EnrichmentSingleResult) dict[str, Any][source]
serialize_common_filter(grouping_results: dict[str, dict[str, EnrichmentSingleResult]], effect_type: str, _result: EnrichmentSingleResult, gender: list[str] | None = None) dict[str, Any][source]

Serialize common filter.

serialize_enrichment_result(result: EnrichmentSingleResult) dict[str, Any][source]

Serialize enrichment result.

serialize_female(grouping_results: dict[str, Any], effect_type: str, result: EnrichmentSingleResult) dict[str, Any][source]
serialize_male(grouping_results: dict[str, Any], effect_type: str, result: EnrichmentSingleResult) dict[str, Any][source]
serialize_overlap_filter(grouping_results: dict[str, Any], effect_type: str, result: EnrichmentSingleResult, *, gender: list[str] | None = None, overlapped_genes: bool = False) dict[str, Any][source]

Serialize overlapped events filter.

serialize_people_groups(grouping_results: dict[str, Any]) dict[str, Any][source]

Serialize people group results.

serialize_rec(grouping_results: dict[str, Any], effect_type: str, result: EnrichmentSingleResult) dict[str, Any][source]

Serialize recurrent events.

serialize_rec_filter(grouping_results: dict[str, Any], effect_type: str, result: EnrichmentSingleResult, gender: list[str] | None = None) dict[str, Any][source]

Serialize recurrent events filter.

dae.enrichment_tool.event_counters module

class dae.enrichment_tool.event_counters.CounterBase(counter_id: str)[source]

Bases: ABC

Class to represent enrichement events counter object.

event_counts(variant_events: list[VariantEvent], children_by_sex: ChildrenBySex, effect_types: Iterable[str]) EventCountersResult[source]

Calculate the event counts from the given variant events.

Args:
variant_events (list[VariantEvent]):

A list of variant events.

children_by_sex (dict[str, set[tuple[str, str]]]):

A dictionary mapping sex to a set of child IDs.

effect_types (Iterable[str]):

An iterable of effect types.

Returns:

EventCountersResult: An object containing the event counters.

abstract events(variant_events: list[VariantEvent], children_by_sex: ChildrenBySex, effect_types: Iterable[str]) EventsResult[source]
select_events_in_person_set(variant_events: list[VariantEvent], persons: set[tuple[str, str]]) list[VariantEvent][source]

Select variant events that occur in the passed persons.

split_events(variant_events: list[VariantEvent], children_by_sex: ChildrenBySex) VariantEventsResult[source]

Split the passed variant events based on the children’s sex.

Args:
variant_events (list[VariantEvent]): The list of variant events

to be split.

children_by_sex (dict[str, set[tuple[str, str]]]): A dictionary

containing children grouped by sex.

Returns:

VariantEventsResult: An object containing the split variant events.

class dae.enrichment_tool.event_counters.EnrichmentResult(all: EnrichmentSingleResult, rec: EnrichmentSingleResult, male: EnrichmentSingleResult, female: EnrichmentSingleResult, unspecified: EnrichmentSingleResult, rec_genes: set[str] | None = None)[source]

Bases: object

Represents result of calculating enrichment test.

all: EnrichmentSingleResult
female: EnrichmentSingleResult
male: EnrichmentSingleResult
rec: EnrichmentSingleResult
rec_genes: set[str] | None = None
unspecified: EnrichmentSingleResult
class dae.enrichment_tool.event_counters.EnrichmentSingleResult(name: str, events: int, overlapped: int, expected: float, pvalue: float, overlapped_genes: set[str] | None = None)[source]

Bases: object

Represents result of enrichment tool calculations.

Supported fields are:

name

events – list of events found

overlapped – list of overlapped events

expected – number of expected events

pvalue

class dae.enrichment_tool.event_counters.EventCountersResult(all: int, rec: int, male: int, female: int, unspecified: int, rec_genes: set[str] | None = None)[source]

Bases: object

Represents result of event counting.

all: int
female: int
static from_events_result(events: EventsResult) EventCountersResult[source]
male: int
rec: int
rec_genes: set[str] | None = None
unspecified: int
class dae.enrichment_tool.event_counters.EventsCounter[source]

Bases: CounterBase

Events counter class.

events(variant_events: list[VariantEvent], children_by_sex: ChildrenBySex, effect_types: Iterable[str]) EventsResult[source]
class dae.enrichment_tool.event_counters.EventsResult(all: 'list[list[str]]', rec: 'list[list[str]]', male: 'list[list[str]]', female: 'list[list[str]]', unspecified: 'list[list[str]]')[source]

Bases: object

all: list[list[str]]
female: list[list[str]]
male: list[list[str]]
rec: list[list[str]]
unspecified: list[list[str]]
class dae.enrichment_tool.event_counters.GeneEventsCounter[source]

Bases: CounterBase

Counts events in genes.

events(variant_events: list[VariantEvent], children_by_sex: ChildrenBySex, effect_types: Iterable[str]) EventsResult[source]

Count the events by sex and effect type.

class dae.enrichment_tool.event_counters.VariantEventsResult(all: 'list[VariantEvent]', rec: 'list[VariantEvent]', male: 'list[VariantEvent]', female: 'list[VariantEvent]', unspecified: 'list[VariantEvent]')[source]

Bases: object

all: list[VariantEvent]
female: list[VariantEvent]
male: list[VariantEvent]
rec: list[VariantEvent]
unspecified: list[VariantEvent]
dae.enrichment_tool.event_counters.filter_denovo_one_event_per_family(variant_events: list[VariantEvent], requested_effect_types: Iterable[str]) list[list[str]][source]

For each variant returns list of affected gene syms.

vs - generator for variants.

This functions receives a generator for variants and transforms each variant into list of gene symbols, that are affected by the variant.

The result is represented as list of lists.

dae.enrichment_tool.event_counters.filter_denovo_one_gene_per_events(variant_events: list[VariantEvent], requested_effect_types: Iterable[str]) list[list[str]][source]
dae.enrichment_tool.event_counters.filter_denovo_one_gene_per_recurrent_events(variant_events: list[VariantEvent], requsted_effect_types: Iterable[str]) list[list[str]][source]

Collect only events that occur in more than one family.

dae.enrichment_tool.event_counters.filter_overlapping_events(events: list[list[str]], gene_syms: list[str]) list[list[str]][source]
dae.enrichment_tool.event_counters.get_sym_2_fn(variant_events: list[VariantEvent], requested_effect_types: Iterable[str]) dict[str, int][source]

Count the number of requested effect types events in genes.

dae.enrichment_tool.event_counters.overlap_enrichment_result_dict(events_counts: EventsResult, gene_syms: Iterable[str]) EventsResult[source]

Calculate the overlap between all events and requested gene syms.

dae.enrichment_tool.event_counters.overlap_event_counts(events_counts: EventsResult, gene_syms: Iterable[str]) EventCountersResult[source]

dae.enrichment_tool.gene_weights_background module

class dae.enrichment_tool.gene_weights_background.GeneScoreEnrichmentBackground(resource: GenomicResource)[source]

Bases: BaseEnrichmentBackground

Provides class for gene weights enrichment background model.

property background_id: str
property background_type: str
calc_enrichment_test(events_counts: EventCountersResult, overlapped_counts: EventCountersResult, gene_set: Iterable[str], **kwargs: Any) EnrichmentResult[source]

Calculate enrichment statistics.

static calc_expected_observed_pvalue(events_prob: float, events_count: int, observed: int) tuple[float, float][source]

Calculate expected event count and binomtest p-value.

property filename: str
genes_prob(genes: Iterable[str]) float[source]
genes_weight(genes: Iterable[str]) float[source]
is_loaded() bool[source]

Check if the background data is loaded.

load() None[source]

Load enrichment background model.

property name: str

Get the background name.

property resource_id: str
property score_id: str
class dae.enrichment_tool.gene_weights_background.GeneWeightsEnrichmentBackground(resource: GenomicResource)[source]

Bases: BaseEnrichmentResourceBackground

Provides class for gene weights enrichment background model.

calc_enrichment_test(events_counts: EventCountersResult, overlapped_counts: EventCountersResult, gene_set: Iterable[str], **kwargs: Any) EnrichmentResult[source]

Calculate enrichment statistics.

static calc_expected_observed_pvalue(events_prob: float, events_count: int, observed: int) tuple[float, float][source]

Calculate expected event count and binomtest p-value.

genes_prob(genes: Iterable[str]) float[source]
genes_weight(genes: Iterable[str]) float[source]
is_loaded() bool[source]

Check if the background data is loaded.

load() None[source]

Load enrichment background model.

dae.enrichment_tool.genotype_helper module

class dae.enrichment_tool.genotype_helper.AlleleEvent(persons: set[tuple[str, str]], effect_genes: set[dae.enrichment_tool.genotype_helper.GeneEffect])[source]

Bases: object

effect_genes: set[GeneEffect]
persons: set[tuple[str, str]]
class dae.enrichment_tool.genotype_helper.GeneEffect(gene: str, effect: str)[source]

Bases: object

effect: str
gene: str
class dae.enrichment_tool.genotype_helper.GenotypeHelper(genotype_data: GenotypeData, person_set_collection: PersonSetCollection, effect_types: list[str] | None = None, genes: list[str] | None = None)[source]

Bases: object

Genotype helper for enrichment tools.

static collect_denovo_events(denovo_variants: Iterable[FamilyVariant]) list[VariantEvent][source]

Collect denovo events.

get_denovo_events() list[VariantEvent][source]
class dae.enrichment_tool.genotype_helper.VariantEvent(family_id: str, fvuid: str, allele_events: list[dae.enrichment_tool.genotype_helper.AlleleEvent])[source]

Bases: object

allele_events: list[AlleleEvent]
family_id: str
fvuid: str

dae.enrichment_tool.samocha_background module

class dae.enrichment_tool.samocha_background.SamochaEnrichmentBackground(resource: GenomicResource)[source]

Bases: BaseEnrichmentResourceBackground

Represents Samocha’s enrichment background model.

calc_enrichment_test(events_counts: EventCountersResult, overlapped_counts: EventCountersResult, gene_set: Iterable[str], **kwargs: Any) EnrichmentResult[source]

Calculate enrichment statistics.

static get_schema() dict[str, Any][source]

Return schema to be used for config validation.

is_loaded() bool[source]

Check if the background data is loaded.

load() None[source]

Load enrichment background model.

property name: str

Get the background name.

dae.enrichment_tool.samocha_background.poisson_test(observed: float, expected: float) float[source]

Perform Poisson test.

Bernard Rosner, Fundamentals of Biostatistics, 8th edition, pp 260-261

Module contents