gpf.gene_sets package

Submodules

gpf.gene_sets.denovo_gene_set_collection module

class gpf.gene_sets.denovo_gene_set_collection.DenovoGeneSetCollection(study_id: str, study_name: str, dgsc_config: DenovoGeneSetsConfig, pscs: dict[str, PersonSetCollection], *, cache: dict[str, Any] | None = None, gene_sets_types_legend: dict[str, Any] | None = None)[source]

Bases: object

Class representing a study’s denovo gene sets.

add_gene(gene_effects: list[tuple[str, str]], persons: list[Person]) None[source]

Add a gene to the cache.

static build_collection(genotype_data: GenotypeData) DenovoGeneSetCollection | None[source]

Generate a denovo gene set collection for a study.

static create_empty_collection(study: GenotypeData) DenovoGeneSetCollection | None[source]

Create an empty denovo gene set collection for a genotype data.

classmethod get_all_gene_sets(denovo_gene_sets: list[DenovoGeneSetCollection], denovo_gene_set_spec: dict[str, dict[str, list[str]]]) list[dict[str, Any]][source]

Return all gene sets from provided denovo gene set collections.

get_gene_set(dgsc_query: str | DGSCQuery) GeneSet | None[source]

Return a gene set from the collection.

classmethod get_gene_set_from_collections(gene_set_id: str, denovo_gene_set_collections: list[DenovoGeneSetCollection], denovo_gene_set_spec: dict[str, dict[str, list[str]]]) dict[str, Any] | None[source]

Return a single set from provided denovo gene set collections.

get_gene_sets_types_legend() dict[str, Any][source]

Return dict with legends for each collection.

get_person_set_collection_legend(psc_id: str) list[dict[str, Any]][source]

Return the domain (used as a legend) of a person set collection.

is_cached(cache_dir: str) bool[source]

Check if all the cache files exist.

load(cache_dir: str) None[source]

Load cached denovo gene set collection from a cache files.

save(cache_dir: str) None[source]

Save the denovo gene set collection to a cache files.

gpf.gene_sets.denovo_gene_set_helpers module

class gpf.gene_sets.denovo_gene_set_helpers.DenovoGeneSetHelpers[source]

Bases: object

Helper functions for creation of denovo gene sets.

classmethod build_collection(study: GenotypeData, *, force: bool = False) DenovoGeneSetCollection | None[source]

Build a denovo gene set collection for a study and save it.

classmethod load_collection(study: GenotypeData) DenovoGeneSetCollection | None[source]

Load a denovo gene set collection for a given study.

classmethod load_collection_from_dict(study: GenotypeData, cache: dict) DenovoGeneSetCollection | None[source]

Load a denovo gene set collection for a given study.

gpf.gene_sets.denovo_gene_sets_config module

class gpf.gene_sets.denovo_gene_sets_config.DGSCQuery(*, gene_set_id: str, psc_id: str, selected_person_sets: set[str], effects: list[EffectsCriteria], sex: list[SexesCriteria], recurrency: SingleCriteria | RecurrentCriteria | TripleCriteria | None)[source]

Bases: BaseModel

Query for de novo gene set collection.

effects: list[EffectsCriteria]
gene_set_id: str
model_computed_fields: ClassVar[dict[str, ComputedFieldInfo]] = {}

A dictionary of computed field names and their corresponding ComputedFieldInfo objects.

model_config: ClassVar[ConfigDict] = {'extra': 'forbid'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_fields: ClassVar[dict[str, FieldInfo]] = {'effects': FieldInfo(annotation=list[EffectsCriteria], required=True), 'gene_set_id': FieldInfo(annotation=str, required=True), 'psc_id': FieldInfo(annotation=str, required=True), 'recurrency': FieldInfo(annotation=Union[SingleCriteria, RecurrentCriteria, TripleCriteria, NoneType], required=True), 'selected_person_sets': FieldInfo(annotation=set[str], required=True), 'sex': FieldInfo(annotation=list[SexesCriteria], required=True)}

Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo].

This replaces Model.__fields__ from Pydantic V1.

psc_id: str
recurrency: RecurrencyCriteria | None
selected_person_sets: set[str]
sex: list[SexesCriteria]
class gpf.gene_sets.denovo_gene_sets_config.DGSSpec(*, gene_set_id: str, criterias: dict[str, EffectsCriteria | SexesCriteria | SingleCriteria | RecurrentCriteria | TripleCriteria])[source]

Bases: BaseModel

De novo gene set specification.

criterias: dict[str, EffectsCriteria | SexesCriteria | RecurrencyCriteria]
gene_set_id: str
model_computed_fields: ClassVar[dict[str, ComputedFieldInfo]] = {}

A dictionary of computed field names and their corresponding ComputedFieldInfo objects.

model_config: ClassVar[ConfigDict] = {'extra': 'forbid'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_fields: ClassVar[dict[str, FieldInfo]] = {'criterias': FieldInfo(annotation=dict[str, Union[EffectsCriteria, SexesCriteria, SingleCriteria, RecurrentCriteria, TripleCriteria]], required=True), 'gene_set_id': FieldInfo(annotation=str, required=True)}

Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo].

This replaces Model.__fields__ from Pydantic V1.

class gpf.gene_sets.denovo_gene_sets_config.DenovoGeneSetsConfig(*, enabled: bool, selected_person_set_collections: list[str], effect_types: dict[str, EffectsCriteria], sexes: dict[str, SexesCriteria], recurrency: dict[str, SingleCriteria | RecurrentCriteria | TripleCriteria], gene_sets_ids: list[str])[source]

Bases: BaseModel

Configuration for de novo gene sets.

effect_types: dict[str, EffectsCriteria]
enabled: bool
gene_sets_ids: list[str]
model_computed_fields: ClassVar[dict[str, ComputedFieldInfo]] = {}

A dictionary of computed field names and their corresponding ComputedFieldInfo objects.

model_config: ClassVar[ConfigDict] = {'extra': 'forbid'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_fields: ClassVar[dict[str, FieldInfo]] = {'effect_types': FieldInfo(annotation=dict[str, EffectsCriteria], required=True), 'enabled': FieldInfo(annotation=bool, required=True), 'gene_sets_ids': FieldInfo(annotation=list[str], required=True), 'recurrency': FieldInfo(annotation=dict[str, Union[SingleCriteria, RecurrentCriteria, TripleCriteria]], required=True), 'selected_person_set_collections': FieldInfo(annotation=list[str], required=True), 'sexes': FieldInfo(annotation=dict[str, SexesCriteria], required=True)}

Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo].

This replaces Model.__fields__ from Pydantic V1.

recurrency: dict[str, RecurrencyCriteria]
selected_person_set_collections: list[str]
sexes: dict[str, SexesCriteria]
class gpf.gene_sets.denovo_gene_sets_config.EffectsCriteria(*, name: str, effects: Annotated[list[str], AfterValidator(func=_validate_effect_types)])[source]

Bases: BaseModel

Criteria for filtering effect types.

effects: EffectTypes
model_computed_fields: ClassVar[dict[str, ComputedFieldInfo]] = {}

A dictionary of computed field names and their corresponding ComputedFieldInfo objects.

model_config: ClassVar[ConfigDict] = {'extra': 'forbid'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_fields: ClassVar[dict[str, FieldInfo]] = {'effects': FieldInfo(annotation=list[str], required=True, metadata=[AfterValidator(func=<function _validate_effect_types>)]), 'name': FieldInfo(annotation=str, required=True)}

Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo].

This replaces Model.__fields__ from Pydantic V1.

name: str
class gpf.gene_sets.denovo_gene_sets_config.RecurrentCriteria(*, name: Literal['Recurrent'], start: Literal[2], end: Literal[-1])[source]

Bases: BaseModel

Recurrent recurrency criteria.

end: Literal[-1]
model_computed_fields: ClassVar[dict[str, ComputedFieldInfo]] = {}

A dictionary of computed field names and their corresponding ComputedFieldInfo objects.

model_config: ClassVar[ConfigDict] = {'extra': 'forbid'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_fields: ClassVar[dict[str, FieldInfo]] = {'end': FieldInfo(annotation=Literal[-1], required=True), 'name': FieldInfo(annotation=Literal['Recurrent'], required=True), 'start': FieldInfo(annotation=Literal[2], required=True)}

Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo].

This replaces Model.__fields__ from Pydantic V1.

name: Literal['Recurrent']
start: Literal[2]
class gpf.gene_sets.denovo_gene_sets_config.SexesCriteria(*, name: str, sexes: list[Sex])[source]

Bases: BaseModel

Criteria for filtering sexes.

model_computed_fields: ClassVar[dict[str, ComputedFieldInfo]] = {}

A dictionary of computed field names and their corresponding ComputedFieldInfo objects.

model_config: ClassVar[ConfigDict] = {'extra': 'forbid'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_fields: ClassVar[dict[str, FieldInfo]] = {'name': FieldInfo(annotation=str, required=True), 'sexes': FieldInfo(annotation=list[Sex], required=True)}

Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo].

This replaces Model.__fields__ from Pydantic V1.

name: str
sexes: list[Sex]
class gpf.gene_sets.denovo_gene_sets_config.SingleCriteria(*, name: Literal['Single'], start: Literal[1], end: Literal[2])[source]

Bases: BaseModel

Single recurrency criteria.

end: Literal[2]
model_computed_fields: ClassVar[dict[str, ComputedFieldInfo]] = {}

A dictionary of computed field names and their corresponding ComputedFieldInfo objects.

model_config: ClassVar[ConfigDict] = {'extra': 'forbid'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_fields: ClassVar[dict[str, FieldInfo]] = {'end': FieldInfo(annotation=Literal[2], required=True), 'name': FieldInfo(annotation=Literal['Single'], required=True), 'start': FieldInfo(annotation=Literal[1], required=True)}

Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo].

This replaces Model.__fields__ from Pydantic V1.

name: Literal['Single']
start: Literal[1]
class gpf.gene_sets.denovo_gene_sets_config.TripleCriteria(*, name: Literal['Triple'], start: Literal[3], end: Literal[-1])[source]

Bases: BaseModel

Triple recurrency criteria.

end: Literal[-1]
model_computed_fields: ClassVar[dict[str, ComputedFieldInfo]] = {}

A dictionary of computed field names and their corresponding ComputedFieldInfo objects.

model_config: ClassVar[ConfigDict] = {'extra': 'forbid'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_fields: ClassVar[dict[str, FieldInfo]] = {'end': FieldInfo(annotation=Literal[-1], required=True), 'name': FieldInfo(annotation=Literal['Triple'], required=True), 'start': FieldInfo(annotation=Literal[3], required=True)}

Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo].

This replaces Model.__fields__ from Pydantic V1.

name: Literal['Triple']
start: Literal[3]
gpf.gene_sets.denovo_gene_sets_config.create_denovo_gene_set_spec(gene_set_id: str, config: DenovoGeneSetsConfig) DGSSpec[source]

Create de novo gene set specification from name.

gpf.gene_sets.denovo_gene_sets_config.parse_denovo_gene_sets_config(config: dict[str, Any], *, study_person_set_collections: dict[str, Any] | None = None, has_denovo: bool = False) DenovoGeneSetsConfig | None[source]

Parse de novo gene sets configuration.

gpf.gene_sets.denovo_gene_sets_config.parse_denovo_gene_sets_study_config(study_config: dict[str, Any], *, has_denovo: bool = False) DenovoGeneSetsConfig | None[source]

Parse de novo gene sets study configuration.

gpf.gene_sets.denovo_gene_sets_config.parse_dgsc_query(gene_set_spec: str, dgsc_config: DenovoGeneSetsConfig) DGSCQuery[source]

Parse de novo gene set collection query.

gpf.gene_sets.denovo_gene_sets_config.parse_recurrency_criteria(name: str, recurrency_criteria: dict[str, Any]) SingleCriteria | RecurrentCriteria | TripleCriteria[source]

Parse recurrency criteria.

gpf.gene_sets.denovo_gene_sets_db module

class gpf.gene_sets.denovo_gene_sets_db.DenovoGeneSetsDb(gpf_instance: Any)[source]

Bases: object

Class to manage available de Novo gene sets.

build_cache(genotype_data_ids: list[str], *, force: bool = False) None[source]

Build cache for de Novo gene sets for specified genotype data IDs.

property collections_descriptions: list[dict[str, Any]]

Return gene set descriptions.

property denovo_gene_sets_types: list[dict[str, Any]]

Return denovo gene sets types descriptions.

get_all_gene_sets(denovo_gene_set_spec: dict[str, dict[str, list[str]]], collection_id: str = 'denovo') list[dict[str, Any]][source]

Return all de Novo gene sets matching the spec for permitted DS.

get_collection_types_legend(gs_collection_id: str) dict[str, Any][source]
get_gene_set(gene_set_id: str, gene_set_spec: dict[str, dict[str, list[str]]], collection_id: str = 'denovo') dict[str, Any] | None[source]

Return de Novo gene set matching the spec for permitted datasets.

get_gene_set_ids(genotype_data_id: str) list[str][source]
get_genotype_data_ids() set[str][source]

Return list of genotype data IDs with denovo gene sets.

has_gene_sets() bool[source]
reload() None[source]
update_cache(denovo_gene_sets: dict[str, Any]) None[source]

Load a dictionary of denovo gene sets.

gpf.gene_sets.gene_sets_db module

Class for handling a database of gene set collections.

class gpf.gene_sets.gene_sets_db.GeneSetsDb(gene_set_collections: Sequence[BaseGeneSetCollection])[source]

Bases: object

Class that represents a dictionary of gene set collections.

property collections_descriptions: list[dict[str, Any]]

Collect gene set descriptions.

Iterates and creates a list of descriptions for each gene set collection

get_all_gene_sets(collection_id: str) list[GeneSet][source]

Return all the gene sets in the specified collection.

get_gene_set(collection_id: str, gene_set_id: str) GeneSet | None[source]

Find and return a gene set in a gene set collection.

get_gene_set_collection_ids() set[str][source]

Return all gene set collection ids.

Including the ids of collections which have not been loaded.

get_gene_set_ids(collection_id: str) set[str][source]

Return the IDs of all the gene sets in specified collection.

has_gene_set_collection(gsc_id: str) bool[source]

Check the database if contains the specified gene set collection.

Module contents