dae.genotype_storage package

Submodules

dae.genotype_storage.genotype_storage module

class dae.genotype_storage.genotype_storage.GenotypeStorage(storage_config: dict[str, Any])[source]

Bases: ABC

Base class for genotype storages.

build_backend(study_config: dict, genome: ReferenceGenome, gene_models: GeneModels) None[source]

Create and cache backend for study.

create_runner(study_id: str, kwargs: dict[str, Any]) QueryRunner | None[source]

Create a query runner for a study with given query kwargs.

create_summary_runner(study_id: str, kwargs: dict[str, Any]) QueryRunner | None[source]

Create a query runner for summary variants for a given study.

abstract classmethod get_storage_types() set[str][source]

Return the genotype storage type.

is_read_only() bool[source]
property loaded_variants: dict[str, QueryVariantsBase]
property read_only: bool
abstract shutdown() GenotypeStorage[source]

Frees all resources used by the genotype storage to work.

abstract start() GenotypeStorage[source]

Allocate all resources needed for the genotype storage to work.

property study_configs: dict[str, dict[str, Any]]
classmethod validate_and_normalize_config(config: dict) dict[source]

Normalize and validate the genotype storage configuration.

When validation passes returns the normalized and validated annotator configuration dict.

When validation fails, raises ValueError.

All genotype storage configurations are required to have:

  • “storage_type” - which storage type this configuration is used for;

  • “id” - the ID of the genotype storage instance that will be created.

dae.genotype_storage.genotype_storage_genomic_context_cli module

Command-line helpers for genotype storage registry configuration.

This module provides CLI-based integration of genotype storage registries into the genomic context system. The CLIGenotypeStorageContextProvider allows tools to accept a genotype storage configuration file via command-line arguments and exposes the resulting GenotypeStorageRegistry through the shared genomic context mechanism.

Key Constants

GC_GENOTYPE_STORAGES_KEYstr

Standard key for the genotype storage registry object in the context.

See Also

dae.genomic_resources.genomic_context

High-level orchestration and provider registration functions.

dae.genotype_storage.genotype_storage_registry.GenotypeStorageRegistry

The registry implementation for managing multiple genotype storages.

class dae.genotype_storage.genotype_storage_genomic_context_cli.CLIGenotypeStorageContextProvider[source]

Bases: GenomicContextProvider

Expose genotype storage registry configuration via CLI.

This provider allows CLI tools to load a genotype storage registry from a YAML configuration file specified on the command line. When invoked without the --genotype-storage-config argument, the provider returns None to allow other providers to supply default storage registries.

The provider registers a single context key, GC_GENOTYPE_STORAGES_KEY, pointing to the instantiated GenotypeStorageRegistry.

Notes

The provider has a priority of 500, placing it below general resource providers (like CLI genomic context at 900) but above lower-priority specialized providers.

add_argparser_arguments(parser: ArgumentParser) None[source]

Register CLI argument for the genotype storage configuration file.

Parameters

parser

The argument parser that should receive the provider-specific option.

Notes

The provider adds --genotype-storage-config (short form --gsf) pointing to a YAML file describing the genotype storage configurations.

init(**kwargs: Any) GenomicContext | None[source]

Build a genomic context containing a genotype storage registry.

Parameters

**kwargs

Keyword arguments parsed from the command line. The provider inspects the genotype_storages key (set via --genotype-storage-config).

Returns

GenomicContext | None

A context exposing the genotype storage registry under GC_GENOTYPE_STORAGES_KEY, or None when the configuration file argument is absent.

Notes

The configuration file must be valid YAML defining one or more genotype storage entries. The provider loads the file, instantiates a GenotypeStorageRegistry, and registers the configured storages.

dae.genotype_storage.genotype_storage_genomic_context_cli.get_context_genotype_storages(context: GenomicContext) GenotypeStorageRegistry | None[source]

Extract a validated genotype storage registry from context.

Parameters

context

The genomic context from which to retrieve the registry object.

Returns

GenotypeStorageRegistry | None

The registry instance or None when the context does not expose a genotype storage registry.

Raises

TypeError

If the context entry is present but does not contain the expected GenotypeStorageRegistry type.

Notes

This helper is analogous to dae.annotation.annotation_genomic_context_cli.get_context_pipeline() and provides type-safe access to the genotype storage registry within the context system.

dae.genotype_storage.genotype_storage_registry module

class dae.genotype_storage.genotype_storage_registry.GenotypeStorageRegistry[source]

Bases: object

Registry for genotype storages.

This class could accept genotype storages config from a GPF instance configuration and instantiate and register all genotype storages defined in this configuration. To do this, one could use GenotypeStorageRegistry.register_storages_configs().

To create and register single genotype storage using its configuration you can use GenotypeStorageRegistry.register_storage_config().

When you have already created an instance of genotype storage, you can use GenotypeStorageRegistry.register_genotype_storage() to register it.

find_storage(study_id: str) GenotypeStorage[source]
get_all_genotype_storage_ids() list[str][source]

Return list of all registered genotype storage IDs.

get_all_genotype_storages() list[GenotypeStorage][source]

Return list of registered genotype storages.

get_default_genotype_storage() GenotypeStorage[source]

Return the default genotype storage if one is defined.

Otherwise, return None.

get_genotype_storage(storage_id: str) GenotypeStorage[source]

Return genotype storage with specified storage_id.

If the method can not find storage with the specified ID, it will raise ValueError exception.

query_summary_variants(study_ids: list[str], kwargs: dict[str, Any], limit: int | None = None) Iterable[SummaryVariant][source]

Query summary variants for the given of study ids and kwargs.

query_variants(study_kwargs: list[tuple[str, dict[str, Any]]], limit: int | None = None) Iterable[FamilyVariant][source]

Query variants for the given of study ids and kwargs.

register_default_storage(genotype_storage: GenotypeStorage) None[source]

Register a genotype storage and make it the default storage.

register_genotype_storage(storage: GenotypeStorage) GenotypeStorage[source]

Register a genotype storage instance.

register_storage_config(storage_config: dict[str, Any]) GenotypeStorage[source]

Create a genotype storage using storage config and registers it.

register_storages_configs(genotype_storages_config: dict[str, Any]) None[source]

Create and register all genotype storages defined in config.

When defining a GPF instance, we specify a genotype_storage section in the configuration. If you pass this whole configuration section to this method, it will create and register all genotype storages defined in that configuration section.

shutdown() None[source]
dae.genotype_storage.genotype_storage_registry.get_genotype_storage_factory(storage_type: str) Callable[[dict[str, Any]], GenotypeStorage][source]

Find and return a factory function for creation of a storage type.

If the specified storage type is not found, this function raises ValueError exception.

Returns:

the genotype storage factory for the specified storage type.

Raises:

ValueError – when can’t find a genotype storage factory for the specified storage type.

dae.genotype_storage.genotype_storage_registry.get_genotype_storage_types() list[str][source]

Return the list of all registered genotype storage factory types.

dae.genotype_storage.genotype_storage_registry.register_genotype_storage_factory(storage_type: str, factory: Callable[[dict[str, Any]], GenotypeStorage]) None[source]

Register additional genotype storage factory.

By default all genotype storage factories should be registered at [dae.genotype_storage.factories] extenstion point. All registered factories are loaded automatically. This function should be used if you want to bypass extension point mechanism and register addition genotype storage factory programatically.

Module contents