dae.genotype_storage package
Submodules
dae.genotype_storage.genotype_storage module
- class dae.genotype_storage.genotype_storage.GenotypeStorage(storage_config: dict[str, Any])[source]
Bases:
ABCBase class for genotype storages.
- build_backend(study_config: dict, genome: ReferenceGenome, gene_models: GeneModels) None[source]
Create and cache backend for study.
- create_runner(study_id: str, kwargs: dict[str, Any]) QueryRunner | None[source]
Create a query runner for a study with given query kwargs.
- create_summary_runner(study_id: str, kwargs: dict[str, Any]) QueryRunner | None[source]
Create a query runner for summary variants for a given study.
- property loaded_variants: dict[str, QueryVariantsBase]
- property read_only: bool
- abstract shutdown() GenotypeStorage[source]
Frees all resources used by the genotype storage to work.
- abstract start() GenotypeStorage[source]
Allocate all resources needed for the genotype storage to work.
- property study_configs: dict[str, dict[str, Any]]
- classmethod validate_and_normalize_config(config: dict) dict[source]
Normalize and validate the genotype storage configuration.
When validation passes returns the normalized and validated annotator configuration dict.
When validation fails, raises ValueError.
All genotype storage configurations are required to have:
“storage_type” - which storage type this configuration is used for;
“id” - the ID of the genotype storage instance that will be created.
dae.genotype_storage.genotype_storage_genomic_context_cli module
Command-line helpers for genotype storage registry configuration.
This module provides CLI-based integration of genotype storage registries
into the genomic context system. The
CLIGenotypeStorageContextProvider allows tools to accept a genotype
storage configuration file via command-line arguments and exposes the
resulting GenotypeStorageRegistry through the shared genomic
context mechanism.
Key Constants
- GC_GENOTYPE_STORAGES_KEYstr
Standard key for the genotype storage registry object in the context.
See Also
- dae.genomic_resources.genomic_context
High-level orchestration and provider registration functions.
- dae.genotype_storage.genotype_storage_registry.GenotypeStorageRegistry
The registry implementation for managing multiple genotype storages.
- class dae.genotype_storage.genotype_storage_genomic_context_cli.CLIGenotypeStorageContextProvider[source]
Bases:
GenomicContextProviderExpose genotype storage registry configuration via CLI.
This provider allows CLI tools to load a genotype storage registry from a YAML configuration file specified on the command line. When invoked without the
--genotype-storage-configargument, the provider returnsNoneto allow other providers to supply default storage registries.The provider registers a single context key,
GC_GENOTYPE_STORAGES_KEY, pointing to the instantiatedGenotypeStorageRegistry.Notes
The provider has a priority of 500, placing it below general resource providers (like CLI genomic context at 900) but above lower-priority specialized providers.
- add_argparser_arguments(parser: ArgumentParser) None[source]
Register CLI argument for the genotype storage configuration file.
Parameters
- parser
The argument parser that should receive the provider-specific option.
Notes
The provider adds
--genotype-storage-config(short form--gsf) pointing to a YAML file describing the genotype storage configurations.
- init(**kwargs: Any) GenomicContext | None[source]
Build a genomic context containing a genotype storage registry.
Parameters
- **kwargs
Keyword arguments parsed from the command line. The provider inspects the
genotype_storageskey (set via--genotype-storage-config).
Returns
- GenomicContext | None
A context exposing the genotype storage registry under
GC_GENOTYPE_STORAGES_KEY, orNonewhen the configuration file argument is absent.
Notes
The configuration file must be valid YAML defining one or more genotype storage entries. The provider loads the file, instantiates a
GenotypeStorageRegistry, and registers the configured storages.
- dae.genotype_storage.genotype_storage_genomic_context_cli.get_context_genotype_storages(context: GenomicContext) GenotypeStorageRegistry | None[source]
Extract a validated genotype storage registry from context.
Parameters
- context
The genomic context from which to retrieve the registry object.
Returns
- GenotypeStorageRegistry | None
The registry instance or
Nonewhen the context does not expose a genotype storage registry.
Raises
- TypeError
If the context entry is present but does not contain the expected
GenotypeStorageRegistrytype.
Notes
This helper is analogous to
dae.annotation.annotation_genomic_context_cli.get_context_pipeline()and provides type-safe access to the genotype storage registry within the context system.
dae.genotype_storage.genotype_storage_registry module
- class dae.genotype_storage.genotype_storage_registry.GenotypeStorageRegistry[source]
Bases:
objectRegistry for genotype storages.
This class could accept genotype storages config from a GPF instance configuration and instantiate and register all genotype storages defined in this configuration. To do this, one could use
GenotypeStorageRegistry.register_storages_configs().To create and register single genotype storage using its configuration you can use
GenotypeStorageRegistry.register_storage_config().When you have already created an instance of genotype storage, you can use
GenotypeStorageRegistry.register_genotype_storage()to register it.- find_storage(study_id: str) GenotypeStorage[source]
- get_all_genotype_storage_ids() list[str][source]
Return list of all registered genotype storage IDs.
- get_all_genotype_storages() list[GenotypeStorage][source]
Return list of registered genotype storages.
- get_default_genotype_storage() GenotypeStorage[source]
Return the default genotype storage if one is defined.
Otherwise, return None.
- get_genotype_storage(storage_id: str) GenotypeStorage[source]
Return genotype storage with specified storage_id.
If the method can not find storage with the specified ID, it will raise ValueError exception.
- query_summary_variants(study_ids: list[str], kwargs: dict[str, Any], limit: int | None = None) Iterable[SummaryVariant][source]
Query summary variants for the given of study ids and kwargs.
- query_variants(study_kwargs: list[tuple[str, dict[str, Any]]], limit: int | None = None) Iterable[FamilyVariant][source]
Query variants for the given of study ids and kwargs.
- register_default_storage(genotype_storage: GenotypeStorage) None[source]
Register a genotype storage and make it the default storage.
- register_genotype_storage(storage: GenotypeStorage) GenotypeStorage[source]
Register a genotype storage instance.
- register_storage_config(storage_config: dict[str, Any]) GenotypeStorage[source]
Create a genotype storage using storage config and registers it.
- register_storages_configs(genotype_storages_config: dict[str, Any]) None[source]
Create and register all genotype storages defined in config.
When defining a GPF instance, we specify a genotype_storage section in the configuration. If you pass this whole configuration section to this method, it will create and register all genotype storages defined in that configuration section.
- dae.genotype_storage.genotype_storage_registry.get_genotype_storage_factory(storage_type: str) Callable[[dict[str, Any]], GenotypeStorage][source]
Find and return a factory function for creation of a storage type.
If the specified storage type is not found, this function raises ValueError exception.
- Returns:
the genotype storage factory for the specified storage type.
- Raises:
ValueError – when can’t find a genotype storage factory for the specified storage type.
- dae.genotype_storage.genotype_storage_registry.get_genotype_storage_types() list[str][source]
Return the list of all registered genotype storage factory types.
- dae.genotype_storage.genotype_storage_registry.register_genotype_storage_factory(storage_type: str, factory: Callable[[dict[str, Any]], GenotypeStorage]) None[source]
Register additional genotype storage factory.
By default all genotype storage factories should be registered at [dae.genotype_storage.factories] extenstion point. All registered factories are loaded automatically. This function should be used if you want to bypass extension point mechanism and register addition genotype storage factory programatically.