dae.parquet_storage package
Submodules
dae.parquet_storage.storage module
- class dae.parquet_storage.storage.ParquetGenotypeStorage(storage_config: dict[str, Any])[source]
Bases:
GenotypeStorage
Genotype storage for raw parquet files.
- VALIDATION_SCHEMA: ClassVar[dict] = {'dir': {'check_with': <function validate_path>, 'type': 'string'}, 'id': {'type': 'string'}, 'storage_type': {'allowed': ['parquet'], 'type': 'string'}}
- build_backend(study_config: dict[str, Any], genome: ReferenceGenome, gene_models: GeneModels | None) ParquetLoaderVariants [source]
Construct a query backend for this genotype storage.
- import_dataset(study_id: str, layout: Schema2DatasetLayout) Schema2DatasetLayout [source]
Copy study parquet dataset into Schema2 genotype storage.
- shutdown() GenotypeStorage [source]
No resources to close.
- start() GenotypeStorage [source]
Allocate all resources needed for the genotype storage to work.
- classmethod validate_and_normalize_config(config: dict) dict [source]
Normalize and validate the genotype storage configuration.
When validation passes returns the normalized and validated annotator configuration dict.
When validation fails, raises ValueError.
All genotype storage configurations are required to have:
“storage_type” - which storage type this configuration is used for;
“id” - the ID of the genotype storage instance that will be created.
- class dae.parquet_storage.storage.ParquetImportStorage[source]
Bases:
Schema2ImportStorage
Import storage for Parquet files.
- generate_import_task_graph(project: ImportProject) TaskGraph [source]
Generate task grap for import of the project into this storage.
- class dae.parquet_storage.storage.ParquetLoaderVariants(data_dir: str, reference_genome: ReferenceGenome | None = None, gene_models: GeneModels | None = None)[source]
Bases:
object
Variants class that utilizes ParquetLoader to fetch variants.
- build_family_variants_query_runner(*, regions: list[Region] | None = None, genes: list[str] | None = None, effect_types: list[str] | None = None, family_ids: list[str] | None = None, person_ids: list[str] | None = None, inheritance: list[str] | None = None, roles: str | None = None, sexes: str | None = None, variant_type: str | None = None, real_attr_filter: list[tuple[str, tuple[float | None, float | None]]] | None = None, ultra_rare: bool | None = None, frequency_filter: list[tuple[str, tuple[float | None, float | None]]] | None = None, return_reference: bool | None = None, return_unknown: bool | None = None, **_kwargs: Any) RawVariantsQueryRunner [source]
Return a query runner for the family variants.
- build_summary_variants_query_runner(*, regions: list[Region] | None = None, genes: list[str] | None = None, effect_types: list[str] | None = None, variant_type: str | None = None, real_attr_filter: list[tuple[str, tuple[float | None, float | None]]] | None = None, ultra_rare: bool | None = None, frequency_filter: list[tuple[str, tuple[float | None, float | None]]] | None = None, return_reference: bool | None = None, return_unknown: bool | None = None, **kwargs: Any) RawVariantsQueryRunner [source]
Return a query runner for the summary variants.
- property families: FamiliesData