dae.schema2_storage package

Submodules

dae.schema2_storage.schema2_import_storage module

class dae.schema2_storage.schema2_import_storage.Schema2ImportStorage[source]

Bases: ImportStorage

Import logic for data in the Schema 2 format.

BATCH_SIZE = 1000
generate_import_task_graph(project: ImportProject) TaskGraph[source]

Generate task grap for import of the project into this storage.

static generate_reannotate_task_graph(gpf_instance: GPFInstance, study_dir: str, region_size: int, allow_repeated_attributes: bool, work_dir: Path | None = None, *, full_reannotation: bool = False) TaskGraph[source]

Generate TaskGraph for reannotation of a given study.

static load_meta(project: ImportProject) dict[str, str][source]

Load meta data from the parquet dataset.

dae.schema2_storage.schema2_import_storage.schema2_project_dataset_layout(project: ImportProject) Schema2DatasetLayout[source]

dae.schema2_storage.schema2_layout module

class dae.schema2_storage.schema2_layout.Schema2DatasetLayout(study: str, pedigree: str, summary: str | None, family: str | None, meta: str, base_dir: str | None = None)[source]

Bases: object

Schema2 dataset layout data class.

base_dir: str | None = None
family: str | None
has_variants() bool[source]
meta: str
pedigree: str
study: str
summary: str | None
dae.schema2_storage.schema2_layout.create_schema2_dataset_layout(study_dir: str) Schema2DatasetLayout[source]

Create dataset layout for a given directory.

Used for creating new datasets, where all tables should exist.

dae.schema2_storage.schema2_layout.load_schema2_dataset_layout(study_dir: str, *, has_variants: bool = True) Schema2DatasetLayout[source]

Create dataset layout for a given directory.

Assumes that the dataset already exists, therefore it should check whether summary and family tables exist.

Module contents