dae.schema2_storage package
Submodules
dae.schema2_storage.schema2_import_storage module
- class dae.schema2_storage.schema2_import_storage.Schema2ImportStorage[source]
Bases:
ImportStorage
Import logic for data in the Schema 2 format.
- BATCH_SIZE = 1000
- generate_import_task_graph(project: ImportProject) TaskGraph [source]
Generate task grap for import of the project into this storage.
- static generate_reannotate_task_graph(gpf_instance: GPFInstance, study_dir: str, region_size: int, allow_repeated_attributes: bool, work_dir: Path | None = None, *, full_reannotation: bool = False) TaskGraph [source]
Generate TaskGraph for reannotation of a given study.
- static load_meta(project: ImportProject) dict[str, str] [source]
Load meta data from the parquet dataset.
- dae.schema2_storage.schema2_import_storage.schema2_project_dataset_layout(project: ImportProject) Schema2DatasetLayout [source]
dae.schema2_storage.schema2_layout module
- class dae.schema2_storage.schema2_layout.Schema2DatasetLayout(study: str, pedigree: str, summary: str | None, family: str | None, meta: str, base_dir: str | None = None)[source]
Bases:
object
Schema2 dataset layout data class.
- base_dir: str | None = None
- family: str | None
- meta: str
- pedigree: str
- study: str
- summary: str | None
- dae.schema2_storage.schema2_layout.create_schema2_dataset_layout(study_dir: str) Schema2DatasetLayout [source]
Create dataset layout for a given directory.
Used for creating new datasets, where all tables should exist.
- dae.schema2_storage.schema2_layout.load_schema2_dataset_layout(study_dir: str, *, has_variants: bool = True) Schema2DatasetLayout [source]
Create dataset layout for a given directory.
Assumes that the dataset already exists, therefore it should check whether summary and family tables exist.