dae.query_variants package
Subpackages
Submodules
dae.query_variants.attribute_queries module
Module for attribute queries
Attribute queries are human readable strings which are used to filter variants based on enum attribute values.
They are applicable only to enum types and this module provides three utilities to transform an attribute query string into a callable function, an sqlglot expression or an sqlglot expression tailored to old legacy genotype storages.
Attribute queries have a defined earley grammar, used by lark to create a parsable tree.
The grammar divides the input string into expressions, which can be different types:
- literal:
The most basic type, a normal string which matches to an enumeration. For example, for roles “mom” would match variants which are present in the mother.
- compound:
A special case of literal, used for zygosity. A compound consists of 2 literals connected by a tilde (“~”) with the second one being the complementary. For example: prb~homozygous would match variants which are present and homozygotic in the proband of the family.
- neg:
Match any value that does not match the underlying expression. Negation is written by writing “not” before your expression. Example: “not sib~heterozygous”
- and_:
Match when both expressions are true. Example: prb and dad
- or_:
Match when one of the expressions is true. Example: sib or prb
- grouping:
Prioritize an expression to be evaluated first with brackets. Example: (sib or prb) and dad
- any:
Syntax sugar for doing an or_ between many values. Example: “any([dad, prb, sib])” is equivalent to “dad or prb or sib”
- all:
Syntax sugar for doing an and_ between many values. Example: “all([dad, prb, sib])” is equivalent to “dad and prb and sib”
- class dae.query_variants.attribute_queries.AttributeQueryTransformer(enum_type: type[Enum], main_aliases: dict[str, str], complementary_type: type[Enum] | None = None)[source]
Bases:
Transformer
Base class for attribute query transformers.
Supports two enum types, the second being optional. The second enum type is intended for compounds.
- class dae.query_variants.attribute_queries.AttributeQueryTransformerFunction(enum_type: type[Enum], main_aliases: dict[str, str], complementary_type: type[Enum] | None = None)[source]
Bases:
AttributeQueryTransformer
Class for transforming attribute Lark trees into function calls.
- and_(values: list[Callable[[int, int | None], bool]]) Callable[[int, int | None], bool] [source]
Transform and_ into a function that combines two comparisons.
- compound(values: list[Token]) Callable[[int, int | None], bool] [source]
Transform compounds into a function that compares two values.
- grouping(values: list[Callable[[int, int | None], bool]]) Callable[[int, int | None], bool] [source]
Transform grouping into a function that evaluates the expression.
- literal(values: list[Token]) Callable[[int, int | None], bool] [source]
Transform literals into a direct comparison function.
- class dae.query_variants.attribute_queries.AttributeQueryTransformerSQL(main_column: Column, enum_type: type[Enum], main_aliases: dict[str, str], complementary_column: Column | None = None, complementary_type: type[Enum] | None = None)[source]
Bases:
AttributeQueryTransformer
Class for transforming attribute queries into an SQLglot expression.
- compound(values: list[Token]) Expression [source]
Convert compounds into a query statement for two enums.
- class dae.query_variants.attribute_queries.AttributeQueryTransformerSQLLegacy(main_column: Column, enum_type: type[Enum], main_aliases: dict[str, str], complementary_column: Column | None = None, complementary_type: type[Enum] | None = None)[source]
Bases:
AttributeQueryTransformerSQL
Class for transforming attribute queries into an SQLglot expression.
Intended for use with legacy Impala schema1 storage.
- class dae.query_variants.attribute_queries.CompoundStripTransformer(visit_tokens: bool = True)[source]
Bases:
Transformer
- class dae.query_variants.attribute_queries.QueryCompoundAdditionTransformer(compound_value: str)[source]
Bases:
Transformer
Transformer that adds a compound value to an attribute query’s literals.
Directly returns a new attribute query string to be used with other transformers.
- class dae.query_variants.attribute_queries.SyntaxSugarTransformer(visit_tokens: bool = True)[source]
Bases:
Transformer
Transformer for adapting syntax sugar to regular queries.
- dae.query_variants.attribute_queries.transform_attribute_query_to_function(enum_type: type[Enum], query: str, aliases: dict[str, str] | None = None, *, complementary_type: type[Enum] | None = None, strip_compounds: bool = False) Matcher [source]
Transform attribute query to a callable function.
Can evaluate a query for multiple enum types. Queries need to use proper enum names in order to be valid. A dictionary of aliases can be provided, where the keys are the original values.
- dae.query_variants.attribute_queries.transform_attribute_query_to_sql_expression(enum_type: type[Enum], query: str, column: Column, aliases: dict[str, str] | None = None, *, complementary_type: type[Enum] | None = None, complementary_column: Column | None = None, strip_compounds: bool = False) Expression [source]
Transform attribute query to an SQLglot expression.
Can evaluate a query for multiple enum types. Queries need to use proper enum names in order to be valid. A dictionary of aliases can be provided, where the keys are the original values.
- dae.query_variants.attribute_queries.transform_attribute_query_to_sql_expression_schema1(enum_type: type[Enum], query: str, column: Column, aliases: dict[str, str] | None = None, *, complementary_type: type[Enum] | None = None, complementary_column: Column | None = None, strip_compounds: bool = False) Expression [source]
Transform attribute query to an SQLglot expression.
Can evaluate a query for multiple enum types. Queries need to use proper enum names in order to be valid. A dictionary of aliases can be provided, where the keys are the original values.
- dae.query_variants.attribute_queries.update_attribute_query_with_compounds(query: str, compound_value: str) str [source]
Update an attribute query to match by a secondary compound value.
Used to add zygosity in an already existing attribute query, for example: “dad and mom” -> “dad~homozygous and mom~homozygous”
dae.query_variants.base_query_variants module
- class dae.query_variants.base_query_variants.QueryVariants(families: FamiliesData)[source]
Bases:
ABC
Abstract class for querying variants interface.
- abstract build_family_variants_query_runner(*, regions: list[Region] | None = None, genes: list[str] | None = None, effect_types: list[str] | None = None, family_ids: list[str] | None = None, person_ids: list[str] | None = None, inheritance: list[str] | None = None, roles_in_parent: str | None = None, roles_in_child: str | None = None, roles: str | None = None, sexes: str | None = None, affected_statuses: str | None = None, variant_type: str | None = None, real_attr_filter: list[tuple[str, tuple[float | None, float | None]]] | None = None, ultra_rare: bool | None = None, frequency_filter: list[tuple[str, tuple[float | None, float | None]]] | None = None, return_reference: bool | None = None, return_unknown: bool | None = None, limit: int | None = None, study_filters: list[str] | None = None, tags_query: TagsQuery | None = None, zygosity_query: ZygosityQuery | None = None, **kwargs: Any) QueryRunner | None [source]
Create a query runner for searching family variants.
- abstract build_summary_variants_query_runner(*, regions: list[Region] | None = None, genes: list[str] | None = None, effect_types: list[str] | None = None, variant_type: str | None = None, real_attr_filter: list[tuple[str, tuple[float | None, float | None]]] | None = None, ultra_rare: bool | None = None, frequency_filter: list[tuple[str, tuple[float | None, float | None]]] | None = None, return_reference: bool | None = None, return_unknown: bool | None = None, limit: int | None = None, **kwargs: Any) QueryRunner | None [source]
Create query runner for searching summary variants.
- property families: FamiliesData
Return families data.
- abstract has_affected_status_queries() bool [source]
Return True if the storage supports affected status queries.
- abstract query_summary_variants(*, regions: list[Region] | None = None, genes: list[str] | None = None, effect_types: list[str] | None = None, variant_type: str | None = None, real_attr_filter: list[tuple[str, tuple[float | None, float | None]]] | None = None, ultra_rare: bool | None = None, frequency_filter: list[tuple[str, tuple[float | None, float | None]]] | None = None, return_reference: bool | None = None, return_unknown: bool | None = None, limit: int | None = None, **kwargs: Any) Generator[SummaryVariant, None, None] [source]
Execute the summary variants query and yields summary variants.
- abstract query_variants(*, regions: list[Region] | None = None, genes: list[str] | None = None, effect_types: list[str] | None = None, family_ids: list[str] | None = None, person_ids: list[str] | None = None, inheritance: list[str] | None = None, roles_in_parent: str | None = None, roles_in_child: str | None = None, roles: str | None = None, sexes: str | None = None, affected_statuses: str | None = None, variant_type: str | None = None, real_attr_filter: list[tuple[str, tuple[float | None, float | None]]] | None = None, ultra_rare: bool | None = None, frequency_filter: list[tuple[str, tuple[float | None, float | None]]] | None = None, return_reference: bool | None = None, return_unknown: bool | None = None, limit: int | None = None, tags_query: TagsQuery | None = None, zygosity_query: ZygosityQuery | None = None, **kwargs: Any) Generator[FamilyVariant, None, None] [source]
Execute the family variants query and yields family variants.
- class dae.query_variants.base_query_variants.QueryVariantsBase(families: FamiliesData, *, summary_schema: dict[str, Any], variants_blob_serializer: str)[source]
Bases:
QueryVariants
Base class variants for Schema2 query interface.
- RUNNER_CLASS: type[QueryRunner]
- deserialize_family_variant(sv_data: bytes, fv_data: bytes) FamilyVariant [source]
Deserialize a family variant from a summary and family blobs.
- deserialize_summary_variant(sv_data: bytes) SummaryVariant [source]
Deserialize a summary variant from a summary blob.
dae.query_variants.query_runners module
- class dae.query_variants.query_runners.QueryResult(runners: list[QueryRunner], limit: int | None = -1)[source]
Bases:
object
Run a list of queries in the background.
The result of the queries is enqueued on result_queue
- CHECK_VERBOSITY = 20
- class dae.query_variants.query_runners.QueryRunner(**kwargs: Any)[source]
Bases:
ABC
Run a query in the backround using the provided executor.
- put_value_in_result_queue(val: Any) None [source]
Put a value in the result queue.
The result queue is blocking, so it will wait until there is space for the new value. So it causes backpressure on the QueryRunners.
- property result_queue: Queue | None