dae.query_variants package

Subpackages

Submodules

dae.query_variants.attribute_queries module

Module for attribute queries

Attribute queries are human readable strings which are used to filter variants based on enum attribute values.

They are applicable only to enum types and this module provides three utilities to transform an attribute query string into a callable function, an sqlglot expression or an sqlglot expression tailored to old legacy genotype storages.

Attribute queries have a defined earley grammar, used by lark to create a parsable tree.

The grammar divides the input string into expressions, which can be different types:

  • literal:

    The most basic type, a normal string which matches to an enumeration. For example, for roles “mom” would match variants which are present in the mother.

  • compound:

    A special case of literal, used for zygosity. A compound consists of 2 literals connected by a tilde (“~”) with the second one being the complementary. For example: prb~homozygous would match variants which are present and homozygotic in the proband of the family.

  • neg:

    Match any value that does not match the underlying expression. Negation is written by writing “not” before your expression. Example: “not sib~heterozygous”

  • and_:

    Match when both expressions are true. Example: prb and dad

  • or_:

    Match when one of the expressions is true. Example: sib or prb

  • grouping:

    Prioritize an expression to be evaluated first with brackets. Example: (sib or prb) and dad

  • any:

    Syntax sugar for doing an or_ between many values. Example: “any([dad, prb, sib])” is equivalent to “dad or prb or sib”

  • all:

    Syntax sugar for doing an and_ between many values. Example: “all([dad, prb, sib])” is equivalent to “dad and prb and sib”

exception dae.query_variants.attribute_queries.AttributeQueryNotSupported[source]

Bases: Exception

class dae.query_variants.attribute_queries.AttributeQueryTransformer(enum_type: type[Enum], main_aliases: dict[str, str], complementary_type: type[Enum] | None = None)[source]

Bases: Transformer

Base class for attribute query transformers.

Supports two enum types, the second being optional. The second enum type is intended for compounds.

class dae.query_variants.attribute_queries.AttributeQueryTransformerFunction(enum_type: type[Enum], main_aliases: dict[str, str], complementary_type: type[Enum] | None = None)[source]

Bases: AttributeQueryTransformer

Class for transforming attribute Lark trees into function calls.

and_(values: list[Callable[[int, int | None], bool]]) Callable[[int, int | None], bool][source]

Transform and_ into a function that combines two comparisons.

compound(values: list[Token]) Callable[[int, int | None], bool][source]

Transform compounds into a function that compares two values.

grouping(values: list[Callable[[int, int | None], bool]]) Callable[[int, int | None], bool][source]

Transform grouping into a function that evaluates the expression.

literal(values: list[Token]) Callable[[int, int | None], bool][source]

Transform literals into a direct comparison function.

neg(values: list[Callable[[int, int | None], bool]]) Callable[[int, int | None], bool][source]

Transform neg into a function that negates the comparison.

or_(values: list[Callable[[int, int | None], bool]]) Callable[[int, int | None], bool][source]

Transform or_ into a function that combines two comparisons.

class dae.query_variants.attribute_queries.AttributeQueryTransformerSQL(main_column: Column, enum_type: type[Enum], main_aliases: dict[str, str], complementary_column: Column | None = None, complementary_type: type[Enum] | None = None)[source]

Bases: AttributeQueryTransformer

Class for transforming attribute queries into an SQLglot expression.

and_(values: list[Expression]) Expression[source]
compound(values: list[Token]) Expression[source]

Convert compounds into a query statement for two enums.

grouping(values: list[Expression]) Expression[source]
literal(values: list[Token]) Expression[source]

Transform literals into a direct comparison function.

neg(values: list[Expression]) Expression[source]
or_(values: list[Expression]) Expression[source]
class dae.query_variants.attribute_queries.AttributeQueryTransformerSQLLegacy(main_column: Column, enum_type: type[Enum], main_aliases: dict[str, str], complementary_column: Column | None = None, complementary_type: type[Enum] | None = None)[source]

Bases: AttributeQueryTransformerSQL

Class for transforming attribute queries into an SQLglot expression.

Intended for use with legacy Impala schema1 storage.

compound(values: list[Token]) Expression[source]

Convert compounds into a query statement for two enums.

literal(values: list[Token]) Expression[source]

Transform literals into a direct comparison function.

class dae.query_variants.attribute_queries.CompoundStripTransformer(visit_tokens: bool = True)[source]

Bases: Transformer

compound(values: list[Token]) Tree[source]
class dae.query_variants.attribute_queries.Matcher(*args, **kwargs)[source]

Bases: Protocol

class dae.query_variants.attribute_queries.QueryCompoundAdditionTransformer(compound_value: str)[source]

Bases: Transformer

Transformer that adds a compound value to an attribute query’s literals.

Directly returns a new attribute query string to be used with other transformers.

all(values: list[Tree]) str[source]
and_(values: list[Token]) str[source]
any(values: list[Tree]) str[source]
compound() str[source]
grouping(values: list[Token]) str[source]
literal(values: list[Token]) str[source]
neg(values: list[Token]) str[source]
or_(values: list[Token]) str[source]
class dae.query_variants.attribute_queries.SyntaxSugarTransformer(visit_tokens: bool = True)[source]

Bases: Transformer

Transformer for adapting syntax sugar to regular queries.

all(values: list[Tree]) Tree | None[source]

Transform all into a sequence of and nodes.

any(values: list[Tree]) Tree | None[source]

Transform any into a sequence of or nodes.

dae.query_variants.attribute_queries.create_and_node(left: Tree, right: Tree) Tree[source]
dae.query_variants.attribute_queries.create_or_node(left: Tree, right: Tree) Tree[source]
dae.query_variants.attribute_queries.transform_attribute_query_to_function(enum_type: type[Enum], query: str, aliases: dict[str, str] | None = None, *, complementary_type: type[Enum] | None = None, strip_compounds: bool = False) Matcher[source]

Transform attribute query to a callable function.

Can evaluate a query for multiple enum types. Queries need to use proper enum names in order to be valid. A dictionary of aliases can be provided, where the keys are the original values.

dae.query_variants.attribute_queries.transform_attribute_query_to_sql_expression(enum_type: type[Enum], query: str, column: Column, aliases: dict[str, str] | None = None, *, complementary_type: type[Enum] | None = None, complementary_column: Column | None = None, strip_compounds: bool = False) Expression[source]

Transform attribute query to an SQLglot expression.

Can evaluate a query for multiple enum types. Queries need to use proper enum names in order to be valid. A dictionary of aliases can be provided, where the keys are the original values.

dae.query_variants.attribute_queries.transform_attribute_query_to_sql_expression_schema1(enum_type: type[Enum], query: str, column: Column, aliases: dict[str, str] | None = None, *, complementary_type: type[Enum] | None = None, complementary_column: Column | None = None, strip_compounds: bool = False) Expression[source]

Transform attribute query to an SQLglot expression.

Can evaluate a query for multiple enum types. Queries need to use proper enum names in order to be valid. A dictionary of aliases can be provided, where the keys are the original values.

dae.query_variants.attribute_queries.update_attribute_query_with_compounds(query: str, compound_value: str) str[source]

Update an attribute query to match by a secondary compound value.

Used to add zygosity in an already existing attribute query, for example: “dad and mom” -> “dad~homozygous and mom~homozygous”

dae.query_variants.base_query_variants module

class dae.query_variants.base_query_variants.QueryVariants(families: FamiliesData)[source]

Bases: ABC

Abstract class for querying variants interface.

abstract build_family_variants_query_runner(*, regions: list[Region] | None = None, genes: list[str] | None = None, effect_types: list[str] | None = None, family_ids: list[str] | None = None, person_ids: list[str] | None = None, inheritance: list[str] | None = None, roles_in_parent: str | None = None, roles_in_child: str | None = None, roles: str | None = None, sexes: str | None = None, affected_statuses: str | None = None, variant_type: str | None = None, real_attr_filter: list[tuple[str, tuple[float | None, float | None]]] | None = None, ultra_rare: bool | None = None, frequency_filter: list[tuple[str, tuple[float | None, float | None]]] | None = None, return_reference: bool | None = None, return_unknown: bool | None = None, limit: int | None = None, study_filters: list[str] | None = None, tags_query: TagsQuery | None = None, zygosity_query: ZygosityQuery | None = None, **kwargs: Any) QueryRunner | None[source]

Create a query runner for searching family variants.

abstract build_summary_variants_query_runner(*, regions: list[Region] | None = None, genes: list[str] | None = None, effect_types: list[str] | None = None, variant_type: str | None = None, real_attr_filter: list[tuple[str, tuple[float | None, float | None]]] | None = None, ultra_rare: bool | None = None, frequency_filter: list[tuple[str, tuple[float | None, float | None]]] | None = None, return_reference: bool | None = None, return_unknown: bool | None = None, limit: int | None = None, **kwargs: Any) QueryRunner | None[source]

Create query runner for searching summary variants.

property families: FamiliesData

Return families data.

abstract has_affected_status_queries() bool[source]

Return True if the storage supports affected status queries.

abstract query_summary_variants(*, regions: list[Region] | None = None, genes: list[str] | None = None, effect_types: list[str] | None = None, variant_type: str | None = None, real_attr_filter: list[tuple[str, tuple[float | None, float | None]]] | None = None, ultra_rare: bool | None = None, frequency_filter: list[tuple[str, tuple[float | None, float | None]]] | None = None, return_reference: bool | None = None, return_unknown: bool | None = None, limit: int | None = None, **kwargs: Any) Generator[SummaryVariant, None, None][source]

Execute the summary variants query and yields summary variants.

abstract query_variants(*, regions: list[Region] | None = None, genes: list[str] | None = None, effect_types: list[str] | None = None, family_ids: list[str] | None = None, person_ids: list[str] | None = None, inheritance: list[str] | None = None, roles_in_parent: str | None = None, roles_in_child: str | None = None, roles: str | None = None, sexes: str | None = None, affected_statuses: str | None = None, variant_type: str | None = None, real_attr_filter: list[tuple[str, tuple[float | None, float | None]]] | None = None, ultra_rare: bool | None = None, frequency_filter: list[tuple[str, tuple[float | None, float | None]]] | None = None, return_reference: bool | None = None, return_unknown: bool | None = None, limit: int | None = None, tags_query: TagsQuery | None = None, zygosity_query: ZygosityQuery | None = None, **kwargs: Any) Generator[FamilyVariant, None, None][source]

Execute the family variants query and yields family variants.

tags_to_family_ids(tags_query: TagsQuery | None = None) set[str] | None[source]

Transform a query for tags into a set of family IDs.

static transform_roles_to_single_role_string(roles_in_parent: str | None, roles_in_child: str | None, roles: str | None = None) str | None[source]

Transform roles arguments into singular roles argument.

Helper method for supporting legacy backends.

class dae.query_variants.base_query_variants.QueryVariantsBase(families: FamiliesData, *, summary_schema: dict[str, Any], variants_blob_serializer: str)[source]

Bases: QueryVariants

Base class variants for Schema2 query interface.

RUNNER_CLASS: type[QueryRunner]
deserialize_family_variant(sv_data: bytes, fv_data: bytes) FamilyVariant[source]

Deserialize a family variant from a summary and family blobs.

deserialize_summary_variant(sv_data: bytes) SummaryVariant[source]

Deserialize a summary variant from a summary blob.

has_affected_status_queries() bool[source]

Schema2 do support affected status queries.

dae.query_variants.query_runners module

class dae.query_variants.query_runners.QueryResult(runners: list[QueryRunner], limit: int | None = -1)[source]

Bases: object

Run a list of queries in the background.

The result of the queries is enqueued on result_queue

CHECK_VERBOSITY = 20
close() None[source]

Gracefully close and dispose of resources.

get(timeout: float = 0.0) Any[source]

Pop the next entry from the queue.

Return None if the queue is still empty after timeout seconds.

is_done() bool[source]

Check if the query result is done.

start() None[source]
class dae.query_variants.query_runners.QueryRunner(**kwargs: Any)[source]

Bases: ABC

Run a query in the backround using the provided executor.

adapt(adapter_func: Callable[[Any], Any]) None[source]
close() None[source]

Close query runner.

is_closed() bool[source]
is_done() bool[source]
is_started() bool[source]
put_value_in_result_queue(val: Any) None[source]

Put a value in the result queue.

The result queue is blocking, so it will wait until there is space for the new value. So it causes backpressure on the QueryRunners.

property result_queue: Queue | None
abstract run() None[source]
set_result_queue(result_queue: Queue) None[source]
set_study_id(study_id: str) None[source]
start(executor: Executor) None[source]

Module contents