dae.annotation package
Submodules
dae.annotation.annotatable module
- class dae.annotation.annotatable.Annotatable(chrom: str, pos: int, pos_end: int, annotatable_type: Type)[source]
Bases:
objectBase class for annotatables used in annotation pipeline.
- class Type(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]
Bases:
EnumDefines annotatable types.
- COMPLEX = 5
- LARGE_DELETION = 7
- LARGE_DUPLICATION = 6
- POSITION = 0
- REGION = 1
- SMALL_DELETION = 4
- SMALL_INSERTION = 3
- SUBSTITUTION = 2
- property chrom: str
- property chromosome: str
- property end_position: int
- static from_string(value: str) Annotatable[source]
Deserialize an Annotatable instance from a string value.
- property pos: int
- property pos_end: int
- property position: int
- class dae.annotation.annotatable.CNVAllele(chrom: str, pos_begin: int, pos_end: int, cnv_type: Type)[source]
Bases:
AnnotatableDefines copy number variants annotatable.
- class dae.annotation.annotatable.Position(chrom: str, pos: int)[source]
Bases:
AnnotatableAnnotatable class representing a single position in a chromosome.
- class dae.annotation.annotatable.Region(chrom: str, pos_begin: int, pos_end: int)[source]
Bases:
AnnotatableAnnotatable class representing a region in a chromosome.
- class dae.annotation.annotatable.VCFAllele(chrom: str, pos: int, ref: str, alt: str)[source]
Bases:
AnnotatableDefines small variants annotatable.
- property alt: str
- property alternative: str
- static from_string(value: str) VCFAllele[source]
Deserialize an Annotatable instance from a string value.
- property ref: str
- property reference: str
dae.annotation.annotate_columns module
dae.annotation.annotate_doc module
dae.annotation.annotate_utils module
- dae.annotation.annotate_utils.add_common_annotation_arguments(parser: ArgumentParser) None[source]
Add common arguments to an annotation command line parser.
- dae.annotation.annotate_utils.add_input_files_to_task_graph(args: dict, task_graph: TaskGraph) None[source]
- dae.annotation.annotate_utils.build_cli_genomic_context(cli_args: dict[str, Any]) GenomicContext[source]
Helper method to collect necessary objects from the genomic context.
- dae.annotation.annotate_utils.build_output_path(raw_input_path: str, output_path: str | None) str[source]
Build an output filepath for an annotation tool’s output.
- dae.annotation.annotate_utils.cache_pipeline_resources(grr: GenomicResourceRepo, pipeline: AnnotationPipeline) None[source]
Cache resources that the given pipeline will use.
- dae.annotation.annotate_utils.get_grr_from_context(context: GenomicContext) GenomicResourceRepo[source]
Get the genomic resource repository from the genomic context.
- dae.annotation.annotate_utils.get_pipeline_from_context(context: GenomicContext) AnnotationPipeline[source]
Get the annotation pipeline from the genomic context.
- dae.annotation.annotate_utils.handle_default_args(args: dict[str, Any]) dict[str, Any][source]
Handle default arguments for annotation command line tools.
- dae.annotation.annotate_utils.produce_partfile_paths(input_file_path: str, regions: list[Region], work_dir: str) list[str][source]
Produce a list of file paths for output region part files.
dae.annotation.annotate_vcf module
dae.annotation.annotation_config module
- class dae.annotation.annotation_config.AnnotationConfigParser[source]
Bases:
objectParser for annotation configuration.
- static has_wildcard(string: str) bool[source]
Ascertain whether a string contains a valid wildcard.
- static match_labels_query(query: dict[str, str], resource_labels: dict[str, str]) bool[source]
Check if the labels query for a wildcard matches.
- static parse_complete(raw: dict[str, Any], idx: int, grr: GenomicResourceRepo | None = None) list[AnnotatorInfo][source]
Parse a full-form annotation config.
- static parse_minimal(raw: str, idx: int) AnnotatorInfo[source]
Parse a minimal-form annotation config.
- static parse_raw(pipeline_raw_config: list[dict[str, Any]] | RawFullConfig | None, grr: GenomicResourceRepo | None = None) tuple[AnnotationPreamble | None, list[AnnotatorInfo]][source]
Parse raw dictionary annotation pipeline configuration.
- static parse_raw_attribute_config(raw_attribute_config: dict[str, Any]) AttributeInfo[source]
Parse annotation attribute raw configuration.
- static parse_raw_attributes(raw_attributes_config: Any) list[AttributeInfo][source]
Parse annotator pipeline attribute configuration.
- static parse_short(raw: dict[str, Any], idx: int, grr: GenomicResourceRepo | None = None) list[AnnotatorInfo][source]
Parse a short-form annotation config.
- static parse_str(content: str, source_file_name: str | None = None, grr: GenomicResourceRepo | None = None) tuple[AnnotationPreamble | None, list[AnnotatorInfo]][source]
Parse annotation pipeline configuration string.
- static query_resources(annotator_type: str, wildcard: str, grr: GenomicResourceRepo) list[str][source]
Collect resources matching a given query.
- class dae.annotation.annotation_config.AnnotationPreamble(summary: 'str', description: 'str', input_reference_genome: 'str', input_reference_genome_res: 'GenomicResource | None', metadata: 'dict[str, Any]')[source]
Bases:
object- description: str
- input_reference_genome: str
- input_reference_genome_res: GenomicResource | None
- metadata: dict[str, Any]
- summary: str
- class dae.annotation.annotation_config.AnnotatorInfo(_type: str, attributes: list[AttributeInfo], parameters: ParamsUsageMonitor | dict[str, Any], documentation: str = '', resources: list[GenomicResource] | None = None, annotator_id: str = 'N/A')[source]
Bases:
objectDefines annotator configuration.
- annotator_id: str
- attributes: list[AttributeInfo]
- documentation: str = ''
- parameters: ParamsUsageMonitor
- resources: list[GenomicResource]
- type: str
- class dae.annotation.annotation_config.AttributeInfo(name: str, source: str, *, internal: bool | None, parameters: ParamsUsageMonitor | dict[str, Any], _type: str = 'str', description: str = '', documentation: str | None = None)[source]
Bases:
objectDefines annotation attribute configuration.
- static create(source: str, name: str | None = None, *, internal: bool = False) AttributeInfo[source]
Create an AttributeInfo instance.
- description: str = ''
- property documentation: str
- internal: bool | None
- name: str
- parameters: ParamsUsageMonitor
- source: str
- type: str = 'str'
- class dae.annotation.annotation_config.ParamsUsageMonitor(data: dict[str, Any])[source]
Bases:
MappingClass to monitor usage of annotator parameters.
- class dae.annotation.annotation_config.RawFullConfig[source]
Bases:
TypedDict- annotators: list[dict[str, Any]]
- preamble: RawPreamble
dae.annotation.annotation_factory module
Factory for creation of annotation pipeline.
- dae.annotation.annotation_factory.build_annotation_pipeline(config: list[dict[str, Any]] | RawFullConfig, grr: GenomicResourceRepo, *, allow_repeated_attributes: bool = False, work_dir: Path | None = None) AnnotationPipeline[source]
Build an annotation pipeline.
- dae.annotation.annotation_factory.check_for_repeated_attributes_in_annotator(annotator_config: AnnotatorInfo) None[source]
Check for repeated attributes in annotator configuration.
- dae.annotation.annotation_factory.check_for_repeated_attributes_in_pipeline(pipeline: AnnotationPipeline, *, allow_repeated_attributes: bool = False) None[source]
Check for repeated attributes in pipeline configuration.
- dae.annotation.annotation_factory.check_for_unused_parameters(info: AnnotatorInfo) None[source]
Check annotator configuration for unused parameters.
- dae.annotation.annotation_factory.get_annotator_factory(annotator_type: str) Callable[[AnnotationPipeline, AnnotatorInfo], Annotator][source]
Find and return a factory function for creation of an annotator type.
If the specified annotator type is not found, this function raises ValueError exception.
- Returns:
the annotator factory for the specified annotator type.
- Raises:
ValueError – when can’t find an annotator factory for the specified annotator type.
- dae.annotation.annotation_factory.get_available_annotator_types() list[str][source]
Return the list of all registered annotator factory types.
- dae.annotation.annotation_factory.load_pipeline_from_file(raw_path: str, grr: GenomicResourceRepo, *, allow_repeated_attributes: bool = False, work_dir: Path | None = None) AnnotationPipeline[source]
Load an annotation pipeline from a configuration file.
- dae.annotation.annotation_factory.load_pipeline_from_grr(grr: GenomicResourceRepo, resource: GenomicResource) AnnotationPipeline[source]
Load a pipeline from a grr and a resource.
- dae.annotation.annotation_factory.load_pipeline_from_yaml(raw: str, grr: GenomicResourceRepo, *, allow_repeated_attributes: bool = False, work_dir: Path | None = None) AnnotationPipeline[source]
Load an annotation pipeline from a YAML-formatted string.
- dae.annotation.annotation_factory.register_annotator_factory(annotator_type: str, factory: Callable[[AnnotationPipeline, AnnotatorInfo], Annotator]) None[source]
Register additional annotator factory.
By default all genotype storage factories should be registered at [dae.genotype_storage.factories] extenstion point. All registered factories are loaded automatically. This function should be used if you want to bypass extension point mechanism and register addition genotype storage factory programatically.
- dae.annotation.annotation_factory.resolve_repeated_attributes(pipeline: AnnotationPipeline, repeated_attributes: set[str]) None[source]
Resolve repeated attributes in pipeline configuration via renaming.
dae.annotation.annotation_genomic_context_cli module
Command line helpers for constructing annotation pipelines.
The utilities in this module complement the generic genomic context
providers by supplying annotation pipeline objects. They enable CLI tools to
load pipeline definitions from the file system or from genomic resource
repositories, and to make the resulting AnnotationPipeline
instances available through the shared genomic context mechanism.
- class dae.annotation.annotation_genomic_context_cli.CLIAnnotationContextProvider[source]
Bases:
GenomicContextProviderExpose annotation pipeline configuration through CLI options.
The provider allows users to point to an annotation pipeline definition (either as a file path or a genomic resource identifier) and optionally tweak pipeline behaviour via command-line flags. When invoked without a
pipelineargument the provider abstains from creating a context so that other providers can supply their default pipelines.- add_argparser_arguments(parser: ArgumentParser) None[source]
Register arguments that describe the annotation pipeline source.
Parameters
- parser
The parser that should receive the provider specific CLI options.
- init(**kwargs: Any) GenomicContext | None[source]
Materialise a genomic context containing an annotation pipeline.
Parameters
- **kwargs
Keyword arguments parsed from the command line. The provider looks at
pipeline,allow_repeated_attributes, andwork_dir.
Returns
- GenomicContext | None
A context containing the annotation pipeline, or
Nonewhen no pipeline could be created (for example when thepipelineargument is omitted).
- dae.annotation.annotation_genomic_context_cli.get_context_pipeline(context: GenomicContext) AnnotationPipeline | None[source]
Extract a validated
AnnotationPipelinefrom context.Parameters
- context
The genomic context from which to retrieve the pipeline object.
Returns
- AnnotationPipeline | None
The pipeline instance or
Nonewhen the context does not expose a pipeline.
Raises
- TypeError
If the context entry is present but does not contain the expected
AnnotationPipelinetype.
dae.annotation.annotation_pipeline module
Provides annotation pipeline class.
- class dae.annotation.annotation_pipeline.AnnotationPipeline(repository: GenomicResourceRepo)[source]
Bases:
objectProvides annotation pipeline abstraction.
- annotate(annotatable: Annotatable | None, context: dict | None = None) dict[source]
Apply all annotators to an annotatable.
- batch_annotate(annotatables: Sequence[Annotatable | None], contexts: list[dict] | None = None, batch_work_dir: str | None = None) list[dict][source]
Apply all annotators to a list of annotatables.
- get_annotator_by_attribute_info(attribute_info: AttributeInfo) Annotator | None[source]
- get_attribute_info(attribute_name: str) AttributeInfo | None[source]
- get_attributes() list[AttributeInfo][source]
- get_info() list[AnnotatorInfo][source]
- open() AnnotationPipeline[source]
Open all annotators in the pipeline and mark it as open.
- class dae.annotation.annotation_pipeline.Annotator(pipeline: AnnotationPipeline | None, info: AnnotatorInfo)[source]
Bases:
ABCAnnotator provides a set of attrubutes for a given Annotatable.
- abstract annotate(annotatable: Annotatable | None, context: dict[str, Any]) dict[str, Any][source]
Produce annotation attributes for an annotatable.
- property attributes: list[AttributeInfo]
- batch_annotate(annotatables: Sequence[Annotatable | None], contexts: list[dict[str, Any]], batch_work_dir: str | None = None) Iterable[dict[str, Any]][source]
- get_info() AnnotatorInfo[source]
- property resource_ids: set[str]
- property resources: list[GenomicResource]
- property used_context_attributes: tuple[str, ...]
- class dae.annotation.annotation_pipeline.AnnotatorDecorator(child: Annotator)[source]
Bases:
AnnotatorDefines annotator decorator base class.
- class dae.annotation.annotation_pipeline.InputAnnotableAnnotatorDecorator(child: Annotator)[source]
Bases:
AnnotatorDecoratorDefines annotator decorator to use input annotatable if defined.
- annotate(annotatable: Annotatable | None, context: dict[str, Any]) dict[str, Any][source]
Produce annotation attributes for an annotatable.
- property used_context_attributes: tuple[str, ...]
- class dae.annotation.annotation_pipeline.ReannotationPipeline(pipeline_new: AnnotationPipeline, pipeline_previous: AnnotationPipeline, *, full_reannotation: bool = False)[source]
Bases:
AnnotationPipelineProvides functionality for reannotation.
- get_attributes() list[AttributeInfo][source]
- class dae.annotation.annotation_pipeline.ValueTransformAnnotatorDecorator(child: Annotator, value_transformers: dict[str, Callable[[Any], Any]])[source]
Bases:
AnnotatorDecoratorDefine value transformer annotator decorator.
- annotate(annotatable: Annotatable | None, context: dict[str, Any]) dict[str, Any][source]
Produce annotation attributes for an annotatable.
dae.annotation.annotator_base module
Provides base class for annotators.
- class dae.annotation.annotator_base.AnnotatorBase(pipeline: AnnotationPipeline | None, info: AnnotatorInfo, attribute_descriptions: Mapping[str, AttributeDesc | tuple])[source]
Bases:
AnnotatorBase implementation of the Annotator class.
- annotate(annotatable: Annotatable | None, context: dict[str, Any]) dict[str, Any][source]
Produce annotation attributes for an annotatable.
- batch_annotate(annotatables: Sequence[Annotatable | None], contexts: list[dict[str, Any]], batch_work_dir: str | None = None) list[dict[str, Any]][source]
- class dae.annotation.annotator_base.AttributeDesc(name: str, type: str, description: str, default: bool = True, internal: bool = False, params: dict[str, ~typing.Any] = <factory>)[source]
Bases:
objectHolds default attribute configuration for annotators.
- default: bool = True
- description: str
- internal: bool = False
- name: str
- params: dict[str, Any]
- type: str
dae.annotation.chromosome_annotator module
- class dae.annotation.chromosome_annotator.ChromosomeAnnotator(pipeline: AnnotationPipeline, info: AnnotatorInfo)[source]
Bases:
AnnotatorBaseAnnotator for adjusting chromosome values.
- dae.annotation.chromosome_annotator.build_chromosome_annotator(pipeline: AnnotationPipeline, info: AnnotatorInfo) Annotator[source]
dae.annotation.cnv_collection_annotator module
- class dae.annotation.cnv_collection_annotator.CnvCollectionAnnotator(pipeline: AnnotationPipeline, info: AnnotatorInfo)[source]
Bases:
AnnotatorSimple effect annotator class.
- annotate(annotatable: Annotatable | None, context: dict[str, Any]) dict[str, Any][source]
Produce annotation attributes for an annotatable.
- dae.annotation.cnv_collection_annotator.build_cnv_collection_annotator(pipeline: AnnotationPipeline, info: AnnotatorInfo) Annotator[source]
dae.annotation.debug_annotator module
- class dae.annotation.debug_annotator.HelloWorldAnnotator(pipeline: AnnotationPipeline, info: AnnotatorInfo)[source]
Bases:
AnnotatorDefines example annotator.
- annotate(annotatable: Annotatable | None, context: dict[str, Any]) dict[str, Any][source]
Produce annotation attributes for an annotatable.
- dae.annotation.debug_annotator.build_annotator(pipeline: AnnotationPipeline, info: AnnotatorInfo) Annotator[source]
Create an example hello world annotator.
dae.annotation.docker_annotator module
- class dae.annotation.docker_annotator.DockerAnnotator(pipeline: AnnotationPipeline | None, info: AnnotatorInfo)[source]
Bases:
AnnotatorBaseBase class for annotators that use docker containers.
dae.annotation.effect_annotator module
- class dae.annotation.effect_annotator.EffectAnnotatorAdapter(pipeline: AnnotationPipeline, info: AnnotatorInfo)[source]
Bases:
AnnotatorBaseAdapts effect annotator to be used in annotation infrastructure.
- annotate(annotatable: Annotatable | None, context: dict[str, Any]) dict[str, Any][source]
Produce annotation attributes for an annotatable.
- dae.annotation.effect_annotator.build_effect_annotator(pipeline: AnnotationPipeline, info: AnnotatorInfo) Annotator[source]
dae.annotation.gene_score_annotator module
Module containing the gene score annotator.
- class dae.annotation.gene_score_annotator.GeneScoreAnnotator(pipeline: AnnotationPipeline | None, info: AnnotatorInfo, gene_score_resource: GenomicResource, input_gene_list: str)[source]
Bases:
AnnotatorGene score annotator class.
- DEFAULT_AGGREGATOR_TYPE = 'dict'
- aggregate_gene_values(score_id: str, gene_symbols: list[str], aggregator_type: str) Any[source]
Aggregate gene score values.
- annotate(annotatable: Annotatable | None, context: dict[str, Any]) dict[str, Any][source]
Produce annotation attributes for an annotatable.
- property used_context_attributes: tuple[str, ...]
- dae.annotation.gene_score_annotator.build_gene_score_annotator(pipeline: AnnotationPipeline, info: AnnotatorInfo) Annotator[source]
Create a gene score annotator.
dae.annotation.gene_set_annotator module
- class dae.annotation.gene_set_annotator.GeneSetAnnotator(pipeline: AnnotationPipeline | None, info: AnnotatorInfo, gene_set_resource: GenomicResource, input_gene_list: str)[source]
Bases:
AnnotatorBaseGene set annotator class.
- property used_context_attributes: tuple[str, ...]
- dae.annotation.gene_set_annotator.build_gene_set_annotator(pipeline: AnnotationPipeline, info: AnnotatorInfo) Annotator[source]
Create a gene set annotator.
dae.annotation.liftover_annotator module
Provides a lift over annotator and helpers.
- class dae.annotation.liftover_annotator.AbstractLiftoverAnnotator(pipeline: AnnotationPipeline, info: AnnotatorInfo, chain: LiftoverChain, source_genome: ReferenceGenome, target_genome: ReferenceGenome)[source]
Bases:
AnnotatorBaseLiftovver annotator class.
- liftover_cnv(cnv_allele: Annotatable) Annotatable | None[source]
Liftover CNV allele annotatable.
- liftover_position(position: Annotatable) Annotatable | None[source]
Liftover position annotatable.
- liftover_region(region: Annotatable) Annotatable | None[source]
Liftover region annotatable.
- class dae.annotation.liftover_annotator.BasicLiftoverAnnotator(pipeline: AnnotationPipeline, info: AnnotatorInfo, chain: LiftoverChain, source_genome: ReferenceGenome, target_genome: ReferenceGenome)[source]
Bases:
AbstractLiftoverAnnotatorBasic liftover annotator class.
- class dae.annotation.liftover_annotator.BcfLiftoverAnnotator(pipeline: AnnotationPipeline, info: AnnotatorInfo, chain: LiftoverChain, source_genome: ReferenceGenome, target_genome: ReferenceGenome)[source]
Bases:
AbstractLiftoverAnnotatorBCF tools liftover re-implementation annotator class.
- class dae.annotation.liftover_annotator.LiftoverFunction(*args, **kwargs)[source]
Bases:
ProtocolProtocol for liftover function.
- dae.annotation.liftover_annotator.basic_liftover_allele(chrom: str, pos: int, ref: str, alt: str, liftover_chain: LiftoverChain, *, source_genome: ReferenceGenome, target_genome: ReferenceGenome) tuple[str, int, str, str] | None[source]
Basic liftover an allele.
- dae.annotation.liftover_annotator.basic_liftover_variant(chrom: str, pos: int, ref: str, alts: list[str], liftover_chain: LiftoverChain, *, source_genome: ReferenceGenome, target_genome: ReferenceGenome) tuple[str, int, str, list[str]] | None[source]
Basic liftover variant utility function.
- dae.annotation.liftover_annotator.bcf_liftover_allele(chrom: str, pos: int, ref: str, alt: str, liftover_chain: LiftoverChain, *, source_genome: ReferenceGenome, target_genome: ReferenceGenome) tuple[str, int, str, str] | None[source]
Liftover a variant.
- dae.annotation.liftover_annotator.bcf_liftover_variant(chrom: str, pos: int, ref: str, alts: list[str], liftover_chain: LiftoverChain, *, source_genome: ReferenceGenome, target_genome: ReferenceGenome) tuple[str, int, str, list[str]] | None[source]
BCF liftover variant utility function.
- dae.annotation.liftover_annotator.build_liftover_annotator(pipeline: AnnotationPipeline, info: AnnotatorInfo) Annotator[source]
Create a liftover annotator.
dae.annotation.normalize_allele_annotator module
Provides normalize allele annotator and helpers.
- class dae.annotation.normalize_allele_annotator.NormalizeAlleleAnnotator(pipeline: AnnotationPipeline, info: AnnotatorInfo)[source]
Bases:
AnnotatorBaseAnnotator to normalize VCF alleles.
- dae.annotation.normalize_allele_annotator.build_normalize_allele_annotator(pipeline: AnnotationPipeline, info: AnnotatorInfo) Annotator[source]
- dae.annotation.normalize_allele_annotator.normalize_allele(allele: VCFAllele, genome: ReferenceGenome) VCFAllele[source]
Normalize an allele.
Using algorithm defined in following https://genome.sph.umich.edu/wiki/Variant_Normalization
dae.annotation.processing_pipeline module
- class dae.annotation.processing_pipeline.Annotation(annotatable: ~dae.annotation.annotatable.Annotatable | None, context: dict[str, ~typing.Any] = <factory>)[source]
Bases:
objectA pair of an annotatable and its relevant context.
The context can hold any key/value pair relevant to the annotatable and is typically used to store the results of annotators.
- annotatable: Annotatable | None
- context: dict[str, Any]
- class dae.annotation.processing_pipeline.AnnotationPipelineAnnotatablesBatchFilter(annotation_pipeline: AnnotationPipeline)[source]
Bases:
AnnotationsWithSourceBatchFilter,AnnotationPipelineContextManagerFilter that annotates an AnnotationWithSource batch using a pipeline.
- class dae.annotation.processing_pipeline.AnnotationPipelineAnnotatablesFilter(annotation_pipeline: AnnotationPipeline)[source]
Bases:
AnnotationsWithSourceFilter,AnnotationPipelineContextManagerFilter that annotates an AnnotationWithSource object using a pipeline.
- class dae.annotation.processing_pipeline.AnnotationPipelineContextManager(annotation_pipeline: AnnotationPipeline)[source]
Bases:
AbstractContextManagerA context manager for annotation pipelines.
- class dae.annotation.processing_pipeline.AnnotationsWithSource(source: Any, annotations: list[Annotation])[source]
Bases:
objectA pair of a list of Annotation instances and their source.
The source is typically a variant read from some format, with the ‘annotations’ attribute corresponding to its alleles.
- annotations: list[Annotation]
- source: Any
- class dae.annotation.processing_pipeline.AnnotationsWithSourceBatchFilter[source]
Bases:
FilterBase class for filters that work on AnnotationsWithSource batches.
- filter(data: Sequence[AnnotationsWithSource]) Sequence[AnnotationsWithSource][source]
Filter a batch of AnnotationsWithSource objects.
- class dae.annotation.processing_pipeline.AnnotationsWithSourceFilter[source]
Bases:
FilterBase class for filters that work on AnnotationsWithSource objects.
- filter(data: AnnotationsWithSource) AnnotationsWithSource[source]
Filter a single AnnotationsWithSource object.
- class dae.annotation.processing_pipeline.DeleteAttributesFromAWSBatchFilter(attributes_to_remove: Sequence[str])[source]
Bases:
FilterFilter to remove items from AWS batches. Works in-place.
- filter(data: Sequence[AnnotationsWithSource]) Sequence[AnnotationsWithSource][source]
- class dae.annotation.processing_pipeline.DeleteAttributesFromAWSFilter(attributes_to_remove: Sequence[str])[source]
Bases:
FilterFilter to remove items from AWSs. Works in-place.
- filter(data: AnnotationsWithSource) AnnotationsWithSource[source]
dae.annotation.record_to_annotatable module
- class dae.annotation.record_to_annotatable.CSHLAlleleRecordToAnnotatable(columns: tuple, ref_genome: ReferenceGenome | None)[source]
Bases:
RecordToAnnotableTransform a CSHL variant record into a VCF allele annotatable.
- build(record: dict[str, str]) Annotatable[source]
Constructs an annotatable from a record.
- class dae.annotation.record_to_annotatable.DaeAlleleRecordToAnnotatable(columns: tuple, ref_genome: ReferenceGenome | None)[source]
Bases:
RecordToAnnotableTransform a CSHL variant record into a VCF allele annotatable.
- build(record: dict[str, str]) Annotatable[source]
Constructs an annotatable from a record.
- class dae.annotation.record_to_annotatable.RecordToAnnotable(columns: tuple, ref_genome: ReferenceGenome | None)[source]
Bases:
ABCBase class for record to annotable transformation.
- abstract build(record: dict[str, str]) Annotatable[source]
Constructs an annotatable from a record.
- class dae.annotation.record_to_annotatable.RecordToCNVAllele(columns: tuple, ref_genome: ReferenceGenome | None)[source]
Bases:
RecordToAnnotableTransform a columns record into a CNV allele annotatable.
- build(record: dict[str, str]) Annotatable[source]
Constructs an annotatable from a record.
- class dae.annotation.record_to_annotatable.RecordToPosition(columns: tuple, ref_genome: ReferenceGenome | None)[source]
Bases:
RecordToAnnotable- build(record: dict[str, str]) Annotatable[source]
Constructs an annotatable from a record.
- class dae.annotation.record_to_annotatable.RecordToRegion(columns: tuple, ref_genome: ReferenceGenome | None)[source]
Bases:
RecordToAnnotable- build(record: dict[str, str]) Annotatable[source]
Constructs an annotatable from a record.
- class dae.annotation.record_to_annotatable.RecordToVcfAllele(columns: tuple, ref_genome: ReferenceGenome | None)[source]
Bases:
RecordToAnnotable- build(record: dict[str, str]) Annotatable[source]
Constructs an annotatable from a record.
- class dae.annotation.record_to_annotatable.VcfLikeRecordToVcfAllele(columns: tuple, ref_genome: ReferenceGenome | None)[source]
Bases:
RecordToAnnotableTransform a columns record into VCF allele annotatable.
- build(record: dict[str, str]) Annotatable[source]
Constructs an annotatable from a record.
- dae.annotation.record_to_annotatable.add_record_to_annotable_arguments(parser: ArgumentParser) None[source]
- dae.annotation.record_to_annotatable.build_record_to_annotatable(parameters: dict[str, str], available_columns: set[str], ref_genome: ReferenceGenome | None = None) RecordToAnnotable[source]
Transform a variant record into an annotatable.
dae.annotation.score_annotator module
This contains the implementation of the three score annotators.
Genomic score annotators defined are positions_score, np_score, and allele_score.
- class dae.annotation.score_annotator.AlleleScoreAnnotator(pipeline: AnnotationPipeline, info: AnnotatorInfo)[source]
Bases:
GenomicScoreAnnotatorBaseThis class implements allele_score annotator.
- annotate(annotatable: Annotatable | None, context: dict[str, Any]) dict[str, Any][source]
Produce annotation attributes for an annotatable.
- build_score_aggregator_documentation(attr_info: AttributeInfo) list[str][source]
Collect score aggregator documentation.
- class dae.annotation.score_annotator.GenomicScoreAnnotatorBase(pipeline: AnnotationPipeline, info: AnnotatorInfo, score: GenomicScore)[source]
Bases:
AnnotatorGenomic score base annotator.
- add_score_aggregator_documentation(attribute_info: AttributeInfo, aggregator: str, attribute_conf_agg: str | None) None[source]
Collect score aggregator documentation.
- abstract build_score_aggregator_documentation(attr_info: AttributeInfo) list[str][source]
Construct score aggregator documentation.
- class dae.annotation.score_annotator.PositionScoreAnnotator(pipeline: AnnotationPipeline, info: AnnotatorInfo)[source]
Bases:
GenomicScoreAnnotatorBaseThis class implements the position_score annotator.
The position_score annotator requires the resrouce_id parameter, whose value must be an id of a genomic resource of type position_score.
The position_score resource provides a set of scores (see …) that the position_score annotator uses as attributes to assign to the annotatable.
The position_score annotator recognized one attribute level parameter called position_aggregator that controls how the position scores are aggregator for annotates that ref to a region of the reference genome.
- annotate(annotatable: Annotatable | None, context: dict[str, Any]) dict[str, Any][source]
Produce annotation attributes for an annotatable.
- build_score_aggregator_documentation(attr_info: AttributeInfo) list[str][source]
Collect score aggregator documentation.
- dae.annotation.score_annotator.build_allele_score_annotator(pipeline: AnnotationPipeline, info: AnnotatorInfo) Annotator[source]
- dae.annotation.score_annotator.build_np_score_annotator(pipeline: AnnotationPipeline, info: AnnotatorInfo) Annotator[source]
- dae.annotation.score_annotator.build_position_score_annotator(pipeline: AnnotationPipeline, info: AnnotatorInfo) Annotator[source]
- dae.annotation.score_annotator.get_genomic_resource(pipeline: AnnotationPipeline, info: AnnotatorInfo, resource_types: set[str]) GenomicResource[source]
Return genomic score resource used for given genomic score annotator.
dae.annotation.simple_effect_annotator module
- class dae.annotation.simple_effect_annotator.SimpleEffect(effect_type: str, transcript_id: str, gene: str)[source]
Bases:
object- effect_type: str
- gene: str
- transcript_id: str
- class dae.annotation.simple_effect_annotator.SimpleEffectAnnotator(pipeline: AnnotationPipeline, info: AnnotatorInfo)[source]
Bases:
AnnotatorBaseSimple effect annotator class.
- annotate(annotatable: Annotatable | None, context: dict[str, Any]) dict[str, Any][source]
Produce annotation attributes for an annotatable.
- call_region(chrom: str, beg: int, end: int, tx: TranscriptModel, *, func_name: str, classification: str) SimpleEffect | None[source]
Call a region with a specific classification.
- cds_intron_regions(transcript: TranscriptModel) list[Region][source]
Return whether region is CDS intron.
- cds_regions(transcript: TranscriptModel) Sequence[Region][source]
Return whether the region is classified as coding.
- noncoding_regions(transcript: TranscriptModel) list[Region][source]
Return whether the region is noncoding.
- peripheral_regions(transcript: TranscriptModel) list[Region][source]
Return whether the region is peripheral.
- run_annotate(chrom: str, beg: int, end: int) dict[str, set[SimpleEffect]][source]
Return classification with a set of affected genes.
- dae.annotation.simple_effect_annotator.build_simple_effect_annotator(pipeline: AnnotationPipeline, info: AnnotatorInfo) Annotator[source]
dae.annotation.utils module
- dae.annotation.utils.find_annotator_gene_models(info: AnnotatorInfo, grr: GenomicResourceRepo) GeneModels[source]
Get gene models from the annotator info or genomic context.
- dae.annotation.utils.find_annotator_reference_genome(info: AnnotatorInfo, gene_models: GeneModels, pipeline: AnnotationPipeline, grr: GenomicResourceRepo) ReferenceGenome[source]
Get reference genome from the annotator info or genomic context.