Getting started on the web

The GAIn Web Annotation page (https://gain.iossifovlab.com/) provides a simple and interactive interface for annotating genetic variants, positions, or regions, collectively referred to as annotatables. The page is organized into two main areas. On the left, users can create, edit, and manage annotation pipelines, either by selecting a saved pipeline or by building a new one. This pipeline editor defines which resources and annotators are applied during annotation. On the right, users can enter an annotatable, submit it for processing, and view the resulting annotations. Together, these two panels allow users to configure an annotation workflow and immediately apply it to annotatables of interest.

_images/GAInWeb1b.png

Select annotation pipeline

To begin, open the Select pipeline menu, scroll through the available options, and choose pipeline/hg38_clinical_annotation. This pipeline is designed for annotatables provided in the hg38 assembly and annotates them with a broad set of clinically relevant attributes. In addition to effect prediction, it incorporates information from resources such as dbSNP, gnomAD, ClinVar, PhyloP7, AlphaMissense, MPC, and gene-level constraint scores. After selecting it, users can review the full annotation pipeline definition in the annotation pipeline editor on the left.

_images/GAInWeb2b.png

Single annotation

The GAIn web interface has two annotation modes: Single annotation, for interactively annotating one variant, position, or region at a time, and Annotation jobs, for annotating files containing multiple annotatables. This section demonstrates the use of Single annotation mode. Users can either enter their own annotatable in hg38 coordinates, or click the i button to the left of the entry field and choose an example annotatable. Example annotatables cover a wide range of syntax options for entering genomic positions, regions, or variants. This automatically enters the example annotatable, runs the selected pipeline on it, and displays the resulting annotations on the right side of the page. In this example, we choose a variant annotatable, which is defined by chromosome, position, reference allele, and alternative allele.

The annotation results on the right panel can be viewed in two formats. In the compact report view, when full report is turned off, only the annotation results are shown, making it easier to scan the output. This view is useful when a concise summary is needed without additional details.

_images/GAInWeb3b.png

When full report is turned on, the page shows a more detailed view of the results. In this view, annotations that include score distributions are accompanied by plots showing how the scores are distributed across the resource and where the queried annotatable falls within that distribution. In the example shown here, two such distributions are displayed, with a red line marking where the queried annotatable falls in relation to the overall distribution.

_images/GAInWeb4b.png

Annotation jobs

The second annotation mode is Annotation jobs, which allows multiple annotatables to be annotated at once. In this mode, users upload a file in tabular or VCF format containing annotatables (variants, positions, or regions) with the required coordinate information. Click Annotation jobs tab on the top.

_images/GAInWeb5.png

Download the example input CSV file (small_input.csv), whose content is shown below.

chrom

pos

ref

alt

chr14

21415880

G

A

chr17

7674904

TCT

T

chr7

117587806

G

A

The file contains three variant annotatables, each described by the columns chrom, pos, ref, and alt, which specify the chromosome, genomic position, reference allele, and alternate allele. Drag and drop the file into the upload area, or click Choose file and select small_input.csv for annotation.

When the file is uploaded, GAIn automatically recognizes the chrom, pos, ref, and alt columns because the file uses those exact column names. If the input file uses different column labels, users can manually specify which columns correspond to chrom, pos, ref, and alt before creating the annotation job.

_images/GAInWeb6.png

Once the annotation job has finished, GAIn shows its status as success. Users can then click the Download button next to the result to download a file (result.csv) containing the annotation attributes produced for all annotatables in the input file.

_images/GAInWeb7.png

The downloaded file, shown below, includes the original variant columns together with the annotations generated by the selected pipeline, such as effect predictions, population frequencies, clinical classifications, pathogenicity scores, and gene-level scores.

chrom

pos

ref

alt

worst_effect_MANE_1_5

effect_details_MANE_1_5

gene_effects_MANE_1_5

dbSNP_rs_number

gnomad_v4_exome_ALL_af

gnomad_v4_genome_ALL_af

clinical_significance

clinical_disease_name

phyloP7way

AlphaMissense_pathogenicity

AlphaMissense_class

MPC_score

worst_effect_GENCODE_49

effect_details_GENCODE_49

gene_effects_GENCODE_49

pLI_rank_all

pLI_rank_min

LOEUF_rank_all

LOEUF_rank_min

chr14

21415880

G

A

nonsense

ENST00000646647.2:CHD8:nonsense:582/2581(Arg->End)

CHD8:nonsense

863224857

Pathogenic/Likely_pathogenic

not_provided|Intellectual_developmental_disorder_with_autism_and_macrocephaly

0.917

nonsense

ENST00000645929.1:CHD8:nonsense:303/2302(Arg->End)|ENST00000646647.2:CHD8:nonsense:582/2581(Arg->End)|ENST00000864429.1:CHD8:nonsense:582/2581(Arg->End)|ENST00000643469.1:CHD8:nonsense:582/2581(Arg->End)|ENST00000934461.1:CHD8:nonsense:582/2581(Arg->End)|ENST00000557364.6:CHD8:nonsense:582/2581(Arg->End)|ENST00000934460.1:CHD8:nonsense:582/2581(Arg->End)|ENST00000430710.8:CHD8:nonsense:303/2302(Arg->End)

CHD8:nonsense

CHD8:45

45

CHD8:112.5

112.5

chr17

7674904

TCT

T

frame-shift

ENST00000269305.9:TP53:frame-shift:209/393

TP53:frame-shift

1057517840

6.84e-07

Pathogenic

Hereditary_cancer-predisposing_syndrome|TP53-related_disorder|not_provided|Li-Fraumeni_syndrome_1|Ovarian_neoplasm|Li-Fraumeni_syndrome

-0.12

frame-shift

ENST00000622645.4:TP53:frame-shift:170/302|ENST00000420246.6:TP53:frame-shift:209/341|ENST00000610538.4:TP53:frame-shift:170/307|ENST00000455263.6:TP53:frame-shift:209/346|ENST00000949117.1:TP53:frame-shift:209/394|ENST00000923567.1:TP53:frame-shift:209/393|ENST00000905353.1:TP53:frame-shift:209/393|ENST00000620739.4:TP53:frame-shift:170/354|ENST00000714357.1:TP53:frame-shift:209/393|ENST00000923566.1:TP53:frame-shift:209/393|ENST00000510385.5:TP53:frame-shift:77/209|ENST00000610623.4:TP53:frame-shift:50/187|ENST00000504290.5:TP53:frame-shift:77/214|ENST00000618944.4:TP53:frame-shift:50/182|ENST00000504937.5:TP53:frame-shift:77/261|ENST00000619186.4:TP53:frame-shift:50/234|ENST00000445888.6:TP53:frame-shift:209/393|ENST00000604348.6:TP53:frame-shift:202/386|ENST00000619485.4:TP53:frame-shift:170/354|ENST00000269305.9:TP53:frame-shift:209/393|ENST00000714408.1:TP53:frame-shift:209/411|ENST00000714409.1:TP53:frame-shift:209/367|ENST00000413465.6:TP53:frame-shift:209/285|ENST00000576024.2:TP53:frame-shift:209/344|ENST00000923568.1:TP53:frame-shift:209/393|ENST00000714359.1:TP53:frame-shift:209/393|ENST00000714356.1:TP53:frame-shift:170/347|ENST00000923569.1:TP53:frame-shift:209/393|ENST00000359597.8:TP53:frame-shift:209/343|ENST00000610292.4:TP53:frame-shift:170/354

TP53:frame-shift

TP53:3122

3122

TP53:4446.5

4446.5

chr7

117587806

G

A

missense

ENST00000003084.11:CFTR:missense:551/1480(Gly->Asp)

CFTR:missense

75527207

0.000404

0.000276

Pathogenic

CFTR-related_disorder|Cystic_fibrosis|Congenital_bilateral_aplasia_of_vas_deferens_from_CFTR_mutation|not_provided|Hereditary_pancreatitis|Bronchiectasis_with_or_without_elevated_sweat_chloride_1|ivacaftor_response_-_Efficacy

0.917

0.99

likely_pathogenic

0.015

missense

ENST00000889206.1:CFTR:missense:551/1451(Gly->Asp)|ENST00000950799.1:CFTR:intron:11/22[15019]|ENST00000003084.11:CFTR:missense:551/1480(Gly->Asp)|ENST00000889208.1:CFTR:missense:551/1437(Gly->Asp)|ENST00000699605.1:CFTR:missense:409/1338(Gly->Asp)|ENST00000889209.1:CFTR:missense:551/1450(Gly->Asp)|ENST00000649781.2:CFTR:missense:490/1419(Gly->Asp)|ENST00000649406.1:CFTR:missense:490/1187(Gly->Asp)|ENST00000648260.1:CFTR:intron:10/16[15019]|ENST00000699602.1:CFTR:missense:551/1478(Gly->Asp)|ENST00000889207.1:CFTR:missense:490/1354(Gly->Asp)|ENST00000889210.1:CFTR:missense:490/1376(Gly->Asp)

CFTR:missense|CFTR:intron

CFTR:18190

18190

CFTR:13993.5

13993.5

Custom annotation pipeline

The web interface can also be used to create custom annotation pipelines. Users can build pipelines by selecting resources and configuring how they should be used in annotation. This workflow is described in more detail in the GAIn web interface section. Here, we demonstrate a simple example in which a user creates a new pipeline by selecting two resources: the MANE 1.5 gene models and the phyloP7way conservation score.

To create a new pipeline, click the New pipeline button at the bottom of the annotation pipeline editor. This opens an empty pipeline editor. Next, click the New resource button to open the resource selection dialog, where resources can be added to the pipeline.

_images/GAInWeb8.png

When the resource selection dialog opens, it lists resources available to the web interface from the connected public genomic resource repositories. By default, this includes approximately 8000 resources from the main IossifovLab GRR and GRR-ENCODE.

_images/GAInWeb9.png

For this example, we first add a gene models resource. Gene models describe the locations and structures of genes and transcripts, allowing GAIn to determine which genes are affected by a variant and what type of effect the variant has. Use the search box to search for MANE, scroll through the matching results, find MANE 1.5, and select the checkbox next to it. This adds the MANE 1.5 gene models resource to the pipeline with its default annotation attributes, including the predicted worst effect, affected genes, and effect details.

_images/GAInWeb10.png

When the resource is selected, GAIn automatically adds the corresponding lines to the annotation pipeline editor. The editor now contains the pipeline configuration needed to use the MANE 1.5 gene models resource.

_images/GAInWeb11.png

After adding the gene models resource, add a position score resource to the same pipeline. Position score resources assign numeric values to genomic positions. In this example, we add phyloP7way, a conservation score that reports evolutionary conservation at each queried position.

Click New resource again and search for phyloP7way in the resource selection dialog. Select the checkbox next to the phyloP7way resource to add it to the pipeline.

_images/GAInWeb12.png

After the resource is selected, GAIn automatically adds the corresponding lines to the annotation pipeline editor. Because phyloP7way provides a single score, GAIn adds that score as the default annotation attribute for this resource. The custom pipeline now includes both the MANE 1.5 gene models resource and the phyloP7way position score resource.

_images/GAInWeb13.png

To test the custom pipeline, choose one of the example annotatables or enter an annotatable in hg38 coordinates. GAIn runs the custom pipeline and displays the results immediately in the right panel. The example below shows the compact report produced by the custom pipeline.

_images/GAInWeb14.png

This concludes the Getting Started on the Web section, which demonstrated how to use a saved annotation pipeline, run annotation jobs, and create a simple custom annotation pipeline in the web interface. The GAIn web interface section describes the web interface in more detail, including registration and additional features of the site.