Getting started on the web
The GAIn Web Annotation page (https://gain.iossifovlab.com/) provides a simple and interactive interface for annotating genetic variants, positions, or regions, collectively referred to as annotatables. The page is organized into two main areas. On the left, users can create, edit, and manage annotation pipelines, either by selecting a saved pipeline or by building a new one. This pipeline editor defines which resources and annotators are applied during annotation. On the right, users can enter an annotatable, submit it for processing, and view the resulting annotations. Together, these two panels allow users to configure an annotation workflow and immediately apply it to annotatables of interest.
Select annotation pipeline
To begin, open the Select pipeline menu, scroll through the available options, and choose pipeline/hg38_clinical_annotation. This pipeline is designed for annotatables provided in the hg38 assembly and annotates them with a broad set of clinically relevant attributes. In addition to effect prediction, it incorporates information from resources such as dbSNP, gnomAD, ClinVar, PhyloP7, AlphaMissense, MPC, and gene-level constraint scores. After selecting it, users can review the full annotation pipeline definition in the annotation pipeline editor on the left.
Single annotation
The GAIn web interface has two annotation modes: Single annotation, for interactively annotating one variant, position, or region at a time, and Annotation jobs, for annotating files containing multiple annotatables. This section demonstrates the use of Single annotation mode. Users can either enter their own annotatable in hg38 coordinates, or click the i button to the left of the entry field and choose an example annotatable. Example annotatables cover a wide range of syntax options for entering genomic positions, regions, or variants. This automatically enters the example annotatable, runs the selected pipeline on it, and displays the resulting annotations on the right side of the page. In this example, we choose a variant annotatable, which is defined by chromosome, position, reference allele, and alternative allele.
The annotation results on the right panel can be viewed in two formats. In the compact report view, when full report is turned off, only the annotation results are shown, making it easier to scan the output. This view is useful when a concise summary is needed without additional details.
When full report is turned on, the page shows a more detailed view of the results. In this view, annotations that include score distributions are accompanied by plots showing how the scores are distributed across the resource and where the queried annotatable falls within that distribution. In the example shown here, two such distributions are displayed, with a red line marking where the queried annotatable falls in relation to the overall distribution.
Annotation jobs
The second annotation mode is Annotation jobs, which allows multiple annotatables to be annotated at once. In this mode, users upload a file in tabular or VCF format containing annotatables (variants, positions, or regions) with the required coordinate information. Click Annotation jobs tab on the top.
Download the example input CSV file (small_input.csv), whose content is shown below.
chrom |
pos |
ref |
alt |
|---|---|---|---|
chr14 |
21415880 |
G |
A |
chr17 |
7674904 |
TCT |
T |
chr7 |
117587806 |
G |
A |
The file contains three variant annotatables, each described by the columns chrom, pos, ref, and alt, which specify the chromosome, genomic position, reference allele, and alternate allele. Drag and drop the file into the upload area, or click Choose file and select small_input.csv for annotation.
When the file is uploaded, GAIn automatically recognizes the chrom, pos, ref, and alt columns because the file uses those exact column names. If the input file uses different column labels, users can manually specify which columns correspond to chrom, pos, ref, and alt before creating the annotation job.
Once the annotation job has finished, GAIn shows its status as success.
Users can then click the Download button next to the result to download a file (result.csv) containing the annotation attributes produced for all annotatables in the input file.
The downloaded file, shown below, includes the original variant columns together with the annotations generated by the selected pipeline, such as effect predictions, population frequencies, clinical classifications, pathogenicity scores, and gene-level scores.
chrom |
pos |
ref |
alt |
worst_effect_MANE_1_5 |
effect_details_MANE_1_5 |
gene_effects_MANE_1_5 |
dbSNP_rs_number |
gnomad_v4_exome_ALL_af |
gnomad_v4_genome_ALL_af |
clinical_significance |
clinical_disease_name |
phyloP7way |
AlphaMissense_pathogenicity |
AlphaMissense_class |
MPC_score |
worst_effect_GENCODE_49 |
effect_details_GENCODE_49 |
gene_effects_GENCODE_49 |
pLI_rank_all |
pLI_rank_min |
LOEUF_rank_all |
LOEUF_rank_min |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
chr14 |
21415880 |
G |
A |
nonsense |
ENST00000646647.2:CHD8:nonsense:582/2581(Arg->End) |
CHD8:nonsense |
863224857 |
Pathogenic/Likely_pathogenic |
not_provided|Intellectual_developmental_disorder_with_autism_and_macrocephaly |
0.917 |
nonsense |
ENST00000645929.1:CHD8:nonsense:303/2302(Arg->End)|ENST00000646647.2:CHD8:nonsense:582/2581(Arg->End)|ENST00000864429.1:CHD8:nonsense:582/2581(Arg->End)|ENST00000643469.1:CHD8:nonsense:582/2581(Arg->End)|ENST00000934461.1:CHD8:nonsense:582/2581(Arg->End)|ENST00000557364.6:CHD8:nonsense:582/2581(Arg->End)|ENST00000934460.1:CHD8:nonsense:582/2581(Arg->End)|ENST00000430710.8:CHD8:nonsense:303/2302(Arg->End) |
CHD8:nonsense |
CHD8:45 |
45 |
CHD8:112.5 |
112.5 |
|||||
chr17 |
7674904 |
TCT |
T |
frame-shift |
ENST00000269305.9:TP53:frame-shift:209/393 |
TP53:frame-shift |
1057517840 |
6.84e-07 |
Pathogenic |
Hereditary_cancer-predisposing_syndrome|TP53-related_disorder|not_provided|Li-Fraumeni_syndrome_1|Ovarian_neoplasm|Li-Fraumeni_syndrome |
-0.12 |
frame-shift |
ENST00000622645.4:TP53:frame-shift:170/302|ENST00000420246.6:TP53:frame-shift:209/341|ENST00000610538.4:TP53:frame-shift:170/307|ENST00000455263.6:TP53:frame-shift:209/346|ENST00000949117.1:TP53:frame-shift:209/394|ENST00000923567.1:TP53:frame-shift:209/393|ENST00000905353.1:TP53:frame-shift:209/393|ENST00000620739.4:TP53:frame-shift:170/354|ENST00000714357.1:TP53:frame-shift:209/393|ENST00000923566.1:TP53:frame-shift:209/393|ENST00000510385.5:TP53:frame-shift:77/209|ENST00000610623.4:TP53:frame-shift:50/187|ENST00000504290.5:TP53:frame-shift:77/214|ENST00000618944.4:TP53:frame-shift:50/182|ENST00000504937.5:TP53:frame-shift:77/261|ENST00000619186.4:TP53:frame-shift:50/234|ENST00000445888.6:TP53:frame-shift:209/393|ENST00000604348.6:TP53:frame-shift:202/386|ENST00000619485.4:TP53:frame-shift:170/354|ENST00000269305.9:TP53:frame-shift:209/393|ENST00000714408.1:TP53:frame-shift:209/411|ENST00000714409.1:TP53:frame-shift:209/367|ENST00000413465.6:TP53:frame-shift:209/285|ENST00000576024.2:TP53:frame-shift:209/344|ENST00000923568.1:TP53:frame-shift:209/393|ENST00000714359.1:TP53:frame-shift:209/393|ENST00000714356.1:TP53:frame-shift:170/347|ENST00000923569.1:TP53:frame-shift:209/393|ENST00000359597.8:TP53:frame-shift:209/343|ENST00000610292.4:TP53:frame-shift:170/354 |
TP53:frame-shift |
TP53:3122 |
3122 |
TP53:4446.5 |
4446.5 |
||||
chr7 |
117587806 |
G |
A |
missense |
ENST00000003084.11:CFTR:missense:551/1480(Gly->Asp) |
CFTR:missense |
75527207 |
0.000404 |
0.000276 |
Pathogenic |
CFTR-related_disorder|Cystic_fibrosis|Congenital_bilateral_aplasia_of_vas_deferens_from_CFTR_mutation|not_provided|Hereditary_pancreatitis|Bronchiectasis_with_or_without_elevated_sweat_chloride_1|ivacaftor_response_-_Efficacy |
0.917 |
0.99 |
likely_pathogenic |
0.015 |
missense |
ENST00000889206.1:CFTR:missense:551/1451(Gly->Asp)|ENST00000950799.1:CFTR:intron:11/22[15019]|ENST00000003084.11:CFTR:missense:551/1480(Gly->Asp)|ENST00000889208.1:CFTR:missense:551/1437(Gly->Asp)|ENST00000699605.1:CFTR:missense:409/1338(Gly->Asp)|ENST00000889209.1:CFTR:missense:551/1450(Gly->Asp)|ENST00000649781.2:CFTR:missense:490/1419(Gly->Asp)|ENST00000649406.1:CFTR:missense:490/1187(Gly->Asp)|ENST00000648260.1:CFTR:intron:10/16[15019]|ENST00000699602.1:CFTR:missense:551/1478(Gly->Asp)|ENST00000889207.1:CFTR:missense:490/1354(Gly->Asp)|ENST00000889210.1:CFTR:missense:490/1376(Gly->Asp) |
CFTR:missense|CFTR:intron |
CFTR:18190 |
18190 |
CFTR:13993.5 |
13993.5 |
Custom annotation pipeline
The web interface can also be used to create custom annotation pipelines. Users can build pipelines by selecting resources and configuring how they should be used in annotation. This workflow is described in more detail in the GAIn web interface section. Here, we demonstrate a simple example in which a user creates a new pipeline by selecting two resources: the MANE 1.5 gene models and the phyloP7way conservation score.
To create a new pipeline, click the New pipeline button at the bottom of the annotation pipeline editor. This opens an empty pipeline editor. Next, click the New resource button to open the resource selection dialog, where resources can be added to the pipeline.
When the resource selection dialog opens, it lists resources available to the web interface from the connected public genomic resource repositories. By default, this includes approximately 8000 resources from the main IossifovLab GRR and GRR-ENCODE.
For this example, we first add a gene models resource. Gene models describe the locations and structures of genes and transcripts, allowing GAIn to determine which genes are affected by a variant and what type of effect the variant has. Use the search box to search for MANE, scroll through the matching results, find MANE 1.5, and select the checkbox next to it. This adds the MANE 1.5 gene models resource to the pipeline with its default annotation attributes, including the predicted worst effect, affected genes, and effect details.
When the resource is selected, GAIn automatically adds the corresponding lines to the annotation pipeline editor. The editor now contains the pipeline configuration needed to use the MANE 1.5 gene models resource.
After adding the gene models resource, add a position score resource to the same pipeline. Position score resources assign numeric values to genomic positions. In this example, we add phyloP7way, a conservation score that reports evolutionary conservation at each queried position.
Click New resource again and search for phyloP7way in the resource selection dialog. Select the checkbox next to the phyloP7way resource to add it to the pipeline.
After the resource is selected, GAIn automatically adds the corresponding lines to the annotation pipeline editor. Because phyloP7way provides a single score, GAIn adds that score as the default annotation attribute for this resource. The custom pipeline now includes both the MANE 1.5 gene models resource and the phyloP7way position score resource.
To test the custom pipeline, choose one of the example annotatables or enter an annotatable in hg38 coordinates. GAIn runs the custom pipeline and displays the results immediately in the right panel. The example below shows the compact report produced by the custom pipeline.
This concludes the Getting Started on the Web section, which demonstrated how to use a saved annotation pipeline, run annotation jobs, and create a simple custom annotation pipeline in the web interface. The GAIn web interface section describes the web interface in more detail, including registration and additional features of the site.