Getting Started with Federation

Federation is a mechanism that allows you to combine multiple GPF instances into a single system. This allows you to share data and resources across multiple GPF instances, enabling you to work with larger datasets and collaborate with other researchers more effectively.

In this section, we will show you how to set up a federation between the SFARI GPF instance and your local GPF instance. This will allow you to access the data and resources you have access to on the SFARI GPF instance from your local GPF instance.

Configure federation on your local GPF instance

To use the GPF federation, you need to install the additional gpf_federation conda package in your local conda environment. You can do this by running the following command:

mamba install \
    -c conda-forge \
    -c bioconda \
    -c iossifovlab \
    gpf_federation

Once the package is installed, you need to configure the federation on your local GPF instance. You can do this by editing the minimal_instacne/gpf_instance.yaml file:

 1instance_id: minimal_instance
 2
 3reference_genome:
 4  resource_id: "hg38/genomes/GRCh38-hg38"
 5
 6gene_models:
 7  resource_id: "hg38/gene_models/MANE/1.3"
 8
 9annotation:
10  config:
11    - allele_score: hg38/variant_frequencies/gnomAD_4.1.0/genomes/ALL
12    - allele_score: hg38/scores/ClinVar_20240730
13
14gene_sets_db:
15  gene_set_collections:
16  - gene_properties/gene_sets/autism
17  - gene_properties/gene_sets/relevant
18  - gene_properties/gene_sets/GO_2024-06-17_release
19
20gene_scores_db:
21  gene_scores:
22  - gene_properties/gene_scores/Satterstrom_Buxbaum_Cell_2020
23  - gene_properties/gene_scores/Iossifov_Wigler_PNAS_2015
24  - gene_properties/gene_scores/LGD
25  - gene_properties/gene_scores/RVIS
26  - gene_properties/gene_scores/LOEUF
27
28gene_profiles_config:
29  conf_file: gene_profiles.yaml
30
31remotes:
32  - id: "sfari"
33    url: "https://gpf.sfari.org/hg38"

Note

This configuration will allow your local instance to access only the publicly available resources in the SFARI GPF instance.

In case you have a user account on the SFARI GPF instance, you can create federation tokens and use them to access the remote instance as described in the Federation tokens.

When you are ready with the configuration, you can start the GPF instance using the wgpf tool:

wgpf run

On the home page of your local GPF instance, you should see studies loaded from the SFARI remote instance in the Home Page:

administration/getting_started/getting_started_files/federation_home_page.png

Home page with studies from the SFARI GPF instance

Warning

The federation loads a lot of data from the remote instance. When you start the GPF instance, it may take some time to load all the needed information.

Combine analysis using local and remote studies

Having the federation configured, you can explore local and remote studies. Moreover, you can combine local and remote studies using the available tools.

For example, let’s go to the ssc_denovo and select the Enrichment Tool. From Gene Sets choose Denovo:

administration/getting_started/getting_started_files/federation_enrichment_tool.png

Enrichment Tool for ssc_denovo study

Then from the studies hierarchy choose (sfari) Sequencing de Novo / (sfari) SD Autism / (sfari) SD SPARK Autism / (sfari) SD iWES_v1_1_genotypes_DENOVO study and select the autism phenotype.

administration/getting_started/getting_started_files/federation_enrichment_tool_denovo_gene_set.png

Enrichment Tool for ssc_denovo study with selected remote study de Novo gene sets

Let us select the LGDs de Novo gene set and run the Enrichment Tool:

administration/getting_started/getting_started_files/federation_enrichment_tool_iwes_denovo_gene_sets.png

De Novo gene set from SD_iWES_v1_1_genotypes_DENOVO study

Federation tokens

Federation tokens are used to authenticate and authorize access to the federated GPF instance.

Let us create a federation token for the SFARI GPF instance. You need to log in to the SFARI GPF instance, go to User Profile, select Federation Tokens, and create a new federation token:

administration/getting_started/getting_started_files/federation_client_id_and_secret.png

Federation client ID and secret from the User Profile

Warning

The federation client ID and secret are shown only once. Make sure to copy them to a safe place. You will need them to configure the federation on your local GPF instance.

Once you have the federation client ID and secret, you can configure your local GPF instance to use them. You need to edit the minimal_instance/gpf_instance.yaml file and add the lines 5-6 to the remotes section:

1remotes:
2  - id: "sfari"
3    url: "https://gpf.sfari.org/hg38"
4    client_id: "Tqtgr2e3YPiDQS6CHvMdH7rPgTnxmoA46OWSbagV"
5    client_secret: "22xKTkewcxyTnKdHou21LRikUU2Hea2tLRBBOaPm2UCIUWEqZFogWk0nRysDrXepieOWYUkTZvG1xVULtwEspWG2YQ71lH7Vow7dNTMzG9ELdVQcOY8YQOD3y9XwRw8T"

This will allow your local GPF instance to have access to the resources in SFARI GPF instance that you have access to.

Warning

The federation client ID and secret in the example above are placeholders and should not be used. You need to replace them with your own federation client ID and secret.