Getting Started with Federation
Federation is a mechanism that allows you to combine multiple GPF instances into a single system. This allows you to share data and resources across multiple GPF instances, enabling you to work with larger datasets and collaborate with other researchers more effectively.
In this section, we will show you how to set up a federation between the SFARI GPF instance and your local GPF instance. This will allow you to access the data and resources you have access to on the SFARI GPF instance from your local GPF instance.
Configure federation on your local GPF instance
To use the GPF federation, you need to install the additional
gpf_federation conda package in your local conda environment. You can do
this by running the following command:
mamba install \
-c conda-forge \
-c bioconda \
-c iossifovlab \
gpf_federation
Once the package is installed, you need to configure the federation on your
local GPF instance. You can do this by editing the
minimal_instacne/gpf_instance.yaml file:
1instance_id: minimal_instance
2
3reference_genome:
4 resource_id: "hg38/genomes/GRCh38-hg38"
5
6gene_models:
7 resource_id: "hg38/gene_models/MANE/1.3"
8
9annotation:
10 config:
11 - allele_score: hg38/variant_frequencies/gnomAD_4.1.0/genomes/ALL
12 - allele_score: hg38/scores/ClinVar_20240730
13
14gene_sets_db:
15 gene_set_collections:
16 - gene_properties/gene_sets/autism
17 - gene_properties/gene_sets/relevant
18 - gene_properties/gene_sets/GO_2024-06-17_release
19
20gene_scores_db:
21 gene_scores:
22 - gene_properties/gene_scores/Satterstrom_Buxbaum_Cell_2020
23 - gene_properties/gene_scores/Iossifov_Wigler_PNAS_2015
24 - gene_properties/gene_scores/LGD
25 - gene_properties/gene_scores/RVIS
26 - gene_properties/gene_scores/LOEUF
27
28gene_profiles_config:
29 conf_file: gene_profiles.yaml
30
31remotes:
32 - id: "sfari"
33 url: "https://gpf.sfari.org/hg38"
Note
This configuration will allow your local instance to access only the publicly available resources in the SFARI GPF instance.
In case you have a user account on the SFARI GPF instance, you can create federation tokens and use them to access the remote instance as described in the Federation tokens.
When you are ready with the configuration, you can start the GPF instance using
the wgpf tool:
wgpf run
On the home page of your local GPF instance, you should see studies loaded from the SFARI remote instance in the Home Page:
Home page with studies from the SFARI GPF instance
Warning
The federation loads a lot of data from the remote instance. When you start the GPF instance, it may take some time to load all the needed information.
Combine analysis using local and remote studies
Having the federation configured, you can explore local and remote studies. Moreover, you can combine local and remote studies using the available tools.
For example, let’s go to the ssc_denovo and select the Enrichment Tool. From Gene Sets choose Denovo:
Enrichment Tool for ssc_denovo study
Then from the studies hierarchy choose (sfari) Sequencing de Novo / (sfari) SD Autism / (sfari) SD SPARK Autism / (sfari) SD iWES_v1_1_genotypes_DENOVO study and select the autism phenotype.
Enrichment Tool for ssc_denovo study with selected remote study de Novo gene sets
Let us select the LGDs de Novo gene set and run the Enrichment Tool:
De Novo gene set from SD_iWES_v1_1_genotypes_DENOVO study
Federation tokens
Federation tokens are used to authenticate and authorize access to the federated GPF instance.
Let us create a federation token for the SFARI GPF instance. You need to log in to the SFARI GPF instance, go to User Profile, select Federation Tokens, and create a new federation token:
Federation client ID and secret from the User Profile
Warning
The federation client ID and secret are shown only once. Make sure to copy them to a safe place. You will need them to configure the federation on your local GPF instance.
Once you have the federation client ID and secret, you can configure your local
GPF instance to use them. You need to edit the
minimal_instance/gpf_instance.yaml file and add the lines 5-6 to the
remotes section:
1remotes:
2 - id: "sfari"
3 url: "https://gpf.sfari.org/hg38"
4 client_id: "Tqtgr2e3YPiDQS6CHvMdH7rPgTnxmoA46OWSbagV"
5 client_secret: "22xKTkewcxyTnKdHou21LRikUU2Hea2tLRBBOaPm2UCIUWEqZFogWk0nRysDrXepieOWYUkTZvG1xVULtwEspWG2YQ71lH7Vow7dNTMzG9ELdVQcOY8YQOD3y9XwRw8T"
This will allow your local GPF instance to have access to the resources in SFARI GPF instance that you have access to.
Warning
The federation client ID and secret in the example above are placeholders and should not be used. You need to replace them with your own federation client ID and secret.