Submitter Portal

Submitter Portal Index

Overview

Common Aspects

Identifiers

Tutorial Video

Analysis Submission

Guided Documentation

Webin Tool

Overview

The EGA - Submitter Portal, provides the tools that aim to facilitate the metadata submission of human data to the European Genome Archive. The aim of this page is to provide a video tutorial on how to use the EGA Submitter Portal. The page is divided into ordered sections for completing a submission.

In this tutorial page we will demonstrate how to use the Submitter Portal to register your metadata. While the video focuses on the run-based submission (for raw files - fastq - and aligned data - BAM/CRAM), the analysis-based submission is defined below (for your BAM/BAI pairs, variation -VCF - and phenotype files)

Before registering the metadata is very important that all submitters have encrypted and uploaded to their ega submission account (ega-box).

The EGA is a shared, public service with limited resources. In order to manage the available resources, EGA submission boxes should not exceed 8Tb in size, and cannot exceed 12Tb. Please do not exceed this limit. If you are approaching this limit please contact EGA helpdesk so that we can advise on how to register the associated metadata and trigger the archiving of files, so that you can continue with your submission. If we note that your submission account increases above 10Tb on a consistent base your password will be changed until metadata is associated

Please note that some metadata (run and analysis objects) cannot be registered until at least 24hours after the files have been uploaded to your box.

Common Aspects

The metadata objects required for read submissions are as follows:

If you are performing Array-based submission(s), the Submitter Portal should only be used to register the Study, Samples, Data Access Committee (DAC) and Policy metadata objects. We are currently working on the features to provide the creation of array metadata submissions using the portal..

Identifiers

EGA objects can be identified by their unique accession. These are ID's displayed everywhere, shared among all EGA locations and specific for each data type (More information on the list below)

EGA Accession ID EGA Object description
EGAS EGA Study Accession ID
EGAC EGA DAC Accession ID
EGAP EGA Policy Accession ID
EGAN EGA Sample Accession ID
EGAR EGA Run Accession ID
EGAX EGA Experiment ID
EGAZ EGA Analysis Accession ID
EGAD EGA Dataset Accession ID
EGAB EGA Submission ID
EGAF EGA File Unique Accession ID

Tutorial Video

In the below 12 short videos, you can find a worked example, with detailed instructions on how to use the EGA submitter portal to perform metadata submissions to the EGA.

Introduction and explanation of the example (1 of 12)
What metadata needs to be submitted? (2 of 12)
Submitter Portal Common aspects (3 of 12)
Making a new submission (4 of 12)
Register study (5 of 12)
Register samples (6 of 12)
Register data access committee (7 of 12)
Register data access policy (8 of 12)
Register experiments (9 of 12)
Link files and samples (10 of 12)
Register dataset (11 of 12)
Submit objects to the server (12 of 12)

Points to Notice

There is a strong relationship among EGA metadata objects. Unless the primary objects (study, samples and DAC) are properly submitted, their linked and secondary objects will not validate (experiments, runs, analyses or policies). The tertiary metadata object (dataset) require all the objects to be submitted before can be validated and submitted. Should you prefer to submit everyone at once, please generate all the objects with no validation and the go to "Edit title and description" tab and click "I'm done". This will validate and submit all together

Analysis Submission

The EGA submitter portal video focuses on a unique use, the submission of Runs.

Aligned BAM files are expected to be submitted as runs (1 to 1 cardinality with samples). Analysis should be only be used for BAM/BAI pair, VCF and phenotype linkage to samples.. The analysis is an EGA specific metadata object that links Samples, to Files. This object also stores some metadata about your experiments, such as the experiment type, genome reference, or the platform used.

**If only BAM or CRAM alignment files are submitted but not the original unaligned FASTQ files, then please make sure that the BAM or CRAM files also contain the unaligned reads. This is critical to enable primary re-analysis and re-alignment of the dataset using new tools or future genome assemblies.**

Aligned/ Mapped Sequence Reads

Prior to defining the Analysis

In order to register your analysis you should firstly :

  1. Register your Study
  2. Register the DAC and Policy
  3. Register the Samples
  4. Encrypt and Upload the files

Please note that the EGA allows for the re-use of registered metadata. Therefore the previously registered Study, DAC, Policy or samples can be re-used for the analysis data submission.

Defining the Analysis

  1. In the Submitter Portal accordion, select the option "Link files and samples" and click "Analysis Data".
  2. Start by selecting the sample(s) to be linked to the file, and populate the required attribute fields. Please note the existence of mandatory fields. These must be populated.
  3. Finally, select the file and file type to be associated with the sample. If you wish to add additional files, click the button "Add additional files".
  4. Your analysis will be created in draft status. To learn more about validating, editing or deleting the analysis view the Submitter Portal video section above

Points to notice

When populating the chromosome field (mandatory). Please, after selecting the chromosome(s), press key ENTER in order to save your selection.

Submitter Portal - Guided Documentation

Login

The EGA submitter portal credentials are provided by the Helpdesk team when a submission account is requested

Login

Main page

Main page: when you log in to the Submitter Portal, you will find the following image (with your submissions):

Main Page

In the main page you can see the open submissions in your ega-box. The submission can have different status depending on the objects in it:

Top Right Buttons

Submissions

1) Submissions : Clicking this button you can see all submissions in you ega-box

Submissions

By clicking on the option in the circle you can filter your submissions depending on their status:

Submitted Objects

2) Submitted objects : You can also see your objects (studies, samples, files, experiments, analyses, dacs, policies and datasets):

Submitted Objects

For example, samples. You can also filter your samples depending on their status:

Filter Samples

Moreover, you can also filter your samples by different options: Status, EGA ID, Alias, Subject ID, Updated, Created

Submitted objects

New Submission

3) New submission : Click this button when you need to start a new submission

New submission

IMPORTANT:

In the submission there are several tabs (one for each object)

When registering an object, there are some field that are mandatory (marked with a *). If these mandatory fields are not populated, you will not be able to save the object:

New submission

As you can see, the ‘Save study’ is greyed because there is still an mandatory field empty (Study type). Once this field is filled, the objects can be saved by clicking on ‘Save study’.

Save Study
Created Object

There are several action for a created object:

ACTIONS:

Each object is linked in a unidirectional way with another object. Map of linkage of objects in a submission:

EGA Metadata

For example, an study is not directly linked to a dataset. A dataset is linked to runs (linkage between samples and files). These linkages are linked to experiments and, these experiments are the ones directly linked to a study.

Reusing Registered Objects

In the EGA we strongly encourage reusing registered objects if needed. How can you do that? In each tab you can find a click box where all objects in the ega-box will display. For instance, you want to reuse an old study but you need to register a new experiment, in the experiment tab you will find the following checkbox:

Reusing Registered Objects

By clicking the Show all box’s studies:

Reuse Studies

The same goes when reusing a sample in the linkage with the files:

Reuse Samples

Or to reuse a DAC for a policy:

Reuse Policies

Or to reuse a policy for a dataset:

Reuse Dataset

And the same with multiple combination of objects.

For this reason, if you already have old submitted objects (via Webin or SP) you can reuse them, without having to register them all over again.

How to submit all objects

How to submit all objects (the whole submission) at once? By clicking on the ‘I’m done. Please, process this submission’ button on the first tab of the submission tab list:

Submit All Objects

Troubleshooting

When you try to validate a run and the samples used are not registered (submitted):

Sample:

Sample

Experiment:

Experiment

Run

Run

Click on validate and a message box will appear saying that the submission request was sent:

Validate

After a few minutes the following message will appear:

Error Message

These error messages stating that the validation failed because the referenced alias could not be found are because the sample actually DOES NOT EXIST on our database (where the call is sent to validate or submit your object).If you submit your sample first (by clicking on the blue arrow on the sample object):

Troubleshooting

Then, click to submit the run it will work this time (as the sample is not registered and added on our database):

Troubleshooting

Also, the experiment is submitted itself with the validation and submission of its linked objects (sample and runs)

Troubleshooting

IMPORTANT: If there are several runs in different status, the experiment will duplicated itself in different statuses. It is ok. This object (experiment) will submit itself once the submission is completed and all samples and runs are submitted.

Finally, in order to observe the error messages, please, go to the ‘Submission errors console’ tab

Submission Errors Console

Webin Tool

EGA Webin is an online tool that could be used to submit metadata (affiliated to sequence files) to the EGA. Furthermore, it can also be used to to register Study (EGAS), Data Access Committee (DAC) and Policy (EGAP) for all array based submissions.

The Webin platform is a historical tool that preceded the current Submitter Portal. It was developed and used to register metadata affiliated to sequence files. Detailed documentation about Webin

Click on the links below for guides on submitting specific metadata using the Webin :

Read (unaligned/raw)

Analysis: Aligned (BAM)

Analysis: Variant (VCF)

Complete Genomics

Array Based

Phenotype

Webin will not be further maintained, however, it can be used as a backup tool if there is any issue with your submission via the Submitter Portal. Please contact the ega helpdesk for any related queries.