Submitting array based metadata

The submission metadata required for Array-based submission must be submitted using EGA Submitter Portal and by completing the Array-based format (AF) spreadsheet. The guidelines for this workflow are described on this page.

**Metadata submitted as xmls or through the EGA Submitter Portal will be made publicly available to view on the EGA website and other resources/ partner websites**

Please notice that all files should be encrypted and uploaded prior to the processing of your EGA-Array-based-Format document.

Metadata Model

Array Based Metadata

Registering Metadata

Use the Submitter Portal to register your Study, Samples, Data Access Committee (DAC) and Policy. This online interface enables you to create new and edit existing submissions.

Go to the EGA Submitter Portal page and log in using your assigned ega-box and password.

Registering Study

To use the study accession number in a publication, we suggest the following format:

  "Sequence data has been deposited at the European Genome-phenome Archive (EGA), which is hosted by the EBI and the CRG, under accession number EGASXXXXXXXXXXX.
Further information about EGA can be found on https://ega-archive.org "The European Genome-phenome Archive of human data consented for biomedical research"( http://www.nature.com/ng/journal/v47/n7/full/ng.3312.html ).

Registering Samples

Registering Data Access Committee

Further information on the role of your DAC.

Registering Policy

Your Data Access Policy provides the terms and conditions of data use. This is also referred to as the Data Access Agreement (DAA).

Completion of a DAA by the applicant/s should form part of the application process to the Data Access Committee (DAA)

Complete the Array-based format (AF) spreadsheet

Once you have completed the registration of your Study, DAC and Policy using Submitter Portal, you must then complete and return the AF spreadsheet

The AF spreadsheet consists on four components:

Should further assistance be required after going through the guide below; please do not hesitate to contact the EGA helpdesk

Once the AF spreadsheet is populated, please send it to our EGA helpdesk for further validation.

AF spreadsheet: Webin accessions

Should your submission require multiple DAC’s or policies, use ‘ ; ‘ to separate the accession numbers.

Accessions

AF spreadsheet: Samples & phenotypes

Samples and phenotypes

AF spreadsheet: Datasets

We suggest that each dataset consists of a common set of data. The example below consists of two datasets, grouped according to shared data type, technology and by case/control.

We also like to capture the number of unique samples that make up the dataset and the Data Access Committee (DAC) responsible for providing the named dataset and their policy (EGAP).

Datasets

AF spreadsheet: Data files

What follows is an example of how to map your samples to the array based files added to your upload account (4th tab).

Data Files

Please, find below some practical examples on how to register the linkage between samples-files

Case 1) 1 sample or list of samples in different datasets:

Data Files

In case you have a list of samples that belong to different datasets, please, repeat the samples accession number/s in the first column and link the sample to the corresponding dataset each time (each row).

Each row is one linkage between sample-file-dataset.

Case 2) 1 sample links to several files:

Data Files

In order to add multiple files to one sample you MUST use “ ; “ between filenames. Example: file1.gpg;file2.gpg;file3.gpg

In case that you want to add an extra file to the sample (phenotype or .Rdata), please use “Additional files” column.

Important note: You MUST upload the encrypted and unencrypted md5sum values of all files uploaded to your submission account using the filename nomenclature (file.gpg, file.md5,file.md5.gpg). Your submission will not be processed without md5values supplied for all files in the CORRECT format.

What happens after the submission of a dataset?

All datasets affiliated to unreleased studies are automatically placed on hold until the authorised submitted or DAC contact instructs our EGA helpdesk for the study to be released.

Datasets affiliated to released studies will automatically be released.

When your study progresses is released the named DAC contacts will be provided access to the EGA DAC admin tools to create and manage EGA accounts with access permissions to the dataset/s affiliated to the study.

Further information regarding the role of the Data Access Committee.

Finally, your data is archived within our databases and prepared for encrypted distribution upon the request of permitted EGA account holders.

We strongly advise you NOT to delete your data until we confirm that your data has been successfully archived.