Submit Raw Reads Interactively¶
Read files in ENA are contained by run objects, which point to the location of the file in an FTP directory. A run is always linked with one experiment object, which describes the library preparation and sequencing protocol. Experiments are linked with one sample and one study, as shown in the metadata model diagram:
Before you register the run and experiment objects through the Webin Portal, you should have completed the following steps:
To start your read submission, log in to the Webin Portal. You will need to complete three steps, described below:
- Select and customise a read submission template spreadsheet
- Fill out the template spreadsheet
- Validate and submit the template spreadsheet
If you are unsure, you are welcomed and encouraged to test out your submission through the Webin Portal test version. All submissions to the test version are overwritten within 24 hours, so your submissions here are consequence-free. To be sure that you are in the test environment, always check for the ‘wwwdev’ URL.
Step 1: Select A Read Spreadsheet¶
To begin, log in to the Webin Portal and select the ‘Submit Reads’ button.
- Expand the ‘Download spreadsheet template for Read submission’ section by clicking it
- Carefully review the list of submittable read file formats and choose the one which applies to your submission. Note that there are different options for single and paired FASTQ files
- Click the appropriate file format to advance and see its list of attributes
- Review the included attributes, their meanings, and requirements; use the ‘Show Description’ box to see more information about an attribute Expand the ‘Optional Fields’ section and check the boxes next to the non-default attributes to include them
- Click ‘Next’ when you are ready to continue; you can return to this interface later to review attribute meanings and requirements
- Click ‘Download TSV Template’ to acquire a copy of your customised template spreadsheet
Step 2: Complete The Template Spreadsheet¶
Once you have downloaded the template spreadsheet, you should open it in an appropriate spreadsheet editing program, such as Microsoft Excel or Google Sheets. Consider the following tips as you complete your spreadsheet:
- Each row of your spreadsheet should describe the files and metadata for exactly one experiment/run pair
- Submissions must always be de-multiplexed
- Return to the interface in the previous step to review the meanings of attributes and any requirements they have
- The study and sample fields can be filled out with either ENA or BioStudies/BioSamples accessions
- The file name fields must exactly match the name of a file in your account’s upload area
- If you created subdirectories for your files, include this full path in the file name field
- The MD5 fields should be filled with file MD5 values, which are 32-digit hexadecimal numbers. The MD5 value is a fingerprint value for the file which allows us to verify that it was uploaded successfully. In Linux and Mac, you can generate this value from the command line by running the command ‘md5’ or ‘md5sum’ on the file, while Microsoft has a support article on performing this activity for Windows.
- Do not edit the existing column names
- Use only valid ASCII characters
- When you come to submit the file you must use one of the following extensions: .csv, .tsv, .tab, .txt
Once you are satisfied that your spreadsheet content is complete, save the file and move on to the final step.
Step 3: Submit The Template Spreadsheet¶
Return to the ‘Submit Reads’ interface in Webin Portal. This time, expand the ‘Upload filled spreadsheet template for Read submission’ section.
Select the ‘Browse’ option or click-and-drag the file onto this section. Then, click the ‘Submit Completed Spreadsheet’ button to have your file validated and submitted.
Should metadata validation fail, you will receive a pop-up with an error message. If the content of the error message is unclear, please contact the helpdesk.
If metadata validation is successful, you will receive a pop-up informing you of this and confirmation of the assigned experiment and run accessions. Your submitted data files will then be entered into a processing pipeline which will check their validity before moving them to an archive. If there are file errors, these will be reported to account holders by the registered email address(es). You can always check the processing status of your submissions via the run reports available in Webin Portal.
See Webin Portal Reports for advice on retrieving information about these submissions.