Accession Numbers

Submissions to ENA result in accession numbers. A set of rules describe the format of the accessions, and these are described below, alongside examples of how they look. These accessions can be used to identify each unique part of your submission.

Please note, not all accessions become available in the browser and not all can be used in publications. For information on which accessions can be described in publications, see the guidelines at the bottom of this page.

Understanding these accessions can give you some information about what they refer to, even before you find them in our browser. For example, in the case of studies, samples, experiments, runs and analyses, you can identify which INSDC partner accepted the original submission by looking at the first letter: ‘E’ for ENA, ‘D’ for DDBJ, or ‘S’ for NCBI.

Accession Type

Accession Format

Example

Projects
Studies
PRJ(E|D|N)[A-Z][0-9]+
(E|D|S)RP[0-9]{6,}
PRJEB12345
ERP123456
BioSamples
Samples
SAM(E|D|N)[A-Z]?[0-9]+
(E|D|S)RS[0-9]{6,}
SAMEA123456
ERS123456

Experiments

(E|D|S)RX[0-9]{6,}

ERX123456

Runs

(E|D|S)RR[0-9]{6,}

ERR123456

Analyses

(E|D|S)RZ[0-9]{6,}

ERZ123456

Assemblies

GCA_[0-9]{9}\.[0-9]+

GCA_123456789.1

Assembled/Annotated Sequences
(including contig, scaffold
and chromosome sequences
generated from an assembly
submission)
[A-Z]{1}[0-9]{5}.[0-9]+
[A-Z]{2}[0-9]{6}.[0-9]+
[A-Z]{2}[0-9]{8}
[A-Z]{4}[0-9]{2}S?[0-9]{6,8}
[A-Z]{6}[0-9]{2}S?[0-9]{7,9}
A12345.1
AB123456.1
AB12345678
ABCD01123456
ABCDEF011234567

Protein Coding Sequences

[A-Z]{3}[0-9]{5}\.[0-9]+
[A-Z]{3}[0-9]{7}\.[0-9]+
ABC12345.1
ABC1234567.1

For Project and Sample registration, you will receive two accessions for each submission: a Project and a Study as well as Samples and BioSamples.

Every Project in ENA has a secondary Study accession. Before ENA was combined as a single archive, it was a separate archive for raw data (ERA which used the Study accession) and constructed assemblies/sequences (EMBL-Bank which used the Project accession). The Project and Study in ENA have since been merged and you may find the terms used interchangeably for both types of accession numbers, as both can be used to access the same data in the browser. To remain compatible with the other INSDC partners services which remain as separate archives for raw data/assemblies/sequences, we continue to provide both accessions on registration.

All Samples registered with the ENA are mirrored within the EMBL-EBI BioSamples service. This is why all samples registered with ENA receive a BioSamples accession as well as a standard ENA Sample accession. All ENA registered samples can be viewed within BioSamples using their BioSamples accession.

How to cite your ENA study

In all cases, the top-level Project accession should be cited as well as a link to where the data can be found in the browser, for example:

“the data for this study have been deposited in the European Nucleotide Archive (ENA) at EMBL-EBI under accession number PRJEBxxxx (https://www.ebi.ac.uk/ena/browser/view/PRJEBxxxx).”

If there is a particular scenario where using the top level accession would not be suitable, for example, if you have multiple publications that reference individual components within a single ENA project (and therefore the project accession provides too much ambiguity), then the following accessions are also considered accessions that could be used for publication:

  • Assemblies

  • BioSamples (in the context of associated data)

  • Assembled/Annotated Sequences

You can now claim your ENA studies to your ORCID ID. Please see here for more information.