Reporting Missing Values
The International Nucleotide Database Collaboration (INSDC) have a standardised missing/null value reporting language to be used where a value of an expected format for sample metadata reporting can not be provided.
The controlled vocabulary takes into account different type of constraints. Submitters are strongly encouraged to always provide true values. However, if missing/null value reporting is required, submitters are asked to use a term with the finest granularity for their situation. See the table below for accepted missing value reporting terms.
Note
If your sample metadata does not provide enough context for your data to be easily interpreted, you may be requested to update your samples, so it is important to ensure you provide any metadata available and only use missing value terms when necessary.
INSDC Missing Value Reporting Terms
INSDC term (top level) |
INSDC term (lower level) |
Definition |
INSDC term (reporting level) |
Definition |
not applicable |
information is inappropriate to report, can
indicate that the standard itself fails to
model or represent the information
appropriately
|
control sample |
Information is not applicable as the sample
represents a negative control sample
collected in a lab
|
|
sample group |
Information is not applicable as the sample
represents a group of samples that do not
have a single origin. E.g. for co-assembly or
transcriptome assembly.
|
|||
missing |
not collected |
information of an expected format was not
given because it has not been collected
|
synthetic construct |
Information does not exist as the sample
represents an ab-initio synthetic construct.
|
lab stock |
Information was not collected as the sample
represents a cultured cell line or model
organism under long-term lab control.
|
|||
third party data |
Information does not exist as the metadata
was not collected or reported in records
predating the 2023 agreement. For use in
Third Party data submissions.
|
|||
not provided |
information of an expected format was not
given, a value may be given at the later
stage
|
data agreement established pre-2023 |
Data agreements were established before the
2023 INSDC standard and metadata can not be
provided. A value may be given at a later stage
|
|
restricted access |
information exists but can not be released
openly because of privacy concerns
|
endangered species |
Information can not be reported as the target
organism is endangered e.g. on the IUCN red-
list
|
|
human-identifiable |
Information can not be reported as the
metadata would make the sample human-
identifiable.
|
Usage of INSDC Missing Value Reporting Terms
Please use the above standardised missing value vocabulary only if a true value of an expected format for a mandatory field is missing. If a true value is missing for a recommended or an optional field, then these fields should not be used for reporting at all. When reporting a missing mandatory field, the eight granular ‘reporting level’ terms need to be preceded with the term missing: to declare both the absence of a true value as well as the reason.
Example of usage:
geographic location (country and/or sea): missing: data agreement-established pre-2023
collection date: missing: control sample
geographic location (country and/or sea): missing: human-identifiable