Module 4: Update a Study using REST API

Editing studies in the ENA using the REST API is an almost identical process to submitting a new one. The first step is to obtain the original study in XML format. This step alone can be tricky if you did not submit the project using the REST API to begin with. Note that Webin has good study editing functionality already:

Webin project edit

However, learning to use the REST API with a simple project object can pave the way for submitting and updating more complicated objects such as samples, experiments and runs. Also for making edits in bulk (to many projects) the ENA REST API is more feasible than Webin.

Step 1: Get hold of the study in XML format

If you used REST API to submit the study in the first place you can use the XML files that you used previously.

If you don’t have an XML file containing the study you can copy the public version by using &display=xml at the end of the study page. For example, http://www.ebi.ac.uk/ena/data/view/PRJEB5932&display=xml. Note that the web version has additional blocks that are not part of the original XML as well as parts that have been added automatically and can be cleaned up for the purpose of updating (besides, they will be added again automatically). For example the below web version XML can be cleaned up so that it looks like submitted version that follows it.

Web Version

<?xml version="1.0" encoding="UTF-8"?>
<ROOT request="PRJEB14252&amp;display=xml">
<PROJECT alias="ena-STUDY-klanvin-03-06-2016-07:54:42:301-120" center_name="klanvin" accession="PRJEB14252" first_public="2016-08-02+01:00">
     <IDENTIFIERS>
          <PRIMARY_ID>PRJEB14252</PRIMARY_ID>
          <SECONDARY_ID>ERP015887</SECONDARY_ID>
          <SUBMITTER_ID namespace="klanvin">ena-STUDY-klanvin-03-06-2016-07:54:42:301-120</SUBMITTER_ID>
     </IDENTIFIERS>
     <NAME>Cheddar cheese</NAME>
     <TITLE>Characterization of Microbial Diversity and Chemical Properties of Cheddar Cheese Prepared from Heat-treated Milk</TITLE>
     <DESCRIPTION>This study aimed to characterize the interaction of microbial diversity and chemical properties of Cheddar cheese after three different heat treatments of milk; low temperature/long time (LTLT), thermization, and high temperature/short time (HTST). Cheese obtained from LTLT-treated milk (LC) and thermized milk (TC) .... </DESCRIPTION>
     <SUBMISSION_PROJECT>
          <SEQUENCING_PROJECT>
               <LOCUS_TAG_PREFIX>BN8055</LOCUS_TAG_PREFIX>
          </SEQUENCING_PROJECT>
     </SUBMISSION_PROJECT>
     <PROJECT_LINKS>
          <PROJECT_LINK>
               <XREF_LINK>
                    <DB>ENA-SUBMISSION</DB>
                    <ID>ERA645775</ID>
               </XREF_LINK>
          </PROJECT_LINK>
          <PROJECT_LINK>
               <XREF_LINK>
                    <DB>ENA-FASTQ-FILES</DB>
                    <ID><![CDATA[http://www.ebi.ac.uk/ena/data/warehouse/filereport?accession=PRJEB14252&result=read_run&fields=run_accession,fastq_ftp,fastq_md5,fastq_bytes]]></ID>
               </XREF_LINK>
          </PROJECT_LINK>
          <PROJECT_LINK>
               <XREF_LINK>
                    <DB>ENA-SUBMITTED-FILES</DB>
                    <ID><![CDATA[http://www.ebi.ac.uk/ena/data/warehouse/filereport?accession=PRJEB14252&result=read_run&fields=run_accession,submitted_ftp,submitted_md5,submitted_bytes,submitted_format]]></ID>
               </XREF_LINK>
          </PROJECT_LINK>
     </PROJECT_LINKS>
     <PROJECT_ATTRIBUTES>
          <PROJECT_ATTRIBUTE>
               <TAG>ENA-FIRST-PUBLIC</TAG>
               <VALUE>2016-08-02</VALUE>
          </PROJECT_ATTRIBUTE>
          <PROJECT_ATTRIBUTE>
               <TAG>ENA-LAST-UPDATE</TAG>
               <VALUE>2016-06-03</VALUE>
          </PROJECT_ATTRIBUTE>
     </PROJECT_ATTRIBUTES>
</PROJECT>
</ROOT>

Submitted version

<?xml version="1.0" encoding="US-ASCII"?>
<PROJECT_SET>
  <PROJECT center_name="klanvin" accession="PRJEB14252">
    <NAME>Cheddar cheese</NAME>
    <TITLE>Characterization of Microbial Diversity and Chemical Properties of Cheddar Cheese Prepared from Heat-treated Milk</TITLE>
    <DESCRIPTION>This study aimed to characterize the interaction of microbial diversity and chemical properties of Cheddar cheese after three different heat treatments of milk; low temperature/long time (LTLT), thermization, and high temperature/short time (HTST). Cheese obtained from LTLT-treated milk (LC) and thermized milk (TC) .... </DESCRIPTION>
    <SUBMISSION_PROJECT>
      <SEQUENCING_PROJECT>
    <LOCUS_TAG_PREFIX>BN8055</LOCUS_TAG_PREFIX>
      </SEQUENCING_PROJECT>
    </SUBMISSION_PROJECT>
  </PROJECT>
</PROJECT_SET>

The submitted version is much shorter and I even removed the unique alias because now that the object has an accession number the server will not need both alias and accession number to realise the identity of the object that is being overwritten.

ERP version

If your study is not public yet and you do not have it in XML format you can try using the submit/drop-box/ REST endpoint. Log in to here with your Webin id and password and click on ‘STUDY’. You will see a list of studies submitted from your account and you can view the XML for each by selecting the study and then clicking ‘xml’

rest endpoint

xml study

Studies obtained from this resource are actually different (you may have noticed). Previously a study in the read domain had an accession like this ERP000001 whereas a project object (used for registering genome assemblies among other things) would have an accession like this PRJEB0001. We no longer distinguish between the 2 objects officially and we expose the PRJEB type more while the ERP type is kept for legacy reasons. You can edit either the PRJEB type or the ERP type and most attributes will be carried over to the other one. Similarly when you create a PRJEB type project then an ERP project is created automatically (and vice versa).

Step 2: Create a submission XML file

As with submitting a new study (see module 1), a submission object is required to accompany the study XML for updating an existing study object too. You may have this from a previous submission or update but it is also very quick to create.

<?xml version="1.0" encoding="UTF-8"?>
<SUBMISSION alias="cheese_update" center_name="">
   <ACTIONS>
      <ACTION>
         <MODIFY source="project.xml" schema="project"/>
      </ACTION>
   </ACTIONS>
</SUBMISSION>

Make sure that you give the submission object a unique alias (which can be any string) and fill in the center_name for your account (you can find this in the “my account details” drop down from inside Webin.

If you are updating the ‘ERP’ version of the project (see above) you also need to specify this in the submission XML by changing schema="project" to schema="study" because the ERP style objects use a different schema.

The important part of this submission object is the <MODIFY> tag. Contrast this with the tag used to submit an object for the first time (in module 1) which is <ADD>. This tells the REST server that we are updating an existing object instead of adding a new one.

Make the edit and send to ENA

Now you can make changes to the study object contained in the XML file. For example as a test, you might try modifying the title or the description.

The final step is identical to submitting a study for the first time in module 1. You will send the submission xml and the study xml to the ENA REST server using cURL or the webform and you should receive a receipt in XML format. If the receipt contains success="true" then your edit will have been committed to the database. If not, check the error message(s), correct and repeat.

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="receipt.xsl"?>
<RECEIPT receiptDate="2017-07-17T13:22:11.020+01:00" submissionFile="sub.xml" success="true">
    <PROJECT accession="PRJEB14252" alias="ena-STUDY-klanvin-03-06-2016-07:54:42:301-120"
        status="PUBLIC"/>
    <SUBMISSION accession="" alias="cheese_update"/>
    <ACTIONS>MODIFY</ACTIONS>
</RECEIPT>