ena-utils



A simple CLI toolbox to submit sequences to the European Nucleotide Archive (ENA)

ena-utils [OPTIONS] COMMAND [ARGS]...

Options

-v, --verbose

Print various messages.

submit



Submit studies, experiments, runs and samples. - A study can be associated to multiple experiments. - An experiment can be associated to multiple runs. - An experiment can be associated to only one sample. - A run can be associated to only one experiment. - A sample can be associated to multiple experiments and runs.

ena-utils submit [OPTIONS] COMMAND [ARGS]...

Options

-u, --user <user>

Required Webin user ID (e.g. “Webin-12345”).

-p, --password <password>

Required Webin user password.

--type <type>

Type of submission.

Default

ADD

-h, --hold <hold>

A date in YYYY-MM-DD format until which the submission will be held confidential.

-s, --server_address <server_address>

Server address (default to the development server. Change “wwwdev” to “www” to submit to the production server).

Default

https://wwwdev.ebi.ac.uk/ena/submit/drop-box/submit/

--submission_xml <submission_xml>

Path to the XML submission file that will be created.

Default

submission.xml

experiment

Submit a single experiment.

ena-utils submit experiment [OPTIONS]

Options

--study <study>

Required Associated study accession number (e.g. “PRJEB12345”).

--sample <sample>

Required Associated sample name (e.g. “sample_01”).

-a, --alias <alias>

Required Experiment alias (e.g. “my_experiment_01”).

-c, --center <center>

Required Name of the sequencing center (can be different from the study host institution).

-t, --title <title>

Required Experiment title (e.g. “A great experiment”).

-d, --design <design>

Required Experiment design (e.g. “Targeted sequencing of gene X with primers A/B.”).

--lib_name <lib_name>

Library name (e.g. “LIB_01”).

--lib_strategy <lib_strategy>

Required Library strategy (e.g. “AMPLICON”).

--lib_source <lib_source>

Required Library source (e.g. “METAGENOMIC”).

--lib_selection <lib_selection>

Required Library selection (e.g. “PCR”).

--lib_length <lib_length>

Required Library nominal length (e.g. “311”).

--lib_protocol <lib_protocol>

Required Library construction protocol (e.g. “As described previously in XY et al.”).

--instrument <instrument>

Required Instrument model (e.g. “Illumina MiSeq”).

--experiment_xml <experiment_xml>

Path to the XML experiment file that will be created.

Default

experiment.xml

-v, --verbose

Print various messages.

experiment-set

Submit multiple experiments using a tab-delimited table. The elements of the table must have the same format as the parameters used for a single experiment submission (see the “experiment” command above).

ena-utils submit experiment-set [OPTIONS]

Options

--table <table>

Path to a tab-delimited text file containing a list of experiments to submit and their parameters. The file must contain one row per experiment and the first row must contain columns headers for associated study accession number, associated sample name, alias, center, title, design, library name, library strategy, library source, library selection method, library nominal lenght, library protocol, and library sequencing instrument.

--study <study>

Name of the table column containing associated study accession numbers.

Default

study

--sample <sample>

Name of the table column containing associated sample names.

Default

sample

-a, --alias <alias>

Name of the table column containing aliases.

Default

alias

-c, --center <center>

Name of the table column containing centers.

Default

center

-t, --title <title>

Name of the table column containing titles.

Default

title

-d, --design <design>

Name of the table column containing designs.

Default

design

--lib_name <lib_name>

Name of the table column containing libraries names.

Default

lib_name

--lib_strategy <lib_strategy>

Name of the table column containing libraries strategies.

Default

lib_strategy

--lib_source <lib_source>

Name of the table column containing libraries sources.

Default

lib_source

--lib_selection <lib_selection>

Name of the table column containing libraries selection methods.

Default

lib_selection

--lib_length <lib_length>

Name of the table column containing libraries nominal lengths.

Default

lib_length

--lib_protocol <lib_protocol>

Name of the table column containing libraries protocols.

Default

lib_protocol

--instrument <instrument>

Name of the table column containing libraries sequencing instrument name.

Default

instrument

--experiment_xml <experiment_xml>

Path to the XML experiment file that will be created.

Default

experiment.xml

-v, --verbose

Print various messages.

run

Submit a single run. A md5 checksum of the run file must be provided for the submission. If a valid path to the file is provided with the –filename option, this is automatically calculated by ena-utils. However, if the path is not valid or the files are not accessible on your file system, the md5 checksum must be provided using either the –checksum option.

ena-utils submit run [OPTIONS]

Options

-e, --experiment <experiment>

Required Experiment reference (e.g. “my_experiment_01”).

-a, --alias <alias>

Required Run alias (e.g. “my_run_01.”).

-c, --center <center>

Required Name of the sequencing center.

--filename <filename>

Required Comma-delimited paths to the sequence files, absolute or relative to your working directory (e.g. “data/exp01_01_R1.fastq.gz,data/exp01_01_R2.fastq.gz”). If the sequence files are not on your file system: 1) Checksums for these files must be provided with the –checksum option. 2) File names still need to be provided here. 3) If a path is provided, only the file name (basename) will be used.

--filetype <filetype>

Required Comma-delimited type descriptions for the sequence files (e.g. “fastq,fastq”).

--checksum <checksum>

Comma-delimited md5 checksums for the sequence files (e.g. “51e00fbf45a12acd4cd4e65zgac54321,214f0fbf45a12acd4cd4e65zgac12345”).

--run_xml <run_xml>

Path to the XML run file that will be created.

Default

run.xml

-v, --verbose

Print various messages.

run-set

Submit multiple runs using a tab-delimited table. The elements of the table must have the same format as the parameters used for a single run submission (see the “run” command above).

ena-utils submit run-set [OPTIONS]

Options

--table <table>

Path to a tab-delimited text file containing a list of runs to submit and their parameters. The file must contain one row per run and the first row must contain columns headers for alias, sequencing center, files names and files types. Optionally, an additional column for md5 checksums can be provided (usefull when the files are not on your system and the checksum cannot be calculated by ena-utils). Using a parameters files allows to submit multiple runs at one time.

-e, --experiment <experiment>

Name of the table column containing references to associated experiments.

Default

experiment

-a, --alias <alias>

Name of the table column containing runs aliases.

Default

alias

-c, --center <center>

Name of the table column containing the name of the sequencing center.

Default

center

--filename <filename>

Name of the table column containing comma-delimited paths to the associated sequence files.

Default

filename

--filetype <filetype>

Name of the table column containing comma-delimited type description for the associated sequence files.

Default

filetype

--checksum <checksum>

Name of the table column containing comma-delimited md5 checksums for the sequence files.

--run_xml <run_xml>

Path to the XML run file that will be created.

Default

run.xml

-v, --verbose

Print various messages.

sample

Submit a single sample.

ena-utils submit sample [OPTIONS]

Options

-a, --alias <alias>

Required Sample alias (e.g. “sample_01”).

--title <title>

Required Sample title (e.g. “My great sample 01.”).

--taxon_id <taxon_id>

Required Sample taxon ID attribute (e.g. “10090”).

--scientific_name <scientific_name>

Required Sample scientific name attribute (e.g. “Mus musculus”).

--common_name <common_name>

Required Sample common name attribute (e.g. “house mouse”).

--attributes <attributes>

JSON-formatted string of sample attributes key:value pairs (e.g. {“age”:”2 weeks”,”strain”:”C57BL/6”}).

--sample_xml <sample_xml>

Path to the XML sample file that will be created.

Default

sample.xml

-v, --verbose

Print various messages.

sample-set

Submit multiple samples using a tab-delimited table. The elements of the table must have the same format as the parameters used for a single sample submission (see the “sample” command above).

ena-utils submit sample-set [OPTIONS]

Options

--table <table>

Required Path to a tab-delimited text file containing a list of samples to submit and their parameters. The file must contain one row per sample and the first row must contain columns headers for at least alias, title, taxon ID, scientific name and common name. Additional columns are submitted as additional sample attributes. Taxon ID, scientific name and common name must comply with ENA standard organism taxonomic description (e.g. for Mus musculus, taxon ID is “10090”, scientific name is “Mus musculus” and common name is “house mouse”)

-a, --alias <alias>

Name of the table column containing sample alias.

Default

alias

--title <title>

Name of the table column containing sample title.

Default

title

--taxon_id <taxon_id>

Name of the table column containing sample taxon ID attribute.

Default

taxon_id

--scientific_name <scientific_name>

Name of the table column containing sample scientific name attribute.

Default

scientific_name

--common_name <common_name>

Name of the table column containing sample common name attribute.

Default

common_name

--sample_xml <sample_xml>

Path to the XML sample file that will be created.

Default

sample.xml

-v, --verbose

Print various messages.

study

Submit a single study.

ena-utils submit study [OPTIONS]

Options

-a, --alias <alias>

Required Study alias (e.g. “my_new_great_study”).

-t, --title <title>

Required Study title (e.g. “A great study.”).

-d, --description <description>

Required Study description (e.g. “A longer study description.”).

--center <center>

Name of the main research institution hosting the study.

--study_xml <study_xml>

Path to the XML project file that will be created.

Default

project.xml

-v, --verbose

Print various messages.

study-set

Submit multiple studies using a tab-delimited table. The elements of the table must have the same format as the parameters used for a single study submission (see the “study” command above).

ena-utils submit study-set [OPTIONS]

Options

--table <table>

Path to a tab-delimited text file containing a list of studies to submit and their parameters. The file must contain one row per study and the first row must contain columns headers for alias, title and description.

-a, --alias <alias>

Name of the table column containing studies aliases.

Default

alias

-t, --title <title>

Name of the table column containing studies titles.

Default

title

-d, --description <description>

Name of the table column containing studies descriptions.

Default

description

--study_xml <study_xml>

Path to the XML project file that will be created.

Default

study.xml

-v, --verbose

Print various messages.

upload

Upload nucleotide sequences files.

ena-utils upload [OPTIONS]

Options

-u, --user <user>

Required Webin user ID (e.g. “Webin-12345”).

-p, --password <password>

Required Webin user password.

-f, --file_path <file_path>

Required Path to the sequence files - wildcards are supported - (e.g. “data/exp01_*.fastq.gz”).

-h, --host_address <host_address>

FTP server address.

Default

webin2.ebi.ac.uk

-v, --verbose

Print various messages.

write-template



Write template files.

ena-utils write-template [OPTIONS] COMMAND [ARGS]...

experiment-table

Write a template table for experiment submission.

ena-utils write-template experiment-table [OPTIONS]

Options

-t, --table <table>

Path to the table file that will be created.

Default

experiment.txt

-v, --verbose

Print various messages.

run-table

Write a template table for run submission.

ena-utils write-template run-table [OPTIONS]

Options

-t, --table <table>

Path to the table file that will be created.

Default

run.txt

-v, --verbose

Print various messages.

sample-table

Write a template table for sample submission.

ena-utils write-template sample-table [OPTIONS]

Options

-t, --table <table>

Path to the table file that will be created.

Default

sample.txt

-v, --verbose

Print various messages.

study-table

Write a template table for study submission.

ena-utils write-template study-table [OPTIONS]

Options

-t, --table <table>

Path to the table file that will be created.

Default

project.txt

-v, --verbose

Print various messages.