-i, --input <path> Input file of AA sequences in FASTA format
-i, --input <path> Input file of AA sequences in FASTA format
(protein sequences or predicted CDS)
-N, --name <str> Dataset/job name
-N, --name <str> Dataset/job name
MetaCLADE OPTIONS:
MetaCLADE OPTIONS:
...
@@ -67,41 +72,6 @@ MetaCLADE usage
...
@@ -67,41 +72,6 @@ MetaCLADE usage
(e.g., use --time-limit 2:30:00 for setting a limit of 2h and 30m)
(e.g., use --time-limit 2:30:00 for setting a limit of 2h and 30m)
```
```
### 1. MetaCLADE configuration
First of all it is advised to include (if it is not) MetaCLADE main directory to your PATH environment variable by adding the following line to your `~/.bashrc`
```
export PATH=[MetaCLADE_DIR]:${PATH}"
```
where `[MetaCLADE_DIR]` is MetaCLADE's installation directory.
Then, in order to create MetaCLADE jobs you must first create a *Run configuration file* (see below) and run the following command:
```
metaclade --run-cfg [Run configuration file]
```
#### Input file preprocessing
Before running MetaCLADE on the input FASTA file you should build a BLAST database.
You can either set the CREATE_BLASTDB parameter to True in the Run configuration file (see below) or you can manually run the following command:
A custom working directory (where jobs and results are saved) could be set with the `WORKING_DIR` parameter (the default value is the directory from which the metaclade command has been called).
A custom temporary directory could be set using the `TMP_DIR` parameter (the default is a temp subdirectory in the working directory).
If you want to restrict MetaCLADE's annotation to a subset of domains, you could provide a file containing one domain identifier per line to the `DOMAINS_DIR` parameter.
#### MetaCLADE configuration file example (optional)
#### MetaCLADE configuration file example (optional)
Optionally, a MetaCLADE configuration file could be provided to metaclade with the parameter `--metaclade-cfg`.
Optionally, a MetaCLADE configuration file could be provided to metaclade with the parameter `--metaclade-cfg`.
This file could be used to set custom paths to PSI-BLAST/HMMER/Python executables or to the MetaCLADE model library.
This file could be used to set custom paths to PSI-BLAST/HMMER/Python executables or to the MetaCLADE model library.
...
@@ -118,52 +88,50 @@ Lines starting with a semicolon are not taken into account. Also, you should pro
...
@@ -118,52 +88,50 @@ Lines starting with a semicolon are not taken into account. Also, you should pro
```
```
### 2. MetaCLADE jobs
### MetaCLADE jobs
By default jobs are created in `[WORKING_DIR]/[DATASET_NAME]/jobs/`.
By default jobs are created in `[WORKING_DIR]/[DATASET_NAME]/jobs/`.
Each (numbered) folder in this directory represents a step of the pipeline and contains several `*.sh` files (depending on the value assigned to the `NUMBER_OF_JOBS` parameter):
By default `[WORKING_DIR]` is the current directory where the `metaclade2` command has been run.
Each (numbered) folder in this directory represents a step of the pipeline and contains several `*.sh` files (depending on the value provided with the `-j [NUMBER_OF_JOBS]` parameter):
In the first three directories you can find a `submit.sh` file that contains the `qsub` command to submit each job to the queue system of a SGE environment.
Each file in a given directory can be submitted independently to the HPC environment.
This file can be used (or adapted for other HPC environments) in order to submit all jobs at each step.
### 3. MetaCLADE results
### MetaCLADE2 results
By default results are stored in the `[WORKING_DIR]/[DATASET_NAME]/results/` directory.
By default results are stored in the `[WORKING_DIR]/[DATASET_NAME]/results/` directory.
Each (numbered) folder in this directory contains the results after each step of the pipeline.
Each (numbered) folder in this directory contains the results after each step of the pipeline.
After running each step, the final annotation is saved in the file
After running each step, the final annotation is saved in the file