Commit 1b6d600b by Riccardo Vicedomini


parent 41b9da71
......@@ -8,35 +8,35 @@ Biochemical and regulatory pathways have until recently been thought and modelle
We introduce MetaCLADE2, and improved profile-based domain annotation pipeline based on the multi-source domain annotation strategy. It provides a domain annotation realised directly from reads, and reaches an improved identification of the catalog of functions in a microbiome. MetaCLADE2 can be applied to either metagenomic or metatranscriptomic datasets as well as proteomes.
System requirements
+ MetaCLADE has been developed under a Unix environment.
# System requirements
+ MetaCLADE2 has been developed under a Linux environment.
+ The bash environment should be installed.
+ Python 3 is required for this package.
Software requirements
# Software requirements
+ HMMer-3
+ GNU parallel (optional but recommended for running jobs on multiple threads)
# Installation
Latest development version of MetaCLADE2 can be obtained running the following command:
git clone
Then, it is advised to include MetaCLADE2 directory in your PATH environment variable by adding the following line to your `~/.bashrc` file:
export PATH=[MetaCLADE_DIR]:${PATH}"
export PATH=[MetaCLADE2_DIR]:${PATH}"
where `[MetaCLADE_DIR]` is MetaCLADE's installation directory.
where `[MetaCLADE2_DIR]` is MetaCLADE2 installation directory.
MetaCLADE usage
# MetaCLADE2 usage
USAGE: metaclade2 -i <input_fasta> -N <name> [options]
......@@ -72,25 +72,21 @@ MetaCLADE usage
(e.g., use --time-limit 2:30:00 for setting a limit of 2h and 30m)
#### MetaCLADE configuration file example (optional)
Optionally, a MetaCLADE configuration file could be provided to metaclade with the parameter `--metaclade-cfg`.
This file could be used to set custom paths to PSI-BLAST/HMMER/Python executables or to the MetaCLADE model library.
Lines starting with a semicolon are not taken into account. Also, you should provide absolute paths.
#### Optional MetaCLADE2 configuration file (available soon)
MetaCLADE2 optionnally accepts a configuration file that allows the user to set custom paths to the MetaCLADE model library.
Lines starting with a semicolon are not taken into account and are considered as comments.
You **must** also provide absolute paths.
;PSIBLAST_DIR = /home/ncbi-blast-2.7.1+/bin/
;HMMER_DIR = /home/hmmer-3.2.1/bin/
;PYTHON_DIR = /home/python-2.7.15/bin
;PSSMS_DIR = /home/MetaCLADE/data/models/pssms
;HMMS_DIR = /home/MetaCLADE/data/models/hmms
;ccms_path = /absolute/path/to/data/models/CCMs
;hmms_path = /absolute/path/to/data/models/HMMs
### MetaCLADE jobs
By default jobs are created in `[WORKING_DIR]/[DATASET_NAME]/jobs/`.
By default `[WORKING_DIR]` is the current directory where the `metaclade2` command has been run.
# MetaCLADE jobs
By default jobs are created in `[WORKING_DIR]/[DATASET_NAME]/jobs/`. By default `[WORKING_DIR]` is the current directory where the `metaclade2` command has been run.
Using the `--sge` parameter it is possible to automatically handle MetaCLADE2 pipeline in a SGE-based cluster (see [MetaCLADE2 usage](#metaclade2-usage) section).
Each (numbered) folder in this directory represents a step of the pipeline and contains several `*.sh` files (depending on the value provided with the `-j [NUMBER_OF_JOBS]` parameter):
......@@ -108,7 +104,7 @@ Jobs **must** be run in the following order:
Each file in a given directory can be submitted independently to the HPC environment.
### MetaCLADE2 results
# MetaCLADE2 results
By default results are stored in the `[WORKING_DIR]/[DATASET_NAME]/results/` directory.
Each (numbered) folder in this directory contains the results after each step of the pipeline.
After running each step, the final annotation is saved in the file
......@@ -131,7 +127,16 @@ Each annotation has the following fields:
* Accuracy value in the interval [0,1]
# Example
A test dataset is available in the `test` directory and can be run with the following command:
metaclade2 -i ./test/test.fa -N testDataSet -d PF00875,PF03441,PF03167,PF12546 -W ./ -j 2
This will create at most two scrips (jobs) in each directory of the pipeline.
metaclade2 -i ./test/test.fa -N pippo -d PF00875,PF03441,PF03167,PF12546 -W ./test/ --arch --sge --pe smp -j 2 -t 2
Alternatively, if you are running MetaCLADE2 in a SGE cluster, the following script will run at most 2 jobs, each one using 2 CPUs, for each step of the pipeline:
metaclade2 -i ./test/test.fa -N testDataSet -d PF00875,PF03441,PF03167,PF12546 -W ./ --sge --pe smp -j 2 -t 2
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment