Update README.md

1b6d600b · Riccardo Vicedomini · 41b9da71 · 1b6d600b
Commit 1b6d600b authored Feb 16, 2020 by Riccardo Vicedomini
Show whitespace changes
Inline Side-by-side

Showing with 35 additions and 30 deletions

README.md README.md +35 -30

No files found.
--- a/README.md
+++ b/README.md
@@ -8,35 +8,35 @@ Biochemical and regulatory pathways have until recently been thought and modelle
 We introduce MetaCLADE2, and improved profile-based domain annotation pipeline based on the multi-source domain annotation strategy. It provides a domain annotation realised directly from reads, and reaches an improved identification of the catalog of functions in a microbiome. MetaCLADE2 can be applied to either metagenomic or metatranscriptomic datasets as well as proteomes.


-System requirements
-------------------
-+ MetaCLADE has been developed under a Unix environment.
+# System requirements
+
+ MetaCLADE2 has been developed under a Linux environment.
 + The bash environment should be installed.
 + Python 3 is required for this package.


-Software requirements
---------------------
+# Software requirements
+
 + HMMer-3
 + DAMA
 + GNU parallel (optional but recommended for running jobs on multiple threads)


-Installation
------------
+# Installation
+
 Latest development version of MetaCLADE2 can be obtained running the following command:
 ```
 git clone http://gitlab.lcqb.upmc.fr/vicedomini/metaclade2.git
 ```
 Then, it is advised to include MetaCLADE2 directory in your PATH environment variable by adding the following line to your `~/.bashrc` file:
 ```
-export PATH=[MetaCLADE_DIR]:${PATH}"
+export PATH=[MetaCLADE2_DIR]:${PATH}"
 ```
-where `[MetaCLADE_DIR]` is MetaCLADE's installation directory.
+where `[MetaCLADE2_DIR]` is MetaCLADE2 installation directory.


-MetaCLADE usage
---------------
+# MetaCLADE2 usage
+
 ```
  USAGE: metaclade2 -i <input_fasta> -N <name> [options]

@@ -72,25 +72,21 @@ MetaCLADE usage
                             (e.g., use --time-limit 2:30:00 for setting a limit of 2h and 30m)
 ```

-#### MetaCLADE configuration file example (optional)
-Optionally, a MetaCLADE configuration file could be provided to metaclade with the parameter `--metaclade-cfg`. 
-This file could be used to set custom paths to PSI-BLAST/HMMER/Python executables or to the MetaCLADE model library.
-Lines starting with a semicolon are not taken into account. Also, you should provide absolute paths.
+#### Optional MetaCLADE2 configuration file (available soon)
+MetaCLADE2 optionnally accepts a configuration file that allows the user to set custom paths to the MetaCLADE model library.
+Lines starting with a semicolon are not taken into account and are considered as comments. 
+You **must** also provide absolute paths.
 ```
-[Programs]
-;PSIBLAST_DIR = /home/ncbi-blast-2.7.1+/bin/
-;HMMER_DIR = /home/hmmer-3.2.1/bin/
-;PYTHON_DIR = /home/python-2.7.15/bin
-
-[Models]
-;PSSMS_DIR = /home/MetaCLADE/data/models/pssms
-;HMMS_DIR = /home/MetaCLADE/data/models/hmms
+[metaclade]
+;ccms_path    = /absolute/path/to/data/models/CCMs
+;hmms_path    = /absolute/path/to/data/models/HMMs
 ```


-### MetaCLADE jobs
-By default jobs are created in `[WORKING_DIR]/[DATASET_NAME]/jobs/`.
-By default `[WORKING_DIR]` is the current directory where the `metaclade2` command has been run.
+# MetaCLADE jobs
+By default jobs are created in `[WORKING_DIR]/[DATASET_NAME]/jobs/`. By default `[WORKING_DIR]` is the current directory where the `metaclade2` command has been run.
+Using the `--sge` parameter it is possible to automatically handle MetaCLADE2 pipeline in a SGE-based cluster (see [MetaCLADE2 usage](#metaclade2-usage) section).
+
 Each (numbered) folder in this directory represents a step of the pipeline and contains several `*.sh` files (depending on the value provided with the `-j [NUMBER_OF_JOBS]` parameter):
 ```
 [DATASET_NAME]_1.sh
@@ -108,7 +104,7 @@ Jobs **must** be run in the following order:
 Each file in a given directory can be submitted independently to the HPC environment.


-### MetaCLADE2 results
+# MetaCLADE2 results
 By default results are stored in the `[WORKING_DIR]/[DATASET_NAME]/results/` directory.
 Each (numbered) folder in this directory contains the results after each step of the pipeline. 
 After running each step, the final annotation is saved in the file
@@ -131,7 +127,16 @@ Each annotation has the following fields:
 * Accuracy value in the interval [0,1]


-Example
-------
+# Example
+A test dataset is available in the `test` directory and can be run with the following command:
+```
+cd [METACLADE2_DIR]
+metaclade2 -i ./test/test.fa -N testDataSet -d PF00875,PF03441,PF03167,PF12546 -W ./ -j 2
+```
+This will create at most two scrips (jobs) in each directory of the pipeline.

-metaclade2 -i ./test/test.fa -N pippo -d PF00875,PF03441,PF03167,PF12546 -W ./test/ --arch --sge --pe smp -j 2 -t 2
+Alternatively, if you are running MetaCLADE2 in a SGE cluster, the following script will run at most 2 jobs, each one using 2 CPUs, for each step of the pipeline:
+```
+cd [METACLADE2_DIR]
+metaclade2 -i ./test/test.fa -N testDataSet -d PF00875,PF03441,PF03167,PF12546 -W ./ --sge --pe smp -j 2 -t 2
+```