Commit d68afcdb by Mustafa Tekpinar

Updated README.md file for prescott module.

parent 6f4cbe77
......@@ -9,33 +9,41 @@ It is made up of two main programs: escott and prescott.
ESCOTT can calculate effects of single point mutations and multiple point mutations. On the other hand, PRESCOTT incorporates
population frequencies into ESCOTT predictions. Therefore, you need to run ESCOTT first to have predictions of mutational effects.
We recommend using PRESCOTT via our web site or our docker image.
We recommend using PRESCOTT package via our web site or our docker image due to its dependencies.
## Input Data Requirements
### Input Data Requirements for escott
ESCOTT requires two files:
escott requires two files:
* a multiple sequence alignment (MSA) file in fasta format (mandatory):
Your query protein must be the first sequence in the fasta file. In addition, the query sequence should not contain any gaps.
* a structure file in PDB format (optional but recommended)
* a structure file in PDB format (optional but highly recommended).
One of the fastest ways to obtain both input MSA and a PDB file is to run colabfold:
https://colab.research.google.com/github/sokrypton/ColabFold/blob/main/AlphaFold2.ipynb
Please note that the MSA file produced by colabfold (a3m file) can contain gaps in the query sequence. You have to remove them before using it in PRESCOTT. You can remove the gaps with pragrams that have a GUI, such as ugene (http://ugene.net/) or jalview (https://www.jalview.org/).
For testing purpose, you can find example input files for BLAT protein in data/ folder of this repository.
For testing purpose, you can find some example input files for BLAT protein in data/ folder of this repository.
### Input Data Requirements for prescott
prescott requires three files:
* output file of escott (the file ending with ...normPredCombi.txt)
* a fasta file containing only your query sequence
* gnomad csv file containing to be downloaded from https://gnomad.broadinstitute.org/ for your protein.
## Usage
### Running the program
You can find example bash scripts for escott and prescott in examples folder of this repository.
Below, you will find examples of the most basic usage. Consult to the documentation for further details.
### Running the escott program
Let's assume that our input MSA is inputAli.fasta and input.pdb is our structure file in PDB format.
Run the program by issuing the following command in a bash terminal:
```bash
escott inputAli.fasta --pdbfile input.pdb
escott inputAli.fasta -f inputAli.fasta --pdbfile input.pdb
```
A quick help can be accessed by typing
......@@ -45,7 +53,7 @@ escott --help
By default, ESCOTT will predict the effect of all possible single mutations at all positions in the
query sequence. Alternatively, a set of single or multiple mutations can be given with the option -m.
Eachline of the file should contain a mutation (e.g. D136R) or combination of mutations separated
Each line of the file should contain a mutation (e.g. D136R) or combination of mutations separated
by commas (or colons) and ordered according to their positions in the sequence (e.g. D136R,V271A).
GEMME calls JET2 to compute evolutionary conservation levels. By default, JET2 will retrieve a set
......@@ -58,6 +66,21 @@ values obtained over the 10 iterations.
JET2 configuration file is: default.conf.
JET2 output file is: myProt_jet.res.
### Running the prescott program
A quick help can be accessed by typing
```bash
prescott --help
```
Run the program by issuing the following command in a bash terminal:
```bash
prescott -e ../data/MLH1_normPred_evolCombi.txt -g ../data/gnomAD_v2.1.1_MLH1_HUMAN_ENSG00000076242.csv -s ../data/MLH1.fasta
```
The most important output is prescott-scores.txt file, which produces frequecy modified scores for the mutations.
Please note that the example input files are in the data directory of this repository.
## Installation
PRESCOTT is implemented in Python 3 and R. It has been tested only on Linux. Since PRESCOTT has many dependencies, we recommend using our web site or our docker image. If you are a determined user, you can find the steps required to install it from the source in the following link (or in the docs folder of this repository):
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment