# ESGEMME: Evolutionary and Structural Global Epistatic Model for Mutational Effects
## Installation
ESGEMME is implemented in Python 3 and R. It has been tested only on Linux. Since ESGEMME has many dependencies, we recommend using our web site or our docker image.
## Introduction
ESGEMME is a program predicting mutational effects of a protein based on evolutionary and structural information.
It can calculate effects of single point mutations and multiple point mutations.
We recommend using ESGEMME via our web site or our docker image.
### Installation from the source:
#### Getting the source code and preparing the environment:
Download the ESGEMME source code from http://gitlab.lcqb.upmc.fr/tekpinar/ESGEMME.
Define and export the environment variable ESGEMME_PATH=/path-to-ESGEMME-directory/
Please note that the MSA file produced by colabfold (a3m file) can contain gaps in the query sequence. You have to remove them before using it in ESGEMME.
For testing purpose, you can find example input files for BLAT protein in data/ folder of this repository.
## Usage
### Running the program
Let's assume that our input MSA is inputAli.fasta and input.pdb is our structure file in PDB format.
Run the program by issuing the following command in a bash terminal:
By default, GEMME will predict the effect of all possible single mutations at all positions in the
query sequence. Alternatively, a set of single or multiple mutations can be given with the option -m.
...
...
@@ -48,20 +53,43 @@ JET is run in its iterative mode, iJET, 10 times and the final conservation leve
values obtained over the 10 iterations.
JET2 configuration file is: default.conf.
JET2 output file is: myProt_jet.res.
By default, GEMME will output mutational effects predictions obtained from the global epistatic model,
the independent model, and a combination of those two using a reduced alphabet (alphabets/lw-i.11.txt):
myProt_pred_evolEpi.txt
myProt_normPred_evolEpi.txt
myProt_pred_evolInd.txt
myProt_normPred_evolInd.txt
myProt_normPred_evolCombi.txt
### Analyzing the ESGEMME output
By default, ESGEMME will output the following files:
* myProt_pred_evolEpi.txt
* myProt_normPred_evolEpi.txt
* myProt_pred_evolInd.txt
* myProt_normPred_evolInd.txt
* myProt_normPred_evolCombi.txt
The most important output file is **myProt_normPred_evolCombi.txt**.
The values of interest are the normalized predictions (normPred). Each file contains a 20 x n matrix,
where n is the number of positions in the query sequence.
If the user provides her/his own list of mutations, then only the global epistatic model will be run
and the output file will contain 2 columns, the first one with the mutations, the second one with the
normalized predicted effects.
## Installation
ESGEMME is implemented in Python 3 and R. It has been tested only on Linux. Since ESGEMME has many dependencies, we recommend using our web site or our docker image. If you are a determined user, here comes the steps required to install it from the source.
### Installation from the source:
#### Getting the source code and preparing the environment:
Download the ESGEMME source code from http://gitlab.lcqb.upmc.fr/tekpinar/ESGEMME.
Define and export the environment variable ESGEMME_PATH=/path-to-ESGEMME-directory/
These tools should be installed to be able to use ESGEMME.
# Cite
Laine E, Karami Y, Carbone A. GEMME: A Simple and Fast Global Epistatic Model Predicting Mutational Effects. Molecular Biology and Evolution, Volume 36, Issue 11, November 2019, Pages 2604–2619