Using ESGEMME via Docker¶
Requirements¶
You need to have docker installed on your machine. You can consult the following page for this: https://docs.docker.com/get-docker/
I am assuming some basic familiarity with Linux/Unix/MacOS terminal commands.
Let’s start our favorite terminal app.
You must to create a folder called docker-tutorial and go to that empty folder:
mkdir docker-tutorial
cd docker-tutorial
Getting the input data¶
Let’s download the sample data provided in the ESGEMME repository for this exercise. First, we will download the multiple sequence alignment file in fasta format:
wget http://gitlab.lcqb.upmc.fr/tekpinar/ESGEMME/blob/master/data/aliBLAT.fasta
If you don’t have wget, you can try the same command with curl:
curl http://gitlab.lcqb.upmc.fr/tekpinar/ESGEMME/blob/master/data/aliBLAT.fasta
Please verify that the aliBLAT.fasta file is in the folder.
Now, we will download the PDB (Protein Databank) file for BLAT:
wget http://gitlab.lcqb.upmc.fr/tekpinar/ESGEMME/blob/master/data/blat-af2.pdb
Running a calculation for a single sequence/protein¶
In order to make sure that the docker is installed:
docker -h
If it shows you a list of options, you are on a good track. On MacOS, you may not need ‘sudo’ word before the docker command at all.
sudo docker run -ti --rm --mount type=bind,source=$PWD,target=/home/tekpinar/research/myexample tekpinar/esgemme-docker:v1.3.0
You are in the container (your virtual operating system) now. You created a folder called myexample in your container with the previous command. Let’s change to that folder.
cd ../myexample/
When you check the data in that folder with ‘ls’ command, you are supposed to see aliBLAT.fasta and blat-af2.pdb files. Basically, your docker-tutorial folder on the host system and myexample folder on the docker container are pointing to the same place.
One last step and we are done:
python $ESGEMME_PATH/esgemme.py aliBLAT.fasta -r input -f aliBLAT.fasta --pdbfile blat-af2.pdb
After a few minutes of calculation, you must see two files named BLAT_normPred_evolCombi.txt and BLAT_normPred_evolCombi.png. You have the entire single point mutational landscape of BLAT protein in these files.