Default model change: from senseppi.ckpt to fly_worm_human_chicken.ckpt

parent 358ad745
...@@ -127,6 +127,8 @@ dmypy.json ...@@ -127,6 +127,8 @@ dmypy.json
/esm2_embs_3B /esm2_embs_3B
*.sh *.sh
draft.py draft.py
/data/string_species/mmseqs_dbs/ /data/string_species/mmseqs_dbs_orig/
/data/human_virus/all_test_viruses.csv /data/human_virus/all_test_viruses.csv
/esm2_backup /esm2_backup
/data/string_species/mmseqs_dbs/
/data/string_species/mmseqs_dbs_fwh/
...@@ -33,15 +33,17 @@ the interactions are taken from the STRING database (based on seed proteins). ...@@ -33,15 +33,17 @@ the interactions are taken from the STRING database (based on seed proteins).
Predictions are compared with the STRING database. Optionally, the graphs can be constructed. Predictions are compared with the STRING database. Optionally, the graphs can be constructed.
- `create_dataset`: creates a dataset from the STRING database based on the taxonomic ID of the organism. - `create_dataset`: creates a dataset from the STRING database based on the taxonomic ID of the organism.
The package already comes with one pretrained version of the model `fly_worm_human_chiken.ckpt` (checkpoint with weights) that is used by **default** if model path is not specified.
This model was trained on dataset that combined PPIs from D. melanogaster, C. elegans, H. sapiens and G. gallus, and it provides the best performance with respect to the other pretrained models.
The original SENSE-PPI repository contains two models (checkpoints with weights) pretrained on human PPIs: `senseppi.ckpt` and `dscript.ckpt` pretrained on SENSE-PPI and DSCRIPT human datasets respectively. The original SENSE-PPI repository also contains two human-based models pretrained on human PPIs: `senseppi.ckpt` and `dscript.ckpt` pretrained on SENSE-PPI and DSCRIPT human datasets respectively.
- `senseppi.ckpt`: Download from [here](http://gitlab.lcqb.upmc.fr/Konstvv/SENSE-PPI/raw/master/pretrained_models/senseppi.ckpt) - `senseppi.ckpt`: Download from [here](http://gitlab.lcqb.upmc.fr/Konstvv/SENSE-PPI/raw/master/pretrained_models/senseppi.ckpt)
- `dscript.ckpt` : Download from [here](http://gitlab.lcqb.upmc.fr/Konstvv/SENSE-PPI/raw/master/pretrained_models/dscript.ckpt) - `dscript.ckpt` : Download from [here](http://gitlab.lcqb.upmc.fr/Konstvv/SENSE-PPI/raw/master/pretrained_models/dscript.ckpt)
The package already comes with preinstalled model `senseppi.ckpt` that is used by default if model path is not specified. For information about the other models that can be found in the pretrained_models folder, please refer to the original article.
**N.B.**: Both pretrained models were made to work with proteins in range 50-800 amino acids. **N.B.**: All pretrained models were made to work with proteins in range 50-800 amino acids.
In order to cite the original SENSE-PPI paper, please use the following link: https://doi.org/10.1101/2023.09.19.558413 In order to cite the original SENSE-PPI paper, please use the following link: https://doi.org/10.1101/2023.09.19.558413
......
__version__ = "0.6.1" __version__ = "0.6.2"
__author__ = "Konstantin Volzhenin" __author__ = "Konstantin Volzhenin"
from . import model, commands, esm2_model, dataset, utils, network_utils from . import model, commands, esm2_model, dataset, utils, network_utils
......
...@@ -71,7 +71,7 @@ def add_args(parser): ...@@ -71,7 +71,7 @@ def add_args(parser):
) )
predict_args.add_argument("--model_path", type=str, default=None, predict_args.add_argument("--model_path", type=str, default=None,
help="A path to .ckpt file that contains weights to a pretrained model. If " help="A path to .ckpt file that contains weights to a pretrained model. If "
"None, the preinstalled senseppi.ckpt trained version is used. " "None, the preinstalled fly_worm_human_chicken.ckpt trained version is used. "
"(Trained on human PPIs)") "(Trained on human PPIs)")
predict_args.add_argument("--pairs_file", type=str, default=None, predict_args.add_argument("--pairs_file", type=str, default=None,
help="A path to a .tsv file with pairs of proteins to test (Optional). If not provided, " help="A path to a .tsv file with pairs of proteins to test (Optional). If not provided, "
......
...@@ -173,7 +173,7 @@ def add_args(parser): ...@@ -173,7 +173,7 @@ def add_args(parser):
"typed (separated by whitespaces).") "typed (separated by whitespaces).")
string_pred_args.add_argument("--model_path", type=str, default=None, string_pred_args.add_argument("--model_path", type=str, default=None,
help="A path to .ckpt file that contains weights to a pretrained model. If " help="A path to .ckpt file that contains weights to a pretrained model. If "
"None, the preinstalled senseppi.ckpt trained version is used. " "None, the preinstalled fly_worm_human_chicken.ckpt trained version is used. "
"(Trained on human PPIs)") "(Trained on human PPIs)")
string_pred_args.add_argument("-s", "--species", type=int, default=9606, string_pred_args.add_argument("-s", "--species", type=int, default=9606,
help="Species from STRING database. Default: H. Sapiens") help="Species from STRING database. Default: H. Sapiens")
......
...@@ -47,7 +47,7 @@ def add_args(parser): ...@@ -47,7 +47,7 @@ def add_args(parser):
) )
test_args.add_argument("--model_path", type=str, default=None, test_args.add_argument("--model_path", type=str, default=None,
help="A path to .ckpt file that contains weights to a pretrained model. If " help="A path to .ckpt file that contains weights to a pretrained model. If "
"None, the preinstalled senseppi.ckpt trained version is used. " "None, the preinstalled fly_worm_human_chicken.ckpt trained version is used. "
"(Trained on human PPIs)") "(Trained on human PPIs)")
test_args.add_argument("-o", "--output", type=str, default="test_metrics", test_args.add_argument("-o", "--output", type=str, default="test_metrics",
help="A path to a file where the test metrics will be saved. " help="A path to a file where the test metrics will be saved. "
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment