0.5.9 minor changes

0999af3c · Konstantin Volzhenin · 973c7875 · 0999af3c · 0999af3c · 0999af3c
Commit 0999af3c authored Oct 05, 2023 by Konstantin Volzhenin
Hide whitespace changes
Inline Side-by-side

Showing with 12 additions and 4 deletions

README.md README.md +4 -2

usage.rst docs/source/usage.rst +7 -1

__init__.py senseppi/__init__.py +1 -1

No files found.
--- a/README.md
+++ b/README.md
@@ -43,4 +43,6 @@ The package already comes with preinstalled model `senseppi.ckpt` that is used b

 **N.B.**: Both pretrained models were made to work with proteins in range 50-800 amino acids.

-In order to cite the original SENSE-PPI paper, please use the following link: https://doi.org/10.1101/2023.09.19.558413  
\ No newline at end of file
+In order to cite the original SENSE-PPI paper, please use the following link: https://doi.org/10.1101/2023.09.19.558413  
+
+The documentation for the package can be found [here](https://sense-ppi.readthedocs.io/en/latest/).
\ No newline at end of file
--- a/docs/source/usage.rst
+++ b/docs/source/usage.rst
@@ -22,7 +22,7 @@ List of commands

 There are 5 commands available in the package:

- `train`: trains SENSE-PPI on a given dataset
+- `train`: trains SENSE-PPI on a given dataset.
 - `test`: computes test metrics (AUROC, AUPRC, F1, MCC, Presicion, Recall, Accuracy) on a given dataset
 - `predict`: predicts interactions for a given dataset
 - `predict_string`: predicts interactions for a given dataset using STRING database: the interactions are taken from the STRING database (based on seed proteins). Predictions are compared with the STRING database. Optionally, the graphs can be constructed.
@@ -127,6 +127,12 @@ Test
 Train
 ------------

+A dataset for training must be provided as two separate files:
+
+- **pairs_file**: a .tsv file with pairs of proteins and their labels (1 for interacting, 0 for non-interacting)
+- **fasta_file**: a FASTA file with protein sequences. The FASTA file is used to extract ESM2 embeddings for each protein. The embeddings are saved in a separate folder so they can be reused in multiple runs. In order to reuse the embeddings, make sure that `--output_dir_esm` is set to the correct folder.
+
+
 .. code-block:: bash

    usage: senseppi <command> [<args>] train [-h] [-v] [--min_len MIN_LEN] [--max_len MAX_LEN] [--device {cpu,gpu,mps,auto}] [--valid_size VALID_SIZE] [--seed SEED]

--- a/senseppi/__init__.py
+++ b/senseppi/__init__.py
-__version__ = "0.5.8"
+__version__ = "0.5.9"
 __author__ = "Konstantin Volzhenin"

 from . import model, commands, esm2_model, dataset, utils, network_utils