0.4.0 alpha version of the package that can be pip installed

parent 80345594
SENSE-PPI
=======================================
SENSE-PPI is a Deep Learning model for predicting physical protein-protein interactions based on amino acid sequences.
SENSE-PPI is a Deep Learning model for predicting physical protein-protein interactions based on amino acid sequences.
It is based on embeddings generated by ESM2 and uses Siamese RNN architecture to perform a binary classification.
P.S.: Both pretrained models were made to work with protein in range 50-800 amino acids.
\ No newline at end of file
## Installation
SENSE-PPI requires Python 3.10 or higher. To install the package, run:
```bash
pip install senseppi
```
**N.B.**: if you intend to use the `create_dataset` command to generate new datasets from STRING,
do not forget to additionally install the MMseqs2 software (instructions can be found at: https://github.com/soedinglab/MMseqs2).
The `mmseqs` command should be available in your PATH.
## Usage
There are 5 commands available in the package:
- `train`: trains SENSE-PPI on a given dataset
- `test`: computes test metrics (AUROC, AUPRC, F1, MCC, Presicion, Recall, Accuracy) on a given dataset
- `predict`: predicts interactions for a given dataset
- `predict_string`: predicts interactions for a given dataset using STRING database:
the interactions are taken from the STRING database (based on seed proteins).
Predictions are compared with the STRING database. Optionally, the graphs can be constructed.
- `create_dataset`: creates a dataset from the STRING database based on the taxonomic ID of the organism.
The original SENSE-PPI repository contains two pretrained models: `senseppi.ckpt` and `dscript.ckpt` pretrained on SENSE-PPI and DSCRIPT human datasets respectively.
- `senseppi.ckpt` (preferred) : Download from [here](http://gitlab.lcqb.upmc.fr/Konstvv/SENSE-PPI/raw/master/pretrained_models/senseppi.ckpt)
- `dscript.ckpt` : Download from [here](http://gitlab.lcqb.upmc.fr/Konstvv/SENSE-PPI/raw/master/pretrained_models/dscript.ckpt)
**N.B.**: Both pretrained models were made to work with proteins in range 50-800 amino acids.
\ No newline at end of file
__version__ = "0.3.2"
__version__ = "0.4.0"
__author__ = "Konstantin Volzhenin"
from . import model, commands, esm2_model, dataset, utils, network_utils
......
......@@ -5,9 +5,9 @@ with open("README.md", "r") as fh:
long_description = fh.read()
setup(
name="dscript_data",
name="senseppi",
version=senseppi.__version__,
description="SENSE_PPI: Sequence-based EvolutIoNary ScalE Protein-Protein Interaction prediction",
description="SENSE_PPI: Sequence-based EvolutioNary ScalE Protein-Protein Interaction prediction",
author="Konstantin Volzhenin",
author_email="konstantin.volzhenin@sorbonne-universite.fr",
url="",
......@@ -17,6 +17,17 @@ setup(
long_description=long_description,
long_description_content_type="text/markdown",
include_package_data=True,
classifiers=[
"Programming Language :: Python :: 3",
"Intended Audience :: Science/Research",
"Operating System :: OS Independent",
],
python_requires='>=3.9',
entry_points={
'console_scripts': [
'senseppi=senseppi.__main__:main',
],
},
install_requires=[
"numpy",
"pandas",
......@@ -31,7 +42,6 @@ setup(
"pytorch-lightning==1.9.0",
"torchmetrics",
"biopython",
"fair-esm",
"mmseqs2"
"fair-esm"
],
)
\ No newline at end of file
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment