Skip to content
Projects
Groups
Snippets
Help
This project
Loading...
Sign in / Register
Toggle navigation
P
PRESCOTT
Overview
Overview
Details
Activity
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Board
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
Mustafa Tekpinar
PRESCOTT
Commits
d1e78426
Commit
d1e78426
authored
Mar 05, 2024
by
Mustafa Tekpinar
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
Added custom frequency file input option.
parent
05a0fb8c
Show whitespace changes
Inline
Side-by-side
Showing
3 changed files
with
1175 additions
and
135 deletions
+1175
-135
custom-frequency-file.txt
data/custom-frequency-file.txt
+963
-0
example-prescott-script.sh
examples/example-prescott-script.sh
+3
-0
prescott.py
prescott/prescott.py
+209
-135
No files found.
data/custom-frequency-file.txt
0 → 100644
View file @
d1e78426
S2A 2.5607342137e-06
S2L 6.1954797779e-06
S2W 6.840647125218045e-07
F3L 1.5903965494e-06
F3S 1.2003274493e-06
F3L 1.67277954004e-05
V4L 1.5903965494e-06
V4M 1.5903965494e-06
A5P 1.2003245677e-06
A5V 2.0521632536e-06
G6W 6.5666780489e-06
G6V 1.5903965494e-06
V7F 5.01813964553e-05
V7G 6.3620921103e-06
I8L 1.5903965494e-06
I8N 1.2003216862e-06
I8M 1.3680957338e-06
R9W 1.2003245677e-06
R9P 6.840516103258959e-07
R9Q 6.1954644244e-06
R10W 1.3680976055e-06
R10G 1.3680976055e-06
R10Q 2.0521520234e-06
R10L 1.3681013489e-06
L11V 1.5904066669e-06
D12N 2.560721099e-06
D12E 2.0521464083e-06
E13Q 2.0521492158e-06
E13K 3.42024869312e-05
E13D 1.5903914907e-06
T14R 1.3680938621e-06
T14I 3.4202346554e-06
V15L 2.0521436008e-06
V15M 2.0521436008e-06
V15A 3.044813566e-06
V16M 2.0521464083e-06
V16L 8.8926344361e-06
V16G 2.0521520234e-06
N17D 1.2003216862e-06
R18G 1.2390007706e-06
R18C 6.8769569899e-05
R18L 1.5904066669e-06
I19V 6.567281802e-06
A20V 1.590386432e-06
A21T 1.2003216862e-06
G22R 1.2003245677e-06
G22A 0.0003172014898557
E23K 2.4006433724e-06
E23G 7.9542598243e-06
I25V 1.2003216862e-06
I25T 2.7362101847e-06
I25N 1.3681050923e-06
R27W 6.4025077342e-06
R27Q 3.96506775309e-05
P28T 6.840553537592262e-07
P28S 6.840553537592262e-07
P28L 1.5904218434e-06
N30K 1.5904521973e-06
A31T 1.3681163227e-06
A31S 5.51349194844e-05
A31G 5.51361491178e-05
I32F 6.5693526559e-06
I32V 1.17713824063e-05
I32M 6.840731356270762e-07
K33T 2.5608916e-06
K33N 4.801298271e-06
E34D 1.5905483256e-06
E34D 6.5676268537e-06
M35I 6.841012141428348e-07
M35I 6.841012141428348e-07
I36V 1.5906900095e-06
I36T 3.7174536929e-06
I36M 2.0522137914e-06
E37K 1.5905280871e-06
E37D 6.840890465030052e-07
E37D 1.2391205218e-06
C39R 1.5906647069e-06
A42T 6.84875215735693e-07
A42G 6.5728933876e-06
A42V 6.2019119254e-06
K43Q 1.5905129086e-06
S44F 1.5905685646e-06
T45A 3.1810764126e-06
T45R 1.3687848612e-06
S46G 1.3686106412e-06
Q48K 1.5905635048e-06
Q48H 6.1963473771e-06
V49M 4.7715842614e-06
V49A 1.200552254e-06
I50T 1.5905078491e-06
I50M 8.0548523049e-06
V51I 1.5905230275e-06
K52N 1.5905078491e-06
E53K 5.5763671083e-06
E53D 1.5905027897e-06
G54R 6.5735847072e-06
G55D 1.3682379967e-06
K57Q 1.5905027897e-06
K57T 1.590517968e-06
Q60E 6.5725477824e-06
Q60H 1.2007021706e-06
Q62E 6.843296717133699e-07
N64S 2.6040601017e-05
T66I 1.5905736244e-06
G67R 2.4073414284e-06
G67E 2.4069184463e-06
I68V 2.4074863194e-06
I68M 2.7422804804e-06
R69K 1.3713809257e-06
R69T 1.3713809257e-06
E71K 7.016549233021003e-07
L73M 6.2587448573e-06
L73V 2.7816605122e-06
D74N 6.574535509e-06
D74V 5.1239220548e-06
I75V 2.45024772e-06
V76I 1.44986971884e-05
V76A 6.895144163674174e-07
C77G 2.4286876586e-06
C77S 6.5732390292e-06
E78Q 2.4188067058e-06
R79K 6.868745148948738e-07
R79S 7.5466313209e-06
F80V 1.2058391555e-06
T81A 1.5910139531e-06
T81I 4.60211434281e-05
T81S 2.0363115067e-06
T82A 2.7415662568e-06
T82I 1.5909785154e-06
K84E 2.4064029569e-06
L85V 3.181946906e-06
Q86E 1.2023710757e-06
S87A 1.2024520402e-06
S87F 6.847664124813743e-07
F88V 6.847645368663531e-07
F88L 5.12272775e-06
E89G 4.7934567945e-06
E89D 6.848470736484543e-07
D90G 1.5909987653e-06
A92T 1.3703416947e-06
A92V 6.5772165219e-06
S93G 6.45551344298e-05
I94V 6.5698705735e-06
S95A 1.30419084567e-05
T96A 5.4918425543e-06
R100Q 1.7549247576e-05
E102D 6.2919287138e-06
A103S 1.5904774931e-06
A105S 3.1809347494e-06
S106N 3.7177162812e-06
I107V 5.1209045565e-06
I107L 6.5684033525e-06
I107R 1.5904673747e-06
H109Y 1.2003850835e-06
H109P 1.3681556304e-06
H109R 6.840778152196368e-07
H109L 1.31281835845e-05
V110G 1.5904572564e-06
H112Y 6.840918543814715e-07
H112D 6.840918543814715e-07
H112R 3.1809043947e-06
I115V 6.1957254451e-06
I115T 4.7713414149e-06
T116A 4.1044725068e-06
T116K 1.36308493382e-05
T117M 8.2092482654e-06
T117R 6.841030861258421e-07
K118E 6.840843667567834e-07
K118Q 2.736337467e-06
A120G 3.8414456128e-06
D121H 1.2005147807e-06
D121G 6.4020486555e-06
D121V 1.5904420792e-06
G122R 1.5904471383e-06
K123Q 6.5694389699e-06
K123N 1.3684046317e-06
C124R 4.3375452963e-06
C124G 1.3684065043e-06
C124Y 1.98434420845e-05
A125T 1.2009646147e-06
Y126N 5.20548658285e-05
R127G 6.844636337626746e-07
R127K 1.369011601e-06
R127T 6.845058005021535e-07
A128T 1.5906292847e-06
A128G 3.1812383288e-06
S129G 6.5701295629e-06
S129N 7.9528681223e-06
Y130C 4.7910880287e-06
D132H 0.0001140305973839
D132G 4.8032469949e-06
K134E 3.1810662934e-06
L135R 6.841395918423195e-07
L135P 6.841395918423195e-07
K136R 1.5905129086e-06
A137T 2.06768651327e-05
P138L 3.1810460551e-06
P138R 3.1810460551e-06
P139S 6.841115101761587e-07
P139A 8.2093381221e-06
P141S 1.2005118982e-06
P141L 2.0523963095e-06
P141R 2.7365284126e-06
C142R 1.368243613e-06
C142S 3.1811472489e-06
C142W 1.5905382063e-06
A143V 6.841564419486418e-07
G144V 6.5734982843e-06
N145H 1.2006445059e-06
N145S 6.842023158879988e-07
N145I 6.842023158879988e-07
Q146P 1.2008204004e-06
G147R 1.2009848075e-06
T148N 6.84532040206674e-07
T148I 6.84532040206674e-07
Q149H 1.3691596782e-06
I150T 6.846763945488804e-07
T151M 8.0679394979e-06
V152M 1.09589491463e-05
E153K 1.2019461912e-06
L155I 6.842734794759012e-07
N158S 1.2004283128e-06
I159V 1.368161246e-06
I159K 1.2004023748e-06
A160V 4.21320293883e-05
T161A 1.2003706744e-06
T161R 6.840955982552826e-07
T161M 3.4204779912e-06
R163G 2.5608719256e-06
K164N 3.1808639226e-06
A165S 1.5904572564e-06
L166I 1.5904521973e-06
N168S 1.2003476206e-06
N168K 1.200344739e-06
P169L 5.1227539925e-06
S170G 1.200353384e-06
E171A 2.4007298218e-06
G174W 1.5904420792e-06
G174R 2.5610949192e-06
I176V 1.2005118982e-06
I176N 6.5749677826e-06
L177S 1.97083169097e-05
E178Q 6.843193691122873e-07
E178K 6.843193691122873e-07
V179F 1.5904521973e-06
V180G 5.08658357753e-05
G181S 1.31414679019e-05
Y183C 4.1069898228e-06
V185I 3.1810662934e-06
V185G 1.3687005556e-06
H186N 4.8051049434e-06
A188T 6.843053206107288e-07
A188V 1.02457179302e-05
I190V 1.8592521344e-06
I190S 3.6030928949e-06
V194I 2.053531458e-06
K196T 3.1810258172e-06
G198R 2.4044241404e-06
G198E 1.2020964562e-06
E199Q 4.02953338003e-05
E199K 6.84502989224554e-07
T200P 1.2014927345e-06
T200A 1.2014927345e-06
T200I 6.844608228314228e-07
V201L 1.17773510382e-05
V201A 1.3685357215e-06
V201E 6.842678607597016e-07
D203H 4.3375883009e-06
D203V 1.42516342906e-05
V204F 1.5904572564e-06
R205S 1.5904572564e-06
T206A 1.5904673747e-06
P208S 1.11531626651e-05
P208L 6.841133822155149e-07
P208H 1.3682267644e-06
N209S 4.0270418961e-05
A210P 4.1046353647e-06
A210D 1.2005436061e-06
T212A 4.059446535e-06
V213L 6.841536335399477e-07
V213M 0.0014672568740488
N215S 1.8471288776e-05
I216T 1.2006099098e-06
R217C 0.0001332262566024
R217S 6.84220105397265e-07
R217H 1.54926732049e-05
R217P 2.7368055184e-06
S218P 4.1054891758e-06
S218C 6.0050105808e-06
I219F 1.3691165638e-06
I219V 0.284034317711987
G221R 1.2025156627e-06
N222K 3.1809549863e-06
A223V 1.43966172062e-05
V224I 6.858832873562903e-07
R226Q 1.2145326114e-06
R226L 1.2145326114e-06
E227K 1.346072564e-06
L228M 5.42321756915e-05
I229T 1.411781315e-06
E230Q 1.268272638e-06
I231V 3.4820687388e-06
G232R 6.948845379851684e-07
G232R 6.948845379851684e-07
G232V 1.2550232304e-06
E234D 6.88721361243996e-07
D235G 6.874628770046417e-07
D235V 6.8457430058e-06
T237I 8.2391778399e-06
L238R 6.861929739329014e-07
K241E 1.2412152987e-06
K241Q 6.853624402192612e-07
N243I 6.851680306078263e-07
N243S 2.0555040918e-06
Y245C 1.5905078491e-06
I246L 6.850112684353657e-07
I246V 2.4813187713e-06
I246T 2.4067446612e-06
S247T 6.850572570855472e-07
S247P 6.850572570855472e-07
S247A 1.73709339606e-05
N248S 5.5849281405e-06
N250H 1.2045116187e-06
S252L 6.862767973932462e-07
V253L 2.4171637966e-06
V253G 6.866613290192691e-07
C256R 6.922590211734345e-07
C256Y 4.8486124656e-06
I257V 1.3894292224e-06
I257F 6.947146112377035e-07
I257S 1.244394005e-06
F258I 1.2311450141e-06
F258L 2.0740863649e-06
L259S 6.2593455506e-06
L259F 7.00942768022991e-07
L259F 7.00942768022991e-07
L260F 8.869291977e-06
L260V 1.4022157813e-06
L260H 2.80305982e-06
F261I 1.2515832528e-06
F261Y 6.5714248163e-06
I262F 1.2657234495e-06
I262T 1.4088654265e-06
I262M 1.9159021152e-06
N263S 2.1279100943e-06
N263K 1.2813448996e-06
H264N 1.4224791535e-06
H264D 4.2674192497e-06
H264Y 1.41196887507e-05
R265C 1.2009386536e-06
R265H 7.25099344807e-05
R265L 2.0525648165e-06
L266R 6.841161902937595e-07
L266P 1.239246439e-06
V267L 6.841021501330579e-07
V267I 3.0982159233e-06
E268G 0.000114003608462
S269T 6.4029996772e-06
T270A 4.37800817045e-05
T270N 1.2003994929e-06
L272W 1.44096025591e-05
K274R 6.4027372982e-06
A275P 1.5903965494e-06
A275V 2.4007182949e-06
I276V 3.4202440138e-06
I276L 6.840488027777854e-07
I276T 2.0521520234e-06
I276M 6.5701295629e-06
E277K 1.2003274493e-06
A281G 6.5734982843e-06
A282T 2.7362289019e-06
A282D 6.840665843050499e-07
A282G 4.9567891902e-06
Y283C 9.2930930015e-06
Y283F 6.840609689860438e-07
P285A 2.736322492e-06
P285S 1.368161246e-06
K286Q 1.2008204004e-06
K286R 4.1044331982e-06
N287K 1.64188334966e-05
T288I 1.858855837e-06
T288R 6.841339753246558e-07
P290S 1.5904117257e-06
P290Q 6.842744159375723e-07
P290L 6.842744159375723e-07
L292Q 1.2012964391e-06
L294F 2.0334419869e-06
L296S 2.5610424467e-06
I298V 1.2003274493e-06
I298M 6.5707339509e-06
S299R 1.2003303309e-06
S299N 1.2003332124e-06
P300T 1.5904117257e-06
P300H 3.1807829815e-06
P300L 1.5903914907e-06
N302D 1.2003332124e-06
D304H 1.5903813734e-06
D304N 1.5903813734e-06
D304G 1.2003332124e-06
N306D 6.5715111847e-06
N306I 6.3615457283e-06
N306S 1.590386432e-06
V307A 6.0016516545e-06
H308Y 3.1807526296e-06
P309S 1.43650052056e-05
T310A 1.59037631484e-05
T310R 1.3680919905e-06
K311E 1.5903813734e-06
K311N 1.2003303309e-06
H312R 1.31395684965e-05
H312P 8.8926101041e-06
H312Q 1.2004427232e-06
E313A 1.2003245677e-06
V314F 4.7711441203e-06
H315Y 1.16294501459e-05
H315Q 1.67275259493e-05
H318Q 2.4782041941e-06
E319K 6.44336284063e-05
E320K 1.2003216862e-06
E324G 1.2003332124e-06
R325W 4.7883743742e-06
R325Q 6.87659059877e-05
V326M 4.3368734611e-06
V326L 6.840525461803874e-07
V326A 0.0005786377356059
Q327E 6.8150193112e-06
Q327R 1.2003303309e-06
Q328P 1.3681032206e-06
Q328H 1.5904016082e-06
H329Y 1.5904117257e-06
H329L 1.2003216862e-06
I330V 3.1808032164e-06
I330M 1.368106964e-06
E331K 1.30098465953e-05
S332G 1.2003216862e-06
S332N 3.1808234515e-06
L334P 1.5903965494e-06
G336D 7.4344744012e-06
G336A 1.3680994772e-06
S337C 2.0521464083e-06
N338H 1.2003245677e-06
N338S 0.0001263785483565
S339F 1.2003245677e-06
S340F 1.31278388951e-05
R341G 6.5690937278e-06
R341K 1.2003216862e-06
R341S 1.5904370202e-06
M342V 1.2003216862e-06
M342T 1.2003274493e-06
T345A 1.2003216862e-06
T345I 3.0977897889e-06
T347S 3.5576806768e-06
L348W 7.91031600130362e-07
L348S 1.5820632002e-06
P350S 7.892385741931614e-07
P350R 1.5652538685e-06
P350Q 7.826355094252795e-07
G351E 7.605291457584529e-07
L352V 7.4626865671e-06
L352I 2.413663265e-06
L352F 8.045039348287453e-07
L352P 1.4650275937e-06
A353P 7.418650784819067e-07
A353V 7.506391692526186e-07
G354S 1.45454160782e-05
P355S 1.9670196374e-06
P355A 7.24366830953064e-07
P355L 4.3778565513e-06
S356P 6.958119081249957e-07
S356F 5.6638611674e-06
G357R 1.6038801067e-06
G357E 6.861440079043789e-07
M359T 1.6032732426e-06
V360I 1.6038595275e-06
V360A 1.2004600162e-06
K361E 3.1905050569e-06
S362Y 1.6071279338e-06
T363A 1.2428165205e-06
T364A 3.1032077237e-06
T364P 6.84178910048262e-07
S365G 1.5906394052e-06
S365R 1.5958635217e-06
L366V 1.2003591474e-06
L366P 1.5920676819e-06
T367A 1.5907557998e-06
T367N 1.3685862914e-06
T367I 4.9668277988e-06
S368T 1.5932954129e-06
S368L 7.4489745245e-06
T371A 1.864968799e-06
S372P 1.5906444655e-06
G373R 1.16302775805e-05
G373E 7.9535765643e-06
S374R 1.2003274493e-06
S374N 1.2003389757e-06
S374R 7.9526151379e-06
S375G 1.29977452332e-05
D376A 6.840656484121468e-07
D376V 6.840656484121468e-07
K377E 3.1809448678e-06
V378F 1.2003360941e-06
Y379H 2.5629131095e-06
Y379C 4.02935353051e-05
Y379F 6.1990054315e-06
Y379S 4.7883940273e-06
A380V 6.6019673862e-06
H381Y 3.8447046882e-06
Q382H 6.5998759223e-06
M383V 1.3680957338e-06
M383T 6.8169113937e-06
V384I 5.1260626968e-06
V384F 1.5903965494e-06
V384D 0.0010107169459616
R385C 4.83422332624e-05
R385H 3.28473169939e-05
R385L 1.3681013489e-06
T386I 6.5822384876e-06
D387N 1.590386432e-06
R389W 1.67314442086e-05
R389Q 9.10882254526e-05
Q391K 6.840459952527208e-07
Q391R 3.1807627469e-06
K392R 6.5805059092e-06
L393V 1.2003216862e-06
A395G 1.5903965494e-06
Q398E 6.8405161032e-06
P399A 1.2003216862e-06
S401N 3.0986268125e-06
K402I 1.5903914907e-06
P403T 1.3680994772e-06
P403A 6.840497386245948e-07
P403S 1.8590770054e-06
P403R 3.0985730451e-06
L404P 1.5903965494e-06
S406G 3.7178637155e-06
S406T 6.840459952527208e-07
S406N 0.0010354288943527
P408H 1.5903813734e-06
Q409E 9.5766439335e-06
Q409P 4.7711592962e-06
Q409R 6.581025587e-06
Q409H 2.78853947456e-05
A410V 1.5903813734e-06
I411F 1.3680919905e-06
I411V 3.7176241407e-06
T413I 5.1234626409e-06
D415N 2.04925906476e-05
D415E 2.0302837118e-06
D418N 4.7711441203e-06
I419V 2.29266273567e-05
I419T 9.9141066584e-06
S421G 4.7883285176e-06
S421N 1.5903813734e-06
G422S 2.5612589099e-06
R423G 1.3680901188e-06
R423K 2.41652456676e-05
R423T 1.11531903081e-05
A424T 5.51445530812e-05
A424G 1.5903813734e-06
R425G 1.3680901188e-06
R425T 1.5903763148e-06
Q426L 3.1807526296e-06
Q426H 6.5747948664e-06
D428N 2.5611933111e-06
D428E 2.0521351782e-06
E429K 1.5903914907e-06
E429A 3.6011379595e-06
M431I 3.1807425125e-06
L432H 1.5903763148e-06
E433K 6.840459952527208e-07
E433Q 2.4783823102e-06
P435T 1.5903712562e-06
A436S 1.5903763148e-06
A436V 1.2003216862e-06
P437S 4.3373034117e-06
P437A 1.36315250082e-05
V440M 2.560721099e-06
A441T 0.0003401503848841
A442T 4.7711441203e-06
A442G 1.3681032206e-06
A442D 3.4202580516e-06
K443Q 3.8416620566e-06
N444S 1.2003216862e-06
N444K 2.4006433724e-06
Q445E 1.5903813734e-06
S446R 6.5739304215e-06
E448V 1.2003245677e-06
E448D 1.92065346826e-05
D450Y 5.121415969e-06
D450N 1.590386432e-06
T452A 2.4782287603e-06
K453T 1.31430223694e-05
K453N 8.0543133326e-06
G454R 9.04525477261e-05
E457K 1.2003216862e-06
E457V 2.56082602e-06
M458V 3.1807829815e-06
M458T 1.5903914907e-06
E460K 1.5904016082e-06
E460A 0.0001493090275583
T465S 6.840544178970525e-07
T465A 6.840544178970525e-07
S466C 2.4006491355e-06
S467G 1.2003216862e-06
S467N 6.5722022135e-06
S467T 1.5904572564e-06
N468S 1.5904724339e-06
N468K 6.840665843050499e-07
P469L 1.5905280871e-06
R472K 3.7173799905e-06
H473D 6.840619048661428e-07
H473Y 1.3681238097e-06
H473R 1.5903965494e-06
R474G 4.02701195712e-05
R474W 9.9133695419e-06
R474Q 9.5767618505e-06
D476N 1.5903965494e-06
D476Y 6.5696979252e-06
D478Y 3.1807728641e-06
D478V 1.5903813734e-06
V479L 6.5696116045e-06
V479L 2.4006664249e-06
E480Q 1.2003389757e-06
M481V 1.8584919452e-06
M481T 6.5715975553e-06
M481I 1.5903965494e-06
V482M 1.5904016082e-06
V482L 3.1808032164e-06
E483K 6.5704749139e-06
D484Y 6.5695252861e-06
D484H 1.2003360941e-06
D484G 9.9126079699e-06
D485Y 1.2003389757e-06
D485G 1.2003418573e-06
S486F 5.1212979417e-06
R487L 6.840562896239606e-07
R487Q 2.23040312058e-05
E489K 1.2003389757e-06
M490V 2.5607866736e-06
M490T 1.3681107075e-06
M490K 6.840553537592262e-07
A492T 3.71729707894e-05
C494G 1.590386432e-06
C494Y 1.5903914907e-06
T495I 1.3683484581e-06
P496S 1.2004254307e-06
P496R 2.78812481877e-05
P496L 1.67287489126e-05
P496H 2.0522053682e-06
R497W 9.292931796e-06
R497G 4.1044051212e-06
R497Q 2.1684770352e-05
I500M 3.8415439933e-06
I501L 1.3680938621e-06
I501S 6.8404786693e-06
I501T 1.3680957338e-06
L503F 6.567281802e-06
T504A 1.5903965494e-06
S505N 1.73474723493e-05
S505I 6.5708203012e-06
V506I 1.5904016082e-06
V506A 6.5758325003e-06
L507M 1.5904016082e-06
L507F 9.5424703547e-06
S508G 1.2003332124e-06
S508R 3.1808032164e-06
L509F 6.840553537592262e-07
L509I 1.3681107075e-06
E511V 1.5904167846e-06
E512D 2.0521548309e-06
I513T 5.5767126394e-06
N514S 6.3616873739e-06
G517E 1.2003216862e-06
H518Y 1.5904572564e-06
H518R 6.5715975553e-06
E519K 1.5904420792e-06
V520A 1.5911861019e-06
L521V 2.4024601191e-06
L521H 5.0772452086e-06
R522W 1.92191804941e-05
R522Q 4.21565819443e-05
E523K 6.5695252861e-06
M524L 1.2006791041e-06
M524I 7.6852136489e-06
M524I 5.5077364484e-05
H526R 7.6849970925e-06
N527K 6.841573780899967e-07
H528Y 1.5908317186e-06
H528R 1.5908266571e-06
F530L 3.4206651962e-06
V531M 5.124814866e-06
G532S 1.3682342526e-06
G532D 3.4206605158e-06
Q537K 1.5907102521e-06
L540F 6.841189983950568e-07
A541T 1.5907861665e-06
A541S 1.5907811053e-06
A541P 1.5907811053e-06
A541V 4.7889312747e-06
Q542R 6.0016804705e-06
H543Y 6.5735847072e-06
H543R 5.63864816198e-05
Q544E 1.2003562657e-06
T545A 6.01066552319e-05
K546R 7.6845640162e-06
L547S 6.5724613867e-06
Y548H 4.9574772389e-06
L549F 3.1818456613e-06
L550V 6.843671383324984e-07
N551H 7.2107237884e-06
N551S 6.818770712e-06
T552I 3.1820177811e-06
T553I 6.84822685710216e-07
L555F 6.853972013861473e-07
S556T 3.1085827971e-06
E557K 6.254565833e-06
L559M 1.8709470609e-06
L559Q 1.8712761604e-06
F560L 6.879973360743147e-07
Y561C 3.1093560523e-06
Q562P 6.866179534115986e-07
Q562H 6.866141818786039e-07
I563V 3.0600878857e-06
L564F 4.8096214415e-06
I565F 6.1711972396e-06
Y566H 1.3710424858e-06
D567Y 6.873135661951696e-07
F568S 6.855381611672796e-07
A569S 2.0680185073e-06
A569V 6.859491547e-06
N570D 1.2054554089e-06
N570S 1.86281717567e-05
G572D 1.865878188e-06
V573I 2.4192397297e-06
L574V 6.882312456985547e-07
R575G 1.3777843299e-06
R575M 1.2153267284e-06
R575T 1.3147687979e-05
L576F 6.902950044731116e-07
S577L 1.93925259953e-05
E578K 5.1226490242e-06
E578G 5.45240729978e-05
P579R 4.8013213236e-06
A580T 1.3681575022e-06
P581A 3.1810764126e-06
P581L 6.31921711095e-05
R319H 2.54014985645e-05
R319P 6.840787511458319e-07
L582V 5.24935727381e-05
D584E 1.02609566495e-05
L585F 1.3681312968e-06
L585I 6.840656484121468e-07
L323S 1.5904673747e-06
A586V 1.2003274493e-06
P324L 2.4006433724e-06
M587L 3.1808942766e-06
M587T 6.5705612573e-06
M587I 6.840600331085056e-07
M587I 3.0978281746e-06
A589T 1.5904623155e-06
A589V 6.3618290258e-06
D591G 1.2003216862e-06
S592N 4.3369755667e-06
S592T 6.8152473191e-06
S595C 2.4782717524e-06
G596R 6.840778152196368e-07
G596S 1.3681556304e-06
T598A 0.0001608095887095
T598R 6.840628407488026e-07
T598I 6.840628407488026e-07
E599D 1.5904420792e-06
D601E 6.840553537592262e-07
G602S 1.5904319613e-06
G602V 1.2003389757e-06
P603S 2.0521913299e-06
P603R 0.0001053255053765
P603L 8.8927195988e-06
K604Q 6.840544178970525e-07
E605D 1.5904066669e-06
G606A 3.0977744349e-06
L607I 1.5904066669e-06
L607H 0.000190808719215
L607P 6.840525461803874e-07
A608P 1.5904016082e-06
A608T 6.5653846658e-06
A608V 6.1565318751e-06
E609K 1.2003360941e-06
I611V 1.5904269023e-06
I611L 1.5904269023e-06
I611T 2.7362326454e-06
V612F 2.4006548986e-06
V612I 1.31428496326e-05
E613Q 1.5904370202e-06
E613D 1.3681144509e-06
E613D 1.3681144509e-06
F614L 7.9521851014e-06
K616Q 1.2003360941e-06
K616N 1.5904471383e-06
K617Q 1.2003216862e-06
K617T 6.3617278452e-06
K618E 0.0047796290537015
K618R 3.22172533304e-05
K618M 1.43653393093e-05
K618T 0.0047746551121706
A619P 1.2003389757e-06
A619D 4.9566847709e-06
E620D 1.5904724339e-06
M621L 1.5904471383e-06
M621V 3.1808942766e-06
M621T 1.2003245677e-06
M621K 6.5716839282e-06
L622F 1.2003476206e-06
D624Y 6.840656484121468e-07
D624N 6.840656484121468e-07
Y625D 1.590492671e-06
Y625S 6.5779952901e-06
Y625C 2.05219975291e-05
F626L 4.89472014592e-05
F626L 4.7715690827e-06
E629D 6.840965342301383e-07
I630M 1.97350244056e-05
D631V 3.1812484491e-06
E633K 1.5906039841e-06
G634V 1.860061382e-06
N635H 2.4031587118e-06
N635S 2.0535680059e-06
I637T 1.2009069249e-06
G638R 1.3685207385e-06
G638V 1.5905635048e-06
L639F 1.2007223545e-06
L641P 1.2004830743e-06
I643T 2.5614622875e-06
D644H 1.3682230203e-06
N645S 1.5904977303e-06
N645K 3.1810156983e-06
Y646C 5.94878836813e-05
V647L 6.84094662282988e-07
V647M 5.01915955722e-05
P648S 2.4010353264e-06
P649S 6.8161763976e-06
P654T 6.842041884243599e-07
P654L 1.2006675711e-06
I655F 6.841901446514803e-07
I655V 0.0007269324442406
I655T 5.51584970115e-05
I657V 4.7717815923e-06
R659Q 2.79076690274e-05
R659P 6.5864871629e-06
R659L 6.846351442389322e-07
V664G 3.1812585695e-06
D667E 6.5759189846e-06
E668K 6.843015744410625e-07
E669Q 1.5905382063e-06
K670R 1.0262023671e-05
E671G 3.1810865319e-06
C672W 3.8416325401e-06
F673L 6.5703022339e-06
E674K 2.0522811789e-06
E674Q 6.840937263132548e-07
S675T 2.72614622057e-05
S675I 2.7363000298e-06
S677G 1.31387053119e-05
S677C 3.0977936274e-06
E679A 1.590492671e-06
C680G 3.4887586723e-05
C680Y 1.5905331467e-06
C680W 6.840834308152222e-07
A681T 2.0522137914e-06
M682T 3.1809752233e-06
Y684H 2.5609703004e-06
Y684S 3.7173707779e-06
Y684C 2.60215954459e-05
S685A 1.3681500149e-06
I686M 2.0522362534e-06
R687W 7.4352390677e-06
R687Q 8.6745208256e-06
Q689K 1.3681799649e-06
Q689R 0.0003060627688891
Q689P 6.840806230059049e-07
Y690N 6.840918543814715e-07
Y690D 4.1045511262e-06
I691V 1.92053685819e-05
I691K 4.7886560841e-06
I691T 6.5709930084e-06
S692P 9.5774432057e-06
E694A 6.81599900858e-05
S695W 7.4372021245e-06
S695L 1.3685844183e-06
T696I 1.5905938641e-06
L697I 6.844074195239536e-07
G699D 6.850347312608749e-07
Q700R 1.5906798883e-06
Q701K 9.50563255325e-05
S702T 6.5693526559e-06
E703K 7.4347599811e-06
V704L 6.840750074564176e-07
V704L 5.4726000596e-06
V704M 6.840750074564176e-07
V704E 1.5905432659e-06
V704G 3.1810865319e-06
P705A 3.1811067706e-06
L101P 1.5905432659e-06
A102V 2.4006548986e-06
S707C 1.590558445e-06
P103L 5.472644984e-06
Q105P 1.5905331467e-06
Q105R 6.5698705735e-06
N710H 1.2003274493e-06
N710K 1.5905382063e-06
T106I 6.5694389699e-06
S711P 1.5905280871e-06
S711C 1.5905280871e-06
W712C 6.5696979252e-06
K713T 3.4203797168e-06
W714R 1.2003274493e-06
W714C 6.840740715404664e-07
V716L 6.1566329511e-06
V716M 0.0016088584477148
V716A 6.5689211203e-06
E717D 6.5689211203e-06
H718Y 0.0040380586094655
I719V 3.4203656781e-06
I719T 4.7885119493e-06
Y721H 3.8412488668e-06
Y721C 1.11519326918e-05
Y721S 5.4725626232e-06
K722E 1.5905078491e-06
A723T 6.5690937278e-06
R725C 1.4250221498e-05
R725G 6.1957484773e-06
R725S 1.368178093e-06
R725H 9.47864949688e-05
S726L 1.5904977303e-06
H727Y 1.200353384e-06
H727L 9.542956026e-06
H727Q 3.4203843964e-06
P730R 6.840768792960028e-07
P730L 6.840768792960028e-07
H733R 2.73625885e-06
T735I 2.0522081759e-06
D737V 9.5426828298e-06
G738E 1.23901612207e-05
N739K 6.5694389699e-06
I740T 9.9126816653e-06
I740M 6.840675202005138e-07
L741M 1.02609566495e-05
L743F 2.0522474846e-06
L743R 1.31361164385e-05
A744V 1.5904774931e-06
L746P 1.5904825524e-06
P747S 6.3619504467e-06
P747R 6.5709066537e-06
D748H 1.05327724115e-05
Y750H 1.3681518867e-06
Y750C 3.6010687972e-06
K751Q 1.5905432659e-06
K751R 0.0001369301135962
V752L 2.5610949192e-06
V752A 1.200344739e-06
F753L 2.4010699167e-06
R755S 1.200457134e-06
C756G 1.2005320758e-06
C756Y 1.2006012611e-06
examples/example-prescott-script.sh
View file @
d1e78426
...
@@ -5,3 +5,6 @@
...
@@ -5,3 +5,6 @@
#If you are using GnomAD v4.0.0 data.
#If you are using GnomAD v4.0.0 data.
prescott
-e
../data/MLH1_normPred_evolCombi.txt
-g
../data/gnomAD_v4.0.0_MLH1_HUMAN_ENSG00000076242.csv
-s
../data/MLH1.fasta
prescott
-e
../data/MLH1_normPred_evolCombi.txt
-g
../data/gnomAD_v4.0.0_MLH1_HUMAN_ENSG00000076242.csv
-s
../data/MLH1.fasta
#If you have a custom frequency file
#prescott -e ../data/MLH1_normPred_evolCombi.txt -g ../data/custom-frequency-file.txt -s ../data/MLH1.fasta
prescott/prescott.py
View file @
d1e78426
...
@@ -558,6 +558,134 @@ def rankSortData(dataArray):
...
@@ -558,6 +558,134 @@ def rankSortData(dataArray):
return
(
normalizedRankedDataArray
)
return
(
normalizedRankedDataArray
)
def
plotLabeledPositions
(
myBigMergedDF
,
selectedPositionsList
,
selectedValuesList
,
selectedMutantsList
,
useFrequencies
):
clinvarLabeledDF
=
myBigMergedDF
.
loc
[(
myBigMergedDF
[
'labels'
]
==
0
)
|
(
myBigMergedDF
[
'labels'
]
==
1
)]
clinvarLabeledDF
[
'labels'
]
=
clinvarLabeledDF
[
'labels'
]
.
astype
(
'int64'
)
if
(
len
(
clinvarLabeledDF
)
>
0
):
print
(
"
\n
Mutations with ClinVar labels according to the gnomAD file:
\n
"
)
print
(
clinvarLabeledDF
)
fprESCOTT
,
tprESCOTT
,
AUC_ESCOTT
=
plotROCandAUCV2
(
clinvarLabeledDF
[
'labels'
],
\
clinvarLabeledDF
[
'ESCOTT'
])
fprPRESCOTT
,
tprPRESCOTT
,
AUC_PRESCOTT
=
plotROCandAUCV2
(
clinvarLabeledDF
[
'labels'
],
\
clinvarLabeledDF
[
'PRESCOTT'
])
fig
=
plt
.
figure
(
figsize
=
(
12
,
6
))
# plt.rcParams.update({'font.size': 18})
plt
.
grid
(
linestyle
=
'--'
)
# plt.title(protName + " - "+method+" AUC={:.2f}".format(AUC_ESCOTT))
plt
.
title
(
"AUC={:.2f} -> AUC={:.2f}"
.
format
(
AUC_ESCOTT
,
AUC_PRESCOTT
))
plt
.
ylim
([
0.0
,
1.0
])
#plt.xlim([1000, 1863])
plt
.
scatter
(
myBigMergedDF
.
loc
[
myBigMergedDF
[
'labels'
]
==
1
,
'position'
],
myBigMergedDF
.
loc
[
myBigMergedDF
[
'labels'
]
==
1
,
'ESCOTT'
],
marker
=
'o'
,
color
=
'red'
,
label
=
'pathogenic'
)
plt
.
scatter
(
myBigMergedDF
.
loc
[
myBigMergedDF
[
'labels'
]
==
0
,
'position'
],
myBigMergedDF
.
loc
[
myBigMergedDF
[
'labels'
]
==
0
,
'ESCOTT'
],
marker
=
'o'
,
color
=
'blue'
,
label
=
'benign'
)
if
(
useFrequencies
.
lower
()
==
'true'
):
#print(selectedPositionsList)
#print(selectedValuesList)
plt
.
scatter
(
selectedPositionsList
,
selectedValuesList
,
marker
=
'o'
,
color
=
'olive'
,
label
=
'PRESCOTT'
)
# Add vertical lines connecting old and new values
for
i
in
range
(
len
(
selectedPositionsList
)):
plt
.
annotate
(
""
,
xy
=
(
selectedPositionsList
[
i
],
selectedValuesList
[
i
]),
xycoords
=
'data'
,
\
xytext
=
(
selectedPositionsList
[
i
],
myBigMergedDF
.
loc
[
myBigMergedDF
[
'mutant'
]
==
selectedMutantsList
[
i
],
'ESCOTT'
]
.
values
[
0
]),
textcoords
=
'data'
,
arrowprops
=
dict
(
arrowstyle
=
"->"
,
connectionstyle
=
"arc3"
))
plt
.
xticks
(
rotation
=
90
)
plt
.
ylabel
(
"PR/ESCOTT Score"
)
plt
.
xlabel
(
"Position"
)
plt
.
legend
(
loc
=
'upper right'
)
plt
.
tight_layout
()
plt
.
savefig
(
"clinvar-vs-position.png"
)
plt
.
close
()
print
(
"@> AUC= {:.3f} {:.3f}"
.
format
(
AUC_ESCOTT
,
AUC_PRESCOTT
))
def
runPrescottModel
(
myBigMergedDF
,
selectedPositionsList
,
selectedValuesList
,
selectedMutantsList
,
\
version
=
2
,
scalingCoeff
=
1.0
,
freqCutoff
=-
4.0
):
# # print(myBigMergedDF)
# scalingCoeff = args.coefficient
# freqCutoff = args.frequencycutoff
for
index
,
row
in
myBigMergedDF
.
iterrows
():
if
(
row
[
'log10frequency'
]
!=
999.0
):
# print(row['log10frequency'])
# freq = np.log10(row['log10frequency'])
freq
=
row
[
'log10frequency'
]
# print(freq)
temp1
=
row
[
'PRESCOTT'
]
label
=
row
[
'labels'
]
if
(
version
==
1
):
if
(
freq
>
freqCutoff
):
temp2
=
temp1
-
freq
*
scalingCoeff
/
freqCutoff
if
(
temp2
<
0.0
):
myBigMergedDF
.
at
[
index
,
'PRESCOTT'
]
=
0.0
selectedValuesList
.
append
(
0.0
)
else
:
myBigMergedDF
.
at
[
index
,
'PRESCOTT'
]
=
temp2
selectedValuesList
.
append
(
temp2
)
selectedPositionsList
.
append
(
row
[
'position'
])
selectedMutantsList
.
append
(
row
[
'mutant'
])
if
(
version
==
2
):
if
(
freq
>
freqCutoff
):
temp2
=
temp1
-
scalingCoeff
*
(
freqCutoff
-
freq
)
/
freqCutoff
if
(
temp2
<
0.0
):
temp2
=
0.0
myBigMergedDF
.
at
[
index
,
'PRESCOTT'
]
=
temp2
if
(
label
==
0
or
label
==
1
):
selectedValuesList
.
append
(
temp2
)
selectedPositionsList
.
append
(
row
[
'position'
])
selectedMutantsList
.
append
(
row
[
'mutant'
])
if
(
version
==
3
):
temp2
=
temp1
-
scalingCoeff
*
(
freqCutoff
-
freq
)
/
freqCutoff
if
(
freq
>
freqCutoff
):
if
(
temp2
<
0.0
):
temp2
=
0.0
myBigMergedDF
.
at
[
index
,
'PRESCOTT'
]
=
temp2
if
(
label
==
0
or
label
==
1
):
selectedValuesList
.
append
(
temp2
)
selectedPositionsList
.
append
(
row
[
'position'
])
selectedMutantsList
.
append
(
row
[
'mutant'
])
else
:
if
(
temp2
>
1.0
):
temp2
=
1.0
myBigMergedDF
.
at
[
index
,
'PRESCOTT'
]
=
temp2
if
(
label
==
0
or
label
==
1
):
selectedValuesList
.
append
(
temp2
)
selectedPositionsList
.
append
(
row
[
'position'
])
selectedMutantsList
.
append
(
row
[
'mutant'
])
if
(
version
==
4
):
temp2
=
temp1
-
scalingCoeff
*
(
freqCutoff
-
freq
)
/
freqCutoff
if
(
freq
>
freqCutoff
):
if
(
temp2
<
0.0
):
temp2
=
0.0
myBigMergedDF
.
at
[
index
,
'PRESCOTT'
]
=
temp2
if
(
label
==
0
or
label
==
1
):
selectedValuesList
.
append
(
temp2
)
selectedPositionsList
.
append
(
row
[
'position'
])
selectedMutantsList
.
append
(
row
[
'mutant'
])
else
:
print
(
myBigMergedDF
.
loc
[
index
,
'Selected Population'
])
sys
.
exit
(
-
1
)
if
(
myBigMergedDF
.
iloc
[
index
,
'Selected Population'
]
.
values
==
None
):
if
(
temp2
>
1.0
):
temp2
=
1.0
myBigMergedDF
.
at
[
index
,
'PRESCOTT'
]
=
temp2
if
(
label
==
0
or
label
==
1
):
selectedValuesList
.
append
(
temp2
)
selectedPositionsList
.
append
(
row
[
'position'
])
selectedMutantsList
.
append
(
row
[
'mutant'
])
if
(
version
==
5
):
if
(
freq
>
freqCutoff
):
temp2
=
temp1
*
0.0
myBigMergedDF
.
at
[
index
,
'PRESCOTT'
]
=
temp2
if
(
label
==
0
or
label
==
1
):
selectedValuesList
.
append
(
temp2
)
selectedPositionsList
.
append
(
row
[
'position'
])
selectedMutantsList
.
append
(
row
[
'mutant'
])
return
myBigMergedDF
,
selectedPositionsList
,
selectedValuesList
,
selectedMutantsList
def
main
():
def
main
():
# Adding the main parser
# Adding the main parser
main_parser
=
argparse
.
ArgumentParser
(
description
=
\
main_parser
=
argparse
.
ArgumentParser
(
description
=
\
...
@@ -634,13 +762,13 @@ def main():
...
@@ -634,13 +762,13 @@ def main():
args
=
main_parser
.
parse_args
()
args
=
main_parser
.
parse_args
()
print
(
"
\n\n
@> Running PRESCOTT with the following parameters:
\n\n
"
)
print
(
"
\n\n
@> Running PRESCOTT with the following parameters:
\n\n
"
)
print
(
"@> ESCOTT file : {}"
.
format
(
args
.
escottfile
))
print
(
"@> ESCOTT file : {}"
.
format
(
args
.
escottfile
))
print
(
"@>
GNOMAD f
requency file : {}"
.
format
(
args
.
gnomadfile
))
print
(
"@>
F
requency file : {}"
.
format
(
args
.
gnomadfile
))
print
(
"@> Use population max. freq : {}"
.
format
(
str
(
args
.
usepopmax
)
.
lower
()))
print
(
"@> Use population max. freq : {}"
.
format
(
str
(
args
.
usepopmax
)
.
lower
()))
print
(
"@> Which equation to use (Default=2): {}"
.
format
(
str
(
args
.
equation
)))
print
(
"@> Which equation to use (Default=2): {}"
.
format
(
str
(
args
.
equation
)))
print
(
"@> Scaling coefficient (Default=1.0): {}"
.
format
(
args
.
coefficient
))
print
(
"@> Scaling coefficient (Default=1.0): {}"
.
format
(
args
.
coefficient
))
print
(
"@> Frequency cutoff (Default=-4.0) : {}"
.
format
(
args
.
frequencycutoff
))
print
(
"@> Frequency cutoff (Default=-4.0) : {}"
.
format
(
args
.
frequencycutoff
))
print
(
"@> Name of the output file : {}"
.
format
(
args
.
outputfile
))
print
(
"@> Name of the output file : {}"
.
format
(
args
.
outputfile
))
print
(
"@> GnomAD data version (Default=4) : {}"
.
format
(
str
(
args
.
gnomadversion
)))
# End of argument parsing!
# End of argument parsing!
protein
=
os
.
path
.
splitext
(
os
.
path
.
basename
(
args
.
escottfile
))[
0
]
protein
=
os
.
path
.
splitext
(
os
.
path
.
basename
(
args
.
escottfile
))[
0
]
...
@@ -649,6 +777,9 @@ def main():
...
@@ -649,6 +777,9 @@ def main():
# Check if file exists
# Check if file exists
usePopMaxOrNot
=
args
.
usepopmax
.
lower
()
usePopMaxOrNot
=
args
.
usepopmax
.
lower
()
version
=
args
.
equation
version
=
args
.
equation
useFrequencies
=
args
.
usefrequencies
if
(
os
.
path
.
exists
(
args
.
escottfile
)):
if
(
os
.
path
.
exists
(
args
.
escottfile
)):
#Convert the matrix format to singleline format
#Convert the matrix format to singleline format
localResidueList
=
None
localResidueList
=
None
...
@@ -703,9 +834,23 @@ def main():
...
@@ -703,9 +834,23 @@ def main():
print
(
"ERROR: ESCOTT input file does not exist!"
)
print
(
"ERROR: ESCOTT input file does not exist!"
)
sys
.
exit
(
-
1
)
sys
.
exit
(
-
1
)
#Create a dataframe to merge ESCOTT data with frequency data
myBigMergedDF
=
pd
.
DataFrame
()
myBigMergedDF
=
pd
.
DataFrame
()
myBigMergedDF
=
pd
.
concat
([
myBigMergedDF
,
dfESCOTT
],
ignore_index
=
True
)
myBigMergedDF
=
pd
.
concat
([
myBigMergedDF
,
dfESCOTT
],
ignore_index
=
True
)
# Add frequency column and a dummy frequency to each row in myBigMergedDF
myBigMergedDF
[
'log10frequency'
]
=
999.0
myBigMergedDF
[
'labels'
]
=
np
.
nan
myBigMergedDF
[
'position'
]
=
""
# Assign ESCOTT scores to PRESCOTT scores.
# Then, we will modify them according to different conditions.
myBigMergedDF
[
'PRESCOTT'
]
=
myBigMergedDF
[
'ESCOTT'
]
file_name
,
file_extension
=
os
.
path
.
splitext
(
args
.
gnomadfile
)
if
(
file_extension
==
".csv"
):
print
(
"@> You frequency data is in gnomAD format!"
)
print
(
"@> GnomAD data version (Default=4) : {}"
.
format
(
str
(
args
.
gnomadversion
)))
if
(
args
.
gnomadversion
==
2
or
args
.
gnomadversion
==
3
):
if
(
args
.
gnomadversion
==
2
or
args
.
gnomadversion
==
3
):
gnomadDF
=
getGnomADOverallFrequency
(
args
.
gnomadfile
,
usePopMax
=
usePopMaxOrNot
)
gnomadDF
=
getGnomADOverallFrequency
(
args
.
gnomadfile
,
usePopMax
=
usePopMaxOrNot
)
elif
(
args
.
gnomadversion
==
4
):
elif
(
args
.
gnomadversion
==
4
):
...
@@ -729,19 +874,15 @@ def main():
...
@@ -729,19 +874,15 @@ def main():
if
(
len
(
gnomadDF
.
loc
[(
gnomadDF
[
'labels'
]
==
0
)
|
(
gnomadDF
[
'labels'
]
==
1
)])
>
0
):
if
(
len
(
gnomadDF
.
loc
[(
gnomadDF
[
'labels'
]
==
0
)
|
(
gnomadDF
[
'labels'
]
==
1
)])
>
0
):
print
(
gnomadDF
.
loc
[(
gnomadDF
[
'labels'
]
==
0
)
|
(
gnomadDF
[
'labels'
]
==
1
)])
print
(
gnomadDF
.
loc
[(
gnomadDF
[
'labels'
]
==
0
)
|
(
gnomadDF
[
'labels'
]
==
1
)])
# print(gnomadDF['ClinVar Clinical Significance'])
# print(gnomadDF['ClinVar Clinical Significance'])
# Add frequency column and a dummy frequency to each row in myBigMergedDF
myBigMergedDF
[
'frequency'
]
=
999.0
myBigMergedDF
[
'labels'
]
=
np
.
nan
myBigMergedDF
[
'position'
]
=
""
useFrequencies
=
args
.
usefrequencies
selectedPositionsList
=
[]
selectedPositionsList
=
[]
selectedValuesList
=
[]
selectedValuesList
=
[]
selectedMutantsList
=
[]
selectedMutantsList
=
[]
# Assign ESCOTT scores to PRESCOTT scores.
# # Assign ESCOTT scores to PRESCOTT scores.
# Then, we will modify them according to different conditions.
# # Then, we will modify them according to different conditions.
myBigMergedDF
[
'PRESCOTT'
]
=
myBigMergedDF
[
'ESCOTT'
]
# myBigMergedDF['PRESCOTT'] = myBigMergedDF['ESCOTT']
labelsList
=
[]
labelsList
=
[]
if
(
useFrequencies
.
lower
()
==
'true'
):
if
(
useFrequencies
.
lower
()
==
'true'
):
...
@@ -753,147 +894,81 @@ def main():
...
@@ -753,147 +894,81 @@ def main():
temp
=
(
gnomadDF
.
loc
[
gnomadDF
[
'mutant'
]
==
row
[
'mutant'
],
'Allele Frequency Log'
]
.
values
)
temp
=
(
gnomadDF
.
loc
[
gnomadDF
[
'mutant'
]
==
row
[
'mutant'
],
'Allele Frequency Log'
]
.
values
)
#print(temp)
#print(temp)
if
(
len
(
temp
)
>
0
):
if
(
len
(
temp
)
>
0
):
myBigMergedDF
.
at
[
index
,
'
frequency'
]
=
temp
[
0
]
myBigMergedDF
.
at
[
index
,
'log10
frequency'
]
=
temp
[
0
]
myBigMergedDF
.
at
[
index
,
'labels'
]
=
gnomadDF
.
loc
[
gnomadDF
[
'mutant'
]
==
row
[
'mutant'
],
'labels'
]
.
values
[
0
]
myBigMergedDF
.
at
[
index
,
'labels'
]
=
gnomadDF
.
loc
[
gnomadDF
[
'mutant'
]
==
row
[
'mutant'
],
'labels'
]
.
values
[
0
]
# print(myBigMergedDF)
# print(myBigMergedDF)
scalingCoeff
=
args
.
coefficient
# scalingCoeff = args.coefficient
freqCutoff
=
args
.
frequencycutoff
# freqCutoff = args.frequencycutoff
for
index
,
row
in
myBigMergedDF
.
iterrows
():
if
(
row
[
'frequency'
]
!=
999.0
):
# print(row['frequency'])
myBigMergedDF
,
selectedPositionsList
,
selectedValuesList
,
selectedMutantsList
=
\
# freq = np.log10(row['frequency'])
runPrescottModel
(
myBigMergedDF
,
selectedPositionsList
,
selectedValuesList
,
selectedMutantsList
,
\
freq
=
row
[
'frequency'
]
version
=
version
,
scalingCoeff
=
args
.
coefficient
,
freqCutoff
=
args
.
frequencycutoff
)
# print(freq)
temp1
=
row
[
'PRESCOTT'
]
label
=
row
[
'labels'
]
if
(
version
==
1
):
if
(
freq
>
freqCutoff
):
temp2
=
temp1
-
freq
*
scalingCoeff
/
freqCutoff
if
(
temp2
<
0.0
):
myBigMergedDF
.
at
[
index
,
'PRESCOTT'
]
=
0.0
selectedValuesList
.
append
(
0.0
)
else
:
myBigMergedDF
.
at
[
index
,
'PRESCOTT'
]
=
temp2
selectedValuesList
.
append
(
temp2
)
selectedPositionsList
.
append
(
row
[
'position'
])
selectedMutantsList
.
append
(
row
[
'mutant'
])
if
(
version
==
2
):
if
(
freq
>
freqCutoff
):
temp2
=
temp1
-
scalingCoeff
*
(
freqCutoff
-
freq
)
/
freqCutoff
if
(
temp2
<
0.0
):
temp2
=
0.0
myBigMergedDF
.
at
[
index
,
'PRESCOTT'
]
=
temp2
if
(
label
==
0
or
label
==
1
):
selectedValuesList
.
append
(
temp2
)
selectedPositionsList
.
append
(
row
[
'position'
])
selectedMutantsList
.
append
(
row
[
'mutant'
])
if
(
version
==
3
):
temp2
=
temp1
-
scalingCoeff
*
(
freqCutoff
-
freq
)
/
freqCutoff
if
(
freq
>
freqCutoff
):
if
(
temp2
<
0.0
):
temp2
=
0.0
myBigMergedDF
.
at
[
index
,
'PRESCOTT'
]
=
temp2
if
(
label
==
0
or
label
==
1
):
selectedValuesList
.
append
(
temp2
)
selectedPositionsList
.
append
(
row
[
'position'
])
selectedMutantsList
.
append
(
row
[
'mutant'
])
else
:
if
(
temp2
>
1.0
):
temp2
=
1.0
myBigMergedDF
.
at
[
index
,
'PRESCOTT'
]
=
temp2
if
(
label
==
0
or
label
==
1
):
selectedValuesList
.
append
(
temp2
)
selectedPositionsList
.
append
(
row
[
'position'
])
selectedMutantsList
.
append
(
row
[
'mutant'
])
if
(
version
==
4
):
temp2
=
temp1
-
scalingCoeff
*
(
freqCutoff
-
freq
)
/
freqCutoff
if
(
freq
>
freqCutoff
):
if
(
temp2
<
0.0
):
temp2
=
0.0
myBigMergedDF
.
at
[
index
,
'PRESCOTT'
]
=
temp2
if
(
label
==
0
or
label
==
1
):
selectedValuesList
.
append
(
temp2
)
selectedPositionsList
.
append
(
row
[
'position'
])
selectedMutantsList
.
append
(
row
[
'mutant'
])
else
:
print
(
myBigMergedDF
.
loc
[
index
,
'Selected Population'
])
sys
.
exit
(
-
1
)
if
(
myBigMergedDF
.
iloc
[
index
,
'Selected Population'
]
.
values
==
None
):
if
(
temp2
>
1.0
):
temp2
=
1.0
myBigMergedDF
.
at
[
index
,
'PRESCOTT'
]
=
temp2
if
(
label
==
0
or
label
==
1
):
selectedValuesList
.
append
(
temp2
)
selectedPositionsList
.
append
(
row
[
'position'
])
selectedMutantsList
.
append
(
row
[
'mutant'
])
if
(
version
==
5
):
if
(
freq
>
freqCutoff
):
temp2
=
temp1
*
0.0
myBigMergedDF
.
at
[
index
,
'PRESCOTT'
]
=
temp2
if
(
label
==
0
or
label
==
1
):
selectedValuesList
.
append
(
temp2
)
selectedPositionsList
.
append
(
row
[
'position'
])
selectedMutantsList
.
append
(
row
[
'mutant'
])
# myBigMergedDF.dropna(subset = ['labels'], inplace=True)
# myBigMergedDF.dropna(subset = ['labels'], inplace=True)
clinvarLabeledDF
=
myBigMergedDF
.
loc
[(
myBigMergedDF
[
'labels'
]
==
0
)
|
(
myBigMergedDF
[
'labels'
]
==
1
)]
clinvarLabeledDF
[
'labels'
]
=
clinvarLabeledDF
[
'labels'
]
.
astype
(
'int64'
)
if
(
len
(
clinvarLabeledDF
)
>
0
):
print
(
"
\n
Mutations with ClinVar labels according to the gnomAD file:
\n
"
)
print
(
clinvarLabeledDF
)
#print(myBigMergedDF.loc[(myBigMergedDF['labels']=='0') | (myBigMergedDF['labels']=='1'), 'labels'])
# print(clinvarLabeledDF['labels'].values)
# print(clinvarLabeledDF['ESCOTT'].values)
numPathogenic
=
len
(
myBigMergedDF
.
loc
[(
myBigMergedDF
[
'labels'
]
==
1
)])
numPathogenic
=
len
(
myBigMergedDF
.
loc
[(
myBigMergedDF
[
'labels'
]
==
1
)])
numBenign
=
len
(
myBigMergedDF
.
loc
[(
myBigMergedDF
[
'labels'
]
==
0
)])
numBenign
=
len
(
myBigMergedDF
.
loc
[(
myBigMergedDF
[
'labels'
]
==
0
)])
if
((
numPathogenic
>=
1
)
and
(
numBenign
>=
1
)):
if
((
numPathogenic
>=
1
)
and
(
numBenign
>=
1
)):
fprESCOTT
,
tprESCOTT
,
AUC_ESCOTT
=
plotROCandAUCV2
(
clinvarLabeledDF
[
'labels'
],
\
plotLabeledPositions
(
myBigMergedDF
,
selectedPositionsList
,
selectedValuesList
,
selectedMutantsList
,
useFrequencies
)
clinvarLabeledDF
[
'ESCOTT'
])
fprPRESCOTT
,
tprPRESCOTT
,
AUC_PRESCOTT
=
plotROCandAUCV2
(
clinvarLabeledDF
[
'labels'
],
\
else
:
clinvarLabeledDF
[
'PRESCOTT'
])
print
(
"@> You're using a custom frequency file!"
)
# fprPRESCOTT, tprPRESCOTT, AUC_PRESCOTT = plotROCandAUCV2(myBigMergedDF.loc[(myBigMergedDF['labels']==0) | (myBigMergedDF['labels']==1), 'labels'], \
gnomadDF
=
pd
.
read_csv
(
args
.
gnomadfile
,
header
=
None
,
sep
=
'
\
s+'
)
# myBigMergedDF.loc[(myBigMergedDF['labels']==0) | (myBigMergedDF['labels']==1), 'PRESCOTT'])
gnomadDF
.
columns
=
[
'mutant'
,
'frequency'
]
print
(
gnomadDF
)
# Assign labels to pathogenic/benign mutations for performance evaluation
# gnomadDF['labels'] = ""
# for index, row in gnomadDF.iterrows():
# if ((row['ClinVar Clinical Significance']=='Benign/Likely benign') or \
# (row['ClinVar Clinical Significance']=='Benign') or \
# (row['ClinVar Clinical Significance']=='Likely benign')):
# gnomadDF.at[index,'labels'] = 0
# if((row['ClinVar Clinical Significance']=='Pathogenic/Likely pathogenic') or \
# (row['ClinVar Clinical Significance']=='Pathogenic') or \
# (row['ClinVar Clinical Significance']=='Likely pathogenic')):
# gnomadDF.at[index,'labels'] = 1
# if (len(gnomadDF.loc[(gnomadDF['labels']==0) | (gnomadDF['labels']==1)]) > 0):
# print(gnomadDF.loc[(gnomadDF['labels']==0) | (gnomadDF['labels']==1)])
# # print(gnomadDF['ClinVar Clinical Significance'])
fig
=
plt
.
figure
(
figsize
=
(
12
,
6
))
selectedPositionsList
=
[]
# plt.rcParams.update({'font.size': 18})
selectedValuesList
=
[]
plt
.
grid
(
linestyle
=
'--'
)
selectedMutantsList
=
[]
# plt.title(protName + " - "+method+" AUC={:.2f}".format(AUC_ESCOTT))
plt
.
title
(
"AUC={:.2f} -> AUC={:.2f}"
.
format
(
AUC_ESCOTT
,
AUC_PRESCOTT
))
# # Assign ESCOTT scores to PRESCOTT scores.
plt
.
ylim
([
0.0
,
1.0
])
# # Then, we will modify them according to different conditions.
#
plt.xlim([1000, 1863])
#
myBigMergedDF['PRESCOTT'] = myBigMergedDF['ESCOTT']
plt
.
scatter
(
myBigMergedDF
.
loc
[
myBigMergedDF
[
'labels'
]
==
1
,
'position'
],
myBigMergedDF
.
loc
[
myBigMergedDF
[
'labels'
]
==
1
,
'ESCOTT'
],
marker
=
'o'
,
color
=
'red'
,
label
=
'pathogenic'
)
plt
.
scatter
(
myBigMergedDF
.
loc
[
myBigMergedDF
[
'labels'
]
==
0
,
'position'
],
myBigMergedDF
.
loc
[
myBigMergedDF
[
'labels'
]
==
0
,
'ESCOTT'
],
marker
=
'o'
,
color
=
'blue'
,
label
=
'benign'
)
labelsList
=
[]
if
(
useFrequencies
.
lower
()
==
'true'
):
if
(
useFrequencies
.
lower
()
==
'true'
):
#print(selectedPositionsList)
#print(selectedValuesList)
plt
.
scatter
(
selectedPositionsList
,
selectedValuesList
,
marker
=
'o'
,
color
=
'olive'
,
label
=
'PRESCOTT'
)
# Add vertical lines connecting old and new values
# print(myBigMergedDF)
for
i
in
range
(
len
(
selectedPositionsList
)):
for
index
,
row
in
myBigMergedDF
.
iterrows
():
plt
.
annotate
(
""
,
xy
=
(
selectedPositionsList
[
i
],
selectedValuesList
[
i
]),
xycoords
=
'data'
,
\
myBigMergedDF
.
at
[
index
,
'position'
]
=
row
[
'mutant'
][
1
:
-
1
]
xytext
=
(
selectedPositionsList
[
i
],
myBigMergedDF
.
loc
[
myBigMergedDF
[
'mutant'
]
==
selectedMutantsList
[
i
],
'ESCOTT'
]
.
values
[
0
]),
textcoords
=
'data'
,
# print(row['mutant'], row['ESCOTT'])
arrowprops
=
dict
(
arrowstyle
=
"->"
,
connectionstyle
=
"arc3"
))
# print(row['mutant'][1:-1])
temp
=
(
gnomadDF
.
loc
[
gnomadDF
[
'mutant'
]
==
row
[
'mutant'
],
'frequency'
]
.
values
)
#print(temp)
if
(
len
(
temp
)
>
0
):
myBigMergedDF
.
at
[
index
,
'log10frequency'
]
=
np
.
log10
(
temp
[
0
])
# myBigMergedDF.at[index,'labels'] = gnomadDF.loc[gnomadDF['mutant'] == row['mutant'], 'labels'].values[0]
plt
.
xticks
(
rotation
=
90
)
# print(myBigMergedDF)
plt
.
ylabel
(
"PR/ESCOTT Score"
)
# scalingCoeff = args.coefficient
plt
.
xlabel
(
"Position"
)
# freqCutoff = args.frequencycutoff
plt
.
legend
(
loc
=
'upper right'
)
plt
.
tight_layout
()
myBigMergedDF
,
selectedPositionsList
,
selectedValuesList
,
selectedMutantsList
=
\
plt
.
savefig
(
"clinvar-vs-position.png"
)
runPrescottModel
(
myBigMergedDF
,
selectedPositionsList
,
selectedValuesList
,
selectedMutantsList
,
\
plt
.
close
()
version
=
version
,
scalingCoeff
=
args
.
coefficient
,
freqCutoff
=
args
.
frequencycutoff
)
print
(
"@> AUC= {:.3f} {:.3f}"
.
format
(
AUC_ESCOTT
,
AUC_PRESCOTT
))
# Renaming the column just to make clear that the frequency column in the csv
# myBigMergedDF.dropna(subset = ['labels'], inplace=True)
# is actually log10 frequencies. Normally, one can deduce it from the values as well
# but it is always better to be clear.
# sys.exit(-1)
myBigMergedDF
=
myBigMergedDF
.
rename
(
columns
=
{
'frequency'
:
'log10frequency'
})
#Write the results to csv files.
myBigMergedDF
[
'mutant'
]
=
myBigMergedDF
[
'mutant'
]
.
str
.
upper
()
myBigMergedDF
[
'mutant'
]
=
myBigMergedDF
[
'mutant'
]
.
str
.
upper
()
# myBigMergedDF = myBigMergedDF['mutant'].apply(lambda x: x.upper())
# myBigMergedDF = myBigMergedDF['mutant'].apply(lambda x: x.upper())
myBigMergedDF
.
to_csv
(
outfile
+
'-details.csv'
,
index
=
None
)
myBigMergedDF
.
to_csv
(
outfile
+
'-details.csv'
,
index
=
None
)
...
@@ -925,7 +1000,6 @@ def main():
...
@@ -925,7 +1000,6 @@ def main():
# print(myBigMergedDF.loc[myBigMergedDF['mutant']==variant, 'PRESCOTT'].values)
# print(myBigMergedDF.loc[myBigMergedDF['mutant']==variant, 'PRESCOTT'].values)
my_file
.
write
(
"{:.2f},"
.
format
(
float
(
myBigMergedDF
.
loc
[
myBigMergedDF
[
'mutant'
]
==
variant
,
'PRESCOTT'
]
.
values
[
0
])))
my_file
.
write
(
"{:.2f},"
.
format
(
float
(
myBigMergedDF
.
loc
[
myBigMergedDF
[
'mutant'
]
==
variant
,
'PRESCOTT'
]
.
values
[
0
])))
if
(
os
.
path
.
exists
(
protein
+
'_singleline.txt'
)):
if
(
os
.
path
.
exists
(
protein
+
'_singleline.txt'
)):
os
.
remove
(
protein
+
'_singleline.txt'
)
os
.
remove
(
protein
+
'_singleline.txt'
)
if
(
os
.
path
.
exists
(
protein
+
'_singleline_1-ranksort.txt'
)):
if
(
os
.
path
.
exists
(
protein
+
'_singleline_1-ranksort.txt'
)):
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment