Protein

Protein accession
C1KFN1 [UniProt]
Representative
4Sv8d
Source
UniProt (cluster: phalp2_11026)
Protein name
Putative tail lysin
Lysin probability
100%
PhaLP type
VAL
Probability: 99% (predicted by ML model)
Protein sequence
MGKEDIMADQARFGRLRLPVITLYIHTEHSVFKVSNAGQNNATQANADQNAFNSDIISFKTTNNMSDDSATFSVVLPLRRNNGVRWDNVISENDIIVIRIDSNEDLLNKGVTATVNNNIMTGIVSEVAIDGEYSSNSEMVQVTGQSFSKVFSQFRIGMISEVEQQLSGMGWLWDSSISPDAYTGSGDSDDSGGSGASVGDFDAGSGSTSEQLKALCKQIGSKTGIKWEFILAQVGVEVGSLDGNSYAAKNDNNFSGIKYANQEGATSGSNATDGAGGAYAHFKSKAYWALAMSNTLAKDDKSTGNALSSAKTVYDFAKGLKAAHYFEADVGQYAAGLDTWYKKITGQSSSTSGSTSGSSSTSSGSSDGGDGTTTDAAIAAEKAASPNGGVAFYDNNVAGIESALIERFKPYIILNYDNNGYTIWSFLDYSNMTSWDTYEKLKDSSNFVNFSGTLYELMQAAQRQPFNEMFFDSTSDGISKLTVRRTPFNPEDWYNLQQVTIGNDAIITKQVSRSAREQYSVFVDNPASGLLSIGVDSLAFGSFPKTNLDLIKVYGYAKMEVSDLYVTGADDKDYSINGGDIKKAKSTNNEKGTMYDYEKVVEFLSTTTSKTNLTQKPLTYSQELANKSNNISMFQASRLVNAYISNAYNLTEVVYNDIMNTDQGGGQANTGTHKLSYTEVSKFIKSSSNLSDFLTKSKPYFKNVSDEELTAIYNASESGKIDKKAYDTAVKNYDKTDSGEKSTSSLLDTDFFQTTLYNWYANNVNFLSGTITILGDPDIRIGTILNDAYDHIRYYIESVSHTFSFTEGFQTEIGVTRGLTYQDGQYDPRFTATYMWGTGIDYQGGYMGEAPVNYLAIDGSGGDGNDSSDGSGAFSGDAGPATAVKAAKYGATFEKSKCKRSEWYVWGGGHGGGNILESSEDPIKLDCSGFISACFNHVGLNINGTTWTFNDSSLFTHVPIPSTSTDGMKIGDCVLLYGCNHIMFYVGGGKLMGWNGNPPTDTSGGCKIVTLSDMQGHHDGYVLRLKG
Physico‐chemical
properties
protein length:1027 AA
molecular weight:111067,9 Da
isoelectric point:4,79
hydropathy:-0,41
Representative Protein Details
Accession
4Sv8d
Protein name
4Sv8d
Sequence length
1032 AA
Molecular weight
112820,81610 Da
Isoelectric point
5,91270
Sequence
MVKRAARITHPKISISIYSEHTAYHITNDSDPSVTNNKPATVDKSTLENSIVSCRTQNTLEDDTATFTVVLSGLIRWDNVINSNDIFIIRMNPNEDNAKTKVKNDNVMTGLVSDVSVIKDFGNDSIMYQITGQSMAKVFTQYKIGLPSQVESQLSDMGWLWDTNAELTEEVKEGSGGDSNLTLASGSNPKKVWVALRGAGYSEAAAAGAMGNIQVESGFVPNKWNGGHIGGSSKPQGSDGGYGLIQWTGPRETGVMNYLKKHDAVDKSSELKYELMYLLNVDPGKSVPSSYRKKTSVSDAARWWLLRVEGINDGSGPSRTSYATAFYHKFRGTKVTGSSLPSGSSSDDSDSSGNISGTVNSSQSAIDREKANSIGVAFFGNNVAQIQTNLINRFRPYIKYTYENGAKGIWDFIDVTNFHSWEDYEYLFDSSGFTNFNGSLYDLQQAALRAPFNEMFYESLPNGKSKLVVRRTPFNPDDWQKLDINTVDQTAVIEHEVSKNDLQEYSVFTVNPATPTMMGISDGVLLSAYPQTNRDLIDKYGYSKYEVEDLYLSGKGDNNEQKKAKGASSSKTAETSKDNSLGTEFKLADVNSFLGRINHTALRLEKDKYAKKLADASNNISATQAYTLVNDYIANAYKLTADDMDNDLDMDNGGGLPNTGTTPVSYKSLNSCLSKSNGDEATFLAEAKSTLKNVSDEFLRDVWQSYASGNNKLGKEEYKKLIKKDRNQGDSTGDATATDLKYFTKVLYNWYADNFNYYSGNVEVSGNPDIRLGNILDVIDGADLDANGYPGRRYYIESVVNTFTFTDGYITQVGVTRGMRRPVNGRADPRFHTLWGTSIDFLGGYMGEATIANLALAKKISSGGASSGDVLSGKKGNAVAVKAATIAYGFRESSYKEKREVYALGGHGERGSKNPLTHDINGGTIVLDCSSFVHWCFKMAGANIPSNTTGIANDTSQFRQVHISSNSTKGMRIGDVVEFYGQGHVMFYIGGGKFCGWNGGATNPSWDPSGGCQVRTLSEMGGSHDSIVLRYK
Other Proteins in cluster: phalp2_11026
Total (incl. this protein): 15 Avg length: 1021,6 Avg pI: 5,65

Protein ID Length (AA) pI
4Sv8d 1032 5,91270
1Z16z 1014 6,36997
1kWUU 975 4,87237
28x2C 1029 4,78422
3xPcG 1032 5,84750
5HFNe 1021 4,78393
5tXSj 987 4,47990
8KVXc 1027 4,78945
8Ls9H 1045 7,11297
8LscN 1011 6,59943
A0A4Y5FFS2 1045 7,11297
A0A4Y5FG95 1011 6,59943
A0A7G9V4Q9 1031 4,76916
A0AAE8ZGX7 1037 5,98807
Similar Clusters (pHMM search)
# Cluster # Members Identity (%) Alignment Length E-value
1 phalp2_37189
1QZtJ
98 26,9% 1058 4.201E-133
2 phalp2_5245
7YnjZ
6 23,8% 984 6.169E-71
3 phalp2_28907
5tXTi
21 25,0% 716 6.408E-61
4 phalp2_25144
1Emml
5 18,3% 1177 2.601E-46

Domains

Domains [InterPro]

No domain annotations available.

Taxonomy

  Name Taxonomy ID Lineage
Phage Mooreparkvirus Lb3381
[NCBI]
632112 Herelleviridae > Mooreparkvirus >
Host Lactobacillus paracasei
[NCBI]
1597 Firmicutes > Bacilli > Lactobacillales > Lactobacillaceae > Lactobacillus > Lactobacillus casei group

Coding sequence (CDS)

Coding sequence (CDS)
CDS Source ID
CDS Source
FJ822135 [NCBI]
CDS location
range 76501 -> 79584
strand +
CDS
ATGGGTAAGGAAGATATTATGGCAGATCAAGCAAGATTTGGTAGGCTAAGGCTACCAGTTATTACACTATACATACATACGGAACACTCTGTTTTTAAGGTGTCGAATGCTGGGCAGAATAACGCGACACAGGCTAATGCAGATCAAAACGCTTTTAACTCTGACATTATCTCTTTTAAAACAACAAATAACATGTCTGATGATTCTGCAACCTTTTCCGTTGTCCTTCCTTTAAGAAGAAACAACGGTGTTCGATGGGACAACGTTATTAGCGAGAATGACATTATTGTAATCAGGATTGATAGCAATGAAGATTTGTTAAATAAAGGTGTAACAGCTACTGTTAACAACAACATTATGACCGGAATTGTATCCGAAGTAGCTATTGATGGTGAATATTCTTCAAATTCAGAAATGGTGCAAGTAACCGGGCAAAGCTTTTCTAAGGTGTTTTCTCAGTTTCGTATTGGTATGATTTCAGAAGTTGAGCAACAATTATCCGGGATGGGATGGCTATGGGATAGTTCCATTTCTCCAGATGCATATACCGGTAGTGGCGACTCTGACGATTCTGGCGGTTCTGGAGCAAGTGTAGGGGACTTTGATGCAGGGTCTGGATCAACATCTGAACAATTAAAGGCATTGTGCAAGCAGATAGGTAGTAAGACTGGCATTAAATGGGAGTTTATTCTTGCTCAAGTTGGTGTTGAAGTAGGTAGTCTTGACGGGAACTCCTATGCTGCTAAAAATGATAATAACTTTTCTGGGATAAAGTATGCTAATCAAGAAGGGGCAACATCAGGGTCTAATGCAACTGATGGTGCAGGTGGTGCTTATGCACATTTTAAGAGTAAGGCATACTGGGCACTAGCTATGAGTAACACTTTAGCTAAGGATGACAAGAGTACAGGCAATGCTTTATCCAGTGCTAAAACTGTTTATGATTTTGCTAAAGGGCTTAAAGCAGCCCATTATTTTGAAGCAGATGTAGGCCAATATGCAGCTGGCCTTGACACTTGGTACAAGAAGATAACAGGGCAGAGTAGCTCTACCTCAGGGTCAACGTCTGGAAGTTCGAGTACATCATCTGGGTCAAGTGATGGTGGTGATGGCACTACAACAGATGCTGCTATTGCAGCTGAAAAAGCTGCTAGTCCAAATGGCGGCGTTGCTTTTTATGACAACAATGTGGCTGGTATTGAGAGTGCTCTTATTGAGCGATTTAAGCCATATATTATTTTAAACTACGATAATAATGGCTACACTATATGGAGTTTTCTTGACTATAGTAACATGACTTCTTGGGATACGTATGAGAAGCTAAAAGACAGTTCAAATTTCGTTAACTTTTCTGGTACTTTGTATGAGTTGATGCAAGCAGCGCAACGACAGCCATTTAACGAGATGTTCTTTGATTCAACCTCAGATGGCATTTCAAAGCTTACAGTAAGGCGCACACCATTTAACCCAGAGGATTGGTATAACTTACAGCAAGTAACAATAGGTAACGATGCTATTATTACTAAGCAAGTTAGTCGATCTGCTAGAGAGCAATATTCTGTTTTCGTTGATAATCCTGCTAGTGGGCTATTATCTATAGGTGTTGACTCATTAGCATTTGGTAGCTTTCCAAAGACAAACTTAGATCTAATAAAGGTTTATGGATATGCTAAAATGGAGGTTTCTGATTTATATGTAACAGGTGCTGATGATAAAGACTATAGTATTAATGGCGGTGATATTAAAAAAGCCAAAAGCACCAATAATGAAAAAGGAACTATGTACGACTATGAAAAGGTTGTTGAGTTTCTAAGCACAACGACAAGTAAGACTAATTTGACACAGAAACCCTTGACTTACTCACAGGAGCTTGCAAACAAGTCTAATAATATTTCAATGTTCCAAGCAAGCAGGTTAGTTAACGCGTATATTAGCAATGCTTACAATCTTACCGAAGTTGTTTATAATGATATTATGAATACGGACCAAGGCGGGGGTCAGGCAAATACAGGTACGCATAAGCTTAGCTATACAGAGGTTTCAAAGTTTATTAAGAGTTCTAGTAATTTGTCTGATTTTCTTACTAAATCGAAGCCCTACTTTAAAAACGTTTCTGACGAAGAGCTAACTGCTATCTATAATGCATCAGAAAGTGGTAAAATAGATAAGAAGGCATATGATACAGCAGTTAAGAACTATGACAAGACAGATTCTGGTGAAAAGTCAACTAGCTCATTGCTAGACACAGACTTTTTTCAAACAACTTTGTATAACTGGTATGCCAACAACGTTAATTTTTTGTCAGGGACTATTACAATCCTTGGAGATCCTGATATAAGAATTGGTACAATATTAAATGATGCCTATGACCACATTAGATATTATATTGAGTCAGTATCGCACACTTTCTCCTTTACAGAAGGTTTTCAAACAGAGATTGGCGTTACTAGAGGACTTACATATCAAGACGGTCAATATGATCCGAGATTCACGGCTACGTATATGTGGGGTACTGGAATAGATTACCAAGGTGGGTATATGGGTGAAGCTCCTGTTAACTACCTTGCTATTGATGGATCTGGTGGAGATGGAAATGATAGCTCTGATGGATCCGGTGCATTTTCAGGTGATGCTGGCCCAGCAACGGCTGTAAAAGCCGCTAAGTATGGGGCAACTTTTGAAAAGTCTAAGTGCAAAAGATCCGAGTGGTATGTATGGGGTGGCGGCCATGGCGGTGGTAACATCTTGGAGTCAAGTGAAGATCCAATAAAGCTTGATTGTTCAGGATTTATATCTGCTTGTTTTAACCATGTTGGTCTTAATATTAATGGTACTACTTGGACATTTAATGATAGTTCATTGTTCACGCATGTTCCTATTCCGTCTACTAGCACAGATGGAATGAAAATTGGTGATTGTGTCCTGCTTTATGGTTGTAACCATATTATGTTTTACGTTGGGGGAGGAAAACTAATGGGTTGGAATGGCAATCCCCCAACAGATACAAGCGGTGGCTGTAAGATAGTCACCTTGTCAGATATGCAGGGACATCACGATGGATATGTTTTAAGATTGAAGGGATAA

Gene Ontology

Description Category Evidence (source)
GO:0001897 symbiont-mediated cytolysis of host cell biological process None (UniProt)
GO:0008234 cysteine-type peptidase activity molecular function None (UniProt)

Enzymatic activity

No enzymatic activity data available.

Tertiary structure

PDB ID
upi0001998ce9_model
Method AlphaFold3 (non-commercial)
Resolution -
Chain position -
Model Confidence
Very high
pLDDT > 90
High
90 > pLDDT > 70
Low
70 > pLDDT > 50
Very low
pLDDT < 50

The structures below correspond to the cluster representative (4Sv8d) rather than this protein.
PDB ID
4Sv8d
Method AlphaFoldv2
Resolution 79.17
Chain position -
Model Confidence
Very high
pLDDT > 90
High
90 > pLDDT > 70
Low
70 > pLDDT > 50
Very low
pLDDT < 50