Protein
- Protein accession
- A0A0A0YS76 [UniProt]
- Representative
- 8rcoV
- Source
- UniProt (cluster: phalp2_19301)
- Protein name
- Protein RegB
- Lysin probability
- 100%
- PhaLP type
-
endolysin
Probability: 99% (predicted by ML model) - Protein sequence
-
MAMSKLKVIGAIMAFTLATGVSASCDVAFTSSQLDVMTRAYTTGKKSDLGYTLAAISWRESKAGQDVVRMGKSVKWANLGAFQNQVKSTGDRAGCKTQSCYADVGYRLMTDQRYAANAALNEVNYWMDRHNSNLRKALASYNSGGNHNTASRRYAQDVTKKAKYLQKCVSFAGRPVVNKPDPSVLADNTRTLKRLKRIQ
- Physico‐chemical
properties -
protein length: 199 AA molecular weight: 21880,8 Da isoelectric point: 10,01 hydropathy: -0,45
Representative Protein Details
- Accession
- 8rcoV
- Protein name
- 8rcoV
- Sequence length
- 226 AA
- Molecular weight
- 25702,99650 Da
- Isoelectric point
- 4,89488
- Sequence
-
MMVLSKEDMQLLIKCAMLVVAFSIPQIVINILNEREAQQIENELVTDVEDVDLLPVIDEINPPEIIDEGPIVTEECFIEKDQLDVVSFSYNYGSKYDLGYTLAAIALKESNGGRVNINISDPSGGYYHVTLDKVLRYYRWKNTPYNLNRAMQELVNKPNLAAELAVNELLSWRQRTSYNWMATWASYHAGSRGNTTTRGKDYAADIRQIIAKIKTCKWENSLIVSS
Other Proteins in cluster: phalp2_19301
Similar Clusters (pHMM search)
| # | Cluster | # Members | Identity (%) | Alignment Length | E-value |
|---|---|---|---|---|---|
| 1 |
phalp2_30113
3cbsX
|
116 | 41,4% | 152 | 7.677E-37 |
| 2 |
phalp2_21864
4xA7u
|
35 | 40,5% | 143 | 4.565E-30 |
| 3 |
phalp2_17388
4Kl1I
|
1 | 28,6% | 178 | 2.224E-20 |
| 4 |
phalp2_38011
5E1kA
|
59 | 32,1% | 143 | 3.038E-18 |
| 5 |
phalp2_16738
Yafm
|
2 | 29,2% | 140 | 1.998E-12 |
| 6 |
phalp2_28672
A0A2I7R1F1
|
1 | 28,0% | 146 | 3.654E-12 |
Domains
Domains [InterPro]
1
226 AA (representative)
Domain positions follow the representative sequence above; the member sequence bar is scaled to the same axis.
Legend:
EAD
CBD
Linker
Disordered
Unannotated
Taxonomy
| Name | Taxonomy ID | Lineage | |
|---|---|---|---|
| Phage |
Erwinia phage phiEa2809 [NCBI] |
1564096 | Ackermannviridae > Nezavisimistyvirus > Nezavisimistyvirus Ea2809 |
| Host |
Erwinia amylovora [NCBI] |
552 | Proteobacteria > Gammaproteobacteria > Enterobacteriales > Enterobacteriaceae > Erwinia > |
Coding sequence (CDS)
Coding sequence (CDS)
CDS Source ID
CDS Source
KP037007
[NCBI]
CDS location
range 70693 -> 71292
strand -
strand -
CDS
ATGGCTATGTCAAAGTTGAAGGTCATTGGAGCAATCATGGCCTTTACCCTTGCCACAGGAGTAAGCGCATCATGTGATGTCGCTTTCACCAGTTCGCAACTAGACGTGATGACAAGGGCATACACGACTGGAAAGAAGTCAGACCTGGGCTACACCCTGGCAGCAATCAGCTGGCGGGAGAGCAAGGCCGGTCAGGACGTCGTGAGAATGGGGAAATCGGTCAAGTGGGCAAACCTGGGCGCATTCCAGAACCAGGTGAAGTCAACAGGGGATCGCGCCGGGTGCAAGACCCAAAGCTGTTACGCCGATGTGGGCTACAGATTGATGACTGACCAGCGGTATGCAGCCAATGCAGCTCTGAATGAGGTCAACTACTGGATGGACCGCCACAACTCAAACCTGCGCAAGGCATTGGCCTCCTATAACTCAGGCGGCAACCATAATACTGCATCGCGCCGGTACGCTCAGGACGTCACCAAGAAGGCTAAATATCTGCAAAAGTGTGTTTCATTTGCGGGACGGCCTGTTGTTAACAAACCTGACCCCAGCGTCCTTGCTGACAACACACGCACCCTGAAACGTCTCAAGAGGATACAATAA
Gene Ontology
No Gene Ontology terms available.
Enzymatic activity
No enzymatic activity data available.
Tertiary structure
No tertiary structures available for this protein.
The structures below correspond to the cluster representative
(8rcoV)
rather than this protein.
Model Confidence
Very high
pLDDT > 90
pLDDT > 90
High
90 > pLDDT > 70
90 > pLDDT > 70
Low
70 > pLDDT > 50
70 > pLDDT > 50
Very low
pLDDT < 50
pLDDT < 50