Protein

Protein accession
20MAi [EnVhog]
Representative
1p4eq
Source
EnVhog (cluster: phalp2_20032)
Protein name
20MAi
Lysin probability
99%
PhaLP type
endolysin
Probability: 99% (predicted by ML model)
Protein sequence
MRLWMPTAIKRPVSYAATLPARELYKAIEMHTNGSAGDPFNWFNSPANTQHLCSHFQVEADGRKYQYLPLNVQAFAEFKANLFAVSIETADGGHPERRWTVPQIASIVEIIEFVGAPGVLLADKPSNGIGWHSQYPDDNQNGHDCPGAVRVAQIHSTLIPQVRADAFNWRSTKRLLVADGVALQLPGHANKPPFGVDVTNPKKGAAFVALITQLRNI
Physico‐chemical
properties
protein length:217 AA
molecular weight:23872,9 Da
isoelectric point:8,57
hydropathy:-0,23
Representative Protein Details
Accession
1p4eq
Protein name
1p4eq
Sequence length
199 AA
Molecular weight
22189,86350 Da
Isoelectric point
10,11124
Sequence
VRSKHATWKPVPYTGLHKRSPRRAVILHTNGGGSGSLQGYFTGNARGLHGAENRHVGAQFQALRNGGAEQYVDTDLVIYHAYGASEWAVGIETEDDGDPSKPWTPKQVAAIVAICRELNVPGQLLKETPSDGIGWHEQYPSWNKTAHHCPGPVRERQIHDEILPALAALSPAQRRRMTIRLRHAVRRVHRLRHLLGRKP
Other Proteins in cluster: phalp2_20032
Total (incl. this protein): 2 Avg length: 208,0 Avg pI: 9,34

Protein ID Length (AA) pI
1p4eq 199 10,11124
Similar Clusters (pHMM search)
# Cluster # Members Identity (%) Alignment Length E-value
1 phalp2_22323
f6kt
1 37,1% 156 1.493E-34
2 phalp2_31597
4osFp
9 34,6% 173 1.644E-19
3 phalp2_8004
5n9mx
40 27,7% 216 5.675E-19
4 phalp2_29385
19X45
21 28,5% 175 4.373E-13
5 phalp2_10334
1rJoR
1 29,0% 179 1.095E-12
6 phalp2_14594
5ioYm
94 29,6% 162 5.042E-12
7 phalp2_25765
4RsTe
9 28,2% 177 2.315E-11
8 phalp2_30640
6HXeu
2 27,3% 183 5.767E-11
9 phalp2_26205
hgNg
25 24,0% 183 1.341E-08
10 phalp2_9240
6UXx2
2 28,0% 203 3.307E-08

Domains

Domains
Representative sequence (used for alignment): 1p4eq (199 AA)
Member sequence: 20MAi (217 AA)
1 199 AA (representative)
Domain positions follow the representative sequence above; the member sequence bar is scaled to the same axis.
Legend: EAD CBD Linker Disordered Unannotated
Pfam accessions: PF01510

Taxonomy

  Name Taxonomy ID Lineage
Phage Unknown from Metagenome
[NCBI]
UNKNOWN_ENVHOG No lineage information
Host No host information

Coding sequence (CDS)

Coding sequence (CDS)

No CDS data available.

Gene Ontology

No Gene Ontology terms available.

Enzymatic activity

No enzymatic activity data available.

Tertiary structure

No tertiary structures available for this protein.

The structures below correspond to the cluster representative (1p4eq) rather than this protein.
PDB ID
1p4eq
Method AlphaFoldv2
Resolution 92.09
Chain position -
Model Confidence
Very high
pLDDT > 90
High
90 > pLDDT > 70
Low
70 > pLDDT > 50
Very low
pLDDT < 50