Protein

Protein accession
2b3IK [EnVhog]
Representative
4gf7o
Source
EnVhog (cluster: phalp2_30252)
Protein name
2b3IK
Lysin probability
89%
PhaLP type
endolysin
Probability: 99% (predicted by ML model)
Protein sequence
MPQKEGPMQQYKDPQQWMDDALPVAIGIIFVIVFSIVCGHVWGAEIPKTRAVMAVIGEAEGETYIGKLAVACAIRERGTLRGVFGEHAPRVKKHLYSVKTFVAADRAWEESRDPGNCAMTDHADHWEGTAFPLPSWAKDMKQTAVIGNQRFFRAYTQDEEQEQIDRRHGL
Physico‐chemical
properties
protein length:170 AA
molecular weight:19154,6 Da
isoelectric point:6,07
hydropathy:-0,38
Representative Protein Details
Accession
4gf7o
Protein name
4gf7o
Sequence length
168 AA
Molecular weight
18921,80280 Da
Isoelectric point
9,90326
Sequence
MSLNYKQSRLLARSWTESEKEESRRMPFIWAIMVGIAILLVMLIVDMASASEASTAKYSDNMAILAIIGEAESEPYAGMVAVGRTIIKRGSLKGVYGLTARRVVMRKYSSSTYKRARQALEEAKRTMHGWKAIGWGNESDLAIFNRSAWFTRCTIVAHIGNHYFYGVK
Other Proteins in cluster: phalp2_30252
Total (incl. this protein): 10 Avg length: 169,7 Avg pI: 8,88

Protein ID Length (AA) pI
4gf7o 168 9,90326
1Z5Gx 185 10,13909
1kBRq 146 9,32324
1mFNG 129 7,81487
2Ak0h 176 9,16201
3npLe 174 9,09792
4Zeyj 193 8,21941
8aCXN 174 9,64159
Rkne 182 9,39712
Similar Clusters (pHMM search)
# Cluster # Members Identity (%) Alignment Length E-value
1 phalp2_3237
2edZS
52 39,0% 128 2.031E-43
2 phalp2_16587
oHc
1 34,1% 117 3.322E-26
3 phalp2_40530
4HipJ
3 36,9% 111 6.223E-26
4 phalp2_11072
4Y5WB
2 25,1% 151 1.595E-25
5 phalp2_37890
7FH0Q
12 33,3% 111 1.305E-15
6 phalp2_11411
d7uy
20 22,5% 133 1.208E-12
7 phalp2_13797
6Xdga
10 26,9% 130 2.660E-11
8 phalp2_26034
6XPUy
1 23,7% 118 1.693E-10
9 phalp2_26781
3naKf
1 21,9% 132 3.134E-10
10 phalp2_20064
1DZhH
3 18,4% 130 2.306E-08

Domains

Domains
Unannotated
Representative sequence (used for alignment): 4gf7o (168 AA)
Member sequence: 2b3IK (170 AA)
1 168 AA (representative)
Domain positions follow the representative sequence above; the member sequence bar is scaled to the same axis.
Legend: EAD CBD Linker Disordered Unannotated

Taxonomy

  Name Taxonomy ID Lineage
Phage Unknown from Metagenome
[NCBI]
UNKNOWN_ENVHOG No lineage information
Host No host information

Coding sequence (CDS)

Coding sequence (CDS)

No CDS data available.

Gene Ontology

No Gene Ontology terms available.

Enzymatic activity

No enzymatic activity data available.

Tertiary structure

PDB ID
2b3IK
Method AlphaFoldv2
Resolution 90.45
Chain position -
Model Confidence
Very high
pLDDT > 90
High
90 > pLDDT > 70
Low
70 > pLDDT > 50
Very low
pLDDT < 50

The structures below correspond to the cluster representative (4gf7o) rather than this protein.
PDB ID
4gf7o
Method AlphaFoldv2
Resolution 91.74
Chain position -
Model Confidence
Very high
pLDDT > 90
High
90 > pLDDT > 70
Low
70 > pLDDT > 50
Very low
pLDDT < 50