I-TASSER results

I-TASSER results for job id Rv0997

[Click on result.tar.bz2 to download the tarball file including all modelling results listed on this page]

Input Sequence in FASTA format

>protein (143 residues)
MAGIAGVDRDPPGWPQHSHLLAGDPERFRHQLQRAETTNSIECFVAEWHHAGVAADMTRP
WPTVVQGGAGQRRRRDVEPDRKTPVRWMSGQRLSEITWPTTDIEHSVGAAEVQRHRGAVP
LGSGGDAAGKVEGGRTPQPFVQP

Predicted Secondary Structure

Sequence	20 40 60 80 100 120 140 \| \| \| \| \| \| \| MAGIAGVDRDPPGWPQHSHLLAGDPERFRHQLQRAETTNSIECFVAEWHHAGVAADMTRPWPTVVQGGAGQRRRRDVEPDRKTPVRWMSGQRLSEITWPTTDIEHSVGAAEVQRHRGAVPLGSGGDAAGKVEGGRTPQPFVQP
Prediction	CCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHCCCEEEEHHHHHHHHHCCCCCCCCCCEEHCCCCCEHCCCCCCCCCCCEEHCCCCCCEHCCCCCCCCHHHHCHHHHHHHCCCCCCCCCCCCCCEHCCCCCCCCCCCC
Conf.Score	98877878899998766554568989999999987645644555444445444444578887455478765333468877688644446765444557765544444478888765777778887755544578888888899
	H:Helix; S:Strand; C:Coil

Predicted Solvent Accessibility

Sequence	20 40 60 80 100 120 140 \| \| \| \| \| \| \| MAGIAGVDRDPPGWPQHSHLLAGDPERFRHQLQRAETTNSIECFVAEWHHAGVAADMTRPWPTVVQGGAGQRRRRDVEPDRKTPVRWMSGQRLSEITWPTTDIEHSVGAAEVQRHRGAVPLGSGGDAAGKVEGGRTPQPFVQP
Prediction	74424424553541453431233437414441553643530401124133122324244423212533445445562547453334224454146141334415442334414544311322444533242654534443368
	Values range from 0 (buried residue) to 8 (highly exposed residue)

Predicted Normalized B-facotr

(B-factor is a value to indicate the extent of the inherent thermal mobility of residues/atoms in proteins. In I-TASSER, this value is deduced from threading template proteins from the PDB in combination with the sequence profiles derived from sequence databases. The reported B-factor profile in the figure below corresponds to the normalized B-factor of the target protein, defined by B=(B'-u)/s, where B' is the raw B-factor value, u and s are respectively the mean and standard deviation of the raw B-factors along the sequence. (Click here to read more about predicted normalized B-factor)

Top 10 threading templates used by I-TASSER

Rank

PDB
Hit

Iden1

Iden2

Cov

Norm.
Zscore

Download
Align.

                  20                  40                  60                  80                 100                 120                 140

                   |                   |                   |                   |                   |                   |                   |

Sec.Str
Seq

CCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHCCCEEEEHHHHHHHHHCCCCCCCCCCEEHCCCCCEHCCCCCCCCCCCEEHCCCCCCEHCCCCCCCCHHHHCHHHHHHHCCCCCCCCCCCCCCEHCCCCCCCCCCCC
MAGIAGVDRDPPGWPQHSHLLAGDPERFRHQLQRAETTNSIECFVAEWHHAGVAADMTRPWPTVVQGGAGQRRRRDVEPDRKTPVRWMSGQRLSEITWPTTDIEHSVGAAEVQRHRGAVPLGSGGDAAGKVEGGRTPQPFVQP

3ct8A

0.15

0.13

0.85

0.66

MUSTER

L-YFQGLH--------HVEINVDHLEESIAFWDWLLGELGYEDYQS--WSRGKSYKHGKTYLVFVQT---EDRFQTPTFHRKRT-------GLNHLAFHAASREKVDELTQKLKERGDPILYEDRHPFAGGPNHYAPN-IVAP

3k70D4

0.18

0.15

0.83

0.67

dPPAS

---------------EVNHAIEVDEALLAQTLDKLFPVSD----EINWQKVAAAVALTRRI-SVISGGPGIQMADGGDRDQLASVE--AGAVLGDICAYANAGFTAERARQLSRLTGTHVPAGTGTEAASLRDSQKSYRFG--

1zfjA2

0.30

0.18

0.60

0.52

wdPPAS

------VIIDP---PEHK-------------VSEAEE-------LQRYRISGV------P---IVETLANRKL-RDFISDYNAPIS-TS-EHL--VT--AVGTDLETAERILHEHREKLPL----DNSGRLSG------LI--

4btgA

0.20

0.19

0.93

0.73

wMUSTER

LKNSVGALQLPLQFTRFSASMTSDPVMYARLFFQYAQAGSVDELFTEYHQSTACNPEI--WRKLTAYITGSS-NRAIKADAKVPPTAI--EQLRTLAPSEHELFHHITTDFVCHV--LSPLGILPDAAYVYRVGRT-ATY--P

3tsbA2

0.31

0.20

0.64

0.51

wPPAS

V--I----SDP---PEHQ-------------VYDAEH---L---MGKYRISGV--------P-VVNNLDERKL-RDMIQDYSIKIS-MTKEQL--IT--PVGTTLSEAEKILQKYKEKLPL----DNNGVLQG--T---IEKP

3k70D4

0.21

0.16

0.76

0.57

dPPAS2

---------------EVNHAIEVDEALLAQTLDKLPVSDEIN-----WQKVAAAVALTRRI-SVISGGPGIQMADGGDRDQLASVE--AGAVLGDICANAGFT--AERARQLSRLTGTHVPAGTGTEAASLRDG---------

2ktlA

0.21

0.19

0.89

0.54

PPAS

KSSLSGFVNPQSGNPHAPQTNFANMPSARVTLPKSLVYDK---FSKVLWSAGLVASKSR---IINNNGAYVGSRPGVKKNEPGGG--MPDD-LTFTTWNASKTQEFIGDLLILK-LGKVS--LGLTAPGVGKGKEEPSP----

5ic8A3

0.21

0.16

0.75

0.65

Env-PPAS

---------NASLWGRRA-AQNGDMQRARAHFLRGCRFCT-RE-PTLWLEYAR---CEMDWLARM------EAKKDGLPDPD--AEGT--DGTKKAAKPVFD-AEQTKKLEQSALSGAIPI-AVFDVARK-------QPFW--

4hqbA

0.15

0.14

0.91

0.64

MUSTER

MLQIEFIT--DLG-ARVTVNVEHESRLLDVQRHYGRLSGGYQFPIENWSLIGARKWKSPEGEELVIHRGHAYRRRELEAVDSRKLKLPAAIKYSRGAKV-SDPQHVREKA--DGDIEYVSL-------AIFRGGKRQERYAVP

2p6aD3

0.13

0.11

0.87

0.58

dPPAS

------CNRICPEPASSEQYLCGSACHLRKATCLLGRSIGLAYEGKCIKAKSCEDIQCTGGKKCLWDFKVGRGRDELCPDSDEPVCASDNA--------TYASECAMKEAACSSGVLLEVKHSGSCNSISEDTEEEEE-----

(a)	Iden1 is the number of template residues identical to query divided by number of aligned residues.
(b)	Iden2 is the number of template residues identical to query divided by query sequence length.
(c)	Cov is equal the number of aligned template residues divided by query sequence length.
(d)	Norm. Zscore is the normalized Z-score of the threading alignments. A Normalized Z-score >1 means a good alignment.
(e)	Download Align. list the threading program used to identify the template provide the 3D structure of aligned regions of threading templates.

Top 5 final models predicted by I-TASSER

(For each target, I-TASSER simulations generate a large ensemble of structural conformations, called decoys. To select the final models, I-TASSER uses the SPICKER program to cluster all the decoys based on the pair-wise structure similarity, and reports up to five models which corresponds to the five largest structure clusters. The confidence of each model is quantitatively measured by C-score that is calculated based on the significance of threading template alignments and the convergence parameters of the structure assembly simulations. C-score is typically in the range of [-5, 2], where a C-score of higher value signifies a model with a high confidence and vice-versa. TM-score and RMSD are estimated based on C-score and protein length following the correlation observed between these qualities. Since the top 5 models are ranked by the cluster size, it is possible that the lower-rank models have a higher C-score in rare cases. Although the first model has a better quality in most cases, it is also possible that the lower-rank models have a better quality than the higher-rank models as seen in our benchmark tests. If the I-TASSER simulations converge, it is possible to have less than 5 clusters generated. This is usually an indication that the models have a good quality because of the converged simulations.)

More about C-score

Local structure accuracy of first model

Download Model 1

C-score=-3.95

Estimated TM-score = 0.29±0.09

Estimated RMSD = 13.9±3.9Å

Download Model 2

C-score=-4.74

Download Model 3

C-score=-4.41

Download Model 4

C-score=-5.00

Download Model 5

C-score=-5.00

Proteins structureally close to the target in PDB (as identified by TM-align

(After the structure assembly simulation, I-TASSER uses the TM-align structural alignment program to match the first I-TASSER model to all structures in the PDB library. This section reports the top 10 proteins from the PDB that have the closest structural similarity, i.e. the highest TM-score, to the predicted I-TASSER model. Due to the structural similarity, these proteins often have similar function to the target. However, users are encouraged to use the data in the next section 'Predicted function using COACH' to infer the function of the target protein, since COACH has been extensively trained to derive biological functions from multi-source of sequence and structure features which has on average a higher accuracy than the function annotations derived only from the global structure comparison.)

Top 10 Identified stuctural analogs in PDB

Rank	PDB Hit	TM-score	RMSD^a	IDEN^a	Cov
1	3k70D	0.659	3.43	0.146	0.860
2	1v8zB	0.485	4.98	0.106	0.846
3	3ahcA	0.479	5.05	0.053	0.825
4	5kinB	0.478	4.86	0.050	0.825
5	1k8zB	0.466	4.84	0.058	0.811
6	1x1qA	0.460	4.66	0.058	0.783
7	4c20A	0.459	4.36	0.069	0.748
8	1fuiA	0.458	4.31	0.053	0.741
9	3a9rA	0.455	4.43	0.047	0.755
10	3owaA	0.453	5.10	0.037	0.804

(a)	Query structure is shown in cartoon, while the structural analog is displayed using backbone trace.
(b)	Ranking of proteins is based on TM-score of the structural alignment between the query structure and known structures in the PDB library.
(c)	RMSD^a is the RMSD between residues that are structurally aligned by TM-align.
(d)	IDEN^a is the percentage sequence identity in the structurally aligned region.
(e)	Cov represents the coverage of the alignment by TM-align and is equal to the number of structurally aligned residues divided by length of the query protein.

Predicted function using COACH

(This section reports biological annotations of the target protein by COACH based on the I-TASSER structure prediction. COACH is a meta-server approach that combines multiple function annotation results from the COFACTOR, TM-SITE and S-SITE programs.)

Ligand binding sites

Rank	C-score	Cluster size	PDB Hit	Lig Name	Download Complex	Ligand Binding Site Residues
1	0.13	4	2b5zA	BGS	Rep, Mult	114,115
2	0.03	1	2xgfA	FE2	Rep, Mult	17,19
3	0.03	1	3r1bC	TB2	Rep, Mult	54,58
4	0.03	1	4nhpA	2T8	Rep, Mult	110,114
5	0.03	1	2d2nC	OXY	Rep, Mult	49,53
6	0.03	1	4qc1A	ZN	Rep, Mult	36,50
7	0.03	1	3ai7A	TPP	Rep, Mult	42,53,65,77
8	0.03	1	1ud9A	ZN	Rep, Mult	24,26
9	0.03	1	3g3nA	TC8	Rep, Mult	31,35,57
10	0.03	1	3psqB	ZN	Rep, Mult	26,30
11	0.03	1	2h1cA	MG	Rep, Mult	29,33,50
12	0.03	1	2i0oA	ZN	Rep, Mult	59,111

	Download the all possible binding ligands and detailed prediction summary.
	Download the templates clustering results.
(a)	C-score is the confidence score of the prediction. C-score ranges [0-1], where a higher score indicates a more reliable prediction.
(b)	Cluster size is the total number of templates in a cluster.
(c)	Lig Name is name of possible binding ligand. Click the name to view its information in the BioLiP database.
(d)	Rep is a single complex structure with the most representative ligand in the cluster, i.e., the one listed in the Lig Name column. Mult is the complex structures with all potential binding ligands in the cluster.

Enzyme Commission (EC) numbers and active sites

Rank	Cscore^EC	PDB Hit	TM-score	RMSD^a	IDEN^a	Cov	EC Number	Active Site Residues
1	0.060	1zklA	0.446	4.95	0.039	0.790	3.1.4.17	NA
2	0.060	1vgmA	0.440	4.99	0.023	0.790	4.1.3.7	105
3	0.060	3fg0C	0.438	4.87	0.056	0.762	1.2.1.8	95
4	0.060	2o2jB	0.444	4.58	0.067	0.720	4.2.1.20	50
5	0.060	1x1qA	0.460	4.66	0.058	0.783	4.2.1.20	NA
6	0.060	2j5nB	0.436	4.88	0.058	0.755	1.5.1.12	26
7	0.060	2d4eC	0.440	4.81	0.064	0.762	1.2.1.60	NA
8	0.060	1w36D	0.652	3.25	0.140	0.846	3.1.11.5	32
9	0.060	1wnbA	0.446	4.65	0.065	0.755	1.2.1.19,1.2.1.8	95
10	0.060	2tysB	0.489	4.78	0.092	0.839	4.2.1.20	NA
11	0.060	1x1qB	0.446	4.57	0.030	0.741	4.2.1.20	NA
12	0.060	2o1sB	0.383	5.24	0.068	0.699	2.2.1.7	36
13	0.060	2o2eB	0.442	4.72	0.074	0.755	4.2.1.20	NA
14	0.060	2opxA	0.439	4.76	0.058	0.748	1.2.1.21,1.2.1.22	134
15	0.060	2hawB	0.437	5.11	0.015	0.790	3.6.1.1	44
16	0.060	3bjcA	0.424	5.37	0.008	0.783	3.1.4.35	34
17	0.060	3hhsA	0.441	5.60	0.014	0.839	1.14.18.1	NA
18	0.060	2hg2A	0.437	4.89	0.072	0.762	1.2.1.21,1.2.1.22	95
19	0.060	2qykB	0.447	4.91	0.029	0.790	3.1.4.17	39

(a)	Cscore^EC is the confidence score for the EC number prediction. Cscore^EC values range in between [0-1]; where a higher score indicates a more reliable EC number prediction.
(b)	TM-score is a measure of global structural similarity between query and template protein.
(c)	RMSD^a is the RMSD between residues that are structurally aligned by TM-align.
(d)	IDEN^a is the percentage sequence identity in the structurally aligned region.
(e)	Cov represents the coverage of global structural alignment and is equal to the number of structurally aligned residues divided by length of the query protein.

Gene Ontology (GO) terms

Rank	Cscore^GO	TM-score	RMSD^a	IDEN^a	Cov	PDB Hit	Associated GO Terms
Homologous GO templates in PDB
0	0.09	0.659	3.43	0.15	0.86	3k70D	GO:0000166 GO:0000724 GO:0003677 GO:0004003 GO:0004386 GO:0004518 GO:0004519 GO:0004527 GO:0005524 GO:0006281 GO:0006302 GO:0006310 GO:0006974 GO:0008854 GO:0009338 GO:0016787 GO:0032508 GO:0043142 GO:0090305
1	0.07	0.466	4.88	0.06	0.82	4y6gB	GO:0000162 GO:0004834 GO:0005737 GO:0006568 GO:0008652 GO:0009073 GO:0016829 GO:0030170
2	0.07	0.428	4.99	0.08	0.76	1x1qA	GO:0000162 GO:0004834 GO:0006568 GO:0008652 GO:0009073 GO:0016829
3	0.07	0.451	4.58	0.07	0.76	2dh6A	GO:0000162 GO:0004834 GO:0005737 GO:0005829 GO:0006568 GO:0008652 GO:0009073 GO:0016829 GO:0030170 GO:0042802
4	0.07	0.408	5.32	0.02	0.78	4qysB	GO:0000162 GO:0004834 GO:0005737 GO:0006568 GO:0008652 GO:0009073 GO:0016829 GO:0030170 GO:0052684
5	0.07	0.469	4.89	0.07	0.82	2rhgB	GO:0000162 GO:0004834 GO:0005737 GO:0006568 GO:0008652 GO:0009073 GO:0016829 GO:0030170
6	0.07	0.446	4.57	0.03	0.74	1x1qB	GO:0000162 GO:0004834 GO:0006568 GO:0008652 GO:0009073 GO:0016829
7	0.06	0.401	5.11	0.08	0.74	5b36B	GO:0004122 GO:0004124 GO:0006535 GO:0008652 GO:0016740 GO:0016829 GO:0019344 GO:0033847
8	0.06	0.373	5.15	0.03	0.66	1rqxC	GO:0008660 GO:0009310 GO:0016787 GO:0018871 GO:0030170
9	0.06	0.380	5.24	0.06	0.71	4d8uH	GO:0003824 GO:0005737 GO:0016829 GO:0019148 GO:0030170 GO:0046416
10	0.06	0.365	5.49	0.05	0.71	1j0aA	GO:0003824 GO:0008660 GO:0016787
11	0.06	0.355	5.30	0.04	0.66	2q3dA	GO:0004124 GO:0005576 GO:0005737 GO:0005829 GO:0006535 GO:0008652 GO:0016740 GO:0019344 GO:0030170 GO:0080146
12	0.06	0.367	5.37	0.03	0.70	1ofpB	GO:0003824 GO:0003849 GO:0008652 GO:0009058 GO:0009073 GO:0009423 GO:0016740
13	0.06	0.362	5.03	0.07	0.66	3bm5A	GO:0004124 GO:0006535
14	0.06	0.345	5.48	0.07	0.65	2d1fA	GO:0004795 GO:0005829 GO:0005886 GO:0006520 GO:0008652 GO:0009088 GO:0016829 GO:0030170 GO:0040007
15	0.06	0.347	5.33	0.05	0.65	3aexA	GO:0004795 GO:0006520 GO:0008652 GO:0009088 GO:0016829 GO:0030170
16	0.06	0.366	4.95	0.03	0.65	2zsjA	GO:0004795 GO:0005737 GO:0006520 GO:0008652 GO:0009088 GO:0016829 GO:0030170
17	0.06	0.391	5.06	0.04	0.68	4d9gB	GO:0005737 GO:0008838 GO:0009063 GO:0016829 GO:0030170
18	0.06	0.389	4.91	0.03	0.66	4d9kD	GO:0005737 GO:0008838 GO:0009063 GO:0016829 GO:0030170

Consensus prediction of GO terms

Molecular Function	GO:0016836	GO:0043168	GO:0048037
GO-Score	0.47	0.37	0.37
Biological Processes	GO:0006568	GO:0046219	GO:0009073	GO:1901607	GO:0006520	GO:0046394
GO-Score	0.47	0.47	0.47	0.47	0.47	0.47
Cellular Component	GO:0044424
GO-Score	0.37

(a)	Cscore^GO is a combined measure for evaluating global and local similarity between query and template protein. It's range is [0-1] and higher values indicate more confident predictions.
(b)	TM-score is a measure of global structural similarity between query and template protein.
(c)	RMSD^a is the RMSD between residues that are structurally aligned by TM-align.
(d)	IDEN^a is the percentage sequence identity in the structurally aligned region.
(e)	Cov represents the coverage of global structural alignment and is equal to the number of structurally aligned residues divided by length of the query protein.
(f)	The second table shows a consensus GO terms amongst the top scoring templates. The GO-Score associated with each prediction is defined as the average weight of the GO term, where the weights are assigned based on Cscore^GO of the template.

[Click on result.tar.bz2 to download the tarball file including all modelling results listed on this page]

Please cite the following articles when you use the I-TASSER server:
1.	J Yang, R Yan, A Roy, D Xu, J Poisson, Y Zhang. The I-TASSER Suite: Protein structure and function prediction. Nature Methods, 12: 7-8, 2015.
2.	J Yang, Y Zhang. I-TASSER server: new development for protein structure and function predictions, Nucleic Acids Research, 43: W174-W181, 2015.
3.	A Roy, A Kucukural, Y Zhang. I-TASSER: a unified platform for automated protein structure and function prediction. Nature Protocols, 5: 725-738, 2010.
4.	Y Zhang. I-TASSER server for protein 3D structure prediction. BMC Bioinformatics, 9: 40, 2008.