________________________________________________________________________________
Note that results from all three values of IBasis (1, 2 and 3) have 6 secondary structutres but the values from IBasis=2 are different from the other two sets. This is due to different assignments used in IBasis=2. However, the results for IBasis=1 and 3 are similar; the difference between them are in the wavelength range (178-260 nm and 185-240 nm) and in the number of proteins (29 and 37).
Header details the references, and options (PRINT and IBASIS) selected. The selected reference set of proteins and types of secondary structure fractions obtained are indicated.
IPrint IBasis
0 3
Reference PROTEIN Set Selected:
CDDATA.37; SSDATA.37
Structures: Helix1, Helix2, Strand1,
Strand2, Turns and Unordered
The wavelength range: 260.0 178.0, is
different from that of
the Database proteins: 240.0 185.0
The No. of CD points = 83
The wavelengths are modified to be within the valid range
Beginning wavelength: 240.00; Ending
wavelength: 185.00
Total No of CD points 56
Beginning wavelength: 240.00; Ending
wavelength: 185.00
Total No of CD points 56
Secondary structural elements= 6
Multipilcation FACTOR for CD spectrum= 1.000
SAMPLE CD: FILE from TEST
-.54 -.67 -.82
-.99 -1.19 -1.41 -1.65 -1.92 -2.20
-2.49
-2.79 -3.11 -3.43 -3.71
-3.93 -4.10 -4.22 -4.29 -4.32 -4.31
-4.28 -4.22 -4.15 -4.06
-3.97 -3.87 -3.78 -3.71 -3.67 -3.64
-3.60 -3.53 -3.36 -3.05
-2.60 -2.04 -1.40 -.68
.22 1.38
2.68 4.00 5.32
6.63 7.74 8.46 8.91
9.20 9.27 9.02
8.45 7.60 6.63
5.69 4.80 3.95
Number of Proteins in the database= 37
The SAMPLE data from: TEST
Ordered DELTA(CD) values :
.0000 .7942
.8554 1.3308 1.5276 1.7137 1.8931 1.9142
1.9638 2.0399 2.1560
2.1766 2.1869 2.2606 2.3609 2.5331
2.5834 2.9485 3.1310
3.1741 3.2208 3.3008 3.3297 3.4069
3.4870 3.5328 3.6343
3.9341 3.9860 4.3677 4.8802 4.9138
5.0270 5.0321 5.0449
5.1182 5.3781
Ordered List of PROTEINS :
TEST THML TPI RHD
SUBN ECOR PGK T4LS
LYSM CYTC SUBB GPD
HMRT ADH FLVD GRS
AZU BLAC RNAS HBN
GFP PRAL PAPN IFBP
ColA CONA PPSN SUDS
TNF CHYT MGLB BNJN
CGA GCR ELAS CANH BTOX
IGUESS = 0; The structure of the Protein with closest CD spectrum: THML
Initial Guess: .282
.133 .070 .095
.215 .206
H(r)
H(d) S(r) S(d) Trn
Unrd SUM
.278
.146 .037 .029 .106
.165 .762
Helix From H & J Method: .425
ITER: 1; AVE OF 49 SOLN:
.263 .160 .070 .067
.195 .258
SUM of SECNDRY STRUCTURE: 1.013
RMSD with Previous Guess: .0288
ITER: 2; AVE OF 48 SOLN:
.260 .164 .070 .063
.192 .264
SUM of SECNDRY STRUCTURE: 1.014
RMSD with Previous Guess: .0038
SOLN. CONVERGED: 2 ITERATIONS
MinSol: 1 SUM < .050
Solution at the END of FIRST STAGE:
H(r)
H(d) S(r) S(d) Trn
Unrd SUM
.260 .164
.070 .063 .192 .264 1.014
First Part Completed. The results roughly
correspond
to SELCON and SELCON1 with only 2 selection rules
I NSol Bas NS H(r)
H(d) S(r) S(d) Trn
Unrd SUM RRcn RExp
1 3 6 3
.219 .166 .086 .070
.203 .303 1.047 .247 .055
2 4 7 1
.218 .152 .069 .074
.214 .297 1.024 .349 .082
3 5 7 2
.225 .157 .072 .077
.217 .297 1.044 .282 .055
4 8 8 1
.239 .158 .061 .063
.184 .251 .957 .349 .063
5 9 8 2
.244 .161 .061 .063
.181 .244 .953 .282 .067
.
.
38 162 28 1 .276
.164 .087 .062 .184
.256 1.029 .349 .160
39 166 28 5 .253
.148 .077 .067 .174
.280 .998 .223 .616
40 170 29 1 .276
.164 .089 .063 .185
.258 1.035 .349 .161
41 178 30 1 .276
.164 .090 .064 .186
.261 1.041 .349 .178
TotSOL > 1
Limits: ABS(Sum-1.0) < .050; Each Fraction
> -.030
RmsCD(Exp,Cal) < .250
SOLUTION AT SECOND STAGE (SELCON2):
Average Solution From 41 Solutions:
H(r) H(d)
S(r) S(d) Trn Unrd
SUM RRcn
.258 .164 .067
.063 .194 .267 1.013 .370
Second Part Completed. The results roughly
correspond SELCON2 with three selection rules
+--------------------------------------------------+
| Based on the SOLUTION the Number
of SEGMENTS of |
| HELICES (Per 100 Residues)
are: 4.112 |
| STRANDS (Per 100 Residues)
are: 3.159 |
+--------------------------------------------------+
For YOUR PROTEIN multiply No. of SEGMENTS by
(Number of Residues) / 100
e.g., If Number of Residues = 153, Use a FACTOR of
1.53
Average Length of Segments Remains as Estimated
+--------------------------------------------------+
| Based on the SOLUTION
Obtained
|
| The AVERAGE LENGTH of
HELICES : 10.265 |
| The AVERAGE LENGTH of
STRANDS : 4.132 |
+--------------------------------------------------+
Now the FOURTH SELECTION RULE is applied to theThe solution at the end of the final valid stage (If no solutions at the end of third stage, solution at the end of second stage is selected) is appended to PROTSS.OUT file, which looks like,
Solutions That SATISFY the first THREE rules.
The HELIX FRACTION should be with in HLIMITSHmin .357 Hmax .443 HelHJ .425
HELIX: .434HLIMITS: .404 ---> .464
1 5 8 2 .244 .161 .061 .063 .181 .244 .953 .067
2 6 9 1 .242 .167 .059 .064 .197 .255 .983 .066
3 7 9 2 .247 .165 .060 .062 .185 .243 .964 .146
4 8 10 1 .246 .173 .057 .061 .205 .268 1.010 .114
.
.
31 38 28 1 .276 .164 .087 .062 .184 .256 1.029 .160
32 40 29 1 .276 .164 .089 .063 .185 .258 1.035 .161
33 41 30 1 .276 .164 .090 .064 .186 .261 1.041 .178Final Solution: Aver. of 33 Solns
PROT H(r) H(d) S(r) S(d) Trn Unrd SUM RRcn
TEST .264 .167 .066 .061 .193 .263 1.015 .359+--------------------------------------------------+
| Based on the SOLUTION the Number of SEGMENTS of |
| HELICES (Per 100 Residues) are: 4.183 |
| STRANDS (Per 100 Residues) are: 3.067 |
+--------------------------------------------------+
For YOUR PROTEIN multiply No. of SEGMENTS by
(Number of Residues) / 100
e.g., If Number of Residues = 153, Use a FACTOR of 1.53Average Length of Segments Remains as Estimated
+--------------------------------------------------+
| Based on the SOLUTION Obtained |
| The AVERAGE LENGTH of HELICES : 10.311 |
| The AVERAGE LENGTH of STRANDS : 4.146 |
+--------------------------------------------------+
SELCON3 IBasis=3 SStr: .264 .167 .066 .061 .193 .263The calculated CD spectra corresponding to the solution is also output in a text file (CALCCD.OUT) for importing into any grapics software.
Program Modified by Sreerama.
N.(1999),
For use with CDPro reference
proteins
IPrint IBasis
0 3
Reference PROTEIN Set Selected:
CDDATA.37; SSDATA.37
Structures: Helix1, Helix2, Strand1,
Strand2, Turns and Unordered
Number of Helices and Strands Calculated
The wavelength range: 260.0 178.0, is
different from that of
the Database proteins: 240.0 185.0
The No. of CD points = 83
The wavelengths are modified to be within the valid range
Beginning wavelength: 240.00; Ending
wavelength: 185.00
Total No of CD points 56
Secondary structural elements= 6
Multipilcation FACTOR for CD spectrum= 1.000
SAMPLE CD:
SAMPLE INPUT: Lactate Dehydrogenase CD DATA (178-260 nm)
-.54 -.67
-.82 -.99 -1.19 -1.41 -1.65
-1.92 -2.20 -2.49
-2.79 -3.11 -3.43 -3.71
-3.93 -4.10 -4.22 -4.29 -4.32 -4.31
-4.28 -4.22 -4.15 -4.06
-3.97 -3.87 -3.78 -3.71 -3.67 -3.64
-3.60 -3.53 -3.36 -3.05
-2.60 -2.04 -1.40 -.68
.22 1.38
2.68 4.00 5.32
6.63 7.74 8.46 8.91
9.20 9.27 9.02
8.45 7.60 6.63
5.69 4.80 3.95
Proteins: 36; CD data: 56; Sec Str: 6
NbasCD= 36 Nwave= 56 Npro=
1 ncomb=400 icombf=100000 Nsstr= 6
Valid Solutions (combinations):The solution is appended to PROTSS.OUT file, which looks like,
Number SComb H(r) H(d) S(r) S(d) Trn Unrd
Hmin: .44; Hmax: .50
1 3 .29 .17 .16 .10 .12 .18
2 4 .30 .15 .01 .07 .21 .27
.
.
.91 394 .26 .19 .05 .07 .19 .27
92 396 .29 .16 -.03 .06 .23 .24
93 398 .30 .17 .11 .06 .12 .25Predicted: H(r)= .29 H(d)= .18 S(r)= .06 S(d)= .06 Trn = .17 Unrd= .24
isstr=400 icntf= 93
+--------------------------------------------------+
| Based on the SOLUTION the Number of SEGMENTS of |
| HELICES (Per 100 Residues) are: 4.418 |
| STRANDS (Per 100 Residues) are: 3.235 |
+--------------------------------------------------+
For YOUR PROTEIN multiply No. of SEGMENTS by
(Number of Residues) / 100
e.g., If Number of Residues = 153, Use a FACTOR of 1.53Average Length of Segments Remains as Estimated
+--------------------------------------------------+
| Based on the SOLUTION Obtained |
| The AVERAGE LENGTH of HELICES : 10.453 |
| The AVERAGE LENGTH of STRANDS : 3.990 |
+--------------------------------------------------+
CDSSTR IBasis=3 SStr: .285 .177 .064 .065 .173 .240The reconstructed CD spectra is output in a text file (RECONCD.OUT) for importing into any grapics software.
Header details the references, and options (PRINT and IBASIS) selected. The selected reference set of proteins and types of secondary structure fractions obtained are indicated.
Program Modified by Sreerama.
N.(1999),
For use with CDPro reference proteins
IPrint IBasis
0 3
Reference PROTEIN Set Selected:
CDDATA.37; SSDATA.37
Structures: Helix1, Helix2, Strand1,
Strand2, Turns and Unordered
Number of Helices and Strands Calculated
The wavelength range: 260.0 178.0, is
different from that of
the Database proteins: 240.0 185.0
The No. of CD points = 83
The wavelengths are modified to be within the valid range
Beginning wavelength: 240.00; Ending
wavelength: 185.00
Total No of CD points 56
Secondary structural elements= 6
Multipilcation FACTOR for CD spectrum= 1.000
SAMPLE CD: FILE from Test
-.54 -.67 -.82
-.99 -1.19 -1.41 -1.65 -1.92 -2.20
-2.49
-2.79 -3.11 -3.43 -3.71
-3.93 -4.10 -4.22 -4.29 -4.32 -4.31
-4.28 -4.22 -4.15 -4.06
-3.97 -3.87 -3.78 -3.71 -3.67 -3.64
-3.60 -3.53 -3.36 -3.05
-2.60 -2.04 -1.40 -.68
.22 1.38
2.68 4.00 5.32
6.63 7.74 8.46 8.91
9.20 9.27 9.02
8.45 7.60 6.63
5.69 4.80 3.95
Beginning wavelength: 240.00; Ending
wavelength: 185.00
Total No of CD points 56
Secondary structural elements= 6
Multipilcation FACTOR for CD spectrum= 1.000
Ordered DELTA(CD) values :
.794 .855 1.331 1.528
1.714 1.893 1.914 1.964 2.040 2.156
2.177 2.187 2.261 2.361
2.533 2.583 2.949 3.131 3.174 3.221
3.301 3.330 3.407 3.487
3.533 3.634 3.934 3.986 4.368 4.880
4.914 5.027 5.032 5.045
5.118 5.378
Ordered List of PROTEINS :
7 5
36 15 10 9
4 6 8 12
13 3 33
11 35 20 18
16 2 31
23 14 30
29 24 17 27
26 19 1
25 32 22
21 34 28
The variable selection is performed by removing
the least similar protein (to the test protein) and the solutions that
satisfy the Helix_rule are listed and averaged to get the final solution.
CONTIN gives a list of solutions with varying values of regularizer and we have selected the solution with the least standard error as the best solution. The CONTIN selected solution, solution based on degrees of freedom are summarized in SUMMARY.PG and the complete CONTIN output can be generated using IPRINT option. With IPRINT=0 only the final CONTIN output (with number of proteins = 6) will be in CONTIN.OUT.
VALID SOLUTIONS:
ISol N H(r) H(d) S(r) S(d) Turn Unrd
36 1
.258 .146 .073
.077 .178 .268
35 2 .257
.146 .076 .078
.178 .265
34 3 .256
.146 .076 .077
.179 .266
33 4 .258
.147 .077 .077
.175 .266
32 5 .253
.144 .077 .077
.184 .265
.
.
.
10 27 .252
.171 .083 .072
.178 .245
9 28 .242
.171 .076 .070
.186 .255
8 29 .249
.176 .074 .064
.179 .257
7 30 .248
.175 .075 .065
.178 .260
AVE(30): .256 .155 .064 .069 .184 .272
--------------------------------------------------------------------------------
+--------------------------------------------------+
| Based on the SOLUTION the Number
of SEGMENTS of |
| HELICES (Per 100 Residues)
are: 3.872 |
| STRANDS (Per 100 Residues)
are: 3.432 |
+--------------------------------------------------+
For YOUR PROTEIN multiply No. of SEGMENTS by
(Number of Residues) / 100
e.g., If Number of Residues = 153, Use a FACTOR of
1.53
Average Length of Segments Remains as Estimated
+--------------------------------------------------+
| Based on the SOLUTION
Obtained
|
| The AVERAGE LENGTH of
HELICES : 10.623 |
| The AVERAGE LENGTH of
STRANDS : 3.868 |
+--------------------------------------------------+
ContinLL IBasis=3 SStr: .256 .155 .064 .069 .184 .272The file SUMMARY.PG contains the three solutions from CONTIN (1. CONTIN chosen solution, 2. Solution based on minimum degrees of freedom, and 3. Solution with minimum error) for all variable selection based reference sets. This looks like:
--------------------------------------------------------------------------------Number of Proteins: 36
[1] CONTIN-Chosen Solution; ISol= 12 .37 .22 .19 .10 .08 .05
[2] Ave of 5 Solns with Deg Freed: 7-16 .26 .14 .07 .08 .18 .27
[3] Solution with Min. Error; Isol= 9 .26 .15 .07 .08 .18 .27
--------------------------------------------------------------------------------Number of Proteins: 35
[1] CONTIN-Chosen Solution; ISol= 12 .37 .22 .19 .10 .08 .05
[2] Ave of 5 Solns with Deg Freed: 7-16 .26 .14 .08 .08 .17 .26
[3] Solution with Min. Error; Isol= 9 .26 .15 .08 .08 .18 .26
.
.
.
.
.
Number of Proteins: 7
[1] CONTIN-Chosen Solution; ISol= 12 .24 .18 .08 .06 .17 .27
[2] Ave of 11 Solns with Deg Freed: 7-16 .21 .20 .08 .06 .15 .30
[3] Solution with Min. Error; Isol= 11 .25 .17 .08 .06 .18 .26
--------------------------------------------------------------------------------Number of Proteins: 6
[1] CONTIN-Chosen Solution; ISol= 12 .21 .18 .09 .07 .16 .29
[2] Ave of 11 Solns with Deg Freed: 7-16 .19 .20 .09 .06 .15 .32
[3] Solution with Min. Error; Isol= 11 .21 .17 .09 .07 .18 .28
********************************************************************************
Header details the references and the reference sets of proteins.
Determination of TERTIARY CLASS using the method of
Venyaminov and Vassilenko, Anal. Biochem. 222, 176 (1994)
Program Modified by Sreerama. N.(2000) for use with CDPro
The TERTIARY CLASSES are All-Alpha, All-beta, Alpha-Beta, and Denatured
CD analysis could be performed for former three classes
________________________________________________________________________________
The wavelength range: 260.0 178.0, is
different from that of
Anal.Biochem.Paper: 236.0 190.0
The No. of CD points = 83
The wavelengths are modified to be within the valid range
Beginning wavelength: 190.00; Ending
wavelength: 236.00
Total No of CD points 26
8.450 9.270
8.910 7.740 5.320
2.680 .220 -1.400 -2.600 -3.360
-3.600 -3.670 -3.780 -3.970
-4.150 -4.280 -4.320 -4.220 -3.930 -3.430
-2.790 -2.200 -1.650 -1.190
1.000
Class
Score
--------------------------
All Alpha 3
Alfa+Beta 2
Alfa/Beta 4
All Beta 1
Denatured 0
Maximum Score Obtained: 4
PREDICTED Tertiary Class: Alfa/Beta
Predicted class and cration of class specific
files.