Tackling Malaria
Cluster Analysis
As might be expected the cluster analysis shows that many of the compounds are singletons with no similar analogues in the data-set, the histogram below shows the distribution of cluster sizes using the Maccs fingerprints, around 8,500 compounds are singletons. The results using MCS and 3PP clustering show a similar profile but of course the members of the clusters are very different.
Many compounds are singletons but there are a number of fairly significant clusters
Maccs Clusters |
Cluster Number | Cluster Size |
9407 | 53 |
1024 | 51 |
4405 | 41 |
1561 | 39 |
60 | 34 |
1495 | 33 |
4107 | 25 |
8641 | 24 |
4083 | 23 |
2801 | 21 |
991 | 20 |
1127 | 20 |
4882 | 17 |
5681 | 17 |
9426 | 16 |
750 | 15 |
944 | 15 |
1123 | 15 |
1183 | 15 |
1395 | 15 |
2967 | 15 |
6660 | 14 |
1212 | 13 |
2065 | 13 |
5672 | 13 |
1131 | 12 |
2218 | 12 |
2390 | 12 |
5798 | 12 |
7257 | 12 |
7476 | 12 |
9448 | 12 |
8139 | 11 |
930 | 10 |
951 | 10 |
1093 | 10 |
1130 | 10 |
3190 | 10 |
4097 | 10 |
8112 | 10 |
8253 | 10 |
| |
|
|
MCS Clusters |
Cluster Number | Cluster Size |
16 | 1,874.00 |
12 | 1,435.00 |
34 | 1,041.00 |
105 | 652 |
19 | 631 |
10 | 595 |
9 | 576 |
97 | 518 |
67 | 435 |
36 | 411 |
35 | 327 |
89 | 317 |
2 | 304 |
7 | 284 |
112 | 258 |
51 | 256 |
109 | 249 |
103 | 215 |
107 | 214 |
39 | 206 |
114 | 178 |
37 | 153 |
91 | 106 |
8 | 103 |
106 | 103 |
26 | 98 |
111 | 94 |
104 | 84 |
14 | 83 |
110 | 73 |
1 | 72 |
11 | 69 |
29 | 67 |
53 | 66 |
144 | 64 |
32 | 62 |
123 | 62 |
65 | 61 |
119 | 60 |
50 | 56 |
|
|
3PP Clusters |
Cluster Number | Cluster Size |
3019 | 64
|
1028 | 58 |
187 | 45 |
3600 | 44 |
4834 | 41 |
4194 | 40 |
433 | 34 |
2846 | 34 |
1277 | 33 |
1851 | 32 |
7370 | 32 |
4907 | 30 |
509 | 28 |
2586 | 28 |
6456 | 27 |
855 | 26 |
3021 | 26 |
6709 | 26 |
57 | 25 |
7583 | 25 |
4859 | 24 |
6532 | 24 |
1171 | 23 |
1565 | 23 |
1934 | 23 |
1937 | 23 |
6666 | 23 |
7398 | 23 |
949 | 22 |
6272 | 22 |
3539 | 21 |
3672 | 21 |
1015 | 20 |
|
_ |
First Page
Cluster Analysis
Using the Filemaker Pro Database