Paul E. Blower, Jr.
Systematic analysis of large screening sets for drug discovery
Echeminfo
Systematic analysis of large screening sets for drug discovery

Paul Blower 1, Kevin Cross 1, Michael Fligner 2, Glenn Myatt 1, and Joseph Verducci 2, and Chihae Yang 1

We have developed a novel, systematic process for analysing the structure-activity relationships (SAR) of large, heterogeneous data sets. It leverages existing techniques as components in a common, overall analysis process that can be applied differently in individual cases. We first filter the initial screening set to remove compounds that would be unsuitable as lead compounds. Then we group active compounds into structurally meaningful categories and perform an outlier analysis of the classification results. For each active class, we identify key macrostructural features by reassembling common structural building blocks in the class. The algorithm can be parameterized to meet differing objectives: (1) features that discriminate for biological activity, (2) scaffolds for R-group analysis, and (3) features that discriminate for membership in the class. The macrostructures that discriminate for activity are useful descriptors for building local prediction models; others provide the basis for R-group analysis to further refine the SAR within the class. The suite of tools can assist pharmaceutical researchers in making better use of the vast quantities of information already residing in pharmaceutical databases, to select more and better qualified lead series, and to more effectively design follow-up experiments for optimization studies.

1. LeadScope, Inc., 2 The Ohio State University, Columbus, OH, USA