Abstract
The National Cancer Institute Division of Cancer Treatment has revised its drug-screening program. About 230,000 compounds in our repository are available for screening under the new protocol. This paper is the first on an attempt to extract a representative sample of these compounds by clustering. It reviews the establishment of the clustering method on a 4980-compound initial sample. The clustering algorithm is fairly simple. However, the molecular fragments employed to match the compounds are somewhat complex to distinguish a large number of compounds.