Abstract
This paper compares 22 different similarity coefficients when they are used for searching databases of 2D fragment bit-strings. Experiments with the National Cancer Institutes AIDS and IDAlert databases show that the coefficients fall into several well-marked clusters, in which the members of a cluster will produce comparable rankings of a set of molecules. These clusters provide a basis for selecting combinations of coefficients for use in data fusion experiments. The results of these experiments provide a simple way of increasing the effectiveness of fragment-based similarity searching systems.
Keywords: Inter-Molecular Similarity, 2D Fragment Bit-Strings