Abstract
Similarity studies are important for chemistry and their applications range from the periodic table to the screening of large databases in the searching for new drugs. In this later case, it is assumed that similarity in molecular structure is related to similarity in reactivity. However, we state that structural formulas can be regarded as abstract representations emerging from the analysis of large amounts of data upon chemical reactivity. Hence, chemical formulas such as organic functions are not direct pictures of the atomic constitution of matter, but signs used to represent similarity in the reactivity of a class of substances. Therefore, reactivity, rather than molecular structure, becomes the fundamental feature of chemical substances. As reactivity is important, chemical identity is given by the relations substances establish with each other, giving place to a network of chemical reactions. We explore similarity in the network rather than in molecular structure. By characterising each substance in terms of the related ones, we show how Category Theory helps in this description. Afterwards, we study the similarity among substances using topological spaces, which leads us to concepts such as closure and neighbourhood, which formalise the intuition of things lying somewhere near around. The second focus of the chapter is the exploration of the potential of closure operators, and of topological closures in particular, as more general descriptors of chemical similarity. As we introduce the formalism, we develop a worked example, concerning the analysis of similarity among chemical elements regarding their ability to combine into binary compounds. The results show that several of the trends of chemical elements are found through the current approach.
Keywords: Binary compounds, category theory, chemical classification, chemical networks, closure, closure operators, directed hypergraphs, formal concept analysis, graph theory, network theory, order theory, periodic table, reaction networks, similarity, topology.