Home » Key Scientific Articles » Chemical space networks: a powerful new paradigm for the description of chemical space.

Chemical space networks: a powerful new paradigm for the description of chemical space.

Significance Statement

The notion of chemical space plays a significant role in drug discovery by providing a framework for organizing the vast array of compounds in the chemical universe and by affording a systematic way of describing structural and functional relationships among the compounds.  Much like our own cosmos,  chemical space is envisioned to be made up of numerous clusters of compounds scattered throughout the space in a similar fashion to the stars in our galaxies.  Unlike our cosmos, however, chemical space is generally of much higher dimension and compounds located in close proximity to one another are considered to be structurally similar and to possess similar chemical and biological properties, features not necessarily shared by their cosmic analogues. Although this coordinate-based representation provides an intuitive picture of chemical space, it is not without issues. One issue is the continuous nature of the space, a characteristic that is somewhat incompatible with the inherently discrete nature of the sets of compounds that populate the space.  Another issue, alluded to earlier, is the high dimensionality of the space, a property that gives rise to the ‘Curse of Dimensionality’, which can result in serious problems associated with compound distributions not experienced in lower dimensional spaces.  These spaces are large and complex enough that typical dimensionality reduction techniques can also be problematic. Lastly, chemical space is not invariant to the choice of the coordinate system.  This is a significant issue since relationships among compounds in one coordinate system may not be preserved in another. In the two papers showcased here chemical space networks are proposed as an attractive alternative for representing chemical space since they avoid some of the issues associated with coordinate-based representations. Importantly, their structure is entirely compatible with the inherently discrete nature of chemical space, in contrast to coordinate-based representations.  Moreover, since networks are widely used in many fields today, their application to chemical space navigation can significantly benefit from new developments in these fields.  For example, identifying compounds similar to known actives is considerably easier in Chemical space networks than in coordinate-based representations.  Another example involves graph databases, an exciting emerging technology with considerable potential for handling Chemical space networks. Although Chemical space networks are rather new to the chemical informatics field, they possess many desirable features that suggest that they might have a significant impact on the way in which the chemical space paradigm is implemented and applied to enhance drug discovery.

Figure Legend: Shown is an exemplary chemical space network for a set of active compounds (nodes). Edges represent similarity relationships. Nodes are color-coded by compound potency using a continuous spectrum from green (low potency) to red (high).

Chemical space networks: a powerful new paradigm for the description of chemical space. Global Medical Discovery














Journal Reference

Maggiora GM (1,2), Bajorath J (3). J Comput Aided Mol Des. 2014 ;28(8):795-802.

Show Affiliations

1. University of Arizona BIO5 Institute, 1657 East Helen Street, Tucson, AZ, 85721, USA.
2. Translational Genomics Research Institute, 445 North Fifth Street, Phoenix, AZ, 85004, USA.
3. Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Freidrich-Wilhelms-Universität, Dahlmannstrasse 2, 53113, Bonn, Germany.


The concept of chemical space is playing an increasingly important role in many areas of  chemical research, especially medicinal chemistry and chemical biology. It is generally conceived as consisting of numerous compound clusters of varying sizes scattered throughout the space in much the same way as galaxies of stars inhabit our universe. A number of issues associated with this coordinate-based representation are discussed. Not the least of which is the continuous nature of the space, a feature not entirely compatible with the inherently discrete nature of chemical space. Cell-based representations, which are derived from coordinate-based spaces, have also been developed that facilitate a number of chemical informatic activities (e.g., diverse subset selection, filling ‘diversity voids’, and comparing compound collections).These representations generally suffer the ‘curse of dimensionality’. In this work, networks are proposed as an attractive paradigm for representing chemical space since they circumvent many of the issues associated with coordinate- and cell-based representations, including the curse of dimensionality. In addition, their relational structure is entirely compatible with the intrinsic nature of chemical space. A description of the features of these chemical space networks is presented that emphasizes their statistical characteristics and indicates how they are related to various types of network topologies that exhibit random, scale-free, and/or ‘small world’ properties.

Go To Journal of Computer-Aided Molecular Design