Home » Key Scientific Articles » Design and characterization of chemical space networks for different compound data sets.

Design and characterization of chemical space networks for different compound data sets.

Zwierzyna M(1), Vogt M(1), Maggiora GM (2,3), Bajorath J(1).

J Comput Aided Mol Des. 2015 Feb;29(2):113-25.

1. B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Department of Life Science Informatics, Rheinische Friedrich-Wilhelms-Universität, Dahlmannstr. 2, 53113, Bonn, Germany.

2. BIO5 Institute, University of Arizona, 1657 East Helen Street, Tucson, AZ, 85721, USA.

3. Translational Genomics Research Institute, 445 North Fifth Street, Phoenix, AZ, 85004, USA.

 

Abstract 

Chemical Space Networks (CSNs) are generated for  different  compound  data  sets on the basis of pairwise similarity relationships. Such networksare thought to complement and further extend traditional coordinate-based views of chemical space. Our proof-of-concept study focuses on CSNs based upon fingerprint similarity relationships calculated using the conventional Tanimo to similarity metric. The resulting CSNs are characterized with statistical measures from network science and compared in different ways. We show that the homophily principle, which is widely considered in the context of social networks, is a major determinant of the topology of CSNs of bioactive compounds, designed as threshold networks, typically giving rise to community structures. Many properties of CSNs are influenced by numerical features of the conventional Tanimoto similarity metric and largely dominated by the edge density of the networks, which depends on chosen similarity threshold values. However, properties of different CSNs with constant edge density can be directly compared, revealing systematic differences between CSNs generated from randomly collected or bioactive compounds.

Go To PubMed