论文部分内容阅读
Background: Toxicities of potential drug candidates is one of the main focuses in the drug discovery process.However, the molecular fingerprint based in silico toxicity prediction methods usually suffers from data sparsity problems, and the commonly used distance metrics, such as the Euclidean distance and the Manhattan distance, cannot accurately reflect the distance relationship between drug candidates.Employing appropriate data pre-processing methods for this type of data before modelling is critically important for successful toxicity prediction of drug candidates.