论文部分内容阅读
在化学及相关研究中,常常需要根据化合物的CAS登记号查询其结构。本文通过整合山东省生物信息工程技术研究中心现有的数据,创建成一个化合物CAS登记号与其结构相互对应的数据库,应用于科研。先从CMC、MDDR、ACD、CNPD、NCI等7个数据库中,共导出575468个化合物的相关数据,这些数据经处理后,导入ChemFinder化学数据库系统。通过查重,保留了404269个独立CAS登记号的化合物,每个化合物都包含其结构、CAS登记号、来源数据库及编号、分子式、分子量、脂水分布系数等信息。数据库还保留了sdf和mol2两种文件格式,以满足虚拟筛选等后续研究的需要。
In chemical and related research, it is often necessary to check the structure of compounds based on the CAS Registry Number. In this paper, the existing data of Shandong Institute of Bioinformatics Engineering and Technology Research Center are integrated into a database of CAS registry numbers corresponding to their structures, which are used in scientific research. A total of 575468 compounds were derived from seven databases, including CMC, MDDR, ACD, CNPD and NCI. The data were processed and imported into the ChemFinder chemical database system. By checking and checking, 404269 independent CAS Registry Number compounds are retained, each of which contains information on its structure, CAS registry number, source database and number, molecular formula, molecular weight, and lipid profile. The database also retains sdf and mol2 two file formats to meet the needs of the follow-up study such as virtual screening.