论文部分内容阅读
通过对有关SARS CoV文献的调研 ,指出了有关基因预测和功能研究的不足。为制备有效的药物和疫苗 ,对SARS CoV(BJ0 1)重新进行了基因预测和功能推测。比较 12种基因预测方法对冠状病毒属中已知基因的预测优劣 ,选用Heuristicmodels、GeneIdentification、ZCURVECoV和ORFFINDER 4种较好的方法来预测基因 ,然后运用AT Gpr分析第一起始密码子的可能性及是否符合Kozak规则 ,同时搜索转录调控序列 ,以提高基因预测的准确性。共预测出 34个ORF ,排除NCBI及有关文献中完全相同或有微弱差别的 13个 ,得到 2 1个大于 5 0个氨基酸的可能新基因。对于预测出的蛋白质 ,运用ProtParam分析它们的物理化学特征 ,用SignalP分析蛋白是否有信号肽 ,用BLAST、FASTA分析是否有相似序列 ,用TMPred、TMHMM、PFAM和HMMTOP分析结构域或模体 ,以提高基因功能推测的可靠性。根据 4种基因预测方法使用情况、与其他冠状病毒属已知基因匹配分值、匹配预期值、已知基因与预测基因长度差别 ,将 2 1个可能的新基因按出现可能性分为 4类。同时对结果进行了讨论。
Through the investigation of the literature about SARS CoV, the deficiency of gene prediction and function research is pointed out. In order to prepare effective drugs and vaccines, SARS CoV (BJ0 1) was re-genetically predicted and functionally speculated. We compared the prediction of the known genes in Coronavirus by 12 gene prediction methods, and we predicted the gene by using four better methods, Heuristicmodels, GeneIdentification, ZCURVECoV and ORFFINDER, and then analyzed the possibility of the first start codon by using AT Gpr And meet the Kozak rules, while searching for transcriptional regulatory sequences to improve the accuracy of gene prediction. A total of 34 ORFs were predicted, excluding 13 identical or weakly differentiated NCBIs and related literature, resulting in 21 possible new genes greater than 50 amino acids. For predicted proteins, ProtParam was used to analyze their physico-chemical characteristics. SignalP was used to analyze whether proteins had signal peptides. Similar sequences were analyzed by BLAST and FASTA. The domains or motifs were analyzed by TMPred, TMHMM, PFAM and HMMTOP. Increase the reliability of gene function speculation. According to the usage of four kinds of gene prediction methods, matched with other known genes of the genus Coronavirus, the expected value of the match, the difference between the length of the known gene and the predicted gene, the 21 new possible genes were divided into 4 categories . The results were discussed at the same time.