论文部分内容阅读
癌症的产生与发展与人类生存环境、个体遗传因素等都存在密切关系,鉴于癌症对人类生存与健康所造成的生命危险,全世界很多顶级科研机构都在积极开展研究,以期尽早发现癌症发生、发展的致病机理.新一代测序技术的出现极大地加速了癌症研究,为发现癌症相关的重要信息奠定了基础.本文从数据集成的角度研究异构多层次数据,科学地引入lnc RNA(long non-coding RNA)组学数据,构建生物网络模型.通过聚类方法挖掘致癌基因模块.总结出一种挖掘导致癌症发生的关键基因模块的系统方法,挖掘出包含15个基因的关键基因模块.通过分析这些基因的功能与通路,发现其与癌症密切相关.通过生存分析,发现它们能够很好地区分高低风险组.所有这些结果表明通过集成多组学生物数据能够发现关键基因模块及其异常调控的基因集合,有助于癌症研究.
The emergence and development of cancer are closely related to human living environment and individual genetic factors. In view of the life-threatening effect of cancer on human survival and health, many top-level scientific research institutes in the world are actively conducting research to discover cancer as soon as possible, The emergence of a new generation of sequencing technology has greatly accelerated the emergence of cancer research, and found the basis for the discovery of cancer-related important information.In this paper, heterogeneous multi-level data from the perspective of data integration, the scientific introduction of lnc RNA (long non-coding RNA, to construct a bio-network model, to discover oncogenic modules by clustering method, to summarize a systematic method to discover the key gene modules that cause cancer, and to excavate the key gene modules containing 15 genes. By analyzing the functions and pathways of these genes, they were found to be closely associated with cancer, and by survival analysis they were found to be well differentiated between high and low risk groups.3 All of these results indicate that key gene modules and their abnormalities can be found by integrating multiple sets of student data Regulated gene collection contributes to cancer research.