,Understanding traditional Chinese medicine via statistical learning of expert-specific Electronic M

来源 :定量生物学(英文版) | 被引量 : 0次 | 上传用户:Okira_lacusO
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
Background:Traditional Chinese medicine (TCM) has been attracting lots of attentions from various disciplines recently.However,TCM is still mysterious because of its unique philosophy and theoretical thinking.Due to the lack of high quality data,understanding TCM thoroughly faces critical challenges.In this study,we introduce the Zhou Archive,a large-scale database of expert-specific Electronic Medical Records containing information about 73,000 + visits to one TCM doctor for over 35 years.Covering the full spectrum of diagnosis-treatment model behind TCM practice,the archive provides an opportunity to understand TCM from the data-driven perspective.Methods:Processing the text data in the archive via a series of data processing steps,we transformed the semistructured EMRs in the archive to a well-structured feature table.Based on the structured feature table obtained,a series of statistical analyses are implemented to lea principles of TCM clinical practice from the archive,including correlation analysis,enrichment analysis,embedding analysis and association patte discovery.Results:A structured feature table of 14,000 + features is generated at the end of the proposed data processing procedure,with a feature codebook,a term dictionary and a term-feature map as byproducts.Statistical analysis of the feature table reveals underlying principles about the diagnosis-treatment model of TCM,helping us better understand the TDM practice from a data-driven perspective.Conclusion:Expert-specific EMRs provide opportunities to understand TCM from the data-driven perspective.Taking advantage of recent progresses on NLP for Chinese,we can process a large number of TCM EMRs efficiently to gain insights via statistical analysis.
其他文献
亲本是甘蔗杂交育种的基础,甘蔗亲本遗传多样性的信息有助于指导亲本选择和组合配置,也有助于扩大育种计划中亲本材料的遗传多样性,同时,分子指纹图谱的构建是甘蔗品种权保护和基因型鉴别的依据,因此,相关研究具有重要的理论和实践意义。本研究采用5对SSR荧光标记引物,对116份甘蔗常用亲本、创新亲本或新亲本进行SSR标记,构建DNA指纹图谱,通过对SSR标记数据进行聚类分析、主成分分析和遗传相似性分析,获得
Background:The Oxford MinION nanopore sequencer is the recently appealing third-generation genome sequencing device that is portable and no larger than a cellph