Inference of Global HIV-1 Sequence Patterns and Preliminary FeatureAnalysis

来源 :Virologica Sinica | 被引量 : 0次 | 上传用户:qdmarie
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
The epidemiology of HIV-1 varies in different areas of the world, and it is possible that this complexity may leave unique footprints in the viral genome. Thus, we attempted to find significant patterns in global HIV-1 genome sequences. By applying the rule inference algorithm RIPPER (Repeated Incremental Pruning to Produce Error Reduction) to multiple sequence alignments of Env sequences from four classes of compiled datasets, we generated four sets of signature patterns. We found that these patterns were able to distinguish southeastern Asian from non- southeastern Asian sequences with 97.5% accuracy, Chinese from non-Chinese sequences with 98.3% accuracy, African from non-African sequences with 88.4% accuracy, and southern African from non-southern African sequences with 91.2% accuracy. These patterns showed different associations with subtypes and with amino acid positions. In addition, some signature patterns were characteristic of the geographic area from which the sample was taken. Amino acid features corresponding to the phylogenetic clustering of HIV-1 sequences were consistent with some of the deduced patterns. Using a combination of patterns inferred from subtypes B, C, and all subtypes chimeric with CRF01_AE worldwide, we found that signature patterns of subtype C were extremely common in some sampled countries (for example, Zambia in southern Africa), which may hint at the origin of this HIV-1 subtype and the need to pay special attention to this area of Africa. Signature patterns of subtype B sequences were associated with different countries. Even more, there are distinct patterns at single position 21 with glycine, leucine and isoleucine corresponding to subtype C, B and all possible recombination forms chimeric with CRF01_AE, which also indicate distinct geographic features. Our method widens the scope of inference of signature from geographic, genetic, and genomic viewpoints. These findings may provide a valuable reference for epidemiological research or vaccine design. The epidemiology of HIV-1 varies in different areas of the world, and it is possible that this complexity may leave unique footprints in the viral genome. Thus, we attempted to find significant patterns in global HIV-1 genome sequences. By applying the rule inference algorithm RIPPER (Repeated Incremental Pruning to Produce Error Reduction) to multiple sequence alignments of Env sequences from four classes of compiled datasets, we generated four sets of signature patterns. We found that these patterns were able to distinguish southeastern Asian from non- southeastern Asian sequences with 97.5% accuracy, Chinese from non-Chinese sequences with 98.3% accuracy, African from non-African sequences with 88.4% accuracy, and southern African from non-southern African sequences with 91.2% accuracy. These patterns showed different associations with subtypes and with amino acid positions. In addition, some signature patterns were characteristic of the geographic area from which the sample was tak en. Amino acid features corresponding to the phylogenetic clustering of HIV-1 sequences were consistent with some of the deduced patterns. Using a combination of patterns inferred from subtypes B, C, and all subtypes chimeric with CRF01_AE worldwide, we found that signature patterns of subtype C of extremely common in some sampled countries (for example, Zambia in southern Africa), which may hint at the origin of this HIV-1 subtype and the need to pay special attention to this area of ​​Africa. Signature patterns of subtype B sequences Even more, there are distinct patterns at single position 21 with glycine, leucine and isoleucine corresponding to subtype C, B and all possible recombination forms chimeric with CRF01_AE, which also indicate distinct geographic features. Our method widens the scope of inference of signature from geographic, genetic, and genomic viewpoints. These findings may provide a valuable reference for epidemiological research or vaccine design.
其他文献
执政时间越长,执政党越要警惕和清除自身的腐败,加强自身的党风廉政建设。这是人类政权更迭史的重要启示。从世界范围看,迄今为止的共产党政权,有谁成功跳出了毛泽东与黄炎培
随着以互联网、手机为代表的新媒体平台的迅速发展,多媒体时代疾步而来,随之而来的,是新的广告投放平台和渠道逐日激增。来自新媒体的强势竞争,无疑使得作为传统媒体代表之一的电
改革开放以来,特别是社会主义市场经济体制的初步建立与不断完善,不仅给中国经济带来了巨大变化,而且也给中国社会结构带来了巨大变化,中国社会治理结构出现了从“一元治理”向“政府、市场、社会”三元治理体系的演化。新社会组织在这样的背景下出现,它包括三部分:社会中介组织、社会团体、民办非企业组织,其中社会中介组织是主体。国外一般称这类组织为“非政府组织”或有别于政府和企业的“第三部门”。经济社会的转型使得
目的了解河池市医疗机构的消毒状况,以便采取措施改进消毒工作。方法对2006~2010年河池市各级医疗机构的室内空气、物体表面、医护人员的手、使用中的消毒液等检测资料进行分
目的:进一步了解育龄妇女意外妊娠发生的原因、影响因素及生殖健康服务需求。方法:采用自行设计问卷,对意外妊娠妇女进行面对面咨询、访谈,对选择口服短效避孕药患者给予免费
在n值S-MTL逻辑系统的统一框架下,通过视全体赋值之集为通常乘积拓扑空间,给出了命题的Borel概率真度定义。通过构造公式所诱导的阶梯函数给出了公式真度的积分表达式,进而利用
芥酸在工业上具有诸多用途,目前主要从菜籽油中制取,提高芥酸含量和含油量是工业专用高芥酸油菜育种的两个重要目标。本研究在甘蓝型油菜高油品种CY2中籽粒特异正向表达拟南
该文从挂篮荷载计算、施工流程、支座及临时固结施工、挂篮安装及试验、合拢段施工、模板制作安装、钢筋安装、混凝土的浇筑及养生、测量监控等方面人手,介绍了S226海滨大桥
21世纪,新兴的技术革命浪潮有力地推动着全球经济高速发展,高新技术产业已经成为国家整体经济发展的核心指标。税收作为政府宏观调控经济运行的主要财政手段,被世界各国尤其是发
本文通过对荣华二采区10
期刊