论文部分内容阅读
[背景]暴露组构成了一个有前景的框架,通过明确考虑多重检验、避免选择性报告来提升关于环境暴露对健康影响的理解,但暴露组研究在同时考虑多种相关暴露方面受到了挑战。[目的]比较线性回归法在分析暴露组与健康相关性中的性能。[方法]在模拟研究中,设定237个暴露协变量,它们具有真实的相关结构并且对其中0~25个协变量呈线性相关的健康结局。主要比较统计方法的假阳性率(false discovery proportion,FDP)和灵敏度。[结果]在所有模拟设定中,弹性网络和稀疏偏最小二乘回归法的灵敏度为76%,FDP为44%;图形单元进化随机搜索(Graphical Unit Evolutionary Stochastic Search,GUESS)和删除/替换/添加(deletion/substitution/addition,DSA)算法的灵敏度为81%,FDP为34%。全环境关联分析(environment-wide association study,EWAS)尽管灵敏度更高,但平均FDP达到86%,但性能弱于前两者。当评估协变量间高度相关的暴露组暴露矩阵时,其性能明显下降。[结论]暴露之间的相关性是暴露组研究的一项挑战。在真实的暴露组情况下,本研究所考察的统计学方法有效区分真实预测变量与相关协变量的能力有限。尽管GUESS和DSA在灵敏度和FDP之间平衡方面稍好,但他们在所有考察的情景及属性上并未比其他多因素统计方法效力更好,在选择这些方法时也应考虑计算的复杂性和灵活性。
[Background] Exposure groups constitute a promising framework for understanding the health effects of environmental exposures by explicitly considering multiple tests and avoiding selective reporting, but exposure group studies are challenged to consider multiple related exposures simultaneously. [Objective] To compare the performance of linear regression in the analysis of health-related exposure. [Method] In the simulation study, 237 exposed covariates were set, with true related structures and healthy outcomes with a linear correlation between 0 and 25 covariates. The main statistical methods compared false detection proportion (FDP) and sensitivity. [Results] The sensitivity of elastic network and sparse partial least squares regression was 76% and the FDP was 44% in all the simulation settings. Graphical Unit Evolutionary Stochastic Search (GUESS) and delete / replace / The sensitivity of the deletion / substitution / addition (DSA) algorithm is 81% and the FDP is 34%. Environment-wide association study (EWAS) Although the sensitivity is higher, the average FDP is 86%, but the performance is weaker than the former two. When assessing exposure matrixes that are highly correlated among covariates, their performance is significantly reduced. [Conclusion] The correlation between exposure is a challenge for exposure study. In the case of a true exposure group, the statistical methods examined in this study have limited ability to effectively distinguish between true predictors and related covariates. Although GUESS and DSA are slightly better at balancing sensitivity and FDP, they do not perform as well as other multivariate statistical methods at all investigated scenarios and attributes, and computational complexity should be taken into account when choosing these approaches flexibility.