A special hierarchical fuzzy neural-networks based reinforcement learning for multi-variables system

来源 :哈尔滨工业大学学报(英文版) | 被引量 : 0次 | 上传用户:ycy111
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
Proposes a reinforcement learning scheme based on a special Hierarchical Fuzzy Neural-Networks (HFNN) for solving complicated learning tasks in a continuous multi-variables environment. The output of the previous layer in the HFNN is no longer used as if-part of the next layer, but used only in then-part. Thus it can deal with the difficulty when the output of the previous layer is meaningless or its meaning is uncertain. The proposed HFNN has a minimal number of fuzzy rules and can successfully solve the problem of rules combination explosion and decrease the quantity of computation and memory requirement. In the learning process, two HFNN with the same structure perform fuzzy action composition and evaluation function approximation simultaneously where the parameters of neural-networks are tuned and updated on line by using gradient descent algorithm. The reinforcement learning method is proved to be correct and feasible by simulation of a double inverted pendulum system.
其他文献
对目前过敏性疾病的诊断和预测高危婴幼儿过敏性疾病的发生,以及评估疾病的严重程度和疗效的一些实验室检测方法做分析。指出需通过循证医学尽快完善过敏性疾病的实验室检测
目的 探讨丙型肝炎病毒(HCV)基因型、RNA含量与肝组织炎症活动的相关性,慢性丙型肝炎患者经干扰素治疗后复发的相关因素.方法 对慢性丙型肝炎患者的血清进行丙氨酸氨基转移酶
目的探讨新型脑保护剂依达拉奉治疗放射性脑病的疗效。方法将42例鼻咽癌放疗后放射性脑病患者随机分为试验组(传统常规治疗方法合用依达拉奉)和对照组(传统常规治疗方法)。观
英国科学社会学家齐曼继承了广义科学社会学的研究传统,他在解构"默顿范式"的过程中提出自己的科学社会学思想,并建立起后学院科学的"齐曼范式".从默顿的学院科学到齐曼的后
A layer of polyimide is adopted to improve the adhesive ability between common flexible PET (poly(ethylene terephthalate)), generally used in the FOLEDs (flexib
传统材料和工艺用于桥台后背回填极易造成回填质量不稳定从而导致桥头跳车,采用免于碾压的液态水泥粉煤灰作为桥台后背回填材料,有效地控制了桥台后背回填的沉降值.
A two-dimensional coupled thermo-mechanical model is used to simulate the progress of milling mild carbon steel with continuous chip formation. Deformation of t
The pellet injection experiments for fuelling and diagnostics have been carried out on the HL-1M tokamak. The eight-pellet injector was installed on HL-1M. A re
研究目的:探讨模型简单、实用的机载成像光谱数据辐射畸变校正算法,为土地利用/覆被研究提供辐射质量优化的图像。研究方法:理论分析与实证研究法。研究结论:由于“太阳-目标-传感器”的相对几何关系发生变化,使得不同空间位置的相同地物辐射亮度值发生变化,这种辐射畸变称其为几何辐射畸变更为合理;基于低通滤波的辐射畸变校正算法,模型简单、方法实用,实现了研究区128波段OMIS-I机载成像光谱数据图像辐射畸变
A naphtha catalytic reforming unit with four reactors in series is analyzed. A physical model is proposed to describe the catalytic reforming radial flow reacto