论文部分内容阅读
为从单目图像中提取到丰富的3D结构特征,并用以推测场景的深度信息,针对单目图像深度估计任务提出了一种结构化深度学习模型,该模型将一种新的多尺度卷积神经网络与连续条件随机场统一于一个深度学习框架中.卷积神经网络可以从图像中学习到相关特征表达,而连续条件随机场可以根据图像像素的位置、颜色信息对卷积神经网络输出进行优化,将二者参数以联合优化的方式进行学习可以提升模型的泛化性能.通过在NYU Depth数据集上的实验验证了模型的有效性与优越性,该模型预测结果的平均相对误差为0.187,均方根误差为0.074,对数空间平均误差为0.671.
In order to extract rich 3D structure features from monocular images and use them to infer the depth information of the scene, a structured deep learning model is proposed for the monocular image depth estimation task. This model introduces a new multi-scale convolution Neural networks and continuous conditional random fields are unified in a deep learning framework. Convolutional neural network can learn the related feature expression from the image, while the continuous conditional random field can output the convolutional neural network output according to the position and color information of image pixels Optimization and learning the two parameters in a joint optimization manner can improve the generalization performance of the model.The validity and superiority of the model are verified by experiments on the NYU Depth dataset.The average relative error of the model prediction results is 0.187 , Root mean square error of 0.074, and logarithmic space mean error of 0.671.