【泡泡一分钟】基于贝叶斯深度学习框架进行概率TSDF融合的单个RGB相机密集三维重建方法

每天一分钟，带你读遍机器人顶级会议文章

标题：Probabilistic TSDF Fusion Using Bayesian Deep Learning for Dense 3D Reconstruction with a Single RGB Camera

作者： Hanjun Kim and Beomhee Lee

来源：2020 IEEE International Conference on Robotics and Automation (ICRA)

编译：靳小鑫

审核：王靖淇，柴毅

这是泡泡一分钟推送的第 799 篇文章，欢迎个人转发朋友圈；其他机构或自媒体如需转载，后台留言申请授权

摘要

在本文中，我们使用来自单个 RGB 图像的深度预测来解决 3D 重建问题。随着近期深度学习的发展，深度预测展现出高性能。然而，由于训练环境和测试环境之间的差异，3D 重建容易受到深度预测的不确定性的影响。为了考虑鲁棒的3D重建深度预测的不确定性，我们采用贝叶斯深度学习框架。传统的贝叶斯深度学习需要大量的时间和GPU内存来执行蒙特卡罗采样。为了解决这个问题，我们提出了一个轻量级的实时执行的贝叶斯神经网络，它由 U-net 结构和基于求和的跳过连接组成。通过最大化每个体素的TSDF值的后验概率，在概率 TSDF 融合中利用估计的不确定性进行密集 3D 重建。因此，可以获得对错误深度值具有鲁棒性的全局 TSDF，然后可以更准确地实现来自全局 TSDF 的密集 3D 重建。为了使用我们的方法评估深度预测和 3D 重建的性能，我们使用了两个官方数据集，并证明了所提出的方法优于其他传统方法。

图 1.典型 3D 重建与所提出方法之间的流程比较。(a) 使用深度相机进行典型 3D 重建的流程。(b) 使用具有深度和不确定性预测的概率 TSDF 融合的 3D 重建的流程。

图 2.深度和不确定性预测网络结构。使用简单的基于 U-net 的贝叶斯神经网络从单个 RGB 图像预测深度和不确定性。

图 3.NYU-Depth-v2 预测结果。从左到右：输入 RGB 图像、真实深度、深度预测、任意不确定性、认知不确定性。

表 1.NYU DEPTH V2 深度预测结果对比

图 4.根据以下权重选项从全局 TSDF 中提取 3D 网格：(a) 地面实况 (b) 常数 (c) 线性 (d) 指数 (e) 基于最小深度 (f) 最小最大深度 (g) 截断的不确定性 (h) 归一化的不确定性

图 5.ICL-NUIM 数据集上的 3D 重建比较（放大）。(kt0)。3D 重建的结果被放大用于视觉理解。(a) 真实情况 (b) 基于最小深度 (c) 归一化的不确定性

Abstract

In this paper, we address a 3D reconstruction problem using depth prediction from a single RGB image. With the recent advances in deep learning, depth prediction shows high performance. However, due to the discrepancy between training environment and test environment, 3D reconstruction can be vulnerable to the uncertainty of depth prediction. To consider the uncertainty of depth prediction for robust 3D reconstruction, we adopt Bayesian deep learning framework.Conventional Bayesian deep learning requires a large amount of time and GPU memory to perform Monte Carlo sampling. To address this problem, we propose a lightweight Bayesian neural network consisting of U-net structure and summation-based skip connections, which is performed in real-time. Estimated uncertainty is utilized in probabilistic TSDF fusion for dense 3D reconstruction by maximizing the posterior of TSDF value per voxel. As a result, global TSDF robust to erroneous depth values can be obtained and then dense 3D reconstruction from the global TSDF is achievable more accurately. To evaluate the performance of depth prediction and 3D reconstruction using our method, we utilized two official datasets and demonstrated the outperformance of the proposed method over other conventional methods.

如果你对本文感兴趣，请点击点击阅读原文下载完整文章，如想查看更多文章请关注【泡泡机器人SLAM】公众号（paopaorobot_slam）。

百度网盘提取码：gh3a

欢迎来到泡泡论坛，这里有大牛为你解答关于SLAM的任何疑惑。

有想问的问题，或者想刷帖回答问题，泡泡论坛欢迎你！

泡泡网站：www.paopaorobot.org

泡泡论坛：http://paopaorobot.org/bbs/

泡泡机器人SLAM的原创内容均由泡泡机器人的成员花费大量心血制作而成，希望大家珍惜我们的劳动成果，转载请务必注明出自【泡泡机器人SLAM】微信公众号，否则侵权必究！同时，我们也欢迎各位转载到自己的朋友圈，让更多的人能进入到SLAM这个领域中，让我们共同为推进中国的SLAM事业而努力！

商业合作及转载请联系paopaorobot@163.com