[1]. Cadena C, Carlone L, Carrillo H, et al. Past, Present, and Future of Simultaneous Localization and Mapping:Toward the Robust-Perception Age[J]. IEEE Transactions on Robotics (S1552-3098), 2016, 32(6): 1309-1332.
[2]. Fuentes-Pacheco J, Ruiz-Ascencio J, Rendón-Mancha J M, et al. Visual simultaneous localization and mapping: A survey[J]. Artificial Intelligence Review (S0269-2821), 2015, 43(1): 55-81.
[3]. 刘浩敏, 章国峰, 鲍虎军. 基于单目视觉的同时定位与地图构建方法综述[J]. 计算机辅助设计与图形学学报, 2016, 28(6): 855-868.
Liu Haomin, Zhang Guofeng, Bao Hujun. A survey of monocular simiultaneous localization and mapping[J]. Journal of Computer-Aided Design & Computer Graphics, 2016, 28(6): 855-868.
[4]. Geiger A, Lenz P, Urtasun R. Are we ready for autonomous driving? The KITTI vision benchmark suite[C]//IEEE Conference on Computer Vision and Pattern Recognition. Piscataway, USA: IEEE, 2012: 3354-3361.
[5]. Kummerle R, Grisetti G, Strasdat H, et al. g2o: A general framework for graph optimization[C]//IEEE International Conference on Robotics and Automation. Piscataway, USA: IEEE, 2011: 3607-3613.
[6]. Belter D, Skrzypczyński P. Precise self-localization of a walking robot on rough terrain using ptam[M]. Baltimore,USA: Adaptive Mobile Robotics, 2012: 89-96.
[7]. Mur-artal R, Tardos J D. ORB-SLAM2: an open-source SLAM system for monocular, stereo, and RGB-D cameras[J]. IEEE Transactions on Robotics (S1552-3098), 2017, 23(5): 1255-1262.
[8]. Engel J, Koltunk V, Cremers D. Direct sparse odometry[J]. IEEE Transaction on Pattern Analysis and Machine Intelligence (S0162-8828), 2018, 40(3): 611-625.
[9]. He K, Zhang X, Ren S, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE conference on Computer Vision and Pattern Recognition. LAS VEGAS: IEEE, 2016: 779-788.
[10]. Ren S, He K, Girshick R B, et al. Faster R-CNN:Towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence (S0162-8828), 2017, 39(6): 1137-1149.
[11]. Donahue J, Anne Hendricks L, Guadarrama S, et al. Long-term recurrent convolutional networks for visual recognition and description[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Boston: IEEE, 2015: 3128-3137.
[12]. Sünderhauf N, Pham T T, Latif Y, et al. Meaningful maps with object-oriented semantic mapping[C]//2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Vancouver, Canada: IEEE, 2017: 5079-5085.
[13]. Zhou Y, Li H, Kneip L. Canny-vo: Visual odometry with rgb-d cameras based on geometric 3-d–2-d edge alignment[J]. IEEE Transactions on Robotics (S1552-3098), 2018, 35(1): 184-199.
[14]. Costante G, Mancini M, Valigi P, et al. Exploring representation learning with CNNs for frame-to-frame ego-motion estimation[J]. IEEE Robotics and Automation Letters (S2377-3766), 2016, 1(1): 18-25.
[15]. Shahid M, Naseer T, Burgard W. DTLC: Deeply trained loop closure detections for lifelong visual SLAM[C]//Proceedings, Workshop on Visual Place Recognition,Conference on Robotics: Science and Systems (RSS). Ann Arbor, USA: RSS, 2016: 1-8.
[16]. Hou Y, Zhang H, Zhou S L. Convolutional neural networkbased image representation for visual loop closure detection[C]//IEEE International Conference on Information and Automation. Piscataway, USA: IEEE, 2015: 2238-2245.
[17]. Daniel D, Malisiewicz T, Rabinovich A. Toward geometric deep SLAM[EB/OL]. (2017-07-24)[2019-08-20], https://arxiv.org/pdf/1707.07410.pdf.
[18]. Sharif Razavian A, Azizpour H, Sullivan J, et al. CNN features off-the-shelf: an astounding baseline for recognition[C]//Proceedings of the IEEE conference on computer vision and pattern recognition workshops. Columbus, Ohio: IEEE, 2014: 806-813.
[19]. Wang S, Clark R, Wen H, et al. Deepvo: Towards end-to-end visual odometry with deep recurrent convolutional neural networks[C]//2017 IEEE International Conference on Robotics and Automation(ICRA). Singapore: IEEE, 2017: 2043-2050.
[20]. Donahue J, Anne Hendricks L, Guadarrama S, et al. Long-term recurrent convolutional networks for visual recognition and description[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA: IEEE, 2015: 2625-2634.
[21]. Elman J L. Finding structure in time[J]. Cognitive science (S0364-0213), 1990, 14(2): 179-211.
[22]. Graves A. Supervised Sequence Labeling with Recurrent Neural Networks[M]. Heidelberg: Springer, 2012: 5-13.
[23]. Chen Z, Jacobson A, Sünderhauf N, et al. Deep learning features at scale for visual place recognition[C]//2017 IEEE International Conference on Robotics and Automation (ICRA). Singapore: IEEE, 2017: 3223-3230.
[24]. Sünderhauf N, Dayoub F, Shirazi S, et al. On the Performance of ConvNet Features for Place Recognition[C]//International Conference on Intelligent Robots and Systems (IROS). Hamburg: IEEE, 2015: 4297-4304.
[25]. Yi H, Hong Z, Zhou S. BoCNF: efficient image matching with Bag of ConvNet features for scalable and robust visual place recognition[J]. Autonomous Robots (S0929-5593), 2017, 42(9): 1-17.
[26]. Lin K, Yang H F, Hsiao J H, et al. Deep learning of binary hash codes for fast image retrieval[C]//Proceedings of the IEEE conference on computer vision and pattern recognition workshops. Boston, USA: IEEE, 2015: 27-35.
[27]. Sünderhauf N, Shirazi S, Jacobson A, et al. Place recognition with convnet landmarks: Viewpoint-robust,condition-robust, training-free[C]//Proceedings of Robotics: Science and Systems XI. Michigan, USA: RSS, 2015: 1-10.
[28]. Parisotto E, Singh Chaplot D, Zhang J, et al. Global pose estimation with an attention-based recurrent network[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. Salt Lake City: IEEE, 2018: 237-246.
[29]. Hwang J, Park S, Kwak N. Athlete pose estimation by a global-local network[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. Honolulu, Hawaii: IEEE, 2017: 58-65.
[30]. Southall C, Stables R, Hockman J. Automatic Drum Transcription for Polyphonic Recordings Using Soft Attention Mechanisms and Convolutional Neural Networks[C]//The 18th International Society for Music Information Retrieval Conference. Suzhou: ISMIR, 2017: 606-612.
[31]. Sünderhauf N, Pham T T, Latif Y, et al. Meaningful Maps with Object-Oriented Semantic Mapping[C]//2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). New York: IEEE, 2017: 5079-5085.
[32]. Ng P C, Henikoff S. SIFT: Predicting amino acid changes that affect protein function[J]. Nucleic Acids Research (S0305-1048), 2003, 31(13): 3812-3814.
[33]. Lei H, Akhtar N, Mian A. Octree guided CNN with Spherical Kernels for 3D Point Clouds[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Long Beach, CA: IEEE, 2019: 9631-9640.
[34]. Mani I, Zhang I. KNN approach to unbalanced data distributions: a case study involving information extraction[C]//Proceedings of workshop on learning from imbalanced datasets. Washington: ICML, 2003: 126.
[35]. Radwan N, Valada A, Burgard W. VLocNet++: Deep Multitask Learning for Semantic Visual Localization and Odometry[J]. IEEE Robotics and Automation Letters (S2377-3766), 2018, 3(4): 4407-4414.
[36]. Girisha S, Manohara P, Ujjwal V, et al. Semantic Segmentation of UAV Aerial Videos using Convolutional Neural Networks[C]//2019 IEEE Knowledge Engineering (AIKE). Sardinia, Italy: IEEE, 2019: 21-27.
[37]. Han Y, Ye J C. Framing U-Net via deep convolutional framelets: Application to sparse-view CT[J]. IEEE Transactions on Medical Imaging (S0278-0062), 2018, 37(6): 1418-1429.
[38]. Bowman S L, Atanasov N, Daniilidis K, et al. Probabilistic data association for semantic slam[C]//2017 IEEE International Conference on Robotics and Automation (ICRA). Singapore: IEEE, 2017: 1722-1729.
[39]. Jordan M I, Jacobs R A. Hierarchical Mixtures of Experts and the EM Algorithm[J]. Neural Computation (S0899-7667), 1994, 6(2): 181-214.
[40]. Engel J, Koltun V, Cremers D. Direct sparse odometry[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence (S0162-8828), 2017, 40(3): 611-625.
[41]. Geiger A, Ziegler J, Stiller C. Stereoscan: Dense 3d reconstruction in real-time[C]//2011 IEEE Intelligent Vehicles Symposium (IV). Baden-Baden, Germany: IEEE, 2011: 963-968.
[42]. Loo S Y, Amiri A J, Mashohor S, et al. CNN-SVO:Improving the mapping in semi-direct visual odometry using single-image depth prediction[C]//2019 International Conference on Robotics and Automation(ICRA). Montreal, Canada: IEEE, 2019: 5218-5223.
[43]. Zhan H, Weerasekera C S, Bian J, et al. Visual Odometry Revisited: What Should Be Learnt?[EB/OL]. (2019/09/21)[2019/10/05], https://arxiv.org/abs/1909.09803.pdf.
[44]. Zhou T, Brown M, Snavely N, et al. Unsupervised learning of depth and ego-motion from video[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, Hawaii: IEEE, 2017: 1851-1858.
[45]. Cieslewski T, Choudhary S, Scaramuzza D. Data-efficient decentralized visual SLAM[C]//2018 IEEE International Conference on Robotics and Automation (ICRA). Prague, Czech Republic,: IEEE, 2018: 2466-2473.
[46]. Li S, Zhi Y, Anestis Z, et al. Recurrent-OctoMap:Learning State-based Map Refinement for Long-Term Semantic Mapping with 3D-Lidar Data[J]. IEEE Robotics and Automation Letters (S2377-3766), 2018, 3(4): 3749-3756.
[47]. Hornung A, Kai M W, Bennewitz M, et al. OctoMap: An efficient probabilistic 3D mapping framework based on octrees[J]. Autonomous Robots (S0929-5593), 2013, 34(3): 189-206.
[48]. Zhang J, Singh S. Laser–visual–inertial odometry and mapping with high robustness and low drift[J]. Journal of Field Robotics (S1556-4967), 2018, 35(8): 1242-1264.
[49]. Garcia-Fidalgo E, Ortiz A. Vision-based topological mapping and localization methods: a survey[J]. Robotics and Autonomous Systems (S0921-8890), 2015, 64: 1-20.
[50]. Engel J, Schöps T, Cremers D. LSD-SLAM: Large-Scale Direct Monocular SLAM[M]. Munich: Computer Vision– ECCV 2014. 2014: 834-849.
[51]. Scherer S A, Zell A. Efficient onbard RGBD-SLAM for autonomous MAVs[C]//2013 IEEE/RSJ International Conference on Intelligent Robots and Systems. Tokyo, Japan: IEEE, 2013: 1062-1068.
[52]. Vijayanarasimhan S, Ricco S, Schmid C, et al. Sfm-net:Learning of structure and motion from video[EB/OL]. (2017/04/25) [2019/08/25], https://arxiv.org/abs/1704.07804.pdf.
[53]. 张峻宁, 苏群星, 刘鹏远, 等. 一种自适应特征地图匹配的改进VSLAM算法[J]. 自动化学报, 2019, 45(3): 553-565.
Zhang Junning, Su Qunxing, Liu Pengyuan, et al. An Improved VSLAM Algorithm Based on Adaptive Feature Map[J]. Acta Automatica Sinica, 2019, 45(3): 553-565.
[54]. Grisetti G, Kümmerle R, Strasdat H, et al. g2o: A general framework for (hyper) graph optimization[C]//2011 IEEE International Conference on Robotics and Automation (ICRA), Shanghai, China: IEEE, 2011: 3607-3613.