机器学习学术速递[9.22]

Update！H5支持摘要折叠，体验更佳！点击阅读原文访问arxivdaily.com，涵盖CS|物理|数学|经济|统计|金融|生物|电气领域，更有搜索、收藏等功能！

cs.LG 方向，今日共计79篇

Graph相关(图学习|图神经网络|图优化等)(5篇)

【1】 AutoGCL: Automated Graph Contrastive Learning via Learnable View Generators
标题：AutoGCL：基于可学习视图生成器的自动图形对比学习
链接：https://arxiv.org/abs/2109.10259

作者：Yihang Yin,Qingzhong Wang,Siyu Huang,Haoyi Xiong,Xiang Zhang
机构：Nanyang Technological University, Baidu Research, The Pennsylvania State University
摘要：对比学习已广泛应用于图形表示学习，其中视图生成器在生成有效对比样本方面起着至关重要的作用。现有的对比学习方法大多采用预定义的视图生成方法，如节点下降法或边缘扰动法，这些方法通常不能很好地适应输入数据或保留原始语义结构。为了解决这个问题，本文提出了一个新的框架，称为自动图形对比学习（AutoGCL）。具体地说，AutoGCL采用了一组可学习的图形视图生成器，这些生成器通过自动增强策略进行编排，其中每个图形视图生成器学习受输入条件制约的图形的概率分布。AutoGCL中的图形视图生成器在生成每个对比样本时保留原始图形最具代表性的结构，而auto-augmentation学习策略在整个对比学习过程中引入足够的增强方差。此外，AutoGCL采用联合训练策略，以端到端的方式训练可学习视图生成器、图形编码器和分类器，从而在生成对比样本时实现拓扑异构性和语义相似性。在半监督学习、无监督学习和迁移学习方面的大量实验表明，我们的AutoGCL框架在图形对比学习方面优于最新技术。此外，可视化结果进一步证实，与现有的视图生成方法相比，可学习视图生成器可以提供更紧凑和语义上更有意义的对比样本。
摘要：Contrastive learning has been widely applied to graph representation learning, where the view generators play a vital role in generating effective contrastive samples. Most of the existing contrastive learning methods employ pre-defined view generation methods, e.g., node drop or edge perturbation, which usually cannot adapt to input data or preserve the original semantic structures well. To address this issue, we propose a novel framework named Automated Graph Contrastive Learning (AutoGCL) in this paper. Specifically, AutoGCL employs a set of learnable graph view generators orchestrated by an auto augmentation strategy, where every graph view generator learns a probability distribution of graphs conditioned by the input. While the graph view generators in AutoGCL preserve the most representative structures of the original graph in generation of every contrastive sample, the auto augmentation learns policies to introduce adequate augmentation variances in the whole contrastive learning procedure. Furthermore, AutoGCL adopts a joint training strategy to train the learnable view generators, the graph encoder, and the classifier in an end-to-end manner, resulting in topological heterogeneity yet semantic similarity in the generation of contrastive samples. Extensive experiments on semi-supervised learning, unsupervised learning, and transfer learning demonstrate the superiority of our AutoGCL framework over the state-of-the-arts in graph contrastive learning. In addition, the visualization results further confirm that the learnable view generators can deliver more compact and semantically meaningful contrastive samples compared against the existing view generation methods.

【2】 mGNN: Generalizing the Graph Neural Networks to the Multilayer Case
标题：mGNN：将图神经网络推广到多层情况
链接：https://arxiv.org/abs/2109.10119

作者：Marco Grassia,Manlio De Domenico,Giuseppe Mangioni
机构： Mangioni are with the Department of Electric Electronicand Information Engineering, University of Catania
备注：Submitted to the IEEE Computer Society
摘要：网络是建立复杂系统模型的有力工具，而许多图形神经网络（GNN）的定义，即能够处理网络的深度学习算法，为解决许多难以或甚至无法实现的现实问题开辟了一条新途径。在本文中，我们提出了mGNN，这是一个旨在将GNNs推广到多层网络的框架，即可以对节点之间的多种交互和关系进行建模的网络。我们的方法是通用的（即，不是特定于任务的），并且具有扩展任何类型的GNN而不需要任何计算开销的优势。我们将框架测试为三个不同的任务（节点和网络分类、链路预测）来验证它。
摘要：Networks are a powerful tool to model complex systems, and the definition of many Graph Neural Networks (GNN), Deep Learning algorithms that can handle networks, has opened a new way to approach many real-world problems that would be hardly or even untractable. In this paper, we propose mGNN, a framework meant to generalize GNNs to the case of multi-layer networks, i.e., networks that can model multiple kinds of interactions and relations between nodes. Our approach is general (i.e., not task specific) and has the advantage of extending any type of GNN without any computational overhead. We test the framework into three different tasks (node and network classification, link prediction) to validate it.

【3】 Transferability of Graph Neural Networks: an Extended Graphon Approach
标题：图神经网络的可传递性：一种扩展的图方法
链接：https://arxiv.org/abs/2109.10096

作者：Sohir Maskey,Ron Levie,Gitta Kutyniok
机构：†Department of Physics and Technology, University of Tromsø
摘要：我们研究了谱图卷积神经网络（GCNN），其中滤波器通过函数演算定义为图移位算子（GSO）的连续函数。谱GCNN不适合于一个特定的图，可以在不同的图之间传输。因此，重要的是研究GCNN的可转移性：网络对代表相同现象的不同图形产生近似相同影响的能力。如果测试集中的图表示与训练集中的图相同的现象，则可转移性可确保在某些图上训练的GCNN是泛化的。在本文中，我们考虑一个基于图形分析的可转让性模型。图素是图的极限对象，在图范式中，如果两个图都近似于相同的图素，则两个图表示相同的现象。我们的主要贡献可以概括如下：1）我们证明了具有连续滤波器的任何固定GCNN在近似相同图的图下是可转移的，2）我们证明了本文定义的近似无界图移位算子的图的可转移性，以及，3）我们得到了非渐近逼近结果，证明了GCNNs的线性稳定性。这扩展了当前最新的结果，这些结果表明多项式滤波器在近似有界图的图下是渐近可转移的。
摘要：We study spectral graph convolutional neural networks (GCNNs), where filters are defined as continuous functions of the graph shift operator (GSO) through functional calculus. A spectral GCNN is not tailored to one specific graph and can be transferred between different graphs. It is hence important to study the GCNN transferability: the capacity of the network to have approximately the same repercussion on different graphs that represent the same phenomenon. Transferability ensures that GCNNs trained on certain graphs generalize if the graphs in the test set represent the same phenomena as the graphs in the training set. In this paper, we consider a model of transferability based on graphon analysis. Graphons are limit objects of graphs, and, in the graph paradigm, two graphs represent the same phenomenon if both approximate the same graphon. Our main contributions can be summarized as follows: 1) we prove that any fixed GCNN with continuous filters is transferable under graphs that approximate the same graphon, 2) we prove transferability for graphs that approximate unbounded graphon shift operators, which are defined in this paper, and, 3) we obtain non-asymptotic approximation results, proving linear stability of GCNNs. This extends current state-of-the-art results which show asymptotic transferability for polynomial filters under graphs that approximate bounded graphons.

【4】 Graph Neural Networks for Graph Drawing
标题：图形神经网络在图形绘制中的应用
链接：https://arxiv.org/abs/2109.10061

作者：Matteo Tiezzi,Gabriele Ciravegna,Marco Gori
机构： Gori are with the Department of Information Engineeringand Mathematics, University of Siena, Universite Cote d’Azur
备注：Preprint, Under Review
摘要：图形绘制技术在过去几年中得到了发展，其目的是生成美观的节点链接布局。最近，可微损失函数的应用为梯度下降和相关优化算法的大量使用铺平了道路。在本文中，我们提出了一个新的框架来开发图神经抽屉（GND），这是一种依靠神经计算来构造高效复杂地图的机器。GND是一种图形神经网络（GNN），它的学习过程可以由任何提供的损失函数驱动，例如图形绘制中常用的损失函数。此外，我们还证明了这种机制可以由前馈神经网络计算的损失函数来指导，基于表示美的特性的监督提示，如交叉边的最小化。在本文中，我们展示了GNN可以很好地通过位置特征来丰富，以处理未标记的顶点。我们通过构造边缘交叉的损失函数提供概念证明，并在所提出的框架下对不同GNN模型进行定量和定性比较。
摘要：Graph Drawing techniques have been developed in the last few years with the purpose of producing aesthetically pleasing node-link layouts. Recently, the employment of differentiable loss functions has paved the road to the massive usage of Gradient Descent and related optimization algorithms. In this paper, we propose a novel framework for the development of Graph Neural Drawers (GND), machines that rely on neural computation for constructing efficient and complex maps. GND are Graph Neural Networks (GNNs) whose learning process can be driven by any provided loss function, such as the ones commonly employed in Graph Drawing. Moreover, we prove that this mechanism can be guided by loss functions computed by means of Feedforward Neural Networks, on the basis of supervision hints that express beauty properties, like the minimization of crossing edges. In this context, we show that GNNs can nicely be enriched by positional features to deal also with unlabelled vertexes. We provide a proof-of-concept by constructing a loss function for the edge-crossing and provide quantitative and qualitative comparisons among different GNN models working under the proposed framework.

【5】 Search For Deep Graph Neural Networks
标题：深图神经网络的研究
链接：https://arxiv.org/abs/2109.10047

作者：Guosheng Feng,Chunnan Wang,Hongzhi Wang
机构：Harbin Institute of Technology, Peng Cheng Laboratory
摘要：当前面向GNN的NAS方法侧重于搜索具有浅层和简单架构的不同层聚合组件，这些组件受到“过平滑”问题的限制。为了进一步探索GNN架构的结构多样性和深度带来的好处，我们提出了一种具有新颖的两阶段搜索空间的GNN生成管道，旨在以分块方式自动生成高性能、可转移的深层GNN模型。同时，为了缓解“过度平滑”问题，我们在搜索空间中加入了多个灵活的剩余连接，并在基本GNN层中应用身份映射。对于搜索算法，我们使用了深度q学习和epsilon贪婪搜索策略以及奖励重塑。在真实数据集上的大量实验表明，我们生成的GNN模型优于现有的手动设计和基于NAS的模型。
摘要：Current GNN-oriented NAS methods focus on the search for different layer aggregate components with shallow and simple architectures, which are limited by the 'over-smooth' problem. To further explore the benefits from structural diversity and depth of GNN architectures, we propose a GNN generation pipeline with a novel two-stage search space, which aims at automatically generating high-performance while transferable deep GNN models in a block-wise manner. Meanwhile, to alleviate the 'over-smooth' problem, we incorporate multiple flexible residual connection in our search space and apply identity mapping in the basic GNN layers. For the search algorithm, we use deep-q-learning with epsilon-greedy exploration strategy and reward reshaping. Extensive experiments on real-world datasets show that our generated GNN models outperforms existing manually designed and NAS-based ones.

Transformer(2篇)

【1】 Audiomer: A Convolutional Transformer for Keyword Spotting
标题：Audimer：一种用于关键词定位的卷积变换器
链接：https://arxiv.org/abs/2109.10252

作者：Surya Kant Sahu,Sai Mitheran,Juhi Kamdar,Meet Gandhi
机构：The Learning Machines, National Institute of Technology, Tiruchirappalli, George Mason University
备注：Submitted to NeurIPS 2021 ENLSP Workshop
摘要：Transformers在自然语言处理和计算机视觉任务方面出现了前所未有的增长。然而，在音频任务中，由于音频波形的序列长度非常大，因此无法进行训练，或者通过基于傅立叶的方法进行特征提取后，无法达到具有竞争力的性能，从而导致损失。在这项工作中，我们介绍了一种体系结构Audiomer，其中我们将1D剩余网络与表演者注意力相结合，以实现利用原始音频波形进行关键词识别的最新性能，超越了所有以前的方法，同时计算成本更低，参数和数据效率更高。Audiomer允许在计算受限的设备中部署，并在较小的数据集上进行训练。
摘要：Transformers have seen an unprecedented rise in Natural Language Processing and Computer Vision tasks. However, in audio tasks, they are either infeasible to train due to extremely large sequence length of audio waveforms or reach competitive performance after feature extraction through Fourier-based methods, incurring a loss-floor. In this work, we introduce an architecture, Audiomer, where we combine 1D Residual Networks with Performer Attention to achieve state-of-the-art performance in Keyword Spotting with raw audio waveforms, out-performing all previous methods while also being computationally cheaper, much more parameter and data-efficient. Audiomer allows for deployment in compute-constrained devices and training on smaller datasets.

【2】 LOTR: Face Landmark Localization Using Localization Transformer
标题：LOTR：基于定位转换器的人脸地标定位
链接：https://arxiv.org/abs/2109.10057

作者：Ukrit Watchareeruetai,Benjaphan Sommanna,Sanjana Jain,Pavit Noinongyao,Ankush Ganguly,Aubin Samacoits,Samuel W. F. Earp,Nakarin Sritrakool
机构：Samuel W.F. Earp, Sertis Vision Lab, , Sukhumvit Road, Watthana, Bangkok, Thailand, Chulalongkorn University, Phayathai Road, Pathum Wan, Bangkok , Thailand
摘要：提出了一种新的基于Transformer的人脸地标定位网络——定位Transformer（LOTR）。提出的框架是一种直接坐标回归方法，利用Transformer网络更好地利用特征地图中的空间信息。LOTR模型由三个主要模块组成：1）将输入图像转换为特征地图的视觉主干，2）从视觉主干改进特征表示的变换器模块，以及3）从变换器的表示直接预测地标坐标的地标预测头。给定裁剪和对齐的人脸图像，建议的LOTR可以进行端到端的训练，而无需任何后处理步骤。本文还介绍了光滑机翼损失函数，它解决了机翼损失的梯度不连续性，比标准损失函数（如L1、L2和机翼损失）具有更好的收敛性。在JD landmark数据集上进行的106点面部landmark定位第一次大挑战的实验结果表明，LOTR优于排行榜上现有的方法和最近两种基于热图的方法。
摘要：This paper presents a novel Transformer-based facial landmark localization network named Localization Transformer (LOTR). The proposed framework is a direct coordinate regression approach leveraging a Transformer network to better utilize the spatial information in the feature map. An LOTR model consists of three main modules: 1) a visual backbone that converts an input image into a feature map, 2) a Transformer module that improves the feature representation from the visual backbone, and 3) a landmark prediction head that directly predicts the landmark coordinates from the Transformer's representation. Given cropped-and-aligned face images, the proposed LOTR can be trained end-to-end without requiring any post-processing steps. This paper also introduces the smooth-Wing loss function, which addresses the gradient discontinuity of the Wing loss, leading to better convergence than standard loss functions such as L1, L2, and Wing loss. Experimental results on the JD landmark dataset provided by the First Grand Challenge of 106-Point Facial Landmark Localization indicate the superiority of LOTR over the existing methods on the leaderboard and two recent heatmap-based approaches.

GAN|对抗|攻击|生成相关(3篇)

【1】 Shape Inference and Grammar Induction for Example-based Procedural Generation
标题：基于实例的过程生成中的形状推理和语法归纳
链接：https://arxiv.org/abs/2109.10217

作者：Gillis Hermans,Thomas Winters,Luc De Raedt
机构：Computer Science Department, KU Leuven, Belgium
备注：None
摘要：在各种行业中，设计师越来越依赖程序生成来自动生成内容。这些技术需要对所需内容以及如何实际实现这些过程方法有广泛的了解。从示例内容学习可解释生成模型的算法可以缓解这两个困难。我们提出了SIGI，一种从基于网格的3D建筑实例中推断形状和归纳形状语法的新方法。这种可解释的语法非常适合共同创作设计。应用于Minecraft建筑，我们展示了如何使用形状语法自动生成类似样式的新建筑。
摘要：Designers increasingly rely on procedural generation for automatic generation of content in various industries. These techniques require extensive knowledge of the desired content, and about how to actually implement such procedural methods. Algorithms for learning interpretable generative models from example content could alleviate both difficulties. We propose SIGI, a novel method for inferring shapes and inducing a shape grammar from grid-based 3D building examples. This interpretable grammar is well-suited for co-creative design. Applied to Minecraft buildings, we show how the shape grammar can be used to automatically generate new buildings in a similar style.

【2】 Scenario generation for market risk models using generative neural networks
标题：基于产生式神经网络的市场风险模型情景生成
链接：https://arxiv.org/abs/2109.10072

作者：Solveig Flaig,Gero Junike
机构：.,.
摘要：在这项研究中，我们展示了如何将作为经济情景生成器（ESG）使用的生成性对抗性网络（GAN）的现有方法扩展到一个完整的内部模型——根据偿付能力2的要求，有足够的风险因素来模拟保险公司投资的全带宽和一年的期限。为了验证这种方法以及优化GAN架构，我们开发了新的性能度量，并提供了一致的数据驱动框架。最后，我们证明了基于GAN的ESG的结果与欧洲监管机构批准的内部模型相似。因此，基于GAN的模型可以被视为市场风险建模的无假设数据驱动的替代方法。
摘要：In this research, we show how to expand existing approaches of generative adversarial networks (GANs) being used as economic scenario generators (ESG) to a whole internal model - with enough risk factors to model the full band-width of investments for an insurance company and for a one year horizon as required in Solvency 2. For validation of this approach as well as for optimisation of the GAN architecture, we develop new performance measures and provide a consistent, data-driven framework. Finally, we demonstrate that the results of a GAN-based ESG are similar to regulatory approved internal models in Europe. Therefore, GAN-based models can be seen as an assumption-free data-driven alternative way of market risk modelling.

【3】 Modelling Adversarial Noise for Adversarial Defense
标题：对抗性防御中的对抗性噪声建模
链接：https://arxiv.org/abs/2109.09901

作者：Dawei Zhou,Nannan Wang,Tongliang Liu,Bo Han
机构：Xidian University, The University of Sydney, Hong Kong Baptist University
摘要：深度神经网络已被证明易受对抗性噪声的影响，促进了对抗性攻击防御能力的发展。传统上，对抗性防御通常侧重于直接利用对抗性示例来消除对抗性噪声或训练具有对抗性的鲁棒目标模型。鉴于对抗性数据与自然数据之间的关系有助于从对抗性数据中推断出清晰的数据，从而获得最终正确的预测，本文研究建立对抗性噪声模型，以学习标签空间中的转换关系，从而使用对抗性标签提高对抗准确性。具体来说，我们引入了一个转移矩阵来关联对抗性标签和真实标签。通过利用转移矩阵，我们可以直接从敌对标签中推断出干净标签。然后，我们建议使用深度神经网络（即转换网络）从对抗性噪声中建模实例相关的转换矩阵。此外，我们在目标模型和过渡网络上进行联合对抗训练，以实现最佳性能。对基准数据集的实证评估表明，与最先进的方法相比，我们的方法可以显著提高对抗准确性。
摘要：Deep neural networks have been demonstrated to be vulnerable to adversarial noise, promoting the development of defenses against adversarial attacks. Traditionally, adversarial defenses typically focus on directly exploiting adversarial examples to remove adversarial noise or train an adversarially robust target model. Motivated by that the relationship between adversarial data and natural data can help infer clean data from adversarial data to obtain the final correct prediction, in this paper, we study to model adversarial noise to learn the transition relationship in the label space for using adversarial labels to improve adversarial accuracy. Specifically, we introduce a transition matrix to relate adversarial labels and true labels. By exploiting the transition matrix, we can directly infer clean labels from adversarial labels. Then, we propose to employ a deep neural network (i.e., transition network) to model the instance-dependent transition matrix from adversarial noise. In addition, we conduct joint adversarial training on the target model and the transition network to achieve optimal performance. Empirical evaluations on benchmark datasets demonstrate that our method could significantly improve adversarial accuracy in comparison to state-of-the-art methods.

半/弱/无/有监督|不确定性|主动学习(6篇)

【1】 Uncertainty Toolbox: an Open-Source Library for Assessing, Visualizing, and Improving Uncertainty Quantification
标题：不确定性工具箱：评估、可视化和改进不确定性量化的开源库
链接：https://arxiv.org/abs/2109.10254

作者：Youngseog Chung,Ian Char,Han Guo,Jeff Schneider,Willie Neiswanger
机构： Uncertainty Toolbox focuses on the regres- 1Robotics Institute, 3Language Technologies Institute, Carnegie Mellon University, Stanford University
摘要：随着机器学习系统在各种现实任务中的应用越来越多，对预测不确定性的精确量化需求也越来越高。虽然机器学习中不确定性量化（UQ）的共同目标是近似目标数据的真实分布，但UQ中的许多工作在使用的评估指标中往往是不相交的，并且每个指标的不同实现导致数值结果在不同的工作中无法直接比较。为了解决这个问题，我们引入了不确定性工具箱，这是一个开源python库，有助于评估、可视化和改进UQ。“不确定性工具箱”还提供教学资源，如关键术语表和有组织的关键论文参考文献集合。我们希望这个工具箱有助于加速和统一机器学习中不确定性的研究工作。
摘要：With increasing deployment of machine learning systems in various real-world tasks, there is a greater need for accurate quantification of predictive uncertainty. While the common goal in uncertainty quantification (UQ) in machine learning is to approximate the true distribution of the target data, many works in UQ tend to be disjoint in the evaluation metrics utilized, and disparate implementations for each metric lead to numerical results that are not directly comparable across different works. To address this, we introduce Uncertainty Toolbox, an open-source python library that helps to assess, visualize, and improve UQ. Uncertainty Toolbox additionally provides pedagogical resources, such as a glossary of key terms and an organized collection of key paper references. We hope that this toolbox is useful for accelerating and uniting research efforts in uncertainty in machine learning.

【2】 Self-supervised Representation Learning for Reliable Robotic Monitoring of Fruit Anomalies
标题：基于自监督表示学习的水果异常机器人可靠监测
链接：https://arxiv.org/abs/2109.10135

作者：Taeyeong Choi,Owen Would,Adrian Salazar-Gomez,Grzegorz Cielniak
备注：Codes and data are all available online
摘要：数据增强是一种简单但功能强大的工具，可用于自主机器人充分利用可用数据进行非典型场景或对象的自我监督识别。最先进的增强方法可以在典型图像的焦点对象中任意嵌入结构特征，因此对这些伪影进行分类可以为异常视觉输入检测的学习表示提供指导。然而，在本文中，我们认为学习这种结构敏感表征可能是某些类别异常（如不健康水果）的次优方法，这些异常可以通过不同类型的视觉元素（如“颜色”）更好地识别。因此，我们提出通道随机化作为一种新的数据增强方法，用于限制神经网络模型学习“颜色不规则性”的编码，同时预测通道随机化图像，以最终构建识别非典型水果品质的可靠水果监测机器人。我们的实验表明：（1）基于颜色的替代方案可以更好地学习表征，以便一致准确地识别各种水果品种中的水果异常；（2）由于颜色学习任务与水果异常检测之间存在正相关，因此可以监控验证准确度，以便尽早停止训练。此外，在一个新的异常数据集Riseholme-2021上对所提出的方法进行了评估，该数据集由一个移动机器人收集的3:5K草莓图像组成，我们与社区共享该图像，以鼓励积极的农业机器人研究。
摘要：Data augmentation can be a simple yet powerful tool for autonomous robots to fully utilise available data for self-supervised identification of atypical scenes or objects. State-of-the-art augmentation methods arbitrarily embed structural peculiarity in focal objects on typical images so that classifying these artefacts can provide guidance for learning representations for the detection of anomalous visual inputs. In this paper, however, we argue that learning such structure-sensitive representations can be a suboptimal approach to some classes of anomaly (e.g., unhealthy fruits) which are better recognised by a different type of visual element such as "colour". We thus propose Channel Randomisation as a novel data augmentation method for restricting neural network models to learn encoding of "colour irregularity" whilst predicting channel-randomised images to ultimately build reliable fruit-monitoring robots identifying atypical fruit qualities. Our experiments show that (1) the colour-based alternative can better learn representations for consistently accurate identification of fruit anomalies in various fruit species, and (2) validation accuracy can be monitored for early stopping of training due to positive correlation between the colour-learning task and fruit anomaly detection. Moreover, the proposed approach is evaluated on a new anomaly dataset Riseholme-2021, consisting of 3:5K strawberry images collected from a mobile robot, which we share with the community to encourage active agri-robotics research.

【3】 Self-Supervised Action-Space Prediction for Automated Driving
标题：自动驾驶的自监督动作空间预测
链接：https://arxiv.org/abs/2109.10024

作者：Faris Janjoš,Maxim Dolgov,J. Marius Zöllner
摘要：做出明智的驾驶决策需要可靠地预测其他车辆的轨迹。在本文中，我们提出了一种用于自动驾驶的新型学习多模态轨迹预测体系结构。它通过将学习问题投射到加速度和转向角空间来实现运动学上可行的预测——通过执行动作空间预测，我们可以利用有价值的模型知识。此外，行动流形的维数低于状态流形的维数，状态流形的内在关联状态更难通过学习的方式捕获。为了实现动作空间预测，我们提出了简单前馈动作空间预测（FFW-ASP）体系结构。然后，我们基于这一概念，介绍了一种新的自监督动作空间预测（SSP-ASP）体系结构，该体系结构除了输出轨迹外，还输出未来的环境上下文特征。自监督体系结构中的一个关键要素是，基于观察到的动作历史和过去的上下文特征，在未来轨迹之前预测未来的上下文特征。在包含城市交叉口和环形交叉口的真实数据集上对所提出的方法进行了评估，并显示了准确的预测，在若干预测指标上优于最先进的运动学可行性预测。
摘要：Making informed driving decisions requires reliable prediction of other vehicles' trajectories. In this paper, we present a novel learned multi-modal trajectory prediction architecture for automated driving. It achieves kinematically feasible predictions by casting the learning problem into the space of accelerations and steering angles -- by performing action-space prediction, we can leverage valuable model knowledge. Additionally, the dimensionality of the action manifold is lower than that of the state manifold, whose intrinsically correlated states are more difficult to capture in a learned manner. For the purpose of action-space prediction, we present the simple Feed-Forward Action-Space Prediction (FFW-ASP) architecture. Then, we build on this notion and introduce the novel Self-Supervised Action-Space Prediction (SSP-ASP) architecture that outputs future environment context features in addition to trajectories. A key element in the self-supervised architecture is that, based on an observed action history and past context features, future context features are predicted prior to future trajectories. The proposed methods are evaluated on real-world datasets containing urban intersections and roundabouts, and show accurate predictions, outperforming state-of-the-art for kinematically feasible predictions in several prediction metrics.

【4】 Unsupervised Abstract Reasoning for Raven's Problem Matrices
标题：Raven问题矩阵的无监督抽象推理
链接：https://arxiv.org/abs/2109.10011

作者：Tao Zhuo,Qiang Huang,Mohan Kankanhalli
机构：)Tao Zhuo is with Shandong Artificial Intelligence Institute, Qiang Huang and Mohan Kankanhalli are with School of Computing, Na-tional University of Singapore
备注：Accepted by TIP
摘要：瑞文进步矩阵（RPM）与人类智力高度相关，已被广泛用于衡量人类的抽象推理能力。在本文中，为了研究深度神经网络的抽象推理能力，我们提出了第一种解决RPM问题的无监督学习方法。由于地面真值标签是不允许的，我们基于RPM公式的先验约束设计了一个伪目标来逼近地面真值标签，从而有效地将无监督学习策略转化为有监督学习策略。然而，正确答案被伪目标错误地标记，因此噪声对比度将导致不准确的模型训练。为了缓解这个问题，我们建议用否定的答案来改进模型性能。此外，我们还开发了一种分散方法，以适应不同RPM问题的特征表示。在三个数据集上的大量实验表明，我们的方法甚至优于一些有监督的方法。我们的代码可在https://github.com/visiontao/ncd.
摘要：Raven's Progressive Matrices (RPM) is highly correlated with human intelligence, and it has been widely used to measure the abstract reasoning ability of humans. In this paper, to study the abstract reasoning capability of deep neural networks, we propose the first unsupervised learning method for solving RPM problems. Since the ground truth labels are not allowed, we design a pseudo target based on the prior constraints of the RPM formulation to approximate the ground truth label, which effectively converts the unsupervised learning strategy into a supervised one. However, the correct answer is wrongly labelled by the pseudo target, and thus the noisy contrast will lead to inaccurate model training. To alleviate this issue, we propose to improve the model performance with negative answers. Moreover, we develop a decentralization method to adapt the feature representation to different RPM problems. Extensive experiments on three datasets demonstrate that our method even outperforms some of the supervised approaches. Our code is available at https://github.com/visiontao/ncd.

【5】 Discovery of temporal structure intricacy in arterial blood pressure waveforms representing acuity of liver transplant and forecasting short term surgical outcome via unsupervised manifold learning
标题：通过无监督流形学习发现反映肝移植敏锐度的动脉血压波形的时间结构复杂性并预测短期手术结果
链接：https://arxiv.org/abs/2109.10258

作者：Shen-Chih Wang,Chien-Kun Ting,Cheng-Yen Chen,Chin-Su Liu,Niang-Cheng Lin,Che-Chuan Loon,Hau-Tieng Wu,Yu-Ting Lin
机构：. Department of Anesthesiology, Taipei Veterans General Hospital, Taipei, Taiwan, . School of Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan, . Division of Transplantation Surgery, Taipei Veterans General Hospital, Taipei
备注：5 figures and 1 table
摘要：背景：在肝移植手术期间，动脉血压（ABP）波形在每个连续的脉冲中演变。我们假设波形演变的量化反映了1）接受肝移植的受体的敏锐度和2）预测短期手术结果的术中动力学。方法：在这项关于活体肝移植手术的前瞻性观察性单队列研究中，我们使用无监督流形学习波形分析从ABP数据中提取波形形态演变。发展了两个定量指标，趋势运动和波动运动，分别代表慢变和快变动态。我们调查了终末期肝病模型（MELD）评分与肝脏疾病敏锐度的相关性、主要转归、早期同种异体移植失败（EAF）以及最近发展的EAF评分，包括移植后肝移植评估（L-移植物）评分，早期同种异体移植失败简化评估（EASE）评分和早期同种异体移植功能模型（MEAF）评分。结果：60名受试者入选。术前趋势运动与MELD评分相关。无肝期减少。新肝趋势运动与L移植物评分、EASE评分和MEAF评分相关。关于EAF评分的组成部分，趋势运动与术后第7天胆红素最相关。结论：术前阶段ABP波形演变的复杂性反映了受者的视力状况，而新肝期ABP波形演变的复杂性反映了根据术后7-10天实验室数据计算的短期手术结果。波形演变反映了术中对早期结果的贡献。
摘要：Background: Arterial blood pressure (ABP) waveform evolves across each consecutive pulse during the liver transplant surgery. We hypothesized that the quantification of the waveform evolution reflects 1) the acuity of the recipient undergoing liver transplant and 2) the intraoperative dynamics that forecasts short-term surgical outcomes. Methods: In this prospective observational single cohort study on living donor liver transplant surgery, we extracted the waveform morphological evolution from the ABP data with the unsupervised manifold learning waveform analysis. Two quantitative indices, trend movement and fluctuation movement, were developed to represent the slow-varying and fast-varying dynamics respectively. We investigated the associations with the liver disease acuity represented with the Model for End-Stage Liver Disease (MELD) score and the primary outcomes, the early allograft failure (EAF), as well as the recently developed EAF scores, including the Liver Graft Assessment Following Transplantation (L-GrAFT) score, the Early Allograft Failure Simplified Estimation (EASE) score, and the Model for Early Allograft Function (MEAF) score. Results: Sixty recipients were enrolled. The presurgical trend movement was correlated with the MELD scores. It decreased in the anhepatic phase. The neohepatic trend movement correlated with the L-GrAFT scores, the EASE score, and the MEAF score. Regarding the constituent of the EAF scores, the trend movement most correlated with the postoperative day 7 bilirubin. Conclusions: The ABP waveform evolution intricacy in the presurgical phase reflects recipients' acuity condition while that in the neohepatic phase reveal the short-term surgical outcome calculated from laboratory data in postoperative day 7-10. The waveform evolution reflects the intraoperative contribution to the early outcome.

【6】 Unsupervised Domain Adaptation with Semantic Consistency across Heterogeneous Modalities for MRI Prostate Lesion Segmentation
标题：用于MRI前列腺病变分割的跨异构模式语义一致的无监督领域自适应
链接：https://arxiv.org/abs/2109.09736

作者：Eleni Chiou,Francesco Giganti,Shonit Punwani,Iasonas Kokkinos,Eleftheria Panagiotaki
机构： Centre for Medical Image Computing, UCL, London, UK, Department of Computer Science, UCL, London, UK, Department of Radiology, UCLH NHS Foundation Trust, London, UK, Division of Surgery & Interventional Science, UCL, London, UK
备注：Accepted at MICCAI 2021 Workshop on Domain Adaptation and Representation Transfer (DART). arXiv admin note: text overlap with arXiv:2010.07411
摘要：任何与先前协议不同的新型医学成像模式，例如成像通道的数量，都会引入一个与先前协议不同的新领域。这种常见的医学成像场景很少在领域适应文献中考虑，该文献处理相同维度的领域之间的转移。在我们的工作中，我们依靠随机生成模型在像素空间跨两个异构域进行转换，并引入两个新的损失函数来促进语义一致性。首先，我们在源域引入语义循环一致性损失，以确保翻译保留语义。其次，我们引入了伪标记丢失，将目标数据转换为源数据，通过源域网络对其进行标记，并使用生成的伪标记监控目标域网络。我们的结果表明，这使我们能够系统地为目标域提取更好的表示。特别是，我们通过利用标记的mp MRI数据，解决了增强DIVERT-MRI（一种先进的扩散加权成像技术）性能的挑战。与几种无监督领域自适应方法相比，我们的方法产生了实质性的改进，始终延续到半监督和监督学习环境。
摘要：Any novel medical imaging modality that differs from previous protocols e.g. in the number of imaging channels, introduces a new domain that is heterogeneous from previous ones. This common medical imaging scenario is rarely considered in the domain adaptation literature, which handles shifts across domains of the same dimensionality. In our work we rely on stochastic generative modeling to translate across two heterogeneous domains at pixel space and introduce two new loss functions that promote semantic consistency. Firstly, we introduce a semantic cycle-consistency loss in the source domain to ensure that the translation preserves the semantics. Secondly, we introduce a pseudo-labelling loss, where we translate target data to source, label them by a source-domain network, and use the generated pseudo-labels to supervise the target-domain network. Our results show that this allows us to extract systematically better representations for the target domain. In particular, we address the challenge of enhancing performance on VERDICT-MRI, an advanced diffusion-weighted imaging technique, by exploiting labeled mp-MRI data. When compared to several unsupervised domain adaptation approaches, our approach yields substantial improvements, that consistently carry over to the semi-supervised and supervised learning settings.

迁移|Zero/Few/One-Shot|自适应(5篇)

【1】 Multilingual Document-Level Translation Enables Zero-Shot Transfer From Sentences to Documents
标题：多语言文档级翻译支持从句子到文档的零射转换
链接：https://arxiv.org/abs/2109.10341

作者：Biao Zhang,Ankur Bapna,Melvin Johnson,Ali Dabirmoghaddam,Naveen Arivazhagan,Orhan Firat
机构： School of Informatics, University of Edinburgh, Google Research
摘要：文档级神经机器翻译（DOCNMT）通过合并跨句上下文来提供连贯的翻译。然而，对于大多数语言对来说，尽管平行句很容易获得，但平行文档还是很短缺。在本文中，我们研究了DocNMT中的上下文建模是否以及如何通过多语言建模以零拍方式（即学生语言没有并行文档）从句子转换到文档。使用基于简单连接的DocNMT，我们探讨了3个因素对多语言迁移的影响：文档监督教师语言的数量、训练时平行文档的数据调度以及平行文档的数据条件（正版与反译）。我们在Europarl-7和IWSLT-10数据集上的实验表明了DocNMT多语言传输的可行性，特别是在文档特定的度量上。我们观察到更多的教师语言和适当的数据表都有助于提高迁移质量。令人惊讶的是，传输对数据条件不太敏感，多语言DocNMT与回译文档对和真实文档对的性能相当。
摘要：Document-level neural machine translation (DocNMT) delivers coherent translations by incorporating cross-sentence context. However, for most language pairs there's a shortage of parallel documents, although parallel sentences are readily available. In this paper, we study whether and how contextual modeling in DocNMT is transferable from sentences to documents in a zero-shot fashion (i.e. no parallel documents for student languages) through multilingual modeling. Using simple concatenation-based DocNMT, we explore the effect of 3 factors on multilingual transfer: the number of document-supervised teacher languages, the data schedule for parallel documents at training, and the data condition of parallel documents (genuine vs. backtranslated). Our experiments on Europarl-7 and IWSLT-10 datasets show the feasibility of multilingual transfer for DocNMT, particularly on document-specific metrics. We observe that more teacher languages and adequate data schedule both contribute to better transfer quality. Surprisingly, the transfer is less sensitive to the data condition and multilingual DocNMT achieves comparable performance with both back-translated and genuine document pairs.

【2】 Adaptive Reliability Analysis for Multi-fidelity Models using a Collective Learning Strategy
标题：基于集体学习策略的多保真模型自适应可靠性分析
链接：https://arxiv.org/abs/2109.10219

作者：Chi Zhang,Chaolin Song,Abdollah Shafieezadeh
机构： Risk Assessment and Management of Structural and Infrastructure Systems (RAMSIS) Lab, Department of Civil, Environmental, and Geodetic Engineering, The Ohio State University, Columbus, OH, United States
备注：None
摘要：在科学和工程的许多领域中，可以使用具有不同可信度的模型。准确捕捉系统行为的物理实验或详细模拟被视为具有低模型不确定性的高保真模型，但运行成本较高。另一方面，简化的物理实验或数值模型被视为低保真度模型，评估成本更低。尽管低保真度模型由于其低精度通常不适合直接用于可靠性分析，但它们可以提供有关高保真模型趋势的信息，从而提供以低成本探索设计空间的机会。本研究提出了一种新的可靠性分析方法，称为自适应多保真高斯过程（AMGPRA）。与最新的mfEGRA方法在两个不同阶段选择训练点和信息源不同，该方法使用新的集体学习函数（CLF）同时找到最佳训练点和信息源。CLF能够从一个信息源评估候选训练点的全球影响，并适应满足某个特征的任何学习功能。在这种情况下，CLF为量化新训练点的影响提供了一个新的方向，并且可以通过新的学习功能轻松扩展，以适应不同的可靠性问题。通过三个数学算例和一个输电塔风振可靠性工程问题验证了该方法的有效性。结果表明，与最新的单保真度和多保真度方法相比，该方法在降低计算成本的情况下达到了相似或更高的精度。AMGPRA的一个关键应用是使用复杂且昂贵的基于物理的计算模型进行高保真脆弱性建模。
摘要：In many fields of science and engineering, models with different fidelities are available. Physical experiments or detailed simulations that accurately capture the behavior of the system are regarded as high-fidelity models with low model uncertainty, however, they are expensive to run. On the other hand, simplified physical experiments or numerical models are seen as low-fidelity models that are cheaper to evaluate. Although low-fidelity models are often not suitable for direct use in reliability analysis due to their low accuracy, they can offer information about the trend of the high-fidelity model thus providing the opportunity to explore the design space at a low cost. This study presents a new approach called adaptive multi-fidelity Gaussian process for reliability analysis (AMGPRA). Contrary to selecting training points and information sources in two separate stages as done in state-of-the-art mfEGRA method, the proposed approach finds the optimal training point and information source simultaneously using the novel collective learning function (CLF). CLF is able to assess the global impact of a candidate training point from an information source and it accommodates any learning function that satisfies a certain profile. In this context, CLF provides a new direction for quantifying the impact of new training points and can be easily extended with new learning functions to adapt to different reliability problems. The performance of the proposed method is demonstrated by three mathematical examples and one engineering problem concerning the wind reliability of transmission towers. It is shown that the proposed method achieves similar or higher accuracy with reduced computational costs compared to state-of-the-art single and multi-fidelity methods. A key application of AMGPRA is high-fidelity fragility modeling using complex and costly physics-based computational models.

【3】 Learning Adaptive Control for SE(3) Hamiltonian Dynamics
标题：SE(3)哈密顿动力学的学习自适应控制
链接：https://arxiv.org/abs/2109.09974

作者：Thai Duong,Nikolay Atanasov
机构：The authors are with the Department of Electrical and Computer Engi-neering, University of California San Diego
备注：Project website: this https URL
摘要：快速自适应控制是机器人在快速变化的操作条件下实现可靠自主的关键部件。虽然机器人动力学模型可以从第一原理中获得，也可以从数据中学习，但更新其参数通常太慢，无法在线适应环境变化。这促使使用机器学习技术从离线轨迹数据中学习干扰描述符，以及设计自适应控制来在线估计和补偿干扰。本文针对满足SE（3）流形上Hamilton运动方程的刚体系统，如地面、空中和水下航行器，提出了自适应几何控制方法。我们的设计包括离线系统识别阶段，然后是在线自适应控制阶段。在第一阶段，我们使用从具有不同扰动实现的状态控制轨迹数据训练的神经常微分方程（ODE）网络学习系统动力学的哈密顿模型。干扰被建模为非线性描述符的线性组合。在第二阶段，我们从基于能量的角度设计了一个带有干扰补偿的轨迹跟踪控制器。采用自适应控制律在线调整扰动模型，使其与SE（3）流形上的几何跟踪误差成比例。我们验证了我们的自适应几何控制器在全驱动摆和欠驱动四旋翼上的轨迹跟踪。
摘要：Fast adaptive control is a critical component for reliable robot autonomy in rapidly changing operational conditions. While a robot dynamics model may be obtained from first principles or learned from data, updating its parameters is often too slow for online adaptation to environment changes. This motivates the use of machine learning techniques to learn disturbance descriptors from trajectory data offline as well as the design of adaptive control to estimate and compensate the disturbances online. This paper develops adaptive geometric control for rigid-body systems, such as ground, aerial, and underwater vehicles, that satisfy Hamilton's equations of motion over the SE(3) manifold. Our design consists of an offline system identification stage, followed by an online adaptive control stage. In the first stage, we learn a Hamiltonian model of the system dynamics using a neural ordinary differential equation (ODE) network trained from state-control trajectory data with different disturbance realizations. The disturbances are modeled as a linear combination of nonlinear descriptors. In the second stage, we design a trajectory tracking controller with disturbance compensation from an energy-based perspective. An adaptive control law is employed to adjust the disturbance model online proportional to the geometric tracking errors on the SE(3) manifold. We verify our adaptive geometric controller for trajectory tracking on a fully-actuated pendulum and an under-actuated quadrotor.

【4】 Improving Span Representation for Domain-adapted Coreference Resolution
标题：改进跨度表示的领域自适应共指消解
链接：https://arxiv.org/abs/2109.09811

作者：Nupoor Gandhi,Anjalie Field,Yulia Tsvetkov
机构：Carnegie Mellon University †, University of Washington §
摘要：最近的研究表明，当适应不同的领域时，微调神经相关模型可以产生强大的性能。但是，同时，这可能需要大量带注释的目标示例。在这项工作中，我们专注于临床笔记的监督领域适应，建议使用概念知识更有效地将共指模型适应新的领域。我们开发了改进跨度表示的方法：（1）通过改造损失来激励跨度表示以满足基于知识的距离函数；（2）支架损失来指导从跨度表示中恢复知识。通过整合这些损失，我们的模型能够提高基线精度和F-1分数。特别是，我们表明，将知识与端到端的共指模型相结合，可以在最具挑战性的领域特定跨度上获得更好的性能。
摘要：Recent work has shown fine-tuning neural coreference models can produce strong performance when adapting to different domains. However, at the same time, this can require a large amount of annotated target examples. In this work, we focus on supervised domain adaptation for clinical notes, proposing the use of concept knowledge to more efficiently adapt coreference models to a new domain. We develop methods to improve the span representations via (1) a retrofitting loss to incentivize span representations to satisfy a knowledge-based distance function and (2) a scaffolding loss to guide the recovery of knowledge from the span representation. By integrating these losses, our model is able to improve our baseline precision and F-1 score. In particular, we show that incorporating knowledge with end-to-end coreference models results in better performance on the most challenging, domain-specific spans.

【5】 MetaMedSeg: Volumetric Meta-learning for Few-Shot Organ Segmentation
标题：MetaMedSeg：用于Few-Shot器官分割的体积元学习
链接：https://arxiv.org/abs/2109.09734

作者：Anastasia Makarevich,Azade Farshad,Vasileios Belagiannis,Nassir Navab
机构：Technical University of Munich, Munich, Germany, Ulm University, Ulm, Germany
摘要：缺乏足够的带注释的图像数据是医学图像分割中的一个常见问题。对于某些器官和密度，注释可能很少，导致模型训练收敛性差，而其他器官有大量注释数据。在这项工作中，我们提出了MetaMedSeg，一种基于梯度的元学习算法，它重新定义了体积医学数据的元学习任务，目标是捕获切片之间的变化。我们还探讨了梯度聚合的不同加权方案，认为不同的任务可能具有不同的复杂性，因此对初始化的贡献不同。我们提出了一个重要性感知加权方案来训练我们的模型。在实验中，我们通过从不同器官的CT和MRI体积中提取2D切片并执行语义分割，对医用十项全能数据集进行了评估。结果表明，与相关基线相比，我们提出的体积任务定义在IoU方面提高了30%。对于目标器官和源器官的数据分布非常不同的复杂场景，所提出的更新规则也可以提高性能。
摘要：The lack of sufficient annotated image data is a common issue in medical image segmentation. For some organs and densities, the annotation may be scarce, leading to poor model training convergence, while other organs have plenty of annotated data. In this work, we present MetaMedSeg, a gradient-based meta-learning algorithm that redefines the meta-learning task for the volumetric medical data with the goal to capture the variety between the slices. We also explore different weighting schemes for gradients aggregation, arguing that different tasks might have different complexity, and hence, contribute differently to the initialization. We propose an importance-aware weighting scheme to train our model. In the experiments, we present an evaluation of the medical decathlon dataset by extracting 2D slices from CT and MRI volumes of different organs and performing semantic segmentation. The results show that our proposed volumetric task definition leads to up to 30% improvement in terms of IoU compared to related baselines. The proposed update rule is also shown to improve the performance for complex scenarios where the data distribution of the target organ is very different from the source organs.

强化学习(4篇)

【1】 Example-Driven Model-Based Reinforcement Learning for Solving Long-Horizon Visuomotor Tasks
标题：基于实例驱动模型的强化学习在求解长视距视觉运动任务中的应用
链接：https://arxiv.org/abs/2109.10312

作者：Bohan Wu,Suraj Nair,Li Fei-Fei,Chelsea Finn
机构：Stanford University, Stanford, CA
备注：Equal advising and contribution for last two authors
摘要：在本文中，我们研究了从原始图像中学习一系列低级技能的问题，这些图像可以被排序以完成长视距视觉运动任务。强化学习（RL）是一种很有前途的自主获取短期技能的方法。然而，RL算法的重点主要是这些个人技能的成功，而不是学习和掌握大量技能，这些技能可以按顺序完成扩展的多阶段任务。后者需要健壮性和持久性，因为技能上的错误会随着时间的推移而加剧，并且可能要求机器人在其技能库中拥有许多原始技能，而不仅仅是一项。为此，我们介绍了EMBR，一种基于模型的RL方法，用于学习适合完成长视野视觉运动任务的原始技能。EMBR使用学习模型、批评家和成功分类器进行学习和规划，其中成功分类器既是RL的奖励函数，也是持续检测机器人是否应在失败或受到干扰时重试技能的基础机制。此外，学习的模型是任务无关的，并且使用来自所有技能的数据进行训练，使机器人能够有效地学习许多不同的原语。这些视觉运动原始技能及其相关的前置和后置条件可以直接与现成的符号规划者相结合，以完成长视野任务。在Franka-Emika机器人手臂上，我们发现EMBR使机器人能够以85%的成功率完成三项长视野视觉运动任务，例如组织办公桌、文件柜和抽屉，这需要多达12项技能的排序，涉及14种独特的学习基元，并需要对新对象进行概括。
摘要：In this paper, we study the problem of learning a repertoire of low-level skills from raw images that can be sequenced to complete long-horizon visuomotor tasks. Reinforcement learning (RL) is a promising approach for acquiring short-horizon skills autonomously. However, the focus of RL algorithms has largely been on the success of those individual skills, more so than learning and grounding a large repertoire of skills that can be sequenced to complete extended multi-stage tasks. The latter demands robustness and persistence, as errors in skills can compound over time, and may require the robot to have a number of primitive skills in its repertoire, rather than just one. To this end, we introduce EMBR, a model-based RL method for learning primitive skills that are suitable for completing long-horizon visuomotor tasks. EMBR learns and plans using a learned model, critic, and success classifier, where the success classifier serves both as a reward function for RL and as a grounding mechanism to continuously detect if the robot should retry a skill when unsuccessful or under perturbations. Further, the learned model is task-agnostic and trained using data from all skills, enabling the robot to efficiently learn a number of distinct primitives. These visuomotor primitive skills and their associated pre- and post-conditions can then be directly combined with off-the-shelf symbolic planners to complete long-horizon tasks. On a Franka Emika robot arm, we find that EMBR enables the robot to complete three long-horizon visuomotor tasks at 85% success rate, such as organizing an office desk, a file cabinet, and drawers, which require sequencing up to 12 skills, involve 14 unique learned primitives, and demand generalization to novel objects.

【2】 Learning offline: memory replay in biological and artificial reinforcement learning
标题：离线学习：生物和人工强化学习中的记忆回放
链接：https://arxiv.org/abs/2109.10034

作者：Emma L. Roscow,Raymond Chua,Rui Ponte Costa,Matt W. Jones,Nathan Lepora
机构：Centre de Recerca Matemàtica, Bellaterra, Spain; ,McGill University and Mila, Montréal, Canada; ,Bristol Computational Neuroscience Unit, Intelligent Systems Lab
备注：In press at Trends in Neurosciences
摘要：学习在环境中行动以获得最大回报是大脑的关键功能之一。这一过程通常在强化学习的框架内被概念化，强化学习在机器学习和人工智能（AI）中作为优化决策的一种方式也得到了重视。生物强化学习和机器强化学习的一个共同方面是重新激活以前经历过的事件，称为重播。重放对生物神经网络中的记忆巩固非常重要，是深层神经网络中稳定学习的关键。在这里，我们回顾了神经科学和人工智能领域中关于回放功能作用的最新进展。互补性进展表明重播如何支持学习过程，包括泛化和持续学习，为跨两个领域转移知识提供机会，以促进对生物和人工学习和记忆的理解。
摘要：Learning to act in an environment to maximise rewards is among the brain's key functions. This process has often been conceptualised within the framework of reinforcement learning, which has also gained prominence in machine learning and artificial intelligence (AI) as a way to optimise decision-making. A common aspect of both biological and machine reinforcement learning is the reactivation of previously experienced episodes, referred to as replay. Replay is important for memory consolidation in biological neural networks, and is key to stabilising learning in deep neural networks. Here, we review recent developments concerning the functional roles of replay in the fields of neuroscience and AI. Complementary progress suggests how replay might support learning processes, including generalisation and continual learning, affording opportunities to transfer knowledge across the two fields to advance the understanding of biological and artificial learning and memory.

【3】 A Simple Unified Framework for Anomaly Detection in Deep Reinforcement Learning
标题：一种简单的深度强化学习异常检测统一框架
链接：https://arxiv.org/abs/2109.09889

作者：Hongming Zhang,Ke Sun,Bo Xu,Linglong Kong,Martin Müller
机构： Department of Computing Science, University of Alberta, Department of Mathematical and Statistical Sciences, University of Alberta, Institute of Automation, Chinese Academy of Sciences
备注：15 pages, 18 figures
摘要：深度强化学习（RL）中的异常状态是超出RL策略范围的状态。这种状态可能会使RL系统不安全，并妨碍其在实际场景中的部署。在本文中，我们为深度RL算法提出了一个简单而有效的异常检测框架，该框架同时考虑了随机、敌对和分布外的状态离群值。特别地，我们在高斯假设下得到了每个动作类的类条件分布，并基于马氏距离（MD）和鲁棒马氏距离，利用这些分布来区分内点和离群点。我们在Atari游戏上进行了大量实验，验证了我们检测策略的有效性。据我们所知，我们首次详细研究了深度RL算法中的统计异常和对抗异常检测。这种简单的统一异常检测为在实际应用中部署安全的RL系统铺平了道路。
摘要：Abnormal states in deep reinforcement learning~(RL) are states that are beyond the scope of an RL policy. Such states may make the RL system unsafe and impede its deployment in real scenarios. In this paper, we propose a simple yet effective anomaly detection framework for deep RL algorithms that simultaneously considers random, adversarial and out-of-distribution~(OOD) state outliers. In particular, we attain the class-conditional distributions for each action class under the Gaussian assumption, and rely on these distributions to discriminate between inliers and outliers based on Mahalanobis Distance~(MD) and Robust Mahalanobis Distance. We conduct extensive experiments on Atari games that verify the effectiveness of our detection strategies. To the best of our knowledge, we present the first in-detail study of statistical and adversarial anomaly detection in deep RL algorithms. This simple unified anomaly detection paves the way towards deploying safe RL systems in real-world applications.

【4】 Reinforcement Learning for Finite-Horizon Restless Multi-Armed Multi-Action Bandits
标题：有限视界无休止多臂多动作胶带的强化学习
链接：https://arxiv.org/abs/2109.09855

作者：Guojun Xiong,Jian Li,Rahul Singh
机构：SUNY-Binghamton University, Binghamton, NY , Indian Institute of Science, Bengaluru, Karnataka , India
摘要：我们研究了一个具有多个作用的有限视界不宁多臂土匪问题，称为R（MA）^2B。每个手臂的状态根据受控马尔可夫决策过程（MDP）演化，拉动手臂的回报取决于相应MDP的当前状态和采取的行动。目标是按顺序为武器选择行动，以便最大限度地提高累积奖励的预期价值。由于寻找最优策略通常是困难的，我们提出了一种计算上有吸引力的指数策略，我们称之为占用率测量奖励指数策略。即使基础MDP不可索引，我们的政策也是定义明确的。我们证明了当激活预算和臂数按比例增加时，在保持它们的比率不变的情况下，它是渐近最优的。对于系统参数未知的情况，我们提出了一种学习算法。我们的学习算法使用了面对不确定性时的乐观主义原则，并进一步使用了生成模型，以充分利用占用率测量奖励指数策略的结构。我们称之为R（MA）^2B-UCB算法。与现有算法相比，R（MA）^2B-UCB算法的性能接近离线最优策略，并且在较低的计算复杂度下实现了次线性后悔。实验结果表明，R（MA）^2B-UCB算法在遗憾和运行时间方面均优于现有算法。
摘要：We study a finite-horizon restless multi-armed bandit problem with multiple actions, dubbed R(MA)^2B. The state of each arm evolves according to a controlled Markov decision process (MDP), and the reward of pulling an arm depends on both the current state of the corresponding MDP and the action taken. The goal is to sequentially choose actions for arms so as to maximize the expected value of the cumulative rewards collected. Since finding the optimal policy is typically intractable, we propose a computationally appealing index policy which we call Occupancy-Measured-Reward Index Policy. Our policy is well-defined even if the underlying MDPs are not indexable. We prove that it is asymptotically optimal when the activation budget and number of arms are scaled up, while keeping their ratio as a constant. For the case when the system parameters are unknown, we develop a learning algorithm. Our learning algorithm uses the principle of optimism in the face of uncertainty and further uses a generative model in order to fully exploit the structure of Occupancy-Measured-Reward Index Policy. We call it the R(MA)^2B-UCB algorithm. As compared with the existing algorithms, R(MA)^2B-UCB performs close to an offline optimum policy, and also achieves a sub-linear regret with a low computational complexity. Experimental results show that R(MA)^2B-UCB outperforms the existing algorithms in both regret and run time.

元学习(1篇)

【1】 Meta-Model Structure Selection: Building Polynomial NARX Model for Regression and Classification
标题：元模型结构选择：构建用于回归和分类的多项式NARX模型
链接：https://arxiv.org/abs/2109.09917

作者：W. R. Lacerda Junior,S. A. M. Martins,E. G. Nepomuceno
摘要：这项工作提出了一种新的元启发式方法来选择回归和分类问题的多项式NARX模型的结构。该方法通过提出一种新的成本函数公式，考虑了模型的复杂性和每个项对构建简约模型的贡献。在几个具有不同非线性特性的仿真和实验系统上测试了新算法的鲁棒性。所得结果表明，对于已知适当模型结构的情况，所提出的算法能够识别正确的模型，并且即使对于传统和现代方法通常失败的系统，也能够确定实验数据的精简模型。新算法通过FROLS和最近的随机化方法等经典方法进行了验证。
摘要：This work presents a new meta-heuristic approach to select the structure of polynomial NARX models for regression and classification problems. The method takes into account the complexity of the model and the contribution of each term to build parsimonious models by proposing a new cost function formulation. The robustness of the new algorithm is tested on several simulated and experimental system with different nonlinear characteristics. The obtained results show that the proposed algorithm is capable of identifying the correct model, for cases where the proper model structure is known, and determine parsimonious models for experimental data even for those systems for which traditional and contemporary methods habitually fails. The new algorithm is validated over classical methods such as the FROLS and recent randomized approaches.

医学相关(4篇)

【1】 Comparison of single and multitask learning for predicting cognitive decline based on MRI data
标题：基于MRI数据的单任务和多任务学习预测认知功能下降的比较
链接：https://arxiv.org/abs/2109.10266

作者：Vandad Imani,Mithilesh Prakash,Marzieh Zare,Jussi Tohka
机构：Alzheimer’s Disease Neuroimaging Initiative., This work was supported in part by the Academy of Finland under Grant ,. Data used in preparation of this article were obtained
摘要：阿尔茨海默病评估量表认知子量表（ADAS Cog）是一种神经心理学工具，旨在评估痴呆症认知症状的严重程度。对ADAS Cog评分变化的个性化预测有助于确定痴呆症和高危人群的治疗干预时机。在目前的工作中，我们比较了单任务和多任务学习方法，以预测基于T1加权解剖磁共振成像（MRI）的ADAS Cog评分的变化。与大多数基于机器学习的ADAS Cog变化预测方法相比，我们根据受试者的基线诊断对其进行分层，并评估各组的预测性能。我们的实验表明，在每个诊断组中，预测的ADAS Cog评分变化与观察到的ADAS Cog评分变化之间存在正相关关系，这表明T1加权MRI对评估整个AD连续体的认知下降具有预测价值。我们进一步研究了校正MRI磁场强度的差异是否会改善ADAS Cog评分预测。基于偏最小二乘的域自适应略微改善了预测性能，但改善幅度不大。总之，这项研究表明，ADAS Cog变化可以在一定程度上根据解剖MRI进行预测。基于这项研究，推荐的学习预测模型的方法是单任务正则线性回归，因为它简单且性能良好。为了获得最有效的预测模型，整合所有受试者群体的训练数据显得很重要。
摘要：The Alzheimer's Disease Assessment Scale-Cognitive subscale (ADAS-Cog) is a neuropsychological tool that has been designed to assess the severity of cognitive symptoms of dementia. Personalized prediction of the changes in ADAS-Cog scores could help in timing therapeutic interventions in dementia and at-risk populations. In the present work, we compared single and multitask learning approaches to predict the changes in ADAS-Cog scores based on T1-weighted anatomical magnetic resonance imaging (MRI). In contrast to most machine learning-based prediction methods ADAS-Cog changes, we stratified the subjects based on their baseline diagnoses and evaluated the prediction performances in each group. Our experiments indicated a positive relationship between the predicted and observed ADAS-Cog score changes in each diagnostic group, suggesting that T1-weighted MRI has a predictive value for evaluating cognitive decline in the entire AD continuum. We further studied whether correction of the differences in the magnetic field strength of MRI would improve the ADAS-Cog score prediction. The partial least square-based domain adaptation slightly improved the prediction performance, but the improvement was marginal. In summary, this study demonstrated that ADAS-Cog change could be, to some extent, predicted based on anatomical MRI. Based on this study, the recommended method for learning the predictive models is a single-task regularized linear regression due to its simplicity and good performance. It appears important to combine the training data across all subject groups for the most effective predictive models.

【2】 NADE: A Benchmark for Robust Adverse Drug Events Extraction in Face of Negations
标题：NADE：面对否定的稳健药物不良事件提取基准
链接：https://arxiv.org/abs/2109.10080

作者：Simone Scaboro,Beatrice Portelli,Emmanuele Chersoni,Enrico Santus,Giuseppe Serra
机构： University of Udine, Italy, The Hong Kong Polytechnic University, Hong Kong, DSIG - Bayer Pharmaceuticals, New Jersey, USA
备注：W-NUT Workshop, EMLNP 2021
摘要：药物不良事件（ADE）提取模型可以快速检查大量社会媒体文本，检测药物相关不良反应的提及并触发医学调查。然而，尽管NLP最近取得了一些进展，但目前尚不清楚这些模型在面对消极时是否稳健，因为消极在不同的语言变体中都是有效的。在本文中，我们评估了三种最先进的系统，显示了它们在消极方面的脆弱性，然后，我们介绍了两种可能的策略来提高这些模型的鲁棒性：一种管道方法，依赖于特定组件进行否定检测；通过对ADE提取数据集的分析，人工创建否定样本并进一步训练模型。我们表明，这两种策略都显著提高了性能，降低了模型预测的虚假实体的数量。我们的数据集和代码将公开转租，以鼓励对该主题的研究。
摘要：Adverse Drug Event (ADE) extraction mod-els can rapidly examine large collections of so-cial media texts, detecting mentions of drug-related adverse reactions and trigger medicalinvestigations. However, despite the recent ad-vances in NLP, it is currently unknown if suchmodels are robust in face ofnegation, which ispervasive across language varieties.In this paper we evaluate three state-of-the-artsystems, showing their fragility against nega-tion, and then we introduce two possible strate-gies to increase the robustness of these mod-els: a pipeline approach, relying on a specificcomponent for negation detection; an augmen-tation of an ADE extraction dataset to artifi-cially create negated samples and further trainthe models.We show that both strategies bring significantincreases in performance, lowering the num-ber of spurious entities predicted by the mod-els. Our dataset and code will be publicly re-leased to encourage research on the topic.

【3】 Clinical Validation of Single-Chamber Model-Based Algorithms Used to Estimate Respiratory Compliance
标题：基于单腔模型算法估计呼吸顺应性的临床验证
链接：https://arxiv.org/abs/2109.10224

作者：Gregory Rehm,Jimmy Nguyen,Chelsea Gilbeau,Marc T Bomactao,Chen-Nee Chuah,Jason Adams
机构：Department of Respiratory Care, University of California Davis Health System, Department of Computer and Electrical Engineering, University of California Davis, Shields Ave., Davis CA, -,-
摘要：使用计算算法对呼吸生理学进行无创评估有望成为未来临床医生检测患者病理生理学有害变化的一项有价值的技术。然而，用于非侵入性分析肺生理学的临床算法很少在临床环境中经过严格验证，并且通常使用机械设备或使用2-8名患者的小型临床验证数据集进行验证。这项工作旨在改善这种情况，首先，建立一个开放的、经临床验证的数据集，包括来自18名插管患者的机械肺和近40000次呼吸的数据。接下来，我们使用这些数据来评估15种不同的算法，这些算法使用“单腔”模型来估计呼吸顺应性。我们在患者住院期间通常经历的不同临床场景下评估这些算法。特别是，我们探讨了在四种不同类型的患者呼吸机异步情况下的算法性能。我们还分析了不同通风模式下的算法，以测试算法性能，并确定通风模式是否对算法有任何影响。我们的方法取得了一些进展，1）显示了哪些特定算法在不同模式和异步场景下临床效果最好，2）开发了一种简单的数学方法以减少算法结果的差异，3）提出了关于单腔模型算法的更多见解。我们希望我们的论文、方法、数据集和软件框架能够被未来的研究人员用来改进他们的工作，并允许将来将“单腔”算法集成到临床实践中。
摘要：Non-invasive estimation of respiratory physiology using computational algorithms promises to be a valuable technique for future clinicians to detect detrimental changes in patient pathophysiology. However, few clinical algorithms used to non-invasively analyze lung physiology have undergone rigorous validation in a clinical setting, and are often validated either using mechanical devices, or with small clinical validation datasets using 2-8 patients. This work aims to improve this situation by first, establishing an open, and clinically validated dataset comprising data from both mechanical lungs and nearly 40,000 breaths from 18 intubated patients. Next, we use this data to evaluate 15 different algorithms that use the "single chamber" model of estimating respiratory compliance. We evaluate these algorithms under varying clinical scenarios patients typically experience during hospitalization. In particular, we explore algorithm performance under four different types of patient ventilator asynchrony. We also analyze algorithms under varying ventilation modes to benchmark algorithm performance and to determine if ventilation mode has any impact on the algorithm. Our approach yields several advances by 1) showing which specific algorithms work best clinically under varying mode and asynchrony scenarios, 2) developing a simple mathematical method to reduce variance in algorithmic results, and 3) presenting additional insights about single-chamber model algorithms. We hope that our paper, approach, dataset, and software framework can thus be used by future researchers to improve their work and allow future integration of "single chamber" algorithms into clinical practice.

【4】 An Optimal Control Framework for Joint-channel Parallel MRI Reconstruction without Coil Sensitivities
标题：一种无线圈敏感的关节通道并行MRI重建最优控制框架
链接：https://arxiv.org/abs/2109.09738

作者：Wanyu Bian,Yunmei Chen,Xiaojing Ye
机构：Department of Mathematics, University of Florida, Gainesville, FL , USA, Department of Mathematics and Statistics, Georgia State University, Atlanta, GA , USA, A R T I C L E I N F O
备注：13 pages
摘要：目的：本研究旨在开发一种结合离散时间最优控制框架的无校准快速并行MRI（pMRI）重建方法。重建模型的目的是通过利用多线圈图像通道间的信息共享来学习结合通道和提取特征的正则化。我们建议利用图像和傅里叶空间中的结构化多层卷积网络来恢复幅度和相位信息。方法：我们开发了一种新的变分模型，该模型具有可学习的目标函数，该目标函数集成了自适应多线圈图像组合算子和图像和傅里叶空间中的有效图像正则化。我们将重构网络转化为一个结构化的离散时间最优控制系统，从而得到参数训练的最优控制公式，其中目标函数的参数扮演控制变量的角色。我们证明了解决控制问题的拉格朗日方法等价于反向传播，确保了训练算法的局部收敛性。结果：我们对所提出的方法进行了大量的数值实验，并在真实的pMRI数据集上与几种最先进的pMRI重建网络进行了比较。数值结果表明，该方法具有良好的性能。结论：该方法为关节通道pMRI重建提供了一个通用的深度网络设计和训练框架。意义：通过学习多线圈图像组合算子并在图像域和k空间域进行正则化，该方法实现了一个高效的pMRI图像重建网络。
摘要：Goal: This work aims at developing a novel calibration-free fast parallel MRI (pMRI) reconstruction method incorporate with discrete-time optimal control framework. The reconstruction model is designed to learn a regularization that combines channels and extracts features by leveraging the information sharing among channels of multi-coil images. We propose to recover both magnitude and phase information by taking advantage of structured multiplayer convolutional networks in image and Fourier spaces. Methods: We develop a novel variational model with a learnable objective function that integrates an adaptive multi-coil image combination operator and effective image regularization in the image and Fourier spaces. We cast the reconstruction network as a structured discrete-time optimal control system, resulting in an optimal control formulation of parameter training where the parameters of the objective function play the role of control variables. We demonstrate that the Lagrangian method for solving the control problem is equivalent to back-propagation, ensuring the local convergence of the training algorithm. Results: We conduct a large number of numerical experiments of the proposed method with comparisons to several state-of-the-art pMRI reconstruction networks on real pMRI datasets. The numerical results demonstrate the promising performance of the proposed method evidently. Conclusion: The proposed method provides a general deep network design and training framework for efficient joint-channel pMRI reconstruction. Significance: By learning multi-coil image combination operator and performing regularizations in both image domain and k-space domain, the proposed method achieves a highly efficient image reconstruction network for pMRI.

聚类(1篇)

【1】 Consistency of spectral clustering for directed network community detection
标题：有向网络社区检测的谱聚类一致性
链接：https://arxiv.org/abs/2109.10319

作者：Huan Qing,Jingli Wang
机构：ML] 2 1 Sep 20 2 1CONSISTENCY OF SPECTRAL CLUSTERING FORDIRECTED NETWORK COMMUNITY DETECTIONBY HUAN QING 1 AND JINGLI WANG 2 1 School of Mathematics, China University of Mining and Technology, cn 2 School of Statistics and Data Science, Nankai University
摘要：定向网络出现在许多领域，如生物学、社会学、生理学和计算机科学。然而，目前大多数网络分析忽略了方向。在本文中，我们构造了一种基于邻接矩阵奇异分解的谱聚类方法来检测有向随机块模型（DiSBM）中的群体。通过考虑稀疏性参数，在一些温和的条件下，我们证明了所提出的方法能够在不同程度的缩放下一致地恢复隐藏的行和列社区。通过考虑行和列节点的程度异质性，我们进一步建立了有向度修正随机块模型（DiDCSBM）的理论框架。结果表明，谱聚类方法在一定程度的异质性约束下，能够稳定地对行簇和列簇进行一致的群落检测。我们在DiSBM和DiDCSBM下的理论结果为一些特殊的有向网络提供了一些创新，如具有平衡簇的有向网络、具有相似度节点的有向网络和有向Erd\“os-R”是一个图形。此外，当DiDCSBM退化为DiSBM时，我们在DiDCSBM下的理论结果与在DiSBM下的理论结果是一致的。
摘要：Directed networks appear in various areas, such as biology, sociology, physiology and computer science. However, at present, most network analysis ignores the direction. In this paper, we construct a spectral clustering method based on the singular decomposition of the adjacency matrix to detect community in directed stochastic block model (DiSBM). By considering a sparsity parameter, under some mild conditions, we show the proposed approach can consistently recover hidden row and column communities for different scaling of degrees. By considering the degree heterogeneity of both row and column nodes, we further establish a theoretical framework for directed degree corrected stochastic block model (DiDCSBM). We show that the spectral clustering method stably yields consistent community detection for row clusters and column clusters under mild constraints on the degree heterogeneity. Our theoretical results under DiSBM and DiDCSBM provide some innovations on some special directed networks, such as directed network with balanced clusters, directed network with nodes enjoying similar degrees, and the directed Erd\"os-R\'enyi graph. Furthermore, our theoretical results under DiDCSBM are consistent with those under DiSBM when DiDCSBM degenerates to DiSBM.

自动驾驶|车辆|车道检测等(3篇)

【1】 Short-term traffic prediction using physics-aware neural networks
标题：基于物理感知神经网络的短期交通量预测
链接：https://arxiv.org/abs/2109.10253

作者：Mike Pereira,Annika Lang,Balázs Kulcsár
机构： and SCNN) and the Transport Area of Advance of the Chalmers University of Technology(under the projects STONE and IRIS)
备注：17 pages, 11 figures, 2 tables
摘要：在这项工作中，我们提出了一种算法，使用过去的流量测量值对一段道路上的车辆流量进行短期预测。该算法基于物理感知的递归神经网络。宏观交通流模型的离散化（使用所谓的交通反应模型）嵌入到网络结构中，并根据估计和预测的时空相关交通参数生成流量预测。这些参数本身是使用一系列LSTM ans简单递归神经网络获得的。此外，在预测的基础上，该算法对其输入进行平滑处理，这也受到宏观交通流模型的物理约束。该算法在从环形探测器获得的原始通量测量上进行了测试。
摘要：In this work, we propose an algorithm performing short-term predictions of the flux of vehicles on a stretch of road, using past measurements of the flux. This algorithm is based on a physics-aware recurrent neural network. A discretization of a macroscopic traffic flow model (using the so-called Traffic Reaction Model) is embedded in the architecture of the network and yields flux predictions based on estimated and predicted space-time dependent traffic parameters. These parameters are themselves obtained using a succession of LSTM ans simple recurrent neural networks. Besides, on top of the predictions, the algorithm yields a smoothing of its inputs which is also physically-constrained by the macroscopic traffic flow model. The algorithm is tested on raw flux measurements obtained from loop detectors.

【2】 Fast nonlinear risk assessment for autonomous vehicles using learned conditional probabilistic models of agent futures
标题：基于智能体期货学习条件概率模型的自动驾驶车辆快速非线性风险评估
链接：https://arxiv.org/abs/2109.09975

作者：Ashkan Jasour,Xin Huang,Allen Wang,Brian C. William
机构： Massachusetts Institute of Technology (MIT) {jasour
备注：Accepted at Autonomous Robots. Author version with 11 pages, 5 figures, 2 tables. arXiv admin note: substantial text overlap with arXiv:2005.13458
摘要：本文提出了一种快速的基于非抽样的方法来评估由深度神经网络（DNN）生成的其他代理的未来概率预测时自动车辆轨迹的风险。所提出的方法解决了不确定预测的广泛表示，包括高斯和非高斯混合模型，以预测代理位置和控制场景上下文条件下的输入。我们表明，当学习代理位置的高斯混合模型（GMM）时，风险评估问题可以用现有的数值方法快速求解到任意精度水平。为了解决agent位置的非高斯混合模型的风险评估问题，我们提出使用非线性Chebyshev不等式和平方和（SOS）规划寻找风险上界；它们都很有趣，因为前者的速度要快得多，而后者可以任意拉紧。这些方法只需要代理位置的高阶统计矩来确定风险上界。为了在学习agent控制输入（相对于位置）的模型时执行风险评估，我们通过非线性运动动力学传播不确定控制输入的力矩，以获得规划范围内不确定位置的精确力矩。为此，我们构造了确定性线性动力系统，在存在不确定控制输入的情况下，控制不确定位置矩的精确时间演化。所提出的方法在Argoverse和CARLA数据集上训练的DNN的实际预测上得到了验证，并被证明是快速评估低概率事件概率的有效方法。
摘要：This paper presents fast non-sampling based methods to assess the risk for trajectories of autonomous vehicles when probabilistic predictions of other agents' futures are generated by deep neural networks (DNNs). The presented methods address a wide range of representations for uncertain predictions including both Gaussian and non-Gaussian mixture models to predict both agent positions and control inputs conditioned on the scene contexts. We show that the problem of risk assessment when Gaussian mixture models (GMMs) of agent positions are learned can be solved rapidly to arbitrary levels of accuracy with existing numerical methods. To address the problem of risk assessment for non-Gaussian mixture models of agent position, we propose finding upper bounds on risk using nonlinear Chebyshev's Inequality and sums-of-squares (SOS) programming; they are both of interest as the former is much faster while the latter can be arbitrarily tight. These approaches only require higher order statistical moments of agent positions to determine upper bounds on risk. To perform risk assessment when models are learned for agent control inputs as opposed to positions, we propagate the moments of uncertain control inputs through the nonlinear motion dynamics to obtain the exact moments of uncertain position over the planning horizon. To this end, we construct deterministic linear dynamical systems that govern the exact time evolution of the moments of uncertain position in the presence of uncertain control inputs. The presented methods are demonstrated on realistic predictions from DNNs trained on the Argoverse and CARLA datasets and are shown to be effective for rapidly assessing the probability of low probability events.

【3】 SFFDD: Deep Neural Network with Enriched Features for Failure Prediction with Its Application to Computer Disk Driver
标题：SFFDD：特征丰富的故障预测深度神经网络及其在计算机磁盘驱动器中的应用
链接：https://arxiv.org/abs/2109.09856

作者：Lanfa Frank Wang,Danjue Li
机构：SFFDD: Deep Neural Network with Enriched Features for Failure, Prediction with Its Application to Computer Disk Driver
备注：11 pages, 20 figures
摘要：提出了一种结合新特征推导方法的分类技术，用于预测具有多变量时间序列传感器数据的系统或设备的故障。我们将多变量时间序列传感器数据作为图像进行可视化和计算。失败遵循与根本原因密切相关的各种模式。对原始传感器数据应用不同的预定义转换，以更好地描述故障模式。除特征提取外，采用集成方法进一步提高了性能。此外，本文还提出了一种通用的深度神经网络算法结构，用较少的人工特征工程处理多种类型的数据。为了提高存储系统的可用性和避免数据丢失，我们将所提出的方法应用于计算机磁盘驱动器的早期故障预测。通过丰富的特征（称为智能特征），分类精度大大提高。
摘要：A classification technique incorporating a novel feature derivation method is proposed for predicting failure of a system or device with multivariate time series sensor data. We treat the multivariate time series sensor data as images for both visualization and computation. Failure follows various patterns which are closely related to the root causes. Different predefined transformations are applied on the original sensors data to better characterize the failure patterns. In addition to feature derivation, ensemble method is used to further improve the performance. In addition, a general algorithm architecture of deep neural network is proposed to handle multiple types of data with less manual feature engineering. We apply the proposed method on the early predict failure of computer disk drive in order to improve storage systems availability and avoid data loss. The classification accuracy is largely improved with the enriched features, named smart features.

点云|SLAM|雷达|激光|深度RGBD相关(1篇)

【1】 Survey on Semantic Stereo Matching / Semantic Depth Estimation
标题：语义立体匹配/语义深度估计研究综述
链接：https://arxiv.org/abs/2109.10123

作者：Viny Saajan Victor,Peter Neigel
摘要：立体匹配由于其鲁棒性和快速性，是从立体图像中推断深度的广泛应用的技术之一。由于其在自动驾驶、机器人导航、三维重建等领域的应用，它已成为研究的主要课题之一。在非纹理、遮挡和反射区域中寻找像素对应是立体匹配的主要挑战。最近的研究表明，来自图像分割的语义线索可以用来改善立体匹配的结果。为了利用立体匹配中语义分割的优势，人们提出了许多深层神经网络结构。本文旨在对在实时应用中具有较高重要性的最先进的网络在精度和速度方面进行比较。
摘要：Stereo matching is one of the widely used techniques for inferring depth from stereo images owing to its robustness and speed. It has become one of the major topics of research since it finds its applications in autonomous driving, robotic navigation, 3D reconstruction, and many other fields. Finding pixel correspondences in non-textured, occluded and reflective areas is the major challenge in stereo matching. Recent developments have shown that semantic cues from image segmentation can be used to improve the results of stereo matching. Many deep neural network architectures have been proposed to leverage the advantages of semantic segmentation in stereo matching. This paper aims to give a comparison among the state of art networks both in terms of accuracy and in terms of speed which are of higher importance in real-time applications.

推理|分析|理解|解释(2篇)

【1】 FakeWake: Understanding and Mitigating Fake Wake-up Words of Voice Assistants
标题：FAKEWAKE：理解和减少语音助手的假唤醒词
链接：https://arxiv.org/abs/2109.09958

作者：Yanjiao Chen,Yijie Bai,Richard Mitev,Kaibo Wang,Ahmad-Reza Sadeghi,Wenyuan Xu
机构：Zhejiang University, Technical University of Darmstadt, de
摘要：在物联网（IoT）领域，语音助理已成为操作智能扬声器、智能手机甚至汽车的重要接口。为了节省电力和保护用户隐私，语音助理只有在检测到一小部分预先注册的唤醒词时才会向云发送命令。然而，语音助手很容易受到虚假唤醒现象的影响，因为这些现象是由听起来天真的模糊词语无意中触发的。本文从三个方面对伪尾流现象进行了系统的研究。首先，我们设计了第一个模糊词生成器来自动高效地生成模糊词，而不是通过大量音频材料进行搜索。我们成功地生成了965个模糊单词，涵盖了8个最流行的英语和汉语智能扬声器。为了解释伪唤醒现象背后的原因，我们构建了一个基于可解释树的决策模型，该模型揭示了导致唤醒词检测器错误接受模糊词的语音特征。最后，我们提出了一些补救措施，以减轻假冒行为的影响。结果表明，增强后的模型不仅对模糊词具有较好的恢复能力，而且在原始训练数据集上具有较好的整体性能。
摘要：In the area of Internet of Things (IoT) voice assistants have become an important interface to operate smart speakers, smartphones, and even automobiles. To save power and protect user privacy, voice assistants send commands to the cloud only if a small set of pre-registered wake-up words are detected. However, voice assistants are shown to be vulnerable to the FakeWake phenomena, whereby they are inadvertently triggered by innocent-sounding fuzzy words. In this paper, we present a systematic investigation of the FakeWake phenomena from three aspects. To start with, we design the first fuzzy word generator to automatically and efficiently produce fuzzy words instead of searching through a swarm of audio materials. We manage to generate 965 fuzzy words covering 8 most popular English and Chinese smart speakers. To explain the causes underlying the FakeWake phenomena, we construct an interpretable tree-based decision model, which reveals phonetic features that contribute to false acceptance of fuzzy words by wake-up word detectors. Finally, we propose remedies to mitigate the effect of FakeWake. The results show that the strengthened models are not only resilient to fuzzy words but also achieve better overall performance on original training datasets.

【2】 Robustness Analysis of Deep Learning Frameworks on Mobile Platforms
标题：移动平台上深度学习框架的健壮性分析
链接：https://arxiv.org/abs/2109.09869

作者：Amin Eslami Abyane,Hadi Hemmati
机构：Department of Electrical and Software Engineering, University of Calgary, Canada
摘要：随着现代移动设备计算能力的提高，基于机器学习的繁重任务（如人脸检测和语音识别）现在已成为此类设备不可或缺的组成部分。这需要框架在移动设备上执行机器学习模型（例如，深度神经网络）。虽然已经有关于这些框架的准确性和性能的研究，但是关于设备上深度学习框架的质量，就其健壮性而言，还没有被系统地研究过。在本文中，我们对两个设备上的深度学习框架和三种不同模型架构上的三种对抗性攻击进行了实证比较。我们还为每个架构使用量化和非量化变体。结果表明，总体而言，两种深度学习框架在鲁棒性方面都不优于另一种，并且PC和移动框架之间也没有显著差异。然而，在像边界攻击这样的情况下，移动版本比PC更健壮。此外，量化在从PC移动到移动设备的所有情况下都提高了健壮性。
摘要：With the recent increase in the computational power of modern mobile devices, machine learning-based heavy tasks such as face detection and speech recognition are now integral parts of such devices. This requires frameworks to execute machine learning models (e.g., Deep Neural Networks) on mobile devices. Although there exist studies on the accuracy and performance of these frameworks, the quality of on-device deep learning frameworks, in terms of their robustness, has not been systematically studied yet. In this paper, we empirically compare two on-device deep learning frameworks with three adversarial attacks on three different model architectures. We also use both the quantized and unquantized variants for each architecture. The results show that, in general, neither of the deep learning frameworks is better than the other in terms of robustness, and there is not a significant difference between the PC and mobile frameworks either. However, in cases like Boundary attack, mobile version is more robust than PC. In addition, quantization improves robustness in all cases when moving from PC to mobile.

检测相关(1篇)

【1】 DeepTimeAnomalyViz: A Tool for Visualizing and Post-processing Deep Learning Anomaly Detection Results for Industrial Time-Series
标题：DeepTimeAnomalyViz：工业时间序列深度学习异常检测结果可视化及后处理工具
链接：https://arxiv.org/abs/2109.10082

作者：Błażej Leporowski,Casper Hansen,Alexandros Iosifidis
机构：Technicon ApS, Hobro, Denmark, Dept. of Electrical and Computer Engineering, Aarhus University, Aarhus, Denmark
摘要：工业过程由大量产生时间序列数据的各种传感器监控。深度学习提供了创建异常检测方法的可能性，可以帮助预防故障和提高效率。但创建这样一个解决方案可能是一项复杂的任务，推理速度、可用数据量、传感器数量等因素会影响这种实现的可行性。我们介绍了DeTAVIZ界面，这是一个基于web浏览器的可视化工具，用于快速探索和评估给定问题中基于DL的异常检测的可行性。DeTAVIZ提供了大量预训练模型和模拟结果，使用户可以轻松快速地迭代多个后处理选项，比较不同的模型，并允许针对所选指标进行手动优化。
摘要：Industrial processes are monitored by a large number of various sensors that produce time-series data. Deep Learning offers a possibility to create anomaly detection methods that can aid in preventing malfunctions and increasing efficiency. But creating such a solution can be a complicated task, with factors such as inference speed, amount of available data, number of sensors, and many more, influencing the feasibility of such implementation. We introduce the DeTAVIZ interface, which is a web browser based visualization tool for quick exploration and assessment of feasibility of DL based anomaly detection in a given problem. Provided with a pool of pretrained models and simulation results, DeTAVIZ allows the user to easily and quickly iterate through multiple post processing options and compare different models, and allows for manual optimisation towards a chosen metric.

分类|识别(3篇)

【1】 CondNet: Conditional Classifier for Scene Segmentation
标题：CondNet：用于场景分割的条件分类器
链接：https://arxiv.org/abs/2109.10322

作者：Changqian Yu,Yuanjie Shao,Changxin Gao,Nong Sang
机构： Sang)The authors are with the Key Laboratory of Image Processing and Intelli-gent Control, School of Artificial Intelligence and Automation, HuazhongUniversity of Science and Technology
备注：Accepted to IEEE SPL. 4 pages, 3 figures, 4 tables
摘要：完全卷积网络（FCN）在密集的视觉识别任务中取得了巨大的成功，如场景分割。FCN的最后一层通常是一个全局分类器（1x1卷积），用于将每个像素识别为语义标签。我们的经验表明，这种全局分类器，忽略类内区分，可能会导致次优结果。在这项工作中，我们提出了一种条件分类器来取代传统的全局分类器，其中分类器的核是根据输入动态生成的。新分类器的主要优点包括：（1）注意类内区分，具有更强的稠密识别能力；（ii）条件分类器简单灵活，可以集成到几乎任意的FCN结构中，以提高预测能力。大量实验表明，该分类器在FCN结构上优于传统分类器。配备条件分类器（称为CondNet）的框架在两个数据集上实现了最新的性能。有关代码和模型，请访问https://git.io/CondNet.
摘要：The fully convolutional network (FCN) has achieved tremendous success in dense visual recognition tasks, such as scene segmentation. The last layer of FCN is typically a global classifier (1x1 convolution) to recognize each pixel to a semantic label. We empirically show that this global classifier, ignoring the intra-class distinction, may lead to sub-optimal results. In this work, we present a conditional classifier to replace the traditional global classifier, where the kernels of the classifier are generated dynamically conditioned on the input. The main advantages of the new classifier consist of: (i) it attends on the intra-class distinction, leading to stronger dense recognition capability; (ii) the conditional classifier is simple and flexible to be integrated into almost arbitrary FCN architectures to improve the prediction. Extensive experiments demonstrate that the proposed classifier performs favourably against the traditional classifier on the FCN architecture. The framework equipped with the conditional classifier (called CondNet) achieves new state-of-the-art performances on two datasets. The code and models are available at https://git.io/CondNet.

【2】 Assured Neural Network Architectures for Control and Identification of Nonlinear Systems
标题：非线性系统控制与辨识的确定性神经网络结构
链接：https://arxiv.org/abs/2109.10298

作者：James Ferlez,Yasser Shoukry
机构：Department of Electrical Engineering and Computer Science
摘要：在本文中，我们考虑的问题，自动设计整流线性单元（Relu）神经网络（NN）架构（层的数量和每层神经元的数量），并确保它被充分参数化，以控制非线性系统；i、 e.控制系统以满足给定的正式规范。这与当前的技术不同，当前的技术无法保证最终的体系结构。此外，我们的方法只需要对底层非线性系统和规范的有限知识。我们只假设Lipschitz常数有一个已知界的Lipschitz连续控制器可以满足该规范；不需要知道具体的控制器。从这个假设出发，我们定义了构造一个连续分段仿射（CPWA）函数所需的仿射函数的数量，该函数可以逼近满足规范的任何Lipschitz连续控制器。然后，我们使用作者最近关于两层晶格（TLL）NN体系结构的研究结果，将此CPWA连接到NN体系结构；TLL体系结构通过其实现的CPWA函数中存在的仿射函数的数量进行参数化。
摘要：In this paper, we consider the problem of automatically designing a Rectified Linear Unit (ReLU) Neural Network (NN) architecture (number of layers and number of neurons per layer) with the assurance that it is sufficiently parametrized to control a nonlinear system; i.e. control the system to satisfy a given formal specification. This is unlike current techniques, which provide no assurances on the resultant architecture. Moreover, our approach requires only limited knowledge of the underlying nonlinear system and specification. We assume only that the specification can be satisfied by a Lipschitz-continuous controller with a known bound on its Lipschitz constant; the specific controller need not be known. From this assumption, we bound the number of affine functions needed to construct a Continuous Piecewise Affine (CPWA) function that can approximate any Lipschitz-continuous controller that satisfies the specification. Then we connect this CPWA to a NN architecture using the authors' recent results on the Two-Level Lattice (TLL) NN architecture; the TLL architecture was shown to be parameterized by the number of affine functions present in the CPWA function it realizes.

【3】 Signal Classification using Smooth Coefficients of Multiple wavelets
标题：基于多小波平滑系数的信号分类
链接：https://arxiv.org/abs/2109.09988

作者：Paul Grant,Md Zahidul Islam
机构：School of Computing and Mathematics, Charles Sturt University, Bathurst, NSW , Australia.
备注：13 pages, 3 figures
摘要：时间序列信号的分类已成为一种重要的构造方法，并有许多实际应用。使用现有的分类器，我们可能能够准确地对信号进行分类，但是，如果使用的属性数量减少，准确度可能会下降。转换数据然后进行降维可以提高数据分析的质量，减少分类所需的时间并简化模型。我们提出了一种方法，即选择合适的小波变换数据，然后结合这些变换的输出来构造数据集，然后应用集成分类器对其进行分类。我们在不同的数据集、不同的分类器上演示了这一点，并使用了不同的评估方法。与使用原始信号数据或单个小波变换的方法相比，我们的实验结果证明了该方法的有效性。
摘要：Classification of time series signals has become an important construct and has many practical applications. With existing classifiers we may be able to accurately classify signals, however that accuracy may decline if using a reduced number of attributes. Transforming the data then undertaking reduction in dimensionality may improve the quality of the data analysis, decrease time required for classification and simplify models. We propose an approach, which chooses suitable wavelets to transform the data, then combines the output from these transforms to construct a dataset to then apply ensemble classifiers to. We demonstrate this on different data sets, across different classifiers and use differing evaluation methods. Our experimental results demonstrate the effectiveness of the proposed technique, compared to the approaches that use either raw signal data or a single wavelet transform.

表征(2篇)

【1】 Chemical-Reaction-Aware Molecule Representation Learning
标题：化学反应感知的分子表征学习
链接：https://arxiv.org/abs/2109.09888

作者：Hongwei Wang,Weijiang Li,Xiaomeng Jin,Kyunghyun Cho,Heng Ji,Jiawei Han,Martin D. Burke
机构：University of Illinois Urbana-Champaign, Urbana, IL , USA, New York University, New York, NY
摘要：分子表征学习（MRL）方法旨在将分子嵌入到实向量空间中。然而，现有的基于SMILES（简化分子输入线输入系统）或基于GNN（图形神经网络）的MRL方法要么以SMILES字符串作为输入，难以编码分子结构信息，要么过分强调GNN结构的重要性，而忽视了其泛化能力。在这里，我们建议使用化学反应来帮助学习分子表征。我们的方法的关键思想是保持分子在嵌入空间中的化学反应的等效性，即强制每个化学方程式的反应物嵌入和产物嵌入的总和相等。该约束被证明是有效的：1）保持嵌入空间的有序性；2）提高分子嵌入的泛化能力。此外，我们的模型可以使用任何GNN作为分子编码器，因此对GNN架构是不可知的。实验结果表明，我们的方法在各种下游任务中达到了最先进的性能，例如17.4%的绝对值Hit@1化学反应预测增益、分子性质预测绝对AUC增益和图形编辑距离预测相对RMSE增益分别为2.3%和18.5%，超过最佳基线方法。该守则可于https://github.com/hwwang55/MolR.
摘要：Molecule representation learning (MRL) methods aim to embed molecules into a real vector space. However, existing SMILES-based (Simplified Molecular-Input Line-Entry System) or GNN-based (Graph Neural Networks) MRL methods either take SMILES strings as input that have difficulty in encoding molecule structure information, or over-emphasize the importance of GNN architectures but neglect their generalization ability. Here we propose using chemical reactions to assist learning molecule representation. The key idea of our approach is to preserve the equivalence of molecules with respect to chemical reactions in the embedding space, i.e., forcing the sum of reactant embeddings and the sum of product embeddings to be equal for each chemical equation. This constraint is proven effective to 1) keep the embedding space well-organized and 2) improve the generalization ability of molecule embeddings. Moreover, our model can use any GNN as the molecule encoder and is thus agnostic to GNN architectures. Experimental results demonstrate that our method achieves state-of-the-art performance in a variety of downstream tasks, e.g., 17.4% absolute Hit@1 gain in chemical reaction prediction, 2.3% absolute AUC gain in molecule property prediction, and 18.5% relative RMSE gain in graph-edit-distance prediction, respectively, over the best baseline method. The code is available at https://github.com/hwwang55/MolR.

【2】 Context-Specific Representation Abstraction for Deep Option Learning
标题：面向深度选项学习的上下文特定表示抽象
链接：https://arxiv.org/abs/2109.09876

作者：Marwa Abdulhai,Dong-Ki Kim,Matthew Riemer,Miao Liu,Gerald Tesauro,Jonathan P. How
机构： MIT LIDS , MIT-IBM Watson AI Lab , IBM Research
摘要：分层强化学习侧重于发现临时扩展的动作，如选项，这些动作可以在需要广泛探索的问题中提供益处。一种有希望的端到端学习这些选项的方法是选项评论家（OC）框架。我们在本文中研究并表明，OC并没有将问题分解为更简单的子问题，而是增加了策略空间上的搜索规模，每个选项在学习过程中都考虑了整个状态空间。这个问题可能导致这种方法的实际局限性，包括样本学习效率低下。为了解决这个问题，我们引入了面向深度选项学习的上下文特定表示抽象（CRADOL），这是一个新的框架，同时考虑了时间抽象和上下文特定表示抽象，以有效减少策略空间上的搜索规模。具体地说，我们的方法学习一个分解的信念状态表示，使每个选项只在状态空间的一个子部分上学习策略。我们针对分层、非分层和模块化递归神经网络基线对我们的方法进行了测试，证明在具有挑战性的部分可观测环境中，样本效率显著提高。
摘要：Hierarchical reinforcement learning has focused on discovering temporally extended actions, such as options, that can provide benefits in problems requiring extensive exploration. One promising approach that learns these options end-to-end is the option-critic (OC) framework. We examine and show in this paper that OC does not decompose a problem into simpler sub-problems, but instead increases the size of the search over policy space with each option considering the entire state space during learning. This issue can result in practical limitations of this method, including sample inefficient learning. To address this problem, we introduce Context-Specific Representation Abstraction for Deep Option Learning (CRADOL), a new framework that considers both temporal abstraction and context-specific representation abstraction to effectively reduce the size of the search over policy space. Specifically, our method learns a factored belief state representation that enables each option to learn a policy over only a subsection of the state space. We test our method against hierarchical, non-hierarchical, and modular recurrent neural network baselines, demonstrating significant sample efficiency improvements in challenging partially observable environments.

优化|敛散性(3篇)

【1】 SMAC3: A Versatile Bayesian Optimization Package for Hyperparameter Optimization
标题：SMAC3：一个用于超参数优化的通用贝叶斯优化软件包
链接：https://arxiv.org/abs/2109.09831

作者：Marius Lindauer,Katharina Eggensperger,Matthias Feurer,André Biedenkapp,Difan Deng,Carolin Benjamins,René Sass,Frank Hutter
机构：Andr´e Biedenkapp, Ren´e Sass, Leibniz University Hannover,University of Freiburg,Bosch Center for Artificial Intelligence
摘要：算法参数，特别是机器学习算法的超参数，可以极大地影响其性能。为了支持用户为其现有算法、数据集和应用程序确定性能良好的超参数配置，SMAC3提供了一个稳健灵活的贝叶斯优化框架，可在几次评估中提高性能。它为典型用例提供了一些外观和预设，例如优化超参数、解决低维连续（人工）全局优化问题以及配置算法以在多个问题实例中表现良好。SMAC3软件包在许可的BSD许可证下提供，网址为https://github.com/automl/SMAC3.
摘要：Algorithm parameters, in particular hyperparameters of machine learning algorithms, can substantially impact their performance. To support users in determining well-performing hyperparameter configurations for their algorithms, datasets and applications at hand, SMAC3 offers a robust and flexible framework for Bayesian Optimization, which can improve performance within a few evaluations. It offers several facades and pre-sets for typical use cases, such as optimizing hyperparameters, solving low dimensional continuous (artificial) global optimization problems and configuring algorithms to perform well across multiple problem instances. The SMAC3 package is available under a permissive BSD-license at https://github.com/automl/SMAC3.

【2】 Generalized Optimization: A First Step Towards Category Theoretic Learning Theory
标题：广义优化：走向范畴论学习理论的第一步
链接：https://arxiv.org/abs/2109.10262

作者：Dan Shiebler
机构：University of Oxford
摘要：笛卡尔逆导数是逆模式自动微分的范畴化推广。我们使用这个算子来推广几种优化算法，包括梯度下降的直接推广和牛顿方法的新推广。然后，我们探索这些算法的哪些属性在这个广义设置中被保留。首先，我们证明了这些算法的变换不变性是保持不变的：虽然广义牛顿法对所有可逆线性变换是不变的，但广义梯度下降法仅对正交线性变换是不变的。接下来，我们证明了我们可以用类似内积的表达式来表示广义梯度下降损失的变化，从而推广了梯度下降优化流的非递增性和收敛性。最后，我们通过几个数值实验来说明本文的思想，并演示如何使用它们来优化有序环上的多项式函数。
摘要：The Cartesian reverse derivative is a categorical generalization of reverse-mode automatic differentiation. We use this operator to generalize several optimization algorithms, including a straightforward generalization of gradient descent and a novel generalization of Newton's method. We then explore which properties of these algorithms are preserved in this generalized setting. First, we show that the transformation invariances of these algorithms are preserved: while generalized Newton's method is invariant to all invertible linear transformations, generalized gradient descent is invariant only to orthogonal linear transformations. Next, we show that we can express the change in loss of generalized gradient descent with an inner product-like expression, thereby generalizing the non-increasing and convergence properties of the gradient descent optimization flow. Finally, we include several numerical experiments to illustrate the ideas in the paper and demonstrate how we can use them to optimize polynomial functions over an ordered ring.

【3】 Vaccine allocation policy optimization and budget sharing mechanism using Thompson sampling
标题：基于汤普森抽样的疫苗分配策略优化与预算分担机制
链接：https://arxiv.org/abs/2109.10004

作者：David Rey,Ahmed W Hammad,Meead Saberi
机构：SKEMA Business School, Universit´e Cˆote d’Azur, Sophia Antipolis, France., School of Built Environment, UNSW Sydney, Sydney, NSW, Australia, School of Civil and Environmental Engineering, UNSW Sydney, Sydney, NSW, Australia
摘要：随着时间的推移，疫苗在人群中的最佳分配是一个具有挑战性的卫生保健管理问题。在大流行的情况下，多个机构采取的疫苗接种政策与合作（或缺乏合作）之间的相互作用创造了一个影响疾病全球传播动态的复杂环境。在这项研究中，我们从决策主体的角度出发，目的是最小化其易感人群的规模，并且必须在有限的供应条件下分配疫苗。我们假设疫苗的有效率对代理是未知的，我们提出了一种基于汤普森抽样的优化策略来学习随时间变化的平均疫苗有效率。此外，我们还开发了预算平衡的资源共享机制，以促进代理之间的合作。我们将建议的框架应用于新冠病毒-19大流行。我们使用一个光栅模型，其中代理代表世界主要国家，并在全球移动网络中进行交互，以生成多个问题实例。我们的数值结果表明，与基于人口的政策相比，拟议的疫苗分配政策在全球范围内实现了更大的易感个体、感染和死亡人数减少。此外，我们还表明，在固定的全球疫苗分配预算下，大多数国家可以通过与流动性相对较高的国家分享预算来减少本国的感染和死亡人数。拟议的框架可用于改善国家和全球卫生当局在卫生保健管理方面的决策。
摘要：The optimal allocation of vaccines to population subgroups over time is a challenging health care management problem. In the context of a pandemic, the interaction between vaccination policies adopted by multiple agents and the cooperation (or lack thereof) creates a complex environment that affects the global transmission dynamics of the disease. In this study, we take the perspective of decision-making agents that aim to minimize the size of their susceptible populations and must allocate vaccine under limited supply. We assume that vaccine efficiency rates are unknown to agents and we propose an optimization policy based on Thompson sampling to learn mean vaccine efficiency rates over time. Furthermore, we develop a budget-balanced resource sharing mechanism to promote cooperation among agents. We apply the proposed framework to the COVID-19 pandemic. We use a raster model of the world where agents represent the main countries worldwide and interact in a global mobility network to generate multiple problem instances. Our numerical results show that the proposed vaccine allocation policy achieves a larger reduction in the number of susceptible individuals, infections and deaths globally compared to a population-based policy. In addition, we show that, under a fixed global vaccine allocation budget, most countries can reduce their national number of infections and deaths by sharing their budget with countries with which they have a relatively high mobility exchange. The proposed framework can be used to improve policy-making in health care management by national and global health authorities.

预测|估计(7篇)

【1】 KDFNet: Learning Keypoint Distance Field for 6D Object Pose Estimation
标题：KDFNet：用于6D目标位姿估计的学习关键点距离场
链接：https://arxiv.org/abs/2109.10127

作者：Xingyu Liu,Shun Iwase,Kris M. Kitani
机构：Heatmap-based methods predict probability heatmaps of aThe authors are with the Robotics Institute
备注：IROS 2021
摘要：我们提出了一种基于RGB图像的6D物体姿态估计的新方法KDFNet。为了处理遮挡问题，最近的许多工作都提出通过像素投票来定位二维关键点，并解决了用于姿势估计的透视n点（Perspective-n-Point，PnP）问题，取得了领先的性能。然而，这种投票过程是基于方向的，无法处理无法可靠找到方向交点的细长对象。为了解决这个问题，我们提出了一种新的连续表示法，称为关键点距离场（KDF），用于投影二维关键点位置。作为2D数组，KDF的每个元素存储相应图像像素和指定投影2D关键点之间的2D欧氏距离。我们使用完全卷积神经网络回归每个关键点的KDF。利用投影对象关键点位置的KDF编码，我们建议使用基于距离的投票方案，通过RANSAC方式计算圆交点来定位关键点。我们通过大量烧蚀实验验证了我们框架的设计选择。我们提出的方法在Occlusion-LINEMOD数据集上实现了最先进的性能，平均ADD（-S）准确率为50.3%，在TOD数据集mug子集上实现了平均ADD准确率为75.72%。大量的实验和可视化结果表明，该方法能够在包括遮挡在内的挑战场景中稳健地估计6D姿态。
摘要：We present KDFNet, a novel method for 6D object pose estimation from RGB images. To handle occlusion, many recent works have proposed to localize 2D keypoints through pixel-wise voting and solve a Perspective-n-Point (PnP) problem for pose estimation, which achieves leading performance. However, such voting process is direction-based and cannot handle long and thin objects where the direction intersections cannot be robustly found. To address this problem, we propose a novel continuous representation called Keypoint Distance Field (KDF) for projected 2D keypoint locations. Formulated as a 2D array, each element of the KDF stores the 2D Euclidean distance between the corresponding image pixel and a specified projected 2D keypoint. We use a fully convolutional neural network to regress the KDF for each keypoint. Using this KDF encoding of projected object keypoint locations, we propose to use a distance-based voting scheme to localize the keypoints by calculating circle intersections in a RANSAC fashion. We validate the design choices of our framework by extensive ablation experiments. Our proposed method achieves state-of-the-art performance on Occlusion LINEMOD dataset with an average ADD(-S) accuracy of 50.3% and TOD dataset mug subset with an average ADD accuracy of 75.72%. Extensive experiments and visualizations demonstrate that the proposed method is able to robustly estimate the 6D pose in challenging scenarios including occlusion.

【2】 StereOBJ-1M: Large-scale Stereo Image Dataset for 6D Object Pose Estimation
标题：StereOBJ-1M：用于6D目标位姿估计的大规模立体图像数据集
链接：https://arxiv.org/abs/2109.10115

作者：Xingyu Liu,Shun Iwase,Kris M. Kitani
机构：Carnegie Mellon University
备注：ICCV 2021
摘要：我们提出了一个名为$\textbf{StereOBJ-1M}$数据集的大规模立体RGB图像对象姿势估计数据集。该数据集旨在解决具有挑战性的情况，例如对象透明度、半透明性和镜面反射，以及遮挡、对称性和照明和环境变化等常见问题。为了为现代深度学习模型收集足够规模的数据，我们提出了一种以多视图方式高效注释姿势数据的新方法，该方法允许在复杂和灵活的环境中捕获数据。我们的数据集包含超过396K帧和超过150万个18个对象的注释，这些对象记录在11个不同环境中构建的183个场景中。这18个对象包括8个对称对象、7个透明对象和8个反射对象。我们在StereOBJ-1M上对两个最先进的姿势估计框架进行基准测试，作为未来工作的基线。我们还提出了一种新的对象级姿态优化方法，用于从多幅图像的关键点预测计算6D姿态。
摘要：We present a large-scale stereo RGB image object pose estimation dataset named the $\textbf{StereOBJ-1M}$ dataset. The dataset is designed to address challenging cases such as object transparency, translucency, and specular reflection, in addition to the common challenges of occlusion, symmetry, and variations in illumination and environments. In order to collect data of sufficient scale for modern deep learning models, we propose a novel method for efficiently annotating pose data in a multi-view fashion that allows data capturing in complex and flexible environments. Fully annotated with 6D object poses, our dataset contains over 396K frames and over 1.5M annotations of 18 objects recorded in 183 scenes constructed in 11 different environments. The 18 objects include 8 symmetric objects, 7 transparent objects, and 8 reflective objects. We benchmark two state-of-the-art pose estimation frameworks on StereOBJ-1M as baselines for future work. We also propose a novel object-level pose optimization method for computing 6D pose from keypoint predictions in multiple images.

【3】 Online Multi-horizon Transaction Metric Estimation with Multi-modal Learning in Payment Networks
标题：支付网络中基于多模态学习的在线多层次交易指标估计
链接：https://arxiv.org/abs/2109.10020

作者：Chin-Chia Michael Yeh,Zhongfang Zhuang,Junpeng Wang,Yan Zheng,Javid Ebrahimi,Ryan Mercer,Liang Wang,Wei Zhang
机构：Visa Research, University of California, Riverside∗
备注：10 pages
摘要：预测与支付处理网络中实体的跨国行为相关的指标对于系统监控至关重要。从过去的交易历史中聚合的多变量时间序列可以为此类预测提供有价值的见解。一般的多变量时间序列预测问题已经在多个领域得到了很好的研究和应用，包括制造业、医学和昆虫学。然而，除了大规模处理支付交易数据的实时性要求外，还出现了与数据相关的新领域挑战，如概念漂移和多模态。在这项工作中，我们研究了用于估计与支付交易数据库中的实体相关联的交易度量的多变量时间序列预测问题。我们提出了一个包含五个独特组件的模型，用于从多模态数据估计事务度量。其中四个组件捕获交互、时间、规模和形状透视图，第五个组件将这些透视图融合在一起。我们还提出了一种离线/在线混合训练方案，以解决数据中的概念漂移问题，并满足实时性要求。将估算模型与图形用户界面相结合，原型交易度量估算系统已证明其作为提高支付处理公司系统监控能力的工具的潜在优势。
摘要：Predicting metrics associated with entities' transnational behavior within payment processing networks is essential for system monitoring. Multivariate time series, aggregated from the past transaction history, can provide valuable insights for such prediction. The general multivariate time series prediction problem has been well studied and applied across several domains, including manufacturing, medical, and entomology. However, new domain-related challenges associated with the data such as concept drift and multi-modality have surfaced in addition to the real-time requirements of handling the payment transaction data at scale. In this work, we study the problem of multivariate time series prediction for estimating transaction metrics associated with entities in the payment transaction database. We propose a model with five unique components to estimate the transaction metrics from multi-modality data. Four of these components capture interaction, temporal, scale, and shape perspectives, and the fifth component fuses these perspectives together. We also propose a hybrid offline/online training scheme to address concept drift in the data and fulfill the real-time requirements. Combining the estimation model with a graphical user interface, the prototype transaction metric estimation system has demonstrated its potential benefit as a tool for improving a payment processing company's system monitoring capability.

【4】 ApproxIFER: A Model-Agnostic Approach to Resilient and Robust Prediction Serving Systems
标题：ApproxIFER：一种模型不可知的弹性鲁棒预测服务系统方法
链接：https://arxiv.org/abs/2109.09868

作者：Mahdi Soleymani,Ramy E. Ali,Hessam Mahdavifar,A. Salman Avestimehr
机构：EECS Department, University of Michigan-Ann Arbor, ECE Department, University of Southern California (USC)
摘要：由于云辅助人工智能服务的激增，设计能够有效应对掉队/故障并将响应延迟降至最低的弹性预测服务系统的问题引起了广泛关注。解决此问题的常用方法是复制，它将相同的预测任务分配给多个工作者。然而，这种方法效率非常低，并且会产生大量的资源开销。因此，最近提出了一种称为平价模型（ParM）的基于学习的方法，该方法学习能够为一组预测生成平价的模型，以便重建缓慢/失败工人的预测。虽然这种基于学习的方法比复制更具资源效率，但它是针对云托管的特定模型定制的，特别适合于少量查询（通常少于四个）和很少（主要是一个）掉队者。此外，ParM不处理拜占庭敌对工人。我们提出了一种不同的方法，称为近似编码推理（ApproxIFER），它不需要训练任何奇偶校验模型，因此它与云托管的模型无关，并且可以很容易地应用于不同的数据域和模型架构。与以前的工作相比，ApproxIFER可以处理一般数量的散乱者，并且可以更好地扩展查询数量。此外，ApproxIFER对拜占庭工人的抵抗力很强。我们在大量数据集和模型体系结构上进行的大量实验也表明，与奇偶模型方法相比，精度提高了58%。
摘要：Due to the surge of cloud-assisted AI services, the problem of designing resilient prediction serving systems that can effectively cope with stragglers/failures and minimize response delays has attracted much interest. The common approach for tackling this problem is replication which assigns the same prediction task to multiple workers. This approach, however, is very inefficient and incurs significant resource overheads. Hence, a learning-based approach known as parity model (ParM) has been recently proposed which learns models that can generate parities for a group of predictions in order to reconstruct the predictions of the slow/failed workers. While this learning-based approach is more resource-efficient than replication, it is tailored to the specific model hosted by the cloud and is particularly suitable for a small number of queries (typically less than four) and tolerating very few (mostly one) number of stragglers. Moreover, ParM does not handle Byzantine adversarial workers. We propose a different approach, named Approximate Coded Inference (ApproxIFER), that does not require training of any parity models, hence it is agnostic to the model hosted by the cloud and can be readily applied to different data domains and model architectures. Compared with earlier works, ApproxIFER can handle a general number of stragglers and scales significantly better with the number of queries. Furthermore, ApproxIFER is robust against Byzantine workers. Our extensive experiments on a large number of datasets and model architectures also show significant accuracy improvement by up to 58% over the parity model approaches.

【5】 Well Googled is Half Done: Multimodal Forecasting of New Fashion Product Sales with Image-based Google Trends
标题：谷歌搜索做得好是成功的一半：基于图像的谷歌趋势对新时尚产品销售的多模式预测
链接：https://arxiv.org/abs/2109.09824

作者：Geri Skenderi,Christian Joppi,Matteo Denitto,Marco Cristani
机构：Department of Computer Science, University of Verona, Strada le Grazie, Verona, Italy, Humatics Srl, University of Verona, Strada le Grazie, Verona, Italy
摘要：本文研究了在没有以往销售数据，但只有图像和少量元数据可用的情况下，系统地探索谷歌趋势的有效性，将视觉方面的文本翻译作为外生知识来预测全新时尚产品的销售。特别是，我们提出了GTM Transformer，代表Google Trends Multimodal Transformer，其编码器处理外生时间序列的表示，而解码器使用Google Trends编码以及可用的视觉和元数据信息预测销售额。我们的模型以非自回归方式工作，避免了第一步误差的复合效应。作为第二个贡献，我们展示了VISUELLE数据集，这是第一个用于新时尚产品销售预测任务的公开数据集，包含2016-2019年期间销售的5577种新产品的销售，这些数据来自意大利快速时尚公司Nunalie的真实历史数据。我们的数据集包含产品图像、元数据、相关销售和相关的谷歌趋势。我们使用VISUELLE将我们的方法与最先进的备选方案和众多基线进行比较，表明GTM Transformer在百分比和绝对误差方面都是最准确的。值得注意的是，外生知识的添加将WAPE预测精度提高了1.5%，这表明了利用Google趋势的重要性。代码和数据集都可以在https://github.com/HumaticsLAB/GTM-Transformer.
摘要：This paper investigates the effectiveness of systematically probing Google Trendsagainst textual translations of visual aspects as exogenous knowledge to predict the sales of brand-new fashion items, where past sales data is not available, but only an image and few metadata are available. In particular, we propose GTM-Transformer, standing for Google Trends Multimodal Transformer, whose encoder works on the representation of the exogenous time series, while the decoder forecasts the sales using the Google Trends encoding, and the available visual and metadata information. Our model works in a non-autoregressive manner, avoiding the compounding effect of the first-step errors. As a second contribution, we present the VISUELLE dataset, which is the first publicly available dataset for the task of new fashion product sales forecasting, containing the sales of 5577 new products sold between 2016-2019, derived from genuine historical data ofNunalie, an Italian fast-fashion company. Our dataset is equipped with images of products, metadata, related sales, and associated Google Trends. We use VISUELLE to compare our approach against state-of-the-art alternatives and numerous baselines, showing that GTM-Transformer is the most accurate in terms of both percentage and absolute error. It is worth noting that the addition of exogenous knowledge boosts the forecasting accuracy by 1.5% WAPE wise, showing the importance of exploiting Google Trends. The code and dataset are both available at https://github.com/HumaticsLAB/GTM-Transformer.

【6】 Prediction of severe thunderstorm events with ensemble deep learning and radar data
标题：用集合深度学习和雷达资料预报强雷暴事件
链接：https://arxiv.org/abs/2109.09791

作者：Sabrina Guastavino,Michele Piana,Marco Tizzi,Federico Cassola,Antonio Iengo,Davide Sacchetti,Enrico Solazzo,Federico Benvenuto
机构：The MIDA group, Dipartimento di Matematica, Universita di Genova, Genova, Italy, CNR - SPIN Genova, Genova, Italy, ARPAL, Genova, Italy
摘要：临近预报极端天气事件的问题可以通过应用数值方法求解动态模型方程或数据驱动的人工智能算法来解决。在后一个框架内，本文说明了如何利用雷达反射率帧视频作为输入的深度学习方法，实现一个能够及时发出可能的严重雷暴事件警报的警报机。从技术角度来看，该方法的计算核心是使用价值加权技能得分，将深度神经网络的概率结果转换为二元分类，并评估预测性能。该预警机已根据意大利利古里亚地区记录的天气雷达数据进行了验证，
摘要：The problem of nowcasting extreme weather events can be addressed by applying either numerical methods for the solution of dynamic model equations or data-driven artificial intelligence algorithms. Within this latter framework, the present paper illustrates how a deep learning method, exploiting videos of radar reflectivity frames as input, can be used to realize a warning machine able to sound timely alarms of possible severe thunderstorm events. From a technical viewpoint, the computational core of this approach is the use of a value-weighted skill score for both transforming the probabilistic outcomes of the deep neural network into binary classification and assessing the forecasting performances. The warning machine has been validated against weather radar data recorded in the Liguria region, in Italy,

【7】 Non-parametric Kernel-Based Estimation of Probability Distributions for Precipitation Modeling
标题：基于非参数核的降水建模概率分布估计
链接：https://arxiv.org/abs/2109.09961

作者：Andrew Pavlides,Vasiliki Agou,Dionissios T. Hristopulos
机构：School of Mineral Resources Engineering, Technical University of Crete, Chania, Crete, School of Electrical and Computer Engineering, Technical University of Crete, Chania, Crete , Greece
备注：49 pages, 21 figures
摘要：降水量的概率分布在很大程度上取决于所考虑的地理、气候带和时间尺度。封闭形式的参数概率分布不够灵活，无法为不同时间尺度的降水量提供准确和通用的模型。本文推导了湿时段降水量累积分布函数（CDF）的非参数估计。CDF估计是通过积分核密度估计得到的，从而得到不同核函数的半显式CDF表达式。我们使用合成数据集和来自克里特岛（希腊）的再分析降水数据，研究基于核的自适应插件带宽（KCDE）CDF估计。我们表明，与使用正常参考带宽的标准经验（阶梯）估计和基于核的估计相比，KCDE提供了更好的概率分布估计。我们还证明了KCDE能够通过逆变换采样方法模拟非参数降水量分布。
摘要：The probability distribution of precipitation amount strongly depends on geography, climate zone, and time scale considered. Closed-form parametric probability distributions are not sufficiently flexible to provide accurate and universal models for precipitation amount over different time scales. In this paper we derive non-parametric estimates of the cumulative distribution function (CDF) of precipitation amount for wet time intervals. The CDF estimates are obtained by integrating the kernel density estimator leading to semi-explicit CDF expressions for different kernel functions. We investigate kernel-based CDF estimation with an adaptive plug-in bandwidth (KCDE), using both synthetic data sets and reanalysis precipitation data from the island of Crete (Greece). We show that KCDE provides better estimates of the probability distribution than the standard empirical (staircase) estimate and kernel-based estimates that use the normal reference bandwidth. We also demonstrate that KCDE enables the simulation of non-parametric precipitation amount distributions by means of the inverse transform sampling method.

其他神经网络|深度学习|模型|建模(12篇)

【1】 Homography augumented momentum constrastive learning for SAR image retrieval
标题：单调增强动量对比学习在SAR图像检索中的应用
链接：https://arxiv.org/abs/2109.10329

作者：Seonho Park,Maciej Rysz,Kathleen M. Dipple,Panos M. Pardalos
机构：eduMaciej RyszDepartment of Information Systems & Analytics, Miami University
摘要：基于深度学习的图像检索一直是计算机视觉领域的研究热点。深度神经网络（DNNs）提取的表示嵌入不仅旨在包含图像的语义信息，而且可以管理大规模的图像检索任务。在这项工作中，我们提出了一种基于深度学习的图像检索方法，使用单应变换增强对比学习来执行大规模合成孔径雷达（SAR）图像搜索任务。此外，我们提出了一种不需要任何标记程序的对比学习诱导的DNN训练方法。这可以相对轻松地实现大规模数据集的可处理性。最后，通过对极化SAR图像数据集的实验，验证了该方法的性能。
摘要：Deep learning-based image retrieval has been emphasized in computer vision. Representation embedding extracted by deep neural networks (DNNs) not only aims at containing semantic information of the image, but also can manage large-scale image retrieval tasks. In this work, we propose a deep learning-based image retrieval approach using homography transformation augmented contrastive learning to perform large-scale synthetic aperture radar (SAR) image search tasks. Moreover, we propose a training method for the DNNs induced by contrastive learning that does not require any labeling procedure. This may enable tractability of large-scale datasets with relative ease. Finally, we verify the performance of the proposed method by conducting experiments on the polarimetric SAR image datasets.

【2】 Introduction to Neural Network Verification
标题：神经网络验证导论
链接：https://arxiv.org/abs/2109.10317

作者：Aws Albarghouthi
机构：University of Wisconsin–Madison, arXiv:,.,v, [cs.LG] , Sep
摘要：深度学习改变了我们对软件的看法和它的功能。但是深层神经网络是脆弱的，它们的行为常常令人惊讶。在许多情况下，我们需要对神经网络的安全性、安全性、正确性或鲁棒性提供正式的保证。这本书涵盖了从形式验证到神经网络推理和深度学习的基本思想。
摘要：Deep learning has transformed the way we think of software and what it can do. But deep neural networks are fragile and their behaviors are often surprising. In many settings, we need to provide formal guarantees on the safety, security, correctness, or robustness of neural networks. This book covers foundational ideas from formal verification and their adaptation to reasoning about neural networks and deep learning.

【3】 Learning PAC-Bayes Priors for Probabilistic Neural Networks
标题：概率神经网络的PAC-Bayes先验学习
链接：https://arxiv.org/abs/2109.10304

作者：Maria Perez-Ortiz,Omar Rivasplata,Benjamin Guedj,Matthew Gleeson,Jingyu Zhang,John Shawe-Taylor,Miroslaw Bober,Josef Kittler
机构：∗Centre for AI and Dept. of Computer Science, University College London (UK), †Inria, Lille Nord-Europe research centre and Inria London Programme (France), §Center for Vision, Speech and Signal Processing (CVSSP), University of Surrey (UK)
摘要：最近的工作研究了通过优化PAC贝叶斯边界训练的深度学习模型，其先验知识是在数据子集上学习的。这种组合已被证明不仅能产生准确的分类器，而且能产生非常严格的风险证书，有望实现自我认证学习（即使用所有数据学习预测值并证明其质量）。在这项工作中，我们实证研究了先验知识的作用。我们在6个具有不同策略和数据量的数据集上进行实验，学习数据相关的PAC Bayes先验，并比较它们对学习的预测因子的测试性能的影响及其风险证书的严密性。我们询问构建先验知识应该分配的最佳数据量是多少，并表明最佳数据量可能依赖于数据集。我们证明，使用一小部分先前的建筑数据来验证先前的结果是有希望的。我们包括对低参数化和超参数化模型的比较，以及对不同训练目标和正规化策略的实证研究，以了解先验分布。
摘要：Recent works have investigated deep learning models trained by optimising PAC-Bayes bounds, with priors that are learnt on subsets of the data. This combination has been shown to lead not only to accurate classifiers, but also to remarkably tight risk certificates, bearing promise towards self-certified learning (i.e. use all the data to learn a predictor and certify its quality). In this work, we empirically investigate the role of the prior. We experiment on 6 datasets with different strategies and amounts of data to learn data-dependent PAC-Bayes priors, and we compare them in terms of their effect on test performance of the learnt predictors and tightness of their risk certificate. We ask what is the optimal amount of data which should be allocated for building the prior and show that the optimum may be dataset dependent. We demonstrate that using a small percentage of the prior-building data for validation of the prior leads to promising results. We include a comparison of underparameterised and overparameterised models, along with an empirical study of different training objectives and regularisation strategies to learn the prior distribution.

【4】 Multiblock-Networks: A Neural Network Analog to Component Based Methods for Multi-Source Data
标题：多块网络：一种基于神经网络模拟分量的多源数据处理方法
链接：https://arxiv.org/abs/2109.10279

作者：Anna Jenul,Stefan Schrunner,Runar Helin,Kristian Hovde Liland,Cecilia Marie Futsæther,Oliver Tomic
机构：Norwegian University of Life Sciences, Kristian H. Liland, Cecilia M. Futsæther
摘要：在应用机器学习中，对来自多个来源的数据集的预测模型进行训练是一种常见但具有挑战性的设置。尽管近年来模型解释越来越受到关注，但许多建模方法仍然主要关注性能。为了进一步提高机器学习模型的可解释性，我们建议采用基于组件的多块分析（也称为化学计量学）的成熟框架中的概念和工具。然而，人工神经网络在模型结构上提供了更大的灵活性，因此，通常提供了更好的预测性能。在本研究中，我们提出了一种将基于组件的统计模型（包括主成分回归和偏最小二乘回归的多块变量）的概念转移到神经网络结构的方法。因此，我们将神经网络的灵活性与多块方法中解释块相关性的概念结合起来。在两个用例中，我们演示了如何在实践中实现该概念，并将其与常见的无块前馈神经网络以及基于统计分量的多块方法进行了比较。我们的结果强调，多块网络允许基本模型解释，同时匹配普通前馈神经网络的性能。
摘要：Training predictive models on datasets from multiple sources is a common, yet challenging setup in applied machine learning. Even though model interpretation has attracted more attention in recent years, many modeling approaches still focus mainly on performance. To further improve the interpretability of machine learning models, we suggest the adoption of concepts and tools from the well-established framework of component based multiblock analysis, also known as chemometrics. Nevertheless, artificial neural networks provide greater flexibility in model architecture and thus, often deliver superior predictive performance. In this study, we propose a setup to transfer the concepts of component based statistical models, including multiblock variants of principal component regression and partial least squares regression, to neural network architectures. Thereby, we combine the flexibility of neural networks with the concepts for interpreting block relevance in multiblock methods. In two use cases we demonstrate how the concept can be implemented in practice, and compare it to both common feed-forward neural networks without blocks, as well as statistical component based multiblock methods. Our results underline that multiblock networks allow for basic model interpretation while matching the performance of ordinary feed-forward neural networks.

【5】 Learning low-degree functions from a logarithmic number of random queries
标题：从对数个随机查询中学习低次函数
链接：https://arxiv.org/abs/2109.10162

作者：Alexandros Eskenazis,Paata Ivanisvili
摘要：我们证明了对于任何整数$n\in\mathbb{n}$、$d\in\{1、\ldots、n\}$和任何$\varepsilon、\delta\in（0,1）$，一个有界函数$f:\{-1,1\}^n\到[-1,1]$度的最大值$d$可以用$\log（\tfrac n}{n}{delta}{n}{n}到[-1,1]$sq d}学习，概率至少为$1-\delta$2$-error$\varepsilon通用有限常数$C>1$的$随机查询。
摘要：We prove that for any integer $n\in\mathbb{N}$, $d\in\{1,\ldots,n\}$ and any $\varepsilon,\delta\in(0,1)$, a bounded function $f:\{-1,1\}^n\to[-1,1]$ of degree at most $d$ can be learned with probability at least $1-\delta$ and $L_2$-error $\varepsilon$ using $\log(\tfrac{n}{\delta})\,\varepsilon^{-d-1} C^{d^{3/2}\sqrt{\log d}}$ random queries for a universal finite constant $C>1$.

【6】 A Novel Structured Natural Gradient Descent for Deep Learning
标题：一种新的用于深度学习的结构化自然梯度下降算法
链接：https://arxiv.org/abs/2109.10100

作者：Weihua Liu,Xiabi Liu
机构： Beijing Lab of Intelligent Information Technology, School of Computer, Beijing Institute of Technology, Beijing, China
摘要：自然梯度下降（NGD）为深层神经网络提供了深刻的见解和强大的工具。然而，随着网络结构的大型化和复杂化，Fisher信息矩阵的计算变得越来越困难。本文提出了一种新的优化方法，其主要思想是通过重构网络精确地代替自然梯度优化。更具体地说，我们重建了深层神经网络的结构，并使用传统的梯度下降法（GD）对新网络进行了优化。重构后的网络通过自然梯度下降达到了优化方法的效果。实验结果表明，我们的优化方法可以加快深度网络模型的收敛速度，在保持计算简单性的同时获得比GD更好的性能。
摘要：Natural gradient descent (NGD) provided deep insights and powerful tools to deep neural networks. However the computation of Fisher information matrix becomes more and more difficult as the network structure turns large and complex. This paper proposes a new optimization method whose main idea is to accurately replace the natural gradient optimization by reconstructing the network. More specifically, we reconstruct the structure of the deep neural network, and optimize the new network using traditional gradient descent (GD). The reconstructed network achieves the effect of the optimization way with natural gradient descent. Experimental results show that our optimization method can accelerate the convergence of deep network models and achieve better performance than GD while sharing its computational simplicity.

【7】 Learning Interpretable Concept Groups in CNNs
标题：在CNN中学习可解释概念组
链接：https://arxiv.org/abs/2109.10078

作者：Saurabh Varshneya,Antoine Ledent,Robert A. Vandermeulen,Yunwen Lei,Matthias Enders,Damian Borth,Marius Kloft
机构：Technical University of Kaiserslautern, Germany, Technical University of Berlin, Germany, University of Birmingham, United Kingdom, NPZ Innovation GmbH, Germany, University of St.Gallen, Switzerland
摘要：我们提出了一种新的训练方法——概念组学习（CGL）——通过将每个层中的过滤器划分为概念组，鼓励训练可解释的CNN过滤器，每个概念组都经过训练以学习单个视觉概念。我们通过一种新的正则化策略来实现这一点，该策略强制同一组中的过滤器在给定层的类似图像区域中处于活动状态。此外，我们还使用正则化器来鼓励在每个层中对概念组进行稀疏加权，以便少数概念组比其他概念组具有更大的重要性。我们使用标准的解释性评估技术对CGL模型的解释性进行了定量评估，发现我们的方法在大多数情况下提高了解释性得分。定性地，我们比较了使用CGL学习的过滤器和不使用CGL学习的过滤器下最活跃的图像区域，发现CGL激活区域更集中于语义相关的特征。
摘要：We propose a novel training methodology -- Concept Group Learning (CGL) -- that encourages training of interpretable CNN filters by partitioning filters in each layer into concept groups, each of which is trained to learn a single visual concept. We achieve this through a novel regularization strategy that forces filters in the same group to be active in similar image regions for a given layer. We additionally use a regularizer to encourage a sparse weighting of the concept groups in each layer so that a few concept groups can have greater importance than others. We quantitatively evaluate CGL's model interpretability using standard interpretability evaluation techniques and find that our method increases interpretability scores in most cases. Qualitatively we compare the image regions that are most active under filters learned using CGL versus filters learned without CGL and find that CGL activation regions more strongly concentrate around semantically relevant features.

【8】 Stabilizing Elastic Weight Consolidation method in practical ML tasks and using weight importances for neural network pruning
标题：实际ML任务中稳定弹性权值合并方法及权值重要性在神经网络修剪中的应用
链接：https://arxiv.org/abs/2109.10021

作者：Alexey Kutalev,Alisa Lapina
机构：SberDevices, PJSC Sberbank, Moscow, Russia, NeuroLab, PJSC Sberbank, Moscow, Russia
备注：16 pages, 7 figures
摘要：本文论述了弹性重量固结法在实际应用中的特点。在这里，我们将更严格地比较用于计算权重重要性的已知方法，这些方法适用于具有完全连接层和卷积层的网络。我们还将指出在具有卷积层和自关注层的多层神经网络中应用弹性权重固结方法时出现的问题，并提出克服这些问题的方法。此外，我们将注意到一个有趣的事实，即在神经网络修剪任务中使用各种类型的权重重要性。
摘要：This paper is devoted to the features of the practical application of Elastic Weight Consolidation method. Here we will more rigorously compare the known methodologies for calculating the importance of weights when applied to networks with fully connected and convolutional layers. We will also point out the problems that arise when applying the Elastic Weight Consolidation method in multilayer neural networks with convolutional layers and self-attention layers, and propose method to overcome these problems. In addition, we will notice an interesting fact about the use of various types of weight importance in the neural network pruning task.

【9】 Neural networks with trainable matrix activation functions
标题：具有可训练矩阵激活函数的神经网络
链接：https://arxiv.org/abs/2109.09948

作者：Zhengqi Liu,Yuwen Li,Ludmil Zikatanov
摘要：神经网络的训练过程通常优化线性变换的权值和偏差参数，而非线性激活函数是预先指定和固定的。这项工作发展了一种系统的方法来构造矩阵激活函数，其条目是从ReLU中推广出来的。激活基于仅使用标量乘法和比较的矩阵向量乘法。所提出的激活函数依赖于与权重和偏差向量一起训练的参数。基于该方法的神经网络简单有效，在数值实验中表现出良好的鲁棒性。
摘要：The training process of neural networks usually optimize weights and bias parameters of linear transformations, while nonlinear activation functions are pre-specified and fixed. This work develops a systematic approach to constructing matrix activation functions whose entries are generalized from ReLU. The activation is based on matrix-vector multiplications using only scalar multiplications and comparisons. The proposed activation functions depend on parameters that are trained along with the weights and bias vectors. Neural networks based on this approach are simple and efficient and are shown to be robust in numerical experiments.

【10】 IgNet. A Super-precise Convolutional Neural Network
标题：IgNet。一种超精密卷积神经网络
链接：https://arxiv.org/abs/2109.09939

作者：Igor Mackarov
备注：16 pages, 8 figures
摘要：卷积神经网络（CNN）是检测和分析图像的有效手段。它们的能力基本上是基于提取图像共同特征的能力。然而，也存在涉及独特、不规则特征或细节的图像。这是一组不同寻常的儿童绘画作品，反映了儿童的想象力和个性。这些图纸通过Keras TensorFlow构建的CNN进行分析。同样的问题——在一个更高的层次上——通过本文描述的新开发的名为IgNet的网络家族得到了解决。事实证明，它能够100%地了解图纸的所有分类特征。对于回归任务（学习年轻艺术家的年龄），IgNet执行的误差不超过0.4%。讨论了IgNet设计的原理，这些原理使我们能够用相当简单的网络拓扑结构获得如此重要的结果。
摘要：Convolutional neural networks (CNN) are known to be an effective means to detect and analyze images. Their power is essentially based on the ability to extract out images common features. There exist, however, images involving unique, irregular features or details. Such is a collection of unusual children drawings reflecting the kids imagination and individuality. These drawings were analyzed by means of a CNN constructed by means of Keras-TensorFlow. The same problem - on a significantly higher level - was solved with newly developed family of networks called IgNet that is described in this paper. It proved able to learn by 100 % all the categorical characteristics of the drawings. In the case of a regression task (learning the young artists ages) IgNet performed with an error of no more than 0.4 %. The principles are discussed of IgNet design that made it possible to reach such substantial results with rather simple network topology.

【11】 iRNN: Integer-only Recurrent Neural Network
标题：IRNN：纯整数递归神经网络
链接：https://arxiv.org/abs/2109.09828

作者：Eyyüb Sari,Vanessa Courville,Vahid Partovi Nia
机构：Huawei Noah’s Ark Lab
摘要：递归神经网络（RNN）被用于许多实际的文本和语音应用中。它们包括复杂的模块，如重现、基于指数的激活、门交互、可展开的规范化、双向依赖和注意。这些元素之间的交互防止在仅整数操作上运行它们，而不会导致性能显著下降。部署包含层规范化和只关注整数算法的RNN仍然是一个开放的问题。我们提出了一种量化感知的训练方法来获得高精度的纯整数递归神经网络（iRNN）。我们的方法支持层规范化、注意和自适应分段线性近似激活，以服务于各种应用中的广泛RNN。该方法已在基于RNN的语言模型和自动语音识别中得到验证。我们的iRNN与全精度iRNN保持着相似的性能，它们在智能手机上的部署将运行时性能提高了$2 \倍，并将模型大小减少了$4 \倍。
摘要：Recurrent neural networks (RNN) are used in many real-world text and speech applications. They include complex modules such as recurrence, exponential-based activation, gate interaction, unfoldable normalization, bi-directional dependence, and attention. The interaction between these elements prevents running them on integer-only operations without a significant performance drop. Deploying RNNs that include layer normalization and attention on integer-only arithmetic is still an open problem. We present a quantization-aware training method for obtaining a highly accurate integer-only recurrent neural network (iRNN). Our approach supports layer normalization, attention, and an adaptive piecewise linear approximation of activations, to serve a wide range of RNNs on various applications. The proposed method is proven to work on RNN-based language models and automatic speech recognition. Our iRNN maintains similar performance as its full-precision counterpart, their deployment on smartphones improves the runtime performance by $2\times$, and reduces the model size by $4\times$.

【12】 Molecular Energy Learning Using Alternative Blackbox Matrix-Matrix Multiplication Algorithm for Exact Gaussian Process
标题：精确高斯过程的交替黑盒矩阵-矩阵乘法学习分子能量
链接：https://arxiv.org/abs/2109.09817

作者：Jiace Sun,Lixue Cheng,Thomas F. Miller III
机构：Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, CA , USA
备注：Preprint, under review
摘要：在基于分子轨道的机器学习（MOB-ML）框架中，我们提出了一种应用黑盒矩阵乘法（BBMM）算法来扩大分子能量的高斯过程（GP）训练。此外，还提出了一种BBMM（AltBBMM）的替代实现方案，以提高训练效率（超过四倍的加速比），并具有与原始BBMM实现相同的准确性和可转移性。MOB-ML的训练仅限于220个分子，BBMM和AltBBMM将MOB-ML的训练规模扩大了30倍以上，达到6500个分子（超过一百万对能量）。在含有7个和13个重原子的有机分子的基准数据集上检验了这两种算法的准确性和可转移性。GP的这些低规模实现在低数据区域保持了最先进的学习效率，同时将其扩展到大数据区域，比其他可用的分子能量机器学习工作具有更好的准确性。
摘要：We present an application of the blackbox matrix-matrix multiplication (BBMM) algorithm to scale up the Gaussian Process (GP) training of molecular energies in the molecular-orbital based machine learning (MOB-ML) framework. An alternative implementation of BBMM (AltBBMM) is also proposed to train more efficiently (over four-fold speedup) with the same accuracy and transferability as the original BBMM implementation. The training of MOB-ML was limited to 220 molecules, and BBMM and AltBBMM scale the training of MOB-ML up by over 30 times to 6500 molecules (more than a million pair energies). The accuracy and transferability of both algorithms are examined on the benchmark datasets of organic molecules with 7 and 13 heavy atoms. These lower-scaling implementations of the GP preserve the state-of-the-art learning efficiency in the low-data regime while extending it to the large-data regime with better accuracy than other available machine learning works on molecular energies.

其他(14篇)

【1】 Relation-Guided Pre-Training for Open-Domain Question Answering
标题：关系导引的开放领域答疑预训练
链接：https://arxiv.org/abs/2109.10346

作者：Ziniu Hu,Yizhou Sun,Kai-Wei Chang
机构：University of California, Los Angeles
摘要：回答复杂的开放领域问题需要理解涉及实体之间的潜在关系。然而，我们发现现有的QA数据集在某些类型的关系中极不平衡，这会影响长尾关系问题的泛化性能。为了解决这个问题，在本文中，我们提出了一个关系引导的预训练（RGPT-QA）框架。我们首先生成一个关系QA数据集，涵盖Wikidata三元组和Wikipedia超链接的广泛关系。然后，我们预先训练一个问答模型，从问题中推断出潜在的关系，然后进行抽取式问答，得到目标答案实体。我们证明，通过使用推荐的RGPT-QA技术进行预训练，流行的开放域QA模型稠密通道检索器（DPR）在自然问题、琐事问答和网络问题的精确匹配准确率上分别实现了2.2%、2.4%和6.3%的绝对提高。特别是，我们发现RGPT-QA在长尾关系问题上有显著改善
摘要：Answering complex open-domain questions requires understanding the latent relations between involving entities. However, we found that the existing QA datasets are extremely imbalanced in some types of relations, which hurts the generalization performance over questions with long-tail relations. To remedy this problem, in this paper, we propose a Relation-Guided Pre-Training (RGPT-QA) framework. We first generate a relational QA dataset covering a wide range of relations from both the Wikidata triplets and Wikipedia hyperlinks. We then pre-train a QA model to infer the latent relations from the question, and then conduct extractive QA to get the target answer entity. We demonstrate that by pretraining with propoed RGPT-QA techique, the popular open-domain QA model, Dense Passage Retriever (DPR), achieves 2.2%, 2.4%, and 6.3% absolute improvement in Exact Match accuracy on Natural Questions, TriviaQA, and WebQuestions. Particularly, we show that RGPT-QA improves significantly on questions with long-tail relations

【2】 Long-Term Exploration in Persistent MDPs
标题：持续性MDP的长期探索
链接：https://arxiv.org/abs/2109.10173

作者：Leonid Ugadiarov,Alexey Skrynnik,Aleksandr I. Panov
机构：Panov,[,−,−,−,], Moscow Institute of Physics and Technology, Moscow, Russia, Artificial Intelligence Research Institute FRC CSC RAS, Moscow, Russia
备注：This is a preprint of the paper accepted to MICAI 2021. It contains 13 pages and 6 figures
摘要：探究是强化学习的重要组成部分，它制约着学习策略的质量。艰苦的勘探环境是由巨大的状态空间和稀疏的奖励决定的。在这种情况下，对环境进行彻底的探索通常是不可能的，并且成功地训练一个代理需要很多交互步骤。在本文中，我们提出了一种称为rollbackexplore（RbExplore）的探索方法，该方法利用了持久马尔可夫决策过程的概念，在训练过程中，代理可以回滚到访问状态。我们在艰苦探索的波斯王子游戏中测试了我们的算法，没有奖励和领域知识。在所有使用的游戏级别上，我们的代理都优于或显示出与基于知识的内在动机（ICM和RND）的最先进的好奇心方法相比较的结果。RbExplore的实现可在https://github.com/cds-mipt/RbExplore.
摘要：Exploration is an essential part of reinforcement learning, which restricts the quality of learned policy. Hard-exploration environments are defined by huge state space and sparse rewards. In such conditions, an exhaustive exploration of the environment is often impossible, and the successful training of an agent requires a lot of interaction steps. In this paper, we propose an exploration method called Rollback-Explore (RbExplore), which utilizes the concept of the persistent Markov decision process, in which agents during training can roll back to visited states. We test our algorithm in the hard-exploration Prince of Persia game, without rewards and domain knowledge. At all used levels of the game, our agent outperforms or shows comparable results with state-of-the-art curiosity methods with knowledge-based intrinsic motivation: ICM and RND. An implementation of RbExplore can be found at https://github.com/cds-mipt/RbExplore.

【3】 Towards a Fairness-Aware Scoring System for Algorithmic Decision-Making
标题：面向算法决策的公平性评分系统
链接：https://arxiv.org/abs/2109.10053

作者：Yi Yang,Ying Wu,Xiangyu Chang,Mei Li
机构： School of Management, Xi’an Jiaotong University†Department of Information Management and E-Business, Xi’an Jiaotong University‡Center for Intelligent Decision-Making and Machine Learning, Xi’an Jiaotong University;email
摘要：评分系统作为简单的分类模型，在预测时在解释性和透明度方面具有显著优势。它通过加减几个分数，让人们能够手动快速做出预测，从而促进决策，因此已广泛应用于各个领域，如重症监护病房的医疗诊断。然而，这些模型中的（联合国）公平问题长期以来一直受到批评，在评分系统的构建中使用有偏见的数据加剧了这种担忧。在本文中，我们提出了一个通用框架来创建数据驱动的公平感知评分系统。我们的方法是首先发展一种兼顾效率和公平的社会福利功能。然后，我们将经济学中的社会福利最大化问题转化为机器学习社区中的经验风险最小化任务，借助混合整数规划推导出一个公平感知评分系统。我们表明，提议的框架为从业者或决策者提供了选择所需公平性要求的极大灵活性，并允许他们通过施加各种操作约束来定制自己的要求。在多个真实数据集上的实验证明，所提出的评分系统能够实现利益相关者的最佳福利，并平衡可解释性、公平性和效率问题。
摘要：Scoring systems, as simple classification models, have significant advantages in interpretability and transparency when making predictions. It facilitates humans' decision-making by allowing them to make a quick prediction by hand through adding and subtracting a few point scores and thus has been widely used in various fields such as medical diagnosis of Intensive Care Units. However, the (un)fairness issues in these models have long been criticized, and the use of biased data in the construction of score systems heightens this concern. In this paper, we proposed a general framework to create data-driven fairness-aware scoring systems. Our approach is first to develop a social welfare function that incorporates both efficiency and equity. Then, we translate the social welfare maximization problem in economics into the empirical risk minimization task in the machine learning community to derive a fairness-aware scoring system with the help of mixed integer programming. We show that the proposed framework provides practitioners or policymakers great flexibility to select their desired fairness requirements and also allows them to customize their own requirements by imposing various operational constraints. Experimental evidence on several real data sets verifies that the proposed scoring system can achieve the optimal welfare of stakeholders and balance the interpretability, fairness, and efficiency issues.

【4】 Identifying biases in legal data: An algorithmic fairness perspective
标题：从算法公平的角度识别法律数据中的偏差
链接：https://arxiv.org/abs/2109.09946

作者：Jackson Sargent,Melanie Weber
机构： University of Michigan, Princeton University & Claudius Legal Intelligence
备注：EEAMO 2021
摘要：解决法律案件数据中的代表性偏见和判决差异的必要性早已得到承认。在这里，我们从算法公平的角度研究识别和测量大规模法律案例数据中的偏差问题。我们的方法使用两个回归模型：一个基线代表数据给出的“典型”法官的决定，另一个“公平”法官应用三个公平概念中的一个。比较“典型”法官和“公正”法官的判决可以量化不同人口群体的偏见，正如我们在库克县（伊利诺伊州）犯罪数据的四个案例研究中所展示的那样。
摘要：The need to address representation biases and sentencing disparities in legal case data has long been recognized. Here, we study the problem of identifying and measuring biases in large-scale legal case data from an algorithmic fairness perspective. Our approach utilizes two regression models: A baseline that represents the decisions of a "typical" judge as given by the data and a "fair" judge that applies one of three fairness concepts. Comparing the decisions of the "typical" judge and the "fair" judge allows for quantifying biases across demographic groups, as we demonstrate in four case studies on criminal data from Cook County (Illinois).

【5】 Demonstration-Efficient Guided Policy Search via Imitation of Robust Tube MPC
标题：基于鲁棒管状MPC仿真的演示高效引导策略搜索
链接：https://arxiv.org/abs/2109.09910

作者：Andrea Tagliabue,Dong-Ki Kim,Michael Everett,Jonathan P. How
机构： or labfactory when training on a realAll the authors are with the MIT Department of Aeronautics andAstronautics
备注：Submitted to the 2022 IEEE Conference on Robotics and Automation (ICRA)
摘要：我们提出了一种基于深度神经网络和模拟学习（IL）的高效演示策略，将计算代价高昂的模型预测控制器（MPC）压缩为计算效率更高的表示形式。通过生成MPC的健壮管变体（RTMPC）并利用管的特性，我们引入了一种数据增强方法，该方法能够实现高演示效率，能够补偿IL中通常遇到的分布偏移。我们的方法打开了Zero-Shot转移的可能性，从在标称域中收集的单个演示，如实验室/受控环境中的模拟或机器人，转移到具有有限模型误差/扰动的域。在四旋翼轨迹跟踪MPC上进行的数值和实验评估表明，我们的方法在演示效率和对训练过程中未发现的扰动的鲁棒性方面优于IL中常用的策略，如匕首和域随机化。
摘要：We propose a demonstration-efficient strategy to compress a computationally expensive Model Predictive Controller (MPC) into a more computationally efficient representation based on a deep neural network and Imitation Learning (IL). By generating a Robust Tube variant (RTMPC) of the MPC and leveraging properties from the tube, we introduce a data augmentation method that enables high demonstration-efficiency, being capable to compensate the distribution shifts typically encountered in IL. Our approach opens the possibility of zero-shot transfer from a single demonstration collected in a nominal domain, such as a simulation or a robot in a lab/controlled environment, to a domain with bounded model errors/perturbations. Numerical and experimental evaluations performed on a trajectory tracking MPC for a quadrotor show that our method outperforms strategies commonly employed in IL, such as DAgger and Domain Randomization, in terms of demonstration-efficiency and robustness to perturbations unseen during training.

【6】 Fast TreeSHAP: Accelerating SHAP Value Computation for Trees
标题：FAST TreeSHAP：加速树的形状值计算
链接：https://arxiv.org/abs/2109.09847

作者：Jilei Yang
机构：LinkedIn Corporation
备注：21 pages (including 9-page appendix), 1 figure
摘要：SHAP（SHapley加法解释）值是解释机器学习模型的主要工具之一，具有强大的理论保证（一致性、局部准确性）和广泛的实现和用例可用性。尽管计算SHAP值通常需要指数时间，但TreeSHAP在基于树的模型上需要多项式时间。虽然加速效果显著，但TreeSHAP仍然可以在具有数百万条或更多条目的数据集上主导行业级机器学习解决方案的计算时间，从而导致事后模型诊断和解释服务的延迟。在本文中，我们提出了两种新的算法，Fast TreeSHAP v1和v2，旨在提高大型数据集TreeSHAP的计算效率。我们根据经验发现，在保持内存成本不变的情况下，Fast TreeSHAP v1比TreeSHAP快1.5倍。类似地，Fast TreeSHAP v2比TreeSHAP快2.5倍，但由于预先计算了昂贵的TreeSHAP步骤，内存使用率略高。我们还表明，Fast TreeSHAP v2非常适合于多时间模型解释，从而使新输入样本的解释速度提高了3倍。
摘要：SHAP (SHapley Additive exPlanation) values are one of the leading tools for interpreting machine learning models, with strong theoretical guarantees (consistency, local accuracy) and a wide availability of implementations and use cases. Even though computing SHAP values takes exponential time in general, TreeSHAP takes polynomial time on tree-based models. While the speedup is significant, TreeSHAP can still dominate the computation time of industry-level machine learning solutions on datasets with millions or more entries, causing delays in post-hoc model diagnosis and interpretation service. In this paper we present two new algorithms, Fast TreeSHAP v1 and v2, designed to improve the computational efficiency of TreeSHAP for large datasets. We empirically find that Fast TreeSHAP v1 is 1.5x faster than TreeSHAP while keeping the memory cost unchanged. Similarly, Fast TreeSHAP v2 is 2.5x faster than TreeSHAP, at the cost of a slightly higher memory usage, thanks to the pre-computation of expensive TreeSHAP steps. We also show that Fast TreeSHAP v2 is well-suited for multi-time model interpretations, resulting in as high as 3x faster explanation of newly incoming samples.

【7】 Revisiting the Characteristics of Stochastic Gradient Noise and Dynamics
标题：关于随机梯度噪声和动力学特性的再认识
链接：https://arxiv.org/abs/2109.09833

作者：Yixin Wu,Rui Luo,Chen Zhang,Jun Wang,Yaodong Yang
机构：University College London,King’s College London
备注：18 pages
摘要：在本文中，我们描述了随机梯度的噪声，并分析了基于梯度的优化器在训练深层神经网络过程中噪声引起的动力学。具体地说，我们首先证明了随机梯度噪声具有有限方差，因此经典的中心极限定理（CLT）适用；这表明梯度噪声是渐近高斯的。这样的渐近结果验证了广泛接受的高斯噪声假设。我们澄清了最近观察到的梯度噪声中的重尾现象可能不是固有特性，而是小批量不足的结果；梯度噪声是有限的i.i.d.随机变量之和，尚未达到CLT的渐近状态，因此偏离高斯分布。我们定量地测量了噪声的高斯逼近的优度，这支持了我们的结论。其次，我们使用Langevin方程分析了随机梯度下降的噪声诱导动力学，给出了优化器中动量超参数的物理解释。然后我们继续证明了随机梯度下降的稳态分布的存在性，并在较小的学习率下近似该分布。
摘要：In this paper, we characterize the noise of stochastic gradients and analyze the noise-induced dynamics during training deep neural networks by gradient-based optimizers. Specifically, we firstly show that the stochastic gradient noise possesses finite variance, and therefore the classical Central Limit Theorem (CLT) applies; this indicates that the gradient noise is asymptotically Gaussian. Such an asymptotic result validates the wide-accepted assumption of Gaussian noise. We clarify that the recently observed phenomenon of heavy tails within gradient noise may not be intrinsic properties, but the consequence of insufficient mini-batch size; the gradient noise, which is a sum of limited i.i.d. random variables, has not reached the asymptotic regime of CLT, thus deviates from Gaussian. We quantitatively measure the goodness of Gaussian approximation of the noise, which supports our conclusion. Secondly, we analyze the noise-induced dynamics of stochastic gradient descent using the Langevin equation, granting for momentum hyperparameter in the optimizer with a physical interpretation. We then proceed to demonstrate the existence of the steady-state distribution of stochastic gradient descent and approximate the distribution at a small learning rate.

【8】 Weak Signals in the Mobility Landscape: Car Sharing in Ten European Cities
标题：机动性景观中的微弱信号：欧洲十大城市的汽车共享
链接：https://arxiv.org/abs/2109.09832

作者：Chiara Boldrini,Raffaele Bruno,Haitam Laarabi
机构：Full list of author information is, available at the end of the article
备注：None
摘要：汽车共享是智能交通基础设施的支柱之一，因为它有望减少我们城市的交通拥堵、停车需求和污染。从需求建模的角度来看，汽车共享在城市景观中是一个微弱的信号：只有一小部分人口使用它，因此很难用家庭旅行日记等传统技术进行可靠的研究。在这项工作中，我们从这些传统方法出发，利用基于网络的数字记录，为一家主要活跃的汽车共享运营商提供欧洲10个城市的车辆可用性。我们讨论哪些社会人口和城市活动指标与汽车共享需求的变化相关，哪些预测方法（在相关文献中最流行）更适合预测上下车事件，以及如何利用有关车辆可用性的时空信息来推断客户如何使用城市中的不同区域。我们通过对数据集分析的直接应用来总结本文，旨在确定车辆共享作业区内的维修设施位置。
摘要：Car sharing is one the pillars of a smart transportation infrastructure, as it is expected to reduce traffic congestion, parking demands and pollution in our cities. From the point of view of demand modelling, car sharing is a weak signal in the city landscape: only a small percentage of the population uses it, and thus it is difficult to study reliably with traditional techniques such as households travel diaries. In this work, we depart from these traditional approaches and we leverage web-based, digital records about vehicle availability in 10 European cities for one of the major active car sharing operators. We discuss which sociodemographic and urban activity indicators are associated with variations in car sharing demand, which forecasting approach (among the most popular in the related literature) is better suited to predict pickup and drop-off events, and how the spatio-temporal information about vehicle availability can be used to infer how different zones in a city are used by customers. We conclude the paper by presenting a direct application of the analysis of the dataset, aimed at identifying where to locate maintenance facilities within the car sharing operation area.

【9】 Towards Energy-Efficient and Secure Edge AI: A Cross-Layer Framework
标题：迈向节能安全的边缘人工智能：一种跨层框架
链接：https://arxiv.org/abs/2109.09829

作者：Muhammad Shafique,Alberto Marchisio,Rachmad Vidya Wicaksana Putra,Muhammad Abdullah Hanif
机构：∗New York University Abu Dhabi (NYUAD), Abu Dhabi, United Arab Emirates, †Technische Universit¨at Wien (TU Wien), Vienna, Austria
备注：To appear at the 40th IEEE/ACM International Conference on Computer-Aided Design (ICCAD), November 2021, Virtual Event
摘要：安全和隐私问题以及需要定期处理的数据量将处理推到了计算系统的边缘。由于严格的内存和功率/能量限制，在资源受限的边缘设备上部署高级神经网络（NN），如深度神经网络（DNN）和尖峰神经网络（SNN），提供最先进的结果是一项挑战。此外，这些系统需要在各种安全和可靠性威胁下保持正确的功能。本文首先讨论了在不同系统层，即硬件（HW）和软件（SW）解决能源效率、可靠性和安全问题的现有方法。然后，我们讨论了如何通过硬件/软件级优化（如修剪、量化和近似）进一步提高边缘人工智能系统的性能（延迟）和能效。为了解决可靠性威胁（如永久性和瞬时性故障），我们强调了成本效益高的缓解技术，如故障感知训练和映射。此外，我们还简要讨论了解决安全威胁（如模型和数据损坏）的有效检测和保护技术。最后，我们将讨论如何将这些技术结合到一个集成的跨层框架中，以实现健壮且节能的边缘人工智能系统。
摘要：The security and privacy concerns along with the amount of data that is required to be processed on regular basis has pushed processing to the edge of the computing systems. Deploying advanced Neural Networks (NN), such as deep neural networks (DNNs) and spiking neural networks (SNNs), that offer state-of-the-art results on resource-constrained edge devices is challenging due to the stringent memory and power/energy constraints. Moreover, these systems are required to maintain correct functionality under diverse security and reliability threats. This paper first discusses existing approaches to address energy efficiency, reliability, and security issues at different system layers, i.e., hardware (HW) and software (SW). Afterward, we discuss how to further improve the performance (latency) and the energy efficiency of Edge AI systems through HW/SW-level optimizations, such as pruning, quantization, and approximation. To address reliability threats (like permanent and transient faults), we highlight cost-effective mitigation techniques, like fault-aware training and mapping. Moreover, we briefly discuss effective detection and protection techniques to address security threats (like model and data corruption). Towards the end, we discuss how these techniques can be combined in an integrated cross-layer framework for realizing robust and energy-efficient Edge AI systems.

【10】 Data Augmentation Methods for Anaphoric Zero Pronouns
标题：回指零代词的数据增强方法
链接：https://arxiv.org/abs/2109.09825

作者：Abdulrahman Aloraini,Massimo Poesio
机构：Queen Mary University of London, United Kingdom, Qassim University, Saudi Arabia
备注：CRAC2021@EMNLP2021
摘要：在阿拉伯语、汉语、意大利语、日语、西班牙语和许多其他语言中，某些句法位置的未实现（空）参数可以指先前引入的实体，因此被称为回指零代名词。然而，现有的研究回指零代名词解释的资源仍然有限。在本文中，我们使用五种数据增强方法来自动生成和检测回指零代词。我们使用增加的数据作为两个阿拉伯语回指零代名词系统的额外训练材料。我们的实验结果表明，数据增强提高了两个系统的性能，超过了最新的结果。
摘要：In pro-drop language like Arabic, Chinese, Italian, Japanese, Spanish, and many others, unrealized (null) arguments in certain syntactic positions can refer to a previously introduced entity, and are thus called anaphoric zero pronouns. The existing resources for studying anaphoric zero pronoun interpretation are however still limited. In this paper, we use five data augmentation methods to generate and detect anaphoric zero pronouns automatically. We use the augmented data as additional training materials for two anaphoric zero pronoun systems for Arabic. Our experimental results show that data augmentation improves the performance of the two systems, surpassing the state-of-the-art results.

【11】 Metamorphic Relation Prioritization for Effective Regression Testing
标题：一种有效回归测试的变质关系优先排序方法
链接：https://arxiv.org/abs/2109.09798

作者：Madhusudan Srinivasan,Upulee Kanewala
摘要：变形测试（MT）广泛用于测试面临oracle问题的程序。它使用一组变形关系（MRs），即多个输入及其相应输出之间的关系来确定被测程序是否有故障。通常，MRs检测被测程序故障的能力各不相同，有些MRs往往检测同一组故障。在本文中，我们提出了将MRs排序的方法，以提高回归测试中机器翻译的效率和有效性。我们提出了两种MR优先级划分方法：（1）基于故障的方法（2）基于覆盖的方法。为了评估这些MR优先级划分方法，我们在三个复杂的开源软件系统上进行了实验。我们的结果表明，我们开发的MR优先级划分方法在故障检测有效性方面明显优于当前以特别方式执行MRs源测试用例和后续测试用例的做法。此外，基于故障的MR优先级划分可以减少需要执行的源测试用例和后续测试用例的数量，并减少检测故障所需的平均时间，从而在测试过程中节省时间和成本。
摘要：Metamorphic testing (MT) is widely used for testing programs that face the oracle problem. It uses a set of metamorphic relations (MRs), which are relations among multiple inputs and their corresponding outputs to determine whether the program under test is faulty. Typically, MRs vary in their ability to detect faults in the program under test, and some MRs tend to detect the same set of faults. In this paper, we propose approaches to prioritize MRs to improve the efficiency and effectiveness of MT for regression testing. We present two MR prioritization approaches: (1) fault-based and (2) coverage-based. To evaluate these MR prioritization approaches, we conduct experiments on three complex open-source software systems. Our results show that the MR prioritization approaches developed by us significantly outperform the current practice of executing the source and follow-up test cases of the MRs in an ad-hoc manner in terms of fault detection effectiveness. Further, fault-based MR prioritization leads to reducing the number of source and follow-up test cases that needs to be executed as well as reducing the average time taken to detect a fault, which would result in saving time and cost during the testing process.

【12】 Inconsistency in Conference Peer Review: Revisiting the 2014 NeurIPS Experiment
标题：会议同行评审中的不一致：重温2014年的NeurIPS实验
链接：https://arxiv.org/abs/2109.09774

作者：Corinna Cortes,Neil D. Lawrence
机构：⋆Google Research, New York, †Computer Lab, University of Cambridge
备注：Source code available at this https URL
摘要：在本文中，我们回顾了2014年的NeurIPS实验，该实验检验了会议同行评议中的不一致性。我们确定50%的审核员质量分数的变化是主观的。此外，从实验开始到现在已经过去了七年，我们发现对于被接受的论文，质量分数和论文影响之间没有相关性，这是作为引文计数的函数来衡量的。我们追踪被拒绝论文的命运，找到这些论文最终发表的地方。对于这些论文，我们发现质量分数和影响之间存在相关性。我们得出的结论是，2014年会议的审查过程有助于识别不良论文，但对识别良好论文的效果不佳。我们提出了一些改进审查过程的建议，但也警告不要删除主观因素。最后，我们建议，实验的真正结论是，在评估单个研究人员的质量时，社区应该减少对“顶级会议出版物”概念的负担。对于神经突2021，个人电脑正在重复实验，同时也在进行新的实验。
摘要：In this paper we revisit the 2014 NeurIPS experiment that examined inconsistency in conference peer review. We determine that 50\% of the variation in reviewer quality scores was subjective in origin. Further, with seven years passing since the experiment we find that for \emph{accepted} papers, there is no correlation between quality scores and impact of the paper as measured as a function of citation count. We trace the fate of rejected papers, recovering where these papers were eventually published. For these papers we find a correlation between quality scores and impact. We conclude that the reviewing process for the 2014 conference was good for identifying poor papers, but poor for identifying good papers. We give some suggestions for improving the reviewing process but also warn against removing the subjective element. Finally, we suggest that the real conclusion of the experiment is that the community should place less onus on the notion of `top-tier conference publications' when assessing the quality of individual researchers. For NeurIPS 2021, the PCs are repeating the experiment, as well as conducting new ones.

【13】 Multifield Cosmology with Artificial Intelligence
标题：人工智能的多场宇宙学
链接：https://arxiv.org/abs/2109.09747

作者：Francisco Villaescusa-Navarro,Daniel Anglés-Alcázar,Shy Genel,David N. Spergel,Yin Li,Benjamin Wandelt,Andrina Nicola,Leander Thiele,Sultan Hassan,Jose Manuel Zorrilla Matilla,Desika Narayanan,Romeel Dave,Mark Vogelsberger
机构：Department of Astrophysical Sciences, Princeton University, Peyton Hall, Princeton, NJ, USA; bCenter for Computational Astrophysics, Flatiron Institute,th
备注：11 pages, 7 figures. First paper of a series of four. All 2D maps, codes, and networks weights publicly available at this https URL
摘要：超新星和活动星系核的反馈等天体物理过程以一种鲜为人知的方式改变了暗物质、气体和星系的性质和空间分布。这种不确定性是从宇宙学调查中提取信息的主要理论障碍之一。我们使用CAMELS项目的2000个最先进的流体动力学模拟，跨越了各种各样的宇宙学和天体物理模型，并为13个不同的领域生成了数十万张二维地图：从暗物质到气体和恒星性质。我们使用这些映射来训练卷积神经网络，以提取最大数量的宇宙学信息，同时在场水平上边缘化天体物理效应。虽然我们的地图只覆盖$（25~h^{-1}{\rm Mpc}）^2$的一小部分区域，而且不同的场以非常不同的方式受到天体物理效应的污染，但我们的网络可以推断出$\Omega{\rm m}$和$\sigma_8$的值，大多数场的精度只有百分之几。我们发现，与基于仅重力N体模拟（不受天体物理影响）的地图训练的模型相比，该网络执行的边缘化保留了丰富的宇宙学信息。最后，我们在多场上训练我们的网络——包含不同颜色或通道的多个字段的2D映射——发现它们不仅可以比在单个字段上训练的网络更精确地推断所有参数的值，而且可以约束$\Omega_{\rm m}的值$的精度高于N体模拟的地图。
摘要：Astrophysical processes such as feedback from supernovae and active galactic nuclei modify the properties and spatial distribution of dark matter, gas, and galaxies in a poorly understood way. This uncertainty is one of the main theoretical obstacles to extract information from cosmological surveys. We use 2,000 state-of-the-art hydrodynamic simulations from the CAMELS project spanning a wide variety of cosmological and astrophysical models and generate hundreds of thousands of 2-dimensional maps for 13 different fields: from dark matter to gas and stellar properties. We use these maps to train convolutional neural networks to extract the maximum amount of cosmological information while marginalizing over astrophysical effects at the field level. Although our maps only cover a small area of $(25~h^{-1}{\rm Mpc})^2$, and the different fields are contaminated by astrophysical effects in very different ways, our networks can infer the values of $\Omega_{\rm m}$ and $\sigma_8$ with a few percent level precision for most of the fields. We find that the marginalization performed by the network retains a wealth of cosmological information compared to a model trained on maps from gravity-only N-body simulations that are not contaminated by astrophysical effects. Finally, we train our networks on multifields -- 2D maps that contain several fields as different colors or channels -- and find that not only they can infer the value of all parameters with higher accuracy than networks trained on individual fields, but they can constrain the value of $\Omega_{\rm m}$ with higher accuracy than the maps from the N-body simulations.

【14】 Neural Distance Embeddings for Biological Sequences
标题：生物序列的神经距离嵌入
链接：https://arxiv.org/abs/2109.09740

作者：Gabriele Corso,Rex Ying,Michal Pándy,Petar Veličković,Jure Leskovec,Pietro Liò
机构：MIT, Stanford University, University of Cambridge, Petar Veliˇckovi´c, DeepMind
摘要：为反映生物序列进化距离的生物序列开发依赖于数据的启发式和表示对于大规模生物研究至关重要。然而，基于连续欧几里德空间的流行机器学习方法一直在与编辑距离的离散组合公式作斗争，编辑距离用于模拟进化和真实世界数据集的层次关系。我们提出了神经距离嵌入（NeuroSEED），这是一种在几何向量空间中嵌入序列的通用框架，并说明了双曲空间的有效性，该空间捕获了层次结构，与最佳竞争几何相比，嵌入RMSE平均减少了22%。该框架的能力和这些改进的意义随后通过设计生物信息学中多个核心任务的有监督和无监督神经种子方法来证明。以公共基线为基准，所提出的方法在真实数据集上显示出显著的准确性和/或运行时改进。作为分层聚类的一个例子，所提出的预训练和从头开始的方法分别以30倍和15倍的运行时缩减来匹配竞争基线的质量。
摘要：The development of data-dependent heuristics and representations for biological sequences that reflect their evolutionary distance is critical for large-scale biological research. However, popular machine learning approaches, based on continuous Euclidean spaces, have struggled with the discrete combinatorial formulation of the edit distance that models evolution and the hierarchical relationship that characterises real-world datasets. We present Neural Distance Embeddings (NeuroSEED), a general framework to embed sequences in geometric vector spaces, and illustrate the effectiveness of the hyperbolic space that captures the hierarchical structure and provides an average 22% reduction in embedding RMSE against the best competing geometry. The capacity of the framework and the significance of these improvements are then demonstrated devising supervised and unsupervised NeuroSEED approaches to multiple core tasks in bioinformatics. Benchmarked with common baselines, the proposed approaches display significant accuracy and/or runtime improvements on real-world datasets. As an example for hierarchical clustering, the proposed pretrained and from-scratch methods match the quality of competing baselines with 30x and 15x runtime reduction, respectively.

机器翻译，仅供参考

点击“阅读原文”获取带摘要的学术速递