机器学习学术速递[9.14]

Update！H5支持摘要折叠，体验更佳！点击阅读原文访问arxivdaily.com，涵盖CS|物理|数学|经济|统计|金融|生物|电气领域，更有搜索、收藏等功能！

cs.LG 方向，今日共计161篇

Graph相关(图学习|图神经网络|图优化等)(12篇)

【1】 Graph-based Retrieval for Claim Verification over Cross-Document Evidence
标题：基于图的跨文档证据索赔验证检索
链接：https://arxiv.org/abs/2109.06022

作者：Misael Mongiovì,Aldo Gangemi
机构：ISTC-CNR, Catania and Rome, Italy
摘要：验证声明的真实性需要在一个庞大的知识库上进行推理，通常是以可靠来源的语料库的形式。一种常见的方法是从参考文献中检索相关文本的短部分，并将其作为输入输入到自然语言推理模块，该模块确定是否可以从中推断或反驳权利要求。然而，当需要从不同的文档中收集和组合多个证据时，这种方法会遇到困难，因为单个文档通常与目标索赔几乎没有关联，因此检索模块会将其忽略。我们推测，基于图形的方法有助于识别碎片证据。我们通过在整个语料库上构建一个大型图来检验这一假设，该图通过上述实体将文本部分连接起来，并利用该图来识别来自多个来源的候选证据集。我们的实验表明，利用图形结构有助于识别与声明相关的一小部分段落。
摘要：Verifying the veracity of claims requires reasoning over a large knowledge base, often in the form of corpora of trustworthy sources. A common approach consists in retrieving short portions of relevant text from the reference documents and giving them as input to a natural language inference module that determines whether the claim can be inferred or contradicted from them. This approach, however, struggles when multiple pieces of evidence need to be collected and combined from different documents, since the single documents are often barely related to the target claim and hence they are left out by the retrieval module. We conjecture that a graph-based approach can be beneficial to identify fragmented evidence. We tested this hypothesis by building, over the whole corpus, a large graph that interconnects text portions by means of mentioned entities and exploiting such a graph for identifying candidate sets of evidence from multiple sources. Our experiments show that leveraging on a graph structure is beneficial in identifying a reasonably small portion of passages related to a claim.

【2】 A deep learning guided memetic framework for graph coloring problems
标题：图着色问题的深度学习引导模因框架
链接：https://arxiv.org/abs/2109.05948

作者：Olivier Goudet,Cyril Grelier,Jin-Kao Hao
机构： Hao (corresponding author) are with the Department of Computer Science, Universit´e d’Angers
摘要：给定一个无向图$G=（V，E）$，有一组顶点$V$和一组边$E$，图着色问题涉及将顶点划分为不同的独立集。在本文中，我们提出了一个新的框架，它将深度神经网络与用于图着色的“经典”元启发式的最佳工具相结合。将该算法应用于加权图着色问题，计算结果表明，该算法可以为中图和大图获得新的上界。对算法中深度学习的贡献的研究强调，学习相关模式有可能获得更好的问题解决方案。
摘要：Given an undirected graph $G=(V,E)$ with a set of vertices $V$ and a set of edges $E$, a graph coloring problem involves finding a partition of the vertices into different independent sets. In this paper we present a new framework which combines a deep neural network with the best tools of "classical" metaheuristics for graph coloring. The proposed algorithm is evaluated on the weighted graph coloring problem and computational results show that the proposed approach allows to obtain new upper bounds for medium and large graphs. A study of the contribution of deep learning in the algorithm highlights that it is possible to learn relevant patterns useful to obtain better solutions to this problem.

【3】 r-GAT: Relational Graph Attention Network for Multi-Relational Graphs
标题：R-GAT：面向多关系图的关系图注意力网络
链接：https://arxiv.org/abs/2109.05922

作者：Meiqi Chen,Yuan Zhang,Xiaoyu Kou,Yuntao Li,Yan Zhang
机构：Key Laboratory of Machine Perception (MOE), Department of Machine Intelligence, Peking University, Beijing, China
摘要：图形注意网络（GAT）只关注简单的无向和单关系图形数据的建模。这限制了它处理更一般和复杂的多关系图的能力，这些图包含具有不同标签的定向链接的实体（例如，知识图）。因此，在多关系图上直接应用GAT会导致次优解。为了解决这个问题，我们提出了r-GAT，一种关系图注意网络来学习多通道实体表示。具体而言，每个通道对应于实体的潜在语义方面。这使我们能够使用关系特征聚合当前方面的邻域信息。我们进一步提出了一种查询感知注意机制，用于后续任务选择有用的方面。对链接预测和实体分类任务的大量实验表明，我们的r-GAT可以有效地建模多关系图。此外，我们还通过案例研究展示了我们方法的可解释性。
摘要：Graph Attention Network (GAT) focuses on modelling simple undirected and single relational graph data only. This limits its ability to deal with more general and complex multi-relational graphs that contain entities with directed links of different labels (e.g., knowledge graphs). Therefore, directly applying GAT on multi-relational graphs leads to sub-optimal solutions. To tackle this issue, we propose r-GAT, a relational graph attention network to learn multi-channel entity representations. Specifically, each channel corresponds to a latent semantic aspect of an entity. This enables us to aggregate neighborhood information for the current aspect using relation features. We further propose a query-aware attention mechanism for subsequent tasks to select useful aspects. Extensive experiments on link prediction and entity classification tasks show that our r-GAT can model multi-relational graphs effectively. Also, we show the interpretability of our approach by case study.

【4】 Process Discovery Using Graph Neural Networks
标题：基于图神经网络的进程发现
链接：https://arxiv.org/abs/2109.05835

作者：Dominique Sommers,Vlado Menkovski,Dirk Fahland
机构：Eindhoven University of Technology, Mathematics and Computer Science, Eindhoven, the Netherlands
备注：accepted at IEEE International Conference on Process Mining (ICPM) 2021, submitted version
摘要：从事件日志中自动发现流程模型是流程挖掘的主要问题。到目前为止，这个任务是通过图合成算法作为一个无监督学习问题来处理的。算法设计决策和启发式允许在减少的搜索空间中高效地查找模型。然而，设计决策和启发式是从关于给定行为描述（事件日志）如何转化为过程模型的假设中得出的，而不是从在解决方案中引入偏差的实际模型中获得的。在本文中，我们探讨了过程发现技术D的监督学习问题。我们介绍了一种使用图卷积神经网络训练基于ML的模型D的技术；D将给定的输入事件日志转换为健全的Petri网。我们表明，对D进行输入日志和输出模型的合成生成对训练，可以使D将以前看不见的合成日志和几个真实事件日志转换为可靠的、任意结构的模型，其准确性和简单性与现有的发现命令式过程模型的最新技术相当。我们分析了所提出的技术的局限性，并为今后的工作勾勒了道路。
摘要：Automatically discovering a process model from an event log is the prime problem in process mining. This task is so far approached as an unsupervised learning problem through graph synthesis algorithms. Algorithmic design decisions and heuristics allow for efficiently finding models in a reduced search space. However, design decisions and heuristics are derived from assumptions about how a given behavioral description - an event log - translates into a process model and were not learned from actual models which introduce biases in the solutions. In this paper, we explore the problem of supervised learning of a process discovery technique D. We introduce a technique for training an ML-based model D using graph convolutional neural networks; D translates a given input event log into a sound Petri net. We show that training D on synthetically generated pairs of input logs and output models allows D to translate previously unseen synthetic and several real-life event logs into sound, arbitrarily structured models of comparable accuracy and simplicity as existing state of the art techniques for discovering imperative process models. We analyze the limitations of the proposed technique and outline alleys for future work.

【5】 Shape-Biased Domain Generalization via Shock Graph Embeddings
标题：基于冲击图嵌入的形状偏向区域泛化
链接：https://arxiv.org/abs/2109.05671

作者：Maruthi Narayanan,Vickram Rajendran,Benjamin Kimia
机构：Johns Hopkins University Applied Physics Laboratory, Laurel, MD , Brown University, School of Engineering
备注：Accepted to ICCV 2021
摘要：人们逐渐意识到，图像卷积神经网络（CNN）的脆弱性，即对图像损坏、干扰和敌对攻击的敏感性，与纹理偏差有关。这种相对缺乏形状偏差的情况也是导致领域泛化（DG）性能差的原因。包含形状角色可以缓解这些漏洞，一些方法通过对负面图像、具有边缘贴图的图像或具有冲突形状和纹理信息的图像进行训练来实现这一点。本文提倡使用经典的计算机视觉方法对形状进行显式和完整的表示，即用轮廓图的冲击图表示图像的形状内容。生成的图形及其描述符是轮廓内容的完整表示，并使用最新的图形神经网络（GNN）方法进行分类。在三个域移位数据集、彩色MNIST、PACS和VLCS上的实验结果表明，即使不使用外观，基于形状的方法在域泛化方面也超过了经典的基于图像CNN的方法。
摘要：There is an emerging sense that the vulnerability of Image Convolutional Neural Networks (CNN), i.e., sensitivity to image corruptions, perturbations, and adversarial attacks, is connected with Texture Bias. This relative lack of Shape Bias is also responsible for poor performance in Domain Generalization (DG). The inclusion of a role of shape alleviates these vulnerabilities and some approaches have achieved this by training on negative images, images endowed with edge maps, or images with conflicting shape and texture information. This paper advocates an explicit and complete representation of shape using a classical computer vision approach, namely, representing the shape content of an image with the shock graph of its contour map. The resulting graph and its descriptor is a complete representation of contour content and is classified using recent Graph Neural Network (GNN) methods. The experimental results on three domain shift datasets, Colored MNIST, PACS, and VLCS demonstrate that even without using appearance the shape-based approach exceeds classical Image CNN based methods in domain generalization.

【6】 Is Heterophily A Real Nightmare For Graph Neural Networks To Do Node Classification?
标题：异质性是图神经网络进行节点分类的真正噩梦吗？
链接：https://arxiv.org/abs/2109.05641

作者：Sitao Luan,Chenqing Hua,Qincheng Lu,Jiaqi Zhu,Mingde Zhao,Shuyuan Zhang,Xiao-Wen Chang,Doina Precup
机构：McGill University; ,Mila; ,DeepMind
摘要：图神经网络（GNNs）通过使用基于关系归纳偏差（同质假设）的图结构来扩展基本神经网络（NNs）。虽然GNN被认为在实际任务中优于NNs，但GNN相对于图无关NNs的性能优势似乎并不令人满意。异性恋被认为是一个主要的原因，并提出了许多工作来解决它。在本文中，我们首先表明，并非所有的异嗜性情况都对具有聚合操作的GNN有害。然后，我们提出了基于相似度矩阵的新度量，该矩阵考虑了图形结构和输入特征对GNN的影响。通过对合成图的测试，证明了该度量方法优于常用的同态度量方法。从指标和观察结果来看，我们发现一些有害的异嗜性可以通过多元化经营来解决。基于这一事实和滤波器组的知识，我们提出了自适应信道混合（ACM）框架，以自适应地利用每个GNN层中的聚合、多样化和身份信道来解决有害的异嗜性。我们用10个真实世界的节点分类任务验证了ACM增强基线。在大多数任务上，它们始终实现了显著的性能增益，并超过了最先进的GNN，而不会产生显著的计算负担。
摘要：Graph Neural Networks (GNNs) extend basic Neural Networks (NNs) by using the graph structures based on the relational inductive bias (homophily assumption). Though GNNs are believed to outperform NNs in real-world tasks, performance advantages of GNNs over graph-agnostic NNs seem not generally satisfactory. Heterophily has been considered as a main cause and numerous works have been put forward to address it. In this paper, we first show that not all cases of heterophily are harmful for GNNs with aggregation operation. Then, we propose new metrics based on a similarity matrix which considers the influence of both graph structure and input features on GNNs. The metrics demonstrate advantages over the commonly used homophily metrics by tests on synthetic graphs. From the metrics and the observations, we find some cases of harmful heterophily can be addressed by diversification operation. With this fact and knowledge of filterbanks, we propose the Adaptive Channel Mixing (ACM) framework to adaptively exploit aggregation, diversification and identity channels in each GNN layer to address harmful heterophily. We validate the ACM-augmented baselines with 10 real-world node classification tasks. They consistently achieve significant performance gain and exceed the state-of-the-art GNNs on most of the tasks without incurring significant computational burden.

【7】 CoG: a Two-View Co-training Framework for Defending Adversarial Attacks on Graph
标题：COG：一种防御图的对抗性攻击的两视图协同训练框架
链接：https://arxiv.org/abs/2109.05558

作者：Xugang Wu,Huijun Wu,Xu Zhou,Kai Lu
机构：National University of Defense Technology
摘要：图形神经网络在图形数据分析中表现出显著的性能。然而，GNN模型的鲁棒性仍然是一个挑战。因此，它们不够可靠，无法部署在关键应用程序中。最近的研究表明，GNN很容易被对抗性扰动，特别是结构扰动所愚弄。这种脆弱性归因于过度依赖结构信息进行预测。为了获得更好的稳健性，需要构建具有更全面特征的GNN预测。在大多数情况下，图形数据有两种信息视图，即结构信息和特征信息。在本文中，我们提出了CoG，一个简单而有效的联合训练框架，将这两种观点结合起来，以达到鲁棒性的目的。CoG从特征视图和结构视图独立地训练子模型，并允许它们通过将最可靠的未标记数据添加到训练集中，从彼此中提取知识。这两个视图的正交性使子模型多样化，从而增强了其集合的鲁棒性。我们在三个流行的数据集上对我们的框架进行了评估，结果表明，CoG显著提高了图形模型对敌对攻击的鲁棒性，而不会牺牲它们在干净数据上的性能。我们还表明，当节点特征和图结构都受到扰动时，CoG仍然具有良好的鲁棒性。
摘要：Graph neural networks exhibit remarkable performance in graph data analysis. However, the robustness of GNN models remains a challenge. As a result, they are not reliable enough to be deployed in critical applications. Recent studies demonstrate that GNNs could be easily fooled with adversarial perturbations, especially structural perturbations. Such vulnerability is attributed to the excessive dependence on the structure information to make predictions. To achieve better robustness, it is desirable to build the prediction of GNNs with more comprehensive features. Graph data, in most cases, has two views of information, namely structure information and feature information. In this paper, we propose CoG, a simple yet effective co-training framework to combine these two views for the purpose of robustness. CoG trains sub-models from the feature view and the structure view independently and allows them to distill knowledge from each other by adding their most confident unlabeled data into the training set. The orthogonality of these two views diversifies the sub-models, thus enhancing the robustness of their ensemble. We evaluate our framework on three popular datasets, and results show that CoG significantly improves the robustness of graph models against adversarial attacks without sacrificing their performance on clean data. We also show that CoG still achieves good robustness when both node features and graph structures are perturbed.

【8】 DynSTGAT: Dynamic Spatial-Temporal Graph Attention Network for Traffic Signal Control
标题：DynSTGAT：面向交通信号控制的动态时空图注意力网络
链接：https://arxiv.org/abs/2109.05491

作者：Libing Wu,Min Wang,Dan Wu,Jia Wu
机构：School of Computer Science, Wuhan University, Wuhan , China, State Key Laboratory of Integrated Services Networks, Xidian University, Xi’an , China, School of Computer Science, University of Windsor, Windsor N,B ,P, Canada
备注：In Proceedings of the 30th ACM International Conference on Information and Knowledge Management (CIKM'21)
摘要：自适应交通信号控制在智能城市建设中发挥着重要作用。这项任务具有挑战性，因为有许多基本因素，例如相邻交叉口之间的协作和动态交通场景。首先，为了便于交通信号的协调，现有的工作采用图形神经网络将周围交叉口的时间和空间影响合并到目标交叉口中，在目标交叉口中分别使用时空信息。然而，这些方法的一个缺点是没有充分利用时空相关性来获得更好的控制方案。其次，在动态交通环境中，交叉口的历史状态对于预测未来的信号切换也至关重要。以前的工作主要是利用当前交叉口的状态来解决这个问题，忽略了交通流在空间和时间上都在不断变化的事实，并且没有处理历史状态。本文提出了一种新的神经网络框架DynSTGAT，它将动态历史状态集成到一个新的时空图注意网络中，以解决上述两个问题。更具体地说，我们的DynSTGAT模型采用了一种新的多头图注意机制，旨在充分利用时空信息的联合关系。然后，为了有效地利用交叉口的历史状态信息，我们设计了一个带有时间卷积网络（TCN）的序列模型来捕获历史信息，并进一步将其与空间信息合并以提高其性能。在多交叉口场景中对合成数据和真实数据进行的大量实验证实，与最先进的方法相比，我们的方法可以在行程时间和吞吐量方面取得优异的性能。
摘要：Adaptive traffic signal control plays a significant role in the construction of smart cities. This task is challenging because of many essential factors, such as cooperation among neighboring intersections and dynamic traffic scenarios. First, to facilitate cooperation of traffic signals, existing work adopts graph neural networks to incorporate the temporal and spatial influences of the surrounding intersections into the target intersection, where spatial-temporal information is used separately. However, one drawback of these methods is that the spatial-temporal correlations are not adequately exploited to obtain a better control scheme. Second, in a dynamic traffic environment, the historical state of the intersection is also critical for predicting future signal switching. Previous work mainly solves this problem using the current intersection's state, neglecting the fact that traffic flow is continuously changing both spatially and temporally and does not handle the historical state. In this paper, we propose a novel neural network framework named DynSTGAT, which integrates dynamic historical state into a new spatial-temporal graph attention network to address the above two problems. More specifically, our DynSTGAT model employs a novel multi-head graph attention mechanism, which aims to adequately exploit the joint relations of spatial-temporal information. Then, to efficiently utilize the historical state information of the intersection, we design a sequence model with the temporal convolutional network (TCN) to capture the historical information and further merge it with the spatial information to improve its performance. Extensive experiments conducted in the multi-intersection scenario on synthetic data and real-world data confirm that our method can achieve superior performance in travel time and throughput against the state-of-the-art methods.

【9】 On the Fundamental Limits of Matrix Completion: Leveraging Hierarchical Similarity Graphs
标题：论矩阵补全的基本极限：利用层次相似图
链接：https://arxiv.org/abs/2109.05408

作者：Junhyung Ahn,Adel Elmahdy,Soheil Mohajer,Changho Suh
机构：Adel Elmahdy and Soheil Mohajer are with the Department of Electrical and Computer Engineering, University of Minnesota
备注：The first two authors contributed equally to this work. A preliminary version of this work was presented at the 2020 Advances in Neural Information Processing Systems Conference (NeurIPS 2020)
摘要：我们研究了在推荐系统中利用层次相似图作为辅助信息的矩阵完备问题。在充分尊重实际相关社会图的分层随机块模型和低秩评级矩阵模型下，我们通过证明样本复杂度的严格上界和下界，刻画了观测矩阵项数（即最优样本复杂度）的精确信息论极限。在可达性证明中，我们证明了在满足所有充分条件的情况下，对于足够多的用户和项目，最大似然估计的误差概率为零。另一方面，逆（不可能性）证明基于genie辅助最大似然估计。在每个必要条件下，我们给出了一个精灵辅助估计的例子，以证明对于足够多的用户和项目，错误概率不会消失。这一结果的一个重要结果是，利用社会图的层次结构，相对于简单地识别不同的组而不依赖于它们之间的关系结构，可以在样本复杂性方面获得巨大的收益。更具体地说，我们分析了最佳样本复杂度，并确定了不同的模式，其特征依赖于分层相似图的边信息的质量度量。最后，我们给出了模拟结果来证实我们的理论发现，并表明特征信息理论极限可以渐近实现。
摘要：We study the matrix completion problem that leverages hierarchical similarity graphs as side information in the context of recommender systems. Under a hierarchical stochastic block model that well respects practically-relevant social graphs and a low-rank rating matrix model, we characterize the exact information-theoretic limit on the number of observed matrix entries (i.e., optimal sample complexity) by proving sharp upper and lower bounds on the sample complexity. In the achievability proof, we demonstrate that probability of error of the maximum likelihood estimator vanishes for sufficiently large number of users and items, if all sufficient conditions are satisfied. On the other hand, the converse (impossibility) proof is based on the genie-aided maximum likelihood estimator. Under each necessary condition, we present examples of a genie-aided estimator to prove that the probability of error does not vanish for sufficiently large number of users and items. One important consequence of this result is that exploiting the hierarchical structure of social graphs yields a substantial gain in sample complexity relative to the one that simply identifies different groups without resorting to the relational structure across them. More specifically, we analyze the optimal sample complexity and identify different regimes whose characteristics rely on quality metrics of side information of the hierarchical similarity graph. Finally, we present simulation results to corroborate our theoretical findings and show that the characterized information-theoretic limit can be asymptotically achieved.

【10】 A Joint Graph and Image Convolution Network for Automatic Brain Tumor Segmentation
标题：一种用于脑肿瘤自动分割的联合图形图像卷积网络
链接：https://arxiv.org/abs/2109.05580

作者：Camillo Saueressig,Adam Berkley,Reshma Munbodh,Ritambhara Singh
机构： Department of Computer Science, Brown University, Center for Computational Molecular Biology, Brown University, Department of Radiation Oncology, Brown Alpert Medical School
备注：9 pages, 3 figures, submitted to BrainLes Workshop (MICCAI 2021) as part of BraTS2021 challenge
摘要：我们提出了一个联合图卷积图像卷积神经网络作为我们提交给脑肿瘤分割（BraTS）2021年的挑战。我们将每个大脑建模为一个由不同图像区域组成的图形，该图像区域最初由图形神经网络（GNN）分割。随后，GNN识别的肿瘤体积通过简单（体素）卷积神经网络（CNN）进一步细化，产生最终分割。该方法通过图形表示捕获全局大脑特征交互，通过卷积滤波器捕获局部图像细节。我们发现，GNN成分本身可以有效地识别和分割脑肿瘤。在所有评估指标中，CNN的加入进一步将模型的平均性能提高了2%。在验证集上，我们的联合GNN-CNN模型在整个肿瘤、核心肿瘤和增强肿瘤上的平均Dice分数分别为0.89、0.81、0.73和平均Hausdorff距离（第95百分位）分别为6.8、12.6、28.2mm。
摘要：We present a joint graph convolution-image convolution neural network as our submission to the Brain Tumor Segmentation (BraTS) 2021 challenge. We model each brain as a graph composed of distinct image regions, which is initially segmented by a graph neural network (GNN). Subsequently, the tumorous volume identified by the GNN is further refined by a simple (voxel) convolutional neural network (CNN), which produces the final segmentation. This approach captures both global brain feature interactions via the graphical representation and local image details through the use of convolutional filters. We find that the GNN component by itself can effectively identify and segment the brain tumors. The addition of the CNN further improves the median performance of the model by 2 percent across all metrics evaluated. On the validation set, our joint GNN-CNN model achieves mean Dice scores of 0.89, 0.81, 0.73 and mean Hausdorff distances (95th percentile) of 6.8, 12.6, 28.2mm on the whole tumor, core tumor, and enhancing tumor, respectively.

【11】 Link Scheduling using Graph Neural Networks
标题：基于图神经网络的链路调度
链接：https://arxiv.org/abs/2109.05536

作者：Zhongyuan Zhao,Gunjan Verma,Chirag Rao,Ananthram Swami,Santiago Segarra
机构： Segarra are with the Department of Electrical andComputer Engineering, Rice University
备注：13 pages, 15 figures, submitted to IEEE Journal of Selected Topics in Signal Processing. arXiv admin note: text overlap with arXiv:2011.09430
摘要：有效的传输调度是无线网络中的一个关键问题。主要的挑战来自这样一个事实：最优链路调度涉及到求解最大加权独立集（MWIS）对于实际的链路调度方案，通常使用集中式和分布式贪婪启发式算法来近似求解MWIS问题。然而，这些贪婪算法大多忽略了无线网络的重要拓扑信息。为了克服这一局限性，我们提出了快速启发式算法基于图卷积网络（GCNs）的s这可以以集中式和分布式方式实现。我们的集中式MWIS解算器基于树搜索，由可训练GCN模块和一步展开引导。在我们的分布式MWIS解算器中，可训练GCN模块在调用分布式贪婪解算器之前学习与网络权重相结合的拓扑感知节点嵌入。在中型无线网络上的测试结果表明，基于GCN的集中式MWIS调度器可以快速获得近似最优解。此外，我们还证明了基于浅层GCN的分布式MWIS调度器可以在最小复杂度增加的情况下，将分布式贪婪解算器的次优度差距减少近一半duling解在图和权重分布上也表现出良好的通用性。
摘要：Efficient scheduling of transmissions is a key problem in wireless networks. The main challenge stems from the fact that optimal link scheduling involves solving a maximum weighted independent set (MWIS) problem, which is known to be NP-hard. For practical link scheduling schemes, centralized and distributed greedy heuristics are commonly used to approximate the solution to the MWIS problem. However, these greedy schemes mostly ignore important topological information of the wireless network. To overcome this limitation, we propose fast heuristics based on graph convolutional networks (GCNs) that can be implemented in centralized and distributed manners. Our centralized MWIS solver is based on tree search guided by a trainable GCN module and 1-step rollout. In our distributed MWIS solver, a trainable GCN module learns topology-aware node embeddings that are combined with the network weights before calling a distributed greedy solver. Test results on medium-sized wireless networks show that a GCN-based centralized MWIS solver can reach a near-optimal solution quickly. Moreover, we demonstrate that a shallow GCN-based distributed MWIS scheduler can reduce by nearly half the suboptimality gap of the distributed greedy solver with minimal increase in complexity. The proposed scheduling solutions also exhibit good generalizability across graph and weight distributions.

【12】 Graph Attention Network Based Single-Pixel Compressive Direction of Arrival Estimation
标题：基于图注意网络的单像素压缩波达方向估计
链接：https://arxiv.org/abs/2109.05466

作者：Kürşat Tekbıyık,Okan Yurduseven,Güneş Karabulut Kurt
机构：G¨unes¸ Karabulut Kurt, Senior Member, IEEE
备注：5 pages, 4 figures
摘要：在本文中，我们提出了一种利用基于图形注意网络（GAT）的深度学习框架的单像素压缩波达方向（DoA）估计技术。物理层压缩使用编码孔径技术实现，使用一组时空非相干模式探测入射到孔径上的远场源的频谱。然后将该信息编码并压缩到编码孔径的信道中。基于编码孔径的接收机采用单通道，取代了传统的基于多通道光栅扫描的DoA估计解决方案。GAT网络使压缩DoA估计框架能够直接从使用编码孔径获得的测量值中学习DoA信息。该步骤消除了额外重建步骤的需要，并显著简化了获得DoA估计的处理层。结果表明，在相对较低的信噪比（SNR）水平下，所提出的GAT集成单像素雷达框架能够恢复高保真DoA信息。
摘要：In this paper, we present a single-pixel compressive direction of arrival (DoA) estimation technique leveraging a graph attention network (GAT) based deep-learning framework. The physical layer compression is achieved using a coded-aperture technique, probing the spectrum of far-field sources incident on the aperture using a set of spatio-temporally incoherent modes. This information is then encoded and compressed into the channel of the coded-aperture. The coded-aperture based receiver exhibits a single-channel, replacing the conventional multichannel raster scan based solutions for DoA estimation. The GAT network enables the compressive DoA estimation framework to learn the DoA information directly from the measurements acquired using the coded-aperture. This step eliminates the need for an additional reconstruction step and significantly simplifies the processing layer to obtain the DoA estimate. We show that the presented GAT integrated single-pixel radar framework can retrieve high fidelity DoA information even under relatively low signal-to-noise ratio (SNR) levels.

Transformer(5篇)

【1】 CDTrans: Cross-domain Transformer for Unsupervised Domain Adaptation
标题：CDTrans：面向无监督领域适配的跨域转换器
链接：https://arxiv.org/abs/2109.06165

作者：Tongkun Xu,Weihua Chen,Pichao Wang,Fan Wang,Hao Li,Rong Jin
机构：Alibaba Group,Shandong University
摘要：无监督领域自适应（UDA）旨在将从标记的源领域学到的知识转移到不同的未标记的目标领域。大多数现有的UDA方法都集中于使用基于卷积神经网络（CNN）的框架从领域级或类别级学习领域不变特征表示。基于类别级别的UDA的一个基本问题是为目标域中的样本生成伪标签，这通常太过嘈杂，无法进行精确的域对齐，从而不可避免地影响UDA性能。随着Transformer在各种任务中的成功，我们发现Transformer中的交叉注意对噪声输入对具有鲁棒性，以便更好地进行特征对齐，因此本文采用Transformer来完成具有挑战性的UDA任务。具体来说，为了生成精确的输入对，我们设计了一种双向中心感知标记算法来为目标样本生成伪标记。在伪标签的基础上，提出了一种权重共享的三分支变换框架，分别将自注意和交叉注意应用于源/目标特征学习和源-目标域对齐。这种设计明确地强制框架同时学习区分性的领域特定和领域不变表示。提出的方法被称为CDTrans（跨域转换器），它提供了使用纯转换器解决UDA任务的第一次尝试之一。大量实验表明，我们提出的方法在Office Home、VisDA-2017和DomainNet数据集上实现了最佳性能。
摘要：Unsupervised domain adaptation (UDA) aims to transfer knowledge learned from a labeled source domain to a different unlabeled target domain. Most existing UDA methods focus on learning domain-invariant feature representation, either from the domain level or category level, using convolution neural networks (CNNs)-based frameworks. One fundamental problem for the category level based UDA is the production of pseudo labels for samples in target domain, which are usually too noisy for accurate domain alignment, inevitably compromising the UDA performance. With the success of Transformer in various tasks, we find that the cross-attention in Transformer is robust to the noisy input pairs for better feature alignment, thus in this paper Transformer is adopted for the challenging UDA task. Specifically, to generate accurate input pairs, we design a two-way center-aware labeling algorithm to produce pseudo labels for target samples. Along with the pseudo labels, a weight-sharing triple-branch transformer framework is proposed to apply self-attention and cross-attention for source/target feature learning and source-target domain alignment, respectively. Such design explicitly enforces the framework to learn discriminative domain-specific and domain-invariant representations simultaneously. The proposed method is dubbed CDTrans (cross-domain transformer), and it provides one of the first attempts to solve UDA tasks with a pure transformer solution. Extensive experiments show that our proposed method achieves the best performance on Office-Home, VisDA-2017, and DomainNet datasets.

【2】 GradTS: A Gradient-Based Automatic Auxiliary Task Selection Method Based on Transformer Networks
标题：GradTS：一种基于Transformer网络的梯度辅助任务自动选择方法
链接：https://arxiv.org/abs/2109.05748

作者：Weicheng Ma,Renze Lou,Kai Zhang,Lili Wang,Soroush Vosoughi
机构：Department of Computer Science, Dartmouth College, Department of Computer Science, Zhejiang University City College, Department of Computer Science and Technology, Tsinghua University
备注：In EMNLP 2021
摘要：多任务学习（MTL）研究的一个关键问题是如何自动选择高质量的辅助任务。本文提出了基于Transformer模型中基于梯度计算的辅助任务自动选择方法GradTS。与强基线方法AUTOSEM相比，GradTS在GLUE基准中的8项自然语言理解（NLU）任务中，使用基于bert的案例后端模型，将MT-DNN的性能从0.33%提高到17.93%。梯度也很省时，因为（1）梯度计算基于单任务实验；（2）当候选任务集发生变化时，梯度可以在不进行额外实验的情况下重复使用。例如，在8个胶水分类任务中，GradTS的时间平均比具有可比GPU消耗量的AUTOSEM少21.32%。此外，我们还展示了梯度在各种任务设置和模型选择中的鲁棒性，例如候选任务之间的混合目标。这些案例研究中GradTS的效率和功效说明了它在MTL研究中的普遍适用性，而不需要手动任务筛选或昂贵的参数调整。
摘要：A key problem in multi-task learning (MTL) research is how to select high-quality auxiliary tasks automatically. This paper presents GradTS, an automatic auxiliary task selection method based on gradient calculation in Transformer-based models. Compared to AUTOSEM, a strong baseline method, GradTS improves the performance of MT-DNN with a bert-base-cased backend model, from 0.33% to 17.93% on 8 natural language understanding (NLU) tasks in the GLUE benchmarks. GradTS is also time-saving since (1) its gradient calculations are based on single-task experiments and (2) the gradients are re-used without additional experiments when the candidate task set changes. On the 8 GLUE classification tasks, for example, GradTS costs on average 21.32% less time than AUTOSEM with comparable GPU consumption. Further, we show the robustness of GradTS across various task settings and model selections, e.g. mixed objectives among candidate tasks. The efficiency and efficacy of GradTS in these case studies illustrate its general applicability in MTL research without requiring manual task filtering or costly parameter tuning.

【3】 TEASEL: A Transformer-Based Speech-Prefixed Language Model
标题：Teasel：一种基于转换器的语音前缀语言模型
链接：https://arxiv.org/abs/2109.05522

作者：Mehdi Arjmand,Mohammad Javad Dousti,Hadi Moradi
机构：University of Tehran
摘要：多模态语言分析是NLP的一个新兴领域，旨在同时模拟说话人的单词、声学注释和面部表情。在这方面，词汇特征通常优于其他模式，因为它们是通过基于变换器的模型在大型语料库上预先训练的。尽管他们的表现很好，但由于数据不足，通常无法在任何模式上训练一个新的自我监督学习（SSL）转换器，这在多模式语言学习中就是如此。这项工作提出了一种基于转换器的语音前缀语言模型，称为TRICEL，以接近上述约束，而无需训练完整的转换器模型。与传统的语言模型相比，STREASEL模型除了文本模态之外，还包括作为动态前缀的语音模态。该方法利用传统的预训练语言模型作为跨模态变换器模型。我们评估了由CMU-MOSI数据集定义的多模态情绪分析任务的TRANSEL。大量的实验表明，我们的模型在F1分数上优于单峰基线语言模型4%，优于当前的多峰最新（SoTA）模型1%。此外，我们提出的方法比SoTA模型小72%。
摘要：Multimodal language analysis is a burgeoning field of NLP that aims to simultaneously model a speaker's words, acoustical annotations, and facial expressions. In this area, lexicon features usually outperform other modalities because they are pre-trained on large corpora via Transformer-based models. Despite their strong performance, training a new self-supervised learning (SSL) Transformer on any modality is not usually attainable due to insufficient data, which is the case in multimodal language learning. This work proposes a Transformer-Based Speech-Prefixed Language Model called TEASEL to approach the mentioned constraints without training a complete Transformer model. TEASEL model includes speech modality as a dynamic prefix besides the textual modality compared to a conventional language model. This method exploits a conventional pre-trained language model as a cross-modal Transformer model. We evaluated TEASEL for the multimodal sentiment analysis task defined by CMU-MOSI dataset. Extensive experiments show that our model outperforms unimodal baseline language models by 4% and outperforms the current multimodal state-of-the-art (SoTA) model by 1% in F1-score. Additionally, our proposed method is 72% smaller than the SoTA model.

【4】 Single-Read Reconstruction for DNA Data Storage Using Transformers
标题：利用Transformer实现DNA数据存储的单读重建
链接：https://arxiv.org/abs/2109.05478

作者：Yotam Nahum,Eyar Ben-Tolila,Leon Anavy
机构： Data Science Institute, Reichman University (IDC Herzliya), School of Electrical and Computer Engineering, Ben-Gurion University of the Negev, Efi Arazi School of Computer Science, Reichman University (IDC Herzliya)
备注：9 pages, 6 figures
摘要：随着全球对大规模数据存储的需求呈指数级增长，现有存储技术在密度和能耗方面正在接近其理论和功能极限，使得基于DNA的存储成为未来数据存储的潜在解决方案。一些研究介绍了基于DNA的高信息密度存储系统（PB/克）。然而，DNA合成和测序技术产生了错误的结果。纠正这些错误的算法方法依赖于读取每个序列的多个副本，并导致过多的读取成本。Transformers作为一种语言建模的深度学习体系结构获得了前所未有的成功，这导致它重新调整了用途，用于解决跨不同领域的各种任务。在这项工作中，我们提出了一种基于DNA数据存储的编码器-解码器-转换器结构的单读重建新方法。我们将纠错过程描述为一个自监督序列到序列任务，并使用合成噪声注入仅使用解码读取来训练模型。我们的方法利用每个解码文件的固有冗余来了解其底层结构。为了演示我们提出的方法，我们将文本、图像和代码脚本文件编码为DNA，使用高保真错误模拟器生成错误，并从嘈杂的读取中重建原始文件。与使用2-3份拷贝的最新算法相比，我们的模型在从单个DNA链读取重建原始数据时实现了更低的错误率。这是首次演示在基于DNA的存储中使用深度学习模型进行单次读取重建，从而降低整个过程的成本。我们表明，这种方法适用于各种领域，也可以推广到新的领域。
摘要：As the global need for large-scale data storage is rising exponentially, existing storage technologies are approaching their theoretical and functional limits in terms of density and energy consumption, making DNA based storage a potential solution for the future of data storage. Several studies introduced DNA based storage systems with high information density (petabytes/gram). However, DNA synthesis and sequencing technologies yield erroneous outputs. Algorithmic approaches for correcting these errors depend on reading multiple copies of each sequence and result in excessive reading costs. The unprecedented success of Transformers as a deep learning architecture for language modeling has led to its repurposing for solving a variety of tasks across various domains. In this work, we propose a novel approach for single-read reconstruction using an encoder-decoder Transformer architecture for DNA based data storage. We address the error correction process as a self-supervised sequence-to-sequence task and use synthetic noise injection to train the model using only the decoded reads. Our approach exploits the inherent redundancy of each decoded file to learn its underlying structure. To demonstrate our proposed approach, we encode text, image and code-script files to DNA, produce errors with high-fidelity error simulator, and reconstruct the original files from the noisy reads. Our model achieves lower error rates when reconstructing the original data from a single read of each DNA strand compared to state-of-the-art algorithms using 2-3 copies. This is the first demonstration of using deep learning models for single-read reconstruction in DNA based storage which allows for the reduction of the overall cost of the process. We show that this approach is applicable for various domains and can be generalized to new domains as well.

【5】 FBERT: A Neural Transformer for Identifying Offensive Content
标题：FBERT：一种识别攻击性内容的神经转换器
链接：https://arxiv.org/abs/2109.05074

作者：Diptanu Sarkar,Marcos Zampieri,Tharindu Ranasinghe,Alexander Ororbia
备注：Accepted to EMNLP Findings
摘要：基于转换器的模型，如BERT、XLNET和XLM-R，在各种NLP任务中取得了最先进的性能，包括识别攻击性语言和仇恨言论，这是社交媒体中的一个重要问题。在本文中，我们介绍了fBERT，一个在SOLID上重新训练的BERT模型，SOLID是最大的英语攻击性语言识别语料库，拥有超过140万美元的攻击性实例。我们评估了fBERT在多个英语数据集上识别攻击性内容的性能，并测试了从SOLID中选择实例的几个阈值。fBERT模型将免费提供给社区。
摘要：Transformer-based models such as BERT, XLNET, and XLM-R have achieved state-of-the-art performance across various NLP tasks including the identification of offensive language and hate speech, an important problem in social media. In this paper, we present fBERT, a BERT model retrained on SOLID, the largest English offensive language identification corpus available with over $1.4$ million offensive instances. We evaluate fBERT's performance on identifying offensive content on multiple English datasets and we test several thresholds for selecting instances from SOLID. The fBERT model will be made freely available to the community.

GAN|对抗|攻击|生成相关(16篇)

【1】 The mathematics of adversarial attacks in AI -- Why deep learning is unstable despite the existence of stable neural networks
标题：人工智能中对抗性攻击的数学--为什么尽管存在稳定的神经网络，深度学习却是不稳定的
链接：https://arxiv.org/abs/2109.06098

作者：Alexander Bastounis,Anders C Hansen,Verner Vlačić
备注：29 pages, 1 figure
摘要：深度学习（DL）的空前成功使得它在分类问题上不受挑战。然而，目前的DL方法会产生普遍不稳定的神经网络（NNs），这一点已得到公认。不稳定问题已经引起了巨大的研究努力——有大量关于所谓的对抗性攻击的文献——但这个问题还没有解决方案。我们的论文阐述了为什么这个问题没有解决方案，因为我们证明了以下数学悖论：任何基于训练神经网络的训练程序，对于具有固定结构的分类问题，都会产生不准确或不稳定的神经网络（如果准确的话）--尽管对于相同的分类问题，可以证明存在精确和稳定的神经网络。关键是稳定、准确的神经网络必须根据输入具有可变维数，特别是可变维数是稳定的必要条件。我们的结果指出了一个悖论，即存在精确和稳定的神经网络，然而，现代算法并不计算它们。这就产生了一个问题：如果可以证明具有理想性质的神经网络的存在，那么人们还能找到计算它们的算法吗？在数学中，有些情况下可证明的存在性意味着可计算性，但神经网络会是这样吗？相反，我们证明了神经网络是如何作为具有标准代价函数的标准优化问题的近似极小值存在的，然而，没有一种随机算法能够以高于1/2的概率计算它们。
摘要：The unprecedented success of deep learning (DL) makes it unchallenged when it comes to classification problems. However, it is well established that the current DL methodology produces universally unstable neural networks (NNs). The instability problem has caused an enormous research effort -- with a vast literature on so-called adversarial attacks -- yet there has been no solution to the problem. Our paper addresses why there has been no solution to the problem, as we prove the following mathematical paradox: any training procedure based on training neural networks for classification problems with a fixed architecture will yield neural networks that are either inaccurate or unstable (if accurate) -- despite the provable existence of both accurate and stable neural networks for the same classification problems. The key is that the stable and accurate neural networks must have variable dimensions depending on the input, in particular, variable dimensions is a necessary condition for stability. Our result points towards the paradox that accurate and stable neural networks exist, however, modern algorithms do not compute them. This yields the question: if the existence of neural networks with desirable properties can be proven, can one also find algorithms that compute them? There are cases in mathematics where provable existence implies computability, but will this be the case for neural networks? The contrary is true, as we demonstrate how neural networks can provably exist as approximate minimisers to standard optimisation problems with standard cost functions, however, no randomised algorithm can compute them with probability better than 1/2.

【2】 Adversarial Bone Length Attack on Action Recognition
标题：动作识别的对抗性骨长攻击
链接：https://arxiv.org/abs/2109.05830

作者：Nariki Tanaka,Hiroshi Kera,Kazuhiko Kawamoto
机构：Chiba University
备注：11 pages, 8 figures
摘要：基于骨架的动作识别模型最近被证明容易受到敌对攻击。与对图像的对抗性攻击相比，对骨架的扰动通常限制在每帧约100的较低维度。这种低维设置使得产生难以察觉的扰动变得更加困难。现有的攻击通过利用骨架运动的时间结构来解决这一问题，从而使扰动维数增加到数千。在本文中，我们证明了基于骨架的动作识别模型可以执行对抗性攻击，即使在非常低的维度环境中，也不需要任何时间操作。具体来说，我们将扰动限制在骨骼的长度上，这使得对手只能操纵大约30个有效尺寸。我们在NTU RGB+D和HDM05数据集上进行了实验，证明所提出的攻击通过小扰动成功欺骗了模型，成功率有时超过90%。此外，我们发现了一个有趣的现象：在我们的低维环境中，骨长攻击的对抗性训练与数据增强具有相似的特性，它不仅提高了对抗性鲁棒性，而且提高了原始数据的分类精度。这是一个有趣的反例，说明了对抗性稳健性和精确性之间的权衡，这在高维对抗训练研究中已得到广泛观察。
摘要：Skeleton-based action recognition models have recently been shown to be vulnerable to adversarial attacks. Compared to adversarial attacks on images, perturbations to skeletons are typically bounded to a lower dimension of approximately 100 per frame. This lower-dimensional setting makes it more difficult to generate imperceptible perturbations. Existing attacks resolve this by exploiting the temporal structure of the skeleton motion so that the perturbation dimension increases to thousands. In this paper, we show that adversarial attacks can be performed on skeleton-based action recognition models, even in a significantly low-dimensional setting without any temporal manipulation. Specifically, we restrict the perturbations to the lengths of the skeleton's bones, which allows an adversary to manipulate only approximately 30 effective dimensions. We conducted experiments on the NTU RGB+D and HDM05 datasets and demonstrate that the proposed attack successfully deceived models with sometimes greater than 90\% success rate by small perturbations. Furthermore, we discovered an interesting phenomenon: in our low-dimensional setting, the adversarial training with the bone length attack shares a similar property with data augmentation, and it not only improves the adversarial robustness but also improves the classification accuracy on the original original data. This is an interesting counterexample of the trade-off between adversarial robustness and clean accuracy, which has been widely observed in studies on adversarial training in the high-dimensional regime.

【3】 Improving Robustness of Adversarial Attacks Using an Affine-Invariant Gradient Estimator
标题：利用仿射不变梯度估计器提高敌方攻击的鲁棒性
链接：https://arxiv.org/abs/2109.05820

作者：Wenzhao Xiang,Hang Su,Chang Liu,Yandong Guo,Shibao Zheng
机构： Dept. of E.E., Institute of Image Communication and Networks Engineering, Shanghai Jiao Tong University, Shanghai, China, Dept. of Comp. Sci. and Tech., BNRist Center, Institute for AI, THBI Lab, Tsinghua University, Beijing, China
摘要：对抗性的例子可以欺骗深层神经网络（DNN），通过不可察觉的扰动显著改变其响应，随着DNN的日益普遍，这会带来新的潜在漏洞。然而，如果我们对结果示例应用仿射变换，则大多数现有的对抗性示例无法保持恶意功能，这是衡量对抗性攻击对实际风险鲁棒性的一个重要指标。为了解决这个问题，我们提出了一种仿射不变对抗攻击，它可以在仿射变换的分布上一致地构造鲁棒的对抗示例。为了进一步提高效率，我们建议将仿射变换分解为旋转、平移、放大，并在极坐标空间中重新构造变换。然后，我们通过将原始图像上的梯度与导出的核进行卷积来构造仿射不变梯度估计器，该估计器可以与任何基于梯度的攻击方法相结合。在ImageNet上的大量实验表明，我们的方法可以在显著的仿射变换下始终如一地生成更健壮的对抗性示例，并且作为副产品，与其他最先进的方法相比，提高了对抗性示例的可转移性。
摘要：Adversarial examples can deceive a deep neural network (DNN) by significantly altering its response with imperceptible perturbations, which poses new potential vulnerabilities as the growing ubiquity of DNNs. However, most of the existing adversarial examples cannot maintain the malicious functionality if we apply an affine transformation on the resultant examples, which is an important measurement to the robustness of adversarial attacks for the practical risks. To address this issue, we propose an affine-invariant adversarial attack which can consistently construct adversarial examples robust over a distribution of affine transformation. To further improve the efficiency, we propose to disentangle the affine transformation into rotations, translations, magnifications, and reformulate the transformation in polar space. Afterwards, we construct an affine-invariant gradient estimator by convolving the gradient at the original image with derived kernels, which can be integrated with any gradient-based attack methods. Extensive experiments on the ImageNet demonstrate that our method can consistently produce more robust adversarial examples under significant affine transformations, and as a byproduct, improve the transferability of adversarial examples compared with the alternative state-of-the-art methods.

【4】 Randomized Substitution and Vote for Textual Adversarial Example Detection
标题：基于随机替换和投票的文本对抗性实例检测
链接：https://arxiv.org/abs/2109.05698

作者：Xiaosen Wang,Yifeng Xiong,Kun He
机构：School of Computer Science and Technology, Huazhong University of Science and Technology
备注：8 pages
摘要：一系列工作表明，自然文本处理模型容易受到对抗性示例的攻击。相应地，提出了各种防御方法来缓解文本对抗性示例的威胁，例如对抗性训练、认证防御、输入预处理、检测等。在这项工作中，我们将基于同义词替换的文本对抗性攻击的优化过程视为一个特定的单词替换序列，每个词相互影响其他词。我们发现，我们可以通过用同义词随机替换一个单词来破坏这种相互作用，消除敌对干扰。基于这一观察，我们提出了一种新的文本对抗性示例检测方法，称为随机替换和投票（RS&V），该方法通过将输入文本中的单词随机替换为同义词而生成的k个样本的logit进行累积，从而对预测标签进行投票。所提出的RS&V通常适用于任何现有的神经网络，无需对其结构进行修改或额外的训练，并且它与先前关于使分类网络本身更健壮的工作是正交的。对三个基准数据集的实证评估表明，RS&V能够比现有的检测方法更成功地检测文本对抗性示例，同时保持了对良性样本的高分类精度。
摘要：A line of work has shown that natural text processing models are vulnerable to adversarial examples. Correspondingly, various defense methods are proposed to mitigate the threat of textual adversarial examples, e.g. adversarial training, certified defense, input pre-processing, detection, etc. In this work, we treat the optimization process for synonym substitution based textual adversarial attacks as a specific sequence of word replacement, in which each word mutually influences other words. We identify that we could destroy such mutual interaction and eliminate the adversarial perturbation by randomly substituting a word with its synonyms. Based on this observation, we propose a novel textual adversarial example detection method, termed Randomized Substitution and Vote (RS&V), which votes the prediction label by accumulating the logits of k samples generated by randomly substituting the words in the input text with synonyms. The proposed RS&V is generally applicable to any existing neural networks without modification on the architecture or extra training, and it is orthogonal to prior work on making the classification network itself more robust. Empirical evaluations on three benchmark datasets demonstrate that RS&V could detect the textual adversarial examples more successfully than the existing detection methods while maintaining the high classification accuracy on benign samples.

【5】 Source Inference Attacks in Federated Learning
标题：联邦学习中的源推理攻击
链接：https://arxiv.org/abs/2109.05659

作者：Hongsheng Hu,Zoran Salcic,Lichao Sun,Gillian Dobbie,Xuyun Zhang
机构：University of Auckland, New Zealand, Lehigh University, USA, Macquarie University, Australia
备注：This paper has been accepted by ICDM 2021
摘要：联邦学习（FL）已成为一种有前途的隐私感知范式，它允许多个客户联合训练一个模型，而无需共享他们的私有数据。最近，许多研究表明，FL容易受到成员推理攻击（MIAs）的攻击，MIAs可以区分给定模型的训练成员和非成员。然而，现有的MIA忽略了训练成员的来源，即客户拥有训练成员的信息，而在FL中，除了所有客户示例的成员隐私外，还必须探索来源隐私。源信息泄漏可能导致严重的隐私问题。例如，如果医院位于高风险地区，识别有助于训练新冠病毒-19大流行FL模型的医院可以使该医院数据记录的所有者更容易受到歧视。在本文中，我们提出了一种新的推理攻击，称为源推理攻击（SIA），它可以得到训练成员源的最佳估计。具体而言，我们创新性地采用贝叶斯观点，证明诚实但好奇的服务器可以启动SIA，在不违反FL协议的情况下窃取训练成员的非平凡源信息。服务器利用训练成员上本地模型的预测损失，以有效和非侵入方式实现攻击。我们在一个合成数据集和五个真实数据集上进行了大量实验，以评估SIA中的关键因素，结果表明了所提出的源推断攻击的有效性。
摘要：Federated learning (FL) has emerged as a promising privacy-aware paradigm that allows multiple clients to jointly train a model without sharing their private data. Recently, many studies have shown that FL is vulnerable to membership inference attacks (MIAs) that can distinguish the training members of the given model from the non-members. However, existing MIAs ignore the source of a training member, i.e., the information of which client owns the training member, while it is essential to explore source privacy in FL beyond membership privacy of examples from all clients. The leakage of source information can lead to severe privacy issues. For example, identification of the hospital contributing to the training of an FL model for COVID-19 pandemic can render the owner of a data record from this hospital more prone to discrimination if the hospital is in a high risk region. In this paper, we propose a new inference attack called source inference attack (SIA), which can derive an optimal estimation of the source of a training member. Specifically, we innovatively adopt the Bayesian perspective to demonstrate that an honest-but-curious server can launch an SIA to steal non-trivial source information of the training members without violating the FL protocol. The server leverages the prediction loss of local models on the training members to achieve the attack effectively and non-intrusively. We conduct extensive experiments on one synthetic and five real datasets to evaluate the key factors in an SIA, and the results show the efficacy of the proposed source inference attack.

【6】 Generating Datasets of 3D Garments with Sewing Patterns
标题：基于缝纫图案的三维服装数据集生成
链接：https://arxiv.org/abs/2109.05633

作者：Maria Korosteleva,Sung-Hee Lee
机构：Graduate School of Culture Technology, KAIST
备注：To appear in NeurIPS 2021 Datasets and Benchmarks Track
摘要：服装在现实世界和许多虚拟世界中无处不在。它们是高度可变形的物体，展示了各种各样的设计和形状，然而，大多数服装是由一组规则形状的扁平件制成的。服装结构的探索为对象结构估计任务提供了一种特殊的情况，通过提供服装形状的强先验知识，可能对神经三维服装建模和重建的下游任务非常有用。为了促进这些方向的研究，我们提出了一种生成三维服装设计及其缝纫图案的大型合成数据集的方法。我们的方法包括一个灵活的描述结构，用于指定参数化缝制模式模板，以及一个自动生成管道，用于生成服装三维模型，几乎不需要人工干预。为了增加真实感，管道还创建最终网格的损坏版本，模拟3D扫描的瑕疵。通过这条管道，我们创建了第一个大规模的三维服装模型及其缝纫模式的合成数据集。该数据集包含来自19种不同基本类型的20000多种服装设计变体。其中七种服装类型专门设计用于评估服装缝制模式拓扑的通用性。
摘要：Garments are ubiquitous in both real and many of the virtual worlds. They are highly deformable objects, exhibit an immense variety of designs and shapes, and yet, most garments are created from a set of regularly shaped flat pieces. Exploration of garment structure presents a peculiar case for an object structure estimation task and might prove useful for downstream tasks of neural 3D garment modeling and reconstruction by providing strong prior on garment shapes. To facilitate research in these directions, we propose a method for generating large synthetic datasets of 3D garment designs and their sewing patterns. Our method consists of a flexible description structure for specifying parametric sewing pattern templates and the automatic generation pipeline to produce garment 3D models with little-to-none manual intervention. To add realism, the pipeline additionally creates corrupted versions of the final meshes that imitate artifacts of 3D scanning. With this pipeline, we created the first large-scale synthetic dataset of 3D garment models with their sewing patterns. The dataset contains more than 20000 garment design variations produced from 19 different base types. Seven of these garment types are specifically designed to target evaluation of the generalization across garment sewing pattern topologies.

【7】 Adversarial Representation Learning With Closed-Form Solvers
标题：基于闭合形式解算器的对抗性表征学习
链接：https://arxiv.org/abs/2109.05535

作者：Bashir Sadeghi,Lan Wang,Vishnu Naresh Boddeti
机构：Michigan State University, East Lansing, MI , USA
摘要：对抗表示学习旨在学习目标任务的数据表示，同时删除不需要的敏感信息。现有的方法通过随机梯度下降-上升迭代学习模型参数，在实际应用中往往是不稳定和不可靠的。为了克服这一挑战，我们对对手和目标任务采用了封闭式解算器。我们将它们建模为核岭回归器，并解析地确定表示的最优维数的上界。我们的解决方案称为OptNet ARL，它简化为一个稳定的一次性优化问题，可以可靠高效地解决。OptNet ARL可以很容易地推广到多目标任务和敏感属性的情况。在小型和大型数据集上的数值实验表明，从优化的角度来看，OptNet ARL是稳定的，收敛速度快三到五倍。在性能方面，当目标和敏感属性相互依赖时，OptNet ARL学习的表示方式通过减少私人信息泄漏，在（a）公平分类的效用和偏见以及（b）效用和隐私之间提供了更好的权衡。
摘要：Adversarial representation learning aims to learn data representations for a target task while removing unwanted sensitive information at the same time. Existing methods learn model parameters iteratively through stochastic gradient descent-ascent, which is often unstable and unreliable in practice. To overcome this challenge, we adopt closed-form solvers for the adversary and target task. We model them as kernel ridge regressors and analytically determine an upper-bound on the optimal dimensionality of representation. Our solution, dubbed OptNet-ARL, reduces to a stable one one-shot optimization problem that can be solved reliably and efficiently. OptNet-ARL can be easily generalized to the case of multiple target tasks and sensitive attributes. Numerical experiments, on both small and large scale datasets, show that, from an optimization perspective, OptNet-ARL is stable and exhibits three to five times faster convergence. Performance wise, when the target and sensitive attributes are dependent, OptNet-ARL learns representations that offer a better trade-off front between (a) utility and bias for fair classification and (b) utility and privacy by mitigating leakage of private information than existing solutions.

【8】 Check Your Other Door! Establishing Backdoor Attacks in the Frequency Domain
标题：检查一下你的另一扇门！在频域中建立后门攻击
链接：https://arxiv.org/abs/2109.05507

作者：Hasan Abed Al Kader Hammoud,Bernard Ghanem
机构：King Abdullah University of Science and Technology (KAUST)
摘要：深度神经网络（DNN）已被广泛应用于从图像分类、人脸识别到医学图像分析和实时目标检测等各个领域。随着我们的模型变得越来越复杂，训练此类模型的计算成本成为小公司和个人的负担；因此，外包训练流程一直是此类用户的首选。不幸的是，外包训练过程的代价是易受后门攻击。这些攻击的目的是在DNN中建立隐藏的后门，以便模型在良性样本上表现良好，但在对输入应用触发器时输出特定的目标标签。当前的后门攻击依赖于在图像/像素域中生成触发器；然而，正如我们在本文中所展示的，它不是唯一可以利用的领域，人们应该始终“检查其他的门”。在这项工作中，我们提出了一个完整的管道，用于在频域中生成动态、高效和不可见的后门攻击。通过对各种数据集和网络架构的大量实验，我们展示了利用频域建立不可检测且强大的后门攻击的优势。后门模型显示，打破了各种最先进的防御。我们还展示了成功抵御基于频率的后门攻击的两种可能的防御方法，以及攻击者绕过它们的可能方式。最后，我们对网络的学习能力和在模型中嵌入后门攻击的能力进行了说明。
摘要：Deep Neural Networks (DNNs) have been utilized in various applications ranging from image classification and facial recognition to medical imagery analysis and real-time object detection. As our models become more sophisticated and complex, the computational cost of training such models becomes a burden for small companies and individuals; for this reason, outsourcing the training process has been the go-to option for such users. Unfortunately, outsourcing the training process comes at the cost of vulnerability to backdoor attacks. These attacks aim at establishing hidden backdoors in the DNN such that the model performs well on benign samples but outputs a particular target label when a trigger is applied to the input. Current backdoor attacks rely on generating triggers in the image/pixel domain; however, as we show in this paper, it is not the only domain to exploit and one should always "check the other doors". In this work, we propose a complete pipeline for generating a dynamic, efficient, and invisible backdoor attack in the frequency domain. We show the advantages of utilizing the frequency domain for establishing undetectable and powerful backdoor attacks through extensive experiments on various datasets and network architectures. The backdoored models are shown to break various state-of-the-art defences. We also show two possible defences that succeed against frequency-based backdoor attacks and possible ways for the attacker to bypass them. We conclude the work with some remarks regarding a network's learning capacity and the capability of embedding a backdoor attack in the model.

【9】 COSMic: A Coherence-Aware Generation Metric for Image Descriptions
标题：COSMIC：一种相干感知的图像描述生成度量
链接：https://arxiv.org/abs/2109.05281

作者：Mert İnan,Piyush Sharma,Baber Khalid,Radu Soricut,Matthew Stone,Malihe Alikhani
机构：Google Research, Rutgers University, University of Pittsburgh
备注：12 pages, 4 figures, Findings of the Association for Computational Linguistics: EMNLP 2021
摘要：文本生成模型的开发人员依赖于自动评估指标作为缓慢而昂贵的手动评估的替身。然而，图像字幕指标很难给出输出文本语义和语用成功的准确估计。我们通过引入第一个用于评估图像描述的话语感知学习生成度量来解决这个弱点。我们的方法受到了语篇计算理论的启发，通过连贯性获取信息目标。我们提供了一个图像$\unicode{x2013}$描述对的数据集，用一致性关系进行注释。然后，我们在概念标题数据集的子集上训练一致性感知度量，并测量其有效性$\unicode{x2014}$及其在由域外图像组成的测试集上预测输出标题$\unicode{x2014}$的人类评级的能力。与其他一些指标（包括最近提出的学习指标，如BLEURT和BERTScore）相比，我们展示了我们提出的指标更高的Kendall相关系数，以及人类对一些最先进的一致性感知字幕生成模型结果的判断。
摘要：Developers of text generation models rely on automated evaluation metrics as a stand-in for slow and expensive manual evaluations. However, image captioning metrics have struggled to give accurate learned estimates of the semantic and pragmatic success of output text. We address this weakness by introducing the first discourse-aware learned generation metric for evaluating image descriptions. Our approach is inspired by computational theories of discourse for capturing information goals using coherence. We present a dataset of image$\unicode{x2013}$description pairs annotated with coherence relations. We then train a coherence-aware metric on a subset of the Conceptual Captions dataset and measure its effectiveness$\unicode{x2014}$its ability to predict human ratings of output captions$\unicode{x2014}$on a test set composed of out-of-domain images. We demonstrate a higher Kendall Correlation Coefficient for our proposed metric with the human judgments for the results of a number of state-of-the-art coherence-aware caption generation models when compared to several other metrics including recently proposed learned metrics such as BLEURT and BERTScore.

【10】 2-in-1 Accelerator: Enabling Random Precision Switch for Winning Both Adversarial Robustness and Efficiency
标题：2合1加速器：实现随机精确切换，同时赢得对手的稳健性和效率
链接：https://arxiv.org/abs/2109.05223

作者：Yonggan Fu,Yang Zhao,Qixuan Yu,Chaojian Li,Yingyan Lin
机构：Rice University
备注：Accepted at MICRO 2021
摘要：深度神经网络（DNN）的最新突破和数十亿物联网（IoT）设备的出现，激发了对配备特定领域DNN加速器的智能物联网设备的爆炸性需求。然而，将DNN加速器支持的智能功能部署到现实世界的物联网设备中仍然具有特别大的挑战性。首先，强大的DNN通常具有令人望而却步的复杂性，而物联网设备通常受到严格的资源限制。第二，尽管DNN容易受到对抗性攻击，特别是在暴露于复杂现实环境中的物联网设备上，但许多物联网应用程序需要严格的安全性。现有的DNN加速器大多只解决上述两个挑战中的一个（即效率或对抗性稳健性），而忽略甚至牺牲了另一个挑战。为此，我们提出了一个二合一加速器，一个集成的算法加速器协同设计框架，旨在赢得DNN加速器的对抗性鲁棒性和效率。具体地说，我们首先提出了一种随机精度切换（RPS）算法，通过将随机DNN量化作为原位模型切换，该算法可以有效地防御DNN攻击。此外，我们提出了一种新的精密可伸缩加速器，其特点是：（1）一种新的精密可伸缩MAC单元体系结构，该结构在空间上平铺时间MAC单元，以提高可实现的效率和灵活性；（2）一种由我们的通用加速器优化器搜索的系统优化数据流。大量的实验和烧蚀研究证实，我们的二合一加速器不仅可以在各种攻击下有力地提高DNN加速器的对抗鲁棒性和效率，而且还可以自然地支持瞬时鲁棒性效率权衡，以适应各种资源，而无需对DNN进行再训练。
摘要：The recent breakthroughs of deep neural networks (DNNs) and the advent of billions of Internet of Things (IoT) devices have excited an explosive demand for intelligent IoT devices equipped with domain-specific DNN accelerators. However, the deployment of DNN accelerator enabled intelligent functionality into real-world IoT devices still remains particularly challenging. First, powerful DNNs often come at prohibitive complexities, whereas IoT devices often suffer from stringent resource constraints. Second, while DNNs are vulnerable to adversarial attacks especially on IoT devices exposed to complex real-world environments, many IoT applications require strict security. Existing DNN accelerators mostly tackle only one of the two aforementioned challenges (i.e., efficiency or adversarial robustness) while neglecting or even sacrificing the other. To this end, we propose a 2-in-1 Accelerator, an integrated algorithm-accelerator co-design framework aiming at winning both the adversarial robustness and efficiency of DNN accelerators. Specifically, we first propose a Random Precision Switch (RPS) algorithm that can effectively defend DNNs against adversarial attacks by enabling random DNN quantization as an in-situ model switch. Furthermore, we propose a new precision-scalable accelerator featuring (1) a new precision-scalable MAC unit architecture which spatially tiles the temporal MAC units to boost both the achievable efficiency and flexibility and (2) a systematically optimized dataflow that is searched by our generic accelerator optimizer. Extensive experiments and ablation studies validate that our 2-in-1 Accelerator can not only aggressively boost both the adversarial robustness and efficiency of DNN accelerators under various attacks, but also naturally support instantaneous robustness-efficiency trade-offs adapting to varied resources without the necessity of DNN retraining.

【11】 Conditional Generation of Synthetic Geospatial Images from Pixel-level and Feature-level Inputs
标题：从像素级和要素级输入有条件地生成合成地理空间图像
链接：https://arxiv.org/abs/2109.05201

作者：Xuerong Xiao,Swetava Ganguli,Vipul Pandey
机构：Stanford University, Apple
备注：Extended abstract accepted for presentation at BayLearn 2021. 3 pages, 2 figures
摘要：由于缺乏类平衡和多样的训练数据，为计算机视觉的许多地理空间应用训练鲁棒的监督深度学习模型是困难的。相反，为许多应用程序获取足够的训练数据在财务上是不可行的，或者可能是不可行的，特别是当应用程序涉及对罕见或极端事件建模时。使用生成模型综合生成数据（和标签），该生成模型可以从目标分布中采样并利用图像的多尺度性质，是解决标记数据稀缺问题的廉价解决方案。为了实现这一目标，我们提出了一种称为VAE Info cGAN的深层条件生成模型，该模型将变分自动编码器（VAE）与条件信息最大化生成对抗网络（InfoGAN）相结合，用于在像素级条件（PLC）下同时合成语义丰富的图像和宏观特征级条件（FLC）。在尺寸上，PLC只能在通道尺寸上与合成图像不同，并且是特定于任务的输入。FLC被建模为生成图像潜在空间中的属性向量，该属性向量控制与目标分布密切相关的各种特征属性的贡献。在GPS轨迹数据集上的实验表明，所提出的模型能够准确地生成跨越不同地理位置的各种形式的时空聚集，同时仅以道路网络的光栅表示为条件。VAE Info cGAN的主要预期应用是合成数据（和标签）生成，用于基于计算机视觉的地理空间分析和遥感相关问题建模的目标数据增强。
摘要：Training robust supervised deep learning models for many geospatial applications of computer vision is difficult due to dearth of class-balanced and diverse training data. Conversely, obtaining enough training data for many applications is financially prohibitive or may be infeasible, especially when the application involves modeling rare or extreme events. Synthetically generating data (and labels) using a generative model that can sample from a target distribution and exploit the multi-scale nature of images can be an inexpensive solution to address scarcity of labeled data. Towards this goal, we present a deep conditional generative model, called VAE-Info-cGAN, that combines a Variational Autoencoder (VAE) with a conditional Information Maximizing Generative Adversarial Network (InfoGAN), for synthesizing semantically rich images simultaneously conditioned on a pixel-level condition (PLC) and a macroscopic feature-level condition (FLC). Dimensionally, the PLC can only vary in the channel dimension from the synthesized image and is meant to be a task-specific input. The FLC is modeled as an attribute vector in the latent space of the generated image which controls the contributions of various characteristic attributes germane to the target distribution. Experiments on a GPS trajectories dataset show that the proposed model can accurately generate various forms of spatiotemporal aggregates across different geographic locations while conditioned only on a raster representation of the road network. The primary intended application of the VAE-Info-cGAN is synthetic data (and label) generation for targeted data augmentation for computer vision-based modeling of problems relevant to geospatial analysis and remote sensing.

【12】 HypoGen: Hyperbole Generation with Commonsense and Counterfactual Knowledge
标题：下位：常识与反事实知识的夸张生成
链接：https://arxiv.org/abs/2109.05097

作者：Yufei Tian,Arvind krishna Sridhar,Nanyun Peng
机构：Computer Science Department, University of California, Los Angeles
备注：Accepted at Findings of EMNLP21
摘要：夸张是一种有意的、创造性的夸张，不能从字面上理解。尽管它在日常生活中无处不在，但对夸张的计算探索却很少。在这篇论文中，我们解决了探索不足和具有挑战性的任务：句子级夸张生成。我们从一个有代表性的强化句法模式开始，系统地研究这种夸张中每个成分之间的语义（常识和反事实）关系。接下来，我们利用COMeT和reverse COMeT模型进行常识和反事实推理。然后，我们根据模式的发现生成多个夸张候选词，并训练神经分类器对高质量的夸张词进行排序和选择。自动和人工评估表明，我们的生成方法能够创造性地生成夸张，具有较高的成功率和强度分数。
摘要：A hyperbole is an intentional and creative exaggeration not to be taken literally. Despite its ubiquity in daily life, the computational explorations of hyperboles are scarce. In this paper, we tackle the under-explored and challenging task: sentence-level hyperbole generation. We start with a representative syntactic pattern for intensification and systematically study the semantic (commonsense and counterfactual) relationships between each component in such hyperboles. Next, we leverage the COMeT and reverse COMeT models to do commonsense and counterfactual inference. We then generate multiple hyperbole candidates based on our findings from the pattern, and train neural classifiers to rank and select high-quality hyperboles. Automatic and human evaluations show that our generation method is able to generate hyperboles creatively with high success rate and intensity scores.

【13】 Stochastic Adversarial Koopman Model for Dynamical Systems
标题：动力系统的随机对抗库普曼模型
链接：https://arxiv.org/abs/2109.05095

作者：Kaushik Balakrishnan,Devesh Upadhyay
机构：Ford Greenfield Labs, Palo Alto, CA, Ford Research, Dearborn, MI
备注：20 pages, 10 figures
摘要：动力系统是普遍存在的，通常使用非线性控制方程组来建模。许多动力系统的数值求解过程已经存在了几十年，但由于动力系统的高维状态空间，数值求解过程可能很慢。因此，基于深度学习的降阶模型（ROM）是一个令人感兴趣的问题，其中一个这样的算法家族基于Koopman理论。本文将最近发展的对抗性库普曼模型（Balakrishnan\&Upadhyay，arXiv:2006.05547）扩展到随机空间，其中库普曼算子应用于编码器潜在编码的概率分布。具体而言，系统的潜在编码被建模为高斯编码，并通过使用输出两个Koopman矩阵$K_{\mu}$和$K_{\sigma}$的辅助神经网络在时间上进行改进。使用了对抗性损失和梯度损失，发现这可以降低预测误差。在假设库普曼矩阵具有三对角结构的情况下，还采用了简化的库普曼公式，这产生了与具有完整库普曼矩阵的基线模型相当的预测。随机Koopman模型的有效性在混沌、流体动力学、燃烧和反应扩散模型的不同测试问题上得到了验证。该模型还应用于Koopman矩阵以其他输入参数为条件进行泛化的环境中，并用于实时模拟锂离子电池的状态。本研究中讨论的Koopman模型对于所考虑的广泛问题是非常有前景的。
摘要：Dynamical systems are ubiquitous and are often modeled using a non-linear system of governing equations. Numerical solution procedures for many dynamical systems have existed for several decades, but can be slow due to high-dimensional state space of the dynamical system. Thus, deep learning-based reduced order models (ROMs) are of interest and one such family of algorithms along these lines are based on the Koopman theory. This paper extends a recently developed adversarial Koopman model (Balakrishnan \& Upadhyay, arXiv:2006.05547) to stochastic space, where the Koopman operator applies on the probability distribution of the latent encoding of an encoder. Specifically, the latent encoding of the system is modeled as a Gaussian, and is advanced in time by using an auxiliary neural network that outputs two Koopman matrices $K_{\mu}$ and $K_{\sigma}$. Adversarial and gradient losses are used and this is found to lower the prediction errors. A reduced Koopman formulation is also undertaken where the Koopman matrices are assumed to have a tridiagonal structure, and this yields predictions comparable to the baseline model with full Koopman matrices. The efficacy of the stochastic Koopman model is demonstrated on different test problems in chaos, fluid dynamics, combustion, and reaction-diffusion models. The proposed model is also applied in a setting where the Koopman matrices are conditioned on other input parameters for generalization and this is applied to simulate the state of a Lithium-ion battery in time. The Koopman models discussed in this study are very promising for the wide range of problems considered.

【14】 Data Generation Method for Learning a Low-dimensional Safe Region in Safe Reinforcement Learning
标题：安全强化学习中低维安全区域学习的数据生成方法
链接：https://arxiv.org/abs/2109.05077

作者：Zhehua Zhou,Ozgur S. Oguz,Yi Ren,Marion Leibold,Martin Buss
摘要：安全强化学习旨在学习控制策略，同时确保在学习过程中不会损坏系统或环境。为了在高度非线性和高维动态系统上实现安全强化学习，一种可能的方法是通过数据驱动的特征提取方法找到低维安全区域，为学习算法提供安全性估计。由于学习到的安全评估的可靠性依赖于数据，我们在这项工作中研究了不同的训练数据将如何影响安全强化学习方法。通过平衡学习性能和不安全风险，提出了一种结合两种抽样方法的数据生成方法来生成具有代表性的训练数据。以三连杆倒立摆为例，验证了该方法的性能。
摘要：Safe reinforcement learning aims to learn a control policy while ensuring that neither the system nor the environment gets damaged during the learning process. For implementing safe reinforcement learning on highly nonlinear and high-dimensional dynamical systems, one possible approach is to find a low-dimensional safe region via data-driven feature extraction methods, which provides safety estimates to the learning algorithm. As the reliability of the learned safety estimates is data-dependent, we investigate in this work how different training data will affect the safe reinforcement learning approach. By balancing between the learning performance and the risk of being unsafe, a data generation method that combines two sampling methods is proposed to generate representative training data. The performance of the method is demonstrated with a three-link inverted pendulum example.

【15】 Instance-Conditioned GAN
标题：实例条件GaN
链接：https://arxiv.org/abs/2109.05070

作者：Arantxa Casanova,Marlène Careil,Jakob Verbeek,Michal Drozdzal,Adriana Romero-Soriano
机构：Facebook AI Research, École Polytechnique de Montréal, Mila, Quebec AI Institute, Télécom Paris, Michał Dro˙zd˙zal∗, McGill University
摘要：生成性对抗网络（GAN）可以在狭窄的区域（如人脸）生成接近照片真实感的图像。然而，在无条件的环境下，对ImageNet和COCO Stuff等数据集的复杂分布进行建模仍然具有挑战性。在本文中，我们从核密度估计技术中得到启发，引入了一种非参数方法来建模复杂数据集的分布。我们将数据流形划分为由数据点及其最近邻描述的重叠邻域的混合物，并引入称为实例条件GAN（IC-GAN）的模型，该模型学习每个数据点周围的分布。在ImageNet和COCO Stuff上的实验结果表明，与无条件模型和无监督数据分割基线相比，IC-GAN显著提高了性能。此外，我们还表明，IC-GAN可以通过简单地改变条件实例轻松地传输到训练过程中看不到的数据集，并且仍然可以生成真实的图像。最后，我们将IC-GAN扩展到类条件情况，并在ImageNet上显示语义可控生成和竞争性定量结果；在ImageNet-LT上改进BigGAN的同时，我们将开放源代码和经过训练的模型，以重现报告的结果。
摘要：Generative Adversarial Networks (GANs) can generate near photo realistic images in narrow domains such as human faces. Yet, modeling complex distributions of datasets such as ImageNet and COCO-Stuff remains challenging in unconditional settings. In this paper, we take inspiration from kernel density estimation techniques and introduce a non-parametric approach to modeling distributions of complex datasets. We partition the data manifold into a mixture of overlapping neighborhoods described by a datapoint and its nearest neighbors, and introduce a model, called instance-conditioned GAN (IC-GAN), which learns the distribution around each datapoint. Experimental results on ImageNet and COCO-Stuff show that IC-GAN significantly improves over unconditional models and unsupervised data partitioning baselines. Moreover, we show that IC-GAN can effortlessly transfer to datasets not seen during training by simply changing the conditioning instances, and still generate realistic images. Finally, we extend IC-GAN to the class-conditional case and show semantically controllable generation and competitive quantitative results on ImageNet; while improving over BigGAN on ImageNet-LT. We will opensource our code and trained models to reproduce the reported results.

【16】 Inferential Wasserstein Generative Adversarial Networks
标题：推理式Wasserstein生成对抗网络
链接：https://arxiv.org/abs/2109.05652

作者：Yao Chen,Qingyi Gao,Xiao Wang
机构：Department of Statistics, Purdue University
摘要：生成性对抗网络（GAN）在许多问题和应用中都有影响，但训练不稳定。Wasserstein-GAN（WGAN）利用Wasserstein距离来避免GANs的minmax双人训练中的警告，但也存在其他缺陷，如模式崩溃和缺乏检测收敛的度量。我们介绍了一种新的推理瓦瑟斯坦-甘（IW甘）模型，这是一个融合自动编码器和WGAN的原则框架。iWGAN模型通过迭代原始-对偶优化过程联合学习编码器网络和发电机网络。编码器网络将观察到的样本映射到潜在空间，生成器网络将样本从潜在空间映射到数据空间。我们建立了iWGAN的泛化误差界，从理论上证明了其性能。我们进一步在最大似然估计的框架下对我们的模型提供了严格的概率解释。iWGAN具有明确的停止标准，与其他自动编码器Gan相比具有许多优势。实验表明，iWGAN极大地缓解了模式崩溃的症状，加快了收敛速度，并且能够为每个样本提供质量检查的度量。我们通过为基准数据集获得有竞争力和稳定的性能来说明iWGAN的能力。
摘要：Generative Adversarial Networks (GANs) have been impactful on many problems and applications but suffer from unstable training. The Wasserstein GAN (WGAN) leverages the Wasserstein distance to avoid the caveats in the minmax two-player training of GANs but has other defects such as mode collapse and lack of metric to detect the convergence. We introduce a novel inferential Wasserstein GAN (iWGAN) model, which is a principled framework to fuse auto-encoders and WGANs. The iWGAN model jointly learns an encoder network and a generator network motivated by the iterative primal dual optimization process. The encoder network maps the observed samples to the latent space and the generator network maps the samples from the latent space to the data space. We establish the generalization error bound of the iWGAN to theoretically justify its performance. We further provide a rigorous probabilistic interpretation of our model under the framework of maximum likelihood estimation. The iWGAN, with a clear stopping criteria, has many advantages over other autoencoder GANs. The empirical experiments show that the iWGAN greatly mitigates the symptom of mode collapse, speeds up the convergence, and is able to provide a measurement of quality check for each individual sample. We illustrate the ability of the iWGAN by obtaining competitive and stable performances for benchmark datasets.

半/弱/无/有监督|不确定性|主动学习(8篇)

【1】 Towards Stochastic Fault-tolerant Control using Precision Learning and Active Inference
标题：基于精确学习和主动推理的随机容错控制
链接：https://arxiv.org/abs/2109.05870

作者：Mohamed Baioumy,Corrado Pezzato,Carlos Hernandez Corbato,Nick Hawes,Riccardo Ferrari
机构： Oxford Robotics Institute, University of Oxford, Cognitive Robotics, Delft University of Technology, Delft Center for Systems and Control, Delft University of Technology
备注：Presented at the International Workshop on Active Inference (IWAI) 2021; 11 pages, 3 figures
摘要：提出了一种基于主动推理的机器人感官故障容错控制方案。在大多数现有方案中，传感器是否健康（功能）或故障的二元决策是基于测量数据做出的。决策边界称为阈值，通常是确定的。在做出错误决策后，通过排除故障传感器来实现故障恢复。我们提出了一种基于主动推理和精确学习的随机容错方案，该方案不需要先验阈值定义来触发故障恢复。相反，代表其健康状态的传感器精度是以无模型的方式在线学习的，允许系统逐渐而不是突然排除故障单元。在机械手上的实验显示了有希望的结果，并讨论了未来工作的方向。
摘要：This work presents a fault-tolerant control scheme for sensory faults in robotic manipulators based on active inference. In the majority of existing schemes, a binary decision of whether a sensor is healthy (functional) or faulty is made based on measured data. The decision boundary is called a threshold and it is usually deterministic. Following a faulty decision, fault recovery is obtained by excluding the malfunctioning sensor. We propose a stochastic fault-tolerant scheme based on active inference and precision learning which does not require a priori threshold definitions to trigger fault recovery. Instead, the sensor precision, which represents its health status, is learned online in a model-free way allowing the system to gradually, and not abruptly exclude a failing unit. Experiments on a robotic manipulator show promising results and directions for future work are discussed.

【2】 Adversarially Trained Object Detector for Unsupervised Domain Adaptation
标题：用于无监督领域自适应的对抗性训练目标检测器
链接：https://arxiv.org/abs/2109.05751

作者：Kazuma Fujii,Hiroshi Kera,Kazuhiko Kawamoto
机构： Graduate School of Science and Engineering, Chiba University, Graduate School of Engineering, Chiba University
备注：9 pages, 4 figures
摘要：无监督域自适应涉及将知识从标签丰富的源域转移到未标记的目标域，可用于大幅降低目标检测领域中的注释成本。在这项研究中，我们证明了源域中的对抗性训练可以作为一种新的无监督域适应方法。具体而言，我们确定，经过对抗训练的检测器在显著偏离源域的目标域中实现了改进的检测性能。这一现象归因于这样一个事实，即经过逆向训练的检测器可用于提取与人类感知一致且值得跨域转移的鲁棒特征，同时丢弃特定于域的非鲁棒特征。此外，我们提出了一种结合对抗性训练和特征对齐的方法，以确保鲁棒特征与目标域的更好对齐。我们在四个基准数据集上进行了实验，并证实了我们提出的方法在从真实图像到艺术图像的大范围转换上的有效性。与基线模型相比，经过对抗性训练的检测器将平均精度提高了7.7%，当特征对齐时，平均精度进一步提高了11.8%。
摘要：Unsupervised domain adaptation, which involves transferring knowledge from a label-rich source domain to an unlabeled target domain, can be used to substantially reduce annotation costs in the field of object detection. In this study, we demonstrate that adversarial training in the source domain can be employed as a new approach for unsupervised domain adaptation. Specifically, we establish that adversarially trained detectors achieve improved detection performance in target domains that are significantly shifted from source domains. This phenomenon is attributed to the fact that adversarially trained detectors can be used to extract robust features that are in alignment with human perception and worth transferring across domains while discarding domain-specific non-robust features. In addition, we propose a method that combines adversarial training and feature alignment to ensure the improved alignment of robust features with the target domain. We conduct experiments on four benchmark datasets and confirm the effectiveness of our proposed approach on large domain shifts from real to artistic images. Compared to the baseline models, the adversarially trained detectors improve the mean average precision by up to 7.7\%, and further by up to 11.8\% when feature alignments are incorporated.

【3】 Online Unsupervised Learning of Visual Representations and Categories
标题：视觉表征和类别的在线无监督学习
链接：https://arxiv.org/abs/2109.05675

作者：Mengye Ren,Tyler R. Scott,Michael L. Iuzzolino,Michael C. Mozer,Richard Zemel
机构：Google Research; University of Colorado, Boulder, University of Toronto; Vector Institute; CIFAR
备注：29 pages
摘要：现实世界的学习场景涉及样本之间具有顺序依赖关系的类的非平稳分布，这与独立于固定的、典型的均匀分布绘制样本的标准机器学习公式形成对比。此外，现实世界的交互需要从很少或没有类标签的情况下进行动态学习。在这项工作中，我们提出了一个无监督的模型，该模型在不依赖任何类别标签的情况下，同时执行在线视觉表征学习和新类别的少量镜头学习。我们的模型是一个基于原型的内存网络，带有一个控制组件，用于确定何时形成一个新的类原型。我们将其表述为一个在线高斯混合模型，其中组件仅通过一个新示例在线创建，分配不必平衡，这允许从未经处理的原始数据近似自然不平衡分布。学习包括一种对比损失，它鼓励将同一图像的不同视图分配给同一原型。其结果是一种在非平稳环境中形成对象分类表示的机制。实验表明，我们的方法可以从在线视觉输入数据流中学习，并且与最先进的自监督学习方法相比，在类别识别方面有显著的优势。
摘要：Real world learning scenarios involve a nonstationary distribution of classes with sequential dependencies among the samples, in contrast to the standard machine learning formulation of drawing samples independently from a fixed, typically uniform distribution. Furthermore, real world interactions demand learning on-the-fly from few or no class labels. In this work, we propose an unsupervised model that simultaneously performs online visual representation learning and few-shot learning of new categories without relying on any class labels. Our model is a prototype-based memory network with a control component that determines when to form a new class prototype. We formulate it as an online Gaussian mixture model, where components are created online with only a single new example, and assignments do not have to be balanced, which permits an approximation to natural imbalanced distributions from uncurated raw data. Learning includes a contrastive loss that encourages different views of the same image to be assigned to the same prototype. The result is a mechanism that forms categorical representations of objects in nonstationary environments. Experiments show that our method can learn from an online stream of visual input data and is significantly better at category recognition compared to state-of-the-art self-supervised learning methods.

【4】 FedTriNet: A Pseudo Labeling Method with Three Players for Federated Semi-supervised Learning
标题：FedTriNet：一种用于联合半监督学习的三人伪标记方法
链接：https://arxiv.org/abs/2109.05612

作者：Liwei Che,Zewei Long,Jiaqi Wang,Yaqing Wang,Houping Xiao,Fenglong Ma
摘要：联邦学习在分布式数据利用和隐私保护方面显示出巨大的潜力。大多数现有的联合学习方法关注于监督设置，这意味着每个客户机中存储的所有数据都有标签。然而，在现实世界的应用程序中，客户端数据不可能完全标记。因此，如何利用未标记的数据应该是联邦学习的一个新挑战。尽管有一些研究试图克服这一挑战，但它们可能会遇到信息泄漏或误导性信息使用问题。为了解决这些问题，本文提出了一种新的联邦半监督学习方法FedTriNet，它由两个学习阶段组成。在第一阶段，我们使用带有FedAvg的标记数据预训练FedTriNet。在第二阶段，我们的目标是利用大部分未标记的数据来帮助模型学习。特别是，我们建议使用三个网络和一个动态质量控制机制为添加到训练集中的未标记数据生成高质量的伪标签。最后，FedTriNet使用新的训练集对模型进行再训练。在三个公开数据集上的实验结果表明，在IID和非IID设置下，所提出的FedTriNet优于最先进的基线。
摘要：Federated Learning has shown great potentials for the distributed data utilization and privacy protection. Most existing federated learning approaches focus on the supervised setting, which means all the data stored in each client has labels. However, in real-world applications, the client data are impossible to be fully labeled. Thus, how to exploit the unlabeled data should be a new challenge for federated learning. Although a few studies are attempting to overcome this challenge, they may suffer from information leakage or misleading information usage problems. To tackle these issues, in this paper, we propose a novel federated semi-supervised learning method named FedTriNet, which consists of two learning phases. In the first phase, we pre-train FedTriNet using labeled data with FedAvg. In the second phase, we aim to make most of the unlabeled data to help model learning. In particular, we propose to use three networks and a dynamic quality control mechanism to generate high-quality pseudo labels for unlabeled data, which are added to the training set. Finally, FedTriNet uses the new training set to retrain the model. Experimental results on three publicly available datasets show that the proposed FedTriNet outperforms state-of-the-art baselines under both IID and Non-IID settings.

【5】 An Unsupervised Deep-Learning Method for Fingerprint Classification: the CCAE Network and the Hybrid Clustering Strategy
标题：一种用于指纹分类的无监督深度学习方法：CCAE网络和混合聚类策略
链接：https://arxiv.org/abs/2109.05526

作者：Yue-Jie Hou,Zai-Xin Xie,Jian-Hu,Yao-Shen,Chi-Chun Zhou
摘要：在指纹匹配过程中，指纹分类是加快匹配过程、提高匹配精度的一种重要而有效的方法。传统的监督方法需要大量的预标记数据，从而消耗大量的人力资源。在本文中，我们提出了一种新的、高效的无监督深度学习方法，可以自动提取指纹特征并对指纹模式进行分类。该方法采用约束卷积自动编码（CCAE）模型提取指纹特征，并采用混合聚类策略得到最终聚类结果。在NIST-DB4数据集上的一组实验表明，所提出的无监督方法在指纹分类方面表现出了有效的性能。例如，在NIST-DB4中，CCAE仅对1000个未标记指纹的准确率达到97.3%。
摘要：The fingerprint classification is an important and effective method to quicken the process and improve the accuracy in the fingerprint matching process. Conventional supervised methods need a large amount of pre-labeled data and thus consume immense human resources. In this paper, we propose a new and efficient unsupervised deep learning method that can extract fingerprint features and classify fingerprint patterns automatically. In this approach, a new model named constraint convolutional auto-encoder (CCAE) is used to extract fingerprint features and a hybrid clustering strategy is applied to obtain the final clusters. A set of experiments in the NIST-DB4 dataset shows that the proposed unsupervised method exhibits the efficient performance on fingerprint classification. For example, the CCAE achieves an accuracy of 97.3% on only 1000 unlabeled fingerprints in the NIST-DB4.

【6】 Pairwise Supervised Contrastive Learning of Sentence Representations
标题：句子表征的成对监督对比学习
链接：https://arxiv.org/abs/2109.05424

作者：Dejiao Zhang,Shang-Wen Li,Wei Xiao,Henghui Zhu,Ramesh Nallapati,Andrew O. Arnold,Bing Xiang
机构：AWS AI
备注：9 pages, EMNLP 2021
摘要：最近在句子表征学习方面的许多成功都是通过简单地微调具有三元组丢失或连体丢失的自然语言推理（NLI）数据集来实现的。然而，它们有一个共同的弱点：矛盾对中的句子不一定来自不同的语义范畴。因此，仅优化语义蕴涵和矛盾推理目标不足以捕获高层语义结构。香草连体或三胞胎的损失只能从单个句子对或三胞胎中学习，这一事实使缺点更加复杂，因为这些句子对或三胞胎经常遭受不好的局部最优。在本文中，我们提出了PairSupCon，这是一种基于实例判别的方法，旨在将语义蕴涵和矛盾理解与高级范畴概念编码联系起来。我们评估PairSupCon在不同的下游任务中的表现，这些任务涉及理解不同粒度下的句子语义。我们在八项聚类任务上的平均改善率为$10\%$--$13\%$，在七项语义-文本相似性（STS）任务上的平均改善率为$5\%$--$6\%$，超过了先前最先进的方法。
摘要：Many recent successes in sentence representation learning have been achieved by simply fine-tuning on the Natural Language Inference (NLI) datasets with triplet loss or siamese loss. Nevertheless, they share a common weakness: sentences in a contradiction pair are not necessarily from different semantic categories. Therefore, optimizing the semantic entailment and contradiction reasoning objective alone is inadequate to capture the high-level semantic structure. The drawback is compounded by the fact that the vanilla siamese or triplet losses only learn from individual sentence pairs or triplets, which often suffer from bad local optima. In this paper, we propose PairSupCon, an instance discrimination based approach aiming to bridge semantic entailment and contradiction understanding with high-level categorical concept encoding. We evaluate PairSupCon on various downstream tasks that involve understanding sentence semantics at different granularities. We outperform the previous state-of-the-art method with $10\%$--$13\%$ averaged improvement on eight clustering tasks, and $5\%$--$6\%$ averaged improvement on seven semantic textual similarity (STS) tasks.

【7】 Self supervised learning improves dMMR/MSI detection from histology slides across multiple cancers
标题：自我监督学习改进了跨多个癌症的组织切片中的dMMR/MSI检测
链接：https://arxiv.org/abs/2109.05819

作者：Charlie Saillard,Olivier Dehaene,Tanguy Marchand,Olivier Moindrot,Aurélie Kamoun,Benoit Schmauch,Simon Jegou
机构：Owkin, Inc., Aur´elie Kamoun, Editor:
备注：Accepted for poster and oral presentation at the MICCAI 2021 COMPAY Workshop (submitted the 19th of July 2021)
摘要：微卫星不稳定性（MSI）是一种肿瘤表型，其诊断在很大程度上影响结直肠癌（CRC）患者的治疗，并与所有实体瘤的免疫治疗反应相关。直接从H&E染色切片检测MSI肿瘤的深度学习模型在改善MSI患者的诊断方面显示出了希望。先前用于MSI检测的深度学习模型依赖于在不包含任何医学图像的ImageNet数据集上预训练的神经网络。在本研究中，我们利用自监督学习的最新进展，通过使用MoCo V2对TCGA数据集的组织学图像训练神经网络。我们发现，这些网络始终优于使用ImageNet预训练的网络，并获得MSI检测的最新结果，CRC和胃肿瘤的AUC分别为0.92和0.83。这些模型在外部CRC队列（PAIP上的AUC为0.97）上具有良好的通用性，并改善了从一个器官到另一个器官的转移。最后，我们表明，预测图像区域显示出有意义的组织学模式，并且根据专家病理学家的说法，MoCo特征的使用突出了更相关的模式。
摘要：Microsatellite instability (MSI) is a tumor phenotype whose diagnosis largely impacts patient care in colorectal cancers (CRC), and is associated with response to immunotherapy in all solid tumors. Deep learning models detecting MSI tumors directly from H&E stained slides have shown promise in improving diagnosis of MSI patients. Prior deep learning models for MSI detection have relied on neural networks pretrained on ImageNet dataset, which does not contain any medical image. In this study, we leverage recent advances in self-supervised learning by training neural networks on histology images from the TCGA dataset using MoCo V2. We show that these networks consistently outperform their counterparts pretrained using ImageNet and obtain state-of-the-art results for MSI detection with AUCs of 0.92 and 0.83 for CRC and gastric tumors, respectively. These models generalize well on an external CRC cohort (0.97 AUC on PAIP) and improve transfer from one organ to another. Finally we show that predictive image regions exhibit meaningful histological patterns, and that the use of MoCo features highlighted more relevant patterns according to an expert pathologist.

【8】 AstronomicAL: An interactive dashboard for visualisation, integration and classification of data using Active Learning
标题：天文学：使用主动学习实现数据可视化、集成和分类的交互式仪表板
链接：https://arxiv.org/abs/2109.05207

作者：Grant Stevens,Sotiria Fotopoulou,Malcolm N. Bremer,Oliver Ray
机构：Ray, Department of Computer Science, Merchant Venturers Building, University of Bristol, Woodland, Road, Bristol, BS,UB , School of Physics, HH Wills Physics Laboratory, University of Bristol, Tyndall Avenue, Bristol, BS,TL, DOI: ,.,joss., Software, • Review
备注：None
摘要：天文学是一个人在回路交互式标签和训练仪表板，允许用户使用主动学习创建可靠的数据集和稳健的分类器。此技术优先处理提供高信息增益的数据，从而使用更少的数据提高性能。该系统允许用户可视化和整合来自不同来源的数据，并处理不正确或缺失的标签以及不平衡的班级规模。天文技术使专家能够可视化特定领域的绘图和关键信息，这些绘图和信息与从各种数据源中提取的兴趣点的更广泛背景和细节有关，从而确保可靠的标签。此外，天文还提供了探索训练过程各个方面的功能，包括自定义模型和查询策略。这使得该软件成为一种工具，用于试验特定领域的分类和更通用的机器学习策略。由于该领域的迫切需要，我们用一个天文数据集演示了如何使用该系统；然而，天文学是为任何学科的数据集而设计的。最后，通过导出一个简单的配置文件，可以与社区共享整个布局、模型和指定的标签。这允许完全透明，并确保复制结果的过程毫不费力
摘要：AstronomicAL is a human-in-the-loop interactive labelling and training dashboard that allows users to create reliable datasets and robust classifiers using active learning. This technique prioritises data that offer high information gain, leading to improved performance using substantially less data. The system allows users to visualise and integrate data from different sources and deal with incorrect or missing labels and imbalanced class sizes. AstronomicAL enables experts to visualise domain-specific plots and key information relating both to broader context and details of a point of interest drawn from a variety of data sources, ensuring reliable labels. In addition, AstronomicAL provides functionality to explore all aspects of the training process, including custom models and query strategies. This makes the software a tool for experimenting with both domain-specific classifications and more general-purpose machine learning strategies. We illustrate using the system with an astronomical dataset due to the field's immediate need; however, AstronomicAL has been designed for datasets from any discipline. Finally, by exporting a simple configuration file, entire layouts, models, and assigned labels can be shared with the community. This allows for complete transparency and ensures that the process of reproducing results is effortless

迁移|Zero/Few/One-Shot|自适应(6篇)

【1】 Few-Shot Cross-Lingual Stance Detection with Sentiment-Based Pre-Training
标题：基于情感预训练的Few-Shot跨语言姿态检测
链接：https://arxiv.org/abs/2109.06050

作者：Momchil Hardalov,Arnav Arora,Preslav Nakov,Isabelle Augenstein
机构： Checkstep Research, Sofia University “St. Kliment Ohridski”, Bulgaria, University of Copenhagen, Denmark, Qatar Computing Research Institute, HBKU, Doha, Qatar
摘要：姿态检测的目标是确定文本中表达的朝向目标的视点。根据用户和平台的不同，这些观点或上下文通常以多种不同的语言表达，可以是本地新闻媒体、社交媒体平台、新闻论坛等。然而，大多数姿态检测研究仅限于使用单一语言和少数有限目标，在跨语言姿态检测方面的工作很少。此外，标签数据的非英语来源往往很稀少，并带来额外的挑战。最近，大型多语言语言模型极大地提高了许多非英语任务的性能，尤其是在示例数量有限的情况下。这突出了模型预训练的重要性及其从少数示例中学习的能力。在本文中，我们介绍了迄今为止最全面的跨语言立场检测研究：我们对来自6个语系的12种语言的15个不同数据集进行了实验，每个数据集有6个低资源评估设置。在我们的实验中，我们建立在模式利用训练的基础上，建议添加一个新的标签编码器来简化描述过程。我们进一步提出了基于情绪的姿势数据生成用于训练前，这表明与几个强基线相比，在低投篮设置下F1绝对成绩有6%以上的显著提高。
摘要：The goal of stance detection is to determine the viewpoint expressed in a piece of text towards a target. These viewpoints or contexts are often expressed in many different languages depending on the user and the platform, which can be a local news outlet, a social media platform, a news forum, etc. Most research in stance detection, however, has been limited to working with a single language and on a few limited targets, with little work on cross-lingual stance detection. Moreover, non-English sources of labelled data are often scarce and present additional challenges. Recently, large multilingual language models have substantially improved the performance on many non-English tasks, especially such with limited numbers of examples. This highlights the importance of model pre-training and its ability to learn from few examples. In this paper, we present the most comprehensive study of cross-lingual stance detection to date: we experiment with 15 diverse datasets in 12 languages from 6 language families, and with 6 low-resource evaluation settings each. For our experiments, we build on pattern-exploiting training, proposing the addition of a novel label encoder to simplify the verbalisation procedure. We further propose sentiment-based generation of stance data for pre-training, which shows sizeable improvement of more than 6% F1 absolute in low-shot settings compared to several strong baselines.

【2】 An Adaptive Boosting Technique to Mitigate Popularity Bias in Recommender System
标题：一种降低推荐系统热度偏差的自适应Boosting技术
链接：https://arxiv.org/abs/2109.05677

作者：Ajay Gangwar,Shweta Jain
机构：Indian Institute of Technology, Ropar
备注：7 pages, 8 figures, Accepted in FAccTRec
摘要：在大多数推荐系统中观察到的评分都会受到受欢迎度偏差的影响，因此不会随机缺失。因此，只推荐了少数受欢迎的项目，而很少推荐大量非受欢迎的项目。不推荐不受欢迎的产品会导致主导市场的产品减少，从而减少创造力和创新的机会。在文献中，已经提出了几种公平算法，主要关注于提高推荐系统的准确性。然而，一个典型的准确度度量偏向流行项目，即，与非流行项目相比，流行项目的准确度更高。本文考虑了一个度量流行性偏差的指标，即流行性项目和非流行性项目的误差差异。受分类的公平推进算法的启发，我们提出了一种算法，可以减少数据中存在的流行偏差，同时将准确性保持在可接受的范围内。我们算法的主要思想是，它提高了非流行项的权重，这些项在数据中通常表示不足。通过对真实数据集的综合实验，我们证明了我们提出的算法在所提出的流行度偏差度量上优于现有算法。
摘要：The observed ratings in most recommender systems are subjected to popularity bias and are thus not randomly missing. Due to this, only a few popular items are recommended, and a vast number of non-popular items are hardly recommended. Not suggesting the non-popular items lead to fewer products dominating the market and thus offering fewer opportunities for creativity and innovation. In the literature, several fair algorithms have been proposed which mainly focused on improving the accuracy of the recommendation system. However, a typical accuracy measure is biased towards popular items, i.e., it promotes better accuracy for popular items compared to non-popular items. This paper considers a metric that measures the popularity bias as the difference in error on popular items and non-popular items. Motivated by the fair boosting algorithm on classification, we propose an algorithm that reduces the popularity bias present in the data while maintaining accuracy within acceptable limits. The main idea of our algorithm is that it lifts the weights of the non-popular items, which are generally underrepresented in the data. With the help of comprehensive experiments on real-world datasets, we show that our proposed algorithm outperforms the existing algorithms on the proposed popularity bias metric.

【3】 Facial Anatomical Landmark Detection using Regularized Transfer Learning with Application to Fetal Alcohol Syndrome Recognition
标题：正则化转移学习面部解剖标志检测及其在胎儿酒精综合征识别中的应用
链接：https://arxiv.org/abs/2109.05485

作者：Zeyu Fu,Jianbo Jiao,Michael Suttie,J. Alison Noble
机构： Noble are with the Department of EngineeringScience, University of Oxford
备注：To appear in IEEE journal of Biomedical and Health Informatics 2021
摘要：产前酒精暴露引起的胎儿酒精综合征（FAS）可导致一系列颅面部异常、行为和神经认知问题。目前对FAS的诊断通常是通过识别一组面部特征来完成的，这些特征通常是通过手动检查获得的。解剖地标检测提供了丰富的几何信息，对于检测FAS相关面部异常的存在非常重要。这种成像应用的特点是数据外观变化大，标记数据的可用性有限。当前基于深度学习的热图回归方法设计用于自然图像中的面部地标检测，假设存在大量数据集，因此不适合此应用。为了解决这一限制，我们开发了一种新的正则化迁移学习方法，该方法利用在大型人脸识别数据集上学习到的网络知识。与标准的转移学习相比，标准的转移学习侧重于调整预先训练的权重，所提出的学习方法规范了模型的行为。它明确地重用了目标任务数据上领域相似源模型的丰富视觉语义，作为规范化地标检测优化的附加监控信号。具体来说，我们为所提出的转移学习开发了四个正则化约束，包括约束分类和中间层的特征输出，以及在空间和通道级别匹配激活注意图。对收集的临床影像数据集的实验评估表明，该方法在有限的训练样本下可以有效地提高模型的泛化能力，并优于文献中的其他方法。
摘要：Fetal alcohol syndrome (FAS) caused by prenatal alcohol exposure can result in a series of cranio-facial anomalies, and behavioral and neurocognitive problems. Current diagnosis of FAS is typically done by identifying a set of facial characteristics, which are often obtained by manual examination. Anatomical landmark detection, which provides rich geometric information, is important to detect the presence of FAS associated facial anomalies. This imaging application is characterized by large variations in data appearance and limited availability of labeled data. Current deep learning-based heatmap regression methods designed for facial landmark detection in natural images assume availability of large datasets and are therefore not wellsuited for this application. To address this restriction, we develop a new regularized transfer learning approach that exploits the knowledge of a network learned on large facial recognition datasets. In contrast to standard transfer learning which focuses on adjusting the pre-trained weights, the proposed learning approach regularizes the model behavior. It explicitly reuses the rich visual semantics of a domain-similar source model on the target task data as an additional supervisory signal for regularizing landmark detection optimization. Specifically, we develop four regularization constraints for the proposed transfer learning, including constraining the feature outputs from classification and intermediate layers, as well as matching activation attention maps in both spatial and channel levels. Experimental evaluation on a collected clinical imaging dataset demonstrate that the proposed approach can effectively improve model generalizability under limited training samples, and is advantageous to other approaches in the literature.

【4】 Adaptive network reliability analysis: Methodology and applications to power grid
标题：自适应网络可靠性分析方法及其在电网中的应用
链接：https://arxiv.org/abs/2109.05360

作者：Nariman L. Dehghani,Soroush Zamanian,Abdollah Shafieezadeh
机构： Risk Assessment and Management of Structural and Infrastructure Systems (RAMSIS) Lab, Department, of Civil, Environmental, and Geodetic Engineering, The Ohio State University, Columbus, OH, United States
备注：None
摘要：流量网络模型可以捕获许多网络系统的基本物理和操作约束，包括电网、交通和水网络。然而，使用计算昂贵的基于流的模型分析系统的可靠性面临着巨大的挑战，特别是对于罕见事件。现有的主动训练元模型为可靠性分析提供了一个新的方向，但由于这些方法无法处理高维问题以及离散或混合变量输入，因此不适用于网络。本研究首次提出了基于贝叶斯加性回归树的自适应代理网络可靠性分析（ANR-BART）。该方法通过一种主动学习方法将BART和蒙特卡罗模拟（MCS）相结合，该方法基于BART在预测变量空间上得出的可信区间以及点与估计极限状态的接近程度来识别最有价值的训练样本。基准电网包括IEEE 30、57、118和300母线系统及其用于级联故障分析的潮流模型，用于研究ANR-BART、MCS、子集模拟以及被动训练的最优深层神经网络和BART。结果表明，ANR-BART具有鲁棒性，能够准确估计网络故障概率，同时显著降低可靠性分析的计算成本。
摘要：Flow network models can capture the underlying physics and operational constraints of many networked systems including the power grid and transportation and water networks. However, analyzing reliability of systems using computationally expensive flow-based models faces substantial challenges, especially for rare events. Existing actively trained meta-models, which present a new promising direction in reliability analysis, are not applicable to networks due to the inability of these methods to handle high-dimensional problems as well as discrete or mixed variable inputs. This study presents the first adaptive surrogate-based Network Reliability Analysis using Bayesian Additive Regression Trees (ANR-BART). This approach integrates BART and Monte Carlo simulation (MCS) via an active learning method that identifies the most valuable training samples based on the credible intervals derived by BART over the space of predictor variables as well as the proximity of the points to the estimated limit state. Benchmark power grids including IEEE 30, 57, 118, and 300-bus systems and their power flow models for cascading failure analysis are considered to investigate ANR-BART, MCS, subset simulation, and passively-trained optimal deep neural networks and BART. Results indicate that ANR-BART is robust and yields accurate estimates of network failure probability, while significantly reducing the computational cost of reliability analysis.

【5】 Doubly Adaptive Scaled Algorithm for Machine Learning Using Second-Order Information
标题：基于二阶信息的机器学习双自适应缩放算法
链接：https://arxiv.org/abs/2109.05198

作者：Majid Jahani,Sergey Rusakov,Zheng Shi,Peter Richtárik,Michael W. Mahoney,Martin Takáč
机构：Lehigh University, USA, KAUST, Saudi Arabia, University of California, Berkeley, USA, Martin Takáˇc, MBZUAI, United Arab Emirates
摘要：针对大规模机器学习问题，提出了一种新的自适应优化算法。该方法通过对局部曲率和Lipschitz平滑度的低成本估计，动态调整搜索方向和步长。搜索方向包含梯度信息，该梯度信息由捕获局部曲率信息的缩放良好的对角预处理矩阵预处理。我们的方法不需要学习率调整的繁琐任务，因为学习率会自动更新，而无需添加额外的超参数。我们提供了一系列优化问题的收敛性保证，包括确定性和随机性条件下的凸、强凸和非凸问题。我们还对标准的机器学习问题进行了广泛的实证评估，证明了我们的算法的通用性，并证明了与其他最新的一阶和二阶方法相比，它具有强大的性能。
摘要：We present a novel adaptive optimization algorithm for large-scale machine learning problems. Equipped with a low-cost estimate of local curvature and Lipschitz smoothness, our method dynamically adapts the search direction and step-size. The search direction contains gradient information preconditioned by a well-scaled diagonal preconditioning matrix that captures the local curvature information. Our methodology does not require the tedious task of learning rate tuning, as the learning rate is updated automatically without adding an extra hyperparameter. We provide convergence guarantees on a comprehensive collection of optimization problems, including convex, strongly convex, and nonconvex problems, in both deterministic and stochastic regimes. We also conduct an extensive empirical evaluation on standard machine learning problems, justifying our algorithm's versatility and demonstrating its strong performance compared to other start-of-the-art first-order and second-order methods.

【6】 Toward Communication Efficient Adaptive Gradient Method
标题：走向通信高效的自适应梯度法
链接：https://arxiv.org/abs/2109.05109

作者：Xiangyi Chen,Xiaoyun Li,Ping Li
机构：Cognitive Computing Lab, Baidu Research, NE ,th St. Bellevue, WA , USA
摘要：近年来，分布式优化被证明是加速大规模机器学习模型（如深度神经网络）训练的有效方法。随着gpu计算能力的提高，分布式训练中训练速度的瓶颈逐渐从计算转向通信。与此同时，为了在移动设备上训练机器学习模型，一种称为“联合学习”的新型分布式训练范式已经流行起来。由于移动设备的低带宽，联邦学习中的通信时间尤其重要。虽然已经提出了各种提高联邦学习通信效率的方法，但大多数方法都是以SGD作为原型训练算法设计的。虽然自适应梯度方法已被证明是训练神经网络的有效方法，但在联合学习中对自适应梯度方法的研究却很少。在本文中，我们提出了一种自适应梯度方法，可以保证联邦学习的收敛性和通信效率。
摘要：In recent years, distributed optimization is proven to be an effective approach to accelerate training of large scale machine learning models such as deep neural networks. With the increasing computation power of GPUs, the bottleneck of training speed in distributed training is gradually shifting from computation to communication. Meanwhile, in the hope of training machine learning models on mobile devices, a new distributed training paradigm called ``federated learning'' has become popular. The communication time in federated learning is especially important due to the low bandwidth of mobile devices. While various approaches to improve the communication efficiency have been proposed for federated learning, most of them are designed with SGD as the prototype training algorithm. While adaptive gradient methods have been proven effective for training neural nets, the study of adaptive gradient methods in federated learning is scarce. In this paper, we propose an adaptive gradient method that can guarantee both the convergence and the communication efficiency for federated learning.

强化学习(8篇)

【1】 RADARS: Memory Efficient Reinforcement Learning Aided Differentiable Neural Architecture Search
标题：雷达：记忆有效的强化学习辅助可微神经结构搜索
链接：https://arxiv.org/abs/2109.05691

作者：Zheyu Yan,Weiwen Jiang,Xiaobo Sharon Hu,Yiyu Shi
机构：University of Notre Dame †George Mason University
摘要：可微神经结构搜索（DNAS）以其自动生成高级神经网络的能力而闻名。然而，当搜索空间扩大时，基于DNA的方法会受到内存使用量爆炸的影响，这可能会阻止它们在更高级的GPU平台上成功运行。另一方面，基于强化学习（RL）的方法虽然具有记忆效率，但非常耗时。结合这两种方法的优点，本文提出了一种可扩展的RL辅助DNAS框架RADARS，该框架可以快速、高效地探索大的搜索空间。雷达迭代地应用RL修剪不需要的候选体系结构，并确定一个有希望的子空间来执行DNA。使用具有12 GB GPU内存的工作站进行的实验表明，在CIFAR-10和ImageNet数据集上，与最先进的基于RL的方法相比，雷达可以实现高达3.41%的精度，搜索时间减少2.5倍，而两条DNA基线由于内存使用或搜索时间过多而无法完成。据作者所知，这是第一个能够处理内存使用受限的大型搜索空间的DNAS框架。
摘要：Differentiable neural architecture search (DNAS) is known for its capacity in the automatic generation of superior neural networks. However, DNAS based methods suffer from memory usage explosion when the search space expands, which may prevent them from running successfully on even advanced GPU platforms. On the other hand, reinforcement learning (RL) based methods, while being memory efficient, are extremely time-consuming. Combining the advantages of both types of methods, this paper presents RADARS, a scalable RL-aided DNAS framework that can explore large search spaces in a fast and memory-efficient manner. RADARS iteratively applies RL to prune undesired architecture candidates and identifies a promising subspace to carry out DNAS. Experiments using a workstation with 12 GB GPU memory show that on CIFAR-10 and ImageNet datasets, RADARS can achieve up to 3.41% higher accuracy with 2.5X search time reduction compared with a state-of-the-art RL-based method, while the two DNAS baselines cannot complete due to excessive memory usage or search time. To the best of the authors' knowledge, this is the first DNAS framework that can handle large search spaces with bounded memory usage.

【2】 Reinforcement Learning for Load-balanced Parallel Particle Tracing
标题：强化学习在负载平衡并行粒子跟踪中的应用
链接：https://arxiv.org/abs/2109.05679

作者：Jiayi Xu,Hanqi Guo,Han-Wei Shen,Mukund Raj,Skylar Wolfgang Wurster,Tom Peterka
机构： The Ohio State University
备注：Under Review at IEEE Transactions on Visualization and Computer Graphics
摘要：我们探索了一种在线学习强化学习（RL）范式，用于优化分布式存储系统中的并行粒子跟踪性能。我们的方法结合了三个新的组成部分：（1）工作负载捐赠模型，（2）高阶工作负载估计模型，（3）通信开销模型，动态优化数据并行粒子跟踪的性能。首先，我们设计了一个基于RL的工作负载捐赠模型。我们的工作负载捐赠模型监控流程的工作负载，并创建RL代理，将高工作负载流程中的粒子和数据块捐赠给低工作负载流程，以最小化执行时间。代理人根据报酬和成本函数动态学习捐赠策略。奖励和成本函数被设计为考虑过程的工作量变化和数据转移成本的每一个捐赠行动。其次，我们提出了一个在线工作负载估计模型，以帮助我们的RL模型估计未来计算中进程的工作负载分布。第三，我们设计了同时考虑块和粒子数据交换代价的通信代价模型，帮助代理以最小的通信代价做出有效的决策。我们证明了我们的算法适用于大规模流体动力学、海洋和天气模拟数据中的不同流动行为。我们的算法在并行效率、负载平衡、I/O和通信成本等方面提高了并行粒子跟踪性能，最多可用于16384个处理器。
摘要：We explore an online learning reinforcement learning (RL) paradigm for optimizing parallel particle tracing performance in distributed-memory systems. Our method combines three novel components: (1) a workload donation model, (2) a high-order workload estimation model, and (3) a communication cost model, to optimize the performance of data-parallel particle tracing dynamically. First, we design an RL-based workload donation model. Our workload donation model monitors the workload of processes and creates RL agents to donate particles and data blocks from high-workload processes to low-workload processes to minimize the execution time. The agents learn the donation strategy on-the-fly based on reward and cost functions. The reward and cost functions are designed to consider the processes' workload change and the data transfer cost for every donation action. Second, we propose an online workload estimation model, in order to help our RL model estimate the workload distribution of processes in future computations. Third, we design the communication cost model that considers both block and particle data exchange costs, helping the agents make effective decisions with minimized communication cost. We demonstrate that our algorithm adapts to different flow behaviors in large-scale fluid dynamics, ocean, and weather simulation data. Our algorithm improves parallel particle tracing performance in terms of parallel efficiency, load balance, and costs of I/O and communication for evaluations up to 16,384 processors.

【3】 Direct Random Search for Fine Tuning of Deep Reinforcement Learning Policies
标题：深度强化学习策略精调的直接随机搜索
链接：https://arxiv.org/abs/2109.05604

作者：Sean Gillen,Asutay Ozmen,Katie Byl
机构： University of California
备注：Under Review
摘要：研究人员已经证明，深度强化学习（DRL）是一种寻找在复杂机器人系统上表现良好的策略的强大工具。然而，这些策略通常是不可预测的，并且在仅使用稍微不同的初始条件进行评估时，可能会导致高度可变的行为。训练方面的考虑限制了DRL算法的设计，因为大多数算法在训练期间必须使用随机策略。然而，部署过程中使用的最终策略可以而且经常是确定性策略，在每个步骤中使用最大似然操作（MLA）。在这项工作中，我们展示了直接随机搜索通过使用确定性卷展直接优化DRL策略在微调DRL策略方面非常有效。我们使用从不同算法获得的各种策略，在大量强化学习环境中演示了这一点。我们的结果表明，这种方法在我们测试的环境中产生了更一致和更高性能的代理。此外，我们还展示了如何使用这种方法来扩展我们以前在深度神经网络（DNN）策略下运行的闭环系统的可及状态空间的维数收缩方面的工作。
摘要：Researchers have demonstrated that Deep Reinforcement Learning (DRL) is a powerful tool for finding policies that perform well on complex robotic systems. However, these policies are often unpredictable and can induce highly variable behavior when evaluated with only slightly different initial conditions. Training considerations constrain DRL algorithm designs in that most algorithms must use stochastic policies during training. The resulting policy used during deployment, however, can and frequently is a deterministic one that uses the Maximum Likelihood Action (MLA) at each step. In this work, we show that a direct random search is very effective at fine-tuning DRL policies by directly optimizing them using deterministic rollouts. We illustrate this across a large collection of reinforcement learning environments, using a wide variety of policies obtained from different algorithms. Our results show that this method yields more consistent and higher performing agents on the environments we tested. Furthermore, we demonstrate how this method can be used to extend our previous work on shrinking the dimensionality of the reachable state space of closed-loop systems run under Deep Neural Network (DNN) policies.

【4】 Federated Ensemble Model-based Reinforcement Learning
标题：基于联邦集成模型的强化学习
链接：https://arxiv.org/abs/2109.05549

作者：Jin Wang,Jia Hu,Jed Mills,Geyong Min
机构： University ofExeter
摘要：联邦学习（FL）是一种保护隐私的机器学习范式，它支持地理分布和异构用户之间的协作训练，而无需收集他们的数据。联邦强化学习（RL）将FL扩展到传统的监督学习范式之外，用于处理各种隐私敏感应用（如自主驾驶）的顺序决策问题。然而，现有的联邦RL算法直接将无模型RL与FL相结合，因此通常具有较高的样本复杂度，缺乏理论保证。为了应对上述挑战，我们提出了一种新的联邦RL算法，该算法将基于模型的RL和集成知识提取合并到FL中。具体而言，我们利用FL和知识提取从客户创建动力学模型的集成，然后，在不与真实环境交互的情况下，单独使用集成模型来训练策略。此外，我们从理论上证明了该算法的单调改进是有保证的。大量的实验结果表明，在具有挑战性的连续控制基准环境中，与联邦无模型RL算法相比，我们的算法获得了更高的采样效率。结果还显示了非IID客户端数据和本地更新步骤对联邦RL性能的影响，验证了我们从理论分析中获得的见解。
摘要：Federated learning (FL) is a privacy-preserving machine learning paradigm that enables collaborative training among geographically distributed and heterogeneous users without gathering their data. Extending FL beyond the conventional supervised learning paradigm, federated Reinforcement Learning (RL) was proposed to handle sequential decision-making problems for various privacy-sensitive applications such as autonomous driving. However, the existing federated RL algorithms directly combine model-free RL with FL, and thus generally have high sample complexity and lack theoretical guarantees. To address the above challenges, we propose a new federated RL algorithm that incorporates model-based RL and ensemble knowledge distillation into FL. Specifically, we utilise FL and knowledge distillation to create an ensemble of dynamics models from clients, and then train the policy by solely using the ensemble model without interacting with the real environment. Furthermore, we theoretically prove that the monotonic improvement of the proposed algorithm is guaranteed. Extensive experimental results demonstrate that our algorithm obtains significantly higher sample efficiency compared to federated model-free RL algorithms in the challenging continuous control benchmark environments. The results also show the impact of non-IID client data and local update steps on the performance of federated RL, validating the insights obtained from our theoretical analysis.

【5】 HyAR: Addressing Discrete-Continuous Action Reinforcement Learning via Hybrid Action Representation
标题：HYAR：基于混合动作表示的离散-连续动作强化学习
链接：https://arxiv.org/abs/2109.05490

作者：Boyan Li,Hongyao Tang,Yan Zheng,Jianye Hao,Pengyi Li,Zhen Wang,Zhaopeng Meng,Li Wang
机构：College of Intelligence and Computing, Tianjin University, Northwestern Polytechnical University
备注：15 pages, preprint
摘要：离散-连续混合动作空间是许多实际问题的自然背景，例如机器人控制和游戏AI。然而，以往的强化学习（RL）大多只在离散或连续的动作空间中进行控制，很少考虑混合动作空间。解决混合动作RL的一种简单方法是通过离散化或连续化将混合动作空间转换为统一的齐次动作空间，从而可以应用常规RL算法。然而，这忽略了混合动作空间的底层结构，也导致了可伸缩性问题和额外的逼近困难，从而导致退化结果。在本文中，我们提出了混合动作表示（HyAR）来学习原始混合动作空间的紧凑和可解码的潜在表示空间。HyAR通过嵌入表和条件方差自动编码器（VAE）构造潜在空间并嵌入离散动作和连续参数之间的依赖关系。为了进一步提高有效性，通过无监督的环境动力学预测将动作表示训练为语义平滑。最后，代理在学习到的表示空间中使用传统DRL算法学习其策略，并通过解码嵌入到原始动作空间的混合动作与环境交互。我们在离散连续动作空间的各种环境中评估HyAR。结果表明，与以前的基线相比，HyAR具有优越性，尤其是对于高维动作空间。
摘要：Discrete-continuous hybrid action space is a natural setting in many practical problems, such as robot control and game AI. However, most previous Reinforcement Learning (RL) works only demonstrate the success in controlling with either discrete or continuous action space, while seldom take into account the hybrid action space. One naive way to address hybrid action RL is to convert the hybrid action space into a unified homogeneous action space by discretization or continualization, so that conventional RL algorithms can be applied. However, this ignores the underlying structure of hybrid action space and also induces the scalability issue and additional approximation difficulties, thus leading to degenerated results. In this paper, we propose Hybrid Action Representation (HyAR) to learn a compact and decodable latent representation space for the original hybrid action space. HyAR constructs the latent space and embeds the dependence between discrete action and continuous parameter via an embedding table and conditional Variantional Auto-Encoder (VAE). To further improve the effectiveness, the action representation is trained to be semantically smooth through unsupervised environmental dynamics prediction. Finally, the agent then learns its policy with conventional DRL algorithms in the learned representation space and interacts with the environment by decoding the hybrid action embeddings to the original action space. We evaluate HyAR in a variety of environments with discrete-continuous action space. The results demonstrate the superiority of HyAR when compared with previous baselines, especially for high-dimensional action spaces.

【6】 Concave Utility Reinforcement Learning with Zero-Constraint Violations
标题：具有零约束违反的凹效用强化学习
链接：https://arxiv.org/abs/2109.05439

作者：Mridul Agarwal,Qinbo Bai,Vaneet Aggarwal
机构： The authors are with Purdue University
摘要：我们考虑凸约束下的表无穷时域凹效用强化学习（CURL）问题。具有约束的各种学习应用程序（如机器人）不允许违反约束的策略。为此，我们提出了一种基于模型的学习算法，实现了零约束冲突。为了得到这个结果，我们假设凹目标和凸约束在可行占据测度集的内部有一个解。然后，我们解决一个更严格的优化问题，以确保即使模型知识和模型随机性不精确，约束也不会被违反。我们还提出了一种新的基于Bellman误差的分析方法，用于分析随机策略。结合基于Bellman误差的分析和更严格的优化方程，对于$T$与环境的交互，我们得到了目标的遗憾保证，其增长为$\Tilde{O}（1/\sqrt{T}）$，不包括其他因素。
摘要：We consider the problem of tabular infinite horizon concave utility reinforcement learning (CURL) with convex constraints. Various learning applications with constraints, such as robotics, do not allow for policies that can violate constraints. To this end, we propose a model-based learning algorithm that achieves zero constraint violations. To obtain this result, we assume that the concave objective and the convex constraints have a solution interior to the set of feasible occupation measures. We then solve a tighter optimization problem to ensure that the constraints are never violated despite the imprecise model knowledge and model stochasticity. We also propose a novel Bellman error based analysis for tabular infinite-horizon setups which allows to analyse stochastic policies. Combining the Bellman error based analysis and tighter optimization equation, for $T$ interactions with the environment, we obtain a regret guarantee for objective which grows as $\Tilde{O}(1/\sqrt{T})$, excluding other factors.

【7】 EMVLight: A Decentralized Reinforcement Learning Framework for EfficientPassage of Emergency Vehicles
标题：EMVLight：一种用于应急车辆高效通行的分散式强化学习框架
链接：https://arxiv.org/abs/2109.05429

作者：Haoran Su,Yaofeng Desmond Zhong,Biswadip Dey,Amit Chakraborty
机构：Siemens Corporation, Technology
摘要：应急车辆（EMV）在响应时间紧急事件（如医疗紧急事件和城市地区的火灾爆发）方面发挥着关键作用。电动汽车在交通中行驶的时间越短，就越有可能拯救人们的生命，减少财产损失。为了减少电动汽车的行驶时间，先前的工作采用了基于历史交通流数据的路线优化和基于最优路线的交通信号抢占。然而，交通信号抢占动态地改变交通流，进而改变EMV的最佳路线。此外，交通信号抢占措施通常会导致交通流受到严重干扰，从而增加非电动汽车的行驶时间。在本文中，我们提出了EMVLight，一个用于同时动态路由和交通信号控制的分散强化学习（RL）框架。EMVLight扩展了Dijkstra算法，在EMV通过交通网络时实时有效地更新其最佳路线。分散的RL代理学习网络级的协同交通信号相位策略，不仅减少了EMV在网络中的旅行时间，而且减少了非EMV在网络中的平均旅行时间。这一优势已通过合成地图和真实地图的综合实验得到证明。这些实验表明，EMVLight优于基准交通工程技术和现有的基于RL的信号控制方法。
摘要：Emergency vehicles (EMVs) play a crucial role in responding to time-critical events such as medical emergencies and fire outbreaks in an urban area. The less time EMVs spend traveling through the traffic, the more likely it would help save people's lives and reduce property loss. To reduce the travel time of EMVs, prior work has used route optimization based on historical traffic-flow data and traffic signal pre-emption based on the optimal route. However, traffic signal pre-emption dynamically changes the traffic flow which, in turn, modifies the optimal route of an EMV. In addition, traffic signal pre-emption practices usually lead to significant disturbances in traffic flow and subsequently increase the travel time for non-EMVs. In this paper, we propose EMVLight, a decentralized reinforcement learning (RL) framework for simultaneous dynamic routing and traffic signal control. EMVLight extends Dijkstra's algorithm to efficiently update the optimal route for the EMVs in real time as it travels through the traffic network. The decentralized RL agents learn network-level cooperative traffic signal phase strategies that not only reduce EMV travel time but also reduce the average travel time of non-EMVs in the network. This benefit has been demonstrated through comprehensive experiments with synthetic and real-world maps. These experiments show that EMVLight outperforms benchmark transportation engineering techniques and existing RL-based signal control methods.

【8】 Optimizing a domestic battery and solar photovoltaic system with deep reinforcement learning
标题：基于深度强化学习的家用电池和太阳能光伏系统优化
链接：https://arxiv.org/abs/2109.05024

作者：Alexander J. M. Kell,A. Stephen McGough,Matthew Forshaw
机构：Sustainable Gas Institute, Imperial College London, London, UK, School of Computing, Newcastle University, Newcastle-upon-Tyne, UK
备注：arXiv admin note: text overlap with arXiv:2011.04079
摘要：电池和太阳能光伏系统成本的降低导致了太阳能电池家庭系统的大量使用。在这项工作中，我们使用深度确定性策略梯度算法来优化这样一个系统中电池的充电和放电行为。我们的方法在给电池充电和放电时输出一个连续的动作空间，并且可以在随机环境中正常工作。我们通过在一年内选定的几周内，将单个家庭的电力支出降低到大约10美元，用于购买大型电池，展示了该算法的良好性能。
摘要：A lowering in the cost of batteries and solar PV systems has led to a high uptake of solar battery home systems. In this work, we use the deep deterministic policy gradient algorithm to optimise the charging and discharging behaviour of a battery within such a system. Our approach outputs a continuous action space when it charges and discharges the battery, and can function well in a stochastic environment. We show good performance of this algorithm by lowering the expenditure of a single household on electricity to almost \$1AUD for large batteries across selected weeks within a year.

符号|符号学习(1篇)

【1】 Neuro-Symbolic AI: An Emerging Class of AI Workloads and their Characterization
标题：神经符号型人工智能：一类新兴的人工智能工作负荷及其表征
链接：https://arxiv.org/abs/2109.06133

作者：Zachary Susskind,Bryce Arden,Lizy K. John,Patrick Stockton,Eugene B. John
机构：Department of Electrical and Computer Engineering, The University of Texas at Austin, Austin, Texas, The University of Texas at San Antonio, San Antonio, Texas
备注：11 pages, 7 figures
摘要：神经符号人工智能是人工智能研究的一个新领域，旨在将传统的基于规则的人工智能方法与现代深度学习技术相结合。神经符号模型已经证明在图像和视频推理等领域优于最先进的深度学习模型。与传统模型相比，它们还可以在显著减少训练数据的情况下获得高精度。由于该领域的出现较晚，且发表的结果相对稀少，这些模型的性能特征尚未得到很好的理解。在本文中，我们描述和分析了三种最新的神经符号模型的性能特征。我们发现，由于复杂的控制流和低运算强度的操作（如标量乘法和张量加法），符号模型比传统的神经模型具有更少的潜在并行性。然而，在它们明显可分离的情况下，计算的神经方面支配着符号部分。我们还发现，数据移动构成了一个潜在的瓶颈，就像在许多ML工作负载中一样。
摘要：Neuro-symbolic artificial intelligence is a novel area of AI research which seeks to combine traditional rules-based AI approaches with modern deep learning techniques. Neuro-symbolic models have already demonstrated the capability to outperform state-of-the-art deep learning models in domains such as image and video reasoning. They have also been shown to obtain high accuracy with significantly less training data than traditional models. Due to the recency of the field's emergence and relative sparsity of published results, the performance characteristics of these models are not well understood. In this paper, we describe and analyze the performance characteristics of three recent neuro-symbolic models. We find that symbolic models have less potential parallelism than traditional neural models due to complex control flow and low-operational-intensity operations, such as scalar multiplication and tensor addition. However, the neural aspect of computation dominates the symbolic part in cases where they are clearly separable. We also find that data movement poses a potential bottleneck, as it does in many ML workloads.

分层学习(1篇)

【1】 Zeroth-order non-convex learning via hierarchical dual averaging
标题：基于分层对偶平均的零阶非凸学习
链接：https://arxiv.org/abs/2109.05829

作者：Amélie Héliou,Matthieu Martin,Panayotis Mertikopoulos,Thibaud Rahier
备注：40 pages, 14 figures
摘要：我们提出了一种用于零阶在线非凸优化的双重平均的分层版本，即，在每个阶段，优化器都面临一个未知的非凸损失函数，并且只接收作为反馈产生的损失的学习过程。拟议的政策类别依赖于在线模型的构建，该模型在损失信息到达时聚合损失信息，并由两个主要组成部分组成：（a）适用于Fisher信息度量的正则化器（与环境空间的度量范数相反）；以及（b）基于自适应分层调度的问题状态空间的原则性探索。这种结构能够更精确地控制模型的偏差和方差，并允许我们推导出学习者静态和动态后悔的严格界限，即，在游戏的地平线上，事后对最佳动态策略产生的后悔。
摘要：We propose a hierarchical version of dual averaging for zeroth-order online non-convex optimization - i.e., learning processes where, at each stage, the optimizer is facing an unknown non-convex loss function and only receives the incurred loss as feedback. The proposed class of policies relies on the construction of an online model that aggregates loss information as it arrives, and it consists of two principal components: (a) a regularizer adapted to the Fisher information metric (as opposed to the metric norm of the ambient space); and (b) a principled exploration of the problem's state space based on an adapted hierarchical schedule. This construction enables sharper control of the model's bias and variance, and allows us to derive tight bounds for both the learner's static and dynamic regret - i.e., the regret incurred against the best dynamic policy in hindsight over the horizon of play.

医学相关(4篇)

【1】 CoviHawkes: Temporal Point Process and Deep Learning based Covid-19 forecasting for India
标题：CoviHawkes：印度基于时点过程和深度学习的冠状病毒预测
链接：https://arxiv.org/abs/2109.06056

作者：Ambedkar Dukkipati,Tony Gracious,Shubham Gupta
机构：Department of Computer Science and Automation, Indian Institute of Science, Bangalore, INDIA.
摘要：封锁是遏制大流行蔓延的最有效措施之一。不幸的是，它们给民众带来了沉重的经济和情感损失，而这往往比封锁本身还要持久。这篇文章支持“局部”封锁，这是一种针对当前爆发疫情地区的封锁。我们提出了一种基于时间点过程的机器学习工具CoviHawkes，称为CoviHawkes，用于预测印度国家、州和地区层面的每日新冠肺炎病例数。我们的短期预测（$<30$天）可能有助于决策者确定必须主动实施局部封锁以阻止病毒传播的地区。我们的长期预测（最多几个月）模拟了在各种封锁条件下大流行的进展，从而为印度潜在的第三波病例提供了一个嘈杂的指标。大量的实验结果验证了我们的工具在各个级别的性能。
摘要：Lockdowns are one of the most effective measures for containing the spread of a pandemic. Unfortunately, they involve a heavy financial and emotional toll on the population that often outlasts the lockdown itself. This article argues in favor of ``local'' lockdowns, which are lockdowns focused on regions currently experiencing an outbreak. We propose a machine learning tool called CoviHawkes based on temporal point processes, called CoviHawkes that predicts the daily case counts for Covid-19 in India at the national, state, and district levels. Our short-term predictions ($<30$ days) may be helpful for policymakers in identifying regions where a local lockdown must be proactively imposed to arrest the spread of the virus. Our long-term predictions (up to a few months) simulate the progression of the pandemic under various lockdown conditions, thereby providing a noisy indicator for a potential third wave of cases in India. Extensive experimental results validate the performance of our tool at all levels.

【2】 DeepPyram: Enabling Pyramid View and Deformable Pyramid Reception for Semantic Segmentation in Cataract Surgery Videos
标题：DeepPyram：在白内障手术视频中启用金字塔视图和可变形金字塔接收进行语义分割
链接：https://arxiv.org/abs/2109.05352

作者：Negin Ghamsarian,Mario Taschwer,klaus Schoeffmann
机构： Klagenfurt University
备注：12 pages, 10 figures
摘要：语义分割在白内障手术中有着广泛的应用，有助于提高手术效果和降低临床风险。然而，在划分不同的相关实例时存在不同的问题，这使得指定一个独特的网络非常具有挑战性。本文提出了一种称为DeepPyram的语义分割网络，该网络在分割白内障手术视频中存在各种问题的相关对象时可以获得优异的性能。这种优势主要来源于三个模块：（i）金字塔视图融合，它提供以输入卷积特征图中每个像素位置为中心的周围区域的可变角度全局视图；（ii）可变形金字塔接收，其能够实现能够适应感兴趣对象中的几何变换的宽可变形接收场；（iii）金字塔损失，自适应地监督多尺度语义特征映射。这些模块可以有效地提高语义分割性能，特别是在对象的透明性、可变形性、可伸缩性和钝边的情况下。该方法使用四个白内障手术数据集对具有不同上下文特征的对象进行评估，并与十三个最先进的分割网络进行比较。实验结果证实，DeepPyram在不增加额外可训练参数的情况下优于其他方法。我们的综合消融研究进一步证明了所提出模块的有效性。
摘要：Semantic segmentation in cataract surgery has a wide range of applications contributing to surgical outcome enhancement and clinical risk reduction. However, the varying issues in segmenting the different relevant instances make the designation of a unique network quite challenging. This paper proposes a semantic segmentation network termed as DeepPyram that can achieve superior performance in segmenting relevant objects in cataract surgery videos with varying issues. This superiority mainly originates from three modules: (i) Pyramid View Fusion, which provides a varying-angle global view of the surrounding region centering at each pixel position in the input convolutional feature map; (ii) Deformable Pyramid Reception, which enables a wide deformable receptive field that can adapt to geometric transformations in the object of interest; and (iii) Pyramid Loss that adaptively supervises multi-scale semantic feature maps. These modules can effectively boost semantic segmentation performance, especially in the case of transparency, deformability, scalability, and blunt edges in objects. The proposed approach is evaluated using four datasets of cataract surgery for objects with different contextual features and compared with thirteen state-of-the-art segmentation networks. The experimental results confirm that DeepPyram outperforms the rival approaches without imposing additional trainable parameters. Our comprehensive ablation study further proves the effectiveness of the proposed modules.

【3】 Global and Local Interpretation of black-box Machine Learning models to determine prognostic factors from early COVID-19 data
标题：从早期冠状病毒数据确定预后因子的黑盒机器学习模型的全局和局部解释
链接：https://arxiv.org/abs/2109.05087

作者：Ananya Jana,Carlos D. Minacapelli,Vinod Rustgi,Dimitris Metaxas
机构：Dept. of Computer Science, Rutgers University, New Jersey, USA, Dept. of Medicine, Division of Gastroenterology and Hepatology, Rutgers Robert Wood, Johnson Medical School, New Jersey, USA
备注：accepted by SIPAIM 2021, code repository: this https URL
摘要：截至2021年7月24日，新冠病毒已夺走410万人的生命。各种机器学习模型已被应用于相关数据，以预测疾病的严重程度、感染率等重要因素，并发现重要的预后因素。由于缺乏方法的可解释性，使用这些技术所得结果的有用性通常会降低。在机器学习模型的可解释性方面取得的一些最新进展有可能在使用传统机器学习模型时揭示更多的见解。在这项工作中，我们使用一些流行的机器学习模型来分析新冠病毒-19血样数据；然后，我们采用最先进的事后局部可解释性技术（如SHAP、LIME）和全局可解释性技术（如符号元建模）对经过训练的黑盒模型进行分析，得出可解释的结论。在机器学习算法的范围内，回归仍然是最简单和最可解释的模型之一，具有明确的数学公式。我们探索了一种称为符号元建模的最新技术，以找到新冠病毒19的机器学习模型的数学表达式。我们确定急性肾损伤（AKI）、初始白蛋白水平（ALBI）、天冬氨酸转氨酶（ASTI）、总胆红素初始值（TBILI）和D-二聚体初始值（二聚体）是疾病严重程度的主要预后因素。我们的贡献是：（i）揭示新冠病毒-19严重性预测任务黑盒模型的基本数学表达式（ii）我们是第一个将符号元建模应用于该任务的人，以及（iii）发现重要特征和特征交互作用。
摘要：The COVID-19 corona virus has claimed 4.1 million lives, as of July 24, 2021. A variety of machine learning models have been applied to related data to predict important factors such as the severity of the disease, infection rate and discover important prognostic factors. Often the usefulness of the findings from the use of these techniques is reduced due to lack of method interpretability. Some recent progress made on the interpretability of machine learning models has the potential to unravel more insights while using conventional machine learning models. In this work, we analyze COVID-19 blood work data with some of the popular machine learning models; then we employ state-of-the-art post-hoc local interpretability techniques(e.g.- SHAP, LIME), and global interpretability techniques(e.g. - symbolic metamodeling) to the trained black-box models to draw interpretable conclusions. In the gamut of machine learning algorithms, regressions remain one of the simplest and most explainable models with clear mathematical formulation. We explore one of the most recent techniques called symbolic metamodeling to find the mathematical expression of the machine learning models for COVID-19. We identify Acute Kidney Injury (AKI), initial Albumin level (ALBI), Aspartate aminotransferase (ASTI), Total Bilirubin initial(TBILI) and D-Dimer initial (DIMER) as major prognostic factors of the disease severity. Our contributions are- (i) uncover the underlying mathematical expression for the black-box models on COVID-19 severity prediction task (ii) we are the first to apply symbolic metamodeling to this task, and (iii) discover important features and feature interactions.

【4】 Co-Correcting: Noise-tolerant Medical Image Classification via mutual Label Correction
标题：协同校正：基于互标签校正的抗噪声医学图像分类
链接：https://arxiv.org/abs/2109.05159

作者：Jiarun Liu,Ruirui Li,Chuan Sun
机构： Bei-jing University of Chemical Technology (e-mail, Chuan Sun is with Department of ophthalmology
备注：IEEE Transactions on Medical Imaging 2021
摘要：随着深度学习的发展，医学图像分类得到了显著的提高。然而，深度学习需要有标签的海量数据。虽然人类专家对样本进行标记既昂贵又耗时，但从众包中收集标签会受到噪声的影响，这可能会降低分类器的准确性。因此，人们迫切需要能够有效处理标签噪声的方法。不幸的是，在深度学习中处理标签噪声的最新进展在很大程度上没有被医学图像所注意到。为了填补这一空白，本文提出了一种抗噪医学图像分类框架Co-correction，该框架通过双网络互学习、标签概率估计和课程标签校正，显著提高了分类精度，获得了更准确的标签。在两个具有代表性的医学图像数据集和MNIST数据集上，我们测试了六种使用噪声标签方法的最新学习，并进行了比较研究。实验表明，在不同的噪声比下，协同校正在不同的任务中都能达到最佳的精度和泛化能力。我们的项目可在以下网址找到：https://github.com/JiarunLiu/Co-Correcting.
摘要：With the development of deep learning, medical image classification has been significantly improved. However, deep learning requires massive data with labels. While labeling the samples by human experts is expensive and time-consuming, collecting labels from crowd-sourcing suffers from the noises which may degenerate the accuracy of classifiers. Therefore, approaches that can effectively handle label noises are highly desired. Unfortunately, recent progress on handling label noise in deep learning has gone largely unnoticed by the medical image. To fill the gap, this paper proposes a noise-tolerant medical image classification framework named Co-Correcting, which significantly improves classification accuracy and obtains more accurate labels through dual-network mutual learning, label probability estimation, and curriculum label correcting. On two representative medical image datasets and the MNIST dataset, we test six latest Learning-with-Noisy-Labels methods and conduct comparative studies. The experiments show that Co-Correcting achieves the best accuracy and generalization under different noise ratios in various tasks. Our project can be found at: https://github.com/JiarunLiu/Co-Correcting.

蒸馏|知识提取(1篇)

【1】 On the Efficiency of Subclass Knowledge Distillation in Classification Tasks
标题：分类任务中的子类知识提炼效率研究
链接：https://arxiv.org/abs/2109.05587

作者：Ahmad Sajedi,Konstantinos N. Plataniotis
机构：The Edward S. Rogers Sr. Department of Electrical & Computer Engineering, University of Toronto
摘要：这项工作为分类任务引入了一个新的知识提取框架，其中现有子类的信息是可用的并被考虑在内。在具有少量类或二进制检测（两类）的分类任务中，从教师到学生网络传输的信息量受到限制，从而限制了知识提取的效用。通过利用有关分类任务中可用类中可能的子类的信息，可以提高性能。为此，我们提出了所谓的子类知识提取（SKD）框架，即将子类的预测知识从大型教师模型转移到小型学生模型的过程。通过SKD，教师课堂日志中没有但存在于子类中的其他有意义信息（例如，课堂内的相似性）将传达给学生，并提高其表现。在数学上，我们测量教师通过SKD框架可以为学生提供多少额外信息位。该框架在临床应用中进行了评估，即大肠息肉二元分类。在此应用程序中，临床医生提供的注释用于根据注释标签在课程学习风格中的可变性定义子类。使用该框架训练的轻量级、低复杂性学生的F1成绩为85.05%，分别比不使用和使用常规知识提取训练的学生提高2.14%和1.49%。这些结果表明，额外的子类知识（即，在我们的实验中，每个训练样本0.4656个标签位）可以提供更多关于教师泛化的信息，因此SKD可以从使用更多信息来提高学生成绩中受益。
摘要：This work introduces a novel knowledge distillation framework for classification tasks where information on existing subclasses is available and taken into consideration. In classification tasks with a small number of classes or binary detection (two classes) the amount of information transferred from the teacher to the student network is restricted, thus limiting the utility of knowledge distillation. Performance can be improved by leveraging information about possible subclasses within the available classes in the classification task. To that end, we propose the so-called Subclass Knowledge Distillation (SKD) framework, which is the process of transferring the subclasses' prediction knowledge from a large teacher model into a smaller student one. Through SKD, additional meaningful information which is not in the teacher's class logits but exists in subclasses (e.g., similarities inside classes) will be conveyed to the student and boost its performance. Mathematically, we measure how many extra information bits the teacher can provide for the student via SKD framework. The framework developed is evaluated in clinical application, namely colorectal polyp binary classification. In this application, clinician-provided annotations are used to define subclasses based on the annotation label's variability in a curriculum style of learning. A lightweight, low complexity student trained with the proposed framework achieves an F1-score of 85.05%, an improvement of 2.14% and 1.49% gain over the student that trains without and with conventional knowledge distillation, respectively. These results show that the extra subclasses' knowledge (i.e., 0.4656 label bits per training sample in our experiment) can provide more information about the teacher generalization, and therefore SKD can benefit from using more information to increase the student performance.

推荐(4篇)

【1】 Application of Machine Learning in Early Recommendation of Cardiac Resynchronization Therapy
标题：机器学习在心脏再同步治疗早期推荐中的应用
链接：https://arxiv.org/abs/2109.06139

作者：Brendan E. Odigwe,Francis G. Spinale,Homayoun Valafar
机构：Computer Science & Engineering, U. of South Carolina, Columbia, SC , USA, School of Medicine
备注：10 Pages, 8 Figues, 4 Tables. The 7th International Conference on Health Informatics & Medical Systems
摘要：心力衰竭（HF）是发病率、死亡率和医疗费用的主要原因。心衰时可发生心肌传导延长，设备驱动的方法，称为心脏再同步治疗（CRT），可改善左心室（LV）心肌传导模式。虽然CRT的功能益处已得到证实，但接受CRT的大部分心衰患者（30-50%）并未表现出足够的改善。此外，前瞻性地确定能从CRT中获益的HF患者仍然是一个临床挑战。因此，有效预测从CRT中获得功能性益处的心衰患者的策略具有重大的医学和社会经济意义。因此，我们使用分类HF患者的机器学习方法，即聚类分析、决策树和人工神经网络，来开发CRT后个体结果的预测模型。收集心衰患者CRT前后的临床、功能和生物标志物数据。LV容量减少的预期6个月终点被定义为CRT反应。使用这种方法（418名应答者，412名无应答者），每个人有56个参数，我们可以根据他们对CRT的反应对HF患者进行分类，成功率超过95%。我们已经证明，使用机器学习方法可以识别CRT阳性反应概率高（95%准确率）的HF患者，同样重要的是，识别那些不会从CRT中获得功能益处的HF患者。将这种方法发展成为一种临床算法，以协助HF患者使用CRT的临床决策，将有可能改善结果并降低医疗成本。
摘要：Heart failure (HF) is a leading cause of morbidity, mortality, and health care costs. Prolonged conduction through the myocardium can occur with HF, and a device-driven approach, termed cardiac resynchronization therapy (CRT), can improve left ventricular (LV) myocardial conduction patterns. While a functional benefit of CRT has been demonstrated, a large proportion of HF patients (30-50%) receiving CRT do not show sufficient improvement. Moreover, identifying HF patients that would benefit from CRT prospectively remains a clinical challenge. Accordingly, strategies to effectively predict those HF patients that would derive a functional benefit from CRT holds great medical and socio-economic importance. Thus, we used machine learning methods of classifying HF patients, namely Cluster Analysis, Decision Trees, and Artificial neural networks, to develop predictive models of individual outcomes following CRT. Clinical, functional, and biomarker data were collected in HF patients before and following CRT. A prospective 6-month endpoint of a reduction in LV volume was defined as a CRT response. Using this approach (418 responders, 412 non-responders), each with 56 parameters, we could classify HF patients based on their response to CRT with more than 95% success. We have demonstrated that using machine learning approaches can identify HF patients with a high probability of a positive CRT response (95% accuracy), and of equal importance, identify those HF patients that would not derive a functional benefit from CRT. Developing this approach into a clinical algorithm to assist in clinical decision-making regarding the use of CRT in HF patients would potentially improve outcomes and reduce health care costs.

【2】 Correcting the User Feedback-Loop Bias for Recommendation Systems
标题：推荐系统中用户反馈循环偏差的校正
链接：https://arxiv.org/abs/2109.06037

作者：Weishen Pan,Sen Cui,Hongyi Wen,Kun Chen,Changshui Zhang,Fei Wang
机构：Institute for Artificial Intelligence, Tsinghua University (THUAI), State Key Lab of Intelligent Technologies and Systems, Beijing National Research Center for Information Science and Technology (BNRist), Department of Automation
摘要：在训练和评估具有明确反馈的推荐系统的数据中，选择偏差非常普遍。例如，用户倾向于对他们喜欢的项目进行评分。然而，当对与特定用户有关的项目进行评分时，大多数推荐算法往往过于依赖他/她的评分（反馈）历史记录。这在推荐系统中引入了隐式偏差，本文称之为用户反馈回路偏差。我们提出了一种系统和动态的方法来纠正这种偏见，并通过利用时态评级信息来获得更加多样化和客观的建议。具体地说，我们的方法包括一个深度学习组件，用于学习每个用户的动态评分历史嵌入，以估计用户顺序评分的项目的概率分布。然后将这些估计的动态暴露概率用作倾向评分，以训练反向倾向评分（IPS）评分预测因子。我们通过实证验证了现实世界推荐系统中用户反馈回路偏差的存在，并将我们的方法与基线模型的性能进行了比较，基线模型没有消除偏差，或者与其他方法估计的倾向分数进行了比较。结果表明了该方法的优越性。
摘要：Selection bias is prevalent in the data for training and evaluating recommendation systems with explicit feedback. For example, users tend to rate items they like. However, when rating an item concerning a specific user, most of the recommendation algorithms tend to rely too much on his/her rating (feedback) history. This introduces implicit bias on the recommendation system, which is referred to as user feedback-loop bias in this paper. We propose a systematic and dynamic way to correct such bias and to obtain more diverse and objective recommendations by utilizing temporal rating information. Specifically, our method includes a deep-learning component to learn each user's dynamic rating history embedding for the estimation of the probability distribution of the items that the user rates sequentially. These estimated dynamic exposure probabilities are then used as propensity scores to train an inverse-propensity-scoring (IPS) rating predictor. We empirically validated the existence of such user feedback-loop bias in real world recommendation systems and compared the performance of our method with the baseline models that are either without de-biasing or with propensity scores estimated by other methods. The results show the superiority of our approach.

【3】 FaiREO: User Group Fairness for Equality of Opportunity in Course Recommendation
标题：FaiREO：课程推荐中机会均等的用户组公平
链接：https://arxiv.org/abs/2109.05931

作者：Agoritsa Polyzou,Maria Kalantzi,George Karypis
机构：Department of Computer Science and Engineering, University of, Minnesota, Minneapolis, MN, USA., School of Computing and Information Sciences, Florida, International University, Miami, FL, USA.
备注：30 pages
摘要：对高等教育机构的学生来说，选课是一项挑战。现有的课程推荐系统向学生提供相关建议，帮助他们探索现有课程。推荐的课程会影响学生对学位课程的选择、未来的就业甚至他们的社会经济地位。本文的重点是识别和缓解课程推荐系统中可能存在的偏差。我们努力向所有学生群体提供平衡的机会。同时，我们需要向所有受保护群体提出高质量的建议。我们将我们的方法表述为一个多目标优化问题，并研究机会均等和质量之间的权衡。我们使用真实世界和合成数据集评估我们的方法。结果表明，我们可以显著提高关于机会平等的公平性，但会引入一些质量损失。在我们测试的四种方法中，GHC Inc和GHC Tabu是性能最好的方法，具有不同的优势特征。
摘要：Course selection is challenging for students in higher educational institutions. Existing course recommendation systems make relevant suggestions to the students and help them in exploring the available courses. The recommended courses can influence students' choice of degree program, future employment, and even their socioeconomic status. This paper focuses on identifying and alleviating biases that might be present in a course recommender system. We strive to promote balanced opportunities with our suggestions to all groups of students. At the same time, we need to make recommendations of good quality to all protected groups. We formulate our approach as a multi-objective optimization problem and study the trade-offs between equal opportunity and quality. We evaluate our methods using both real-world and synthetic datasets. The results indicate that we can considerably improve fairness regarding equality of opportunity, but we will introduce some quality loss. Out of the four methods we tested, GHC-Inc and GHC-Tabu are the best performing ones with different advantageous characteristics.

【4】 Existence conditions for hidden feedback loops in online recommender systems
标题：在线推荐系统中隐藏反馈环存在的条件
链接：https://arxiv.org/abs/2109.05278

作者：Anton S. Khritankov,Anton A. Pilkevich
机构：Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region, Russian Federation
备注：6 pages, 3 figures
摘要：我们探讨了在线推荐系统中隐藏的反馈环效应。反馈回路导致在线多臂bandit（MAB）建议退化为一小部分，并失去覆盖范围和新颖性。我们研究用户兴趣中的不确定性和噪声如何影响反馈回路的存在。首先，我们证明了用户兴趣中的无偏加性随机噪声不会阻止反馈循环。其次，我们证明了重置用户兴趣的非零概率足以限制反馈回路并估计影响的大小。我们的实验证实了四种bandit算法在模拟环境中的理论发现。
摘要：We explore a hidden feedback loops effect in online recommender systems. Feedback loops result in degradation of online multi-armed bandit (MAB) recommendations to a small subset and loss of coverage and novelty. We study how uncertainty and noise in user interests influence the existence of feedback loops. First, we show that an unbiased additive random noise in user interests does not prevent a feedback loop. Second, we demonstrate that a non-zero probability of resetting user interests is sufficient to limit the feedback loop and estimate the size of the effect. Our experiments confirm the theoretical findings in a simulated environment for four bandit algorithms.

自动驾驶|车辆|车道检测等(2篇)

【1】 Neural Network Guided Evolutionary Fuzzing for Finding Traffic Violations of Autonomous Vehicles
标题：神经网络引导的进化模糊自动车辆交通违章识别
链接：https://arxiv.org/abs/2109.06126

作者：Ziyuan Zhong,Gail Kaiser,Baishakhi Ray
机构：Columbia University, New York, United States
摘要：自动驾驶汽车和卡车，即自动驾驶汽车（AVs），在监管机构和公众对其安全性和可靠性有更高的信心之前，不应该被接受——这可以通过测试最实际、最令人信服地实现。但是，现有的测试方法不足以检查AV控制器的端到端行为，以及复杂的真实角落案例，这些案例涉及多个独立代理（如行人和人驾驶车辆）的交互。尽管在街道和高速公路上试驾AV无法捕捉到许多罕见事件，但现有的基于模拟的测试方法主要关注简单场景，对于需要对周围环境进行复杂感知的复杂驾驶情况，无法很好地进行扩展。为了解决这些限制，我们提出了一种新的模糊测试技术，称为AutoFuzz，它可以利用广泛使用的AV模拟器的API语法。生成语义和时间上有效的复杂驾驶场景（场景序列）。AutoFuzz由API语法上的约束神经网络（NN）进化搜索引导，以生成寻找唯一交通违规的场景。在一个最先进的基于学习的控制器和两个基于规则的控制器上对我们的原型进行的评估表明，AutoFuzz有效地发现了数百个类似于真实世界碰撞的真实交通违规。此外，使用AutoFuzz发现的交通违规对基于学习的控制器进行微调，成功地减少了新版本AV控制器软件中发现的交通违规。
摘要：Self-driving cars and trucks, autonomous vehicles (AVs), should not be accepted by regulatory bodies and the public until they have much higher confidence in their safety and reliability -- which can most practically and convincingly be achieved by testing. But existing testing methods are inadequate for checking the end-to-end behaviors of AV controllers against complex, real-world corner cases involving interactions with multiple independent agents such as pedestrians and human-driven vehicles. While test-driving AVs on streets and highways fails to capture many rare events, existing simulation-based testing methods mainly focus on simple scenarios and do not scale well for complex driving situations that require sophisticated awareness of the surroundings. To address these limitations, we propose a new fuzz testing technique, called AutoFuzz, which can leverage widely-used AV simulators' API grammars. to generate semantically and temporally valid complex driving scenarios (sequences of scenes). AutoFuzz is guided by a constrained Neural Network (NN) evolutionary search over the API grammar to generate scenarios seeking to find unique traffic violations. Evaluation of our prototype on one state-of-the-art learning-based controller and two rule-based controllers shows that AutoFuzz efficiently finds hundreds of realistic traffic violations resembling real-world crashes. Further, fine-tuning the learning-based controller with the traffic violations found by AutoFuzz successfully reduced the traffic violations found in the new version of the AV controller software.

【2】 Space Meets Time: Local Spacetime Neural Network For Traffic Flow Forecasting
标题：空间遇见时间：交通流量预测的局部时空神经网络
链接：https://arxiv.org/abs/2109.05225

作者：Song Yang,Jiamou Liu,Kaiqi Zhao
机构：School of Computer Science, The University of Auckland, Auckland, New Zealand
备注：Accepted by ICDM 2021
摘要：交通流预测是城市计算中的一项重要任务。由于交通流往往表现出内在和潜在的时空相关性，无法通过单独提取交通数据的空间和时间模式来识别这些相关性，因此出现了挑战。我们认为，这种相关性是普遍存在的，在交通流中起着关键作用。我们提出时空间隔学习作为一种范式，通过对时空特征的统一分析来明确地捕捉这些相关性。与仅限于特定道路网络的最新方法不同，我们对从城市到城市的通用时空相关性进行建模。为此，我们提出了一个新的时空间隔学习框架，该框架构建了一个交通传感器的本地时空上下文，该上下文包含来自邻近时间点的数据。基于这一思想，我们引入了时空神经网络（STNN），它采用新颖的时空卷积和注意机制来学习普遍的时空相关性。所提出的STNN捕获不依赖于特定网络结构的本地流量模式。因此，经过训练的STNN模型可以应用于任何看不见的交通网络。我们在两个公共真实交通数据集和一个动态网络模拟数据集上对所提出的STNN进行了评估。实验结果表明，与现有的预测方法相比，STNN不仅能将预测精度提高15%，而且能有效地处理交通网络发生动态变化的情况，并具有良好的泛化能力。
摘要：Traffic flow forecasting is a crucial task in urban computing. The challenge arises as traffic flows often exhibit intrinsic and latent spatio-temporal correlations that cannot be identified by extracting the spatial and temporal patterns of traffic data separately. We argue that such correlations are universal and play a pivotal role in traffic flow. We put forward spacetime interval learning as a paradigm to explicitly capture these correlations through a unified analysis of both spatial and temporal features. Unlike the state-of-the-art methods, which are restricted to a particular road network, we model the universal spatio-temporal correlations that are transferable from cities to cities. To this end, we propose a new spacetime interval learning framework that constructs a local-spacetime context of a traffic sensor comprising the data from its neighbors within close time points. Based on this idea, we introduce spacetime neural network (STNN), which employs novel spacetime convolution and attention mechanism to learn the universal spatio-temporal correlations. The proposed STNN captures local traffic patterns, which does not depend on a specific network structure. As a result, a trained STNN model can be applied on any unseen traffic networks. We evaluate the proposed STNN on two public real-world traffic datasets and a simulated dataset on dynamic networks. The experiment results show that STNN not only improves prediction accuracy by 15% over state-of-the-art methods, but is also effective in handling the case when the traffic network undergoes dynamic changes as well as the superior generalization capability.

点云|SLAM|雷达|激光|深度RGBD相关(1篇)

【1】 RVMDE: Radar Validated Monocular Depth Estimation for Robotics
标题：RVMDE：雷达验证的机器人单目深度估计
链接：https://arxiv.org/abs/2109.05265

作者：Muhamamd Ishfaq Hussain,Muhammad Aasim Rafique,Moongu Jeon
机构：and 3MoonguJeonarewiththeSchoolofElectricalEngineeringandComputer Science, Gwangju Institute of Science and Technology
摘要：立体显示是对场景中距离的自然感知，其在三维世界理解中的表现是一种直观现象。然而，双目视觉传感器固有的刚性校准对于精确的深度估计至关重要。或者，单目摄像机以牺牲深度估计的准确性为代价来减轻限制，并且在恶劣的环境条件下，这一挑战会加剧。此外，在恶劣环境中，光学传感器通常无法获取重要信号，而使用雷达，可以提供粗略但更精确的信号。这项工作探讨了在恶劣环境条件下，雷达粗信号与单目相机的细粒度数据融合用于深度估计的效用。特征金字塔网络（FPN）的一种变体以较少的参数在多个尺度上广泛地处理细粒度图像特征。FPN特征图与卷积神经网络提取的稀疏雷达特征融合。利用串联的层次特征对深度进行有序回归预测。我们在nuScenes数据集上进行了实验，提出的体系结构在定量评估中保持领先地位，参数减少，推理速度加快。深度估计结果表明，在机器人和自动驾驶汽车的关键应用中，所提出的技术可以作为立体深度估计的替代方法。源代码将在以下位置提供：\url{https://github.com/MI-Hussain/RVMDE}.
摘要：Stereoscopy exposits a natural perception of distance in a scene, and its manifestation in 3D world understanding is an intuitive phenomenon. However, an innate rigid calibration of binocular vision sensors is crucial for accurate depth estimation. Alternatively, a monocular camera alleviates the limitation at the expense of accuracy in estimating depth, and the challenge exacerbates in harsh environmental conditions. Moreover, an optical sensor often fails to acquire vital signals in harsh environments, and radar is used instead, which gives coarse but more accurate signals. This work explores the utility of coarse signals from radar when fused with fine-grained data from a monocular camera for depth estimation in harsh environmental conditions. A variant of feature pyramid network (FPN) extensively operates on fine-grained image features at multiple scales with a fewer number of parameters. FPN feature maps are fused with sparse radar features extracted with a Convolutional neural network. The concatenated hierarchical features are used to predict the depth with ordinal regression. We performed experiments on the nuScenes dataset, and the proposed architecture stays on top in quantitative evaluations with reduced parameters and faster inference. The depth estimation results suggest that the proposed techniques can be used as an alternative to stereo depth estimation in critical applications in robotics and self-driving cars. The source code will be available in the following: \url{https://github.com/MI-Hussain/RVMDE}.

联邦学习|隐私保护|加密(7篇)

【1】 SignGuard: Byzantine-robust Federated Learning through Collaborative Malicious Gradient Filtering
标题：SignGuard：基于协作恶意梯度过滤的拜占庭鲁棒联合学习
链接：https://arxiv.org/abs/2109.05872

作者：Jian Xu,Shao-Lun Huang,Linqi Song,Tian Lan
机构：Tsinghua University,City University of Hong Kong,George Washington University
摘要：众所周知，联邦学习中基于梯度的训练容易受到错误/恶意工作节点的攻击，这些节点通常被建模为拜占庭客户端。以前的工作要么利用parameter server上的辅助数据来验证接收到的梯度，要么利用基于统计的方法来识别和删除拜占庭客户端中的恶意梯度。在本文中，我们承认辅助数据在实践中可能并不总是可用的，并将重点放在基于统计的方法上。然而，最近关于模型中毒攻击的研究表明，精心设计的攻击可以绕过大多数现有的基于中值和距离的统计防御方法，使得恶意梯度与诚实梯度难以区分。为了应对这一挑战，我们证明了梯度向量的元素符号可以为检测模型中毒攻击提供有价值的见解。基于我们对最新攻击的理论分析，我们提出了一种新的方法，通过协同恶意梯度过滤实现拜占庭式鲁棒联邦学习。更准确地说，首先对接收到的梯度进行处理，以生成相关的幅度、符号和相似性统计信息，然后由多个并行过滤器协同使用，以在最终聚合之前消除恶意梯度。我们进一步提供了SignGuard的理论分析，通过在非IID训练数据下选择适当的学习率来量化其收敛性。最后，对图像和文本分类任务（包括MNIST、Fashion MNIST、CIFAR-10和AG News）以及最近提出的攻击和防御策略进行了广泛的实验。数值结果表明了该方法的有效性和优越性。
摘要：Gradient-based training in federated learning is known to be vulnerable to faulty/malicious worker nodes, which are often modeled as Byzantine clients. Previous work either makes use of auxiliary data at parameter server to verify the received gradients or leverages statistic-based methods to identify and remove malicious gradients from Byzantine clients. In this paper, we acknowledge that auxiliary data may not always be available in practice and focus on the statistic-based approach. However, recent work on model poisoning attacks have shown that well-crafted attacks can circumvent most of existing median- and distance-based statistical defense methods, making malicious gradients indistinguishable from honest ones. To tackle this challenge, we show that the element-wise sign of gradient vector can provide valuable insight in detecting model poisoning attacks. Based on our theoretical analysis of state-of-the-art attack, we propose a novel approach, \textit{SignGuard}, to enable Byzantine-robust federated learning through collaborative malicious gradient filtering. More precisely, the received gradients are first processed to generate relevant magnitude, sign, and similarity statistics, which are then collaboratively utilized by multiple, parallel filters to eliminate malicious gradients before final aggregation. We further provide theoretical analysis of SignGuard by quantifying its convergence with appropriate choice of learning rate and under non-IID training data. Finally, extensive experiments of image and text classification tasks - including MNIST, Fashion-MNIST, CIFAR-10, and AG-News - are conducted together with recently proposed attacks and defense strategies. The numerical results demonstrate the effectiveness and superiority of our proposed approach.

【2】 AMI-FML: A Privacy-Preserving Federated Machine Learning Framework for AMI
标题：AMI-FML：一种隐私保护的AMI联合机器学习框架
链接：https://arxiv.org/abs/2109.05666

作者：Milan Biswal,Abu Saleh Md Tayeen,Satyajayant Misra
机构：Computer Science Department, New Mexico State University, USA, Abu-Saleh Md Tayeen
备注：7 pages
摘要：基于机器学习（ML）的智能电表数据分析对于高级计量基础设施（AMI）中的能源管理和需求响应应用非常有前景。为AMI开发分布式ML应用程序的一个关键挑战是在允许最终用户积极参与的同时保护用户隐私。本文解决了这一挑战，并为AMI中的ML应用程序提出了一个保护隐私的联合学习框架。我们考虑每个智能仪表作为一个联合边缘设备托管一个ML应用程序，定期与中央集合体或数据集中器交换信息。ML模型权重不是传输智能电表感应到的原始数据，而是传输到聚合器以保护隐私。聚合器处理这些参数，以设计一个健壮的ML模型，该模型可在每个边缘设备上替换。我们还讨论了在共享ML模型参数的同时增强隐私和提高通信效率的策略，适用于AMI中相对较慢的网络连接。我们在一个用例联邦ML（FML）应用程序上演示了所提出的框架，该应用程序改进了短期负荷预测（STLF）。我们使用长短时记忆（LSTM）递归神经网络（RNN）模型进行STLF。在我们的架构中，我们假设有一个聚合器连接到一组智能电表。聚合器使用从联邦智能电表接收到的学习模型梯度生成聚合、鲁棒RNN模型，从而提高单个和聚合STLF的预测精度。我们的结果表明，使用FML，预测精度得到了提高，同时保护了最终用户的数据隐私。
摘要：Machine learning (ML) based smart meter data analytics is very promising for energy management and demand-response applications in the advanced metering infrastructure(AMI). A key challenge in developing distributed ML applications for AMI is to preserve user privacy while allowing active end-users participation. This paper addresses this challenge and proposes a privacy-preserving federated learning framework for ML applications in the AMI. We consider each smart meter as a federated edge device hosting an ML application that exchanges information with a central aggregator or a data concentrator, periodically. Instead of transferring the raw data sensed by the smart meters, the ML model weights are transferred to the aggregator to preserve privacy. The aggregator processes these parameters to devise a robust ML model that can be substituted at each edge device. We also discuss strategies to enhance privacy and improve communication efficiency while sharing the ML model parameters, suited for relatively slow network connections in the AMI. We demonstrate the proposed framework on a use case federated ML (FML) application that improves short-term load forecasting (STLF). We use a long short-term memory(LSTM) recurrent neural network (RNN) model for STLF. In our architecture, we assume that there is an aggregator connected to a group of smart meters. The aggregator uses the learned model gradients received from the federated smart meters to generate an aggregate, robust RNN model which improves the forecasting accuracy for individual and aggregated STLF. Our results indicate that with FML, forecasting accuracy is increased while preserving the data privacy of the end-users.

【3】 FedFair: Training Fair Models In Cross-Silo Federated Learning
标题：FedFair：跨竖井联合学习的训练集市模式
链接：https://arxiv.org/abs/2109.05662

作者：Lingyang Chu,Lanjun Wang,Yanjie Dong,Jian Pei,Zirui Zhou,Yong Zhang
机构：∗McMaster University, Hamilton, Canada., †Huawei Technologies Co., Ltd., Beijing, China., ‡Huawei Technologies Canada Co., Ltd., Burnaby, Canada., §Simon Fraser University, Burnaby, Canada.
摘要：建立公平的机器学习模型变得越来越重要。由于许多强大的模型都是由多方协作建立的，各方都持有一些敏感数据，因此自然要探索在跨联盟学习中训练公平模型的可行性，以便同时充分尊重公平、隐私和协作。然而，这是一项非常具有挑战性的任务，因为在不知道参与方的私有数据的情况下准确估计模型的公平性绝非易事。在本文中，我们首先提出了一种联邦估计方法来准确估计模型的公平性，而不侵犯任何一方的数据隐私。然后，我们利用公平性估计提出了一个新的跨联盟学习中公平模型的训练问题。我们开发了FedFair，这是一个设计良好的联邦学习框架，它可以成功地训练出一个公平的模型，具有高性能，而不会侵犯数据隐私。我们在三个真实数据集上的大量实验表明，我们的方法具有良好的公平模型训练性能。
摘要：Building fair machine learning models becomes more and more important. As many powerful models are built by collaboration among multiple parties, each holding some sensitive data, it is natural to explore the feasibility of training fair models in cross-silo federated learning so that fairness, privacy and collaboration can be fully respected simultaneously. However, it is a very challenging task, since it is far from trivial to accurately estimate the fairness of a model without knowing the private data of the participating parties. In this paper, we first propose a federated estimation method to accurately estimate the fairness of a model without infringing the data privacy of any party. Then, we use the fairness estimation to formulate a novel problem of training fair models in cross-silo federated learning. We develop FedFair, a well-designed federated learning framework, which can successfully train a fair model with high performance without any data privacy infringement. Our extensive experiments on three real-world data sets demonstrate the excellent fair model training performance of our method.

【4】 Critical Learning Periods in Federated Learning
标题：联合学习中的关键学习阶段
链接：https://arxiv.org/abs/2109.05613

作者：Gang Yan,Hao Wang,Jian Li
机构：SUNY-Binghamton University, Binghamton, NY , Louisiana State University, Baton Rouge, LA
摘要：联邦学习（FL）是一种使用分散数据训练机器学习（ML）模型的流行技术。大量工作研究了全局模型的性能；然而，目前尚不清楚训练过程如何影响最终测试的准确性。使这一问题更加严重的是，FL执行与传统的ML有着显著的不同，它具有跨客户端的异构数据特征，涉及更多的超参数。在这项工作中，我们表明，FL的最终测试精度受到训练过程早期阶段的显著影响，即FL表现出关键的学习期，在此期间，小梯度误差可能对最终测试精度产生不可恢复的影响。为了进一步解释这一现象，我们将Fisher信息矩阵（FIM）的轨迹推广到FL，并定义了一个称为FedFIM的新概念，该量反映了每个客户从FL训练开始时的局部曲率。我们的发现表明{\em初始学习阶段}在理解外语成绩方面起着至关重要的作用。这与许多现有工程形成对比，这些工程通常不会将FL的最终精度与早期训练联系起来。最后，抓住外语学习中的关键学习期具有独立的意义，并有助于解决其他问题，如超参数的选择，如每轮选择的客户数量、批量大小等，从而提高外语训练和测试的绩效。
摘要：Federated learning (FL) is a popular technique to train machine learning (ML) models with decentralized data. Extensive works have studied the performance of the global model; however, it is still unclear how the training process affects the final test accuracy. Exacerbating this problem is the fact that FL executions differ significantly from traditional ML with heterogeneous data characteristics across clients, involving more hyperparameters. In this work, we show that the final test accuracy of FL is dramatically affected by the early phase of the training process, i.e., FL exhibits critical learning periods, in which small gradient errors can have irrecoverable impact on the final test accuracy. To further explain this phenomenon, we generalize the trace of the Fisher Information Matrix (FIM) to FL and define a new notion called FedFIM, a quantity reflecting the local curvature of each clients from the beginning of the training in FL. Our findings suggest that the {\em initial learning phase} plays a critical role in understanding the FL performance. This is in contrast to many existing works which generally do not connect the final accuracy of FL to the early phase training. Finally, seizing critical learning periods in FL is of independent interest and could be useful for other problems such as the choices of hyperparameters such as the number of client selected per round, batch size, and more, so as to improve the performance of FL training and testing.

【5】 Cost-Effective Federated Learning in Mobile Edge Networks
标题：移动边缘网络中的高性价比联合学习
链接：https://arxiv.org/abs/2109.05411

作者：Bing Luo,Xiang Li,Shiqiang Wang,Jianwei Huang,Leandros Tassiulas
备注：Accepted in IEEE JSAC Special Issue on Distributed Learning over Wireless Edge Networks. arXiv admin note: substantial text overlap with arXiv:2012.08336
摘要：联邦学习（FL）是一种分布式学习范式，它使大量移动设备能够在中央服务器的协调下协作学习模型，而无需共享其原始数据。尽管具有实际的效率和有效性，但设备上的迭代学习过程（例如，本地计算和与服务器的全局通信）在学习时间和能耗方面会产生相当大的成本，这在很大程度上取决于所选客户机的数量和每轮训练中的本地迭代次数。在本文中，我们分析了如何在移动边缘网络中设计自适应FL，从而在保证收敛性的同时，优化选择这些基本控制变量以最小化总成本。建立了总成本与控制变量之间的解析关系，并给出了收敛上界。为了有效地解决成本最小化问题，我们开发了一种基于低成本采样的算法来学习与收敛相关的未知参数。我们导出了重要的解决方案属性，这些属性有效地确定了不同优化指标的设计原则。实际上，我们在模拟环境和硬件原型上评估了我们的理论结果。实验证据验证了我们导出的属性，并证明我们提出的解决方案在不同数据集、异构系统和统计设置的不同优化指标下实现了接近最优的性能。
摘要：Federated learning (FL) is a distributed learning paradigm that enables a large number of mobile devices to collaboratively learn a model under the coordination of a central server without sharing their raw data. Despite its practical efficiency and effectiveness, the iterative on-device learning process (e.g., local computations and global communications with the server) incurs a considerable cost in terms of learning time and energy consumption, which depends crucially on the number of selected clients and the number of local iterations in each training round. In this paper, we analyze how to design adaptive FL in mobile edge networks that optimally chooses these essential control variables to minimize the total cost while ensuring convergence. We establish the analytical relationship between the total cost and the control variables with the convergence upper bound. To efficiently solve the cost minimization problem, we develop a low-cost sampling-based algorithm to learn the convergence related unknown parameters. We derive important solution properties that effectively identify the design principles for different optimization metrics. Practically, we evaluate our theoretical results both in a simulated environment and on a hardware prototype. Experimental evidence verifies our derived properties and demonstrates that our proposed solution achieves near-optimal performance for different optimization metrics for various datasets and heterogeneous system and statistical settings.

【6】 On the Initial Behavior Monitoring Issues in Federated Learning
标题：论联合学习中的初始行为监控问题
链接：https://arxiv.org/abs/2109.05385

作者：Ranwa Al Mallah,Godwin Badu-Marfo,Bilal Farooq
机构：tions in Transportation, Ryerson University, Canada
摘要：在联邦学习（FL）中，一组工作人员在一个节点（主管）的协调下参与构建全局模型。关于FL的网络安全，一些攻击旨在将伪造的本地模型更新注入系统。一些防御措施基于恶意工作者检测和行为模式分析。在这种情况下，如果没有及时和动态的监测方法，主管就无法检测到恶意或不可靠的工作人员并将其从系统中清除。我们的工作强调了为监控和最终的行为模式分析准备联合学习过程的紧迫性。我们在训练的早期阶段研究学习过程中的信息，提出监控过程，并评估所需的监控时间。目的是分析何时启动检测算法，以便从系统中移除恶意或不可靠的工作人员，并优化防御机制部署。我们在行为模式分析防御上测试了我们的策略，该防御应用于文本和图像分类的不同基准系统的FL过程。我们的结果表明，监控过程降低了误报和漏报，从而提高了系统效率，使分布式学习系统能够在训练的早期阶段获得更好的性能。
摘要：In Federated Learning (FL), a group of workers participate to build a global model under the coordination of one node, the chief. Regarding the cybersecurity of FL, some attacks aim at injecting the fabricated local model updates into the system. Some defenses are based on malicious worker detection and behavioral pattern analysis. In this context, without timely and dynamic monitoring methods, the chief cannot detect and remove the malicious or unreliable workers from the system. Our work emphasize the urgency to prepare the federated learning process for monitoring and eventually behavioral pattern analysis. We study the information inside the learning process in the early stages of training, propose a monitoring process and evaluate the monitoring period required. The aim is to analyse at what time is it appropriate to start the detection algorithm in order to remove the malicious or unreliable workers from the system and optimise the defense mechanism deployment. We tested our strategy on a behavioral pattern analysis defense applied to the FL process of different benchmark systems for text and image classification. Our results show that the monitoring process lowers false positives and false negatives and consequently increases system efficiency by enabling the distributed learning system to achieve better performance in the early stage of training.

【7】 Utility Fairness for the Differentially Private Federated Learning
标题：差分私有联合学习的效用公平性
链接：https://arxiv.org/abs/2109.05267

作者：Sheeraz A. Alvi,Yi Hong,Salman Durrani
摘要：联邦学习（FL）允许在无线物联网（IoT）网络中对感测数据进行预测模型训练，从而避免在能量、时间和隐私方面的数据收集成本。在本文中，对于FL设置，我们将物联网设备获得的学习收益与其参与成本作为效用进行建模。由于设备的异质性可能是时变的，因此不同设备的局部模型质量和相关成本不同。我们发现这会导致效用不公平，因为设备之间共享相同的全局模型。在vanilla FL设置中，主机不知道设备的本地模型计算和传输成本，因此无法解决效用不公平问题。此外，设备可能会利用主设备的这种知识不足来有意减少其开支，从而提高其效用。我们建议根据设备的贡献和支出，在每一轮中控制与设备共享的全球模型的质量。这是通过利用差异隐私来减少基于学习贡献的全局模型泄露来实现的。此外，我们还为每个设备设计了自适应计算和传输策略，以控制其支出，从而缓解效用不公平。我们的结果表明，与基准方案相比，所提出的方案将设备能量成本的标准偏差降低了99%，而设备训练损耗的标准偏差在0.103左右。
摘要：Federated learning (FL) allows predictive model training on the sensed data in a wireless Internet of things (IoT) network evading data collection cost in terms of energy, time, and privacy. In this paper, for a FL setting, we model the learning gain achieved by an IoT device against its participation cost as its utility. The local model quality and the associated cost differs from device to device due to the device-heterogeneity which could be time-varying. We identify that this results in utility unfairness because the same global model is shared among the devices. In the vanilla FL setting, the master is unaware of devices' local model computation and transmission costs, thus it is unable to address the utility unfairness problem. In addition, a device may exploit this lack of knowledge at the master to intentionally reduce its expenditure and thereby boost its utility. We propose to control the quality of the global model shared with the devices, in each round, based on their contribution and expenditure. This is achieved by employing differential privacy to curtail global model divulgence based on the learning contribution. Furthermore, we devise adaptive computation and transmission policies for each device to control its expenditure in order to mitigate utility unfairness. Our results show that the proposed scheme reduces the standard deviation of the energy cost of devices by 99% in comparison to the benchmark scheme, while the standard deviation of the training loss of devices varies around 0.103.

推理|分析|理解|解释(9篇)

【1】 Augmenting Decision Making via Interactive What-If Analysis
标题：通过交互式假设分析增强决策能力
链接：https://arxiv.org/abs/2109.06160

作者：Sneha Gathani,Madelon Hulsebos,James Gale,Peter J. Haas,Çağatay Demiralp
机构：Sigma Computing, San Francisco, USA, University of Massachusetts, Amherst, USA, Ça˘gatay Demiralp
备注：Submitted to CIDR'22
摘要：业务数据分析的基本目标是使用数据改进业务决策。业务用户（如销售、营销、产品或运营经理）通常会做出决策，以实现关键绩效指标（KPI）目标，如增加客户保留率、降低成本和增加销售额。为了发现假设为驱动因素的数据属性和与感兴趣的KPI相对应的数据属性之间的关系，业务用户当前需要执行冗长的探索性分析，考虑多种组合和场景，相应地对数据进行切片、切割和转换。例如，分析一年中各个季度的客户保持率，或建议跨客户阶层的最佳媒体渠道。然而，数据集的复杂性不断增加，再加上人类的认知局限性，使得即使对于简单的数据集，也很难进行多个假设。因此，在精神上进行这样的分析是很困难的。现有的商业工具要么提供部分解决方案，其有效性尚不清楚，要么无法满足业务用户的需求。在这里，我们讨论了四种功能，我们认为这是必要的，以使业务用户能够交互地学习和推理数据属性集之间的关系（功能），从而促进数据驱动的决策。我们在SystemD中实现了这些功能，SystemD是一个交互式可视化分析系统，允许业务用户通过询问假设问题来试验数据。我们通过三个业务用例对系统进行评估：营销组合建模分析、客户保留分析和交易结束分析，并报告来自多个业务用户的反馈。总的来说，业务用户发现SystemD直观且有用，可用于围绕感兴趣的KPI快速测试和验证他们的假设，以及做出有效且快速的数据驱动决策。
摘要：The fundamental goal of business data analysis is to improve business decisions using data. Business users such as sales, marketing, product, or operations managers often make decisions to achieve key performance indicator (KPI) goals such as increasing customer retention, decreasing cost, and increasing sales. To discover the relationship between data attributes hypothesized to be drivers and those corresponding to KPIs of interest, business users currently need to perform lengthy exploratory analyses, considering multitudes of combinations and scenarios, slicing, dicing, and transforming the data accordingly. For example, analyzing customer retention across quarters of the year or suggesting optimal media channels across strata of customers. However, the increasing complexity of datasets combined with the cognitive limitations of humans makes it challenging to carry over multiple hypotheses, even for simple datasets. Therefore mentally performing such analyses is hard. Existing commercial tools either provide partial solutions whose effectiveness remains unclear or fail to cater to business users. Here we argue for four functionalities that we believe are necessary to enable business users to interactively learn and reason about the relationships (functions) between sets of data attributes, facilitating data-driven decision making. We implement these functionalities in SystemD, an interactive visual analysis system enabling business users to experiment with the data by asking what-if questions. We evaluate the system through three business use cases: marketing mix modeling analysis, customer retention analysis, and deal closing analysis, and report on feedback from multiple business users. Overall, business users find SystemD intuitive and useful for quick testing and validation of their hypotheses around interested KPI as well as in making effective and fast data-driven decisions.

【2】 Formalizing and Estimating Distribution Inference Risks
标题：配电推理风险的形式化与估计
链接：https://arxiv.org/abs/2109.06024

作者：Anshuman Suri,David Evans
机构：Department of Computer Science, University of Virginia
备注：arXiv admin note: text overlap with arXiv:2106.03699
摘要：属性推理攻击揭示了训练集的统计属性，但很难与统计机器学习的内在目的区分开来，即生成捕获分布统计属性的模型。基于Yeom等人的成员推理框架，我们提出了属性推理攻击的形式化和一般性定义。所提出的概念描述了可以区分可能的训练分布的攻击，扩展了先前的属性推断攻击，该属性推断攻击推断训练数据集中特定类型数据的比率，例如女性的比例。我们展示了我们的定义如何捕获以前的属性推理攻击，以及一种可以揭示训练图的平均节点度或聚类系数的新攻击。我们的定义还支持一个定理，该定理将区分分布的推理攻击的最大可能精度与模型泄漏的数据集的有效大小联系起来。为了量化和理解属性推断风险，我们使用黑盒和白盒攻击在一系列不同的分布上进行了一系列实验。我们的结果表明，便宜的攻击通常与昂贵的元分类器攻击一样有效，并且攻击的有效性存在令人惊讶的不对称性。我们还将最先进的属性推断攻击扩展到卷积神经网络，并提出了一些技术来帮助识别泄漏最多信息的模型中的参数，从而显著降低元分类器攻击的资源需求。
摘要：Property inference attacks reveal statistical properties about a training set but are difficult to distinguish from the intrinsic purpose of statistical machine learning, namely to produce models that capture statistical properties about a distribution. Motivated by Yeom et al.'s membership inference framework, we propose a formal and general definition of property inference attacks. The proposed notion describes attacks that can distinguish between possible training distributions, extending beyond previous property inference attacks that infer the ratio of a particular type of data in the training data set such as the proportion of females. We show how our definition captures previous property inference attacks as well as a new attack that can reveal the average node degree or clustering coefficient of a training graph. Our definition also enables a theorem that connects the maximum possible accuracy of inference attacks distinguishing between distributions to the effective size of dataset leaked by the model. To quantify and understand property inference risks, we conduct a series of experiments across a range of different distributions using both black-box and white-box attacks. Our results show that inexpensive attacks are often as effective as expensive meta-classifier attacks, and that there are surprising asymmetries in the effectiveness of attacks. We also extend the state-of-the-art property inference attack to work on convolutional neural networks, and propose techniques to help identify parameters in a model that leak the most information, thus significantly lowering resource requirements for meta-classifier attacks.

【3】 On Solving a Stochastic Shortest-Path Markov Decision Process as Probabilistic Inference
标题：随机最短路马尔可夫决策过程的概率推理求解
链接：https://arxiv.org/abs/2109.05866

作者：Mohamed Baioumy,Bruno Lacerda,Paul Duckworth,Nick Hawes
机构：Oxford Robotics Institute, University of Oxford
备注：Presented at the second International Workshop on Active Inference (IWAI 2021); 11 pages, 2 figures
摘要：以前关于计划作为主动推理的工作解决了在线计划有效的有限时间问题和解决方案。我们提出将一般随机最短路径马尔可夫决策过程（SSP MDP）作为概率推理来求解。此外，我们还讨论了在线和离线的不确定性规划方法。在SSP MDP中，地平线是不确定的，并且是先验未知的。SSP MDP是有限和无限视界MDP的推广，广泛应用于人工智能领域。此外，我们还强调了使用人工智能社区中广泛使用的动态规划方法和主动推理社区中使用的方法解决MDP之间的一些差异。
摘要：Previous work on planning as active inference addresses finite horizon problems and solutions valid for online planning. We propose solving the general Stochastic Shortest-Path Markov Decision Process (SSP MDP) as probabilistic inference. Furthermore, we discuss online and offline methods for planning under uncertainty. In an SSP MDP, the horizon is indefinite and unknown a priori. SSP MDPs generalize finite and infinite horizon MDPs and are widely used in the artificial intelligence community. Additionally, we highlight some of the differences between solving an MDP using dynamic programming approaches widely used in the artificial intelligence community and approaches used in the active inference community.

【4】 DisCERN:Discovering Counterfactual Explanations using Relevance Features from Neighbourhoods
标题：识别：利用邻域关联特征发现反事实解释
链接：https://arxiv.org/abs/2109.05800

作者：Nirmalie Wiratunga,Anjana Wijekoon,Ikechukwu Nkisi-Orji,Kyle Martin,Chamath Palihawadana,David Corsar
机构：School of Computing, Robert Gordon University, Aberdeen, Scotland
备注：Pre-print of the paper accepted at the 33rd IEEE International Conference on Tools with Artificial Intelligence (ICTAI)
摘要：反事实解释侧重于“可操作的知识”，以帮助最终用户理解如何将机器学习结果更改为更理想的结果。为此，反事实解释者需要发现与结果变化相关的输入依赖性。对于反事实解释者来说，确定在决策中对输出变化采取行动所需的特征变化的最小子集是一个有趣的挑战。本文介绍的识别算法是一种基于实例的反事实解释算法。在这里，通过替换最近邻（NUN）的特征值形成反事实，直到观察到可操作的变化。我们展示了广泛采用的基于特征相关性的解释者（即LIME、SHAP）是如何告知DISCER识别“可操作特征”的最小子集的。在与广泛使用的基于优化的反事实方法DiCE的对比研究中，我们在五个数据集上展示了我们的识别算法。我们的研究结果表明，辨别是一种有效的策略，可以最大限度地减少创造良好的反事实解释所需的可操作变化。
摘要：Counterfactual explanations focus on "actionable knowledge" to help end-users understand how a machine learning outcome could be changed to a more desirable outcome. For this purpose a counterfactual explainer needs to discover input dependencies that relate to outcome changes. Identifying the minimum subset of feature changes needed to action an output change in the decision is an interesting challenge for counterfactual explainers. The DisCERN algorithm introduced in this paper is a case-based counter-factual explainer. Here counterfactuals are formed by replacing feature values from a nearest unlike neighbour (NUN) until an actionable change is observed. We show how widely adopted feature relevance-based explainers (i.e. LIME, SHAP), can inform DisCERN to identify the minimum subset of "actionable features". We demonstrate our DisCERN algorithm on five datasets in a comparative study with the widely used optimisation-based counterfactual approach DiCE. Our results demonstrate that DisCERN is an effective strategy to minimise actionable changes necessary to create good counterfactual explanations.

【5】 AdViCE: Aggregated Visual Counterfactual Explanations for Machine Learning Model Validation
标题：建议：机器学习模型验证的聚合视觉反事实解释
链接：https://arxiv.org/abs/2109.05629

作者：Oscar Gomez,Steffen Holter,Jun Yuan,Enrico Bertini
备注：4 pages, 2 figures, IEEE VIS 2021 Machine learning, interpretability, explainability, counterfactual explanations, data visualization
摘要：机器学习模型性能的快速改进将其推到了数据驱动决策的前沿。同时，这些模型越来越多地集成到各种应用领域，这进一步突出了对更高解释性和透明度的需求。为了识别偏差、过度拟合和不正确的相关性等问题，数据科学家需要工具来解释这些模型决策的机制。在本文中，我们将介绍AdViCE，这是一种可视化分析工具，旨在指导用户进行黑盒模型调试和验证。该解决方案依赖于两个主要的可视化用户界面创新：（1）交互式可视化设计，能够比较用户定义的数据子集上的决策；（2）一种算法和可视化设计，用于计算和可视化反事实解释——当数据特征偏离其原始值时，描述模型结果的解释。我们通过一个用例演示了该工具，展示了所提议方法的功能和潜在限制。
摘要：Rapid improvements in the performance of machine learning models have pushed them to the forefront of data-driven decision-making. Meanwhile, the increased integration of these models into various application domains has further highlighted the need for greater interpretability and transparency. To identify problems such as bias, overfitting, and incorrect correlations, data scientists require tools that explain the mechanisms with which these model decisions are made. In this paper we introduce AdViCE, a visual analytics tool that aims to guide users in black-box model debugging and validation. The solution rests on two main visual user interface innovations: (1) an interactive visualization design that enables the comparison of decisions on user-defined data subsets; (2) an algorithm and visual design to compute and visualize counterfactual explanations - explanations that depict model outcomes when data features are perturbed from their original values. We provide a demonstration of the tool through a use case that showcases the capabilities and potential limitations of the proposed approach.

【6】 Compute and Energy Consumption Trends in Deep Learning Inference
标题：深度学习推理中的计算和能耗趋势
链接：https://arxiv.org/abs/2109.05472

作者：Radosvet Desislavov,Fernando Martínez-Plumed,José Hernández-Orallo
机构：VRAIN. Universitat Politecnica de Valencia, Spain, Fernando Mart´ınez-Plumed, European Commission, Joint Research Centre, Jos´e Hern´andez-Orallo
备注：Preprint. Under review
摘要：一些人工智能范例（如深度学习）的进展据说与参数数量的指数增长有关。有许多研究证实了这些趋势，但这是否转化为能源消耗的指数增长？为了回答这个问题，我们将重点放在推理成本上，而不是训练成本上，因为前者占计算工作量的大部分，这完全是因为乘法因素。此外，除了算法创新外，我们还考虑了更具体、更强大的硬件（导致更高的触发器），通常伴随着重要的能效优化。我们还将重点从突破性论文的首次实施转移到一两年后技术的整合版本。在这一独特而全面的视角下，我们研究了计算机视觉和自然语言处理领域的相关模型：对于性能的持续增长，我们看到能源消耗的增长比之前预期的要软得多。唯一需要注意的是，随着未来人工智能的普及和普及，乘性因素再次出现。
摘要：The progress of some AI paradigms such as deep learning is said to be linked to an exponential growth in the number of parameters. There are many studies corroborating these trends, but does this translate into an exponential increase in energy consumption? In order to answer this question we focus on inference costs rather than training costs, as the former account for most of the computing effort, solely because of the multiplicative factors. Also, apart from algorithmic innovations, we account for more specific and powerful hardware (leading to higher FLOPS) that is usually accompanied with important energy efficiency optimisations. We also move the focus from the first implementation of a breakthrough paper towards the consolidated version of the techniques one or two year later. Under this distinctive and comprehensive perspective, we study relevant models in the areas of computer vision and natural language processing: for a sustained increase in performance we see a much softer growth in energy consumption than previously anticipated. The only caveat is, yet again, the multiplicative factor, as future AI increases penetration and becomes more pervasive.

【7】 Making Table Understanding Work in Practice
标题：在实践中做好表格理解工作
链接：https://arxiv.org/abs/2109.05173

作者：Madelon Hulsebos,Sneha Gathani,James Gale,Isil Dillig,Paul Groth,Çağatay Demiralp
机构：Sigma Computing, San Francisco, USA, University of Texas, Austin, USA, University of Amsterdam, Amsterdam, Netherlands, Ça˘gatay Demiralp
备注：Submitted to CIDR'22
摘要：理解大规模表的语义对于数据集成、准备和搜索等任务至关重要。表理解方法旨在检测表的主题、语义列类型、列关系或实体。随着深度学习的兴起，为这些任务开发了功能强大的模型，并在基准上具有极高的准确性。然而，我们观察到，这些模型在这些基准上的性能与其在实践中的适用性之间存在差距。在本文中，我们要解决的问题是：这些模型在实践中需要什么？我们讨论了部署表理解模型的三个挑战，并提出了解决这些挑战的框架。这些挑战包括1）难以为特定领域定制模型，2）缺乏企业中常见的典型数据库表的训练数据，以及3）对模型所做的推断缺乏信心。我们提出了SigmaTyper，它实现了语义列类型检测任务的这个框架。SigmaTyper封装了一个在GitTables上训练的混合模型，并集成了一种轻量级的人在回路方法来定制模型。最后，我们强调了未来研究的途径，以进一步缩小在实践中使表格理解有效的差距。
摘要：Understanding the semantics of tables at scale is crucial for tasks like data integration, preparation, and search. Table understanding methods aim at detecting a table's topic, semantic column types, column relations, or entities. With the rise of deep learning, powerful models have been developed for these tasks with excellent accuracy on benchmarks. However, we observe that there exists a gap between the performance of these models on these benchmarks and their applicability in practice. In this paper, we address the question: what do we need for these models to work in practice? We discuss three challenges of deploying table understanding models and propose a framework to address them. These challenges include 1) difficulty in customizing models to specific domains, 2) lack of training data for typical database tables often found in enterprises, and 3) lack of confidence in the inferences made by models. We present SigmaTyper which implements this framework for the semantic column type detection task. SigmaTyper encapsulates a hybrid model trained on GitTables and integrates a lightweight human-in-the-loop approach to customize the model. Lastly, we highlight avenues for future research that further close the gap towards making table understanding effective in practice.

【8】 Prediction of gene expression time series and structural analysis of gene regulatory networks using recurrent neural networks
标题：基于回归神经网络的基因表达时间序列预测及基因调控网络结构分析
链接：https://arxiv.org/abs/2109.05849

作者：Michele Monti,Jonathan Fiorentino,Edoardo Milanetti,Giorgio Gosti,Gian Gaetano Tartaglia
机构：Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, Dr. Aiguader , Barcelona, RNA System Biology Lab, department of Neuroscience and Brain Technologies, Istituto Italiano di Tecnologia, Via Morego ,Genoa, Italy.
备注：17 pages, 6 figures
摘要：迄今为止，从基因表达数据进行时间序列预测和基因调控网络（GRN）分类的方法已被单独处理。最近出现的基于注意的递归神经网络（RNN）模型提高了RNN参数的可解释性，使其有助于理解基因相互作用。在这项工作中，我们从一系列原型GRN生成合成的时间序列基因表达数据，并依靠双注意RNN预测基因的时间动态。我们表明，对于具有不同体系结构的GRN，预测是非常准确的。接下来，我们将重点放在RNN的注意机制上，并使用图论中的工具，我们发现它的图属性允许从层次上区分GRN的不同架构。我们发现，在RNN的预测中，GRN对添加噪声的反应不同，并且我们将噪声反应与注意机制的分析联系起来。综上所述，本研究为理解和开发RNN的注意机制提供了一条途径，并为基于RNN的时间序列预测方法和从基因表达数据推断GRN铺平了道路。
摘要：Methods for time series prediction and classification of gene regulatory networks (GRNs) from gene expression data have been treated separately so far. The recent emergence of attention-based recurrent neural networks (RNN) models boosted the interpretability of RNN parameters, making them appealing for the understanding of gene interactions. In this work, we generated synthetic time series gene expression data from a range of archetypal GRNs and we relied on a dual attention RNN to predict the gene temporal dynamics. We show that the prediction is extremely accurate for GRNs with different architectures. Next, we focused on the attention mechanism of the RNN and, using tools from graph theory, we found that its graph properties allow to hierarchically distinguish different architectures of the GRN. We show that the GRNs respond differently to the addition of noise in the prediction by the RNN and we relate the noise response to the analysis of the attention mechanism. In conclusion, this work provides a a way to understand and exploit the attention mechanism of RNN and it paves the way to RNN-based methods for time series prediction and inference of GRNs from gene expression data.

【9】 Bayesian Topic Regression for Causal Inference
标题：贝叶斯主题回归在因果推理中的应用
链接：https://arxiv.org/abs/2109.05317

作者：Maximilian Ahrens,Julian Ashwin,Jan-Peter Calliess,Vu Nguyen
机构：University of Oxford,Amazon
备注：accepted as a conference paper at EMNLP 2021
摘要：使用观察文本数据进行因果推理在许多研究领域变得越来越流行。本文提出了贝叶斯主题回归（BTR）模型，该模型使用文本和数字信息对结果变量进行建模。它允许评估离散和连续治疗效果。此外，它允许在文本数据旁边包含额外的数值混杂因素。为此，我们将有监督的贝叶斯主题模型与贝叶斯回归框架相结合，并根据Frisch-Waugh-Lovell定理，对文本特征进行有监督的表示学习和回归参数训练。我们的论文有两个主要贡献。首先，我们提供了一个回归框架，当文本和数字混杂因素都相关时，允许在设置中进行因果推断。我们通过合成和半合成数据集表明，当文本和数字特征相关时，我们的联合方法比任何基准模型都能以更低的偏差恢复地面真实值。其次，在两个真实数据集上的实验表明，与分别估计文本和非文本特征回归权重的策略相比，联合监督学习策略也能产生更好的预测结果，甚至与更复杂的深层神经网络具有竞争力。
摘要：Causal inference using observational text data is becoming increasingly popular in many research areas. This paper presents the Bayesian Topic Regression (BTR) model that uses both text and numerical information to model an outcome variable. It allows estimation of both discrete and continuous treatment effects. Furthermore, it allows for the inclusion of additional numerical confounding factors next to text data. To this end, we combine a supervised Bayesian topic model with a Bayesian regression framework and perform supervised representation learning for the text features jointly with the regression parameter training, respecting the Frisch-Waugh-Lovell theorem. Our paper makes two main contributions. First, we provide a regression framework that allows causal inference in settings when both text and numerical confounders are of relevance. We show with synthetic and semi-synthetic datasets that our joint approach recovers ground truth with lower bias than any benchmark model, when text and numerical features are correlated. Second, experiments on two real-world datasets demonstrate that a joint and supervised learning strategy also yields superior prediction results compared to strategies that estimate regression weights for text and non-text features separately, being even competitive with more complex deep neural networks.

检测相关(9篇)

【1】 DAFNe: A One-Stage Anchor-Free Deep Model for Oriented Object Detection
标题：Dafne：一种面向目标检测的一阶段无锚点深度模型
链接：https://arxiv.org/abs/2109.06148

作者：Steven Lang,Fabrizio Ventola,Kristian Kersting
机构： TU Darmstadt, Darmstadt, Germany, Hessian Center for AI and Centre for Cognitive Science, Darmstadt, Germany
备注：Main paper: 7 pages, References: 2 pages, Appendix: 5 pages; Main paper: 5 figures, Appendix: 5 figures
摘要：目标检测是计算机视觉中的一项基本任务。虽然轴对齐边界框检测方法近年来取得了实质性进展，但它们在定向对象上的性能较差，这在一些真实场景中很常见，如鸟瞰图像和安全摄像头镜头。在这些情况下，预测边界框的很大一部分将覆盖与对象无关的区域。因此，面向对象检测应运而生，其目的是将对象检测推广到任意方向。这样可以更紧密地拟合定向对象，从而更好地分离边界框，尤其是在密集对象分布的情况下。该领域的绝大多数工作都集中在复杂的两阶段锚定方法上。锚定作为边界框形状的优先项，需要对每个数据集进行仔细的超参数微调，增加模型大小，并带来计算开销。在这项工作中，我们提出了DAFNe：一种用于面向对象检测的密集单级无锚深度网络。作为一个单阶段模型，DAFNe在输入图像的密集网格上执行预测，在架构上比两阶段模型更简单、更快，并且更易于优化。此外，作为一种无锚模型，DAFNe通过避免使用边界框锚来降低预测复杂性。此外，我们还引入了一种基于方向感知的任意方向包围盒中心度函数的泛化方法，以降低低质量预测的权重，并引入了一种改进目标定位性能的中心到角包围盒预测策略。DAFNe将DOTA 1.0上以前的最佳单阶段无锚模型结果的预测精度提高了4.65%mAP，通过实现76.95%mAP，设定了新的最先进的结果。
摘要：Object detection is a fundamental task in computer vision. While approaches for axis-aligned bounding box detection have made substantial progress in recent years, they perform poorly on oriented objects which are common in several real-world scenarios such as aerial view imagery and security camera footage. In these cases, a large part of a predicted bounding box will, undesirably, cover non-object related areas. Therefore, oriented object detection has emerged with the aim of generalizing object detection to arbitrary orientations. This enables a tighter fit to oriented objects, leading to a better separation of bounding boxes especially in case of dense object distributions. The vast majority of the work in this area has focused on complex two-stage anchor-based approaches. Anchors act as priors on the bounding box shape and require attentive hyper-parameter fine-tuning on a per-dataset basis, increased model size, and come with computational overhead. In this work, we present DAFNe: A Dense one-stage Anchor-Free deep Network for oriented object detection. As a one-stage model, DAFNe performs predictions on a dense grid over the input image, being architecturally simpler and faster, as well as easier to optimize than its two-stage counterparts. Furthermore, as an anchor-free model, DAFNe reduces the prediction complexity by refraining from employing bounding box anchors. Moreover, we introduce an orientation-aware generalization of the center-ness function for arbitrarily oriented bounding boxes to down-weight low-quality predictions and a center-to-corner bounding box prediction strategy that improves object localization performance. DAFNe improves the prediction accuracy over the previous best one-stage anchor-free model results on DOTA 1.0 by 4.65% mAP, setting the new state-of-the-art results by achieving 76.95% mAP.

【2】 Concept Drift Detection in Federated Networked Systems
标题：联邦网络系统中的概念漂移检测
链接：https://arxiv.org/abs/2109.06088

作者：Dimitrios Michael Manias,Ibrahim Shaer,Li Yang,Abdallah Shami
机构：ECE Department, Western University, London ON, Canada
备注：Accepted in IEEE GLOBECOM 2021
摘要：随着下一代网络的实现，需要提高智能水平。联邦学习已被确定为智能和分布式网络的关键使能技术；然而，与任何机器学习应用程序一样，它容易出现概念漂移。考虑到现代网络提供的关键和应急服务，概念漂移直接影响模型的性能，并可能导致严重后果。为了减轻漂移的不利影响，本文提出了一个概念漂移检测系统，利用联邦训练过程中每次迭代提供的联邦学习更新。利用降维和聚类技术，以智能交通系统为例，通过实验提出了一个分离系统漂移节点的框架。本文的工作表明，该框架能够在不同的漂移阶段和不同的系统暴露水平下检测各种非iid场景中的漂移节点。
摘要：As next-generation networks materialize, increasing levels of intelligence are required. Federated Learning has been identified as a key enabling technology of intelligent and distributed networks; however, it is prone to concept drift as with any machine learning application. Concept drift directly affects the model's performance and can result in severe consequences considering the critical and emergency services provided by modern networks. To mitigate the adverse effects of drift, this paper proposes a concept drift detection system leveraging the federated learning updates provided at each iteration of the federated training process. Using dimensionality reduction and clustering techniques, a framework that isolates the system's drifted nodes is presented through experiments using an Intelligent Transportation System as a use case. The presented work demonstrates that the proposed framework is able to detect drifted nodes in a variety of non-iid scenarios at different stages of drift and different levels of system exposure.

【3】 Applications of Recurrent Neural Network for Biometric Authentication & Anomaly Detection
标题：递归神经网络在生物特征认证和异常检测中的应用
链接：https://arxiv.org/abs/2109.05701

作者：Joseph M. Ackerson,Dave Rushit,Seliya Jim
机构：Department of Computer Science, University of Wisconsin Eau-Claire, Eau Claire, WI , USA;
摘要：递归神经网络是一种强大的机器学习框架，它允许以时间序列保存和引用数据。这为手写分析和语音识别等领域带来了许多新的可能性。本文旨在探讨目前在生物特征认证、表情识别、异常检测和飞机应用四个非常重要的领域对RNN进行的研究。本文回顾了以下每种方法的方法、目的、结果和优缺点。这些不同的方法都关注于如何利用不同的RNN体系结构，如流行的长短时内存（LSTM）RNN或深度剩余RNN。本文还研究了在某些情况下哪些框架工作得最好，以及每种提出的模型的优缺点。
摘要：Recurrent Neural Networks are powerful machine learning frameworks that allow for data to be saved and referenced in a temporal sequence. This opens many new possibilities in fields such as handwriting analysis and speech recognition. This paper seeks to explore current research being conducted on RNNs in four very important areas, being biometric authentication, expression recognition, anomaly detection, and applications to aircraft. This paper reviews the methodologies, purpose, results, and the benefits and drawbacks of each proposed method below. These various methodologies all focus on how they can leverage distinct RNN architectures such as the popular Long Short-Term Memory (LSTM) RNN or a Deep-Residual RNN. This paper also examines which frameworks work best in certain situations, and the advantages and disadvantages of each pro-posed model.

【4】 FaceGuard: Proactive Deepfake Detection
标题：FacGuard：主动式深伪检测
链接：https://arxiv.org/abs/2109.05673

作者：Yuankun Yang,Chenyue Liang,Hongyu He,Xiaoyu Cao,Neil Zhenqiang Gong
机构： Neil Zhenqiang Gong 3 1Fudan University, com 3Duke University
摘要：现有的deepfake检测方法侧重于被动检测，即通过利用deepfake操作过程中产生的伪影来检测假人脸图像。被动检测的一个关键限制是它不能检测由新的伪人脸生成方法生成的伪人脸。在这项工作中，我们提出了FaceGuard，一个主动的伪造检测框架。FaceGuard在真实人脸图像发布到社交媒体之前将水印嵌入其中。给定声称是个人的人脸图像（如Nicolas Cage），FaceGuard从中提取水印，并预测如果提取的水印与个人的基本真相不匹配，则人脸图像为假。FaceGuard的一个关键组成部分是一种新的基于深度学习的水印方法，该方法1）对正常图像后处理（如JPEG压缩、高斯模糊、裁剪和大小调整）具有鲁棒性，但2）对深度伪操作非常脆弱。我们对多个数据集的评估表明，FaceGuard能够准确地检测到深度伪造，并且优于现有的方法。
摘要：Existing deepfake-detection methods focus on passive detection, i.e., they detect fake face images via exploiting the artifacts produced during deepfake manipulation. A key limitation of passive detection is that it cannot detect fake faces that are generated by new deepfake generation methods. In this work, we propose FaceGuard, a proactive deepfake-detection framework. FaceGuard embeds a watermark into a real face image before it is published on social media. Given a face image that claims to be an individual (e.g., Nicolas Cage), FaceGuard extracts a watermark from it and predicts the face image to be fake if the extracted watermark does not match well with the individual's ground truth one. A key component of FaceGuard is a new deep-learning-based watermarking method, which is 1) robust to normal image post-processing such as JPEG compression, Gaussian blurring, cropping, and resizing, but 2) fragile to deepfake manipulation. Our evaluation on multiple datasets shows that FaceGuard can detect deepfakes accurately and outperforms existing methods.

【5】 On the Impact of Spurious Correlation for Out-of-distribution Detection
标题：论伪相关对非分布检测的影响
链接：https://arxiv.org/abs/2109.05642

作者：Yifei Ming,Hang Yin,Yixuan Li
机构：Department of Computer Sciences, University of Wisconsin-Madison
摘要：现代神经网络可以将高置信度分配给从训练分布外部提取的输入，从而对实际部署中的模型构成威胁。虽然许多研究都集中在设计新的分布外（out-of-distribution，OOD）检测方法上，但OOD的精确定义往往模糊不清，并且在现实中没有达到理想的OOD概念。在本文中，我们提出了一种新的形式化方法，并通过同时考虑不变和环境（虚假）特征对数据转移进行建模。在这种形式化下，我们系统地研究了训练集中的虚假相关性如何影响OOD检测。我们的结果表明，当训练集中虚假特征和标签之间的相关性增加时，检测性能会严重恶化。我们进一步展示了在减少虚假相关性影响方面更有效的检测方法的见解，并对依赖环境特征导致高OOD检测错误的原因进行了理论分析。我们的工作旨在促进更好地理解OOD样本及其形式化，以及探索增强OOD检测的方法。
摘要：Modern neural networks can assign high confidence to inputs drawn from outside the training distribution, posing threats to models in real-world deployments. While much research attention has been placed on designing new out-of-distribution (OOD) detection methods, the precise definition of OOD is often left in vagueness and falls short of the desired notion of OOD in reality. In this paper, we present a new formalization and model the data shifts by taking into account both the invariant and environmental (spurious) features. Under such formalization, we systematically investigate how spurious correlation in the training set impacts OOD detection. Our results suggest that the detection performance is severely worsened when the correlation between spurious features and labels is increased in the training set. We further show insights on detection methods that are more effective in reducing the impact of spurious correlation and provide theoretical analysis on why reliance on environmental features leads to high OOD detection error. Our work aims to facilitate a better understanding of OOD samples and their formalization, as well as the exploration of methods that enhance OOD detection.

【6】 Detecting Handwritten Mathematical Terms with Sensor Based Data
标题：利用基于传感器的数据检测手写数学术语
链接：https://arxiv.org/abs/2109.05594

作者：Lukas Wegmeth,Alexander Hoelzemann,Kristof Van Laerhoven
摘要：在这项工作中，我们针对Stabilo提出的UbiComp 2021挑战提出了一个解决方案，其中手写数学术语应根据在DigiPen上捕获的时间序列传感器数据自动分类。输入数据集包含不同写入程序的数据，标签字符串由总共15个不同的可能字符构成。标签应首先拆分为单独的字符，以便逐个进行分类。该问题通过对标记数据应用依赖于数据且基于规则的信息提取算法来解决。利用得到的数据，构造了两个分类器。第一种是二元分类器，它能够预测未知数据的样本是否是书写活动的一部分，并由深度神经网络特征提取器和随机林组成，随机林经过训练，以F1分数>90%对提取的特征进行分类。第二个分类器是一个深度神经网络，它将卷积层与递归层结合起来，在15个可能的类别中，以F1分数>60%预测具有单个标签的窗口。挑战评估程序的模拟报告距离为8，表明所选方法仍然缺乏总体准确性和实时适用性。
摘要：In this work we propose a solution to the UbiComp 2021 Challenge by Stabilo in which handwritten mathematical terms are supposed to be automatically classified based on time series sensor data captured on the DigiPen. The input data set contains data of different writers, with label strings constructed from a total of 15 different possible characters. The label should first be split into separate characters to classify them one by one. This issue is solved by applying a data-dependant and rule-based information extraction algorithm to the labeled data. Using the resulting data, two classifiers are constructed. The first is a binary classifier that is able to predict, for unknown data, if a sample is part of a writing activity, and consists of a Deep Neural Network feature extractor in concatenation with a Random Forest that is trained to classify the extracted features at an F1 score of >90%. The second classifier is a Deep Neural Network that combines convolution layers with recurrent layers to predict windows with a single label, out of the 15 possible classes, at an F1 score of >60%. A simulation of the challenge evaluation procedure reports a Levensthein Distance of 8 and shows that the chosen approach still lacks in overall accuracy and real-time applicability.

【7】 No True State-of-the-Art? OOD Detection Methods are Inconsistent across Datasets
标题：没有真正最先进的吗？数据集之间的OOD检测方法不一致
链接：https://arxiv.org/abs/2109.05554

作者：Fahim Tajwar,Ananya Kumar,Sang Michael Xie,Percy Liang
机构：Equal contribution 1Department of Computer Science, Stan-ford University
备注：ICML Workshop on Uncertainty & Robustness in Deep Learning, 2021
摘要：分布外检测是可靠的ML系统的重要组成部分。先前的文献提出了各种方法（如MSP（Hendrycks&Gimpel，2017）、ODIN（Liang et al.，2018）、Mahalanobis（Lee et al.，2018）），声称它们是最先进的，在一组选定的分布内（ID）和分布外（OOD）数据集上表现出优于先前的方法。在这项工作中，我们表明，在一组标准化的16（ID，OOD）对上，这些方法在OOD检测方面都没有比其他方法更好的方法。我们用简单的玩具数据集给出了这些不一致性的可能解释，其中一种方法是否优于另一种方法取决于所讨论的ID和OOD数据集的结构。最后，我们证明了一种方法在某些（ID，OOD）对上的性能优于另一种方法，但在低数据区域中可能无法做到这一点。在低数据区，我们提出了一种基于距离的方法，即成对木材检测（POD），该方法基于暹罗网络，并通过避开昂贵的协方差估计步骤对马氏体进行了改进。我们的结果表明，OID检测问题可能过于广泛，我们应该考虑更具体的结构杠杆。
摘要：Out-of-distribution detection is an important component of reliable ML systems. Prior literature has proposed various methods (e.g., MSP (Hendrycks & Gimpel, 2017), ODIN (Liang et al., 2018), Mahalanobis (Lee et al., 2018)), claiming they are state-of-the-art by showing they outperform previous methods on a selected set of in-distribution (ID) and out-of-distribution (OOD) datasets. In this work, we show that none of these methods are inherently better at OOD detection than others on a standardized set of 16 (ID, OOD) pairs. We give possible explanations for these inconsistencies with simple toy datasets where whether one method outperforms another depends on the structure of the ID and OOD datasets in question. Finally, we show that a method outperforming another on a certain (ID, OOD) pair may not do so in a low-data regime. In the low-data regime, we propose a distance-based method, Pairwise OOD detection (POD), which is based on Siamese networks and improves over Mahalanobis by sidestepping the expensive covariance estimation step. Our results suggest that the OOD detection problem may be too broad, and we should consider more specific structures for leverage.

【8】 LEA-Net: Layer-wise External Attention Network for Efficient Color Anomaly Detection
标题：LEA-Net：一种高效颜色异常检测的分层外部注意力网络
链接：https://arxiv.org/abs/2109.05493

作者：Ryoya Katafuchi,Terumasa Tokunaga
机构： Kyushu Institute of Technology
摘要：利用异常的先验知识是异常检测的一个基本问题。近年来，视觉注意机制已成为提高CNN在某些计算机视觉任务中性能的一种很有前途的方法。在本文中，我们提出了一种新的模型，称为分层外部注意网络（LEA-Net），用于有效的图像异常检测。其核心思想是通过视觉注意机制整合无监督和有监督的异常检测器。我们的策略如下：（i）将关于异常的先验知识表示为通过正常实例的无监督学习生成的异常图，（ii）通过外部网络将异常图转换为注意图，（iii）然后将注意图合并到异常检测网络的中间层。值得注意的是，这种分层的外部注意可以以端到端的训练方式应用于任何CNN模型。作为一项初步研究，我们验证了LEA网络在颜色异常检测任务中的有效性。通过对PlantVillage、MVTec AD和Cloud数据集的大量实验，我们证明了所提出的分层视觉注意机制能够持续提高现有CNN模型的异常检测性能，即使在不平衡的数据集上也是如此。此外，我们还表明，我们的注意机制成功地提高了几个CNN模型的性能。
摘要：The utilization of prior knowledge about anomalies is an essential issue for anomaly detections. Recently, the visual attention mechanism has become a promising way to improve the performance of CNNs for some computer vision tasks. In this paper, we propose a novel model called Layer-wise External Attention Network (LEA-Net) for efficient image anomaly detection. The core idea relies on the integration of unsupervised and supervised anomaly detectors via the visual attention mechanism. Our strategy is as follows: (i) Prior knowledge about anomalies is represented as the anomaly map generated by unsupervised learning of normal instances, (ii) The anomaly map is translated to an attention map by the external network, (iii) The attention map is then incorporated into intermediate layers of the anomaly detection network. Notably, this layer-wise external attention can be applied to any CNN model in an end-to-end training manner. For a pilot study, we validate LEA-Net on color anomaly detection tasks. Through extensive experiments on PlantVillage, MVTec AD, and Cloud datasets, we demonstrate that the proposed layer-wise visual attention mechanism consistently boosts anomaly detection performances of an existing CNN model, even on imbalanced datasets. Moreover, we show that our attention mechanism successfully boosts the performance of several CNN models.

【9】 Towards a Rigorous Evaluation of Time-series Anomaly Detection
标题：走向严格的时间序列异常检测评估
链接：https://arxiv.org/abs/2109.05257

作者：Siwon Kim,Kukjin Choi,Hyun-Soo Choi,Byunghan Lee,Sungroh Yoon
机构： Data Science and AI Laboratory, Seoul National University, Korea, DIT Center, Samsung Electronics, Korea, Department of Computer Science and Engineering, Kangwon National University, Korea
备注：9 pages, 6 figures
摘要：近年来，关于时间序列异常检测（TAD）的拟议研究报告了基准TAD数据集的高F1分数，给人以明显改善的印象。然而，大多数研究在评分前采用一种特殊的评估方案，称为点调整（PA）。在本文中，我们从理论和实验上揭示了PA协议高估检测性能的可能性很大；也就是说，即使是随机异常评分也可以很容易地转化为最先进的TAD方法。因此，在PA协议之后，将TAD方法与F1分数进行比较可能会导致错误的排名。此外，我们质疑现有TAD方法的潜力，通过显示未经训练的模型即使没有PA也能获得与现有方法相当的检测性能。基于我们的发现，我们提出了一个新的基线和评估方案。我们希望我们的研究将有助于对TAD进行严格评估，并在未来的研究中进一步改进。
摘要：In recent years, proposed studies on time-series anomaly detection (TAD) report high F1 scores on benchmark TAD datasets, giving the impression of clear improvements. However, most studies apply a peculiar evaluation protocol called point adjustment (PA) before scoring. In this paper, we theoretically and experimentally reveal that the PA protocol has a great possibility of overestimating the detection performance; that is, even a random anomaly score can easily turn into a state-of-the-art TAD method. Therefore, the comparison of TAD methods with F1 scores after the PA protocol can lead to misguided rankings. Furthermore, we question the potential of existing TAD methods by showing that an untrained model obtains comparable detection performance to the existing methods even without PA. Based on our findings, we propose a new baseline and an evaluation protocol. We expect that our study will help a rigorous evaluation of TAD and lead to further improvement in future researches.

分类|识别(6篇)

【1】 Learning-Based UE Classification in Millimeter-Wave Cellular Systems With Mobility
标题：毫米波移动蜂窝系统中基于学习的UE分类
链接：https://arxiv.org/abs/2109.05893

作者：Dino Pjanić,Alexandros Sopasakis,Harsh Tataria,Fredrik Tufvesson,Andres Reial
机构：⋆Ericsson AB, Lund, Sweden, †Department of Electrical and Information Technology, Lund University, Lund, Sweden, ‡Department of Mathematics, Lund University, Lund, Sweden
备注：Accepted for Publication in 2021 IEEE International Workshop on Machine Learning for Signal Processing, 6 Pages, 7 Figures, 1 Table
摘要：毫米波蜂窝通信需要波束形成过程，该过程能够在用户设备（UE）移动时对齐发射机和接收机波束。对于有效的波束跟踪，根据用户的流量和移动模式对用户进行分类是有利的。迄今为止的研究已经证明了基于机器学习的UE分类的有效方法。尽管不同的机器学习方法已经取得了成功，但大多数方法都是基于接收信号的物理层属性。然而，这增加了额外的复杂性，并且需要访问那些较低层的信号。在本文中，我们证明了传统的有监督甚至无监督机器学习方法可以成功地应用于更高层的信道测量报告，以执行UE分类，从而降低分类过程的复杂性。
摘要：Millimeter-wave cellular communication requires beamforming procedures that enable alignment of the transmitter and receiver beams as the user equipment (UE) moves. For efficient beam tracking it is advantageous to classify users according to their traffic and mobility patterns. Research to date has demonstrated efficient ways of machine learning based UE classification. Although different machine learning approaches have shown success, most of them are based on physical layer attributes of the received signal. This, however, imposes additional complexity and requires access to those lower layer signals. In this paper, we show that traditional supervised and even unsupervised machine learning methods can successfully be applied on higher layer channel measurement reports in order to perform UE classification, thereby reducing the complexity of the classification process.

【2】 Low-Shot Validation: Active Importance Sampling for Estimating Classifier Performance on Rare Categories
标题：低射验证：评估稀有类别分类器性能的主动重要性抽样
链接：https://arxiv.org/abs/2109.05720

作者：Fait Poms,Vishnu Sarukkai,Ravi Teja Mullapudi,Nimit S. Sohoni,William R. Mark,Deva Ramanan,Kayvon Fatahalian
备注：Accepted to ICCV 2021; 12 pages, 12 figures
摘要：对于使用有限的标记训练数据训练的机器学习模型，验证将成为降低总体注释成本的主要瓶颈。我们提出了一种统计验证算法，可以准确估计稀有类别的二元分类器的F分数，在这种情况下，找到相关的示例进行评估尤其具有挑战性。我们的主要见解是，即使在低样本区（<300个样本），同时校准和重要性抽样也能实现准确的估计。关键的是，我们还推导出了我们方法方差的准确单次试验估计量，并证明该估计量在低样本数下具有经验准确性，使从业者能够了解他们对给定低样本估计的信任程度。在ImageNet和iNaturalist2017上验证最先进的半监督模型时，我们的方法实现了相同的模型性能估计，标签数量比竞争方法少10倍。特别是，我们可以使用100个标签估算方差为0.005的F1模型分数。
摘要：For machine learning models trained with limited labeled training data, validation stands to become the main bottleneck to reducing overall annotation costs. We propose a statistical validation algorithm that accurately estimates the F-score of binary classifiers for rare categories, where finding relevant examples to evaluate on is particularly challenging. Our key insight is that simultaneous calibration and importance sampling enables accurate estimates even in the low-sample regime (< 300 samples). Critically, we also derive an accurate single-trial estimator of the variance of our method and demonstrate that this estimator is empirically accurate at low sample counts, enabling a practitioner to know how well they can trust a given low-sample estimate. When validating state-of-the-art semi-supervised models on ImageNet and iNaturalist2017, our method achieves the same estimates of model performance with up to 10x fewer labels than competing approaches. In particular, we can estimate model F1 scores with a variance of 0.005 using as few as 100 labels.

【3】 Robust Federated Best-Arm Identification in Multi-Armed Bandits
标题：多臂土匪的鲁棒联邦最佳臂辨识
链接：https://arxiv.org/abs/2109.05700

作者：Aritra Mitra,Hamed Hassani,George Pappas
机构：The authors are with the Department of Electrical and Systems Engineering, University of Pennsylvania
摘要：我们研究了随机多武装匪徒中最佳武器识别问题的一个联合变量：一组客户端，每个客户端只能对武器的子集进行采样，通过服务器协作，以指定的置信度识别最佳武器（即平均报酬最高的武器）。针对这个问题，我们提出了Fed SEL，这是一种基于逐次消除技术的简单高效通信算法，涉及客户端的局部采样步骤。为了研究Fed SEL的性能，我们引入了arm异质性的概念，该概念捕获了对应于不同客户端的arm分布之间的差异程度。有趣的是，我们的分析揭示了arm异构性在降低Fed SEL的样本和通信复杂性方面的优势。作为我们分析的一个特例，我们表明，对于某些异构问题实例，Fed SEL仅在一轮通信后输出最佳arm。我们的发现具有以下关键含义：与联邦监督学习不同，联邦监督学习最近的工作表明统计异质性会导致较差的性能，可以证明，在联邦最佳arm识别中，可以同时获得局部计算和异质性的好处。作为我们的最终贡献，我们开发了Fed SEL的变体，用于联邦和点对点设置，对拜占庭客户端的存在具有鲁棒性，因此适合在恶劣的敌对环境中部署。
摘要：We study a federated variant of the best-arm identification problem in stochastic multi-armed bandits: a set of clients, each of whom can sample only a subset of the arms, collaborate via a server to identify the best arm (i.e., the arm with the highest mean reward) with prescribed confidence. For this problem, we propose Fed-SEL, a simple communication-efficient algorithm that builds on successive elimination techniques and involves local sampling steps at the clients. To study the performance of Fed-SEL, we introduce a notion of arm-heterogeneity that captures the level of dissimilarity between distributions of arms corresponding to different clients. Interestingly, our analysis reveals the benefits of arm-heterogeneity in reducing both the sample- and communication-complexity of Fed-SEL. As a special case of our analysis, we show that for certain heterogeneous problem instances, Fed-SEL outputs the best-arm after just one round of communication. Our findings have the following key implication: unlike federated supervised learning where recent work has shown that statistical heterogeneity can lead to poor performance, one can provably reap the benefits of both local computation and heterogeneity for federated best-arm identification. As our final contribution, we develop variants of Fed-SEL, both for federated and peer-to-peer settings, that are robust to the presence of Byzantine clients, and hence suitable for deployment in harsh, adversarial environments.

【4】 SphereFace Revived: Unifying Hyperspherical Face Recognition
标题：SphereFace复活：统一超球面人脸识别
链接：https://arxiv.org/abs/2109.05565

作者：Weiyang Liu,Yandong Wen,Bhiksha Raj,Rita Singh,Adrian Weller
备注：Technical Report (15 pages)
摘要：本文讨论了开放集协议下的深度人脸识别问题，在适当选择的度量空间下，理想人脸特征的最大类内距离小于最小类间距离。为此，超球面人脸识别作为一个很有前途的研究方向，受到了越来越多的关注，并逐渐成为人脸识别研究的一个主要热点。SphereFace作为超球面人脸识别领域最早的工作之一，明确提出了学习具有较大类间角裕度的人脸嵌入。然而，SphereFace仍然存在严重的训练不稳定性，这限制了它在实践中的应用。为了解决这个问题，我们引入了一个统一的框架来理解超球面人脸识别中的大角度边缘。在此框架下，我们扩展了SphereFace的研究，并提出了一种具有更好训练稳定性的改进变体——SphereFace-R。具体而言，我们提出了两种实现乘法裕度的新方法，并在三种不同的特征规范化方案下研究了SphereFace-R（无特征规范化、硬特征规范化和软特征规范化）。我们还提出了一种实现策略——“特征梯度分离”——以稳定训练。对SphereFace-R的大量实验表明，它始终优于或与最先进的方法相竞争。
摘要：This paper addresses the deep face recognition problem under an open-set protocol, where ideal face features are expected to have smaller maximal intra-class distance than minimal inter-class distance under a suitably chosen metric space. To this end, hyperspherical face recognition, as a promising line of research, has attracted increasing attention and gradually become a major focus in face recognition research. As one of the earliest works in hyperspherical face recognition, SphereFace explicitly proposed to learn face embeddings with large inter-class angular margin. However, SphereFace still suffers from severe training instability which limits its application in practice. In order to address this problem, we introduce a unified framework to understand large angular margin in hyperspherical face recognition. Under this framework, we extend the study of SphereFace and propose an improved variant with substantially better training stability -- SphereFace-R. Specifically, we propose two novel ways to implement the multiplicative margin, and study SphereFace-R under three different feature normalization schemes (no feature normalization, hard feature normalization and soft feature normalization). We also propose an implementation strategy -- "characteristic gradient detachment" -- to stabilize training. Extensive experiments on SphereFace-R show that it is consistently better than or competitive with state-of-the-art methods.

【5】 Structure-preserving Sparse Identification of Nonlinear Dynamics for Data-driven Modeling
标题：数据驱动建模中非线性动力学的保结构稀疏辨识
链接：https://arxiv.org/abs/2109.05364

作者：Kookjin Lee,Nathaniel Trask,Panos Stinis
机构： School of Computing and Augmented Intelligence, Arizona State University, Tempe, AZ , Center for Computing Research, Sandia National Laboratories, Albuquerque, NM , Pacific Northwest National Laboratory, Richland, WA
摘要：从数据中发现动力系统形成数据驱动建模的基础，最近，结构保持几何透视已经被证明能够提供改进的预测、稳定性和物理可实现性保证。我们在这里提出了一个统一的稀疏识别非线性动力学（SINDy）形式与神经常微分方程。由此产生的框架允许学习“黑箱”动力学和学习可逆和不可逆动力学的结构保持括号形式。我们提出了一套基准测试，证明了有效性和结构保护，包括混沌系统。
摘要：Discovery of dynamical systems from data forms the foundation for data-driven modeling and recently, structure-preserving geometric perspectives have been shown to provide improved forecasting, stability, and physical realizability guarantees. We present here a unification of the Sparse Identification of Nonlinear Dynamics (SINDy) formalism with neural ordinary differential equations. The resulting framework allows learning of both "black-box" dynamics and learning of structure preserving bracket formalisms for both reversible and irreversible dynamics. We present a suite of benchmarks demonstrating effectiveness and structure preservation, including for chaotic systems.

【6】 Real-Time EMG Signal Classification via Recurrent Neural Networks
标题：基于递归神经网络的肌电信号实时分类
链接：https://arxiv.org/abs/2109.05674

作者：Reza Bagherian Azhiri,Mohammad Esmaeili,Mehrdad Nourani
机构：Predictive Analytics and Technologies Lab, ME Dept., The University of Texas at Dallas, Richardson, TX, USA, Department of Electrical and Computer Engineering, Predictive Analytics and Technologies Lab, ECE Dept.
摘要：肌电图信号的实时分类是控制假手最具挑战性的部分。在短延迟时间内实现肌电信号的高分类精度仍然是一个挑战。递归神经网络（RNN）是一种适用于连续数据（如肌电图）的人工神经网络结构。在本文中，在从混合时频域（离散小波变换）中提取特征后，我们利用一组基于递归神经网络的结构来提高分类精度并减少预测延迟时间。对这些结构的性能进行了比较，通过在600毫秒内实现96%的分类准确率，总体上优于其他最先进的方法。
摘要：Real-time classification of Electromyography signals is the most challenging part of controlling a prosthetic hand. Achieving a high classification accuracy of EMG signals in a short delay time is still challenging. Recurrent neural networks (RNNs) are artificial neural network architectures that are appropriate for sequential data such as EMG. In this paper, after extracting features from a hybrid time-frequency domain (discrete Wavelet transform), we utilize a set of recurrent neural network-based architectures to increase the classification accuracy and reduce the prediction delay time. The performances of these architectures are compared and in general outperform other state-of-the-art methods by achieving 96% classification accuracy in 600 msec.

表征(2篇)

【1】 Cross Domain Robot Imitation with Invariant Representation
标题：具有不变表示的跨域机器人仿真
链接：https://arxiv.org/abs/2109.05940

作者：Zhao-Heng Yin,Lingfeng Sun,Hengbo Ma,Masayoshi Tomizuka,Wu-Jun Li
机构：com 2Wu-Jun Li is with the Department of Computer Science and Technology, Nanjing University
摘要：动物能够模仿彼此的行为，尽管它们在生物力学上存在差异。相比之下，在机器人学中，模仿其他类似的机器人是一项更具挑战性的任务。这个问题称为跨域模仿学习（CDIL）。在本文中，我们考虑CDIL对一类类似的机器人。我们通过引入一种基于不变表示的模拟学习算法来解决这个问题。我们建议学习不变的状态和动作表示，这将使多个机器人的行为保持一致，从而使CDIL成为可能。与以往用于类似目的的不变表示学习方法相比，我们的方法不需要人类标记的成对数据进行训练。相反，我们使用循环一致性和域混淆来对齐表示并增强其健壮性。我们在模拟器中对多个机器人进行了测试，结果表明，未看到的新机器人实例可以通过现有的专家演示成功地进行训练。定性结果还表明，该方法能够学习具有相似行为的不同机器人的相似表示，这对于成功的CDIL至关重要。
摘要：Animals are able to imitate each others' behavior, despite their difference in biomechanics. In contrast, imitating the other similar robots is a much more challenging task in robotics. This problem is called cross domain imitation learning~(CDIL). In this paper, we consider CDIL on a class of similar robots. We tackle this problem by introducing an imitation learning algorithm based on invariant representation. We propose to learn invariant state and action representations, which aligns the behavior of multiple robots so that CDIL becomes possible. Compared with previous invariant representation learning methods for similar purpose, our method does not require human-labeled pairwise data for training. Instead, we use cycle-consistency and domain confusion to align the representation and increase its robustness. We test the algorithm on multiple robots in simulator and show that unseen new robot instances can be trained with existing expert demonstrations successfully. Qualitative results also demonstrate that the proposed method is able to learn similar representations for different robots with similar behaviors, which is essential for successful CDIL.

【2】 Explaining Deep Learning Representations by Tracing the Training Process
标题：追踪训练过程解释深度学习表征
链接：https://arxiv.org/abs/2109.05880

作者：Lukas Pfahler,Katharina Morik
机构：TU Dortmund University, Dortmund, Germany
摘要：我们提出了一种新的解释方法，通过研究在训练过程中如何细化深层神经网络各层的中间表示来解释深层神经网络的决策。通过这种方式，我们可以a）在训练过程中找到最具影响力的训练示例，b）分析哪些课程最适合最终表现。我们的方法是通用的：它可以包裹在任何迭代优化过程中，并涵盖各种神经网络结构，包括前馈网络和卷积神经网络。我们首先提出了一种单训练实例的随机训练方法，但也继续推导出一种常见小批量训练的变体。在实验评估中，我们表明，我们的方法确定了具有高度代表性的训练实例，可以作为一种解释。此外，我们还提出了一种可视化方法，以聚合统计的形式对整个训练过程进行解释。
摘要：We propose a novel explanation method that explains the decisions of a deep neural network by investigating how the intermediate representations at each layer of the deep network were refined during the training process. This way we can a) find the most influential training examples during training and b) analyze which classes attributed most to the final representation. Our method is general: it can be wrapped around any iterative optimization procedure and covers a variety of neural network architectures, including feed-forward networks and convolutional neural networks. We first propose a method for stochastic training with single training instances, but continue to also derive a variant for the common mini-batch training. In experimental evaluations, we show that our method identifies highly representative training instances that can be used as an explanation. Additionally, we propose a visualization that provides explanations in the form of aggregated statistics over the whole training process.

编码器(1篇)

【1】 Fast Variational AutoEncoder with Inverted Multi-Index for Collaborative Filtering
标题：用于协同过滤的多索引倒排快速变分自动编码器
链接：https://arxiv.org/abs/2109.05773

作者：Jin Chen,Binbin Jin,Xu Huang,Defu Lian,Kai Zheng,Enhong Chen
机构：University of Electronic Science and Technology of China, University of Science and Technology of China
摘要：变分自动编码器（VAE）是一种典型的非线性协同过滤方法。然而，VAE的瓶颈在于对所有项目的softmax计算，因此它需要项目数量的线性成本来计算损失和梯度以进行优化。这阻碍了实际使用，因为现实世界中有数百万项。重要性抽样是一种有效的近似方法，并以此为基础推导了抽样的softmax。然而，现有的方法通常利用均匀抽样或流行抽样作为建议分布，导致梯度估计存在较大偏差。为此，我们提出了一种基于内积的软最大概率分解方法，该方法基于倒排多指标，实现了次线性时间和高精度采样。基于提出的方案，我们开发了一种用于协同过滤的快速变分自动编码器（FastVAE）。根据在三个真实数据集上的实验，FastVAE在采样质量和效率方面都优于最先进的基线。
摘要：Variational AutoEncoder (VAE) has been extended as a representative nonlinear method for collaborative filtering. However, the bottleneck of VAE lies in the softmax computation over all items, such that it takes linear costs in the number of items to compute the loss and gradient for optimization. This hinders the practical use due to millions of items in real-world scenarios. Importance sampling is an effective approximation method, based on which the sampled softmax has been derived. However, existing methods usually exploit the uniform or popularity sampler as proposal distributions, leading to a large bias of gradient estimation. To this end, we propose to decompose the inner-product-based softmax probability based on the inverted multi-index, leading to sublinear-time and highly accurate sampling. Based on the proposed proposals, we develop a fast Variational AutoEncoder (FastVAE) for collaborative filtering. FastVAE can outperform the state-of-the-art baselines in terms of both sampling quality and efficiency according to the experiments on three real-world datasets.

优化|敛散性(6篇)

【1】 DHA: End-to-End Joint Optimization of Data Augmentation Policy, Hyper-parameter and Architecture
标题：DHA：数据增强策略、超参数和体系结构的端到端联合优化
链接：https://arxiv.org/abs/2109.05765

作者：Kaichen Zhou,Lanqing Hong,Shoukang Hu,Fengwei Zhou,Binxin Ru,Jiashi Feng,Zhenguo Li
机构： University of Oxford, Huawei Noah’s Ark Lab, The Chinese University of Hong Kong, National University of Singapore
摘要：自动机器学习（AutoML）通常涉及几个关键组件，如数据扩充（DA）策略、超参数优化（HPO）和神经结构搜索（NAS）。尽管已经制定了许多策略来实现这些组件的分离自动化，但由于搜索维度的大幅增加和每个组件的输入类型的变化，这些组件的联合优化仍然具有挑战性。同时，按顺序执行这些组件通常需要人类专家的仔细协调，并可能导致次优结果。与此并行，通常的做法是先搜索最佳体系结构，然后在NAS中部署之前对其进行重新训练，这通常会导致搜索和重新训练阶段之间的性能相关性较低。需要一种端到端解决方案，该解决方案集成了AutoML组件，并在搜索结束时返回一个随时可用的模型。鉴于此，我们提出了DHA，它实现了数据扩充策略、超参数和体系结构的联合优化。具体而言，端到端NAS通过优化压缩的低维特征空间以可区分的方式实现，而DA策略和HPO同时动态更新。实验表明，DHA在各种数据集上获得了最先进的（SOTA）结果，尤其是在基于单元格的搜索空间的ImageNet上，准确率达到77.4%，比当前的SOTA高0.5%。据我们所知，我们是第一家以端到端的方式高效地联合优化DA策略、NAS和HPO而无需再训练的公司。
摘要：Automated machine learning (AutoML) usually involves several crucial components, such as Data Augmentation (DA) policy, Hyper-Parameter Optimization (HPO), and Neural Architecture Search (NAS). Although many strategies have been developed for automating these components in separation, joint optimization of these components remains challenging due to the largely increased search dimension and the variant input types of each component. Meanwhile, conducting these components in a sequence often requires careful coordination by human experts and may lead to sub-optimal results. In parallel to this, the common practice of searching for the optimal architecture first and then retraining it before deployment in NAS often suffers from low performance correlation between the search and retraining stages. An end-to-end solution that integrates the AutoML components and returns a ready-to-use model at the end of the search is desirable. In view of these, we propose DHA, which achieves joint optimization of Data augmentation policy, Hyper-parameter and Architecture. Specifically, end-to-end NAS is achieved in a differentiable manner by optimizing a compressed lower-dimensional feature space, while DA policy and HPO are updated dynamically at the same time. Experiments show that DHA achieves state-of-the-art (SOTA) results on various datasets, especially 77.4\% accuracy on ImageNet with cell based search space, which is higher than current SOTA by 0.5\%. To the best of our knowledge, we are the first to efficiently and jointly optimize DA policy, NAS, and HPO in an end-to-end manner without retraining.

【2】 HyP-ABC: A Novel Automated Hyper-Parameter Tuning Algorithm Using Evolutionary Optimization
标题：Hyp-ABC：一种基于进化优化的超参数自动整定新算法
链接：https://arxiv.org/abs/2109.05319

作者：Leila Zahedi,Farid Ghareh Mohammadi,M. Hadi Amini
机构：Knight Foundation School of, Computing and Information Sciences, Florida International University, Miami, Florida , Department of Computer Science, University of Georgia, Athens, Georgia
备注：6 figures, 2 tables
摘要：机器学习技术在广泛的应用中成为有前途的决策和分析工具。不同的ML算法具有不同的超参数。为了针对特定的应用定制ML模型，需要调整大量的超参数。调整超参数直接影响性能（准确性和运行时）。然而，对于大规模搜索空间，有效地探索足够数量的超参数组合在计算上具有挑战性。现有的自动超参数调整技术存在时间复杂度高的问题。在本文中，我们提出了HyP-ABC，一种使用改进人工蜂群方法的自动创新混合超参数优化算法，用于测量三种ML算法的分类精度，即随机森林、极端梯度增强和支持向量机。与最先进的技术相比，HyP-ABC效率更高，需要调整的参数数量有限，因此值得用于实际的超参数优化问题。我们进一步将我们提出的HyP-ABC算法与最先进的技术进行比较。为了保证该方法的鲁棒性，该算法采用了大量可行的超参数值，并使用真实的教育数据集进行了测试。
摘要：Machine learning techniques lend themselves as promising decision-making and analytic tools in a wide range of applications. Different ML algorithms have various hyper-parameters. In order to tailor an ML model towards a specific application, a large number of hyper-parameters should be tuned. Tuning the hyper-parameters directly affects the performance (accuracy and run-time). However, for large-scale search spaces, efficiently exploring the ample number of combinations of hyper-parameters is computationally challenging. Existing automated hyper-parameter tuning techniques suffer from high time complexity. In this paper, we propose HyP-ABC, an automatic innovative hybrid hyper-parameter optimization algorithm using the modified artificial bee colony approach, to measure the classification accuracy of three ML algorithms, namely random forest, extreme gradient boosting, and support vector machine. Compared to the state-of-the-art techniques, HyP-ABC is more efficient and has a limited number of parameters to be tuned, making it worthwhile for real-world hyper-parameter optimization problems. We further compare our proposed HyP-ABC algorithm with state-of-the-art techniques. In order to ensure the robustness of the proposed method, the algorithm takes a wide range of feasible hyper-parameter values, and is tested using a real-world educational dataset.

【3】 Fundamental limits of over-the-air optimization: Are analog schemes optimal?
标题：空中优化的基本限制：模拟方案是最优的吗？
链接：https://arxiv.org/abs/2109.05222

作者：Shubham K Jha,Prathamesh Mayekar,Himanshu Tyagi
备注：An abridged version of this paper will appear in the proceedings of IEEE Global Communications Conference (GLOBECOM), 2021
摘要：我们考虑在D维空间上的空气凸优化，其中编码梯度在具有方差σ^ 2的加性高斯噪声信道上发送。码字满足平均功率约束P，导致信噪比（SNR）为P/\sigma^2。我们推导了空中优化的收敛速度的界。我们的第一个结果是收敛速度的下限，表明任何代码都必须将收敛速度降低大约\sqrt{d/log（1+SNR）}。接下来，我们考虑一类流行的称为模拟编码的方案，其中发送梯度的线性函数。我们证明了一个简单的缩放传输模拟编码方案会导致收敛速度降低一个因子\sqrt{d（1+1/SNR）}。这与之前低信噪比下恒定因子的下限相匹配，使缩放传输方案在低信噪比下达到最优。然而，我们证明了这种减速对于任何模拟编码方案都是必要的。特别是，即使在SNR趋于无穷大时，模拟编码的收敛速度仍然会减慢一倍。值得注意的是，我们提出了一种简单的量化和调制方案，该方案使用幅度移位键控，几乎在所有SNR下都能达到最佳收敛速度。
摘要：We consider over-the-air convex optimization on a d dimensional space where coded gradients are sent over an additive Gaussian noise channel with variance \sigma^2. The codewords satisfy an average power constraint P, resulting in the signal-to-noise ratio (SNR) of P/\sigma^2. We derive bounds for the convergence rates for over-the-air optimization. Our first result is a lower bound for the convergence rate showing that any code must slowdown the convergence rate by a factor of roughly \sqrt{d/log(1 + SNR)}. Next, we consider a popular class of schemes called analog coding, where a linear function of the gradient is sent. We show that a simple scaled transmission analog coding scheme results in a slowdown in convergence rate by a factor of \sqrt{d(1 + 1/SNR)}. This matches the previous lower bound up to constant factors for low SNR, making the scaled transmission scheme optimal at low SNR. However, we show that this slowdown is necessary for any analog coding scheme. In particular, a slowdown in convergence by a factor of \sqrt{d} for analog coding remains even when SNR tends to infinity. Remarkably, we present a simple quantize-and-modulate scheme that uses Amplitude Shift Keying and almost attains the optimal convergence rate at all SNRs.

【4】 Nonlinear matrix recovery using optimization on the Grassmann manifold
标题：基于优化的Grassmann流形上的非线性矩阵恢复
链接：https://arxiv.org/abs/2109.06095

作者：Florentin Goyens,Coralia Cartis,Armin Eftekhari
摘要：我们研究了一个部分观测的高秩矩阵的恢复问题，该矩阵的列服从非线性结构，如子空间的并集、代数簇或簇。恢复问题被描述为应用于原始矩阵的非线性特征映射的秩极小化，然后进一步由包含格拉斯曼流形的约束非凸优化问题来逼近。我们提出了两套算法，一套来自黎曼优化，另一套作为交替最小化方案，这两套算法都包括一阶和二阶变量。这两套算法都有理论保证。特别地，对于交替极小化，我们建立了全局收敛性和最坏情况复杂度界。此外，利用Kurdyka-Lojasiewicz性质，我们证明了交替极小化收敛到一个唯一的极限点。我们提供了大量的数值结果，用于在入口抽样和稠密高斯抽样下恢复子空间并和聚类。我们的方法与现有方法相比具有竞争力，尤其是在使用黎曼二阶方法进行回收时实现了高精度。
摘要：We investigate the problem of recovering a partially observed high-rank matrix whose columns obey a nonlinear structure such as a union of subspaces, an algebraic variety or grouped in clusters. The recovery problem is formulated as the rank minimization of a nonlinear feature map applied to the original matrix, which is then further approximated by a constrained non-convex optimization problem involving the Grassmann manifold. We propose two sets of algorithms, one arising from Riemannian optimization and the other as an alternating minimization scheme, both of which include first- and second-order variants. Both sets of algorithms have theoretical guarantees. In particular, for the alternating minimization, we establish global convergence and worst-case complexity bounds. Additionally, using the Kurdyka-Lojasiewicz property, we show that the alternating minimization converges to a unique limit point. We provide extensive numerical results for the recovery of union of subspaces and clustering under entry sampling and dense Gaussian sampling. Our methods are competitive with existing approaches and, in particular, high accuracy is achieved in the recovery using Riemannian second-order methods.

【5】 Online Optimization of Stimulation Speed in an Auditory Brain-Computer Interface under Time Constraints
标题：时间约束下听觉脑机接口刺激速度的在线优化
链接：https://arxiv.org/abs/2109.06011

作者：Jan Sosulski,David Hübner,Aaron Klein,Michael Tangermann
机构：University of Freiburg, Germany, Radboud University, The Netherlands
摘要：使用机器学习对通过（例如）脑电图记录的大脑信号进行解码是脑-机接口（BCI）的关键。BCI协议的刺激参数或其他实验设置通常根据文献选择。解码性能直接取决于参数的选择，因为它们影响诱发的大脑信号，最佳参数取决于受试者。因此，快速、自动地选择实验参数可以极大地提高BCIs的可用性。我们评估了闭环听觉事件相关电位协议中的独立随机搜索和贝叶斯优化与随机搜索的组合。我们的目标是找到个体最佳刺激速度——也称为刺激开始异步（SOA）——最大化正则化线性判别分析的分类性能。为了使贝叶斯优化在噪声和在线BCI实验造成的时间压力下可行，我们首先使用离线模拟来初始化和约束内部优化模型。然后，我们对13名健康受试者在线评估了我们的方法。我们可以证明，对于13个受试者中的8个，所提出的使用贝叶斯优化的方法成功地从多个评估的SOA值中选择了各自最优的SOA。然而，我们的数据表明，受试者在很大程度上受到SOA参数的影响。这使得自动参数选择对于受影响有限的对象不可行。我们的工作提出了一种利用个体化实验方案优势的方法，并在听觉脑机接口中对其进行了评估。当应用于其他实验参数时，我们的方法可以提高BCI在不同目标群体中的可用性——特别是在个别疾病进展可能阻止使用标准参数的情况下。
摘要：The decoding of brain signals recorded via, e.g., an electroencephalogram, using machine learning is key to brain-computer interfaces (BCIs). Stimulation parameters or other experimental settings of the BCI protocol typically are chosen according to the literature. The decoding performance directly depends on the choice of parameters, as they influence the elicited brain signals and optimal parameters are subject-dependent. Thus a fast and automated selection procedure for experimental parameters could greatly improve the usability of BCIs. We evaluate a standalone random search and a combined Bayesian optimization with random search in a closed-loop auditory event-related potential protocol. We aimed at finding the individually best stimulation speed -- also known as stimulus onset asynchrony (SOA) -- that maximizes the classification performance of a regularized linear discriminant analysis. To make the Bayesian optimization feasible under noise and the time pressure posed by an online BCI experiment, we first used offline simulations to initialize and constrain the internal optimization model. Then we evaluated our approach online with 13 healthy subjects. We could show that for 8 out of 13 subjects, the proposed approach using Bayesian optimization succeeded to select the individually optimal SOA out of multiple evaluated SOA values. Our data suggests, however, that subjects were influenced to very different degrees by the SOA parameter. This makes the automatic parameter selection infeasible for subjects where the influence is limited. Our work proposes an approach to exploit the benefits of individualized experimental protocols and evaluated it in an auditory BCI. When applied to other experimental parameters our approach could enhance the usability of BCI for different target groups -- specifically if an individual disease progress may prevent the use of standard parameters.

【6】 Near Instance Optimal Model Selection for Pure Exploration Linear Bandits
标题：纯勘探线性波的近实例最优模型选择
链接：https://arxiv.org/abs/2109.05131

作者：Yinglun Zhu,Julian Katz-Samuels,Robert Nowak
机构：University of Wisconsin-Madison
摘要：介绍了纯勘探线性bandit环境下的模型选择问题，并研究了固定置信度和固定预算环境下的模型选择问题。模型选择问题考虑了一系列复杂度不断增加的假设类。我们的目标是自动适应包含真实模型的最小假设类的依赖于实例的复杂性度量，而不是遭受与最大假设类相关的复杂性度量。我们提供的证据表明，标准的维度加倍技巧无法实现最佳的依赖实例的样本复杂度。我们的算法基于实验设计定义了一个新的优化问题，该优化问题利用动作集的几何结构来有效地识别近似最优的假设类。我们的固定预算算法在bandits中使用了一种新的选择验证技巧。这为线性bandits中未经研究的固定预算设置提供了一种新方法（即使没有模型选择的额外挑战）。我们进一步将模型选择问题推广到错误指定的区域，在固定置信度和固定预算设置下调整我们的算法。
摘要：The model selection problem in the pure exploration linear bandit setting is introduced and studied in both the fixed confidence and fixed budget settings. The model selection problem considers a nested sequence of hypothesis classes of increasing complexities. Our goal is to automatically adapt to the instance-dependent complexity measure of the smallest hypothesis class containing the true model, rather than suffering from the complexity measure related to the largest hypothesis class. We provide evidence showing that a standard doubling trick over dimension fails to achieve the optimal instance-dependent sample complexity. Our algorithms define a new optimization problem based on experimental design that leverages the geometry of the action set to efficiently identify a near-optimal hypothesis class. Our fixed budget algorithm uses a novel application of a selection-validation trick in bandits. This provides a new method for the understudied fixed budget setting in linear bandits (even without the added challenge of model selection). We further generalize the model selection problem to the misspecified regime, adapting our algorithms in both fixed confidence and fixed budget settings.

预测|估计(6篇)

【1】 Joint prediction of truecasing and punctuation for conversational speech in low-resource scenarios
标题：低资源场景下会话语音的真实性和标点符号联合预测
链接：https://arxiv.org/abs/2109.06103

作者：Raghavendra Pappagari,Piotr Żelasko,Agnieszka Mikołajczyk,Piotr Pęzik,Najim Dehak
机构：Center for Language and Speech Processing, Johns Hopkins University, Baltimore, USA, VoiceLab, Poland
备注：Accepted for ASRU 2021
摘要：大写和标点符号是理解书面文本和会话记录的重要线索。然而，许多ASR系统不产生标点和大小写格式的语音记录。我们建议使用一个多任务系统，可以利用大小写和标点之间的关系来提高它们的预测性能。虽然用于预测标点符号和真实大小写的文本数据似乎非常丰富，但我们认为书面文本资源不足以作为会话模型的训练数据。我们通过比较标点符号和词格的联合分布，并通过跨域测试我们的模型，量化了书面文本域和会话文本域之间的不匹配。此外，我们还表明，通过在书面文本领域对模型进行训练，然后将学习转移到对话中，我们可以用较少的数据实现合理的性能。
摘要：Capitalization and punctuation are important cues for comprehending written texts and conversational transcripts. Yet, many ASR systems do not produce punctuated and case-formatted speech transcripts. We propose to use a multi-task system that can exploit the relations between casing and punctuation to improve their prediction performance. Whereas text data for predicting punctuation and truecasing is seemingly abundant, we argue that written text resources are inadequate as training data for conversational models. We quantify the mismatch between written and conversational text domains by comparing the joint distributions of punctuation and word cases, and by testing our model cross-domain. Further, we show that by training the model in the written text domain and then transfer learning to conversations, we can achieve reasonable performance with less data.

【2】 Direct Advantage Estimation
标题：直接优势估算
链接：https://arxiv.org/abs/2109.06093

作者：Hsiao-Ru Pan,Nico Gürtler,Alexander Neitz,Bernhard Schölkopf
机构：Max Planck Institute for Intelligent Systems, T¨ubingen, Germany
摘要：学分分配是强化学习的核心问题之一。主要的方法是根据预期回报分配信贷。然而，我们表明，预期回报可能以一种不受欢迎的方式取决于政策，这可能会减慢学习速度。相反，我们借用了因果关系文献中的观点，证明优势函数可以解释为因果效应，它与因果表征具有相似的性质。基于这一观点，我们提出了直接优势估计（DAE），这是一种新的方法，可以对优势函数进行建模，并直接从数据中进行估计，而不需要（动作）值函数。如果需要，还可以将值函数无缝集成到DAE中，并以与时差学习类似的方式进行更新。所提出的方法易于实现，且易于被现代演员批评方法所采用。我们在Atari域上对DAE进行了实证测试，结果表明，使用最先进的优势评估方法，DAE可以获得具有竞争力的结果。
摘要：Credit assignment is one of the central problems in reinforcement learning. The predominant approach is to assign credit based on the expected return. However, we show that the expected return may depend on the policy in an undesirable way which could slow down learning. Instead, we borrow ideas from the causality literature and show that the advantage function can be interpreted as causal effects, which share similar properties with causal representations. Based on this insight, we propose the Direct Advantage Estimation (DAE), a novel method that can model the advantage function and estimate it directly from data without requiring the (action-)value function. If desired, value functions can also be seamlessly integrated into DAE and be updated in a similar way to Temporal Difference Learning. The proposed method is easy to implement and can be readily adopted by modern actor-critic methods. We test DAE empirically on the Atari domain and show that it can achieve competitive results with the state-of-the-art method for advantage estimation.

【3】 Accurate Prediction Using Triangular Type-2 Fuzzy Linear Regression
标题：三角二型模糊线性回归的精确预测
链接：https://arxiv.org/abs/2109.05461

作者：Assef Zare,Afshin Shoeibi,Narges Shafaei,Parisa Moridian,Roohallah Alizadehsani,Majid Halaji,Abbas Khosravi
机构：Gonabad Branch, Islamic Azad University, Gonabad, Iran;, Clinical Studies Lab (CSL), K. N. Toosi University of Technology, Science and Research Branch, Islamic Azad University, Tehran, Iran.
摘要：许多工作已经完成，以处理数据中的不确定性使用1型模糊回归。很少有第2类模糊回归工作使用区间第2类进行使用第1类模糊隶属度的不确定建模。目前的调查提出了一个三角形2型模糊回归（TT2FR）模型，通过处理数据中的不确定性来提高模型的效率。采用三角二次隶属函数代替广泛使用的区间型模型。在该模型中，一级模糊集和二级模糊集的模糊性被最小化，并且在预测值的相同{\alpha}平面中包含观测值的指定x平面。通过将三维2型模糊集（3DT2FS）简化为二维区间2型模糊（2DIT2F）模型，简化了2型模糊（T2F）模型的复杂计算。当前调查通过考虑T2F隶属函数的更一般形式，提出了一种新的T2F回归模型，从而避免了高复杂性。利用TAIEX和COVID-19预测数据集对所开发模型的性能进行评估。与其他最先进的技术相比，我们开发的模型达到了最高的性能。我们开发的方法可以用更多的不确定数据进行测试，并有可能用于天气和股票预测。
摘要：Many works have been done to handle the uncertainties in the data using type 1 fuzzy regression. Few type 2 fuzzy regression works used interval type 2 for indeterminate modeling using type 1 fuzzy membership. The current survey proposes a triangular type-2 fuzzy regression (TT2FR) model to ameliorate the efficiency of the model by handling the uncertainty in the data. The triangular secondary membership function is used instead of widely used interval type models. In the proposed model, vagueness in primary and secondary fuzzy sets is minimized and also, a specified x-plane of observed value is included in the same {\alpha}- plane of the predicted value. Complex calculations of the type-2 fuzzy (T2F) model are simplified by reducing three dimensional type-2 fuzzy set (3DT2FS) into two dimensional interval type-2 fuzzy (2DIT2F) models. The current survey presents a new regression model of T2F by considering the more general form of T2F membership functions and thus avoids high complexity. The performance of the developed model is evaluated using the TAIEX and COVID-19 forecasting datasets. Our developed model reached the highest performance as compared to the other state-of-art techniques. Our developed method is ready to be tested with more uncertain data and has the potential to use to predict the weather and stock prediction.

【4】 Remaining Useful Life Estimation of Hard Disk Drives using Bidirectional LSTM Networks
标题：基于双向LSTM网络的硬盘剩余使用寿命估算
链接：https://arxiv.org/abs/2109.05351

作者：Austin Coursey,Gopal Nath,Srikanth Prabhu,Saptarshi Sengupta
机构：∗Department of Computer Science and Information Systems, Murray State University, Murray, KY, USA, †Department of Mathematics and Statistics, Murray State University, Murray, KY, USA
备注：10 pages, 11 figures, 3 tables
摘要：运行可靠的大容量存储系统可以很好地为物理和云存储服务提供服务。最近的观察表明，硬盘可靠性是包含大量存储设备（如HDD）的数据中心最紧迫的可靠性问题之一。在这方面，在磁盘级别早期检测即将发生的故障有助于减少系统停机时间并减少操作损失，从而使主动健康监测成为此类环境中AIOP的优先事项。在这项工作中，我们介绍了提取与操作故障相关的有意义属性的方法，以及使用数据驱动方法对高度不平衡的健康统计数据进行预处理，以用于后续的预测任务。我们使用一个具有多日回顾期的双向LSTM来了解健康指标的时间进程，并将其与普通LSTM和随机森林模型进行比较，得出几个关键指标，这些指标在一些严格定义的操作约束下确定了我们模型的有用性和优越性。例如，使用15天的回顾期，考虑到故障前60天的测试数据，我们的方法可以预测磁盘故障的发生，准确率为96.4%。这有助于提前向运行维护部门发出有关潜在缓解需求的警报。此外，我们的模型报告的平均绝对误差为0.12，用于提前60天预测故障，使其在最近的文献中处于最先进水平。
摘要：Physical and cloud storage services are well-served by functioning and reliable high-volume storage systems. Recent observations point to hard disk reliability as one of the most pressing reliability issues in data centers containing massive volumes of storage devices such as HDDs. In this regard, early detection of impending failure at the disk level aids in reducing system downtime and reduces operational loss making proactive health monitoring a priority for AIOps in such settings. In this work, we introduce methods of extracting meaningful attributes associated with operational failure and of pre-processing the highly imbalanced health statistics data for subsequent prediction tasks using data-driven approaches. We use a Bidirectional LSTM with a multi-day look back period to learn the temporal progression of health indicators and baseline them against vanilla LSTM and Random Forest models to come up with several key metrics that establish the usefulness of and superiority of our model under some tightly defined operational constraints. For example, using a 15 day look back period, our approach can predict the occurrence of disk failure with an accuracy of 96.4% considering test data 60 days before failure. This helps to alert operations maintenance well in-advance about potential mitigation needs. In addition, our model reports a mean absolute error of 0.12 for predicting failure up to 60 days in advance, placing it among the state-of-the-art in recent literature.

【5】 An Empirical Comparison of Off-policy Prediction Learning Algorithms in the Four Rooms Environment
标题：四室环境下非策略预测学习算法的实证比较
链接：https://arxiv.org/abs/2109.05110

作者：Sina Ghiassian,Richard S. Sutton
机构： University of Alberta and DeepMind
备注：13 pages
摘要：在过去的十年中，已经提出了许多非策略预测学习算法，但仍不清楚哪些算法学习速度比其他算法快。我们在两个小任务（房间任务和高方差房间任务）上比较了11种非策略预测学习算法和线性函数近似。任务的设计使得快速学习具有挑战性。在Rooms任务中，重要抽样比率的乘积可以大到$2^{14}$，有时可以是两个。为了控制重要抽样比率乘积引起的高方差，步长应设置为较小，这反过来会减慢学习速度。高方差房间任务更为极端，因为比率的乘积可能高达$2^{14}\乘以25$。本文基于Ghiasian和Sutton（2021）对非政策预测学习算法的实证研究。我们考虑与他们相同的算法集，并采用相同的实验方法。所考虑的算法有：非策略TD（$\lambda$）、五个梯度TD算法、两个重点TD算法、树备份（$\lambda$）、Vtrace（$\lambda$）和ABTD（$\zeta$）。我们发现，算法的性能受到重要抽样率引起的方差的高度影响。数据显示，树备份（$\lambda$）、Vtrace（$\lambda$）和ABTD（$\zeta$）与其他算法一样，不受高差异的影响，但它们限制了有效的引导参数，对不存在高差异的任务来说，限制太大。我们观察到，与其他算法相比，强调型TD（$\lambda$）的渐近误差较小，但在某些情况下学习速度可能较慢。我们根据实践者感兴趣的问题为他们推荐算法，并建议可应用于可能导致算法显著改进的特定算法的方法。
摘要：Many off-policy prediction learning algorithms have been proposed in the past decade, but it remains unclear which algorithms learn faster than others. We empirically compare 11 off-policy prediction learning algorithms with linear function approximation on two small tasks: the Rooms task, and the High Variance Rooms task. The tasks are designed such that learning fast in them is challenging. In the Rooms task, the product of importance sampling ratios can be as large as $2^{14}$ and can sometimes be two. To control the high variance caused by the product of the importance sampling ratios, step size should be set small, which in turn slows down learning. The High Variance Rooms task is more extreme in that the product of the ratios can become as large as $2^{14}\times 25$. This paper builds upon the empirical study of off-policy prediction learning algorithms by Ghiassian and Sutton (2021). We consider the same set of algorithms as theirs and employ the same experimental methodology. The algorithms considered are: Off-policy TD($\lambda$), five Gradient-TD algorithms, two Emphatic-TD algorithms, Tree Backup($\lambda$), Vtrace($\lambda$), and ABTD($\zeta$). We found that the algorithms' performance is highly affected by the variance induced by the importance sampling ratios. The data shows that Tree Backup($\lambda$), Vtrace($\lambda$), and ABTD($\zeta$) are not affected by the high variance as much as other algorithms but they restrict the effective bootstrapping parameter in a way that is too limiting for tasks where high variance is not present. We observed that Emphatic TD($\lambda$) tends to have lower asymptotic error than other algorithms, but might learn more slowly in some cases. We suggest algorithms for practitioners based on their problem of interest, and suggest approaches that can be applied to specific algorithms that might result in substantially improved algorithms.

【6】 Estimation of Local Average Treatment Effect by Data Combination
标题：用数据组合法估计局部平均治疗效果
链接：https://arxiv.org/abs/2109.05175

作者：Kazuhiko Shinoda,Takahiro Hoshino
机构：Graduate School of Economics, Keio University, RIKEN AIP
摘要：当治疗分配的依从性不完全时，估计局部平均治疗效果（迟发）很重要。先前提出的后期估计方法要求在单个数据集中共同观察所有相关变量；然而，由于技术或隐私原因，在许多现实问题中收集此类数据有时很困难，甚至不可能。我们考虑一个新的问题设置，其中后期，作为协变量的函数，是非参数地从单独观测的数据集的组合中识别出来的。对于评估，我们表明，直接最小二乘法（最初用于评估完全依从性下的平均治疗效果）适用于我们的设置。然而，直接最小二乘估计的模型选择和超参数调整在实践中可能是不稳定的，因为它被定义为极大极小问题的解决方案。然后，我们提出了一种加权最小二乘估计，通过避免极小极大目标公式，可以简化模型选择。与逆概率加权（IPW）估计器不同，该估计器直接使用预先估计的权重而不进行反演，避免了IPW方法带来的问题。我们通过使用合成数据集和真实数据集的实验证明了我们方法的有效性。
摘要：It is important to estimate the local average treatment effect (LATE) when compliance with a treatment assignment is incomplete. The previously proposed methods for LATE estimation required all relevant variables to be jointly observed in a single dataset; however, it is sometimes difficult or even impossible to collect such data in many real-world problems for technical or privacy reasons. We consider a novel problem setting in which LATE, as a function of covariates, is nonparametrically identified from the combination of separately observed datasets. For estimation, we show that the direct least squares method, which was originally developed for estimating the average treatment effect under complete compliance, is applicable to our setting. However, model selection and hyperparameter tuning for the direct least squares estimator can be unstable in practice since it is defined as a solution to the minimax problem. We then propose a weighted least squares estimator that enables simpler model selection by avoiding the minimax objective formulation. Unlike the inverse probability weighted (IPW) estimator, the proposed estimator directly uses the pre-estimated weight without inversion, avoiding the problems caused by the IPW methods. We demonstrate the effectiveness of our method through experiments using synthetic and real-world datasets.

其他神经网络|深度学习|模型|建模(21篇)

【1】 On Tilted Losses in Machine Learning: Theory and Applications
标题：机器学习中的倾斜损失：理论与应用
链接：https://arxiv.org/abs/2109.06141

作者：Tian Li,Ahmad Beirami,Maziar Sanjabi,Virginia Smith
机构：Computer Science Department, Carnegie Mellon University, Pittsburgh, PA , USA, Facebook AI, Menlo Park, CA , USA, Machine Learning Department
备注：arXiv admin note: substantial text overlap with arXiv:2007.01162
摘要：指数倾斜是一种常用于统计学、概率论、信息论和最优化等领域的技术，用于产生参数分布偏移。尽管倾斜在相关领域很流行，但它在机器学习中并没有得到广泛的应用。在这项工作中，我们的目标是通过探索在风险最小化中使用倾斜来弥合这一差距。我们研究了ERM的一个简单扩展——倾斜经验风险最小化（TERM）——它使用指数倾斜来灵活调整个人损失的影响。由此产生的框架有几个有用的特性：我们表明，术语可以分别增加或减少异常值的影响，以实现公平性或鲁棒性；具有有利于泛化的方差减少特性；可以看作是超分位数方法的光滑近似。我们的工作在术语和相关目标之间建立了严格的联系，如风险价值、条件风险价值和分布稳健优化（DRO）。我们开发了求解项的批处理和随机一阶优化方法，为解算器提供了收敛性保证，并证明了相对于常见方案，该框架可以有效地求解。最后，我们证明了该术语可用于机器学习中的许多应用，例如在子组之间强制执行公平性、减轻异常值的影响以及处理类不平衡。尽管对传统的ERM目标进行了简单的修改，我们发现该框架可以始终优于ERM，并通过最先进的、针对具体问题的方法提供具有竞争力的性能。
摘要：Exponential tilting is a technique commonly used in fields such as statistics, probability, information theory, and optimization to create parametric distribution shifts. Despite its prevalence in related fields, tilting has not seen widespread use in machine learning. In this work, we aim to bridge this gap by exploring the use of tilting in risk minimization. We study a simple extension to ERM -- tilted empirical risk minimization (TERM) -- which uses exponential tilting to flexibly tune the impact of individual losses. The resulting framework has several useful properties: We show that TERM can increase or decrease the influence of outliers, respectively, to enable fairness or robustness; has variance-reduction properties that can benefit generalization; and can be viewed as a smooth approximation to a superquantile method. Our work makes rigorous connections between TERM and related objectives, such as Value-at-Risk, Conditional Value-at-Risk, and distributionally robust optimization (DRO). We develop batch and stochastic first-order optimization methods for solving TERM, provide convergence guarantees for the solvers, and show that the framework can be efficiently solved relative to common alternatives. Finally, we demonstrate that TERM can be used for a multitude of applications in machine learning, such as enforcing fairness between subgroups, mitigating the effect of outliers, and handling class imbalance. Despite the straightforward modification TERM makes to traditional ERM objectives, we find that the framework can consistently outperform ERM and deliver competitive performance with state-of-the-art, problem-specific approaches.

【2】 Uniform Generalization Bounds for Overparameterized Neural Networks
标题：过参数神经网络的一致泛化界
链接：https://arxiv.org/abs/2109.06099

作者：Sattar Vakili,Michael Bromberg,Da-shan Shiu,Alberto Bernacchia
机构：MediaTek Research
摘要：人工神经网络中一个有趣的观察结果是，尽管通常被过度参数化，但它们的泛化误差是有利的。众所周知，在过参数化神经网络的情况下，经典的统计学习方法往往会导致空洞的泛化错误。采用最近发展起来的神经正切（NT）核理论，当真实数据生成模型属于与NT核对应的再生核希尔BERT空间（RKHS）时，我们证明了核域中过参数化神经网络的一致推广界。重要的是，我们的边界捕获了精确的错误率，这取决于激活函数的可微性。为了建立这些界限，我们提出了NT核的信息增益作为学习问题复杂性的度量。我们的分析在球谐函数和相应特征值的衰减率的基础上使用NT核的Mercer分解。作为我们结果的副产品，我们展示了对应于NT核的RKH与对应于Mat’ern核族的RKH之间的等价性，这导致了一类非常普遍的模型。我们进一步讨论了我们的分析对使用过参数化神经网络的强化学习算法的遗憾界的一些最新结果的影响。
摘要：An interesting observation in artificial neural networks is their favorable generalization error despite typically being extremely overparameterized. It is well known that classical statistical learning methods often result in vacuous generalization errors in the case of overparameterized neural networks. Adopting the recently developed Neural Tangent (NT) kernel theory, we prove uniform generalization bounds for overparameterized neural networks in kernel regimes, when the true data generating model belongs to the reproducing kernel Hilbert space (RKHS) corresponding to the NT kernel. Importantly, our bounds capture the exact error rates depending on the differentiability of the activation functions. In order to establish these bounds, we propose the information gain of the NT kernel as a measure of complexity of the learning problem. Our analysis uses a Mercer decomposition of the NT kernel in the basis of spherical harmonics and the decay rate of the corresponding eigenvalues. As a byproduct of our results, we show the equivalence between the RKHS corresponding to the NT kernel and its counterpart corresponding to the Mat\'ern family of kernels, that induces a very general class of models. We further discuss the implications of our analysis for some recent results on the regret bounds for reinforcement learning algorithms, which use overparameterized neural networks.

【3】 The Grammar-Learning Trajectories of Neural Language Models
标题：神经语言模型的语法学习轨迹
链接：https://arxiv.org/abs/2109.06096

作者：Leshem Choshen,Guy Hacohen,Daphna Weinshall,Omri Abend
机构：Departments of Computer † and Brain Sciences‡, Hebrew University of Jerusalem
摘要：语言现象的学习轨迹提供了对语言表征本质的洞察，而不仅仅是通过观察成年说话者的行为所能获得的。为了应用相似的方法来分析神经语言模型（NLM），首先需要确定不同的模型在它们所做的概括方面足够相似。在本文中，我们展示了具有不同初始化、体系结构和训练数据的NLM以相似的顺序获得语言现象，尽管它们对数据具有不同的最终性能。利用这些发现，我们将不同学习阶段不同现象的相对表现与更简单的参考模型进行比较。结果表明，NLMs表现出一致的“发展”阶段。对这些阶段的初步分析显示了现象簇（尤其是形态簇），它们的表现是一致的，暗示了它们获得的表征之间的潜在联系。
摘要：The learning trajectories of linguistic phenomena provide insight into the nature of linguistic representation, beyond what can be gleaned from inspecting the behavior of an adult speaker. To apply a similar approach to analyze neural language models (NLM), it is first necessary to establish that different models are similar enough in the generalizations they make. In this paper, we show that NLMs with different initialization, architecture, and training data acquire linguistic phenomena in a similar order, despite having different end performances over the data. Leveraging these findings, we compare the relative performance on different phenomena at varying learning stages with simpler reference models. Results suggest that NLMs exhibit consistent ``developmental'' stages. Initial analysis of these stages presents phenomena clusters (notably morphological ones), whose performance progresses in unison, suggesting potential links between their acquired representations.

【4】 Efficient Contrastive Learning via Novel Data Augmentation and Curriculum Learning
标题：通过新颖的数据增强和课程学习实现高效的对比学习
链接：https://arxiv.org/abs/2109.05941

作者：Seonghyeon Ye,Jiseon Kim,Alice Oh
机构：School of Computing, KAIST
备注：EMNLP 2021
摘要：我们介绍了EfficientL，一种高效记忆的持续预训练方法，它将对比学习与新的数据增强和课程学习相结合。对于数据增强，我们按顺序堆叠两种类型的操作：切断和PCA抖动。在进行训练前步骤的同时，我们通过增加每个难度步骤的强化程度来应用课程学习。数据扩充完成后，对比学习应用于原始和扩充实例的投影嵌入。当在GLUE基准上进行微调时，我们的模型优于基准模型，尤其是在句子级任务中。此外，与基线模型相比，这种改进只需要70%的计算内存。
摘要：We introduce EfficientCL, a memory-efficient continual pretraining method that applies contrastive learning with novel data augmentation and curriculum learning. For data augmentation, we stack two types of operation sequentially: cutoff and PCA jittering. While pretraining steps proceed, we apply curriculum learning by incrementing the augmentation degree for each difficulty step. After data augmentation is finished, contrastive learning is applied on projected embeddings of original and augmented examples. When finetuned on GLUE benchmark, our model outperforms baseline models, especially for sentence-level tasks. Additionally, this improvement is capable with only 70% of computational memory compared to the baseline model.

【5】 Modeling Systems with Machine Learning based Differential Equations
标题：基于机器学习的微分方程系统建模
链接：https://arxiv.org/abs/2109.05935

作者：Pedro Garcia
机构： Universidad Central de Venezuela
摘要：在动态系统中，行为的预测经常受到模型设计的影响。当从观测系统中获得的时间序列可用时，可以通过根据这些观测值设计模型来执行任务，而无需额外假设，或者通过在模型中假设预先设想的结构，并借助于有关系统的额外信息来执行任务。在第二种情况下，这是一个将理论与观测充分结合并随后优化混合物的问题。在这项工作中，我们提出使用机器学习技术，从非均匀采样或噪声观测中，设计动态系统的时间连续模型作为微分方程的解。通过野兔猞猁种群和冠状病毒2019外面包的模拟数据集和实验数据，展示了该策略的性能。我们的结果表明，这种建模系统的方法在合成或实验数据的情况下是一种有用的技术。
摘要：The prediction of behavior in dynamical systems, is frequently subject to the design of models. When a time series obtained from observing the system is available, the task can be performed by designing the model from these observations without additional assumptions or by assuming a preconceived structure in the model, with the help of additional information about the system. In the second case, it is a question of adequately combining theory with observations and subsequently optimizing the mixture. In this work, we proposes the design of time-continuous models of dynamical systems as solutions of differential equations, from non-uniform sampled or noisy observations, using machine learning techniques. The performance of strategy is shown with both, several simulated data sets and experimental data from Hare-Lynx population and Coronavirus 2019 outbreack. Our results suggest that this approach to the modeling systems, can be an useful technique in the case of synthetic or experimental data.

【6】 Online Learning of Optimally Diverse Rankings
标题：最优多样化排名的在线学习
链接：https://arxiv.org/abs/2109.05899

作者：Stefan Magureanu,Alexandre Proutiere,Marcus Isaksson,Boxun Zhang
机构： KTH Royal Institute of Technology, KTHRoyal Institute of Technology
备注：None
摘要：搜索引擎通过列出相关项目（如文档、歌曲、产品、网页等）回答用户的查询。这些引擎依赖于学习对项目进行排序的算法，以便呈现一个有序列表，最大限度地提高其包含相关项目的概率。学习排名算法设计中的主要挑战来自这样一个事实：对于不同的用户，查询通常具有不同的含义。在没有任何关于查询的上下文信息的情况下，通常必须遵循{\it diversity}原则，即返回一个包含查询的各种可能主题或含义的列表。为了将这种学习排序问题形式化，我们提出了一个自然模型，其中（i）项被分类为主题，（ii）用户仅在匹配查询主题时才发现相关项，（iii）引擎不知道到达查询的主题，也不知道与各种主题相关的查询到达的频率，也不依赖于主题的项目点击率。针对这个问题，我们设计了LDR（Learning Diversity Rankings），一种只根据用户反馈有效地学习最优列表的算法。我们表明，在$T$查询之后，LDR的遗憾程度为$O（（N-L）\log（T））$，其中$N$是所有项目的数量。我们进一步确定，这种缩放无法改进，即LDR是顺序最优的。最后，通过对人工和真实数据的数值实验，我们说明了LDR与现有的排序学习算法相比的优越性。
摘要：Search engines answer users' queries by listing relevant items (e.g. documents, songs, products, web pages, ...). These engines rely on algorithms that learn to rank items so as to present an ordered list maximizing the probability that it contains relevant item. The main challenge in the design of learning-to-rank algorithms stems from the fact that queries often have different meanings for different users. In absence of any contextual information about the query, one often has to adhere to the {\it diversity} principle, i.e., to return a list covering the various possible topics or meanings of the query. To formalize this learning-to-rank problem, we propose a natural model where (i) items are categorized into topics, (ii) users find items relevant only if they match the topic of their query, and (iii) the engine is not aware of the topic of an arriving query, nor of the frequency at which queries related to various topics arrive, nor of the topic-dependent click-through-rates of the items. For this problem, we devise LDR (Learning Diverse Rankings), an algorithm that efficiently learns the optimal list based on users' feedback only. We show that after $T$ queries, the regret of LDR scales as $O((N-L)\log(T))$ where $N$ is the number of all items. We further establish that this scaling cannot be improved, i.e., LDR is order optimal. Finally, using numerical experiments on both artificial and real-world data, we illustrate the superiority of LDR compared to existing learning-to-rank algorithms.

【7】 Question Answering over Electronic Devices: A New Benchmark Dataset and a Multi-Task Learning based QA Framework
标题：电子设备上的问答：一种新的基准数据集和基于多任务学习的问答框架
链接：https://arxiv.org/abs/2109.05897

作者：Abhilash Nandy,Soumya Sharma,Shubham Maddhashiya,Kapil Sachdeva,Pawan Goyal,Niloy Ganguly
机构：♠Indian Institute of Technology, Kharagpur, ♣Samsung Research Institute, Delhi, ♦ L,S Research Center, Leibniz Universität Hannover
备注：EMNLP 2021, Long
摘要：从教学语料库（如电子手册、食谱书等）中回答问题的研究远远少于基于开放域factoid上下文的问题回答。这主要归因于缺乏标准基准数据集。在本文中，我们精心创建了大量与电子手册相关的数据，并开发了合适的算法来利用这些数据。我们收集了电子手册语料库，一个包含307957本电子手册的庞大语料库，以及这个大型语料库上的pretrain RoBERTa。我们创建了各种基准QA数据集，其中包括由专家根据两份电子手册策划的问答对、来自社区问答论坛的与电子手册相关的真实用户问题等。我们介绍了回答电子设备相关问题的EMQAP（电子手册问答管道）。基于预先训练的RoBERTa，它拥有一个有监督的多任务学习框架，有效地执行双重任务，即确定电子手册中可以找到答案的部分以及该部分中的确切答案范围。对于电子手册注释问答对，我们显示，与最具竞争力的基线相比，ROUGE-L F1分数提高了约40%。我们进行了详细的消融研究，并建立了EMQAP在不同情况下的通用性。代码和数据集在以下位置共享：https://github.com/abhi1nandy2/EMNLP-2021-Findings，相应的项目网站为https://sites.google.com/view/emanualqa/home.
摘要：Answering questions asked from instructional corpora such as E-manuals, recipe books, etc., has been far less studied than open-domain factoid context-based question answering. This can be primarily attributed to the absence of standard benchmark datasets. In this paper we meticulously create a large amount of data connected with E-manuals and develop suitable algorithm to exploit it. We collect E-Manual Corpus, a huge corpus of 307,957 E-manuals and pretrain RoBERTa on this large corpus. We create various benchmark QA datasets which include question answer pairs curated by experts based upon two E-manuals, real user questions from Community Question Answering Forum pertaining to E-manuals etc. We introduce EMQAP (E-Manual Question Answering Pipeline) that answers questions pertaining to electronics devices. Built upon the pretrained RoBERTa, it harbors a supervised multi-task learning framework which efficiently performs the dual tasks of identifying the section in the E-manual where the answer can be found and the exact answer span within that section. For E-Manual annotated question-answer pairs, we show an improvement of about 40% in ROUGE-L F1 scores over the most competitive baseline. We perform a detailed ablation study and establish the versatility of EMQAP across different circumstances. The code and datasets are shared at https://github.com/abhi1nandy2/EMNLP-2021-Findings, and the corresponding project website is https://sites.google.com/view/emanualqa/home.

【8】 Construction of Grid Operators for Multilevel Solvers: a Neural Network Approach
标题：构造多级解算器的网格算子：一种神经网络方法
链接：https://arxiv.org/abs/2109.05873

作者：Claudio Tomasi,Rolf Krause
机构： The actual state of the method presents some limitations related to the meshClaudio Tomasi and Rolf KrauseUniversità della Svizzera Italiana
备注：To appear in Springer Journal: "The 26th International Domain Decomposition Conference (DD26)"
摘要：本文从椭圆偏微分方程的有限元离散出发，研究多重网格方法与神经网络的结合。多重网格方法使用插值运算符在不同的近似级别之间传递信息。这些算子对于多重网格的快速收敛至关重要，但它们通常是未知的。我们提出了用于学习插值算子的深度神经网络模型，并基于网络的输出建立了多级层次结构。我们研究了神经网络预测的插值算子的精度，并用不同的网络结构对其进行了测试。这种用于构造网格算子的神经网络方法可以扩展为自动定义多级解算器，从而在科学计算中提供可移植的解决方案
摘要：In this paper, we investigate the combination of multigrid methods and neural networks, starting from a Finite Element discretization of an elliptic PDE. Multigrid methods use interpolation operators to transfer information between different levels of approximation. These operators are crucial for fast convergence of multigrid, but they are generally unknown. We propose Deep Neural Network models for learning interpolation operators and we build a multilevel hierarchy based on the output of the network. We investigate the accuracy of the interpolation operator predicted by the Neural Network, testing it with different network architectures. This Neural Network approach for the construction of grid operators can then be extended for an automatic definition of multilevel solvers, allowing a portable solution in scientific computing

【9】 Robust Stability of Neural-Network Controlled Nonlinear Systems with Parametric Variability
标题：具有参数可变性的神经网络控制非线性系统的鲁棒稳定性
链接：https://arxiv.org/abs/2109.05710

作者：Soumyabrata Talukder,Ratnesh Kumar
机构：accessible.
备注：15 pages, 7 figures
摘要：稳定性认证和系统可稳定运行区域的识别是确保系统运行安全性和鲁棒性的两个重要问题。随着机器学习工具的出现，这些问题对于反馈回路中具有机器学习组件的系统来说尤为重要。在这里，我们发展了一类神经网络控制的非线性系统的稳定性和可镇定性理论，其中当参数发生变化时，平衡点可能会漂移。提出了一种基于李雅普诺夫的凸稳定性证明，并进一步用于设计神经网络（NN）控制器的局部Lipschitz上界和状态空间上相应的操作域的估计，其中包含一个初始化集，闭环（CL）从中在同一控制器下，保证了该类系统的局部渐近稳定性，同时系统轨迹保持在工作域内。为了计算这种鲁棒镇定神经网络控制器，还提出了一种稳定性保证训练（SGT）算法。通过实例说明了该框架的有效性。
摘要：Stability certification and identification of the stabilizable operating region of a system are two important concerns to ensure its operational safety/security and robustness. With the advent of machine-learning tools, these issues are specially important for systems with machine-learned components in the feedback loop. Here we develop a theory for stability and stabilizability of a class of neural-network controlled nonlinear systems, where the equilibria can drift when parametric changes occur. A Lyapunov based convex stability certificate is developed and is further used to devise an estimate for a local Lipschitz upper bound for a neural-network (NN) controller and a corresponding operating domain on the state space, containing an initialization set from where the closed-loop (CL) local asymptotic stability of each system in the class is guaranteed under the same controller, while the system trajectories remain confined to the operating domain. For computing such a robust stabilizing NN controller, a stability guaranteed training (SGT) algorithm is also proposed. The effectiveness of the proposed framework is demonstrated using illustrative examples.

【10】 SCORE-IT: A Machine Learning-based Tool for Automatic Standardization of EEG Reports
标题：SCORE-IT：一种基于机器学习的脑电报告自动标准化工具
链接：https://arxiv.org/abs/2109.05694

作者：Samarth Rawal,Yogatheesan Varatharajah
机构：. Carle Illinois College of Medicine, University of Illinois at Urbana Champaign., . Department of Bioengineering, University of Illinois at Urbana Champaign., . Department of Neurology
摘要：基于机器学习（ML）的脑电图分析（EEG）在促进神经系统护理方面发挥着重要作用。然而，从临床记录中自动提取有用元数据的困难阻碍了基于EEG的大规模ML模型的发展。EEG报告是EEG研究元数据的主要来源，但缺乏标准化。在这里，我们提出了一个基于机器学习的系统，该系统可以从非结构化的自然语言EEG报告中自动从分数规范中提取成分。具体而言，我们的系统识别（1）记录中观察到的发作类型，根据医生印象；（2）根据医生印象，会话记录是否正常或异常；（3）患者是否被诊断为癫痫。我们使用公开的TUH EEG语料库对我们的系统进行了评估，并报告了各自任务的F1分数为0.92、0.82和0.97。
摘要：Machine learning (ML)-based analysis of electroencephalograms (EEGs) is playing an important role in advancing neurological care. However, the difficulties in automatically extracting useful metadata from clinical records hinder the development of large-scale EEG-based ML models. EEG reports, which are the primary sources of metadata for EEG studies, suffer from lack of standardization. Here we propose a machine learning-based system that automatically extracts components from the SCORE specification from unstructured, natural-language EEG reports. Specifically, our system identifies (1) the type of seizure that was observed in the recording, per physician impression; (2) whether the session recording was normal or abnormal according to physician impression; (3) whether the patient was diagnosed with epilepsy or not. We performed an evaluation of our system using the publicly available TUH EEG corpus and report F1 scores of 0.92, 0.82, and 0.97 for the respective tasks.

【11】 Machine Learning for Naval Architecture, Ocean and Marine Engineering
标题：机器学习在船舶建筑、海洋和海洋工程中的应用
链接：https://arxiv.org/abs/2109.05574

作者：J P Panda
机构： PandaDepartment of Mechanical Engineering, DIT University
摘要：基于机器学习（ML）的算法在工程和科学的许多领域都产生了重大影响，这些领域的数据集可以从实验和高保真数值模拟中获得。这些数据集通常用于机器学习模型，以提取有关基础物理的信息，并导出将输入变量映射到目标感兴趣量的函数关系。科学机器学习（SciML）中常用的机器学习算法包括神经网络、回归树、随机森林、支持向量机等。本文的重点是回顾机器学习在海军建筑、海洋和海洋工程问题中的应用；并确定优先研究方向。我们讨论了机器学习算法在不同问题上的应用，如波高预测、船舶风荷载计算、海上平台损伤检测、船舶附加阻力计算，以及在沿海和海洋环境中的各种其他应用。包括数据集的详细信息，包括ML模型开发中使用的数据集的来源。详细介绍了作为ML模型输入的特征，最后讨论了优化ML模型所采用的方法。在此综合分析的基础上，我们指出了ML在海洋和海洋工程问题中应用的未来研究方向。
摘要：Machine Learning (ML) based algorithms have found significant impact in many fields of engineering and sciences, where datasets are available from experiments and high fidelity numerical simulations. Those datasets are generally utilized in a machine learning model to extract information about the underlying physics and derive functional relationships mapping input variables to target quantities of interest. Commonplace machine learning algorithms utilized in Scientific Machine Learning (SciML) include neural networks, regression trees, random forests, support vector machines, etc. The focus of this article is to review the applications of ML in naval architecture, ocean, and marine engineering problems; and identify priority directions of research. We discuss the applications of machine learning algorithms for different problems such as wave height prediction, calculation of wind loads on ships, damage detection of offshore platforms, calculation of ship added resistance, and various other applications in coastal and marine environments. The details of the data sets including the source of data-sets utilized in the ML model development are included. The features used as the inputs to the ML models are presented in detail and finally, the methods employed in optimization of the ML models were also discussed. Based on this comprehensive analysis we point out future directions of research that may be fruitful for the application of ML to the ocean and marine engineering problems.

【12】 BioLCNet: Reward-modulated Locally Connected Spiking Neural Networks
标题：BioLCNet：报酬调制的局部连接尖峰神经网络
链接：https://arxiv.org/abs/2109.05539

作者：Hafez Ghaemi,Erfan Mirzaei,Mahbod Nouri,Saeed Reza Kheradpisheh
机构：Department of Control and Computer Engineering, Polytechnic University of Turin, Italy, School of Electrical and Computer Engineering, University of Tehran, Iran, School of Informatics, University of Edinburgh, United Kingdom
备注：8 pages, 5 figures
摘要：最近的研究表明，卷积神经网络（CNN）并不是唯一可行的图像分类方法。此外，CNN中使用的权重共享和反向传播并不符合灵长类视觉系统中存在的机制。为了提出一个生物学上更合理的解决方案，我们设计了一个局部连接的尖峰神经网络（SNN），该网络使用尖峰时间依赖性可塑性（STDP）及其报酬调制变量（R-STDP）学习规则进行训练。通过使用尖峰神经元和局部连接以及强化学习（RL），我们为我们提出的架构命名了BioLCNet。我们的网络由速率编码输入层、本地连接的隐藏层和解码输出层组成。输出层采用基于尖峰总体的投票方案进行解码。我们使用MNIST数据集获得图像分类精度，并评估奖励系统对不同目标响应的鲁棒性。
摘要：Recent studies have shown that convolutional neural networks (CNNs) are not the only feasible solution for image classification. Furthermore, weight sharing and backpropagation used in CNNs do not correspond to the mechanisms present in the primate visual system. To propose a more biologically plausible solution, we designed a locally connected spiking neural network (SNN) trained using spike-timing-dependent plasticity (STDP) and its reward-modulated variant (R-STDP) learning rules. The use of spiking neurons and local connections along with reinforcement learning (RL) led us to the nomenclature BioLCNet for our proposed architecture. Our network consists of a rate-coded input layer followed by a locally connected hidden layer and a decoding output layer. A spike population-based voting scheme is adopted for decoding in the output layer. We used the MNIST dataset to obtain image classification accuracy and to assess the robustness of our rewarding system to varying target responses.

【13】 DRo: A data-scarce mechanism to revolutionize the performance of Deep Learning based Security Systems
标题：DRO：一种改进基于深度学习的安全系统性能的数据稀缺机制
链接：https://arxiv.org/abs/2109.05470

作者：Mohit Sewak,Sanjay K. Sahay,Hemant Rathore
机构：Microsoft R&D, India, BITS Pilani, Goa, India
备注：None
摘要：有监督的深度学习需要大量的标记数据才能收敛，因此对于任务特定的学习来说，它的性能最佳。因此，我们提出了一种新的机制，称为DRo（用于深度路由），用于数据稀缺的领域，如安全性。DRo方法建立在深度集群的一些最新发展之上。特别是，它利用了综合产生的局部扰动的自增强训练机制。DRo不仅缓解了稀疏标记数据带来的挑战，还提供了许多独特的优势。我们还开发了一个名为DRoID的系统，该系统使用DRo机制来增强现有恶意软件检测系统的性能，该系统使用（低信息功能，如）Android隐式意图作为唯一功能。我们使用流行且标准化的Android恶意软件数据集对DRoID进行了实验，发现DRo机制可以成功地将下游分类器产生的错误警报减少67.9%，同时将其准确性提高11.3%。这一点非常重要，不仅因为所获得的收益是无与伦比的，还因为所使用的特征从未被认为足够丰富，足以训练分类器；因此，到目前为止，任何恶意软件分类系统都无法单独使用这些功能报告良好的性能。由于所取得的成果，DRo机制在所有已知系统中占据主导地位，其目的是提高具有稀疏标记数据的深度学习模型的分类性能。
摘要：Supervised Deep Learning requires plenty of labeled data to converge, and hence perform optimally for task-specific learning. Therefore, we propose a novel mechanism named DRo (for Deep Routing) for data-scarce domains like security. The DRo approach builds upon some of the recent developments in Deep-Clustering. In particular, it exploits the self-augmented training mechanism using synthetically generated local perturbations. DRo not only allays the challenges with sparse-labeled data but also offers many unique advantages. We also developed a system named DRoID that uses the DRo mechanism for enhancing the performance of an existing Malware Detection System that uses (low information features like the) Android implicit Intent(s) as the only features. We conduct experiments on DRoID using a popular and standardized Android malware dataset and found that the DRo mechanism could successfully reduce the false-alarms generated by the downstream classifier by 67.9%, and also simultaneously boosts its accuracy by 11.3%. This is significant not only because the gains achieved are unparalleled but also because the features used were never considered rich enough to train a classifier on; and hence no decent performance could ever be reported by any malware classification system till-date using these features in isolation. Owing to the results achieved, the DRo mechanism claims a dominant position amongst all known systems that aims to enhance the classification performance of deep learning models with sparse-labeled data.

【14】 Learning To Describe Player Form in The MLB
标题：学习描述美国职棒大联盟的球员状态
链接：https://arxiv.org/abs/2109.05280

作者：Connor Heaton,Prasenjit Mitra
机构：The Pennsylvania State University, State College, PA , USA
摘要：美国职业棒球大联盟（MLB）有着利用统计数据更好地理解和讨论棒球比赛的悠久历史，它有一整套专门研究棒球运动的统计学科，称为刀术。在其核心，所有的剑术都试图量化比赛的某些方面，通常是球员技能集的一个特定方面——例如击球手的助跑能力（RBI）或投手阻止击球手到达底线的能力（鞭子）。虽然这些统计数据很有用，但从根本上说，它们是根据实地发生的情况而不是如何发生的情况得出的，这一事实限制了这些统计数据的使用。作为缓解这一缺陷的第一步，我们提出了一个新的、基于对比学习的框架来描述MLB中的球员形式。我们使用的形式是指球员在最近的出场中影响比赛进程的方式。具体地说，玩家的形态由72维向量描述。通过比较由我们的形式表示和传统的abermetrics生成的玩家集群，我们证明了我们的形式表示包含关于玩家如何影响游戏过程的信息，而不是在传统的、公开的统计数据中。我们相信，这些嵌入可以用来预测游戏内和游戏级别的事件，例如击球的结果或游戏的赢家。
摘要：Major League Baseball (MLB) has a storied history of using statistics to better understand and discuss the game of baseball, with an entire discipline of statistics dedicated to the craft, known as sabermetrics. At their core, all sabermetrics seek to quantify some aspect of the game, often a specific aspect of a player's skill set - such as a batter's ability to drive in runs (RBI) or a pitcher's ability to keep batters from reaching base (WHIP). While useful, such statistics are fundamentally limited by the fact that they are derived from an account of what happened on the field, not how it happened. As a first step towards alleviating this shortcoming, we present a novel, contrastive learning-based framework for describing player form in the MLB. We use form to refer to the way in which a player has impacted the course of play in their recent appearances. Concretely, a player's form is described by a 72-dimensional vector. By comparing clusters of players resulting from our form representations and those resulting from traditional abermetrics, we demonstrate that our form representations contain information about how players impact the course of play, not present in traditional, publicly available statistics. We believe these embeddings could be utilized to predict both in-game and game-level events, such as the result of an at-bat or the winner of a game.

【15】 Benchmarking Processor Performance by Multi-Threaded Machine Learning Algorithms
标题：基于多线程机器学习算法的处理器性能基准测试
链接：https://arxiv.org/abs/2109.05276

作者：Muhammad Fahad Saleem
机构：Department of Computer Science, National University of Computer and Emerging Sciences, Islamabad, Pakistan
备注：The research paper consists of seven pages and contains twenty nine figures
摘要：机器学习算法使计算机能够通过从以前的数据中学习来预测事物。数据存储和处理能力正在快速增长，从而增加了机器学习和人工智能的应用。大部分工作都是为了提高过去建立的模型的准确性，而很少有研究确定机器学习的计算成本。在本文中，我将继续后面的研究工作，并将对多线程机器学习聚类算法进行性能比较。我将致力于线性回归、随机森林和K近邻，以确定算法的性能特征以及获得结果的计算成本。我将通过运行这些多线程算法来测试系统硬件性能，在数据集上训练和测试模型，以注意算法性能矩阵的差异。最后，我将说明性能最好的算法，这些算法在我的系统上的性能效率。
摘要：Machine learning algorithms have enabled computers to predict things by learning from previous data. The data storage and processing power are increasing rapidly, thus increasing machine learning and Artificial intelligence applications. Much of the work is done to improve the accuracy of the models built in the past, with little research done to determine the computational costs of machine learning acquisitions. In this paper, I will proceed with this later research work and will make a performance comparison of multi-threaded machine learning clustering algorithms. I will be working on Linear Regression, Random Forest, and K-Nearest Neighbors to determine the performance characteristics of the algorithms as well as the computation costs to the obtained results. I will be benchmarking system hardware performance by running these multi-threaded algorithms to train and test the models on a dataset to note the differences in performance matrices of the algorithms. In the end, I will state the best performing algorithms concerning the performance efficiency of these algorithms on my system.

【16】 Physics-based Deep Learning
标题：基于物理的深度学习
链接：https://arxiv.org/abs/2109.05237

作者：Nils Thuerey,Philipp Holl,Maximilian Mueller,Patrick Schnell,Felix Trost,Kiwon Um
备注：Online version at: this https URL
摘要：这本数字书包含了一个实际的和全面的介绍，所有与物理模拟背景下的深度学习相关的内容。尽可能多地，所有主题都以Jupyter笔记本的形式提供了实践代码示例，以便快速入门。除了标准的数据监督学习外，我们还将研究物理损失约束、更紧密耦合的可微模拟学习算法，以及强化学习和不确定性建模。我们生活在一个激动人心的时代：这些方法有巨大的潜力从根本上改变计算机模拟所能实现的。
摘要：This digital book contains a practical and comprehensive introduction of everything related to deep learning in the context of physical simulations. As much as possible, all topics come with hands-on code examples in the form of Jupyter notebooks to quickly get started. Beyond standard supervised learning from data, we'll look at physical loss constraints, more tightly coupled learning algorithms with differentiable simulations, as well as reinforcement learning and uncertainty modeling. We live in exciting times: these methods have a huge potential to fundamentally change what computer simulations can achieve.

【17】 Machine learning reveals how personalized climate communication can both succeed and backfire
标题：机器学习揭示了个性化气候交流是如何既成功又适得其反的
链接：https://arxiv.org/abs/2109.05104

作者：Totte Harinen,Alexandre Filipowicz,Shabnam Hakimi,Rumen Iliev,Matthew Klenk,Emily Sumner
机构： Emily Sarah 1 1Toyota Research InstituteCorresponding author
摘要：不同的广告信息适用于不同的人。机器学习是个性化气候通信的有效方法。在本文中，我们使用机器学习重新分析了最近一项研究的结果，结果表明，在线广告增加了一些人对气候变化的信念，而导致对其他人的信念下降。特别是，我们发现广告的效果可能会随着人们的年龄和种族而变化。
摘要：Different advertising messages work for different people. Machine learning can be an effective way to personalise climate communications. In this paper we use machine learning to reanalyse findings from a recent study, showing that online advertisements increased some people's belief in climate change while resulting in decreased belief in others. In particular, we show that the effect of the advertisements could change depending on people's age and ethnicity.

【18】 On the Compression of Neural Networks Using $\ell_0$-Norm Regularization and Weight Pruning
链接：https://arxiv.org/abs/2109.05075

作者：Felipe Dennis de Resende Oliveira,Eduardo Luiz Ortiz Batista,Rui Seara
机构：©, Notice: This manuscript has been submitted to an IEEE journal and is under review.
备注：7 pages, 6 figures, 2 tables
摘要：尽管高容量计算平台的可用性越来越高，但实现复杂性仍然是神经网络实际部署中的一个重大问题。这种担忧不仅是由于最先进的网络架构的巨大成本，还由于最近对边缘智能的推动以及神经网络在嵌入式应用中的使用。在这种情况下，网络压缩技术因其在降低部署成本的同时将推理精度保持在令人满意的水平的能力而受到关注。本文致力于开发一种新的神经网络压缩方案。为此，首次提出了一种新的基于$\ellu\u 0$范数的正则化方法，该方法能够在训练过程中诱导网络的强稀疏性。然后，利用剪枝技术，以训练网络的较小权值为目标，得到较小但高效的网络。建议的压缩方案还包括使用$\ellu 2$-范数正则化以避免过度拟合，以及微调以提高修剪网络的性能。实验结果表明了该方案的有效性，并与其他方法进行了比较。
摘要：Despite the growing availability of high-capacity computational platforms, implementation complexity still has been a great concern for the real-world deployment of neural networks. This concern is not exclusively due to the huge costs of state-of-the-art network architectures, but also due to the recent push towards edge intelligence and the use of neural networks in embedded applications. In this context, network compression techniques have been gaining interest due to their ability for reducing deployment costs while keeping inference accuracy at satisfactory levels. The present paper is dedicated to the development of a novel compression scheme for neural networks. To this end, a new $\ell_0$-norm-based regularization approach is firstly developed, which is capable of inducing strong sparseness in the network during training. Then, targeting the smaller weights of the trained network with pruning techniques, smaller yet highly effective networks can be obtained. The proposed compression scheme also involves the use of $\ell_2$-norm regularization to avoid overfitting as well as fine tuning to improve the performance of the pruned network. Experimental results are presented aiming to show the effectiveness of the proposed scheme as well as to make comparisons with competing approaches.

【19】 Physics-based machine learning for modeling stochastic IP3-dependent calcium dynamics
标题：基于物理的机器学习在随机IP3依赖性钙动力学建模中的应用
链接：https://arxiv.org/abs/2109.05053

作者：Oliver K. Ernst,Tom Bartol,Terrence Sejnowski,Eric Mjolsness
机构：Department of Physics, University of California at San Diego, La Jolla, California, Salk Institute for Biological Studies, La Jolla, California, Division of Biological Sciences, University of California at San Diego, La Jolla, California
备注：26 pages
摘要：我们提出了一种通过候选函数结合特定领域物理的模型简化机器学习方法。我们的方法通过对反应网络的随机模拟来估计一个有效的概率分布和微分方程模型。简化描述和精细描述之间的紧密联系允许将从主方程导出的近似值引入学习问题。该表示法可改善非兴奋性细胞中三磷酸肌醇（IP3）依赖性钙振荡的经典模型的泛化，并允许大幅减小网络大小。
摘要：We present a machine learning method for model reduction which incorporates domain-specific physics through candidate functions. Our method estimates an effective probability distribution and differential equation model from stochastic simulations of a reaction network. The close connection between reduced and fine scale descriptions allows approximations derived from the master equation to be introduced into the learning problem. This representation is shown to improve generalization and allows a large reduction in network size for a classic model of inositol trisphosphate (IP3) dependent calcium oscillations in non-excitable cells.

【20】 Neural network based order parameter for phase transitions and its applications in high-entropy alloys
标题：基于神经网络的相变序参量及其在高熵合金中的应用
链接：https://arxiv.org/abs/2109.05598

作者：Junqi Yin,Zongrui Pei,Michael Gao
机构：Oak Ridge National Laboratory, Oak Ridge, TN, USA, National Energy Technology Laboratory, Albany, OR, USA
摘要：相变是自然界最重要的现象之一，在材料设计中起着核心作用。所有相变都具有合适的有序参数，包括有序-无序相变。然而，对于复杂系统来说，找到一个具有代表性的序参量是非常重要的，例如对于高熵合金。鉴于变分自动编码器（VAE）将高维数据简化为几个主分量的能力，这里我们提出了“VAE序参数”的新概念。我们提出VAE潜空间中的曼哈顿距离可以作为有序-无序相变的一般有序参数。用多种难熔高熵合金对有序参数的物理性质进行了定量解释和论证。在此基础上，通过模拟元素的自然混合，提出了一种普遍适用的合金设计概念。物理解释的“VAE序参量”为理解和通过化学有序化进行合金设计奠定了基础。
摘要：Phase transition is one of the most important phenomena in nature and plays a central role in materials design. All phase transitions are characterized by suitable order parameters, including the order-disorder phase transition. However, finding a representative order parameter for complex systems is nontrivial, such as for high-entropy alloys. Given variational autoencoder's (VAE) strength of reducing high dimensional data into few principal components, here we coin a new concept of "VAE order parameter". We propose that the Manhattan distance in the VAE latent space can serve as a generic order parameter for order-disorder phase transitions. The physical properties of the order parameter are quantitatively interpreted and demonstrated by multiple refractory high-entropy alloys. Assisted by it, a generally applicable alloy design concept is proposed by mimicking the nature mixing of elements. Our physically interpretable "VAE order parameter" lays the foundation for the understanding of and alloy design by chemical ordering.

【21】 MLReal: Bridging the gap between training on synthetic data and real data applications in machine learning
标题：MLReal：在机器学习中弥合合成数据训练和真实数据应用之间的差距
链接：https://arxiv.org/abs/2109.05294

作者：Tariq Alkhalifah,Hanchen Wang,Oleg Ovcharenko
机构：Physical Sciences and Engineering, King Abdullah University of Science and Technology, Thuwal ,-, Saudi Arabia
备注：27 pages, 15 figures
摘要：在利用基于波形数据（即地震、电磁或超声波）训练的神经网络方面，我们面临的最大挑战之一是它在实际数据中的应用。对精确标签的要求迫使我们使用合成数据开发解决方案，而这些数据中的标签是现成的。然而，合成数据往往无法捕捉现场/真实实验的真实情况，最终导致训练神经网络（NN）在推理阶段的性能较差。我们描述了一种新的方法来增强对具有真实数据特征的合成数据的监督训练（域自适应）。具体而言，对于输入数据纵轴（时间或深度）绝对值不重要的任务，如分类，或可在之后进行校正的任务，如使用测井建立速度模型，我们建议对输入进行一系列线性操作，以便训练和应用数据具有类似的分布。这是通过对NN模型的输入数据应用两种操作来实现的：1）输入数据（即放炮采集、地震图像等）与来自同一数据集的固定参考道的互相关。2）结果数据与来自另一个域的自相关数据的平均值（或随机样本）的卷积。在训练阶段，输入数据来自合成域，自相关数据来自真实域，并在每个训练时段从真实数据中抽取随机样本。在推理/应用阶段，输入数据来自真实子集域，自相关部分的平均值来自合成数据子集域。将被动地震数据用于微震事件源位置确定和主动地震数据用于低频预测的示例应用程序用于证明该方法在提高训练模型对真实数据的适用性方面的能力。
摘要：Among the biggest challenges we face in utilizing neural networks trained on waveform data (i.e., seismic, electromagnetic, or ultrasound) is its application to real data. The requirement for accurate labels forces us to develop solutions using synthetic data, where labels are readily available. However, synthetic data often do not capture the reality of the field/real experiment, and we end up with poor performance of the trained neural network (NN) at the inference stage. We describe a novel approach to enhance supervised training on synthetic data with real data features (domain adaptation). Specifically, for tasks in which the absolute values of the vertical axis (time or depth) of the input data are not crucial, like classification, or can be corrected afterward, like velocity model building using a well-log, we suggest a series of linear operations on the input so the training and application data have similar distributions. This is accomplished by applying two operations on the input data to the NN model: 1) The crosscorrelation of the input data (i.e., shot gather, seismic image, etc.) with a fixed reference trace from the same dataset. 2) The convolution of the resulting data with the mean (or a random sample) of the autocorrelated data from another domain. In the training stage, the input data are from the synthetic domain and the auto-correlated data are from the real domain, and random samples from real data are drawn at every training epoch. In the inference/application stage, the input data are from the real subset domain and the mean of the autocorrelated sections are from the synthetic data subset domain. Example applications on passive seismic data for microseismic event source location determination and active seismic data for predicting low frequencies are used to demonstrate the power of this approach in improving the applicability of trained models to real data.

其他(25篇)

【1】 Relaxed Marginal Consistency for Differentially Private Query Answering
标题：差分私有查询应答的松弛边缘一致性
链接：https://arxiv.org/abs/2109.06153

作者：Ryan McKenna,Siddhant Pradhan,Daniel Sheldon,Gerome Miklau
机构：College of Information and Computer Sciences, University of Massachusetts, Amherst, MA
摘要：许多用于回答数据库查询的差异私有算法涉及一个步骤，该步骤从噪声测量中重建离散数据分布。这提供了一致的查询答案并减少了错误，但通常需要随着维度呈指数增长的空间。Private PGM是一种最新的方法，它使用图形模型来表示数据分布，其复杂性与图形模型中精确边缘推理的复杂性成正比，图形模型的结构由噪声测量中变量的共存决定。私有PGM对于稀疏测量具有高度可扩展性，但可能无法在高维密集测量中运行。我们通过一种原则性的方法克服了私有PGM的主要可扩展性限制，该方法放松了估计目标中的一致性约束。我们的新方法可以与许多现有的私有查询应答算法协同工作，并在没有隐私成本的情况下提高了可伸缩性或准确性。
摘要：Many differentially private algorithms for answering database queries involve a step that reconstructs a discrete data distribution from noisy measurements. This provides consistent query answers and reduces error, but often requires space that grows exponentially with dimension. Private-PGM is a recent approach that uses graphical models to represent the data distribution, with complexity proportional to that of exact marginal inference in a graphical model with structure determined by the co-occurrence of variables in the noisy measurements. Private-PGM is highly scalable for sparse measurements, but may fail to run in high dimensions with dense measurements. We overcome the main scalability limitation of Private-PGM through a principled approach that relaxes consistency constraints in the estimation objective. Our new approach works with many existing private query answering algorithms and improves scalability or accuracy with no privacy cost.

【2】 Single-stream CNN with Learnable Architecture for Multi-source Remote Sensing Data
标题：具有可学习结构的多源遥感数据单流CNN
链接：https://arxiv.org/abs/2109.06094

作者：Yi Yang,Daoye Zhu,Tengteng Qu,Qiangyu Wang,Fuhu Ren,Chengqi Cheng
摘要：本文提出了一种基于深度卷积神经网络（CNN）的多源遥感数据联合分类框架。虽然目前的方法大多基于多流结构，但我们使用组卷积在单流网络中高效地构造等效网络结构。我们进一步采用并改进了动态分组卷积（DGConv），使分组卷积超参数，从而使整个网络结构在网络训练过程中可学习。因此，所提出的方法在理论上可以将任何现代CNN模型调整到任何多源遥感数据集，并且可以潜在地避免由人工确定的结构参数引起的次优解。在实验中，将该方法应用于ResNet和UNet，并在三个非常不同的基准数据集（即Houston 2018数据、Berlin数据和MUUFL数据）上验证了调整后的网络。实验结果证明了所提出的单流CNN的有效性，特别是ResNet18 DGConv将HS-SAR柏林数据集的最新分类总体精度（OA）从$62.23\%$提高到$68.21\%$。在实验中，我们有两个有趣的发现。首先，使用DGConv通常会减少测试OA方差。第二，如果将多流应用于前几层，则多流对模型性能有害，但如果应用于更深的层，则多流将变得有益。总之，研究结果表明，多源遥感数据深度学习模型中，多流体系结构并不是一个严格必要的组成部分，而是起着模型正则化器的作用。我们的代码在https://github.com/yyyyangyi/Multi-source-RS-DGConv. 我们希望我们的工作能对未来的小说研究有所启发。
摘要：In this paper, we propose an efficient and generalizable framework based on deep convolutional neural network (CNN) for multi-source remote sensing data joint classification. While recent methods are mostly based on multi-stream architectures, we use group convolution to construct equivalent network architectures efficiently within a single-stream network. We further adopt and improve dynamic grouping convolution (DGConv) to make group convolution hyperparameters, and thus the overall network architecture, learnable during network training. The proposed method therefore can theoretically adjust any modern CNN models to any multi-source remote sensing data set, and can potentially avoid sub-optimal solutions caused by manually decided architecture hyperparameters. In the experiments, the proposed method is applied to ResNet and UNet, and the adjusted networks are verified on three very diverse benchmark data sets (i.e., Houston2018 data, Berlin data, and MUUFL data). Experimental results demonstrate the effectiveness of the proposed single-stream CNNs, and in particular ResNet18-DGConv improves the state-of-the-art classification overall accuracy (OA) on HS-SAR Berlin data set from $62.23\%$ to $68.21\%$. In the experiments we have two interesting findings. First, using DGConv generally reduces test OA variance. Second, multi-stream is harmful to model performance if imposed to the first few layers, but becomes beneficial if applied to deeper layers. Altogether, the findings imply that multi-stream architecture, instead of being a strictly necessary component in deep learning models for multi-source remote sensing data, essentially plays the role of model regularizer. Our code is publicly available at https://github.com/yyyyangyi/Multi-source-RS-DGConv. We hope our work can inspire novel research in the future.

【3】 Online Influence Maximization with Node-level Feedback Using Standard Offline Oracles
标题：使用标准离线Oracle实现节点级反馈的在线影响力最大化
链接：https://arxiv.org/abs/2109.06077

作者：Zhijie Zhang,Wei Chen,Xiaoming Sun,Jialin Zhang
机构： Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China, University of Chinese Academy of Sciences, Beijing, China, Microsoft Research Asia, Beijing, China
备注：Abstract shorten at arXiv's request. Readers are welcome to read the abstract in the paper directly
摘要：我们研究了社交网络中的在线影响最大化（OIM）问题，学习者在多轮中反复选择种子节点生成级联，观察级联反馈，并逐渐学习生成最大级联的最佳种子。在本文中，我们关注两大挑战。首先，我们使用节点级反馈，而不是边缘级反馈。边级反馈显示级联中通过信息的所有边，其中节点级反馈仅显示带有时间戳的激活节点。节点级反馈可以说更现实，因为在实践中，观察谁受到影响相对容易，但很难观察影响来自哪个关系（边缘）。其次，我们使用标准的离线oracle，而不是离线oracle。为了为下一轮计算一个好的种子集，离线对预言机在置信域内同时找到最佳种子集和最佳参数，由于OIM问题的组合核心，这种预言机很难计算。因此，我们关注如何使用标准的离线影响最大化oracle，该oracle在给定边参数作为输入的情况下找到最佳种子集。在本文中，我们解决了两个最流行的扩散模型，独立级联（IC）和线性阈值（LT）模型的这些挑战。对于IC模型，过去的研究只实现了边缘级反馈，而我们提出了第一个节点级反馈的$\widetilde{O}（\sqrt{T}）$-后悔算法。此外，该算法只调用标准的脱机oracle。对于LT模型，最近的一项研究仅提供了一个OIM解决方案，该解决方案满足了第一个挑战，但仍然需要一对oracle。在本文中，我们应用与IC模型中类似的技术，用标准oracle替换成对oracle，同时保持$\widetilde{O}（\sqrt{T}）$-遗憾。
摘要：We study the online influence maximization (OIM) problem in social networks, where in multiple rounds the learner repeatedly chooses seed nodes to generate cascades, observes the cascade feedback, and gradually learns the best seeds that generate the largest cascade. We focus on two major challenges in this paper. First, we work with node-level feedback instead of edge-level feedback. The edge-level feedback reveals all edges that pass through information in a cascade, where the node-level feedback only reveals the activated nodes with timestamps. The node-level feedback is arguably more realistic since in practice it is relatively easy to observe who is influenced but very difficult to observe from which relationship (edge) the influence comes from. Second, we use standard offline oracle instead of offline pair-oracle. To compute a good seed set for the next round, an offline pair-oracle finds the best seed set and the best parameters within the confidence region simultaneously, and such an oracle is difficult to compute due to the combinatorial core of OIM problem. So we focus on how to use the standard offline influence maximization oracle which finds the best seed set given the edge parameters as input. In this paper, we resolve these challenges for the two most popular diffusion models, the independent cascade (IC) and the linear threshold (LT) model. For the IC model, the past research only achieves edge-level feedback, while we present the first $\widetilde{O}(\sqrt{T})$-regret algorithm for the node-level feedback. Besides, the algorithm only invokes standard offline oracles. For the LT model, a recent study only provides an OIM solution that meets the first challenge but still requires a pair-oracle. In this paper, we apply a similar technique as in the IC model to replace the pair-oracle with a standard oracle while maintaining $\widetilde{O}(\sqrt{T})$-regret.

【4】 An End-to-end Point of Interest (POI) Conflation Framework
标题：端到端兴趣点(POI)合并框架
链接：https://arxiv.org/abs/2109.06073

作者：Raymond Low,Zeynep D. Tekler,Lynette Cheah
机构：Engineering Systems and Design Pillar, Singapore University of Technology and Design, Somapah Rd, Singapore , Engineering Product Development Pillar
备注：20 pages, 6 figures, 5 tables
摘要：Point of interest (POI) data serves as a valuable source of semantic information for places of interest and has many geospatial applications in real estate, transportation, and urban planning. With the availability of different data sources, POI conflation serves as a valuable technique for enriching data quality and coverage by merging the POI data from multiple sources. This study proposes a novel end-to-end POI conflation framework consisting of six steps, starting with data procurement, schema standardisation, taxonomy mapping, POI matching, POI unification, and data verification. The feasibility of the proposed framework was demonstrated in a case study conducted in the eastern region of Singapore, where the POI data from five data sources was conflated to form a unified POI dataset. Based on the evaluation conducted, the resulting unified dataset was found to be more comprehensive and complete than any of the five POI data sources alone. Furthermore, the proposed approach for identifying POI matches between different data sources outperformed all baseline approaches with a matching accuracy of 97.6% with an average run time below 3 minutes when matching over 12,000 POIs to result in 8,699 unique POIs, thereby demonstrating the framework's scalability for large scale implementation in dense urban contexts.

【5】 Region Invariant Normalizing Flows for Mobility Transfer
标题：用于移动性转移的区域不变归一化流
链接：https://arxiv.org/abs/2109.05738

作者：Vinayak Gupta,Srikanta Bedathur
机构：IIT Delhi
备注：CIKM 2021
摘要：There exists a high variability in mobility data volumes across different regions, which deteriorates the performance of spatial recommender systems that rely on region-specific data. In this paper, we propose a novel transfer learning framework called REFORMD, for continuous-time location prediction for regions with sparse checkin data. Specifically, we model user-specific checkin-sequences in a region using a marked temporal point process (MTPP) with normalizing flows to learn the inter-checkin time and geo-distributions. Later, we transfer the model parameters of spatial and temporal flows trained on a data-rich origin region for the next check-in and time prediction in a target region with scarce checkin data. We capture the evolving region-specific checkin dynamics for MTPP and spatial-temporal flows by maximizing the joint likelihood of next checkin with three channels (1) checkin-category prediction, (2) checkin-time prediction, and (3) travel distance prediction. Extensive experiments on different user mobility datasets across the U.S. and Japan show that our model significantly outperforms state-of-the-art methods for modeling continuous-time sequences. Moreover, we also show that REFORMD can be easily adapted for product recommendations i.e., sequences without any spatial component.

【6】 On the Choice of Fairness: Finding Representative Fairness Metrics for a Given Context
标题：论公平的选择：寻找特定情境下的代表性公平度量
链接：https://arxiv.org/abs/2109.05697

作者：Hadis Anahideh,Nazanin Nezami,Abolfazl Asudeh
机构： University of Illinois at Chicago
摘要：It is of critical importance to be aware of the historical discrimination embedded in the data and to consider a fairness measure to reduce bias throughout the predictive modeling pipeline. Various notions of fairness have been defined, though choosing an appropriate metric is cumbersome. Trade-offs and impossibility theorems make such selection even more complicated and controversial. In practice, users (perhaps regular data scientists) should understand each of the measures and (if possible) manually explore the combinatorial space of different measures before they can decide which combination is preferred based on the context, the use case, and regulations. To alleviate the burden of selecting fairness notions for consideration, we propose a framework that automatically discovers the correlations and trade-offs between different pairs of measures for a given context. Our framework dramatically reduces the exploration space by finding a small subset of measures that represent others and highlighting the trade-offs between them. This allows users to view unfairness from various perspectives that might otherwise be ignored due to the sheer size of the exploration space. We showcase the validity of the proposal using comprehensive experiments on real-world benchmark data sets.

【7】 Mixing between the Cross Entropy and the Expectation Loss Terms
标题：交叉熵与期望损失项的混合
链接：https://arxiv.org/abs/2109.05635

作者：Barak Battash,Lior Wolf,Tamir Hazan
机构： Tel-Aviv University, Technion
备注：8 pages, 3 figures
摘要：The cross entropy loss is widely used due to its effectiveness and solid theoretical grounding. However, as training progresses, the loss tends to focus on hard to classify samples, which may prevent the network from obtaining gains in performance. While most work in the field suggest ways to classify hard negatives, we suggest to strategically leave hard negatives behind, in order to focus on misclassified samples with higher probabilities. We show that adding to the optimization goal the expectation loss, which is a better approximation of the zero-one loss, helps the network to achieve better accuracy. We, therefore, propose to shift between the two losses during training, focusing more on the expectation loss gradually during the later stages of training. Our experiments show that the new training protocol improves performance across a diverse set of classification domains, including computer vision, natural language processing, tabular data, and sequences. Our code and scripts are available at supplementary.

【8】 Data Analytics for Smart cities: Challenges and Promises
标题：智能城市数据分析：挑战与承诺
链接：https://arxiv.org/abs/2109.05581

作者：Farid Ghareh Mohammadi,Farzan Shenavarmasouleh,M. Hadi Amini,Hamid R. Arabnia
机构：:Department of Computer Science, Franklin College of arts and sciences, University of Georgia, Athens, Georgia, : School of Computing and Information Sciences, College of Engineering and Computing, Florida International University, Miami, FL
备注：12 pages, 2 figures
摘要：The explosion of advancements in artificial intelligence, sensor technologies, and wireless communication activates ubiquitous sensing through distributed sensors. These sensors are various domains of networks that lead us to smart systems in healthcare, transportation, environment, and other relevant branches/networks. Having collaborative interaction among the smart systems connects end-user devices to each other which enables achieving a new integrated entity called Smart Cities. The goal of this study is to provide a comprehensive survey of data analytics in smart cities. In this paper, we aim to focus on one of the smart cities important branches, namely Smart Mobility, and its positive ample impact on the smart cities decision-making process. Intelligent decision-making systems in smart mobility offer many advantages such as saving energy, relaying city traffic, and more importantly, reducing air pollution by offering real-time useful information and imperative knowledge. Making a decision in smart cities in time is challenging due to various and high dimensional factors and parameters, which are not frequently collected. In this paper, we first address current challenges in smart cities and provide an overview of potential solutions to these challenges. Then, we offer a framework of these solutions, called universal smart cities decision making, with three main sections of data capturing, data analysis, and decision making to optimize the smart mobility within smart cities. With this framework, we elaborate on fundamental concepts of big data, machine learning, and deep leaning algorithms that have been applied to smart cities and discuss the role of these algorithms in decision making for smart mobility in smart cities.

【9】 Improved Algorithms for Misspecified Linear Markov Decision Processes
标题：误指定线性马尔可夫决策过程的改进算法
链接：https://arxiv.org/abs/2109.05546

作者：Daniel Vial,Advait Parulekar,Sanjay Shakkottai,R. Srikant
机构：University of Illinois at Urbana-Champaign, University of Texas at Austin
摘要：For the misspecified linear Markov decision process (MLMDP) model of Jin et al. [2020], we propose an algorithm with three desirable properties. (P1) Its regret after $K$ episodes scales as $K \max \{ \varepsilon_{\text{mis}}, \varepsilon_{\text{tol}} \}$, where $\varepsilon_{\text{mis}}$ is the degree of misspecification and $\varepsilon_{\text{tol}}$ is a user-specified error tolerance. (P2) Its space and per-episode time complexities remain bounded as $K \rightarrow \infty$. (P3) It does not require $\varepsilon_{\text{mis}}$ as input. To our knowledge, this is the first algorithm satisfying all three properties. For concrete choices of $\varepsilon_{\text{tol}}$, we also improve existing regret bounds (up to log factors) while achieving either (P2) or (P3) (existing algorithms satisfy neither). At a high level, our algorithm generalizes (to MLMDPs) and refines the Sup-Lin-UCB algorithm, which Takemura et al. [2021] recently showed satisfies (P3) in the contextual bandit setting.

【10】 Feature Importance in Gradient Boosting Trees with Cross-Validation Feature Selection
标题：交叉验证特征选择在梯度增强树中的特征重要性
链接：https://arxiv.org/abs/2109.05468

作者：Afek Ilay Adler,Amichai Painsky
机构：The Industrial Engineering Department, Tel Aviv University, Israel
摘要：Gradient Boosting Machines (GBM) are among the go-to algorithms on tabular data, which produce state of the art results in many prediction tasks. Despite its popularity, the GBM framework suffers from a fundamental flaw in its base learners. Specifically, most implementations utilize decision trees that are typically biased towards categorical variables with large cardinalities. The effect of this bias was extensively studied over the years, mostly in terms of predictive performance. In this work, we extend the scope and study the effect of biased base learners on GBM feature importance (FI) measures. We show that although these implementation demonstrate highly competitive predictive performance, they still, surprisingly, suffer from bias in FI. By utilizing cross-validated (CV) unbiased base learners, we fix this flaw at a relatively low computational cost. We demonstrate the suggested framework in a variety of synthetic and real-world setups, showing a significant improvement in all GBM FI measures while maintaining relatively the same level of prediction accuracy.

【11】 The Logic Traps in Evaluating Post-hoc Interpretations
标题：后即席口译评价中的逻辑陷阱
链接：https://arxiv.org/abs/2109.05463

作者：Yiming Ju,Yuanzhe Zhang,Zhao Yang,Zhongtao Jiang,Kang Liu,Jun Zhao
机构： National Laboratory of Pattern Recognition, Institute of Automation, CAS, Beijing, China, School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, China
摘要：Post-hoc interpretation aims to explain a trained model and reveal how the model arrives at a decision. Though research on post-hoc interpretations has developed rapidly, one growing pain in this field is the difficulty in evaluating interpretations. There are some crucial logic traps behind existing evaluation methods, which are ignored by most works. In this opinion piece, we summarize four kinds evaluation methods and point out the corresponding logic traps behind them. We argue that we should be clear about these traps rather than ignore them and draw conclusions assertively.

【12】 Omnipredictors
标题：全向预报器
链接：https://arxiv.org/abs/2109.05389

作者：Parikshit Gopalan,Adam Tauman Kalai,Omer Reingold,Vatsal Sharan,Udi Wieder
机构：VMware Research, Microsoft Research, Stanford University, USC
备注：35 pages, 1 figure
摘要：Loss minimization is a dominant paradigm in machine learning, where a predictor is trained to minimize some loss function that depends on an uncertain event (e.g., "will it rain tomorrow?''). Different loss functions imply different learning algorithms and, at times, very different predictors. While widespread and appealing, a clear drawback of this approach is that the loss function may not be known at the time of learning, requiring the algorithm to use a best-guess loss function. We suggest a rigorous new paradigm for loss minimization in machine learning where the loss function can be ignored at the time of learning and only be taken into account when deciding an action. We introduce the notion of an (${\mathcal{L}},\mathcal{C}$)-omnipredictor, which could be used to optimize any loss in a family ${\mathcal{L}}$. Once the loss function is set, the outputs of the predictor can be post-processed (a simple univariate data-independent transformation of individual predictions) to do well compared with any hypothesis from the class $\mathcal{C}$. The post processing is essentially what one would perform if the outputs of the predictor were true probabilities of the uncertain events. In a sense, omnipredictors extract all the predictive power from the class $\mathcal{C}$, irrespective of the loss function in $\mathcal{L}$. We show that such "loss-oblivious'' learning is feasible through a connection to multicalibration, a notion introduced in the context of algorithmic fairness. In addition, we show how multicalibration can be viewed as a solution concept for agnostic boosting, shedding new light on past results. Finally, we transfer our insights back to the context of algorithmic fairness by providing omnipredictors for multi-group loss minimization.

【13】 A Novel Intrinsic Measure of Data Separability
标题：一种新的数据可分性本征测度
链接：https://arxiv.org/abs/2109.05180

作者：Shuyue Guan,Murray Loew
机构：Received: date Accepted: date
备注：16 pages, 12 figures. arXiv admin note: substantial text overlap with arXiv:2005.13120
摘要：In machine learning, the performance of a classifier depends on both the classifier model and the separability/complexity of datasets. To quantitatively measure the separability of datasets, we create an intrinsic measure -- the Distance-based Separability Index (DSI), which is independent of the classifier model. We consider the situation in which different classes of data are mixed in the same distribution to be the most difficult for classifiers to separate. We then formally show that the DSI can indicate whether the distributions of datasets are identical for any dimensionality. And we verify the DSI to be an effective separability measure by comparing to several state-of-the-art separability/complexity measures using synthetic and real datasets. Having demonstrated the DSI's ability to compare distributions of samples, we also discuss some of its other promising applications, such as measuring the performance of generative adversarial networks (GANs) and evaluating the results of clustering methods.

【14】 MURAL: Multimodal, Multitask Retrieval Across Languages
标题：壁画：跨语言的多模式、多任务检索
链接：https://arxiv.org/abs/2109.05125

作者：Aashi Jain,Mandy Guo,Krishna Srinivasan,Ting Chen,Sneha Kudugunta,Chao Jia,Yinfei Yang,Jason Baldridge
机构：Google Research
摘要：Both image-caption pairs and translation pairs provide the means to learn deep representations of and connections between languages. We use both types of pairs in MURAL (MUltimodal, MUltitask Representations Across Languages), a dual encoder that solves two tasks: 1) image-text matching and 2) translation pair matching. By incorporating billions of translation pairs, MURAL extends ALIGN (Jia et al. PMLR'21)--a state-of-the-art dual encoder learned from 1.8 billion noisy image-text pairs. When using the same encoders, MURAL's performance matches or exceeds ALIGN's cross-modal retrieval performance on well-resourced languages across several datasets. More importantly, it considerably improves performance on under-resourced languages, showing that text-text learning can overcome a paucity of image-caption examples for these languages. On the Wikipedia Image-Text dataset, for example, MURAL-base improves zero-shot mean recall by 8.1% on average for eight under-resourced languages and by 6.8% on average when fine-tuning. We additionally show that MURAL's text representations cluster not only with respect to genealogical connections but also based on areal linguistics, such as the Balkan Sprachbund.

【15】 No Size Fits All: Automated Radio Configuration for LPWANs
标题：没有放之四海而皆准的解决方案：LPWAN的自动无线电配置
链接：https://arxiv.org/abs/2109.05103

作者：Zerina Kapetanovic,Deepak Vasisht,Tusher Chakraborty,Joshua R. Smith,Ranveer Chandra
机构： University of Washington, Seattle, WA, University of Illinois, Urbana, IL, Microsoft, Redmond, WA
摘要：Low power long-range networks like LoRa have become increasingly mainstream for Internet of Things deployments. Given the versatility of applications that these protocols enable, they support many data rates and bandwidths. Yet, for a given network that supports hundreds of devices over multiple miles, the network operator typically needs to specify the same configuration or among a small subset of configurations for all the client devices to communicate with the gateway. This one-size-fits-all approach is highly inefficient in large networks. We propose an alternative approach -- we allow network devices to transmit at any data rate they choose. The gateway uses the first few symbols in the preamble to classify the correct data rate, switches its configuration, and then decodes the data. Our design leverages the inherent asymmetry in outdoor IoT deployments where the clients are power-starved and resource-constrained, but the gateway is not. Our gateway design, Proteus, runs a neural network architecture and is backward compatible with existing LoRa protocols. Our experiments reveal that Proteus can identify the correct configuration with over 97% accuracy in both indoor and outdoor deployments. Our network architecture leads to a 3.8 to 11 times increase in throughput for our LoRa testbed.

【16】 Entity-Based Knowledge Conflicts in Question Answering
标题：问答中基于实体的知识冲突
链接：https://arxiv.org/abs/2109.05052

作者：Shayne Longpre,Kartik Perisetla,Anthony Chen,Nikhil Ramesh,Chris DuBois,Sameer Singh
机构：♠Apple, ♥University of California, Irvine
备注：Accepted to Empirical Methods in Natural Language Processing (EMNLP) 2021
摘要：Knowledge-dependent tasks typically use two sources of knowledge: parametric, learned at training time, and contextual, given as a passage at inference time. To understand how models use these sources together, we formalize the problem of knowledge conflicts, where the contextual information contradicts the learned information. Analyzing the behaviour of popular models, we measure their over-reliance on memorized information (the cause of hallucinations), and uncover important factors that exacerbate this behaviour. Lastly, we propose a simple method to mitigate over-reliance on parametric knowledge, which minimizes hallucination, and improves out-of-distribution generalization by 4%-7%. Our findings demonstrate the importance for practitioners to evaluate model tendency to hallucinate rather than read, and show that our mitigation strategy encourages generalization to evolving information (i.e., time-dependent queries). To encourage these practices, we have released our framework for generating knowledge conflicts.

【17】 Potential-based Reward Shaping in Sokoban
标题：索科班基于潜力的奖励塑造
链接：https://arxiv.org/abs/2109.05022

作者：Zhao Yang,Mike Preuss,Aske Plaat
机构： LIACS, Leiden University, the Netherlands
摘要：Learning to solve sparse-reward reinforcement learning problems is difficult, due to the lack of guidance towards the goal. But in some problems, prior knowledge can be used to augment the learning process. Reward shaping is a way to incorporate prior knowledge into the original reward function in order to speed up the learning. While previous work has investigated the use of expert knowledge to generate potential functions, in this work, we study whether we can use a search algorithm(A*) to automatically generate a potential function for reward shaping in Sokoban, a well-known planning task. The results showed that learning with shaped reward function is faster than learning from scratch. Our results indicate that distance functions could be a suitable function for Sokoban. This work demonstrates the possibility of solving multiple instances with the help of reward shaping. The result can be compressed into a single policy, which can be seen as the first phrase towards training a general policy that is able to solve unseen instances.

【18】 Barzilai and Borwein conjugate gradient method equipped with a non-monotone line search technique and its application on non-negative matrix factorization
标题：带有非单调线搜索技术的Barzilai和Borwein共轭梯度法及其在非负矩阵分解中的应用
链接：https://arxiv.org/abs/2109.05685

作者：Sajad Fathi Hafshejani,Daya Gaur,Shahadat Hossain,Robert Benkoczi
机构：Department of Math and Computer Science, University of Lethbridge, Lethbridge, AB, Canada
摘要：In this paper, we propose a new non-monotone conjugate gradient method for solving unconstrained nonlinear optimization problems. We first modify the non-monotone line search method by introducing a new trigonometric function to calculate the non-monotone parameter, which plays an essential role in the algorithm's efficiency. Then, we apply a convex combination of the Barzilai-Borwein method for calculating the value of step size in each iteration. Under some suitable assumptions, we prove that the new algorithm has the global convergence property. The efficiency and effectiveness of the proposed method are determined in practice by applying the algorithm to some standard test problems and non-negative matrix factorization problems.

【19】 Automatic Componentwise Boosting: An Interpretable AutoML System
标题：自动组件式Boosting：一个可解释的AutoML系统
链接：https://arxiv.org/abs/2109.05583

作者：Stefan Coors,Daniel Schalk,Bernd Bischl,David Rügamer
机构： and R¨ugamer David 1[0000−000 2−877 2−9 20 2]Department of Statistics
备注：6 pages, 4 figures, ECML-PKDD Workshop on Automating Data Science 2021
摘要：In practice, machine learning (ML) workflows require various different steps, from data preprocessing, missing value imputation, model selection, to model tuning as well as model evaluation. Many of these steps rely on human ML experts. AutoML - the field of automating these ML pipelines - tries to help practitioners to apply ML off-the-shelf without any expert knowledge. Most modern AutoML systems like auto-sklearn, H20-AutoML or TPOT aim for high predictive performance, thereby generating ensembles that consist almost exclusively of black-box models. This, in turn, makes the interpretation for the layperson more intricate and adds another layer of opacity for users. We propose an AutoML system that constructs an interpretable additive model that can be fitted using a highly scalable componentwise boosting algorithm. Our system provides tools for easy model interpretation such as visualizing partial effects and pairwise interactions, allows for a straightforward calculation of feature importance, and gives insights into the required model complexity to fit the given task. We introduce the general framework and outline its implementation autocompboost. To demonstrate the frameworks efficacy, we compare autocompboost to other existing systems based on the OpenML AutoML-Benchmark. Despite its restriction to an interpretable model space, our system is competitive in terms of predictive performance on most data sets while being more user-friendly and transparent.

【20】 Kernel PCA with the Nyström method
标题：基于Nyström方法的核主成分分析
链接：https://arxiv.org/abs/2109.05578

作者：Fredrik Hallgren
机构：Department of Statistical Science, University College London
备注：43 pages, 6 figures
摘要：Kernel methods are powerful but computationally demanding techniques for non-linear learning. A popular remedy, the Nystr\"om method has been shown to be able to scale up kernel methods to very large datasets with little loss in accuracy. However, kernel PCA with the Nystr\"om method has not been widely studied. In this paper we derive kernel PCA with the Nystr\"om method and study its accuracy, providing a finite-sample confidence bound on the difference between the Nystr\"om and standard empirical reconstruction errors. The behaviours of the method and bound are illustrated through extensive computer experiments on real-world data. As an application of the method we present kernel principal component regression with the Nystr\"om method.

【21】 Towards a variational Jordan-Lee-Preskill quantum algorithm
标题：一种变分Jordan-Lee-Preskill量子算法
链接：https://arxiv.org/abs/2109.05547

作者：Junyu Liu,Jinzhao Sun,Xiao Yuan
机构：Walter Burke Institute for Theoretical Physics, California Institute of Technology, Pasadena, Institute for Quantum Information and Matter, California Institute of Technology, Pasadena
备注：48 pages, many figures
摘要：Rapid developments of quantum information technology show promising opportunities for simulating quantum field theory in near-term quantum devices. In this work, we formulate the theory of (time-dependent) variational quantum simulation, explicitly designed for quantum simulation of quantum field theory. We develop hybrid quantum-classical algorithms for crucial ingredients in particle scattering experiments, including encoding, state preparation, and time evolution, with several numerical simulations to demonstrate our algorithms in the 1+1 dimensional $\lambda \phi^4$ quantum field theory. These algorithms could be understood as near-term analogs of the Jordan-Lee-Preskill algorithm, the basic algorithm for simulating quantum field theory using universal quantum devices. Our contribution also includes a bosonic version of the Unitary Coupled Cluster ansatz with physical interpretation in quantum field theory, a discussion about the subspace fidelity, a comparison among different bases in the 1+1 dimensional $\lambda \phi^4$ theory, and the "spectral crowding" in the quantum field theory simulation.

【22】 Team NeuroPoly: Description of the Pipelines for the MICCAI 2021 MS New Lesions Segmentation Challenge
标题：NeuroPoly团队：MICCAI 2021MS新病灶分割挑战赛的管道描述
链接：https://arxiv.org/abs/2109.05409

作者：Uzay Macar,Enamundram Naga Karthik,Charley Gros,Andréanne Lemay,Julien Cohen-Adad
机构： NeuroPoly Lab, Institute of Biomedical Engineering, Polytechnique Montréal, Montréal, QC, MILA - Québec AI Institute, Montréal, QC, Canada, Functional Neuroimaging Unit, CRIUGM, Université de Montréal, Montréal, QC, Canada
备注：To be presented at the 2021 MICCAI Challenge on Multiple Sclerosis Lesion Segmentation (MSSEG-2); 8 pages in total
摘要：This paper gives a detailed description of the pipelines used for the 2nd edition of the MICCAI 2021 Challenge on Multiple Sclerosis Lesion Segmentation. An overview of the data preprocessing steps applied is provided along with a brief description of the pipelines used, in terms of the architecture and the hyperparameters. Our code for this work can be found at: https://github.com/ivadomed/ms-challenge-2021.

【23】 Differentially Private Variable Selection via the Knockoff Filter
标题：基于仿冒过滤的差异化私有变量选择
链接：https://arxiv.org/abs/2109.05402

作者：Mehrdad Pournaderi,Yu Xiang
机构： Central Campus Dr , Salt Lake City, USA
备注：Accepted to the 2021 IEEE International Workshop on Machine Learning for Signal Processing (MLSP)
摘要：The knockoff filter, recently developed by Barber and Candes, is an effective procedure to perform variable selection with a controlled false discovery rate (FDR). We propose a private version of the knockoff filter by incorporating Gaussian and Laplace mechanisms, and show that variable selection with controlled FDR can be achieved. Simulations demonstrate that our setting has reasonable statistical power.

【24】 Gradients and Subgradients of Buffered Failure Probability
标题：缓冲失效概率的梯度和子梯度
链接：https://arxiv.org/abs/2109.05391

作者：Johannes O. Royset,Ji-Eun Byun
机构：Operations Research Department, Engineering Risk Analysis Group, Naval Postgraduate School, Technical University of Munich
摘要：Gradients and subgradients are central to optimization and sensitivity analysis of buffered failure probabilities. We furnish a characterization of subgradients based on subdifferential calculus in the case of finite probability distributions and, under additional assumptions, also a gradient expression for general distributions. Several examples illustrate the application of the results, especially in the context of optimality conditions.

【25】 Real-time multimodal image registration with partial intraoperative point-set data
标题：基于部分术中点集数据的实时多模态图像配准
链接：https://arxiv.org/abs/2109.05023

作者：Zachary M C Baum,Yipeng Hu,Dean C Barratt
机构：Centre for Medical Image Computing, University College London, London, UK, Wellcome EPSRC Centre for Surgical and Interventional Sciences, University College London, London, UK
备注：Accepted manuscript in Medical Image Analysis
摘要：We present Free Point Transformer (FPT) - a deep neural network architecture for non-rigid point-set registration. Consisting of two modules, a global feature extraction module and a point transformation module, FPT does not assume explicit constraints based on point vicinity, thereby overcoming a common requirement of previous learning-based point-set registration methods. FPT is designed to accept unordered and unstructured point-sets with a variable number of points and uses a "model-free" approach without heuristic constraints. Training FPT is flexible and involves minimizing an intuitive unsupervised loss function, but supervised, semi-supervised, and partially- or weakly-supervised training are also supported. This flexibility makes FPT amenable to multimodal image registration problems where the ground-truth deformations are difficult or impossible to measure. In this paper, we demonstrate the application of FPT to non-rigid registration of prostate magnetic resonance (MR) imaging and sparsely-sampled transrectal ultrasound (TRUS) images. The registration errors were 4.71 mm and 4.81 mm for complete TRUS imaging and sparsely-sampled TRUS imaging, respectively. The results indicate superior accuracy to the alternative rigid and non-rigid registration algorithms tested and substantially lower computation time. The rapid inference possible with FPT makes it particularly suitable for applications where real-time registration is beneficial.

机器翻译，仅供参考

点击“阅读原文”获取带摘要的学术速递