机器学习学术速递[8.4]

点击阅读原文访问arxivdaily.com，涵盖CS|物理|数学|经济|统计|金融|生物|电气领域，更有搜索、收藏等功能！

cs.LG 方向，今日共计85篇

Graph相关(图学习|图神经网络|图优化等)(9篇)

【1】 MTGFlow: Unsupervised Multivariate Time Series Anomaly Detection via Dynamic Graph and Entity-aware Normalizing Flow
标题： MTGFlow：基于动态图和实体感知规范化流的无监督多变量时间序列异常检测
链接：https://arxiv.org/abs/2208.02108

作者：Qihang Zhou,Jiming Chen,Haoyu Liu,Shibo He,Wenchao Meng
机构：Zhejiang University
摘要：在半监督环境下，多变量时间序列异常检测问题已经得到了广泛的研究.在半监督环境下，需要一个包含所有正态样本的训练数据集.然而，准备这样的数据集是非常费力的，因为它需要保证每一个数据样本都是正态的.因此，需要探索一种不需要任何标记知识的基于训练数据集的多变量时间序列异常检测方法.该文提出了一种基于标记知识的多变量时间序列异常检测方法，我们提出了一种基于动态图和实体感知归一化流的无监督多变量时间序列异常检测方法MTGFlow，该方法仅依赖于一个被广泛接受的假设，即异常实例的密度比正常实例的密度稀疏，然而，实体之间复杂的相互依赖关系和每个实体的不同固有特性给密度估计带来了巨大挑战，为了解决这些问题，提出了一种基于图结构学习模型的多变量时间序列分析方法，该方法利用图结构学习模型来学习实体间的相互关系和动态关系，提出了一种基于实体感知的归一化流程，将每个实体描述为参数化的正态分布，从而得到细粒度的密度估计.结合这两种策略，MTGFlow实现了更好的异常检测性能.在真实数据集上进行了实验，结果表明，MTGFlow在SWaT和WADI数据集上的AUROC分别比SOTA算法高5.0%和1.6%，并且通过个体贡献的异常得分，MTGFlow可以为检测结果提供解释信息.
摘要：Multivariate time series anomaly detection has been extensively studied under the semi-supervised setting, where a training dataset with all normal instances is required. However, preparing such a dataset is very laborious since each single data instance should be fully guaranteed to be normal. It is, therefore, desired to explore multivariate time series anomaly detection methods based on the dataset without any label knowledge. In this paper, we propose MTGFlow, an unsupervised anomaly detection approach for Multivariate Time series anomaly detection via dynamic Graph and entity-aware normalizing Flow, leaning only on a widely accepted hypothesis that abnormal instances exhibit sparse densities than the normal. However, the complex interdependencies among entities and the diverse inherent characteristics of each entity pose significant challenges on the density estimation, let alone to detect anomalies based on the estimated possibility distribution. To tackle these problems, we propose to learn the mutual and dynamic relations among entities via a graph structure learning model, which helps to model accurate distribution of multivariate time series. Moreover, taking account of distinct characteristics of the individual entities, an entity-aware normalizing flow is developed to describe each entity into a parameterized normal distribution, thereby producing fine-grained density estimation. Incorporating these two strategies, MTGFlowachieves superior anomaly detection performance. Experiments on the real-world datasets are conducted, demonstrating that MTGFlow outperforms the state-of-the-art (SOTA) by 5.0% and 1.6% AUROC for SWaT and WADI datasets respectively. Also, through the anomaly scores contributed by individual entities, MTGFlow can provide explanation information for the detection results.

【2】 Graph Regularized Nonnegative Latent Factor Analysis Model for Temporal Link Prediction in Cryptocurrency Transaction Networks
标题：加密货币交易网络中时态链接预测的图正则化非负潜在因子分析模型
链接：https://arxiv.org/abs/2208.01923

作者：Zhou Yue,Liu ZhiGang,Yuan Ye
机构：W,  Y Zhou and Z.G Liu are with the School of Computer Science and Technology, Chongqing University of Posts and Telecommunications, Chongqing
摘要：随着区块链技术的发展，基于区块链技术的加密货币越来越普及，由此催生的庞大的加密货币交易网络受到了广泛关注，网络的链接预测学习结构有助于理解网络的机理，因此在加密货币网络中也得到了广泛的研究，然而，本文利用图正则化方法将过去的交易记录与未来的交易记录联系起来，在此基础上提出了一个单潜在因子依赖的、非负在此基础上，提出了图正则化非负潜在因子分析（GrNLFA）模型，并在一个真实的加密货币交易网络上进行了实验，实验结果表明，所提方法提高了计算效率和准确性
摘要：With the development of blockchain technology, the cryptocurrency based on blockchain technology is becoming more and more popular. This gave birth to a huge cryptocurrency transaction network has received widespread attention. Link prediction learning structure of network is helpful to understand the mechanism of network, so it is also widely studied in cryptocurrency network. However, the dynamics of cryptocurrency transaction networks have been neglected in the past researches. We use graph regularized method to link past transaction records with future transactions. Based on this, we propose a single latent factor-dependent, non-negative, multiplicative and graph regularized-incorporated update (SLF-NMGRU) algorithm and further propose graph regularized nonnegative latent factor analysis (GrNLFA) model. Finally, experiments on a real cryptocurrency transaction network show that the proposed method improves both the accuracy and the computational efficiency

【3】 Robust Graph Neural Networks using Weighted Graph Laplacian
标题：基于加权图拉普拉斯算子的鲁棒图神经网络
链接：https://arxiv.org/abs/2208.01853

作者：Bharat Runwal,Vivek,Sandeep Kumar
机构：Indian Institute of Technology, Delhi, Samsung Research Bangalore, Department of Electrical Engineering, IIT Delhi, India
备注：Accepted at IEEE International Conference on Signal Processing and Communications (SPCOM), 2022
摘要：图神经网络（GNN）在许多应用领域取得了显著的性能.然而，GNN容易受到输入数据中的噪声和对抗性攻击的攻击.如何使GNN对噪声和对抗性攻击具有鲁棒性是一个重要的问题.现有的GNN防御方法计算量大且不可扩展.该文提出了一种基于GNN的攻击防御方法，我们提出了一种鲁棒性GNN的通用框架，称为加权拉普拉斯GNN该方法将加权图Laplacian学习与GNN实现相结合，利用Laplacian矩阵的半正定性，通过建立一个统一的优化框架，保证了图中的敌对/噪声边被丢弃，连接被适当加权，从而保证了图的特征平滑性和潜在特征（GCNN）架构，然而，该方法可以很容易地应用于任何现有的GNN体系结构.在基准数据集上的仿真结果证实了该方法的有效性，在准确性和计算效率方面都是如此。代码可以在https：//github.com/Bharat-Runwal/RWL-GNN.
摘要：Graph neural network (GNN) is achieving remarkable performances in a variety of application domains. However, GNN is vulnerable to noise and adversarial attacks in input data. Making GNN robust against noises and adversarial attacks is an important problem. The existing defense methods for GNNs are computationally demanding and are not scalable. In this paper, we propose a generic framework for robustifying GNN known as Weighted Laplacian GNN (RWL-GNN). The method combines Weighted Graph Laplacian learning with the GNN implementation. The proposed method benefits from the positive semi-definiteness property of Laplacian matrix, feature smoothness, and latent features via formulating a unified optimization framework, which ensures the adversarial/noisy edges are discarded and connections in the graph are appropriately weighted. For demonstration, the experiments are conducted with Graph convolutional neural network(GCNN) architecture, however, the proposed framework is easily amenable to any existing GNN architecture. The simulation results with benchmark dataset establish the efficacy of the proposed method, both in accuracy and computational efficiency. Code can be accessed at https://github.com/Bharat-Runwal/RWL-GNN.

【4】 Link Prediction on Heterophilic Graphs via Disentangled Representation Learning
标题：基于解纠缠表示学习的异构图链接预测
链接：https://arxiv.org/abs/2208.01820

作者：Shijie Zhou,Zhimeng Guo,Charu Aggarwal,Xiang Zhang,Suhang Wang
机构：†College of Information Sciences and Technology, The Pennsylvania State University, USA, ‡IBM T. J. Watson Research Center, USA
摘要：链接预测是一项重要的任务，在各个领域有着广泛的应用.然而，现有的链接预测方法大多假设给定的图遵循同态假设，并设计了基于相似性的启发式或表示学习方法来预测链接.然而，现实世界中的许多图都是异构图，同态假设不成立，这对现有的链接预测方法提出了挑战.通常，在异质图中，导致链接形成的潜在因素很多，两个链接的节点往往在一个或两个因素上相似，但在其他因素上可能不相似，导致整体相似性较低，因此，一种方法是学习每个节点的解纠缠表示，每个向量捕获节点在一个因素上的潜在表示，这为异构图中的链接形成建模提供了一条途径，从而获得了更好的节点表示学习和链接预测性能.然而，我们研究了一个新的问题——探索用于异嗜图上链接预测的解纠缠表示学习。提出了一种新的DisenLink框架，该框架通过对链路形成过程建模来学习解纠缠表示，并执行因子感知的消息传递以促进链路预测.在13个真实世界数据集上的大量实验证明了DisenLink在异嗜图和血友病图上进行链接预测的有效性。//github.com/sjz5202/DisenLink
摘要：Link prediction is an important task that has wide applications in various domains. However, the majority of existing link prediction approaches assume the given graph follows homophily assumption, and designs similarity-based heuristics or representation learning approaches to predict links. However, many real-world graphs are heterophilic graphs, where the homophily assumption does not hold, which challenges existing link prediction methods. Generally, in heterophilic graphs, there are many latent factors causing the link formation, and two linked nodes tend to be similar in one or two factors but might be dissimilar in other factors, leading to low overall similarity. Thus, one way is to learn disentangled representation for each node with each vector capturing the latent representation of a node on one factor, which paves a way to model the link formation in heterophilic graphs, resulting in better node representation learning and link prediction performance. However, the work on this is rather limited. Therefore, in this paper, we study a novel problem of exploring disentangled representation learning for link prediction on heterophilic graphs. We propose a novel framework DisenLink which can learn disentangled representations by modeling the link formation and perform factor-aware message-passing to facilitate link prediction. Extensive experiments on 13 real-world datasets demonstrate the effectiveness of DisenLink for link prediction on both heterophilic and hemophiliac graphs. Our codes are available at https://github.com/sjz5202/DisenLink

【5】 Adversarial Camouflage for Node Injection Attack on Graphs
标题：图上节点注入攻击的对抗性伪装
链接：https://arxiv.org/abs/2208.01819

作者：Shuchang Tao,Qi Cao,Huawei Shen,Yunfan Wu,Liang Hou,Xueqi Cheng
机构：Data Intelligence System Research Center, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China, CAS Key Laboratory of Network Data Science and Technology, University of Chinese Academy of Sciences, Beijing, China
摘要：图神经网络的节点注入攻击（GNN）作为一种实际的攻击场景已经引起了人们的关注，攻击者通过注入恶意节点而不是修改节点特征或边来降低GNN的性能，尽管节点注入攻击最初取得了成功，我们发现，现有方法注入的节点很容易被防御方法和限制其攻击的方法与原来的正常节点区分开来。为了解决上述问题，本文致力于伪装节点注入攻击，即伪装注入的恶意节点图数据的非欧性质和人类先验知识的缺乏给图伪装的形式化、实现和评估带来了极大的挑战，本文提出了一种基于图的伪装算法，首先从以注入节点为中心的自我网络的保真度和多样性两个方面提出并阐述了节点注入攻击的伪装方法，然后设计了一个节点注入攻击的对抗性CAmouflage框架CANA，在保证攻击性能的同时提高了伪装性能，并设计了几种新的图形伪装指标进行综合评价。实验结果表明，在现有节点注入攻击方法的基础上，采用本文提出的CANA框架，可以显著提高节点注入攻击方法对防御方法和节点伪装的攻击性能。
摘要：Node injection attacks against Graph Neural Networks (GNNs) have received emerging attention as a practical attack scenario, where the attacker injects malicious nodes instead of modifying node features or edges to degrade the performance of GNNs. Despite the initial success of node injection attacks, we find that the injected nodes by existing methods are easy to be distinguished from the original normal nodes by defense methods and limiting their attack performance in practice. To solve the above issues, we devote to camouflage node injection attack, i.e., camouflaging injected malicious nodes (structure/attributes) as the normal ones that appear legitimate/imperceptible to defense methods. The non-Euclidean nature of graph data and the lack of human prior brings great challenges to the formalization, implementation, and evaluation of camouflage on graphs. In this paper, we first propose and formulate the camouflage of injected nodes from both the fidelity and diversity of the ego networks centered around injected nodes. Then, we design an adversarial CAmouflage framework for Node injection Attack, namely CANA, to improve the camouflage while ensuring the attack performance. Several novel indicators for graph camouflage are further designed for a comprehensive evaluation. Experimental results demonstrate that when equipping existing node injection attack methods with our proposed CANA framework, the attack performance against defense methods as well as node camouflage is significantly improved.

【6】 Analysis of the Spatio-temporal Dynamics of COVID-19 in Massachusetts via Spectral Graph Wavelet Theory
标题：基于谱图小波理论的马萨诸塞州COVID-19疫情时空动态分析
链接：https://arxiv.org/abs/2208.01749

作者：Ru Geng,Yixian Gao,Hongkun Zhang,Jian Zu
机构： and School of Mathematics and Statis-tics, Northeast Normal University
备注：Accepted by IEEE Transactions on Signal and Information Processing over Networks
摘要：COVID-19疾病的快速传播对世界产生了重大影响，本文利用开放数据源研究了2020年12月6日至2021年9月25日马萨诸塞州351个城镇的COVID-19数据解释和可视化，由于城市嵌入在相当复杂的交通网络中，我们构建了时空动态图模型、其中使用图注意神经网络作为深度学习方法来学习马萨诸塞州主要城市之间的流行病转移概率，使用谱图小波变换（SGWT），我们在动态图上处理COVID-19数据，这使我们能够设计有效的工具来分析和检测流行病传播中的时空模式。我们设计了一种新的节点分类方法，该方法基于谱图小波系数有效地识别出异常城市，可以帮助管理部门或公共卫生组织监测大流行的传播和制定预防措施。与大多数工作专注于确诊病例随时间的演变不同，通过数据分析和可视化，我们可以更好地了解城市水平的流行病学发展，并有助于城市特定监测。
摘要：The rapid spread of COVID-19 disease has had a significant impact on the world. In this paper, we study COVID-19 data interpretation and visualization using open-data sources for 351 cities and towns in Massachusetts from December 6, 2020 to September 25, 2021. Because cities are embedded in rather complex transportation networks, we construct the spatio-temporal dynamic graph model, in which the graph attention neural network is utilized as a deep learning method to learn the pandemic transition probability among major cities in Massachusetts. Using the spectral graph wavelet transform (SGWT), we process the COVID-19 data on the dynamic graph, which enables us to design effective tools to analyze and detect spatio-temporal patterns in the pandemic spreading. We design a new node classification method, which effectively identifies the anomaly cities based on spectral graph wavelet coefficients. It can assist administrations or public health organizations in monitoring the spread of the pandemic and developing preventive measures. Unlike most work focusing on the evolution of confirmed cases over time, we focus on the spatio-temporal patterns of pandemic evolution among cities. Through the data analysis and visualization, a better understanding of the epidemiological development at the city level is obtained and can be helpful with city-specific surveillance.

【7】 V-Coder: Adaptive AutoEncoder for Semantic Disclosure in Knowledge Graphs
标题： V-Coder：知识图语义揭示的自适应自动编码器
链接：https://arxiv.org/abs/2208.01735

作者：Christian M. M. Frey,Matthias Schubert
机构：Ludwig-Maximilians-Universität,Institute for Informatics, Munich, Germany
摘要：语义网或知识图（KG）是智能系统获取结构化知识的重要信息源之一，从文本数据中提取和处理无歧义信息是其中的一个主要挑战，两个命名实体之间的重叠语义链接由于我们的公共-在本文中，我们对KGs范围内的关系归结问题感兴趣，即，我们研究了网络中实体之间关系的内在语义，提出了一种新的自适应自动编码器，称为V-Coder，识别内在地连接来自不同领域的实体的关系。这些关系可以被认为是不明确的，并且是解开纠缠的候选者。适应性学习理论也是如此在Freebase，Yago和NELL三个真实数据集上的测试表明，V-Coder不仅能够从损坏的输入数据中恢复出链接，而且还表明了KG中关系的语义揭示对链接预测的改进趋势，语义评价对评价进行了封装。
摘要：Semantic Web or Knowledge Graphs (KG) emerged to one of the most important information source for intelligent systems requiring access to structured knowledge. One of the major challenges is the extraction and processing of unambiguous information from textual data. Following the human perception, overlapping semantic linkages between two named entities become clear due to our common-sense about the context a relationship lives in which is not the case when we look at it from an automatically driven process of a machine. In this work, we are interested in the problem of Relational Resolution within the scope of KGs, i.e, we are investigating the inherent semantic of relationships between entities within a network. We propose a new adaptive AutoEncoder, called V-Coder, to identify relations inherently connecting entities from different domains. Those relations can be considered as being ambiguous and are candidates for disentanglement. Likewise to the Adaptive Learning Theory (ART), our model learns new patterns from the KG by increasing units in a competitive layer without discarding the previous observed patterns whilst learning the quality of each relation separately. The evaluation on real-world datasets of Freebase, Yago and NELL shows that the V-Coder is not only able to recover links from corrupted input data, but also shows that the semantic disclosure of relations in a KG show the tendency to improve link prediction. A semantic evaluation wraps the evaluation up.

【8】 Curvature-informed multi-task learning for graph networks
标题：图网络的曲率信息多任务学习
链接：https://arxiv.org/abs/2208.01684

作者：Alexander New,Michael J. Pekala,Nam Q. Le,Janna Domenico,Christine D. Piatko,Christopher D. Stiles
机构： JohnsHopkins University Applied Physics Laboratory
备注：Published at the ICML 2022 AI for Science workshop: this https URL
摘要：晶体和分子感兴趣的性质，例如带隙、弹性和溶解度，通常彼此相关：它们受相同的基本物理定律支配。然而，当最先进的图神经网络试图同时预测多个属性（多任务学习（MTL）设置）时，它们的表现往往不如一套单一属性预测器。这表明图神经网络可能没有充分利用这些基本相似性。在这里，我们研究了对这种现象的可能解释：该曲率的差异可以通过查看每个资产的损失函数的海森（Hessian）的谱特性来评估，我们在两个基准数据集（Materials Project（MP）和QM8）上评估了我们的假设。并考虑这些发现如何为新的多任务学习模型的训练提供信息。
摘要：Properties of interest for crystals and molecules, such as band gap, elasticity, and solubility, are generally related to each other: they are governed by the same underlying laws of physics. However, when state-of-the-art graph neural networks attempt to predict multiple properties simultaneously (the multi-task learning (MTL) setting), they frequently underperform a suite of single property predictors. This suggests graph networks may not be fully leveraging these underlying similarities. Here we investigate a potential explanation for this phenomenon: the curvature of each property's loss surface significantly varies, leading to inefficient learning. This difference in curvature can be assessed by looking at spectral properties of the Hessians of each property's loss function, which is done in a matrix-free manner via randomized numerical linear algebra. We evaluate our hypothesis on two benchmark datasets (Materials Project (MP) and QM8) and consider how these findings can inform the training of novel multi-task learning models.

【9】 Maximal Independent Vertex Set applied to Graph Pooling
标题：最大独立顶点集在图池中的应用
链接：https://arxiv.org/abs/2208.01648

作者：Stevan Stanovic,Benoit Gaüzère,Luc Brun
机构： Normandie Univ, ENSICAEN, CNRS, UNICAEN, GREYC UMR , Caen, France, Normandie Univ, INSA de Rouen, Univ. rouen, Univ. Le Havre, LITIS EA , Saint-´Etienne-du-Rouvray, France
备注：None
摘要：卷积神经网络（CNN）通过卷积和池化（pooling）在图像分类方面取得了重大进展.特别是，图像池化将连通的离散网格转换为具有相同连通性的简化网格，并允许简化函数考虑图像的所有像素.然而，对于图来说，满足这些性质的池化并不存在.实际上，一些方法基于一个点选择步骤，这导致了重要的信息损失.另一些方法学习一个点集的模糊聚类，这导致了几乎完全的约简图.我们提出使用一个新的池方法来克服这两个问题，该方法基于使用最大独立顶点集（MaximalIndependentVertex Set）选择的称为存活顶点的顶点（MIVS）并将剩余顶点分配给幸存者。因此，我们的方法没有丢弃任何顶点信息，也没有人为地增加图的密度.实验结果表明，在各种标准数据集上，图分类的准确率都有所提高.
摘要：Convolutional neural networks (CNN) have enabled major advances in image classification through convolution and pooling. In particular, image pooling transforms a connected discrete grid into a reduced grid with the same connectivity and allows reduction functions to take into account all the pixels of an image. However, a pooling satisfying such properties does not exist for graphs. Indeed, some methods are based on a vertex selection step which induces an important loss of information. Other methods learn a fuzzy clustering of vertex sets which induces almost complete reduced graphs. We propose to overcome both problems using a new pooling method, named MIVSPool. This method is based on a selection of vertices called surviving vertices using a Maximal Independent Vertex Set (MIVS) and an assignment of the remaining vertices to the survivors. Consequently, our method does not discard any vertex information nor artificially increase the density of the graph. Experimental results show an increase in accuracy for graph classification on various standard datasets.

Transformer(2篇)

【1】 Two-Stream Transformer Architecture for Long Video Understanding
标题：用于长视频理解的双流Transformer结构
链接：https://arxiv.org/abs/2208.01753

作者：Edward Fish,Jon Weinbren,Andrew Gilbert
机构：The University of Surrey
摘要：纯Vision Transformer结构对于短视频分类和动作识别任务是非常有效的，但是由于自注意的二次复杂性和缺乏归纳偏差，长格式视频理解任务放大了变换器中的数据和存储器效率问题，使得当前的方法在数据或存储器受限的域上实现是不可行的。该文提出了一种高效的时空注意网络（Spatio-Temporal Attention Network，STAN），该网络采用双流Transformer结构来建模静态图像特征和时间上下文特征之间的依赖关系，能够在单个GPU上对最长为2分钟的视频进行分类，具有数据高效性，并且在多个长视频理解任务上实现了SOTA性能。
摘要：Pure vision transformer architectures are highly effective for short video classification and action recognition tasks. However, due to the quadratic complexity of self attention and lack of inductive bias, transformers are resource intensive and suffer from data inefficiencies. Long form video understanding tasks amplify data and memory efficiency problems in transformers making current approaches unfeasible to implement on data or memory restricted domains. This paper introduces an efficient Spatio-Temporal Attention Network (STAN) which uses a two-stream transformer architecture to model dependencies between static image features and temporal contextual features. Our proposed approach can classify videos up to two minutes in length on a single GPU, is data efficient, and achieves SOTA performance on several long video understanding tasks.

【2】 Multi-Feature Vision Transformer via Self-Supervised Representation Learning for Improvement of COVID-19 Diagnosis
标题：基于自监督表示学习的多特征Vision Transformer算法在COVID-19诊断中的应用
链接：https://arxiv.org/abs/2208.01843

作者：Xiao Qi,David J. Foran,John L. Nosher,Ilker Hacihaliloglu
机构：Hacihaliloglu[,−,−,−,], Department of Electrical and Computer Engineering, Rutgers University, NJ, USA, Department of Radiology, The University of British Columbia, BC, Canada, Department of Medicine, The University of British Columbia, BC, Canada
备注：Accepted to the 2022 MICCAI Workshop on Medical Image Learning with Limited and Noisy Data
摘要：胸部X线的作用（CXR）成像，由于更具成本效益，广泛可用，并且与CT相比具有更快的采集时间，在新冠肺炎大流行期间得到了发展。为了提高CXR成像的诊断性能，越来越多的研究调查了监督深度学习方法是否可以提供额外的支持。然而，监督方法依赖于大量带标签的放射学图像，这是一个耗时且复杂的过程，需要专业临床医生的输入。由于COVID-19患者数据相对稀缺且标记过程成本高昂，自监督学习方法已获得发展势头，并已被提出获得与全监督学习方法相当的结果。在这项工作中，我们研究了自监督学习在从CXR图像诊断COVID-19疾病的背景下的有效性。（ViT）引导的自监督学习模型结构，在该结构中，我们采用交叉注意机制从原始CXR图像和相应的增强的局部相位CXR图像中学习信息.我们证明了利用基于局部相位的增强CXR图像可以进一步提高基线自监督学习模型的性能.通过使用10%的局部相位增强CXR图像，我们的自监督学习模型的性能得到了提高.标记的CXR扫描，提出的模型实现了91.10%和96.21%的总体准确性，在总共35，483张健康（8，851），普通肺炎（6，045）和COVID-19（18，159）扫描的CXR图像上进行了测试，并显示出比最先进的技术有显著的改进。代码可获得https：//github.com/endiqq/Multi-Feature-ViT
摘要：The role of chest X-ray (CXR) imaging, due to being more cost-effective, widely available, and having a faster acquisition time compared to CT, has evolved during the COVID-19 pandemic. To improve the diagnostic performance of CXR imaging a growing number of studies have investigated whether supervised deep learning methods can provide additional support. However, supervised methods rely on a large number of labeled radiology images, which is a time-consuming and complex procedure requiring expert clinician input. Due to the relative scarcity of COVID-19 patient data and the costly labeling process, self-supervised learning methods have gained momentum and has been proposed achieving comparable results to fully supervised learning approaches. In this work, we study the effectiveness of self-supervised learning in the context of diagnosing COVID-19 disease from CXR images. We propose a multi-feature Vision Transformer (ViT) guided architecture where we deploy a cross-attention mechanism to learn information from both original CXR images and corresponding enhanced local phase CXR images. We demonstrate the performance of the baseline self-supervised learning models can be further improved by leveraging the local phase-based enhanced CXR images. By using 10\% labeled CXR scans, the proposed model achieves 91.10\% and 96.21\% overall accuracy tested on total 35,483 CXR images of healthy (8,851), regular pneumonia (6,045), and COVID-19 (18,159) scans and shows significant improvement over state-of-the-art techniques. Code is available https://github.com/endiqq/Multi-Feature-ViT

GAN|对抗|攻击|生成相关(5篇)

【1】 Efficiently Computing Nash Equilibria in Adversarial Team Markov Games
标题：对抗性团队马氏对策纳什均衡的有效计算
链接：https://arxiv.org/abs/2208.02204

作者：Fivos Kalogiannis,Ioannis Anagnostides,Ioannis Panageas,Emmanouil-Vasileios Vlatakis-Gkaragkounis,Vaggos Chatziafratis,Stelios Stavroulakis
机构：University of California, Irvine, Carnegie Mellon University, Columbia University, University of California, Santa Cruz
摘要：Nash均衡策略的计算是多智能体强化学习中的一个核心问题，在理论和实践中都得到了广泛的关注.然而，迄今为止，可证明的保证要么局限于完全竞争或合作的场景，要么强加了在大多数实际应用中难以满足的强假设.在本文中，我们通过研究无限时域\emph{对抗性团队马尔可夫对策}来偏离那些先前的结果，这是一种自然的、动机良好的游戏，在这种游戏中，一组兴趣相同的参与者——在没有任何明确的协调或沟通的情况下—-正在与敌对玩家竞争。此设置允许统一处理零和马尔可夫博弈和马尔可夫势博弈，和合作利益为特征的更现实的战略交互作用的一个步骤。我们的主要贡献是第一个计算平稳ε-给出了计算复杂度为关于博弈所有自然参数的多项式的对抗性团队Markov博弈的近似Nash均衡，以及$1/\epsilon$。所提出的算法是特别自然和实用的，并且其基于与来自对手方的最佳响应一前一后地对团队中的每个参与者执行独立的策略梯度步骤;然后通过求解一个精心构造的线性规划得到对手的策略.我们的分析利用非标准技术建立了一个具有非凸约束的非线性规划的KKT最优性条件，从而得到了诱导拉格朗日乘子的自然解释.在此过程中，我们显著地扩展了Von Stengel和Koller（GEB ′ 97）提出的对抗性（正规形式）团队对策中最优策略的一个重要特征。
摘要：Computing Nash equilibrium policies is a central problem in multi-agent reinforcement learning that has received extensive attention both in theory and in practice. However, provable guarantees have been thus far either limited to fully competitive or cooperative scenarios or impose strong assumptions that are difficult to meet in most practical applications. In this work, we depart from those prior results by investigating infinite-horizon \emph{adversarial team Markov games}, a natural and well-motivated class of games in which a team of identically-interested players -- in the absence of any explicit coordination or communication -- is competing against an adversarial player. This setting allows for a unifying treatment of zero-sum Markov games and Markov potential games, and serves as a step to model more realistic strategic interactions that feature both competing and cooperative interests. Our main contribution is the first algorithm for computing stationary $\epsilon$-approximate Nash equilibria in adversarial team Markov games with computational complexity that is polynomial in all the natural parameters of the game, as well as $1/\epsilon$. The proposed algorithm is particularly natural and practical, and it is based on performing independent policy gradient steps for each player in the team, in tandem with best responses from the side of the adversary; in turn, the policy for the adversary is then obtained by solving a carefully constructed linear program. Our analysis leverages non-standard techniques to establish the KKT optimality conditions for a nonlinear program with nonconvex constraints, thereby leading to a natural interpretation of the induced Lagrange multipliers. Along the way, we significantly extend an important characterization of optimal policies in adversarial (normal-form) team games due to Von Stengel and Koller (GEB `97).

【2】 Character Generation through Self-Supervised Vectorization
标题：基于自监督矢量化的字符生成
链接：https://arxiv.org/abs/2208.02012

作者：Gokcen Gokceoglu,Emre Akbas
机构：Department of Computer Engineering, Middle East Technical University
摘要：自监督图像生成的主要方法是在像素级表示上进行操作.这种方法虽然可以生成高质量的图像，但它不能从矢量化的简单性和固有质量中获益.本文提出了一种在笔划级表示上操作图像的绘制代理.在每个时间步，该代理首先评估当前画布并决定是停止还是继续绘制，当做出“绘制”决定时，该代理输出指示要绘制的笔划的程序，它通过在画布上绘制笔划来产生最终的光栅图像，我们通过在MNIST和Omniglot数据集上的强化学习来训练我们的代理进行无条件的生成和解析在Omniglot挑战中，我们利用我们的解析代理进行样本生成和类型条件概念生成，而无需任何进一步的训练。最重要的是，我们不需要任何笔画级或向量监督;我们仅使用光栅图像进行训练。
摘要：The prevalent approach in self-supervised image generation is to operate on pixel level representations. While this approach can produce high quality images, it cannot benefit from the simplicity and innate quality of vectorization. Here we present a drawing agent that operates on stroke-level representation of images. At each time step, the agent first assesses the current canvas and decides whether to stop or keep drawing. When a 'draw' decision is made, the agent outputs a program indicating the stroke to be drawn. As a result, it produces a final raster image by drawing the strokes on a canvas, using a minimal number of strokes and dynamically deciding when to stop. We train our agent through reinforcement learning on MNIST and Omniglot datasets for unconditional generation and parsing (reconstruction) tasks. We utilize our parsing agent for exemplar generation and type conditioned concept generation in Omniglot challenge without any further training. We present successful results on all three generation tasks and the parsing task. Crucially, we do not need any stroke-level or vector supervision; we only use raster images for training.

【3】 Zero-Shot Style Transfer for Gesture Animation driven by Text and Speech using Adversarial Disentanglement of Multimodal Style Encoding
标题：基于多模态风格编码对抗解纠缠的文本和语音驱动手势动画Zero-Shot风格转换
链接：https://arxiv.org/abs/2208.01917

作者：Mireille Fares,Michele Grimaldi,Catherine Pelachaud,Nicolas Obin
机构： Sorbonne University
摘要：虚拟主体的行为风格建模是实现虚拟主体交互个性化的一个重要因素.本文提出了一种高效的机器学习方法，用于合成由韵律特征和不同说话人风格的文本驱动的手势，包括训练过程中未看到的那些.我们的模型实现zero shot多模态风格转换，由来自包含不同说话人视频的PATS数据库的多模态数据驱动.我们认为风格在说话过程中是普遍存在的，它使交际行为具有表现力，而言语内容是由多模态信号和文本承载的，这种内容和风格的分离方案使我们能够直接推断说话者的风格嵌入，即使说话者的数据不是训练阶段的一部分，我们的模型的第一个目标是基于两种音频和文本模态的内容来生成源说话人的手势。第二个目标是在目标说话人的多模态行为风格嵌入上调节源说话人预测的姿势。第三个目标是在训练过程中不需要重新训练模型就可以实现扬声器的zero shot式传输。（1）说话人风格编码器网络，其学习以从目标说话人多模态数据生成固定维说话人嵌入风格，以及（2）序列到序列合成网络，其基于源说话者的输入模态的内容并以说话者风格嵌入为条件来合成姿势。我们评估我们的模型可以合成源说话人的手势，并将目标说话人风格变化的知识转移到zero shot设置中的手势生成任务。我们将2D手势转换成3D姿势，并制作3D动画，我们进行客观和主观的评估来验证我们的方法，并将其与基准进行比较。
摘要：Modeling virtual agents with behavior style is one factor for personalizing human agent interaction. We propose an efficient yet effective machine learning approach to synthesize gestures driven by prosodic features and text in the style of different speakers including those unseen during training. Our model performs zero shot multimodal style transfer driven by multimodal data from the PATS database containing videos of various speakers. We view style as being pervasive while speaking, it colors the communicative behaviors expressivity while speech content is carried by multimodal signals and text. This disentanglement scheme of content and style allows us to directly infer the style embedding even of speaker whose data are not part of the training phase, without requiring any further training or fine tuning. The first goal of our model is to generate the gestures of a source speaker based on the content of two audio and text modalities. The second goal is to condition the source speaker predicted gestures on the multimodal behavior style embedding of a target speaker. The third goal is to allow zero shot style transfer of speakers unseen during training without retraining the model. Our system consists of: (1) a speaker style encoder network that learns to generate a fixed dimensional speaker embedding style from a target speaker multimodal data and (2) a sequence to sequence synthesis network that synthesizes gestures based on the content of the input modalities of a source speaker and conditioned on the speaker style embedding. We evaluate that our model can synthesize gestures of a source speaker and transfer the knowledge of target speaker style variability to the gesture generation task in a zero shot setup. We convert the 2D gestures to 3D poses and produce 3D animations. We conduct objective and subjective evaluations to validate our approach and compare it with a baseline.

【4】 Understanding Adversarial Imitation Learning in Small Sample Regime: A Stage-coupled Analysis
标题：理解小样本制度下的对抗性模仿学习：阶段耦合分析
链接：https://arxiv.org/abs/2208.01899

作者：Tian Xu,Ziniu Li,Yang Yu,Zhi-Quan Luo
机构：National Key Laboratory for Novel Software Technology, Nanjing University, The Chinese University of Hong Kong, Shenzhen, Shenzhen Research Institute of Big Data
摘要：模仿学习从专家轨迹中学习策略，专家数据被认为对模仿质量至关重要，但人们发现一种模仿学习方法——对抗性模仿学习（adversarial imitation learning），它能有效地提高模仿质量，并能有效地提高学习效率.（AIL），可以具有异常的性能。利用少到只有一个专家轨迹，AIL甚至在长的时间范围内也可以与专家性能相匹配，在诸如运动控制这样的任务上。在这种现象中有两个神秘的点。第一，为什么AIL只需要几条专家轨迹就能表现得很好？第二，为什么AIL在规划时间段很长的情况下仍能保持良好的性能？本文中，对于基于全变差距离的AIL（称为TV-AIL），我们的分析显示了水平自由模仿间隙|\数学S|/N} \}）$在一类从运动控制任务中抽象出来的实例上。这里$|\数学S| $是表格马尔可夫决策过程的状态空间大小，$N$是专家轨迹的数目.我们强调了我们的界的两个重要特点.首先，这个界在小样本和大样本情况下都是有意义的.其次，这个界表明TV-AIL的模仿间隙最多为1，而与规划水平无关.因此，这个界可以解释经验观察.从技术上讲，我们利用TV-AIL中的多阶段策略优化结构，并通过动态规划提出了一种新的阶段耦合分析
摘要：Imitation learning learns a policy from expert trajectories. While the expert data is believed to be crucial for imitation quality, it was found that a kind of imitation learning approach, adversarial imitation learning (AIL), can have exceptional performance. With as little as only one expert trajectory, AIL can match the expert performance even in a long horizon, on tasks such as locomotion control. There are two mysterious points in this phenomenon. First, why can AIL perform well with only a few expert trajectories? Second, why does AIL maintain good performance despite the length of the planning horizon? In this paper, we theoretically explore these two questions. For a total-variation-distance-based AIL (called TV-AIL), our analysis shows a horizon-free imitation gap $\mathcal O(\{\min\{1, \sqrt{|\mathcal S|/N} \})$ on a class of instances abstracted from locomotion control tasks. Here $|\mathcal S|$ is the state space size for a tabular Markov decision process, and $N$ is the number of expert trajectories. We emphasize two important features of our bound. First, this bound is meaningful in both small and large sample regimes. Second, this bound suggests that the imitation gap of TV-AIL is at most 1 regardless of the planning horizon. Therefore, this bound can explain the empirical observation. Technically, we leverage the structure of multi-stage policy optimization in TV-AIL and present a new stage-coupled analysis via dynamic programming

【5】 Subject-Specific Lesion Generation and Pseudo-Healthy Synthesis for Multiple Sclerosis Brain Images
标题：多发性硬化脑图像的特定对象病变生成和伪健康合成
链接：https://arxiv.org/abs/2208.02135

作者：Berke Doga Basaran,Mengyun Qiao,Paul M. Matthews,Wenjia Bai
机构： Department of Computing, Imperial College London, London, UK, Data Science Institute, Imperial College London, London, UK, Department of Brain Sciences, Imperial College London, London, UK, UK Dementia Research Institute, Imperial College London, London, UK
备注：13 pages, 6 figures, 2022 MICCAI SASHIMI (Simulation and Synthesis in Medical Imaging) Workshop paper
摘要：了解脑损伤的强度特征是神经学研究中定义基于图像的生物标志物以及预测疾病负担和结局的关键.在这项工作中，我们提出了一种新的基于前景的生成方法，用于对局部损伤特征进行建模，该方法既可以在健康图像上生成合成的损伤，也可以从病理图像中合成特定对象的伪健康图像.此外，所提出的方法可用作数据扩充模块以生成用于训练脑图像分割网络的合成图像。（MS）磁共振成像采集的脑图像实验结果表明，该方法能够生成高度逼真的伪健康和伪病理脑图像。与传统的数据扩增方法以及最近的损伤感知数据扩增技术CarveMix相比，使用合成图像的数据扩增提高了大脑图像分割性能。//github.com/dogabasaran/lesion-synthesis.
摘要：Understanding the intensity characteristics of brain lesions is key for defining image-based biomarkers in neurological studies and for predicting disease burden and outcome. In this work, we present a novel foreground-based generative method for modelling the local lesion characteristics that can both generate synthetic lesions on healthy images and synthesize subject-specific pseudo-healthy images from pathological images. Furthermore, the proposed method can be used as a data augmentation module to generate synthetic images for training brain image segmentation networks. Experiments on multiple sclerosis (MS) brain images acquired on magnetic resonance imaging (MRI) demonstrate that the proposed method can generate highly realistic pseudo-healthy and pseudo-pathological brain images. Data augmentation using the synthetic images improves the brain image segmentation performance compared to traditional data augmentation methods as well as a recent lesion-aware data augmentation technique, CarveMix. The code will be released at https://github.com/dogabasaran/lesion-synthesis.

半/弱/无/有监督|不确定性|主动学习(4篇)

【1】 Edge-Based Self-Supervision for Semi-Supervised Few-Shot Microscopy Image Cell Segmentation
标题：基于边缘自监督的半监督Few-Shot显微图像细胞分割
链接：https://arxiv.org/abs/2208.02105

作者：Youssef Dawoud,Katharina Ernst,Gustavo Carneiro,Vasileios Belagiannis
机构： Universit¨at Ulm, Ulm, Germany, Ulm University Medical Center, Ulm, Germany, The University of Adelaide, Adelaide, Australia, Otto von Guericke University, Magdeburg, Germany
备注：Accepted by MOVI 2022
摘要：深度神经网络在显微图像细胞分割中有着良好的应用前景，但它需要大规模的标记数据库，这是一个昂贵且耗时的过程.本文将自监督和半监督学习相结合，放宽了对标记数据库的要求.我们提出了基于边缘映射的预测，用于自监督未标记图像的训练，该方法结合少量标记图像的监督训练来学习分割任务，在我们的实验中，我们在few-Shot的显微镜图像细胞分割基准上进行了评估，结果表明，只有少量的标记图像，这足以使我们的方法在1到10个镜头上达到与完全注释的数据库相似的性能。
摘要：Deep neural networks currently deliver promising results for microscopy image cell segmentation, but they require large-scale labelled databases, which is a costly and time-consuming process. In this work, we relax the labelling requirement by combining self-supervised with semi-supervised learning. We propose the prediction of edge-based maps for self-supervising the training of the unlabelled images, which is combined with the supervised training of a small number of labelled images for learning the segmentation task. In our experiments, we evaluate on a few-shot microscopy image cell segmentation benchmark and show that only a small number of annotated images, e.g. 10% of the original training set, is enough for our approach to reach similar performance as with the fully annotated databases on 1- to 10-shots. Our code and trained models is made publicly available

【2】 Unsupervised Discovery of Semantic Concepts in Satellite Imagery with Style-based Wavelet-driven Generative Models
标题：基于风格的小波驱动生成模型的卫星影像语义概念无监督发现
链接：https://arxiv.org/abs/2208.02089

作者：Nikos Kostagiolas,Mihalis A. Nicolaou,Yannis Panagakis
机构：The Cyprus Institute, National and Kapodistrian University of Athens
备注：11 pages, 5 figures, accepted at SETN 2022
摘要：近年来，生成对抗网络研究取得了长足的进展（GANs），特别是随着基于风格的体系结构的出现，该体系结构解决了建模能力和网络可解释性方面的许多关键缺点。尽管有这些改进，在卫星图像领域采用这种方法并不简单。在生成任务中使用的典型视觉数据集是良好的相比之下，卫星图像表现出巨大的空间和光谱可变性，广泛存在精细的高频细节，而注释卫星图像的乏味本质导致注释不足—进一步推动了无监督学习的发展。提出了第一个基于小波变换的GAN模型，该模型可以在各种环境和条件下合成宽范围的真实卫星图像，同时保留了高频信息，我们表明，通过分析我们网络的中间激活，人们可以发现大量可解释的语义方向，这些语义方向促进了卫星图像在高层次概念方面的指导性合成通过一系列定性和定量的实验，我们证明了我们的框架的有效性，在下游任务的适用性（例如，数据扩充）、合成图像的质量以及对不可见数据集的概括能力方面。
摘要：In recent years, considerable advancements have been made in the area of Generative Adversarial Networks (GANs), particularly with the advent of style-based architectures that address many key shortcomings - both in terms of modeling capabilities and network interpretability. Despite these improvements, the adoption of such approaches in the domain of satellite imagery is not straightforward. Typical vision datasets used in generative tasks are well-aligned and annotated, and exhibit limited variability. In contrast, satellite imagery exhibits great spatial and spectral variability, wide presence of fine, high-frequency details, while the tedious nature of annotating satellite imagery leads to annotation scarcity - further motivating developments in unsupervised learning. In this light, we present the first pre-trained style- and wavelet-based GAN model that can readily synthesize a wide gamut of realistic satellite images in a variety of settings and conditions - while also preserving high-frequency information. Furthermore, we show that by analyzing the intermediate activations of our network, one can discover a multitude of interpretable semantic directions that facilitate the guided synthesis of satellite images in terms of high-level concepts (e.g., urbanization) without using any form of supervision. Via a set of qualitative and quantitative experiments we demonstrate the efficacy of our framework, in terms of suitability for downstream tasks (e.g., data augmentation), quality of synthetic imagery, as well as generalization capabilities to unseen datasets.

【3】 Exploration with Model Uncertainty at Extreme Scale in Real-Time Bidding
标题：极端规模下模型不确定性在实时报价中的探索
链接：https://arxiv.org/abs/2208.01951

作者：Jan Hartman,Davorin Kopič
机构：Zemanta, an Outbrain company, Slovenia, Ljubljana
摘要：在本研究中，我们提出了一个可扩展的高效系统，用于探索实时报价中的供应前景。该系统基于用于点击率预测的模型的预测不确定性来指导探索，并在高吞吐量、低延迟的环境中工作。通过在线A/B测试，我们证明了模型不确定性的探索对模型性能和业务KPI有积极的影响。
摘要：In this work, we present a scalable and efficient system for exploring the supply landscape in real-time bidding. The system directs exploration based on the predictive uncertainty of models used for click-through rate prediction and works in a high-throughput, low-latency environment. Through online A/B testing, we demonstrate that exploration with model uncertainty has a positive impact on model performance and business KPIs.

【4】 Success of Uncertainty-Aware Deep Models Depends on Data Manifold Geometry
标题：不确定性感知深度模型的成功依赖于数据流形几何
链接：https://arxiv.org/abs/2208.01705

作者：Mark Penrod,Harrison Termotto,Varshini Reddy,Jiayu Yao,Finale Doshi-Velez,Weiwei Pan
机构： Harvard University
备注：None
摘要：为了在安全关键环境中做出负责任的决策，机器学习模型必须有效地检测和处理边缘案例数据。尽管现有研究表明预测不确定性对这些任务很有用，但从文献中并不清楚哪种不确定性感知模型最适合给定的数据集。因此，我们比较了六种不确定性感知深度学习模型对一组边缘案例任务的影响：我们发现，数据子流形的几何结构是决定各种模型成功与否的重要因素。我们的发现为不确定性感知深度学习模型的研究提供了一个有趣的方向。
摘要：For responsible decision making in safety-critical settings, machine learning models must effectively detect and process edge-case data. Although existing works show that predictive uncertainty is useful for these tasks, it is not evident from literature which uncertainty-aware models are best suited for a given dataset. Thus, we compare six uncertainty-aware deep learning models on a set of edge-case tasks: robustness to adversarial attacks as well as out-of-distribution and adversarial detection. We find that the geometry of the data sub-manifold is an important factor in determining the success of various models. Our finding suggests an interesting direction in the study of uncertainty-aware deep learning models.

迁移|Zero/Few/One-Shot|自适应(3篇)

【1】 AdaCat: Adaptive Categorical Discretization for Autoregressive Models
标题： AdaCat：自回归模型的自适应分类离散化
链接：https://arxiv.org/abs/2208.02246

作者：Qiyang Li,Ajay Jain,Pieter Abbeel
机构： University of California Berkeley, Berkeley, CA, USA
备注：Uncertainty in Artificial Intelligence (UAI) 2022 13 pages, 4 figures
摘要：自回归生成模型可以估计复杂的连续数据分布，如RL环境中的轨迹滚动、图像强度和音频。大多数最先进的模型将连续数据离散化为几个箱，并使用箱上的分类分布来近似连续数据分布。优点是分类分布可以轻松表示多个模式，并且可以直接优化。但是，这种近似不能表达密度的急剧变化而不使用明显更多的面元，这使得它的参数效率低下AdaCat自适应地离散化自回归模型的每个维度，这允许模型将密度分配给感兴趣的精细间隔，提高参数效率。AdaCat可推广基于分类和分位数的回归。AdaCat是任何基于离散化的分布估计器的简单附加项。在实验中，AdaCat可改进真实世界表格数据、图像、音频和轨迹的密度估计，并且改进了基于模型的离线RL中的规划。
摘要：Autoregressive generative models can estimate complex continuous data distributions, like trajectory rollouts in an RL environment, image intensities, and audio. Most state-of-the-art models discretize continuous data into several bins and use categorical distributions over the bins to approximate the continuous data distribution. The advantage is that the categorical distribution can easily express multiple modes and are straightforward to optimize. However, such approximation cannot express sharp changes in density without using significantly more bins, making it parameter inefficient. We propose an efficient, expressive, multimodal parameterization called Adaptive Categorical Discretization (AdaCat). AdaCat discretizes each dimension of an autoregressive model adaptively, which allows the model to allocate density to fine intervals of interest, improving parameter efficiency. AdaCat generalizes both categoricals and quantile-based regression. AdaCat is a simple add-on to any discretization-based distribution estimator. In experiments, AdaCat improves density estimation for real-world tabular data, images, audio, and trajectories, and improves planning in model-based offline RL.

【2】 Interpretable bilinear attention network with domain adaptation improves drug-target prediction
标题：可解释的双线性注意力网络结合领域自适应提高药物靶点预测
链接：https://arxiv.org/abs/2208.02194

作者：Peizhen Bai,Filip Miljković,Bino John,Haiping Lu
机构：Department of Computer Science, University of Sheffield, Sheffield, United Kingdom, Imaging and Data Analytics, Clinical Pharmacology & Safety Sciences, R&D, AstraZeneca, Gothenburg, Sweden
备注：16 pages, 6 figures
摘要：预测药物—靶点相互作用是药物发现的关键。最近基于深度学习的方法显示出良好的性能，但仍存在两个挑战：（i）如何明确地建模和学习药物和靶之间的局部相互作用以用于更好的预测和解释;（2）如何推广对不同分布的新药物—靶点对的预测性能.本文提出了一种具有域自适应的深度双线性注意力网络（BAN）框架DrugBAN，该框架能够显式地学习药物与靶点之间的成对局部相互作用，并适应分布外的数据. DrugBAN对药物分子图和靶蛋白序列进行预测，在三个基准数据集上的实验表明，DrugBAN在域内和跨域两种情况下都取得了最好的整体性能，而且，DrugBAN的性能优于其他两种情况，对所学习的双线性注意力图进行可视化提供了来自预测结果的可解释的见解。
摘要：Predicting drug-target interaction is key for drug discovery. Recent deep learning-based methods show promising performance but two challenges remain: (i) how to explicitly model and learn local interactions between drugs and targets for better prediction and interpretation; (ii) how to generalize prediction performance on novel drug-target pairs from different distribution. In this work, we propose DrugBAN, a deep bilinear attention network (BAN) framework with domain adaptation to explicitly learn pair-wise local interactions between drugs and targets, and adapt on out-of-distribution data. DrugBAN works on drug molecular graphs and target protein sequences to perform prediction, with conditional domain adversarial learning to align learned interaction representations across different distributions for better generalization on novel drug-target pairs. Experiments on three benchmark datasets under both in-domain and cross-domain settings show that DrugBAN achieves the best overall performance against five state-of-the-art baselines. Moreover, visualizing the learned bilinear attention map provides interpretable insights from prediction results.

【3】 Adaptive Domain Generalization via Online Disagreement Minimization
标题：基于在线不一致最小化的自适应领域泛化
链接：https://arxiv.org/abs/2208.01996

作者：Xin Zhang,Ying-Cong Chen
机构： alsowith the Department of Computer Science and Engineering, The Hong KongUniversity of Science and Technology
备注：11 pages, 4 figures
摘要：当部署和训练之间存在分布偏移时，深度神经网络的性能会显著下降。DG（DistributedRiskMinimization）是一种基于源域集的模型安全迁移方法，虽然已有多种DG方法被提出，但最近的DomainBed研究表明，大多数DG方法都不能克服经验风险最小化方法的不足为此，本文提出了一个与已有DG算法正交的通用框架，该框架能够不断地提高DG算法的性能.与以往的DG算法依赖于一个静态的源模型以希望成为一个通用的模型不同，我们提出的AdaODM算法在测试时间针对不同的目标域自适应地修改源模型.具体来说，我们在一个共享的领域通用特征提取器上创建多个领域特定分类器，特征提取器和分类器以对抗的方式训练，其中特征提取器将输入样本嵌入到领域不变空间中，并且所述多个分类器捕获它们中的每一个与特定源域相关的不同判定边界。在测试期间，目标域和源域之间的分布差异可以通过平衡源分类器之间的预测不一致来有效地测量。通过微调源模型以最小化测试时的不一致性，使目标领域特征与不变特征空间对齐.在两种流行的DG方法ERM和CORAL以及4个DG基准测试VLCS、PACS、OfficeHome和TerraIncognita上进行了验证，结果表明AdaODM稳定地提高了对未知领域的泛化能力，达到了最佳性能.
摘要：Deep neural networks suffer from significant performance deterioration when there exists distribution shift between deployment and training. Domain Generalization (DG) aims to safely transfer a model to unseen target domains by only relying on a set of source domains. Although various DG approaches have been proposed, a recent study named DomainBed, reveals that most of them do not beat the simple Empirical Risk Minimization (ERM). To this end, we propose a general framework that is orthogonal to existing DG algorithms and could improve their performance consistently. Unlike previous DG works that stake on a static source model to be hopefully a universal one, our proposed AdaODM adaptively modifies the source model at test time for different target domains. Specifically, we create multiple domain-specific classifiers upon a shared domain-generic feature extractor. The feature extractor and classifiers are trained in an adversarial way, where the feature extractor embeds the input samples into a domain-invariant space, and the multiple classifiers capture the distinct decision boundaries that each of them relates to a specific source domain. During testing, distribution differences between target and source domains could be effectively measured by leveraging prediction disagreement among source classifiers. By fine-tuning source models to minimize the disagreement at test time, target domain features are well aligned to the invariant feature space. We verify AdaODM on two popular DG methods, namely ERM and CORAL, and four DG benchmarks, namely VLCS, PACS, OfficeHome, and TerraIncognita. The results show AdaODM stably improves the generalization capacity on unseen domains and achieves state-of-the-art performance.

强化学习(3篇)

【1】 A Lightweight Transmission Parameter Selection Scheme Using Reinforcement Learning for LoRaWAN
标题：一种基于强化学习的LoRaWAN轻量级传输参数选择方法
链接：https://arxiv.org/abs/2208.01824

作者：Aohan Li,Ikumi Urabe,Minoru Fujisawa,So Hasegawa,Hiroyuki Yasuda,Song-Ju Kim,Mikio Hasegawa
机构： NationalInstitute of Information and Communication Technology (NICT), and the Department of Electrical Engineering, Tokyo University of Science
备注：14 pages, 12 figures, 8 tables. This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible
摘要：预计到2023年，物联网设备数量将达到1250亿台。物联网设备的增长将加剧设备之间的冲突，降低通信性能。选择合适的传输参数，如信道和扩频因子（SF），可有效减少远程之间的碰撞然而，当前文献中提出的大多数方案并不容易在具有有限计算复杂度和存储器的物联网设备上实现。我们提出了一种轻量级的传输参数选择方案，即一种用于低功率广域网的使用强化学习的联合信道和SF选择方案（LoRaWAN）。在建议的方案中，可以通过仅使用Acknowledge的简单的四次算术运算来选择适当的传输参数此外，理论上分析了该方案的计算复杂度和存储需求，结果表明该方案能够以极低的计算复杂度和存储需求选择传输参数，在LoRa器件上进行了大量的实验来评估我们所提出的方案的有效性.实验结果证明了以下主要现象.（1）与其他轻量级传输参数选择方案相比，通过我们提出的LoRaWAN中的方案，可以有效地避免LoRa设备之间的冲突，而不管可用信道的变化。（2）与仅选择接入信道相比，通过选择接入信道并使用SF，可以提高帧成功率（FSR）。（3）由于相邻信道之间存在干扰，所以可以通过增加相邻可用信道的间隔来提高FSR和公平性。
摘要：The number of IoT devices is predicted to reach 125 billion by 2023. The growth of IoT devices will intensify the collisions between devices, degrading communication performance. Selecting appropriate transmission parameters, such as channel and spreading factor (SF), can effectively reduce the collisions between long-range (LoRa) devices. However, most of the schemes proposed in the current literature are not easy to implement on an IoT device with limited computational complexity and memory. To solve this issue, we propose a lightweight transmission-parameter selection scheme, i.e., a joint channel and SF selection scheme using reinforcement learning for low-power wide area networking (LoRaWAN). In the proposed scheme, appropriate transmission parameters can be selected by simple four arithmetic operations using only Acknowledge (ACK) information. Additionally, we theoretically analyze the computational complexity and memory requirement of our proposed scheme, which verified that our proposed scheme could select transmission parameters with extremely low computational complexity and memory requirement. Moreover, a large number of experiments were implemented on the LoRa devices in the real world to evaluate the effectiveness of our proposed scheme. The experimental results demonstrate the following main phenomena. (1) Compared to other lightweight transmission-parameter selection schemes, collisions between LoRa devices can be efficiently avoided by our proposed scheme in LoRaWAN irrespective of changes in the available channels. (2) The frame success rate (FSR) can be improved by selecting access channels and using SFs as opposed to only selecting access channels. (3) Since interference exists between adjacent channels, FSR and fairness can be improved by increasing the interval of adjacent available channels.

【2】 Digital Twin-Assisted Efficient Reinforcement Learning for Edge Task Scheduling
标题：数字孪生辅助的边缘任务调度强化学习算法
链接：https://arxiv.org/abs/2208.01781

作者：Xiucheng Wang,Longfei Ma,Haocheng Li,Zhisheng Yin,Tom. Luan,Nan Cheng
机构：∗School of Telecommunications Engineering, Xidian University, Xi’an, China, †School of Cyber Engineering, Xidian University, Xi’an, China
摘要：任务调度问题是一个用户将多个不同的任务卸载到边缘服务器的关键问题.当用户有多个任务要卸载，而一次只能有一个任务传输到服务器，而服务器按照传输顺序处理任务时，该问题是NP-hard问题.然而，传统的优化方法很难快速得到最优解，而基于强化学习的方法则面临着动作空间过大和收敛速度慢的挑战，我们建议采用数字孪生为了提高RL的性能和收敛性，提出了一种基于DT辅助的RL任务调度方法。我们使用DT来模拟Agent不同决策的结果，使得一个Agent可以同时尝试多个动作，或者类似地，多个Agent可以在DT中并行地与环境交互，这样，通过DT可以显著地提高RL的探索效率，和DT-assisted Exploring Q-learning（DTEQL）两种算法，仿真结果表明，这两种算法通过提高探索效率，显著提高了Q-learning算法的收敛速度.
摘要：Task scheduling is a critical problem when one user offloads multiple different tasks to the edge server. When a user has multiple tasks to offload and only one task can be transmitted to server at a time, while server processes tasks according to the transmission order, the problem is NP-hard. However, it is difficult for traditional optimization methods to quickly obtain the optimal solution, while approaches based on reinforcement learning face with the challenge of excessively large action space and slow convergence. In this paper, we propose a Digital Twin (DT)-assisted RL-based task scheduling method in order to improve the performance and convergence of the RL. We use DT to simulate the results of different decisions made by the agent, so that one agent can try multiple actions at a time, or, similarly, multiple agents can interact with environment in parallel in DT. In this way, the exploration efficiency of RL can be significantly improved via DT, and thus RL can converges faster and local optimality is less likely to happen. Particularly, two algorithms are designed to made task scheduling decisions, i.e., DT-assisted asynchronous Q-learning (DTAQL) and DT-assisted exploring Q-learning (DTEQL). Simulation results show that both algorithms significantly improve the convergence speed of Q-learning by increasing the exploration efficiency.

【3】 Deep Reinforcement Learning for Multi-Agent Interaction
标题：面向多Agent交互的深度强化学习
链接：https://arxiv.org/abs/2208.01769

作者：Ibrahim H. Ahmed,Cillian Brewitt,Ignacio Carlucho,Filippos Christianos,Mhairi Dunion,Elliot Fosong,Samuel Garcin,Shangmin Guo,Balint Gyevnar,Trevor McInroe,Georgios Papoudakis,Arrasy Rahman,Lukas Schäfer,Massimiliano Tamborski,Giuseppe Vecchio,Cheng Wang,Stefano V. Albrecht
机构：Autonomous Agents Research Group, School of Informatics, University of Edinburgh, United Kingdom
备注：Published in AI Communications Special Issue on Multi-Agent Systems Research in the UK
摘要：开发能够与其他智能体交互以完成给定任务的自治智能体是人工智能和机器学习的核心研究领域，为此，自治智能体研究小组开发了用于自治系统控制的新型机器学习算法，特别关注深度强化学习和多主体强化学习。研究问题包括协调Agent策略的可扩展学习和Agent间通信;从有限的观察中推理出其他主体的行为、目标和组成;以及基于内在动机的样本有效学习、课程学习、因果推理和表征学习。本文对该小组正在进行的研究进行了广泛的概述，并讨论了未来研究方向中存在的问题。
摘要：The development of autonomous agents which can interact with other agents to accomplish a given task is a core area of research in artificial intelligence and machine learning. Towards this goal, the Autonomous Agents Research Group develops novel machine learning algorithms for autonomous systems control, with a specific focus on deep reinforcement learning and multi-agent reinforcement learning. Research problems include scalable learning of coordinated agent policies and inter-agent communication; reasoning about the behaviours, goals, and composition of other agents from limited observations; and sample-efficient learning based on intrinsic motivation, curriculum learning, causal inference, and representation learning. This article provides a broad overview of the ongoing research portfolio of the group and discusses open problems for future directions.

医学相关(4篇)

【1】 Empirical Study of Overfitting in Deep FNN Prediction Models for Breast Cancer Metastasis
标题：乳腺癌转移深度模糊神经网络预测模型过拟合的实证研究
链接：https://arxiv.org/abs/2208.02150

作者：Chuhan Xu,Pablo Coen-Pirani,Xia Jiang
机构：Empirical Study of Overfitting in Deep FNN, Prediction Models for Breast Cancer Metastasis
摘要：过拟合被定义为当前模型完美地拟合特定数据集的事实，导致泛化能力减弱，在本研究中，我们使用了一个关于乳腺癌转移的EHR数据集来研究深度前馈神经网络的过拟合（FNN）预测模型。我们包括了深层FNN模型的11个超参数，并采用经验方法研究了当样本量为1000个时，这些超参数中的每一个如何影响预测性能和过拟合。我们还研究了一些有趣的超参数对是如何相互作用以影响模型的性能和过拟合的。权重初始化器、隐层数、学习率、动量、衰减、丢失率、批量大小、历元、L1和L2。我们的结果表明，大多数单超参数与模型预测性能和过拟合负相关或正相关。特别是，我们发现过拟合总体上倾向于与学习率、衰减、批量边和L2负相关。但倾向于与动量、时期和L1正相关。根据我们的结果，学习率、衰减和批量大小可能比大多数其他超参数（包括L1、L2和脱落率）对过拟合和预测性能都有更显著的影响，我们还发现了一些有趣的相互作用的超参数对，如学习率和动量，学习率和衰减，以及批量大小和历元。深度学习，过拟合，预测，网格搜索，前馈神经网络，乳腺癌转移。
摘要：Overfitting is defined as the fact that the current model fits a specific data set perfectly, resulting in weakened generalization, and ultimately may affect the accuracy in predicting future data. In this research we used an EHR dataset concerning breast cancer metastasis to study overfitting of deep feedforward Neural Networks (FNNs) prediction models. We included 11 hyperparameters of the deep FNNs models and took an empirical approach to study how each of these hyperparameters was affecting both the prediction performance and overfitting when given a large range of values. We also studied how some of the interesting pairs of hyperparameters were interacting to influence the model performance and overfitting. The 11 hyperparameters we studied include activate function; weight initializer, number of hidden layers, learning rate, momentum, decay, dropout rate, batch size, epochs, L1, and L2. Our results show that most of the single hyperparameters are either negatively or positively corrected with model prediction performance and overfitting. In particular, we found that overfitting overall tends to negatively correlate with learning rate, decay, batch sides, and L2, but tends to positively correlate with momentum, epochs, and L1. According to our results, learning rate, decay, and batch size may have a more significant impact on both overfitting and prediction performance than most of the other hyperparameters, including L1, L2, and dropout rate, which were designed for minimizing overfitting. We also find some interesting interacting pairs of hyperparameters such as learning rate and momentum, learning rate and decay, and batch size and epochs. Keywords: Deep learning, overfitting, prediction, grid search, feedforward neural networks, breast cancer metastasis.

【2】 Cross-lingual Approaches for the Detection of Adverse Drug Reactions in German from a Patient's Perspective
标题：从患者角度看德语中药物不良反应的跨语言检测方法
链接：https://arxiv.org/abs/2208.02031

作者：Lisa Raithel,Philippe Thomas,Roland Roller,Oliver Sapina,Sebastian Möller,Pierre Zweigenbaum
机构：Deutsches Forschungszentrum f¨ur K¨unstliche Intelligenz (DFKI) Berlin, Berlin, Germany, Technische Universit¨at Berlin, Berlin, Germany, Universit´e Paris-Saclay, CNRS, Laboratoire interdisciplinaire des sciences du num´erique (LISN), Orsay, France
备注：Accepted at LREC 2022
摘要：在这项工作中，我们提出了第一个语料库德国药物不良反应（ADR）检测。该数据由来自德国患者论坛的4，169个二进制注释文档组成，用户在该论坛上谈论健康问题并从医生那里获得建议。正如该领域的社交媒体数据中常见的那样，语料库的分类标签是非常不平衡的。2这和高度的主题不平衡使它成为一个非常有挑战性的数据集，因为通常，相同的症状可能有多种原因，而且不一定都与服用药物有关。我们希望鼓励更多的在此基础上，本文提出了一种基于多语言模型的零次和few-Shot学习的二进制分类方法，并对该方法进行了初步的实验研究。首先对英文患者论坛数据进行RoBERTa分析，然后对新的德国数据进行分析，我们获得了阳性类别的F1得分37.52。我们将数据集和模型公开给社区。
摘要：In this work, we present the first corpus for German Adverse Drug Reaction (ADR) detection in patient-generated content. The data consists of 4,169 binary annotated documents from a German patient forum, where users talk about health issues and get advice from medical doctors. As is common in social media data in this domain, the class labels of the corpus are very imbalanced. This and a high topic imbalance make it a very challenging dataset, since often, the same symptom can have several causes and is not always related to a medication intake. We aim to encourage further multi-lingual efforts in the domain of ADR detection and provide preliminary experiments for binary classification using different methods of zero- and few-shot learning based on a multi-lingual model. When fine-tuning XLM-RoBERTa first on English patient forum data and then on the new German data, we achieve an F1-score of 37.52 for the positive class. We make the dataset and models publicly available for the community.

【3】 Reconstructing Sparse Illicit Supply Networks: A Case Study of Multiplex Drug Trafficking Networks
标题：重建稀疏的非法供应网络：多重贩毒网络的案例研究
链接：https://arxiv.org/abs/2208.01739

作者：Jin-Zhu Yu,Mincheng Wu,Gisela Bichler,Felipe Aros-Vera,Jianxi Gao
机构：Department of Computer Science, Rensselaer Polytechnic Institute (RPI), Troy, NY, USA, Network Science and Technology Center, Rensselaer Polytechnic Institute (RPI), Troy, NY, USA
摘要：网络结构为执法机构制定有效的策略来阻断非法供应网络提供了重要信息，然而，隐蔽网络的完整结构往往是不可得的，因此，开发推断更完整的隐蔽网络结构的方法至关重要.本文首先介绍了隐蔽网络的概念，我们的研究对象是从一份调查报告中提取的现实世界的多重毒品贩运网络。将DegEM方法和其他基于结构相似性的方法应用于不同观测节点和链路比例的多重贩毒网络重构，基于结构相似度的方法在重建毒品贩运网络时效果不佳，因为网络中节点之间的链接非常稀疏。（i）将该项决定告知─对隐蔽网络的监控以及分配有限的资源来收集额外的信息以改进重建准确性和㈡制定更有效的阻截战略。
摘要：The network structure provides critical information for law enforcement agencies to develop effective strategies to interdict illicit supply networks. However, the complete structure of covert networks is often unavailable, thus it is crucially important to develop approaches to infer a more complete structure of covert networks. In this paper, we work on real-world multiplex drug trafficking networks extracted from an investigation report. A statistical approach built on the EM algorithm (DegEM) as well as other methods based on structural similarity are applied to reconstruct the multiplex drug trafficking network given different fractions of observed nodes and links. It is found that DegEM approach achieves the best predictive performance in terms of several accuracy metrics. Meanwhile, structural similarity-based methods perform poorly in reconstructing the drug trafficking networks due to the sparsity of links between nodes in the network. The inferred multiplex networks can be leveraged to (i) inform the decision-making on monitoring covert networks as well as allocating limited resources for collecting additional information to improve the reconstruction accuracy and (ii) develop more effective interdiction strategies.

【4】 Internet of Things (IoT) based ECG System for Rural Health Care
标题：基于物联网的农村医疗心电系统
链接：https://arxiv.org/abs/2208.02226

作者：Md. Obaidur Rahman,Mohammod Abul Kashem,Al-Akhir Nayan,Most. Fahmida Akter,Fazly Rabbi,Marzia Ahmed,Mohammad Asaduzzaman
机构：Department of CSE, DUET, EUB, Gabtoli, Dhaka, Bangladesh, Gazipur, Dhaka, Bangladesh, Department of CSE, EUB, Department of CSE, BUP, Mirpur, Dhaka, Bangladesh, Department of Statistics, JNU, Savar, Dhaka, Bangladesh, Department of Software Engineering
备注：None
摘要：孟加拉国农村地区近30%的人口生活在贫困线以下。此外，由于缺乏现代化的医疗保健相关技术，农村人口的护理和诊断设施有限。因此，农村人口被剥夺了适当的医疗保健。从这个角度来看，现代技术可以帮助减轻他们的健康问题。心电图传感工具与人的胸部连接。通过物联网设备采集心血管数据，并将这些数据存储在结合MQTT和HTTP服务器的云中，提出了一种基于物联网的心血管或心脏病人心电监护系统的创新方法，该方法通过采集心血管信号参数P，Q，R，S，T，并对其进行预处理，和错误率的重要性，并利用机器学习算法对ECG信号参数进行预测，从而实现对心血管疾病的监测。通过对PQRST质量的预测，确定了PQRST在心电监护系统中的适用性，并结合质量参数的取值，基于物联网的心电图系统将降低未来心血管疾病的医疗成本和复杂性。
摘要：Nearly 30% of the people in the rural areas of Bangladesh are below the poverty level. Moreover, due to the unavailability of modernized healthcare-related technology, nursing and diagnosis facilities are limited for rural people. Therefore, rural people are deprived of proper healthcare. In this perspective, modern technology can be facilitated to mitigate their health problems. ECG sensing tools are interfaced with the human chest, and requisite cardiovascular data is collected through an IoT device. These data are stored in the cloud incorporates with the MQTT and HTTP servers. An innovative IoT-based method for ECG monitoring systems on cardiovascular or heart patients has been suggested in this study. The ECG signal parameters P, Q, R, S, T are collected, pre-processed, and predicted to monitor the cardiovascular conditions for further health management. The machine learning algorithm is used to determine the significance of ECG signal parameters and error rate. The logistic regression model fitted the better agreements between the train and test data. The prediction has been performed to determine the variation of PQRST quality and its suitability in the ECG Monitoring System. Considering the values of quality parameters, satisfactory results are obtained. The proposed IoT-based ECG system reduces the health care cost and complexity of cardiovascular diseases in the future.

蒸馏|知识提取(1篇)

【1】 KPI-BERT: A Joint Named Entity Recognition and Relation Extraction Model for Financial Reports
标题： KPI-BERT：一种面向财务报表的联合命名实体识别与关系抽取模型
链接：https://arxiv.org/abs/2208.02140

作者：Lars Hillebrand,Tobias Deußer,Tim Dilmaghani,Bernd Kliem,Rüdiger Loitz,Christian Bauckhage,Rafet Sifa
机构：†Fraunhofer IAIS, Bonn, Germany, ‡University of Bonn, Bonn, Germany, §PricewaterhouseCoopers GmbH, D¨usseldorf, Germany
备注：Accepted at ICPR 2022, 8 pages, 1 figure, 6 tables
摘要：我们提出的KPI-BERT，一个系统，采用新的方法命名实体识别（NER）与关系抽取（RE）提取和链接关键绩效指标（KPI），例如“收入”或“利息支出”。具体而言，我们介绍了一种基于Transformers双向编码器表示的端到端可训练体系结构（BERT）组合递归神经网络该模型引入了一种基于RNN的可学习池机制，并通过显式过滤不可能的关系来融合领域专家知识，在德国财务报告的一个新的实际数据集上取得了显著的提高，优于几个强基准，包括竞争的最先进的基于跨度的实体标记方法。
摘要：We present KPI-BERT, a system which employs novel methods of named entity recognition (NER) and relation extraction (RE) to extract and link key performance indicators (KPIs), e.g. "revenue" or "interest expenses", of companies from real-world German financial documents. Specifically, we introduce an end-to-end trainable architecture that is based on Bidirectional Encoder Representations from Transformers (BERT) combining a recurrent neural network (RNN) with conditional label masking to sequentially tag entities before it classifies their relations. Our model also introduces a learnable RNN-based pooling mechanism and incorporates domain expert knowledge by explicitly filtering impossible relations. We achieve a substantially higher prediction performance on a new practical dataset of German financial reports, outperforming several strong baselines including a competing state-of-the-art span-based entity tagging approach.

推荐(2篇)

【1】 Adapting Triplet Importance of Implicit Feedback for Personalized Recommendation
标题：自适应隐式反馈的三重重要度个性化推荐
链接：https://arxiv.org/abs/2208.01709

作者：Haolun Wu,Chen Ma,Yingxue Zhang,Xue Liu,Ruiming Tang,Mark Coates
机构：McGill University, Montréal, Canada, City University of Hong Kong, Hong Kong SAR, Huawei Noah’s Ark Lab, Shenzhen, China
备注：11 pages, 7 figures
摘要：隐式反馈信息在现实系统中普遍存在且可访问，因此常被用于开发个性化推荐服务，为了有效地利用这些信息，大多数研究都采用对训练三元组进行成对排序的方法（用户、正项目、负项目），并且旨在为每个用户区分正项目和负项目。这些方法大多将所有的训练三元组等同对待，忽略了不同正负项之间的细微差别，另一方面，尽管有些方法利用了辅助信息，为了解决上述问题，本文提出了一种新的训练框架——三重重要性学习（Triplet Importance Learning），该框架利用了用户行为的重要性信息（如停留时间）来描述用户行为的细微差别，但这种辅助信息很难获得本文提出了一种新的训练方法———-TIL，它能自适应地学习训练三元组的重要性分数.我们设计了两种重要性分数的生成策略，并将整个过程描述为一个双层优化过程，不需要任何基于规则的设计.我们将所提出的训练过程与几种矩阵分解算法相结合，并将其应用于训练数据集.（MF）—图神经网络（GNN）的推荐模型，证明了该框架的兼容性.通过使用三个真实数据集和多种现有方法的比较，我们证明了我们提出的方法在对于top-k推荐的Recall@k方面比现有的最好的模型有3-21%的性能。
摘要：Implicit feedback is frequently used for developing personalized recommendation services due to its ubiquity and accessibility in real-world systems. In order to effectively utilize such information, most research adopts the pairwise ranking method on constructed training triplets (user, positive item, negative item) and aims to distinguish between positive items and negative items for each user. However, most of these methods treat all the training triplets equally, which ignores the subtle difference between different positive or negative items. On the other hand, even though some other works make use of the auxiliary information (e.g., dwell time) of user behaviors to capture this subtle difference, such auxiliary information is hard to obtain. To mitigate the aforementioned problems, we propose a novel training framework named Triplet Importance Learning (TIL), which adaptively learns the importance score of training triplets. We devise two strategies for the importance score generation and formulate the whole procedure as a bilevel optimization, which does not require any rule-based design. We integrate the proposed training procedure with several Matrix Factorization (MF)- and Graph Neural Network (GNN)-based recommendation models, demonstrating the compatibility of our framework. Via a comparison using three real-world datasets with many state-of-the-art methods, we show that our proposed method outperforms the best existing models by 3-21\% in terms of Recall@k for the top-k recommendation.

【2】 DeepProphet2 -- A Deep Learning Gene Recommendation Engine
标题： DeepProphet2-—一个深度学习基因推荐引擎
链接：https://arxiv.org/abs/2208.01918

作者：Daniele Brambilla,Davide Maria Giacomini,Luca Muscarnera,Andrea Mazzoleni
机构：TheProphetAI
摘要：机器学习的最新进展为解决生命科学问题创造了新的强大工具。本文的目的是讨论由人工智能（AI）执行的基因推荐的潜在优势。事实上，基因推荐引擎试图解决这个问题：如果用户对一组基因感兴趣，那么其他哪些基因可能与起始组相关，应该进行调查？这个任务通过定制的深度学习推荐引擎DeepProphet2（DP2）解决，全球研究人员可通过www.generecommender.com免费获得该引擎。下文将阐述算法背后的见解及其实际应用。基因推荐问题可以通过将基因映射到度量空间来解决，在度量空间中定义一个距离来表示基因之间的真实语义距离.为了实现这个目标，在一个精心设计的免费纸质语料库PubMed上训练了一个基于Transformer的模型.本文描述了用于获得最佳偏差—方差折衷的多个优化过程，重点关注嵌入大小和网络深度。在这种情况下，通过交叉验证评估了该模型发现与疾病和通路有关的基因组的能力。一个简单的假设指导了该程序：该网络没有关于通路和疾病的直接知识，但学习了基因的相似性和它们之间的相互作用.此外，为了进一步研究神经网络表示基因的空间，降低了嵌入的维数，并将结果投影到人类可理解的空间.总之，一组用例说明了该算法在真实单词设置中的潜在应用.
摘要：New powerful tools for tackling life science problems have been created by recent advances in machine learning. The purpose of the paper is to discuss the potential advantages of gene recommendation performed by artificial intelligence (AI). Indeed, gene recommendation engines try to solve this problem: if the user is interested in a set of genes, which other genes are likely to be related to the starting set and should be investigated? This task was solved with a custom deep learning recommendation engine, DeepProphet2 (DP2), which is freely available to researchers worldwide via www.generecommender.com. Hereafter, insights behind the algorithm and its practical applications are illustrated. The gene recommendation problem can be addressed by mapping the genes to a metric space where a distance can be defined to represent the real semantic distance between them. To achieve this objective a transformer-based model has been trained on a well-curated freely available paper corpus, PubMed. The paper describes multiple optimization procedures that were employed to obtain the best bias-variance trade-off, focusing on embedding size and network depth. In this context, the model's ability to discover sets of genes implicated in diseases and pathways was assessed through cross-validation. A simple assumption guided the procedure: the network had no direct knowledge of pathways and diseases but learned genes' similarities and the interactions among them. Moreover, to further investigate the space where the neural network represents genes, the dimensionality of the embedding was reduced, and the results were projected onto a human-comprehensible space. In conclusion, a set of use cases illustrates the algorithm's potential applications in a real word setting.

聚类(3篇)

【1】 A Tighter Analysis of Spectral Clustering, and Beyond
标题：谱聚类的更严密分析及其他
链接：https://arxiv.org/abs/2208.01724

作者：Peter Macgregor,He Sun
机构：University of Edinburgh
备注：A preliminary version of this work appeared at ICML 2022
摘要：本文研究了嵌入图$G=的顶点的经典谱聚类算法（V_G，E_G）$分解为$\mathbb{R}^k$，并应用$k$-均值将$V_G$分解为$k$个聚类.我们的第一个结果是对谱聚类性能的更严格的分析，并解释了为什么谱聚类在比文献中研究的条件弱得多的条件下工作.对于第二个结果，我们表明，在许多实际情况下，用少于$k$个特征向量来构造嵌入，谱聚类能够产生更好的输出;本文的研究成果在理论上和概念上都具有重要意义，在合成数据集和真实数据集上的实证分析也表明了本文的研究成果在实际应用中的重要性，其中，谱聚类算法在不超过$k$个特征向量的情况下就能得到与之相当或更好的结果。
摘要：This work studies the classical spectral clustering algorithm which embeds the vertices of some graph $G=(V_G, E_G)$ into $\mathbb{R}^k$ using $k$ eigenvectors of some matrix of $G$, and applies $k$-means to partition $V_G$ into $k$ clusters. Our first result is a tighter analysis on the performance of spectral clustering, and explains why it works under some much weaker condition than the ones studied in the literature. For the second result, we show that, by applying fewer than $k$ eigenvectors to construct the embedding, spectral clustering is able to produce better output for many practical instances; this result is the first of its kind in spectral clustering. Besides its conceptual and theoretical significance, the practical impact of our work is demonstrated by the empirical analysis on both synthetic and real-world datasets, in which spectral clustering produces comparable or better results with fewer than $k$ eigenvectors.

【2】 No Pattern, No Recognition: a Survey about Reproducibility and Distortion Issues of Text Clustering and Topic Modeling
标题：无模式，无识别：文本聚类和主题建模的可再现性和失真问题综述
链接：https://arxiv.org/abs/2208.01712

作者：Marília Costa Rosendo Silva,Felipe Alves Siqueira,João Pedro Mantovani Tarrega,João Vitor Pataca Beinotti,Augusto Sousa Nunes,Miguel de Mattos Gardini,Vinícius Adolfo Pereira da Silva,Nádia Félix Felipe da Silva,André Carlos Ponce de Leon Ferreira de Carvalho
机构：Vinicius Adolfo Pereira da Silva, Institute of Mathematics and Computer Science and São Carlos School of Engineering, Institute of Mathematics and Computer Science Institute of Informatics, University of São Paulo Federal University of Goiás
摘要：使用机器学习算法从未标记的文本中提取知识可能是复杂的，文档分类和信息检索是两个可能受益于无监督学习的应用（例如，文本聚类和主题建模），包括探索性数据分析。但是，无监督学习范例会带来可重复性问题。初始化可能会导致可变性，具体取决于机器学习算法。此外，在这些原因中，离群值和异常的存在可能是一个决定性因素。尽管初始化和离群值问题对于文本聚类和主题建模是相关的，作者没有发现对它们的深入分析。本调查提供了一个系统的文献综述（2011-2022），并提出了一个共同的术语，因为类似的程序有不同的术语。作者描述了研究机会，趋势和开放的问题。附录总结了文本向量化，因子分解和聚类算法的理论背景，这些都直接或间接地与所审查的工作。
摘要：Extracting knowledge from unlabeled texts using machine learning algorithms can be complex. Document categorization and information retrieval are two applications that may benefit from unsupervised learning (e.g., text clustering and topic modeling), including exploratory data analysis. However, the unsupervised learning paradigm poses reproducibility issues. The initialization can lead to variability depending on the machine learning algorithm. Furthermore, the distortions can be misleading when regarding cluster geometry. Amongst the causes, the presence of outliers and anomalies can be a determining factor. Despite the relevance of initialization and outlier issues for text clustering and topic modeling, the authors did not find an in-depth analysis of them. This survey provides a systematic literature review (2011-2022) of these subareas and proposes a common terminology since similar procedures have different terms. The authors describe research opportunities, trends, and open issues. The appendices summarize the theoretical background of the text vectorization, the factorization, and the clustering algorithms that are directly or indirectly related to the reviewed works.

【3】 Differentially Private Vertical Federated Clustering
标题：差分私有垂直联邦聚类
链接：https://arxiv.org/abs/2208.01700

作者：Zitao Li,Tianhao Wang,Ninghui Li
机构：Purdue University, University of Virginia
摘要：在许多应用中，多方都拥有关于同一组用户但属性不相交的私有数据，而服务器希望利用这些数据来训练模型。为了在保护数据主体隐私的同时实现模型学习，我们需要垂直联邦学习（VFL）技术，其中数据方只共享用于训练模型的信息，而不共享私有数据。然而，在学习精确模型的同时保证共享信息的隐私性是一个挑战，据我们所知，本文提出的算法是第一个实用的解决差分隐私垂直联邦k-means聚类问题的方法，其中服务器可以获得一组具有可证明的差分隐私保证的全局中心.我们的算法假设一个不可信的中心服务器差分地聚集来自本地数据方的私有本地中心和成员编码.该算法首先利用接收到的信息构建一个加权网格作为全局数据集的概要，然后在加权网格上运行任意的k-means算法生成最终的中心，以及基于Flajolet-Martin草图的差分私有集交集基数估计算法，我们进一步提出了权值估计算法的改进版本和参数调整策略以减少最终的k-本文对算法计算出的聚类中心进行了理论效用分析和实验评估，结果表明，本文算法的理论和实验性能均优于基于现有技术的两种基线算法.
摘要：In many applications, multiple parties have private data regarding the same set of users but on disjoint sets of attributes, and a server wants to leverage the data to train a model. To enable model learning while protecting the privacy of the data subjects, we need vertical federated learning (VFL) techniques, where the data parties share only information for training the model, instead of the private data. However, it is challenging to ensure that the shared information maintains privacy while learning accurate models. To the best of our knowledge, the algorithm proposed in this paper is the first practical solution for differentially private vertical federated k-means clustering, where the server can obtain a set of global centers with a provable differential privacy guarantee. Our algorithm assumes an untrusted central server that aggregates differentially private local centers and membership encodings from local data parties. It builds a weighted grid as the synopsis of the global dataset based on the received information. Final centers are generated by running any k-means algorithm on the weighted grid. Our approach for grid weight estimation uses a novel, light-weight, and differentially private set intersection cardinality estimation algorithm based on the Flajolet-Martin sketch. To improve the estimation accuracy in the setting with more than two data parties, we further propose a refined version of the weights estimation algorithm and a parameter tuning strategy to reduce the final k-means utility to be close to that in the central private setting. We provide theoretical utility analysis and experimental evaluation results for the cluster centers computed by our algorithm and show that our approach performs better both theoretically and empirically than the two baselines based on existing techniques.

超分辨率|去噪|去模糊|去雾(1篇)

【1】 Pyramidal Denoising Diffusion Probabilistic Models
标题：金字塔去噪扩散概率模型
链接：https://arxiv.org/abs/2208.01864

作者：Dohoon Ryu,Jong Chul Ye
机构： Dept. of Bio and Brain Engineering, Kim Jaechul Graduate School of AI, Dept. of Mathematical Sciences, Korea Advanced Institute of Science and Technology (KAIST)
摘要：扩散模型在图像生成方面表现出了令人印象深刻的性能，并被广泛应用于各种计算机视觉任务中，然而，使用扩散模型生成图像非常耗时，因为它需要数千个采样步骤，为了解决这个问题，这里我们提出了一种新的金字塔扩散模型，以产生高分辨率图像开始从粗得多的分辨率图像使用单一得分函数训练与位置这使得图像生成的采样具有时间效率，并且还解决了在有限资源下训练时的低批量问题。此外，我们还表明，所提出的方法可以使用单个得分函数有效地用于多尺度超分辨率问题。
摘要：Diffusion models have demonstrated impressive image generation performance, and have been used in various computer vision tasks. Unfortunately, image generation using diffusion models is very time-consuming since it requires thousands of sampling steps. To address this problem, here we present a novel pyramidal diffusion model to generate high resolution images starting from much coarser resolution images using a single score function trained with a positional embedding. This enables a time-efficient sampling for image generation, and also solves the low batch size problem when training with limited resources. Furthermore, we show that the proposed approach can be efficiently used for multi-scale super-resolution problem using a single score function.

联邦学习|隐私保护|加密(2篇)

【1】 Asynchronous Federated Learning for Edge-assisted Vehicular Networks
标题：边缘辅助车辆网络的异步联邦学习
链接：https://arxiv.org/abs/2208.01901

作者：Siyuan Wang,Qiong Wu,Qiang Fan,Cui Zhang,Zhengquan Li
机构：Engineering, Jiangnan University, Wuxi , China) (,. State Key Laboratory of Integrated Services Network(Xidian University), Xi’an
备注：This paper has been submitted to WCSP
摘要：车载网络通过训练数据使车辆能够支持实时的车载应用，由于计算能力有限，车辆通常将数据传输到路边单元传统的联邦学习算法（RSU）在网络边缘处理数据，但由于隐私问题，车辆之间通常不愿意共享数据.（FL），车辆在本地训练数据得到本地模型，然后将本地模型上传到RSU更新全局模型，这样就可以通过共享模型参数而不是数据来保护数据隐私。RSU需要等待所有车辆上传模型后才能更新全局模型，但车辆在通过训练得到局部模型之前往往会驶出RSU的覆盖范围，降低了全局模型的精度，因此有必要提出一种异步联邦学习方法本文提出了一种基于RSU（RadioSystemUnit）（AFL）的全局模型更新算法，该算法在接收到车辆的局部模型后，对全局模型进行更新，但由于数据量、计算能力和车辆机动性等因素，综合考虑数据量、计算能力和车辆机动性等因素，设计了一种AFL方案，提高了全局模型的精度，仿真实验表明，该方案优于FL方案
摘要：Vehicular networks enable vehicles support real-time vehicular applications through training data. Due to the limited computing capability, vehicles usually transmit data to a road side unit (RSU) at the network edge to process data. However, vehicles are usually reluctant to share data with each other due to the privacy issue. For the traditional federated learning (FL), vehicles train the data locally to obtain a local model and then upload the local model to the RSU to update the global model, thus the data privacy can be protected through sharing model parameters instead of data. The traditional FL updates the global model synchronously, i.e., the RSU needs to wait for all vehicles to upload their models for the global model updating. However, vehicles may usually drive out of the coverage of the RSU before they obtain their local models through training, which reduces the accuracy of the global model. It is necessary to propose an asynchronous federated learning (AFL) to solve this problem, where the RSU updates the global model once it receives a local model from a vehicle. However, the amount of data, computing capability and vehicle mobility may affect the accuracy of the global model. In this paper, we jointly consider the amount of data, computing capability and vehicle mobility to design an AFL scheme to improve the accuracy of the global model. Extensive simulation experiments have demonstrated that our scheme outperforms the FL scheme

【2】 A New Implementation of Federated Learning for Privacy and Security Enhancement
标题：一种新的联邦学习隐私和安全增强实现
链接：https://arxiv.org/abs/2208.01826

作者：Xiang Ma,Haijian Sun,Rose Qingyang Hu,Yi Qian
机构：∗Department of Electrical and Computer Engineering, Utah State University, Logan, UT, †Department of Computer Science, University of Wisconsin-Whitewater, Whitewater, WI
摘要：随着人们对个人数据隐私的日益关注和本地客户端数据量的快速增长，联合学习成为了一种新的学习方式（FL）已成为一种新的机器学习设置。FL系统由一个中央参数服务器和多个本地客户端组成。它将数据保存在本地客户端，并通过共享本地学习的模型参数来学习中央模型。无需共享本地数据，和隐私，但由于共享的是模型而不是原始数据，系统可能会受到恶意客户端发起的中毒模型攻击，而且由于服务器上没有本地客户端数据，识别恶意客户端也很困难，此外，但是仍然可以通过使用上传的模型估计客户端的本地数据来进行成员关系推断攻击，首先提出了一种基于模型更新的联邦平均算法来抵抗诸如加性噪声攻击和符号翻转攻击等拜占庭攻击。提出了个体客户端模型初始化方法，通过隐藏个体本地机器学习模型来进一步保护隐私免受成员关系推理攻击，实验证明，所提方案在非对称条件下收敛，且具有良好的保密性和安全性在Byzantine攻击下，所提方案的性能明显优于经典的基于模型的FedAvg算法。
摘要：Motivated by the ever-increasing concerns on personal data privacy and the rapidly growing data volume at local clients, federated learning (FL) has emerged as a new machine learning setting. An FL system is comprised of a central parameter server and multiple local clients. It keeps data at local clients and learns a centralized model by sharing the model parameters learned locally. No local data needs to be shared, and privacy can be well protected. Nevertheless, since it is the model instead of the raw data that is shared, the system can be exposed to the poisoning model attacks launched by malicious clients. Furthermore, it is challenging to identify malicious clients since no local client data is available on the server. Besides, membership inference attacks can still be performed by using the uploaded model to estimate the client's local data, leading to privacy disclosure. In this work, we first propose a model update based federated averaging algorithm to defend against Byzantine attacks such as additive noise attacks and sign-flipping attacks. The individual client model initialization method is presented to provide further privacy protections from the membership inference attacks by hiding the individual local machine learning model. When combining these two schemes, privacy and security can be both effectively enhanced. The proposed schemes are proved to converge experimentally under non-IID data distribution when there are no attacks. Under Byzantine attacks, the proposed schemes perform much better than the classical model based FedAvg algorithm.

推理|分析|理解|解释(2篇)

【1】 A cloud platform for automating and sharing analysis of raw simulation data from high throughput polymer molecular dynamics simulations
标题：用于自动化和共享分析高通量聚合物分子动力学模拟的原始模拟数据的云平台
链接：https://arxiv.org/abs/2208.01692

作者：Tian Xie,Ha-Kyung Kwon,Daniel Schweigert,Sheng Gong,Arthur France-Lanord,Arash Khajeh,Emily Crabb,Michael Puzon,Chris Fajardo,Will Powelson,Yang Shao-Horn,Jeffrey C. Grossman
机构：Department of Materials Science and Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts , USA, Toyota Research Institute, El Camino Real, Los Altos, CA , Sorbonne Universit´e, Institut des Sciences du Calcul
备注：21 pages, 7 figures
摘要：开放的材料数据库存储了数十万种材料结构及其相应的性质，已经成为现代计算材料科学的基石。然而，模拟的原始输出，如分子动力学模拟的轨迹和密度泛函理论计算的电荷密度，由于其庞大的规模，通常不共享。在这项工作中，我们描述了一个基于云的平台来促进原始数据的共享并实现云中的快速后处理以提取用户定义的新属性。作为初始演示，我们的数据库目前包括6286个无定形聚合物电解质的分子动力学轨迹和5.7兆兆字节的数据。2我们在https：//github.com/TRI-AMDD/htp_md从原始数据中提取多个属性，使用专家设计的功能和机器学习模型。分析在云端自动运行，和分析功能，我们的平台鼓励用户通过公共接口贡献新的轨迹数据和分析功能。新分析的属性将被合并到数据库中。//www.htpmd.matr.io，用于浏览和可视化我们的数据。我们设想该平台将成为计算材料科学界共享原始数据和新见解的新方式。
摘要：Open material databases storing hundreds of thousands of material structures and their corresponding properties have become the cornerstone of modern computational materials science. Yet, the raw outputs of the simulations, such as the trajectories from molecular dynamics simulations and charge densities from density functional theory calculations, are generally not shared due to their huge size. In this work, we describe a cloud-based platform to facilitate the sharing of raw data and enable the fast post-processing in the cloud to extract new properties defined by the user. As an initial demonstration, our database currently includes 6286 molecular dynamics trajectories for amorphous polymer electrolytes and 5.7 terabytes of data. We create a public analysis library at https://github.com/TRI-AMDD/htp_md to extract multiple properties from the raw data, using both expert designed functions and machine learning models. The analysis is run automatically with computation in the cloud, and results then populate a database that can be accessed publicly. Our platform encourages users to contribute both new trajectory data and analysis functions via public interfaces. Newly analyzed properties will be incorporated into the database. Finally, we create a front-end user interface at https://www.htpmd.matr.io for browsing and visualization of our data. We envision the platform to be a new way of sharing raw data and new insights for the computational materials science community.

【2】 Diagnosis of Paratuberculosis in Histopathological Images Based on Explainable Artificial Intelligence and Deep Learning
标题：基于可解释人工智能和深度学习的副结核病病理组织图像诊断
链接：https://arxiv.org/abs/2208.01674

作者：Tuncay Yiğit,Nilgün Şengöz,Özlem Özmen,Jude Hemanth,Ali Hakan Işık
机构： Dept. of Computer Engineering, Süleyman Demirel University, Isparta , Turkey, Dept. of Pathology, Burdur Mehmet Akif Ersoy University, Burdur , Turkey, Karunya Institute of Technology & Sciences, Coimbatore , India
备注：None
摘要：人工智能在医学影像，尤其是组织病理学成像领域有着广阔的应用前景，但人工智能算法并不能完全解释决策过程中的思维过程，这就将人工智能应用的可解释性问题，即黑箱问题提上了议事日程：为了克服这个问题并提高可解释性，可解释的人工智能在此背景下，本研究使用深度学习算法对一个新的原始数据集进行测试，并使用梯度加权类激活映射对输出进行可视化（Grad-CAM）的诊断结果，并对病理医师进行了详细的问卷调查，验证了诊断结果的正确性，为病理医师诊断副结核病提供了有力的帮助。
摘要：Artificial intelligence holds great promise in medical imaging, especially histopathological imaging. However, artificial intelligence algorithms cannot fully explain the thought processes during decision-making. This situation has brought the problem of explainability, i.e., the black box problem, of artificial intelligence applications to the agenda: an algorithm simply responds without stating the reasons for the given images. To overcome the problem and improve the explainability, explainable artificial intelligence (XAI) has come to the fore, and piqued the interest of many researchers. Against this backdrop, this study examines a new and original dataset using the deep learning algorithm, and visualizes the output with gradient-weighted class activation mapping (Grad-CAM), one of the XAI applications. Afterwards, a detailed questionnaire survey was conducted with the pathologists on these images. Both the decision-making processes and the explanations were verified, and the accuracy of the output was tested. The research results greatly help pathologists in the diagnosis of paratuberculosis.

检测相关(6篇)

【1】 Blockchain associated machine learning and IoT based hypoglycemia detection system with auto-injection feature
标题：具有自动注射功能的基于区块链的机器学习和物联网的低血糖检测系统
链接：https://arxiv.org/abs/2208.02222

作者：Rahnuma Mahzabin,Fahim Hossain Sifat,Sadia Anjum,Al-Akhir Nayan,Muhammad Golam Kibria
机构：Department of Computer Science and Engineering, School of Science and Engineering, University of Liberal Arts Bangladesh (ULAB), Dhaka, Bangladesh, Department of Computer Engineering, Chulalongkorn University, Bangkok, Thailand, Article Info
备注：None
摘要：低血糖症是由低血糖引起的一种令人不快的现象。这种疾病可能导致一个人死亡或严重的身体损伤。为了避免严重的伤害，患者需要糖。这项研究旨在实现一个自动系统来检测低血糖症，并执行自动糖注射，以挽救生命。接受物联网的好处（IoT），传感器数据使用超文本传输协议进行传输（HTTP）协议。为确保与健康有关的数据的安全，利用区块链技术，将葡萄糖传感器和智能手表的数据通过Fog处理后，发送到云端。提出了一种随机森林算法，用于判断低血糖事件，当检测到低血糖事件时，系统会向移动应用程序和自动注射设备发送通知，以将浓缩糖推送到受害者体内。支持向量机（KNN）随机森林模型的检测准确率为0.942，优于其他模型，在多种条件下对系统的性能进行了测试，并取得了满意的效果，为低血糖患者的生存提供了有益的帮助。
摘要：Hypoglycemia is an unpleasant phenomenon caused by low blood glucose. The disease can lead a person to death or a high level of body damage. To avoid significant damage, patients need sugar. The research aims at implementing an automatic system to detect hypoglycemia and perform automatic sugar injections to save a life. Receiving the benefits of the internet of things (IoT), the sensor data was transferred using the hypertext transfer protocol (HTTP) protocol. To ensure the safety of health-related data, blockchain technology was utilized. The glucose sensor and smartwatch data were processed via Fog and sent to the cloud. A Random Forest algorithm was proposed and utilized to decide hypoglycemic events. When the hypoglycemic event was detected, the system sent a notification to the mobile application and auto-injection device to push the condensed sugar into the victims body. XGBoost, k-nearest neighbors (KNN), support vector machine (SVM), and decision tree were implemented to compare the proposed models performance. The random forest performed 0.942 testing accuracy, better than other models in detecting hypoglycemic events. The systems performance was measured in several conditions, and satisfactory results were achieved. The system can benefit hypoglycemia patients to survive this disease.

【2】 A Novel Approach To Network Intrusion Detection System Using Deep Learning For Sdn: Futuristic Approach
标题：一种基于深度学习的网络入侵检测系统未来主义方法
链接：https://arxiv.org/abs/2208.02094

作者：Mhmood Radhi Hadi,Adnan Saher Mohammed
机构：Department of Computer Engineering, Karabük University, Karabük, Turkey
摘要：软件定义的网络SDN是改变传统网络体系结构的下一代网络，是改变互联网网络体系结构的一种很有前途的解决方案。由于SDN体系结构的集中性，攻击变得越来越普遍。为SDN提供安全性至关重要。在本研究中，提出了一种网络入侵检测系统——深度学习模块（NIDS-DL）方法。我们建议的方法结合了网络入侵检测系统我们的方法使用了从NSL-KDD数据集中的41个特征中提取的12个特征，使用了分类器，（CNN、DNN、RNN、LSTM和GRU）。当我们比较分类器得分时，我们的技术产生的准确性结果为分别为98.63%、98.53%、98.13%、98.04%和97.78%。（NIDS-DL）使用了5个深度学习分类器，并对数据集进行预处理以获得最佳结果，在二值分类和攻击检测方面取得了成功，这意味着我们的方法（NIDS-DL）在将来可能被非常有效地使用。
摘要：Software-Defined Networking (SDN) is the next generation to change the architecture of traditional networks. SDN is one of the promising solutions to change the architecture of internet networks. Attacks become more common due to the centralized nature of SDN architecture. It is vital to provide security for the SDN. In this study, we propose a Network Intrusion Detection System-Deep Learning module (NIDS-DL) approach in the context of SDN. Our suggested method combines Network Intrusion Detection Systems (NIDS) with many types of deep learning algorithms. Our approach employs 12 features extracted from 41 features in the NSL-KDD dataset using a feature selection method. We employed classifiers (CNN, DNN, RNN, LSTM, and GRU). When we compare classifier scores, our technique produced accuracy results of (98.63%, 98.53%, 98.13%, 98.04%, and 97.78%) respectively. The novelty of our new approach (NIDS-DL) uses 5 deep learning classifiers and made pre-processing dataset to harvests the best results. Our proposed approach was successful in binary classification and detecting attacks, implying that our approach (NIDS-DL) might be used with great efficiency in the future.

【3】 Localization and Classification of Parasitic Eggs in Microscopic Images Using an EfficientDet Detector
标题：基于高效Det检测器的显微图像中寄生虫卵的定位与分类
链接：https://arxiv.org/abs/2208.01963

作者：Nouar AlDahoul,Hezerul Abdul Karim,Shaira Limson Kee,Myles Joshua Toledo Tan
机构：Multimedia University, Cyberjaya, Malaysia, Department of Natural Sciences, University of St. La Salle, Bacolod City, Philippines, Department of Chemical Engineering, University of St. La Salle, Bacolod City, Philippines
备注：6 pages, 7 figures, to be published in IEEE International Conference on Image Processing 2022
摘要：由原生动物和蠕虫寄生虫引起的IPI是人类在LMIC中最常见的感染之一。它们被视为严重的公共卫生问题，因为它们会导致各种潜在的有害健康状况。研究人员一直在开发模式识别技术，用于自动识别显微图像中的寄生虫卵。现有的解决方案仍需改进，以减少诊断错误并快速生成针对这一问题，本文提出了一种多模态学习检测器来定位寄生虫卵并将其分类为11类.实验在Chula-ParasiteEgg-11数据集上进行，该数据集用于训练具有EfficientNet-v2骨干的EfficientDet模型和EfficientNet-B7 + SVM.该数据集有11个，实验结果表明，该检测器具有较强的定位能力，准确率达到92%，F1得分达到93%，IOU分布也表明了该检测器的高定位能力.
摘要：IPIs caused by protozoan and helminth parasites are among the most common infections in humans in LMICs. They are regarded as a severe public health concern, as they cause a wide array of potentially detrimental health conditions. Researchers have been developing pattern recognition techniques for the automatic identification of parasite eggs in microscopic images. Existing solutions still need improvements to reduce diagnostic errors and generate fast, efficient, and accurate results. Our paper addresses this and proposes a multi-modal learning detector to localize parasitic eggs and categorize them into 11 categories. The experiments were conducted on the novel Chula-ParasiteEgg-11 dataset that was used to train both EfficientDet model with EfficientNet-v2 backbone and EfficientNet-B7+SVM. The dataset has 11,000 microscopic training images from 11 categories. Our results show robust performance with an accuracy of 92%, and an F1 score of 93%. Additionally, the IOU distribution illustrates the high localization capability of the detector.

【4】 Leveraging Smartphone Sensors for Detecting Abnormal Gait for Smart Wearable Mobile Technologies
标题：利用智能手机传感器检测智能可穿戴移动技术的异常步态
链接：https://arxiv.org/abs/2208.01876

作者：Md Shahriar Tasjid,Ahmed Al Marouf
机构：Daffodil International University, Dhaka, Bangladesh, University of Calgary, Alberta, Canada
备注：None
摘要：行走是人类最常见的地面运动模式之一。行走是人类进行大多数日常活动所必需的。当一个人行走时，其中有一种模式，这就是众所周知的步态。步态分析用于体育和医疗保健。我们可以通过不同的方式来分析这种步态，例如在实验室环境中使用监控摄像机或深度图像摄像机捕获的视频。它还可以通过可穿戴传感器识别。例如，加速计、力传感器、陀螺仪、柔性角度计磁阻传感器、电磁跟踪系统、力传感器和肌电图通过这些传感器进行的分析需要实验室条件，或者用户必须佩戴这些传感器。我们需要将这些传感器分别集成在一起，通过检测人体的异常步态来了解一个人的健康状况。异常步态可以通过智能穿戴技术来了解受试者的健康状况。因此，本文提出了一种通过智能手机传感器来分析人体异常步态的方法。虽然智能手机和智能手表等智能设备现在已经被大多数人使用。但是，我们可以通过这些智能穿戴设备上的传感器来跟踪他们的步态。
摘要：Walking is one of the most common modes of terrestrial locomotion for humans. Walking is essential for humans to perform most kinds of daily activities. When a person walks, there is a pattern in it, and it is known as gait. Gait analysis is used in sports and healthcare. We can analyze this gait in different ways, like using video captured by the surveillance cameras or depth image cameras in the lab environment. It also can be recognized by wearable sensors. e.g., accelerometer, force sensors, gyroscope, flexible goniometer, magneto resistive sensors, electromagnetic tracking system, force sensors, and electromyography (EMG). Analysis through these sensors required a lab condition, or users must wear these sensors. For detecting abnormality in gait action of a human, we need to incorporate the sensors separately. We can know about one's health condition by abnormal human gait after detecting it. Understanding a regular gait vs. abnormal gait may give insights to the health condition of the subject using the smart wearable technologies. Therefore, in this paper, we proposed a way to analyze abnormal human gait through smartphone sensors. Though smart devices like smartphones and smartwatches are used by most of the person nowadays. So, we can track down their gait using sensors of these intelligent wearable devices.

【5】 Robust Learning of Deep Time Series Anomaly Detection Models with Contaminated Training Data
标题：训练数据被污染的深度时间序列异常检测模型的鲁棒学习
链接：https://arxiv.org/abs/2208.01841

作者：Wenkai Li,Cheng Feng,Ting Chen,Jun Zhu
机构：Tsinghua University, Beijing, China, Siemens AG
摘要：时间序列异常检测（TSAD）是物联网时代一项重要的数据挖掘任务，具有众多的应用。近年来，大量基于深度神经网络的方法被提出，在解决多个领域具有挑战性的TSAD问题方面，表现出明显优于传统方法的性能。然而，这些深度TSAD方法通常依赖于未被异常污染的干净训练数据集来学习基础动态的“正常分布”。由于实际上几乎不可能提供干净的数据集，因此这个要求并不简单，而且，如果不知道它们的鲁棒性，盲目地应用具有潜在污染的训练数据的深度TSAD方法可能导致检测阶段中的显著性能降级。针对这一问题，本文首先研究了常用的深度TSAD方法在训练数据被污染的情况下的鲁棒性，为在训练数据不能保证无异常的情况下应用这些方法提供了指导，提出了一种模型不可知的方法，能够有效地提高学习主流深度TSAD模型的鲁棒性。在广泛使用的基准数据集上的实验结果表明，该方法能够持续地防止或减轻主流深度TSAD模型的性能退化.
摘要：Time series anomaly detection (TSAD) is an important data mining task with numerous applications in the IoT era. In recent years, a large number of deep neural network-based methods have been proposed, demonstrating significantly better performance than conventional methods on addressing challenging TSAD problems in a variety of areas. Nevertheless, these deep TSAD methods typically rely on a clean training dataset that is not polluted by anomalies to learn the "normal profile" of the underlying dynamics. This requirement is nontrivial since a clean dataset can hardly be provided in practice. Moreover, without the awareness of their robustness, blindly applying deep TSAD methods with potentially contaminated training data can possibly incur significant performance degradation in the detection phase. In this work, to tackle this important challenge, we firstly investigate the robustness of commonly used deep TSAD methods with contaminated training data which provides a guideline for applying these methods when the provided training data are not guaranteed to be anomaly-free. Furthermore, we propose a model-agnostic method which can effectively improve the robustness of learning mainstream deep TSAD models with potentially contaminated data. Experiment results show that our method can consistently prevent or mitigate performance degradation of mainstream deep TSAD models on widely used benchmark datasets.

【6】 Robust PCA for Anomaly Detection and Data Imputation in Seasonal Time Series
标题：基于鲁棒PCA的季节性时间序列异常检测与数据插补
链接：https://arxiv.org/abs/2208.01998

作者：Hong-Lan Botterman,Julien Roussel,Thomas Morzadec,Ali Jabbari,Nicolas Brunel
机构：Quantmetry, rue d’Anjou, Paris, France, LaMME, ENSIIE, Universit´e Paris Saclay, square de la R´esistance, Evry Cedex
摘要：本文提出了一种鲁棒的主成分分析（RPCA）框架，用于从时态观测数据中恢复低秩稀疏矩阵，并开发了一种在线版本的批处理时态算法，以处理更大的数据集或流数据.实验比较了所提出的方法与不同的RPCA框架，并证明了它们在实际情况下的有效性.
摘要：We propose a robust principal component analysis (RPCA) framework to recover low-rank and sparse matrices from temporal observations. We develop an online version of the batch temporal algorithm in order to process larger datasets or streaming data. We empirically compare the proposed approaches with different RPCA frameworks and show their effectiveness in practical situations.

分类|识别(3篇)

【1】 One Node at a Time: Node-Level Network Classification
标题：一次一个节点：节点级网络分类
链接：https://arxiv.org/abs/2208.02162

作者：Saray Shai,Isaac Jacobs,Peter J. Mucha
机构：Department of Mathematics, and Computer Science, Wesleyan University, Middletown, CT USA, Dartmouth College, Hanover, NH USA
备注：None
摘要：网络分类旨在对网络进行分组我们研究了网络的分类与其组成节点的分类之间的联系，以及不同网络组中的节点是否可以根据结构节点特征（如中心度和聚类系数）进行区分.我们使用各种网络数据集和随机网络模型证明，分类器可以被训练来准确预测给定节点的网络类别（无需查看整个网络），这意味着复杂网络即使在节点级别也显示出不同的结构模式。最后，我们讨论了节点级别网络分类的两个应用：（i）来自节点的小样本的全网络分类，以及（ii）网络自举。
摘要：Network classification aims to group networks (or graphs) into distinct categories based on their structure. We study the connection between classification of a network and of its constituent nodes, and whether nodes from networks in different groups are distinguishable based on structural node characteristics such as centrality and clustering coefficient. We demonstrate, using various network datasets and random network models, that a classifier can be trained to accurately predict the network category of a given node (without seeing the whole network), implying that complex networks display distinct structural patterns even at the node level. Finally, we discuss two applications of node-level network classification: (i) whole-network classification from small samples of nodes, and (ii) network bootstrapping.

【2】 Binary Classification with Positive Labeling Sources
标题：基于正标记源的二元分类
链接：https://arxiv.org/abs/2208.01704

作者：Jieyu Zhang,Yujing Wang,Yaming Yang,Yang Luo,Alexander Ratner
机构：Microsoft Research Asia, The Paul G. Allen School of Computer Science & Engineering, University of Washington, Snorkel AI, Inc.
备注：CIKM 2022 (short)
摘要：为了有效地为机器学习模型创建大量的训练标签，研究者们转向了弱监督WS的现有工作通常假设存在能够以大致平衡的比例向数据分配正面标签和负面标签的标签源，然而，对于许多具有少数正类的任务来说，负例的多样性使得开发人员无法生成指示性的标记源.因此，本文研究了WS在仅具有正标记源的二分类任务中的应用.我们提出了WEAPO，一个简单而有竞争力的WS方法来产生训练标签而没有负标签源.在10个基准数据集上，我们证明了WEAPO在合成标签的质量和由这些标签监督的最终分类器的性能方面实现了最高的平均性能。我们将\method的实现合并到现有的基准测试平台WRENCH中。
摘要：To create a large amount of training labels for machine learning models effectively and efficiently, researchers have turned to Weak Supervision (WS), which uses programmatic labeling sources rather than manual annotation. Existing works of WS for binary classification typically assume the presence of labeling sources that are able to assign both positive and negative labels to data in roughly balanced proportions. However, for many tasks of interest where there is a minority positive class, negative examples could be too diverse for developers to generate indicative labeling sources. Thus, in this work, we study the application of WS on binary classification tasks with positive labeling sources only. We propose WEAPO, a simple yet competitive WS method for producing training labels without negative labeling sources. On 10 benchmark datasets, we show WEAPO achieves the highest averaged performance in terms of both the quality of synthesized labels and the performance of the final classifier supervised with these labels. We incorporated the implementation of \method into WRENCH, an existing benchmarking platform.

【3】 AI-driven Hypernetwork of Organic Chemistry: Network Statistics and Applications in Reaction Classification
标题：人工智能驱动的有机化学超网络：网络统计及其在反应分类中的应用
链接：https://arxiv.org/abs/2208.01647

作者：Vipul Mann,Venkat Venkatasubramanian
机构： 1Department of Chemical Engineering, Columbia University
摘要：近年来，随着高通量筛选技术的发展、复杂化学设计空间的可及性以及精确分子模拟框架的发展，新反应和新分子的快速发现得到了促进。因此，需要对不断增长的化学文献进行整体研究，重点了解最近的趋势，并将其外推到未来可能的轨迹。为此，本文以化学反应为研究对象，采用超图表示化学反应，超图中的超边表示化学反应，节点表示参与化学反应的分子，利用标准反应数据集构造超网络，并给出超网络的度分布、平均路径长度同配性或度相关性、PageRank中心性和基于图的聚类我们还计算了反应的等价有向图表示的每个统计量，以绘制两者之间的相似之处并突出两者之间的差异。为了证明超图反应表示的AI适用性，我们生成稠密超图嵌入并将其用于反应分类问题，我们得出结论：超网络表示是灵活的，保持了反应上下文，并且揭示了在化学反应的传统有向图表示中不明显的隐藏的见解。
摘要：Rapid discovery of new reactions and molecules in recent years has been facilitated by the advancements in high throughput screening, accessibility to a much more complex chemical design space, and the development of accurate molecular modeling frameworks. A holistic study of the growing chemistry literature is, therefore, required that focuses on understanding the recent trends and extrapolating them into possible future trajectories. To this end, several network theory-based studies have been reported that use a directed graph representation of chemical reactions. Here, we perform a study based on representing chemical reactions as hypergraphs where the hyperedges represent chemical reactions and nodes represent the participating molecules. We use a standard reactions dataset to construct a hypernetwork and report its statistics such as degree distributions, average path length, assortativity or degree correlations, PageRank centrality, and graph-based clusters (or communities). We also compute each statistic for an equivalent directed graph representation of reactions to draw parallels and highlight differences between the two. To demonstrate the AI applicability of hypergraph reaction representation, we generate dense hypergraph embeddings and use them in the reaction classification problem. We conclude that the hypernetwork representation is flexible, preserves reaction context, and uncovers hidden insights that are otherwise not apparent in a traditional directed graph representation of chemical reactions.

表征(3篇)

【1】 Multimodal sensor fusion in the latent representation space
标题：潜在表示空间中的多模态传感器融合
链接：https://arxiv.org/abs/2208.02183

作者：Robert J. Piechocki,Xiaoyang Wang,Mohammud J. Bocus
机构：School of Computer Science, Electrical and Electronic Engineering, and Engineering Maths, University of Bristol, Bristol, BS,UB, UK.
备注：Under review for Nature Scientific Reports
摘要：提出了一种新的多模态传感器融合方法.该方法分为两个阶段：第一阶段，由未标记的训练数据构造多模态生成模型;第二阶段，该方法还可以处理仅通过二次采样（即压缩感知）获取观测值的情况。我们在一系列多模态融合实验中证明了该方法的有效性和优异性能，如多传感器分类、去噪和从子采样观测值中恢复。
摘要：A new method for multimodal sensor fusion is introduced. The technique relies on a two-stage process. In the first stage, a multimodal generative model is constructed from unlabelled training data. In the second stage, the generative model serves as a reconstruction prior and the search manifold for the sensor fusion tasks. The method also handles cases where observations are accessed only via subsampling i.e. compressed sensing. We demonstrate the effectiveness and excellent performance on a range of multimodal fusion experiments such as multisensory classification, denoising, and recovery from subsampled observations.

【2】 Masked Vision and Language Modeling for Multi-modal Representation Learning
标题：多模态表征学习的掩蔽视觉与语言建模
链接：https://arxiv.org/abs/2208.02131

作者：Gukyeong Kwon,Zhaowei Cai,Avinash Ravichandran,Erhan Bas,Rahul Bhotika,Stefano Soatto
机构：AWS AI Labs
摘要：本文主要研究如何将掩蔽信号建模应用于视觉和语言中（V+L）表征学习。而不是发展掩蔽语言模型（MLM）和掩模图像建模（MIM）独立地，我们提出建立联合掩蔽视觉和语言建模，其中一种模态的屏蔽信号在另一种模态的帮助下被重建。本文提出了一种新的视觉语言识别方法，即基于图像和文本的识别方法，该方法利用图像和文本的匹配信息，通过对一个模态的掩蔽信号进行重建，可以隐式地学习语言标记和图像标记之间的跨模态对齐.实验结果表明，该方法不仅可以实现语言标记和图像标记之间的状态对齐，而且可以有效地提高识别的准确性.艺术表演，而且在有限的训练数据的情况下，也以显著的优势胜过其他竞争者。
摘要：In this paper, we study how to use masked signal modeling in vision and language (V+L) representation learning. Instead of developing masked language modeling (MLM) and masked image modeling (MIM) independently, we propose to build joint masked vision and language modeling, where the masked signal of one modality is reconstructed with the help from another modality. This is motivated by the nature of image-text paired data that both of the image and the text convey almost the same information but in different formats. The masked signal reconstruction of one modality conditioned on another modality can also implicitly learn cross-modal alignment between language tokens and image patches. Our experiments on various V+L tasks show that the proposed method not only achieves state-of-the-art performances by using a large amount of data, but also outperforms the other competitors by a significant margin in the regimes of limited training data.

【3】 HybridGNN: Learning Hybrid Representation in Multiplex Heterogeneous Networks
标题： HybridGNN：在多异构网络中学习混合表示
链接：https://arxiv.org/abs/2208.02068

作者：Tiankai Gu,Chaokun Wang,Cheng Wu,Jingcao Xu,Yunkai Lou,Changping Wang,Kai Xu,Can Ye,Yang Song
机构：†School of Software, Tsinghua University, Beijing , China, ‡Kuaishou Inc., Beijing , China
备注：ICDE 2022
摘要：近年来，图神经网络在异构网络推荐系统中表现出了对复杂拓扑结构建模的优越性，由于节点之间的交互多样性以及不同类型的节点和边所产生的丰富语义，在多路异构网络中学习表达性节点表示是一个非常热门的研究课题。在推荐系统中，最重要的任务之一是预测在特定边类型下两个节点之间的潜在连接尽管现有的研究利用显式元路径来聚集邻居，实际上，它们只考虑内部关系元路径，因此不能利用内部关系信息的潜在提升.此外，在各种关系下，特别是随着节点和边类型数量的增加，综合利用内部关系元路径并不总是直接的.此外，为了充分利用网络中节点间关系的多样性，提出了一种具有混合聚集流和分层关注的端到端GNN模型HybridGNN，该模型采用随机化的关系间探索模块来挖掘不同关系间的多样性，该模型利用关系内元路径下的混合聚合流和随机探索来学习丰富的语义，提出了一种同时利用元层次注意力和关系层次注意力的层次注意力模型。大量的实验结果表明，HybridGNN的性能优于现有的几种基准网络。
摘要：Recently, graph neural networks have shown the superiority of modeling the complex topological structures in heterogeneous network-based recommender systems. Due to the diverse interactions among nodes and abundant semantics emerging from diverse types of nodes and edges, there is a bursting research interest in learning expressive node representations in multiplex heterogeneous networks. One of the most important tasks in recommender systems is to predict the potential connection between two nodes under a specific edge type (i.e., relationship). Although existing studies utilize explicit metapaths to aggregate neighbors, practically they only consider intra-relationship metapaths and thus fail to leverage the potential uplift by inter-relationship information. Moreover, it is not always straightforward to exploit inter-relationship metapaths comprehensively under diverse relationships, especially with the increasing number of node and edge types. In addition, contributions of different relationships between two nodes are difficult to measure. To address the challenges, we propose HybridGNN, an end-to-end GNN model with hybrid aggregation flows and hierarchical attentions to fully utilize the heterogeneity in the multiplex scenarios. Specifically, HybridGNN applies a randomized inter-relationship exploration module to exploit the multiplexity property among different relationships. Then, our model leverages hybrid aggregation flows under intra-relationship metapaths and randomized exploration to learn the rich semantics. To explore the importance of different aggregation flow and take advantage of the multiplexity property, we bring forward a novel hierarchical attention module which leverages both metapath-level attention and relationship-level attention. Extensive experimental results suggest that HybridGNN achieves the best performance compared to several state-of-the-art baselines.

3D|3D重建等相关(1篇)

【1】 PolarMOT: How Far Can Geometric Relations Take Us in 3D Multi-Object Tracking?
标题： PolarMOT：几何关系在3D多目标跟踪中能带我们走多远？
链接：https://arxiv.org/abs/2208.01957

作者：Aleksandr Kim,Guillem Brasó,Aljoša Ošep,Laura Leal-Taixé
机构：Technical University of Munich, Germany
备注：ECCV 2022, 17 pages, 5 pages of supplementary, 3 figures
摘要：大多数（3D）多目标跟踪方法依赖于基于外观的线索进行数据关联，相比之下，我们研究了仅将3D空间中目标之间的几何关系编码为数据驱动的数据关联线索所能达到的程度，我们将3D检测编码为图中的节点，其中对象之间的空间和时间成对关系经由图边缘上的局部化极坐标来编码。这种表示使得我们的几何关系对于全局变换和平滑轨迹变化是不变的，特别是在非完整运动下，这使得我们的图形神经网络能够学习有效地编码时间和空间交互，并充分利用上下文和运动线索，通过将数据关联作为边缘分类来获得最终的场景解释。我们在nuScenes数据集上建立了一个新的最先进的方法，更重要的是，证明了我们的方法，PolarMOT，在不同的地点（波士顿，新加坡，卡尔斯鲁厄）和数据集（nuScenes和KITTI）上具有非常好的通用性。
摘要：Most (3D) multi-object tracking methods rely on appearance-based cues for data association. By contrast, we investigate how far we can get by only encoding geometric relationships between objects in 3D space as cues for data-driven data association. We encode 3D detections as nodes in a graph, where spatial and temporal pairwise relations among objects are encoded via localized polar coordinates on graph edges. This representation makes our geometric relations invariant to global transformations and smooth trajectory changes, especially under non-holonomic motion. This allows our graph neural network to learn to effectively encode temporal and spatial interactions and fully leverage contextual and motion cues to obtain final scene interpretation by posing data association as edge classification. We establish a new state-of-the-art on nuScenes dataset and, more importantly, show that our method, PolarMOT, generalizes remarkably well across different locations (Boston, Singapore, Karlsruhe) and datasets (nuScenes and KITTI).

优化|敛散性(3篇)

【1】 A Screening Strategy for Structured Optimization Involving Nonconvex $\ell_{q,p}$ Regularization
标题：非凸$\ell_{q，p}$正则化结构优化的筛选策略
链接：https://arxiv.org/abs/2208.02161

作者：Tiange Li,Xiangyu Yang,Hao Wang
机构：School of Information Science and Technology, ShanghaiTech University, Shanghai, China
摘要：本文提出了一种简单而有效的筛选规则策略，以提高求解包含非凸$\ell_q，p}$正则化的结构优化问题的计算效率.基于一个迭代重加权$\ell_1$正则化的筛选规则策略，提出了一种新的筛选规则策略.（IRL1）框架中，所提出的筛选规则像预处理模块一样工作，在启动子问题求解器之前潜在地移除不活动组，从而减少了计算时间.这主要是通过在每次迭代中启发式地利用对偶子问题的信息来实现的.此外，我们证明了我们的筛选规则可以在IRL1方法的有限次迭代中去除所有无效变量。数值实验表明，与现有的几种算法相比，本文提出的筛选规则策略是有效的.
摘要：In this paper, we develop a simple yet effective screening rule strategy to improve the computational efficiency in solving structured optimization involving nonconvex $\ell_{q,p}$ regularization. Based on an iteratively reweighted $\ell_1$ (IRL1) framework, the proposed screening rule works like a preprocessing module that potentially removes the inactive groups before starting the subproblem solver, thereby reducing the computational time in total. This is mainly achieved by heuristically exploiting the dual subproblem information during each iteration.Moreover, we prove that our screening rule can remove all inactive variables in a finite number of iterations of the IRL1 method. Numerical experiments illustrate the efficiency of our screening rule strategy compared with several state-of-the-art algorithms.

【2】 Machine learning optimization of Majorana hybrid nanowires
标题： Majorana杂化纳米线的机器学习优化
链接：https://arxiv.org/abs/2208.02182

作者：Matthias Thamm,Bernd Rosenow
机构：Institut f¨ur Theoretische Physik, Universit¨at Leipzig, Br¨uderstrasse , Leipzig, Germany, )
备注：12 pages, 13 figures
摘要：随着量子系统如量子位阵列的复杂性的增加，自动化昂贵的调整工作越来越值得.我们以强无序的Majorana线为例，研究了基于机器学习的CMA-ES算法的门阵列调整.我们发现，该算法能够有效地改善拓扑特征，学习本征无序分布，例如，仅用20个门，通过优化门电压，就有可能完全恢复被无序破坏的Majorana零模。
摘要：As the complexity of quantum systems such as quantum bit arrays increases, efforts to automate expensive tuning are increasingly worthwhile. We investigate machine learning based tuning of gate arrays using the CMA-ES algorithm for the case study of Majorana wires with strong disorder. We find that the algorithm is able to efficiently improve the topological signatures, learn intrinsic disorder profiles, and completely eliminate disorder effects. For example, with only 20 gates, it is possible to fully recover Majorana zero modes destroyed by disorder by optimizing gate voltages.

【3】 Optimal Rates for Regularized Conditional Mean Embedding Learning
标题：正则化条件均值嵌入学习的最优速率
链接：https://arxiv.org/abs/2208.01711

作者：Zhu Li,Dimitri Meunier,Mattes Mollenhauer,Arthur Gretton
机构：Gatsby Computational Neuroscience Unit, University College London, Department of Mathematics and Computer Science, Freie Universität Berlin
摘要：我们解决了一致性的核岭回归估计的条件均值嵌入（CME），它是将给定$X$的$Y$的条件分布嵌入到目标再生核希尔伯特空间$\mathcal{H}_Y$中。CME允许我们获取目标RKHS函数的条件期望，并且已用于非参数因果和贝叶斯推断。我们解决了错误指定的设置，其中目标CME在从$\mathcal {H}_X$和$L_2$之间的输入插值空间作用到$\mathcal {H}_Y$的希尔伯特—施密特算子的空间中。该算子的空间被示出同构于新定义的向量值插值空间。使用该同构，我们在错误设定的情况下，导出了经验CME估计量的一个新的自适应统计学习率，我们的分析表明，我们的学习率与最优的$O相匹配（\logn/n）$的学习率，并进一步建立了学习率的下界，证明了所得到的上界是最优的.
摘要：We address the consistency of a kernel ridge regression estimate of the conditional mean embedding (CME), which is an embedding of the conditional distribution of $Y$ given $X$ into a target reproducing kernel Hilbert space $\mathcal{H}_Y$. The CME allows us to take conditional expectations of target RKHS functions, and has been employed in nonparametric causal and Bayesian inference. We address the misspecified setting, where the target CME is in the space of Hilbert-Schmidt operators acting from an input interpolation space between $\mathcal{H}_X$ and $L_2$, to $\mathcal{H}_Y$. This space of operators is shown to be isomorphic to a newly defined vector-valued interpolation space. Using this isomorphism, we derive a novel and adaptive statistical learning rate for the empirical CME estimator under the misspecified setting. Our analysis reveals that our rates match the optimal $O(\log n / n)$ rates without assuming $\mathcal{H}_Y$ to be finite dimensional. We further establish a lower bound on the learning rate, which shows that the obtained upper bound is optimal.

预测|估计(1篇)

【1】 EgPDE-Net: Building Continuous Neural Networks for Time Series Prediction with Exogenous Variables
标题： EgPDE-Net：构建连续神经网络用于外生变量时间序列预测
链接：https://arxiv.org/abs/2208.01913

作者：Penglei Gao,Xi Yang,Kaizhu Huang,Rui Zhang,Ping Guo,John Y. Goulermas
机构： Kaizhu Huang is with the Institute of AppliedPhysical Sciences and Engineering, Duke Kunshan University
摘要：在时间序列分析中，外生变量是影响分析性能的重要因素，但现有的连续方法很少考虑序列间的相关性和时间相关性，多变量时间序列的动力系统可以用复杂的未知偏微分方程来建模（PDE）在科学和工程的许多学科中起着重要的作用，提出了一种连续时间的任意步预测模型来学习控制方程由自注意和门控递归神经网络参数化的多变量时间序列中未知的偏微分方程系统，外生引导局部微分方程网络（EgPDE-Net），考虑了外生变量之间的关系及其对目标序列的影响，重要的是，该模型可以简化为正则化的常微分方程（ODE）具有特殊设计规则化指南的问题，这使得PDE问题易于获得数值解，并且可以预测目标序列在任意时间点的多个未来值。大量的实验表明，我们提出的模型可以达到与强基线相比具有竞争力的精度：平均而言，与最佳基线相比，任意步预测的均方根误差（RMSE）和平均误差估计（MAE）分别降低了9.85和13.98美元。
摘要：While exogenous variables have a major impact on performance improvement in time series analysis, inter-series correlation and time dependence among them are rarely considered in the present continuous methods. The dynamical systems of multivariate time series could be modelled with complex unknown partial differential equations (PDEs) which play a prominent role in many disciplines of science and engineering. In this paper, we propose a continuous-time model for arbitrary-step prediction to learn an unknown PDE system in multivariate time series whose governing equations are parameterised by self-attention and gated recurrent neural networks. The proposed model, \underline{E}xogenous-\underline{g}uided \underline{P}artial \underline{D}ifferential \underline{E}quation Network (EgPDE-Net), takes account of the relationships among the exogenous variables and their effects on the target series. Importantly, the model can be reduced into a regularised ordinary differential equation (ODE) problem with special designed regularisation guidance, which makes the PDE problem tractable to obtain numerical solutions and feasible to predict multiple future values of the target series at arbitrary time points. Extensive experiments demonstrate that our proposed model could achieve competitive accuracy over strong baselines: on average, it outperforms the best baseline by reducing $9.85\%$ on RMSE and $13.98\%$ on MAE for arbitrary-step prediction.

其他神经网络|深度学习|模型|建模(13篇)

【1】 Quantum-Inspired Tensor Neural Networks for Partial Differential Equations
标题：偏微分方程的量子张量神经网络
链接：https://arxiv.org/abs/2208.02235

作者：Raj Patel,Chia-Wei Hsing,Serkan Sahin,Saeed S. Jahromi,Samuel Palmer,Shivam Sharma,Christophe Michel,Vincent Porte,Mustafa Abid,Stephane Aubert,Pierre Castellani,Chi-Guhn Lee,Samuel Mugel,Roman Orus
机构：and Rom´an Or´us, Multiverse Computing, Centre for Social Innovation, Spadina Ave, Suite , Toronto, M,T ,C, Canada., Multiverse Computing, Paseo de Miram´on , San, Cr´edit Agricole, Place des Etats-Unis - CS , - , Montrouge Cedex, France.
备注：20 pages, 11 figures
摘要：偏微分方程（PDE）用于对科学和工程中的各种动态系统进行建模。深度学习的最新进展使我们能够以新的方式解决维度灾难，从而在更高维度上解决这些问题。然而，深度学习方法受到训练时间和内存的限制。为了解决这些缺点，我们实施了张量神经网络（TNN），一种量子启发的神经网络架构，利用张量网络的思想来改进深度学习方法。我们证明，与经典的密集神经网络相比，TNN在获得相同精度的同时，显著节省了参数。此外，我们还展示了在相同精度下，TNN的训练速度如何快于DNN，并通过将TNN应用于求解抛物型偏微分方程，特别是广泛应用于金融定价理论的Black-Scholes-Barenblatt方程，并以Hamilton-Jacobi-Bellman方程为例进行了进一步的讨论.
摘要：Partial Differential Equations (PDEs) are used to model a variety of dynamical systems in science and engineering. Recent advances in deep learning have enabled us to solve them in a higher dimension by addressing the curse of dimensionality in new ways. However, deep learning methods are constrained by training time and memory. To tackle these shortcomings, we implement Tensor Neural Networks (TNN), a quantum-inspired neural network architecture that leverages Tensor Network ideas to improve upon deep learning approaches. We demonstrate that TNN provide significant parameter savings while attaining the same accuracy as compared to the classical Dense Neural Network (DNN). In addition, we also show how TNN can be trained faster than DNN for the same accuracy. We benchmark TNN by applying them to solve parabolic PDEs, specifically the Black-Scholes-Barenblatt equation, widely used in financial pricing theory, empirically showing the advantages of TNN over DNN. Further examples, such as the Hamilton-Jacobi-Bellman equation, are also discussed.

【2】 Sequence Model Imitation Learning with Unobserved Contexts
标题：基于不可观测背景的序列模型仿真学习
链接：https://arxiv.org/abs/2208.02225

作者：Gokul Swamy,Sanjiban Choudhury,J. Andrew Bagnell,Zhiwei Steven Wu
机构：Cornell University, Aurora Innovation and Carnegie Mellon University
摘要：我们考虑模仿学习问题，其中专家可以访问每一集的上下文，而学习者在演示和测试时间都是隐藏的。虽然学习者可能无法在一集的早期通过考虑状态和动作的整个历史来准确地再现专家的行为，他们可能最终能够识别上下文并像专家那样行动。我们证明了策略上模仿学习算法（有或没有访问可查询专家的权限）比非策略方法更适合处理这类渐近可实现的问题，并且能够避免锁存行为我们在一个玩具强盗领域进行了实验，实验表明，在非策略方法是否能够渐近地匹配专家性能方面，存在着尖锐的相变，我们证明了在几个连续的控制任务上，基于策略的方法能够使用历史来识别上下文，而不基于策略的方法在被给予对历史的访问时实际上执行得更差。
摘要：We consider imitation learning problems where the expert has access to a per-episode context that is hidden from the learner, both in the demonstrations and at test-time. While the learner might not be able to accurately reproduce expert behavior early on in an episode, by considering the entire history of states and actions, they might be able to eventually identify the context and act as the expert would. We prove that on-policy imitation learning algorithms (with or without access to a queryable expert) are better equipped to handle these sorts of asymptotically realizable problems than off-policy methods and are able to avoid the latching behavior (naive repetition of past actions) that plagues the latter. We conduct experiments in a toy bandit domain that show that there exist sharp phase transitions of whether off-policy approaches are able to match expert performance asymptotically, in contrast to the uniformly good performance of on-policy approaches. We demonstrate that on several continuous control tasks, on-policy approaches are able to use history to identify the context while off-policy approaches actually perform worse when given access to history.

【3】 SpanDrop: Simple and Effective Counterfactual Learning for Long Sequences
标题： SpanDrop：简单有效的长序列反事实学习
链接：https://arxiv.org/abs/2208.02169

作者：Peng Qi,Guangtao Wang,Jing Huang
机构： AWS AI, Seattle, WA, TikTok, Mountain View, CA, Alexa AI, Sunnyvale, CA
备注：Peng Qi and Guangtao Wang contributed equally
摘要：从长序列中提取监控信号进行预测是机器学习中的一项挑战性任务，特别是当输入序列中的所有元素对期望输出的贡献不相等时.本文提出了一种简单有效的数据扩充技术SpanDrop，它可以帮助模型在很少的样本情况下识别长序列中的真实监控信号.通过直接操纵输入序列，SpanDrop算法在每次随机抽取部分序列的基础上，要求模型执行相同的任务，模拟反事实学习，实现输入归因.在理论分析SpanDrop算法性质的基础上，提出了一种基于beta-Bernoulli分布的SpanDrop算法变体，其产生多样的扩充序列，同时提供与原始数据集更一致的学习目标。我们演示了SpanDrop在一组精心设计的玩具任务上的有效性，以及需要对长序列进行推理才能得出正确答案的各种自然语言处理任务，并表明它有助于模型在数据稀缺和丰富时提高性能。
摘要：Distilling supervision signal from a long sequence to make predictions is a challenging task in machine learning, especially when not all elements in the input sequence contribute equally to the desired output. In this paper, we propose SpanDrop, a simple and effective data augmentation technique that helps models identify the true supervision signal in a long sequence with very few examples. By directly manipulating the input sequence, SpanDrop randomly ablates parts of the sequence at a time and ask the model to perform the same task to emulate counterfactual learning and achieve input attribution. Based on theoretical analysis of its properties, we also propose a variant of SpanDrop based on the beta-Bernoulli distribution, which yields diverse augmented sequences while providing a learning objective that is more consistent with the original dataset. We demonstrate the effectiveness of SpanDrop on a set of carefully designed toy tasks, as well as various natural language processing tasks that require reasoning over long sequences to arrive at the correct answer, and show that it helps models improve performance both when data is scarce and abundant.

【4】 Noise tolerance of learning to rank under class-conditional label noise
标题：类条件标签噪声下学习排序的噪声容忍度
链接：https://arxiv.org/abs/2208.02126

作者：Dany Haddad
摘要：通常，用于训练排名模型的数据会受到标签噪声的影响。例如，在网络搜索中，由于SERP上的项目描述中的信息不足、用户的查询重新表述以及不稳定或意外的用户行为等问题，从点击流数据创建的标签会产生噪声。在实践中，如果不对标签生成过程进行强有力的假设，则很难处理标签噪声。因此，从业者通常训练他们的学习等级（LtR）模型直接在这种噪声数据上进行训练，而不额外考虑标签噪声。令人惊讶的是，我们经常看到用这种方法训练的LtR模型的强大性能。在这项工作中，我们描述了一类噪声容忍LtR损失，经验风险最小化是一个一致的过程，即使在类条件标签噪声的背景下，我们也能得到一个与通常使用的损失函数类似的噪声容忍函数.实验结果进一步支持了我们理论发现的实际意义.
摘要：Often, the data used to train ranking models is subject to label noise. For example, in web-search, labels created from clickstream data are noisy due to issues such as insufficient information in item descriptions on the SERP, query reformulation by the user, and erratic or unexpected user behavior. In practice, it is difficult to handle label noise without making strong assumptions about the label generation process. As a result, practitioners typically train their learning-to-rank (LtR) models directly on this noisy data without additional consideration of the label noise. Surprisingly, we often see strong performance from LtR models trained in this way. In this work, we describe a class of noise-tolerant LtR losses for which empirical risk minimization is a consistent procedure, even in the context of class-conditional label noise. We also develop noise-tolerant analogs of commonly used loss functions. The practical implications of our theoretical findings are further supported by experimental results.

【5】 Gradient descent provably escapes saddle points in the training of shallow ReLU networks
标题：浅ReLU网络训练中的梯度下降可证明避开鞍点
链接：https://arxiv.org/abs/2208.02083

作者：Patrick Cheridito,Arnulf Jentzen,Florian Rossmannek
机构：Department of Mathematics, ETH Zurich†School of Data Science and Shenzhen Research Institute of Big Data, The Chinese University of Hong Kong, Institute for Analysis and Numerics, University of M¨unster 1
摘要：动力系统理论最近被应用于优化中，证明梯度下降算法避免了所谓的损失函数的严格鞍点，然而在许多现代机器学习应用中，所要求的正则性条件并不满足，特别是对于校正线性单元的情况（ReLU）网络，本文证明了相关动力系统结果的一个变体——中心稳定流形定理，其中我们放宽了一些正则性要求，我们证明了浅ReLU网络适合于新的框架.建立在对仿射目标函数测量的浅ReLU网络平方积分损失的临界点分类的基础上，我们推导出梯度下降避免了大多数鞍点.我们继续证明如果初始化足够好，收敛到全局最小值，其由限制损耗上的明确阈值表示。
摘要：Dynamical systems theory has recently been applied in optimization to prove that gradient descent algorithms avoid so-called strict saddle points of the loss function. However, in many modern machine learning applications, the required regularity conditions are not satisfied. In particular, this is the case for rectified linear unit (ReLU) networks. In this paper, we prove a variant of the relevant dynamical systems result, a center-stable manifold theorem, in which we relax some of the regularity requirements. Then, we verify that shallow ReLU networks fit into the new framework. Building on a classification of critical points of the square integral loss of shallow ReLU networks measured against an affine target function, we deduce that gradient descent avoids most saddle points. We proceed to prove convergence to global minima if the initialization is sufficiently good, which is expressed by an explicit threshold on the limiting loss.

【6】 Efficient Fine-Tuning of Compressed Language Models with Learners
标题：基于学习者的压缩语言模型的高效微调
链接：https://arxiv.org/abs/2208.02070

作者：Danilo Vucetic,Mohammadreza Tayaranian,Maryam Ziaeefard,James J. Clark,Brett H. Meyer,Warren J. Gross
机构： 1Department of Electrical and Computer Engineering, McGillUniversity
备注：8 pages, 9 figures, 2 tables, presented at ICML 2022 workshop on Hardware-Aware Efficient Training (HAET 2022)
摘要：基于BERT的模型的微调需要大量的内存、计算和时间.虽然许多先前的工作旨在通过压缩技术（例如剪枝）来提高推理效率，但是这些工作没有明确地解决训练下游任务的计算挑战.我们引入学习器模块和启动，用于微调的新颖方法，其利用预先训练的语言模型的过度参数化来获得收敛速度和资源利用率方面的益处。学习者模块导航双重约束：1）通过微调参数的子集来有效地训练，我们在DistilBERT上的结果表明，学习者的表现与基线水平相当或超过基线水平。学习者训练的参数比状态参数少7倍。在CoLA上，学习者的微调速度提高了20%，并且资源利用率显著降低。
摘要：Fine-tuning BERT-based models is resource-intensive in memory, computation, and time. While many prior works aim to improve inference efficiency via compression techniques, e.g., pruning, these works do not explicitly address the computational challenges of training to downstream tasks. We introduce Learner modules and priming, novel methods for fine-tuning that exploit the overparameterization of pre-trained language models to gain benefits in convergence speed and resource utilization. Learner modules navigate the double bind of 1) training efficiently by fine-tuning a subset of parameters, and 2) training effectively by ensuring quick convergence and high metric scores. Our results on DistilBERT demonstrate that learners perform on par with or surpass the baselines. Learners train 7x fewer parameters than state-of-the-art methods on GLUE. On CoLA, learners fine-tune 20% faster, and have significantly lower resource utilization.

【7】 Centroids Matching: an efficient Continual Learning approach operating in the embedding space
标题：质心匹配：一种在嵌入空间中操作的有效连续学习方法
链接：https://arxiv.org/abs/2208.02048

作者：Jary Pomponi,Simone Scardapane,Aurelio Uncini
机构：Department of Information Engineering, Sapienza University of Rome, Italy
摘要：灾难性遗忘当神经网络在训练一组来自不同分布的样本（即新任务）时丢失了先前学习的信息时，就会发生CF（CF）.现有的方法在减轻CF方面取得了显著的效果，特别是在称为任务增量学习的场景中.然而，这种场景是不现实的，本文提出了一种新的正则化方法——质心匹配（Centroids Matching），该方法受元学习方法的启发，该方法通过在神经网络产生的特征空间中进行操作来对抗CF，在占用较小内存的情况下取得了较好的效果，具体地说，该方法直接利用神经网络产生的特征向量对样本进行分类，通过将这些向量与代表来自当前任务或直到该点的所有任务的类的质心进行匹配。质心匹配比竞争基线更快，并且可以利用它来有效地减轻CF，通过保持在过去任务结束时由模型产生的嵌入空间与当前产生的嵌入空间之间的距离，产生了一种在所有任务上实现高精确度的方法，当在简单场景上操作时，大量的实验表明，质心匹配算法在多个数据集和场景下都能获得较高的准确率。
摘要：Catastrophic forgetting (CF) occurs when a neural network loses the information previously learned while training on a set of samples from a different distribution, i.e., a new task. Existing approaches have achieved remarkable results in mitigating CF, especially in a scenario called task incremental learning. However, this scenario is not realistic, and limited work has been done to achieve good results on more realistic scenarios. In this paper, we propose a novel regularization method called Centroids Matching, that, inspired by meta-learning approaches, fights CF by operating in the feature space produced by the neural network, achieving good results while requiring a small memory footprint. Specifically, the approach classifies the samples directly using the feature vectors produced by the neural network, by matching those vectors with the centroids representing the classes from the current task, or all the tasks up to that point. Centroids Matching is faster than competing baselines, and it can be exploited to efficiently mitigate CF, by preserving the distances between the embedding space produced by the model when past tasks were over, and the one currently produced, leading to a method that achieves high accuracy on all the tasks, without using an external memory when operating on easy scenarios, or using a small one for more realistic ones. Extensive experiments demonstrate that Centroids Matching achieves accuracy gains on multiple datasets and scenarios.

【8】 BPMN4sML: A BPMN Extension for Serverless Machine Learning. Technology Independent and Interoperable Modeling of Machine Learning Workflows and their Serverless Deployment Orchestration
标题： BPMN4sML：一个面向无服务器机器学习的BPMN扩展.机器学习工作流的技术独立和互操作建模及其无服务器部署编排
链接：https://arxiv.org/abs/2208.02030

作者：Laurens Martin Tetzlaff
机构：Student number Tilburg: , Student number Eindhoven: , Thesis submitted in partial fulfillment, of the requirements for the degree of, Master of Science in Data Science and Entrepreneurship, Jheronimus Academy of Data Science, Thesis committee:
备注：105 pages 3 tables 33 figures
摘要：机器学习（ML）继续渗透到学术界、工业界和社会的各个层面。尽管它取得了成功，但仍然缺乏以一致和连贯的方式捕获和表示机器学习工作流的心智框架。例如，事实上的流程建模标准，业务流程模型和符号（BPMN）是一个被广泛接受和应用的机器学习工作流模型，但缺乏对机器学习工作流的具体支持，部署机器学习解决方案的异构工具的数量很容易使从业者不知所措。2需要研究来调整从建模到部署ML工作流的过程。分析了基于标准的机器学习工作流概念建模及其无服务器部署的需求，针对机器学习工作流在技术独立和可互操作的方式下建模的一致性和连贯性方面的不足，我们扩展了BPMN的元对象工具（MOF）元模型和相应的表示法并介绍BPMN4sML我们的扩展BPMN4sML遵循对象管理组引用的相同大纲（OMG），用于BPMN。我们进一步解决了部署中的异构性问题，提出了一个概念映射，使用托斯卡。 BPMN4sML允许在整个机器学习生命周期中对各种粒度和复杂度的机器学习工作流进行技术独立和可互操作的建模。它有助于达成一种共享和标准化的语言来交流ML解决方案。此外，它还迈出了第一步，通过TOSCA将ML工作流模型图转换为相应的部署模型，以实现无服务器部署。
摘要：Machine learning (ML) continues to permeate all layers of academia, industry and society. Despite its successes, mental frameworks to capture and represent machine learning workflows in a consistent and coherent manner are lacking. For instance, the de facto process modeling standard, Business Process Model and Notation (BPMN), managed by the Object Management Group, is widely accepted and applied. However, it is short of specific support to represent machine learning workflows. Further, the number of heterogeneous tools for deployment of machine learning solutions can easily overwhelm practitioners. Research is needed to align the process from modeling to deploying ML workflows. We analyze requirements for standard based conceptual modeling for machine learning workflows and their serverless deployment. Confronting the shortcomings with respect to consistent and coherent modeling of ML workflows in a technology independent and interoperable manner, we extend BPMN's Meta-Object Facility (MOF) metamodel and the corresponding notation and introduce BPMN4sML (BPMN for serverless machine learning). Our extension BPMN4sML follows the same outline referenced by the Object Management Group (OMG) for BPMN. We further address the heterogeneity in deployment by proposing a conceptual mapping to convert BPMN4sML models to corresponding deployment models using TOSCA. BPMN4sML allows technology-independent and interoperable modeling of machine learning workflows of various granularity and complexity across the entire machine learning lifecycle. It aids in arriving at a shared and standardized language to communicate ML solutions. Moreover, it takes the first steps toward enabling conversion of ML workflow model diagrams to corresponding deployment models for serverless deployment via TOSCA.

【9】 Learning Object Manipulation Skills from Video via Approximate Differentiable Physics
标题：基于近似可微物理学的视频物体操作技能学习
链接：https://arxiv.org/abs/2208.01960

作者：Vladimir Petrik,Mohammad Nomaan Qureshi,Josef Sivic,Makarand Tapaswi
机构： Czech Technical University in Prague {vladimir
备注：Accepted for IROS2022, code at this https URL, project page at this https URL
摘要：本文的目标是通过观看单个视频演示来教机器人执行简单的物体操作任务.为此，我们提出了一种优化方法，输出一个粗略的和时间演化的3D场景来模仿输入视频中演示的动作.与以前的工作类似，可区分渲染器确保3D场景和2D视频之间的感知保真度。我们的关键创新在于包括一个可微的方法来解决一组常微分方程（常微分方程），它允许我们近似地模拟物理定律，如重力、摩擦力这不仅使我们能够显著地提高估计的手和物体状态的质量，而且还产生物理上可接受的轨迹，这些轨迹可以被直接转换到机器人而不需要昂贵的强化学习。我们在一个由54个视频演示组成的3D重建任务上对我们的方法进行了评估，这些视频演示来自9个动作，例如将某物从右拉到左或将某物放在某物前面。在涉及两个物体的物理相互作用的特别具有挑战性的动作中展示出卓越的品质，我们展示了在弗兰卡·埃米卡熊猫机器人上学到的技能。
摘要：We aim to teach robots to perform simple object manipulation tasks by watching a single video demonstration. Towards this goal, we propose an optimization approach that outputs a coarse and temporally evolving 3D scene to mimic the action demonstrated in the input video. Similar to previous work, a differentiable renderer ensures perceptual fidelity between the 3D scene and the 2D video. Our key novelty lies in the inclusion of a differentiable approach to solve a set of Ordinary Differential Equations (ODEs) that allows us to approximately model laws of physics such as gravity, friction, and hand-object or object-object interactions. This not only enables us to dramatically improve the quality of estimated hand and object states, but also produces physically admissible trajectories that can be directly translated to a robot without the need for costly reinforcement learning. We evaluate our approach on a 3D reconstruction task that consists of 54 video demonstrations sourced from 9 actions such as pull something from right to left or put something in front of something. Our approach improves over previous state-of-the-art by almost 30%, demonstrating superior quality on especially challenging actions involving physical interactions of two objects such as put something onto something. Finally, we showcase the learned skills on a Franka Emika Panda robot.

【10】 A Deep Learning Approach to Detect Lean Blowout in Combustion Systems
标题：一种检测燃烧系统稀燃熄火的深度学习方法
链接：https://arxiv.org/abs/2208.01871

作者：Tryambak Gangopadhyay,Somnath De,Qisai Liu,Achintya Mukhopadhyay,Swarnendu Sen,Soumik Sarkar
机构：Amazon Web Services AI, Amazon, Santa Clara, CA, USA., Department of Mechanical Engineering, Iowa State University, Ames, IA, USA, Department of Mechanical Engineering, Jadavpur University, Kolkata, India
摘要：稀薄燃烧是一种低NOx排放的环保型燃烧方式，也能提高燃烧系统的燃油效率。然而，向稀薄燃烧方向发展会使发动机更容易发生稀薄熄火。（LBO）是一种不希望出现的现象，会导致火焰突然熄灭，从而导致突然断电。在设计阶段，如何准确地确定最佳的操作界限以避免突然的杠杆收购事件的发生是一个很大的挑战，发展精确的和易于计算的在线LBO检测框架对于低NOx排放发动机是至关重要的.据我们所知，本文首次提出了一种基于深度学习的贫油熄火检测方法，利用实验室规模的燃烧室收集不同协议的数据，从远离LBO的协议开始，逐步向LBO的协议过渡，使用我们的数据集中的协议之一作为参考协议并且使用由领域专家注释的条件，我们为我们训练的深度学习模型找到了一个过渡状态度量，以检测其他测试协议中的LBO。我们发现，我们提出的方法比其他基线模型更准确，计算速度更快，以检测到LBO的过渡。因此，我们推荐该方法用于稀薄燃烧发动机的实时性能监测。
摘要：Lean combustion is environment friendly with low NOx emissions and also provides better fuel efficiency in a combustion system. However, approaching towards lean combustion can make engines more susceptible to lean blowout. Lean blowout (LBO) is an undesirable phenomenon that can cause sudden flame extinction leading to sudden loss of power. During the design stage, it is quite challenging for the scientists to accurately determine the optimal operating limits to avoid sudden LBO occurrence. Therefore, it is crucial to develop accurate and computationally tractable frameworks for online LBO detection in low NOx emission engines. To the best of our knowledge, for the first time, we propose a deep learning approach to detect lean blowout in combustion systems. In this work, we utilize a laboratory-scale combustor to collect data for different protocols. We start far from LBO for each protocol and gradually move towards the LBO regime, capturing a quasi-static time series dataset at each condition. Using one of the protocols in our dataset as the reference protocol and with conditions annotated by domain experts, we find a transition state metric for our trained deep learning model to detect LBO in the other test protocols. We find that our proposed approach is more accurate and computationally faster than other baseline models to detect the transitions to LBO. Therefore, we recommend this method for real-time performance monitoring in lean combustion engines.

【11】 Cross-Modal Alignment Learning of Vision-Language Conceptual Systems
标题：视觉—语言概念系统的跨通道对齐学习
链接：https://arxiv.org/abs/2208.01744

作者：Taehyeong Kim,Hyeonseop Song,Byoung-Tak Zhang
机构：LG Electronics,Seoul National University
备注：19 pages, 4 figures
摘要：人类婴儿学习物体的名称并发展自己的概念系统，没有明确的监督。在本研究中，我们提出了学习对齐的视觉语言概念系统的方法，启发婴儿的单词学习机制。所提出的模型在线学习视觉物体和单词的关联，并逐步建立跨模态关系图网络。此外，我们还提出了一种对齐的跨模态表示学习方法，该方法基于跨模态关系图网络以自监督的方式学习视觉对象和单词的语义表示。它允许概念上意义相同的不同模态的实体具有相似的语义表示向量，包括对象—词映射和zero-shot学习任务，表明所提出的模型显著优于基线，并且每个概念系统在拓扑上是一致的。
摘要：Human infants learn the names of objects and develop their own conceptual systems without explicit supervision. In this study, we propose methods for learning aligned vision-language conceptual systems inspired by infants' word learning mechanisms. The proposed model learns the associations of visual objects and words online and gradually constructs cross-modal relational graph networks. Additionally, we also propose an aligned cross-modal representation learning method that learns semantic representations of visual objects and words in a self-supervised manner based on the cross-modal relational graph networks. It allows entities of different modalities with conceptually the same meaning to have similar semantic representation vectors. We quantitatively and qualitatively evaluate our method, including object-to-word mapping and zero-shot learning tasks, showing that the proposed model significantly outperforms the baselines and that each conceptual system is topologically aligned.

【12】 A Study of Modeling Rising Intonation in Cantonese Neural Speech Synthesis
标题：粤语神经语音合成中升调建模的研究
链接：https://arxiv.org/abs/2208.02189

作者：Qibing Bai,Tom Ko,Yu Zhang
机构：Department of Computer Science and Engineering, Southern University of Science and Technology, ByteDance AI Lab
备注：Accepted by INTERSPEECH 2022
摘要：在人类语言中，说话者的态度不能单靠文字内容来表达，而必须配合语调。陈述性问句是日常粤语会话中常用的问句，而且通常是用升调来表达的。（TTS）系统由于语义信息的丢失而不能合成这些句子的升调。虽然用额外的语言模型来补充系统已经变得越来越普遍，但是它们在模拟上升语调方面的性能却没有得到很好的研究，我们提出了一个基于BERT的语句/疑问句分类器来补充粤语TTS模型，设计了不同的训练策略并比较了它们的性能。在粤语语料CanTTS上的实验结果表明，分离训练方法具有最好的泛化性能和可行性。
摘要：In human speech, the attitude of a speaker cannot be fully expressed only by the textual content. It has to come along with the intonation. Declarative questions are commonly used in daily Cantonese conversations, and they are usually uttered with rising intonation. Vanilla neural text-to-speech (TTS) systems are not capable of synthesizing rising intonation for these sentences due to the loss of semantic information. Though it has become more common to complement the systems with extra language models, their performance in modeling rising intonation is not well studied. In this paper, we propose to complement the Cantonese TTS model with a BERT-based statement/question classifier. We design different training strategies and compare their performance. We conduct our experiments on a Cantonese corpus named CanTTS. Empirical results show that the separate training approach obtains the best generalization performance and feasibility.

【13】 Conv-NILM-Net, a causal and multi-appliance model for energy source separation
标题： Conv-NILM-Net ― ―一种因果多家电能源分离模型
链接：https://arxiv.org/abs/2208.02173

作者：Mohamed Alami C.,Jérémie Decock,Rim Kaddah,Jesse Read
机构： Ecole Polytechnique, Palaiseau, France, Accenta, Paris, France, IRT SystemX, Palaiseau, France
备注：Published in ECMLPKDD 2022, MLBEM workshop
摘要：非侵入式负载监控（NILM）试图通过从单个聚合测量值估计单个电器的功率使用来节省能量。深度神经网络在试图解决NILM问题方面已经越来越流行。然而，大多数使用的模型用于负载识别而不是在线源分离。在源分离模型中，大多数使用单任务学习方法，其中专门为每个电器训练神经网络，这种策略计算量大，并且忽略了多个电器可以同时活动以及它们之间的依赖性的事实。本文受语音分离模型Convtas-Net的启发，提出了Conv-NILM-net，Conv-NILM-net是一个用于多家电源分离的因果模型，我们的模型在两个真实的数据集REDD和UK上进行了测试，结果表明，Conv-NILM-net是一个有效的端到端NILM因果模型。DALE和明显优于国家的最新技术，同时保持一个显着较小的尺寸比竞争模型。
摘要：Non-Intrusive Load Monitoring (NILM) seeks to save energy by estimating individual appliance power usage from a single aggregate measurement. Deep neural networks have become increasingly popular in attempting to solve NILM problems. However most used models are used for Load Identification rather than online Source Separation. Among source separation models, most use a single-task learning approach in which a neural network is trained exclusively for each appliance. This strategy is computationally expensive and ignores the fact that multiple appliances can be active simultaneously and dependencies between them. The rest of models are not causal, which is important for real-time application. Inspired by Convtas-Net, a model for speech separation, we propose Conv-NILM-net, a fully convolutional framework for end-to-end NILM. Conv-NILM-net is a causal model for multi appliance source separation. Our model is tested on two real datasets REDD and UK-DALE and clearly outperforms the state of the art while keeping a significantly smaller size than the competing models.

其他(14篇)

【1】 SGEM: stochastic gradient with energy and momentum
标题： SGEM：具有能量和动量随机梯度
链接：https://arxiv.org/abs/2208.02208

作者：Hailiang Liu,Xuping Tian
备注：24 pages, 4 figures
摘要：在本文中，我们提出了SGEM（Stochastic Gradient with Energy and Momentum）算法来求解一类一般的非凸随机优化问题，该算法基于AEGD方法，该方法起源于[AEGD：能量自适应梯度下降. arXiv：2010.05109]. SGEM同时包含了能量和动量，继承了它们的双重优点.我们证明了SGEM具有无条件能量稳定性，并在一般非凸随机环境下得到了依赖于能量的收敛速度，以及在线凸集上的后悔界，并给出了能量变量的下限.实验结果表明，SGEM算法比AEGD算法收敛速度快，在训练某些深度神经网络时，SGEM算法的泛化能力比AEGD算法好.
摘要：In this paper, we propose SGEM, Stochastic Gradient with Energy and Momentum, to solve a large class of general non-convex stochastic optimization problems, based on the AEGD method that originated in the work [AEGD: Adaptive Gradient Descent with Energy. arXiv: 2010.05109]. SGEM incorporates both energy and momentum at the same time so as to inherit their dual advantages. We show that SGEM features an unconditional energy stability property, and derive energy-dependent convergence rates in the general nonconvex stochastic setting, as well as a regret bound in the online convex setting. A lower threshold for the energy variable is also provided. Our experimental results show that SGEM converges faster than AEGD and generalizes better or at least as well as SGDM in training some deep neural networks.

【2】 Robots with Different Embodiments Can Express and Influence Carefulness in Object Manipulation
标题：具有不同实施例的机器人可以表达和影响物体操纵中的谨慎
链接：https://arxiv.org/abs/2208.02058

作者：Linda Lastrico,Luca Garello,Francesco Rea,Nicoletta Noceti,Fulvio Mastrogiovanni,Alessandra Sciutti,Alessandro Carfì
机构：†Department of Informatics, Bioengineering, Robotics, and Systems Engineering (DIBRIS), University of Genoa, Italy, §Robotics, Brain and Cognitive Science Department (RBCS), Italian Institute of Technology, Genoa, Italy
备注：Accepted for publication in the Proceedings of the IEEE International Conference on Development and Learning (ICDL) 2022 - 12th ICDL
摘要：人类有着非凡的沟通能力，只要看着别人搬运物品，就能读懂物品的属性。如果协作机器人要想自然有效地进行互动，人类具备的这种沟通技能和解读能力是必不可少的。例如，假设一个机器人正在移交一个易碎的物品。在这种情况下，接收它的人应该通过直接和隐含的消息，即，本文研究了两个具有不同实施例的机器人对具有交流意图的物体操作的知觉（一个iCub类人机器人和一个Baxter机器人）。我们设计了机器人在搬运物体时的动作来传达小心与否。我们发现，这一特征不仅能被人类观察者正确感知，而且还能在随后的人类物体操作中引发一种形式的运动适应。此外，我们可以了解哪些运动特征可以引起或多或少地小心操纵对象。
摘要：Humans have an extraordinary ability to communicate and read the properties of objects by simply watching them being carried by someone else. This level of communicative skills and interpretation, available to humans, is essential for collaborative robots if they are to interact naturally and effectively. For example, suppose a robot is handing over a fragile object. In that case, the human who receives it should be informed of its fragility in advance, through an immediate and implicit message, i.e., by the direct modulation of the robot's action. This work investigates the perception of object manipulations performed with a communicative intent by two robots with different embodiments (an iCub humanoid robot and a Baxter robot). We designed the robots' movements to communicate carefulness or not during the transportation of objects. We found that not only this feature is correctly perceived by human observers, but it can elicit as well a form of motor adaptation in subsequent human object manipulations. In addition, we get an insight into which motion features may induce to manipulate an object more or less carefully.

【3】 OLLIE: Derivation-based Tensor Program Optimizer
标题： OLLIE：基于导数的张量程序优化器
链接：https://arxiv.org/abs/2208.02025

作者：Liyan Zheng,Haojie Wang,Jidong Zhai,Muyan Hu,Zixuan Ma,Tuowei Wang,Shizhi Tang,Lei Xie,Kezhao Huang,Zhihao Jia
机构：†Tsinghua University, ‡Carnegie Mellon University
摘要：提高深度神经网络的运行时性能（DNNs）的张量代数表达式的优化是一个重要的问题.现有的优化DNNs的张量代数表达式的方法只考虑了一组固定的预定义算子，OLLIE通过利用一般张量代数表达式之间的变换来优化张量程序，使得能够实现显著更大的表达式搜索空间，该表达式搜索空间包括作为特殊情况的先前工作所支持的表达式。OLLIE使用混合推导-一个基于OLLIE的优化器，它有效地结合了探索式和引导式推导，以快速发现高度优化的表达式。对七个DNN的评估表明OLLIE比现有优化器的性能高出2.73$\times$（平均为1.46$\x $），在V100 GPU上最高可达2.68$\x $（1.51$\x $）。
摘要：Boosting the runtime performance of deep neural networks (DNNs) is critical due to their wide adoption in real-world tasks. Existing approaches to optimizing the tensor algebra expression of a DNN only consider expressions representable by a fixed set of predefined operators, missing possible optimization opportunities between general expressions. We propose OLLIE, the first derivation-based tensor program optimizer. OLLIE optimizes tensor programs by leveraging transformations between general tensor algebra expressions, enabling a significantly larger expression search space that includes those supported by prior work as special cases. OLLIE uses a hybrid derivation-based optimizer that effectively combines explorative and guided derivations to quickly discover highly optimized expressions. Evaluation on seven DNNs shows that OLLIE can outperform existing optimizers by up to 2.73$\times$ (1.46$\times$ on average) on an A100 GPU and up to 2.68$\times$ (1.51$\times$) on a V100 GPU, respectively.

【4】 Neural Nets with a Newton Conjugate Gradient Method on Multiple GPUs
标题：基于多GPU的神经网络牛顿共轭梯度法
链接：https://arxiv.org/abs/2208.02017

作者：Severin Reiz,Tobias Neckel,Hans-Joachim Bungartz
机构：Technical University of Munich (TUM), School of Computation, Information and, Technology
备注：None
摘要：训练深度神经网络消耗了许多计算中心中不断增加的计算资源份额，我们采用了一种强力方法来获得超参数值。（1）通过启用具有更少超参数的用于大规模神经网络的二阶优化方法来增强这一点，以及（2）对特定任务的性能优化器进行调查，为用户推荐最适合他们的问题的性能优化器。阶数优化方法，该方法仅需要Hessian对向量的影响，并且避免了为大规模网络显式设置Hessian的巨大成本。我们将所提出的二阶方法与两个最先进的优化器在五个代表性的神经网络问题上进行了比较，包括回归和来自计算机视觉或变分自动编码器的非常深的网络。对于最大的设置，我们使用Horovod高效地并行化了优化器，并将其应用于8 GPU NVIDIA P100（DGX-1）机器。
摘要：Training deep neural networks consumes increasing computational resource shares in many compute centers. Often, a brute force approach to obtain hyperparameter values is employed. Our goal is (1) to enhance this by enabling second-order optimization methods with fewer hyperparameters for large-scale neural networks and (2) to perform a survey of the performance optimizers for specific tasks to suggest users the best one for their problem. We introduce a novel second-order optimization method that requires the effect of the Hessian on a vector only and avoids the huge cost of explicitly setting up the Hessian for large-scale networks. We compare the proposed second-order method with two state-of-the-art optimizers on five representative neural network problems, including regression and very deep networks from computer vision or variational autoencoders. For the largest setup, we efficiently parallelized the optimizers with Horovod and applied it to a 8 GPU NVIDIA P100 (DGX-1) machine.

【5】 Equivariant Disentangled Transformation for Domain Generalization under Combination Shift
标题：组合移位下的等变解纠缠域推广
链接：https://arxiv.org/abs/2208.02011

作者：Yivan Zhang,Jindong Wang,Xing Xie,Masashi Sugiyama
机构：The University of Tokyo, RIKEN AIP, Microsoft Research Asia
摘要：当部署环境中的数据分布发生变化时，机器学习系统可能会遇到意想不到的问题。一个主要原因是域和标签的某些组合在训练过程中没有被观察到，而是出现在测试环境中。虽然可以应用各种基于不变性的算法，但我们发现性能增益通常是边际的。要正式分析这个问题，基于同态、等方差和解纠缠的精确定义，我们给出了组合移位问题的一个独特的代数形式，这些代数要求自然地导出了一个简单而有效的方法，称为等变解纠缠变换（EDT），它根据标签的代数结构扩充数据，使变换满足等方差和解纠缠的要求，实验结果表明，不变性可能是不够的，在组合移位问题中利用等方差结构是重要的.
摘要：Machine learning systems may encounter unexpected problems when the data distribution changes in the deployment environment. A major reason is that certain combinations of domains and labels are not observed during training but appear in the test environment. Although various invariance-based algorithms can be applied, we find that the performance gain is often marginal. To formally analyze this issue, we provide a unique algebraic formulation of the combination shift problem based on the concepts of homomorphism, equivariance, and a refined definition of disentanglement. The algebraic requirements naturally derive a simple yet effective method, referred to as equivariant disentangled transformation (EDT), which augments the data based on the algebraic structures of labels and makes the transformation satisfy the equivariance and disentanglement requirements. Experimental results demonstrate that invariance may be insufficient, and it is important to exploit the equivariance structure in the combination shift problem.

【6】 Vision-Based Safety System for Barrierless Human-Robot Collaboration
标题：基于视觉的无障碍人机协作安全系统
链接：https://arxiv.org/abs/2208.02010

作者：Lina María Amaya-Mejía,Nicolás Duque-Suárez,Daniel Jaramillo-Ramírez,Carol Martinez
机构：PontificiaUniversidadJaveriana, UniversityofLuxembourg
备注：Accepted for publication at the 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
摘要：在工业机器人附近工作时，人的安全一直是首要考虑的问题。随着人机协作环境的兴起，避免碰撞的物理障碍正在消失，增加了事故的风险，需要解决方案来确保安全的人机协作。本文提出了一种安全系统，实现了速度和间距监控（SSM）类型的操作。为此，按照工业协作机器人的当前标准，在机器人的工作空间中定义了安全区域。基于深度学习的计算机视觉系统检测、跟踪并且估计接近机器人的操作者的三维位置。机器人控制系统接收操作者的根据检测到最接近操作员的区域，可以确定操作员的位置，给出了三种不同的人机交互操作模式。结果表明，基于视觉的系统能够正确地检测和分类操作员所处的安全区域，不同的操作模式保证了机器人的反应和停止时间在所要求的时间范围内，从而保证了安全。
摘要：Human safety has always been the main priority when working near an industrial robot. With the rise of Human-Robot Collaborative environments, physical barriers to avoiding collisions have been disappearing, increasing the risk of accidents and the need for solutions that ensure a safe Human-Robot Collaboration. This paper proposes a safety system that implements Speed and Separation Monitoring (SSM) type of operation. For this, safety zones are defined in the robot's workspace following current standards for industrial collaborative robots. A deep learning-based computer vision system detects, tracks, and estimates the 3D position of operators close to the robot. The robot control system receives the operator's 3D position and generates 3D representations of them in a simulation environment. Depending on the zone where the closest operator was detected, the robot stops or changes its operating speed. Three different operation modes in which the human and robot interact are presented. Results show that the vision-based system can correctly detect and classify in which safety zone an operator is located and that the different proposed operation modes ensure that the robot's reaction and stop time are within the required time limits to guarantee safety.

【7】 Maintaining Performance with Less Data
标题：使用更少的数据保持性能
链接：https://arxiv.org/abs/2208.02007

作者：Dominic Sanderson,Tatiana Kalgonova
机构：Department of Electronic and Computer Engineering, Brunel University London, London, United Kingdom, -,-,-
备注：12 pages, 8 figures, 11 tables
摘要：我们提出了一种新的训练神经网络进行图像分类的方法，以动态减少输入数据，从而降低训练神经网络模型的成本。随着深度学习任务变得越来越流行，其计算复杂度增加，导致更复杂的算法和模型具有更长的运行时间，需要更多的输入数据。结果是在时间、硬件通过使用数据缩减技术，我们减少了执行的工作量，因此减少了人工智能技术对环境的影响，通过动态数据缩减，我们证明了在保持准确性的同时，可以将运行时间减少多达50%，并按比例减少碳排放。
摘要：We propose a novel method for training a neural network for image classification to reduce input data dynamically, in order to reduce the costs of training a neural network model. As Deep Learning tasks become more popular, their computational complexity increases, leading to more intricate algorithms and models which have longer runtimes and require more input data. The result is a greater cost on time, hardware, and environmental resources. By using data reduction techniques, we reduce the amount of work performed, and therefore the environmental impact of AI techniques, and with dynamic data reduction we show that accuracy may be maintained while reducing runtime by up to 50%, and reducing carbon emission proportionally.

【8】 Flow Annealed Importance Sampling Bootstrap
标题：流动退火重要抽样Bootstrap
链接：https://arxiv.org/abs/2208.01893

作者：Laurence Illing Midgley,Vincent Stimper,Gregor N. C. Simm,Bernhard Schölkopf,José Miguel Hernández-Lobato
机构：Max Planck Institute for Intelligent Systems, University of Cambridge, Bernhard Sch¨olkopf, Jos´e Miguel Hern´andez-Lobato
摘要：归一化流是一种易于处理的密度模型，它可以近似物理系统的Boltzmann分布等复杂的目标分布.然而，目前的流训练方法要么存在模式搜索行为，要么使用昂贵的MCMC模拟预先生成的目标样本，要么使用方差很大的随机损失.为了避免这些问题，我们用退火重要抽样来扩充流（AIS）并最小化质量覆盖α—散度，其中α =2，这最小化了重要性权重方差。（FAB），使用AIS在流是目标的较差近似的区域中产生样本，通过重要性抽样，我们以用于估计α—散度的最小方差分布作为AIS的目标。我们将FAB应用于复杂的多模态目标，并表明我们可以非常精确地逼近它们，而以前的方法无法做到.据我们所知，我们是第一个仅使用未归一化的目标密度而不使用样本来学习丙氨酸二肽分子的玻尔兹曼分布的人通过分子动力学（MD）模拟生成：FAB比最大似然训练法在MD样本上的训练效果更好，而所用的目标评估值却减少了100倍，在用重要性权重重新加权样本后，我们得到了与地面真实值几乎相同的二面角无偏直方图。
摘要：Normalizing flows are tractable density models that can approximate complicated target distributions, e.g. Boltzmann distributions of physical systems. However, current methods for training flows either suffer from mode-seeking behavior, use samples from the target generated beforehand by expensive MCMC simulations, or use stochastic losses that have very high variance. To avoid these problems, we augment flows with annealed importance sampling (AIS) and minimize the mass covering $\alpha$-divergence with $\alpha=2$, which minimizes importance weight variance. Our method, Flow AIS Bootstrap (FAB), uses AIS to generate samples in regions where the flow is a poor approximation of the target, facilitating the discovery of new modes. We target with AIS the minimum variance distribution for the estimation of the $\alpha$-divergence via importance sampling. We also use a prioritized buffer to store and reuse AIS samples. These two features significantly improve FAB's performance. We apply FAB to complex multimodal targets and show that we can approximate them very accurately where previous methods fail. To the best of our knowledge, we are the first to learn the Boltzmann distribution of the alanine dipeptide molecule using only the unnormalized target density and without access to samples generated via Molecular Dynamics (MD) simulations: FAB produces better results than training via maximum likelihood on MD samples while using 100 times fewer target evaluations. After reweighting samples with importance weights, we obtain unbiased histograms of dihedral angles that are almost identical to the ground truth ones.

【9】 Exploring Generative Neural Temporal Point Process
标题：生成性神经时间点过程研究
链接：https://arxiv.org/abs/2208.01874

作者：Haitao Lin,Lirong Wu,Guojiang Zhao,Pai Liu,Stan Z. Li
机构：CAIRI, Westlake University, Zhejiang University, Carnegie Mellon University, School of Engineering, Westlake University
摘要：时间点过程（TPP）通常用于描述具有发生时间戳的异步事件序列，并通过基于历史影响的概率模型来揭示。虽然许多先前的工作集中在TPP模型的'拟合优度'上，通过最大化似然，但它们的预测性能并不令人满意，这意味着模型产生的时间戳与真实观测相差很远。近年来，深度生成模型如去噪扩散模型和得分匹配模型在图像生成方面取得了很大的进展，证明了它们能够生成高质量的样本。然而，目前还没有一个完整的、统一的工作来探索和研究生成模型在事件发生建模中的潜力。在本文中，我们试图通过设计一个统一的生成框架来弥补这一空白，以探索其可行性和有效性，并进一步提高模型的预测性能。此外，在历史影响的度量方面，考虑到事件的类型关系和时间间隔，我们使用一个自适应的重新加权项来修正概括历史事件影响的注意力模型。大量的实验表明，{GNTPP}的预测能力得到了提高，并使用了一系列生成概率解码器，以及改进的注意力带来的性能增益。据我们所知，这是第一部在一个完整的统一框架内调整生成模型并研究其在TPP背景下有效性的作品。我们的代码库（包括第5.1.1节中给出的所有方法）在\url{https：//github.com/BIRD-TAO/GNTPP}，希望该代码框架能为神经TPPs的进一步研究提供帮助。
摘要：Temporal point process (TPP) is commonly used to model the asynchronous event sequence featuring occurrence timestamps and revealed by probabilistic models conditioned on historical impacts. While lots of previous works have focused on `goodness-of-fit' of TPP models by maximizing the likelihood, their predictive performance is unsatisfactory, which means the timestamps generated by models are far apart from true observations. Recently, deep generative models such as denoising diffusion and score matching models have achieved great progress in image generating tasks by demonstrating their capability of generating samples of high quality. However, there are no complete and unified works exploring and studying the potential of generative models in the context of event occurence modeling for TPP. In this work, we try to fill the gap by designing a unified \textbf{g}enerative framework for \textbf{n}eural \textbf{t}emporal \textbf{p}oint \textbf{p}rocess (\textsc{GNTPP}) model to explore their feasibility and effectiveness, and further improve models' predictive performance. Besides, in terms of measuring the historical impacts, we revise the attentive models which summarize influence from historical events with an adaptive reweighting term considering events' type relation and time intervals. Extensive experiments have been conducted to illustrate the improved predictive capability of \textsc{GNTPP} with a line of generative probabilistic decoders, and performance gain from the revised attention. To the best of our knowledge, this is the first work that adapts generative models in a complete unified framework and studies their effectiveness in the context of TPP. Our codebase including all the methods given in Section.5.1.1 is open in \url{https://github.com/BIRD-TAO/GNTPP}. We hope the code framework can facilitate future research in Neural TPPs.

【10】 The Power and Limitation of Pretraining-Finetuning for Linear Regression under Covariate Shift
标题：协变量移位下线性回归预训练—精调的功效与局限性
链接：https://arxiv.org/abs/2208.01857

作者：Jingfeng Wu,Difan Zou,Vladimir Braverman,Quanquan Gu,Sham M. Kakade
机构：edu§Department of Computer Science, Johns Hopkins University
备注：32 pages, 1 figure, 1 table
摘要：我们研究协变量移位下的线性回归，输入协变量的边缘分布在源域和目标域不同，而在给定输入协变量的情况下，输出的条件分布在两个域上是相似的.我们研究了一种迁移学习方法，它在源数据上进行预训练，在目标数据上进行微调（均由在线SGD进行），我们为该方法建立了严格的实例相关的超额风险上界和下界，我们的界表明，对于一大类线性回归实例，使用$O迁移学习（N^2）$源数据（并且缺乏或没有目标数据）与具有$N$个目标数据的监督学习一样有效。此外，我们表明，即使只有少量的目标数据，我们的理论揭示了预训练的有效性和局限性以及细调在处理协变量移位问题上的好处。
摘要：We study linear regression under covariate shift, where the marginal distribution over the input covariates differs in the source and the target domains, while the conditional distribution of the output given the input covariates is similar across the two domains. We investigate a transfer learning approach with pretraining on the source data and finetuning based on the target data (both conducted by online SGD) for this problem. We establish sharp instance-dependent excess risk upper and lower bounds for this approach. Our bounds suggest that for a large class of linear regression instances, transfer learning with $O(N^2)$ source data (and scarce or no target data) is as effective as supervised learning with $N$ target data. In addition, we show that finetuning, even with only a small amount of target data, could drastically reduce the amount of source data required by pretraining. Our theory sheds light on the effectiveness and limitation of pretraining as well as the benefits of finetuning for tackling covariate shift problems.

【11】 Post-hoc Interpretability based Parameter Selection for Data Oriented Nuclear Reactor Accident Diagnosis System
标题：基于事后解释的面向数据的核反应堆事故诊断系统参数选择
链接：https://arxiv.org/abs/2208.01805

作者：Chengyuan Li. Meifu Li,Zhifang Qiu
机构：Science and Technology on, Reactor System Design, Technology Laboratory, Nuclear, Power Institute of China, Chengdu, China
备注：ICONE 29
摘要：在应用面向数据的诊断系统对核电厂初始事件进行类型识别和严重性评估时，确定哪些参数作为系统的输入是至关重要的，然而，尽管已有几种诊断系统在诊断精度和速度上都取得了令人满意的性能，由于对监测点的选择和布设方法的探讨较少，因此，通常采用冗余的测量数据来训练诊断模型，导致分类结果的不确定性较大，针对核电站热工水力参数选择问题，提出了一种基于深度学习中的事后解释理论的核电站热工水力参数选择方法，一种新的时序残差卷积神经网络（TRES-CNN）诊断模型，利用HPR1000上人工选择的38个参数，对LOCA破口位置和水力直径进行了识别，采用事后解释方法对诊断模型输出的属性进行评价，结果表明，TRES-1和TRES-2对LOCA的诊断具有重要意义，TRES-1和TRES-2对LOCA的诊断具有重要意义。基于CNN的诊断模型通过选取HPR1000的15个参数成功地预测了LOCA破口的位置和大小，与用38个参数训练的过程相比，训练时间仅为25%，与用经验选择的参数训练的模型相比，诊断精度的相对误差在1.5%以内，这可以被认为是相同量的诊断可靠性。
摘要：During applying data-oriented diagnosis systems to distinguishing the type of and evaluating the severity of nuclear power plant initial events, it is of vital importance to decide which parameters to be used as the system input. However, although several diagnosis systems have already achieved acceptable performance in diagnosis precision and speed, hardly have the researchers discussed the method of monitoring point choosing and its layout. For this reason, redundant measuring data are used to train the diagnostic model, leading to high uncertainty of the classification, extra training time consumption, and higher probability of overfitting while training. In this study, a method of choosing thermal hydraulics parameters of a nuclear power plant is proposed, using the theory of post-hoc interpretability theory in deep learning. At the start, a novel Time-sequential Residual Convolutional Neural Network (TRES-CNN) diagnosis model is introduced to identify the position and hydrodynamic diameter of breaks in LOCA, using 38 parameters manually chosen on HPR1000 empirically. Afterwards, post-hoc interpretability methods are applied to evaluate the attributions of diagnosis model's outputs, deciding which 15 parameters to be more decisive in diagnosing LOCA details. The results show that the TRES-CNN based diagnostic model successfully predicts the position and size of breaks in LOCA via selected 15 parameters of HPR1000, with 25% of time consumption while training the model compared the process using total 38 parameters. In addition, the relative diagnostic accuracy error is within 1.5 percent compared with the model using parameters chosen empirically, which can be regarded as the same amount of diagnostic reliability.

【12】 Neural Basis Functions for Accelerating Solutions to High Mach Euler Equations
标题：高马赫数Euler方程加速解的神经基函数
链接：https://arxiv.org/abs/2208.01687

作者：David Witman,Alexander New,Hicham Alkendry,Honest Mrema
备注：Published at ICML 2022 AI for Science workshop: this https URL
摘要：我们提出了一种求解偏微分方程的方法（偏微分方程）使用一组神经网络，我们称之为神经基函数该NBF框架是POD DeepONet算子学习方法的新颖变体，其中我们将一组神经网络回归到降阶本征正交分解上然后，这些网络与获取指定PDE的参数的分支网络结合使用，以计算PDE的降阶近似。该方法适用于高速流动条件下的稳态欧拉方程（马赫数10-30），我们考虑了圆柱体周围的二维流动，它产生了激波条件。然后，我们使用NBF预测作为高保真度计算流体动力学的初始条件。（CFD）求解器（CFD++），以显示更快的收敛速度。还将介绍培训和实施该算法的经验教训。
摘要：We propose an approach to solving partial differential equations (PDEs) using a set of neural networks which we call Neural Basis Functions (NBF). This NBF framework is a novel variation of the POD DeepONet operator learning approach where we regress a set of neural networks onto a reduced order Proper Orthogonal Decomposition (POD) basis. These networks are then used in combination with a branch network that ingests the parameters of the prescribed PDE to compute a reduced order approximation to the PDE. This approach is applied to the steady state Euler equations for high speed flow conditions (mach 10-30) where we consider the 2D flow around a cylinder which develops a shock condition. We then use the NBF predictions as initial conditions to a high fidelity Computational Fluid Dynamics (CFD) solver (CFD++) to show faster convergence. Lessons learned for training and implementing this algorithm will be presented as well.

【13】 A Roadmap for Greater Public Use of Privacy-Sensitive Government Data: Workshop Report
标题：让公众更多地使用隐私敏感型政府数据的路线图：研讨会报告
链接：https://arxiv.org/abs/2208.01636

作者：Chris Clifton,Bradley Malin,Anna Oganian,Ramesh Raskar,Vivek Sharma
机构：Additional Authors, Weiyi Xia (Vanderbilt University Medical Center), Jeremy Seeman (Penn State University), Zhiyu Wan (Vanderbilt University Medical Center), Abhishek Singh (MIT Media Lab), Executive Summary
备注：23 pages
摘要：政府机构收集和管理范围广泛的不断增长的数据集。虽然此类数据有可能支持研究和循证决策，但人们担心此类数据的传播可能侵犯个人隐私（或组织）的数据。为了评估数据共享的现状，以及了解以更快的速度刺激这种分享的机会，一个虚拟研讨会于2021年5月21日和26日举行，由国家科学基金会和国家标准与技术研究所赞助，在该讲习班上，一批多国研究人员和从业人员聚集在一起，讨论他们的经验，并了解最近开发的在共享数据的同时管理隐私的技术。讲习班特别侧重于各级政府数据共享方面的挑战和成功。第一天的重点是应用于共享公共数据的新技术的成功案例，包括正式的隐私技术、合成数据和加密方法。第二天的重点是就一些挑战和解决这些挑战的方向举行头脑风暴会议。
摘要：Government agencies collect and manage a wide range of ever-growing datasets. While such data has the potential to support research and evidence-based policy making, there are concerns that the dissemination of such data could infringe upon the privacy of the individuals (or organizations) from whom such data was collected. To appraise the current state of data sharing, as well as learn about opportunities for stimulating such sharing at a faster pace, a virtual workshop was held on May 21st and 26th, 2021, sponsored by the National Science Foundation and National Institute of Standards and Technologies, where a multinational collection of researchers and practitioners were brought together to discuss their experiences and learn about recently developed technologies for managing privacy while sharing data. The workshop specifically focused on challenges and successes in government data sharing at various levels. The first day focused on successful examples of new technology applied to sharing of public data, including formal privacy techniques, synthetic data, and cryptographic approaches. Day two emphasized brainstorming sessions on some of the challenges and directions to address them.

【14】 A Convolutional Persistence Transform
标题：一种卷积持久变换
链接：https://arxiv.org/abs/2208.02107

作者：Elchanan Solomon,Paul Bendich
机构：Department of Mathematics, Duke University, Durham, USA, Geometric Data Analytics
摘要：我们考虑$d$维图像的一种新的拓扑特征化，它是通过在计算持久性之前将图像与各种滤波器进行卷积而获得的.将卷积滤波器看作图像中的一个基元，所得到的卷积的持久性图描述了基元在整个图像中的分布方式.这种流水线，我们称之为卷积持久性，扩展了拓扑的能力，使其能够观察图像数据中的模式。（一般来说）对于任何两个图像，可以找到它们产生不同余辉图的某个滤波器，从而使给定图像的所有可能的卷积持续图的集合是一个单射不变量，并证明了卷积持续图是另一个拓扑不变量的特例，卷积持久性的其它优点是改进的稳定性和对噪声的鲁棒性、对于数据相关的矢量化的更大灵活性、以及对于具有大跨距矢量的卷积的降低的计算复杂性。我们有一系列的实验表明卷积极大地提高了对大量分类任务的持久性的预测能力，即使使用随机过滤器并通过只记录它们的总持久性来矢量化所得到的图。
摘要：We consider a new topological feauturization of $d$-dimensional images, obtained by convolving images with various filters before computing persistence. Viewing a convolution filter as a motif within an image, the persistence diagram of the resulting convolution describes the way the motif is distributed throughout that image. This pipeline, which we call convolutional persistence, extends the capacity of topology to observe patterns in image data. Indeed, we prove that (generically speaking) for any two images one can find some filter for which they produce different persistence diagrams, so that the collection of all possible convolutional persistence diagrams for a given image is an injective invariant. This is proven by showing convolutional persistence to be a special case of another topological invariant, the Persistent Homology Transform. Other advantages of convolutional persistence are improved stability and robustness to noise, greater flexibility for data-dependent vectorizations, and reduced computational complexity for convolutions with large stride vectors. Additionally, we have a suite of experiments showing that convolutions greatly improve the predictive power of persistence on a host of classification tasks, even if one uses random filters and vectorizes the resulting diagrams by recording only their total persistences.

机器翻译，仅供参考

点击“阅读原文”获取带摘要的学术速递