机器学习学术速递[8.25]

点击阅读原文访问arxivdaily.com，涵盖CS|物理|数学|经济|统计|金融|生物|电气领域，更有搜索、收藏等功能！

cs.LG 方向，今日共计70篇

Graph相关(图学习|图神经网络|图优化等)(4篇)

【1】 A Graph Convolution for Signed Directed Graphs
标题：带符号有向图的一种图卷积
链接：https://arxiv.org/abs/2208.11511

作者：Taewook Ko
备注：Preprint version
摘要：根据数据的性质，有几种类型的图表。有向图具有链接的方向，而带符号图具有诸如正和负的链接类型。符号有向图是最复杂和信息量最大的图。带符号有向图的图卷积还没有被提出。虽然已经提供了许多图卷积研究，但大多数都是针对无向或无符号设计的。本文研究了符号有向图的谱图卷积网络。提出了一种新的复Hermitian邻接矩阵，该矩阵通过复数对图信息进行编码.复数通过相位和幅度表示链路方向、符号和连通性。然后，利用Hermite矩阵定义了一个磁拉普拉斯算子，并证明了它的半正定性.最后，我们介绍了符号有向图卷积网络（SD-GCN）。据我们所知，它是第一个带符号图的谱卷积。此外，与现有的针对特定图类型设计的卷积不同，所提出的模型具有通用性，可以应用于任何图，包括无向图、有向图和带符号图.通过四个真实世界的图形评估了所提出的模型的性能。在链路符号预测任务中，它优于所有其他最先进的图卷积。
摘要：There are several types of graphs according to the nature of the data. Directed graphs have directions of links, and signed graphs have link types such as positive and negative. Signed directed graphs are the most complex and informative that have both. Graph convolutions for signed directed graphs have not been delivered much yet. Though many graph convolution studies have been provided, most are designed for undirected or unsigned. In this paper, we investigate a spectral graph convolution network for signed directed graphs. We propose a novel complex Hermitian adjacency matrix that encodes graph information via complex numbers. The complex numbers represent link direction, sign, and connectivity via the phases and magnitudes. Then, we define a magnetic Laplacian with the Hermitian matrix and prove its positive semidefinite property. Finally, we introduce Signed Directed Graph Convolution Network(SD-GCN). To the best of our knowledge, it is the first spectral convolution for graphs with signs. Moreover, unlike the existing convolutions designed for a specific graph type, the proposed model has generality that can be applied to any graphs, including undirected, directed, or signed. The performance of the proposed model was evaluated with four real-world graphs. It outperforms all the other state-of-the-art graph convolutions in the task of link sign prediction.

【2】 Tracking by weakly-supervised learning and graph optimization for whole-embryo C. elegans lineages
标题：基于弱监督学习和图形优化的线虫全胚谱系跟踪
链接：https://arxiv.org/abs/2208.11467

作者：Peter Hirsch,Caroline Malin-Mayor,Anthony Santella,Stephan Preibisch,Dagmar Kainmueller,Jan Funke
机构：Caroline, Stephan, Preibisch,[,−,−,−,X], Max-Delbrueck-Center for Molecular Medicine in the Helmholtz Association, DE, Humboldt-Universität zu Berlin, DE, Sloan Kettering Cancer Center, Molecular Cytology Core, Developmental Biology, USA
备注：Accepted at MICCAI 2022, Code: this https URL
摘要：在嘈杂且密集的荧光显微镜数据中跟踪胚胎的所有细胞核是一项具有挑战性的任务。我们建立在最近的细胞核跟踪方法之上，该方法将来自一小组细胞核中心点注释的弱监督学习与整数线性规划（ILP）相结合，用于最佳细胞谱系提取。我们的工作特别针对C. elegans胚胎记录：（1）与其他生物体的基准记录相比，许多细胞分裂，以及（2）容易被误认为细胞核的极体的存在。为了应对（1），我们设计并结合了学习细胞分裂检测器。为了处理（2），我们采用学习的极体检测器。我们进一步提出了通过结构化SVM的自动ILP权重调整，减轻了对各自网格搜索的繁琐手动设置的需要。我们的方法在Fluo-N3DH-CE胚胎数据集的细胞追踪挑战中优于之前的领导者。我们报告了对另外两个C. elegans数据集。我们将公开这些数据集，作为未来方法开发的扩展基准。我们的结果表明，我们的方法产生了相当大的改进，特别是在分割事件检测的正确性和完全正确的轨道段的数量和长度方面。代码：https://github.com/funkelab/linajea
摘要：Tracking all nuclei of an embryo in noisy and dense fluorescence microscopy data is a challenging task. We build upon a recent method for nuclei tracking that combines weakly-supervised learning from a small set of nuclei center point annotations with an integer linear program (ILP) for optimal cell lineage extraction. Our work specifically addresses the following challenging properties of C. elegans embryo recordings: (1) Many cell divisions as compared to benchmark recordings of other organisms, and (2) the presence of polar bodies that are easily mistaken as cell nuclei. To cope with (1), we devise and incorporate a learnt cell division detector. To cope with (2), we employ a learnt polar body detector. We further propose automated ILP weights tuning via a structured SVM, alleviating the need for tedious manual set-up of a respective grid search. Our method outperforms the previous leader of the cell tracking challenge on the Fluo-N3DH-CE embryo dataset. We report a further extensive quantitative evaluation on two more C. elegans datasets. We will make these datasets public to serve as an extended benchmark for future method development. Our results suggest considerable improvements yielded by our method, especially in terms of the correctness of division event detection and the number and length of fully correct track segments. Code: https://github.com/funkelab/linajea

【3】 Large-scale Entity Alignment via Knowledge Graph Merging, Partitioning and Embedding
标题：基于知识图合并、划分和嵌入的大规模实体对齐
链接：https://arxiv.org/abs/2208.11125

作者：Kexuan Xin,Zequn Sun,Wen Hua,Wei Hu,Jianfeng Qu,Xiaofang Zhou
机构：The University of Queensland, Australia, Nanjing University, China, Soochow University, China, Hong Kong University of Science and, Technology, HKSAR
备注：Accepted by CIKM 2022
摘要：实体对齐是知识图融合中的一个关键问题。然而，大多数实体对齐方法都存在可扩展性问题。最近的方法通过将大的KG划分成小的块以在每个块中嵌入和对齐学习来解决这个问题。然而，这样的划分和学习过程导致结构和对准的过度损失。因此，本文提出了一种基于广义高斯神经网络的可扩展实体比对方法，从三个方面减少结构和比对损失。首先，提出了一种基于中心度的子图生成算法，以召回作为不同子图之间桥梁的地标实体。其次，引入自监督实体重构算法从不完全邻域子图中恢复实体表示，并设计跨子图负采样算法将其他子图中的实体合并到对齐学习中。第三，在推理过程中，我们合并子图的嵌入，形成一个单一的空间来进行对齐搜索。在OpenEA基准数据集和DBpedia1M大数据集上的实验结果验证了该方法的有效性.
摘要：Entity alignment is a crucial task in knowledge graph fusion. However, most entity alignment approaches have the scalability problem. Recent methods address this issue by dividing large KGs into small blocks for embedding and alignment learning in each. However, such a partitioning and learning process results in an excessive loss of structure and alignment. Therefore, in this work, we propose a scalable GNN-based entity alignment approach to reduce the structure and alignment loss from three perspectives. First, we propose a centrality-based subgraph generation algorithm to recall some landmark entities serving as the bridges between different subgraphs. Second, we introduce self-supervised entity reconstruction to recover entity representations from incomplete neighborhood subgraphs, and design cross-subgraph negative sampling to incorporate entities from other subgraphs in alignment learning. Third, during the inference process, we merge the embeddings of subgraphs to make a single space for alignment search. Experimental results on the benchmark OpenEA dataset and the proposed large DBpedia1M dataset verify the effectiveness of our approach.

【4】 EpiGNN: Exploring Spatial Transmission with Graph Neural Network for Regional Epidemic Forecasting
标题：EpiGNN：用图神经网络探索区域疫情预测的空间传播
链接：https://arxiv.org/abs/2208.11517

作者：Feng Xie,Zhong Zhang,Liang Li,Bin Zhou,Yusong Tan
机构：College of Computer, National University of Defense Technology
备注：16 pages, 6 figures, ECML-PKDD2022
摘要：疫情预测是有效控制疫情传播的关键，有助于全球缓解威胁公共卫生的危机。为了更好地理解流行病的传播和演化过程，提出了一种基于图神经网络的流行病预测模型EpiGNN.具体而言，我们设计了传播风险编码模块来表征流行病过程中区域的局部和全局空间效应，并将其纳入模型。同时，我们开发了一个区域感知图学习器（RAGL），它考虑了传播风险、地理依赖性和时间信息，以更好地挖掘时空依赖性，使区域能够感知相关区域的疫情情况。RAGL还可以与外部资源（如人员移动性）相结合，以进一步提高预测性能。在五个真实世界流行病相关数据集（包括流感和COVID-19）上的综合实验表明，EpiGNN的RMSE比现有基线提高了9.48%，验证了本文方法的有效性.
摘要：Epidemic forecasting is the key to effective control of epidemic transmission and helps the world mitigate the crisis that threatens public health. To better understand the transmission and evolution of epidemics, we propose EpiGNN, a graph neural network-based model for epidemic forecasting. Specifically, we design a transmission risk encoding module to characterize local and global spatial effects of regions in epidemic processes and incorporate them into the model. Meanwhile, we develop a Region-Aware Graph Learner (RAGL) that takes transmission risk, geographical dependencies, and temporal information into account to better explore spatial-temporal dependencies and makes regions aware of related regions' epidemic situations. The RAGL can also combine with external resources, such as human mobility, to further improve prediction performance. Comprehensive experiments on five real-world epidemic-related datasets (including influenza and COVID-19) demonstrate the effectiveness of our proposed method and show that EpiGNN outperforms state-of-the-art baselines by 9.48% in RMSE.

Transformer(5篇)

【1】 A model-based approach to meta-Reinforcement Learning: Transformers and tree search
标题：一种基于模型的元强化学习方法：变形器和树搜索
链接：https://arxiv.org/abs/2208.11535

作者：Brieuc Pinon,Jean-Charles Delvenne,Raphaël Jungers
机构：ICTEAMINMA, UCLouvain, Belgium
摘要：元学习是一种研究方法，旨在开发利用过去的经验有效解决新的学习问题的能力。元强化学习（meta-RL）方法在一些元强化学习问题中展示了学习有效地获取和利用信息的行为的能力。在此背景下，Wang等人[2021]提出了Alchemy基准。Alchemy具有丰富的结构化潜在空间，这对最先进的无模型RL方法来说是一个挑战。这些方法无法学会正确地探索然后利用。我们开发了一个基于模型的算法。我们训练了一个以Transformer Encoder为主要模块的模型，以适应符号Alchemy环境的动态性。然后，我们定义了一个在线规划与学习模型使用树搜索方法。该算法在符号Alchemy问题上的性能明显优于以前应用的无模型RL方法。我们的结果揭示了基于模型的方法与在线规划的相关性，以在元RL中成功地执行探索和开发。此外，我们还展示了Transformer架构在学习复杂动态方面的效率，这些复杂动态来自于元RL问题中存在的潜在空间。
摘要：Meta-learning is a line of research that develops the ability to leverage past experiences to efficiently solve new learning problems. Meta-Reinforcement Learning (meta-RL) methods demonstrate a capability to learn behaviors that efficiently acquire and exploit information in several meta-RL problems. In this context, the Alchemy benchmark has been proposed by Wang et al. [2021]. Alchemy features a rich structured latent space that is challenging for state-of-the-art model-free RL methods. These methods fail to learn to properly explore then exploit. We develop a model-based algorithm. We train a model whose principal block is a Transformer Encoder to fit the symbolic Alchemy environment dynamics. Then we define an online planner with the learned model using a tree search method. This algorithm significantly outperforms previously applied model-free RL methods on the symbolic Alchemy problem. Our results reveal the relevance of model-based approaches with online planning to perform exploration and exploitation successfully in meta-RL. Moreover, we show the efficiency of the Transformer architecture to learn complex dynamics that arise from latent spaces present in meta-RL problems.

【2】 An End-to-End OCR Framework for Robust Arabic-Handwriting Recognition using a Novel Transformers-based Model and an Innovative 270 Million-Words Multi-Font Corpus of Classical Arabic with Diacritics
标题：一种端到端的端到端OCR框架，用于稳健的阿拉伯手写识别，它使用了一种新的基于Transformers的模型和一个创新的2.7亿字多字体古阿拉伯语变音语料库
链接：https://arxiv.org/abs/2208.11484

作者：Aly Mostafa,Omar Mohamed,Ali Ashraf,Ahmed Elbehery,Salma Jamal,Anas Salah,Amr S. Ghoneim
机构：Departement of Computer Science, Helwan, University, Helwan, Egypt
摘要：这项研究是一系列研究的第二阶段，这些研究是关于开发阿拉伯历史文献的光学字符识别（OCR），并检查不同的建模过程如何与问题相互作用。第一项研究研究了变形Transformers对我们定制的阿拉伯语数据集的影响。第一项研究的缺点之一是训练数据的大小，由于缺乏资源，我们的3000万张图像中只有15000张。此外，我们还增加了图像增强层、时间和空间优化层以及后校正层，以帮助模型预测正确上下文的正确单词。本文提出了一种端到端的文本识别方法，该方法使用Vision Transformers作为编码器，即BEIT，使用vanilla Transformers作为解码器，避免了使用CNNs进行特征提取，降低了模型的复杂度。实验结果表明，本文提出的端到端模型性能优于卷积骨干网模型。该模型获得的CER为4.46%。
摘要：This research is the second phase in a series of investigations on developing an Optical Character Recognition (OCR) of Arabic historical documents and examining how different modeling procedures interact with the problem. The first research studied the effect of Transformers on our custom-built Arabic dataset. One of the downsides of the first research was the size of the training data, a mere 15000 images from our 30 million images, due to lack of resources. Also, we add an image enhancement layer, time and space optimization, and Post-Correction layer to aid the model in predicting the correct word for the correct context. Notably, we propose an end-to-end text recognition approach using Vision Transformers as an encoder, namely BEIT, and vanilla Transformer as a decoder, eliminating CNNs for feature extraction and reducing the model's complexity. The experiments show that our end-to-end model outperforms Convolutions Backbones. The model attained a CER of 4.46%.

【3】 Improved Zero-Shot Audio Tagging & Classification with Patchout Spectrogram Transformers
标题：基于Patchout频谱变换的改进零射音频标注与分类
链接：https://arxiv.org/abs/2208.11402

作者：Paul Primus,Gerhard Widmer
机构：Institute of Computational Perception (CP-JKU), LIT Artificial Intelligence Lab, Johannes Kepler University, Austria
备注：published in EUSIPCO 2022
摘要：用于标记和分类声音信号的标准机器学习模型不能处理在训练期间未看到的类别。Zero-Shot（ZS）学习通过基于可适应的类描述来预测类，从而克服了这一限制。本研究旨在探讨基于自我注意的音频嵌入结构对ZS学习的有效性。为此，我们将最近的patchout谱图Transformer与两种经典的卷积结构进行了比较。我们在三个任务和三个不同的基准测试数据集上对这三种架构进行了评估：AudioSet上的通用标签、ESC-50上的环境声音分类以及OpenMIC上的乐器标签。我们的结果表明，在所有这些设置中，基于自注意的嵌入方法都优于两种比较的卷积结构。通过相应地设计训练和测试数据，我们观察到当训练和新测试类之间的'语义距离'很大时，预测性能会显著受损，这种影响值得更详细的研究。
摘要：Standard machine learning models for tagging and classifying acoustic signals cannot handle classes that were not seen during training. Zero-Shot (ZS) learning overcomes this restriction by predicting classes based on adaptable class descriptions. This study sets out to investigate the effectiveness of self-attention-based audio embedding architectures for ZS learning. To this end, we compare the very recent patchout spectrogram transformer with two classic convolutional architectures. We evaluate these three architectures on three tasks and on three different benchmark datasets: general-purpose tagging on AudioSet, environmental sound classification on ESC-50, and instrument tagging on OpenMIC. Our results show that the self-attention-based embedding methods outperform both compared convolutional architectures in all of these settings. By designing training and test data accordingly, we observe that prediction performance suffers significantly when the `semantic distance' between training and new test classes is large, an effect that will deserve more detailed investigations.

【4】 Transformer-Boosted Anomaly Detection with Fuzzy Hashes
标题：基于模糊散列的Transformer升压异常检测
链接：https://arxiv.org/abs/2208.11367

作者：Frieder Uhlig,Lukas Struppek,Dominik Hintersdorf,Kristian Kersting
机构：Department of Computer Science, Technical University of Darmstadt, Germany, Centre for Cognitive Science, Technical University of Darmstadt, Hessian Center for AI (hessian.AI)
备注：9 pages, 4 figures, 2 tables
摘要：模糊散列是数字取证中的一种重要工具，用于近似匹配以确定数字作品之间的相似性。它们将文件的字节码转换成可计算的字符串，这使得它们对于智能机器处理特别有意义。在这项工作中，我们提出了深度学习近似匹配（DLAM），它在检测模糊散列中的异常方面取得了比传统方法更高的准确性。除了用于对恶意软件进行聚类的众所周知的应用之外，我们还展示了模糊散列和深度学习确实非常适合根据特定内容的存在对文件进行分类，例如，恶意软件。DLAM依赖于来自自然语言处理领域的基于变换器的模型，并且优于现有方法。传统的模糊散列（如TLSH和ssdeep）大小有限，如果与整体文件大小相比相对较小，则无法检测文件异常。然而，即使对于小于15%的异常大小，DLAM也能够在TLSH和ssdeep的计算的模糊散列中检测这种文件相关性。它可以获得与最先进的模糊散列算法相当的结果，同时依赖于更有效的散列计算，因此可以在更大的规模上使用。
摘要：Fuzzy hashes are an important tool in digital forensics and are used in approximate matching to determine the similarity between digital artifacts. They translate the byte code of files into computable strings, which makes them particularly interesting for intelligent machine processing. In this work, we propose deep learning approximate matching (DLAM), which achieves much higher accuracy in detecting anomalies in fuzzy hashes than conventional approaches. In addition to the well-known application for clustering malware, we show that fuzzy hashes and deep learning are indeed well-suited to classify files according to the presence of certain content, e.g., malware. DLAM relies on transformer-based models from the field of natural language processing and outperforms existing methods. Traditional fuzzy hashes like TLSH and ssdeep have a limited size and fail to detect file anomalies if they are relatively small compared to the overall file size. DLAM, however, enables the detection of such file correlations in the computed fuzzy hashes of TLSH and ssdeep, even for anomaly sizes of less than 15%. It achieves comparable results to state-of-the-art fuzzy hashing algorithms while relying on more efficient hash computations and can, therefore, be used at a much larger scale.

【5】 Towards Efficient Use of Multi-Scale Features in Transformer-Based Object Detectors
标题：基于Transformer的目标探测器中多尺度特征的有效利用
链接：https://arxiv.org/abs/2208.11356

作者：Gongjie Zhang,Zhipeng Luo,Yingchen Yu,Zichen Tian,Jingyi Zhang,Shijian Lu
机构：Nanyang Technological University, Singapore
备注：Project page: this https URL
摘要：多尺度特征在目标检测中的应用已被证明是非常有效的，大多数基于ConvNet的目标检测器都采用特征金字塔网络（FPN）作为多尺度特征的基本组成部分。然而，对于最近提出的基于变换器的对象检测器，由于处理高分辨率特征的注意机制的高复杂性，直接结合多尺度特征导致禁止性的计算开销。本文提出了一种迭代多尺度特征聚合（Iterative Multi-scale Feature Aggregation，IMFA）算法，该算法能够有效地利用基于变换的目标检测器中的多尺度特征。该算法的核心思想是从几个关键位置提取稀疏的多尺度特征，并通过两种新颖的设计实现。首先，IMFA重新安排Transformer编码器—解码Transformer流水线，以便可以基于检测预测来迭代更新编码特征。其次，IMFA在先验检测预测的指导下，仅从少数关键点位置稀疏采样尺度自适应特征以进行精细检测。结果，采样的多尺度特征是稀疏的，但对于对象检测仍然非常有益。大量实验表明，该算法在计算量较小的情况下，显著提高了基于Transformer的多目标检测器的性能。项目页面：网站https://github.com/ZhangGongjie/IMFA。
摘要：Multi-scale features have been proven highly effective for object detection, and most ConvNet-based object detectors adopt Feature Pyramid Network (FPN) as a basic component for exploiting multi-scale features. However, for the recently proposed Transformer-based object detectors, directly incorporating multi-scale features leads to prohibitive computational overhead due to the high complexity of the attention mechanism for processing high-resolution features. This paper presents Iterative Multi-scale Feature Aggregation (IMFA) -- a generic paradigm that enables the efficient use of multi-scale features in Transformer-based object detectors. The core idea is to exploit sparse multi-scale features from just a few crucial locations, and it is achieved with two novel designs. First, IMFA rearranges the Transformer encoder-decoder pipeline so that the encoded features can be iteratively updated based on the detection predictions. Second, IMFA sparsely samples scale-adaptive features for refined detection from just a few keypoint locations under the guidance of prior detection predictions. As a result, the sampled multi-scale features are sparse yet still highly beneficial for object detection. Extensive experiments show that the proposed IMFA boosts the performance of multiple Transformer-based object detectors significantly yet with slight computational overhead. Project page: https://github.com/ZhangGongjie/IMFA.

GAN|对抗|攻击|生成相关(2篇)

【1】 Towards an Awareness of Time Series Anomaly Detection Models' Adversarial Vulnerability
标题：认识时间序列异常检测模型的对抗性漏洞
链接：https://arxiv.org/abs/2208.11264

作者：Shahroz Tariq,Binh M. Le,Simon S. Woo
机构： Sungkyunkwan University, Department of Artificial Intelligence
备注：Part of Proceedings of the 31st ACM International Conference on Information and Knowledge Management (CIKM '22)
摘要：时间序列异常检测在统计学、经济学和计算机科学中得到了广泛的研究。多年来，已经提出了许多使用基于深度学习的方法来进行时间序列异常检测的方法。这些方法中的许多方法在基准数据集上展示了最先进的性能，给人一种错误的印象，即这些系统在许多实际和工业真实世界场景中是鲁棒的和可部署的。在这篇文章中，我们证明了最先进的异常检测方法的性能大大下降，只添加小的敌对扰动到传感器数据。我们使用不同的评分指标，如预测误差、异常和分类评分，对多个公共和私有数据集进行评分，这些数据集包括航空航天应用、服务器机器和发电厂的网络物理系统。在快速梯度符号法（FGSM）和投影梯度下降法（PGD）的对抗性攻击下，我们证明了声称对异常具有鲁棒性并可能被集成到现实系统中的深度神经网络（DNNs）和图神经网络（GNNs）方法的性能下降到0%。据我们所知，我们第一次展示了异常检测系统在对抗攻击时的脆弱性。本研究的首要目标是提高人们对时间序列异常检测器的对抗性脆弱性的认识。
摘要：Time series anomaly detection is extensively studied in statistics, economics, and computer science. Over the years, numerous methods have been proposed for time series anomaly detection using deep learning-based methods. Many of these methods demonstrate state-of-the-art performance on benchmark datasets, giving the false impression that these systems are robust and deployable in many practical and industrial real-world scenarios. In this paper, we demonstrate that the performance of state-of-the-art anomaly detection methods is degraded substantially by adding only small adversarial perturbations to the sensor data. We use different scoring metrics such as prediction errors, anomaly, and classification scores over several public and private datasets ranging from aerospace applications, server machines, to cyber-physical systems in power plants. Under well-known adversarial attacks from Fast Gradient Sign Method (FGSM) and Projected Gradient Descent (PGD) methods, we demonstrate that state-of-the-art deep neural networks (DNNs) and graph neural networks (GNNs) methods, which claim to be robust against anomalies and have been possibly integrated in real-life systems, have their performance drop to as low as 0%. To the best of our understanding, we demonstrate, for the first time, the vulnerabilities of anomaly detection systems against adversarial attacks. The overarching goal of this research is to raise awareness towards the adversarial vulnerabilities of time series anomaly detectors.

【2】 Retrieval-based Controllable Molecule Generation
标题：基于检索的可控分子生成
链接：https://arxiv.org/abs/2208.11126

作者：Zichao Wang,Weili Nie,Zhuoran Qiao,Chaowei Xiao,Richard Baraniuk,Anima Anandkumar
机构：Rice University, NVIDIA, ASU, Richard G. Baraniuk, NVIDIA, Caltech
备注：32 pages
摘要：通过生成模型产生具有特定化学和生物学性质的新分子已成为药物发现的一个有前途的方向。然而，现有的方法需要对大数据集进行大量的训练/微调，这在现实世界的生成任务中通常是不可用的。在本工作中，我们提出了一个新的基于检索的可控分子生成框架。我们使用一小组示例性分子，即，那些（部分）满足设计标准的分子，以操纵预训练的生成模型合成满足给定设计标准的分子。我们设计了一个检索机制，该机制通过一个新的自监督目标来预测输入分子的最近邻居，从而检索并融合样本分子和输入分子。我们还提出了一个迭代求精过程来动态更新生成的分子和检索数据库以获得更好的泛化。我们的方法是不可知的生成模型的选择，并要求没有任务特定的微调。在从简单的设计标准到设计与SARS-CoV-2主蛋白酶结合的先导化合物的具有挑战性的现实世界场景的各种任务中，我们证明了我们的方法外推远远超出检索数据库，并获得了比以前方法更好的性能和更广泛的适用性。
摘要：Generating new molecules with specified chemical and biological properties via generative models has emerged as a promising direction for drug discovery. However, existing methods require extensive training/fine-tuning with a large dataset, often unavailable in real-world generation tasks. In this work, we propose a new retrieval-based framework for controllable molecule generation. We use a small set of exemplar molecules, i.e., those that (partially) satisfy the design criteria, to steer the pre-trained generative model towards synthesizing molecules that satisfy the given design criteria. We design a retrieval mechanism that retrieves and fuses the exemplar molecules with the input molecule, which is trained by a new self-supervised objective that predicts the nearest neighbor of the input molecule. We also propose an iterative refinement process to dynamically update the generated molecules and retrieval database for better generalization. Our approach is agnostic to the choice of generative models and requires no task-specific fine-tuning. On various tasks ranging from simple design criteria to a challenging real-world scenario for designing lead compounds that bind to the SARS-CoV-2 main protease, we demonstrate our approach extrapolates well beyond the retrieval database, and achieves better performance and wider applicability than previous methods.

半/弱/无/有监督|不确定性|主动学习(9篇)

【1】 ImitAL: Learned Active Learning Strategy on Synthetic Data
标题：ImitAL：基于合成数据的学习主动学习策略
链接：https://arxiv.org/abs/2208.11636

作者：Julius Gonsior,Maik Thiele,Wolfgang Lehner
机构：Technische Universität Dresden, Dresden, Germany, Hochschule für Technik und Wirtschaft Dresden
备注：arXiv admin note: text overlap with arXiv:2108.07670
摘要：主动学习（AL）是一种众所周知的标准方法，用于通过首先基于查询策略标记包含最多信息的样本来有效地获得注释数据。在过去，已经提出了大量的这种查询策略，每一代新的策略都增加了运行时间并增加了更多的复杂性。然而，就我们所知，这些策略中没有一个在来自不同应用领域的大量数据集上始终表现出色。现有的人工智能策略基本上都是信息性和代表性两种简单启发式的结合，最大的区别在于这两种启发式的结合往往是相互冲突的。在这篇文章中，我们提出了一个新的领域独立的查询策略ImitAL，它将AL编码为一个学习排序问题，并学习两种启发式之间的最优组合。我们在纯合成数据集上的大规模模拟AL运行上训练ImitAL。为了证明ImitAL被成功训练，我们在13个不同的数据集上进行了广泛的评估，将我们的策略与7个其他查询策略进行了比较。
摘要：Active Learning (AL) is a well-known standard method for efficiently obtaining annotated data by first labeling the samples that contain the most information based on a query strategy. In the past, a large variety of such query strategies has been proposed, with each generation of new strategies increasing the runtime and adding more complexity. However, to the best of our our knowledge, none of these strategies excels consistently over a large number of datasets from different application domains. Basically, most of the the existing AL strategies are a combination of the two simple heuristics informativeness and representativeness, and the big differences lie in the combination of the often conflicting heuristics. Within this paper, we propose ImitAL, a domain-independent novel query strategy, which encodes AL as a learning-to-rank problem and learns an optimal combination between both heuristics. We train ImitAL on large-scale simulated AL runs on purely synthetic datasets. To show that ImitAL was successfully trained, we perform an extensive evaluation comparing our strategy on 13 different datasets, from a wide range of domains, with 7 other query strategies.

【2】 Weakly Supervised Airway Orifice Segmentation in Video Bronchoscopy
标题：视频支气管镜检查中的弱监督气道口分割
链接：https://arxiv.org/abs/2208.11468

作者：Ron Keuth,Mattias Heinrich,Martin Eichenlaub,Marian Himstedt
机构：Institute of Medical Informatics, University of L¨ubeck, Germany, University Heart Center Freiburg-Bad Krozingen, Germany, . DESCRIPTION OF PURPOSE, Video Bronchoscopy (VB) is commonly applied in conjunction with lung diseases. It is a fundamental procedure
备注：5 Pages, 2 figures, only supplemental file, submitted to SPIE MI
摘要：视频支气管镜检查通常用于疑似癌症的肺组织活检、COPD患者的监测和重症监护病房急性呼吸问题的澄清。复杂支气管树内的导航特别具有挑战性，对身体要求很高，需要医生的长期经验。本文讨论了支气管镜检查视频中支气管孔的自动分割。由于缺乏现成的地面实况分割数据，基于深度学习的方法目前阻碍了这项任务。因此，我们提出了一种数据驱动的流水线，该流水线由k-均值和紧接着的基于标记的分水岭算法组成，该分水岭算法使得能够从给定的深度图像生成气道实例分割图。以这种方式，这些传统算法充当仅基于体模数据集直接在RGB图像上训练浅CNN的弱监督。我们在两个体内数据集上评价了该模型的泛化能力，这两个体内数据集涵盖了21种不同支气管镜检查的250帧。我们证明，其性能与直接在体内数据上训练的模型相当，通过128x128的图像分辨率，检测到的气道分割中心的平均误差为11 vs 5像素。我们的定量和定性结果表明，在视频支气管镜检查的背景下，使用非基于学习的方法的体模数据和弱监督能够获得气道结构的语义理解。
摘要：Video bronchoscopy is routinely conducted for biopsies of lung tissue suspected for cancer, monitoring of COPD patients and clarification of acute respiratory problems at intensive care units. The navigation within complex bronchial trees is particularly challenging and physically demanding, requiring long-term experiences of physicians. This paper addresses the automatic segmentation of bronchial orifices in bronchoscopy videos. Deep learning-based approaches to this task are currently hampered due to the lack of readily-available ground truth segmentation data. Thus, we present a data-driven pipeline consisting of a k-means followed by a compact marker-based watershed algorithm which enables to generate airway instance segmentation maps from given depth images. In this way, these traditional algorithms serve as weak supervision for training a shallow CNN directly on RGB images solely based on a phantom dataset. We evaluate generalization capabilities of this model on two in-vivo datasets covering 250 frames on 21 different bronchoscopies. We demonstrate that its performance is comparable to those models being directly trained on in-vivo data, reaching an average error of 11 vs 5 pixels for the detected centers of the airway segmentation by an image resolution of 128x128. Our quantitative and qualitative results indicate that in the context of video bronchoscopy, phantom data and weak supervision using non-learning-based approaches enable to gain a semantic understanding of airway structures.

【3】 Scenario-Adaptive and Self-Supervised Model for Multi-Scenario Personalized Recommendation
标题：多场景个性化推荐的场景自适应自监督模型
链接：https://arxiv.org/abs/2208.11457

作者：Yuanliang Zhang,Xiaofeng Wang,Jinxin Hu,Ke Gao,Chenyi Lei,Fei Fang
机构：Alibaba Group, Hangzhou, China
备注：Accepted by CIKM 2022
摘要：多场景推荐是一种在多个场景中为用户检索相关条目的方法，在行业推荐系统中普遍存在。这些场景在用户和项目上有部分重叠，而不同场景的分布是不同的。多场景建模的关键是有效地最大化使用整个场景的信息，并在多个场景中为用户和项目生成自适应的表示。我们总结了三个对于多场景建模还没有很好解决的实际挑战：（1）缺乏多场景间细粒度、解耦的信息传递控制。(2)对整个空间样本的利用不足。(3)项目的多场景表示解纠缠问题。本文提出了一种场景自适应自监督（SASS）模型来解决上述三个问题。具体而言，我们设计了一个多层情景自适应传输（ML-SAT）模块，通过情景自适应门单元，以细粒度和解耦的方式选择和融合从整个情景到单个情景的有效传输信息。为了充分利用整个空间样本的能量，引入了预训练和微调两阶段的训练过程。预训练阶段是基于场景监督的对比学习任务，训练样本取自有标记和无标记的数据空间。该模型在用户端和项目端对称创建，从而可以得到不同场景下项目的区别表示。在公共数据集和工业数据集上的大量实验结果证明了SASS模型相对于现有方法的优越性。在A/B测试中，该模型还实现了每个用户平均观看时间8.0%以上的改进。
摘要：Multi-scenario recommendation is dedicated to retrieve relevant items for users in multiple scenarios, which is ubiquitous in industrial recommendation systems. These scenarios enjoy portions of overlaps in users and items, while the distribution of different scenarios is different. The key point of multi-scenario modeling is to efficiently maximize the use of whole-scenario information and granularly generate adaptive representations both for users and items among multiple scenarios. we summarize three practical challenges which are not well solved for multi-scenario modeling: (1) Lacking of fine-grained and decoupled information transfer controls among multiple scenarios. (2) Insufficient exploitation of entire space samples. (3) Item's multi-scenario representation disentanglement problem. In this paper, we propose a Scenario-Adaptive and Self-Supervised (SASS) model to solve the three challenges mentioned above. Specifically, we design a Multi-Layer Scenario Adaptive Transfer (ML-SAT) module with scenario-adaptive gate units to select and fuse effective transfer information from whole scenario to individual scenario in a quite fine-grained and decoupled way. To sufficiently exploit the power of entire space samples, a two-stage training process including pre-training and fine-tune is introduced. The pre-training stage is based on a scenario-supervised contrastive learning task with the training samples drawn from labeled and unlabeled data spaces. The model is created symmetrically both in user side and item side, so that we can get distinguishing representations of items in different scenarios. Extensive experimental results on public and industrial datasets demonstrate the superiority of the SASS model over state-of-the-art methods. This model also achieves more than 8.0% improvement on Average Watching Time Per User in online A/B tests.

【4】 Self-Supervised Exploration via Temporal Inconsistency in Reinforcement Learning
标题：强化学习中基于时间不一致性的自监督探索
链接：https://arxiv.org/abs/2208.11361

作者：Zijian Gao,Kele Xu,HengXing Cai,Yuanzhao Zhai,Dawei Feng,Bo Ding,XinJun Mao,Huaimin Wang
机构： National University of Defense Technology, Changsha, China, Paradigm, Beijing, China
摘要：在现实世界中，稀疏报酬协同环境下的强化学习仍然具有挑战性，尽管该领域的研究兴趣高涨。先前的尝试表明，内在奖励可以缓解稀疏性引起的问题。在这篇论文中，我们提出了一种新颖的内在奖励，它受到人类学习的启发，因为人类通过比较当前的观察结果和历史知识来评估好奇心。具体地说，我们训练一个自监督预测模型，并保存一组模型参数的快照，而不产生额外的训练成本。然后，利用核范数来评估不同快照预测之间的时间不一致性，并将其进一步部署为内在奖励.此外，提出了一种变权重机制，以自适应的方式为不同的快照分配权重。我们在各种基准测试环境中验证了所提方法的有效性。实验结果表明，与其他基于奖励的方法相比，该方法在不增加额外训练开销的情况下，具有更高的噪声容忍度，并且具有更好的性能.我们的代码将公开发布，以提高可重复性。
摘要：In real-world scenarios, reinforcement learning under sparse-reward synergistic settings has remained challenging, despite surging interests in this field. Previous attempts suggest that intrinsic reward can alleviate the issue caused by sparsity. In this paper, we present a novel intrinsic reward that is inspired by human learning, as humans evaluate curiosity by comparing current observations with historical knowledge. Specifically, we train a self-supervised prediction model and save a set of snapshots of the model parameters, without incurring addition training cost. Then we employ nuclear norm to evaluate the temporal inconsistency between the predictions of different snapshots, which can be further deployed as the intrinsic reward. Moreover, a variational weighting mechanism is proposed to assign weight to different snapshots in an adaptive manner. We demonstrate the efficacy of the proposed method in various benchmark environments. The results suggest that our method can provide overwhelming state-of-the-art performance compared with other intrinsic reward-based methods, without incurring additional training costs and maintaining higher noise tolerance. Our code will be released publicly to enhance reproducibility.

【5】 Time-to-Green predictions for fully-actuated signal control systems with supervised learning
标题：带监督学习的全驱动信号控制系统的绿灯时间预测
链接：https://arxiv.org/abs/2208.11344

作者：Alexander Genser,Michail A. Makridis,Kaidi Yang,Lukas Ambühl,Monica Menendez,Anastasios Kouvelas
摘要：近来，已经进行了标准化信号相位和定时（SPaT）消息的努力。这些信息包含所有信号交叉口引道的信号相位计时。因此，该信息可以用于有效的运动规划，从而导致更均匀的交通流和均匀的速度分布。尽管努力为半致动信号控制系统提供鲁棒预测，但是预测全致动控制的信号相位定时仍然具有挑战性。本文提出了一种基于交通信号和环路检测器数据的时间序列预测框架。我们利用最先进的机器学习模型来预测未来信号相位的持续时间。线性回归（LR）、随机森林（RF）和长短期记忆（LSTM）神经网络的性能根据原始基线模型进行评估。基于来自瑞士苏黎世的全驱动信号控制系统的经验数据集的结果表明，机器学习模型优于传统的预测方法。此外，诸如RF的基于树的决策模型以满足实际应用的要求的精度执行得最好。
摘要：Recently, efforts have been made to standardize signal phase and timing (SPaT) messages. These messages contain signal phase timings of all signalized intersection approaches. This information can thus be used for efficient motion planning, resulting in more homogeneous traffic flows and uniform speed profiles. Despite efforts to provide robust predictions for semi-actuated signal control systems, predicting signal phase timings for fully-actuated controls remains challenging. This paper proposes a time series prediction framework using aggregated traffic signal and loop detector data. We utilize state-of-the-art machine learning models to predict future signal phases' duration. The performance of a Linear Regression (LR), a Random Forest (RF), and a Long-Short-Term-Memory (LSTM) neural network are assessed against a naive baseline model. Results based on an empirical data set from a fully-actuated signal control system in Zurich, Switzerland, show that machine learning models outperform conventional prediction methods. Furthermore, tree-based decision models such as the RF perform best with an accuracy that meets requirements for practical applications.

【6】 Semi-Supervised and Unsupervised Deep Visual Learning: A Survey
标题：半监督和无监督深度视觉学习研究综述
链接：https://arxiv.org/abs/2208.11296

作者：Yanbei Chen,Massimiliano Mancini,Xiatian Zhu,Zeynep Akata
机构： Zhu is with the University of Surrey
备注：IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
摘要：最先进的深度学习模型通常使用大量昂贵的标记训练数据进行训练。然而，要求详尽的人工注释可能会降低模型在有限标签制度下的推广性。半监督学习和无监督学习提供了从大量未标记视觉数据中学习的有前途的范例。这些范例的最新进展表明，利用未标记数据来提高模型泛化能力和提供更好的模型初始化具有很大的好处。本文从统一的角度综述了近年来视觉识别领域半监督学习（SSL）和无监督学习（UL）的高级深度学习算法。为了全面了解这些领域的最新发展水平，我们提出了一个统一的分类法。我们对现有的具有代表性的SSL和UL进行了全面而深入的分析，以突出它们在不同学习场景和不同计算机视觉任务应用中的设计原理。最后，我们讨论了SSL和UL的新兴趋势和开放挑战，以阐明未来的关键研究方向。
摘要：State-of-the-art deep learning models are often trained with a large amount of costly labeled training data. However, requiring exhaustive manual annotations may degrade the model's generalizability in the limited-label regime. Semi-supervised learning and unsupervised learning offer promising paradigms to learn from an abundance of unlabeled visual data. Recent progress in these paradigms has indicated the strong benefits of leveraging unlabeled data to improve model generalization and provide better model initialization. In this survey, we review the recent advanced deep learning algorithms on semi-supervised learning (SSL) and unsupervised learning (UL) for visual recognition from a unified perspective. To offer a holistic understanding of the state-of-the-art in these areas, we propose a unified taxonomy. We categorize existing representative SSL and UL with comprehensive and insightful analysis to highlight their design rationales in different learning scenarios and applications in different computer vision tasks. Lastly, we discuss the emerging trends and open challenges in SSL and UL to shed light on future critical research directions.

【7】 Federated Self-Supervised Contrastive Learning and Masked Autoencoder for Dermatological Disease Diagnosis
标题：用于皮肤病诊断的联合自监督对比学习和掩蔽自动编码器
链接：https://arxiv.org/abs/2208.11278

作者：Yawen Wu,Dewen Zeng,Zhepeng Wang,Yi Sheng,Lei Yang,Alaina J. James,Yiyu Shi,Jingtong Hu
机构：Department of Electrical and Computer Engineering, University of Pittsburgh, USA, Department of Computer Science and Engineering, University of Notre Dame, USA, Department of Electrical and Computer Engineering, George Mason University, USA
备注：arXiv admin note: substantial text overlap with arXiv:2202.07470
摘要：在皮肤病诊断中，移动皮肤助手收集的隐私数据存在于患者的分布式移动设备上。联邦学习（FL）可以使用分散的数据来训练模型，同时保持数据本地化。现有的FL方法假设所有数据都具有标签。然而，由于高标签成本，医疗数据通常没有完整的标签。自监督学习（SSL）方法、对比学习（CL）和掩蔽自动编码器（MAE）可以利用未标记的数据来预训练模型，然后用有限的标记来微调。然而，结合SSL和FL具有独特的挑战。例如，CL需要不同的数据，但每个设备只有有限的数据。对于MAE，虽然基于Vision Transformer（ViT）的MAE在集中式学习中比CNN具有更高的准确性，但是尚未研究MAE在FL中使用未标记数据的性能。此外，服务器端和客户端之间的ViT同步也不同于传统的CNNs。因此，需要设计特殊的同步方法。在这项工作中，我们提出了两个联邦自监督学习框架的皮肤病诊断与有限的标签。第一种方法的特点是计算成本较低，适用于移动设备。第二种具有高精度，适合高性能服务器。在此基础上，提出了基于特征共享的联邦对比学习算法（FedCLF）。特征被共享用于不同的对比信息，而不共享用于隐私的原始数据。在MAE的基础上，提出了FedMAE。知识分割将从每个客户端学习到的全局知识和局部知识分开。仅聚合全局知识以获得更高的泛化性能。在皮肤病数据集上的实验表明，所提出的框架的准确性优于现有技术。
摘要：In dermatological disease diagnosis, the private data collected by mobile dermatology assistants exist on distributed mobile devices of patients. Federated learning (FL) can use decentralized data to train models while keeping data local. Existing FL methods assume all the data have labels. However, medical data often comes without full labels due to high labeling costs. Self-supervised learning (SSL) methods, contrastive learning (CL) and masked autoencoders (MAE), can leverage the unlabeled data to pre-train models, followed by fine-tuning with limited labels. However, combining SSL and FL has unique challenges. For example, CL requires diverse data but each device only has limited data. For MAE, while Vision Transformer (ViT) based MAE has higher accuracy over CNNs in centralized learning, MAE's performance in FL with unlabeled data has not been investigated. Besides, the ViT synchronization between the server and clients is different from traditional CNNs. Therefore, special synchronization methods need to be designed. In this work, we propose two federated self-supervised learning frameworks for dermatological disease diagnosis with limited labels. The first one features lower computation costs, suitable for mobile devices. The second one features high accuracy and fits high-performance servers. Based on CL, we proposed federated contrastive learning with feature sharing (FedCLF). Features are shared for diverse contrastive information without sharing raw data for privacy. Based on MAE, we proposed FedMAE. Knowledge split separates the global and local knowledge learned from each client. Only global knowledge is aggregated for higher generalization performance. Experiments on dermatological disease datasets show superior accuracy of the proposed frameworks over state-of-the-arts.

【8】 SCALE: Online Self-Supervised Lifelong Learning without Prior Knowledge
标题：规模：无先验知识的在线自我监督终身学习
链接：https://arxiv.org/abs/2208.11266

作者：Xiaofan Yu,Yunhui Guo,Sicun Gao,Tajana Rosing
机构： University of California San Diego, University of Texas at Dallas
备注：Submitted for review
摘要：无监督的终身学习是指在没有监督的情况下，随着时间的推移而学习，同时记忆先前模式的能力。以前的工作假设了关于输入数据的强的先验知识（例如，知道类边界），这在复杂和不可预测的环境中是不可能获得的。本文从现实场景出发，形式化地定义了基于类增量流数据的在线无监督终身学习问题，该问题是非iid的、单遍的.由于标签和先验知识的缺失，该问题比现有的终身学习问题更具挑战性。为了解决这个问题，我们提出了自我监督的对比终身学习（SCALE），它在飞行中提取和记忆知识。SCALE是围绕三个主要组件设计的：伪监督对比损失、自监督遗忘损失和用于统一子集选择的在线记忆更新。这三个部分都是为了协同工作以最大限度地提高学习绩效而设计的。我们的损失函数利用了成对相似性，从而消除了对监督或先验知识的依赖。我们在iid和四种非iid数据流下进行了SCALE的综合实验。SCALE在所有设置上均优于最先进的算法，在CIFAR-10、CIFAR-100和SubImageNet数据集上的kNN准确度分别提高了6.43%、5.23%和5.86%。
摘要：Unsupervised lifelong learning refers to the ability to learn over time while memorizing previous patterns without supervision. Previous works assumed strong prior knowledge about the incoming data (e.g., knowing the class boundaries) which can be impossible to obtain in complex and unpredictable environments. In this paper, motivated by real-world scenarios, we formally define the online unsupervised lifelong learning problem with class-incremental streaming data, which is non-iid and single-pass. The problem is more challenging than existing lifelong learning problems due to the absence of labels and prior knowledge. To address the issue, we propose Self-Supervised ContrAstive Lifelong LEarning (SCALE) which extracts and memorizes knowledge on-the-fly. SCALE is designed around three major components: a pseudo-supervised contrastive loss, a self-supervised forgetting loss, and an online memory update for uniform subset selection. All three components are designed to work collaboratively to maximize learning performance. Our loss functions leverage pairwise similarity thus remove the dependency on supervision or prior knowledge. We perform comprehensive experiments of SCALE under iid and four non-iid data streams. SCALE outperforms the best state-of-the-art algorithm on all settings with improvements of up to 6.43%, 5.23% and 5.86% kNN accuracy on CIFAR-10, CIFAR-100 and SubImageNet datasets.

【9】 Calibrated and Enhanced NRLMSIS 2.0 Model with Uncertainty Quantification
标题：具有不确定量化的NRLMSIS 2.0模型的校正和增强
链接：https://arxiv.org/abs/2208.11619

作者：Richard J. Licata,Piyush M. Mehta,Daniel R. Weimer,W. Kent Tobiska,Jean Yoshii
机构：Dept. of Mechanical and Aerospace Engineering, West Virginia University, Morgantown, WV , Center for Space Science and Eng. Research, Virginia Tech, Blacksburg, VA, Space Environments Technologies, Pacific Palisades, CA
摘要：质谱仪和非相干散射雷达（MSIS）模型族从20世纪70年代早期开始发展和改进。MSIS的最新版本是海军研究实验室（NRL）的MSIS 2.0经验大气模型。NRLMSIS 2.0提供物种密度、质量密度和温度估计，作为位置和空间天气条件的函数。MSIS模型长期以来一直是研究界和运营界普遍选择的大气模型，但与许多模型一样，它不提供不确定性估计。在这项工作中，我们开发了一个基于机器学习（ML）的散逸层温度模型，该模型可以与NRLMSIS 2.0一起使用，以相对于高保真度卫星密度估计值对其进行校准。我们的模型（称为MSIS-UQ）不提供点估计值，而是输出一个分布，该分布使用称为校准误差得分的度量进行评估。我们表明，MSIS-UQ消除了NRLMSIS 2.0的偏差，使模型和卫星密度之间的差异减少了25%，比Space Force的高精度卫星阻力模型更接近卫星密度11%。我们还通过生成物种密度、质量密度和温度的高度剖面图展示了模型的不确定性估计能力。这清楚地展示了散逸层温度概率如何影响NRLMSIS 2.0中的密度和温度分布。另一项研究显示，相对于单独的NRLMSIS 2.0，改进了风暴后过冷能力，增强了它可以捕捉的现象。
摘要：The Mass Spectrometer and Incoherent Scatter radar (MSIS) model family has been developed and improved since the early 1970's. The most recent version of MSIS is the Naval Research Laboratory (NRL) MSIS 2.0 empirical atmospheric model. NRLMSIS 2.0 provides species density, mass density, and temperature estimates as function of location and space weather conditions. MSIS models have long been a popular choice of atmosphere model in the research and operations community alike, but - like many models - does not provide uncertainty estimates. In this work, we develop an exospheric temperature model based in machine learning (ML) that can be used with NRLMSIS 2.0 to calibrate it relative to high-fidelity satellite density estimates. Instead of providing point estimates, our model (called MSIS-UQ) outputs a distribution which is assessed using a metric called the calibration error score. We show that MSIS-UQ debiases NRLMSIS 2.0 resulting in reduced differences between model and satellite density of 25% and is 11% closer to satellite density than the Space Force's High Accuracy Satellite Drag Model. We also show the model's uncertainty estimation capabilities by generating altitude profiles for species density, mass density, and temperature. This explicitly demonstrates how exospheric temperature probabilities affect density and temperature profiles within NRLMSIS 2.0. Another study displays improved post-storm overcooling capabilities relative to NRLMSIS 2.0 alone, enhancing the phenomena that it can capture.

迁移|Zero/Few/One-Shot|自适应(2篇)

【1】 Improving Natural-Language-based Audio Retrieval with Transfer Learning and Audio & Text Augmentations
标题：利用迁移学习和音文增强改进基于自然语言的音频检索
链接：https://arxiv.org/abs/2208.11460

作者：Paul Primus,Gerhard Widmer
机构：Institute of Computational Perception (CP-JKU), LIT Artificial Intelligence Lab, Johannes Kepler University, Austria
备注：submitted to DCASE Workshop 2022
摘要：在深度学习的许多应用领域中，缺乏大的标记数据集仍然是一个重大挑战。研究人员和实践者通常求助于迁移学习和数据扩充来缓解这个问题。我们在使用自然语言查询进行音频检索的背景下研究这些策略（DCASE 2022挑战的任务6b）。我们提出的系统使用预先训练的嵌入模型来将记录和文本描述投影到共享的音频字幕空间中，在该空间中来自不同模态的相关示例是接近的。我们在音频和文本输入上采用各种数据扩增技术，并通过基于模型的顺序优化系统地调整其相应的超参数。实验结果表明，所采用的增强策略减少了过拟合现象，提高了检索性能。我们进一步表明，在AudioCaps数据集上对系统进行预训练会带来额外的改进。
摘要：The absence of large labeled datasets remains a significant challenge in many application areas of deep learning. Researchers and practitioners typically resort to transfer learning and data augmentation to alleviate this issue. We study these strategies in the context of audio retrieval with natural language queries (Task 6b of the DCASE 2022 Challenge). Our proposed system uses pre-trained embedding models to project recordings and textual descriptions into a shared audio-caption space in which related examples from different modalities are close. We employ various data augmentation techniques on audio and text inputs and systematically tune their corresponding hyperparameters with sequential model-based optimization. Our results show that the used augmentations strategies reduce overfitting and improve retrieval performance. We further show that pre-training the system on the AudioCaps dataset leads to additional improvements.

【2】 Transfer Learning-based State of Health Estimation for Lithium-ion Battery with Cycle Synchronization
标题：基于转移学习的循环同步锂离子电池健康状态估计
链接：https://arxiv.org/abs/2208.11204

作者：Kate Qi Zhou,Yan Qin,Chau Yuen
机构： The Singapore University of Technology and Design
摘要：准确估计电池的健康状态（SOH）有助于防止电池供电的应用出现意外故障。迁移学习（transfer learning，TL）是一种应用从具有大量数据的源电池中学习到的知识的机器学习方法，具有减少新电池模型训练数据需求的优势。然而，尽管这些是成功TL的关键组成部分，但很少讨论确定源电池模型是否合理以及哪部分信息可用于SOH估计。针对这些问题，提出了一种基于可解释TL的SOH估计方法，该方法利用时间动态性辅助迁移学习，包括三个部分。首先，借助于动态时间弯曲，来自放电时间序列的时间数据被同步，产生造成容量随周期退化的周期同步时间序列的弯曲路径。其次，从周期同步时间序列的空间路径中提取典型变量，用于源电池和目标电池之间的分布相似性分析。第三，当分布相似性在预定义阈值内时，通过从源SOH估计模型转移共同的时间动态并且用来自目标电池的残差模型补偿误差来构建综合目标SOH估计模型。通过一个广泛使用的开源基准数据集，以均方根误差衡量，所提方法的估计误差低至0. 0034，与现有方法相比，准确率提高了77%.
摘要：Accurately estimating a battery's state of health (SOH) helps prevent battery-powered applications from failing unexpectedly. With the superiority of reducing the data requirement of model training for new batteries, transfer learning (TL) emerges as a promising machine learning approach that applies knowledge learned from a source battery, which has a large amount of data. However, the determination of whether the source battery model is reasonable and which part of information can be transferred for SOH estimation are rarely discussed, despite these being critical components of a successful TL. To address these challenges, this paper proposes an interpretable TL-based SOH estimation method by exploiting the temporal dynamic to assist transfer learning, which consists of three parts. First, with the help of dynamic time warping, the temporal data from the discharge time series are synchronized, yielding the warping path of the cycle-synchronized time series responsible for capacity degradation over cycles. Second, the canonical variates retrieved from the spatial path of the cycle-synchronized time series are used for distribution similarity analysis between the source and target batteries. Third, when the distribution similarity is within the predefined threshold, a comprehensive target SOH estimation model is constructed by transferring the common temporal dynamics from the source SOH estimation model and compensating the errors with a residual model from the target battery. Through a widely-used open-source benchmark dataset, the estimation error of the proposed method evaluated by the root mean squared error is as low as 0.0034 resulting in a 77% accuracy improvement compared with existing methods.

强化学习(2篇)

【1】 Oracle-free Reinforcement Learning in Mean-Field Games along a Single Sample Path
标题：沿单一样本路径的平均场博弈中的无Oracle强化学习
链接：https://arxiv.org/abs/2208.11639

作者：Muhammad Aneeq uz Zaman,Alec Koppel,Sujay Bhatt,Tamer Başar
机构：Coordinated Science Laboratory, University of Illinois at Urbana-Champaign, Urbana IL ,-, J.P. Morgan AI Research, Tamer Ba¸sar
摘要：我们考虑了平均场博弈中的在线强化学习。与已有的工作相比，我们通过开发一个算法来估计平均场和最优策略，从而减少了对平均场预言器的需求。我们称之为沙盒学习，因为它可以用作在多代理非合作设置中运行的任何代理的热启动。我们采用两个时间尺度的方法，其中平均场的在线定点递归在较慢的时间尺度上操作，而控制策略的更新在较快的时间尺度上与通用代理协同操作。在充分的探索条件下，通过平均场和控制策略收敛到平均场平衡点，给出了有限样本收敛性的保证.沙盒学习算法的样本复杂度是$\mathcal{O}（\epsilon^{-4}）$。最后，通过拥塞博弈实验验证了沙盒学习算法的有效性.
摘要：We consider online reinforcement learning in Mean-Field Games. In contrast to the existing works, we alleviate the need for a mean-field oracle by developing an algorithm that estimates the mean-field and the optimal policy using a single sample path of the generic agent. We call this Sandbox Learning, as it can be used as a warm-start for any agent operating in a multi-agent non-cooperative setting. We adopt a two timescale approach in which an online fixed-point recursion for the mean-field operates on a slower timescale and in tandem with a control policy update on a faster timescale for the generic agent. Under a sufficient exploration condition, we provide finite sample convergence guarantees in terms of convergence of the mean-field and control policy to the mean-field equilibrium. The sample complexity of the Sandbox learning algorithm is $\mathcal{O}(\epsilon^{-4})$. Finally, we empirically demonstrate effectiveness of the sandbox learning algorithm in a congestion game.

【2】 Quantum Multi-Agent Meta Reinforcement Learning
标题：量子多智能体元强化学习
链接：https://arxiv.org/abs/2208.11510

作者：Won Joon Yun,Jihong Park,Joongheon Kim
机构： School of Electrical Engineering, Korea University, Seoul, Republic of Korea, School of Information Technology, Deakin University, Geelong, VIC, Australia
摘要：虽然量子霸权尚未到来，但在即将到来的实际量子计算时代，人们对识别量子机器学习（QML）的潜力越来越感兴趣。受此启发，本文基于量子神经网络（QNN）的独特特性，重新设计了多智能体强化学习（MARL），该网络具有两个独立的可训练参数维度：影响输出量子位状态的角度参数，以及与输出测量基础相关联的极点参数。利用这种二元可训练性作为元学习能力，我们提出了量子元MARL（QM2ARL），它首先应用角度训练进行元QNN学习，然后应用极点训练进行few-Shot或局部QNN训练。为了避免过拟合，我们开发了一种角度—极点正则化技术，在角度训练过程中将噪声注入极点域。此外，通过利用极点作为每个训练过的QNN的存储器地址，我们引入了极点存储器的概念，该概念允许仅使用两个参数极点值来保存和加载训练过的QNN。从理论上证明了角—极点正则化下角训练的收敛性，并通过仿真验证了QM2ARL在实现高回报和快速收敛方面的有效性，以及极点记忆在快速适应时变环境方面的有效性。
摘要：Although quantum supremacy is yet to come, there has recently been an increasing interest in identifying the potential of quantum machine learning (QML) in the looming era of practical quantum computing. Motivated by this, in this article we re-design multi-agent reinforcement learning (MARL) based on the unique characteristics of quantum neural networks (QNNs) having two separate dimensions of trainable parameters: angle parameters affecting the output qubit states, and pole parameters associated with the output measurement basis. Exploiting this dyadic trainability as meta-learning capability, we propose quantum meta MARL (QM2ARL) that first applies angle training for meta-QNN learning, followed by pole training for few-shot or local-QNN training. To avoid overfitting, we develop an angle-to-pole regularization technique injecting noise into the pole domain during angle training. Furthermore, by exploiting the pole as the memory address of each trained QNN, we introduce the concept of pole memory allowing one to save and load trained QNNs using only two-parameter pole values. We theoretically prove the convergence of angle training under the angle-to-pole regularization, and by simulation corroborate the effectiveness of QM2ARL in achieving high reward and fast convergence, as well as of the pole memory in fast adaptation to a time-varying environment.

符号|符号学习(1篇)

【1】 Deep Symbolic Learning: Discovering Symbols and Rules from Perceptions
标题：深度符号学习：从感知中发现符号和规则
链接：https://arxiv.org/abs/2208.11561

作者：Alessandro Daniele,Tommaso Campari,Sagar Malhotra,Luciano Serafini
机构：Fondazione Bruno Kessler, Trento, Italy, Universita degli Studi di Padova, Italy, Universita degli Studi di Trento, Italy
摘要：神经—符号（NeSy）集成将符号推理与神经网络（NN）结合起来，用于需要感知和推理的任务。大多数NeSy系统依赖于逻辑知识的连续松弛，并且在模型流水线内不进行离散决策。此外，这些方法都假定符号规则是给定的。本文提出了一种NeSy系统——深度符号学习（DSL），将连续数据映射到离散符号的感知函数（集合）和符号集合上的符号函数的合成。DSL同时学习感知和符号功能，而只训练它们的组成（NeSy-function）。DSL的关键新颖之处在于，它可以创建内部（可解释的）符号表示，并将它们映射到可微分NN学习管道中的感知输入。创建的符号被自动选择以生成最好地解释数据的符号函数。我们提供了实验分析来证实DSL在同时学习感知和符号功能方面的功效。
摘要：Neuro-Symbolic (NeSy) integration combines symbolic reasoning with Neural Networks (NNs) for tasks requiring perception and reasoning. Most NeSy systems rely on continuous relaxation of logical knowledge and no discrete decisions are made within the model pipeline. Furthermore, these methods assume that the symbolic rules are given. In this paper, we propose Deep Symbolic Learning (DSL), a NeSy system that learns NeSy-functions, i.e., the composition of a (set of) perception functions which map continuous data to discrete symbols, and a symbolic function over the set of symbols. DSL learns simultaneously the perception and symbolic functions, while being trained only on their composition (NeSy-function). The key novelty of DSL is that it can create internal (interpretable) symbolic representations and map them to perception inputs within a differentiable NN learning pipeline. The created symbols are automatically selected to generate symbolic functions that best explain the data. We provide experimental analysis to substantiate the efficacy of DSL in simultaneously learning perception and symbolic functions.

医学相关(2篇)

【1】 Adverse Childhood Experiences Identification from Clinical Notes with Ontologies and NLP
标题：从本体论和自然语言处理的临床笔记识别儿童不良经历
链接：https://arxiv.org/abs/2208.11466

作者：Jinge Wu,Rowena Smith,Honghan Wu
机构：Institute of Health Informatics, University College London, London, UK, Usher Institute, University of Edinburgh, Edinburgh, UK
摘要：不良童年经历（ACE）是指在整个童年和/或青少年时期发生的高度紧张和潜在创伤性事件或情况的集合。研究表明，它们与晚年患精神疾病或其他异常行为的风险增加有关。然而，利用自然语言处理（NLP）从自由文本电子健康记录（EHR）中识别ACE是具有挑战性的，因为（a）不存在NLP就绪的ACE本体;（b）可用于机器学习的病例有限，需要来自临床专家的数据注释。我们目前正在开发一个工具，将使用NLP技术，以帮助我们在表面ACE从临床笔记。这将使我们能够进一步研究确定ACE与随后的精神疾病发展之间关系的证据（例如，成瘾），这在以前是不可能的。
摘要：Adverse Childhood Experiences (ACEs) are defined as a collection of highly stressful, and potentially traumatic, events or circumstances that occur throughout childhood and/or adolescence. They have been shown to be associated with increased risks of mental health diseases or other abnormal behaviours in later lives. However, the identification of ACEs from free-text Electronic Health Records (EHRs) with Natural Language Processing (NLP) is challenging because (a) there is no NLP ready ACE ontologies; (b) there are limited cases available for machine learning, necessitating the data annotation from clinical experts. We are currently developing a tool that would use NLP techniques to assist us in surfacing ACEs from clinical notes. This will enable us further research in identifying evidence of the relationship between ACEs and the subsequent developments of mental illness (e.g., addictions) in large-scale and longitudinal free-text EHRs, which has previously not been possible.

【2】 Molecular Substructure-Aware Network for Drug-Drug Interaction Prediction
标题：用于药物相互作用预测的分子结构感知网络
链接：https://arxiv.org/abs/2208.11267

作者：Xinyu Zhu,Yongliang Shen,Weiming Lu
机构：Zhejiang University, Hangzhou, China
备注：Accepted to CIKM 2022 (Short), camera ready version
摘要：合并用药可能导致药物相互作用（DDI）。一些药物组合是有益的，但其他可能会导致负面影响，这是以前没有记录。以往的DDI预测工作通常依赖于人工构建的领域知识，获取起来比较费力。本文提出了一种新的基于分子亚结构感知的网络模型（MSAN），该模型可以有效地从药物对的分子结构中预测潜在的DDI。我们采用了一个类似Transformer的子结构提取模块来获取固定数量的代表性向量，这些向量与药物分子的各种子结构模式相关联。然后，两种药物的亚结构之间的相互作用强度将由基于相似性的相互作用模块捕获。在图编码之前，我们还执行了一个子结构下降扩充，以减轻过拟合。在真实数据集上的实验结果表明，该模型具有较好的性能.最后，通过一个案例分析，说明了模型的预测结果具有很高的可解释性。
摘要：Concomitant administration of drugs can cause drug-drug interactions (DDIs). Some drug combinations are beneficial, but other ones may cause negative effects which are previously unrecorded. Previous works on DDI prediction usually rely on hand-engineered domain knowledge, which is laborious to obtain. In this work, we propose a novel model, Molecular Substructure-Aware Network (MSAN), to effectively predict potential DDIs from molecular structures of drug pairs. We adopt a Transformer-like substructure extraction module to acquire a fixed number of representative vectors that are associated with various substructure patterns of the drug molecule. Then, interaction strength between the two drugs' substructures will be captured by a similarity-based interaction module. We also perform a substructure dropping augmentation before graph encoding to alleviate overfitting. Experimental results from a real-world dataset reveal that our proposed model achieves the state-of-the-art performance. We also show that the predictions of our model are highly interpretable through a case study.

蒸馏|知识提取(2篇)

【1】 Debias the Black-box: A Fair Ranking Framework via Knowledge Distillation
标题：去偏向黑匣子：一个基于知识蒸馏的公平排名框架
链接：https://arxiv.org/abs/2208.11628

作者：Zhitao Zhu,Shijing Si,Jianzong Wang,Yaodong Yang,Jing Xiao
机构：Xiao, Ping An Technology (Shenzhen) Co., Ltd., Shenzhen, China, IAT, University of Science and Technology of China, Hefei, China, School of Economics and Finance, Shanghai International Studies University, Shanghai, China
备注：This paper has been accepted by the 23rd International Conference on Web Information Systems Engineering (WISE 2022)
摘要：深度神经网络由于其具有许多复杂的非线性单元，能够捕获查询和文档之间复杂的交互历史信息，从而提供正确的搜索推荐。然而，服务提供商在现实环境中经常面临更复杂的障碍，例如部署成本约束和公平性要求。知识提取是将经过良好训练的复杂模型（教师）的知识转移到简单模型（学生）中的一种方法，但目前最好的提取方法只关注如何使学生模型模仿教师模型的预测。为了更好地促进深度模型的应用，提出了一种基于知识提炼的公平信息检索框架。该框架可以提高模型的基于曝光的公平性，同时显著减小模型的规模。在三个大型数据集上的实验表明，本文提出的框架可以将模型的大小减少到原来的1%，同时保持模型的黑盒状态.在保证推荐有效性的同时，公平性提高了15%~46%.
摘要：Deep neural networks can capture the intricate interaction history information between queries and documents, because of their many complicated nonlinear units, allowing them to provide correct search recommendations. However, service providers frequently face more complex obstacles in real-world circumstances, such as deployment cost constraints and fairness requirements. Knowledge distillation, which transfers the knowledge of a well-trained complex model (teacher) to a simple model (student), has been proposed to alleviate the former concern, but the best current distillation methods focus only on how to make the student model imitate the predictions of the teacher model. To better facilitate the application of deep models, we propose a fair information retrieval framework based on knowledge distillation. This framework can improve the exposure-based fairness of models while considerably decreasing model size. Our extensive experiments on three huge datasets show that our proposed framework can reduce the model size to a minimum of 1% of its original size while maintaining its black-box state. It also improves fairness performance by 15%~46% while keeping a high level of recommendation effectiveness.

【2】 Federated Learning via Decentralized Dataset Distillation in Resource-Constrained Edge Environments
标题：资源受限边缘环境下分布式数据集抽取的联合学习
链接：https://arxiv.org/abs/2208.11311

作者：Rui Song,Dai Liu,Dave Zhenyu Chen,Andreas Festag,Carsten Trinitis,Martin Schulz,Alois Knoll
机构：Fraunhofer IVI, Technical University of Munich, Technische Hochschule Ingolstadt
摘要：提出了一种新的联邦学习框架FedD3，该框架减少了通信量，使得联邦学习的概念在网络受限的环境中有了更多的应用场景。它通过利用本地数据集提取而不是传统的学习方法来实现这一点：（i）显著减少通信量;（ii）将传输限制为一次性通信，而不是迭代的多路通信。FedD3不像其他联合学习方法那样共享模型更新，而是允许连接的客户端独立地提取本地数据集，然后通过网络聚合这些分散提取的数据集（通常以几个不可识别的图像的形式，通常比模型小）一次，以形成最终的模型。实验结果表明，FedD3在所需通信量方面明显优于其他联邦学习框架，同时它还提供了额外的优势，能够根据使用场景或目标数据集平衡准确性和通信成本。例如，对于在具有10个客户端的Non-IID CIFAR-10数据集上训练AlexNet模型，与其他一次性联合学习方法相比，FedD3可以在相同通信量的情况下将准确率提高超过71%，或者在达到相同准确率的情况下节省98%的通信量。
摘要：We introduce a novel federated learning framework, FedD3, which reduces the overall communication volume and with that opens up the concept of federated learning to more application scenarios in network-constrained environments. It achieves this by leveraging local dataset distillation instead of traditional learning approaches (i) to significantly reduce communication volumes and (ii) to limit transfers to one-shot communication, rather than iterative multiway communication. Instead of sharing model updates, as in other federated learning approaches, FedD3 allows the connected clients to distill the local datasets independently, and then aggregates those decentralized distilled datasets (typically in the form a few unrecognizable images, which are normally smaller than a model) across the network only once to form the final model. Our experimental results show that FedD3 significantly outperforms other federated learning frameworks in terms of needed communication volumes, while it provides the additional benefit to be able to balance the trade-off between accuracy and communication cost, depending on usage scenario or target dataset. For instance, for training an AlexNet model on a Non-IID CIFAR-10 dataset with 10 clients, FedD3 can either increase the accuracy by over 71% with a similar communication volume, or save 98% of communication volume, while reaching the same accuracy, comparing to other one-shot federated learning approaches.

联邦学习|隐私保护|加密(3篇)

【1】 PromptFL: Let Federated Participants Cooperatively Learn Prompts Instead of Models -- Federated Learning in Age of Foundation Model
标题：PromptFL：让联合参与者协作学习提示而不是模型--基础模型时代的联合学习
链接：https://arxiv.org/abs/2208.11625

作者：Tao Guo,Song Guo,Junxiao Wang,Wenchao Xu
机构：Department of Computing, The Hong Kong Polytechnic University, Hong Kong, China
摘要：联邦学习需要足够的带宽进行参数通信，并需要足够的用户数据进行局部训练，而快速全局聚合有效的分布式参数是联邦学习的关键.否则，FL可能花费过多的训练时间用于收敛并产生不准确的模型。本文提出了一个全新的FL框架PromptFL，它用联邦提示训练代替联邦模型训练，即：让联邦参与者训练提示而不是共享模型，以分布式方式利用基础模型（FM）的能力，同时实现有效的全局聚合和对不足数据的局部训练。PromptFL提供现成的FM，即：CLIP，以提供给分布式客户端，这些客户端将基于非常少的本地数据来协作地训练共享的软提示。由于PromptFL只需要更新提示而不是整个模型，因此可以显著地加速局部训练和全局聚集。在大规模数据上训练出的FM能够对分布式用户任务提供很强的适应能力。通过大量的实验对PromptFL进行了实证分析，证明了其在系统可行性、用户隐私和性能方面的优越性。
摘要：Quick global aggregation of effective distributed parameters is crucial to federated learning (FL), which requires adequate bandwidth for parameters communication and sufficient user data for local training. Otherwise, FL may cost excessive training time for convergence and produce inaccurate models. In this paper, we propose a brand-new FL framework, PromptFL, that replaces the federated model training with the federated prompt training, i.e., let federated participants train prompts instead of a shared model, to simultaneously achieve the efficient global aggregation and local training on insufficient data by exploiting the power of foundation models (FM) in a distributed way. PromptFL ships an off-the-shelf FM, i.e., CLIP, to distributed clients who would cooperatively train shared soft prompts based on very few local data. Since PromptFL only needs to update the prompts instead of the whole model, both the local training and the global aggregation can be significantly accelerated. And FM trained over large scale data can provide strong adaptation capability to distributed users tasks with the trained soft prompts. We empirically analyze the PromptFL via extensive experiments, and show its superiority in terms of system feasibility, user privacy, and performance.

【2】 Exact Penalty Method for Federated Learning
标题：联合学习的精确罚函数法
链接：https://arxiv.org/abs/2208.11231

作者：Shenglong Zhou,and Geoffrey Ye Li
机构： Department of Electrical andElectronic Engineering
摘要：近年来，联邦学习在机器学习领域蓬勃发展，引发了各种研究课题。流行的优化算法是基于（随机）梯度下降法或乘子交替方向法的框架。本文采用精确惩罚方法来处理联邦学习，提出了一个算法FedEPM，它能够解决联邦学习中的四个关键问题：通信效率、计算复杂度、掉队者效应和数据保密性。此外，证明了该方法的收敛性，并具有较高的数值性能。
摘要：Federated learning has burgeoned recently in machine learning, giving rise to a variety of research topics. Popular optimization algorithms are based on the frameworks of the (stochastic) gradient descent methods or the alternating direction method of multipliers. In this paper, we deploy an exact penalty method to deal with federated learning and propose an algorithm, FedEPM, that enables to tackle four critical issues in federated learning: communication efficiency, computational complexity, stragglers' effect, and data privacy. Moreover, it is proven to be convergent and testified to have high numerical performance.

【3】 FedOS: using open-set learning to stabilize training in federated learning
标题：FedOS：使用开放集合学习来稳定联合学习中的训练
链接：https://arxiv.org/abs/2208.11512

作者：Mohamad Mohamad,Julian Neubert,Juan Segundo Ayardo
备注：Project report for the course of Advance Machine Learning. year 2021-22, Polytechnic of Turin
摘要：联邦学习是一种在分布式数据集上训练统计模型而不违反隐私约束的新方法。通过在客户端和服务器之间共享模型而不是数据，可以保留数据局部性原则。这带来了许多好处，但也提出了新的挑战。在本报告中，我们将探索这一新的研究领域，并进行几个实验，以加深我们对这些挑战的理解，以及不同的问题设置如何影响最终模型的性能。最后，我们提出了一种新的方法来解决这些挑战之一，并将其与文献中的其他方法进行了比较。
摘要：Federated Learning is a recent approach to train statistical models on distributed datasets without violating privacy constraints. The data locality principle is preserved by sharing the model instead of the data between clients and the server. This brings many advantages but also poses new challenges. In this report, we explore this new research area and perform several experiments to deepen our understanding of what these challenges are and how different problem settings affect the performance of the final model. Finally, we present a novel approach to one of these challenges and compare it to other methods found in literature.

推理|分析|理解|解释(3篇)

【1】 Explainable AI for tailored electricity consumption feedback -- an experimental evaluation of visualizations
标题：用于定制用电反馈的可解释人工智能--可视化的实验评估
链接：https://arxiv.org/abs/2208.11408

作者：Jacqueline Wastensteiner,Tobias M. Weiss,Felix Haag,Konstantin Hopf
机构：bamberg.de
摘要：机器学习（ML）方法可以有效地分析数据，识别其中的模式，并做出高质量的预测。好的预测通常伴随着“黑盒”模型，这些模型无法以人类可读的方式呈现检测到的模式。最近，技术的发展导致了可解释人工智能（XAI）技术，其目的是打开这样的黑盒，并使人类能够从检测到的模式中获得新的见解。我们调查了XAI在一个领域的应用，在这个领域，特定的见解可以对消费者行为产生重大影响，即电力使用。了解到对个人用电量的具体反馈会触发资源节约，我们根据用电量时间序列，使用ML和XAI方法创建了五个可视化，以实现高度个性化的反馈，同时考虑现有的特定领域设计知识。我们对152名参与者进行的实验评估表明，人类可以吸收XAI可视化所显示的模式，但是这样的可视化应该遵循已知的可视化模式，以便被用户很好地理解。
摘要：Machine learning (ML) methods can effectively analyse data, recognize patterns in them, and make high-quality predictions. Good predictions usually come along with "black-box" models that are unable to present the detected patterns in a human-readable way. Technical developments recently led to eXplainable Artificial Intelligence (XAI) techniques that aim to open such black-boxes and enable humans to gain new insights from detected patterns. We investigated the application of XAI in an area where specific insights can have a significant effect on consumer behaviour, namely electricity use. Knowing that specific feedback on individuals' electricity consumption triggers resource conservation, we created five visualizations with ML and XAI methods from electricity consumption time series for highly personalized feedback, considering existing domain-specific design knowledge. Our experimental evaluation with 152 participants showed that humans can assimilate the pattern displayed by XAI visualizations, but such visualizations should follow known visualization patterns to be well-understood by users.

【2】 Augmented cross-selling through explainable AI -- a case from energy retailing
标题：通过可解释的人工智能加强交叉销售--以能源零售业为例
链接：https://arxiv.org/abs/2208.11404

作者：Felix Haag,Konstantin Hopf,Pedro Menelau Vasconcelos,Thorsten Staake
机构： University of Bamberg
摘要：随着机器学习技术的发展，人们对机器学习技术在决策支持中的应用产生了浓厚的兴趣。虽然复杂的ML模型提供的预测通常比传统工具的预测更准确，但此类模型通常向其用户隐藏预测背后的推理，这可能导致较低的采用率和缺乏洞察力。在这种紧张关系的推动下，研究人员提出了可解释的人工智能（XAI）技术，这些技术揭示了ML发现的模式。尽管ML和XAI都被寄予厚望，但几乎没有经验证据表明其对传统业务的好处。为此，我们分析了一家能源零售商的220，185个客户的数据，预测交叉购买的正确率（AUC）高达86%，并表明XAI方法SHAP提供了适用于实际购买者的解释。我们进一步概述了对信息系统、XAI和关系营销研究的影响。
摘要：The advance of Machine Learning (ML) has led to a strong interest in this technology to support decision making. While complex ML models provide predictions that are often more accurate than those of traditional tools, such models often hide the reasoning behind the prediction from their users, which can lead to lower adoption and lack of insight. Motivated by this tension, research has put forth Explainable Artificial Intelligence (XAI) techniques that uncover patterns discovered by ML. Despite the high hopes in both ML and XAI, there is little empirical evidence of the benefits to traditional businesses. To this end, we analyze data on 220,185 customers of an energy retailer, predict cross-purchases with up to 86% correctness (AUC), and show that the XAI method SHAP provides explanations that hold for actual buyers. We further outline implications for research in information systems, XAI, and relationship marketing.

【3】 A novel approach for Fair Principal Component Analysis based on eigendecomposition
标题：一种基于特征分解的公平主成分分析新方法
链接：https://arxiv.org/abs/2208.11362

作者：Guilherme Dean Pelegrina,Leonardo Tomazeli Duarte
机构： Duarte are with the School of Applied Sciences (FCA), University of Campinas (UNICAMP)
摘要：主成分分析（PCA）是信号处理中一种普遍存在的降维技术，它通过寻找一个投影矩阵来使降维后的数据集与原始数据集之间的均方误差最小。由于经典PCA不适合于解决与公平性相关的问题，因此其在实际问题中的应用可能导致不同组的重构误差的不一致（例如，男人和女人、白人和黑人等），有可能产生有害的后果，例如对敏感群体产生偏见。虽然最近已经提出了几种PCA的公平版本，但在寻找足够简单以部署在实际系统中的算法方面仍然存在根本差距。为了解决这个问题，我们提出了一种新的PCA算法，它通过一个简单的策略来处理公平性问题，该策略包括利用PCA的封闭形式解的一维搜索。数值实验结果表明，该算法在不需要复杂的优化方案的情况下，能够以很小的重构误差损失显著提高公平性。此外，我们的发现在几个真实情况下以及在不平衡和平衡数据集的场景中是一致的。
摘要：Principal component analysis (PCA), a ubiquitous dimensionality reduction technique in signal processing, searches for a projection matrix that minimizes the mean squared error between the reduced dataset and the original one. Since classical PCA is not tailored to address concerns related to fairness, its application to actual problems may lead to disparity in the reconstruction errors of different groups (e.g., men and women, whites and blacks, etc.), with potentially harmful consequences such as the introduction of bias towards sensitive groups. Although several fair versions of PCA have been proposed recently, there still remains a fundamental gap in the search for algorithms that are simple enough to be deployed in real systems. To address this, we propose a novel PCA algorithm which tackles fairness issues by means of a simple strategy comprising a one-dimensional search which exploits the closed-form solution of PCA. As attested by numerical experiments, the proposal can significantly improve fairness with a very small loss in the overall reconstruction error and without resorting to complex optimization schemes. Moreover, our findings are consistent in several real situations as well as in scenarios with both unbalanced and balanced datasets.

检测相关(3篇)

【1】 Discovering Transferable Forensic Features for CNN-generated Images Detection
标题：用于CNN生成的图像检测的可转移法医特征发现
链接：https://arxiv.org/abs/2208.11342

作者：Keshigeyan Chandrasegaran,Ngoc-Trung Tran,Alexander Binder,Ngai-Man Cheung
机构： Singapore University of Technology and Design (SUTD), Singapore Institute of Technology (SIT), University of Oslo (UIO)
备注：ECCV 2022 Oral; 35 pages
摘要：随着神经图像合成方法的快速发展，视觉伪造越来越多地在主流媒体中造成了一个生存难题。尽管在图像取证领域中检测这种伪造品是一个棘手的问题，但最近一类取证检测器——通用检测器——能够令人惊讶地识别伪造图像，而不管发生器架构、损失函数、训练数据集和分辨率如何。这一有趣的性质表明通用检测器中可能存在可转移的法医学特征（T-FF）。在这项工作中，我们进行了第一次分析研究，以发现和理解通用探测器中的T-FF。我们的贡献有两个方面：1）我们提出了一种新的取证特征相关性统计量（FF-RS）来量化和发现通用检测器中的T-FF，2）我们的定性和定量研究揭示了一个意想不到的发现：颜色是通用检测器中的关键T-FF。有关代码和型号，请访问https://keshik6.github.io/transferable-forensic-features/
摘要：Visual counterfeits are increasingly causing an existential conundrum in mainstream media with rapid evolution in neural image synthesis methods. Though detection of such counterfeits has been a taxing problem in the image forensics community, a recent class of forensic detectors -- universal detectors -- are able to surprisingly spot counterfeit images regardless of generator architectures, loss functions, training datasets, and resolutions. This intriguing property suggests the possible existence of transferable forensic features (T-FF) in universal detectors. In this work, we conduct the first analytical study to discover and understand T-FF in universal detectors. Our contributions are 2-fold: 1) We propose a novel forensic feature relevance statistic (FF-RS) to quantify and discover T-FF in universal detectors and, 2) Our qualitative and quantitative investigations uncover an unexpected finding: color is a critical T-FF in universal detectors. Code and models are available at https://keshik6.github.io/transferable-forensic-features/

【2】 Comparison of Object Detection Algorithms for Street-level Objects
标题：街道级目标检测算法的比较
链接：https://arxiv.org/abs/2208.11315

作者：Martinus Grady Naftali,Jason Sebastian Sulistyawan,Kelvin Julian
机构：School of Computer Science, Bina Nusantara University, Jakarta, Indonesia
备注：11 pages, 9 figures, 5 tables
摘要：针对街道级对象的对象检测可应用于各种用例，从汽车和交通检测到自动驾驶汽车系统。因此，寻找最佳的目标检测算法是有效应用目标检测的关键。许多物体检测算法已经发布，许多算法已经比较了物体检测算法，但是很少有人比较最新的算法，如YOLOv5，主要关注街道级物体。本文比较了各种单级检测器算法;固态硬盘MobileNetv2 FPN-lite 320x320、YOLOv3、YOLOv4、YOLOv5l和YOLOv5s，用于实时图像中的街道级对象检测。该实验使用了具有3，169张图像的修改后的Udacity自动驾驶汽车数据集。数据集分为训练、确认和测试;然后，使用重缩放、色调偏移和噪声对它进行预处理和增强。然后对每个算法进行训练和评估。实验结果表明，该算法在推理时间、查准率、查全率、F1-Score和平均查准率（mAP）等方面都取得了较好的效果。结果表明，YOLOv5l的准确率为0.593，MobileNetv2 FPN-lite的推理时间最短，仅为www.example.com的3.20ms推理时间。还发现YOLOv5s是最高效的，它具有YOLOv5l的准确性和几乎与MobileNetv2 FPN-lite一样快的速度。这表明各种算法都适用于街道级目标检测，并且有足够的可行性用于自动驾驶汽车。
摘要：Object detection for street-level objects can be applied to various use cases, from car and traffic detection to the self-driving car system. Therefore, finding the best object detection algorithm is essential to apply it effectively. Many object detection algorithms have been released, and many have compared object detection algorithms, but few have compared the latest algorithms, such as YOLOv5, primarily which focus on street-level objects. This paper compares various one-stage detector algorithms; SSD MobileNetv2 FPN-lite 320x320, YOLOv3, YOLOv4, YOLOv5l, and YOLOv5s for street-level object detection within real-time images. The experiment utilizes a modified Udacity Self Driving Car Dataset with 3,169 images. Dataset is split into train, validation, and test; Then, it is preprocessed and augmented using rescaling, hue shifting, and noise. Each algorithm is then trained and evaluated. Based on the experiments, the algorithms have produced decent results according to the inference time and the values of their precision, recall, F1-Score, and Mean Average Precision (mAP). The results also shows that YOLOv5l outperforms the other algorithms in terms of accuracy with a mAP@.5 of 0.593, MobileNetv2 FPN-lite has the fastest inference time among the others with only 3.20ms inference time. It is also found that YOLOv5s is the most efficient, with it having a YOLOv5l accuracy and a speed almost as quick as the MobileNetv2 FPN-lite. This shows that various algorithm are suitable for street-level object detection and viable enough to be used in self-driving car.

【3】 ADMoE: Anomaly Detection with Mixture-of-Experts from Noisy Labels
标题：ADMoE：基于混合专家的噪声标签异常检测
链接：https://arxiv.org/abs/2208.11290

作者：Yue Zhao,Guoqing Zheng,Subhabrata Mukherjee,Robert McCann,Ahmed Awadallah
机构：Carnegie Mellon University,Microsoft Research,Microsoft
摘要：现有的异常检测（AD）工作依赖于来自人类标注者的干净标签，而这些标签在实践中获取起来是昂贵的。在这项工作中，我们提出了一种利用弱/噪声标签（例如，由用于检测恶意软件的机器规则生成的风险分数），具体地说，我们提出了ADMoE，这是第一个从噪声标签中学习的异常检测算法的框架。简而言之，ADMoE利用专家混合（MoE）架构来鼓励从多个噪声源进行专业化和可扩展的学习。它通过共享大多数模型参数来捕获噪声标签之间的相似性，同时通过构建“专家”子网络来鼓励专业化。为了进一步从噪声标签中提取信号，ADMoE将它们用作输入特征，以便于专家学习。对8个数据集（包括一个专有的企业安全数据集）的大量测试结果证明了ADMoE的有效性，与不使用它相比，它可以带来高达34%的性能提升，并且在相同的网络参数和FLOPS下，它的性能优于13个领先的基准。值得注意的是，ADMoE是模型不可知的，使任何基于神经网络的检测方法都能够处理噪声标签，其中我们展示了它在多层感知器（MLP）和领先的AD方法DeepSAD上的结果。
摘要：Existing works on anomaly detection (AD) rely on clean labels from human annotators that are expensive to acquire in practice. In this work, we propose a method to leverage weak/noisy labels (e.g., risk scores generated by machine rules for detecting malware) that are cheaper to obtain for anomaly detection. Specifically, we propose ADMoE, the first framework for anomaly detection algorithms to learn from noisy labels. In a nutshell, ADMoE leverages mixture-of-experts (MoE) architecture to encourage specialized and scalable learning from multiple noisy sources. It captures the similarities among noisy labels by sharing most model parameters, while encouraging specialization by building "expert" sub-networks. To further juice out the signals from noisy labels, ADMoE uses them as input features to facilitate expert learning. Extensive results on eight datasets (including a proprietary enterprise security dataset) demonstrate the effectiveness of ADMoE, where it brings up to 34% performance improvement over not using it. Also, it outperforms a total of 13 leading baselines with equivalent network parameters and FLOPS. Notably, ADMoE is model-agnostic to enable any neural network-based detection methods to handle noisy labels, where we showcase its results on both multiple-layer perceptron (MLP) and the leading AD method DeepSAD.

分类|识别(1篇)

【1】 DCSF: Deep Convolutional Set Functions for Classification of Asynchronous Time Series
标题：DCSF：用于非同步时间序列分类的深卷积集函数
链接：https://arxiv.org/abs/2208.11374

作者：Vijaya Krishna Yalavarthi,Johannes Burchert,Lars Schmidt-Thieme
机构：University of Hildesheim, Germany
摘要：异步时间序列是一个多变量时间序列，其中所有通道都是异步独立观察的，这使得时间序列在对齐时非常稀疏。我们经常在具有复杂观测过程的应用中观察到这种效应，例如医疗保健、气候科学和天文学等。由于异步特性，它们对深度学习架构提出了重大挑战，因为深度学习架构假定呈现给它们的时间序列是定期采样、完全观察并与时间对齐的。提出了一种新的异步时间序列分类框架——深度卷积集函数（Deep Convolutional Set Functions，DCSF），该框架具有很高的可扩展性和存储效率。随着深度集合学习架构的最新进展，我们引入了一种不随时间序列通道呈现顺序而变化的模型。我们探索了卷积神经网络，这是一种对集合元素进行编码的方法，它在规则采样和完全观测的时间序列的密切相关问题——分类方面得到了很好的研究。我们评价了DCSF用于AsTS分类和在线（每个时间点）AsTS分类。在多个真实数据集和人工数据集上的实验结果表明，该模型在准确性和运行时间方面明显优于一系列现有模型.
摘要：Asynchronous Time Series is a multivariate time series where all the channels are observed asynchronously-independently, making the time series extremely sparse when aligning them. We often observe this effect in applications with complex observation processes, such as health care, climate science, and astronomy, to name a few. Because of the asynchronous nature, they pose a significant challenge to deep learning architectures, which presume that the time series presented to them are regularly sampled, fully observed, and aligned with respect to time. This paper proposes a novel framework, that we call Deep Convolutional Set Functions (DCSF), which is highly scalable and memory efficient, for the asynchronous time series classification task. With the recent advancements in deep set learning architectures, we introduce a model that is invariant to the order in which time series' channels are presented to it. We explore convolutional neural networks, which are well researched for the closely related problem-classification of regularly sampled and fully observed time series, for encoding the set elements. We evaluate DCSF for AsTS classification, and online (per time point) AsTS classification. Our extensive experiments on multiple real-world and synthetic datasets verify that the suggested model performs substantially better than a range of state-of-the-art models in terms of accuracy and run time.

优化|敛散性(4篇)

【1】 A Low-Complexity Approach to Rate-Distortion Optimized Variable Bit-Rate Compression for Split DNN Computing
标题：一种用于分裂DNN计算的低复杂度率失真优化变码率压缩方法
链接：https://arxiv.org/abs/2208.11596

作者：Parual Datta,Nilesh Ahuja,V. Srinivasa Somayazulu,Omesh Tickoo
机构：Intel Labs, Bangalore, India, Santa Clara, USA, Hillsboro, USA
备注：ICPR 2022
摘要：拆分计算已成为基于DNN的AI工作负载实现的最新范例，其中DNN模型被拆分为两部分，其中一部分在移动/客户端设备上执行，另一部分在边缘服务器（或云）上执行。数据压缩应用于需要传输的来自DNN的中间张量，解决了优化速率—精度—复杂度折衷的挑战。现有的分割计算方法采用基于ML的数据压缩，但是需要针对不同的压缩级别重新训练整个DNN模型或其重要部分的参数。这会带来很高的计算和存储负担：从头开始训练完整的DNN模型在计算上要求很高，维护DNN参数的多个副本增加了存储需求，并且在推理期间切换权重的全集增加了存储器带宽。在本文中，我们提出了一种解决所有这些挑战的方法。它涉及瓶颈单元的系统设计和训练——简单、低成本的神经网络——可以在分割点插入。该方法在训练和推理过程中都具有显著的轻量化、高效性，并且以较小的计算和存储开销获得了优异的率失真性能。
摘要：Split computing has emerged as a recent paradigm for implementation of DNN-based AI workloads, wherein a DNN model is split into two parts, one of which is executed on a mobile/client device and the other on an edge-server (or cloud). Data compression is applied to the intermediate tensor from the DNN that needs to be transmitted, addressing the challenge of optimizing the rate-accuracy-complexity trade-off. Existing split-computing approaches adopt ML-based data compression, but require that the parameters of either the entire DNN model, or a significant portion of it, be retrained for different compression levels. This incurs a high computational and storage burden: training a full DNN model from scratch is computationally demanding, maintaining multiple copies of the DNN parameters increases storage requirements, and switching the full set of weights during inference increases memory bandwidth. In this paper, we present an approach that addresses all these challenges. It involves the systematic design and training of bottleneck units - simple, low-cost neural networks - that can be inserted at the point of split. Our approach is remarkably lightweight, both during training and inference, highly effective and achieves excellent rate-distortion performance at a small fraction of the compute and storage overhead compared to existing methods.

【2】 Optimal Brain Compression: A Framework for Accurate Post-Training Quantization and Pruning
标题：最佳脑压缩：一种精确的训练后量化和剪枝框架
链接：https://arxiv.org/abs/2208.11580

作者：Elias Frantar,Dan Alistarh
机构：IST Austria & Neural Magic
摘要：研究了深度神经网络（DNNs）在训练后环境下的模型压缩问题，在训练后环境中，我们得到一个精确的训练模型，并且必须在不进行任何训练的情况下，仅基于少量的校准输入数据对它进行压缩.考虑到对执行经由具有加速的修剪和/或量化而压缩的模型的新兴软件和硬件支持，该问题已经变得普遍，并且已经针对两种压缩方法独立地提出了性能良好的解决方案。本文提出了一种新的压缩框架，该框架在统一的设置下涵盖了权重修剪和量化，是时间和空间有效的，并大大改善了现有后训练方法的实际性能。在技术层面上，我们的方法是基于[LeCun，Denker，and Solla，1990]的经典最优脑外科医生（OBS）框架在现代DNN规模上的第一次精确和有效实现，我们进一步扩展以涵盖权重量化。这是由一系列算法的发展，这可能是独立的利益。实验结果表明，该方法能够在压缩精度之间取得较好的平衡，并且能够在训练后的场景中实现剪枝和量化的精确联合应用.
摘要：We consider the problem of model compression for deep neural networks (DNNs) in the challenging post-training setting, in which we are given an accurate trained model, and must compress it without any retraining, based only on a small amount of calibration input data. This problem has become popular in view of the emerging software and hardware support for executing models compressed via pruning and/or quantization with speedup, and well-performing solutions have been proposed independently for both compression approaches. In this paper, we introduce a new compression framework which covers both weight pruning and quantization in a unified setting, is time- and space-efficient, and considerably improves upon the practical performance of existing post-training methods. At the technical level, our approach is based on the first exact and efficient realization of the classical Optimal Brain Surgeon (OBS) framework of [LeCun, Denker, and Solla, 1990] at the scale of modern DNNs, which we further extend to cover weight quantization. This is enabled by a series of algorithmic developments which may be of independent interest. From the practical perspective, our experimental results show that it can improve significantly upon the compression-accuracy trade-offs of existing post-training methods, and that it can even enable the accurate joint application of both pruning and quantization in a post-training setting.

【3】 Multi-objective optimization of actuation waveform for high-precision drop-on-demand inkjet printing
标题：高精度按需喷墨打印驱动波形的多目标优化
链接：https://arxiv.org/abs/2208.11301

作者：Hanzhi Wang,Yosuke Hasegawa
机构：. Department of Mechanical Engieering, The University of Tokyo,-,-, Komaba, Meguro-, ku, Tokyo ,-, Japan, . Institute of Industrial Science, The University of Tokyo,-,-, Komaba, Meguro-ku
备注：The following article has been submitted to Physics of Fluids
摘要：按需喷墨（DOD）技术被认为是制备先进功能材料的一种有前途的技术。对于DOD打印机，用于实现无卫星的较小液滴的高精度分配技术长期以来一直是图案化薄膜结构所需要的。本研究将位于分配喷嘴上游的液体腔室的入口速度作为控制变量，并旨在使用样本有效的贝叶斯优化算法来优化其波形。首先，利用开源的OpenFOAM求解器interFoam对液滴分配动力学进行数值模拟，并将结果传递给基于pyFoam的另一个代码。然后，通过贝叶斯优化（BO）算法来确定表征驱动DOD打印机的致动波形的参数，以最大化表示为两个因素之和的规定的多目标函数，即，主要液滴的尺寸和卫星液滴的存在。实验结果表明，该算法在150次模拟内就能成功地找到高精度的点胶波形。具体地，通过应用最佳波形，可以有效地消除卫星液滴，并且液滴直径可以显著地减小到喷嘴直径的24.9%。
摘要：Drop-on-demand (DOD) inkjet printing has been considered as one of promising technologies for the fabrication of advanced functional materials. For a DOD printer, high-precision dispensing techniques for achieving satellite-free smaller droplets, have long been desired for patterning thin-film structures. The present study considers the inlet velocity of a liquid chamber located upstream of a dispensing nozzle as a control variable and aims to optimize its waveform using a sample-efficient Bayesian optimization algorithm. Firstly, the droplet dispensing dynamics are numerically reproduced by using an open-source OpenFOAM solver, interFoam, and the results are passed on to another code based on pyFoam. Then, the parameters characterizing the actuation waveform driving a DOD printer are determined by the Bayesian optimization (BO) algorithm so as to maximize a prescribed multi-objective function expressed as the sum of two factors, i.e., the size of a primary droplet and the presence of satellite droplets. The results show that the present BO algorithm can successfully find high-precision dispensing waveforms within 150 simulations. Specifically, satellite droplets can be effectively eliminated and the droplet diameter can be significantly reduced to 24.9% of the nozzle diameter by applying the optimal waveform.

【4】 Sparse Polynomial Optimization: Theory and Practice
标题：稀疏多项式优化：理论与实践
链接：https://arxiv.org/abs/2208.11158

作者：Victor Magron,Jie Wang
机构：arXiv:,.,v, [math.OC] , Aug
备注：220 pages, to appear in Series on Optimization and Its Applications, World Scientific Press
摘要：在一组多项式不等式上最小化一个多项式的问题是一个NP难的非凸问题。由于实数代数几何的强大结果，人们可以将这个问题转化为有限维凸问题的嵌套序列。在相关层次的每一步，需要求解一个固定大小的半定规划，该规划又可以用有效的数值工具求解。然而，在实际应用中，\emph{没有免费的午餐}，此类优化方法通常会带来严重的可伸缩性问题。幸运的是，对于许多应用程序，我们可以\emph{正视问题}，并利用描述问题的成本和约束（例如稀疏性或对称性）所产生的固有数据结构。这本书提出了几个研究工作，以解决这一科学挑战与重要的计算含义，并提供了替代优化方案的发展，规模很好地在计算复杂性方面，至少在一些已确定的类问题。本书中提出的算法框架主要利用输入数据的稀疏结构来解决大规模多项式优化问题。我们提出了稀疏利用层次松弛，无论是无约束或约束问题。与密集层次结构相比，它们在实践中提供了更快的解的近似，但也具有相同的理论收敛保证。我们的框架并不局限于\emph{静态}多项式优化，我们还揭示了动力系统分析中所关注的值的近似层次。我们还提出了涉及非交换变量的问题的各种扩展，任意大小矩阵或量子物理算符。
摘要：The problem of minimizing a polynomial over a set of polynomial inequalities is an NP-hard non-convex problem. Thanks to powerful results from real algebraic geometry, one can convert this problem into a nested sequence of finite-dimensional convex problems. At each step of the associated hierarchy, one needs to solve a fixed size semidefinite program, which can be in turn solved with efficient numerical tools. On the practical side however, there is \emph{no-free lunch} and such optimization methods usually encompass severe scalability issues. Fortunately, for many applications, we can \emph{look at the problem in the eyes} and exploit the inherent data structure arising from the cost and constraints describing the problem, for instance sparsity or symmetries. This book presents several research efforts to tackle this scientific challenge with important computational implications, and provides the development of alternative optimization schemes that scale well in terms of computational complexity, at least in some identified class of problems. The presented algorithmic framework in this book mainly exploits the sparsity structure of the input data to solve large-scale polynomial optimization problems. We present sparsity-exploiting hierarchies of relaxations, for either unconstrained or constrained problems. By contrast with the dense hierarchies, they provide faster approximation of the solution in practice but also come with the same theoretical convergence guarantees. Our framework is not restricted to \emph{static} polynomial optimization, and we expose hierarchies of approximations for values of interest arising from the analysis of dynamical systems. We also present various extensions to problems involving noncommuting variables, e.g., matrices of arbitrary size or quantum physic operators.

预测|估计(4篇)

【1】 Collaborative Algorithms for Online Personalized Mean Estimation
标题：在线个性化均值估计的协作算法
链接：https://arxiv.org/abs/2208.11530

作者：Mahsa Asadi,Aurélien Bellet,Odalric-Ambrym Maillard,Marc Tommasi
机构：Univ. Lille, Inria, CNRS, Centrale Lille, UMR , - CRIStAL, F-, Lille, France
摘要：我们考虑了一个在线估计问题涉及一组代理。每个代理人都可以访问一个（个人）过程，该过程从实值分布中生成样本，并试图估计其均值。我们研究的情况下，一些分布具有相同的平均值，并允许代理主动查询信息从其他代理。目标是设计一种算法，使每个代理能够通过与其他代理的通信来改进其平均估计。平均值以及具有相同平均值的分布的数目是未知的，这使得任务不平凡。本文提出了一种新的协同策略来解决在线个性化均值估计问题。我们分析了它的时间复杂度，并引入了在数值实验中表现良好的变体.我们还将我们的方法扩展到具有相似平均值的代理集群试图估计其集群的平均值的设置。
摘要：We consider an online estimation problem involving a set of agents. Each agent has access to a (personal) process that generates samples from a real-valued distribution and seeks to estimate its mean. We study the case where some of the distributions have the same mean, and the agents are allowed to actively query information from other agents. The goal is to design an algorithm that enables each agent to improve its mean estimate thanks to communication with other agents. The means as well as the number of distributions with same mean are unknown, which makes the task nontrivial. We introduce a novel collaborative strategy to solve this online personalized mean estimation problem. We analyze its time complexity and introduce variants that enjoy good performance in numerical experiments. We also extend our approach to the setting where clusters of agents with similar means seek to estimate the mean of their cluster.

【2】 Inter- and Intra-Series Embeddings Fusion Network for Epidemiological Forecasting
标题：用于流行病预测的序列间和序列内嵌入融合网络
链接：https://arxiv.org/abs/2208.11515

作者：Feng Xie,Zhong Zhang,Xuechen Zhao,Bin Zhou,Yusong Tan
机构：National University of Defense, Technology, Changsha, China
备注：6 pages, 5 figures, SEKE2022
摘要：传染病疫情的准确预测是有效控制一个地区疫情的关键。大多数现有方法忽略了区域之间潜在的动态依赖性或区域之间的时间依赖性和相互依赖性对于预测的重要性。本文提出了一种序列间和序列内嵌入融合网络（SEFNet）来提高疫情预测性能。SEFNet由两个并行模块组成，分别称为序列间嵌入模块和序列内嵌入模块。在序列间嵌入模块中，提出了一个多尺度统一卷积组件——区域感知卷积，该组件与自注意协同捕获多区域时间序列之间的动态依赖关系。序列内嵌入模块使用长短时记忆来捕获每个时间序列内的时间关系。然后，利用参数矩阵融合的方法，学习两个嵌入的影响程度，并进行融合。为了进一步提高鲁棒性，SEFNet还将传统的自回归组件与非线性神经网络并行集成。在4个真实世界流行病数据集上的实验表明，SEFNet是有效的，并且优于现有的基线。
摘要：The accurate forecasting of infectious epidemic diseases is the key to effective control of the epidemic situation in a region. Most existing methods ignore potential dynamic dependencies between regions or the importance of temporal dependencies and inter-dependencies between regions for prediction. In this paper, we propose an Inter- and Intra-Series Embeddings Fusion Network (SEFNet) to improve epidemic prediction performance. SEFNet consists of two parallel modules, named Inter-Series Embedding Module and Intra-Series Embedding Module. In Inter-Series Embedding Module, a multi-scale unified convolution component called Region-Aware Convolution is proposed, which cooperates with self-attention to capture dynamic dependencies between time series obtained from multiple regions. The Intra-Series Embedding Module uses Long Short-Term Memory to capture temporal relationships within each time series. Subsequently, we learn the influence degree of two embeddings and fuse them with the parametric-matrix fusion method. To further improve the robustness, SEFNet also integrates a traditional autoregressive component in parallel with nonlinear neural networks. Experiments on four real-world epidemic-related datasets show SEFNet is effective and outperforms state-of-the-art baselines.

【3】 Robot Motion Planning as Video Prediction: A Spatio-Temporal Neural Network-based Motion Planner
标题：作为视频预测的机器人运动规划：一种基于时空神经网络的运动规划器
链接：https://arxiv.org/abs/2208.11287

作者：Xiao Zang,Miao Yin,Lingyi Huang,Jingjin Yu,Saman Zonouz,Bo Yuan
机构： Thanks to the inherently high parallelism of NNs 1Department of Electrical and Computer Engineering, 2Department of Computer Science, Rutgers University
备注：Accepted in IROS 2022
摘要：基于神经网络的机器人运动规划方法由于其强大的学习能力和固有的高度并行性而成为一种有吸引力的方法。尽管当前在这一方向上有了发展，但以直接和同时的方式对重要的顺序和空间信息的有效捕获和处理仍然相对不足。为了克服这一挑战，充分发挥神经网络在运动规划任务中的潜力，本文提出了一种端到端的学习框架STP-Net，它能够充分提取和利用重要的时空信息，形成一个高效的神经运动规划器。通过将机器人的运动解释为视频剪辑，机器人运动规划被转换为视频预测任务，该任务可以由STP-Net以空间和时间上有效的方式来执行。在不同的可见和不可见环境中的经验评估表明，STP-Net在规划速度和路径成本方面表现出非常有前途的性能，其准确率接近100%（又名成功率）。与现有的基于神经网络的运动规划器相比，STP-Net在2D随机森林、2D迷宫和3D随机森林环境中分别实现了至少5倍、2.6倍和1.8倍的速度和较低的路径开销。此外，在多机器人运动规划任务中，STP-Net可以快速地同时计算多条近似最优路径
摘要：Neural network (NN)-based methods have emerged as an attractive approach for robot motion planning due to strong learning capabilities of NN models and their inherently high parallelism. Despite the current development in this direction, the efficient capture and processing of important sequential and spatial information, in a direct and simultaneous way, is still relatively under-explored. To overcome the challenge and unlock the potentials of neural networks for motion planning tasks, in this paper, we propose STP-Net, an end-to-end learning framework that can fully extract and leverage important spatio-temporal information to form an efficient neural motion planner. By interpreting the movement of the robot as a video clip, robot motion planning is transformed to a video prediction task that can be performed by STP-Net in both spatially and temporally efficient ways. Empirical evaluations across different seen and unseen environments show that, with nearly 100% accuracy (aka, success rate), STP-Net demonstrates very promising performance with respect to both planning speed and path cost. Compared with existing NN-based motion planners, STP-Net achieves at least 5x, 2.6x and 1.8x faster speed with lower path cost on 2D Random Forest, 2D Maze and 3D Random Forest environments, respectively. Furthermore, STP-Net can quickly and simultaneously compute multiple near-optimal paths in multi-robot motion planning tasks

【4】 Secondary Protein Structure Prediction Using Neural Networks
标题：基于神经网络的蛋白质二级结构预测
链接：https://arxiv.org/abs/2208.11248

作者：Sidharth Malhotra,Robin Walters
机构：Department of Computer Science, Northeastern University, Huntington Ave., Boston, MA , Department of Mathematics
摘要：在本文中，我们试验使用神经网络结构仅从蛋白质的一级结构（氨基酸序列）预测蛋白质的二级结构（{\alpha}螺旋位置）。我们实现了一个全连接神经网络（FCNN），并使用FCNN进行了三个实验。首先，我们对在小鼠和人类数据集上训练和测试的模型进行了跨物种比较。其次，我们测试了输入到模型中的蛋白质序列的长度变化的影响。第三，我们比较了设计为聚焦于输入窗口中心的自定义误差函数。最后，我们提出了一个可供选择的递归神经网络模型，它可以应用于该问题。
摘要：In this paper we experiment with using neural network structures to predict a protein's secondary structure ({\alpha} helix positions) from only its primary structure (amino acid sequence). We implement a fully connected neural network (FCNN) and preform three experiments using that FCNN. Firstly, we do a cross-species comparison of models trained and tested on mouse and human datasets. Secondly, we test the impact of varying the length of protein sequence we input into the model. Thirdly, we compare custom error functions designed to focus on the center of the input window. At the end of paper we propose a alternative, recurrent neural network model which can be applied to the problem.

其他神经网络|深度学习|模型|建模(13篇)

【1】 Towards Sparsified Federated Neuroimaging Models via Weight Pruning
标题：基于权值剪枝的稀疏联合神经成像模型
链接：https://arxiv.org/abs/2208.11669

作者：Dimitris Stripelis,Umang Gupta,Nikhil Dhinagar,Greg Ver Steeg,Paul Thompson,José Luis Ambite
机构：Paul M. Thompson, and Jos´e Luis Ambite, Information Sciences Institute, University of Southern California, CA, USA, Imaging Genetics Center, Stevens Neuroimaging and Informatics Institute
备注：Accepted to 3rd MICCAI Workshop on Distributed, Collaborative and Federated Learning (DeCaF, 2022)
摘要：大型深度神经网络的联合训练通常会受到限制，因为随着模型大小的增加，传递更新的成本也会增加。在集中式设置中设计了各种模型修剪技术以减少推理时间。将集中式修剪技术与联邦训练相结合对于降低通信成本似乎是直观的——通过在通信步骤之前修剪模型参数。此外，这种在训练期间的渐进模型修剪方法还可以减少训练时间/成本。为此，我们提出了FedSparsify，它在联邦训练期间执行模型修剪。在集中式和联合式设置下的大脑年龄预测任务（根据大脑MRI估计一个人的年龄）实验中，我们证明了即使在具有高度异构数据分布的挑战性联合学习环境中，模型也可以被修剪到95%的稀疏度，而不会影响性能。模型修剪的一个令人惊讶的好处是改进了模型隐私。我们证明了具有高稀疏性的模型不容易受到成员关系推理攻击，一种隐私攻击。
摘要：Federated training of large deep neural networks can often be restrictive due to the increasing costs of communicating the updates with increasing model sizes. Various model pruning techniques have been designed in centralized settings to reduce inference times. Combining centralized pruning techniques with federated training seems intuitive for reducing communication costs -- by pruning the model parameters right before the communication step. Moreover, such a progressive model pruning approach during training can also reduce training times/costs. To this end, we propose FedSparsify, which performs model pruning during federated training. In our experiments in centralized and federated settings on the brain age prediction task (estimating a person's age from their brain MRI), we demonstrate that models can be pruned up to 95% sparsity without affecting performance even in challenging federated learning environments with highly heterogeneous data distributions. One surprising benefit of model pruning is improved model privacy. We demonstrate that models with high sparsity are less susceptible to membership inference attacks, a type of privacy attack.

【2】 Constraint-driven multi-task learning
标题：约束驱动的多任务学习
链接：https://arxiv.org/abs/2208.11656

作者：Bogdan Cretu,Andrew Cropper
机构：University of Oxford, Trinity , arXiv:,.,v, [cs.LG] , Aug
备注：4th year undergraduate project at the University of Oxford
摘要：归纳逻辑编程是一种基于数学逻辑的机器学习形式，它从给定的例子和背景知识生成逻辑程序。在这个项目中，我们扩展了Popper ILP系统，以利用多任务学习。我们实现了最先进的方法和几个新的策略来提高搜索性能。此外，我们还引入了约束保持，这是一种提高所有方法整体性能的技术。约束保留允许系统在背景知识集的更新之间传递知识。因此，我们减少了系统执行的重复工作量。此外，约束保留允许我们从当前最先进的迭代深化搜索方法过渡到更有效的广度优先搜索方法。最后，我们对课程学习技术进行了实验，并展示了它们对该领域的潜在好处。
摘要：Inductive logic programming is a form of machine learning based on mathematical logic that generates logic programs from given examples and background knowledge. In this project, we extend the Popper ILP system to make use of multi-task learning. We implement the state-of-the-art approach and several new strategies to improve search performance. Furthermore, we introduce constraint preservation, a technique that improves overall performance for all approaches. Constraint preservation allows the system to transfer knowledge between updates on the background knowledge set. Consequently, we reduce the amount of repeated work performed by the system. Additionally, constraint preservation allows us to transition from the current state-of-the-art iterative deepening search approach to a more efficient breadth first search approach. Finally, we experiment with curriculum learning techniques and show their potential benefit to the field.

【3】 On a Built-in Conflict between Deep Learning and Systematic Generalization
标题：论深度学习与系统泛化的内在冲突
链接：https://arxiv.org/abs/2208.11633

作者：Yuanpeng Li
摘要：本文假设内部功能分担是削弱组织结构的原因之一。、或用于分类任务的深度学习中的系统概括。在等效预测下，模型将输入空间划分为由边界分隔的多个部分。功能共享倾向于重用边界，导致新输出的部件更少，这与系统泛化相冲突。我们在标准的深度学习模型中展示了这些现象，例如全连接、卷积、残差网络、LSTM和（Vision）Transformers。我们希望本研究能为系统概括提供新的见解，并为新的研究方向奠定基础。
摘要：In this paper, we hypothesize that internal function sharing is one of the reasons to weaken o.o.d. or systematic generalization in deep learning for classification tasks. Under equivalent prediction, a model partitions an input space into multiple parts separated by boundaries. The function sharing prefers to reuse boundaries, leading to fewer parts for new outputs, which conflicts with systematic generalization. We show such phenomena in standard deep learning models, such as fully connected, convolutional, residual networks, LSTMs, and (Vision) Transformers. We hope this study provides novel insights into systematic generalization and forms a basis for new research directions.

【4】 A methodology for identifying resiliency in renewable electrical distribution system using complex network
标题：基于复杂网络的可再生配电系统弹性识别方法
链接：https://arxiv.org/abs/2208.11543

作者：Divyanshi Dwivedi,Pradeep Kumar Yemula,Mayukha Pal
机构：ABB Ability Innovation Center, Asea Brown Boveri Company, Hyderabad , India., Department of Electrical Engineering, Indian Institute of Technology Hyderabad, Kandi, Sangareddy, Telangana , India., $Corresponding author:, R&D Principal Program Manager
摘要：近来，配电系统广泛地渗透有分布式能源（DER）以满足能量需求，并且普遍认为其增强了系统弹性。然而，由于各种因素，如间歇可用性、天气条件的动态性、非线性的引入、复杂性等，它可能对电网运行不利。这需要详细了解我们在这里提出的系统弹性。本文介绍了一种利用复杂网络理论来确定配电系统在各种非理想配置下与太阳能光伏发电相结合时的弹性的方法。得到了不同条件下的复杂相关网络，并计算了各种网络参数来识别这些网络的弹性。所提出的方法识别系统中太阳能电池板的承载能力，同时在不同的不希望的条件下保持弹性，因此有助于获得系统中太阳能电池板的最优分配拓扑。所提出的方法还识别出对变化高度敏感并且可能使系统进入无弹性状态的关键节点。该框架在IEEE-123测试馈线系统上进行了演示，使用GridLAB-D生成时间序列数据，并使用复杂网络和机器学习模型进行了各种分析。
摘要：Recently, Electrical Distribution Systems are extensively penetrated with the Distributed Energy Resources (DERs) to cater the energy demands with general perception that it enhances the system resiliency. However, it may be adverse for the grid operation due to various factors like its intermittent availability, dynamics in weather condition, introduction of nonlinearity, complexity etc. This needs a detailed understanding of system resiliency that our method proposes here. We introduce a methodology using complex network theory to identify the resiliency of distribution system when incorporated with Solar PV generation under various undesirable configurations. Complex correlated networks for different conditions were obtained and various network parameters were computed for identifying the resiliency of those networks. The proposed methodology identifies the hosting capacity of solar panels in the system while maintaining the resiliency under different unwanted conditions hence helps to obtain an optimal allocation topology for solar panels in the system. The proposed method also identifies the critical nodes that are highly sensitive to the changes and could drive the system into non-resiliency. This framework was demonstrated on IEEE-123 Test Feeder system with time-series data generated using GridLAB-D and variety of analysis were performed using complex network and machine learning models.

【5】 UniCon: Unidirectional Split Learning with Contrastive Loss for Visual Question Answering
标题：UNICON：视觉问答的单向分裂学习和对比损失
链接：https://arxiv.org/abs/2208.11435

作者：Yuwei Sun,Hideya Ochiai
机构：University of Tokyo, RIKEN AIP
摘要：利用多模态数据的视觉问答（VQA）在家庭机器人和临床诊断等实际应用中引起了广泛的兴趣。然而，挑战之一是为不同的客户端任务设计稳健的学习。该工作旨在解决大规模训练数据的前提条件和客户端数据共享的保密性限制之间的差距。针对分布式数据孤岛上的VQA任务训练问题，提出了对比损失单向分割学习算法（UniCon）.特别地，UniCon在不同客户端的整个数据分布上训练全局模型，经由对比学习来学习细化的跨模态表示。全局模型的学习表示聚合来自不同局部任务的知识。此外，我们还设计了一个单向的分裂学习框架，以实现更有效的知识共享。在VQA-v2数据集上使用五个最先进的VQA模型进行的综合实验证明了UniCon的有效性，在VQA-v2的验证集中实现了49.89%的准确率。本文首次在数据机密性约束下，利用自监督分裂学习方法对VQA进行了研究。
摘要：Visual question answering (VQA) that leverages multi-modality data has attracted intensive interest in real-life applications, such as home robots and clinic diagnoses. Nevertheless, one of the challenges is to design robust learning for different client tasks. This work aims to bridge the gap between the prerequisite of large-scale training data and the constraint of client data sharing mainly due to confidentiality. We propose the Unidirectional Split Learning with Contrastive Loss (UniCon) to tackle VQA tasks training on distributed data silos. In particular, UniCon trains a global model over the entire data distribution of different clients learning refined cross-modal representations via contrastive learning. The learned representations of the global model aggregate knowledge from different local tasks. Moreover, we devise a unidirectional split learning framework to enable more efficient knowledge sharing. The comprehensive experiments with five state-of-the-art VQA models on the VQA-v2 dataset demonstrated the efficacy of UniCon, achieving an accuracy of 49.89% in the validation set of VQA-v2. This work is the first study of VQA under the constraint of data confidentiality using self-supervised Split Learning.

【6】 TESTSGD: Interpretable Testing of Neural Networks Against Subtle Group Discrimination
标题：TESTSGD：针对细微群体歧视的神经网络可解释测试
链接：https://arxiv.org/abs/2208.11321

作者：Mengdi Zhang,Jun Sun,Jingyi Wang,Bing Sun
机构：Singapore Management University, Zhejiang University, China
摘要：歧视现象已经在许多机器学习应用中表现出来，这就要求在将机器学习应用于人脸识别、医疗诊断和刑事判决等伦理相关领域之前，进行充分的公平性测试。现有的公平性测试方法大多数是为识别个体歧视而设计的，即，对个人的歧视。然而，作为另一种广泛关注的歧视类型，针对群体歧视的测试，大多是隐藏的，研究得很少。为了解决这一问题，本文提出了TESTSGD，这是一种可解释的测试方法，它可以系统地识别和测量神经网络中隐藏的（我们称之为'微妙的'群体辨别力'），该网络的特征是敏感特征组合上的条件。具体地说，给定一个神经网络，TESTSGD首先自动生成一个可解释的规则集，该规则集将输入空间分类为两组，暴露了模型的组区分。此外，TESTSGD还通过对输入空间进行采样来测量识别出的细微组歧视的程度，从而提供估计的组公平性得分，该得分保证在一定误差范围内是准确的。我们评估了在包括结构化数据和文本数据在内的流行数据集上训练的多个神经网络模型。实验结果表明，TESTSGD能够有效地识别和测量这种以前从未发现过的细微群体歧视。此外，我们还证明了TESTSGD的测试结果可以指导新样本的生成，以通过重新训练来减轻这种歧视，而准确率的下降可以忽略不计。
摘要：Discrimination has been shown in many machine learning applications, which calls for sufficient fairness testing before their deployment in ethic-relevant domains such as face recognition, medical diagnosis and criminal sentence. Existing fairness testing approaches are mostly designed for identifying individual discrimination, i.e., discrimination against individuals. Yet, as another widely concerning type of discrimination, testing against group discrimination, mostly hidden, is much less studied. To address the gap, in this work, we propose TESTSGD, an interpretable testing approach which systematically identifies and measures hidden (which we call `subtle' group discrimination} of a neural network characterized by conditions over combinations of the sensitive features. Specifically, given a neural network, TESTSGDfirst automatically generates an interpretable rule set which categorizes the input space into two groups exposing the model's group discrimination. Alongside, TESTSGDalso provides an estimated group fairness score based on sampling the input space to measure the degree of the identified subtle group discrimination, which is guaranteed to be accurate up to an error bound. We evaluate TESTSGDon multiple neural network models trained on popular datasets including both structured data and text data. The experiment results show that TESTSGDis effective and efficient in identifying and measuring such subtle group discrimination that has never been revealed before. Furthermore, we show that the testing results of TESTSGDcan guide generation of new samples to mitigate such discrimination through retraining with negligible accuracy drop.

【7】 Psychophysical Machine Learning
标题：心理物理机器学习
链接：https://arxiv.org/abs/2208.11236

作者：B. N. Kausik
摘要：心理物理学的韦伯·费希纳定律（Weber Fechner Law）指出，人类的感知在刺激中是对数的。提出了一种将Weber Fechner定律融入机器学习损失函数的算法，并利用该算法提高深度学习网络的性能.
摘要：The Weber Fechner Law of psychophysics observes that human perception is logarithmic in the stimulus. We present an algorithm for incorporating the Weber Fechner law into loss functions for machine learning, and use the algorithm to enhance the performance of deep learning networks.

【8】 Preprocessing Source Code Comments for Linguistic Models
标题：对语言模型的源代码注释进行预处理
链接：https://arxiv.org/abs/2208.11235

作者：Sergey Matskevich,Colin Gordon
机构：COLIN S. GORDON, Drexel University, USA
摘要：注释是源代码的重要组成部分，也是文档的主要来源。这就激发了人们对使用大量注释来训练或评估使用或生成注释的工具的兴趣——比如从注释中生成预言甚至代码，或者自动生成代码摘要。大多数这方面的工作对评论的结构和质量做出了强烈的假设，比如假设它们主要由正确的英语句子组成。但是，我们对这些用例的现有注释的实际质量知之甚少。注释通常包含在其他类型的文本中看不到的独特结构和元素，因此从中过滤或提取信息需要格外小心。本文研究了GitHub中840个最流行的开源项目和SriLab数据集中的8422个项目中Python注释的内容和质量，以及朴素过滤和深度过滤对使用现有注释来训练和评估生成注释的系统的影响。
摘要：Comments are an important part of the source code and are a primary source of documentation. This has driven interest in using large bodies of comments to train or evaluate tools that consume or produce them -- such as generating oracles or even code from comments, or automatically generating code summaries. Most of this work makes strong assumptions about the structure and quality of comments, such as assuming they consist mostly of proper English sentences. However, we know little about the actual quality of existing comments for these use cases. Comments often contain unique structures and elements that are not seen in other types of text, and filtering or extracting information from them requires some extra care. This paper explores the contents and quality of Python comments drawn from 840 most popular open source projects from GitHub and 8422 projects from SriLab dataset, and the impact of na\"ive vs. in-depth filtering can have on the use of existing comments for training and evaluation of systems that generate comments.

【9】 Why Deep Learning's Performance Data Are Misleading
标题：为什么深度学习的性能数据具有误导性
链接：https://arxiv.org/abs/2208.11228

作者：Juyang Weng
机构：∗Brain-Mind Institute, §GENISAMA LLC, Okemos, MI , USA
备注：6 pages, 2 figures
摘要：这是一篇理论论文，作为同一次会议上主题演讲的配套论文。与有意识学习相比，人工智能领域的许多项目都采用了深度学习，其中许多项目似乎都能提供令人印象深刻的性能数据。本文解释了由于两种可能的不当行为，此类性能数据可能被误导性地夸大：数据删除和训练集测试。本文阐明了什么是深度学习中的数据删除，什么是深度学习中的训练集测试，以及它们为什么是不当行为。定义了一种简单的分类方法，称为带阈值的最近邻（NNWT）。本文建立了一个定理，即只要测试集在作者手中，并且存储空间和训练时间都是有限的，但与许多深度学习方法一样是无界的，则NNWT方法在任何验证集和使用Post-Selections的任何测试集上都达到零误差。然而，像许多深度学习方法一样，NNWT方法的泛化能力很小。在许多深度学习项目中确实发生了不当行为的证据超出了本文的范围。如果没有一个透明的账户来说明选择后的自由度，深度学习数据就会产生误导。
摘要：This is a theoretical paper, as a companion paper of the keynote talk at the same conference. In contrast to conscious learning, many projects in AI have employed deep learning many of which seem to give impressive performance data. This paper explains that such performance data are probably misleadingly inflated due to two possible misconducts: data deletion and test on training set. This paper clarifies what is data deletion in deep learning and what is test on training set in deep learning and why they are misconducts. A simple classification method is defined, called nearest neighbor with threshold (NNWT). A theorem is established that the NNWT method reaches a zero error on any validation set and any test set using Post-Selections, as long as the test set is in the possession of the author and both the amount of storage space and the time of training are finite but unbounded like with many deep learning methods. However, like many deep learning methods, the NNWT method has little generalization power. The evidence that misconducts actually took place in many deep learning projects is beyond the scope of this paper. Without a transparent account about freedom from Post-Selections, deep learning data are misleading.

【10】 Auditing Membership Leakages of Multi-Exit Networks
标题：审计多出口网络的成员泄漏
链接：https://arxiv.org/abs/2208.11180

作者：Zheng Li,Yiyong Liu,Xinlei He,Ning Yu,Michael Backes,Yang Zhang
机构：CISPA Helmholtz Center for Information Security, Salesforce Research
备注：Accepted by CCS 2022
摘要：由于并非所有输入都需要相同的计算量才能产生可靠的预测，多出口网络作为一种突破有效部署极限的突出方法正在受到关注。多出口网络为主干模型提供了早期出口，允许在模型的中间层获得预测，从而节省计算时间和/或能量。然而，现有的各种多出口网络设计仅考虑在资源利用效率和预测精度之间取得最佳平衡，而对由此产生的隐私风险从未进行过研究.这就需要对多出口网络中的隐私风险进行全面的调查。本文首先从成员泄漏的角度对多出口网络进行了隐私分析。特别地，我们首先利用现有的攻击方法来量化多出口网络对成员泄漏的脆弱性。实验结果表明，多出口网络不易受到成员泄漏的影响，并且出口（数量和深度）与攻击性能密切相关.此外，我们提出了一种利用退出信息的混合攻击，以提高现有攻击的性能。我们在三种不同的对抗设置下评估了混合攻击造成的成员泄漏威胁，最终得到了一个无模型和无数据的对手。这些结果清楚地表明，我们的混合攻击具有非常广泛的适用性，因此相应的风险比现有的成员关系推理攻击所表现出的风险要严重得多。在此基础上，提出了一种针对多出口网络的防御机制TimeGuard，并证明了该防御机制能够很好地抵御新提出的攻击.
摘要：Relying on the fact that not all inputs require the same amount of computation to yield a confident prediction, multi-exit networks are gaining attention as a prominent approach for pushing the limits of efficient deployment. Multi-exit networks endow a backbone model with early exits, allowing to obtain predictions at intermediate layers of the model and thus save computation time and/or energy. However, current various designs of multi-exit networks are only considered to achieve the best trade-off between resource usage efficiency and prediction accuracy, the privacy risks stemming from them have never been explored. This prompts the need for a comprehensive investigation of privacy risks in multi-exit networks. In this paper, we perform the first privacy analysis of multi-exit networks through the lens of membership leakages. In particular, we first leverage the existing attack methodologies to quantify the multi-exit networks' vulnerability to membership leakages. Our experimental results show that multi-exit networks are less vulnerable to membership leakages and the exit (number and depth) attached to the backbone model is highly correlated with the attack performance. Furthermore, we propose a hybrid attack that exploits the exit information to improve the performance of existing attacks. We evaluate membership leakage threat caused by our hybrid attack under three different adversarial setups, ultimately arriving at a model-free and data-free adversary. These results clearly demonstrate that our hybrid attacks are very broadly applicable, thereby the corresponding risks are much more severe than shown by existing membership inference attacks. We further present a defense mechanism called TimeGuard specifically for multi-exit networks and show that TimeGuard mitigates the newly proposed attacks perfectly.

【11】 Using Conservation Laws to Infer Deep Learning Model Accuracy of Richtmyer-meshkov Instabilities
标题：利用守恒定律推断Richtmyer-Meshkov不稳定性的深度学习模型精度
链接：https://arxiv.org/abs/2208.11477

作者：Charles F. Jekel,Dane M. Sterbentz,Sylvie Aubry,Youngsoo Choi,Daniel A. White,Jonathan L. Belof
机构：Lawrence Livermore National Laboratory†, PO Box , Livermore, CA, USA, Key words: Deep learning, Full-field regression, Richtmyer-Meshkov instability, Inference error
备注：Presented at ECCOMAS 2022
摘要：Richtmyer-Meshkov不稳定性（RMI）是激波通过扰动界面时发生的一种复杂现象。为了研究参数化高速撞击下RMI的形成，进行了上千次的流体动力学模拟。深度学习用于学习初始几何扰动到密度和速度的全场流体动力学解的时间映射。连续性方程被用于将物理信息包括到损失函数中，然而，以额外的训练复杂性为代价，仅导致非常小的改进。深度学习模型的预测似乎可以准确捕捉域内各种几何条件下的时间RMI形成。研究了第一原理物理定律，以推断模型预测能力的准确性。虽然连续性方程似乎与模型的准确度无关，但质量守恒和动量守恒与准确度的相关性较弱。由于守恒定律可以从深度学习模型中快速计算出来，因此它们在需要相对准确性度量的应用中可能很有用。
摘要：Richtmyer-Meshkov Instability (RMI) is a complicated phenomenon that occurs when a shockwave passes through a perturbed interface. Over a thousand hydrodynamic simulations were performed to study the formation of RMI for a parameterized high velocity impact. Deep learning was used to learn the temporal mapping of initial geometric perturbations to the full-field hydrodynamic solutions of density and velocity. The continuity equation was used to include physical information into the loss function, however only resulted in very minor improvements at the cost of additional training complexity. Predictions from the deep learning model appear to accurately capture temporal RMI formations for a variety of geometric conditions within the domain. First principle physical laws were investigated to infer the accuracy of the model's predictive capability. While the continuity equation appeared to show no correlation with the accuracy of the model, conservation of mass and momentum were weakly correlated with accuracy. Since conservation laws can be quickly calculated from the deep learning model, they may be useful in applications where a relative accuracy measure is needed.

【12】 Automatic music mixing with deep learning and out-of-domain data
标题：具有深度学习和域外数据的自动音乐混合
链接：https://arxiv.org/abs/2208.11428

作者：Marco A. Martínez-Ramírez,Wei-Hsiang Liao,Giorgio Fabbro,Stefan Uhlich,Chihiro Nagashima,Yuki Mitsufuji
机构：ZSony Group Corporation, Tokyo, Japan, Sony Europe B.V., Stuttgart, Germany
备注：23rd International Society for Music Information Retrieval Conference (ISMIR), December, 2022. Source code, demo and audio examples: this https URL
摘要：音乐混合传统上涉及以干净的、单独的音轨的形式记录乐器，并使用音频效果和专业知识（例如，混合工程师）。近年来，音乐制作任务的自动化已经成为一个新兴的领域，其中已经探索了基于规则的方法和机器学习方法。尽管如此，缺乏干燥或干净的乐器录音限制了这类模型的性能，与专业的人工混音仍有很大差距。我们探索是否可以使用域外数据（如湿音乐或经过处理的多轨音乐录音），并将其重新用于训练监督式深度学习模型，以弥补当前自动混音质量方面的差距。为了实现这一点，我们提出了一种新颖的数据预处理方法，允许模型执行自动音乐混合。我们还重新设计了一种用于评估音乐混音系统的听力测试方法。我们使用经验丰富的混合工程师作为参与者，通过这种主观测试来验证我们的结果。
摘要：Music mixing traditionally involves recording instruments in the form of clean, individual tracks and blending them into a final mixture using audio effects and expert knowledge (e.g., a mixing engineer). The automation of music production tasks has become an emerging field in recent years, where rule-based methods and machine learning approaches have been explored. Nevertheless, the lack of dry or clean instrument recordings limits the performance of such models, which is still far from professional human-made mixes. We explore whether we can use out-of-domain data such as wet or processed multitrack music recordings and repurpose it to train supervised deep learning models that can bridge the current gap in automatic mixing quality. To achieve this we propose a novel data preprocessing method that allows the models to perform automatic music mixing. We also redesigned a listening test method for evaluating music mixing systems. We validate our results through such subjective tests using highly experienced mixing engineers as participants.

【13】 The premise of approximate MCMC in Bayesian deep learning
标题：贝叶斯深度学习中近似最小二乘的前提
链接：https://arxiv.org/abs/2208.11389

作者：Theodore Papamarkou
机构： whichDepartment of Mathematics, The University ofManchester
摘要：本文提出了贝叶斯深度学习中近似MCMC的几个特征。提出了一种神经网络的近似采样算法。类比于从大数据集中批量采样数据，提出从高维神经网络参数空间中采样参数子群。虽然文献中已经讨论了minibatch MCMC的优点，但是在贝叶斯深度学习中，块吉布斯采样受到的研究关注较少。
摘要：This paper identifies several characteristics of approximate MCMC in Bayesian deep learning. It proposes an approximate sampling algorithm for neural networks. By analogy to sampling data batches from big datasets, it is proposed to sample parameter subgroups from neural network parameter spaces of high dimensions. While the advantages of minibatch MCMC have been discussed in the literature, blocked Gibbs sampling has received less research attention in Bayesian deep learning.

其他(10篇)

【1】 Bugs in the Data: How ImageNet Misrepresents Biodiversity
标题：数据中的错误：ImageNet如何错误地描述生物多样性
链接：https://arxiv.org/abs/2208.11695

作者：Alexandra Sasha Luccioni,David Rolnick
机构：Hugging Face, McGill University, Mila
摘要：ImageNet-1k是一个数据集，通常用于对机器学习（ML）模型进行基准测试，以及评估图像识别和对象检测等任务。野生动物占ImageNet-1k的27%，但是，与代表人和物体的类不同，这些数据没有经过仔细的审查。在当前的论文中，我们分析了ImageNet-1k验证集中代表野生动物的269个类别的13,450张图像，专家生态学家参与了分析。我们发现，许多类别是定义不清或重叠的，12%的图像被错误地标记，一些类别有〉90%的图像不正确。我们还发现，ImageNet-1k中包含的野生动物相关标签和图像都存在显著的地理和文化偏见，以及模糊性，如人造动物、同一图像中的多个物种或人类的存在。我们的研究结果强调了大量使用该数据集评估ML系统、在野生动物相关任务中使用此类算法以及更广泛地创建和管理ML数据集的方式的严重问题。
摘要：ImageNet-1k is a dataset often used for benchmarking machine learning (ML) models and evaluating tasks such as image recognition and object detection. Wild animals make up 27% of ImageNet-1k but, unlike classes representing people and objects, these data have not been closely scrutinized. In the current paper, we analyze the 13,450 images from 269 classes that represent wild animals in the ImageNet-1k validation set, with the participation of expert ecologists. We find that many of the classes are ill-defined or overlapping, and that 12% of the images are incorrectly labeled, with some classes having >90% of images incorrect. We also find that both the wildlife-related labels and images included in ImageNet-1k present significant geographical and cultural biases, as well as ambiguities such as artificial animals, multiple species in the same image, or the presence of humans. Our findings highlight serious issues with the extensive use of this dataset for evaluating ML systems, the use of such algorithms in wildlife-related tasks, and more broadly the ways in which ML datasets are commonly created and curated.

【2】 Efficient Heterogeneous Video Segmentation at the Edge
标题：一种高效的边缘异质视频分割算法
链接：https://arxiv.org/abs/2208.11666

作者：Jamie Menjay Lin,Siargey Pisarchyk,Juhyun Lee,David Tian,Tingbo Hou,Karthik Raveendran,Raman Sarokin,George Sung,Trent Tolley,Matthias Grundmann
机构：Google, Mountain View, CA, USA
备注：Published as a workshop paper at CVPRW CV4ARVR 2022
摘要：我们介绍了一种有效的视频分割系统，用于利用异构计算的资源有限的边缘设备。具体而言，我们设计网络模型的方法是，在已经很轻的主干之上，针对市场上可用的边缘推理引擎，在多个规格维度上搜索神经架构和操作。我们进一步分析和优化了系统中CPU、GPU和NPU之间的异构数据流。我们的方法在经验上已经很好地融入到我们的实时AR系统中，实现了显著更高的准确度和四倍的有效分辨率，同时在边缘平台上实现了更短的端到端延迟、更高的帧速率和更低的功耗。
摘要：We introduce an efficient video segmentation system for resource-limited edge devices leveraging heterogeneous compute. Specifically, we design network models by searching across multiple dimensions of specifications for the neural architectures and operations on top of already light-weight backbones, targeting commercially available edge inference engines. We further analyze and optimize the heterogeneous data flows in our systems across the CPU, the GPU and the NPU. Our approach has empirically factored well into our real-time AR system, enabling remarkably higher accuracy with quadrupled effective resolutions, yet at much shorter end-to-end latency, much higher frame rate, and even lower power consumption on edge platforms.

【3】 Metric Effects based on Fluctuations in values of k in Nearest Neighbor Regressor
标题：基于最近邻回归中k值波动的度量效应
链接：https://arxiv.org/abs/2208.11540

作者：Abhishek Gupta,Raunak Joshi,Nandan Kanvinde,Pinky Gerela,Ronald Melwin Laban
机构：Department of EXTC, University of Mumbai, Mumbai, India, Department of IT, Department of MCA, TIMSCDR, Assistant Professor - Dept. of MCA, Assistant Professor - Dept. of EXTC, St. John College of Engineering and Management, Palghar, India
备注：7 pages, 6 figures. To appear in proceedings of 3rd International Conference on Data Intelligence and Cognitive Informatics (ICDICI 2022)
摘要：机器学习的回归分支纯粹专注于连续值的预测。监督学习分支具有许多基于回归的方法，这些方法具有参数和非参数学习模型。在本文中，我们的目标是一个非常微妙的点有关的距离为基础的回归模型。使用的基于距离的模型是K-最近邻回归，这是一种监督的非参数方法。我们要证明的一点是模型的k参数及其波动对度量的影响。我们使用的度量是均方根误差和R平方拟合优度，以及它们相对于k值的值的视觉表示。
摘要：Regression branch of Machine Learning purely focuses on prediction of continuous values. The supervised learning branch has many regression based methods with parametric and non-parametric learning models. In this paper we aim to target a very subtle point related to distance based regression model. The distance based model used is K-Nearest Neighbors Regressor which is a supervised non-parametric method. The point that we want to prove is the effect of k parameter of the model and its fluctuations affecting the metrics. The metrics that we use are Root Mean Squared Error and R-Squared Goodness of Fit with their visual representation of values with respect to k values.

【4】 A Bayesian Variational principle for dynamic Self Organizing Maps
标题：动态自组织映射的贝叶斯变分原理
链接：https://arxiv.org/abs/2208.11337

作者：Anthony Fillion,Thibaut Kulak,François Blayo
机构：- NeoInstinct - NeoLab, rue Traversiere, Lausanne - Switzerland
摘要：我们提出的组织条件，产生了一种方法来训练SOM与自适应邻域半径在一个变分贝叶斯框架。该方法在非平稳背景下得到了验证，并在高维背景下与其他自适应方法进行了比较。
摘要：We propose organisation conditions that yield a method for training SOM with adaptative neighborhood radius in a variational Bayesian framework. This method is validated on a non-stationary setting and compared in an high-dimensional setting with an other adaptative method.

【5】 FashionVQA: A Domain-Specific Visual Question Answering System
标题：FashionVQA：一种特定领域的可视化问答系统
链接：https://arxiv.org/abs/2208.11253

作者：Min Wang,Ata Mahjoubfar,Anupama Joshi
机构：Target Corporation, S. Mathilda Place Suite , Sunnyvale, CA
摘要：人类通过各种感官形式来理解世界，而语言是他们主要的交流渠道。机器学习系统需要利用相同的多模态丰富性，以便用自然语言与人类进行有根据的对话;这对于专门处理视觉密集信息（例如对话、推荐和服装搜索引擎）的系统来说尤其如此。为此，我们训练一个视觉问答（VQA）系统来回答关于时装摄影图像中服装的复杂自然语言问题。成功训练VQA模型的关键是使用不同的模板从20.7万张图像的项目属性中自动创建一个包含1.68亿个样本的可视问答数据集。样本生成采用了一种策略，该策略考虑了问题—答案对的难度，以强调具有挑战性的概念。与最近使用多个数据集来预训练视觉问答模型的趋势相反，我们专注于保持数据集固定，同时从头训练各种模型，以将改进与模型架构变化隔离。我们看到，使用与语言模型相同的Transformer对问题进行编码和对答案进行解码，可以实现最大的准确性，这表明视觉语言模型（VLM）是我们数据集的最佳视觉问题回答系统。最佳模型的准确性超过了人类专家的水平，即使在回答不限于模板格式的人类生成的问题时也是如此。我们生成大规模多模态领域特定数据集的方法为训练能够用自然语言通信的专门模型提供了一条途径。这样的领域专家模型的训练，我们的时尚VLM模型，不能仅仅依赖于从网络上收集的大规模通用数据集。
摘要：Humans apprehend the world through various sensory modalities, yet language is their predominant communication channel. Machine learning systems need to draw on the same multimodal richness to have informed discourses with humans in natural language; this is particularly true for systems specialized in visually-dense information, such as dialogue, recommendation, and search engines for clothing. To this end, we train a visual question answering (VQA) system to answer complex natural language questions about apparel in fashion photoshoot images. The key to the successful training of our VQA model is the automatic creation of a visual question-answering dataset with 168 million samples from item attributes of 207 thousand images using diverse templates. The sample generation employs a strategy that considers the difficulty of the question-answer pairs to emphasize challenging concepts. Contrary to the recent trends in using several datasets for pretraining the visual question answering models, we focused on keeping the dataset fixed while training various models from scratch to isolate the improvements from model architecture changes. We see that using the same transformer for encoding the question and decoding the answer, as in language models, achieves maximum accuracy, showing that visual language models (VLMs) make the best visual question answering systems for our dataset. The accuracy of the best model surpasses the human expert level, even when answering human-generated questions that are not confined to the template formats. Our approach for generating a large-scale multimodal domain-specific dataset provides a path for training specialized models capable of communicating in natural language. The training of such domain-expert models, e.g., our fashion VLM model, cannot rely solely on the large-scale general-purpose datasets collected from the web.

【6】 Accelerating SGD for Highly Ill-Conditioned Huge-Scale Online Matrix Completion
标题：针对高病态大规模在线矩阵补全的加速SGD
链接：https://arxiv.org/abs/2208.11246

作者：Gavin Zhang,Hong-Ming Chiu,Richard Y. Zhang
机构：Department of Electrical and Computer Engineering, University of Illinois-Urbana Champaign
摘要：矩阵补全问题寻求从其单个元素的观察中恢复低秩的地面真实矩阵。现实世界中的矩阵补全通常是一个大规模的优化问题，$d$如此之大，以至于即使是最简单的时间复杂度为$O（d）$的全维向量运算也变得非常昂贵。随机梯度下降（SGD）是少数几个能够解决大规模矩阵补全的算法之一，也可以自然地处理不断演变的地面事实上的流数据。不幸的是，当基础事实是病态的时，SGD会经历一个戏剧性的减速;它需要至少O（\k\log（1/\epsilon））$次迭代以获得具有条件数\k $的\epsilon$-接近地面真实矩阵。在这篇文章中，我们提出了一个预处理版本的SGD，它保留了SGD的所有有益的实用性质，用于大规模的在线优化，同时也使它不可知$\κ $。对于对称的基本事实和均方根误差（RMSE）损失，我们证明了预处理的SGD在O（\log（1/\epsilon）））$次迭代中收敛到$\epsilon$-精度，具有快速的线性收敛速率，就好像基本事实是以$\kappa=1$完美地条件化的。在我们的数值实验中，我们观察到在1位交叉熵损失以及成对损失（如贝叶斯个性化排序（BPR）损失）下，病态矩阵完成的加速类似。
摘要：The matrix completion problem seeks to recover a $d\times d$ ground truth matrix of low rank $r\ll d$ from observations of its individual elements. Real-world matrix completion is often a huge-scale optimization problem, with $d$ so large that even the simplest full-dimension vector operations with $O(d)$ time complexity become prohibitively expensive. Stochastic gradient descent (SGD) is one of the few algorithms capable of solving matrix completion on a huge scale, and can also naturally handle streaming data over an evolving ground truth. Unfortunately, SGD experiences a dramatic slow-down when the underlying ground truth is ill-conditioned; it requires at least $O(\kappa\log(1/\epsilon))$ iterations to get $\epsilon$-close to ground truth matrix with condition number $\kappa$. In this paper, we propose a preconditioned version of SGD that preserves all the favorable practical qualities of SGD for huge-scale online optimization while also making it agnostic to $\kappa$. For a symmetric ground truth and the Root Mean Square Error (RMSE) loss, we prove that the preconditioned SGD converges to $\epsilon$-accuracy in $O(\log(1/\epsilon))$ iterations, with a rapid linear convergence rate as if the ground truth were perfectly conditioned with $\kappa=1$. In our numerical experiments, we observe a similar acceleration for ill-conditioned matrix completion under the 1-bit cross-entropy loss, as well as pairwise losses such as the Bayesian Personalized Ranking (BPR) loss.

【7】 DeepPicarMicro: Applying TinyML to Autonomous Cyber Physical Systems
标题：DeepPicarMicro：将TinyML应用于自主网络物理系统
链接：https://arxiv.org/abs/2208.11212

作者：Michael Bechtel,QiTao Weng,Heechul Yun
机构：University of Kansas, USA.
备注：RTCSA 2022
摘要：在微型微控制器单元（MCU）上运行深度神经网络（DNN）具有挑战性，因为它们在计算、内存和存储容量方面存在限制。幸运的是，MCU硬件和机器学习软件框架的最新进展使得在现代MCU上运行相当复杂的神经网络成为可能，从而形成了一个被广泛称为TinyML的新研究领域。然而，很少有研究表明TinyML在网络物理系统（CPS）中的应用潜力。本文介绍了一个小型自动驾驶遥控汽车试验台DeepPicarMicro，该试验台在Raspberry Pi Pico MCU上运行卷积神经网络（CNN）。我们采用了最先进的DNN优化技术，成功地将著名的PilotNet CNN架构（用于驱动NVIDIA真正的自动驾驶汽车）与MCU相匹配。我们应用最先进的网络架构搜索（NAS）方法来找到进一步优化的网络，以端到端的方式实时有效地控制汽车。从广泛的系统实验评估研究中，我们观察到系统的准确性、延迟和控制性能之间的有趣关系。在此基础上，提出了一种联合优化策略，在人工智能CPS的网络架构搜索过程中兼顾模型的准确性和延迟。
摘要：Running deep neural networks (DNNs) on tiny Micro-controller Units (MCUs) is challenging due to their limitations in computing, memory, and storage capacity. Fortunately, recent advances in both MCU hardware and machine learning software frameworks make it possible to run fairly complex neural networks on modern MCUs, resulting in a new field of study widely known as TinyML. However, there have been few studies to show the potential for TinyML applications in cyber physical systems (CPS). In this paper, we present DeepPicarMicro, a small self-driving RC car testbed, which runs a convolutional neural network (CNN) on a Raspberry Pi Pico MCU. We apply a state-of-the-art DNN optimization to successfully fit the well-known PilotNet CNN architecture, which was used to drive NVIDIA's real self-driving car, on the MCU. We apply a state-of-art network architecture search (NAS) approach to find further optimized networks that can effectively control the car in real-time in an end-to-end manner. From an extensive systematic experimental evaluation study, we observe an interesting relationship between the accuracy, latency, and control performance of a system. From this, we propose a joint optimization strategy that takes both accuracy and latency of a model in the network architecture search process for AI enabled CPS.

【8】 Robustness to Unbounded Smoothness of Generalized SignSGD
标题：广义SignSGD对无界光滑性的稳健性
链接：https://arxiv.org/abs/2208.11195

作者：Michael Crawshaw,Mingrui Liu,Francesco Orabona,Wei Zhang,Zhenxun Zhuang
机构：George Mason University, Boston University, IBM T. J. Watson Research Center
摘要：非凸优化中的传统分析通常依赖于光滑性假设，即要求梯度为Lipschitz。然而，最近的证据表明，这种平滑性条件并没有捕捉到一些深度学习目标函数的属性，包括涉及递归神经网络和LSTM的目标函数。相反，它们满足更宽松的条件，具有潜在的无限平滑性。在这个宽松的假设下，理论和经验表明，梯度剪切SGD比普通SGD具有优势。在本文中，我们说明了在处理以下情况时，剪切对于Adam型算法不是必不可少的：我们从理论上证明了一个广义SignSGD算法可以获得与带裁剪的SGD算法相似的收敛速度，而根本不需要显式裁剪。这一系列算法一方面可以恢复SignSGD，另一方面与流行的Adam算法非常相似。我们的分析强调了动量在分析SignSGD型和Adam型算法中所起的关键作用：它不仅减少了噪声的影响，从而消除了在先前的SignSGD型算法的分析中对大的小型批处理的需要，而且它还实质上减少了无界平滑度和梯度范数的影响。我们还将这些算法与流行的优化器在一组深度学习任务上进行了比较，观察到我们可以在击败其他人的同时达到Adam的性能。
摘要：Traditional analyses in non-convex optimization typically rely on the smoothness assumption, namely requiring the gradients to be Lipschitz. However, recent evidence shows that this smoothness condition does not capture the properties of some deep learning objective functions, including the ones involving Recurrent Neural Networks and LSTMs. Instead, they satisfy a much more relaxed condition, with potentially unbounded smoothness. Under this relaxed assumption, it has been theoretically and empirically shown that the gradient-clipped SGD has an advantage over the vanilla one. In this paper, we show that clipping is not indispensable for Adam-type algorithms in tackling such scenarios: we theoretically prove that a generalized SignSGD algorithm can obtain similar convergence rates as SGD with clipping but does not need explicit clipping at all. This family of algorithms on one end recovers SignSGD and on the other end closely resembles the popular Adam algorithm. Our analysis underlines the critical role that momentum plays in analyzing SignSGD-type and Adam-type algorithms: it not only reduces the effects of noise, thus removing the need for large mini-batch in previous analyses of SignSGD-type algorithms, but it also substantially reduces the effects of unbounded smoothness and gradient norms. We also compare these algorithms with popular optimizers on a set of deep learning tasks, observing that we can match the performance of Adam while beating the others.

【9】 The Alberta Plan for AI Research
标题：艾BERT省人工智能研究计划
链接：https://arxiv.org/abs/2208.11173

作者：Richard S. Sutton,Michael H. Bowling,Patrick M. Pilarski
机构：University of Alberta and DeepMind Alberta, Edmonton, Alberta, Canada, History suggests that the road to a firm research consensus is extraordinarily arduous., — Thomas Kuhn, The Structure of Scientific Revolutions
摘要：在这里，我们描述了我们的人工智能研究方法，我们称之为艾伯塔计划。阿尔伯塔计划是在我们阿尔伯塔省的研究小组和世界各地志同道合的其他研究小组内推行的。我们欢迎所有愿意与我们一道从事这项工作的人。
摘要：Herein we describe our approach to artificial intelligence research, which we call the Alberta Plan. The Alberta Plan is pursued within our research groups in Alberta and by others who are like minded throughout the world. We welcome all who would join us in this pursuit.

【10】 Fast emulation of density functional theory simulations using approximate Gaussian processes
标题：基于近似高斯过程的密度泛函理论模拟快速仿真
链接：https://arxiv.org/abs/2208.11302

作者：Steven Stetzler,Michael Grosskopf,Earl Lawrence
机构：University of Washington Department of Astronomy,th Ave NE, Seattle, WA, USA, Los Alamos National Laboratory, PO Box , Los Alamos, NM, USA
备注：20 pages, 8 figures, to appear in Conference Proceedings for SPIE Applications of Machine Learning 2022
摘要：使用马尔可夫链蒙特卡罗以贝叶斯方式将理论模型拟合到实验数据通常需要对模型进行数千次（或数百万次）评估。当模型是计算速度慢的物理模拟时，贝叶斯模型拟合变得不可行。为了补救这一点，可以在模型拟合期间使用预测模拟输出的第二统计模型—“仿真器”来代替完全模拟。选择的典型仿真器是高斯过程（GP），这是一种灵活的非线性模型，在每个输入点提供预测平均值和方差。高斯过程回归适用于少量的训练数据（$n〈10^3$），但当数据集变大时，训练和用于预测的速度会变慢。可以使用各种方法来加速大中型数据集（$n〉10^5$）中的高斯过程，以大幅减少运行时间来换取预测准确性。本文研究了几种近似高斯过程模型——稀疏变分GP、随机变分GP和深度核学习GP-—在模拟密度泛函理论（DFT）模型预测时的精度—运行时间权衡.此外，我们使用仿真器以贝叶斯方式使用观测数据校准DFT模型参数，解决了数据集大小带来的计算障碍，并将校准结果与先前的工作进行比较。这些校准的DFT模型的用途是基于观测数据，对实验上未观测到的感兴趣的核素（例如超重核）的性质进行预测。
摘要：Fitting a theoretical model to experimental data in a Bayesian manner using Markov chain Monte Carlo typically requires one to evaluate the model thousands (or millions) of times. When the model is a slow-to-compute physics simulation, Bayesian model fitting becomes infeasible. To remedy this, a second statistical model that predicts the simulation output -- an "emulator" -- can be used in lieu of the full simulation during model fitting. A typical emulator of choice is the Gaussian process (GP), a flexible, non-linear model that provides both a predictive mean and variance at each input point. Gaussian process regression works well for small amounts of training data ($n < 10^3$), but becomes slow to train and use for prediction when the data set size becomes large. Various methods can be used to speed up the Gaussian process in the medium-to-large data set regime ($n > 10^5$), trading away predictive accuracy for drastically reduced runtime. This work examines the accuracy-runtime trade-off of several approximate Gaussian process models -- the sparse variational GP, stochastic variational GP, and deep kernel learned GP -- when emulating the predictions of density functional theory (DFT) models. Additionally, we use the emulators to calibrate, in a Bayesian manner, the DFT model parameters using observed data, resolving the computational barrier imposed by the data set size, and compare calibration results to previous work. The utility of these calibrated DFT models is to make predictions, based on observed data, about the properties of experimentally unobserved nuclides of interest e.g. super-heavy nuclei.

机器翻译由腾讯交互翻译提供，仅供参考

点击“阅读原文”获取带摘要的学术速递