Py学习  »  aigc

CV&AIGC顶会整理 [2024-12-10]

晓飞的算法工程笔记 • 6 月前 • 162 次点击  

今日更新34篇:

  • 计算机视觉会议 21篇
  • 自然语言处理会议 13篇
请注意,大模型的论文多发布于自然语言处理会议中。而由于多模态的发展迅速,部分计算机视觉相关的论文也会发布在自然语言处理顶会中。

计算机视觉会议: 21篇


[0] TagFog: Textual Anchor Guidance and Fake Outlier Generation for Visual Out-of-Distribution Detection[cs.CV]
标题:TagFog:文本锚点引导与虚假异常点生成用于视觉出分布检测
作者:Jiankang Chen, Tong Zhang, Wei-Shi Zheng, Ruixuan Wang
链接:http://arxiv.org/abs/2412.05292
代码:https://github.com/Cverchen/TagFog
期刊:Proceedings of the AAAI Conference on Artificial Intelligence, 2024
备注:10 pages, 4 figures

[1]-NeRF: Leveraging Attenuation Priors in Neural Radiance Field for 3D Computed Tomography Reconstruction[cs.CV]
标题:-NeRF:在神经网络辐射场中利用衰减先验进行3D计算机断层扫描重建
作者:Li Zhou, Changsheng Fang, Bahareh Morovati, Yongtong Liu, Shuo Han, Yongshun Xu, Hengyong Yu
链接:http://arxiv.org/abs/2412.05322
备注:The paper was submitted to CVPR 2025

[2] Generative Model-Based Fusion for Improved Few-Shot Semantic Segmentation of Infrared Images[cs.CV]
标题:基于生成模型的融合技术用于红外图像的少样本语义分割提升
作者:Junno Yun, Mehmet Akçakaya
链接:http://arxiv.org/abs/2412.05341
备注:Winter Conference on Applications of Computer Vision (WACV), 2025

[3] Swap Path Network for Robust Person Search Pre-training[cs.CV]
标题:交换路径网络用于鲁棒的人体搜索预训练
作者:Lucas Jaffe, Avideh Zakhor
链接:http://arxiv.org/abs/2412.05433
代码:https://github.com/LLNL/spnet
备注:WACV 2025; Code: this https URL

[4] CigTime: Corrective Instruction Generation Through Inverse Motion Editing[cs.CV]
标题:逆向动作编辑中通过纠正性指令生成时间
作者:Qihang Fang, Chengcheng Tang, Bugra Tekin, Yanchao Yang
链接:http://arxiv.org/abs/2412.05460
备注:20 pages, 8 figures, NeurIPS 2024

[5] Video2Reward: Generating Reward Function from Videos for Legged Robot Behavior Learning[cs.CV]
标题:视频2奖励:生成奖励函数以用于地面机器人行为学习
作者:Runhao Zeng, Dingjie Zhou, Qiwei Liang, Junlin Liu, Hui Li, Changxin Huang, Jianqiang Li, Xiping Hu, Fuchun Sun
链接:http://arxiv.org/abs/2412.05515
期刊:Proceedings of the 27th European Conference on Artificial Intelligence (ECAI 2024), Santiago de Compostela, Spain, October 19-24, 2024. Frontiers in Artificial Intelligence and Applications, vol. 392, IOS Press, pp. 4369-4376
备注:8 pages, 6 figures, ECAI2024

[6] Template-free Articulated Gaussian Splatting for Real-time Reposable Dynamic View Synthesis[cs.CV]
标题:无模板的自由连接高斯喷溅实时可重用动态视点合成
作者:Diwen Wan, Yuxiang Wang, Ruijie Lu, Gang Zeng
链接:http://arxiv.org/abs/2412.05570
备注:Accepted by NeurIPS 2024

[7] TB-HSU: Hierarchical 3D Scene Understanding with Contextual Affordances[cs.CV]
标题:TB-HSU:基于情境能力的层次化三维场景理解
作者:Wenting Xu, Viorela Ila, Luping Zhou, Craig T. Jin
链接:http://arxiv.org/abs/2412.05596
备注:Submitted to AAAI2025

[8] Self-Supervised Learning with Probabilistic Density Labeling for Rainfall Probability Estimation[cs.CV]
标题:基于概率密度标签的自监督学习在降雨概率估计中的应用
作者:Junha Lee, Sojung An, Sujeong You, Namik Cho
链接:http://arxiv.org/abs/2412.05825
代码:https://github.com/joonha425/SSLPDL
备注:Accepted by WACV 2025

[9] LVP-CLIP:Revisiting CLIP for Continual Learning with Label Vector Pool[cs.CV]
标题:LVP-CLIP:重新审视CLIP以实现连续学习的标签向量池
作者:Yue Ma, Huantao Ren, Boyu Wang, Jingang Jin, Senem Velipasalar, Qinru Qiu
链接:http://arxiv.org/abs/2412.05840
备注:submitted to CVPR2025

[10] BiDM: Pushing the Limit of Quantization for Diffusion Models[cs.CV]
标题:生物扩散模型:推动扩散模型量化极限
作者:Xingyu Zheng, Xianglong Liu, Yichen Bian, Xudong Ma, Yulun Zhang, Jiakai Wang, Jinyang Guo, Haotong Qin
链接:http://arxiv.org/abs/2412.05926
代码:https://github.com/Xingyu-Zheng/BiDM
备注:NeurIPS 2024

[11] HSDA: High-frequency Shuffle Data Augmentation for Bird's-Eye-View Map Segmentation[cs.CV]
标题:HSDA:高频洗牌数据增强技术用于鸟瞰图地图分割
作者:Calvin Glisson, Qiuxiao Chen
链接:http://arxiv.org/abs/2412.06127
代码:https://github.com/Zarhult/HSDA
备注:Accepted for publication at the 2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV). 8 pages excluding references, 5 figures

[12] Data Free Backdoor Attacks[cs.CV]
标题:数据无关后门攻击
作者:Bochuan Cao, Jinyuan Jia, Chuxuan Hu, Wenbo Guo, Zhen Xiang, Jinghui Chen, Bo Li, Dawn Song
链接:http://arxiv.org/abs/2412.06219
备注:24 pages, 8 figures, accepted by NeurIPS 2024

[13] No Annotations for Object Detection in Art through Stable Diffusion[cs.CV]
标题:无标注通过稳定扩散进行艺术作品中对象检测
作者:Patrick Ramos, Nicolas Gonthier, Selina Khan, Yuta Nakashima, Noa Garcia
链接:http://arxiv.org/abs/2412.06286
代码:https://github.com/patrick-john-ramos/nada
备注:8 pages, 6 figures, to be published in WACV 2025

[14] LLaVA-SpaceSGG: Visual Instruct Tuning for Open-vocabulary Scene Graph Generation with Enhanced Spatial Relations[cs.CV]
标题:LLaVA-SpaceSGG:开放词汇场景图生成中的视觉指令微调与增强空间关系
作者:Mingjie Xu, Mengyang Wu, Yuzhi Zhao, Jason Chun Lok Li, Weifeng Ou
链接:http://arxiv.org/abs/2412.06322
代码:https://github.com/Endlinc/LLaVA-SpaceSGG
备注:Accepted by the WACV 2025, including supplementary material

[15] Agent Journey Beyond RGB: Unveiling Hybrid Semantic-Spatial Environmental Representations for Vision-and-Language Navigation[cs.CV]
标题:智能体超脱RGB之旅:揭露适用于视觉-语言导航的混合语义-空间环境表示
作者:Xuesong Zhang, Yunbo Xu, Jia Li, Zhenzhen Hu, Richnag Hong
链接:http://arxiv.org/abs/2412.06465
备注:underreview in CVPR 2025

[16] Active Learning with Context Sampling and One-vs-Rest Entropy for Semantic Segmentation[cs.CV]
标题:基于上下文采样和一对多熵的主动学习用于语义分割
作者:Fei Wu, Pablo Marquez-Neila, Hedyeh Rafi-Tarii, Raphael Sznitman
链接:http://arxiv.org/abs/2412.06470
备注:WACV 2025, 8 pages

[17] BATseg: Boundary-aware Multiclass Spinal Cord Tumor Segmentation on 3D MRI Scans[cs.CV]
标题:BATseg:基于3D MRI扫描的边界感知多类脊髓肿瘤分割
作者:Hongkang Song, Zihui Zhang, Yanpeng Zhou, Jie Hu, Zishuo Wang, Hou Him Chan, Chon Lok Lei, Chen Xu, Yu Xin, Bo Yang
链接:http://arxiv.org/abs/2412.06507
代码:https://github.com/vLAR-group/BATseg
备注:ECCV 2024 Workshop on BioImage Computing. Code and data are available at: this https URL

[18] Bridging the Divide: Reconsidering Softmax and Linear Attention[cs.CV]
标题:跨越鸿沟:重新审视Softmax与线性注意力
作者:Dongchen Han, Yifan Pu, Zhuofan Xia, Yizeng Han, Xuran Pan, Xiu Li, Jiwen Lu, Shiji Song, Gao Huang
链接:http://arxiv.org/abs/2412.06590
代码:https://github.com/LeapLabTHU/InLine
备注:NeurIPS 2024

[19] Class Balance Matters to Active Class-Incremental Learning[cs.CV]
标题:类平衡对主动类增量学习很重要
作者:Zitong Huang, Ze Chen, Yuanze Li, Bowen Dong, Erjin Zhou, Yong Liu, Rick Siow Mong Goh, Chun-Mei Feng, Wangmeng Zuo
链接:http://arxiv.org/abs/2412.06642
代码:https://github.com/1170300714/CBS
备注:ACM MM 2024

[20] Tactile DreamFusion: Exploiting Tactile Sensing for 3D Generation[cs.CV]
标题:触觉幻融合:利用触觉感知进行3D生成
作者:Ruihan Gao, Kangle Deng, Gengshan Yang, Wenzhen Yuan, Jun-Yan Zhu
链接:http://arxiv.org/abs/2412.06785
代码:https://ruihangao.github.io/TactileDreamFusion/
备注:Accepted to NeurIPS 2024. Project webpage: this https URL Code: this https URL

自然语言处理会议: 13篇


[0] CALICO: Conversational Agent Localization via Synthetic Data Generation[cs.CL]
标题:合成数据生成对话代理定位系统英文缩写翻译为中文:CALICO:通过合成数据生成实现的对话代理定位
作者:Andy Rosenbaum, Pegah Kharazmi, Ershad Banijamali, Lu Zeng, Christopher DiPersio, Pan Wei, Gokmen Oz, Clement Chung, Karolina Owczarzak, Fabian Triefenbach, Wael Hamza
链接:http://arxiv.org/abs/2412.05388
备注:Accepted to The 37th International Conference on Neural Information Processing Systems (NeurIPS 2023) December 10-16, 2023 - SyntheticData4ML Workshop, New Orleans, United States this https URL

[1] A polar coordinate system represents syntax in large language models[cs.CL]
标题:大型语言模型中的语法定义可以用极坐标系表示
作者:Pablo Diego-Simón, Stéphane D'Ascoli, Emmanuel Chemla, Yair Lakretz, Jean-Rémi King
链接:http://arxiv.org/abs/2412.05571
期刊:NeurIPS 2024

[2] On the effective transfer of knowledge from English to Hindi Wikipedia[cs.CL]
标题:关于英语到印地语维基百科知识有效迁移的研究
作者:Paramita Das, Amartya Roy, Ritabrata Chakraborty, Animesh Mukherjee
链接:http://arxiv.org/abs/2412.05708
备注:accepted at COLING Industry Track 2025

[3] Uncovering Uncertainty in Transformer Inference[cs.CL]
标题:揭示Transformer推理中的不确定性
作者:Greyson Brothers, Willa Mannering, Amber Tien, John Winder
链接:http://arxiv.org/abs/2412.05768
备注:Accepted poster at the 38th Conference on Neural Information Processing Systems (NeurIPS 2024) Workshop on Foundation Model Interventions

[4] Speech Is Not Enough: Interpreting Nonverbal Indicators of Common Knowledge and Engagement[cs.CL]
标题:言语不足:诠释共同知识及参与的非语言指标
作者:Derek Palmer, Yifan Zhu, Kenneth Lai, Hannah VanderHoeven, Mariah Bradford, Ibrahim Khebour, Carlos Mabrey, Jack Fitzgerald, Nikhil Krishnaswamy, Martha Palmer, James Pustejovsky
链接:http://arxiv.org/abs/2412.05797
备注:3 pages, 2 figures, appearing at AAAI 2025 Demos Track

[5] 1-800-SHARED-TASKS at RegNLP: Lexical Reranking of Semantic Retrieval (LeSeR) for Regulatory Question Answering[cs.CL]
标题:1-800-SHARED-TASKS在RegNLP中:语义检索的词汇重排序(LeSeR)用于监管问答
作者:Jebish Purbey, Drishti Sharma, Siddhant Gupta, Khawaja Murad, Siddartha Pullakhandam, Ram Mohan Rao Kadiyala
链接:http://arxiv.org/abs/2412.06009
备注:5 pages, Accepted to RegNLP @ COLING 2025

[6] Steering Large Language Models to Evaluate and Amplify Creativity[cs.CL]
标题:引导大型语言模型以评估和增强创造力
作者:Matthew Lyle Olson, Neale Ratzlaff, Musashi Hinck, Shao-yen Tseng, Vasudev Lal
链接:http://arxiv.org/abs/2412.06060
备注:(Spotlight) NeurIPS 2024 Workshop on Creativity & Generative AI. Authors 1 and 2 contributed equally

[7] Annotations for Exploring Food Tweets From Multiple Aspects[cs.CL]
标题:关于多角度探索食品推文的标注
作者:Matīss Rikters, Edison Marrese-Taylor, Rinalds Vīksna
链接:http://arxiv.org/abs/2412.06179
期刊:Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

[8] SafeWorld: Geo-Diverse Safety Alignment[cs.CL]
标题:SafeWorld:地理多元化的安全同步
作者:Da Yin, Haoyi Qiu, Kung-Hsiang Huang, Kai-Wei Chang, Nanyun Peng
链接:http://arxiv.org/abs/2412.06483
代码:https://github.com/PlusLabNLP/SafeWorld
备注:Accepted by NeurIPS 2024

[9] Data Quality Enhancement on the Basis of Diversity with Large Language Models for Text Classification: Uncovered, Difficult, and Noisy[cs.CL]
标题:基于多样化的大量语言模型文本分类中的数据质量提升:揭示、困难与噪声
作者:Min Zeng, Caiquan Liu, Shiqi Zhang, Li Xie, Chen Sang, Xiaoxin Chen, Xiaoxin Chen
链接:http://arxiv.org/abs/2412.06575
备注:Accepted by COLING 2025(main, long paper)

[10] GEAR: A Simple GENERATE, EMBED, AVERAGE AND RANK Approach for Unsupervised Reverse Dictionary[cs.CL]
标题:GEAR:一种简单的生成、嵌入、平均和排名的无监督反向字典方法
作者:Fatemah Almeman, Luis Espinosa-Anke
链接:http://arxiv.org/abs/2412.06654
备注:9 pages, Accepted at COLING 2025

[11] I Don't Know: Explicit Modeling of Uncertainty with an [IDK] Token[cs.CL]
标题:我不知道:使用[我不确定]标记的显式不确定性建模
作者:Roi Cohen, Konstantin Dobler, Eden Biran, Gerard de Melo
链接:http://arxiv.org/abs/2412.06676
备注:Published at NeurIPS 2024

[12] JAPAGEN: Efficient Few/Zero-shot Learning via Japanese Training Dataset Generation with LLM[cs.CL]
标题:JAPAGEN:通过LLM生成日本语训练数据集实现的低/无样本学习高效方法
作者:Takuro Fujii, Satoru Katsumata
链接:http://arxiv.org/abs/2412.06738
备注:Accepted by PACLIC38 (2024)

感谢arxiv.org


Python社区是高质量的Python/Django开发社区
本文地址:http://www.python88.com/topic/176768
 
162 次点击