Py学习  »  aigc

CV&AIGC顶会整理 [2024-11-19]

晓飞的算法工程笔记 • 8 月前 • 303 次点击  

今日更新27篇:

  • 计算机视觉会议 20篇
  • 自然语言处理会议 7篇
请注意,大模型的论文多发布于自然语言处理会议中。而由于多模态的发展迅速,部分计算机视觉相关的论文也会发布在自然语言处理顶会中。

计算机视觉会议: 20篇


[0] Hateful Meme Detection through Context-Sensitive Prompting and Fine-Grained Labeling[cs.CV]
标题:仇恨梗检测:通过上下文敏感提示和精细粒度标注
作者:Rongxin Ouyang, Kokil Jaidka, Subhayan Mukerjee, Guangyu Cui
链接:http://arxiv.org/abs/2411.10480
备注:AAAI-25 Student Abstract, Oral Presentation

[1] A minimalistic representation model for head direction system[cs.CV]
标题: 翻译为中文是“头部方向系统”,因此标题的中文翻译为:一种用于头部方向系统的简约表示模型
作者:Minglu Zhao, Dehong Xu, Deqian Kong, Wen-Hao Zhang, Ying Nian Wu
链接:http://arxiv.org/abs/2411.10596
备注:Workshop on Symmetry and Geometry in Neural Representations (NeurReps) at NeurIPS 2024, Extended Abstract Track

[2] Voxel-Aggergated Feature Synthesis: Efficient Dense Mapping for Simulated 3D Reasoning[cs.CV]
标题:体积聚合特征合成:模拟三维推理的高效稠密映射
作者:Owen Burns, Rizwan Qureshi
链接:http://arxiv.org/abs/2411.10616
备注:6 pages, 2 figures, CVPR 2025

[3] Diffusion-based Layer-wise Semantic Reconstruction for Unsupervised Out-of-Distribution Detection[cs.CV]
标题:基于扩散的逐层语义重建用于无监督的异常分布检测
作者:Ying Yang, De Cheng, Chaowei Fang, Yubiao Wang, Changzhe Jiao, Lechao Cheng, Nannan Wang
链接:http://arxiv.org/abs/2411.10701
代码:https://github.com/xbyym/DLSR>
期刊:Proceedings of the 38th Conference on Neural Information Processing Systems (NeurIPS 2024)
备注:26 pages, 23 figures, published to Neurlps2024

[4] It Takes Two: Accurate Gait Recognition in the Wild via Cross-granularity Alignment[cs.CV]
标题:两者兼得:通过跨粒度对齐实现野外的精确步态识别
作者:Jinkai Zheng, Xinchen Liu, Boyue Zhang, Chenggang Yan, Jiyong Zhang, Wu Liu, Yongdong Zhang
链接:http://arxiv.org/abs/2411.10742
备注:12 pages, 9 figures; Accepted by ACM MM 2024

[5] MRI Parameter Mapping via Gaussian Mixture VAE: Breaking the Assumption of Independent Pixels[cs.CV]
标题:基于高斯混合变分自动编码器的MRI参数映射:打破独立像素的假设
作者:Moucheng Xu, Yukun Zhou, Tobias Goodwin-Allcock, Kimia Firoozabadi, Joseph Jacob, Daniel C. Alexander, Paddy J. Slator
链接:http://arxiv.org/abs/2411.10772
备注:NeurIPS 2024 Workshop in Machine Learning and the Physical Sciences

[6] An End-to-End Real-World Camera Imaging Pipeline[cs.CV]
标题:端到端真实世界相机图像处理流程
作者:Kepeng Xu, Zijia Ma, Li Xu, Gang He, Yunsong Li, Wenxin Yu, Taichu Han, Cheng Yang
链接:http://arxiv.org/abs/2411.10773
备注:accept by ACMMM 2024

[7] Generating Compositional Scenes via Text-to-image RGBA Instance Generation[cs.CV]
标题:通过文本到图像RGBA实例生成构建合成场景
作者:Alessandro Fontanella, Petru-Daniel Tudosiu, Yongxin Yang, Shifeng Zhang, Sarah Parisot
链接:http://arxiv.org/abs/2411.10913
备注:NeurIPS 2024

[8] Exploiting VLM Localizability and Semantics for Open Vocabulary Action Detection[cs.CV]
标题:利用VLM局部化与语义特性进行开放词汇动作检测
作者:Wentao Bao, Kai Li, Yuxiao Chen, Deep Patel, Martin Renqiang Min, Yu Kong
链接:http://arxiv.org/abs/2411.10922
代码:https://github.com/Cogito2012/OpenMixer
备注:WACV 2025 Accepted

[9] Constrained Diffusion with Trust Sampling[cs.CV]
标题:约束扩散与信任采样
作者:William Huang, Yifeng Jiang, Tom Van Wouwe, C. Karen Liu
链接:http://arxiv.org/abs/2411.10932
备注:18 pages, 6 figures, NeurIPS

[10] Anomaly Detection for People with Visual Impairments Using an Egocentric 360-Degree Camera[cs.CV]
标题:人际视觉障碍者使用自中心360度相机进行异常检测
作者:Inpyo Song, Sanghyeon Lee, Minjun Joo, Jangwon Lee
链接:http://arxiv.org/abs/2411.10945
备注:WACV2025

[11] Framework for developing and evaluating ethical collaboration between expert and machine[cs.CV]
标题:专家与机器伦理协作发展与评估框架
作者:Ayan Banerjee, Payal Kamboj, Sandeep Gupta
链接:http://arxiv.org/abs/2411.10983
备注:Accepted in ECAI Workshop AIEB

[12] Unveiling the Hidden: Online Vectorized HD Map Construction with Clip-Level Token Interaction and Propagation[cs.CV]
标题:揭开面纱:基于修剪级Token交互和传播的在线向量化高精度地图构建
作者:Nayeon Kim, Hongje Seong, Daehyun Ji, Sujin Jang
链接:http://arxiv.org/abs/2411.11002
备注:18 pages, 9 figures, NeurIPS 2024

[13] Time Step Generating: A Universal Synthesized Deepfake Image Detector[cs.CV]
标题:时间步生成:一种通用的合成深度伪造图像检测器
作者:Ziyue Zeng, Haoyuan Liu, Dingjie Peng, Luoxu Jing, Hiroshi Watanabe
链接:http://arxiv.org/abs/2411.11016
备注:Submitted to CVPR 2025, 9 pages, 7 figures

[14] Color-Oriented Redundancy Reduction in Dataset Distillation[cs.CV]
标题:数据集蒸馏中的面向颜色冗余消除
作者:Bowen Yuan, Zijian Wang, Yadan Luo, Mahsa Baktashmotlagh, Yadan Luo, Zi Huang
链接:http://arxiv.org/abs/2411.11329
代码:https://github.com/KeViNYuAn0314/AutoPalette
备注:38th Conference on Neural Information Processing Systems (NeurIPS 2024)

[15] Superpixel-informed Implicit Neural Representation for Multi-Dimensional Data[cs.CV]
标题:超级像素信息化的多维数据隐式神经网络表示
作者:Jiayi Li, Xile Zhao, Jianli Wang, Chao Wang, Min Wang
链接:http://arxiv.org/abs/2411.11356
备注:Accepted at ECCV 2024, 18 pages, 7 figures

[16] GPS-Gaussian+: Generalizable Pixel-wise 3D Gaussian Splatting for Real-Time Human-Scene Rendering from Sparse Views[cs.CV]
标题:GPS-Gaussian+:适用于稀疏视图实时人体场景渲染的可迁移性三维高斯喷绘法
作者:Boyao Zhou, Shunyuan Zheng, Hanzhang Tu, Ruizhi Shao, Boning Liu, Shengping Zhang, Liqiang Nie, Yebin Liu
链接:http://arxiv.org/abs/2411.11363
备注:Journal extension of CVPR 2024,Project page:this https URL

[17] IKEA Manuals at Work: 4D Grounding of Assembly Instructions on Internet Videos[cs.CV]
标题:宜家手册在工作中:网络视频中装配说明的4D接地
作者:Yunong Liu, Cristobal Eyzaguirre, Manling Li, Shubh Khanna, Juan Carlos Niebles, Vineeth Ravi, Saumitra Mishra, Weiyu Liu, Jiajun Wu
链接:http://arxiv.org/abs/2411.11409
备注:NeurIPS 2024 Datasets and Benchmarks Track

[18] Generalizable Person Re-identification via Balancing Alignment and Uniformity[cs.CV]
标题:通用人脸重识别:平衡对齐与均匀性
作者:Yoonki Cho, Jaeyoon Kim, Woo Jae Kim, Junsik Jung, Sung-eui Yoon
链接:http://arxiv.org/abs/2411.11471
代码:https://github.com/yoonkicho/BAU
备注:NeurIPS 2024

[19] Equivariant spatio-hemispherical networks for diffusion MRI deconvolution[cs.CV]
标题:等变空间半边网络用于扩散磁共振成像去卷积
作者:Axel Elaldi, Guido Gerig, Neel Dey
链接:http://arxiv.org/abs/2411.11819
代码:https://github.com/AxelElaldi/fast-equivariant-deconv
备注:Accepted to NeurIPS 2024. 24 pages with 13 figures. Code available at this https URL

自然语言处理会议: 7篇


[0] Hateful Meme Detection through Context-Sensitive Prompting and Fine-Grained Labeling[cs.CV]
标题:仇恨梗检测:通过上下文敏感提示和精细粒度标注
作者:Rongxin Ouyang, Kokil Jaidka, Subhayan Mukerjee, Guangyu Cui
链接:http://arxiv.org/abs/2411.10480
备注:AAAI-25 Student Abstract, Oral Presentation

[1] Does Prompt Formatting Have Any Impact on LLM Performance?[cs.CL]
标题:标题翻译:提示格式对大型语言模型性能有影响吗?
作者:Jia He, Mukund Rungta, David Koleczek, Arshdeep Sekhon, Franklin X Wang, Sadid Hasan
链接:http://arxiv.org/abs/2411.10541
备注:Submitted to NAACL 2025

[2] Hysteresis Activation Function for Efficient Inference[cs.CL]
标题:滞后激活函数:用于高效推理
作者:Moshe Kimhi, Idan Kashani, Avi Mendelson, Chaim Baskin
链接:http://arxiv.org/abs/2411.10573
备注:Accepted to 4th NeurIPS Efficient Natural Language and Speech Processing Workshop (ENLSP-IV 2024)

[3] IntentGPT: Few-shot Intent Discovery with Large Language Models[cs.CL]
标题:意图转换:大型语言模型下的少样本意图发现
作者:Juan A. Rodriguez, Nicholas Botzer, David Vazquez, Christopher Pal, Marco Pedersoli, Issam Laradji
链接:http://arxiv.org/abs/2411.10670
备注:ICLR 2024 Workshop on LLM Agents

[4] Can Generic LLMs Help Analyze Child-adult Interactions Involving Children with Autism in Clinical Observation?[cs.CL]
标题:能否用通用大型语言模型分析临床观察中涉及自闭症儿童与成人互动的情况?
作者:Tiantian Feng, Anfeng Xu, Rimita Lahiri, Helen Tager-Flusberg, So Hyun Kim, Somer Bishop, Catherine Lord, Shrikanth Narayanan
链接:http://arxiv.org/abs/2411.10761
备注:GenAI for Health Workshop, NeurIPS 2024

[5] Inter-linguistic Phonetic Composition (IPC): A Theoretical and Computational Approach to Enhance Second Language Pronunciation[cs.CL]
标题:跨语言语音合成(IPC):增强第二语言发音的理论和计算方法
作者:Jisang Park, Minu Kim, DaYoung Hong, Jongha Lee
链接:http://arxiv.org/abs/2411.10927
备注:10 pages, 6 Figures, submitted to ACL ARR October 2024 for NAACL 2025

[6] FastDraft: How to Train Your Draft[cs.CL]
标题:快速草稿:如何训练你的草稿
作者:Ofir Zafrir, Igor Margulis, Dorin Shteyman, Guy Boudoukh
链接:http://arxiv.org/abs/2411.11055
备注:ENLSP NeurIPS Workshop 2024

感谢arxiv.org


Python社区是高质量的Python/Django开发社区
本文地址:http://www.python88.com/topic/176092
 
303 次点击