title,authors,type,arxiv_id,arxiv,github,hf_paper,hf_space,hf_model,hf_dataset,n_authors,n_linked_authors Simoun: Synergizing Interactive Motion-appearance Understanding for Vision-based Reinforcement Learning,"Huang, Yangru; Peng, Peixi*; Zhao, Yifan; Zhai, Yunpeng; Xu, Haoran; Tian, Yonghong",poster,,,,,,,,, Among Us: Adversarially Robust Collaborative Perception by Consensus,"Li, Yiming; Fang, Qi; Bai, Jiamu; Chen, Siheng; Juefei-Xu, Felix; Feng, Chen*",poster,2303.09495,https://arxiv.org/abs/2303.09495,,https://huggingface.co/papers/2303.09495,,,,6,2 Walking Your LiDOG: A Journey Through Multiple Domains for LiDAR Semantic Segmentation,"Saltori, Cristiano*; Osep, Aljosa; Ricci, Elisa; Leal-Taixé, Laura",poster,2304.11705,https://arxiv.org/abs/2304.11705,,https://huggingface.co/papers/2304.11705,,,,4,1 Stabilizing Visual Reinforcement Learning via Asymmetric Interactive Cooperation,"Zhai, Yunpeng*; Peng, Peixi; Zhao, Yifan; Huang, Yangru; Tian, Yonghong",poster,,,,,,,,, MAAL: Multimodality-Aware Autoencoder-based Affordance Learning for 3D Articulated Objects,"liang, yuanzhi*; Wang, Xiaohan; Zhu, Linchao; Yang, Yi",poster,,,,,,,,, Rethinking Range View Representation for LiDAR Segmentation,"Kong, Lingdong*; Liu, Youquan; Chen, Runnan; Ma, Yuexin; Zhu, Xinge; HOU, Yuenan; Li, Yikang; Qiao, Yu; Liu, Ziwei",poster,2303.05367,https://arxiv.org/abs/2303.05367,,https://huggingface.co/papers/2303.05367,,,,9,1 PourIt!: Weakly-supervised Liquid Perception from a Single Image for Visual Closed-Loop Robotic Pouring,"Lin, Haitao*; Fu, Yanwei; Xue, Xiangyang",poster,2307.11299,https://arxiv.org/abs/2307.11299,,https://huggingface.co/papers/2307.11299,,,,3,0 CROSSFIRE: Camera Relocalization On Self-Supervised Features from an Implicit Representation,"Moreau, Arthur*; Piasco, Nathan; Bennehar, Moussab; Tsishkou, Dzmitry; Stanciulescu, Bogdan; de La Fortelle, Arnaud",poster,2303.04869,https://arxiv.org/abs/2303.04869,,https://huggingface.co/papers/2303.04869,,,,6,0 Environment Agnostic Representation for Visual Reinforcement learning,"Choi, Hyesong*; Lee, Hunsang; Jeong, Seongwon; Min, Dongbo",poster,,,,,,,,, Test-time Personalizable Forecasting of 3D Human Poses,"Cui, Qiongjie*; Sun, Huaijiang; Lu, Jianfeng; Li, Weiqing; Li, Bin; Wang, Haofan; Yi, Hongwei",poster,,,,,,,,, HM-ViT: Hetero-modal Vehicle-to-Vehicle Cooperative perception with vision transformer,"Xiang, Hao; Xu, Runsheng; Ma, Jiaqi*",poster,,,,,,,,, Efficient neural supersampling on a novel gaming dataset,"Mercier, Antoine*; Erasmus, Ruan S; Savani, Yashesh ; Dhingra, Manik; Porikli, Fatih; Berger, Guillaume J. F.",poster,2308.01483,https://arxiv.org/abs/2308.01483,,https://huggingface.co/papers/2308.01483,,,,6,0 Locally Stylized Neural Radiance Fields,"Pang, Hong Wing*; Hua, Binh-Son; Yeung, Sai-Kit",poster,,,,,,,,, NEMTO: Neural Environment Matting for Novel View and Relighting Synthesis of Transparent Objects,"Wang, Dongqing *; Zhang, Tong ; Süsstrunk, Sabine",poster,2303.11963,https://arxiv.org/abs/2303.11963,,https://huggingface.co/papers/2303.11963,,,,3,1 DDColor: Towards Photo-Realistic and Semantic-Aware Image Colorization via Dual Decoders,"Kang, Xiaoyang*; Yang, Tao; Ouyang, Wenqi; REN, PEIRAN; Li, Lingzhi; Xie, Xuansong",poster,,,,,,,,, IntrinsicNeRF: Learning Intrinsic Neural Radiance Fields for Editable Novel View Synthesis,"Ye, Weicai*; CHEN, SHUO; Bao, Chong; Bao, Hujun; Pollefeys, Marc; Cui, Zhaopeng; Zhang, Guofeng",poster,2210.00647,https://arxiv.org/abs/2210.00647,,https://huggingface.co/papers/2210.00647,,,,7,0 PARIS: Part-level Reconstruction and Motion Analysis for Articulated Objects,"Liu, Jiayi*; Mahdavi-Amiri, Ali; Savva, Manolis",poster,2308.07391,https://arxiv.org/abs/2308.07391,,https://huggingface.co/papers/2308.07391,,,,3,0 ReMoDiffuse: Retrieval-Augmented Motion Diffusion Model,"Zhang, Mingyuan*; Guo, Xinying; Pan, Liang; Cai, Zhongang; Hong, Fangzhou; Li, Huirong; Yang, Lei; Liu, Ziwei",poster,2304.01116,https://arxiv.org/abs/2304.01116,,https://huggingface.co/papers/2304.01116,,,,8,0 DS-Fusion: Artistic Typography via Discriminated and Stylized Diffusion,"Tanveer, Maham*; Wang, Yizhi; Mahdavi-Amiri, Ali; Zhang, Hao",poster,,,,,,,,, Dynamic Mesh-Aware Radiance Fields,"Qiao, Yi-Ling*; Gao, Alexander; Xu, Yiran; Feng, Yue; Huang, Jia-Bin; Lin, Ming C",poster,,,,,,,,, Neural Reconstruction of Relightable Human Model from Monocular Video,"Wenzhang, Sun*; che, yunlong; Huang, Han; Guo, Yandong",poster,,,,,,,,, Neural Microfacet Fields for Inverse Rendering,"Mai, Alexander*; Verbin, Dor; Kuester, Falko; Fridovich-Keil, Sara",poster,2303.17806,https://arxiv.org/abs/2303.17806,,https://huggingface.co/papers/2303.17806,,,,4,0 A Theory of Topological Derivatives for Inverse Rendering of Geometry,"Mehta, Ishit*; Chandraker, Manmohan; Ramamoorthi, Ravi",poster,2308.09865,https://arxiv.org/abs/2308.09865,,https://huggingface.co/papers/2308.09865,,,,3,0 Vox-E: Text-guided Voxel Editing of 3D Objects,"Sella, Etai*; Fiebelman, Gal; Hedman, Peter; Averbuch-Elor, Hadar",poster,,,,,,,,, StegaNeRF: Embedding Invisible Information within Neural Radiance Fields,"Li, Chenxin*; Feng, Brandon Yushan; Fan, Zhiwen; Pan, Panwang; Wang, Zhangyang",poster,2212.01602,https://arxiv.org/abs/2212.01602,,https://huggingface.co/papers/2212.01602,,,,5,0 GlobalMapper: Arbitrary-Shaped Urban Layout Generation,"He, Liu*; Aliaga, Daniel",poster,2307.09693,https://arxiv.org/abs/2307.09693,,https://huggingface.co/papers/2307.09693,,,,2,0 Urban Radiance Field Representation with Deformable Neural Mesh Primitives,"Lu, Fan; Xu, Yan*; Chen, Guang; Li, Hongsheng; Lin, Kwan-Yee; Jiang, Changjun",poster,2307.10776,https://arxiv.org/abs/2307.10776,,https://huggingface.co/papers/2307.10776,,,,6,0 End2End Multi-View Feature Matching with Differentiable Pose Optimization,"Roessle, Barbara J*; Niessner, Matthias",poster,,,,,,,,, Tree-Structured Shading Decomposition,"Geng, Chen*; Yu, Hong-Xing; Zhang, Sharon; Agrawala, Maneesh; Wu, Jiajun",poster,,,,,,,,, Lens Parameter Estimation for Realistic Depth of Field Synthesis,"Piche-Meunier, Dominique; Hold-Geoffroy, Yannick; Zhang, Jianming; Lalonde, Jean-Francois*",poster,,,,,,,,, AttT2M: Text-Driven Human Motion Generation with Multi-Perspective Attention Mechanism,"Zhong, Chongyang*; Zhang, Zihao; Hu, Lei; Xia, Shihong",poster,,,,,,,,, Cross-modal Latent Space Alignment for Image to Avatar Translation,"Ladron de Guevara, Manuel*; Echevarria, Jose; Li, Yijun; Hold-Geoffroy, Yannick; Smith, Cameron Y; Ito, Daichi",poster,,,,,,,,, Computationally Efficient Neural Image Compression with Shallow Decoders,"Yang, Yibo*; Mandt, Stephan",poster,,,,,,,,, Enhancing Spatial and Semantic Supervision for Hybrid-Based 3D Instance Segmentation,"Al Khatib, Salwa K.; Boudjoghra, Mohamed El Amine; Lahoud, Jean*; Shahbaz Khan, Fahad",poster,,,,,,,,, Learning Neural Eigenfunctions for Unsupervised Semantic Segmentation,"Deng, Zhijie*; Luo, Yucen",poster,2304.02841,https://arxiv.org/abs/2304.02841,,https://huggingface.co/papers/2304.02841,,,,2,0 Divide and Conquer: 3D Point Cloud Instance Segmentation With Point-Wise Binarization,"Zhao, Weiguang; Yan, Yuyao; Yang, Chaolong; Ye, Jianan; Yang, Xi ; Huang, Kaizhu*",poster,2207.11209,https://arxiv.org/abs/2207.11209,,https://huggingface.co/papers/2207.11209,,,,6,0 Point2Mask: Point-supervised Panoptic Segmentation via Optimal Transport,"Li, Wentong ; Yuan, Yuqian; Wang, Song; Zhu, Jianke*; Li, Jianshu; Liu, Jian; Zhang, Lei",poster,2308.01779,https://arxiv.org/abs/2308.01779,https://github.com/LiWentomng/Point2Mask,https://huggingface.co/papers/2308.01779,,,,7,0 Handwritten and Printed Text Segmentation: A Signature Case Study,"Gholamian, Sina*; Vahdat, Ali",poster,2307.07887,https://arxiv.org/abs/2307.07887,,https://huggingface.co/papers/2307.07887,,,,2,0 Semantic-Aware Template Learning via Part Deformation Consistency,"Kim, Sihyeon*; Joo, Minseok; Lee, Jaewon; Ko, Juyeon; Cha, Juhan; Kim, Hyunwoo J",poster,,,,,,,,, LeaF: Learning Frames for 4D Point Cloud Sequence Understanding,"Liu, Yunze*; Chen, Junyu; Zhang, Zekai; Huang, Jingwei; Yi, Li",poster,,,,,,,,, MARS: Model-agnostic Biased Object Removal without Additional Supervision for Weakly-Supervised Semantic Segmentation,"Jo, Sanghyun; YU, IN JAE; Kim, Kyungsu*",poster,2304.09913,https://arxiv.org/abs/2304.09913,,https://huggingface.co/papers/2304.09913,,,,3,0 USAGE: A Unified Seed Area Generation Paradigm for Weakly Supervised Semantic Segmentation,"peng, zelin*; Wang, Guanchun; Xie, Lingxi; Jiang, Dongsheng; Shen, Wei; Tian, Qi",poster,2303.07806,https://arxiv.org/abs/2303.07806,,https://huggingface.co/papers/2303.07806,,,,6,0 Production-level Video Segmentation From Few Annotated Frames,"Bermudez , Ariana M; Li, Hao; Bekuzarov, Maksym*; Lee, Joon-Young",poster,,,,,,,,, SIGMA: Scale-Invariant Global Sparse Shape Matching,"Gao, Maolin*; Roetzer, Paul; Eisenberger, Marvin; Laehner, Zorah; Moeller, Michael; Cremers, Daniel; Bernard, Florian",poster,2308.08393,https://arxiv.org/abs/2308.08393,,https://huggingface.co/papers/2308.08393,,,,7,0 Self-Calibrated Cross Attention Network for Few-Shot Segmentation,"Xu, Qianxiong; Zhao, Wenting; Lin, Guosheng; Long, Cheng*",poster,2308.09294,https://arxiv.org/abs/2308.09294,https://github.com/Sam1224/SCCAN,https://huggingface.co/papers/2308.09294,,,,4,0 Multi-granularity Interaction Simulation for Unsupervised Interactive Segmentation,"Li, Kehan*; Zhao, Yian; Wang, Zhennan; Cheng, Zesen; Jin, Peng; Ji, Xiangyang; Yuan, Li; Liu, Chang; Chen, Jie",poster,2303.13399,https://arxiv.org/abs/2303.13399,,https://huggingface.co/papers/2303.13399,,,,9,0 Texture Learning Domain Randomization for Domain Generalized Segmentation,"Kim, Sunghwan*; Kim, Dae-hwan; Kim, Hoseong",poster,2303.11546,https://arxiv.org/abs/2303.11546,https://github.com/ssssshwan/TLDR,https://huggingface.co/papers/2303.11546,,,,3,0 Unsupervised Video Object Segmentation with Online Adversarial Self-Tuning,"Su, Tiankang*; Song, Huihui; Liu, Dong; Liu, Bo; Liu, Qingshan",poster,,,,,,,,, Exploring Open-Vocabulary Semantic Segmentation without Human Labels,"Chen, Jun*; Zhu, Deyao; Qian, Guocheng; Ghanem, Bernard; Yan, Zhicheng; Zhu, Chenchen; Xiao, Fanyi; Elhoseiny, Mohamed; Culatana, Sean",poster,2306.00450,https://arxiv.org/abs/2306.00450,,https://huggingface.co/papers/2306.00450,,,,9,0 RbA: Segmenting Unknown Regions Rejected by All,"Nayal, Nazir*; YAVUZ, MISRA; Henriques, Joao F; Guney, Fatma",poster,2211.14293,https://arxiv.org/abs/2211.14293,,https://huggingface.co/papers/2211.14293,,,,4,1 SEMPART: Self-supervised Multi-resolution Partitioning of Image Semantics,"Ravindran, Sriram; Basu, Debraj D*",poster,,,,,,,,, Multi-Object Discovery by Low-Dimensional Object Motion,"Safadoust, Sadra*; Guney, Fatma",poster,2307.08027,https://arxiv.org/abs/2307.08027,,https://huggingface.co/papers/2307.08027,,,,2,0 MemorySeg: Online LiDAR Semantic Segmentation with a Latent Memory,"Li, Enxu*; Casas, Sergio; Urtasun, Raquel",poster,,,,,,,,, Treating Pseudo-labels Generation as Image Matting for Weakly Supervised Semantic Segmentation,"Wang, Changwei; Xu, Rongtao; Xu, Shibiao*; Meng, Weiliang; Zhang, Xiaopeng",poster,,,,,,,,, BoxSnake: Polygonal Instance Segmentation with Box Supervision,"Yang, Rui*; Song, Lin; Ge, Yixiao; Li, Xiu",poster,2303.11630,https://arxiv.org/abs/2303.11630,,https://huggingface.co/papers/2303.11630,,,,4,1 Dynamic Token Pruning in Plain Vision Transformers for Semantic Segmentation,"Tang, Quan*; Zhang, Bowen bz; Liu, Jiajun; Liu, Fagui; Liu, Yifan",poster,2308.01045,https://arxiv.org/abs/2308.01045,,https://huggingface.co/papers/2308.01045,,,,5,0 Instance Neural Radiance Field,"Liu, Yichen*; Tai, Yu-Wing; Tang, Chi-Keung; Hu, Benran; Huang, Junkai",poster,2304.04395,https://arxiv.org/abs/2304.04395,https://github.com/lyclyc52/Instance_NeRF,https://huggingface.co/papers/2304.04395,,,,5,0 Global Knowledge Calibration for Fast Open-Vocabulary Segmentation,"Han, Kunyang*; Liu, Yong; Liew, Jun Hao; Ding, Henghui; Wei, Yunchao; Liu, Jiajun; Wang, Yitong; Tang, Yansong; Yang, Yujiu; Feng, Jiashi; Zhao, Yao",poster,2303.09181,https://arxiv.org/abs/2303.09181,,https://huggingface.co/papers/2303.09181,,,,11,0 Diffusion-based Image Translation with Label Guidance for Domain Adaptive Semantic Segmentation,"Duo, Peng; Hu, Ping; Ke, Qiuhong; Liu, Jun*",poster,2308.12350,https://arxiv.org/abs/2308.12350,,https://huggingface.co/papers/2308.12350,,,,4,0 Boosting Semantic Segmentation from an Explicit Class Embedding’s Perspective,"Liu, Yuhe*; Liu, Chuanjian; Han, Kai; Tang, Quan; Qin, Zengchang",poster,,,,,,,,, The Making and Breaking of Camouflage,"Lamdouar, Hala*; Xie, Weidi; Zisserman, Andrew",poster,,,,,,,,, CoinSeg: Contrast Inter- and Intra- Class Representations for Incremental Segmentation,"Zhang, Zekang; Gao, Guangyu Ryan*; Jiao, Jianbo; Wei, Yunchao; Liu, Chi Harold",poster,,,,,,,,, Few-Shot Physically-Aware Articulated Mesh Generation via Hierarchical Deformation,"Liu, Xueyi*; Wang, Bin; Wang, He; Yi, Li",poster,2308.10898,https://arxiv.org/abs/2308.10898,,https://huggingface.co/papers/2308.10898,,,,4,0 HAL3D: Hierarchical Active Learning for Fine-Grained 3D Part Labeling,"YU, FENGGEN; Qian, Yiming*; Gil-Ureta, Francisca T; Jackson, Brian P; Bennett, Eric P; Zhang, Hao",poster,2301.10460,https://arxiv.org/abs/2301.10460,,https://huggingface.co/papers/2301.10460,,,,6,0 FreeCOS: Self-Supervised Learning from Fractals and Unlabeled Images for Curvilinear Object Segmentation,"Shi, Tianyi*; Ding, Xiaohuan; zhang, liang; Yang, Xin",poster,2307.07245,https://arxiv.org/abs/2307.07245,https://github.com/TY-Shi/FreeCOS,https://huggingface.co/papers/2307.07245,,,,4,0 MasQCLIP for Open-Vocabulary Universal Image Segmentation,"Xu, Xin; Xiong, Tianyi; Ding, Zheng*; Tu, Zhuowen",poster,,,,,,,,, CTVIS: Consistent Training for Online Video Instance Segmentation,"Ying, Kaining; Zhong, Qing; Mao, Weian; Wang, Zhenhua*; Chen, Hao; Wu, Lin Yuanbo; Liu, Yifan; Fan, Chengxiang; Zhuge, Yunzhi; Shen, Chunhua",poster,2307.12616,https://arxiv.org/abs/2307.12616,,https://huggingface.co/papers/2307.12616,,,,10,1 A Simple Framework for Panoptic Segmentation,"Chen, Ting*; Li, Lala; Saxena, Saurabh; Hinton, Geoffrey; Fleet, David J",poster,,,,,,,,, Spectrum-guided Multi-granularity Referring Video Object Segmentation,"Miao, Bo*; Bennamoun, Mohammed; Gao, Yongsheng; Mian, Ajmal",poster,2307.13537,https://arxiv.org/abs/2307.13537,https://github.com/bo-miao/SgMg,https://huggingface.co/papers/2307.13537,,,,4,0 Space Engage: Collaborative Space Supervision for Contrastive-based Semi-Supervised Semantic Segmentation,"Wang, Changqi*; Xie, Haoyu; Yuan, Yuhui; Fu, Chong; Yue, Xiangyu",poster,2307.09755,https://arxiv.org/abs/2307.09755,,https://huggingface.co/papers/2307.09755,,,,5,1 Adaptive Superpixel for Active Learning in Semantic Segmentation,"Kim, Hoyoung*; Oh, Minhyeon; Hwang, Sehyun; Kwak, Suha; Ok, Jungseul",poster,2303.16817,https://arxiv.org/abs/2303.16817,,https://huggingface.co/papers/2303.16817,,,,5,0 Multimodal Variational Auto-encoder based Audio-Visual Segmentation,"Mao, Yuxin; Zhang, Jing; Xiang, Mochu; Zhong, Yiran; Dai, Yuchao*",poster,,,,,,,,, Isomer: Isomerous Transformer for Zero-shot Video Object Segmentation,"Yuan, Yichen*; Wang, Yifan; Wang, Lijun; Zhao, Xiaoqi; Lu, Huchuan; Wang, Yu; su, weibo; Zhang, Lei",poster,2308.06693,https://arxiv.org/abs/2308.06693,https://github.com/DLUT-yyc/Isomer,https://huggingface.co/papers/2308.06693,,,,8,0 2D-3D Interlaced Transformer for Point Cloud Segmentation with Scene-Level Supervision,"Yang, Cheng-Kun*; Chen, Min-Hung; Chuang, Yung-Yu; Lin, Yen-Yu",poster,,,,,,,,, Foreground-Background Separation through Concept Distillation from Generative Image Foundation Models,"Dombrowski, Mischa; Reynaud, Hadrien; Baugh, Matthew M G; Kainz, Bernhard*",poster,,,,,,,,, SegPrompt: Boosting Open-World Segmentation via Category-level Prompt Learning,"Zhu, Muzhi*; Li, Hengtao; Chen, Hao; Fan, Chengxiang; Mao, Weian; Jing, Chenchen; Liu, Yifan; Shen, Chunhua",poster,2308.06531,https://arxiv.org/abs/2308.06531,,https://huggingface.co/papers/2308.06531,,,,8,0 Monte Carlo Linear Clustering with Single-Point Supervision is Enough for Infrared Small Target Detection,"Li, Boyang*; Wang, Yingqian; Wang, Longguang; Zhang, Fei; Liu, Ting; Lin, Zaiping; An, Wei; Guo, Yulan",poster,2304.04442,https://arxiv.org/abs/2304.04442,https://github.com/YeRen123455/SIRST-Single-Point-Supervision,https://huggingface.co/papers/2304.04442,,,,8,0 A Simple Framework for Open-Vocabulary Segmentation and Detection,"Zhang, Hao; Li, Feng*; Zou, Xueyan; Liu, Shilong; Li, Chunyuan; Yang, Jianwei; Zhang, Lei",poster,2303.08131,https://arxiv.org/abs/2303.08131,,https://huggingface.co/papers/2303.08131,,,,8,0 Source-free Depth for Object Pop-out,"WU, Zongwei; Paudel, Danda Pani; Fan, Deng-Ping*; Wang, Jingjing; Wang, Shuo; Demonceaux, Cedric; Timofte, Radu; Van Gool, Luc",poster,2212.05370,https://arxiv.org/abs/2212.05370,,https://huggingface.co/papers/2212.05370,,,,8,0 DynaMITe: Dynamic Query Bootstrapping for Multi-object Interactive Segmentation Transformer,"Rana, Amit Kumar; Mahadevan, Sabarinath*; Hermans, Alexander; Leibe, Bastian",poster,2304.06668,https://arxiv.org/abs/2304.06668,,https://huggingface.co/papers/2304.06668,,,,4,0 Atmospheric Transmission and Thermal Inertia Induced Blind Road Segmentation with a Large-Scale Dataset TBRSD,"Chen, Junzhang*; Bai, Xiangzhi",poster,,,,,,,,, Informative Data Mining for One-shot Cross-Domain Semantic Segmentation,"Wang, yuxi*; Liang, Jian; mei, shuqi; yang, yuran; Xiao, Jun; Zhang, Zhaoxiang",poster,,,,,,,,, Homography Guided Temporal Fusion for Road Line and Marking Segmentation,"Wang, Shan*; Nguyen, Chuong; Liu, Jiawei; Zhang, Kaihao; Luo, Wenhan; Zhang, Yanhao; Muthu, Sundaram; Afzal Maken, Fahira; LI, HONGDONG",poster,,,,,,,,, Zero-Shot Semantic Segmentation with Decoupled One-Shot Network,"Han, Cong*; Zhong, Yujie; Han, Kai; Dengjie, Li; Ma, Lin",poster,,,,,,,,, TCOVIS: Temporally consistent online video instance segmentation,"Li, Junlong; Yu, Bingyao; Rao, Yongming; Zhou, Jie; Lu, Jiwen*",poster,,,,,,,,, FPR: False Positive Rectification for Weakly Supervised Semantic Segmentation,"Chen, Liyi*; Lei, Chenyang; Li, Ruihuang; LI, Shuai; Zhang, Zhaoxiang; Zhang, Lei",poster,,,,,,,,, Stochastic Segmentation with Conditional Categorical Diffusion Models,"Zbinden, Lukas*; Doorenbos, Lars; Pissas, Theodoros; Huber, Adrian Thomas; Sznitman, Raphael; Márquez Neila, Pablo",poster,2303.08888,https://arxiv.org/abs/2303.08888,,https://huggingface.co/papers/2303.08888,,,,6,1 Segmenting Everything In Context,"Wang, Xinlong*; Zhang, Xiaosong; Cao, Yue; Wang, Wen; Shen, Chunhua; Huang, Tiejun",poster,,,,,,,,, Open-vocabulary Panoptic Segmentation with Embedding Modulation,"CHEN, Xi*; Li, Shuang; Lim, Ser-Nam; Torralba, Antonio; Zhao, Hengshuang",poster,2303.11324,https://arxiv.org/abs/2303.11324,,https://huggingface.co/papers/2303.11324,,,,5,0 Residual Pattern Learning for Pixel-wise Out-of-Distribution Detection in Semantic Segmentation,"Liu, Yuyuan*; Ding, Choubo; Tian, Yu; Pang, Guansong; Belagiannis, Vasileios; Reid, Ian; Carneiro, Gustavo",poster,2211.14512,https://arxiv.org/abs/2211.14512,https://github.com/yyliu01/RPL,https://huggingface.co/papers/2211.14512,,,,7,0 Zero-guidance Segmentation Using Zero Segment Labels,"Rewatbowornwong, Pitchaporn*; Chatthee, Nattanat; Chuangsuwanich, Ekapol; Suwajanakorn, Supasorn",poster,2303.13396,https://arxiv.org/abs/2303.13396,,https://huggingface.co/papers/2303.13396,,,,4,0 Model Calibration in Dense Classification with Adaptive Label Perturbation,"Liu, Jiawei*; Ye, Changkun; Wang, Shan; Cui, Ruikai; Zhang, Jing; Zhang, Kaihao; Barnes, Nick",poster,2307.13539,https://arxiv.org/abs/2307.13539,https://github.com/Carlisle-Liu/ASLP,https://huggingface.co/papers/2307.13539,,,,7,0 Enhanced Soft Label for Semi-Supervised Semantic Segmentation,"Ma, Jie; Wang, Chuan; Liu, Yang; Lin, Liang; Li, Guanbin*",poster,,,,,,,,, MixReorg: Cross-Modal Mixed Patch Reorganization is a Good Mask Learner for Open-World Semantic Segmentation,"Cai, Kaixin*; Ren, Pengzhen; Zhu, Yi; Xu, Hang; Liu, Jianzhuang; Li, Changlin; Wang, Guangrun; Liang, Xiaodan",poster,2308.04829,https://arxiv.org/abs/2308.04829,,https://huggingface.co/papers/2308.04829,,,,8,0 DiffuMask: Synthesizing Images with Pixel-level Annotations for Semantic Segmentation Using Diffusion Models,"Wu, Weijia*; Zhao, Yuzhong; Shou, Mike Zheng; ZHOU, HONG; Shen, Chunhua",poster,2303.11681,https://arxiv.org/abs/2303.11681,,https://huggingface.co/papers/2303.11681,,,,5,0 Alignment Before Aggregation: Trajectory Memory Retrieval Network for Video Object Segmentation,"Sun, Rui*; Wang, Yuan; Mai, Huayu; Zhang, Tianzhu; Wu, Feng",poster,,,,,,,,, Semi-Supervised Semantic Segmentation under Label Noise via Diverse Learning Groups,"Li, Peixia*; Purkait, Pulak; Ajanthan, Thalaiyasingam; Abdolshah, Majid; Garg, Ravi; Husain, Hisham; Xu, Chenchen; Gould, Stephen; Ouyang, Wanli; van den Hengel, Anton",poster,,,,,,,,, SUMMIT: Source-Free Adaptation of Uni-Modal Models to Multi-Modal Targets,"Simons, Cody M*; Raychaudhuri, Dripta S.; AHMED, SK MIRAJ; You, Suya; Karydis, Konstantinos; Roy-Chowdhury, Amit K. ",poster,2308.11880,https://arxiv.org/abs/2308.11880,https://github.com/csimo005/SUMMIT,https://huggingface.co/papers/2308.11880,,,,6,0 Class-incremental Continual Learning for Instance Segmentation with Image-level Weak Supervision,"Hsieh, Yu-Hsing*; Chen, Guan-Sheng; Cai, Shun-Xian; Wei, Ting-Yun; Yang, Huei-Fang; Chen, Chu-Song",poster,,,,,,,,, Coarse-to-Fine Amodal Segmentation with Shape Prior,"Gao, Jianxiong; Qian, Xuelin*; Fu, Yanwei; Wang, Yikai; Xiao, Tianjun; Zhang, Zheng; He, Tong",poster,2308.16825,https://arxiv.org/abs/2308.16825,,https://huggingface.co/papers/2308.16825,,,,7,1 Rethinking Amodal Video Segmentation from Learning Supervised Signals with Object-centric Representation,"Fan, Ke; Lei, Jingshi; Qian, Xuelin*; Yu, Miaopeng; Zhang, Zheng; He, Tong; Xiao, Tianjun; Fu, Yanwei",poster,,,,,,,,, DVIS: Decoupled Video Instance Segmentation Framework,"Zhang, Tao*; tian, xingye; Wu, Yu; Ji, Shunping; Wang, Xuebo; Zhang, Yuan; Wan, Pengfei ",poster,2306.03413,https://arxiv.org/abs/2306.03413,,https://huggingface.co/papers/2306.03413,,,,7,0 3D Segmentation of Humans in Point Clouds with Synthetic Data,"Takmaz, Ayca*; Schult, Jonas; Kaftan, Irem; Akcay, Cafer Mertcan; Leibe, Bastian; Sumner, Robert W; Engelmann, Francis; Tang, Siyu",poster,2212.00786,https://arxiv.org/abs/2212.00786,,https://huggingface.co/papers/2212.00786,,,,8,1 WaterMask: Instance Segmentation for Underwater Imagery,"Lian, Shijie*; Li, Hua; Cong, Runmin; Li, Suqi; Zhang, Wei; Kwong, Sam",poster,,,,,,,,, Decoupled or End-to-End Trained Video Segmentation if Target Data is Scarce?,"Cheng, Ho Kei*; Oh, Seoung Wug; Price, Brian; Schwing, Alexander; Lee, Joon-Young",poster,,,,,,,,, Cross Contrasting Feature Perturbation for Domain Generalization,"Li, Chenming*; Zhang, Daoan; Huang, Wenjian; Zhang, Jianguo",poster,2307.12502,https://arxiv.org/abs/2307.12502,,https://huggingface.co/papers/2307.12502,,,,4,0 Flexible Visual Recognition by Evidential Modeling of Confusion and Ignorance,"Fan, Lei*; Liu, Bo; Li, Haoxiang; Wu, Ying; Hua, Gang",poster,,,,,,,,, CDUL: CLIP-Driven Unsupervised Learning for Multi-Label Image Classification,"Abdelfattah, Rabab*; Guo, Qing; Li, Xiaoguang; Wang, XIAOFENG; Wang, Song",poster,2307.16634,https://arxiv.org/abs/2307.16634,,https://huggingface.co/papers/2307.16634,,,,5,0 RankMixup: Ranking-based Mixup Training for Network Calibration,"Noh, Jongyoun; Park, Hyekang; Lee, Junghyup; Ham, Bumsub*",poster,2308.11990,https://arxiv.org/abs/2308.11990,,https://huggingface.co/papers/2308.11990,,,,4,0 Label-Noise Learning with Intrinsically Long-Tailed Data,"Lu, Yang*; Zhang, Yiliang; Han, Bo; CHEUNG, Yiu-ming; Wang, Hanzi",poster,2208.09833,https://arxiv.org/abs/2208.09833,https://github.com/Wakings/TABASCO,https://huggingface.co/papers/2208.09833,,,,5,0 Parallel Attention Interaction Network for Few-Shot Skeleton-based Action Recognition,"Liu, Xingyu; Zhou, Sanping*; Wang, Le; Hua, Gang",poster,,,,,,,,, Rethinking Mobile Block for Efficient Attention-based Models,"Zhang, Jiangning*; Li, Xiangtai; Li, Jian; Liu, Liang; Zhang, Boshen; Jiang, ZhengKai; Huang, Tianxin; Xue, Zhucun; Wang, Yabiao; Wang, Chengjie",poster,2301.01146,https://arxiv.org/abs/2301.01146,,https://huggingface.co/papers/2301.01146,,,,10,0 Read-only Prompt Optimization for Vision-Language Few-shot Learning,"Lee, DongJun*; Song, Seokwon; Suh, Jihee; Choi, Joonmyung; Lee, Sanghyeok; Kim, Hyunwoo J",poster,2308.14960,https://arxiv.org/abs/2308.14960,https://github.com/mlvlab/RPO,https://huggingface.co/papers/2308.14960,,,,6,0 Understanding Self-attention Mechanism via Dynamical System Perspective,"Huang, Zhongzhan; Liang, Mingfu; Qin, Jinghui; Zhong, Shanshan; Lin, Liang*",poster,2308.09939,https://arxiv.org/abs/2308.09939,,https://huggingface.co/papers/2308.09939,,,,5,0 Learning in Imperfect Environment: Multi-Label Classification with Long-Tailed Distribution and Partial Labels,"Zhang, Wenqiao*; LIU, CHANGSHUO; Ooi, Beng Chin; Tang, Siliang; Zhuang, Yueting",poster,2304.10539,https://arxiv.org/abs/2304.10539,,https://huggingface.co/papers/2304.10539,,,,6,0 What do neural networks learn in image classification? A frequency shortcut perspective,"Wang, Shunxin*; Veldhuis, Raymond; Brune, Christoph; Strisciuglio, Nicola",poster,2307.09829,https://arxiv.org/abs/2307.09829,,https://huggingface.co/papers/2307.09829,,,,4,0 Inducing Neural Collapse to a Fixed Hierarchy-Aware Frame for Reducing Mistake Severity,"Liang, Tong*; Davis, Jim",poster,2303.05689,https://arxiv.org/abs/2303.05689,https://github.com/ltong1130ztr/HAFrame,https://huggingface.co/papers/2303.05689,,,,2,0 Unified Out-Of-Distribution Detection: A Model-Specific Perspective,"Averly, Muhammad Reza; Chao, Wei-Lun*",poster,2304.06813,https://arxiv.org/abs/2304.06813,,https://huggingface.co/papers/2304.06813,,,,2,0 A Unified Framework for Robustness on Diverse Sampling Errors,"Jeon, Myeongho; Kang, Myungjoo; Lee, Joonseok*",poster,,,,,,,,, Scene-Aware Label Graph Learning for Multi-Label Image Classification,"Zhu, Xuelin*; Cao, Jiuxin; Liu, Weijia; Ge, Jiawei; Liu, Jian; Liu, Bo",poster,,,,,,,,, Holistic Label Correction for Noisy Multi-Label Classification,"Xia, Xiaobo*; Deng, Jiankang; Bao, Wei; Du, Yuxuan; Han, Bo; Shan, Shiguang; Liu, Tongliang",poster,,,,,,,,, Strip-MLP: Efficient Token Interaction for Vision MLP,"Cao, Guiping*; Luo, Shengda; Huang, Wenjian; Lan, Xiangyuan; Jiang, Dongmei; Wang, Yaowei; Zhang, Jianguo",poster,,,,,,,,, EQ-Net: Elastic Quantization Neural Networks,"Xu, Ke; han, lei; Tian, Ye; Yang, Shangshang*; Zhang, Xingyi",poster,,,,,,,,, Data-free Knowledge Distillation for Fine-grained Vision Categorization,"shao, renrong; Zhang, Wei*; Yin, Jianhua; Wang, Jun",poster,,,,,,,,, Shift from texture-bias to shape-bias: edge deformation-based augmentation for robust object recognition,"He, Xilin; Lin, Qinliang; Luo, Cheng; Xie, Weicheng*; Song, Siyang; Liu, Feng; Shen, Linlin",poster,,,,,,,,, "Latent-OFER: Detect, Mask, and Reconstruct with Latent Vectors for Occluded Facial Expression Recognition","lee, isack; Lee, Eungi; Yoo, Seok Bong*",poster,,,,,,,,, DR-Tune: Improving Fine-tuning of Pretrained Visual Models by Distribution Regularization with Semantic Calibration,"Zhou, Nan*; Chen, Jiaxin; Huang, Di",poster,,,,,,,,, Understanding the Feature Norm for Out-of-Distribution Detection,"Park, Jaewoo*; Chai , Jacky Chen Long; Yoon, Jaeho; Teoh, Andrew Beng Jin",poster,,,,,,,,, Multi-view Active Fine-grained Visual Recognition,"Du, Ruoyi; Yu, Wenqing; Wang, Heqing; Lin, Ting-En; Chang, Dongliang*; Ma, Zhanyu",poster,,,,,,,,, DiffGuard: Semantic Mismatch-Guided Out-of-Distribution Detection using Pre-trained Diffusion Models,"Gao, Ruiyuan*; ZHAO, Chenchen; Hong, Lanqing; Xu, Qiang",poster,2308.07687,https://arxiv.org/abs/2308.07687,,https://huggingface.co/papers/2308.07687,,,,4,0 Task-aware Adaptive Learning for Cross-domain Few-shot Learning,"Guo, Yurong; Du, Ruoyi; Dong, Yuan; Hospedales, Timothy; Song, Yi-Zhe; Ma, Zhanyu*",poster,,,,,,,,, Improving Adversarial Robustness of Masked Autoencoders via Test-time Frequency-domain Prompting,"Huang, Qidong*; Dong, Xiaoyi; Chen, Dongdong; Chen, Yinpeng; Yuan, Lu; Hua, Gang; Zhang, Weiming; Yu, Nenghai",poster,2308.10315,https://arxiv.org/abs/2308.10315,https://github.com/shikiw/RobustMAE,https://huggingface.co/papers/2308.10315,,,,8,0 Saliency Regularization for Self-Training with Partial Annotations,"Wang, Shouwen; Wan, Qian; Xiang, Xiang*; Zeng, Zhigang",poster,,,,,,,,, Learning Gabor Texture Features for Fine-Grained Recognition,"Zhu, Lanyun*; Chen, Tianrun; Yin, Jianxiong (Terry); See, Simon; Liu, Jun",poster,2308.05396,https://arxiv.org/abs/2308.05396,,https://huggingface.co/papers/2308.05396,,,,5,0 UniFormerV2: Unlocking the Potential of Image ViTs for Video Understanding,"Li, Kunchang*; Wang, Yali; He, Yinan; Li, Yizhuo; Wang, Yi; Wang, Limin; Qiao, Yu",poster,,,,,,,,, RankMatch: Fostering Confidence and Consistency in Learning with Noisy Labels,"Zhang, Ziyi; Chen, Weikai; Fang, Chaowei; Li, Zhen; Cheng, Lechao; Lin, Liang; Li, Guanbin*",poster,,,,,,,,, MetaGCD: Learning to Continually Learn in Generalized Category Discovery,"Wu, Yanan; Chi, Zhixiang; Wang, Yang; Feng, Songhe*",poster,2308.11063,https://arxiv.org/abs/2308.11063,,https://huggingface.co/papers/2308.11063,,,,4,0 FerKD: Surgical Label Adaptation for Efficient Distillation,"Shen, Zhiqiang*",poster,,,,,,,,, "Point-Query Quadtree for Crowd Counting, Localization, and More","Liu, Chengxin; Lu, Hao; Cao, Zhiguo*; Liu, Tongliang",poster,2308.13814,https://arxiv.org/abs/2308.13814,https://github.com/cxliu0/PET,https://huggingface.co/papers/2308.13814,,,,4,0 Nearest Neighbor Guidance for Out-of-Distribution Detection,"Park, Jaewoo*; Teoh, Andrew Beng Jin; Jung, Yoon Gyo",poster,,,,,,,,, Bayesian Optimization Meets Self-Distillation,"Lee, HyunJae*; Song, Heon; Lee, Hyeonsoo; Lee, Gi-hyeon; Park, Suyeong; Yoo, Donggeun",poster,2304.12666,https://arxiv.org/abs/2304.12666,,https://huggingface.co/papers/2304.12666,,,,6,0 When Prompt-based Incremental Learning Does Not Meet Strong Pretraining,"Tang, Yu-Ming; Peng, Yi-Xing; ZHENG, WEI-SHI*",poster,2308.10445,https://arxiv.org/abs/2308.10445,https://github.com/TOM-tym/APG,https://huggingface.co/papers/2308.10445,,,,3,0 When to Learn What: Model-Adaptive Data Augmentation Curriculum,"Hou, Chengkai*; Zhang, Jieyu; Zhou, Tianyi",poster,,,,,,,,, Parametric Information Maximization for Generalized Category Discovery,"Chiaroni, Florent*; Dolz, Jose; Ziko, Imtiaz Masud; Mitiche, Amar; Ben Ayed, Ismail",poster,2212.00334,https://arxiv.org/abs/2212.00334,,https://huggingface.co/papers/2212.00334,,,,5,0 Boosting Few-shot Action Recognition with Graph-guided Hybrid Matching,"Xing, Jiazheng*; Wang, Mengmeng; Ruan, Yudi; Chen, Bofan; Guo, Yaowei; Mu, Boyu; Dai, Guang; Wang, Jingdong; Liu, Yong",poster,2308.09346,https://arxiv.org/abs/2308.09346,https://github.com/jiazheng-xing/GgHM,https://huggingface.co/papers/2308.09346,,,,9,0 Domain Generalization via Rationale Invariance,"Chen, Liang*; Zhang, Yong; Song, Yibing; van den Hengel, Anton; Liu, Lingqiao",poster,2308.11158,https://arxiv.org/abs/2308.11158,https://github.com/liangchen527/RIDG,https://huggingface.co/papers/2308.11158,,,,5,0 Masked Spiking Transformer,"Wang, Ziqing*; Fang, Yuetong; Cao, Jiahang; Zhang, Qiang; Wang, Zhongrui; Xu, Renjing",poster,2210.01208,https://arxiv.org/abs/2210.01208,,https://huggingface.co/papers/2210.01208,,,,6,0 Prototype Reminiscence and Augmented Asymmetric Knowledge Aggregation for Non-Exemplar Class-Incremental Learning,"Shi, Wuxuan; Ye, Mang*",poster,,,,,,,,, Distilled Reverse Attention Network for Open-world Compositional Zero-Shot Learning,"Li, Yun*; Liu, Zhe; Jha, Saurav; Yao, Lina",poster,2303.00404,https://arxiv.org/abs/2303.00404,,https://huggingface.co/papers/2303.00404,,,,5,0 Candidate-aware Selective Disambiguation Based On Normalized Entropy for Instance-dependent Partial-label Learning,"He, Shuo; Feng, Lei; Yang, Guowu*",poster,,,,,,,,, CLIPN for Zero-Shot OOD Detection: Teaching CLIP to Say No,"Wang, Hualiang*; Li, Yi; Yao, Huifeng; Li, Xiaomeng",poster,2308.12213,https://arxiv.org/abs/2308.12213,https://github.com/xmed-lab/CLIPN,https://huggingface.co/papers/2308.12213,,,,4,0 Self-similarity Driven Scale-invariant Learning for Weakly Supervised Person Search,"Wang, Benzhi*; Yang, Yang; wu, jinlin; Qi, Guo-Jun; Lei, Zhen",poster,2302.12986,https://arxiv.org/abs/2302.12986,,https://huggingface.co/papers/2302.12986,,,,5,0 Sample-wise Label Confidence Incorporation for Learning with Noisy Labels,"Ahn, Chanho*; Kim, Kikyung; Baek, Ji-won; Lim, Jongin; Han, Seungju",poster,,,,,,,,, Combating Noisy Labels with Sample Selection by Mining High-Discrepancy Examples,"Xia, Xiaobo*; Han, Bo; Zhan, Yibing; Yu, Jun; Gong, Mingming; Gong, Chen; Liu, Tongliang",poster,,,,,,,,, Spatial-Aware Token for Weakly Supervised Object Localization,"pingyu, wu; Zhai, Wei; Cao, Yang*; Luo, Jiebo; Zha, Zheng-Jun",poster,2303.10438,https://arxiv.org/abs/2303.10438,https://github.com/wpy1999/SAT,https://huggingface.co/papers/2303.10438,,,,5,0 Towards Improved Input Masking for Convolutional Neural Networks,"Balasubramanian, Sriram*; Feizi, Soheil",poster,2211.14646,https://arxiv.org/abs/2211.14646,https://github.com/SriramB-98/layer_masking,https://huggingface.co/papers/2211.14646,,,,2,1 PDiscoNet: Semantically consistent part discovery for fine-grained recognition,"van der Klis, Robert D; Alaniz, Stephan; Mancini, Massimiliano; Dantas, Cassio F.; Ienco, Dino; Akata, Zeynep; Marcos, Diego*",poster,,,,,,,,, Corrupting Neuron Explanations of Deep Visual Features,"Srivastava, Divyansh*; Oikarinen, Tuomas; Weng, Lily",poster,,,,,,,,, ICICLE: Interpretable Class Incremental Continual Learning,"Rymarczyk, Dawid Damian*; van de Weijer, Joost; Zieli?ski, Bartosz; Twardowski, Bartlomiej",poster,2303.07811,https://arxiv.org/abs/2303.07811,,https://huggingface.co/papers/2303.07811,,,,4,1 ProbVLM: Probabilistic Adapter for Frozen Vison-Language Models,"Upadhyay, Uddeshya*; Karthik, Shyamgopal; Mancini, Massimiliano; Akata, Zeynep",poster,2307.00398,https://arxiv.org/abs/2307.00398,,https://huggingface.co/papers/2307.00398,,,,4,0 Out-of-Distribution Detection for Monocular Depth Estimation,"Hornauer, Julia*; Holzbock, Adrian; Belagiannis, Vasileios",poster,2308.06072,https://arxiv.org/abs/2308.06072,,https://huggingface.co/papers/2308.06072,,,,3,0 Using Explanations to Guide Models,"Rao, Sukrut*; Böhle, Moritz; Parchami-Araghi, Amin; Schiele, Bernt",poster,2303.11932,https://arxiv.org/abs/2303.11932,,https://huggingface.co/papers/2303.11932,,,,4,1 Rosetta Neurons: Mining the Common Units in a Model Zoo,"Dravid, Amil; Gandelsman, Yossi*; Efros, Alexei A; Shocher, Assaf",poster,2306.09346,https://arxiv.org/abs/2306.09346,,https://huggingface.co/papers/2306.09346,,,,4,0 Protoype-based Dataset Comparison,"Noord, Nanne van*",poster,,,,,,,,, Learning to Identify Critical States for Reinforcement Learning from Videos,"Liu, Haozhe; Zhuge, Mingchen; Li, Bing*; Wang, Yuhui; Faccio, Francesco; Ghanem, Bernard; Schmidhuber, Jürgen ",poster,2308.07795,https://arxiv.org/abs/2308.07795,https://github.com/AI-Initiative-KAUST/VideoRLCS,https://huggingface.co/papers/2308.07795,,,,7,1 Leaping Into Memories: Space-Time Deep Feature Synthesis,"Stergiou, Alexandros*; Deligiannis, Nikos",poster,2303.09941,https://arxiv.org/abs/2303.09941,,https://huggingface.co/papers/2303.09941,,,,2,0 MAGI: Multi-Annotated Explanation-Guided Learning,"Zhang, Yifei; Gu, Siyi; Gao, Yuyang; Pan, Bo; Yang, Xiaofeng; Zhao, Liang*",poster,,,,,,,,, SAFARI: Versatile and Efficient Evaluations for Robustness of Interpretability,"Huang, Wei*; Zhao, Xingyu; Jin, Gaojie; Huang, Xiaowei",poster,2208.09418,https://arxiv.org/abs/2208.09418,,https://huggingface.co/papers/2208.09418,,,,4,0 Do BLIP and Stable Diffusion Understand Each Other?,"Li, Hang*; Gu, Jindong; Koner, Rajat; Sharifzadeh, Sahand; Tresp, Volker",poster,,,,,,,,, Evaluation and Improvement of Interpretability for Self-Explainable Part-Prototype Networks,"Huang, Qihan*; Xue, Mengqi; Huang, Wenqi; Zhang, Haofei; Song, Jie; Jing, Yongcheng; Song, Mingli",poster,2212.05946,https://arxiv.org/abs/2212.05946,https://github.com/hqhQAQ/EvalProtoPNet,https://huggingface.co/papers/2212.05946,,,,7,0 MoreauGrad: Sparse and Robust Interpretation of Neural Networks via Moreau Envelope,"Zhang, Jingwei; Farnia, Farzan*",poster,2302.05294,https://arxiv.org/abs/2302.05294,,https://huggingface.co/papers/2302.05294,,,,2,0 Towards Understanding the Generalization of Deepfake Detectors from a Game-Theoretical View,"yao, kelu; Wang, Jin; diao, boyu; li, chao*",poster,,,,,,,,, Counterfactual-based Saliency Map:Towards Visual Contrastive Explanations for Neural Networks,"Wang, Xue*; Wang, Zhibo; Weng, Haiqin; Zhang, Zhifei; Guo, Hengchang; jin, lu; Wei, Tao; Ren, Kui",poster,,,,,,,,, Beyond Single Path Integrated Gradients for Reliable Input Attribution via Randomized Path Sampling,"Jeon, Giyoung; Jeong, Haedong; Choi, Jaesik*",poster,,,,,,,,, Learning Support and Trivial Prototypes for Interpretable Image Classification,"Wang, Chong*; liu, yuyuan; Chen, Yuanhong; Liu, Fengbei; Tian, Yu; McCarthy, Davis J; Frazer, Helen; Carneiro, Gustavo",poster,2301.04011,https://arxiv.org/abs/2301.04011,,https://huggingface.co/papers/2301.04011,,,,8,0 Visual Explanations via Iterated Integrated Gradients,"Barkan, Oren*; Elisha??, ?Yoni; Asher, Yuval; Eshel, Amit ; Koenigstein, Noam",poster,,,,,,,,, Unsupervised Compositional Concepts Discovery with Text-to-Image Generative Models,"Liu, Nan; Du, Yilun*; Li, Shuang; Tenenbaum, Joshua; Torralba, Antonio",poster,2306.05357,https://arxiv.org/abs/2306.05357,,https://huggingface.co/papers/2306.05357,,,,5,3 Better Aligning Text-to-Image Models with Human Preference,"Wu, Xiaoshi*; Sun, Keqiang; Zhu, Feng; Zhao, Rui; Li, Hongsheng",poster,,,,,,,,, DLT: Conditioned layout generation with Joint Discrete-Continuous Diffusion Layout Transformer,"Levi, Elad*; Brosh, Eli; Mykhailych, Mykola; Perez, Meir M",poster,2303.03755,https://arxiv.org/abs/2303.03755,,https://huggingface.co/papers/2303.03755,,,,4,0 Anti-DreamBooth: Protecting users from personalized text-to-image synthesis,"Tran, Anh T*; Le, Thanh Van; Dao, Quan; Phung, Hao Tien; Nguyen, Thuan Hoang ; Tran, Ngoc N",poster,,,,,,,,, GECCO: Geometrically-Conditioned Point Diffusion Models,"Tyszkiewicz, Micha? J*; Fua, Pascal; Trulls, Eduard",poster,2303.05916,https://arxiv.org/abs/2303.05916,,https://huggingface.co/papers/2303.05916,,,,3,0 DiffDreamer: Towards Consistent Unsupervised Single-view Scene Extrapolation with Conditional Diffusion Models,"Cai, Shengqu*; Chan, Eric; Peng, Songyou; Shahbazi, Mohamad; Obukhov, Anton; Van Gool, Luc; Wetzstein, Gordon",poster,2211.12131,https://arxiv.org/abs/2211.12131,,https://huggingface.co/papers/2211.12131,,,,7,0 Controllable Human Motion Synthesis via Guided Diffusion Models,"Karunratanakul, Korrawe*; Preechakul, Konpat; Suwajanakorn, Supasorn; Tang, Siyu",poster,,,,,,,,, COOP: Decoupling and Coupling of Whole-Body Grasping Pose Generation,"Zheng, Yanzhao*; Shi, Yunzhou; Cui, Yuhao; Zhao, Zhongzhou; Luo, Zhiling; Zhou, Wei",poster,,,,,,,,, Zero-shot spatial layout conditioning for text-to-image diffusion models,"Couairon, Guillaume; Careil, Marlène; Cord, Matthieu; Lathuilière, Stéphane; Verbeek, Jakob*",poster,2306.13754,https://arxiv.org/abs/2306.13754,,https://huggingface.co/papers/2306.13754,,,,5,1 StyleDomain: Efficient and Lightweight Parameterizations of StyleGAN for One-shot and Few-shot Domain Adaptation,"Alanov, Aibek*; Titov, Vadim; Nakhodnov, Maksim; Vetrov, Dmitry P",poster,2212.10229,https://arxiv.org/abs/2212.10229,,https://huggingface.co/papers/2212.10229,,,,4,0 GRAM-HD: 3D-Consistent Image Generation at High Resolution with Generative Radiance Manifolds,"Xiang, Jianfeng*; Yang, Jiaolong; Deng, Yu; Tong, Xin",poster,,,,,,,,, Your Diffusion Model is Secretly a Zero-Shot Classifier,"Li, Alexander C*; Prabhudesai, Mihir; Duggal, Shivam; Brown, Ellis L; Pathak, Deepak",poster,2303.16203,https://arxiv.org/abs/2303.16203,,https://huggingface.co/papers/2303.16203,,,,5,1 Learning Hierarchical Features with Joint Latent Space Energy-Based Prior,"Cui, Jiali*; Wu, Ying Nian; Han, Tian",poster,,,,,,,,, ActFormer: A GAN-based Transformer towards General Action-Conditioned 3D Human Motion Generation,"Xu, Liang*; Song, Ziyang; Wang, Dongliang; Su, Jing; Fang, Zhicheng; Ding, Chenjing; Gan, Weihao; Yan, Yichao; Jin, Xin; Yang, Xiaokang; Zeng, Wenjun ; Wu, Wei",poster,2203.07706,https://arxiv.org/abs/2203.07706,,https://huggingface.co/papers/2203.07706,,,,12,0 Landscape Learning for Neural Network Inversion,"Liu, Ruoshi*; Mao, Chengzhi; Tendulkar, Purva; Wang, Hao; Vondrick, Carl",poster,2206.09027,https://arxiv.org/abs/2206.09027,,https://huggingface.co/papers/2206.09027,,,,5,0 Diffusion in Style,"Everaert, Martin Nicolas*; Bocchio, Marco; Süsstrunk, Sabine; Arpa, Sami; Achanta, Radhakrishna",poster,,,,,,,,, Diffusion-SDF: Conditional Generative Modeling of Signed Distance Functions ,"Chou, Gene*; Bahat, Yuval; Heide, Felix",poster,,,,,,,,, GETAvatar: Generative Textured Meshes for Animatable Human Avatars,"Zhang, Xuanmeng*; Zhang, Jianfeng; Chacko, Rohan; Xu, Hongyi ; Song, Guoxian; Yang, Yi; Feng, Jiashi",poster,,,,,,,,, A-STAR: Test-time Attention Segregation and Retention for Text-to-image Synthesis,"Agarwal, Aishwarya*; Karanam, Srikrishna ; K J, Joseph; Saxena, Apoorv U; Goswami, Koustava; Srinivasan, Balaji Vasan",poster,,,,,,,,, TF-ICON: Diffusion-Based Training-Free Cross-Domain Image Composition,"Lu, Shilin*; Liu, Yanzhu; Kong, Wai-Kin Adams",poster,,,,,,,,, Breaking The Limits of Text-conditioned 3D Motion Synthesis with Elaborative Descriptions,"Qian, Yijun*; Urbanek, Jack; Hauptmann, Alexander ; Won, Jungdam",poster,,,,,,,,, BeLFusion: Latent Diffusion for Behavior-Driven Human Motion Prediction,"Barquero, German*; Escalera, Sergio; Palmero, Cristina",poster,2211.14304,https://arxiv.org/abs/2211.14304,,https://huggingface.co/papers/2211.14304,,,,3,1 Delta Denoising Score,"Hertz, Amir*; Cohen-Or, Danny; Aberman, Kfir",poster,2304.07090,https://arxiv.org/abs/2304.07090,,https://huggingface.co/papers/2304.07090,,,,3,0 Mimic3D: Thriving 3D-Aware GANs via 3D-to-2D Imitation,"Chen, Xingyu*; Deng, Yu; Wang, Baoyuan",poster,2303.09036,https://arxiv.org/abs/2303.09036,,https://huggingface.co/papers/2303.09036,,,,3,0 DreamBooth3D: Subject-Driven Text-to-3D Generation,"Raj, Amit; Kaza, Srinivas; Poole, Ben; Niemeyer, Michael; Ruiz, Nataniel; Mildenhall, Ben; Zada, Shiran; Aberman, Kfir; Rubinstein, Michael; Barron, Jonathan T; Li, Yuanzhen; Jampani, Varun*",poster,2303.13508,https://arxiv.org/abs/2303.13508,,https://huggingface.co/papers/2303.13508,,,,12,0 Feature Proliferation — the “Cancer” in StyleGAN and its Treatments,"Song, Shuang; Liang, Yuanbang; Wu, Jing; Lai, Yu-Kun; Qin, Yipeng*",poster,,,,,,,,, Unsupervised Facial Performance Editing via Vector-Quantized StyleGAN Representations,"Kicanaoglu, Berkay*; Garrido, Pablo; Bharaj, Gaurav",poster,,,,,,,,, 3D-aware Image Generation using 2D Diffusion Models,"Xiang, Jianfeng; Yang, Jiaolong*; Huang, Binbin; Tong, Xin",poster,,,,,,,,, Neural Collage Transfer: Artistic Reconstruction via Material Manipulation,"Lee, Ganghun; Kim, Minji; Lee, Yunsu; Lee, Minsu; Zhang, Byoung-Tak*",poster,,,,,,,,, Phasic Content Fusing Diffusion Model with Directional Distribution Consistency for Few-Shot Model Adaption,"Hu, Teng; Zhang, Jiangning; Liu, Liang; Yi, Ran*; Kou, Siqi; Zhu, Haokun; Chen, Xu; Wang, Yabiao; Wang, Chengjie; Ma, Lizhuang",poster,,,,,,,,, Single-Stage Diffusion NeRF: A Unified Approach to 3D Generation and Reconstruction,"Chen, Hansheng*; Gu, Jiatao; Chen, Anpei; Tian, Wei; Tu, Zhuowen; Liu, Lingjie; Su, Hao",poster,2304.06714,https://arxiv.org/abs/2304.06714,,https://huggingface.co/papers/2304.06714,,,,7,1 Erasing Concepts from Diffusion Models,"Gandikota, Rohit*; Materzynska, Joanna; Fiotto-Kaufman, Jaden F; Bau, David",poster,2303.07345,https://arxiv.org/abs/2303.07345,,https://huggingface.co/papers/2303.07345,,,,4,0 Make Encoder Great Again in 3D GAN Inversion through Geometry and Occlusion-Aware Encoding,"Yuan, Ziyang*; Zhu, Yiming M; Li, Yu; Liu, Hongyu; Yuan, Chun",poster,2303.12326,https://arxiv.org/abs/2303.12326,,https://huggingface.co/papers/2303.12326,,,,5,0 HairNeRF: Geometry-Aware Hair Swapped Image Synthesis,"Chang, Seunggyu*; Kim, GiHoon; Kim, Ha Yeon",poster,,,,,,,,, SMAUG: Sparse Masked Autoencoder for Efficient Video-Language Pre-training,"Lin, Yuanze; Wei, Chen; Wang, Huiyu; Yuille, Alan; Xie, Cihang*",poster,2211.11446,https://arxiv.org/abs/2211.11446,,https://huggingface.co/papers/2211.11446,,,,5,0 DiffusionRet: Generative Text-Video Retrieval with Diffusion Model,"Jin, Peng*; Li, Hao; Cheng, Zesen; Li, Kehan; Ji, Xiangyang; Liu, Chang; Yuan, Li; Chen, Jie",poster,2303.09867,https://arxiv.org/abs/2303.09867,https://github.com/jpthu17/DiffusionRet,https://huggingface.co/papers/2303.09867,,,,8,0 Explore and Tell: Embodied Visual Captioning in 3D Environments,"Hu, Anwen*; Chen, Shizhe; Zhang, Liang; Jin, Qin",poster,2308.10447,https://arxiv.org/abs/2308.10447,,https://huggingface.co/papers/2308.10447,,,,4,1 Distilling Large Vision-Language Model with Out-of-Distribution Generalizability,"Li, Xuanlin*; Fang, Yunhao; Liu, Minghua; Ling, Zhan; Tu, Zhuowen; Su, Hao",poster,2307.03135,https://arxiv.org/abs/2307.03135,https://github.com/xuanlinli17/large_vlm_distillation_ood,https://huggingface.co/papers/2307.03135,,,,6,1 Learning Trajectory-Word Alignments for Video-Language Tasks,"YANG, XU; Li, Zhangzikang*; Xu, Haiyang; Zhang, Hanwang; Ye, Qinghao; Li, Chenliang; Yan, Ming; Zhang, Yu; Huang, Fei; Huang, Songfang",poster,2301.01953,https://arxiv.org/abs/2301.01953,,https://huggingface.co/papers/2301.01953,,,,10,0 Variational Causal Inference Network for Explanatory Visual Question Answering,"Xue, Dizhan*; Qian, Shengsheng; Xu, Changsheng",poster,,,,,,,,, TextManiA: Enriching Visual Feature by Text-driven Manifold Augmentation,"Ye-Bin, Moon*; Kim, Jisoo; Kim, Hongyeob; son, kilho; Oh, Tae-Hyun",poster,2307.14611,https://arxiv.org/abs/2307.14611,,https://huggingface.co/papers/2307.14611,,,,5,0 UniRef: A Unified Model for Reference-based Object Segmentation Tasks,"Wu, Jiannan*; Jiang, Yi; Yan, Bin; Lu, Huchuan; Yuan, Zehuan; Luo, Ping",poster,,,,,,,,, Gradient-Regulated Meta-Prompt Learning for Generalizable Vision-Language Models,"Li, Juncheng*; Gao, Minghe; Wei, Longhui; Tang, Siliang; Zhang, Wenqiao; Li, Mengze; Ji, Wei; Tian, Qi; Chua, Tat-Seng; Zhuang, Yueting",poster,2303.06571,https://arxiv.org/abs/2303.06571,,https://huggingface.co/papers/2303.06571,,,,10,0 "Misalign, Contrast then Distill: Rethinking Misalignments in Language-Image Pre-training","Kim, Bumsoo*; Jo, Yeonsik; Kim, Jinhyung; Kim, Seung Hwan",poster,,,,,,,,, Toward Multi-Granularity Decision-Making: Explicit Visual Reasoning with Hierarchical Knowledge,"Zhang, Yifeng; Chen, Shi; Zhao, Qi*",poster,,,,,,,,, VL-Match: Enhancing Vision-Language Pretraining with Token-Level and Instance-Level Matching,"Bi, Junyu*; Cheng, Daixuan; Yao, Ping; Pang, Bochen; Zhan, Yuefeng; Yang, Chuanguang; Wang, Yujing; Sun, Hao; Deng, Weiwei; Zhang, Qi",poster,,,,,,,,, Moment Detection in Long Tutorial Videos,"Croitoru, Ioana*; Bogolin, Simion-Vlad; Albanie, Samuel; Liu, Yang; Wang, Zhaowen; Yoon, Seunghyun; Dernoncourt, Franck; Jin, Hailin; Bui, Trung",poster,,,,,,,,, Not All Features Matter: Enhancing Few-shot CLIP with Adaptive Prior Refinement,"Zhu, Xiangyang; Zhang, Renrui*; He, Bowei; Zhou, Aojun; Wang, Dong; Zhao, Bin; Gao, Peng",poster,2304.01195,https://arxiv.org/abs/2304.01195,,https://huggingface.co/papers/2304.01195,,,,7,0 Breaking Common Sense: WHOOPS! A Vision-and-Language Benchmark of Synthetic and Compositional Images,"Guetta, Nitzan; Bitton, Yonatan*; Hessel, Jack; Schmidt, Ludwig; Elovici, Yuval ; Stanovsky, Gabriel; Schwartz, Roy",poster,2303.07274,https://arxiv.org/abs/2303.07274,,https://huggingface.co/papers/2303.07274,,,,7,1 Advancing Referring Expression Segmentation Beyond Single Image,"Wu, Yixuan; Zhang, Zhao*; Xie, Chi; Zhu, Feng; Zhao, Rui",poster,2305.12452,https://arxiv.org/abs/2305.12452,https://github.com/yixuan730/group-res,https://huggingface.co/papers/2305.12452,,,,5,0 CLIPoint: Adapting CLIP for Powerful 3D Open-world Learning,"Zhu, Xiangyang; Zhang, Renrui*; He, Bowei; Qin, Zipeng; Zeng, Ziyao; Guo, Ziyu; Zhang, Shanghang; Gao, Peng",poster,,,,,,,,, Unsupervised Prompt Tuning for Text-Driven Object Detection,"He, Weizhen; Chen, Weijie*; Chen, Binbin; Yang, Shicai; Xie, Di; Lin, Luojun; Qi, Donglian; Zhuang, Yueting",poster,,,,,,,,, Distilling Coarse-to-Fine Semantic Matching Knowledge for Weakly Supervised 3D Visual Grounding,"wang, zehan*; Huang, Haifeng; Zhao, Yang; Li, Linjun; Cheng, Xize; Zhu, Yichen; Yin, Aoxiong; Zhao, Zhou",poster,2307.09267,https://arxiv.org/abs/2307.09267,,https://huggingface.co/papers/2307.09267,,,,8,0 I can't believe there's no images! Learning Visual Tasks Using only Language Data,"Gu, Sophia; Clark, Christopher A*; Kembhavi, Aniruddha",poster,,,,,,,,, Learning Cross-Modal Affinity for Referring Video Object Segmentation Targeting Limited Samples,"Li, Guanghui*; Gao, Mingqi; Liu, Heng; Zhen, Xiantong; Zheng, Feng",poster,,,,,,,,, MeViS: A Large-scale Benchmark for Video Segmentation with Motion Expressions,"Ding, Henghui*; Liu, Chang; He, Shuting; Jiang, Xudong; Loy, Chen Change",poster,2308.08544,https://arxiv.org/abs/2308.08544,,https://huggingface.co/papers/2308.08544,,,,5,0 Diverse Data Augmentation with Diffusions for Effective Test-time Prompt Tuning,"Feng, Chun-Mei*; Yu, Kai; Liu, Yong; Khan, Salman; Zuo, Wangmeng",poster,2308.06038,https://arxiv.org/abs/2308.06038,,https://huggingface.co/papers/2308.06038,,,,5,0 ShapeScaffolder: Structure-Aware 3D Shape Generation from Text,"Tian, Xi*; Yang, Yongliang; Wu, Qi",poster,,,,,,,,, SuS-X: Training-Free Name-Only Transfer of Vision-Language Models,"Udandarao, Vishaal*; Gupta, Ankush; Albanie, Samuel",poster,,,,,,,,, BEVBert: Multimodal Map Pre-training for Language-guided Navigation,"An, Dong*; Qi, Yuankai; Li, Yangguang; Huang, Yan; Wang, Liang; Tan, Tieniu; Shao, Jing",poster,2212.04385,https://arxiv.org/abs/2212.04385,,https://huggingface.co/papers/2212.04385,,,,7,0 X-Mesh: Towards Fast and Accurate Text-driven 3D Stylization via Dynamic Textual Guidance,"Ma, Yiwei*; Zhang, Xiaoqing; Sun, Xiaoshuai; Ji, Jiayi; Wang, Haowei; Jiang, Guannan; Zhuang, Weilin; Ji, Rongrong",poster,,,,,,,,, OnlineRefer: A Simple Online Baseline for Referring Video Object Segmentation,"Wu, Dongming; Wang, Tiancai; Zhang, Yuang; Zhang, Xiangyu; Shen, Jianbing*",poster,2307.09356,https://arxiv.org/abs/2307.09356,,https://huggingface.co/papers/2307.09356,,,,5,0 Attentive Mask CLIP,"Yang, Yifan*; Huang, Weiquan; Wei, Yixuan; Peng, Houwen; Jiang, Xinyang; Jiang, Huiqiang; Wei, Fangyun; Wang, Yin; Hu, Han; Qiu, Lili; Yang, Yuqing",poster,2212.08653,https://arxiv.org/abs/2212.08653,,https://huggingface.co/papers/2212.08653,,,,11,0 Knowledge Proxy Intervention for Deconfounded Video Question Answering,"Li, Jiangtong*; Niu, Li; Zhang, Liqing",poster,,,,,,,,, UniVTG: Towards Unified Video-Language Temporal Grounding,"Lin, Qinghong*; Zhang, Pengchuan; Chen, Joya; Pramanick, Shraman; Gao, Difei; Wang, Jinpeng; Yan, Rui; Shou, Mike Zheng",poster,2307.16715,https://arxiv.org/abs/2307.16715,https://github.com/showlab/UniVTG,https://huggingface.co/papers/2307.16715,,,,8,5 Self-Supervised Cross-View Representation Reconstruction for Change Captioning,"Tu, Yunbin*; Li, Liang; Su, Li; Zha, Zheng-Jun; Yan, Chenggang; Huang, Qingming",poster,,,,,,,,, Unified Coarse-to-Fine Alignment for Text-to-Video Retrieval,"Wang, Ziyang*; Sung, Yi-Lin; Cheng, Feng; Bertasius, Gedas; Bansal, Mohit",poster,,,,,,,,, Confidence-aware Pseudo-label Learning for Weakly Supervised Visual Grounding,"Jiahua, Zhang; Chen, Qingchao; Peng, Yuxin; Liu, Yang*",poster,,,,,,,,, TextPSG: Panoptic Scene Graph Generation from Textual Descriptions,"Zhao, Chengyang; Shen, Yikang; Chen, Zhenfang*; Ding, Mingyu; Gan, Chuang",poster,,,,,,,,, "MAtch, eXpand and Improve: Unsupervised Finetuning for Zero-Shot Action Recognition with Language Knowledge","Lin, Wei*; Karlinsky, Leonid; Shvetsova, Nina; Possegger, Horst; Kozinski, Mateusz; Panda, Rameswar; Feris, Rogerio; Kuehne, Hilde; Bischof, Horst",poster,2303.08914,https://arxiv.org/abs/2303.08914,https://github.com/wlin-at/MAXI,https://huggingface.co/papers/2303.08914,,,,9,1 "Unify, Align and Refine: Multi-Level Semantic Alignment for Radiology Report Generation","Li, Yaowei; Yang, Bang; Cheng, Xuxin; Zhu, Zhihong; Li, Hongxiang; Zou, Yuexian*",poster,2303.15932,https://arxiv.org/abs/2303.15932,,https://huggingface.co/papers/2303.15932,,,,6,0 Transferring Visual Knowledge with Pre-trained Models for Multimodal Machine Translation,"Gupta, Devaansh*; Kharbanda, Siddhant; Zhou, Jiawei; Li, Wanhua; Pfister, Hanspeter; Wei, Donglai",poster,,,,,,,,, Learning Human-Human Interactions in Images from Weak Textual Supervision,"Alper, Morris*; Averbuch-Elor, Hadar",poster,2304.14104,https://arxiv.org/abs/2304.14104,,https://huggingface.co/papers/2304.14104,,,,2,0 BUS:Efficient and Effective Vision-language Pretraining with Bottom-Up Patch Summarization.,"Jiang, Chaoya*; Xu, Haiyang; Ye, Wei; Ye, Qinghao; Li, Chenliang; Yan, Ming; Bi, Bin; Zhang, Shikun; Huang, Fei; Huang, Songfang",poster,,,,,,,,, 3D-VisTA: Pre-trained Transformer for 3D Vision and Text Alignment,"zhu, ziyu*; Ma, Xiaojian; Chen, Yixin; Deng, Zhidong; Huang, Siyuan; Li, Qing",poster,,,,,,,,, ALIP: Adaptive Language-Image Pre-training with Synthetic Caption,"Yang, Kaicheng*; Deng, Jiankang; An, Xiang; li, jiawei; Feng, Ziyong; Guo, Jia; Yang, Jing; Liu, Tongliang",poster,2308.08428,https://arxiv.org/abs/2308.08428,https://github.com/deepglint/ALIP,https://huggingface.co/papers/2308.08428,,,,8,0 LoGoPrompt: Synthetic Text Images Can Be Good Visual Prompts for Vision-Language Models,"Shi, Cheng; Yang, Sibei*",poster,,,,,,,,, Noise-aware Learning from Web-crawled Image-Text Data for Image Captioning,"Kang, Woo Young*; Roh, Byungseok; lee, sungjun; Mun, Jonghwan",poster,2212.13563,https://arxiv.org/abs/2212.13563,https://github.com/kakaobrain/noc,https://huggingface.co/papers/2212.13563,,,,4,0 Decouple Before Interact: Multi-Modal Prompt Learning for Continual Visual Question Answering,"Qian, Zi*; Wang, Xin; Duan, Xuguang; Qin, Pengda; Li, Yuhong; Zhu, Wenwu",poster,,,,,,,,, Prompt-Guided Image Captioning for VQA with GPT-3,"Hu, Yushi*; Hua, Hang; Yang, Zhengyuan; Shi, Weijia; Smith, Noah A; Luo, Jiebo",poster,,,,,,,,, Grounded Image Text Matching with Mismatched Relation Reasoning,"Wu, Yu*; Wei, Yana; Wang, Haozhe; Liu, Yongfei; Yang, Sibei; He, Xuming",poster,2308.01236,https://arxiv.org/abs/2308.01236,,https://huggingface.co/papers/2308.01236,,,,6,0 GePSAn: Generative Procedure Step Anticipation in Cooking Videos,"Abdelsalam, Mohamed A*; Rangrej, Samrudhdhi B.; Hadji, Isma; DVORNIK, NIKITA; Derpanis, Konstantinos G; Fazly, Afsaneh",poster,,,,,,,,, LLM-Planner: Few-Shot Grounded Planning for Embodied Agents with Large Language Models,"Song, Chan Hee*; Wu, Jiaman; Washington, Clayton B; Sadler, Brian M; Chao, Wei-Lun; Su, Yu",poster,,,,,,,,, VL-PET: Vision-and-Language Parameter-Efficient Tuning via Granularity Control,"Hu, Zi-Yuan*; Li, Yanyang; Lyu, Michael R; Wang, Liwei",poster,2308.09804,https://arxiv.org/abs/2308.09804,https://github.com/HenryHZY/VL-PET,https://huggingface.co/papers/2308.09804,,,,4,1 With a Little Help from your own Past: Prototypical Memory Networks for Image Captioning,"Barraco, Manuele; Sarto, Sara; Cornia, Marcella*; Baraldi, Lorenzo; Cucchiara, Rita",poster,2308.12383,https://arxiv.org/abs/2308.12383,https://github.com/aimagelab/PMA-Net,https://huggingface.co/papers/2308.12383,,,,5,0 Improving Zero-Shot Generalization for CLIP with Synthesized Prompts,"Wang, Zhengbo*; Liang, Jian; He, Ran; Xu, Nan; Wang, Zilei; Tan, Tieniu",poster,2307.07397,https://arxiv.org/abs/2307.07397,https://github.com/mrflogs/SHIP,https://huggingface.co/papers/2307.07397,,,,6,0 DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generation Models,"Cho, Jaemin*; Zala, Abhaysinh S; Bansal, Mohit",poster,,,,,,,,, Learning Navigational Visual Representations with Semantic Map Supervision,"Hong, Yicong*; Zhou, Yang; Zhang, Ruiyi; Dernoncourt, Franck; Bui, Trung; Gould, Stephen; Tan, Hao",poster,2307.12335,https://arxiv.org/abs/2307.12335,,https://huggingface.co/papers/2307.12335,,,,7,0 CoTDet: Affordance Knowledge Prompting for Task Driven Object Detection,"Tang, Jiajin; Zheng, Ge; Yu, Jingyi; Yang, Sibei*",poster,,,,,,,,, Open Set Video HOI detection from Action-centric Chain-of-Look Prompting,"Xi, Nan*; Meng, Jingjing; Yuan, Junsong",poster,,,,,,,,, Learning Concise and Descriptive Attributes for Visual Recognition,"Yan, An*; Wang, Yu; Zhong, Yiwu; Dong, Chengyu; He, Zexue; Lu, Yujie; Wang, William Yang; Shang, Jingbo; McAuley, Julian",poster,2308.03685,https://arxiv.org/abs/2308.03685,,https://huggingface.co/papers/2308.03685,,,,9,1 Open-Vocabulary Video Question Answering: A New Benchmark for Evaluating the Generalizability of Video Question Answering Models,"Ko, Dohwan*; Lee, Ji Soo; Choi, Miso; Chu, Jaewon; Park, Jihwan; Kim, Hyunwoo J",poster,2308.09363,https://arxiv.org/abs/2308.09363,https://github.com/mlvlab/OVQA,https://huggingface.co/papers/2308.09363,,,,6,0 Encyclopedic VQA: Visual questions about detailed properties of fine-grained categories,"Mensink, Thomas; Uijlings, Jasper*; Castrejon, Lluis; Goel, Arushi; Chamone, Felipe C; Zhou, Howard; Sha, Fei; Araujo, Andre; Ferrari, Vittorio",poster,2306.09224,https://arxiv.org/abs/2306.09224,https://github.com/google-research/google-research/tree/master/encyclopedic_vqa,https://huggingface.co/papers/2306.09224,,,,9,2 Story Visualization by Online Text Augmentation with Context Memory,"Ahn, Daechul; Kim, Daneul; Song, Gwangmo; Kim, Seung Hwan; Lee, Honglak; Kang, Dongyeop; Choi, Jonghyun*",poster,2308.07575,https://arxiv.org/abs/2308.07575,,https://huggingface.co/papers/2308.07575,,,,7,1 Transferable Decoding with Visual Entities for Zero-Shot Image Captioning,"Fei, Junjie*; Wang, Teng; Zhang, Jinrui; He, Zhenyu; Wang, Chengjie; Zheng, Feng",poster,2307.16525,https://arxiv.org/abs/2307.16525,https://github.com/FeiElysia/ViECap,https://huggingface.co/papers/2307.16525,,,,6,0 Too Large; Data Reduction for Vision-Language Pre-Training,"Wang, Jinpeng*; Lin, Qinghong; Zhang, David Junhao; Lei, Stan Weixian; Shou, Mike Zheng",poster,2305.20087,https://arxiv.org/abs/2305.20087,https://github.com/showlab/datacentric.vlp,https://huggingface.co/papers/2305.20087,,,,5,0 ViLTA: Enhancing Vision-Language Pre-training through Textual Augmentation,"Wang, Weihan*; Yang, Zhen; Xu, Bin; Li, Juanzi; Sun, Yankui",poster,2308.16689,https://arxiv.org/abs/2308.16689,,https://huggingface.co/papers/2308.16689,,,,5,0 Learning a More Continuous Zero Level Set in Unsigned Distance Fields through Level Set Projection,"Zhou, Junsheng; Ma, Baorui; Li, Shujuan; Liu, Yu-Shen*; Han, Zhizhong",poster,2308.11441,https://arxiv.org/abs/2308.11441,https://github.com/junshengzhou/LevelSetUDF,https://huggingface.co/papers/2308.11441,,,,5,0 GNT-MOVE: Generalizable NeRF Transformer with Mixture-of-View-Experts,"Cong, Wenyan*; liang, hanxue; Wang, Peihao; Fan, Zhiwen; Chen, Tianlong; Varma, Mukund T; Wang, Yi; Wang, Zhangyang",poster,,,,,,,,, MatrixCity: A Large-scale City Dataset for City-scale Neural Rendering and Beyond,"Li, Yixuan; Jiang, Lihan; Xu, Linning; Xiangli, Yuanbo; Wang, Zhenzhi; Lin, Dahua; Dai, Bo*",poster,,,,,,,,, R3D3: Dense 3D Reconstruction of Dynamic Scenes from Multiple Cameras,"Schmied, Aron; Fischer, Tobias*; Danelljan, Martin; Pollefeys, Marc; Yu, Fisher",poster,2308.14713,https://arxiv.org/abs/2308.14713,,https://huggingface.co/papers/2308.14713,,,,5,0 ClimateNeRF: Extreme Weather Synthesis in Neural Radiance Field,"Li, Yuan; Lin, Zhi-Hao*; Forsyth, David; Huang, Jia-Bin; Wang, Shenlong",poster,2211.13226,https://arxiv.org/abs/2211.13226,,https://huggingface.co/papers/2211.13226,,,,5,0 Rendering Humans from Object-Occluded Monocular Videos,"Xiang, Tiange*; Sun, Adam; Wu, Jiajun; Adeli, Ehsan; Fei-Fei, Li",poster,2308.04622,https://arxiv.org/abs/2308.04622,,https://huggingface.co/papers/2308.04622,,,,5,0 AssetField: Assets Mining and Reconfiguration in Ground Feature Plane Representation,"Xiangli, Yuanbo*; Xu, Linning; Pan, Xingang; Zhao, Nanxuan; Dai, Bo; Lin, Dahua",poster,2303.13953,https://arxiv.org/abs/2303.13953,,https://huggingface.co/papers/2303.13953,,,,6,2 PETRv2: A Unified Framework for 3D Perception from Multi-Camera Images,"Liu, Yingfei*; Yan, Junjie; Jia, Fan; Li, Shuailin; Gao, Aqi; Wang, Tiancai; Zhang, Xiangyu",poster,2206.01256,https://arxiv.org/abs/2206.01256,https://github.com/megvii-research/PETR,https://huggingface.co/papers/2206.01256,,,,8,0 MIMO-NeRF: Fast Neural Rendering with Multi-input Multi-output Neural Radiance Fields,"Kaneko, Takuhiro*",poster,,,,,,,,, Adaptive Positional Encoding for Bundle-Adjusting Neural Radiance Fields,"Gao, Zelin; Dai, Weichen; Zhang, Yu*",poster,,,,,,,,, NeuS2: Fast Learning of Neural Implicit Surfaces for Multi-view Reconstruction,"Wang, Yiming*; Han, Qin; Habermann, Marc; Daniilidis, Kostas; Theobalt, Christian; Liu, Lingjie",poster,2212.05231,https://arxiv.org/abs/2212.05231,,https://huggingface.co/papers/2212.05231,,,,6,0 Learning from Semantic Alignment between Unpaired Multiviews for Egocentric Video Recognition,"Wang, Qitong*; Zhao, Long; Yuan, Liangzhe; Liu, Ting; Peng, Xi",poster,2308.11489,https://arxiv.org/abs/2308.11489,https://github.com/wqtwjt1996/SUM-L,https://huggingface.co/papers/2308.11489,,,,5,0 Uncertainty Guided Adaptive Warping for Robust and Efficient Stereo Matching,"Jing, Junpeng*; Li, Jiankun; Xiong, Pengfei; Liu, Jiangyu; Liu, Shuaicheng; Guo, Yichen; Deng, Xin; Xu, Mai; Jiang, Lai; Sigal, Leonid",poster,2307.14071,https://arxiv.org/abs/2307.14071,,https://huggingface.co/papers/2307.14071,,,,10,0 Compatibility of Fundamental Matrices for Complete Viewing Graphs,"Bråtelund, Martin*; Rydell, Felix",poster,2303.10658,https://arxiv.org/abs/2303.10658,,https://huggingface.co/papers/2303.10658,,,,2,0 ProtoTransfer: Cross-Modal Prototype Transfer for Point Cloud Segmentation,"Tang, Pin; Xu, Haiming; Ma, Chao*",poster,,,,,,,,, SA-BEV: Generating Semantic-Aware Bird's-Eye-View Feature for Multi-view 3D Object Detection,"Zhang, Jinqing*; Zhang, Yanan; Liu, Qingjie; Wang, Yunhong",poster,,,,,,,,, GraphAlign: Enhancing Accurate Feature Alignment by Graph matching for Multi-Modal 3D Object Detection,"Song, Ziying; Wei, Haiyue; Bai, Lin; Yang, Lei; Jia, Caiyan*",poster,,,,,,,,, Tangent Sampson Error: Fast Approximate Two-view Reprojection Error for Central Camera Models,"Terekhov, Mikhail A.*; Larsson, Viktor",poster,,,,,,,,, Using a Waffle Iron for Automotive Point Cloud Semantic Segmentation,"Puy, Gilles*; Boulch, Alexandre; Marlet, Renaud",poster,2301.10100,https://arxiv.org/abs/2301.10100,,https://huggingface.co/papers/2301.10100,,,,3,0 Fast Globally Optimal Surface Normal Estimation from an Affine Correspondence,"Hajder, Levente*; Barath, Daniel; Lóczi, Lajos",poster,,,,,,,,, HeadsUp: A Data-Driven Volumetric Prior for Few-shot Synthesis of Ultra High-resolution Human Heads,"Bühler, Marcel C.; Sarkar, Kripasindhu; Shah, Tanmay; Li, Gengyan; Wang, Daoye; Helminger, Leonhard; Orts-Escolano, Sergio; Lagun, Dmitry; Hilliges, Otmar; Beeler, Thabo; Meka, Abhimitra*",poster,,,,,,,,, TILTED: Robust Neural Fields via Latent Registration,"Yi, Brent H*; Zeng, Weijia ; Buchanan, Sam; Ma, Yi",poster,,,,,,,,, Center-Based Decoupled Point-cloud Registration for 6D Object Pose Estimation,"Jiang, Haobo*; Salzmann, Mathieu; Dang, Zheng; Gu, Shuo; Xie, Jin; Yang, Jian",poster,,,,,,,,, Deep geometry-aware camera self-calibration from video,"Hagemann, Annika*; Knorr, Moritz M; Stiller, Christoph",poster,,,,,,,,, V-FUSE: Volumetric Depth Map Fusion with Long-Range Constraints,"Burgdorfer, Nathaniel J*; Mordohai, Philippos",poster,,,,,,,,, Consistent Depth Prediction for Transparent Object Reconstruction from RGB-D Camera,"Cai, Yuxiang; Zhu, Yifan; Zhang, Haiwei; Ren, Bo*",poster,,,,,,,,, FaceCLIPNeRF: Text-driven 3D Face Manipulation using Deformable Neural Radiance Fields,"Hwang, Sungwon*; Hyung, Junha; Kim, Daejin; Kim, Min-Jung; Choo, Jaegul",poster,2307.11418,https://arxiv.org/abs/2307.11418,,https://huggingface.co/papers/2307.11418,,,,5,3 HollowNeRF: Pruning Hashgrid-Based NeRFs with Trainable Collision Mitigation,"Xie, Xiufeng*; Gherardi, Riccardo; Pan, Zhihong; Huang, Stephen",poster,2308.10122,https://arxiv.org/abs/2308.10122,,https://huggingface.co/papers/2308.10122,,,,4,1 ICE-NeRF: Interactive Color Editing of NeRFs via Decomposition-Aware Weight Optimization,"Lee, Jae-Hyeok*; Kim, Daeshik",poster,,,,,,,,, FULLER: Unified Multi-modality Multi-task 3D Perception via Multi-level Gradient Calibration ,"Huang, Zhijian*; Lin, Sihao; liu, guiyu; LUO, Mukun; Ye, Chaoqiang; Xu, Hang; Chang, Xiaojun; Liang, Xiaodan",poster,,,,,,,,, Neural Fields for Structured Lighting,"Shandilya, Aarrushi*; Attal, Benjamin; Richardt, Christian; Tompkin, James; O'Toole, Matthew",poster,,,,,,,,, CO-Net: Learning Multiple Point Cloud Tasks at Once with A Cohesive Network,"Xie, Tao*; Wang, Ke; Lu, Siyi; zhang, yukun; dai, kun; Li, Xiaoyu; Xu, Jie; Wang, Li; Zhao, Lijun; Zhang, Xinyu; Li, Ruifeng",poster,,,,,,,,, Pose-Free Neural Radiance Fields via Implicit Pose Regularization,"Zhang, Jiahui; Zhan, Fangneng; Yu, Yingchen; Liu, Kunhao; WU, Rongliang; Zhang, Xiaoqin; Shao, Ling; Lu, Shijian*",poster,2308.15049,https://arxiv.org/abs/2308.15049,,https://huggingface.co/papers/2308.15049,,,,8,0 TransHuman: A Transformer-based Human Representation for Generalizable Neural Human Rendering,"Pan, Xiao*; Yang, Zongxin; Ma, Jianxin; Zhou, Chang; Yang, Yi",poster,2307.12291,https://arxiv.org/abs/2307.12291,,https://huggingface.co/papers/2307.12291,,,,5,0 S-VolSDF: Sparse Multi-View Stereo Regularization of Neural Implicit Surfaces,"Wu, Haoyu*; Graikos, Alexandros; Samaras, Dimitris ",poster,,,,,,,,, DPS-Net: Deep Polarimetric Stereo Depth Estimation,"Tian, Chaoran; Pan, Weihong; Wang, Zimo; Mao, Mao; Zhang, Guofeng; Bao, Hujun; Tan, Ping; Cui, Zhaopeng*",poster,,,,,,,,, 3DPPE: 3D Point Positional Encoding for Transformer-based Multi-Camera 3D Object Detection,"Shu, Changyong; Deng, Jiajun; Yu, Fisher; Liu, Yifan*",poster,,,,,,,,, Deformable Neural Radiance Fields using RGB and Event Cameras,"Ma, Qi*; Paudel, Danda Pani; Chhatkuli , Ajad; Van Gool, Luc",poster,,,,,,,,, Inter-Reflectable Light Fields for Geometry and Material Estimation,"Zhang, Jingyang; Yao, Yao*; Li, Shiwei; Liu, Jingbo; Fang, Tian; McKinnon, David N; Tsin, Yanghai; Quan, Long",poster,,,,,,,,, Hierarchical Prior Mining for Non-local Multi-View Stereo,"Ren, Chunlin*; Xu, Qingshan; Zhang, Shikun; Yang, Jiaqi",poster,2303.09758,https://arxiv.org/abs/2303.09758,,https://huggingface.co/papers/2303.09758,,,,4,0 Exploring Object-Centric Temporal Modeling for Efficient Multi-View 3D Object Detection,"Wang, Shihao; Liu, Yingfei*; Wang, Tiancai; Li, Ying; Zhang, Xiangyu",poster,2303.11926,https://arxiv.org/abs/2303.11926,https://github.com/exiawsh/StreamPETR.git,https://huggingface.co/papers/2303.11926,,,,5,0 Re-ReND: Real-time Rendering of NeRFs across Devices,"Rojas Martinez, Sara *; Zarzar, Jesus; Perez, Juan C; Thabet, Ali K; Sanakoyeu, Artsiom; Pumarola, Albert; Ghanem, Bernard",poster,,,,,,,,, Learning Shape Primitives via Implicit Convexity Regularization,"Huang, Xiaoyang*; Zhang, Yi; Ni, Bingbing; Chen, Kai; Li, Teng; Zhang, Wenjun",poster,,,,,,,,, Geometry-guided Feature Learning and Fusion for Indoor Scene Reconstruction,"Yin, Ruihong*; Karaoglu, Sezer; Gevers, Theo",poster,,,,,,,,, LiDAR-Camera Panoptic Segmentation via Geometry-Consistent and Semantic-Aware Alignment,"Zhang, Zhiwei*; Zhang, Zhizhong; Yu, Qian; Yi, Ran; Xie, Yuan; Ma, Lizhuang",poster,2308.01686,https://arxiv.org/abs/2308.01686,https://github.com/zhangzw12319/lcps.git,https://huggingface.co/papers/2308.01686,,,,6,0 PivotNet: End-to-end Learning for Vectorized HD Map Construction,"Ding, Wenjie*; Qiao, Limeng; Qiu, Xi; Zhang, Chi",poster,,,,,,,,, Sat2Density: Faithful Density Learning from Satellite-Ground Image Pairs,"Qian, Ming; Xiong, Jincheng; Xia, Gui-Song; Xue, Nan*",poster,2303.14672,https://arxiv.org/abs/2303.14672,,https://huggingface.co/papers/2303.14672,,,,4,2 Mask-Attention-Free Transformer for 3D Instance Segmentation,"Lai, Xin*; Yuan, Yuhui; Chu, Ruihang; Chen, Yukang; Hu, Han; Jia, Jiaya",poster,,,,,,,,, Scene-Aware Feature Matching,"Lu, Xiaoyong; Yan, Yaping; Wei, Tong; Du, Songlin*",poster,2308.09949,https://arxiv.org/abs/2308.09949,,https://huggingface.co/papers/2308.09949,,,,4,0 "Revisiting Domain-Adaptive 3D Object Detection by Reliable, Diverse and Class-balanced Pseudo-Labeling","Chen, Zhuoxiao*; Luo, Yadan; Wang, Zheng; Baktashmotlagh, Mahsa; Huang, Zi Helen",poster,2307.07944,https://arxiv.org/abs/2307.07944,https://github.com/zhuoxiao-chen/ReDB-DA-3Ddet,https://huggingface.co/papers/2307.07944,,,,5,0 GO-SLAM: Global Optimization for Consistent 3D Instant Reconstruction,"Zhang, Youmin*; Tosi, Fabio; Mattoccia, Stefano; Poggi, Matteo",poster,,,,,,,,, BANSAC: A dynamic BAyesian Network for SAmple Consensus,"Piedade, Valter André; Miraldo, Pedro*",poster,,,,,,,,, Theoretical and Numerical Analysis of 3D Reconstruction Using Point and Line Incidences,"Torres, Angelica*; Rydell, Felix; Shehu, Elima",poster,2303.13593,https://arxiv.org/abs/2303.13593,,https://huggingface.co/papers/2303.13593,,,,3,0 RealGraph: A Multiview Dataset for 4D Real-world Context Graph Generation,"Lin, Haozhe; Chen, Zequn; Zhang, Jinzhi; Bai, Bing; Wang, Yu; Huang, Ruqi; Fang, Lu*",poster,,,,,,,,, CL-MVSNet: Unsupervised Multi-view Stereo with Dual-level Contrastive Learning,"Xiong, Kaiqiang*; Peng, Rui; Zhang, Zhe; Feng, Tianxing; Jiao, Jianbo; Gao, Feng; Wang, Ronggang",poster,,,,,,,,, Temporal Enhanced Training of Multi-view 3D Object Detector via Historical Object Prediction,"Zong, Zhuofan*; Jiang, Dongzhi; Song, Guanglu; Xue, Zeyue; Su, Jingyong; Li, Hongsheng; Liu, Yu",poster,2304.00967,https://arxiv.org/abs/2304.00967,https://github.com/Sense-X/HoP,https://huggingface.co/papers/2304.00967,,,,7,0 Object as Query: Lifting any 2D Object Detector to 3D Detection,"Wang, Zitian*; Huang, Zehao; Fu, Jiahui; Wang, Naiyan; Liu, Si",poster,2301.02364,https://arxiv.org/abs/2301.02364,,https://huggingface.co/papers/2301.02364,,,,5,0 PARTNER: Level up the Polar Representation for LiDAR 3D Object Detection,"Nie, Ming; XUE, Yujing; Wang, Chunwei; Ye, Chaoqiang; Xu, Hang; Bi Mi, Michael; Wang, Xinchao; Zhang, Li*; Zhu, Xinge; Huang, Qingqiu",poster,2308.03982,https://arxiv.org/abs/2308.03982,,https://huggingface.co/papers/2308.03982,,,,10,0 Not Every Side Is Equal: Localization Uncertainty Estimation for Semi-Supervised 3D Object Detection,"Wang, Chuxin*; Yang, Wenfei; Zhang, Tianzhu",poster,,,,,,,,, Multi-Modal Neural Radiance Field for Monocular Dense SLAM with a Light-Weight ToF Sensor,"Liu, Xinyang; Zhang, Yinda; Li, Yijin; Teng, Yanbin; Bao, Hujun; Zhang, Guofeng; Cui, Zhaopeng*",oral,2308.14383,https://arxiv.org/abs/2308.14383,,https://huggingface.co/papers/2308.14383,,,,7,0 ScanNet++: A High-Fidelity Dataset of 3D Indoor Scenes,"Yeshwanth, Chandan*; Liu, Yueh-Cheng; Niessner, Matthias; Dai, Angela",oral,2308.11417,https://arxiv.org/abs/2308.11417,,https://huggingface.co/papers/2308.11417,,,,4,0 Translating Images to Road Network: A Non-Autoregressive Sequence-to-Sequence Approach,"Lu, Jiachen; Peng, Renyuan; Cai, Xinyue; Xu, Hang; Li, Hongyang; Wen, Feng; Zhang, Wei; Zhang, Li*",oral,,,,,,,,, Doppelgangers: Learning to Disambiguate Images of Similar Structures,"Cai, Ruojin*; Tung, Joseph; Wang, Qianqian; Averbuch-Elor, Hadar; Hariharan, Bharath; Snavely, Noah",oral,,,,,,,,, EgoLoc: Revisiting 3D Object Localization from Egocentric Videos with Visual Queries,"Mai, Jinjie*; Hamdi, Abdullah J; Giancola, Silvio; Zhao, Chen; Ghanem, Bernard",oral,2212.06969,https://arxiv.org/abs/2212.06969,https://github.com/Wayne-Mai/EgoLoc,https://huggingface.co/papers/2212.06969,,,,5,1 ClothPose: A Real-world Benchmark for Visual Analysis of Garment Pose via An Indirect Recording Solution,"Xu, Wenqiang*; Du, Wenxin; Xue, Han; Li, Yutong; Ye, Ruolin; Wang, Yan-Feng; Lu, Cewu",oral,,,,,,,,, EMR-MSF: Self-Supervised Recurrent Monocular Scene Flow Exploiting Ego-Motion Rigidity,"Jiang, Zijie*; Okutomi, Masatoshi",oral,,,,,,,,, ENVIDR: Implicit Differentiable Renderer with Neural Environment Lighting,"Liang, Ruofan*; Chen, Huiting; Li, Chunlin; Chen, Fan; Panneer, Selvakumar; Vijaykumar, Nandita",oral,2303.13022,https://arxiv.org/abs/2303.13022,,https://huggingface.co/papers/2303.13022,,,,6,0 Robust Mixture-of-Expert Training for Convolutional Neural Networks,"Zhang, Yihua*; Cai, Ruisi; Chen, Tianlong; Zhang, Guanhua; Zhang, Huan; Chen, Pin-Yu; Chang, Shiyu; Wang, Zhangyang; Liu, Sijia",oral,2308.10110,https://arxiv.org/abs/2308.10110,https://github.com/OPTML-Group/Robust-MoE-CNN,https://huggingface.co/papers/2308.10110,,,,9,0 Set-Level Guidance Attack: Boosting Adversarial Transferability of Vision-Language Pre-training Models,"Lu, Dong*; Wang, Zhiqiang; Wang, Teng; GUAN, WEILI; Gao, Hongchang; Zheng, Feng",oral,2307.14061,https://arxiv.org/abs/2307.14061,,https://huggingface.co/papers/2307.14061,,,,6,0 CleanCLIP: Mitigating Data Poisoning Attacks in Multimodal Contrastive Learning,"Bansal, Hritik*; Singhi, Nishad; Yang, Yu; Yin, Fan; Grover, Aditya; Chang, Kai-Wei",oral,2303.03323,https://arxiv.org/abs/2303.03323,https://github.com/nishadsinghi/CleanCLIP,https://huggingface.co/papers/2303.03323,,,,6,0 CGBA: Curvature-aware Geometric Black-box Attack,"Reza, Md Farhamdur*; Rahmati, Ali; Wu, Tianfu; Dai, Huaiyu",oral,2308.03163,https://arxiv.org/abs/2308.03163,https://github.com/Farhamdur/CGBA,https://huggingface.co/papers/2308.03163,,,,4,0 Robust Evaluation of Diffusion-Based Adversarial Purification,"Lee, Minjong*; Kim, Dongwoo",oral,2303.09051,https://arxiv.org/abs/2303.09051,,https://huggingface.co/papers/2303.09051,,,,2,0 Advancing Example Exploitation Can Alleviate Critical Challenges in Adversarial Training,"Ge, Yao*; Li, Yun; Han, Keji; Zhu, Junyi; Long, Xianzhong",oral,,,,,,,,, The Victim and The Beneficiary: Exploiting a Poisoned Model to Train a Clean Model on Poisoned Data,"Zhu, Zixuan*; Wang, Rui; Zou, Cong; Jing, Lihua",oral,,,,,,,,, TIJO: Trigger Inversion with Joint Optimization for Defending Multimodal Backdoored Models,"Sur, Indranil*; Sikka, Karan; Walmer, Matthew; Koneripalli, Kaushik; Roy, Anirban; Lin, Xiao; Divakaran, Ajay; Jha, Susmit",oral,2308.03906,https://arxiv.org/abs/2308.03906,https://github.com/SRI-CSL/TIJO,https://huggingface.co/papers/2308.03906,,,,8,1 SAGA: Spectral Adversarial Geometric Attack on 3D Meshes,"Stolik, Tomer*; Lang, Itai; Avidan, Shai",poster,2211.13775,https://arxiv.org/abs/2211.13775,https://github.com/StolikTomer/SAGA,https://huggingface.co/papers/2211.13775,,,,3,1 Benchmarking and Analyzing Robust Point Cloud Recognition: Bag of Tricks for Defending Adversarial Examples,"Ji, Qiufan*; Wang, Lin ; Hu, Shengshan; Sun, Lichao; Shi, Cong; Chen, Yingying",poster,2307.16361,https://arxiv.org/abs/2307.16361,https://github.com/qiufan319/benchmark_pc_attack.git,https://huggingface.co/papers/2307.16361,,,,6,0 ACTIVE: Towards Highly Transferable 3D Physical Camouflage for Universal and Robust Vehicle Evasion,"Suryanto, Naufal*; Kim, Yongsu; Larasati, Harashta Tatimma; Kang, Hyoeun; Le, Thi-Thu-Huong; Hong, Yoonyoung; Yang, Hunmin; Oh, Se-Yoon; Kim, Howon",poster,2308.07009,https://arxiv.org/abs/2308.07009,,https://huggingface.co/papers/2308.07009,,,,9,1 Frequency-aware GAN for Adversarial Manipulation Generation,"Zhu, Peifei*; Osada, Genki; Kataoka, Hirokatsu; Takahashi, Tsubasa",poster,,,,,,,,, Breaking Temporal Consistency: Generating Video Universal Adversarial Perturbations Using Image Models,"Kim, Hee-Seon; Son, Minji; Kim, Minbeom; Kwon, Myung-Joon; Kim, Changick*",poster,,,,,,,,, Tracing the Origin of Adversarial Attack for Forensic Investigation and Deterrence,"Fang, Han*; Zhang, Jiyi; Qiu, Yupeng; Liu, Jiayang; Xu, Ke; Fang, Chengfang; Chang, Ee-Chien",poster,2301.01218,https://arxiv.org/abs/2301.01218,,https://huggingface.co/papers/2301.01218,,,,6,0 Downstream-agnostic Adversarial Examples,"Zhou, Ziqi; Hu, Shengshan*; Zhao, Ruizhi; Wang, Qian; ZHANG, LEO YU; Hou, Junhui; Jin, Hai",poster,2307.12280,https://arxiv.org/abs/2307.12280,,https://huggingface.co/papers/2307.12280,,,,7,0 Hiding Visual Information via Obfuscating Adversarial Perturbations,"Su, Zhigang; Zhou, Dawei; Liu, Decheng; Wang, Nannan*; Wang, Zhen; Gao, Xinbo",poster,2209.15304,https://arxiv.org/abs/2209.15304,,https://huggingface.co/papers/2209.15304,,,,6,0 An Embarrassingly Simple Self-supervised Trojan Attack,"Li, Changjiang *; Ren, Pang; Xi, Zhaohan; Du, Tianyu; Ji, Shouling; Wang, Ting; Yao, Yuan",poster,,,,,,,,, Efficient Decision-based Black-box Patch Attacks on Video Recognition ,"Jiang, Kaixun*; Chen, Zhaoyu; Huang, Hao; Wang, Jiafeng; Yang, Dingkang; Li, Bo; Wang, Yan; Zhang, Wenqiang",poster,,,,,,,,, Adversarial Finetuning with Latent Representation Constraint to Mitigate Accuracy-Robustness Tradeoff,"Suzuki, Satoshi*; Yamaguchi, Shin'ya; Takeda, Shoichiro; Kanai, Sekitoshi; makishima, naoki; Ando, Atsushi; Masumura, Ryo",poster,2308.16454,https://arxiv.org/abs/2308.16454,,https://huggingface.co/papers/2308.16454,,,,7,0 Towards Building More Robust Models with Frequency Bias,"Bu, Qingwen*; HUANG, Dong; Cui, Heming ",poster,2307.09763,https://arxiv.org/abs/2307.09763,,https://huggingface.co/papers/2307.09763,,,,3,1 System-Driven Adversarial Object Evasion Attack in Autonomous Driving,"Wang, Ningfei*; Luo, Yunpeng; SATO, TAKAMI; Xu, Kaidi; Chen, Alfred",poster,,,,,,,,, Improving Generalization of Adversarial Training via Robust Critical Fine-Tuning,"Zhu, Kaijie*; Hu, Xixu; Wang, Jindong; Xie, Xing; Yang, Ge",poster,2308.02533,https://arxiv.org/abs/2308.02533,https://github.com/microsoft/robustlearn,https://huggingface.co/papers/2308.02533,,,,5,0 Enhancing Generalization of Universal Adversarial Perturbation through Gradient Aggregation,"Liu, Xuannan; Zhong, Yaoyao; Zhang, Yuhang; Qin, lixiong; Deng, Weihong*",poster,2308.06015,https://arxiv.org/abs/2308.06015,https://github.com/liuxuannan/Stochastic-Gradient-Aggregation,https://huggingface.co/papers/2308.06015,,,,5,0 Unified Adversarial Patch for Cross-modal Attacks in the Physical World,"Wei, Xingxing; Huang, Yao*; Sun, Yitong; Yu, Jie",poster,2307.07859,https://arxiv.org/abs/2307.07859,,https://huggingface.co/papers/2307.07859,,,,4,0 RFLA: A Stealthy Reflected Light Adversarial Attack in the Physical World,"Wang, Donghua*; Yao, Wen; Jiang, Tingsong; Li, Chao; Chen, Xiaoqian",poster,2307.07653,https://arxiv.org/abs/2307.07653,,https://huggingface.co/papers/2307.07653,,,,5,0 Enhancing Fine-Tuning based Backdoor Defense with Sharpness-Aware Minimization,"Zhu, Mingli*; Wei, Shaokui; Shen, Li; Fan, Yanbo; Wu, Baoyuan",poster,2304.11823,https://arxiv.org/abs/2304.11823,,https://huggingface.co/papers/2304.11823,,,,5,0 Conditional 360-degree Image Synthesis for Immersive Indoor Scene Decoration,"Shum, Ka-Chun*; Pang, Hong Wing; Hua, Binh-Son; Nguyen, Thanh; Yeung, Sai-Kit",poster,2307.09621,https://arxiv.org/abs/2307.09621,,https://huggingface.co/papers/2307.09621,,,,5,0 An Adaptive Model Ensemble Adversarial Attack for Boosting Adversarial Transferability,"Chen, Bin; Yin, Jia-Li*; Chen, Shu-Kai; Chen, Bo-Hao; Liu, Ximeng",poster,2308.02897,https://arxiv.org/abs/2308.02897,,https://huggingface.co/papers/2308.02897,,,,5,0 Mitigating Adversarial Vulnerability through Causal Parameter Estimation by Adversarial Double Machine Learning,"Lee, Byung-Kwan*; Kim, Junho; Ro, Yong Man",poster,2307.07250,https://arxiv.org/abs/2307.07250,,https://huggingface.co/papers/2307.07250,,,,3,0 LEA2: A Lightweight Ensemble Adversarial Attack via Non-overlapping Vulnerable Frequency Regions,"QIAN, Yaguan*; He, Shuke; Zhao, Chenyu; Sha, Jia Qiang; Wang, Wei; WANG , Bin",poster,,,,,,,,, Explaining Adversarial Robustness of Neural Networks from Clustering Effect Perspective,"Jin, Yulin*; Zhang, Xiaoyu; Lou, Jian; Ma, Xu; Chen, Xiaofeng; Wang, Zilong",poster,,,,,,,,, VertexSerum: Poisoning Graph Neural Networks for Link Inference,"Ding, Ruyi*; Duan, Shijin; Xu, Xiaolin; Fei, Yunsi",poster,2308.01469,https://arxiv.org/abs/2308.01469,,https://huggingface.co/papers/2308.01469,,,,4,0 How to choose your best allies for a transferable attack?,"Maho, Thibault*; Moosavi-Dezfooli, Seyed-Mohsen; Furon, Teddy",poster,2304.02312,https://arxiv.org/abs/2304.02312,https://github.com/t-maho/transferability_measure_fit,https://huggingface.co/papers/2304.02312,,,,3,0 Enhancing Adversarial Robustness in Semi-Supervised Learning via Adaptively Weighted Regularization and Knowledge Distillation,"Yang, Dongyoon; Kong, Insung; Kim, Yongdai*",poster,,,,,,,,, AdvDiffuser: Natural Adversarial Example Synthesis with Diffusion Models,"Chen, Xinquan; Gao, Xitong*; zhao, juanjuan; Ye, Kejiang; Xu, Cheng-Zhong",poster,,,,,,,,, FnF Attack Adversarial Attack against Multiple Object Trackers by Inducing False Negatives and False Positives,"Zhou, Tao*; Luo, Wenhan; Ye, Qi; Zhang, Kaihao; Shi, Zhiguo; Chen, Jiming",poster,,,,,,,,, Rickrolling the Artist: Injecting Backdoors into Text Encoders for Text-to-Image Synthesis,"Struppek, Lukas*; Hintersdorf, Dominik; Kersting, Kristian",poster,2211.02408,https://arxiv.org/abs/2211.02408,,https://huggingface.co/papers/2211.02408,,,,3,1 Hard No-Box Adversarial Attack on Skeleton-Based Human Action Recognition with Skeleton-Motion-Informed Gradient,"Lu, Zhengzhi; Wang, He; Chang, Ziyi; Yang, Guoan; Shum, Hubert P. H.*",poster,2308.05681,https://arxiv.org/abs/2308.05681,,https://huggingface.co/papers/2308.05681,,,,5,0 Structure Invariant Transformation for better Adversarial Transferability,"Wang, Xiaosen*; Zhang, Zeliang; Zhang, Jianping",poster,,,,,,,,, Beating Backdoor Attack at Its Own Game,"Liu, Min*; Sangiovanni-Vincentelli, Alberto L; Yue, Xiangyu",poster,2307.15539,https://arxiv.org/abs/2307.15539,https://github.com/damianliumin/non-adversarial_backdoor,https://huggingface.co/papers/2307.15539,,,,3,0 Transferable Adversarial Attack for Both Vision Transformers and Convolutional Networks via Momentum Integrated Gradients,"Ma, Wenshuo*; Li, Yidong; Xiaofeng, Jia; Xu, Wei",poster,,,,,,,,, REAP: A Large-Scale Realistic Adversarial Patch Benchmark,"Hingun, Nabeel; Sitawarin, Chawin*; Li, Jerry; Wagner, David",poster,2212.05680,https://arxiv.org/abs/2212.05680,https://github.com/wagner-group/reap-benchmark,https://huggingface.co/papers/2212.05680,,,,4,1 Multi-metrics adaptively identifies backdoors in Federated learning,"Huang, Siquan*; Li, Yijiang; Chen, Chong; Shi, Leyu; Gao, Ying",poster,2303.06601,https://arxiv.org/abs/2303.06601,,https://huggingface.co/papers/2303.06601,,,,5,0 Backpropagation Path Search On Adversarial Transferability,"Xu, Zhuoer*; Gu, Zhangxuan; Zhang, Jianping; Cui, Shiwen; Meng, Changhua; Wang, Weiqiang",poster,2308.07625,https://arxiv.org/abs/2308.07625,,https://huggingface.co/papers/2308.07625,,,,6,0 Fast Adaptation of Neural Networks using Test-Time Feedback,"Yeo, Teresa*; Kar, O?uzhan Fatih; Sodagar, Zahra; Zamir, Amir",poster,,,,,,,,, One-bit Flip is All You Need: When Bit-flip Attack Meets Model Training,"Dong, Jianshuo; Qiu, Han; Li, Yiming*; Zhang, Tianwei; Li, Yuanjie; Lai, Zeqi; Zhang, Chao; Xia, Shu-Tao",poster,2308.07934,https://arxiv.org/abs/2308.07934,https://github.com/jianshuod/TBA,https://huggingface.co/papers/2308.07934,,,,8,0 PolicyCleanse: Backdoor Detection and Mitigation for Competitive Reinforcement Learning,"guo, junfeng*; Li, Ang; Wang, Lixu; Liu, Cong",poster,,,,,,,,, Towards Viewpoint-Invariant Visual Recognition via Adversarial Training,"ruan, shouwei*; Dong, Yinpeng; Su, Hang; Jianteng, Peng; Chen, Ning; Wei, Xingxing",poster,2307.10235,https://arxiv.org/abs/2307.10235,,https://huggingface.co/papers/2307.10235,,,,6,0 Fast Adversarial Training with Smooth Convergence,"Zhao, MN*; Zhang, Lihe; Kong, Yuqiu; Yin, Baocai ",poster,2308.12857,https://arxiv.org/abs/2308.12857,https://github.com/FAT-CS/ConvergeSmooth,https://huggingface.co/papers/2308.12857,,,,4,0 The Perils of Learning From Unlabeled Data: Backdoor Attacks on Semi-supervised Learning,"Shejwalkar, Virat*; Lyu, Lingjuan; Houmansadr, Amir",poster,2211.00453,https://arxiv.org/abs/2211.00453,,https://huggingface.co/papers/2211.00453,,,,3,0 Boosting Adversarial Transferability via Gradient Relevance Attack,"Zhu, Hegui; Ren, Yuchen*; Sui, Xiaoyan; Yang, Lianping; Jiang, Wuming",poster,,,,,,,,, Towards Robust Model Watermark via Reducing Parametric Vulnerability,"Gan, Guanhao*; Li, Yiming; Wu, Dongxian; Xia, Shu-Tao",poster,,,,,,,,, TRM-UAP: Enhancing the Transferability of Data-Free Universal Adversarial Perturbation via Truncated Ratio Maximization,"Liu, Yiran; Feng, Xin; Wang, Yunlong; Yang, Wu; Ming, Di*",poster,,,,,,,,, Enhancing Privacy Preservation in Federated Learning via Learning Rate Perturbation,"Wan, Guangnian*; haitao, du; Yuan, Xuejing; Xu, Jie; Jun, Yang; chen, meiling",poster,,,,,,,,, TARGET: Federated Class-Continual Learning via Exemplar-Free Distillation,"Zhang, Jie*; Chen, Chen; Zhuang, Weiming; Lyu, Lingjuan",poster,2303.06937,https://arxiv.org/abs/2303.06937,,https://huggingface.co/papers/2303.06937,,,,4,0 FACTS: First Amplify Correlations and Then Slice to Discover Bias,"Yenamandra, Sriram*; Ramesh, Pratik; Prabhu, Viraj; Hoffman, Judy",poster,,,,,,,,, Computation and Data Efficient Backdoor Attacks,"WU, YUTONG*; HAN, XINGSHUO; Qiu, Han; Zhang, Tianwei",poster,,,,,,,,, Global Balanced Experts for Federated Long-tailed Learning,"Zeng, Yaopei; Liu, Lei; Liu, Li; Shen, Li; Liu, Shaoguo ; Wu, Baoyuan*",poster,,,,,,,,, Source-free Domain Adaptive Human Pose Estimation,"Peng, Qucheng*; Zheng, Ce; Chen, Chen",poster,2308.03202,https://arxiv.org/abs/2308.03202,https://github.com/davidpengucf/SFDAHPE,https://huggingface.co/papers/2308.03202,,,,3,0 Gender Artifacts in Visual Datasets,"Meister, Nicole*; Zhao, Dorothy; Wang, Angelina; Ramaswamy, Vikram V.; Russakovsky, Olga; Fong, Ruth C",poster,2206.09191,https://arxiv.org/abs/2206.09191,,https://huggingface.co/papers/2206.09191,,,,6,0 FRAug: Tackling Federated Learning with Non-IID Features via Representation Augmentation ,"Chen, Haokun*; Frikha, Ahmed; Krompass, Denis; Gu, Jindong; Tresp, Volker",poster,,,,,,,,, zPROBE: Zero Peek Robustness Checks for Federated Learning,"Ghodsi, Zahra; Javaheripi, Mojan; Sheybani, Nojan*; Zhang, Xinqiao; Huang, Ke; Koushanfar, Farinaz",poster,2206.12100,https://arxiv.org/abs/2206.12100,,https://huggingface.co/papers/2206.12100,,,,6,0 Practical Membership Inference Attacks Against Large-Scale Multi-Modal Models: A Pilot Study,"Ko, Myeongseob*; Jin, Ming; Wang, Chenguang; Jia, Ruoxi",poster,,,,,,,,, FedPD: Federated Open Set Recognition with Parameter Disentanglement,"YANG, Chen*; Zhu, Meilu; Liu, Yifan; Yuan, Yixuan",poster,,,,,,,,, MUter: Machine Unlearning for Adversarial Training Models,"Liu, Junxu; Xue, Mingsheng; Lou, Jian*; Zhang, Xiaoyu; Xiong, Li; Qin, Zhan",poster,,,,,,,,, Beyond Skin Tone: A Multidimensional Measure of Apparent Skin Color,"Thong, William*; Joniak, Przemyslaw; Xiang, Alice",poster,,,,,,,,, A Multidimensional Analysis of Social Biases in Vision Transformers,"Brinkmann, Jannik*; Swoboda, Paul; Bartelt, Christian",poster,2308.01948,https://arxiv.org/abs/2308.01948,https://github.com/jannik-brinkmann/social-biases-in-vision-transformers,https://huggingface.co/papers/2308.01948,,,,3,0 Partition-And-Debias: Agnostic Biases Mitigation via A Mixture of Biases-Specific Experts,"Li, Jiaxuan*; Vo, Duc Minh; Nakayama, Hideki",poster,,,,,,,,, Rethinking Data Distillation: Do Not Overlook Calibration,"Zhu, Dongyao; Lei, Bowen; Zhang, Jie; Fang, Yanbo; Xie, Yiqun; Zhang, Ruqi; Xu, Dongkuan*",poster,2307.12463,https://arxiv.org/abs/2307.12463,,https://huggingface.co/papers/2307.12463,,,,7,0 Mining bias-target Alignment from Voronoi Cells,"Nahon, Remi*; Nguyen, Van-Tam; Tartaglione, Enzo",poster,2305.03691,https://arxiv.org/abs/2305.03691,,https://huggingface.co/papers/2305.03691,,,,3,0 Better May Not Be Fairer: A Study on Subgroup Discrepancy in Image Classification,"Chiu, Ming-Chang*; Chen, Pin-Yu; Ma, Xuezhe",poster,,,,,,,,, GIFD: A Generative Gradient Inversion Method with Feature Domain Optimization,"Fang, Hao*; Chen, Bin; Wang, Xuan; Wang, Zhi; Xia, Shu-Tao",poster,2308.04699,https://arxiv.org/abs/2308.04699,,https://huggingface.co/papers/2308.04699,,,,5,0 Benchmarking Algorithmic Bias in Face Recognition: An Experimental Approach Using Synthetic Faces and Human Evaluation,"Liang, Hao*; Perona, Pietro; Balakrishnan, Guha",poster,2308.05441,https://arxiv.org/abs/2308.05441,,https://huggingface.co/papers/2308.05441,,,,3,0 FedPerfix: Towards Partial Model Personalization of Vision Transformers in Federated Learning,"Sun, Guangyu*; Mendieta, Matias; Luo, Jun; Wu, Shandong; Chen, Chen",poster,2308.09160,https://arxiv.org/abs/2308.09160,,https://huggingface.co/papers/2308.09160,,,,5,0 Towards Attack-tolerant Federated Learning via Critical Parameter Analysis,"Han, Sungwon*; Park, Sungwon; Wu, Fangzhao; Kim, Sundong; Zhu, Bin; Xie, Xing; Cha, Meeyoung",poster,2308.09318,https://arxiv.org/abs/2308.09318,,https://huggingface.co/papers/2308.09318,,,,7,0 What can Discriminator do? Towards Box-free Ownership Verification of Generative Adversarial Networks,"Huang, Ziheng; Li, Boheng; Cai, Yan; Wang, Run*; Guo, Shangwei ; Fang, Liming; Chen, Jing; Wang, Lina",poster,,,,,,,,, Robust Heterogeneous Federated Learning under Data Corruption,"Fang, Xiuwen; Ye, Mang*; Yang, Xiyuan",poster,,,,,,,,, Communication-efficient Federated Learning with Single-Step Synthetic Features Compressor for Faster Convergence,"Zhou, Yuhao*; Shi, Mingjia; Li, Yuanxi; Sun, Yanan; Ye, Qing; Lv, Jiancheng",poster,2302.13562,https://arxiv.org/abs/2302.13562,,https://huggingface.co/papers/2302.13562,,,,6,0 GPFL: Simultaneously Learning Global and Personalized Feature Information for Personalized Federated Learning,"Zhang, Jianqing; Hua, Yang; Wang, Hao; Song, Tao; XUE, Zhengui; Ma, Ruhui*; Cao, Jian; Guan, Haibing",poster,2308.10279,https://arxiv.org/abs/2308.10279,,https://huggingface.co/papers/2308.10279,,,,8,0 MPCViT: Searching for Accurate and Efficient MPC-Friendly Vision Transformer with Heterogeneous Attention,"Zeng, Wenxuan; Li, Meng*; Xiong, Wenjie; Tong, Tong; Lu, Wen-jie; Tan, Jin; Wang, Runsheng; Huang, Ru",poster,2211.13955,https://arxiv.org/abs/2211.13955,https://github.com/PKU-SEC-Lab/mpcvit,https://huggingface.co/papers/2211.13955,,,,8,0 Identification of Systematic Errors of Image Classifiers on Rare Subgroups,"Metzen, Jan Hendrik*; Hutmacher, Robin; Hua, N. Grace; Boreiko, Valentyn; Zhang, Dan",poster,2303.05072,https://arxiv.org/abs/2303.05072,,https://huggingface.co/papers/2303.05072,,,,5,1 Adaptive Image Anonymization in the Context of Image Classification with Neural Networks,"Shvai, Nadiya*; Llanza, Arcadi; nakib, amir",poster,,,,,,,,, When Do Curricula Work in Federated Learning?,"Vahidian, Saeed; Kadaveru, Sreevatsank; Baek, Woonjoon*; Wang, Weijia; Kungurtsev, Vyacheslav; Chen, Chen; Shah, Mubarak; Lin, Bill",poster,2212.12712,https://arxiv.org/abs/2212.12712,,https://huggingface.co/papers/2212.12712,,,,8,0 Domain Specified Optimization for Deployment Authorization,"Wang, Haotian*; Chi, Haoang; Yang, Wenjing; Lin, Zhipeng; Geng, Mingyang; lan, long; Zhang, Jing; Tao, Dacheng",poster,,,,,,,,, STPrivacy: Spatio-Temporal Privacy-Preserving Action Recognition,"Li, Ming*; Xu, Xiangyu; Fan, Hehe; Zhou, Pan; Liu, Jun; Liu, Jia-Wei; Li, Jiahe; Keppo, Jussi; Shou, Mike Zheng; Yan, Shuicheng",poster,2301.03046,https://arxiv.org/abs/2301.03046,,https://huggingface.co/papers/2301.03046,,,,10,0 SAL-ViT: Towards Latency Efficient Private Inference on ViT using Selective Attention Search with a Learnable Softmax Approximation,"Zhang, Yuke*; Chen, Dake; Kundu, Souvik; Li, Chenghao; A. Beerel, Peter",poster,,,,,,,,, Generative Gradient Inversion Without Prior,"Zhang, Chi*; Xiaoman, Zhang; Sotthiwat, Ekanut; Xu, Yanyu; Liu, Ping; Zhen, Liangli; Liu, Yong",poster,,,,,,,,, Inspecting the Geographical Representativeness of Images from Text-to-Image Models,"Basu, Abhipsa*; RADHAKRISHNAN, Venkatesh Babu; Pruthi, Danish",poster,2305.11080,https://arxiv.org/abs/2305.11080,,https://huggingface.co/papers/2305.11080,,,,3,0 Divide and Conquer: a Two-Step Method for High Quality Face De-identification with Model Explainability,"Wen, Yunqian*; Liu, Bo; Cao, Jingyi; Xie, Rong; Song, Li",poster,,,,,,,,, Exploring the Benefits of Visual Prompting in Differential Privacy,"Li, Yizhe; Tsai, Yu-Lin; Yu, Chia-Mu*; Chen, Pin-Yu; Ren, Xuebin",poster,2303.12247,https://arxiv.org/abs/2303.12247,https://github.com/EzzzLi/Prompt-PATE,https://huggingface.co/papers/2303.12247,,,,5,0 Towards Fairness-aware Adversarial Network Pruning,"Wang, Zhibo*; Zhang, Lei; Dong, Xiaowei; Feng, Yunhe; Pang, Xiaoyi; Zhang, Zhifei; Ren, Kui",poster,,,,,,,,, AutoReP: Automatic ReLU Replacement for Fast Private Network Inference,"Peng, Hongwu*; Huang, Shaoyi; Zhou, Tong; Luo, Yukui; Wang, Chenghong; Wang, Zigeng; Zhao, Jiahui; Xie, Xi; Li, Ang; Geng, Tony; Mahmood, Kaleel; Wen, Wujie; Xu, Xiaolin; Ding, Caiwen",poster,2308.10134,https://arxiv.org/abs/2308.10134,,https://huggingface.co/papers/2308.10134,,,,14,0 Flatness-Aware Minimization for Domain Generalization,"Zhang, Xingxuan*; Xu, Renzhe; Yu, Han; Dong, Yancheng; Tian, Pengfei; Cui, Peng",poster,2307.11108,https://arxiv.org/abs/2307.11108,,https://huggingface.co/papers/2307.11108,,,,6,1 Communication-Efficient Vertical Federated Learning with Limited Overlapping Samples,"Sun, Jingwei*; Xu, Ziyue; Yang, Dong; Nath, Vishwesh; Li, Wenqi; Zhao, Can; Xu, Daguang; Chen, Yiran; Roth, Holger R",poster,2303.16270,https://arxiv.org/abs/2303.16270,,https://huggingface.co/papers/2303.16270,,,,9,0 Multimodal Distillation for Egocentric Action Recognition,"Radevski, Gorjan*; Grujicic, Dusan; Blaschko, Matthew B.; Moens, Sien; Tuytelaars, Tinne",poster,2307.07483,https://arxiv.org/abs/2307.07483,https://github.com/gorjanradevski/multimodal-distillation,https://huggingface.co/papers/2307.07483,,,,5,0 Self-Supervised Object Detection from Egocentric Videos,"Akiva, Peri*; Huang, Jing ; Liang, Kevin J; Chen, Xingyu; Kovvuri, Rama; Feiszli, Matt; Dana, Kristin; Hassner, Tal",poster,,,,,,,,, Multi-label affordance mapping from egocentric vision,"Mur-Labadia, Lorenzo*; Guerrero, Josechu; Martinez-Cantin, Ruben",poster,,,,,,,,, Ego-Only: Egocentric Action Detection without Exocentric Transferring,"Wang, Huiyu*; Singh, Mitesh Kumar; Torresani, Lorenzo",poster,,,,,,,,, COPILOT: Human Collision Prediction and Localization from Egocentric Videos,"Pan, Boxiao*; Shen, Bokui; Rempe, Davis; Paschalidou, Despoina; Mo, Kaichun; Yang, Yanchao; Guibas, Leonidas",poster,,,,,,,,, A New Framework for Egocentric Hand-Object Interaction Understanding,"Xu, Yue; Li, Yong-Lu*; Huang, Zhemin; LIU, Michael Xu; Lu, Cewu; Tai, Yu-Wing; Tang, Chi-Keung",poster,,,,,,,,, EgoVLPv2: Egocentric Video-Language Pre-training with Fusion in the Backbone,"Pramanick, Shraman*; Song, Yale; Nag, Sayan; Lin, Qinghong; Shah, Hardik; Shou, Mike Zheng; Chellappa, Rama; Zhang, Pengchuan",poster,2307.05463,https://arxiv.org/abs/2307.05463,,https://huggingface.co/papers/2307.05463,,,,8,2 WDiscOOD: Out-of-Distribution Detection via Whitened Linear Discriminative Analysis,"Chen, Yiye*; Lin, Yunzhi; Xu, Ruinian; Vela, Patricio A",poster,,,,,,,,, Pairwise Similarity Learning is SimPLE,"Wen, Yandong; Liu, Weiyang*; Feng, Yao; Raj, Bhiksha; Singh, Rita; Weller, Adrian; Black, Michael J.; Schölkopf, Bernhard",poster,,,,,,,,, No Fear of Classifier Biases: Neural Collapse Inspired Federated Learning with Synthetic and Fixed Classifier,"Li, Zexi*; Shang, Xinyi; He, Rui; Lin, Tao; Wu, Chao",poster,2303.10058,https://arxiv.org/abs/2303.10058,,https://huggingface.co/papers/2303.10058,,,,5,0 Generalizable Neural Fields as Partially Observed Neural Processes,"Gu, Jeffrey*; Wang, Kuan-Chieh; Yeung, Serena",poster,,,,,,,,, M2T: Masking Transformers Twice for Faster Decoding,"Mentzer, Fabian*; Agustsson, Eirikur; Tschannen, Michael",poster,2304.07313,https://arxiv.org/abs/2304.07313,,https://huggingface.co/papers/2304.07313,,,,3,0 Keep it SimPool: Who said supervised transformers suffer from attention deficit?,"Psomas, Bill*; Kakogeorgiou, Ioannis; Karantzalos, Konstantinos; Avrithis, Yannis",poster,,,,,,,,, Improving Pixel-based MIM by Reducing Wasted Modeling Capability,"Liu, Yuan*; Zhang, Songyang; Chen, Jiacheng; Zhaohui, Yu; Chen, Kai; Lin, Dahua",poster,2308.00261,https://arxiv.org/abs/2308.00261,https://github.com/open-mmlab/mmpretrain,https://huggingface.co/papers/2308.00261,,,,6,0 Learning Image-Adaptive Codebooks for Class-Agnostic Image Restoration,"Liu, Kechun*; Jiang, Yitong ; Choi, Inchang; Gu, Jinwei",poster,2306.06513,https://arxiv.org/abs/2306.06513,,https://huggingface.co/papers/2306.06513,,,,4,0 Quality Diversity for Visual Pre-Training,"Chavhan, Ruchika*; Gouk, Henry; Li, Da; Hospedales, Timothy",poster,,,,,,,,, Subclass-balancing Contrastive Learning for Long-tailed Recognition,"Hou, Chengkai*; Zhang, Jieyu; Wang, Haonan; Zhou, Tianyi",poster,2306.15925,https://arxiv.org/abs/2306.15925,,https://huggingface.co/papers/2306.15925,,,,4,0 Mastering Spatial Graph Prediction of Road Networks,"Anagnostidis, Sotirios-Konstantinos*; Lucchi, Aurelien; Hofmann, Thomas",poster,2210.00828,https://arxiv.org/abs/2210.00828,,https://huggingface.co/papers/2210.00828,,,,3,0 Poincaré ResNet,"van Spengler, Max WF*; Berkhout, Erwin; Mettes, Pascal",poster,2303.14027,https://arxiv.org/abs/2303.14027,https://github.com/maxvanspengler/poincare-resnet,,,,,, Exploring Model Transferability through the Lens of Potential Energy,"Li, Xiaotong*; Hu, Zixuan; Ge, Yixiao; Shan, Ying; Duan, Lingyu",poster,2308.15074,https://arxiv.org/abs/2308.15074,https://github.com/lixiaotong97/PED,https://huggingface.co/papers/2308.15074,,,,5,0 Improving CLIP Fine-tuning Performance,"Wei, Yixuan; Hu, Han*; Xie, Zhenda; Liu, Ze; Zhang, Zheng; Cao, Yue; Bao, Jianmin; Chen, Dong; Guo, Baining",poster,,,,,,,,, Unsupervised Manifold Linearizing and Clustering,"Ding, Tianjiao*; Tong, Peter; Chan, Kwan Ho Ryan; Dai, Xili; Ma, Yi; Haeffele, Benjamin D",poster,2301.01805,https://arxiv.org/abs/2301.01805,,https://huggingface.co/papers/2301.01805,,,,6,0 Generalized Sum Pooling for Metric Learning,"Gurbuz, Yeti Z.*; Sener, Ozan; Alatan, Aydin",poster,2308.09228,https://arxiv.org/abs/2308.09228,,https://huggingface.co/papers/2308.09228,,,,3,0 Partition Speeds Up Learning Implicit Neural Representations Based on Exponential-Increase Hypothesis,"Liu, Ke*; Liu, Feng; Wang, Haishuai; Ma, Ning; Bu, Jiajun; Han, Bo",poster,,,,,,,,, The effectiveness of MAE pre-pretraining for billion-scale pretraining,"Singh, Mannat*; Duval, Quentin; Alwala, Kalyan Vasudev; Fan, Haoqi; Aggarwal, Vaibhav; Adcock, Aaron; Joulin, Armand; Dollar, Piotr; Feichtenhofer, Christoph; Girshick, Ross; Girdhar, Rohit; Misra, Ishan",poster,2303.13496,https://arxiv.org/abs/2303.13496,,https://huggingface.co/papers/2303.13496,,,,12,0 Token-Label Alignment for Vision Transformers,"Xiao, Han; Zheng, Wenzhao; Zhu, Zheng; Zhou, Jie; Lu, Jiwen*",poster,2210.06455,https://arxiv.org/abs/2210.06455,https://github.com/Euphoria16/TL-Align,https://huggingface.co/papers/2210.06455,,,,5,0 Efficiently Robustify Pre-Trained Models,"Jain, Nishant*; Behl, Harkirat Singh; Rawat, Yogesh; Vineet, Vibhav",poster,,,,,,,,, OFVL-MS: Once for Visual Localization across Multiple Indoor Scenes,"Xie, Tao*; dai, kun; Lu, Siyi; Wang, Ke; jiang, zhiqiang; Gao, Jinghan; Liu, Dedong; Xu, Jie; Zhao, Lijun; Li, Ruifeng",poster,,,,,,,,, Feature Prediction Diffusion Model for Video Anomaly Detection,"Yan, Cheng*; Shiyu, Zhang; Liu, Yang; Pang, Guansong; Wang, Wenjun",poster,,,,,,,,, Joint Implicit Neural Representation for High-fidelity and Compact Vector Fonts,"Chen, Chia-Hao*; Liu, Ying-Tian; Zhang, Zhifei; Guo, Yuan-Chen; Zhang, Song-Hai ",poster,,,,,,,,, How Far Pre-trained Models Are from Neural Collapse on the Target Dataset Informs their Transferability,"Wang, Zijian*; Luo, Yadan; Zheng, Liang; Huang, Zi Helen; Baktashmotlagh, Mahsa",poster,,,,,,,,, OPERA: Omni-Supervised Representation Learning with Hierarchical Supervisions,"Wang, Chengkun; Zheng, Wenzhao; Zhu, Zheng; Zhou, Jie; Lu, Jiwen*",poster,2210.05557,https://arxiv.org/abs/2210.05557,https://github.com/wangck20/OPERA,https://huggingface.co/papers/2210.05557,,,,5,0 Perceptual Grouping in Contrastive Vision-Language Models,"Ranasinghe, Kanchana N*; McKinzie, Brandon S; Ravi, Sachin; Yang, Yinfei; Toshev, Alexander; Shlens, Jonathon",poster,2210.09996,https://arxiv.org/abs/2210.09996,,https://huggingface.co/papers/2210.09996,,,,6,1 Fully Attentional Networks with Self-emerging Token Labeling,"Zhao, Bingyin*; Yu, Zhiding; Lan, Shiyi; Cheng, Yutao; Anandkumar, Animashree; Lao, Yingjie; Alvarez, Jose M",poster,,,,,,,,, Instance and Category Supervision are Alternate Learners for Continual Learning,"Tian, Xudong; Zhang, Zhizhong; Tan, Xin; Liu, Jun; Wang, Chengjie; Qu, Yanyun; Jiang, Guannan; Xie, Yuan*",poster,,,,,,,,, SkeletonMAE: Graph-based Masked Autoencoder for Skeleton Sequence Pre-training,"Yan, Hong; Liu, Yang*; Wei, Yushen; Li, Zhen; Li, Guanbin; Lin, Liang",poster,2307.08476,https://arxiv.org/abs/2307.08476,https://github.com/HongYan1123/SkeletonMAE,https://huggingface.co/papers/2307.08476,,,,6,0 Motion-Guided Masking for Spatiotemporal Representation Learning,"Fan, David*; Wang, Jue; Liao, Shuai; Zhu, Yi; Bhat, Vimal; Santos-Villalobos, Hector J; MV, Rohith; Li, Xinyu",poster,2308.12962,https://arxiv.org/abs/2308.12962,,https://huggingface.co/papers/2308.12962,,,,8,0 Data Augmented Flatness-aware Gradient Projection for Continual Learning,"Yang, Enneng*; Shen, Li; Wang, Zhenyi; Liu, Shiwei; Guo, Guibing; Wang, Xingwei",poster,,,,,,,,, Take-A-Photo: 3D-to-2D Generative Pre-training of Point Cloud Models,"Wang, Ziyi; Yu, Xumin; Rao, Yongming; Zhou, Jie; Lu, Jiwen*",poster,,,,,,,,, BiViT: Extremely Compressed Binary Vision Transformers,"He, Yefei*; Zhenyu, Lou; Zhang, Luoming; Liu, Jing; Wu, Weijia; Zhuang, Bohan; ZHOU, HONG",poster,,,,,,,,, Spatio-Temporal Crop Aggregation for Video Representation Learning,"Sameni, Sepehr*; Jenni, Simon; Favaro, Paolo",poster,2211.17042,https://arxiv.org/abs/2211.17042,,https://huggingface.co/papers/2211.17042,,,,3,0 "Contextual, Discriminative, and Unbiased Compositional Zero-Shot Learning","Kim, Hanjae*; Lee, Jiyoung; Park, Seongheon; Sohn , Kwanghoon",poster,,,,,,,,, Semantic Information in Contrastive Learning,"Quan, Shengjiang*; Hirano, Masahiro; Yamakawa, Yuji",poster,,,,,,,,, Cross-Domain Product Representation Learning for Rich-Content E-Commerce,"bai, xuehan; Li, Yan; Cheng, Yanhua; Yang, Wenjie; Chen, Quan*; Li, Han",poster,,,,,,,,, Contrastive Continuity on Augmentation Stability Rehearsal for Continual Self-Supervised Learning,"Cheng, Haoyang*; Wen, Haitao; Zhang, Xiaoliang; Qiu, Heqian; Wang, Lanxiao; Li, Hongliang",poster,,,,,,,,, HybridAugment++: Unified Frequency Spectra Perturbations for Model Robustness,"Yücel, Mehmet Kerim*; Cinbis, Ramazan Gokberk; Duygulu, Pinar",poster,2307.11823,https://arxiv.org/abs/2307.11823,,https://huggingface.co/papers/2307.11823,,,,3,1 Unleashing Text-to-Image Diffusion Models for Visual Perception,"Zhao, Wenliang; Rao, Yongming; Liu, Zuyan; Liu, Benlin; Zhou, Jie; Lu, Jiwen*",poster,2303.02153,https://arxiv.org/abs/2303.02153,https://github.com/wl-zhao/VPD,https://huggingface.co/papers/2303.02153,,,,6,0 Efficient Controllable Multi-Task Architectures,"Aich, Abhishek*; Schulter, Samuel; Roy-Chowdhury, Amit K. ; Chandraker, Manmohan; Suh, Yumin",poster,2308.11744,https://arxiv.org/abs/2308.11744,,https://huggingface.co/papers/2308.11744,,,,5,0 ParCNetV2: Oversized Kernel with Enhanced Attention,"Xu, Ruihan; Zhang, Haokui; Hu, Wenze; Zhang, Shiliang*; Wang, Xiaoyu",poster,2211.07157,https://arxiv.org/abs/2211.07157,https://github.com/XuRuihan/ParCNetV2,https://huggingface.co/papers/2211.07157,,,,5,0 Unleashing the Power of Gradient Signal-to-Noise Ratio for Zero-Shot NAS,"Sun, Zihao*; Sun, Yu; Yang, Longxing; Lu, Shun; Mei, Jilin; Zhao, Wenxiao; Hu, Yu",poster,,,,,,,,, MMST-ViT: Climate Change-aware Crop Yield Prediction via Multi-Modal Spatial-Temporal Vision Transformer,"Lin, Fudong*; Crawford, Summer D; Guillot, Kaleb J; Zhang, Yihe; Chen, Yan; Yuan, Xu; Chen, Li; Williams, Shelby A; Minvielle, Robert; Xiao, Xiangming; M Gholson, Drew M; QUINTANA ASHWELL, NICOLAS E; Setiyono, Tri D; Tubana, Brenda; Peng, Lu; Bayoumi, Magdy; Tzeng, Nian-Feng",poster,,,,,,,,, FastViT: A Fast Hybrid Vision Transformer using Structural Reparameterization,"Anasosalu Vasu, Pavan Kumar*; Gabriel, James; Zhu, Jeff X; Tuzel, Oncel; Ranjan, Anurag",poster,2303.14189,https://arxiv.org/abs/2303.14189,https://github.com/apple/ml-fastvit,https://huggingface.co/papers/2303.14189,,,,5,0 IIEU: Rethinking Neural Feature Activation from Decision-Making,"Cai, Sudong*",poster,,,,,,,,, Scratching Visual Transformer's Back with Uniform Attention,"Hyeon-Woo, Nam*; Yu-Ji, Kim; Heo, Byeongho; Han, Dongyoon; Oh, Seong Joon; Oh, Tae-Hyun",poster,2210.08457,https://arxiv.org/abs/2210.08457,,https://huggingface.co/papers/2210.08457,,,,6,0 SpaceEvo: Hardware-Friendly Search Space Design for Efficient INT8 Inference,"Wang, Xudong; Zhang, Li Lyna*; Xu, Jiahang; Zhang, Quanlu; Wang, Yujing; Yang, Yuqing; Zheng, Ningxin; Cao, Ting; Yang, Mao",poster,2303.08308,https://arxiv.org/abs/2303.08308,,https://huggingface.co/papers/2303.08308,,,,9,1 ElasticViT: Conflict-aware Supernet Training for Deploying Fast Vision Transformer on Diverse Mobile Devices,"Tang, Chen; Zhang, Li Lyna*; Jiang, Huiqiang; Xu, Jiahang; Cao, Ting; Zhang, Quanlu; Yang, Yuqing; Wang, Zhi; Yang, Mao",poster,2303.09730,https://arxiv.org/abs/2303.09730,,https://huggingface.co/papers/2303.09730,,,,9,1 Gramian Attention Heads are Strong yet Efficient Vision Learners,"Ryu, Jongbin*; Han, Dongyoon; Lim, Jongwoo",poster,,,,,,,,, EfficientTrain: Exploring Generalized Curriculum Learning for Training Visual Backbones,"Wang, Yulin; Yue, Yang; Lu, Rui; Liu, Tianjiao; Zhong, Zhao; Song, Shiji; Huang, Gao*",poster,2211.09703,https://arxiv.org/abs/2211.09703,https://github.com/LeapLabTHU/EfficientTrain,https://huggingface.co/papers/2211.09703,,,,7,0 Ord2Seq: Regard Ordinal Regression as Label Sequence Prediction,"Wang, Jinhong*; Cheng, Yi; Chen, Jintai; Chen, Tingting; Chen, Danny Z; Wu, Jian",poster,,,,,,,,, Unified Data-Free Compression: Pruning and Quantization without Fine-Tuning,"Bai, Shipeng*; Chen, Jun; Shen, Xintian; qian, yixuan; Liu, Yong",poster,2308.07209,https://arxiv.org/abs/2308.07209,,https://huggingface.co/papers/2308.07209,,,,5,0 LaPE: Layer-adaptive Position Embedding for Vision Transformers with Independent Layer Normalization,"Yu, Runyi*; Wang, Zhennan; Wang, Yinhuai; Li, Kehan; Liu, Chang; Duan, Haoyi; Ji, Xiangyang; Chen, Jie",poster,,,,,,,,, Exemplar-Free Continual Transformer with Convolutions,"Roy, Anurag*; Voonna, Sravan; Verma, Vinay K; Ghosh, Kripabandhu; Ghosh, Saptarshi; Das, Abir",poster,2308.11357,https://arxiv.org/abs/2308.11357,,https://huggingface.co/papers/2308.11357,,,,6,0 Building Vision Transformers with Hierarchy Aware Feature Aggregation,"chen, yongjie; Liu, Hongmin; Yin, Haoran; Fan, Bin*",poster,,,,,,,,, ShiftNAS: Improving One-shot NAS via Probability Shift,"Zhang, Mingyang*; Yu, Xinyi; Zhao, Haodong; Ou, Linlin",poster,2307.08300,https://arxiv.org/abs/2307.08300,https://github.com/bestfleer/ShiftNAS,https://huggingface.co/papers/2307.08300,,,,4,0 DarSwin: Distortion Aware Radial Swin Transformer,"Athwale, Akshaya; Afrasiyabi, Arman; Lagüe, Justin; Shili, Ichrak; Ahmad, Ola; Lalonde, Jean-Francois*",poster,2304.09691,https://arxiv.org/abs/2304.09691,,https://huggingface.co/papers/2304.09691,,,,6,0 ROME: Robustifying Memory-Efficient NAS via Topology Disentanglement and Gradient Accumulation,"Wang, Xiaoxing*; Chu, Xiangxiang; Fan, Yuda; Zhang, Zhexi; Zhang, Bo; Yang, Xiaokang; Yan, Junchi",poster,2011.11233,https://arxiv.org/abs/2011.11233,,https://huggingface.co/papers/2011.11233,,,,7,0 FDViT: Improve the Hierarchical Architecture of Vision Transformer,"Xu, Yixing*; Li, Chao; Li, Dong; Sheng, Xiao; Jiang, Fan; Tian, Lu; Sirasao, Ashish",poster,,,,,,,,, FLatten Transformer: Vision Transformer using Focused Linear Attention,"Han, Dongchen; Pan, Xuran; Han, Yizeng; Song, Shiji; Huang, Gao*",poster,2308.00442,https://arxiv.org/abs/2308.00442,https://github.com/LeapLabTHU/FLatten-Transformer,https://huggingface.co/papers/2308.00442,,,,5,0 MixPath: A Unified Approach for One-shot Neural Architecture Search,"Chu, Xiangxiang; Lu, Shun; Li, Xudong; Zhang, Bo*",poster,2001.05887,https://arxiv.org/abs/2001.05887,,https://huggingface.co/papers/2001.05887,,,,4,0 SSF: Accelerating Training of Spiking Neural Networks with Stabilized Spiking Flow,"jingtao, wang*; Song, Zengjie; Wang, yuxi; yang, yuran; mei, shuqi; Xiao, Jun; Zhang, Zhaoxiang",poster,,,,,,,,, Dynamic Perceiver for Efficient Visual Recognition,"Han, Yizeng; Han, Dongchen; Liu, Zeyu; Wang, Yulin; Pan, Xuran; Pu, Yifan; Deng, Chao; Feng, Junlan; Song, Shiji; Huang, Gao*",poster,2306.11248,https://arxiv.org/abs/2306.11248,,https://huggingface.co/papers/2306.11248,,,,10,0 SG-Former: Self-guided Transformer with Evolving Token Reallocation,"Ren, Sucheng*; Yang, Xingyi; Liu, Songhua; Wang, Xinchao",poster,,,,,,,,, Scale-Aware Modulation Meet Transformer,"Lin, Weifeng; Wu, Ziheng; Chen, Jiayu; Huang, Jun; Jin, Lianwen *",poster,2307.08579,https://arxiv.org/abs/2307.08579,,https://huggingface.co/papers/2307.08579,,,,5,1 Learning to Upsample by Learning to Sample,"Liu, Wenze; Lu, Hao*; Fu, Hongtao; Cao, Zhiguo",poster,2308.15085,https://arxiv.org/abs/2308.15085,https://github.com/tiny-smart/dysample,https://huggingface.co/papers/2308.15085,,,,4,0 GET: Group Event Transformer for Event-Based Vision,"Peng, Yansong; Zhang, Yueyi*; Xiong, Zhiwei; Sun, Xiaoyan; Wu, Feng",poster,,,,,,,,, Adaptive Frequency Filters As Efficient Global Token Mixers,"Huang, Zhipeng; Zhang, Zhizheng*; Lan, Cuiling; Zha, Zheng-Jun; Lu, Yan; Guo, Baining",poster,2307.14008,https://arxiv.org/abs/2307.14008,,https://huggingface.co/papers/2307.14008,,,,6,1 Fcaformer: Forward Cross Attention in Hybrid Vision Transformer,"Zhang, Haokui*; Hu, Wenze; Wang, Xiaoyu",poster,2211.07198,https://arxiv.org/abs/2211.07198,,https://huggingface.co/papers/2211.07198,,,,3,0 Dynamic Snake Convolution based on Topological Geometric Constraints for Tubular Structure Segmentation,"Qi, Yaolei*; He, Yuting; Qi, Xiaoming; zhang, yuan; Yang, Guanyu",poster,2307.08388,https://arxiv.org/abs/2307.08388,,https://huggingface.co/papers/2307.08388,,,,5,0 Sentence Attention Blocks for Answer Grounding,"Khoshsirat, Seyedalireza*; Kambhamettu, Chandra",poster,,,,,,,,, MST-compression: Compressing and Accelerating Binary Neural Networks with Minimum Spanning Tree,"Vo, Quang Hieu*; Tran, Linh-Tam; Bae, Sung-Ho; Kim, Lokwon; Hong, Choong Seon",poster,,,,,,,,, EGformer: Equirectangular Geometry-biased Transformer for 360 Depth Estimation,"Yun, Ilwi*; Shin, Chanyong; Lee, Hyunku; Lee, Hyuk-Jae; Rhee, Chae Eun",poster,2304.07803,https://arxiv.org/abs/2304.07803,,https://huggingface.co/papers/2304.07803,,,,5,0 SPANet: Frequency-balancing Token Mixer using Spectral Pooling Aggregation Modulation,"Yun, Guhnoo; Yoo, Juhan; Kim, Kijung; Lee, Jeongho; Kim, Dong Hwan*",poster,2308.11568,https://arxiv.org/abs/2308.11568,,https://huggingface.co/papers/2308.11568,,,,5,0 ModelGiF: Gradient Fields for Model Functional Distance,"Song, Jie; Xu, Zhengqi; Wu, Sai; Chen, Gang; Song, Mingli*",poster,,,,,,,,, ClusT3: Information Invariant Test-Time Training,"Vargas Hakim, Gustavo A*; OSOWIECHI, David; Noori, Mehrdad; Cheraghalikhani, Milad; Bahri, Ali; Ben Ayed, Ismail; Desrosiers, Christian",poster,,,,,,,,, Cumulative Spatial Knowledge Distillation for Vision Transformers,"Zhao, Borui*; Song, Renjie; Liang, Jiajun",poster,2307.08500,https://arxiv.org/abs/2307.08500,,https://huggingface.co/papers/2307.08500,,,,3,1 Luminance-aware Color Transform for Multiple Exposure Correction,"Baek, Jong Hyeon*; Kim, DaeHyun; Choi, Su-Min; Lee, Hyo-Jun; Kim, Hanul; Koh, Yeong Jun",poster,,,,,,,,, Towards Memory- and Time-Efficient Backpropagation for Training Spiking Neural Networks,"Meng, Qingyan*; Xiao, Mingqing; Yan, Shen; Wang, Yisen; Lin, Zhouchen; Luo, Zhiquan",poster,,,,,,,,, Domain Generalization Guided by Gradient Signal to Noise Ratio of Parameters,"Michalkiewicz, Mateusz*; Faraki, Masoud; Yu, Xiang; Chandraker, Manmohan; Baktashmotlagh, Mahsa",poster,,,,,,,,, DOT: A Distillation-Oriented Trainer,"Zhao, Borui*; Cui, Quan; Song, Renjie; Liang, Jiajun",poster,2307.08436,https://arxiv.org/abs/2307.08436,,https://huggingface.co/papers/2307.08436,,,,4,1 Extensible and Efficient Proxy for NAS,"Li, Yuhong*; Li, Jiajie; Hao, Callie; Li, Pan; Xiong, Jinjun; Chen, Deming",poster,,,,,,,,, Learning to Transform for Generalizable Instance-wise Invariance,"Singhal, Utkarsh*; Esteves, Carlos; Makadia, Ameesh; Yu, Stella X",poster,,,,,,,,, Convolutional Networks with Oriented 1D Kernels ,"Kirchmeyer, Alexandre*; Deng, Jia",poster,,,,,,,,, Random Boxes Are Open-world Object Detectors,"Wang, Yanghao*; Yue, Zhongqi; Hua, Xian-Sheng; Zhang, Hanwang",poster,2307.08249,https://arxiv.org/abs/2307.08249,https://github.com/scuwyh2000/RandBox,https://huggingface.co/papers/2307.08249,,,,4,1 Unleashing Vanilla Vision Transformer with Masked Image Modeling for Object Detection,"Fang, Yuxin*; Yang, Shusheng; Wang, Shijie; Ge, Yixiao; Shan, Ying; Wang, Xinggang",poster,2204.02964,https://arxiv.org/abs/2204.02964,https://github.com/hustvl/MIMDet,https://huggingface.co/papers/2204.02964,,,,6,0 CoIn: Contrastive Instance Feature Mining for Outdoor 3D Object Detection with Very Limited Annotations,"Xia, Qiming*; Deng, Jinhao; Wen, Chenglu; Wu, Hai; Shi, Shaoshuai; Li, Xin; Wang, Cheng",poster,,,,,,,,, A Dynamic Dual-Processing Object Detection Framework Inspired by the Brain's Recognition Mechanism,"Zhang, Minying*; Bu, Tianpeng; hu, lulu",poster,,,,,,,,, Anchor-Intermediate Detector: Decoupling and Coupling Bounding Boxes for Accurate Object Detection,"Lv, Yilong; Li, Min*; He, Yujie; Li, Shaopeng; He, Zhuzhen; Yang, Aitao",poster,,,,,,,,, Inter-Realization Channels: Unsupervised Anomaly Detection Beyond One-Class Classification,"McIntosh, Declan GD*; Branzan Albu, Alexandra",poster,,,,,,,,, DEQDet: Object Detection with Deep Equilibrium Decoders,"wang, shuai; Teng, Yao; Wang, Limin*",poster,,,,,,,,, RecursiveDet: End-to-End Region-based Recursive Object Detection,"Zhao, Jing; Sun, Li*; Li, Qingli",poster,2307.13619,https://arxiv.org/abs/2307.13619,https://github.com/bravezzzzzz/RecursiveDet,https://huggingface.co/papers/2307.13619,,,,3,0 Small Object Detection via Coarse-to-fine Proposal Generation and Imitation Learning,"Yuan, Xiang; Cheng, Gong*; ?, ??; Zeng, Qinghua; Han, Junwei",poster,2308.09534,https://arxiv.org/abs/2308.09534,,https://huggingface.co/papers/2308.09534,,,,5,0 ASAG: Building Strong One-Decoder-Layer Sparse Detectors via Adaptive Sparse Anchor Generation,"Fu, Shenghao; Yan, Junkai; Gao, Yipeng; Xie, Xiaohua; ZHENG, WEI-SHI*",poster,2308.09242,https://arxiv.org/abs/2308.09242,https://github.com/iSEE-Laboratory/ASAG,https://huggingface.co/papers/2308.09242,,,,5,0 COCO-O: A Benchmark for Object Detectors under Natural Distribution Shifts,"Mao, Xiaofeng*; Chen, Yuefeng; Zhu, Yao; Chen, Da; Su, Hang; zhang, rong; xue, hui",poster,,,,,,,,, Generative Prompt Model for Weakly Supervised Object Localization,"Zhao, Yuzhong; Ye, Qixiang; Wu, Weijia; Shen, Chunhua; Wan, Fang*",poster,2307.09756,https://arxiv.org/abs/2307.09756,https://github.com/callsys/GenPromp,https://huggingface.co/papers/2307.09756,,,,5,0 UniKD: Universal Knowledge Distillation for Mimicking Homogeneous or Heterogeneous Object Detectors,"Lao, Shanshan*; Song, Guanglu; Liu, Boxiao; Liu, Yu; Yang, Yujiu",poster,,,,,,,,, PNI : Industrial Anomaly Detection using Position and Neighborhood Information,"Bae, Jaehyeok*; Lee, Jae-Han; Kim, Seyun",poster,2211.12634,https://arxiv.org/abs/2211.12634,,https://huggingface.co/papers/2211.12634,,,,3,0 Masked Autoencoders Are Stronger Knowledge Distillers for Object Detectors,"Lao, Shanshan*; Song, Guanglu; Liu, Boxiao; Liu, Yu; Yang, Yujiu",poster,,,,,,,,, GPA-3D: Geometry-aware Prototype Alignment for Unsupervised Domain Adaptive 3D Object Detection from Point Clouds,"Li, Ziyu*; Guo, Jingming; Cao, Tongtong; Bingbing, Liu; Yang, Wankou",poster,,,,,,,,, ADNet: Lane Shape Prediction via Anchor Decomposition,"Xiao, Lingyu*; Li, Xiang; Yang, Sen; Yang, Wankou",poster,2308.10481,https://arxiv.org/abs/2308.10481,https://github.com/Sephirex-X/ADNet,https://huggingface.co/papers/2308.10481,,,,4,0 Periodically Exchange Teacher-Student for Source-Free Object Detection,"Liu, Qipeng*; Shen, Zhifeng; Yang, Zhifeng; Lin, Luojun",poster,,,,,,,,, Towards Fair and Comprehensive Comparisons for Image-Based 3D Object Detection,"Ma, Xinzhu*; Wang, Yongtao; Zhang, Yinmin; Xia, Zhiyi; Meng, Yuan; wang, zhihui; Li, Haojie; Ouyang, Wanli",poster,,,,,,,,, Monocular 3D Object Detection with Bounding Box Denoising in 3D by Perceiver,"Liu, Xianpeng*; Zheng, Ce; Cheng, Kelvin B; Xue, Nan; Qi, Guo-Jun; Wu, Tianfu",poster,2304.01289,https://arxiv.org/abs/2304.01289,,https://huggingface.co/papers/2304.01289,,,,6,0 Template-guided Hierarchical Feature Restoration for Anomaly Detection,"Guo, Hewei; ren, liping; Fu, Jingjing*; Wang, Yuwang; Zhang, Zhizheng; Lan, Cuiling; Wang, Haoqian; Hou, Xinwen",poster,,,,,,,,, ALWOD: Active Learning for Weakly-Supervised Object Detection,"Wang, Yuting*; Ilic, Velibor; Li, Jiatong; Kisacanin, Branislav; Pavlovic, Vladimir",poster,,,,,,,,, ProtoFL: Unsupervised Federated Learning via Prototypical Distillation,"Kim, Hansol; Kwak, Youngjun*; Jung, Minyoung; Shin, Jinho; Kim, Youngsung; Kim, Changick",poster,2307.12450,https://arxiv.org/abs/2307.12450,,https://huggingface.co/papers/2307.12450,,,,6,0 Efficient Adaptive Human-Object Interaction Detection with Concept-guided Memory,"Lei, Ting; Caba, Fabian; Chen, Qingchao; Jin, Hailin; Peng, Yuxin; Liu, Yang*",poster,,,,,,,,, Detection Transformer with Stable Matching,"Liu, Shilong*; Ren, Tianhe; Chen, Jiayu; Zeng, Zhaoyang; Li, Hongyang; Zhang, Hao; Li, Feng; Huang, Jun; Su, Hang; Zhu, Jun; Zhang, Lei",poster,2304.04742,https://arxiv.org/abs/2304.04742,https://github.com/IDEA-Research/Stable-DINO,https://huggingface.co/papers/2304.04742,,,,11,0 Distilling DETR with Visual-Linguistic Knowledge for Open-Vocabulary Object Detection,"Li, Liangqi*; Miao, Jiaxu; Shi, Dahu; Tan, Wenming; Ren, Ye; Yang, Yi; Pu, Shiliang",poster,,,,,,,,, Anomaly Detection under Distribution Shift,"Cao, Tri*; ZHU, JIAWEN; Pang, Guansong",poster,2303.13845,https://arxiv.org/abs/2303.13845,,https://huggingface.co/papers/2303.13845,,,,3,0 Detecting Objects with Context-Likelihood Graphs and Graph Refinement,"Bhowmik, Aritra*; Wang, Yu; Baka, Nora; Oswald, Martin R.; Snoek, Cees",poster,,,,,,,,, Unsupervised Object Localization with Representer Point Selection,"Song, Yeonghwan; Jang, Seokwoo; Katabi, Dina; Son, Jeany*",poster,,,,,,,,, Improved Plain DETR,"Lin, Yutong; Yuan, Yuhui; Zhang, Zheng; Li, Chen; Zheng, Nanning; Hu, Han*",poster,,,,,,,,, Deep Directly-Trained Spiking Neural Networks for Object Detection,"qiaoyi, su*; Li, Guoqi; Chou, Yuhong; Hu, Yifan; Li, Jianing; Mei, Shijie; Zhang, Ziyang ",poster,2307.11411,https://arxiv.org/abs/2307.11411,,https://huggingface.co/papers/2307.11411,,,,7,0 GACE: Geometry Aware Confidence Enhancement for Black-box 3D Object Detectors on LiDAR-Data,"Schinagl, David*; Krispel, Georg; Fruhwirth-Reisinger, Christian; Possegger, Horst; Bischof, Horst",poster,,,,,,,,, StageInteractor: Query-based Object Detector with Cross-stage Interaction,"Teng, Yao; Liu, Haisong; Guo, Sheng; Wang, Limin*",poster,2304.04978,https://arxiv.org/abs/2304.04978,,https://huggingface.co/papers/2304.04978,,,,4,0 Adaptive Rotated Convolution for Rotated Object Detection,"Pu, Yifan; Wang, Yiru; Xia, Zhuofan; Han, Yizeng; Wang, Yulin; Gan, Weihao; Wang, ZiDong; Song, Shiji; Huang, Gao*",poster,2303.07820,https://arxiv.org/abs/2303.07820,,https://huggingface.co/papers/2303.07820,,,,9,0 Decoupled DETR: Spatially Disentangling Localization and Classification for Improved End-to-End Object Detection,"Zhang, Manyuan*; Song, Guanglu; Liu, Yu; Li, Hongsheng",poster,,,,,,,,, Exploring Transformers for Open-world Instance Segmentation,"Wu, Jiannan*; Jiang, Yi; Yan, Bin; Lu, Huchuan; Yuan, Zehuan; Luo, Ping",poster,2308.04206,https://arxiv.org/abs/2308.04206,,https://huggingface.co/papers/2308.04206,,,,6,0 DDG-Net: Discriminability-Driven Graph Network for Weakly-supervised Temporal Action Localization,"Tang, Xiaojun*; Fan, Junsong; Luo, Chuanchen; Zhang, Zhaoxiang; Zhang, Man; Yang, Zongyuan",poster,,,,,,,,, Group DETR: Fast DETR Training with Group-Wise One-to-Many Assignment,"Chen, Qiang*; Chen, Xiaokang; Wang, Jian; Zhang, Shan; Yao, Kun; Feng, Haocheng; Han, Junyu; Ding, Errui; Zeng, Gang; Wang, Jingdong",poster,2207.13085,https://arxiv.org/abs/2207.13085,https://github.com/Atten4Vis/GroupDETR,https://huggingface.co/papers/2207.13085,,,,10,0 Category-aware Allocation Transformer for Weakly Supervised Object Localization,"Chen, Zhiwei*; Ding, Jinren; Cao, Liujuan; Shen, Yunhang; Zhang, ShengChuan; Jiang, Guannan; Ji, Rongrong",poster,,,,,,,,, The Devil is in the Crack Orientation: A New Perspective for Crack Detection,"chen, zhuangzhuang*; Zhang, Jin; Lai, Zhuonan; ZHU, Guanming; Liu, Zun; Chen, Jie; Li, Jianqiang ",poster,,,,,,,,, Clusterformer: Cluster-based Transformer for 3D Object Detection in Point Clouds,"Pei, Yu*; Zhao, Xian; li, hao; Ma, Jingyuan; Zhang, Jingwei; Pu, Shiliang",poster,,,,,,,,, Less is More: Focus Attention for Efficient DETR,"Zheng, Dehua Zheng*; Dong, Wenhui; Hu, Hailin; Chen, Xinghao; Wang, Yunhe",poster,2307.12612,https://arxiv.org/abs/2307.12612,https://github.com/huawei-noah/noah-research/tree/master/Focus-DETR,https://huggingface.co/papers/2307.12612,,,,5,0 DFL3D: 3D Deformable Attention-based Feature Lifting for Multi-Camera 3D Object Detection,"Li, Hongyang; Zhang, Hao; Zeng, Zhaoyang; Liu, Shilong; Li, Feng; Ren, Tianhe; Zhang, Lei*",poster,,,,,,,,, Multi-Label Self-Supervised Learning with Scene Images,"Zhu, Ke; Fu, Minghao; Wu, Jianxin*",poster,2308.03286,https://arxiv.org/abs/2308.03286,,https://huggingface.co/papers/2308.03286,,,,3,0 Cascade-DETR: Delving into High-Quality Universal Object Detection,"Ye, Mingqiao; Ke, Lei*; Li, Siyuan; Tai, Yu-Wing; Tang, Chi-Keung; Danelljan, Martin; Yu, Fisher",poster,,,,,,,,, Representation Disparity-aware Distillation for 3D Object Detection,"Li, Yanjing*; Xu, Sheng; Lin, Mingbao; Yin, Jihao; Zhang, Baochang; Cao, Xianbin",poster,2308.10308,https://arxiv.org/abs/2308.10308,,https://huggingface.co/papers/2308.10308,,,,6,0 FeatEnHancer: Enhancing Hierarchical Features for Object Detection and Beyond Under Low-Light Vision,"Hashmi, Khurram Azeem*; Kallempudi, Goutham; Stricker, Didier; Afzal, Muhammad Zeshan",poster,2308.03594,https://arxiv.org/abs/2308.03594,,https://huggingface.co/papers/2308.03594,,,,4,0 DetZero: Rethinking Offboard 3D Object Detection with Long-term Sequential Point Clouds,"Ma, Tao*; Yang, Xuemeng; Zhou, Hongbin; Li, Xin; Shi, Botian; Liu, Junjie; Yang, Yuchen; Liu, Zhizheng; He, Liang; Li, Hongsheng; Li, Yikang; Qiao, Yu",poster,2306.06023,https://arxiv.org/abs/2306.06023,,https://huggingface.co/papers/2306.06023,,,,12,0 DETRs with Collaborative Hybrid Assignments Training,"Zong, Zhuofan*; Song, Guanglu; Liu, Yu",poster,2211.12860,https://arxiv.org/abs/2211.12860,https://github.com/Sense-X/Co-DETR,https://huggingface.co/papers/2211.12860,,,,3,0 Open Vocabulary Object Detection With an Open Corpus,"Wang, Jiong*; zhang, huiming; Hong, Haiwen; Jin, Xuan; He, Yuan; xue, hui; Zhao, Zhou",poster,,,,,,,,, SparseDet: Improving Sparsely Annotated Object Detection with Pseudo-positive Mining,"Suri, Saksham*; Rambhatla, Sai Saketh ; Chellappa, Rama; Shrivastava, Abhinav",poster,2201.04620,https://arxiv.org/abs/2201.04620,,https://huggingface.co/papers/2201.04620,,,,4,2 Unsupervised Anomaly Detection with Diffusion Probabilistic Model,"Zhang, Xinyi*; Li, Naiqi; Li, Jiawei; Dai, Tao; Jiang, Yong; Xia, Shu-Tao",poster,,,,,,,,, UniTR: A Unified and Efficient Multi-Modal Transformer for Bird's-Eye-View Representation,"Wang, Haiyang*; Tang, Hao; Shi, Shaoshuai; Li, Aoxue; Li, Zhenguo; Schiele, Bernt; Wang, Liwei",poster,,,,,,,,, Focus the Discrepancy: Intra- and Inter-Correlation Learning for Image Anomaly Detection,"Yao, Xincheng*; Li, Ruoqi; Qian, Zefeng; Luo, Yan; Zhang, Chongyang",poster,,,,,,,,, MonoNeRD: NeRF-like Representations for Monocular 3D Object Detection,"Xu, Junkai*; Peng, Liang; Cheng, Haoran; Li, Hao; Qian, Wei; Li, Ke; Wang, Wenxiao; Cai, Deng",poster,2308.09421,https://arxiv.org/abs/2308.09421,https://github.com/cskkxjk/MonoNeRD,https://huggingface.co/papers/2308.09421,,,,8,0 Integrally Migrating Pre-trained Transformer Encoder-decoders for Visual Object Detection,"Liu, Feng; Zhang, Xiaosong; Peng, Zhiliang; Guo, Zonghao; Wan, Fang*; Ji, Xiangyang; Ye, Qixiang",poster,2205.09613,https://arxiv.org/abs/2205.09613,https://github.com/LiewFeng/imTED,https://huggingface.co/papers/2205.09613,,,,7,0 Generating Dynamic Kernels via Transformers for Lane Detection,"Chen, Ziye; Liu, Yu; Gong, Mingming; Du, Bo; Qian, Guoqi; Smith-Miles, Kate*",poster,,,,,,,,, Meta-ZSDETR: Zero-shot DETR with Meta-learning,"Zhang, Lu*; Zhang, Chenbo; jia jia, zhao; Guan, Jihong; Zhou, Shuigeng",poster,,,,,,,,, Spatial Self-Distillation for Object Detection with Inaccurate Bounding Boxes,"Wu, Di; Chen, Pengfei; Yu, Xuehui; Li, Guorong; Han, Zhenjun *; Jiao, Jianbin",poster,2307.12101,https://arxiv.org/abs/2307.12101,https://github.com/ucas-vg/PointTinyBenchmark,https://huggingface.co/papers/2307.12101,,,,6,0 AlignDet: Aligning Pre-training and Fine-tuning in Object Detection,"Li, Ming*; Wu, Jie; Wang, Xionghui; Chen, Chen; Qin, Jie; Xiao, Xuefeng; Wang, Rui; Zheng, Min ; Pan, Xin",poster,2307.11077,https://arxiv.org/abs/2307.11077,,https://huggingface.co/papers/2307.11077,,,,9,1 MULLER: Multilayer Laplacian Resizer for Vision,"Tu, Zhengzhong*; Milanfar, Peyman; Talebi, Hossein ",poster,2304.02859,https://arxiv.org/abs/2304.02859,,https://huggingface.co/papers/2304.02859,,,,3,0 Unilaterally Aggregated Contrastive Learning with Hierarchical Augmentation for Anomaly Detection,"Wang, Guodong*; Wang, Yunhong; Qin, Jie; Zhang, Dongming; Bao, Xiuguo; Huang, Di",poster,2308.10155,https://arxiv.org/abs/2308.10155,,https://huggingface.co/papers/2308.10155,,,,6,0 DETRDistill: A Universal Knowledge Distillation Framework for DETR-families,"Chang, Jiahao; Wang, Shuo; Xu, Haiming; Chen, Zehui; Yang, Chenhongyi; Zhao, Feng*",poster,2211.10156,https://arxiv.org/abs/2211.10156,,https://huggingface.co/papers/2211.10156,,,,6,0 Delving into Motion-Aware Matching for Monocular 3D Object Tracking,"Huang, Kuan-Chih*; Yang, Ming-Hsuan; Tsai, Yi-Hsuan",poster,2308.11607,https://arxiv.org/abs/2308.11607,https://github.com/kuanchihhuang/MoMA-M3T,https://huggingface.co/papers/2308.11607,,,,3,0 FB-BEV: BEV Representation from Forward-Backward View Transformations,"Li, Zhiqi*; Yu, Zhiding; Wang, Wenhai; Anandkumar, Animashree; Lu, Tong; Alvarez, Jose M",poster,,,,,,,,, Learning from Noisy Data for Semi-Supervised 3D Object Detection,"Chen, Zehui; Li, Zhenyu; Wang, Shuo; Fu, Dengpan; Zhao, Feng*",poster,,,,,,,,, Boosting Long-tailed Object Detection via Step-wise Learning on Smooth-tail Data,"Dong, Na*; Zhang, Yongqiang; Ding, Mingli; Lee, Gim Hee",poster,2305.12833,https://arxiv.org/abs/2305.12833,,https://huggingface.co/papers/2305.12833,,,,4,0 Objects do not disappear: Video object detection by single-frame object location anticipation,"Liu, Xin*; Karimi Nejadasl, Fatemeh; van Gemert, Jan C; Booij, Olaf; Pintea, Silvia L",poster,2308.04770,https://arxiv.org/abs/2308.04770,https://github.com/L-KID/Videoobject-detection-by-location-anticipation,https://huggingface.co/papers/2308.04770,,,,5,0 Unified Visual Relationship Detection with Vision and Language Models,"Zhao, Long*; Yuan, Liangzhe; Gong, Boqing; Cui, Yin; Schroff, Florian; Yang, Ming-Hsuan; Adam, Hartwig; Liu, Ting",poster,2303.08998,https://arxiv.org/abs/2303.08998,,https://huggingface.co/papers/2303.08998,,,,8,0 Universal Domain Adaptation via Compressive Attention Matching,"zhu, didi; Li, Yinchuan; Yuan, Junkun; Li, Zexi; Kuang, Kun; Wu, Chao*",poster,2304.11862,https://arxiv.org/abs/2304.11862,,https://huggingface.co/papers/2304.11862,,,,6,0 Unsupervised Domain Adaptive Detection with Network Stability Analysis,"Zhou, Wenzhang; Fan, Heng; Luo, Tiejian; Zhang, Libo*",poster,2308.08182,https://arxiv.org/abs/2308.08182,https://github.com/tiankongzhang/NSA,https://huggingface.co/papers/2308.08182,,,,4,0 ImGeoNet: Image-induced Geometry-aware Voxel Representation for Multi-view 3D Object Detection,"Tu, Tao*; Chuang, Shun-Po; Liu, Yu-Lun; Sun, Cheng; Zhang, Ke; Roy, Donna; Kuo, Cheng-Hao; Sun, Min",poster,2308.09098,https://arxiv.org/abs/2308.09098,,https://huggingface.co/papers/2308.09098,,,,8,0 Cyclic-Bootstrap Labeling for Weakly Supervised Object Detection,"Yin, Yufei*; Deng, Jiajun; Zhou, Wengang ; Li, Li; Li, Houqiang",poster,2308.05991,https://arxiv.org/abs/2308.05991,https://github.com/Yinyf0804/WSOD-CBL,https://huggingface.co/papers/2308.05991,,,,5,0 Text-Driven Generative Domain Adaptation with Spectral Consistency Regularization,"Liu, Zhenhuan; Li, Liang*; Xiao, Jiayu; Zha, Zheng-Jun; Huang, Qingming",poster,,,,,,,,, MosaiQ: Enabling High-Quality Image Generation on Quantum Computers,"Silver, Daniel*; Patel, Tirthak; Cutler, William R; Ranjan, Aditya; Gandhi, Harshitta; Tiwari, Devesh",poster,,,,,,,,, Controllable Visual-Tactile Synthesis,"Gao, Ruihan*; Yuan, Wenzhen; Zhu, Jun-Yan",poster,2305.03051,https://arxiv.org/abs/2305.03051,,https://huggingface.co/papers/2305.03051,,,,3,0 Editing Implicit Assumptions in Text-to-Image Diffusion Models,"Orgad, Hadas; Kawar, Bahjat*; Belinkov, Yonatan",poster,2303.08084,https://arxiv.org/abs/2303.08084,,https://huggingface.co/papers/2303.08084,,,,3,0 DINAR: Diffusion Inpainting of Neural Textures for One-Shot Human Avatars,"Svitov, David*; Gudkov, Dmitrii; Bashirov, Renat; Lempitsky, Victor",poster,2303.09375,https://arxiv.org/abs/2303.09375,,https://huggingface.co/papers/2303.09375,,,,4,1 Smoothness Similarity Regularization for Few-Shot GAN Adaptation,"Sushko, Vadim*; Wang, Ruyu; Gall, Jürgen",poster,2308.09717,https://arxiv.org/abs/2308.09717,,https://huggingface.co/papers/2308.09717,,,,3,0 HSR-Diff: Hyperspectral Image Super-Resolution via Conditional Diffusion Models,"Wu, Chanyue*; Bai, Yunpeng; Wang, Dong; Mao, Hanyu; Li, Ying; Shen, Qiang",poster,,,,,,,,, Long-Term Photometric Consistent Novel View Synthesis with Diffusion Models,"Yu, Jason J*; Forghani, Fereshteh; Brubaker, Marcus A; Derpanis, Konstantinos G",poster,2304.10700,https://arxiv.org/abs/2304.10700,,https://huggingface.co/papers/2304.10700,,,,4,0 AutoDiffusion: Training-Free Optimization of Time Steps and Architectures for Automated Diffusion Model Acceleration,"Li, Lijiang; Li, Huixia; Zheng, Xiawu; Wu, Jie; Xiao, Xuefeng; Wang, Rui; Zheng, Min ; Pan, Xin; Chao, Fei*; Ji, Rongrong",poster,,,,,,,,, GaFET: Learning Geometry-aware Facial Expression Translation from In-The-Wild Images,"Ma, Tianxiang*; Li, Bingchuan; He, Qian; Dong, Jing; Tan, Tieniu",poster,2308.03413,https://arxiv.org/abs/2308.03413,,https://huggingface.co/papers/2308.03413,,,,5,0 Collecting The Puzzle Pieces: Disentangled Self-Driven Human Pose Transfer by Permuting Textures,"Li, Nannan*; Shih, Kevin; Plummer, Bryan",poster,2210.01887,https://arxiv.org/abs/2210.01887,https://github.com/NannanLi999/pt_square,https://huggingface.co/papers/2210.01887,,,,3,0 Multi-Directional Subspace Editing in Style-Space,"Naveh, Chen*",poster,2211.11825,https://arxiv.org/abs/2211.11825,,https://huggingface.co/papers/2211.11825,,,,2,0 HyperReenact: One-Shot Reenactment via Jointly Learning to Refine and Retarget Faces,"Bounareli, Stella*; Tzelepis, Christos; Argyriou, Vasileios; Patras, Ioannis; Tzimiropoulos, Georgios",poster,2307.10797,https://arxiv.org/abs/2307.10797,https://github.com/StelaBou/HyperReenact,https://huggingface.co/papers/2307.10797,,,,5,1 Generating Realistic Images from In-the-wild Sounds,"Lee, TaeGyeong; Kang, Jeonghun; Kim, Hyeonyu; Kim, Taehwan*",poster,,,,,,,,, CC3D: Layout-Conditioned Generation of Compositional 3D Scenes,"Bahmani, Sherwin*; Park, Jeong Joon; Paschalidou, Despoina; Yan, Xingguang; Wetzstein, Gordon; Guibas, Leonidas; Tagliasacchi, Andrea",poster,2303.12074,https://arxiv.org/abs/2303.12074,,https://huggingface.co/papers/2303.12074,,,,7,0 UMFuse: Unified Multi View Fusion for Human Editing applications,"Jain, Rishabh*; Hemani, Mayur; Ceylan, Duygu; Singh, Krishna Kumar; Lu, Jingwan; Krishnamurthy, Balaji; Sarkar, Mausoom",poster,2211.10157,https://arxiv.org/abs/2211.10157,,https://huggingface.co/papers/2211.10157,,,,7,1 Evaluating Data Attribution for Text-to-Image Models,"Wang, Sheng-Yu*; Efros, Alexei A; Zhu, Jun-Yan; Zhang, Richard ",poster,2306.09345,https://arxiv.org/abs/2306.09345,,https://huggingface.co/papers/2306.09345,,,,4,0 Neural Characteristic Function Learning for Conditional Image Generation,"Li, Shengxi; Zhang, Jialu; Li, Yifei; Xu, Mai*; Deng, Xin; Li, Li",poster,,,,,,,,, WaveIPT: Joint Attention and Flow Alignment in the Wavelet domain for Pose Transfer,"Ma, Liyuan*; Gao, Tingwei; Jiang, Haitian; Shen, Haibin; Huang, Kejie",poster,,,,,,,,, LayoutDiffusion: Improving Graphic Layout Generation by Discrete Diffusion Probabilistic Models,"Zhang, Junyi*; Guo, Jiaqi; Sun, Shizhao; Lou, Jian-Guang; Zhang, Dongmei",poster,2303.11589,https://arxiv.org/abs/2303.11589,,https://huggingface.co/papers/2303.11589,,,,5,1 Human-inspired Facial Sketch Synthesis with Dynamic Adaptation,"Gao, Fei*; Zhu, Yifan; Jiang, Chang; Wang, Nannan",poster,,,,,,,,, Conceptual and Hierarchical Latent Space Decomposition for Face Editing,"Ozkan, Savas*; Ozay, Mete; Robinson, Thomas W",poster,,,,,,,,, Improving Diversity in Zero-Shot GAN Adaptation with Semantic Variations,"Jeon, Seogkyu*; Liu, Bei; Lee, Pilhyeon; Hong , Kibeom; Fu, Jianlong; Byun, Hyeran",poster,2308.10554,https://arxiv.org/abs/2308.10554,,https://huggingface.co/papers/2308.10554,,,,6,0 BallGAN: 3D-aware Image Synthesis with a Spherical Background,"shin, minjung*; seo, yunji; Bae, Jeongmin; Choi, Young Sun; Kim, Hyunsu; Byun, Hyeran; Uh, Youngjung",poster,2301.09091,https://arxiv.org/abs/2301.09091,,https://huggingface.co/papers/2301.09091,,,,7,0 End-to-End Diffusion Latent Optimization Improves Classifier Guidance,"Wallace, Bram*; Gokul, Akash; Ermon, Stefano ; Naik, Nikhil",poster,2303.13703,https://arxiv.org/abs/2303.13703,https://github.com/salesforce/DOODL,https://huggingface.co/papers/2303.13703,,,,4,2 Deep Geometrized Cartoon Line Inbetweening,"Siyao, Li*; Gu, Tianpei; Xiao, Weiye; Ding, Henghui; Liu, Ziwei; Loy, Chen Change",poster,,,,,,,,, UnitedHuman: Harnessing Multi-Source Data for High-Resolution Human Generation,"Fu, Jianglin; Li, Shikai; Jiang, Yuming; Lin, Kwan-Yee; Wu, Wayne*; Liu, Ziwei",poster,,,,,,,,, Towards Authentic Face Restoration with Iterative Diffusion Models and Beyond ,"Zhao, Yang*; Hou, Tingbo; Su, Yu-Chuan; Jia, Xuhui; Li, Yandong; Grundmann, Matthias",poster,,,,,,,,, SVDiff: Compact Parameter Space for Diffusion Fine-Tuning,"Han, Ligong*; Li, Yinxiao; Zhang, Han; Milanfar, Peyman; Metaxas, Dimitris N.; Yang, Feng",poster,2303.11305,https://arxiv.org/abs/2303.11305,,https://huggingface.co/papers/2303.11305,,,,6,1 MI-GAN: A Simple Baseline for Image Inpainting on Mobile Devices,"Sargsyan, Andranik*; Navasardyan, Shant; Xu, Xingqian; Shi, Humphrey",poster,,,,,,,,, Structure and Content-Guided Video Synthesis with Diffusion Models,"Esser, Patrick*; Chiu, Johnathan; Atighehchian, Parmida PA; Granskog, Jonathan; Germanidis, Anastasis",poster,2302.03011,https://arxiv.org/abs/2302.03011,,https://huggingface.co/papers/2302.03011,,,,5,0 Scenimefy: Learning to Craft Anime Scene via Semi-Supervised Image-to-Image Translation,"Jiang, Yuxin; Jiang, Liming*; Yang, Shuai; Loy, Chen Change",poster,2308.12968,https://arxiv.org/abs/2308.12968,,https://huggingface.co/papers/2308.12968,,,,4,1 Efficient-VQGAN: Towards High-Resolution Image Generation with Efficient Vision Transformers,"Cao, Shiyue*; Yin, Yueqin; Huang, Lianghua; Liu, Yu; Zhao, Xin; Zhao, Deli; HUANG, KAIQI",poster,,,,,,,,, A Latent Space of Stochastic Diffusion Models for Zero-Shot Image Editing and Guidance,"Wu, Chen Henry*; de la Torre, Fernando",poster,,,,,,,,, Generative Multiplane Neural Radiance for 3D-Aware Image Generation,"Kumar, Amandeep*; Bhunia, Ankan Kumar ; Narayan, Sanath; Cholakkal, Hisham; Anwer , Rao Muhammad; Khan, Salman; Yang, Ming-Hsuan; Shahbaz Khan, Fahad",poster,2304.01172,https://arxiv.org/abs/2304.01172,https://github.com/VIROBO-15/GMNR,https://huggingface.co/papers/2304.01172,,,,8,0 Parallax-Tolerant Unsupervised Deep Image Stitching,"Nie, Lang; Lin, Chunyu*; Liao, Kang; Liu, Shuaicheng; Zhao, Yao",poster,2302.08207,https://arxiv.org/abs/2302.08207,https://github.com/nie-lang/UDIS2,https://huggingface.co/papers/2302.08207,,,,5,0 GAIT: Generating Aesthetic Indoor Tours with Deep Reinforcement Learning,"Xie, Desai*; Hu, Ping; Sun, Xin; Pirk, Soeren; Zhang, Jianming; Mech, Radomir; Kaufman, Arie",poster,,,,,,,,, EverLight: Indoor-Outdoor Editable HDR Lighting Estimation,"Karimi Dastjerdi, Mohammad Reza; Eisenmann, Jonathan; Hold-Geoffroy, Yannick; Lalonde, Jean-Francois*",poster,2304.13207,https://arxiv.org/abs/2304.13207,,https://huggingface.co/papers/2304.13207,,,,4,0 Prompt Tuning Inversion for Text-driven Image Editing Using Diffusion Models,"Dong, Wenkai*; Duan, Xiaoyue; Xue, Song; Han, Shumin",poster,2305.04441,https://arxiv.org/abs/2305.04441,,https://huggingface.co/papers/2305.04441,,,,4,0 Efficient Diffusion Training via Min-SNR Weighting Strategy,"Hang, Tiankai; Gu, Shuyang*; Li, Chen; Bao, Jianmin; Chen, Dong; Hu, Han; Geng, Xin; Guo, Baining",poster,2303.09556,https://arxiv.org/abs/2303.09556,https://github.com/TiankaiHang/Min-SNR-Diffusion-Training,https://huggingface.co/papers/2303.09556,,,,8,0 BoxDiff: Text-to-Image Synthesis with Training-Free Box-Constrained Diffusion,"Xie, Jinheng; Li, Yuexiang; Huang, Yawen; Liu, Haozhe; Zhang, Wentian; Zheng, Yefeng; Shou, Mike Zheng*",poster,2307.10816,https://arxiv.org/abs/2307.10816,https://github.com/showlab/BoxDiff,https://huggingface.co/papers/2307.10816,,,,7,0 Improving Sample Quality of Diffusion Models Using Self-Attention Guidance,"Hong, Susung*; Lee, Gyuseong; Jang, Wooseok; Kim, Seungryong",poster,2210.00939,https://arxiv.org/abs/2210.00939,,https://huggingface.co/papers/2210.00939,,,,4,0 Not All Steps are Created Equal: Selective Diffusion Distillation for Image Manipulation,"WANG, Luozhou*; Yang, Shuai; Liu, Shu; Chen, Yingcong",poster,2307.08448,https://arxiv.org/abs/2307.08448,https://github.com/AndysonYs/Selective-Diffusion-Distillation,https://huggingface.co/papers/2307.08448,,,,4,0 Deep Image Harmonization with Learnable Augmentation,"Niu, Li*; Cao, Junyan; Cong, Wenyan; Zhang, Liqing",poster,2308.00376,https://arxiv.org/abs/2308.00376,https://github.com/bcmi/SycoNet-Adaptive-Image-Harmonization,https://huggingface.co/papers/2308.00376,,,,4,0 Out-of-domain GAN inversion via Invertibility Decomposition for Photo-Realistic Human Face Manipulation,"YANG, Xin*; XU, Xiaogang; Chen, Yingcong",poster,2212.09262,https://arxiv.org/abs/2212.09262,,https://huggingface.co/papers/2212.09262,,,,3,0 Bidirectionally Deformable Motion Modulation For Video-based Human Pose Transfer,"Yu, Wing Yin*; Po, Lai-Man; Cheung, Ray; Zhao, Yuzhi; XUE, Yu; Li, Kun",poster,2307.07754,https://arxiv.org/abs/2307.07754,,https://huggingface.co/papers/2307.07754,,,,6,0 Size Does Matter: Size-aware Virtual Try-on via Clothing-oriented Transformation Try-on Network,"Chen, Chieh-Yun*; Chen, Yi-Chung; Shuai, Hong-Han; Cheng, Wen-Huang",poster,,,,,,,,, VidStyleODE: Disentangled Video Editing via StyleGAN and NeuralODEs,"Haji Ali, Moayed; Bond, Andrew; Birdal, Tolga*; Karacan, Levent; Ceylan, Duygu; Erdem, Erkut; Erdem, Aykut",poster,2304.06020,https://arxiv.org/abs/2304.06020,,https://huggingface.co/papers/2304.06020,,,,7,0 Learning Global-aware Kernel for Image Harmonization,"Shen, Xintian*; Zhang, Jiangning; Chen, Jun; Bai, Shipeng; Han, Yue; Wang, Yabiao; Wang, Chengjie; Liu, Yong",poster,2305.11676,https://arxiv.org/abs/2305.11676,https://github.com/XintianShen/GKNet,https://huggingface.co/papers/2305.11676,,,,8,0 Expressive Text-to-Image Generation with Rich-Text,"Ge, Songwei*; Park, Taesung; Zhu, Jun-Yan; Huang, Jia-Bin",poster,,,,,,,,, A Large-Scale Outdoor Multi-modal Dataset and Benchmark for Novel View Synthesis and Implicit Scene Reconstruction,"Lu, Chongshan; Yin, Fukun; Chen, Xin; Liu, Wen; Chen, Tao*; Yu, Gang; Fan, Jiayuan",poster,2301.06782,https://arxiv.org/abs/2301.06782,,https://huggingface.co/papers/2301.06782,,,,6,0 Efficient Region-Aware Neural Radiance Fields for High-Fidelity Talking Portrait Synthesis,"Li, Jiahe*; Zhang, Jiawei; Bai, Xiao; Zhou, Jun; Gu, Lin",poster,2307.09323,https://arxiv.org/abs/2307.09323,,https://huggingface.co/papers/2307.09323,,,,5,0 An Empirical Study of Perceptual Artifacts Localization on Image Synthesis,"Zhang, Lingzhi*; Shi, Jianbo; Xu, Zhengjie; Zhou, Yuqian; Lin, Zhe; Shechtman, Eli; Zhang, He; Liu, Qing; Barnes, Connelly; Amirghodsi, Sohrab",poster,,,,,,,,, Learning to Generate Semantic Layouts for Higher Text-Image Correspondence in Text-to-Image Synthesis,"Park, Minho; Yun, JooYeol*; Choi, Seunghwan; Choo, Jaegul",poster,2308.08157,https://arxiv.org/abs/2308.08157,,https://huggingface.co/papers/2308.08157,,,,4,2 StylerDALLE: Language-Guided Style Transfer Using a Vector-Quantized Tokenizer of a Large-Scale Generative Model,"Xu, Zipeng*; Sangineto, Enver; Sebe, Niculae",poster,2303.09268,https://arxiv.org/abs/2303.09268,https://github.com/zipengxuc/StylerDALLE,https://huggingface.co/papers/2303.09268,,,,3,0 Shortcut-V2V: Compression Framework for Video-to-Video Translation based on Temporal Redundancy Reduction,"Chung, Chaeyeon*; Park, Yeojeong; Choi, Seunghwan; Ganbat, Munkhsoyol; Choo, Jaegul",poster,,,,,,,,, Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation,"Wu, Jay Zhangjie*; Ge, Yixiao; Wang, Xintao; Lei, Stan Weixian; Gu, Yuchao; Shi, Yufei; Hsu, Wynne; Shan, Ying; Qie, Xiaohu; Shou, Mike Zheng",poster,,,,,,,,, BlendFace: Re-designing Identity Encoders for Face-Swapping,"Shiohara, Kaede*; Yang, Xingchao; Taketomi, Takafumi",poster,2307.10854,https://arxiv.org/abs/2307.10854,,https://huggingface.co/papers/2307.10854,,,,3,0 Talking Head Generation with Probabilistic Audio-to-Visual Diffusion Priors,"Yu, Zhentao*; Yin, Zixin; Zhou, Deyu; Wang, Duomin; Wong, Finn; Wang, Baoyuan",poster,2212.04248,https://arxiv.org/abs/2212.04248,,https://huggingface.co/papers/2212.04248,,,,6,0 LinkGAN: Linking GAN Latents to Pixels for Controllable Image Synthesis,"Zhu, Jiapeng*; Yang, Ceyuan; Shen, Yujun; SHI, Zifan; Dai, Bo; Zhao, Deli; Chen, Qifeng",poster,2301.04604,https://arxiv.org/abs/2301.04604,,https://huggingface.co/papers/2301.04604,,,,6,0 Guiding Text-to-Image Diffusion Model Towards Grounded Generation,"Li, Ziyi; Zhou, Qinye; Zhang, Xiaoyun; Zhang, Ya; Wang, Yan-Feng; Xie, Weidi*",poster,,,,,,,,, StyleDiffusion: Controllable Disentangled Style Transfer via Diffusion Models,"Wang, Zhizhong*; Zhao, Lei; Xing, Wei",poster,2308.07863,https://arxiv.org/abs/2308.07863,,https://huggingface.co/papers/2308.07863,,,,3,0 ToonTalker: Cross-Domain Face Reenactment,"Gong, Yuan*; Zhang, Yong; Cun, Xiaodong; Yin, Fei; Fan, Yanbo; Wang, Xuan; Wu, Baoyuan; Yang, Yujiu",poster,2308.12866,https://arxiv.org/abs/2308.12866,,https://huggingface.co/papers/2308.12866,,,,8,0 Dense Text-to-Image Generation with Attention Modulation,"Kim, Yunji*; Lee, Jiyoung; Kim, Jin-Hwa; Ha, Jung-Woo; Zhu, Jun-Yan",poster,2308.12964,https://arxiv.org/abs/2308.12964,,https://huggingface.co/papers/2308.12964,,,,5,1 Householder Projector for Unsupervised Latent Semantics Discovery,"Song, Yue*; Zhang, Jichao; Sebe, Niculae; Wang, Wei",poster,2307.08012,https://arxiv.org/abs/2307.08012,,https://huggingface.co/papers/2307.08012,,,,4,0 Deep Image Harmonization with Globally Guided Feature Transformation and Relation Distillation,"Niu, Li*; Tan, Linfeng; Tao, Xinhao; Cao, Junyan; Guo, Fengjun; Long, Teng; Zhang, Liqing",poster,2308.00356,https://arxiv.org/abs/2308.00356,https://github.com/bcmi/Image-Harmonization-Dataset-ccHarmony,https://huggingface.co/papers/2308.00356,,,,7,0 One-Shot Generative Domain Adaptation,"Yang, Ceyuan*; Shen, Yujun; Zhang, Zhiyi; Xu, Yinghao; Zhu, Jiapeng; Wu, Zhirong; Zhou, Bolei",poster,2111.09876,https://arxiv.org/abs/2111.09876,,https://huggingface.co/papers/2111.09876,,,,7,0 Hashing Neural Video Decomposition with Multiplicative Residuals in Space-Time,"Chan, Cheng-Hung; Yuan, Cheng-Yang; Sun, Cheng; Chen, Hwann-Tzong*",poster,,,,,,,,, "Versatile Diffusion: Text, Images and Variations All in One Diffusion Model","Xu, Xingqian*; Wang, Zhangyang; Zhang, Gong; Wang, Kai; Shi, Humphrey",poster,2211.08332,https://arxiv.org/abs/2211.08332,https://github.com/SHI-Labs/Versatile-Diffusion,https://huggingface.co/papers/2211.08332,,,,5,0 Sound Source Localization is All about Cross-Modal Alignment,"Senocak, Arda*; Ryu, Hyeonggon; Kim, Junsik; Oh, Tae-Hyun; Pfister, Hanspeter; Chung, Joon Son",poster,,,,,,,,, Class-Incremental Grouping Network for Continual Audio-Visual Learning,"Mo, Shentong; Pian, Weiguo; Tian, Yapeng*",poster,,,,,,,,, Audio-Visual Class-Incremental Learning,"Pian, Weiguo*; Mo, Shentong; Guo, Yunhui; Tian, Yapeng",poster,2308.11073,https://arxiv.org/abs/2308.11073,https://github.com/weiguoPian/AV-CIL_ICCV2023,https://huggingface.co/papers/2308.11073,,,,4,0 DiffV2S: Diffusion-based Video-to-Speech Synthesis with Vision-guided Speaker Embedding,"Choi, Jeongsoo; Hong, Joanna*; Ro, Yong Man",poster,2308.07787,https://arxiv.org/abs/2308.07787,,https://huggingface.co/papers/2308.07787,,,,3,0 The Power of Sound (TPoS): Audio Reactive Video Generation with Stable Diffusion,"Jeong, Yujin; Ryoo, Won Jeong; Lee, Seung Hyun; Seo, Da Bin; Byeon, Wonmin; Kim, Sangpil; Kim, Jinkyu*",poster,,,,,,,,, SIDGAN: High-Resolution Dubbed Video Generation via Shift-Invariant Learning,"Muaz, Urwa*; Jang, Won-Dong; Tripathi, Rohun; Mani, Santhosh; Ouyang, Wenbin; Gadde, Ravi Teja; Gecer, Baris; Elizondo, Sergio; Madad, Reza; Nair, Naveen",poster,,,,,,,,, On the Audio-visual Synchronization for Lip-to-Speech Synthesis,"NIU, Zhe*; Mak, Brian",poster,2303.00502,https://arxiv.org/abs/2303.00502,,https://huggingface.co/papers/2303.00502,,,,2,0 Be Everywhere - Hear Everything (BEE): Audio Scene Reconstruction by Sparse Audio-Visual Samples ,"Chen, Mingfei*; Su, Kun; Shlizerman, Eli",poster,,,,,,,,, Dense 2D-3D Indoor Prediction with Sound via Aligned Cross-Modal Distillation,"Yun, Heeseung*; Na, Joonil; Kim, Gunhee",poster,,,,,,,,, Hyperbolic Audio-visual Zero-shot Learning,"Hong, Jie*; Hayder, Zeeshan; Han, Junlin; Fang, Pengfei; Harandi, Mehrtash; Petersson, Lars",poster,2308.12558,https://arxiv.org/abs/2308.12558,,https://huggingface.co/papers/2308.12558,,,,6,2 AdVerb: Visually Guided Audio Dereverberation,"Chowdhury, Sanjoy*; Ghosh, Sreyan; Dasgupta, Subhrajyoti; Ratnarajah, Anton J; Tyagi, Utkarsh; Manocha, Dinesh",poster,2308.12370,https://arxiv.org/abs/2308.12370,,https://huggingface.co/papers/2308.12370,,,,6,0 Sound Localization from Motion: Jointly Learning Sound Direction and Camera Rotation,"Chen, Ziyang*; Qian, Shengyi; Owens, Andrew",poster,2303.11329,https://arxiv.org/abs/2303.11329,,https://huggingface.co/papers/2303.11329,,,,3,1 Learning Conditional Control for Pretrained Text-to-Image Diffusion Models,"Zhang, Lvmin*; Rao, Anyi; Agrawala, Maneesh",oral,,,,,,,,, Factorized Inverse Path Tracing for Efficient and Accurate Material-Lighting Estimation,"Zhu, Rui*; Wu, Liwen; Ramamoorthi, Ravi; Zhu, Yinhao; Cai, Hong; Matai, Janarbek; Li, Tzu-Mao; Yaldiz, Mustafa B; Porikli, Fatih; Chandraker, Manmohan",oral,2304.05669,https://arxiv.org/abs/2304.05669,https://github.com/lwwu2/fipt,https://huggingface.co/papers/2304.05669,,,,10,0 Manipulate by Seeing: Creating Manipulation Controllers from Pre-Trained Representations,"Wang, Jianren; Dasari, Sudeep*; Srirama, Mohan Kumar; Tulsiani, Shubham; Gupta, Abhinav",oral,2303.08135,https://arxiv.org/abs/2303.08135,,https://huggingface.co/papers/2303.08135,,,,5,0 3D Implicit Transporter for Temporally Consistent Keypoint Discovery,"Zhong, Chengliang*; Zheng, Yuhang; Zheng, Yupeng; Zhao, Hao; Wang, Ling; Mu, Xiaodong; Yi, Li; Zhao, Jian; zhang, liang xin; Li, Pengfei; Zhou, Guyue; Yang, Chao",oral,,,,,,,,, Chordal Averaging on Flag Manifolds and Its Applications,"Mankovich, Nathan; Birdal, Tolga*",oral,2303.13501,https://arxiv.org/abs/2303.13501,https://github.com/nmank/FlagAveraging,https://huggingface.co/papers/2303.13501,,,,2,0 UniDexGrasp++: Improving Universal Dexterous Grasping via Geometry-aware Curriculum Learning and Iterative Generalist-Specialist Learning,"Wan, Weikang; Geng, Haoran; Liu, Yun; Shan, Zikang; Yang, Yaodong; Yi, Li; Wang, He*",oral,,,,,,,,, GameFormer: Game-theoretic Modeling and Learning of Transformer-based Interactive Prediction and Planning for Autonomous Driving,"Huang, Zhiyu*; Liu, Haochen; Lv, Chen",oral,2303.05760,https://arxiv.org/abs/2303.05760,,https://huggingface.co/papers/2303.05760,,,,3,0 PPR: Physically Plausible Reconstruction from Monocular Videos,"Yang, Gengshan*; Yang, Shuo; Zhang, John; Manchester, Zachary; Ramanan, Deva",oral,,,,,,,,, Zolly: Zoom Focal Length Correctly for Perspective-Distorted Human Mesh Reconstruction,"Wang, Wenjia*; Ge, Yongtao; Mei, Haiyi; Cai, Zhongang; Sun, Qingping; Wang, Yanjun; Shen, Chunhua; Yang, Lei; Komura, Taku",oral,2303.13796,https://arxiv.org/abs/2303.13796,,https://huggingface.co/papers/2303.13796,,,,9,0 ACLS: Adaptive and Conditional Label Smoothing for Network Calibration,"Park, Hyekang; Noh, Jongyoun; Oh, Youngmin; Baek, Donghyeon; Ham, Bumsub*",oral,2308.11911,https://arxiv.org/abs/2308.11911,,https://huggingface.co/papers/2308.11911,,,,5,0 PGFed: Personalize Each Client's Global Objective for Federated Learning,"Luo, Jun*; Mendieta, Matias; Chen, Chen; Wu, Shandong",oral,2212.01448,https://arxiv.org/abs/2212.01448,,https://huggingface.co/papers/2212.01448,,,,4,0 Overcoming Bias in Pretrained Models by Manipulating the Finetuning Dataset,"Wang, Angelina*; Russakovsky, Olga",oral,,,,,,,,, ITI-GEN: Inclusive Text-to-Image Generation,"Zhang, Cheng*; Chen, Xuanbai; Chai, Siqi; Wu, Chen Henry; Lagun, Dmitry; Beeler, Thabo; de la Torre, Fernando",oral,,,,,,,,, FunnyBirds: A Synthetic Vision Dataset for a Part-Based Analysis of Explainable AI Methods,"Hesse, Robin*; Schaub-Meyer, Simone; Roth, Stefan",oral,2308.06248,https://arxiv.org/abs/2308.06248,,https://huggingface.co/papers/2308.06248,,,,3,0 X-VoE: Measuring eXplanatory Violation of Expectation in Physical Events,"Dai, Bo*; Wang, Linge; Jia, Baoxiong; Zhang, Zeyu; Zhang, Chi ; Zhu, Yixin; Zhu, Song-Chun",oral,,,,,,,,, Adaptive Testing of Computer Vision Models,"Gao, Irena*; Ilharco, Gabriel; Lundberg, Scott; Ribeiro, Marco Tulio",oral,2212.02774,https://arxiv.org/abs/2212.02774,,https://huggingface.co/papers/2212.02774,,,,4,0 Segment Anything,"Kirillov, Alexander*; Mintun, Eric; Ravi, Nikhila; Mao, Hanzi; Rolland, Chloe; Gustafson, Laura ; Xiao, Tete; Whitehead, Spencer; Berg, Alexander C; Lo, Wan-Yen; Dollar, Piotr; Girshick, Ross",oral,2304.02643,https://arxiv.org/abs/2304.02643,https://github.com/facebookresearch/segment-anything,https://huggingface.co/papers/2304.02643,,https://huggingface.co/facebook/sam-vit-huge,,12,0 Shape Analysis of Euclidean Curves under Frenet-Serret Framework,"Chassat, Perrine; Park, Juhyun; Brunel, Nicolas*",oral,,,,,,,,, Unmasking Anomalies in Road-Scene Segmentation,"Rai, Shyam Nandan*; Cermelli, Fabio; Fontanel, Dario; Masone, Carlo; Caputo, Barbara",oral,2307.13316,https://arxiv.org/abs/2307.13316,https://github.com/shyam671/Mask2Anomaly-Unmasking-Anomalies-in-Road-Scene-Segmentation,https://huggingface.co/papers/2307.13316,,,,5,0 High-Quality Entity Segmentation,"Qi, Lu*; Kuen, Jason; Guo, Weidong; Shen, Tiancheng; Gu, Jiuxiang; Li, Wenbo; Jia, Jiaya; Lin, Zhe; Yang, Ming-Hsuan",oral,2211.05776,https://arxiv.org/abs/2211.05776,,https://huggingface.co/papers/2211.05776,,,,8,0 Towards Open-Vocabulary Video Instance Segmentation,"wang, haochen*; yan, cilin; Wang, Shuai; Jiang, Xiaolong; Tang, Xu; Hu, Yao; Xie, Weidi; Gavves, Efstratios",oral,2304.01715,https://arxiv.org/abs/2304.01715,https://github.com/haochenheheda/LVVIS,https://huggingface.co/papers/2304.01715,,,,8,0 Beyond One-to-One: Rethinking the Referring Image Segmentation,"Hu, Yutao*; Wang, Qixiong; Shao, Wenqi; Xie, Enze; Li, Zhenguo; Han, Jungong; Luo, Ping",oral,,,,,,,,, Multiple Instance Learning Framework with Masked Hard Instance Mining for Whole Slide Image Classification,"Tang, Wenhao; Huang, Sheng*; Zhang, Xiaoxian; Zhou, Fengtao; Zhang, Yi; Liu, Bo",oral,2307.15254,https://arxiv.org/abs/2307.15254,https://github.com/DearCaat/MHIM-MIL,https://huggingface.co/papers/2307.15254,,,,6,0 Scale-MAE: A Scale-Aware Masked Autoencoder for Multiscale Geospatial Representation Learning,"Reed, Colorado J; Gupta, Ritwik*; Li, Shufan; Brockman, Sarah; Funk, Christopher; Clipp, Brian S; Keutzer, Kurt; Candido, Salvatore; Uyttendaele, Matt; Darrell, Trevor",oral,,,,,,,,, Progressive Spatio-Temporal Prototype Matching for Text-Video Retrieval,"Li, Pandeng*; Xie, Chen-Wei; Zhao, Liming; Xie, Hongtao; Ge, Jiannan; Zheng, Yun; Zhao, Deli; Zhang, Yongdong",oral,,,,,,,,, Towards Deeply Unified Depth-aware Panoptic Segmentation with Bi-directional Guidance Learning,"He, Junwen*; Wang, Yifan; Wang, Lijun; Lu, Huchuan; Luo, Bin; He, Jun-Yan; Lan, Jin-Peng; Geng, Yifeng; Xie, Xuansong",oral,2307.14786,https://arxiv.org/abs/2307.14786,,https://huggingface.co/papers/2307.14786,,,,9,1 LogicSeg: Parsing Visual Semantics with Neural Logic Learning and Reasoning,"Li, Liulei; Wang, Wenguan*; Yang, Yi",oral,,,,,,,,, ASIC: Aligning Sparse in-the-wild Image Collections,"Gupta, Kamal*; Jampani, Varun; Shrivastava, Abhinav; Makadia, Ameesh; Snavely, Noah; Esteves, Carlos; Kar, Abhishek",oral,2303.16201,https://arxiv.org/abs/2303.16201,,https://huggingface.co/papers/2303.16201,,,,7,1 CLIPascene: Scene Sketching with Different Types and Levels of Abstraction,"Vinker, Yael*; Alaluf, Yuval; Cohen-Or, Danny; Shamir, Ariel",oral,2211.17256,https://arxiv.org/abs/2211.17256,,https://huggingface.co/papers/2211.17256,,,,4,0 LD-ZNet: A Latent Diffusion Approach for Text-Based Image Segmentation,"PNVR, Koutilya*; Singh, Bharat; Ghosh, Pallabi; Jacobs, David; Siddiquie, Behjat",oral,,,,,,,,, TexFusion: Synthesizing 3D Textures with Text-Guided Image Diffusion Models,"Cao, Tianshi*; Kreis, Karsten; Fidler, Sanja; Sharp, Nicholas; Yin, Kangxue",oral,,,,,,,,, NeuRBF: A Neural Fields Representation with Adaptive Radial Basis Functions,"Chen, Zhang*; Li, Zhong; Song, Liangchen; Chen, Lele; Yu, Jingyi; Yuan, Junsong; Xu, Yi",oral,,,,,,,,, Scalable Diffusion Models with Transformers,"Peebles, William*; Xie, Saining",oral,2212.09748,https://arxiv.org/abs/2212.09748,,https://huggingface.co/papers/2212.09748,,,,2,0 Taming Texture Generation on 3D Meshes with Point-UV Diffusion,"Yu, Xin*; Dai, Peng; Li, Wenbo; Ma, Lan; Liu, Zhengzhe; Qi, Xiaojuan",oral,,,,,,,,, Generative Novel View Synthesis with 3D-Aware Diffusion Models,"Chan, Eric*; Nagano, Koki; Park, Jeong Joon; Chan, Matthew; Bergman, Alexander W; Levy, Axel; Aittala, Miika; De Mello, Shalini; Karras, Tero; Wetzstein, Gordon",oral,2304.02602,https://arxiv.org/abs/2304.02602,,https://huggingface.co/papers/2304.02602,,,,10,0 DiffFit: Unlocking Transferability of Large Diffusion Models via Simple Parameter-efficient Fine-Tuning,"Xie, Enze*; Li, Zhenguo; Zhou, Daquan; LIU, Zhili; Shi, Han; Li, Jiawei; Yao, Lewei; Liu, Zhaoqiang",oral,2304.06648,https://arxiv.org/abs/2304.06648,,https://huggingface.co/papers/2304.06648,,,,8,0 VQ3D: Learning a 3D-Aware Generative Model on ImageNet,"Sargent, Kyle*; Koh, Jing Yu; Zhang, Han; Chang, Huiwen; Herrmann, Charles; Srinivasan, Pratul; Wu, Jiajun; Sun, Deqing",oral,2302.06833,https://arxiv.org/abs/2302.06833,,https://huggingface.co/papers/2302.06833,,,,8,0 Ref-NeuS: Ambiguity-Reduced Neural Implicit Surface Learning for Multi-View Reconstruction with Reflection,"Ge, Wenhang; Hu, Tao; Zhao, Haoyu; Liu, Shu; Chen, Yingcong*",oral,,,,,,,,, Generative Diffusions in Augmented Spaces: A Complete Recipe,"Pandey, Kushagra*; Mandt, Stephan",oral,2303.01748,https://arxiv.org/abs/2303.01748,https://github.com/mandt-lab/PSLD,https://huggingface.co/papers/2303.01748,,,,2,0 MMVP: Motion-Matrix-based Video Prediction,"Zhong, Yiqi*; Liang, Luming; Zharkov, Ilya; Neumann, Ulrich",oral,2308.16154,https://arxiv.org/abs/2308.16154,,https://huggingface.co/papers/2308.16154,,,,4,0 Robust Monocular Depth Estimation under Challenging Conditions,"Gasperini, Stefano*; Morbitzer, Nils; Jung, HyunJun; Navab, Nassir; Tombari, Federico",poster,2308.09711,https://arxiv.org/abs/2308.09711,,https://huggingface.co/papers/2308.09711,,,,5,1 UMC: A Unified Bandwidth-efficient and Multi-resolution based Collaborative Perception Framework,"Wang, Tianhang; Chen, Guang*; Chen, Kai; Liu, Zhengfa; bo, zhang; Knoll, Alois C.; Jiang, Changjun",poster,2303.12400,https://arxiv.org/abs/2303.12400,,https://huggingface.co/papers/2303.12400,,,,7,0 View Consistent Purification for Accurate Cross-View Localization,"Wang, Shan*; Zhang, Yanhao; Vora, Ankit; Perincherry, Akhil; LI, HONGDONG",poster,2308.08110,https://arxiv.org/abs/2308.08110,,https://huggingface.co/papers/2308.08110,,,,5,0 Semi-supervised Semantics-guided Adversarial Training for Robust Trajectory Prediction,"Jiao, Ruochen*; Liu, Xiangguo; SATO, TAKAMI; Chen, Alfred; Qi, Zhu",poster,,,,,,,,, NeRF-LOAM: Neural Implicit Representation for Large-Scale Incremental LiDAR Odometry and Mapping,"DENG, Junyuan; Wu, Qi; Chen, Xieyuanli*; Xia, Songpengcheng; Sun, Zhen; Liu, Guoqing; Yu, Wenxian; Pei, Ling",poster,,,,,,,,, MapPrior: A Generative Approach for Bird’s-Eye View Perception,"Zhu, Xiyue*; Zyrianov, Vlas; Liu, Zhijian; Wang, Shenlong",poster,,,,,,,,, Hidden Biases of End-to-End Driving Models,"Jaeger, Bernhard*; Chitta, Kashyap; Geiger, Andreas",poster,2306.07957,https://arxiv.org/abs/2306.07957,,https://huggingface.co/papers/2306.07957,,,,3,2 Search for or Navigate to? Dual Adaptive Thinking for Object Navigation,"Dang, Ronghao*; Wang, Liuyi; He, Zongtao; Su, Shuai; Tang, Jiagui; Liu, Chengju; Chen, Qijun",poster,2208.00553,https://arxiv.org/abs/2208.00553,,https://huggingface.co/papers/2208.00553,,,,6,0 Segmenting Known Objects and Unseen Unknowns without Prior Knowledge,"Gasperini, Stefano*; Marcos-Ramiro, Alvaro; Schmidt, Michael; Navab, Nassir; Busam, Benjamin ; Tombari, Federico",poster,2209.05407,https://arxiv.org/abs/2209.05407,,https://huggingface.co/papers/2209.05407,,,,6,1 BiFF: Bi-level Future Fusion with Polyline-based Coordinate for Interactive Trajectory Prediction,"ZHU, Yiyao*; LUAN, Di; Shen, Shaojie",poster,2306.14161,https://arxiv.org/abs/2306.14161,,https://huggingface.co/papers/2306.14161,,,,3,0 Towards Zero Domain Gap: A Comprehensive Study of Realistic LiDAR Simulation for Autonomy Testing,"Manivasagam, Sivabalan*; Bârsan, Ioan Andrei; Wang, Jingkang; Yang, Ze; Urtasun, Raquel",poster,,,,,,,,, Looking Beyond Single Scenes: Clustering based Point Cloud Segmentation Learning,"Feng, Tuo*; Wang, Wenguan; Wang, Xiaohan; Yang, Yi; Zheng, Qinghua",poster,,,,,,,,, ADAPT: Efficient Multi-Agent Trajectory Prediction with Adaptation,"Aydemir, Görkay*; Akan, Kaan Adil; Guney, Fatma",poster,2307.14187,https://arxiv.org/abs/2307.14187,,https://huggingface.co/papers/2307.14187,,,,3,0 MV-DeepSDF: Implicit Modeling with Multi-Sweep Point Clouds for 3D Vehicle Reconstruction in Autonomous Driving,"LIU, YIBO*; Zhu, Kelly; Wu, Guile; Ren, Yuan; Bingbing, Liu; Liu, Yang; SHAN, JINJUN",poster,,,,,,,,, Learning Vision-and-Language Navigation from YouTube Videos,"Lin, Kunyang*; Chen, Peihao; Huang, Diwei; Li, Thomas H.; Tan, Mingkui; Gan, Chuang",poster,2307.11984,https://arxiv.org/abs/2307.11984,https://github.com/JeremyLinky/YouTube-VLN,https://huggingface.co/papers/2307.11984,,,,6,0 TrajPAC: Towards Robustness Verification of Pedestrian Trajectory Prediction Models,"Zhang, Liang*; Xu, Nathaniel; Yang, Pengfei; Jin, Gaojie; Huang, Cheng-Chao; Zhang, Lijun",poster,2308.05985,https://arxiv.org/abs/2308.05985,,https://huggingface.co/papers/2308.05985,,,,6,0 VAD: Vectorized Scene Representation for Efficient Autonomous Driving,"Jiang, Bo; Chen, Shaoyu; xu, qing; Liao, Bencheng; Chen, Jiajie; Zhou, Helong; Zhang, Qian; Liu, Wenyu; Huang, Chang; Wang, Xinggang*",poster,2303.12077,https://arxiv.org/abs/2303.12077,"https://github.com/hustvl/VAD for",https://huggingface.co/papers/2303.12077,,,,10,0 Traj-MAE: Masked Autoencoders for Trajectory Prediction,"Chen, Hao; Wang, Jiaze*; Shao, Kun; Liu, Furui; Hao, Jianye; GUAN, Chenyong; Chen, Guangyong; Heng, Pheng-Ann",poster,,,,,,,,, Sparse Point Guided 3D Lane Detection,"Yao, Chengtang*; Yu, Lidong; Jia, Yunde; WU, Yuwei",poster,,,,,,,,, A Simple Vision Transformer for Weakly Semi-supervised 3D Object Detection,"Zhang, Dingyuan*; Liang, Dingkang; Zou, Zhikang; Li, Jingyu; Ye, Xiaoqing; Tan, Xiao; Liu, Zhe; Bai, Xiang",poster,,,,,,,,, Learn TAROT with MENTOR: A Meta-Learned Self-supervised Approach for Trajectory Prediction,"Pourkeshavarz, Mozhgan MP*; Chen, Changhe; Rasouli, Amir",poster,,,,,,,,, FocalFormer3D : Focusing on Hard Instance for 3D Object Detection,"Chen, Yilun*; Yu, Zhiding; Chen, Yukang; Lan, Shiyi; Anandkumar, Animashree; Jia, Jiaya; Alvarez, Jose M",poster,2308.04556,https://arxiv.org/abs/2308.04556,https://github.com/NVlabs/FocalFormer3D,https://huggingface.co/papers/2308.04556,,,,7,1 Scene as Occupancy,"Tong, Wenwen; Sima, Chonghao*; Wang, Tai; Chen, Li; wu, silei; Deng, Hanming; Gu, Yi; Lu, Lewei; Luo, Ping; Lin, Dahua; Li, Hongyang",poster,2306.02851,https://arxiv.org/abs/2306.02851,,https://huggingface.co/papers/2306.02851,,,,11,0 Neural Scene Rasterization for Large Scene Rendering in Real-time,"Liu, Jeffrey Yunfan*; Chen, Yun; Yang, Ze; Wang, Jingkang; Manivasagam, Sivabalan; Urtasun, Raquel",poster,,,,,,,,, A Game of Bundle Adjustment - Learning Efficient Convergence,"Belder, Amir*; VIVANTI, REFAEL; Tal, Ayellet",poster,,,,,,,,, Efficient Transformer-based 3D Object Detection with Dynamic Token Halting,"Ye, Mao*; Meyer, Gregory P; Chai, Yuning; Liu, Qiang",poster,2303.05078,https://arxiv.org/abs/2303.05078,,https://huggingface.co/papers/2303.05078,,,,4,0 RegFormer: An Efficient Projection-Aware Transformer Network for Large-Scale Point Cloud Registration,"Liu, Jiuming; Wang, Guangming; Liu, Zhe; Jiang, Chaokang; Pollefeys, Marc; Wang, Hesheng*",poster,2303.12384,https://arxiv.org/abs/2303.12384,,https://huggingface.co/papers/2303.12384,,,,6,0 CASSPR: Cross Attention Single Scan Place Recognition,"Xia, Yan*; Gladkova, Mariia; Wang, Rui; Li, Qianyun; Stilla, Uwe M; Henriques, Joao F; Cremers, Daniel",poster,2211.12542,https://arxiv.org/abs/2211.12542,,https://huggingface.co/papers/2211.12542,,,,7,0 Recursive Video Lane Detection,"Jin, Dongkwon; Kim, Dahyun; Kim, Chang-Su*",poster,2308.11106,https://arxiv.org/abs/2308.11106,https://github.com/dongkwonjin/RVLD,https://huggingface.co/papers/2308.11106,,,,3,0 Parametric Depth Based Feature Representation Learning for Object Detection and Segmentation in Bird’s-Eye View,"Yang, Jiayu*; Xie, Enze; Liu, Miaomiao; Alvarez, Jose M",poster,,,,,,,,, SHIFT3D: Synthesizing Hard Inputs For Tricking 3D Detectors,"Chen, Hongge *; Shrivastava, Ashish; Chai, Yuning; Chen, Zhao; Meyer, Gregory P; Park, Dennis; Vondrick, Carl",poster,,,,,,,,, Bootstrap Motion Forecasting With Self-Consistent Constraints,"Ye, Maosheng*; Xu, Jiamiao; Xu, Xunnong; Wang, Tengfei; Cao, Tongyi; Chen, Qifeng",poster,2204.05859,https://arxiv.org/abs/2204.05859,,https://huggingface.co/papers/2204.05859,,,,6,0 Towards Viewpoint Robustness in Bird’s Eye View Segmentation,"Klinghoffer, Tzofi M*; Philion, Jonah; Chen, Wenzheng; Litany, Or; Gojcic, Zan; Joo, Jungseock; Raskar, Ramesh; Fidler, Sanja; Alvarez, Jose M",poster,,,,,,,,, R-Pred: Two-Stage Motion Prediction Via Tube-Query Attention-Based Trajectory Refinement,"Choi, Sehwan*; Choi, Jun Won; Kim, Jungho; Yun, Junyong",poster,,,,,,,,, INT2: Interactive Trajectory Prediction at Intersections,"yan, zhijie z-j; Li, Pengfei; Fu, Zheng; Xu, Shaocong; shi, yongliang; Chen, Xiaoxue; Zheng, Yuhang; Li, Yang; Liu, Tianyu; Li, Chuxuan; Luo, Nairui; Gao, Xu; Chen, Yilun; Wang, Zuoxu; Shi, Yifeng; HUANG, Pengfei; Han, Zhengxiao; Yuan, Jirui; Gong, Jiangtao; Zhou, Guyue; Zhao, Hang*; Zhao, Hao",poster,,,,,,,,, MatrixVT: Efficient Multi-Camera to BEV Transformation for 3D Perception,"Zhou, Hongyu*; Ge, Zheng; Li, Zeming; Zhang, Xiangyu",poster,2211.10593,https://arxiv.org/abs/2211.10593,,https://huggingface.co/papers/2211.10593,,,,4,0 Unsupervised Self-Driving Attention Prediction via Uncertainty Mining and Knowledge Embedding,"Zhu, Pengfei; Qi, Mengshi*; Li, Xia; Li, Weijian; Ma, Huadong",poster,2303.09706,https://arxiv.org/abs/2303.09706,,https://huggingface.co/papers/2303.09706,,,,5,0 SVQNet: Sparse Voxel-Adjacent Query Network for 4D Spatio-Temporal LiDAR Semantic Segmentation,"Chen, Xuechao; Xu, Shuangjie*; Zou, Xiaoyi; Cao, Tongyi; Yeung, Dit-Yan; Fang, Lu",poster,2308.13323,https://arxiv.org/abs/2308.13323,,https://huggingface.co/papers/2308.13323,,,,6,0 MotionLM: Multi-Agent Motion Forecasting as Language Modeling,"Seff, Ari*; Cera, Brian; Chen, Dian; Ng, Mason; Zhou, Aurick; Nayakanti, Nigamaa; Refaat, Khaled S; Al-Rfou, Rami; Sapp, Benjamin",poster,,,,,,,,, Improving Online Lane Graph Extraction by Object-Lane Clustering,"Can, Yigit Baran*; Liniger, Alexander; Paudel, Danda Pani; Van Gool, Luc",poster,2307.10947,https://arxiv.org/abs/2307.10947,,https://huggingface.co/papers/2307.10947,,,,4,0 Unsupervised 3D Perception with 2D Vision-Language Distillation for Autonomous Driving,"Najibi, Mahyar*; Ji, Jingwei; Zhou, Yin; Qi, Charles R.; Yan, Xinchen; Ettinger, Scott; Anguelov, Dragomir",poster,,,,,,,,, Self-Supervised Monocular Depth Estimation by Direction-aware Cumulative Convolution Network,"Han, Wencheng; Yin, Junbo; Shen, Jianbing*",poster,2308.05605,https://arxiv.org/abs/2308.05605,,https://huggingface.co/papers/2308.05605,,,,3,0 Ordered Atomic Activity for Fine-grained Interactive Traffic Scenario Understanding,"Agarwal, Nakul*; Chen, Yi-Ting",poster,,,,,,,,, Boosting Multi-Camera 3D Object Detection with Cross-Modal Knowledge Distillation,"Wang, Zeyu; Li, Dingwen*; Luo, Chenxu; Xie, Cihang; Yang, Xiaodong",poster,,,,,,,,, Video Task Decathlon: Unification of Image and Video Tasks For Autonomous Driving,"Huang, Thomas E*; Liu, Yifan; Van Gool, Luc; Yu, Fisher",poster,,,,,,,,, MV-Map: Offboard HD-Map Generation with Multi-view Consistency,"Xie, ZiYang*; Pang, Ziqi; Wang, Yu-Xiong",poster,,,,,,,,, Towards Universal LiDAR-Based 3D Object Detection by Multi-Domain Knowledge Transfer,"Wu, Guile*; Cao, Tongtong; Bingbing, Liu; Chen, Xingxin; Ren, Yuan",poster,,,,,,,,, Forecast-MAE: Self-supervised Pre-training for Motion Forecasting with Masked Autoencoders,"CHENG, Jie*; MEI, Xiaodong; Liu, Ming",poster,,,,,,,,, UniFusion: Unified Multi-view Fusion Transformer for Spatial-Temporal Representation in Bird’s-Eye-View,"Qin, Zequn*; Chen, Jingyu; Chen, Chao ; Chen, Xiaozhi; Li, Xi",poster,,,,,,,,, BEVPlace: Learning LiDAR-based Place Recognition using Bird's Eye View Images,"Luo, Lun; Zheng, Shuhang; Li, Yixuan; Fan, Yongzhi; Yu, Beinan; Cao, Si-Yuan; li, junwei; Shen, Hui-Liang*",poster,2302.14325,https://arxiv.org/abs/2302.14325,https://github.com/zjuluolun/BEVPlace,https://huggingface.co/papers/2302.14325,,,,7,0 CORE: Cooperative Reconstruction for Multi-Agent Perception,"Wang, Binglu; Zhang, Lei; Wang, Zhaozhong; Zhao, Yongqiang; Zhou, Tianfei*",poster,2307.11514,https://arxiv.org/abs/2307.11514,,https://huggingface.co/papers/2307.11514,,,,5,0 MetaBEV: Solving Sensor Failures for BEV Detection and Map Segmentation,"GE, Chongjian*; Xie, Enze; Chen, Junsong; Hong, Lanqing; Wang, Zhongdao; Li, Zhenguo; Lu, Huchuan; Luo, Ping",poster,2304.09801,https://arxiv.org/abs/2304.09801,,https://huggingface.co/papers/2304.09801,,,,8,0 Aggregating Feature Point Cloud for Depth Completion,"Yu, Zhu; Sheng, Zehua; Zhou, Zili; Luo, Lun; Cao, Si-Yuan; Gu, Hong; Zhang, Huaqi; Shen, Hui-Liang*",poster,,,,,,,,, Coordinate Transformer: Achieving Single-stage Multi-person Mesh Recovery from Videos,"Li, Haoyuan; Dong, Haoye; Jia, Hanchao; Huang, Dong; Kampffmeyer, Michael C.; Lin, Liang; Liang, Xiaodan*",poster,2308.10334,https://arxiv.org/abs/2308.10334,https://github.com/Li-Hao-yuan/CoordFormer,https://huggingface.co/papers/2308.10334,,,,7,0 MAMo: Leveraging Memory and Attention for Monocular Video Depth Estimation,"Yasarla, Rajeev*; Cai, Hong; Jeong, Jisoo; Shi, Yunxiao; Garrepalli, Risheek; Porikli, Fatih",poster,2307.14336,https://arxiv.org/abs/2307.14336,,https://huggingface.co/papers/2307.14336,,,,6,0 SlaBins: Fisheye Depth Estimation using Slanted Bins on Road Environments,"Lee, Jongsung*; Cho, Gyeongsu; Park, Jeongin; Lee, Seongoh; Kim, Kyongjun; Kim, Jung Hee; Jeong, Seong-Gyun; Joo, Kyungdon",poster,,,,,,,,, Creative Birds: Self-Supervised Single-View 3D Style Transfer,"Wang, Renke; Que, Guimin; Chen, Shuo; Li, Xiang; Li, Jun*; Yang, Jian",poster,2307.14127,https://arxiv.org/abs/2307.14127,https://github.com/wrk226/creative_birds,https://huggingface.co/papers/2307.14127,,,,6,0 Dynamic PlenOctree for Adaptive Sampling Refinement in Explicit NeRF,"Bai, Haotian; Lin, Yiqi; Chen, Yize; Wang, Lin *",poster,2307.15333,https://arxiv.org/abs/2307.15333,,https://huggingface.co/papers/2307.15333,,,,4,0 CORE: Co-planarity Regularized Monocular Geometry Estimation with Weak Supervision,"Li, Yuguang *; Wang, Kai; Li, Hui; Rhee, Seon-Min; Han, Seungju; Kim, Jihye; Yang, Min; Yang, Ran; Zhu, Feng",poster,,,,,,,,, Relightify: Relightable 3D faces from a single image via diffusion models,"Paraperas Papantoniou, Foivos*; Lattas, Alexandros; Moschoglou, Stylianos; Zafeiriou, Stefanos",poster,2305.06077,https://arxiv.org/abs/2305.06077,,https://huggingface.co/papers/2305.06077,,,,4,2 GLA-GCN: Global-local Adaptive Graph Convolutional Network for 3D Human Pose Estimation from Monocular Video,"YU, Bruce X.B.*; ZHANG, Zhi; Liu, Yongxu; Zhong, Sheng-hua; Liu, Yan; Chen, Chang Wen",poster,,,,,,,,, Calibrating Panoramic Depth Estimation for Practical Localization and Mapping,"Kim, Junho*; Lee, Eun Sun; Kim, Young Min",poster,2308.14005,https://arxiv.org/abs/2308.14005,,https://huggingface.co/papers/2308.14005,,,,3,0 SymNP: Learning Symmetry Priors between Neural Points,"Wewer, Christopher Johannes; Ilg, Eddy; Schiele, Bernt; Lenssen, Jan E.*",poster,,,,,,,,, AGG-Net: Attention Guided Gated-convolutional Network for Depth Image Completion,"Chen, Dongyue; Huang, Tingxuan*; song, zhimin; Deng, Shizhuo; Jia, Tong",poster,,,,,,,,, Viewset Diffusion for Probabilistic Single Image 3D Reconstruction,"Szymanowicz, Stanislaw K*; Rupprecht, Christian; Vedaldi, Andrea",poster,,,,,,,,, CVSformer: Cross-View Synthesis Transformer for Semantic Scene Completion,"Dong, Haotian; ma, enhui; Wang, Lubo; Wang, Miaohui; Xie, Wuyuan; Guo, Qing; Li, Ping; Liang, Lingyu; yang, kairui; Lin, Di*",poster,2307.07938,https://arxiv.org/abs/2307.07938,,https://huggingface.co/papers/2307.07938,,,,10,0 U-RED: Unsupervised 3D Shape Retrieval and Deformation for Partial Point Clouds,"Di, Yan*; Zhang, Chenyangguang; Zhang, Ruida; Manhardt, Fabian; Su, Yongzhi; Rambach, Jason; Stricker, Didier; Ji, Xiangyang; Tombari, Federico",poster,,,,,,,,, Single Depth-image 3D Reflection Symmetry And Shape Prediction,"Zhang, Zhaoxuan; Dong, Bo; Li, Tong; Heide, Felix; Peers, Pieter; Yin, Baocai ; Yang, Xin*",poster,,,,,,,,, Self-supervised Monocular Depth Estimation: Let's Talk About The Weather,"Saunders, Kieran R*; Vogiatzis, George; Manso, Luis J.",poster,2307.08357,https://arxiv.org/abs/2307.08357,,https://huggingface.co/papers/2307.08357,,,,3,1 Mesh2Tex: Generating Mesh Textures from Image Queries,"Bokhovkin, Aleksei*; Tulsiani, Shubham; Dai, Angela",poster,2304.05868,https://arxiv.org/abs/2304.05868,,https://huggingface.co/papers/2304.05868,,,,3,1 Sketch and Text Guided Diffusion Model for Colored Point Cloud Generation,"Wu, Zijie*; Wang, Yaonan; Feng, Mingtao; Xie, He; Mian, Ajmal",poster,2308.02874,https://arxiv.org/abs/2308.02874,,https://huggingface.co/papers/2308.02874,,,,5,0 Learning a Room with the Occ-SDF Hybrid: Signed Distance Function Mingled with Occupancy Aids Scene Representation,"Lyu, Xiaoyang*; Dai, Peng; Li, Zizhang; Yan, Dongyu; Lin, Yi; PENG, YIFAN; Qi, Xiaojuan",poster,2303.09152,https://arxiv.org/abs/2303.09152,,https://huggingface.co/papers/2303.09152,,,,7,1 Robust Geometry-Preserving Depth Estimation Using Differentiable Rendering,"Zhang, Chi*; Yin, Wei; Yu, Gang; wang, zhibin; Chen, Tao; Fu, Bin; Zhou, Joey Tianyi; Shen, Chunhua",poster,,,,,,,,, FeatureNeRF: Learning Generalizable NeRFs by Distilling Pre-trained Vision Foundation Models,"Ye, Jianglong*; Wang, Naiyan; Wang, Xiaolong",poster,,,,,,,,, One-shot Implicit Animatable Avatars with Model-based Priors,"Huang, Yangyi*; Yi, Hongwei; Liu, Weiyang; Wang, Haofan; Wu, Boxi; Wang, Wenxiao; Lin, Binbin; Zhang, Debing; Cai, Deng",poster,2212.02469,https://arxiv.org/abs/2212.02469,,https://huggingface.co/papers/2212.02469,,,,9,0 VeRi3D: Generative Vertex-based Radiance Fields for 3D Controllable Human Image Synthesis,"Chen, Xinya*; Huang, Jiaxin; Yanrui, Bin; Yu, Lu; Liao, Yiyi",poster,,,,,,,,, Diffuse3D: Wide-Angle 3D Photography via Bilateral Diffusion,"Jiang, Yutao; Zhou, Yang; liang, yuan; Liu, Wenxi; Jiao, Jianbo; Quan, Yuhui; He, Shengfeng*",poster,,,,,,,,, AutoSynth: Learning to Generate 3D Training Data for Object Point Cloud Registration,"Dang, Zheng*; Salzmann, Mathieu",poster,,,,,,,,, Body Knowledge and Uncertainty Modeling for Monocular 3D Human Body Reconstruction,"Zhang, Yufei*; Wang, Hanjing; Kephart, Jeffrey; Ji, Qiang",poster,2308.00799,https://arxiv.org/abs/2308.00799,,https://huggingface.co/papers/2308.00799,,,,4,0 Accurate 3D Face Reconstruction with Facial Component Tokens,"Zhang, Tianke*; Chu, Xuangeng; Liu, Yunfei; Lin, Lijian; Yang, Zhendong; Cao, Chengkun; Xu, Zhengzhuo; YU, Fei; Zhou, Changyin; Yuan, Chun; Li, Yu",poster,,,,,,,,, Metric3D: Towards Zero-shot Metric 3D Prediction from A Single Image,"Yin, Wei*; Zhang, Chi; Chen, Hao; Cai, Zhipeng; Chen, Xiaozhi; Wang, Kaixuan; Yu, Gang; Shen, Chunhua",poster,2307.10984,https://arxiv.org/abs/2307.10984,https://github.com/YvanYin/Metric3D,https://huggingface.co/papers/2307.10984,,,,8,0 Reconstructing Interacting Hands with Interaction Prior from Monocular Images,"Zuo, Binghui*; Zhao, Zimeng; Sun, Wenqian; Xie, Wei; Xue, Zhou; Wang, Yangang",poster,2308.14082,https://arxiv.org/abs/2308.14082,https://github.com/binghui-z/InterPrior_pytorch,https://huggingface.co/papers/2308.14082,,,,6,0 SparseNeRF: Distilling Depth Ranking for Few-shot Novel View Synthesis,"Wang, Guangcong*; Chen, Zhaoxi; Loy, Chen Change; Liu, Ziwei",poster,2303.16196,https://arxiv.org/abs/2303.16196,,https://huggingface.co/papers/2303.16196,,,,4,0 Beyond the limitation of monocular 3D detector via knowledge distillation,"Yang, Yiran*; Yin, Dongshuo; Rong, Xuee; Sun, Xian; Diao, Wenhui; Li, Xinming",poster,,,,,,,,, HiFace: High-Fidelity 3D Face Reconstruction by Learning Static and Dynamic Details,"Chai, Zenghao; Zhang, Tianke; He, Tianyu; Tan, Xu*; Baltrusaitis, Tadas; Wu, HsiangTao; Li, Runnan; Zhao, Sheng; Yuan, Chun; Bian, Jiang",poster,2303.11225,https://arxiv.org/abs/2303.11225,,https://huggingface.co/papers/2303.11225,,,,10,2 Animal3D: A Comprehensive Dataset of 3D Animal Pose and Shape,"Xu, Jiacong*; Zhang, Yi; Peng, Jiawei; Ma, Wufei; Jesslen, Artur; Ji, Pengliang; Hu, Qixin; Zhang, Jiehua; Liu, Qihao; Wang, Jiahao; Ji, Wei; Wang, Chen; Yuan, Xiaoding; Kaushik, Prakhar; Zhang, Guofeng; liu, jie; Xie, Yushan; Cui, Yawen; Yuille, Alan; Kortylewski, Adam",poster,2308.11737,https://arxiv.org/abs/2308.11737,,https://huggingface.co/papers/2308.11737,,,,20,0 JOTR: 3D Joint Contrastive Learning with Transformers for Occluded Human Mesh Recovery,"Li, Jiahao*; Yang, Zongxin; Wang, Xiaohan; Ma, Jianxin; Zhou, Chang; Yang, Yi",poster,2307.16377,https://arxiv.org/abs/2307.16377,,https://huggingface.co/papers/2307.16377,,,,6,0 Vanishing Point Estimation in Uncalibrated Images with Prior Gravity Direction,"Pautrat, Rémi*; Liu, Shaohui; Hruby, Petr; Pollefeys, Marc; Barath, Daniel",poster,2308.10694,https://arxiv.org/abs/2308.10694,https://github.com/cvg/VP-Estimation-with-Prior-Gravity,https://huggingface.co/papers/2308.10694,,,,5,0 Detailed Clothed Avatar Reconstruction from Implicit Distribution Fields,"Yang, Xueting; Luo, Yihao; Xiu, Yuliang; Wei, Wang; Xu, Hao; Fan, Zhaoxin*",poster,,,,,,,,, 3D Distillation: Improving Self-Supervised Monocular Depth Estimation on Reflective Surfaces,"Shi, Xuepeng*; Dikov, Georgi; Reitmayr, Gerhard; Kim, Tae-Kyun (T-K); Ghafoorian, Mohsen",poster,,,,,,,,, DeformToon3D: Deformable 3D Toonification from Neural Radiance Fields,"Zhang, Junzhe*; Lan, Yushi; Yang, Shuai; Hong, Fangzhou; Wang, Quan; Yeo, Chai Kiat; Liu, Ziwei; Loy, Chen Change",poster,,,,,,,,, MonoDETR: Depth-guided Transformer for Monocular 3D Object Detection,"Zhang, Renrui*; Qiu, Han; Wang, Tai; Guo, Ziyu; Cui, Ziteng; Gao, Peng; Qiao, Yu; Li, Hongsheng",poster,2203.13310,https://arxiv.org/abs/2203.13310,https://github.com/ZrrSkywalker/MonoDETR,https://huggingface.co/papers/2203.13310,,,,9,0 ReLeaPS : Reinforcement Learning-based Illumination Planning for Generalized Photometric Stereo,"Chan, Jun Hoong*; Yu, Bohan; Guo, Heng; Ren, Jieji; Lu, Zongqing; Shi, Boxin",poster,,,,,,,,, Convex Decomposition of Indoor Scenes,"Vavilala, Vaibhav S*; Forsyth, David",poster,2307.04246,https://arxiv.org/abs/2307.04246,,https://huggingface.co/papers/2307.04246,,,,2,0 NeO 360: Neural Fields for Sparse View Synthesis of Outdoor Scenes,"Irshad, Muhammad Zubair*; Zakharov, Sergey; Liu, Katherine; Guizilini, Vitor; Kollar, Thomas; Gaidon, Adrien; Ambru?, Rare? A; Kira, Zsolt",poster,2308.12967,https://arxiv.org/abs/2308.12967,https://github.com/zubair-irshad/NeO-360,https://huggingface.co/papers/2308.12967,,,8,8,1 UrbanGIRAFFE: Representing Urban Scenes as Compositional Generative Neural Feature Fields,"Yang, Yuanbo*; Yang, Yifei; Guo, Hanlei; Xiong, Rong; Wang, Yue; Liao, Yiyi",poster,2303.14167,https://arxiv.org/abs/2303.14167,,https://huggingface.co/papers/2303.14167,,,,6,0 Efficient Converted Spiking Neural Network for 3D and 2D classification,"Lan, Yuxiang; Zhang, Yachao; Ma, Xu; Qu, Yanyun*; FU, YUN",poster,,,,,,,,, Distribution-Aligned Diffusion for Human Mesh Recovery,"Foo, Lin Geng*; Gong, Jia; Rahmani, Hossein; Liu, Jun",poster,2308.13369,https://arxiv.org/abs/2308.13369,,https://huggingface.co/papers/2308.13369,,,,4,0 Towards Zero-Shot Scale-Aware Monocular Depth Estimation,"Guizilini, Vitor*; Vasiljevic, Igor; Chen, Dian; Ambru?, Rare? A; Gaidon, Adrien",poster,2306.17253,https://arxiv.org/abs/2306.17253,,https://huggingface.co/papers/2306.17253,,,,5,0 Learning Depth Estimation for Transparent and Mirror Surfaces,"Costanzino, Alex*; Zama Ramirez, Pierluigi; Poggi, Matteo; Tosi, Fabio; Mattoccia, Stefano; Di Stefano, Luigi",poster,2307.15052,https://arxiv.org/abs/2307.15052,,https://huggingface.co/papers/2307.15052,,,,6,0 Uni-3D: A Universal Model for Panoptic 3D Scene Reconstruction,"Zhang, Xiang*; Chen, Zeyuan; Wei, Fangyin; Tu, Zhuowen",poster,,,,,,,,, 3D VR Sketch Guided 3D Shape Prototyping and Exploration,"Luo, Ling*; Chowdhury, Pinaki Nath; Xiang, Tao; Song, Yi-Zhe; Gryaditskaya, Yulia",poster,2306.10830,https://arxiv.org/abs/2306.10830,https://github.com/Rowl1ng/3Dsketch2shape,https://huggingface.co/papers/2306.10830,,,,5,0 Transparent Shape from a Single View Polarization Image,"Mingqi, Shao*; Xia, Chongkun; Yang, Zhendong; Huang, Junnan; Wang, Xueqian",poster,2204.06331,https://arxiv.org/abs/2204.06331,,https://huggingface.co/papers/2204.06331,,,,5,0 Get3DHuman: Lifting StyleGAN-Human into a 3D Generative Model using Pixel-aligned Reconstruction Priors,"XIONG, Zhangyang*; Kang, Di; Jin, Derong; Chen, Weikai; Bao, Linchao; Cui, Shuguang; Han, Xiaoguang",poster,2302.01162,https://arxiv.org/abs/2302.01162,,https://huggingface.co/papers/2302.01162,,,,7,0 Turn-the-Camera: Towards Zero-Shot Novel View Synthesis and 3D Reconstruction,"Liu, Ruoshi*; Wu, Rundi; Van Hoorick, Basile; Tokmakov, Pavel; Zakharov, Sergey; Vondrick, Carl",poster,,,,,,,,, Pose-free 3D Scene Reconstruction with Frozen Depth Models,"Xu, Guangkai; Yin, Wei; Chen, Hao; Shen, Chunhua; Cheng, Kai; Zhao, Feng*",poster,,,,,,,,, LIST: Learning Implicitly from Spatial Transformers for Single-View 3D Reconstruction,"Arshad, Mohammad Samiul; Beksi, William*",poster,2307.12194,https://arxiv.org/abs/2307.12194,,https://huggingface.co/papers/2307.12194,,,,2,0 3DMiner: Discovering Shapes from Large-Scale Unannotated Image Datasets,"Cheng, Ta-Ying*; Gadelha, Matheus A; Pirk, Soeren; GROUEIX, Thibault; Mech, Radomir; Markham, Andrew; Trigoni, Niki",poster,,,,,,,,, Nonrigid Object Contact Estimation With Regional Unwrapping Transformer,"Xie, Wei*; Zhao, Zimeng; Li, Shiying; Zuo, Binghui; Wang, Yangang",poster,2308.14074,https://arxiv.org/abs/2308.14074,,https://huggingface.co/papers/2308.14074,,,,5,0 SHERF: Generalizable Human NeRF from a Single Image,"Hu, Shoukang*; Hong, Fangzhou; Pan, Liang; Mei, Haiyi; Yang, Lei; Liu, Ziwei",poster,2303.12791,https://arxiv.org/abs/2303.12791,,https://huggingface.co/papers/2303.12791,,,,6,0 Full-Body Articulated Human-Object Interaction,"Jiang, Nan; Liu, Tengyu; Cao, Zhexuan; Cui, Jieming; Zhang, Zhiyuan; Chen, Yixin; Wang, He; Zhu, Yixin; Huang, Siyuan*",poster,2212.10621,https://arxiv.org/abs/2212.10621,,https://huggingface.co/papers/2212.10621,,,,9,0 PlaneRecTR: Unified Query learning for 3D Plane Recovery from a Single View,"Shi, Jingjia*; Zhi, Shuaifeng; Xu, Kai",poster,2307.13756,https://arxiv.org/abs/2307.13756,https://github.com/SJingjia/PlaneRecTR,https://huggingface.co/papers/2307.13756,,,,3,0 SceneRF: Self-Supervised Monocular 3D Scene Reconstruction with Radiance Fields,"CAO, Anh-Quan*; de Charette, Raoul",poster,2212.02501,https://arxiv.org/abs/2212.02501,,https://huggingface.co/papers/2212.02501,,,,2,0 3D-Aware Neural Body Fitting for Occlusion Robust 3D Human Pose Estimation,"Zhang, Yi*; Ji, Pengliang; Wang, Angtian; Mei, Jieru; Kortylewski, Adam; Yuille, Alan",poster,2308.10123,https://arxiv.org/abs/2308.10123,https://github.com/edz-o/3DNBF,https://huggingface.co/papers/2308.10123,,,,6,0 Two-in-One Depth: Bridging the Gap Between Monocular and Binocular Self-supervised Depth Estimation,"Zhou, Zhengming; Dong, Qiulei*",poster,,,,,,,,, LRRU: Long-short Range Recurrent Updating Networks for Depth Completion,"Wang, yufei; Dai, Yuchao*; Zhang, Ge; Li, Bo; Liu, Qi; Gao, Tao",poster,,,,,,,,, OccFormer: Dual-path Transformer for Vision-based 3D Semantic Occupancy Prediction,"Zhang, Yunpeng*; Zhu, Zheng; Du, Dalong",poster,2304.05316,https://arxiv.org/abs/2304.05316,https://github.com/zhangyp15/OccFormer,https://huggingface.co/papers/2304.05316,,,,3,0 CHORD: Category-level in-Hand Object Reconstruction via Shape Deformation,"Li, Kailin*; Yang, Lixin; Zhen, Haoyu; Lin, Zenan; Zhan, Xinyu; Zhong, Licheng; Xu, Jian; Wu, Kejian; Lu, Cewu",poster,,,,,,,,, NDC-Scene: Boost Monocular 3D Semantic Scene Completion in Normalized Device Coordinates Space,"Yao, Jiawei*; Li, Chuming; Sun, Keqiang; Cai, Yingjie; Li, Hao; Ouyang, Wanli; Li, Hongsheng",poster,,,,,,,,, Neural Video Depth Stabilizer,"Wang, Yiran; Shi, Min; li, jiaqi; Huang, Zihao; Cao, Zhiguo; Zhang, Jianming; Xian, Ke*; Lin, Guosheng",poster,2307.08695,https://arxiv.org/abs/2307.08695,,https://huggingface.co/papers/2307.08695,,,,8,0 DiLiGenT-Pi: Photometric Stereo for Planar Surfaces with Rich Details – Benchmark Dataset and Beyond,"Wang, Feishi; Ren, Jieji; Guo, Heng; Ren, Mingjun; Shi, Boxin*",poster,,,,,,,,, TMR: Text-to-Motion Retrieval using Contrastive 3D Human Motion Synthesis,"Petrovich, Mathis*; Black, Michael J.; Varol, Gul",poster,2305.00976,https://arxiv.org/abs/2305.00976,,https://huggingface.co/papers/2305.00976,,,,3,0 Sequential Texts Driven Cohesive Motions Synthesis with Natural Transitions,"Li, Shuai*; Zhuang, Sisi; Song, Wenfeng; Zhang, Xinyu; Chen, Hejia; Hao, Aimin",poster,,,,,,,,, Auxiliary Tasks Benefits 3D Skeleton-based Human Motion Prediction,"Xu, Chenxin; Tan, Robby T.; Yuhong, Tan; Chen, Siheng*; Wang, Xinchao; Wang, Yan-Feng",poster,,,,,,,,, Explicit Motion Disentangling for Efficient Optical Flow Estimation,"Deng, Changxing; Luo, Ao; Huang, Haibin; Ma, Shaodan; Liu, Jiangyu; Liu, Shuaicheng*",poster,,,,,,,,, TrackFlow: Multi-Object tracking with Normalizing Flows,"Mancusi, Gianluca*; Panariello, Aniello; Porrello, Angelo; Fabbri, Matteo; CALDERARA, SIMONE; Cucchiara, Rita",poster,2308.11513,https://arxiv.org/abs/2308.11513,,https://huggingface.co/papers/2308.11513,,,,6,0 HumanMAC: Masked Motion Completion for Human Motion Prediction,"Chen, Ling-Hao*; Zhang, JiaWei; LI, YEWEN; Pang, Yiren; Xia, Xiaobo; Liu, Tongliang",poster,2302.03665,https://arxiv.org/abs/2302.03665,,https://huggingface.co/papers/2302.03665,,,,6,0 Geometrized Transformer for Self-Supervised Homography Estimation,"Liu, Jiazhen; Li, Xirong*",poster,,,,,,,,, SemARFlow: Injecting Semantics into Unsupervised Optical Flow Estimation for Autonomous Driving,"Yuan, Shuai *; Yu, Shuzhi; Kim, Hannah H; Tomasi, Carlo",poster,2303.06209,https://arxiv.org/abs/2303.06209,https://github.com/duke-vision/semantic-unsup-flow-release,https://huggingface.co/papers/2303.06209,,,,4,0 Shi-NeSS: Detecting Good and Stable Keypoints with a Neural Stability Score,"Pakulev, Konstantin*; Ferrer, Gonzalo; Vakhitov, Alexander",poster,,,,,,,,, Robust Object Modeling for Visual Tracking,"Cai, Yidong; Liu, Jie*; Tang, Jie; Wu, Gangshan",poster,2308.05140,https://arxiv.org/abs/2308.05140,,https://huggingface.co/papers/2308.05140,,,,4,0 Social Diffusion: Long-term Multiple Human Motion Anticipation,"Tanke, Julian*; Zhang, Linguang; Zhao, Amy; Tang, Chengcheng; Cai, Yujun; Wang, Lezi; WU, PO-CHEN; Gall, Jürgen; Keskin, Cem",poster,,,,,,,,, Exploring Lightweight Hierarchical Vision Transformers for Efficient Visual Tracking,"Kang, Ben*; Chen, Xin; Wang, Dong; Peng, Houwen; Lu, Huchuan",poster,2308.06904,https://arxiv.org/abs/2308.06904,,https://huggingface.co/papers/2308.06904,,,,5,0 HMD-NeMo: Online 3D Avatar Motion Generation From Sparse Observations,"Aliakbarian, Sadegh*; Saleh, Fatemeh; Collier, David; Cameron, Pashmina; Cosker, Darren P",poster,,,,,,,,, Learning Fine-Graied Features for Pixel-wise Video Correspondences,"Li, Rui; Zhou, Shenglong; Liu, Dong*",poster,,,,,,,,, GAFlow: Incorporating Gaussian Attention into Optical Flow,"Luo, Ao; Yang, Fan; Li, Xin; Nie, Lang; Lin, Chunyu; Fan, Haoqiang; Liu, Shuaicheng*",poster,,,,,,,,, Occ$^2$Net: Robust Image Matching Based on 3D Occupancy Estimation for Occluded Regions,"Fan, Miao*; Chen, Mingrui; Hu, Chen; Zhou, Shuchang",poster,2308.16160,https://arxiv.org/abs/2308.16160,,https://huggingface.co/papers/2308.16160,,,,4,0 Locomotion-Action-Manipulation: Synthesizing Human-Scene Interactions in Complex 3D Environments,"Lee, Jiye*; Joo, Hanbyul",poster,,,,,,,,, Trajectory Unified Transformer for Pedestrian Trajectory Prediction,"Shi, Liushuai; Wang, Le*; Zhou, Sanping; Hua, Gang",poster,,,,,,,,, TMA: Temporal Motion Aggregation for Event-based Optical Flow,"Liu, Haotian; Chen, Guang; Qu, Sanqing; Zhang, Yanping; Li, Zhijun*; Knoll, Alois C.; Jiang, Changjun",poster,2303.11629,https://arxiv.org/abs/2303.11629,https://github.com/ispc-lab/TMA,https://huggingface.co/papers/2303.11629,,,,7,0 "Taming Contrast Maximization for Learning Sequential, Low-latency, Event-based Optical Flow","Paredes-Valles, Federico*; Scheper, Kirk YW; De Wagter, Christophe; de Croon, Guido",poster,2303.05214,https://arxiv.org/abs/2303.05214,,https://huggingface.co/papers/2303.05214,,,,4,0 GlueStick: Robust Image Matching by Sticking Points and Lines Together,"Pautrat, Rémi*; Suárez, Iago; Yu, Yifan; Pollefeys, Marc; Larsson, Viktor",poster,2304.02008,https://arxiv.org/abs/2304.02008,https://github.com/cvg/GlueStick,https://huggingface.co/papers/2304.02008,,,,5,0 DARTH: Holistic Test-time Adaptation for Multiple Object Tracking,"Segù, Mattia*; Schiele, Bernt; Yu, Fisher",poster,,,,,,,,, S-TREK: Sequential Translation and Rotation Equivariant Keypoints for local features extraction,"Santellani, Emanuele*; Sormann, Christian; Rossi, Mattia; Kuhn, Andreas; Fraundorfer, Friedrich",poster,,,,,,,,, Integrating Boxes and Masks: A Multi-Object Framework for Unified Visual Tracking and Segmentation,"Xu, Yuanyou*; Yang, Zongxin; Yang, Yi",poster,2308.13266,https://arxiv.org/abs/2308.13266,https://github.com/yoxu515/MITS,https://huggingface.co/papers/2308.13266,,,,3,0 Robust Frame-to-Frame Camera Rotation Estimation in Crowded Scenes,"Delattre, Fabien*; Dirnfeld, David; Nguyen, Phat T; Scarano, Stephen K; Miraldo, Pedro; Jones, Michael J; Learned-Miller, Erik",poster,,,,,,,,, Sparse Instance Conditioned Multimodal Trajectory Prediction,"Dong, Yonghao; Wang, Le*; Zhou, Sanping; Hua, Gang",poster,,,,,,,,, PoseDiffusion: Solving Pose Estimation via Diffusion-aided Bundle Adjustment,"Wang, Jianyuan; Rupprecht, Christian; Novotny, David*",poster,2306.15667,https://arxiv.org/abs/2306.15667,,https://huggingface.co/papers/2306.15667,,,,3,1 3DMOTFormer: Graph Transformer for Online 3D Multi-Object Tracking,"Ding, Shuxiao*; Rehder, Eike; Schneider, Lukas; Cordts, Marius; Gall, JÃŒrgen",poster,2308.06635,https://arxiv.org/abs/2308.06635,https://github.com/dsx0511/3DMOTFormer,https://huggingface.co/papers/2308.06635,,,,5,0 Fast Inference and Update of Probabilistic Density Estimation on Trajectory Prediction,"Maeda, Takahiro*; Ukita, Norimichi",poster,2308.08824,https://arxiv.org/abs/2308.08824,https://github.com/meaten/FlowChain-ICCV2023,https://huggingface.co/papers/2308.08824,,,,2,1 Supervised Homography Learning with Realistic Dataset Generation,"Jiang, Hai; Li, Haipeng; Han, Songchen; Fan, Haoqiang; Zeng, Bing; Liu, Shuaicheng*",poster,2307.15353,https://arxiv.org/abs/2307.15353,https://github.com/JianghaiSCU/RealSH,https://huggingface.co/papers/2307.15353,,,,6,0 Joint-Relation Transformer for Multi-person Motion Prediction,"Xu, Qingyao*; Mao, Weibo; GONG, JINGZE; Xu, Chenxin; Chen, Siheng; Xie, Weidi; Zhang, Ya; Wang, Yan-Feng",poster,2308.04808,https://arxiv.org/abs/2308.04808,,https://huggingface.co/papers/2308.04808,,,,8,0 Achieving Event-based Temporally Dense Optical Flow Estimation with Sequential Learning,"Ponghiran, Wachirawit*; Liyanagedera , Chamika M; Roy, Kaushik",poster,,,,,,,,, Visualizing Subtle Motions from Time-Varying Radiance Fields,"Feng, Brandon Yushan*; Alzayer, Hadi; Rubinstein, Michael; Freeman, William T.; Huang, Jia-Bin",poster,,,,,,,,, Learning Optical Flow from Event Camera with Rendered Dataset,"luo, xinglong; Luo, Kunming; Luo, Ao; Wang, Zhengning; Tan, Ping; Liu, Shuaicheng*",poster,2303.11011,https://arxiv.org/abs/2303.11011,,https://huggingface.co/papers/2303.11011,,,,6,0 Persistent-Transient Duality: A Multi-mechanism Approach for Modeling Human-Object Interaction,"Tran, Hung*; Le, Vuong; Venkatesh, Svetha; Tran, Truyen",poster,2307.12729,https://arxiv.org/abs/2307.12729,,https://huggingface.co/papers/2307.12729,,,,4,0 Deep Homography Mixture for Single Image Rolling Shutter Correction,"Yan, Weilong; Tan, Robby T.; Zeng, Bing; Liu, Shuaicheng*",poster,,,,,,,,, Fast Neural Scene Flow,"Li, Xueqian*; Zheng, Jianqiao; Ferroni, Francesco; Kaesemodel Pontes, Jhony; Lucey, Simon",poster,2304.09121,https://arxiv.org/abs/2304.09121,,https://huggingface.co/papers/2304.09121,,,,5,0 RLSAC: Reinforcement Learning enhanced Sample Consensus for End-to-End Robust Estimation,"Nie, Chang; Wang, Guangming; Liu, Zhe; Cavalli, Luca; Pollefeys, Marc; Wang, Hesheng*",poster,2308.05318,https://arxiv.org/abs/2308.05318,https://github.com/IRMVLab/RLSAC,https://huggingface.co/papers/2308.05318,,,,6,0 MeMOTR: Long-Term Memory-Augmented Transformer for Multi-Object Tracking,"Gao, Ruopeng*; Wang, Limin",poster,2307.15700,https://arxiv.org/abs/2307.15700,https://github.com/MCG-NJU/MeMOTR,https://huggingface.co/papers/2307.15700,,,,2,0 MBPTrack: Improving 3D Point Cloud Tracking with Memory Networks and Box Priors,"Xu, Tianxing*; Guo, Yuan-Chen; Lai, Yu-Kun; Zhang, Song-Hai ",poster,2303.05071,https://arxiv.org/abs/2303.05071,,https://huggingface.co/papers/2303.05071,,,,4,0 SportsMOT: A Large Multi-Object Tracking Dataset in Diverse Sports Scenes,"Cui, Yutao; Zeng, Chenkai; Zhao, Xiaoyu; Yang, YiChun; Wu, Gangshan; Wang, Limin*",poster,,,,,,,,, Heterogeneous Diversity Driven Active Learning for Multi-Object Tracking,"Li, Rui; Zhang, Baopeng; Liu, Jun; Liu, Wei; Zhao, Jian; Teng, Zhu *",poster,,,,,,,,, TM2D: Bimodality Driven 3D Dance Generation via Music-Text Integration,"Gong, Kehong*; Lian, Dongze; Chang, Heng; Guo, Chuan; Jiang, Zi-Hang; Zuo, Xinxin; Bi Mi, Michael; Wang, Xinchao",poster,2304.02419,https://arxiv.org/abs/2304.02419,,https://huggingface.co/papers/2304.02419,,,,7,0 Synchronize Feature Extracting and Matching: A Single Branch Framework for 3D Object Tracking,"Ma, Teli*; Wang, Mengmeng; Xiao, Jimin; Wu, Huifeng; Liu, Yong",poster,2308.12549,https://arxiv.org/abs/2308.12549,,https://huggingface.co/papers/2308.12549,,,,5,0 Collaborative Tracking Learning for Frame-Rate-Insensitive Multi-Object Tracking,"Liu, Yiheng*; Wu, Junta; Fu, Yi",poster,2308.05911,https://arxiv.org/abs/2308.05911,https://github.com/yolomax/ColTrack,https://huggingface.co/papers/2308.05911,,,,3,1 CiteTracker: Correlating Image and Text for Visual Tracking,"Li, Xin*; Huang, Yuqing; He, Zhenyu; Wang, Yaowei; Lu, Huchuan; Yang, Ming-Hsuan",poster,2308.11322,https://arxiv.org/abs/2308.11322,,https://huggingface.co/papers/2308.11322,,,,6,0 SINC: Spatial Composition of 3D Human Motions for Simultaneous Action Generation,"Athanasiou, Nikos*; Petrovich, Mathis; Black, Michael J.; Varol, Gul",poster,2304.10417,https://arxiv.org/abs/2304.10417,,https://huggingface.co/papers/2304.10417,,,,4,0 Uncertainty-aware Unsupervised Multi-Object Tracking,"Liu, Kai*; Jin, Sheng; Fu, Zhihang; Chen, Ze; Jiang, Rongxin; Ye, Jieping",poster,2307.15409,https://arxiv.org/abs/2307.15409,,https://huggingface.co/papers/2307.15409,,,,6,0 PVT++: A Simple End-to-End Latency-Aware Visual Tracking Framework,"Li, Bowen*; Huang, Ziyuan; Ye, Junjie; Li, Yiming; Scherer, Sebastian; Zhao, Hang; Fu, Changhong",poster,2211.11629,https://arxiv.org/abs/2211.11629,,https://huggingface.co/papers/2211.11629,,,,7,0 EigenTrajectory: Low-Rank Descriptors for Multi-Modal Trajectory Forecasting,"Bae, Inhwan*; Oh, Jean; Jeon, Hae-Gon",poster,2307.09306,https://arxiv.org/abs/2307.09306,https://github.com/inhwanbae/EigenTrajectory,https://huggingface.co/papers/2307.09306,,,,3,1 RPEFlow: Multimodal Fusion of RGB-PointCloud-Event for Joint Optical Flow and Scene Flow Estimation,"Wan, Zhexiong; Mao, Yuxin; Zhang, Jing; Dai, Yuchao*",poster,,,,,,,,, Multi-Scale Bidirectional Recurrent Network with Hybrid Correlation for Point Cloud Based Scene Flow Estimation,"CHENG, WENCAN; Ko, Jong Hwan*",poster,,,,,,,,, ReST: A Reconfigurable Spatial-Temporal Graph Model for Multi-Camera Multi-Object Tracking,"Cheng, Cheng-Che*; Qiu, Min-Xuan; Chiang, Chen-Kuo; Lai, Shang-Hong",poster,2308.13229,https://arxiv.org/abs/2308.13229,,https://huggingface.co/papers/2308.13229,,,,4,1 "TAPIR: Tracking Any Point, Initialized per-frame, Refined temporally","Doersch, Carl*; Yang, Yi; Vecerik, Mel; Gokay, Dilara; Gupta, Ankush; Aytar, Yusuf; Carreira, Joao; Zisserman, Andrew",poster,,,,,,,,, IHNet: Iterative Hierarchical Network Guided by High-Resolution Estimated Information for Scene Flow,"Wang, Yun*; Chi, Cheng; Lin, Min; Yang, Xin",poster,,,,,,,,, Can Language Models Transfer to Social Gesture Motion Generation?,"Ng, Evonne*; Subramanian, Sanjay; Klein, Dan; Kanazawa, Angjoo; Darrell, Trevor; Ginosar, Shiry",poster,,,,,,,,, XVO: Generalized Visual Odometry via Cross-Modal Self-Training,"Lai, Lei*; Shangguan, Zhongkai; Zhang, Jimuyang; Ohn-Bar, Eshed",poster,,,,,,,,, Distracting Downpour: Adversarial Weather Attacks for Motion Estimation,"Schmalfuss, Jenny*; Mehl, Lukas; Bruhn, Andrés",poster,2305.06716,https://arxiv.org/abs/2305.06716,https://github.com/cv-stuttgart/DistractingDownpour,https://huggingface.co/papers/2305.06716,,,,3,0 Foreground-Background Distribution Modeling Transformer for Visual Object Tracking,"Yang, Dawei*; He, Jianfeng; Ma, Yinchao; Yu, Qianjin; Zhang, Tianzhu",poster,,,,,,,,, Weakly-Supervised Action Segmentation and Unseen Error Detection in Anomalous Instructional Videos,"Ghoddoosian, Reza*; Dwivedi, Isht; Agarwal, Nakul; Dariush, Behzad",poster,,,,,,,,, Diffusion Action Segmentation,"Liu, Daochang; Li, Qiyue; Dinh, AnhDung; Jiang, Tingting; Shah, Mubarak; Xu, Chang*",poster,2303.17959,https://arxiv.org/abs/2303.17959,,https://huggingface.co/papers/2303.17959,,,,6,0 Audio-Visual Glance Network for Efficient Video Recognition,"Nugroho, Muhammad Adi*; Woo, Sangmin; Lee, Sumin; Kim, Changick",poster,2308.09322,https://arxiv.org/abs/2308.09322,,https://huggingface.co/papers/2308.09322,,,,4,0 Learning from Noisy Pseudo Labels for Semi-Supervised Temporal Action Localization,"Xia, Kun; Wang, Le*; Zhou, Sanping; Hua, Gang; Tang, Wei",poster,,,,,,,,, Video Action Recognition with Attentive Semantic Units,"Chen, Yifei*; CHEN, Dapeng; Liu, Ruijin; li, hao; Peng, Wei",poster,2303.09756,https://arxiv.org/abs/2303.09756,,https://huggingface.co/papers/2303.09756,,,,5,0 Masked Motion Predictors are Strong 3D Action Representation Learners,"Mao, Yunyao*; Deng, Jiajun; Zhou, Wengang ; Fang, Yao; Ouyang, Wanli; Li, Houqiang",poster,2308.07092,https://arxiv.org/abs/2308.07092,https://github.com/maoyunyao/MAMP,https://huggingface.co/papers/2308.07092,,,,6,0 Boosting Positive Segments for Weakly-Supervised Audio-Visual Video Parsing,"Rachavarapu, Kranthi Kumar*; Ambasamudram, Rajagopalan N",poster,,,,,,,,, Weakly-Supervised Action Localization by Hierarchically-structured Latent Attention Modeling,"Wang, Guiqin; Zhao, Peng; Zhao, Cong; Yang, Shusen; Cheng, Jie; Leng, Luziwei; Liao, Jianxing; Guo, Qinghai*",poster,2308.09946,https://arxiv.org/abs/2308.09946,,https://huggingface.co/papers/2308.09946,,,,8,1 Few-Shot Common Action Localization via Cross-Attentional Fusion of Context and Temporal Dynamics,"Lee, Juntae*; Jain, Mihir; Yun, Sungrack",poster,,,,,,,,, Interaction-aware Joint Attention Estimation Using People Attributes,"Nakatani, Chihiro*; Kawashima, Hiroaki; Ukita, Norimichi",poster,2308.05382,https://arxiv.org/abs/2308.05382,,https://huggingface.co/papers/2308.05382,,,,3,0 FineDance: A Fine-grained Choreography Dataset for 3D Full Body Dance Generation,"Li, Ronghui*; Zhao, JunFan; Zhang, Yachao; Su, Mingyang; Ren, Zeping; Zhang, Han; Tang, Yansong; Li, Xiu",poster,2212.03741,https://arxiv.org/abs/2212.03741,,https://huggingface.co/papers/2212.03741,,,,8,0 SOAR: Scene-debiasing Open-set Action Recognition,"Zhai, Yuanhao*; Liu, Ziyi; Wu, Zhenyu; Wu, Yi; ZHOU, CHUNLUAN; Doermann, David; Yuan, Junsong; Hua, Gang",poster,,,,,,,,, Leveraging Spatio-Temporal Dependency for Skeleton-Based Action Recognition,"Lee, Jungho*; Lee, Minhyeok; Cho, Suhwan; Woo, Sungmin; Jang, Sungjun; Lee, Sangyoun",poster,2212.04761,https://arxiv.org/abs/2212.04761,,https://huggingface.co/papers/2212.04761,,,,6,0 Cross-Modal Learning with 3D Deformable Attention for Action Recognition,"Kim, Sangwon*; Ahn, Dasom; Ko, Byoung Chul",poster,2212.05638,https://arxiv.org/abs/2212.05638,,https://huggingface.co/papers/2212.05638,,,,3,0 Generative Action Description Prompts for Skeleton-based Action Recognition,"Xiang, Wangmeng*; Li, Chao; Zhou, Yuxuan; wang, biao; Zhang, Lei",poster,,,,,,,,, Self-Feedback DETR for Temporal Action Detection,"Kim, Jihwan*; Lee, Miso; Heo, Jae-Pil",poster,2308.10570,https://arxiv.org/abs/2308.10570,,https://huggingface.co/papers/2308.10570,,,,3,0 Skip-Plan: Procedure Planning in Instructional Videos via Condensed Action Space Learning,"Li, Zhiheng*; Geng, Wenjia; Li, Muheng; Chen, Lei; Tang, Yansong; Lu, Jiwen; Zhou, Jie",poster,,,,,,,,, The Unreasonable Effectiveness of Large Language-Vision Models for Source-free Video Domain Adaptation,"Zara, Giacomo*; Conti, Alessandro; Roy, Subhankar; LathuiliÚre, Stéphane; Rota, Paolo; Ricci, Elisa",poster,2308.09139,https://arxiv.org/abs/2308.09139,,https://huggingface.co/papers/2308.09139,,,,6,1 Multimodal Motion Conditioned Diffusion Model for Skeleton-based Video Anomaly Detection,"Flaborea, Alessandro*; Collorone, Luca; D'Amely di Melendugno, Guido Maria; D'Arrigo, Stefano; Prenkaj, Bardh; Galasso, Fabio",poster,2307.07205,https://arxiv.org/abs/2307.07205,,https://huggingface.co/papers/2307.07205,,,,6,1 Video Anomaly Detection via Sequentially Learning Multiple Pretext Tasks,"Shi, Chenrui*; Sun, Che; Jia, Yunde; WU, Yuwei",poster,,,,,,,,, MiniROAD: Minimal RNN Framework for Online Action Detection,"An, Joungbin; Kang, Hyolim; Han, Su Ho; Yang, Ming-Hsuan; Kim, Seon Joo*",poster,,,,,,,,, How Much Temporal Long-Term Context is Needed for Action Segmentation?,"Bahrami, Emad*; Francesca, Gianpiero; Gall, JÃŒrgen",poster,2308.11358,https://arxiv.org/abs/2308.11358,,https://huggingface.co/papers/2308.11358,,,,3,1 DiffTAD: Temporal Action Detection with Conditioned Location Diffusion,"Nag, Sauradip*; Zhu, Xiatian; Deng, Jiankang; Song, Yi-Zhe; Xiang, Tao",poster,,,,,,,,, STEPs: Self-Supervised Key Step Extraction from Unlabeled Procedural Videos,"Shah, Anshul*; Lundell, Benjamin; Sawhney, Harpreet; Chellappa, Rama",poster,2301.00794,https://arxiv.org/abs/2301.00794,,https://huggingface.co/papers/2301.00794,,,,4,0 Efficient Video Action Detection with Token Dropout and Context Refinement,"Chen, Lei*; Tong, Zhan; Song, Yibing; Wu, Gangshan; Wang, Limin",poster,2304.08451,https://arxiv.org/abs/2304.08451,https://github.com/MCG-NJU/EVAD,https://huggingface.co/papers/2304.08451,,,,5,0 FSAR: Federated Skeleton-based Action Recognition with Adaptive Topology Structure and Knowledge Distillation,"Guo, Jingwen*; Liu, Hong; Sun, Shitong; Guo, Tianyu; Zhang, Min; Si, Chenyang",poster,2306.11046,https://arxiv.org/abs/2306.11046,,https://huggingface.co/papers/2306.11046,,,,6,0 Exploring Visual Context in Two-Stage Detection of Human–Object Interactions,"Zhang, Frederic Z*; Yuan, Yuhui; Campbell, Dylan; Zhong, Zhuoyao; Gould, Stephen",poster,,,,,,,,, E2E-LOAD: End-to-End Long-form Online Action Detection,"Cao, Shuqiang*; Luo, Weixin; Wang, Bairui; Zhang, Wei; Ma, Lin",poster,,,,,,,,, Revisiting Foreground and Background Separation in Weakly-supervised Temporal Action Localization: A Clustering-based Approach,"Liu, Qinying; Wang, Zilei*; Rong, Shenghai; li, junjie; Zhang, Yixin",poster,,,,,,,,, Hierarchically Decomposed Graph Convolutional Networks for Skeleton-Based Action Recognition,"Lee, Jungho*; Lee, Minhyeok; Lee, Dogyoon; Lee, Sangyoun",poster,2208.10741,https://arxiv.org/abs/2208.10741,,https://huggingface.co/papers/2208.10741,,,,4,0 Tiled Multiplane Images for Practical 3D Photography,"Khan, Numair*; Lanman, Douglas; Xiao, Lei",poster,,,,,,,,, Eulerian Single-Photon Vision,"Gupta, Shantanu*; Gupta, Mohit",poster,,,,,,,,, ProPainter: Improving Video Inpainting with Enhanced Propagation and Efficient Transformer,"Zhou, Shangchen*; Li, Chongyi; Chan, Kelvin C.K.; Loy, Chen Change",poster,,,,,,,,, Autoregressive for Neural Processes,"Tai, Jinyang*",poster,,,,,,,,, DOLCE: A Model-Based Probabilistic Diffusion Framework for Limited-Angle CT Reconstruction,"Liu, Jiaming*; Anirudh, Rushil; J. Thiagarajan, Jayaraman; He, Stewart; Mohan, Kadri Aditya; Kamilov, Ulugbek S.; Kim, Hyojin",poster,2211.12340,https://arxiv.org/abs/2211.12340,,https://huggingface.co/papers/2211.12340,,,,7,0 GlowGAN: Unsupervised Learning of HDR Images from LDR Images in the Wild,"Wang, Chao*; Serrano, Ana; Pan, Xingang; Chen, Bin; Myszkowski, Karol ; Seidel, Hans-Peter; Theobalt, Christian; Leimkuehler, Thomas",poster,2211.12352,https://arxiv.org/abs/2211.12352,,https://huggingface.co/papers/2211.12352,,,,8,0 Score-Based Diffusion Models as Principled Priors for Inverse Imaging,"Feng, Berthy T*; Smith, Jamie; Rubinstein, Michael; Chang, Huiwen; Bouman, Katherine; Freeman, William T.",poster,2304.11751,https://arxiv.org/abs/2304.11751,,https://huggingface.co/papers/2304.11751,,,,6,0 NLOS-NeuS: Non-line-of-sight Neural Implicit Surface,"Fujimura, Yuki*; Kushida, Takahiro; Funatomi, Takuya; Mukaigawa, Yasuhiro",poster,,,,,,,,, MEFLUT: Unsupervised 1D Lookup Tables for Multi-exposure Image Fusion,"Jiang, Ting; Wang, Chuan; Li, Xinpeng; Li, Ru; Fan, Haoqiang; Liu, Shuaicheng*",poster,,,,,,,,, Temporal-Coded Spiking Neural Networks with Dynamic Firing Threshold: Learning with Event-Driven Backpropagation,"Wei, Wenjie; Zhang, Malu; Qu, Hong; Belatreche, Ammar; Zhang, Jian; Chen, Hong*",poster,,,,,,,,, Enhancing Non-line-of-sight Imaging via Learnable Inverse Kernel and Attention Mechanisms,"Yu, Yanhua*; Shen, Siyuan; Wang, Zi; Huang, Binbin; Wang, Yuehan; Peng, Xingyue; Xia, Su‘an; Liu, Ping; Li, Ruiqian; Li, Shiying ",poster,,,,,,,,, Aperture Diffraction for Compact Snapshot Spectral Imaging,"Lv, Tao*; Ye, Hao; yuan, quan; shi, zhan; Wang, Yibo; Wang, Shuming; Cao, Xun",poster,,,,,,,,, Content-Aware Local GAN for Photo-Realistic Super-Resolution,"Park, JoonKyu; Son, Sanghyun; Lee, Kyoung Mu*",poster,,,,,,,,, RED-PSM: Regularization by Denoising of Partially Separable Models for Dynamic Imaging,"Iskender, Berk*; Bresler, Yoram; Klasky, Marc L",poster,,,,,,,,, Self-Supervised Burst Super-Resolution,"Bhat, Goutam*; Gharbi, Michaël; Chen, Jiawen; Van Gool, Luc; Xia, Zhihao",poster,,,,,,,,, Coherent Event Guided Low-light Video Enhancement,"Liang, Jinxiu S*; Yang, Yixin; Li, Boyu; Duan, Peiqi; Xu, Yong; Shi, Boxin",poster,,,,,,,,, Panoramas from Photons,"Jungerman, Sacha*; Ingle, Atul N; Gupta, Mohit",poster,,,,,,,,, Designing Phase Masks for Under-Display Cameras,"Yang, Anqi*; Kang, Eunhee; Lee, Hyong-Euk; Sankaranarayanan, Aswin",poster,,,,,,,,, Deep Optics for Video Snapshot Compressive Imaging,"Wang, Ping*; Wang, Lishun; Yuan, Xin",poster,,,,,,,,, TiDy-PSFs: Computational Imaging with Time-Averaged Dynamic Point-Spread-Functions,"Shah, Sachin*; Kulshrestha, Sakshum; metzler, christopher",poster,,,,,,,,, Generalized Lightness Adaptation with Channel Selective Normalization,"Yao, Mingde; Huang, Jie; Jin, Xin; Xu, Ruikang; Zhou, Shenglong; zhou, man; Xiong, Zhiwei*",poster,2308.13783,https://arxiv.org/abs/2308.13783,https://github.com/mdyao/CSNorm,https://huggingface.co/papers/2308.13783,,,,7,0 Towards Nonlinear-Motion-Aware and Occlusion-Robust Rolling Shutter Correction,"Qu, Delin*; Lao, Yizhen; Wang, Zhigang; Wang, Dong; Zhao, Bin; Li, Xuelong",poster,2303.18125,https://arxiv.org/abs/2303.18125,https://github.com/DelinQu/qrsc,https://huggingface.co/papers/2303.18125,,,,6,0 FCCNs: Fully Complex-valued Convolutional Networks using Complex-valued Color Model and Loss Function,"Yadav, Saurabh*; Jerripothula, Koteswar Rao",poster,,,,,,,,, Event Camera Data Pre-training,"Yang, Yan*; Pan, Liyuan; liu, Liu",poster,2301.01928,https://arxiv.org/abs/2301.01928,,https://huggingface.co/papers/2301.01928,,,,3,0 Improving 3D Imaging with Pre-Trained Perpendicular 2D Diffusion Models,"Lee, Suhyeon*; Chung, Hyungjin; Park, Min Young; Park, Jonghyeok; Ryu, Wi-Sun; Ye, Jong Chul",poster,2303.08440,https://arxiv.org/abs/2303.08440,,https://huggingface.co/papers/2303.08440,,,,6,1 Multiscale Structure Guided Diffusion for Image Deblurring,"Ren, Mengwei*; Delbracio, Mauricio; Talebi, Hossein ; Gerig, Guido; Milanfar, Peyman",poster,2212.01789,https://arxiv.org/abs/2212.01789,,https://huggingface.co/papers/2212.01789,,,,5,0 Generalizing Event-Based Motion Deblurring in Real-World Scenarios,"Zhang, Xiang; Yu, Lei*; Yang, Wen; Liu, Jianzhuang; Xia, Gui-Song",poster,2308.05932,https://arxiv.org/abs/2308.05932,,https://huggingface.co/papers/2308.05932,,,,5,0 On the Robustness of Normalizing Flows for Inverse Problems in Imaging,"Hong, Seongmin; PARK, INBUM; Chun, Se Young*",poster,2212.04319,https://arxiv.org/abs/2212.04319,,https://huggingface.co/papers/2212.04319,,,,3,0 Learned Compressive Representations for Single-Photon 3D Imaging,"Gutierrez-Barragan, Felipe*; Mu, Fangzhou; Ardelean, Andrei; Ingle, Atul N; Bruschini, Claudio; Charbon, Edoardo; Li, Yin; Gupta, Mohit; Velten, Andreas",poster,,,,,,,,, Recovering a Molecule's 3D Dynamics from Liquid-phase Electron Microscopy Movies,"Ye, Enze; Wang, Yuhang; Zhang, Hong; Gao, Yi Qin; Wang, Huan; Sun, He*",poster,2308.11927,https://arxiv.org/abs/2308.11927,,https://huggingface.co/papers/2308.11927,,,,6,0 NIR-assisted Video Enhancement via Unpaired 24-hour Data,"Niu, Muyao; Zhong, Zhihang; Zheng, Yinqiang*",poster,,,,,,,,, SpinCam: High-Speed Imaging via a Rotating Point-Spread Function,"Chan, Dorian Y*; Sheinin, Mark; O'Toole, Matthew",poster,,,,,,,,, RecRecNet: Rectangling Rectified Wide-Angle Images by Thin-Plate Spline Model and DoF-based Curriculum Learning,"Liao, Kang; Nie, Lang; Lin, Chunyu*; Zheng, Zishuo; Zhao, Yao",poster,2301.01661,https://arxiv.org/abs/2301.01661,,https://huggingface.co/papers/2301.01661,,,,5,0 Affective Image Filter: Reflecting Emotions from Text to Images,"Weng, Shuchen*; Zhang, Peixuan; Chang, Zheng; Wang, Xinlong; Li, Si; Shi, Boxin",poster,,,,,,,,, Towards General Low-Light Raw Noise Synthesis and Modeling,"Zhang, Feng; Xu, Bin; Li, Zhiqiang; Liu, Xinran; Lu, Qingbo; Gao, Changxin; Sang, Nong*",poster,2307.16508,https://arxiv.org/abs/2307.16508,,https://huggingface.co/papers/2307.16508,,,,7,0 Unsupervised Video Deraining with An Event Camera,"Wang, Jin*; Weng, Wenming; Zhang, Yueyi; Xiong, Zhiwei",poster,,,,,,,,, LoLep: Single-View View Synthesis with Locally-Learned Planes and Self-Attention Occlusion Inference,"Wang, Cong*; Wang, Yu-Ping; Manocha, Dinesh",poster,2307.12217,https://arxiv.org/abs/2307.12217,,https://huggingface.co/papers/2307.12217,,,,3,0 Skill Transformer: A Monolithic Policy for Mobile Manipulation,"Huang, Xiaoyu*; Batra, Dhruv; Rai, Akshara; Szot, Andrew",poster,2308.09873,https://arxiv.org/abs/2308.09873,,https://huggingface.co/papers/2308.09873,,,,4,0 ENTL: Embodied Navigation Trajectory Learner,"Kotar, Klemen; Walsman, Aaron T; Mottaghi, Roozbeh*",poster,2304.02639,https://arxiv.org/abs/2304.02639,,https://huggingface.co/papers/2304.02639,,,,3,0 DREAMWALKER: Mental Planning for Continuous Vision-Language Navigation,"Wang, Hanqing*; Wang, Wenguan; Liang, Wei; Van Gool, Luc",poster,2308.07498,https://arxiv.org/abs/2308.07498,,https://huggingface.co/papers/2308.07498,,,,4,0 Scene Graph Contrastive Learning for Embodied Navigation ,"Singh, Kunal Pratap*; Salvador, Jordi; Weihs, Luca; Kembhavi, Aniruddha",poster,,,,,,,,, Perpetual Humanoid Control for Real-time Simulated Avatars,"Luo, Zhengyi*; Cao, Jinkun; Winkler, Alexander W; Kitani, Kris; Xu, Weipeng",poster,2305.06456,https://arxiv.org/abs/2305.06456,,https://huggingface.co/papers/2305.06456,,,,5,1 Grounding 3D Object Affordance from 2D Interactions in Images,"Yang, Yuhang; Zhai, Wei; Luo, Hongchen; Cao, Yang*; Luo, Jiebo; Zha, Zheng-Jun",poster,2303.10437,https://arxiv.org/abs/2303.10437,https://github.com/yyvhang/IAGNet,https://huggingface.co/papers/2303.10437,,,,6,0 Navigating to Objects Specified by Images,"Krantz, Jacob*; Gervet, Theophile; Yadav, Karmesh; Wang, Austin S; Paxton, Chris; Mottaghi, Roozbeh; Batra, Dhruv; Malik, Jitendra; Lee, Stefan; Chaplot, Devendra Singh",poster,2304.01192,https://arxiv.org/abs/2304.01192,,https://huggingface.co/papers/2304.01192,,,,10,1 PEANUT: Predicting and Navigating to Unseen Targets,"Zhai, Albert J*; Wang, Shenlong",poster,2212.02497,https://arxiv.org/abs/2212.02497,,https://huggingface.co/papers/2212.02497,,,,2,0 Context-Aware Planning and Environment-Aware Memory for Instruction Following Embodied Agents,"Kim, Byeonghwi; kim, jinyeon; Kim, yuyeong; Min, Cheolhong; Choi, Jonghyun*",poster,2308.07241,https://arxiv.org/abs/2308.07241,,https://huggingface.co/papers/2308.07241,,,,5,1 Learning Foresightful Dense Visual Affordance for Deformable Object Manipulation,"Wu, Ruihai; Ning, Chuanruo; Dong, Hao*",poster,2303.11057,https://arxiv.org/abs/2303.11057,,https://huggingface.co/papers/2303.11057,,,,3,0 Exploiting Proximity-Aware Tasks for Embodied Social Navigation,"Cancelli, Enrico; Campari, Tommaso; Serafini, Luciano; Chang, Angel X; Ballan, Lamberto*",poster,2212.00767,https://arxiv.org/abs/2212.00767,,https://huggingface.co/papers/2212.00767,,,,5,0 Object-Aware Cognitive Bird’s-Eye-View Grids for Vision-Language Navigation,"Liu, Rui; Wang, Xiaohan; Wang, Wenguan; Yang, Yi*",poster,,,,,,,,, Active Neural Mapping,"Yan, Zike*; Yang, Haoxiang; Zha, Hongbin",poster,2308.16246,https://arxiv.org/abs/2308.16246,,https://huggingface.co/papers/2308.16246,,,,3,0 Omnidirectional Information Gathering for Knowledge Transfer-based Audio-Visual Navigation,"Chen , Jinyu*; Wang, Wenguan; Liu, Si; Li, Hongsheng; Yang, Yi",poster,2308.10306,https://arxiv.org/abs/2308.10306,,https://huggingface.co/papers/2308.10306,,,,5,0 Multi-Object Navigation with dynamically learned neural implicit representations,"Marza, Pierre*; Matignon, Laetitia; Simonin, Olivier; Wolf, Christian",poster,2210.05129,https://arxiv.org/abs/2210.05129,,https://huggingface.co/papers/2210.05129,,,,4,1 Unsupervised Feature Representation Learning for Domain-generalized Cross-domain Image Retrieval,"Hu, Conghui*; Zhang, Can; Lee, Gim Hee",poster,,,,,,,,, DeDrift: Robust Similarity Search under Content Drift,"Baranchuk, Dmitry A*; Douze, Matthijs; Upadhyay, Yash; Yalniz, I. Zeki",poster,2308.02752,https://arxiv.org/abs/2308.02752,,https://huggingface.co/papers/2308.02752,,,,4,0 Global Features are All You Need for Image Retrieval and Reranking,"Shao, Shihao*; Chen, Kaifeng; Karpur, Arjun M; Cui, Qinghua; Araujo, Andre; Cao, Bingyi",poster,2308.06954,https://arxiv.org/abs/2308.06954,https://github.com/ShihaoShao-GH/SuperGlobal,https://huggingface.co/papers/2308.06954,,,,6,0 HSE: Hybrid Species Embedding for Deep Metric Learning,"Yang, Bailin*; sun, haoqiang; Li, Frederick W. B.; chen, zheng; cai, jianlu; Song, Chao",poster,,,,,,,,, Discrepant and Multi-instance Proxies for Unsupervised Person Re-identification,"Zou, Chang; Chen, Zeqi; Cui, Zhichao; Liu, Yuehu*; Zhang, Chi",poster,,,,,,,,, Towards Grand Unified Representation Learning for Unsupervised Visible-Infrared Person Re-Identification,"Yang, Bin*; Chen, Jun; Ye, Mang",poster,,,,,,,,, EigenPlaces: Training Viewpoint Robust Models for Visual Place Recognition,"Berton, Gabriele*; Trivigno, Gabriele; Masone, Carlo; Caputo, Barbara",poster,2308.10832,https://arxiv.org/abs/2308.10832,https://github.com/ShihaoShao-GH/SuperGlobal,https://huggingface.co/papers/2308.10832,,,,4,0 Simple Baselines for Interactive Video Retrieval with Questions and Answers,"Liang, Kaiqu*; Albanie, Samuel",poster,2308.10402,https://arxiv.org/abs/2308.10402,,https://huggingface.co/papers/2308.10402,,,,2,0 Fan-Beam Binarization Difference Projection (FB-BDP): A Novel Local Object Descriptor for Fine-Grained Leaf Image Retrieval,"CHEN, XIN; Wang, Bin*; Gao, Yongsheng",poster,,,,,,,,, Conditional Cross Attention Network for Multi-Space Embedding without Entanglement in Only a SINGLE Network,"Song, Chull Hwan*; Hwang, Taebaek; Yoon, Jooyoung; Choi, Shunghyun; Gu, Yeonghyeon",poster,2307.13254,https://arxiv.org/abs/2307.13254,,https://huggingface.co/papers/2307.13254,,,,5,0 Learning Concordant Attention via Target-aware Alignment for Visible-Infrared Person Re-identification,"Wu, Jianbing*; Liu, Hong; Su, Yuxin; Shi, Wei; Tang, Hao",poster,,,,,,,,, Person Re-Identification without Identification via Event anonymization,"Ahmad, Shafiq*; Morerio, Pietro; Del Bue, Alessio",poster,2308.04402,https://arxiv.org/abs/2308.04402,,https://huggingface.co/papers/2308.04402,,,,3,0 Divide&Classify: Fine-Grained Classification for City-Wide Visual Geo-localization,"Trivigno, Gabriele*; Berton, Gabriele; Masone, Carlo; Aragon, Juan M; Caputo, Barbara",poster,,,,,,,,, Dark Side Augmentation: Generating Diverse Night Examples for Metric Learning,"Mohwald, Albert; Jenicek, Tomas*; Chum, Ondrej",poster,,,,,,,,, PIDRo: Parallel Isomeric Attention with Dynamic Routing for Text-Video Retrieval,"Guan, Peiyan*; Pei, Renjing; Shao, Bin; Liu, Jianzhuang; Li, Weimian; Gu, Jiaxi; Xu, Hang; Xu, Songcen; Yan, Youliang; Lam, Edmund",poster,,,,,,,,, Unified Pre-training with Pseudo Texts for Text-To-Image Person Re-identification,"Shao, Zhiyin; Zhang, Xinyu; Ding, Changxing*; Wang, Jian; Wang, Jingdong",poster,,,,,,,,, Modality Unifying Network for Visible Infrared Person Re-Identification,"Yu, Hao; Cheng, Xu*; Peng, Wei; Liu, Weihao; Zhao, Guoying",poster,,,,,,,,, DeepChange: A Long-Term Person Re-Identification Benchmark with Clothes Change,"Xu, Peng*; Zhu, Xiatian",poster,,,,,,,,, LexLIP: Lexicon-Bottlenecked Language-Image Pre-Training for Large-Scale Image-Text Sparse Retrieval,"Luo, Ziyang*; Zhao, Pu; Xu, Can; Geng, Xiubo; Shen, Tao; Tao, Chongyang; Ma, Jing; Lin, Qingwei; Jiang, Daxin",poster,,,,,,,,, Dual Pseudo-Labels Interactive Self-Training for Semi-Supervised Visible-Infrared Person Re-Identification,"Shi, Jiangming; Zhang, Yachao; Yin, Xiangbo; Xie, Yuan; Zhang, Zhizhong; Fan, Jianping; shi, zhongchao; Qu, Yanyun*",poster,,,,,,,,, $BT^2$: Backward-compatible Training with Basis Transformation,"Zhou, Yifei*; Li, Zilu; Shrivastava, Abhinav; Zhao, Hengshuang; Torralba, Antonio; Tian, Tai-Peng; Lim, Ser-Nam",poster,2211.03989,https://arxiv.org/abs/2211.03989,,https://huggingface.co/papers/2211.03989,,,,7,1 Prototypical Mixup and Retrieval-based Refinement for Label Noise-resistant Image Retrieval,"Yang, Xinlong*; Wang, Haixin; Sun, Jinan; Zhang, Shikun; Chen, Chong; Hua, Xian-Sheng; Luo, Xiao",poster,,,,,,,,, Learning Spatial-context-aware Global Visual Feature Representation for Instance Image Retrieval,"Zhang, Zhongyan*; Wang, Lei; Zhou, Luping; Koniusz, Piotr",poster,,,,,,,,, Coarse-to-Fine: Learning Compact Discriminative Representation for Single-Stage Image Retrieval,"zhu, yunquan*; Gao, Xinkai; Ke, Bo; Qiao, Ruizhi; Sun, Xing",poster,,,,,,,,, Visible-Infrared Person Re-Identification via Semantic Alignment and Affinity Inference,"Fang, Xingye; Yang, Yang; Fu, Ying*",poster,,,,,,,,, Part-Aware Transformer for Generalizable Person Re-identification,"Ni, Hao; Li, Yuke; Gao, Lianli; Shen, Heng Tao; Song, Jingkuan*",poster,2308.03322,https://arxiv.org/abs/2308.03322,https://github.com/liyuke65535/Part-Aware-Transformer,https://huggingface.co/papers/2308.03322,,,,4,0 Towards Universal Image Embeddings: A Large-Scale Dataset and Challenge for Generic Image Representations,"Ypsilantis, Nikolaos-Antonios*; Chen, Kaifeng; Cao, Bingyi; LipovskÃœ, Mário; Dogan-Schonberger, Pelin; Makosa, Grzegorz; Bluntschli, Boris; Seyedhosseini, Mojtaba; Araujo, Andre; Chum, Ondrej",poster,,,,,,,,, Dual Learning with Dynamic Knowledge Distillation for Partially Relevant Video Retrieval,"Dong, Jianfeng*; Zhang, Minsong; zhang, zheng; Chen, Xianke; Liu, Daizong; Qu, Xiaoye; Liu, Baolong; Wang, Xun",poster,,,,,,,,, Fine-grained Unsupervised Domain Adaptation for Gait Recognition,"Ma, Kang; Fu, Ying*; Zheng, Dezhi; Peng, Yunjie; Cao, Chunshui; Huang, Yongzhen",poster,,,,,,,,, FashionNTM: Multi-turn Fashion Image Retrieval via Cascaded Memory,"Pal, Anwesan*; Wadhwa, Sahil; Jaiswal, Ayush; Zhang, Xu; Wu, Yue; Chada, Rakesh; Natarajan, Pradeep; Christensen, Henrik I",poster,2308.10170,https://arxiv.org/abs/2308.10170,,https://huggingface.co/papers/2308.10170,,,,8,1 CrossLoc3D: Aerial-Ground Cross-Source 3D Place Recognition ,"Guan, Tianrui*; Muthuselvam, Aswath; Hoover, Montana; Wang, Xijun; Liang, Jing; Sathyamoorthy, Adarsh Jagan ; Conover, Damon; Manocha, Dinesh",poster,2303.17778,https://arxiv.org/abs/2303.17778,,https://huggingface.co/papers/2303.17778,,,,8,1 ImbSAM: A Closer Look at Sharpness-Aware Minimization in Class-Imbalanced Recognition,"Zhou, Yixuan; Qu, Yi; Xu, Xing*; Shen, Heng Tao",poster,2308.07815,https://arxiv.org/abs/2308.07815,https://github.com/cool-xuan/Imbalanced_SAM,https://huggingface.co/papers/2308.07815,,,,4,0 LFS-GAN: Lifelong Few-Shot Image Generation,"Seo, Juwon*; Kang, Jisu; Park, Gyeong-Moon",poster,,,,,,,,, Augmented Box Replay: Overcoming Foreground Shift for Incremental Object Detection,"Liu, Yuyang*; Cong, Yang; Goswami, Dipam; Liu, Xialei; van de Weijer, Joost",poster,2307.12427,https://arxiv.org/abs/2307.12427,,https://huggingface.co/papers/2307.12427,,,,5,0 Contrastive Model Adaptation for Cross-Condition Robustness in Semantic Segmentation,"BrÃŒggemann, David*; Sakaridis, Christos; Broedermann, Tim; Van Gool, Luc",poster,2303.05194,https://arxiv.org/abs/2303.05194,https://github.com/brdav/cma,https://huggingface.co/papers/2303.05194,,,,4,0 Towards Effective Instance Discrimination Contrastive Loss for Unsupervised Domain Adaptation,"Zhang, Yixin; Wang, Zilei*; li, junjie; Zhuang, Jiafan; Lin, Zihan",poster,,,,,,,,, Adversarial Bayesian Augmentation for Single-Source Domain Generalization,"Cheng, Sheng*; Gokhale, Tejas; Yang, Yezhou",poster,2307.09520,https://arxiv.org/abs/2307.09520,,https://huggingface.co/papers/2307.09520,,,,3,1 Measuring Asymmetric Gradient Discrepancy in Parallel Continual Learning,"Lyu, Fan; Sun, Qing; Shang, Fanhua; Wan, Liang; Feng, Wei*",poster,,,,,,,,, CSDA: Learning Category-Scale Joint Feature for Domain Adaptive Object Detection,"Gao, Changlong; Liu, Chengxu*; Dun, Yujie; Qian, Xueming",poster,,,,,,,,, Distilling from Similar Tasks for Transfer Learning on a Budget,"Borup, Kenneth*; Phoo, Cheng Perng; Hariharan, Bharath",poster,2304.12314,https://arxiv.org/abs/2304.12314,,https://huggingface.co/papers/2304.12314,,,,3,1 Complementary Domain Adaptation and Generalization for Unsupervised Continual Domain Shift Learning,"Cho, Wonguk*; Park, Jinha; Kim, Taesup",poster,2303.15833,https://arxiv.org/abs/2303.15833,,https://huggingface.co/papers/2303.15833,,,,3,0 Camera-driven Representation Learning for Unsupervised Domain Adaptive Person Re-identification,"Lee, Geon; Lee, Sanghoon; KIM, DOHYUNG; Shin, Younghoon; Yoon, Yongsang; Ham, Bumsub*",poster,2308.11901,https://arxiv.org/abs/2308.11901,,https://huggingface.co/papers/2308.11901,,,,6,0 Introducing Language Guidance in Prompt-based Continual Learning,"Khan, Muhammad Gul Zain Ali*; Naeem, Muhammad Ferjad; Van Gool, Luc; Stricker, Didier; Tombari, Federico; Afzal, Muhammad Zeshan",poster,2308.15827,https://arxiv.org/abs/2308.15827,,https://huggingface.co/papers/2308.15827,,,,6,0 Fast and Accurate Transferability Measurement by Evaluating Intra-class Feature Variance,"Xu, Huiwen; Kang, U*",poster,2308.05986,https://arxiv.org/abs/2308.05986,,https://huggingface.co/papers/2308.05986,,,,2,0 A Unified Continual Learning Framework with General Parameter-Efficient Tuning,"Gao, Qiankun; Zhao, Chen; Sun, Yifan; Xi, Teng; zhang, gang; Ghanem, Bernard; Zhang, Jian*",poster,2303.10070,https://arxiv.org/abs/2303.10070,https://github.com/gqk/LAE,https://huggingface.co/papers/2303.10070,,,,7,0 SFHarmony: Source Free Domain Adaptation for Distributed Neuroimaging Analysis,"Dinsdale, Nicola K*; Jenkinson, Mark ; Namburete, Ana Ineyda L",poster,2303.15965,https://arxiv.org/abs/2303.15965,https://github.com/nkdinsdale/SFHarmony,https://huggingface.co/papers/2303.15965,,,,3,0 Towards Realistic Evaluation of Industrial Continual Learning Scenarios with an emphasis on Energy Consumption and Computational Footprint,"Chavan, Vivek*; Koch, Paul; SchlÃŒter, Marian; Briese, Clemens",poster,,,,,,,,, Exploring Consistency in Cross-Domain Transformer for Domain Adaptive Semantic Segmentation,"Wang, Kaihong; Kim, Donghyun*; Feris, Rogerio; Betke, Margrit",poster,2211.14703,https://arxiv.org/abs/2211.14703,,https://huggingface.co/papers/2211.14703,,,,5,0 PC-Adapter: Topology-Aware Adapter for Efficient Domain Adaption on Point Clouds with Rectified pseudo-label,"Park, Joonhyung*; Seo, Hyunjin; Yang, Eunho",poster,,,,,,,,, DETA: Denoised Task Adaptation for Few-Shot Learning,"Zhang, Ji*; Gao, Lianli; Luo, Xu; Shen, Heng Tao; Song, Jingkuan",poster,2303.06315,https://arxiv.org/abs/2303.06315,https://github.com/nobody-1617/DETA,https://huggingface.co/papers/2303.06315,,,,5,0 Activate and Reject: Towards Safe Domain Generalization under Label Shift,"Chen, Chaoqi*; Tang, Luyao; Tao, Leitian; Zhou, Hong-Yu; Huang, Yue; Han, Xiaoguang; Yu, Yizhou",poster,,,,,,,,, Generalizable Decision Boundaries: Dualistic Meta-Learning for Open Set Domain Generalization,"Wang, Xiran; Zhang, Jian; Qi, Lei; Shi, Yinghuan*",poster,2308.09391,https://arxiv.org/abs/2308.09391,https://github.com/zzwdx/MEDIC,https://huggingface.co/papers/2308.09391,,,,4,0 Continual Zero-Shot Learning through Semantically Guided Generative Random Walks,"Zhang, Wenxuan*; Janson, Paul; Yi, Kai; Skorokhodov, Ivan; Elhoseiny, Mohamed",poster,2308.12366,https://arxiv.org/abs/2308.12366,https://github.com/wx-zhang/IGCZSL,https://huggingface.co/papers/2308.12366,,,,5,0 Zero-Shot Point Cloud Segmentation by Semantic-Visual Aware Synthesis,"Yang, Yuwei; Hayat, Munawar; Jin, Zhao; Zhu, Hongyuan; Lei, Yinjie *",poster,,,,,,,,, MDCS: More Diverse Experts with Consistency Self-distillation for Long-tailed Recognition,"Zhao, Qihao; Jiang, Chen; Hu, Wei; Zhang, Fan*; Liu, Jun",poster,2308.09922,https://arxiv.org/abs/2308.09922,https://github.com/fistyee/MDCS,https://huggingface.co/papers/2308.09922,,,,5,0 Building a Winning Team: Selecting Source Model Ensembles using a Submodular Transferability Estimation Approach,"K B, Vimal*; Bachu, Saketh; Garg, Tanmay; N Balasubramanian, Vineeth; Lakshmi Narasimhan, Niveditha; Konuru, Raghavan",poster,,,,,,,,, Confidence-based Visual Dispersal for Few-shot Unsupervised Domain Adaptation,"Xiong, Yizhe*; Chen, Hui; Lin, Zijia; Zhao, Sicheng; ding, guiguang",poster,,,,,,,,, BEV-DG: Cross-Modal Learning under Bird's-Eye View for Domain Generalization of 3D Semantic Segmentation,"Li, Miaoyu*; Zhang, Yachao; Ma, Xu; Qu, Yanyun; FU, YUN",poster,,,,,,,,, CDFSL-V: Cross-Domain Few-Shot Learning for Videos,"Samarasinghe, Sarinda D*; Rizve, Mamshad Nayeem; Kardan, Navid; Shah, Mubarak",poster,,,,,,,,, Energy-based Self-Training and Normalization for Unsupervised Domain Adaptation,"Herath, Samitha*; Fernando, Basura; Abbasnejad, Ehsan M; Hayat, Munawar; Khadivi, Shahram; Harandi, Mehrtash; Rezatofighi, Hamid; Haffari, Reza",poster,,,,,,,,, Regularized Mask Tuning: Uncovering Hidden Knowledge in Pre-trained Vision-Language Models,"Zheng, Kecheng*; Wu, Wei; Feng, Ruili; Zhu, Kai; Liu, Jiawei; Zhao, Deli; Zha, Zheng-Jun; Chen, Wei; Shen, Yujun",poster,2307.15049,https://arxiv.org/abs/2307.15049,,https://huggingface.co/papers/2307.15049,,,,9,0 NAPA-VQ: Neighborhood-Aware Prototype Augmentation with Vector Quantization for Continual Learning,"Malepathirana, Tamasha A*; Senanayake, Damith A; Halgamuge, Saman",poster,,,,,,,,, A Sentence Speaks a Thousand Images: Domain Generalization through Distilling CLIP with Language Guidance,"Huang, Zeyi*; Zhou, Andy; Ling, Zijian; Cai, Mu; Wang, Haohan; Lee, Yong Jae",poster,,,,,,,,, ViM: Vision Middleware for Unified Downstream Transferring,"Feng, Yutong*; Gong, Biao; Jiang, Jianwen; Lv, Yiliang; Shen, Yujun; Zhao, Deli; Zhou, Jingren",poster,2303.06911,https://arxiv.org/abs/2303.06911,,https://huggingface.co/papers/2303.06911,,,,7,0 Learning to Learn: How to Continuously Teach Humans and Machines,"Singh, Parantak*; Li, You; Sikarwar, Ankur; Lei, Stan Weixian; Gao, Difei; Talbot , Morgan B; Sun, Ying; Shou, Mike Zheng; Kreiman, Gabriel; Zhang, Mengmi",poster,2211.15470,https://arxiv.org/abs/2211.15470,,https://huggingface.co/papers/2211.15470,,,,10,0 A Good Student is Cooperative and Reliable: CNN-Transformer Collaborative Learning for Semantic Segmentation,"ZHU, Jinjing*; Luo, Yunhao; Zheng, Xu; Wang, Hao; Wang, Lin ",poster,2307.12574,https://arxiv.org/abs/2307.12574,,https://huggingface.co/papers/2307.12574,,,,5,1 Online Class Incremental Learning on Stochastic Blurry Task Boundary via Mask and Visual Prompt Tuning,"Moon, Jun Yeong*; Park, KeonHee; Kim, Jung Uk; Park, Gyeong-Moon",poster,2308.09303,https://arxiv.org/abs/2308.09303,,https://huggingface.co/papers/2308.09303,,,,4,0 Heterogeneous Forgetting Compensation for Class-Incremental Learning,"Dong, Jiahua*; Cong, Yang; liang, wenqi; Sun, Gan",poster,2308.03374,https://arxiv.org/abs/2308.03374,https://github.com/JiahuaDong/HFC,https://huggingface.co/papers/2308.03374,,,,4,0 Disposable Transfer Learning for Selective Source Task Unlearning,"Koh, Seunghee*; Shon, Hyounguk; Lee, Janghyeon; Hong, Hyeong Gwon; Kim, Junmo",poster,2308.09971,https://arxiv.org/abs/2308.09971,,https://huggingface.co/papers/2308.09971,,,,5,0 Online Continual Learning on Hierarchical Label Expansion,"Lee, Byung Hyun; Jung, Okchul; Choi, Jonghyun; Chun, Se Young*",poster,2308.14374,https://arxiv.org/abs/2308.14374,,https://huggingface.co/papers/2308.14374,,,,4,0 Black-box Unsupervised Domain Adaptation with Bi-directional Atkinson-Shiffrin Memory,"Zhang, Jingyi; Huang, Jiaxing; Jiang, Xueying; Lu, Shijian*",poster,2308.13236,https://arxiv.org/abs/2308.13236,,https://huggingface.co/papers/2308.13236,,,,4,0 Local and Global Logit Adjustments for Long-Tailed Learning,"Tao, Yingfan*; sun, jingna; Yang, Hao; Chen, Li; Wang, Xu; Yang, Wenming; Du, Daniel Kang; Zheng, Min ",poster,,,,,,,,, FS-DETR: Few-Shot DEtection TRansformer with prompting and without re-training,"Bulat, Adrian*; Guerrero , Ricardo; Martinez, Brais; Tzimiropoulos, Georgios",poster,,,,,,,,, Tuning Pre-trained Model via Moment Probing,"Gao, Mingze*; Wang, Qilong; Lin, Zhenyi; Zhu, Pengfei; Hu, Qinghua; zhou, jingbo",poster,2307.11342,https://arxiv.org/abs/2307.11342,,https://huggingface.co/papers/2307.11342,,,,6,0 Text2Room: Extracting Textured 3D Meshes from 2D Text-to-Image Models,"Höllein, Lukas*; Cao, Ang ; Owens, Andrew; Johnson , Justin; Niessner, Matthias",oral,2303.11989,https://arxiv.org/abs/2303.11989,,https://huggingface.co/papers/2303.11989,,,,5,0 LivePose: Online 3D Reconstruction from Monocular Video with Dynamic Camera Poses,"Stier, Noah; Angles, Baptiste; Yang, Liang*; yan, yajie; Colburn, Alex; Chuang, Ming",oral,2304.00054,https://arxiv.org/abs/2304.00054,,https://huggingface.co/papers/2304.00054,,,,6,0 NDDepth: Normal-Distance Assisted Monocular Depth Estimation,"Shao, Shuwei*; pei, zhongcai; Chen, Weihai; Wu, Xingming; Li, Zhengguo",oral,,,,,,,,, LATR: 3D Lane Detection from Monocular Images with Transformer,"Luo, Yueru; Zheng, Chaoda; Yan, Xu; Tang, Kun; zheng, chao; Cui, Shuguang; Li, Zhen*",oral,2308.04583,https://arxiv.org/abs/2308.04583,https://github.com/JMoonr/LATR,https://huggingface.co/papers/2308.04583,,,,7,0 DriveAdapter: Breaking the Coupling Barrier of Perception and Planning in End-to-End Autonomous Driving,"Jia, Xiaosong*; Gao, Yulu; Chen, Li; Yan, Junchi; Liu, Langechuan; Li, Hongyang",oral,2308.00398,https://arxiv.org/abs/2308.00398,,https://huggingface.co/papers/2308.00398,,,,6,0 Dynamic Point Fields,"Prokudin, Sergey*; Ma, Qianli; Raafat, Maxime; Valentin, Julien; Tang, Siyu",oral,2304.02626,https://arxiv.org/abs/2304.02626,,https://huggingface.co/papers/2304.02626,,,,5,0 Generalizing Neural Human Fitting to Unseen Pose With Articulated E(3) Equivariance,"Feng, Haiwen*; Kulits, Peter; Liu, Shichen; Black, Michael J.; Fernandez Abrevaya, Victoria",oral,,,,,,,,, Probabilistic Human Mesh Recovery in 3D Scenes from Egocentric Views,"Zhang, Siwei*; Ma, Qianli; Zhang, Yan; Aliakbarian, Sadegh; Cosker, Darren P; Tang, Siyu",oral,2304.06024,https://arxiv.org/abs/2304.06024,,https://huggingface.co/papers/2304.06024,,,,6,0 DECO: Dense Estimation of 3D Human-Scene Contact In The Wild ,"Tripathi, Shashank*; Chatterjee, Agniv; Passy, Jean-Claude; Yi, Hongwei; Tzionas, Dimitrios; Black, Michael J.",oral,,,,,,,,, Decoupled Iterative Refinement Framework for Interacting Hands Reconstruction from a Single RGB Image,"Ren, Pengfei*; Wen, Chao; Zheng, Xiaozheng; Xue, Zhou; Sun, Haifeng; Qi, Qi; Wang, Jingyu; Liao, Jianxin",oral,2302.02410,https://arxiv.org/abs/2302.02410,,https://huggingface.co/papers/2302.02410,,,,8,0 Chasing clouds: Differentiable volumetric rendering of point clouds as a highly efficient and accurate loss for large-scale deformable 3D registration,"Heinrich, Mattias Paul*; Bigalke, Alexander; Großbröhmer, Christoph; Hansen, Lasse",oral,,,,,,,,, Rehearsal-Free Domain Continual Face Anti-Spoofing: Generalize More and Forget Less,"Cai, Rizhao*; Cui, Yawen; Li, Zhi; Yu, Zitong; Li, Haoliang; Hu, Yongjian; Kot, Alex",oral,,,,,,,,, A 5-Point Minimal Solver for Event Camera Relative Motion Estimation,"Gao, Ling*; Su, Hang; Gehrig, Daniel; Cannici, Marco; Scaramuzza, Davide; Kneip, Laurent",oral,,,,,,,,, General Planar Motion from a 3D point pair,"Dibene Simental, Juan Carlos*; Min, Zhixiang; Dunn, Enrique",oral,,,,,,,,, Beyond the Pixel: a Photometrically Calibrated HDR Dataset for Luminance and Color Temperature Prediction,"Bolduc, Christophe; Giroux, Justine; Marc, Hébert; Demers, Claude MH; Lalonde, Jean-Francois*",oral,,,,,,,,, DDFM: Denoising Diffusion Model for Multi-Modality Image Fusion,"Zhao, Zixiang*; Bai, Haowen; Zhu, Yuanzhi; Zhang, Jiangshe; Xu, Shuang; Zhang, Yulun; Zhang, Kai; Meng, Deyu; Timofte, Radu; Van Gool, Luc",oral,2303.06840,https://arxiv.org/abs/2303.06840,https://github.com/Zhaozixiang1228/MMIF-DDFM,https://huggingface.co/papers/2303.06840,,,,10,0 Iterative Prompt Learning for Unsupervised Backlit Image Enhancement,"Liang, Zhexin*; Li, Chongyi; Zhou, Shangchen; Feng, Ruicheng; Loy, Chen Change",oral,2303.17569,https://arxiv.org/abs/2303.17569,,https://huggingface.co/papers/2303.17569,,,,5,0 Similarity Min-Max: Zero-Shot Day-Night Domain Adaptation,"Luo, Rundong*; Wang, Wenjing; Yang, Wenhan; Liu, Jiaying",oral,,,,,,,,, Multi-interactive Feature Learning and a Full-time Multi-modality Benchmark for Image Fusion and Segmentation,"Liu, Jinyuan; Liu, Zhu; Wu, Guanyao; Ma, Long; Liu, Risheng; Zhong, Wei; Luo, Zhongxuan; Fan, Xin*",oral,2308.02097,https://arxiv.org/abs/2308.02097,https://github.com/JinyuanLiu-CV/SegMiF,https://huggingface.co/papers/2308.02097,,,,8,0 Computational 3D Imaging with Position Sensors,"Klotz, Jeremy*; Gupta, Mohit; Sankaranarayanan, Aswin",oral,,,,,,,,, Passive Ultra-Wideband Single-Photon Imaging,"Wei, Mian*; Nousias, Sotiris; Gulve, Rahul; Lindell, David B; Kutulakos, Kiriakos N",oral,,,,,,,,, Viewing Graph Solvability in Practice,"Arrigoni, Federica*; Pajdla, Tomas; Fusiello, Andrea",oral,,,,,,,,, Minimal Solutions to Generalized Three-View Relative Pose Problem,"Ding, Yaqing; Chien, Chiang-Heng*; Larsson, Viktor; Åström, Karl; Kimia, Benjamin",oral,,,,,,,,, SoDaCam: Software-defined Cameras via Single-Photon Imaging,"Sundar, Varun*; Ardelean, Andrei; Swedish, Tristan; Bruschini, Claudio; Charbon, Edoardo; Gupta, Mohit",oral,,,,,,,,, Hierarchical Contrastive Learning for Pattern-Generalizable Image Corruption Detection,"Feng, Xin; Xu, Yifeng; Lu, Guangming; Pei, Wenjie*",poster,2308.14061,https://arxiv.org/abs/2308.14061,https://github.com/xyfJASON/HCL,https://huggingface.co/papers/2308.14061,,,,4,0 DDS2M: Self-Supervised Denoising Diffusion Spatio-Spectral Model for Hyperspectral Image Restoration,"Yuchun, Miao*; Zhang, Lefei; Zhang, Liangpei; Tao, Dacheng",poster,2303.06682,https://arxiv.org/abs/2303.06682,,https://huggingface.co/papers/2303.06682,,,,4,0 From Sky to the Ground: A Large-scale Benchmark and Simple Baseline Towards Real Rain Removal,"Guo, Yun; Xiao, Xueyao; Chang, Yi*; Deng, ShuMin; Yan, Luxin",poster,2308.03867,https://arxiv.org/abs/2308.03867,,https://huggingface.co/papers/2308.03867,,,,5,0 VAPCNet: Viewpoint-Aware 3D Point Cloud Completion,"Fu, Zhiheng*; Wang, Longguang; Xu, Lian; Guo, Yulan; Laga, Hamid ; Wang, Zhiyong; Boussaid, Farid; Bennamoun, Mohammed",poster,,,,,,,,, AccFlow: Backward Accumulation for Long-Range Optical Flow,"Wu, Guangyang*; Wang, Wenyi; Luo, Kunming; Liu, Xi; Zheng, Qingqing; Liu, Shuaicheng; Jiang, Xinyang; Zhai, Guangtao; Liu, Xiaohong",poster,2308.13133,https://arxiv.org/abs/2308.13133,https://github.com/mulns/AccFlow,https://huggingface.co/papers/2308.13133,,,,9,0 Improving Transformer-based Image Matching by Cascaded Capturing Spatially Informative Keypoints,"cao, chenjie; Fu, Yanwei*",poster,2303.02885,https://arxiv.org/abs/2303.02885,,https://huggingface.co/papers/2303.02885,,,,2,0 Low-Light Image Enhancement with Multi-stage Residue Quantization and Brightness-aware Attention,"Liu, Yunlong; Huang, Tao; Dong, Weisheng*; Wu, Fangfang; Li, Xin; Shi, Guangming",poster,,,,,,,,, Random Sub-Samples Generation for Self-Supervised Real Image Denoising,"Pan, Yizhong; Liu, Xiao; Liao, Xiangyu; Cao, Yuanzhouhan; Ren, Chao*",poster,2307.16825,https://arxiv.org/abs/2307.16825,https://github.com/p1y2z3/SDAP,https://huggingface.co/papers/2307.16825,,,,5,0 RSFNet: A White-Box Image Retouching Approach using Region-Specific Color Filters,"Ouyang, Wenqi*; Dong, Yi; REN, PEIRAN; Kang, Xiaoyang; xu, xin; Xie, Xuansong",poster,2303.08682,https://arxiv.org/abs/2303.08682,,https://huggingface.co/papers/2303.08682,,,,6,0 Physics-Driven Turbulence Image Restoration with Stochastic Refinement,"Jaiswal, Ajay*; Zhang, Xingguang; Chan, Stanley; Wang, Zhangyang",poster,2307.10603,https://arxiv.org/abs/2307.10603,https://github.com/VITA-Group/PiRN,https://huggingface.co/papers/2307.10603,,,,4,0 SYENet: A Simple Yet Effective Network for Multiple Low-Level Vision Tasks with Real-time Performance on Mobile Device,"yi, ziyao*; Gou, Weiran; Xiang, yan; Li, Shaoqing; Liu, Zibin; Dehui, Kong; Ke, Xu",poster,2308.08137,https://arxiv.org/abs/2308.08137,,https://huggingface.co/papers/2308.08137,,,,7,0 Self-supervised Image Denoising with Downsampled Invariance Loss and Conditional Blind-Spot Network,"Jang, Yeong Il*; Lee, Keuntek; PARK, GU YONG; Kim, Seyun; Cho, Nam Ik",poster,2304.09507,https://arxiv.org/abs/2304.09507,,https://huggingface.co/papers/2304.09507,,,,5,0 Variational Degeneration to Structural Refinement: A Unified Framework for Superimposed Image Decomposition,"Li, Wenyu; Xu, Yan; Yang, Yang*; Ji, Haoran; Lang, Yue",poster,,,,,,,,, Reconstructed Convolution Module Based Look-Up Tables for Efficient Image Super-Resolution,"Liu, Guandu; Ding, Yukang; Li, Mading; Sun, Ming; Wen, Xing; Wang, Bin*",poster,2307.08544,https://arxiv.org/abs/2307.08544,https://github.com/liuguandu/RC-LUT,https://huggingface.co/papers/2307.08544,,,,6,0 Self-supervised Pre-training for Mirror Detection,"Lin, Jiaying*; Lau, Rynson W.H.",poster,,,,,,,,, Downscaled Representation Matters: Improving Image Rescaling with Collaborative Downscaled Images,"Xu, Bingna*; Guo, Yong; Jiang, Luoqian; Yu, MianJie; Chen, Jian",poster,2211.10643,https://arxiv.org/abs/2211.10643,,https://huggingface.co/papers/2211.10643,,,,5,0 "Self-supervised Monocular Underwater Depth Recovery, Image Restoration, and a Real-sea Video Dataset","Varghese, Nisha*; Kumar, Ashish; Ambasamudram, Rajagopalan N",poster,,,,,,,,, Rethinking Video Frame Interpolation from Shutter Mode Induced Degradation,"Ji, Xiang; Wang, Zhixiang; Zhong, Zhihang; Zheng, Yinqiang*",poster,,,,,,,,, Single Image Deblurring with Row-dependent Blur Magnitude,"Ji, Xiang; Wang, Zhixiang; Satoh, Shin'ichi; Zheng, Yinqiang*",poster,,,,,,,,, Multi-view Self-supervised Disentanglement for General Image Denoising,"Chen, Hao*; Qu, Chenyuan; Chen, Chen; Zhang, Yu; Jiao, Jianbo",poster,,,,,,,,, Joint Demosaicing and Deghosting of Time-Varying Exposures for Snapshot HDR Imaging,"Kim, Jungwoo; Kim, Min H.*",poster,,,,,,,,, Diff-Retinex: Rethinking Low-light Image Enhancement with A Generative Diffusion Model,"Yi, Xunpeng; Xu, Han; Zhang, Hao; Tang, Linfeng; Ma, Jiayi*",poster,,,,,,,,, Dual Aggregation Transformer for Image Super-Resolution,"Chen, Zheng; Zhang, Yulun*; Gu, Jinjin; Kong, Linghe; Yang, Xiaokang; Yu, Fisher",poster,2308.03364,https://arxiv.org/abs/2308.03364,https://github.com/zhengchen1999/DAT,https://huggingface.co/papers/2308.03364,,,,6,0 Video Object Segmentation-aware Video Frame Interpolation,"Yoo, Junsang; Lee, Hongjae; Jung, Seung-Won*",poster,,,,,,,,, RawHDR: High Dynamic Range Image Reconstruction from a Single Raw Image,"Zou, Yunhao; Yan, Chenggang; Fu, Ying*",poster,,,,,,,,, Multi-scale Residual Low-Pass Filter Network for Image Deblurring,"Dong, Jiangxin*; Pan, Jinshan; Yang, Zhongbao; Tang, Jinhui",poster,,,,,,,,, Indoor Depth Recovery Based on Deep Unfolding with Non-Local Prior,"Dai, Yuhui*; Fang, Faming; zhang, junkang; Zhang, Guixu",poster,,,,,,,,, Learning Correction Filter via Degradation-Adaptive Regression for Blind Single Image Super-Resolution,"Zhou, Hongyang*; Zhu, Xiaobin; Zhu, Jianqing; Han, Zheng; Zhang, Shi-Xue; Qin, Jingyan; Yin, Xu-Cheng",poster,,,,,,,,, Learning Non-Local Spatial-Angular Correlation for Light Field Image Super-Resolution,"Liang, Zhengyu*; Wang, Yingqian; Wang, Longguang; Yang, Jungang; Zhou, Shilin; Guo, Yulan",poster,2302.08058,https://arxiv.org/abs/2302.08058,https://github.com/ZhengyuLiang24/EPIT,https://huggingface.co/papers/2302.08058,,,,6,0 Both Diverse and Realism Matter: Physical Attribute and Style Alignment for Rainy Image Generation,"Yu, Changfeng; Chen, Shiming; Chang, Yi*; Song, Yibing; Yan, Luxin",poster,,,,,,,,, Learned Image Reasoning Prior Penetrates Deep Unfolding Network for Panchromatic and Multi-spectral Image Fusion,"zhou, man*; Huang, Jie; Zheng, Naishan; Li, Chongyi",poster,2308.16083,https://arxiv.org/abs/2308.16083,,https://huggingface.co/papers/2308.16083,,,,4,0 The Devil is in the Upsampling: Architecture Decisions Made Simpler for Denoising with Deep Image Prior,"Liu, Yilin; Li, Jiang; Pang, Yunkui; Nie, Dong; Yap, Pew-Thian*",poster,,,,,,,,, SimFIR: A Simple Framework for Fisheye Image Rectification with Self-supervised Representation Learning,"Feng, Hao; Wang, Wendi; Deng, Jiajun; Zhou, Wengang ; Li, Li*; Li, Houqiang",poster,2308.09040,https://arxiv.org/abs/2308.09040,,https://huggingface.co/papers/2308.09040,,,,6,0 Exploring Temporal Frequency Spectrum in Deep Video Deblurring,"Zhu, Qi; zhou, man; Zheng, Naishan; Li, Chongyi; Huang, Jie; Zhao, Feng*",poster,,,,,,,,, ExposureDiffusion: Learning to Expose for Low-light Image Enhancement,"Wang, Yufei*; Yu, Yi; Yang, Wenhan; Guo, Lanqing; Chau, Lap-Pui; Kot, Alex; Wen, Bihan",poster,2307.07710,https://arxiv.org/abs/2307.07710,,https://huggingface.co/papers/2307.07710,,,,7,0 High-resolution Document Shadow Removal via A Large-scale Real-world Dataset and A Frequency-aware Shadow Erasing Net,"Chen, Xuhang*; Cun, Xiaodong; Li, Zinuo; Pun, Chi-Man",poster,2308.14221,https://arxiv.org/abs/2308.14221,https://github.com/CXH-Research/DocShadow-SD7K,https://huggingface.co/papers/2308.14221,,,,4,0 Towards Saner Deep Image Registration,"Duan, Bin*; Zhong, Ming; Yan, Yan",poster,2307.09696,https://arxiv.org/abs/2307.09696,https://github.com/tuffr5/Saner-deep-registration,https://huggingface.co/papers/2307.09696,,,,3,0 VideoFlow: Exploiting Temporal Cues for Multi-frame Optical Flow Estimation,"Shi, Xiaoyu*; Huang, Zhaoyang; BIAN, Weikang; Li, dasong; Zhang, Manyuan; Cheung, Ka Chun; See, Simon; Qin, Hongwei; Dai, Jifeng; Li, Hongsheng",poster,2303.08340,https://arxiv.org/abs/2303.08340,https://github.com/XiaoyuShi97/VideoFlow,https://huggingface.co/papers/2303.08340,,,,10,0 Scene Matters: Model-based Deep Video Compression,"Tang, Lv*; zhang, xinfeng; Zhang, Gai; Ma, xiaoqi",poster,2303.04557,https://arxiv.org/abs/2303.04557,,https://huggingface.co/papers/2303.04557,,,,4,0 Non-Coaxial Event-guided Motion Deblurring with Spatial Alignment,"Cho, Hoonhee*; Jeong, Yuhwan; Kim, Taewoo; Yoon, Kuk-Jin",poster,,,,,,,,, Retinexformer: One-stage Retinex-based Transformer for Low-light Image Enhancement,"Cai, Yuanhao*; Bian, Hao; Lin, Jing; Wang, Haoqian; Timofte, Radu; Zhang, Yulun",poster,2303.06705,https://arxiv.org/abs/2303.06705,https://github.com/caiyuanhao1998/Retinexformer,https://huggingface.co/papers/2303.06705,,,,6,0 Feature Modulation Transformer: Cross-Refinement of Global Representation via High-Frequency Prior for Image Super-Resolution,"Li, Ao; Zhang, Le*; Liu, Yun; Zhu, Ce",poster,2308.05022,https://arxiv.org/abs/2308.05022,https://github.com/AVC2-UESTC/CRAFT-SR.git,https://huggingface.co/papers/2308.05022,,,,4,0 MVPSNet: Fast Generalizable Multi-view Photometric Stereo,"Zhao, Dongxu*; Lichy, Daniel J; Perrin, Pierre-Nicolas A; Frahm, Jan-Michael; Sengupta, Soumyadip",poster,2305.11167,https://arxiv.org/abs/2305.11167,,https://huggingface.co/papers/2305.11167,,,,5,0 FSI: Frequency and Spatial Interactive Learning for Image Restoration in Under-Display Cameras,"Liu, Chengxu*; Wang, Xuan; li, shuai; Wang, Yuzhi; Qian, Xueming",poster,,,,,,,,, Spherical Space Feature Decomposition for Guided Depth Map Super-Resolution,"Zhao, Zixiang*; Zhang, Jiangshe; Gu, Xiang; Tan, Chengli; Xu, Shuang; Zhang, Yulun; Timofte, Radu; Van Gool, Luc",poster,2303.08942,https://arxiv.org/abs/2303.08942,https://github.com/Zhaozixiang1228/GDSR-SSDNet,https://huggingface.co/papers/2303.08942,,,,8,0 Empowering Low-Light Image Enhancer through Customized Learnable Priors,"Zheng, Naishan; Dong, Yanmeng; Rui, Xiangyu; Huang, Jie; Li, Chongyi; zhou, man; Zhao, Feng*",poster,,,,,,,,, Learning Image Harmonization in the Linear Color Space,"Xu, Ke*; Hancke, Gerhard P.; Lau, Rynson W.H.",poster,,,,,,,,, Under-Display Camera Image Restoration with Scattering Effect,"SONG, Binbin; Chen, Xiangyu; Xu, Shuning; Zhou, Jiantao*",poster,2308.04163,https://arxiv.org/abs/2308.04163,https://github.com/NamecantbeNULL/SRUDC,https://huggingface.co/papers/2308.04163,,,,4,0 Iterative Soft Shrinkage Learning for Efficient Image Super-Resolution,"wang, jiamian*; Wang, Huan; Zhang, Yulun; FU, YUN; Tao, Zhiqiang",poster,2303.09650,https://arxiv.org/abs/2303.09650,https://github.com/Jiamian-Wang/Iterative-Soft-Shrinkage-SR,https://huggingface.co/papers/2303.09650,,,,5,0 Single Image Defocus Deblurring via Inverse Kernel Modeling and Prediction,"Quan, Yuhui*; Yao, Xin; Ji, Hui",poster,,,,,,,,, Degradation-Resistant Unfolding Network for Heterogeneous Image Fusion,"He, Chunming*; Li, Kai; Xu, Guoxia; Zhang, Yulun; Hu, Runze; Guo, Zhenhua; Li, Xiu",poster,,,,,,,,, Graphics2RAW: Mapping Computer Graphics Images to Sensor RAW Images,"Seo, Donghwan Ian*; Punnappurath, Abhijith; Zhao, Luxi; Abdelhamed, Abdelrahman; Tedla, SaiKiran K; Park, Sang Uk; Choe, Jihwan; Brown, Michael S",poster,,,,,,,,, Lighting up NeRF via Unsupervised Decomposition and Enhancement,"Wang, Haoyuan*; XU, Xiaogang; Xu, Ke; Lau, Rynson W.H.",poster,2307.10664,https://arxiv.org/abs/2307.10664,,https://huggingface.co/papers/2307.10664,,,,4,0 Unsupervised Image Denoising in Real-World Scenarios via Self-Collaboration Parallel Generative Adversarial Branches,"Lin, Xin; Ren, Chao*; Liu, Xiao; Huang, Jie; Lei, Yinjie ",poster,2308.06776,https://arxiv.org/abs/2308.06776,,https://huggingface.co/papers/2308.06776,,,,5,0 AWRCP: Reinventing Adverse Weather Removal with Codebook Priors,"Ye, Tian*; Bai, Jinbin; Liu, Yun; Chen, Erkang; Chen, Sixiang; Junjie, Yin; Jun, Shi; Jiang, JingXia; Xue, Chenghao",poster,,,,,,,,, MSRA-SR: Image Super-resolution Transformer with Multi-scale Shared Representation Acquisition,"Zhou, Xiaoqiang*; Huang, Huaibo; Wang, Zilei; Hu, Jie; He, Ran; Tan, Tieniu",poster,,,,,,,,, Deep Video Demoiréing via Compact Invertible Dyadic Decomposition,"Quan, Yuhui; Haoran, Huang; He, Shengfeng; Xu, Ruotao*",poster,,,,,,,,, SILT: Shadow-aware Iterative Label Tuning for Learning to Detect Shadows from Noisy Labels,"Yang, Han; Wang, Tianyu; Hu, Xiaowei*; Fu, Chi-Wing",poster,2308.12064,https://arxiv.org/abs/2308.12064,,https://huggingface.co/papers/2308.12064,,,,4,0 Innovating Real Fisheye Image Correction with Dual Diffusion Architecture,"Yang, Shangrong*; Lin, Chunyu; Liao, Kang; Zhao, Yao",poster,,,,,,,,, Adaptive Illumination Mapping for Shadow Detection in Raw Images,"Sun, Jiayu*; Xu, Ke; Pang, Youwei; Zhang, Lihe; Lu, Huchuan; Hancke, Gerhard P.; Lau, Rynson W.H.",poster,,,,,,,,, GEDepth: Ground Embedding for Monocular Depth Estimation,"Yang, Xiaodong*; Ma, Zhuang; Ji, Zhiyu; Ren, Zhe",poster,,,,,,,,, Efficient Image Super-Resolution with Superpixel Token Interaction,"Zhang, Aiping*; Ren, Wenqi; Liu, Yi; Cao, Xiaochun",poster,,,,,,,,, Unfolding Framework with Prior of Convolution-Transformer Mixture and Uncertainty Estimation for Video Snapshot Compressive Imaging,"zheng, siming*; Yuan, Xin",poster,2306.11316,https://arxiv.org/abs/2306.11316,,https://huggingface.co/papers/2306.11316,,,,2,0 Efficient Unified Demosaicing for Bayer and Non-Bayer Patterned Image Sensors,"Lee, Haechang; Jeong, Wongi; Park, Dongwon; Kim, Kijeong; Je, Hyunwoo; Ryu, Dongil; Chun, Se Young*",poster,2307.10667,https://arxiv.org/abs/2307.10667,,https://huggingface.co/papers/2307.10667,,,,7,0 LAN-HDR: Luminance-based Alignment Network for High Dynamic Range Video Reconstruction,"Chung, Haesoo*; Cho, Nam Ik",poster,,,,,,,,, Fine-grained Visible Watermark Removal,"Niu, Li*; Zhao, Xing; Zhang, Bo; Zhang, Liqing",poster,,,,,,,,, SRFormer: Permuted Self-Attention for Single Image Super-Resolution,"Zhou, Yupeng; Li, Zhen; Guo, Chun-Le; Bai, Song; Cheng, Ming-Ming; Hou, Qibin*",poster,2303.09735,https://arxiv.org/abs/2303.09735,,https://huggingface.co/papers/2303.09735,,,,6,0 DLGSANet: Lightweight Dynamic Local and Global Self-Attention Networks for Image Super-Resolution,"LI, Xiang; Pan, Jinshan*; Dong, Jiangxin; Tang, Jinhui",poster,2301.02031,https://arxiv.org/abs/2301.02031,,https://huggingface.co/papers/2301.02031,,,,4,0 MB-TaylorFormer: Mutil-branch Efficient Transformer Expanded by Taylor Formula for Image Dehazing,"yuwei, qiu*; Zhang, Kaihao; wang, chenxi; Luo, Wenhan; LI, HONGDONG; Jin, Zhi",poster,,,,,,,,, Multi-Frequency Representation Enhancement with Privilege Information for Video Super-Resolution,"Li, Fei*; Zhang, Linfeng; Liu, Zikun; Lei, Juan; Li, Zhenbo",poster,,,,,,,,, COMPASS: High-Efficiency Deep Image Compression with Arbitrary-scale Spatial Scalability,"Park, Jongmin*; Lee, Jooyoung; Kim, Munchurl",poster,,,,,,,,, Alignment-free HDR Deghosting with Semantics Consistent Transformer,"Tel, Steven*; WU, Zongwei; Zhang, Yulun; Heyrman, Barth; Demonceaux, Cedric; Timofte, Radu; Ginhac, Dominique",poster,2305.18135,https://arxiv.org/abs/2305.18135,,https://huggingface.co/papers/2305.18135,,,,7,0 From Chaos Comes Order: Ordering Event Representations for Object Detection,"Zubic, Nikola*; Gehrig, Daniel; Gehrig, Mathias; Scaramuzza, Davide",poster,,,,,,,,, Towards High-quality Specular Highlight Removal by Leveraging Large-scale Synthetic Data,"Fu, Gang*; Zhang, Qing; Zhu, Lei; Xiao, Chunxia; Li, Ping",poster,,,,,,,,, DynamicISP: Dynamically Controlled Image Signal Processor for Image Recognition,"Yoshimura, Masakazu*; Otsuka, Junji; Irie, Atsushi; Ohashi, Takeshi",poster,2211.01146,https://arxiv.org/abs/2211.01146,,https://huggingface.co/papers/2211.01146,,,,4,0 Dancing in the Dark: A Benchmark towards General Low-light Video Enhancement,"Fu, Huiyuan*; Zheng, Wenkai; Wang, Xicong; Wang, Jiaxuan; Zhang, Heng; Ma, Huadong",poster,,,,,,,,, Dec-Adapter: Exploring Efficient Decoder-side Adapter for Cross Domain Image Compression,"Shen, Sheng*; Yue, Huanjing; Yang, Jingyu",poster,,,,,,,,, OmniZoomer: Learning to Move and Zoom in on Sphere at High-Resolution,"Cao, Zidong*; Ai, Hao; Cao, Yan-Pei; Shan, Ying; Qie, Xiaohu; Wang, Lin ",poster,2308.08114,https://arxiv.org/abs/2308.08114,,https://huggingface.co/papers/2308.08114,,,,6,0 Pyramid Dual Domain Injection Network for Pan-sharpening,"He, Xuanhua*; Yan, Keyu; Li, Rui; Xie, Chengjun; zhang, jie; zhou, man",poster,,,,,,,,, Implicit Neural Representation for Cooperative Low-light Image Enhancement,"Yang, Shuzhou; Ding, Moxuan; Wu, Yanmin; Li, Zihan; Zhang, Jian*",poster,2303.11722,https://arxiv.org/abs/2303.11722,https://github.com/Ysz2022/NeRCo,https://huggingface.co/papers/2303.11722,,,,5,0 Physically-plausible illumination distribution estimation,"Ershov, Egor; Tesalin, Vasily; Ermakov, Ivan A*; Brown, Michael S",poster,,,,,,,,, Score Priors Guided Deep Variational Inference for Unsupervised Real-World Single Image Denoising,"Cheng, Jun; Liu, Tao; Tan, Shan*",poster,2308.04682,https://arxiv.org/abs/2308.04682,,https://huggingface.co/papers/2308.04682,,,,3,0 Semantic-Aware Dynamic Parameter for Video Inpainting Transformer,"Lee, Eunhye; Yoo, Jinsu; Yang, Yunjeong; Baik, Sungyong; Kim, Tae Hyun*",poster,,,,,,,,, Pixel Adaptive Deep Unfolding Transformer for Hyperspectral Image Reconstruction,"Li, Miaoyu; Fu, Ying*; Liu, Ji; Zhang, Yulun",poster,2308.10820,https://arxiv.org/abs/2308.10820,https://github.com/MyuLi/PADUT,https://huggingface.co/papers/2308.10820,,,,4,0 Improving Lens Flare Removal with General-Purpose Pipeline and Multiple Light,"Yuyan, Zhou; Liang, Dong*; Chen, Songcan; Huang, Sheng-Jun; Yang, Shuo; Li, Chongyi",poster,,,,,,,,, RFD-ECNet: Extreme Underwater Image Compression with Reference to Feature Dictionary,"Li, Mengyao; Shen, Liquan*; Ye, Peng; Feng, Guorui; Wang, Zheyin",poster,,,,,,,,, Learning Continuous Exposure Value Representations for Single-Image HDR Reconstruction,"Chen, Sykai*; Yen, Hung-Lin; Liu, Yu-Lun; Chen, Min-Hung; Hu, Hou-Ning; Peng, Wen-Hsiao; Lin, Yen-Yu",poster,,,,,,,,, Focal Network for Image Restoration,"Cui, Yuning*; Ren, Wenqi; Cao, Xiaochun; Knoll, Alois C.",poster,,,,,,,,, CIRI: Curricular Inactivation for Residue-aware One-shot Video Inpainting,"Zheng, Weiying; Xu, Cheng; Xu, Xuemiao; Liu, Wenxi; He, Shengfeng*",poster,,,,,,,,, Beyond Image Borders: Learning Feature Extrapolation for Unbounded Image Composition,"Liu, Xiaoyu; Liu, Ming*; Li, Junyi; Liu, Shuai; Xiaotao, Wang; LEI, LEI; Zuo, Wangmeng",poster,,,,,,,,, MetaF2N: Blind Image Super-Resolution by Learning Efficient Model Adaptation from Faces,"Yin, Zhicun; Liu, Ming*; Li, Xiaoming; Yang, Hui; Xiao, Longan; Zuo, Wangmeng",poster,,,,,,,,, Boundary-Aware Divide and Conquer: A Diffusion-based Solution for Unsupervised Shadow Removal,"Guo, Lanqing*; Wang, Chong; Yang, Wenhan; Wang, Yufei; Wen, Bihan",poster,,,,,,,,, Leveraging Inpainting for Single-Image Shadow Removal,"Li, Xiaoguang*; Guo, Qing; Abdelfattah, Rabab; Lin, Di; Feng, Wei; Tsang, Ivor; Wang, Song",poster,2302.05361,https://arxiv.org/abs/2302.05361,,https://huggingface.co/papers/2302.05361,,,,7,0 Hybrid Spectral Denoising Transformer with Guided Attention,"Lai, Zeqiang; Yan, Chenggang; Fu, Ying*",poster,2303.09040,https://arxiv.org/abs/2303.09040,,https://huggingface.co/papers/2303.09040,,,,3,0 Examining Autoexposure for Challenging Scenes,"Tedla, SaiKiran K*; Yang, Beixuan; Brown, Michael S",poster,,,,,,,,, Self-supervised Learning to Bring Dual Reversed Rolling Shutter Images Alive,"Shang, Wei; Ren, Dongwei*; feng, chaoyu; Xiaotao, Wang; LEI, LEI; Zuo, Wangmeng",poster,2305.19862,https://arxiv.org/abs/2305.19862,https://github.com/shangwei5/SelfDRSC,https://huggingface.co/papers/2305.19862,,,,6,0 DiffIR: Efficient Diffusion Model for Image Restoration,"xia, bin; Zhang, Yulun; Wang, Shiyin; Wang, Yitong; Xinglong, Wu; Tian, Yapeng; Yang, Wenming*; Van Gool, Luc",poster,2303.09472,https://arxiv.org/abs/2303.09472,https://github.com/Zj-BinXia/DiffIR,https://huggingface.co/papers/2303.09472,,,,8,0 Sparse Sampling Transformer with Uncertainty-Driven Ranking for Unified Removal of Raindrops and Rain Streaks,"Chen, Sixiang*; Ye, Tian; Bai, Jinbin; Chen, Erkang; Jun, Shi; Zhu, Lei",poster,2308.14153,https://arxiv.org/abs/2308.14153,,https://huggingface.co/papers/2308.14153,,,,6,0 LMR: A Large-Scale Multi-Reference Dataset for Reference-based Super-Resolution,"Zhang, Lin; Li, Xin; He, Dongliang; Li, Fu; Ding, Errui; Zhang, Zhaoxiang*",poster,2303.04970,https://arxiv.org/abs/2303.04970,,https://huggingface.co/papers/2303.04970,,,,5,0 Low-Light Image Enhancement with Illumination-Aware Gamma Correction and Complete Image Modelling Network,"Wang, Yinglong*; Liu, Zhen; Liu, Jianzhuang; Xu, Songcen; Liu, Shuaicheng",poster,2308.08220,https://arxiv.org/abs/2308.08220,,https://huggingface.co/papers/2308.08220,,,,5,0 Single Image Reflection Separation via Component Synergy,"Hu, Qiming; Guo, Xiaojie*",poster,2308.10027,https://arxiv.org/abs/2308.10027,https://github.com/mingcv/DSRNet,https://huggingface.co/papers/2308.10027,,,,2,0 Learning Rain Location Prior for Nighttime Deraining,"Zhang, Fan; Li, Yu; You, Shaodi; Fu, Ying*",poster,,,,,,,,, Exploring Positional Characteristics of Dual-Pixel Data for Camera Autofocus,"Choi, Myungsub; Lee, Hana; Lee, Hyong-Euk*",poster,,,,,,,,, Continuously Masked Transformer for Image Inpainting,"Ko, Keunsoo*; Kim, Chang-Su",poster,,,,,,,,, Learning Data-Driven Vector-Quantized Degradation Model for Animation Video Super-Resolution,"Tuo, Zixi; Yang, Huan*; Fu, Jianlong; Dun, Yujie; Qian, Xueming",poster,2303.09826,https://arxiv.org/abs/2303.09826,,https://huggingface.co/papers/2303.09826,,,,5,1 Spatially-Adaptive Feature Modulation for Efficient Image Super-Resolution,"Sun, Long*; Dong, Jiangxin; Tang, Jinhui; Pan, Jinshan",poster,2302.13800,https://arxiv.org/abs/2302.13800,https://github.com/sunny2109/SAFMN,https://huggingface.co/papers/2302.13800,,,,4,0 Video Adverse-Weather-Component Suppression Network via Weather Messenger and Adversarial Backpropagation,"Yang, Yijun*; Aviles-Rivero, Angelica I; Liu, Ye; Fu, Huazhu; Wang, Weiming; Zhu, Lei",poster,,,,,,,,, Snow Removal in Video: A New Dataset and A Novel Method,"Chen, Haoyu*; Ren, Jingjing; Gu, Jinjin; Wu, Hongtao; Lu, Xuequan; CAI, Haoming; Zhu, Lei",poster,,,,,,,,, Boosting Single Image Super-Resolution via Partial Channel Shifting,"Zhang, XiaoMing*; Li, Tianrui; Zhao, Xiaole",poster,,,,,,,,, Towards Real-World Burst Image Super-Resolution: Benchmark and Method,"Wei, Pengxu*; Sun, Yujing; Guo, Xingbei; Liu, Chang; Li, Guanbin; Chen, Jie; Ji, Xiangyang; Lin, Liang",poster,,,,,,,,, On the Effectiveness of Spectral Discriminators for Perceptual Quality Improvement,"Luo, Xin*; Zhu, Yunan; Xu, Shunxin; Liu, Dong",poster,2307.12027,https://arxiv.org/abs/2307.12027,,https://huggingface.co/papers/2307.12027,,,,4,0 ENeRF: Event-enhanced Neural Radiance Fields from Blurry Images,"Qi, Yunshan*; Zhu, Lin; Zhang, Yu; Li, Jia",poster,,,,,,,,, Iterative Denoiser and Noise Estimator for Self-supervised Image Denoising,"Zou, Yunhao; Yan, Chenggang; Fu, Ying*",poster,,,,,,,,, Lighting Every Darkness in Two Pairs: A Calibration-Free Pipeline for RAW Denoising,"Jin, Xin; Xiao, Jia-wen Schuyler; Han, Ling-Hao; Guo, Chun-Le*; Zhang, Ruixun; Liu, Xialei; Li, Chongyi",poster,2308.03448,https://arxiv.org/abs/2308.03448,https://github.com/Srameo/LED,https://huggingface.co/papers/2308.03448,,,,7,0 Fingerprinting Deep Image Restoration Models,"Quan, Yuhui; Teng, Huan; Xu, Ruotao*; Huang, Jun; Ji, Hui",poster,,,,,,,,, Environment-Invariant Curriculum Relation Learning for Fine-Grained Scene Graph Generation,"Min, Yukuan; Wu, Aming; Deng, Cheng*",poster,2308.03282,https://arxiv.org/abs/2308.03282,,https://huggingface.co/papers/2308.03282,,,,3,1 DCPB: Deformable Convolution based on the Poincare Ball for Top-view Fisheye Cameras,"Wei, Xuan; Ran, Zhidan; Lu, Xiaobo*",poster,,,,,,,,, FemtoDet: An Object Detection Baseline for Energy Versus Performance Tradeoffs,"Tu, Peng*; Xie, Xu; AI, GUO; Li, Yuexiang; Huang, Yawen; Zheng, Yefeng",poster,2301.06719,https://arxiv.org/abs/2301.06719,,https://huggingface.co/papers/2301.06719,,,,6,0 Curvature-Aware Training for Coordinate Networks,"Saratchandran, Hemanth*; Chng, Shin-Fang; Ramasinghe, Sameera; MacDonald, Lachlan; Lucey, Simon",poster,2305.08552,https://arxiv.org/abs/2305.08552,,https://huggingface.co/papers/2305.08552,,,,5,0 "Yes, we CANN: Constrained Approximate Nearest Neighbors for local feature-based visual localization","Aiger, Dror*; Araujo, Andre; Lynen, Simon",poster,2306.09012,https://arxiv.org/abs/2306.09012,,https://huggingface.co/papers/2306.09012,,,,3,0 Unleashing the Potential of Spiking Neural Networks by Dynamic Confidence,"Li, Chen*; Jones, Edward G; Furber, Steve",poster,2303.10276,https://arxiv.org/abs/2303.10276,,https://huggingface.co/papers/2303.10276,,,,3,0 Minimal Solutions to Uncalibrated Two-view Geometry with Known Epipoles,"Nakano, Gaku*",poster,,,,,,,,, FBLNet: FeedBack Loop Network for Driver Attention Prediction,"Nan, Zhixiong*; chen, yilong; Xiang, Tao",poster,2212.02096,https://arxiv.org/abs/2212.02096,,https://huggingface.co/papers/2212.02096,,,,3,0 Deep Feature Deblurring Diffusion for Detecting Out-of-Distribution Objects,"Wu, Aming*; Chen, Da; Deng, Cheng",poster,,,,,,,,, Long-range Multimodal Pretraining for Movie Understanding,"Argaw, Dawit Mureja*; Caba, Fabian; Lee, Joon-Young; Woodson, Markus; Kweon, In So",poster,2308.09775,https://arxiv.org/abs/2308.09775,,https://huggingface.co/papers/2308.09775,,,,5,0 Cross-view Semantic Alignment for Livestreaming Product Recognition,"Yang, Wenjie; Chen, Yiyi; Li, Yan; Cheng, Yanhua; Liu, Xudong; Chen, Quan*; Li, Han",poster,2308.04912,https://arxiv.org/abs/2308.04912,https://github.com/adxcreative/RICE,https://huggingface.co/papers/2308.04912,,,,7,0 HTML: Hybrid Temporal-scale Multimodal Learning Framework for Referring Video Object Segmentation,"Han, Mingfei; Wang, Yali; Li, Zhihui; Yao, Lina; Chang, Xiaojun*; Qiao, Yu",poster,,,,,,,,, DyGait: Exploiting Dynamic Representations for High-performance Gait Recognition,"Wang, Ming; Guo, Xianda; Lin, Beibei; YANG, TIAN; Zhu, Zheng; Li, Lincheng; Zhang, Shunli*; Yu, Xin",poster,2303.14953,https://arxiv.org/abs/2303.14953,,https://huggingface.co/papers/2303.14953,,,,8,0 Identity-Consistent Aggregation for Video Object Detection,"Deng, Chaorui*; Chen, Da; Wu, Qi",poster,2308.07737,https://arxiv.org/abs/2308.07737,,https://huggingface.co/papers/2308.07737,,,,3,0 Augmenting and Aligning Snippets for Few-Shot Video Domain Adaptation,"Xu, Yuecong*; Yang, Jianfei; Zhou, Yunjiao; Wu, Min; Li, Xiaoli; Chen, Zhenghua",poster,2303.10451,https://arxiv.org/abs/2303.10451,,https://huggingface.co/papers/2303.10451,,,,6,0 Action Sensitivity Learning for Temporal Action Localization,"Shao, Jiayi*; Wang, Xiaohan; Quan, Ruijie; Zheng, Junjun; Yang, Jiang; Yang, Yi",poster,2305.15701,https://arxiv.org/abs/2305.15701,,https://huggingface.co/papers/2305.15701,,,,6,0 SwinLSTM: Improving Spatiotemporal Prediction Accuracy using Swin Transformer and LSTM,"TANG, RONGNIAN; Tang, Song*; Zhang, Pu; Li, Chuang",poster,,,,,,,,, LVOS: A Benchmark for Long-term Video Object Segmentation,"Hong, Lingyi*; chen, wenchao; Liu, Zhongying; Zhang, Wei; Guo, Pinxue; Chen, Zhaoyu; Zhang, Wenqiang",poster,2211.10181,https://arxiv.org/abs/2211.10181,,https://huggingface.co/papers/2211.10181,,,,7,0 MGMAE: Motion Guided Masking for Video Masked Autoencoding,"Huang, Bingkun; Zhao, Zhiyu; Zhang, Guozhen; Qiao, Yu; Wang, Limin*",poster,2308.10794,https://arxiv.org/abs/2308.10794,,https://huggingface.co/papers/2308.10794,,,,5,0 Markov Game Video Augmentation for Action Segmentation,"Aziere, Nicolas*; Todorovic, Sinisa",poster,,,,,,,,, COOL-CHIC: Coordinate-based Low Complexity Hierarchical Image Codec,"Ladune, Théo*; Philippe, Pierrick; Henry, Felix E; clare, gordon; Leguay, Thomas",poster,,,,,,,,, ReGen: A good Generative zero-shot video classifier should be Rewarded,"Bulat, Adrian*; Sanchez, Enrique; Martinez, Brais; Tzimiropoulos, Georgios",poster,,,,,,,,, Task Agnostic Restoration of Natural Video Dynamics,"Ali, Muhammad Kashif; Kim, Dongjin; Kim, Tae Hyun*",poster,2206.03753,https://arxiv.org/abs/2206.03753,https://github.com/MKashifAli/TARONVD,https://huggingface.co/papers/2206.03753,,,,3,0 Normalizing Flows for Human Pose Anomaly Detection,"Hirschorn, Or*; Avidan, Shai",poster,2211.10946,https://arxiv.org/abs/2211.10946,,https://huggingface.co/papers/2211.10946,,,,2,1 Movement Enhancement toward Multi-Scale Video Feature Representation for Temporal Action Detection,"Zhao, Zixuan; Wang, Dongqi; Zhao, Xu*",poster,,,,,,,,, Event-Guided Procedure Planning from Instructional Videos with Text Supervision,"Wang, An-Lan; Lin, Kun-Yu; Du, Jia-Run; Meng, Jingke; ZHENG, WEI-SHI*",poster,2308.08885,https://arxiv.org/abs/2308.08885,,https://huggingface.co/papers/2308.08885,,,,5,0 SCANet: Scene Complexity Aware Network for Weakly-Supervised Video Moment Retrieval,"Yoon, Sunjae*; Koo, GwanHyeong; Kim, DaHyun; Yoo, Chang D.",poster,,,,,,,,, Spatio-temporal Prompting Network for Robust Video Feature Extraction,"Sun, Guanxiong; Wang, Chi; Zhang, Zhaoyu; Deng, Jiankang; Zafeiriou, Stefanos; Hua, Yang*",poster,,,,,,,,, TeD-SPAD: Temporal Distinctiveness for Self-supervised Privacy-preservation for video Anomaly Detection,"Fioresi, Joseph G*; Dave, Ishan Rajendrakumar; Shah, Mubarak",poster,,,,,,,,, Non-Semantics Suppressed Mask Learning for Unsupervised Video Semantic Compression,"Tian, Yuan*; Lu, Guo; Zhai, Guangtao; Gao, Zhiyong",poster,,,,,,,,, UnLoc: A Unified Framework for Video Localization Tasks,"Xiong, Xuehan; Yan, Shen*; Nagrani, Arsha; Arnab, Anurag; Wang, Zhonghao; Ge, Weina; Ross, David A; Schmid, Cordelia",poster,2308.11062,https://arxiv.org/abs/2308.11062,https://github.com/google-research/scenic,https://huggingface.co/papers/2308.11062,,,,8,0 SkeleTR: Towards Skeleton-based Action Recognition in the Wild ,"Duan, Haodong; Xu, Mingze; Shuai, Bing; Modolo, Davide; Tu, Zhuowen; Tighe, Joseph; Bergamo, Alessandro*",poster,,,,,,,,, "AutoAD II: The Sequel – Who, When, and What in Movie Audio Description","Han, Tengda*; Bain, Max; Nagrani, Arsha; Varol, Gul; Xie, Weidi; Zisserman, Andrew",poster,,,,,,,,, What can a cook in Italy teach a mechanic in India? Action Recognition Generalisation Over Scenarios and Locations,"Plizzari, Chiara*; Perrett, Toby; Caputo, Barbara; Damen, Dima",poster,2306.08713,https://arxiv.org/abs/2306.08713,,https://huggingface.co/papers/2306.08713,,,,4,0 Localizing Moments in Long Video Via Multimodal Guidance,"Barrios, Wayner J*; Soldan, Mattia; Caba, Fabian; Ceballos-Arroyo, Alberto Mario; Ghanem, Bernard",poster,2302.13372,https://arxiv.org/abs/2302.13372,,https://huggingface.co/papers/2302.13372,,,,5,0 LAC - Latent Action Composition for Skeleton-based Action Segmentation,"Yang, Di*; Wang, Yaohui; Dantcheva, Antitza ; Kong, Quan; Garattoni, Lorenzo; Francesca, Gianpiero; Bremond, Francois",poster,,,,,,,,, RIGID: Recurrent GAN Inversion and Editing of Real Face Videos,"Xu, Yangyang*; He, Shengfeng; Wong, Kwan-Yee K.; Luo, Ping",poster,2308.06097,https://arxiv.org/abs/2308.06097,,https://huggingface.co/papers/2308.06097,,,,4,0 Uncertainty-aware State Space Transformer for Egocentric 3D Trajectory Forecasting,"Bao, Wentao*; Chen, Lele; Zeng, Libing; Li, Zhong; Xu, Yi; Yuan, Junsong; Kong, Yu",poster,,,,,,,,, What Can Simple Arithmetic Operations Do for Temporal Modeling?,"Wu, Wenhao*; Song, Yuxin; Sun, Zhun; Wang, Jingdong; Xu, Chang; Ouyang, Wanli",poster,2307.08908,https://arxiv.org/abs/2307.08908,https://github.com/whwu95/ATM,https://huggingface.co/papers/2307.08908,,,,6,0 UATVR: Uncertainty-Adaptive Text-Video Retrieval,"Fang, Bo*; Wu, Wenhao; Liu, Chang; Zhou, Yu; Song, Yuxin; Wang, Weiping; Shu, Xiangbo; Ji, Xiangyang; Wang, Jingdong",poster,2301.06309,https://arxiv.org/abs/2301.06309,https://github.com/bofang98/UATVR,https://huggingface.co/papers/2301.06309,,,,9,0 D3G:Exploring Gaussian Prior for Temporal Sentence Grounding with Glance Annotation,"Li, Hanjun*; shu, Xiujun; He, Sunan; Qiao, Ruizhi; Wen, Wei; Guo, Taian; Gan, Bei; Sun, Xing",poster,,,,,,,,, Unsupervised Open-Vocabulary Object Localization in Videos,"Bai, Zechen; Fan, Ke; He, Tong*; Xiao, Tianjun; Shou, Mike Zheng; Fu, Yanwei; Zietlow, Dominik; Schiele, Bernt; Locatello, Francesco; Horn, Max; Zhao, Zixu; Zhang, Zheng; Simon-Gabriel, Carl-Johann; Brox, Thomas",poster,,,,,,,,, HiVLP: Hierarchical Interactive Video-Language Pre-Training,"Shao, Bin*; Liu, Jianzhuang; Pei, Renjing; Li, Weimian; Xu, Songcen; Dai, Peng; Lu, Juwei; Yan, Youliang",poster,,,,,,,,, Scanning Only Once: An End-to-end Framework for Fast Temporal Grounding in Long Videos,"Pan, Yulin*; He, Xiangteng; Gong, Biao; Lv, Yiliang; Shen, Yujun; Peng, Yuxin; Zhao, Deli",poster,2303.08345,https://arxiv.org/abs/2303.08345,https://github.com/afcedf/SOONet.git,https://huggingface.co/papers/2303.08345,,,,7,0 Video-FocalNets: Spatio-Temporal Focal Modulation for Video Action Recognition,"Wasim, Syed Talal*; Khattak, Muhammad Uzair ; Naseer, Muzammal; Khan, Salman; Shah, Mubarak; Shahbaz Khan, Fahad",poster,,,,,,,,, Lip2Vec: Efficient and Robust Visual Speech Recognition via Latent-to-Latent Visual to Audio Representations Mapping,"DAHOU DJILALI, Yasser Abdelaziz*; Narayan, Sanath; Boussaid, Haithem; Almazrouei, Ebtesam; DEBBAH, Merouane A",poster,,,,,,,,, Video OWL-ViT: Temporally-consistent open-world localization in video,"Heigold, Georg*; Minderer, Matthias; Gritsenko, Alexey; Bewley, Alex; Keysers, Daniel; Lucic, Mario; Yu, Fisher; Kipf, Thomas",poster,,,,,,,,, Tubelet-Contrastive Self-Supervision for Video-Efficient Generalization,"Thoker, Fida Mohammad*; Doughty, Hazel; Snoek, Cees",poster,2303.11003,https://arxiv.org/abs/2303.11003,,https://huggingface.co/papers/2303.11003,,,,3,0 Memory-and-Anticipation Transformer for Online Action Understanding,"Wang, Jiahao*; Chen, Guo; Huang, Yifei; Wang, Limin; Lu, Tong",poster,2308.07893,https://arxiv.org/abs/2308.07893,https://github.com/Echo0125/Memory-and-Anticipation-Transformer,https://huggingface.co/papers/2308.07893,,,,5,0 Video Action Segmentation via Contextually Refined Temporal Keypoints,"Jiang, Borui*; Jin, Yang; Zhentao, Tan; Mu, Yadong",poster,,,,,,,,, Knowing Where to Focus: Event-aware Transformer for Video Grounding,"Jang, Jinhyun*; Park, JungIn; Kim, Jin; Kwon, Hyeongjun; Sohn , Kwanghoon",poster,2308.06947,https://arxiv.org/abs/2308.06947,,https://huggingface.co/papers/2308.06947,,,,5,0 MPI-Flow: Learning Realistic Optical Flow with Multiplane Images,"Liang, Yingping; Fu, Ying*; Liu, Jiaming; Zhang, Debing",poster,,,,,,,,, Discovering Spatio-Temporal Rationales for Video Question Answering,"Li, Yicong*; Xiao, Junbin; Feng, Chun; Wang, Xiang; Chua, Tat-Seng",poster,2307.12058,https://arxiv.org/abs/2307.12058,https://github.com/yl3800/TranSTR,https://huggingface.co/papers/2307.12058,,,,5,0 Scalable Video Object Segmentation with Simplified Framework,"Wu, Qiangqiang*; Yang, Tianyu; WU, Wei; Chan, Antoni",poster,2308.09903,https://arxiv.org/abs/2308.09903,,https://huggingface.co/papers/2308.09903,,,,4,0 Root Pose Decomposition Towards Generic Non-rigid Reconstruction with Monocular Videos,"Wang, Yikai*; Dong, Yinpeng; Sun, Fuchun; Yang, Xiao",poster,,,,,,,,, Helping Hands: An Object-Aware Ego-Centric Video Recognition Model,"Zhang, Chuhan*; Gupta, Ankush; Zisserman, Andrew",poster,2308.07918,https://arxiv.org/abs/2308.07918,,https://huggingface.co/papers/2308.07918,,,,3,0 Modeling the Relative Visual Tempo for Self-supervised Skeleton-based Action Recognition,"Zhu, Yisheng*; Han, Hu; Yu, Zhengtao; Liu, Guangcan",poster,,,,,,,,, Tube-Link: A Flexible Cross Tube Framework for Universal Video Segmentation,"Li, Xiangtai*; Yuan, Haobo; Zhang, Wenwei; Cheng, Guangliang; Pang, Jiangmiao; Loy, Chen Change",poster,,,,,,,,, Disentangling Spatial and Temporal Learning for Efficient Image-to-Video Transfer Learning,"Qing, Zhiwu; Zhang, Shiwei; Huang, Ziyuan; Zhang, Yingya; Gao, Changxin; Zhao, Deli; Sang, Nong*",poster,,,,,,,,, Tem-adapter: Adapting Image-Text Pretraining for Video Question Answer,"Chen, Guangyi*; Liu, Xiao; Wang, Guangrun; Zhang, Kun; Torr, Philip; Zhang, Xiao-Ping; Tang, Yansong",poster,,,,,,,,, MixCycle: SOTMixup Semi-Supervised 3D Single Object Tracking with Cycle Consistency,"Wu, Qiao*; Yang, Jiaqi; Sun, Kun; Zhang, Chu'ai; Zhang, Yanning ; Salzmann, Mathieu",poster,,,,,,,,, Deep Fusion Transformer Network with Weighted Vector-Wise Keypoints Voting for Robust 6D Object Pose Estimation,"Zhou, Jun*; Chen, Kai; Xu, Linlin; QI, DOU; Qin, Jing",poster,2308.05438,https://arxiv.org/abs/2308.05438,,https://huggingface.co/papers/2308.05438,,,,5,0 Prior-free Category-level Pose Estimation with Implicit Space Transformation,"Liu, Jianhui*; Chen, Yukang; Ye, Xiaoqing; Qi, Xiaojuan",poster,,,,,,,,, Adaptive and Background-Aware Vision Transformer for Real-Time UAV Tracking,"Li, Shuiwang*; yang, xiangyang; Zeng, Dan; Wang, Xucheng",poster,,,,,,,,, VI-Net: Boosting Category-level 6D Object Pose Estimation via Learning Decoupled Rotations on the Spherical Representations,"Lin, Jiehong*; Wei, Zewei; Zhang, Yabin; Jia, Kui",poster,,,,,,,,, Tracking by Natural Language Specification with Long Short-term Context Decoupling,"Ma, Ding*; WU, XIANGQIAN",poster,,,,,,,,, CheckerPose: Progressive Dense Keypoint Localization for Object Pose Estimation with Graph Neural Network,"Lian, Ruyi*; Ling, Haibin",poster,2303.16874,https://arxiv.org/abs/2303.16874,https://github.com/RuyiLian/CheckerPose,https://huggingface.co/papers/2303.16874,,,,2,0 Deep Active Contour for Real-time 6-DoF Object Tracking,"Wang, Long; Yan, Shen; Zhen, Jianan; Liu, Yu; Zhang, Maojun; Zhang, Guofeng; Zhou, Xiaowei*",poster,,,,,,,,, Learning Symmetry-Aware Geometry Correspondences for 6D Object Pose Estimation,"Zhao, Heng*; Wei, Shenxing; Shi, Dahu; Tan, Wenming; Li, Zheyang; Ren, Ye; Wei, Xing; Yang, Yi; Pu, Shiliang",poster,,,,,,,,, QueryPose: Learning Sparse Queries as Implicit Shape Prior for Category-Level 6D Object Pose and Size Estimation,"Wang, Ruiqi*; Wang, Xinggang; Li, Te; Yang, Rong; Wan, Minhong; Liu, Wenyu",poster,,,,,,,,, SOCS: Semantically-aware Object Coordinate Space for Category-Level 6D Object Pose Estimation under Large Shape Variations,"Wan, Boyan; Shi, Yifei; Xu, Kai*",poster,2303.10346,https://arxiv.org/abs/2303.10346,,https://huggingface.co/papers/2303.10346,,,,3,0 Pseudo Flow Consistency for Self-Supervised 6D Object Pose Estimation,"Hai, Yang*; Rui, Song; Li, Jiaojiao ; Ferstl, David; Hu, Yinlin",poster,2308.10016,https://arxiv.org/abs/2308.10016,,https://huggingface.co/papers/2308.10016,,,,5,0 Tracking by 3D Model Estimation of Unknown Objects in Videos,"Rozumnyi, Denys*; Matas, Jiri; Pollefeys, Marc; Ferrari, Vittorio; Oswald, Martin R.",poster,2304.06419,https://arxiv.org/abs/2304.06419,,https://huggingface.co/papers/2304.06419,,,,5,0 Algebraically rigorous quaternion framework for the neural network pose estimation problem,"Lin, Chen; Hanson, Andrew J; Hanson, Sonya M*",poster,,,,,,,,, Linear-Covariance Loss for End-to-End Learning of 6D Pose Estimation,"Liu, Fulin*; Hu, Yinlin; Salzmann, Mathieu",poster,2303.11516,https://arxiv.org/abs/2303.11516,,https://huggingface.co/papers/2303.11516,,,,3,0 2D3D-MATR: 2D-3D Matching Transformer for Detection-free Registration between Images and Point Clouds,"Li, Minhao; Qin, Zheng; Gao, Zhirui; Yi, Renjiao; Zhu, Chenyang; Guo, Yulan; Xu, Kai*",poster,,,,,,,,, Learning Versatile 3D Shape Generation with Improved AR Models,"Luo, Simian; Qian, Xuelin*; Fu, Yanwei; Zhang, Yinda; Tai, Ying; Zhang, Zhenyu; Wang, Chengjie; Xue, Xiangyang",poster,2303.14700,https://arxiv.org/abs/2303.14700,,https://huggingface.co/papers/2303.14700,,,,8,0 CaPhy: Capturing Physical Properties for Animatable Human Avatars,"Su, Zhaoqi; Hu, Liangxiao; Lin, Siyou; Zhang, Hongwen; Zhang, Shengping; Thies, Justus; Liu, Yebin*",poster,2308.05925,https://arxiv.org/abs/2308.05925,,https://huggingface.co/papers/2308.05925,,,,7,0 Instance-aware Dynamic Prompt Tuning for Pre-trained Point Cloud Models,"Zha, Yaohua*; Wang, Jinpeng; Dai, Tao; Chen, Bin; Wang, Zhi; Xia, Shu-Tao",poster,2304.07221,https://arxiv.org/abs/2304.07221,https://github.com/zyh16143998882/ICCV23-IDPT,https://huggingface.co/papers/2304.07221,,,,6,0 Structure-Aware Surface Reconstruction via Primitive Assembling,"Jiang, Jingen; Zhao, Mingyang*; Xin, Shiqing; Yang, Yanchao; Wang, Hanxiao; Jia, Xiaohong; Yan, Dong-Ming",poster,,,,,,,,, BaRe-ESA: A Riemannian Framework for Unregistered Human Body Shapes,"Hartman, Emmanuel L; Pierson, Emery*; Bauer, Martin; Charon, Nicolas; Daoudi, Mohamed",poster,,,,,,,,, Speech4Mesh: Speech-Assisted Monocular 3D Facial Reconstruction for Speech-Driven 3D Facial Animation,"He, Shan*; He, Haonan; Yang, Shuo; xiaoyan, wu; Xia, Pengcheng; Yin, Bing; Liu, Cong; Dai, Lirong; Xu, Chang",poster,,,,,,,,, Learning Point Cloud Completion without Complete Point Clouds: A Pose-aware Approach,"Kim, Jihun*; Kweon, Hyeokjun; yang, yunseo; Yoon, Kuk-Jin",poster,,,,,,,,, GeoUDF: Surface Reconstruction from 3D Point Clouds via Geometry-guided Distance Representation,"Ren, Siyu; Hou, Junhui*; Chen, Xiaodong; He, Ying; Wang, Wenping",poster,2211.16762,https://arxiv.org/abs/2211.16762,https://github.com/rsy6318/GeoUDF,https://huggingface.co/papers/2211.16762,,,,5,0 SurfsUP: Learning Fluid Simulation for Novel Surfaces,"Mani, Arjun*; Chandratreya, Ishaan P; Creager, Elliot; Vondrick, Carl; Zemel, Richard",poster,2304.06197,https://arxiv.org/abs/2304.06197,,https://huggingface.co/papers/2304.06197,,,,5,0 DeFormer: Integrating Transformers with Deformable Models for 3D Shape Abstraction from a Single Image,"Liu, Di*; Yu, Xiang; Ye, Meng; Zhangli, Qilong; Li, Zhuowei; Zhang, Zhixing; Metaxas, Dimitris N.",poster,,,,,,,,, Neural Deformable Models for 3D Bi-Ventricular Heart Shape Reconstruction and Modeling from 2D Sparse Cardiac Magnetic Resonance Imaging,"Ye, Meng*; Yang, Dong; Kanski, Mikael; Axel, Leon; Metaxas, Dimitris N.",poster,2307.07693,https://arxiv.org/abs/2307.07693,,https://huggingface.co/papers/2307.07693,,,,5,0 DiffFacto: Controllable Part-Based 3D Point Cloud Generation with Cross Diffusion,"Nakayama, Kiyohiro*; Uy, Mikaela Angelina; Huang, Jiahui; Hu, Shi-Min; Li, Ke; Guibas, Leonidas",poster,2305.01921,https://arxiv.org/abs/2305.01921,,https://huggingface.co/papers/2305.01921,,,,6,0 Self-supervised Learning of Implicit Shape Representation with Dense Correspondence for Deformable Objects,"Zhang, Baowen*; Li, Jiahe; Deng, Xiaoming; Zhang, Yinda; Ma, Cuixia; Wang, Hongan",poster,2308.12590,https://arxiv.org/abs/2308.12590,,https://huggingface.co/papers/2308.12590,,,,6,0 Neural Implicit Surface Evolution,"Novello, Tiago*; da Silva, Vinícius; Schardong, Guilherme G; Schirmer, Luiz; Lopes, Hélio; Velho, Luiz",poster,2201.09636,https://arxiv.org/abs/2201.09636,,https://huggingface.co/papers/2201.09636,,,,6,2 Unsupervised Semantic Segmentation of 3D Point Clouds via Cross-modal Distillation and Super-Voxel Clustering,"Chen, Zisheng; Xu, Hongbin*; Chen, WeiTao; Zhou, Zhipeng; Sun, Baigui; Xiao, Haihong; Kang, Wenxiong",poster,2304.08965,https://arxiv.org/abs/2304.08965,,https://huggingface.co/papers/2304.08965,,,,2,0 HyperDiffusion: Generating Implicit Neural Fields with Weight-Space Diffusion,"Erkoç, Ziya*; Ma, Fangchang; Shan, Qi; Niessner, Matthias; Dai, Angela",poster,2303.17015,https://arxiv.org/abs/2303.17015,,https://huggingface.co/papers/2303.17015,,,,5,1 Leveraging SE(3) Equivariance for Learning 3D Geometric Shape Assembly,"Wu, Ruihai; Tie, Chenrui; Du, Yushi; Zhao, Yan; Dong, Hao*",poster,,,,,,,,, DPF-Net: Combining Explicit Shape Prior in Deformable Primitive Field for Unsupervised Structural Reconstruction of 3D Objects,"Shuai, Qingyao; Zhang, Chi; Yang, Kaizhi; Chen, Xuejin*",poster,,,,,,,,, Sample-adaptive Augmentation for Point Cloud Recognition Against Real-world Corruptions,"Wang, Jie*; Ding, lihe; Xu, Tingfa; Dong, Shaocong; Xu, xinli; Bai, Long; Li, Jianan",poster,,,,,,,,, 3DHacker: Spectrum-based Decision Boundary Generation for Hard-label 3D Point Cloud Attack,"Tao, Yunbo*; Liu, Daizong; Zhou, Pan; Xie, Yulai; Du, Wei; Hu, Wei",poster,2308.07546,https://arxiv.org/abs/2308.07546,,https://huggingface.co/papers/2308.07546,,,,6,0 P2C: Self-Supervised Point Cloud Completion from Single Partial Clouds,"Cui, Ruikai; Qiu, Shi; Anwar, Saeed; Liu, Jiawei; Xing, Chaoyue; Zhang, Jing; Barnes, Nick*",poster,2307.14726,https://arxiv.org/abs/2307.14726,https://github.com/CuiRuikai/Partial2Complete,https://huggingface.co/papers/2307.14726,,,,7,0 Towards Multi-Layered 3D Garments Animation,"Shao, Yidi*; Loy, Chen Change; Dai, Bo",poster,2305.10418,https://arxiv.org/abs/2305.10418,,https://huggingface.co/papers/2305.10418,,,,3,1 AvatarCraft: Transforming Text into Neural Human Avatars with Parameterized Shape and Pose Control,"JIANG, Ruixiang; Wang, Can; ZHANG, Jingbo; Chai, Menglei; He, Mingming; Chen, Dongdong; Liao, Jing*",poster,2303.17606,https://arxiv.org/abs/2303.17606,,https://huggingface.co/papers/2303.17606,,,,7,0 Text-Driven Localized Object Manipulation Using Blending NeRF,"Song, Hyeonseop; Choi, Seokhun; Do, Hoseok; Lee, Chul; Kim, Taehyeong*",poster,,,,,,,,, SIRA-PCR: Sim-to-Real Adaptation for 3D Point Cloud Registration,"Chen, Suyi; Xu, Hao; Li, Ru; Liu, Guanghui; Fu, Chi-Wing; Liu, Shuaicheng*",poster,,,,,,,,, 3D Semantic Subspace Traverser: Empowering 3D Generative Model with Shape Editing Capability,"Wang, Ruowei; Liu, Yu; su, pei; Zhang, Jianwei; Zhao, Qijun*",poster,2307.14051,https://arxiv.org/abs/2307.14051,https://github.com/TrepangCat/3D_Semantic_Subspace_Traverser,https://huggingface.co/papers/2307.14051,,,,5,0 DMNet: Delaunay Meshing Network for 3D Shape Representation,"Zhang, Chen; Yuan, Ganzhangqin; Tao, Wenbing*",poster,,,,,,,,, Attention Discriminant Sampling for Point Clouds,"Hong, Cheng-Yao*; Chou, Yu-Ying; Liu, Tyng-Luh",poster,,,,,,,,, SALAD: Part-Level Latent Diffusion for 3D Shape Generation and Manipulation,"Koo, Juil*; Yoo, Seungwoo; Nguyen, Hieu Minh; Sung, Minhyuk",poster,2303.12236,https://arxiv.org/abs/2303.12236,,https://huggingface.co/papers/2303.12236,,,,4,0 MAPConNet: Self-supervised 3D Pose Transfer with Mesh and Point Contrastive Learning,"Sun, Jiaze*; Chen, Zhixiang; Kim, Tae-Kyun (T-K)",poster,2304.13819,https://arxiv.org/abs/2304.13819,,https://huggingface.co/papers/2304.13819,,,,3,0 Invariant Training 2D-3D Joint Hard Samples for Few-Shot Point Cloud Recognition,"Yi, Xuanyu*; Deng, Jiajun; Sun, Qianru; Hua, Xian-Sheng; Lim, Joo-Hwee; Zhang, Hanwang",poster,2308.09694,https://arxiv.org/abs/2308.09694,,https://huggingface.co/papers/2308.09694,,,,6,0 EPiC: Ensemble of Partial Point Clouds for Robust Classification,"Levi, Meir Yossef*; Gilboa, Guy",poster,2303.11419,https://arxiv.org/abs/2303.11419,https://github.com/yossilevii100/EPiC,https://huggingface.co/papers/2303.11419,,,,2,1 Leveraging Intrinsic Properties for Non-Rigid Garment Alignment,"Lin, Siyou; ZHOU, Boyao; Zheng, Zerong; Zhang, Hongwen; Liu, Yebin*",poster,2308.09519,https://arxiv.org/abs/2308.09519,,https://huggingface.co/papers/2308.09519,,,,5,0 Spatially and Spectrally Consistent Deep Functional Maps,"Sun, Mingze; Mao, Shiwei; Jiang, Puhua; Ovsjanikov, Maks; Huang, Ruqi*",poster,2308.08871,https://arxiv.org/abs/2308.08871,https://github.com/rqhuang88/Spatiallyand-Spectrally-Consistent-Deep-Functional-Maps,https://huggingface.co/papers/2308.08871,,,,5,0 SVDFormer: Complementing Point Cloud via Self-view Augmentation and Self-structure Dual-generator,"Zhu, Zhe*; Chen, Honghua; He, Xing; Wang, Weiming; Qin, Jing; Wei, Mingqiang",poster,2307.08492,https://arxiv.org/abs/2307.08492,https://github.com/czvvd/SVDFormer,https://huggingface.co/papers/2307.08492,,,,6,0 Batch-based Model Registration for Fast 3D Sherd Reconstruction,"Wang, Jiepeng; Zhang, Congyi; Wang, Peng; Li, Xin; Cobb, Peter; Theobalt, Christian; Wang, Wenping*",poster,,,,,,,,, Implicit Autoencoder for Point-Cloud Self-Supervised Representation Learning,"Yan, Siming*; Yang, Zhenpei; Li, Haoxiang; Song, Chen; Guan, Li; Kang, Hao; Hua, Gang; Huang, Qixing",poster,2201.00785,https://arxiv.org/abs/2201.00785,,https://huggingface.co/papers/2201.00785,,,,8,0 E3Sym: Leveraging E(3) Invariance for Unsupervised 3D Planar Reflective Symmetry Detection,"Li, Ren-Wu; Zhang, Ling-Xiao; Li, Chunpeng*; Lai, Yu-Kun; Gao, Lin",poster,,,,,,,,, Semantify: Simplifying the Control of 3D Morphable Models using CLIP,"Gralnik, Omer*; Gafni, Guy; Shamir, Ariel",poster,2308.07415,https://arxiv.org/abs/2308.07415,,https://huggingface.co/papers/2308.07415,,,,3,0 VoroMesh: Learning Watertight Surface Meshes with Voronoi Diagrams,"Maruani, Nissim*; Klokov, Roman; Ovsjanikov, Maks; Alliez, Pierre; Desbrun, Mathieu",poster,2308.14616,https://arxiv.org/abs/2308.14616,,https://huggingface.co/papers/2308.14616,,,,5,0 DG3D: Generating High Quality 3D Textured Shapes by Learning to Discriminate Multi-Modal Diffusion-Renderings,"Zuo, Qi*; Song, Yafei; JIANFANG, LI; Liu, Lin; Liefeng, Bo",poster,,,,,,,,, Unaligned 2D to 3D Translation with Conditional Vector-Quantized Code Diffusion using Transformers,"Corona-Figueroa, Abril*; Bond-Taylor, Sam; Bhowmik, Neelanjan; A. Gaus, Yona Falinie; Breckon, Toby P; Shum, Hubert P. H.; Willcocks, Chris G.",poster,2308.14152,https://arxiv.org/abs/2308.14152,,https://huggingface.co/papers/2308.14152,,,,7,0 Hyperbolic Chamfer Distance for Point Cloud Completion,"LIN, FANGZHOU; Yue, Yun; Hou, Songlin; Yu, Xuechu; Xu, Yajun ; Yamada, Kazunori D; Zhang, Ziming*",poster,,,,,,,,, SKED: Sketch-guided Text-based 3D Editing,"Mikaeili, Aryan*; Perel, Or; Safaee, Mehdi; Cohen-Or, Danny; Mahdavi-Amiri, Ali",poster,2303.10735,https://arxiv.org/abs/2303.10735,,https://huggingface.co/papers/2303.10735,,,,5,0 Adaptive Spiral Layers for Efficient 3D Representation Learning on Meshes,"Babiloni, Francesca*; Maggioni, Matteo; Tanay, Thomas; Deng, Jiankang; Leonardis, Ales; Zafeiriou, Stefanos",poster,,,,,,,,, EMDB: The Electromagnetic Database of Global 3D Human Pose and Shape in the Wild,"Kaufmann, Manuel*; Song, Jie; Guo, Chen; Shen, Kaiyue; Jiang, Tianjian; Tang, Chengcheng; Zarate, Juan J; Hilliges, Otmar",poster,2308.16894,https://arxiv.org/abs/2308.16894,,https://huggingface.co/papers/2308.16894,,,,8,0 ReFit: Recurrent Fitting Network for 3D Human Recovery,"Wang, Yufu*; Daniilidis, Kostas",poster,2308.11184,https://arxiv.org/abs/2308.11184,,https://huggingface.co/papers/2308.11184,,,,2,0 Global Adaptation meets Local Generalization: Unsupervised Domain Adaptation for 3D Human Pose Estimation,"Chai, Wenhao; Jiang, Zhongyu; Hwang, Jenq-Neng; Wang, Gaoang*",poster,2303.16456,https://arxiv.org/abs/2303.16456,,https://huggingface.co/papers/2303.16456,,,,4,1 Spectral Graph-based Transformer for Egocentric Two-Hand Reconstruction using Multi-View Color Images,"Tse, Tze Ho Elden*; Mueller, Franziska; Dou, Mingsong; Doosti, Bardia; Tang, Danhang; Zhang, Yinda; Beeler, Thabo; Chang, Hyung Jin; Petrovic, Sasa; Shen, Zhengyang; Taylor, Jonathan",poster,,,,,,,,, Realistic Full-Body Tracking from Sparse Observations via Joint-Level Modeling,"Zheng, Xiaozheng*; Su, Zhuo; Wen, Chao; Xue, Zhou; Jin, Xiaojie",poster,2308.08855,https://arxiv.org/abs/2308.08855,,https://huggingface.co/papers/2308.08855,,,,5,0 Rethinking pose estimation in crowds: overcoming the detection information bottleneck and ambiguity,"Zhou, Mu; Stoffl, Lucas; Mathis, Alexander*; Mathis, Mackenzie",poster,,,,,,,,, HDG-ODE: A Hierarchical Continuous-Time Model for Human Pose Forecasting,"Xing, Yucheng*; Wang, Xin",poster,,,,,,,,, AffordPose: A Large-scale Dataset of Hand-Object Interactions with Affordance-driven Hand Pose,"Jian, Juntao; Liu, Xiuping; Li, Manyi; Hu, Ruizhen; Liu, Jian*",poster,,,,,,,,, Robust 3D Pose Estimation via Phase-conditioned Human Motion Prior,"Shi, Mingyi*; Starke, Sebastian; Ye, Yuting; Komura, Taku; Won, Jungdam",poster,,,,,,,,, Inhabiting the Virtual: Synthesizing Diverse Human Motions in 3D Indoor Scenes,"Zhao, Kaifeng*; Zhang, Yan; wang, Shaofei; Beeler, Thabo; Tang, Siyu",poster,,,,,,,,, "TEMPO: Efficient Multi-View Pose Estimation, Tracking, and Forecasting","Choudhury, Rohan*; Kitani, Kris; Jeni, Laszlo A",poster,,,,,,,,, Diffusion-Based 3D Human Pose Estimation with Multi-Hypothesis Aggregation,"Shan, Wenkang*; Liu, Zhenhua; zhang, xinfeng; Wang, Zhao; Han, Kai; Wang, Shanshe; Ma, Siwei; Gao, Wen",poster,2303.11579,https://arxiv.org/abs/2303.11579,https://github.com/paTRICK-swk/D3DP,https://huggingface.co/papers/2303.11579,,,,8,1 Towards Robust and Smooth 3D Multi-Person Pose Estimation from Monocular Videos in the Wild,"Park, Sungchan; Lyou, Eunyi; Lee, Inhoe; Lee, Joonseok*",poster,,,,,,,,, Humans in 4D: Reconstructing and Tracking Humans with Transformers,"Goel, Shubham*; Pavlakos, Georgios; Rajasegaran, Jathushan; Kanazawa, Angjoo; Malik, Jitendra",poster,2305.20091,https://arxiv.org/abs/2305.20091,,https://huggingface.co/papers/2305.20091,,,,5,4 NPC: Neural Point Characters from Video,"Su, Shih-Yang*; Bagautdinov, Timur; Rhodin, Helge",poster,2304.02013,https://arxiv.org/abs/2304.02013,,https://huggingface.co/papers/2304.02013,,,,3,0 Priority-Centric Human Motion Generation in Discrete Latent Space,"Kong, Hanyang*; Gong, Kehong; Lian, Dongze; Bi Mi, Michael; Wang, Xinchao",poster,2308.14480,https://arxiv.org/abs/2308.14480,,https://huggingface.co/papers/2308.14480,,,,5,0 Unsupervised Learning for Neural 3D Composition of Humans and Objects,"Kim, Taeksoo*; Saito, Shunsuke; Joo, Hanbyul",poster,,,,,,,,, Cyclic Test-Time Adaptation on Monocular Video for 3D Human Mesh Reconstruction,"Nam, Hyeongjin; Jung, Daniel Sungho; Oh, Yeonguk; Lee, Kyoung Mu*",poster,2308.06554,https://arxiv.org/abs/2308.06554,https://github.com/hygenie1228/CycleAdapt_RELEASE,https://huggingface.co/papers/2308.06554,,,,4,0 Multiple Hypotheses Meet Entropy for Pose and Shape Recovery,"Chen, Rongyu*; Yang, Linlin; Yao, Angela",poster,,,,,,,,, Probabilistic Triangulation for uncalibrated multi-view 3D human pose estimation,"Jiang, Boyuan; Hu, Lei; Xia, Shihong*",poster,,,,,,,,, DiffPose: SpatioTemporal Diffusion Model for Video-Based Human Pose Estimation,"Feng, Runyang*; Gao, Yixing; Tse, Tze Ho Elden; Ma, Xueqing; Chang, Hyung Jin",poster,2307.16687,https://arxiv.org/abs/2307.16687,,https://huggingface.co/papers/2307.16687,,,,5,0 Reconstructing Groups of People with Hypergraph Relational Reasoning,"Huang, Buzhen*; Ju, Jingyi; Li, Zhihao; Wang, Yangang",poster,2308.15844,https://arxiv.org/abs/2308.15844,https://github.com/boycehbz/GroupRec,https://huggingface.co/papers/2308.15844,,,,4,0 MixSynthFormer: A Transformer Encoder-like Structure with Mixed Synthetic Self-attention for Efficient Human Pose Estimation,"Sun, Yuran*; Dougherty, Alan W; ZHANG, Zhuoying; Choi, Yi King; Wu, Chuan",poster,,,,,,,,, Dynamic Hyperbolic Attention Network for Fine Hand-object Reconstruction,"Leng, Zhiying*; Wu, shuncheng; Saleh, Mahdi; Montanaro, Antonio; Yu, Hao; Wang, Yin; Navab, Nassir; Liang, Xiaohui; Tombari, Federico",poster,,,,,,,,, Human from Blur: Human Pose Tracking from Blurry Images,"Zhao, Yiming*; Rozumnyi, Denys; Song, Jie; Hilliges, Otmar; Pollefeys, Marc; Oswald, Martin R.",poster,2303.17209,https://arxiv.org/abs/2303.17209,,https://huggingface.co/papers/2303.17209,,,,6,0 AG3D: Learning to Generate 3D Avatars from 2D Image Collections,"Dong, Zijian*; Chen, Xu; Yang, Jinlong; Black, Michael J.; Hilliges, Otmar; Geiger, Andreas",poster,2305.02312,https://arxiv.org/abs/2305.02312,,https://huggingface.co/papers/2305.02312,,,,6,0 InterDiff: Generating 3D Human-Object Interactions with Physics-Informed Diffusion,"Xu, Sirui*; Li, Zhengyuan; Wang, Yu-Xiong; Gui, Liangyan",poster,2308.16905,https://arxiv.org/abs/2308.16905,https://github.com/Sirui-Xu/InterDiff,https://huggingface.co/papers/2308.16905,,,,4,1 SEFD: Learning to Distill Complex Pose and Occlusion,"Yang, ChangHee*; Kong, Kyeongbo; Min, Sung-Jun; Wee, Dongyoon; Jang, Ho-Deok; Cha, Geonho; Kang, Suk-Ju",poster,,,,,,,,, 3D Human Mesh Recovery with Sequentially Global Rotation Estimation,"Wang, Dongkai; Zhang, Shiliang*",poster,,,,,,,,, Co-Evolution of Pose and Mesh for 3D Human Body Estimation from Video,"You, Yingxuan*; Liu, Hong; Wang, Ti; Li, Wenhao; Ding, Runwei; Li, Xia",poster,2308.10305,https://arxiv.org/abs/2308.10305,https://github.com/kasvii/PMCE,https://huggingface.co/papers/2308.10305,,,,6,0 PHRIT: Parametric Hand Representation with Implicit Template,"Huang, Zhisheng; Chen, Yujin; Kang, Di; Zhang, Jinlu; Tu, Zhigang*",poster,,,,,,,,, HopFIR: Hop-wise GraphFormer with Intragroup Joint Refinement for 3D Human Pose Estimation,"Zhai, Kai*; Nie, Qiang; Ouyang, Bo; Li, Xiang; Yang, Shanlin",poster,2302.14581,https://arxiv.org/abs/2302.14581,,https://huggingface.co/papers/2302.14581,,,,5,0 Prior-guided Source-free Domain Adaptation for Human Pose Estimation,"Raychaudhuri, Dripta S.*; Ta, Calvin-Khang T; Dutta, Arindam; Lal, Rohit; Roy-Chowdhury, Amit K. ",poster,2308.13954,https://arxiv.org/abs/2308.13954,,https://huggingface.co/papers/2308.13954,,,,5,0 Cloth2Body: Generating 3D Human Body Mesh from 2D Clothing,"Dai, Lu*; Ma, Liqian; Qian, Shenhan; Liu, Hao; Xiong, Hui; Liu, Ziwei",poster,,,,,,,,, PoseFix: Correcting 3D Human Poses with Natural Language,"Delmas, Ginger*; Weinzaepfel, Philippe; Moreno, Francesc; Rogez, Gregory",poster,,,,,,,,, Group Pose: A Simple Baseline for End-to-End Multi-person Pose Estimation,"Liu, Huan*; Chen, Qiang; Tan, Zichang; Liu, Jiang-Jiang; Wang, Jian; Su, Xiangbo; Li, Xiaolong; Yao, Kun; Han, Junyu; Ding, Errui; Zhao, Yao; Wang, Jingdong",poster,2308.07313,https://arxiv.org/abs/2308.07313,,https://huggingface.co/papers/2308.07313,,,,12,0 Make-An-Animation: Large-Scale Text-conditional 3D Human Motion Generation,"Azadi, Samaneh*; Shah, Mian Akbar; Hayes, Thomas F; Parikh, Devi; Gupta, Sonal",poster,,,,,,,,, NSF: Neural Surface Fields for Human Modeling from Monocular Depth,"Xue, Yuxuan*; Bhatnagar, Bharat Lal; Marin, Riccardo; Sarafianos, Nikolaos; Xu, Yuanlu; Pons-Moll, Gerard; Tung, Tony",poster,2308.14847,https://arxiv.org/abs/2308.14847,,https://huggingface.co/papers/2308.14847,,,,7,0 Hierarchical Generation of Human-Object Interactions with Diffusion Probabilistic Models,"Pi, Huaijin*; Peng, Sida; Yang, Minghui; Zhou, Xiaowei; Bao, Hujun",poster,,,,,,,,, Dynamic Mesh Recovery from Partial Point Cloud Sequence,"Jang, Hojun*; Kim, Minkwan; Bae, Jinseok; Kim, Young Min",poster,,,,,,,,, MotionBERT: A Unified Perspective on Learning Human Motion Representations,"Zhu, Wentao*; Ma, Xiaoxuan; Liu, Zhaoyang; Liu, Libin; Wu, Wayne; Wang, Yizhou",poster,2210.06551,https://arxiv.org/abs/2210.06551,,https://huggingface.co/papers/2210.06551,,,,6,0 Novel-view Synthesis and Pose Estimation for Hand-Object Interaction from Sparse Views,"Qu, Wentian*; Cui, Zhaopeng; Meng, Chenyu; Deng, Xiaoming; Zhang, Yinda; Ma, Cuixia; Wang, Hongan",poster,2308.11198,https://arxiv.org/abs/2308.11198,,https://huggingface.co/papers/2308.11198,,,,7,0 OCHID-Fi: Occlusion-Robust Hand Pose Estimation in 3D via RF-Vision,"Zhang, Shujie*; Zheng, Tianyue; Chen, Zhe; Hu, Jingzhi; Khamis, Abdelwahed; Liu, Jiajun; Luo, Jun",poster,,,,,,,,, Neural Interactive Keypoint Detection,"Yang, Jie; Zeng, Ailing*; Li, Feng; Liu, Shilong; Zhang, Ruimao; Zhang, Lei",poster,2308.10174,https://arxiv.org/abs/2308.10174,https://github.com/IDEA-Research/Click-Pose,https://huggingface.co/papers/2308.10174,,,,6,0 Plausible Uncertainties for Human Pose Regression,"Bramlage, Lennart*; Karg, Michelle ; Curio, Cristobal",poster,,,,,,,,, TORE: Token Reduction for Efficient Human Mesh Recovery with Transformer,"Dou, Zhiyang*; Wu, Qingxuan; Lin, Cheng; Cao, Zeyu; Wu, Qiangqiang; Wan, Weilin; Komura, Taku; Wang, Wenping",poster,2211.10705,https://arxiv.org/abs/2211.10705,,https://huggingface.co/papers/2211.10705,,,,8,1 Weakly-supervised 3D Pose Transfer with Keypoints,"Chen, Jinnan*; Li , Chen; Lee, Gim Hee",poster,2307.13459,https://arxiv.org/abs/2307.13459,,https://huggingface.co/papers/2307.13459,,,,3,0 SATR: Zero-Shot Semantic Segmentation of 3D Shapes,"Abdelreheem, Ahmed*; Skorokhodov, Ivan; Ovsjanikov, Maks; Wonka, Peter",poster,2304.04909,https://arxiv.org/abs/2304.04909,,https://huggingface.co/papers/2304.04909,,,,4,0 CiT: Curation in Training for Effective Vision-Language Data,"Xu, Hu*; Xie, Saining; Huang, Po-Yao; Yu, Licheng; Howes, Russell; Ghosh, Gargi; Zettlemoyer, Luke; Feichtenhofer, Christoph",poster,2301.02241,https://arxiv.org/abs/2301.02241,,https://huggingface.co/papers/2301.02241,,,,8,0 Learning Self-regulating Prompts for Vision-Language Models,"Khattak, Muhammad Uzair *; Wasim, Syed Talal; Naseer, Muzammal; Khan, Salman; Yang, Ming-Hsuan; Shahbaz Khan, Fahad",poster,,,,,,,,, Learning To Ground Instructional Articles In Videos Through Narrations,"Mavroudi, Effrosyni*; Afouras, Triantafyllos; Torresani, Lorenzo",poster,2306.03802,https://arxiv.org/abs/2306.03802,,https://huggingface.co/papers/2306.03802,,,,3,0 Ref-Egocentric: Referring Expression Comprehension Dataset from First-Person Perception of Ego4D,"Kurita, Shuhei*; Katsura, Naoki; Onami, Eri",poster,,,,,,,,, Multi3DRefer: Grounding Text Description to Multiple 3D Objects,"Zhang, Yiming*; Gong, ZeMing; Chang, Angel X",poster,,,,,,,,, Bayesian Prompt Learning for Image-Language Model Generalization,"Derakhshani, Mohammad Mahdi*; Sanchez, Enrique; Bulat, Adrian; Turrisi da Costa, Victor G.; Snoek, Cees; Tzimiropoulos, Georgios; Martinez, Brais",poster,2210.02390,https://arxiv.org/abs/2210.02390,https://github.com/saic-fi/Bayesian-Prompt-Learning,https://huggingface.co/papers/2210.02390,,,,7,1 Who are you referring to? Coreference resolution in image narrations,"Goel, Arushi*; Fernando, Basura; Keller, Frank; Bilen, Hakan",poster,2211.14563,https://arxiv.org/abs/2211.14563,,https://huggingface.co/papers/2211.14563,,,,4,0 Guiding image captioning models toward more specific captions,"Kornblith, Simon*; Li, Lala; Wang, Zirui; Nguyen, Thao T",poster,2307.16686,https://arxiv.org/abs/2307.16686,,https://huggingface.co/papers/2307.16686,,,,4,2 PreSTU: Pre-Training for Scene-Text Understanding,"Kil, Jihyung*; Changpinyo, Soravit; Chen, Xi; Hu, Hexiang; Goodman, Sebastian; Chao, Wei-Lun; Soricut, Radu",poster,2209.05534,https://arxiv.org/abs/2209.05534,,https://huggingface.co/papers/2209.05534,,,,7,0 Exploring Group Video Captioning with Efficient Relational Approximation,"lin, wang*; Li, Linjun; Jin, Tao; Wang, Ye; Cheng, Xize; Pan, Wenwen; Zhao, Zhou",poster,,,,,,,,, VLSlice: Interactive Vision-and-Language Slice Discovery,"Slyman, Eric*; Kahng, Minsuk; Lee, Stefan",poster,,,,,,,,, Pretrained Language Models as Visual Planners for Human Assistance,"Patel, Dhruvesh; Eghbalzadeh, Hamid; Kamra, Nitin; Iuzzolino, Michael; Jain, Unnat; Desai, Ruta P*",poster,2304.09179,https://arxiv.org/abs/2304.09179,,https://huggingface.co/papers/2304.09179,,,,6,0 VQA Therapy: Exploring Answer Differences by Visually Grounding Answers,"Chen, Chongyan*; Anjum, Samreen; Gurari, Danna",poster,2308.11662,https://arxiv.org/abs/2308.11662,,https://huggingface.co/papers/2308.11662,,,,3,0 Towards High-Fidelity Text-Guided 3D Face Generation and Manipulation Using only Images,"Yu, Cuican*; Lu, Guansong; Zeng, Yihan; Sun, Jian; Liang, Xiaodan; Li, Huibin; Xu, Zongben; Xu, Songcen; Zhang, Wei; Xu, Hang",poster,2308.16758,https://arxiv.org/abs/2308.16758,,https://huggingface.co/papers/2308.16758,,,,10,0 Zero-Shot Composed Image Retrieval with Textual Inversion,"Baldrati, Alberto*; Agnolucci, Lorenzo; Bertini, Marco; Del Bimbo, Alberto",poster,2303.15247,https://arxiv.org/abs/2303.15247,https://github.com/miccunifi/SEARLE,https://huggingface.co/papers/2303.15247,,,,4,2 PatchCT: Aligning Patch Set and Label Set with Conditional Transport for Multi-Label Image Classification,"Wang, Dongsheng*; Li, Miaoge; Liu, Xinyang; Zeng, Zequn; Lu, Ruiying; Chen, Bo; Zhou, Mingyuan",poster,2307.09066,https://arxiv.org/abs/2307.09066,,https://huggingface.co/papers/2307.09066,,,,7,0 Lip Reading for Low-resource Languages by Learning and Combining General Speech Knowledge and Language-specific Knowledge,"Kim, Minsu*; Yeo, Jeong Hun; Choi, Jeongsoo; Ro, Yong Man",poster,2308.09311,https://arxiv.org/abs/2308.09311,,https://huggingface.co/papers/2308.09311,,,,4,0 ViewRefer: Grasp the Multi-view Knowledge for 3D Visual Grounding,"Guo, Ziyu*; Tang, Yiwen; Zhang, Renrui; Wang, Dong; Wang, Zhigang; Zhao, Bin; Li, Xuelong",poster,,,,,,,,, AerialVLN: Vision-and-Language Navigation for UAVs,"Liu, Shubo*; Zhang, Hongsheng; Qi, Yuankai; Wang, Peng; Zhang, Yanning ; Wu, Qi",poster,2308.06735,https://arxiv.org/abs/2308.06735,https://github.com/AirVLN/AirVLN,https://huggingface.co/papers/2308.06735,,,,6,0 Linear Spaces of Meanings: Compositional Structures in Vision-Language Models,"Trager, Matthew*; Perera, Pramuditha; Zancato, Luca; Achille, Alessandro; Bhatia, Parminder; Soatto, Stefano",poster,2302.14383,https://arxiv.org/abs/2302.14383,,https://huggingface.co/papers/2302.14383,,,,6,0 HiTeA: Hierarchical Temporal-Aware Video-Language Pre-training,"Ye, Qinghao*; Xu, Guohai; Yan, Ming; Xu, Haiyang; Qian, Qi; Zhang, Ji; Huang, Fei",poster,2212.14546,https://arxiv.org/abs/2212.14546,,https://huggingface.co/papers/2212.14546,,,,7,0 EgoTV: Egocentric Task Verification from Natural Language Task Descriptions,"Hazra, Rishi; Chen, Brian; Rai, Akshara; Kamra, Nitin; Desai, Ruta P*",poster,2303.16975,https://arxiv.org/abs/2303.16975,,https://huggingface.co/papers/2303.16975,,,,5,1 SINC: Self-Supervised In-Context Learning for Vision-Language Tasks,"Chen, Yi-Syuan*; Song, Yun-Zhu; Yeo, Cheng Yu; Liu, Bei; Fu, Jianlong; Shuai, Hong-Han",poster,2307.07742,https://arxiv.org/abs/2307.07742,,https://huggingface.co/papers/2307.07742,,,,6,0 VLN-PETL: Parameter-Efficient Transfer Learning for Vision-and-Language Navigation,"Qiao, Yanyuan*; Yu, Zheng; Wu, Qi",poster,,,,,,,,, Open-Vocabulary Object Detection and Part Segmentation,"Sun, Peize*; Chen, Shoufa; Zhu, Chenchen; Xiao, Fanyi; Luo, Ping; Xie, Saining; Yan, Zhicheng",poster,,,,,,,,, Temporal Collection and Distribution for Referring Video Object Segmentation,"Tang, Jiajin; Zheng, Ge; Yang, Sibei*",poster,,,,,,,,, Inverse Compositional Learning for Weakly-supervised Relation Grounding,"Li, Huan; Wei, Ping*; Ma, Zeyu; Zheng, Nanning",poster,,,,,,,,, Why Is Prompt Tuning for Vision-Language Models Robust to Noisy Labels?,"Wu, Cheng-En*; Tian, Yu; Yu, Haichao; Wang, Heng; Morgado, Pedro; Hu, Yu Hen; Yang, Linjie",poster,2307.11978,https://arxiv.org/abs/2307.11978,https://github.com/CEWu/PTNL,https://huggingface.co/papers/2307.11978,,,,7,0 CHAMPAGNE: Learning Real-world Conversation from Large-Scale Web Videos,"Han, Seungju; Hessel, Jack; Dziri, Nouha; Choi, Yejin; Yu, Youngjae*",poster,2303.09713,https://arxiv.org/abs/2303.09713,,https://huggingface.co/papers/2303.09713,,,,5,0 RCA-NOC: Relative Contrastive Alignment for Novel Object Captioning,"Fan, JiaShuo*; Liang, Yaoyuan; Liu, Leyao; Huang, Shao-Lun; Zhang, Lei",poster,,,,,,,,, DIME-FM : DIstilling Multimodal and Efficient Foundation Models,"Sun, Ximeng*; Zhang, Pengchuan; Zhang, Peizhao; Shah, Hardik; Saenko, Kate; Xia, Xide",poster,,,,,,,,, Black Box Few-Shot Adaptation for Vision-Language models,"Ouali, Yassine*; Bulat, Adrian; Martinez, Brais; Tzimiropoulos, Georgios",poster,2304.01752,https://arxiv.org/abs/2304.01752,,https://huggingface.co/papers/2304.01752,,,,4,0 Shatter and Gather: Learning Referring Image Segmentation with Text Supervision,"Kim, Dongwon*; Kim, Namyup; Lan, Cuiling; Kwak, Suha",poster,2308.15512,https://arxiv.org/abs/2308.15512,,https://huggingface.co/papers/2308.15512,,,,4,0 Accurate and Fast Compressed Video Captioning,"Shen, Yaojie; Gu, Xin; Xu, Kai; Fan, Heng; Wen, Longyin; Zhang, Libo*",poster,,,,,,,,, Exploring Temporal Concurrency for Video-Language Representation Learning,"Zhang, Heng; Liu, Daqing; Lv, Zezhong; Su, Bing*; Tao, Dacheng",poster,,,,,,,,, Verbs in Action: Improving verb understanding in video-language models,"Momeni, Liliane*; Caron, Mathilde; Nagrani, Arsha; Zisserman, Andrew; Schmid, Cordelia",poster,2304.06708,https://arxiv.org/abs/2304.06708,,https://huggingface.co/papers/2304.06708,,,,5,0 Sign Language Translation with Iterative Prototype,"Yao, Huijie*; Zhou, Wengang ; Feng, Hao; Hu, Hezhen; Zhou, Hao; Li, Houqiang",poster,2308.12191,https://arxiv.org/abs/2308.12191,,https://huggingface.co/papers/2308.12191,,,,6,0 Contrastive Feature Masking Vision Transformer for Open-vocabulary Detection,"Kim, Dahun*; Angelova, Anelia; Kuo, Weicheng",poster,,,,,,,,, Toward Unsupervised Realistic Visual Question Answering,"Zhang, Yuwei; Ho, Chih-Hui*; Vasconcelos, Nuno",poster,2303.05068,https://arxiv.org/abs/2303.05068,,https://huggingface.co/papers/2303.05068,,,,3,0 GridMM: Grid Memory Map for Vision-and-Language Navigation,"Wang, Zihan*; Li, Xiangyang; Yang, Jiahao; Liu, Yeqi; Jiang, Shuqiang",poster,2307.12907,https://arxiv.org/abs/2307.12907,,https://huggingface.co/papers/2307.12907,,,,5,0 "Video Background Music Generation: Dataset, Method and Evaluation","Zhuo, Le*; Wang, Zhaokai; Wang, Baisen; Liao, Yue; Han, Songhao; Bao, Chenxi; Peng, Stanley; Zhang, Aixi; Fang, Fei; Liu, Si",poster,2211.11248,https://arxiv.org/abs/2211.11248,https://github.com/zhuole1025/SymMV,https://huggingface.co/papers/2211.11248,,,,10,0 Pivot Cube: Semantically-Enhanced CLIP Adaptation for Text-Video Retrieval,"Deng, Chaorui*; Chen, Qi; Qin, Pengda; Chen, Da; Wu, Qi",poster,,,,,,,,, Prompt-aligned Gradient for Prompt Tuning,"Zhu, Beier*; Niu, Yulei; HAN, YUCHENG; Wu, Yue; Zhang, Hanwang",poster,2205.14865,https://arxiv.org/abs/2205.14865,https://github.com/BeierZhu/Prompt-align,https://huggingface.co/papers/2205.14865,,,,5,0 Knowledge-Aware Prompt Tuning for Generalizable Vision-Language Models,"Kan, Baoshuo*; Wang, Teng; Lu, Wenpeng; Zhen, Xiantong; GUAN, WEILI; Zheng, Feng",poster,2308.11186,https://arxiv.org/abs/2308.11186,,https://huggingface.co/papers/2308.11186,,,,6,0 Order-Prompted Tag Sequence Generation for Video Tagging,"Ma, Zongyang*; Zhang, Ziqi; Chen, Yuxin; Qi, Zhongang; Luo, Yingmin; Li, Zekun; Yuan, Chunfeng; Li, Bing; Qie, Xiaohu; Shan, Ying; Hu, Weiming",poster,,,,,,,,, What does a platypus look like? Generating customized prompts for zero-shot image classification ,"Pratt, Sarah*; Covert, Ian; Liu, Rosanne; Farhadi, Ali",poster,,,,,,,,, PromptStyler: Prompt-driven Style Generation for Source-free Domain Generalization,"Cho, Junhyeong*; Nam, Gilhyun; Kim, Sungyeon; Yang, Hunmin; Kwak, Suha",poster,2307.15199,https://arxiv.org/abs/2307.15199,,https://huggingface.co/papers/2307.15199,,,,5,3 DiffDis: Empowering Generative Diffusion Model with Cross-Modal Discrimination Capability,"Huang, Runhui*; Han, Jianhua; Lu, Guansong; Liang, Xiaodan; Zeng, Yihan; Zhang, Wei; Xu, Hang",poster,2308.09306,https://arxiv.org/abs/2308.09306,,https://huggingface.co/papers/2308.09306,,,,7,0 EdaDet: Open-Vocabulary Object Detection Using Early Dense Alignment,"Shi, Cheng; Yang, Sibei*",poster,,,,,,,,, MixSpeech: Cross-Modality Self-Learning with Audio-Visual Stream Mixup for Visual Speech Translation and Recognition,"Cheng, Xize*; Li, Linjun; Jin, Tao; Huang, Rongjie; lin, wang; wang, zehan; Liu, Huadai; Wang, Ye; Yin, Aoxiong; Zhao, Zhou",poster,2303.05309,https://arxiv.org/abs/2303.05309,,https://huggingface.co/papers/2303.05309,,,,10,0 Waffling around for Performance: Visual Classification with Random Words and Broad Concepts,"Roth, Karsten*; Kim, Jae Myung; Koepke, A. Sophia; Vinyals, Oriol; Schmid, Cordelia; Akata, Zeynep",poster,2306.07282,https://arxiv.org/abs/2306.07282,https://github.com/ExplainableML/WaffleCLIP,https://huggingface.co/papers/2306.07282,,,,6,0 March in Chat: Interactive Prompting for Remote Embodied Referring Expression,"Qiao, Yanyuan*; Qi, Yuankai; Yu, Zheng; Liu, Jing; Wu, Qi",poster,2308.10141,https://arxiv.org/abs/2308.10141,,https://huggingface.co/papers/2308.10141,,,,5,0 Frequency Guidance Matters in Few-Shot Learning,"Cheng, Hao*; YANG, SIYUAN; Zhou, Joey Tianyi; Guo, Lanqing; Wen, Bihan",oral,,,,,,,,, Sensitivity-Aware Visual Parameter-Efficient Tuning,"He, Haoyu*; Cai, Jianfei; Zhang, Jing; Tao, Dacheng; Zhuang, Bohan",oral,,,,,,,,, On the Robustness of Open-World Test-Time Training: Self-Training with Dynamic Prototype Expansion,"Li, Yushu; Xu, Xun*; Su, Yongyi; Jia, Kui",oral,2308.09942,https://arxiv.org/abs/2308.09942,https://github.com/Yushu-Li/OWTTT,https://huggingface.co/papers/2308.09942,,,,4,0 Generating Instance-level Prompts for Rehearsal-free Continual Learning,"Jung, Dahuin*; Han, Dongyoon; Bang, Jihwan; Song, Hwanjun",oral,,,,,,,,, Boosting Novel Category Discovery Over Domains with Soft Contrastive Learning and All in One Classifier,"Zang, Zelin*; Shang, Lei; Yang, Senqiao; Wang, Fei; Sun, Baigui; Xie, Xuansong; Li, Stan Z.",oral,,,,,,,,, A soft nearest-neighbor framework for continual semi-supervised learning,"Kang, Zhiqi; Fini, Enrico; Nabi, Moin; Ricci, Elisa; Alahari, Karteek*",oral,2212.05102,https://arxiv.org/abs/2212.05102,https://github.com/kangzhiq/NNCSL,https://huggingface.co/papers/2212.05102,,,,5,0 GraphEcho: Graph-Driven Unsupervised Domain Adaptation for Echocardiogram Video Segmentation,"Yang, Jiewen; Ding, Xinpeng; Ziyang, Zheng; Xu, Xiaowei; Li, Xiaomeng*",oral,,,,,,,,, VIPER: Visual Inference via Program Execution for Reasoning,"Surís, Dídac*; Menon, Sachit; Vondrick, Carl",oral,,,,,,,,, Improved Visual Fine-tuning with Natural Language Supervision,"Wang, Junyang; Xu, Yuanhong; Hu, Juhua; Yan, Ming; Sang, Jitao; Qian, Qi*",oral,2304.01489,https://arxiv.org/abs/2304.01489,https://github.com/idstcv/TeS,https://huggingface.co/papers/2304.01489,,,,6,0 Preparing the Future for Continual Semantic Segmentation,"Lin, Zihan; Wang, Zilei*; Zhang, Yixin",oral,,,,,,,,, MAP: Towards Balanced Generalization of IID and OOD through Model-Agnostic Adapters,"Zhang, Min*; Yuan, Junkun; He, Yue; Li, Wenbin; Chen, Zhengyu; Kuang, Kun",oral,,,,,,,,, Space-time Prompting for Video Class-incremental Learning,"Pei, Yixuan*; Qing, Zhiwu; Zhang, Shiwei; Wang, Xiang; Zhang, Yingya; Zhao, Deli; Qian, Xueming",oral,,,,,,,,, Chinese Text Recognition with A Pre-Trained CLIP-like Model Through Image-IDS Aligning,"Yu, Haiyang; Wang, Xiaocong; Li, Bin*; Xue, Xiangyang",oral,,,,,,,,, OmniLabel: A Challenging Benchmark for Language-Based Object Detection,"Schulter, Samuel*; Kumar B G, Vijay; Suh, Yumin; Dafnis, Konstantinos M. Rafail; Zhang, Zhixing; Zhao, Shiyu; Metaxas, Dimitris N.",oral,2304.11463,https://arxiv.org/abs/2304.11463,,https://huggingface.co/papers/2304.11463,,,,7,0 IntentQA: Context-aware Video Intent Reasoning,"Li, Jiapeng; Wei, Ping; Han, Wenjuan; Fan, Lifeng*",oral,,,,,,,,, Sigmoid loss for Language Image Pre-training,"Zhai, Xiaohua*; Mustafa, Basil; Kolesnikov, Alexander; Beyer, Lucas",oral,2303.15343,https://arxiv.org/abs/2303.15343,,https://huggingface.co/papers/2303.15343,,,,4,0 What does CLIP know about a red circle? Visual prompt engineering for VLMs,"Shtedritski, Aleksandar*; Rupprecht, Christian; Vedaldi, Andrea",oral,2304.06712,https://arxiv.org/abs/2304.06712,,https://huggingface.co/papers/2304.06712,,,,3,0 Equivariance Similarity of Vision-Language Foundation Models,"Wang, Tan*; Lin, Kevin; Li, Linjie; Lin, Chung-Ching; Yang, Zhengyuan; Zhang, Hanwang; Liu, Zicheng; Wang, Lijuan",oral,,,,,,,,, Scaling Data Generation in Vision-and-Language Navigation,"Wang, Zun*; Li, Jialu; Hong, Yicong; Wang, Yi; Wu, Qi; Bansal, Mohit; Gould, Stephen; Tan, Hao; Qiao, Yu",oral,2307.15644,https://arxiv.org/abs/2307.15644,,https://huggingface.co/papers/2307.15644,,,,9,0 Name Your Colour For the Task: Artificially Discover Colour Naming via Colour Quantisation Transformer,"Su, Shenghan; Gu, Lin*; Yang, Yue; Zhang, Zenghui; Harada, Tatsuya",oral,2212.03434,https://arxiv.org/abs/2212.03434,https://github.com/ryeocthiv/CQFormer,https://huggingface.co/papers/2212.03434,,,,5,0 Semantically Aligned and Uniform Video Grounding via Geodesic and Game Theory,"Li, Hongxiang*; Cao, Meng; Cheng, Xuxin; Li, Yaowei; Zhu, Zhihong; Zou, Yuexian",oral,,,,,,,,, Grounded Entity-Landmark Adaptive Pre-training for Vision-and-Language Navigation,"Cui, Yibo*; Xie, Liang; Zhang, Yakun; Zhang, Meishan; Yan, Ye; Yin, Erwei",oral,2308.12587,https://arxiv.org/abs/2308.12587,,https://huggingface.co/papers/2308.12587,,,,6,0 Audio-Enhanced Text-to-Video Retrieval using Text-Conditioned Feature Alignment,"Ibrahimi, Sarah*; Sun, Xiaohang; Wang, Pichao; Garg, Amanmeet; Sanan, Ashutosh; Omar, Mohamed",oral,2307.12964,https://arxiv.org/abs/2307.12964,,https://huggingface.co/papers/2307.12964,,,,6,0 Open-domain Visual Entity Recognition: Towards Recognizing Millions of Wikipedia Entities,"Hu, Hexiang*; Luan, Yi; Chen, Yang; Khandelwal, Urvashi; Joshi, Mandar; Lee, Kenton; Chang, Mingwei; Toutanova, Kristina N",oral,2302.11154,https://arxiv.org/abs/2302.11154,,https://huggingface.co/papers/2302.11154,,,,8,1 Noise2Info: Noisy Image to Information of Noise for Self-Supervised Image Denoising,"Wang, Jiachuan; Di, Shimin*; Chen, Lei ; Ng, Charles Wang Wai",poster,,,,,,,,, Box-based Refinement for Weakly Supervised and Unsupervised Localization Tasks,"Gomel, Eyal; Shaharbany, Tal*; Wolf, Lior",poster,,,,,,,,, Diverse Cotraining Makes Strong Semi-Supervised Segmentor,"Li, Yijiang*; Wang, Xinjiang; Yang, Lihe; Feng, Litong; Zhang, Wayne; Gao, Ying",poster,2308.09281,https://arxiv.org/abs/2308.09281,,https://huggingface.co/papers/2308.09281,,,,6,0 SSB: Simple but Strong Baseline for Boosting Performance of Open-Set Semi-Supervised Learning,"Fan, Yue*; Kukleva, Anna; Dai, Dengxin ; Schiele, Bernt",poster,,,,,,,,, Late Stopping: Avoiding Confidently Learning from Mislabeled Examples,"Yuan, Suqin; Feng, Lei; Liu, Tongliang*",poster,2308.13862,https://arxiv.org/abs/2308.13862,,https://huggingface.co/papers/2308.13862,,,,3,0 Ponder: Point Cloud Pre-training via Neural Rendering,"Huang, Di; Peng, Sida; He, Tong*; Yang, Honghui; Zhou, Xiaowei; Ouyang, Wanli",poster,2301.00157,https://arxiv.org/abs/2301.00157,,https://huggingface.co/papers/2301.00157,,,,5,0 Semantics-Consistent Feature Search for Self-Supervised Visual Representation Learning,"Song, Kaiyou*; Zhang, Shan; Luo, Zimeng; Wang, Tong; Xie, Jin",poster,2212.06486,https://arxiv.org/abs/2212.06486,,https://huggingface.co/papers/2212.06486,,,,6,0 Stable and Causal Inference for Discriminative Self-supervised Deep Visual Representations,"Yang, Yuewei*; Li, Hai; Chen, Yiran",poster,2308.08321,https://arxiv.org/abs/2308.08321,,https://huggingface.co/papers/2308.08321,,,,3,0 Class Transition Tracking Based Pseudo-Rectifying Guidance for Semi-supervised Learning with Non-random Missing Labels,"Duan, Yue*; Zhao, Zhen; Qi, Lei; Zhou, Luping; Wang, Lei; Shi, Yinghuan",poster,,,,,,,,, Hallucination Improves the Performance of Unsupervised Visual Representation Learning,"Wu, Jing*; Hovakimyan, Naira; Hobbs, Jennifer",poster,2307.12168,https://arxiv.org/abs/2307.12168,,https://huggingface.co/papers/2307.12168,,,,3,0 Audiovisual Masked Autoencoders,"Georgescu, Mariana-Iuliana; Fonseca, Eduardo; Ionescu, Radu Tudor; Lucic, Mario; Schmid, Cordelia; Arnab, Anurag*",poster,2212.05922,https://arxiv.org/abs/2212.05922,,https://huggingface.co/papers/2212.05922,,,,6,0 PADCLIP: Pseudo-labeling with Adaptive Debiasing in CLIP for Unsupervised Domain Adaptation,"Lai, Zhengfeng; Vesdapunt, Noranart*; Zhou, Ning; Wu, Jun; Huynh, Cong Phuoc; Li, Xuelu; Fu, Kah Kuen; Chuah, Chen-Nee ",poster,,,,,,,,, Removing Anomalies as Noises for Industrial Defect Localization,"Lu, Fanbin*; Yao, Xufeng; Fu, Chi-Wing; Jia, Jiaya",poster,,,,,,,,, SparseMAE: Sparse Training Meets Masked Autoencoders,"Zhou, Aojun; Li, Yang; Liu, Jianbo; Pan, Junting; Zhang, Renrui; Gao, Peng; Zhao, Rui; Li, Hongsheng*; Qin, Zipeng",poster,,,,,,,,, Shrinking Class Space for Enhanced Certainty in Semi-Supervised Learning,"Yang, Lihe*; Zhao, Zhen; Qi, Lei; Qiao, Yu; Shi, Yinghuan; Zhao, Hengshuang",poster,2308.06777,https://arxiv.org/abs/2308.06777,https://github.com/LiheYoung/ShrinkMatch,https://huggingface.co/papers/2308.06777,,,,6,0 RE2: Logic-induced Diagnostic Reasoning for Semi-supervised Semantic Segmentation,"Liang, Chen*; Wang, Wenguan; Miao, Jiaxu; Yang, Yi",poster,,,,,,,,, GasMono: Geometry-Aided Self-Supervised Monocular Depth Estimation for Indoor Scenes,"Zhao, Chaoqiang*; Poggi, Matteo; Tosi, Fabio; Zhou, lei; Sun, Qiyu; Tang, Yang; Mattoccia, Stefano",poster,,,,,,,,, Is Imitation All You Need? Generalized Decision-Making with Dual-Phase Training,"Wei, Yao; Sun, Yanchao; Zheng, Ruijie; Vemprala, Sai H; Bonatti, Rogerio; Chen, Shuhang; Madaan, Ratnesh; Ba, Zhongjie; Kapoor, Ashish; Ma, Shuang*",poster,2307.07909,https://arxiv.org/abs/2307.07909,,https://huggingface.co/papers/2307.07909,,,,10,0 Benchmarking Low-Shot Robustness to Natural Distribution Shifts,"Singh, Aaditya; Sarangmath, Kartik; Chattopadhyay, Prithvijit*; Hoffman, Judy",poster,2304.11263,https://arxiv.org/abs/2304.11263,,https://huggingface.co/papers/2304.11263,,,,4,1 All4One: Symbiotic Neighbour Contrastive Learning via Self-Attention and Redundancy Reduction,"Estepa, Imanol G*; Nagarajan, Bhalaji; Radeva, Petia; Sarasua, Ignacio",poster,2303.09417,https://arxiv.org/abs/2303.09417,,https://huggingface.co/papers/2303.09417,,,,4,0 Weakly Supervised Learning of Semantic Correspondence through Cascaded Online Correspondence Refinement,"Huang, Yiwen; Sun, Yixuan; Lai, Chenghang; Xu, Qing; Wang, Xiaomei; Shen, Xuli; Ge, Weifeng*",poster,,,,,,,,, Tracking without Label: Unsupervised Multiple Object Tracking via Contrastive Similarity Learning,"Meng, Sha; SHAO, Dian; Guo, Jiacheng; gao, shan*",poster,,,,,,,,, Active Self-Supervised Learning: A Few Low-Cost Relationships Are All You Need,"Cabannnes, Vivien A*; Bottou, Leon; LeCun, Yann; Balestriero, Randall",poster,2303.15256,https://arxiv.org/abs/2303.15256,,https://huggingface.co/papers/2303.15256,,,,4,0 Diffusion Models as Masked Autoencoders,"Wei, Chen*; Mangalam, Karttikeya; Huang, Po-Yao; Li, Yanghao; Fan, Haoqi; Xu, Hu; Wang, Huiyu; Xie, Cihang; Yuille, Alan; Feichtenhofer, Christoph",poster,2304.03283,https://arxiv.org/abs/2304.03283,,https://huggingface.co/papers/2304.03283,,,,10,0 Enhanced Meta Label Correction for Coping with Label Corruption,"Keren Taraday, Mitchell; Baskin, Chaim*",poster,2305.12961,https://arxiv.org/abs/2305.12961,,https://huggingface.co/papers/2305.12961,,,,2,0 Randomized Quantization for Data Agnostic Representation Learning,"Wu, Huimin*; Lei, Chenyang; Sun, Xiao; Wang, Peng-Shuai; Chen, Qifeng; Cheng, Kwang-Ting; Lin, Stephen; Wu, Zhirong",poster,,,,,,,,, Prototypes-oriented Transductive Few-shot Learning with Conditional Transport,"Tian, Long; Feng, Jingyi; Chai, Xiaoqiang; Chen, Wenchao*; Wang, Liming; LIU, Xiyang; Chen, Bo",poster,2308.03047,https://arxiv.org/abs/2308.03047,,https://huggingface.co/papers/2308.03047,,,,7,0 Contrastive Learning Relies More on Spatial Inductive Bias Than Supervised Learning: An Empirical Study,"Chen, Jun-Kun*; Tang, Haoran; Zhong, Yuanyi; Wang, Yu-Xiong",poster,,,,,,,,, Pseudo-label Alignment for Semi-supervised Instance Segmentation,"Hu, Jie*; chen, chen; Cao, Liujuan; Zhang, ShengChuan; Shu, Annan; Jiang, Guannan; Ji, Rongrong",poster,2308.05359,https://arxiv.org/abs/2308.05359,https://github.com/hujiecpp/PAIS,https://huggingface.co/papers/2308.05359,,,,7,0 CFCG: Semi-Supervised Semantic Segmentation via Cross-Fusion and Contour Guidance Supervision,"Li, Shuo*; he, yue; weiming, zhang; Zhang, Wei; Tan, Xiao; Han, Junyu; Ding, Errui; Wang, Jingdong",poster,,,,,,,,, Pixel-Wise Contrastive Distillation,"Huang, Junqiang*; Guo, Zichao",poster,2211.00218,https://arxiv.org/abs/2211.00218,,https://huggingface.co/papers/2211.00218,,,,2,0 Rethinking Safe SSL: Transferring the Open-set Problem to A Close-set One,"ma, qiankun*; Gao, Jiyao; Zhan, Bo; Guo, Yunpeng; Zhou, Jiliu; Wang, Yan",poster,,,,,,,,, Towards Open-Set Test-Time Adaptation Utilizing the Wisdom of Crowds in Entropy Minimization,"Lee, Jungsoo*; Das, Debasmit; Choo, Jaegul; Choi, Sungha",poster,2308.06879,https://arxiv.org/abs/2308.06879,,https://huggingface.co/papers/2308.06879,,,,4,0 Gradient-based Sampling for Class Imbalanced Semi-supervised Object Detection,"Li, Jiaming; Lin, Xiangru; Zhang, Wei; Tan, Xiao; Li, Yingying; Han, Junyu; Ding, Errui; Wang, Jingdong; Li, Guanbin*",poster,,,,,,,,, Remembering Normality: Memory-guided Knowledge Distillation for Unsupervised Anomaly Detection,"Gu, Zhihao*; Liu, Liang; Chen, Xu; Yi, Ran; Zhang, Jiangning; Wang, Yabiao; Wang, Chengjie; Shu, Annan; Jiang, Guannan; Ma, Lizhuang",poster,,,,,,,,, Semi-Supervised Learning via Weight-aware Distillation under Class Distribution Mismatch,"Du, Pan; Zhao, Suyun*; Zisen, Sheng; Li, Cuiping; Chen, Hong",poster,2308.11874,https://arxiv.org/abs/2308.11874,https://github.com/RUC-DWBI-ML/research/tree/main/WAD-master,https://huggingface.co/papers/2308.11874,,,,5,0 Label Shift Adapter for Test-Time Adaptation under Covariate and Label Shifts,"Park, Sunghyun *; Yang, Seunghan; Choo, Jaegul; Yun, Sungrack",poster,2308.08810,https://arxiv.org/abs/2308.08810,,https://huggingface.co/papers/2308.08810,,,,4,0 GraphMatch: Semi-Supervised Learning with Graph Consistency,"Zheng, Mingkai*; You, Shan; Huang, Lang; luo, chen; Wang, Fei; Qian, Chen; Xu, Chang",poster,,,,,,,,, Unsupervised Accuracy Estimation of Deep Visual Models using Domain-Adaptive Adversarial Perturbation without Source Samples,"Lee, JoonHo*; Woo, Jae Oh; Moon, Hankyu; Lee, Kwonho",poster,2307.10062,https://arxiv.org/abs/2307.10062,,https://huggingface.co/papers/2307.10062,,,,4,0 Learning by Sorting: Self-supervised Learning with Group Ordering Constraints,"Shvetsova, Nina*; Petersen, Felix; Kukleva, Anna; Schiele, Bernt; Kuehne, Hilde",poster,2301.02009,https://arxiv.org/abs/2301.02009,,https://huggingface.co/papers/2301.02009,,,,5,0 L-DAWA: Layer-wise Divergence Aware Weight Aggregation in Federated Self-Supervised Visual Representation Learning,"Rehman, Yasar; Gao, Yan*; Porto Buarque de Gusmao, Pedro; Alibeigi, Mina; Shen, Jiajun; Lane, Nicholas",poster,,,,,,,,, Class-relation Knowledge Distillation for Novel Class Discovery,"Zhang, Chuyu*; Gu, Peiyan; Xu, Ruijie; He, Xuming",poster,2307.09158,https://arxiv.org/abs/2307.09158,https://github.com/kleinzcy/Cr-KD-NCD,https://huggingface.co/papers/2307.09158,,,,4,0 Representation Uncertainty in Self-Supervised Learning as Variational Inference,"Nakamura, Hiroki*; Okada, Masashi; Taniguchi, Tadahiro",poster,2203.11437,https://arxiv.org/abs/2203.11437,,https://huggingface.co/papers/2203.11437,,,,3,0 Point-TTA: Test-Time Adaptation for Point Cloud Registration Using Multitask Meta-Auxiliary Learning,"Hatem, Ahmed*; Qian, Yiming; Wang, Yang",poster,,,,,,,,, Adaptive Similarity Bootstrapping for Self-Distillation based Representation Learning,"Lebailly, Tim*; StegmÃŒller, Thomas; Bozorgtabar, Behzad; Thiran, Jean-Philippe; Tuytelaars, Tinne",poster,,,,,,,,, Point Contrastive Prediction with Semantic Clustering for Self-Supervised Learning on Point Cloud Videos,"Sheng, Xiaoxiao*; Shen, Zhiqiang; Xiao, Gang; Wang, Longguang; Guo, Yulan; Fan, Hehe",poster,2308.09247,https://arxiv.org/abs/2308.09247,,https://huggingface.co/papers/2308.09247,,,,6,0 MHCN: A Hyperbolic Neural Network Model for Multi-view Hierarchical Clustering,"Lin, Fangfei*; Bai, Bing; Guo, Yiwen; Chen, Hao; Ren, Yazhou; Xu, Zenglin",poster,,,,,,,,, TimeTuning: Unsupervised Dense Representation Learning from Videos,"Salehi, Mohammadreza*; Gavves, Efstratios; Snoek, Cees; Asano, Yuki M",poster,,,,,,,,, To Adapt or Not to Adapt? Real-Time Adaptation for Semantic Segmentation,"Botet Colomer, Marc; Dovesi, Pier Luigi; Panagiotakopoulos, Theodoros; Carvalho, J. Frederico; HÀrenstam-Nielsen, Linus; Azizpour, Hossein; Kjellström, Hedvig; Cremers, Daniel; Poggi, Matteo*",poster,2307.15063,https://arxiv.org/abs/2307.15063,,https://huggingface.co/papers/2307.15063,,,,9,5 Simple and Effective Out-of-Distribution Detection via Cosine-based Softmax Loss,"Noh, SoonCheol*; Jeong, DongEon; Lee, Jee-Hyong",poster,,,,,,,,, MixBag: Bag-Level Data Augmentation for Learning from Label Proportions,"Asanomi, Takanori*; Matsuo, Shinnosuke; Suehiro, Daiki; Bise, Ryoma ",poster,2308.08822,https://arxiv.org/abs/2308.08822,,https://huggingface.co/papers/2308.08822,,,,4,0 Masked Spatio-Temporal Structure Prediction for Self-supervised Learning on Point Cloud Videos,"Shen, Zhiqiang*; Sheng, Xiaoxiao; Fan, Hehe; Wang, Longguang; Guo, Yulan; Liu, Qiong; Wen, Hao; Zhou, Xi",poster,2308.09245,https://arxiv.org/abs/2308.09245,,https://huggingface.co/papers/2308.09245,,,,8,0 Parametric Classification for Generalized Category Discovery: A Baseline Study,"Wen, Xin*; Zhao, Bingchen; Qi, Xiaojuan",poster,2211.11727,https://arxiv.org/abs/2211.11727,https://github.com/CVMI-Lab/SimGCD,https://huggingface.co/papers/2211.11727,,,,3,1 Object-centric Multiple Object Tracking,"Zhao, Zixu*; Wang, Jiaze; Horn, Max; Ding, Yizhuo; He, Tong; Bai, Zechen; Shuai, Bing; Zietlow, Dominik; Simon-Gabriel, Carl-Johann; Tu, Zhuowen; Brox, Thomas; Schiele, Bernt; Fu, Yanwei; Xiao, Tianjun; Locatello, Francesco; Zhang, Zheng",poster,,,,,,,,, Locating Noise is Halfway Denoising for Semi-Supervised Segmentaiton,"Fang, Yan*; Zhu, Feng; Cheng, Bowen; Liu, Luoqi; Zhao, Yao; Wei, Yunchao",poster,,,,,,,,, Learning Semi-supervised Gaussian Mixture Models for Generalized Category Discovery,"Zhao, Bingchen*; Wen, Xin; Han, Kai",poster,2305.06144,https://arxiv.org/abs/2305.06144,,https://huggingface.co/papers/2305.06144,,,,3,1 Learning Multiscale 3D-consistent Features from Posed Images,"Kloepfer, Dominik A*; Campbell, Dylan; Henriques, Joao F",poster,,,,,,,,, Stable Cluster Discrimination for Deep Clustering,"Qian, Qi*",poster,,,,,,,,, Cross-modal Scalable Hierarchical Clustering in Hyperbolic space,"Long, Teng*; Noord, Nanne van",poster,,,,,,,,, Collaborative Propagation on Multiple Instance Graphs for 3D Instance Segmentation with Single-point Supervision,"Dong, Shichao*; Li, Ruibo; Wei, Jiacheng; Liu, Fayao; Lin, Guosheng",poster,2208.05110,https://arxiv.org/abs/2208.05110,,https://huggingface.co/papers/2208.05110,,,,5,1 Semantics Meets Temporal Correspondence: Self-supervised Object-centric Learning in Videos,"Qian, Rui*; Ding, Shuangrui; Liu, Xian; Lin, Dahua",poster,2308.09951,https://arxiv.org/abs/2308.09951,https://github.com/shvdiwnkozbw/SMTC,https://huggingface.co/papers/2308.09951,,,,4,0 Proxy Anchor-based Unsupervised Learning for Continuous Generalized Category Discovery,"Kim, Hyungmin*; Suh, Sungho; Kim, Daehwan; Jeong, Daun; Cho, Hansang; Kim, Junmo",poster,2307.10943,https://arxiv.org/abs/2307.10943,,https://huggingface.co/papers/2307.10943,,,,6,0 DreamTeacher: Pretraining Image Backbones with Deep Generative Models,"Li, Daiqing*; Ling, Huan; Kar, Amlan; Acuna, David ; Kim, Seung Wook; Kreis, Karsten; Torralba, Antonio; Fidler, Sanja",poster,2307.07487,https://arxiv.org/abs/2307.07487,,https://huggingface.co/papers/2307.07487,,,,8,4 MATE: Masked Autoencoders are Online 3D Test-Time Learners,"Mirza, Muhammad Jehanzeb*; Shin, Inkyu; Lin, Wei; Schriebl, Andreas; Sun, Kunyang; Choe, Jaesung; Kozinski, Mateusz; Possegger, Horst; Kweon, In So; Yoon, Kuk-Jin; Bischof, Horst",poster,2211.11432,https://arxiv.org/abs/2211.11432,,https://huggingface.co/papers/2211.11432,,,,11,1 PADDLES: Phase-Amplitude Spectrum Disentangled Early Stopping for Learning with Noisy Labels,"Huang, Huaxi *; Kang, Hui; Liu, Sheng; Salvado, Olivier; rakotoarivelo, thierry; Wang, Dadong; Liu, Tongliang",poster,2212.03462,https://arxiv.org/abs/2212.03462,,https://huggingface.co/papers/2212.03462,,,,7,0 Calibrating Uncertainty for Semi-Supervised Crowd Counting,"LI, CHEN*; Hu, Xiaoling; Abousamra, Shahira; Chen, Chao",poster,2308.09887,https://arxiv.org/abs/2308.09887,,https://huggingface.co/papers/2308.09887,,,,4,0 Test Time Adaptation for Blind Image Quality Assessment,"Roy, Subhadeep*; Mitra, Shankhanil; Biswas, Soma ; Soundararajan, Rajiv",poster,2307.14735,https://arxiv.org/abs/2307.14735,,https://huggingface.co/papers/2307.14735,,,,4,0 Deep Multiview Clustering by Contrasting Cluster Assignments,"Chen, Jie; Mao, Hua; Woo, Wai Lok; Peng, Xi*",poster,2304.10769,https://arxiv.org/abs/2304.10769,,https://huggingface.co/papers/2304.10769,,,,4,0 Re:PolyWorld - A Graph Neural Network for Polygonal Scene Parsing,"Zorzi, Stefano*; Fraundorfer, Friedrich",poster,,,,,,,,, Satlas: A Large-Scale Dataset for Remote Sensing Image Understanding,"Bastani, Favyen*; Wolters, Piper S; Gupta, Ritwik; Ferdinando, Joseph G; Kembhavi, Aniruddha",poster,,,,,,,,, Large-Scale Land Cover Mapping with Fine-Grained Classes via Class-Aware Semi-Supervised Semantic Segmentation,"Dong, Runmin*; Mou, Lichao; Chen, Mengxuan; Li, Weijia; Tong, Xin-Yi; Yuan, Shuai; Zhang, Lixian; Zheng, Juepeng; Zhu, Xiaoxiang; Fu, Haohuan",poster,,,,,,,,, Large Selective Kernel Network for Remote Sensing Object Detection,"Li, Yuxuan*; Hou, Qibin; Zheng, Zhaohui; Cheng, Ming-Ming; Yang, Jian; Li, Xiang",poster,2303.09030,https://arxiv.org/abs/2303.09030,https://github.com/zcablii/Large-Selective-Kernel-Network,https://huggingface.co/papers/2303.09030,,,,6,0 GFM: Building Geospatial Foundation Models via Continual Pretraining,"Mendieta, Matias*; Han, Boran; Shi, Xingjian; Zhu, Yi; Chen, Chen",poster,2302.04476,https://arxiv.org/abs/2302.04476,,https://huggingface.co/papers/2302.04476,,,,5,0 Regularized Primitive Graph Learning for Unified Vector Mapping,"Wang, Lei*; Dai, Min; He, Jianan; Huang, Jingwei",poster,,,,,,,,, Class Prior-Free Positive-Unlabeled Learning with Taylor Variational Loss for Hyperspectral Remote Sensing Imagery,"Zhao, Hengwei*; Wang, Xinyu; Li, Jingtao; Zhong, Yanfei",poster,2308.15081,https://arxiv.org/abs/2308.15081,https://github.com/Hengwei-Zhao96/T-HOneCls,https://huggingface.co/papers/2308.15081,,,,4,0 MapFormer: Boosting Change Detection by Using Pre-change Information,"Bernhard, Maximilian*; Strauß, Niklas A; Schubert, Matthias",poster,2303.17859,https://arxiv.org/abs/2303.17859,https://github.com/mxbh/mapformer,https://huggingface.co/papers/2303.17859,,,,3,0 Sample4Geo: Hard Negative Sampling For Cross-View Geo-Localisation,"Deuser, Fabian*; Habel, Konrad; Oswald, Norbert",poster,2303.11851,https://arxiv.org/abs/2303.11851,,https://huggingface.co/papers/2303.11851,,,,3,2 PanFlowNet: A Flow-Based Deep Network for Pan-sharpening,"Yang, Gang*; Cao, Xiangyong; xiao, wenzhe; zhou, man; Liu, Aiping; Chen, Xun; Meng, Deyu",poster,2305.07774,https://arxiv.org/abs/2305.07774,,https://huggingface.co/papers/2305.07774,,,,7,0 Seeing Beyond the Patch: Scale-Adaptive Semantic Segmentation of High-resolution Remote Sensing Imagery based on Reinforcement Learning,"Liu, Yinhe*; Shi, Sunan; Wang, Junjue; Zhong, Yanfei",poster,,,,,,,,, AdaNIC: Towards Practical Neural Image Compression via Dynamic Transform Routing,"Tao, Lvfang; Gao, Wei*; Li, Ge; Zhang, Chenhao",poster,,,,,,,,, Rethinking Vision Transformers for MobileNet Size and Speed,"Li, Yanyu; Hu, Ju; Wen, Yang; Evangelidis, Georgios; Salahi, Kamyar; Wang, Yanzhi; Tulyakov, Sergey; Ren, Jian*",poster,2212.08059,https://arxiv.org/abs/2212.08059,,https://huggingface.co/papers/2212.08059,,,,8,0 DELFlow: Dense Efficient Learning of Scene Flow for Large-Scale Point Clouds,"Peng, Chensheng; Wang, Guangming; Lo, Xian Wan ; Wu, Xinrui; Xu, Chenfeng; TOMIZUKA, Masayoshi; Zhan, Wei; Wang, Hesheng*",poster,2308.04383,https://arxiv.org/abs/2308.04383,,https://huggingface.co/papers/2308.04383,,,,8,0 Eventful Transformers: Leveraging Temporal Redundancy in Vision Transformers,"Dutson, Matthew*; Li, Yin; Gupta, Mohit",poster,2308.13494,https://arxiv.org/abs/2308.13494,,https://huggingface.co/papers/2308.13494,,,,3,1 Inherent Redundancy in Spiking Neural Networks,"Yao, Man*; Hu, Jiakui; Zhao, Guangshe; Wang, Yaoyuan; Zhang, Ziyang ; Xu, Bo; Li, Guoqi",poster,2308.08227,https://arxiv.org/abs/2308.08227,https://github.com/BICLab/ASA-SNN,https://huggingface.co/papers/2308.08227,,,,7,0 Achievement-based Training Progress Balancing for Multi-Task Learning,"YUN, hayoung; Cho, Hanjoo*",poster,,,,,,,,, Prune Spatio-temporal Tokens by Semantic-aware Temporal Accumulation,"Ding, Shuangrui*; Zhao, Peisen; zhang, xiaopeng; Qian, Rui; Xiong, Hongkai; Tian, Qi",poster,2308.04549,https://arxiv.org/abs/2308.04549,https://github.com/Mark12Ding/STA,https://huggingface.co/papers/2308.04549,,,,6,0 Differentiable Transportation Pruning,"Li, Yunqiang*; van Gemert, Jan C; Hoefler, Torsten; Moons, Bert; Eleftheriou, Evangelos; Verhoef, Bram-Ernst",poster,2307.08483,https://arxiv.org/abs/2307.08483,,https://huggingface.co/papers/2307.08483,,,,6,0 XiNet: Efficient Neural Networks for tinyML,"Ancilotto, Alberto*; Paissan, Francesco; Farella, Elisabetta",poster,,,,,,,,, Jumping through Local Minima: Quantization in the Loss Landscape of Vision Transformers,"Frumkin, Natalia*; Gope, Dibakar; Marculescu, Diana",poster,2308.10814,https://arxiv.org/abs/2308.10814,https://github.com/enyac-group/evol-q,https://huggingface.co/papers/2308.10814,,,,3,1 A2Q: Accumulator-Aware Quantization with Guaranteed Overflow Avoidance,"Colbert, Ian*; Pappalardo, Alessandro; Petri-Koenig, Jakoba",poster,2308.13504,https://arxiv.org/abs/2308.13504,,https://huggingface.co/papers/2308.13504,,,,3,0 Workie-Talkie: Accelerating Federated Learning by Overlapping Computing and Communications via Contrastive Regularization,"Chen, Rui*; Wan, Qiyu ; Prakash, Pavana; Zhang, Lan; Yuan, Xu; Gong, Yanmin; Fu, Xin; Pan, Miao",poster,,,,,,,,, DenseShift: Towards Accurate and Transferable Low-Bit Shift Network,"Li, Xinlin*; Liu, Bang; Yang, Rui Heng; Courville, Vanessa; Xing, Chao; Partovi Nia, Vahid",poster,2208.09708,https://arxiv.org/abs/2208.09708,,https://huggingface.co/papers/2208.09708,,,,6,0 PRANC: Pseudo RAndom Networks for Compacting deep models,"Nooralinejad, Parsa*; Abbasi, Ali; Abbasi Koohpayegani, Soroush; Khan, Rana Muhammad Shahroz; Pourahmadi Meibodi, Kossar; Kolouri, Soheil; Pirsiavash, Hamed",poster,2206.08464,https://arxiv.org/abs/2206.08464,https://github.com/UCDvision/PRANC,https://huggingface.co/papers/2206.08464,,,,7,0 "Reinforce Data, Multiply Impact: Improved Model Accuracy and Robustness with Dataset Reinforcement","Faghri, Fartash*; Pouransari, Hadi; Mehta, Sachin; Farajtabar, Mehrdad; Farhadi, Ali; Rastegari, Mohammad; Tuzel, Oncel",poster,2303.08983,https://arxiv.org/abs/2303.08983,https://github.com/apple/ml-dr,https://huggingface.co/papers/2303.08983,,,,7,0 A Fast Unified System for 3D Object Detection and Tracking,"Heitzinger, Thomas TH*; Kampel, Martin",poster,,,,,,,,, Estimator Meets Equilibrium Perspective: A Rectified Straight Through Estimator for Binary Neural Networks Training,"Wu, Xiao-Ming; Zheng, Dian; Liu, Zuhao; ZHENG, WEI-SHI*",poster,2308.06689,https://arxiv.org/abs/2308.06689,,https://huggingface.co/papers/2308.06689,,,,4,0 I-ViT: Integer-only Quantization for Efficient Vision Transformer Inference,"Li, Zhikai*; Gu, Qingyi",poster,,,,,,,,, EMQ: Evolving Training-free Proxies for Automated Mixed Precision Quantization,"Dong, Peijie; Li, Lujun; Wei, Zimian; Niu, Xin*; TIAN, ZHILIANG; Pan, Hengyue ",poster,2307.10554,https://arxiv.org/abs/2307.10554,,https://huggingface.co/papers/2307.10554,,,,6,0 Local or Global: Selective Knowledge Assimilation for Federated Learning with Limited Labels,"Cho, Yae Jee*; Joshi, Gauri; Dimitriadis, Dimitrios",poster,2307.08809,https://arxiv.org/abs/2307.08809,,https://huggingface.co/papers/2307.08809,,,,3,0 An Efficient Dataset Distillation with Attention Matching,"Sajedi, Ahmad*; Khaki, Samir; Amjadian, Ehsan; Liu, Lucy; Lawryshyn, Yuri; Plataniotis, Konstantinos N",poster,,,,,,,,, SAFE: Machine Unlearning With Shard Graphs,"Dukler, Yonatan*; Bowman, Benjamin; Achille, Alessandro; Golatkar, Aditya Sharad; Swaminathan, Ashwin; Soatto, Stefano",poster,2304.13169,https://arxiv.org/abs/2304.13169,,https://huggingface.co/papers/2304.13169,,,,6,0 ResQ: Residual Quantization for Video Perception,"Abati, Davide*; Ben Yahia, Haitam; Nagel, Markus; Habibian, Amirhossein",poster,2308.09511,https://arxiv.org/abs/2308.09511,,https://huggingface.co/papers/2308.09511,,,,4,0 Efficient Computation Sharing for Multi-Task Visual Scene Understanding,"Shoouri, Sara*; Yang, Mingyu; Fan, Zichen; Kim, Hun Seok",poster,2303.09663,https://arxiv.org/abs/2303.09663,,https://huggingface.co/papers/2303.09663,,,,4,0 Essential Matrix Estimation using Convex Relaxations in Orthogonal Space,"Karimian, Arman*; Tron, Roberto",poster,,,,,,,,, TripLe: Revisiting Pretrained Model Reuse and Progressive Learning for Efficient Vision Transformer Scaling and Searching,"Fu, Cheng*; Huang, Hanxian; Jiang, Zixuan; Ni, Yun; Nai, Lifeng; Wu, Gang; Cheng, Liqun; Zhou, Yanqi; Li, Sheng; Li, Andrew; Zhao, Jishen",poster,,,,,,,,, DiffRate : Differentiable Compression Rate for Efficient Vision Transformers,"Chen, Mengzhao*; Shao, Wenqi; Xu, Peng; Lin, Mingbao; Zhang, Kaipeng; Chao, Fei; Ji, Rongrong; Qiao, Yu; Luo, Ping",poster,2305.17997,https://arxiv.org/abs/2305.17997,https://github.com/OpenGVLab/DiffRate,https://huggingface.co/papers/2305.17997,,,,9,0 Bridging Cross-task Protocol Inconsistency for Distillation in Dense Object Detection,"Yang, Longrong; Zhou, Xianpan; Li, Xuewei; Qiao, Liang; Li, Zheyang; Yang, Ziwei; Wang, Gaoang; Li, Xi*",poster,2308.14286,https://arxiv.org/abs/2308.14286,https://github.com/TinyTigerPan/BCKD,https://huggingface.co/papers/2308.14286,,,,8,0 From Knowledge Distillation to Self-Knowledge Distillation: A Unified Approach with Normalized Loss and Customized Soft Labels,"Yang, Zhendong*; Zeng, Ailing; Li, Zhe; Zhang, Tianke; Yuan, Chun; Li, Yu",poster,2303.13005,https://arxiv.org/abs/2303.13005,https://github.com/yzd-v/cls_KD,https://huggingface.co/papers/2303.13005,,,,6,1 Efficient 3D Semantic Segmentation with Superpoint Transformer,"ROBERT, Damien*; Raguet, Hugo; Landrieu, Loic",poster,2306.08045,https://arxiv.org/abs/2306.08045,,https://huggingface.co/papers/2306.08045,,,,3,2 Dataset Quantization,"Zhou, Daquan; Wang, Kai*; Gu, Jianyang; Peng, Xiangyu; Lian, Dongze; Zhang, Yifan; You, Yang; Feng, Jiashi",poster,2308.10524,https://arxiv.org/abs/2308.10524,,https://huggingface.co/papers/2308.10524,,,,8,0 Revisiting the Parameter Efficiency of Adapters from the Perspective of Precision Redundancy,"Jie, Shibo*; Wang, Haoqing; Deng, Zhi-Hong",poster,2307.16867,https://arxiv.org/abs/2307.16867,https://github.com/JieShibo/PETL-ViT,https://huggingface.co/papers/2307.16867,,,,3,0 RepQ-ViT: Scale Reparameterization for Post-Training Quantization of Vision Transformers,"Li, Zhikai*; Xiao, Junrui; Yang, Lianwei; Gu, Qingyi",poster,,,,,,,,, Semantically Structured Image Compression via Irregular Group-Based Decoupling,"Feng, Ruoyu*; Gao, Yixin; Jin, Xin; Feng, Runsen; Chen, Zhibo",poster,2305.02586,https://arxiv.org/abs/2305.02586,,https://huggingface.co/papers/2305.02586,,,,5,0 SeiT: Storage-Efficient Vision Training with Tokens Using 1% of Pixel Storage,"Park, Song; Chun, Sanghyuk*; Heo, Byeongho; Kim, Wonjae; Yun, Sangdoo",poster,2303.11114,https://arxiv.org/abs/2303.11114,https://github.com/naver-ai/seit,https://huggingface.co/papers/2303.11114,,,,5,1 SMMix: Self-Motivated Image Mixing for Vision Transformers,"Chen, Mengzhao*; Lin, Mingbao; Lin, Zhihang; Zhang, Yuxin; Chao, Fei; Ji, Rongrong",poster,2212.12977,https://arxiv.org/abs/2212.12977,https://github.com/ChenMnZ/SMMix,https://huggingface.co/papers/2212.12977,,,,6,0 Multi-Label Knowledge Distillation,"Yang, Penghui*; Xie, Ming-Kun; Zong, Chen-Chen; Feng, Lei; Niu, Gang; Sugiyama, Masashi; Huang, Sheng-Jun",poster,2308.06453,https://arxiv.org/abs/2308.06453,https://github.com/penghui-yang/L2D,https://huggingface.co/papers/2308.06453,,,,7,1 UGC: Unified GAN Compression for Efficient Image-to-Image Translation ,"Ren, Yuxi*; Wu, Jie; Zhang, Peng; Zhang, Manlin; Xiao, Xuefeng; He, Qian; Wang, Rui; Zheng, Min ; Pan, Xin",poster,,,,,,,,, MotionDeltaCNN: Sparse CNN Inference of Frame Differences in Moving Camera Videos,"Parger, Mathias*; Tang, Chengcheng; Neff, Thomas; Twigg, Christopher D; Keskin, Cem; Wang, Robert; Steinberger, Markus",poster,2210.09887,https://arxiv.org/abs/2210.09887,,https://huggingface.co/papers/2210.09887,,,,7,0 Lightweight Multi-Scale Attention for On-Device Semantic Segmentation,"Cai, Han*; Li, Junyan; Hu, Muyan; Gan, Chuang; Han, Song",poster,,,,,,,,, DREAM: Efficient Dataset Distillation by Representative Matching,"Liu, Yanqing*; Gu, Jianyang; Wang, Kai; Zhu, Zheng; Jiang, Wei; You, Yang",poster,2302.14416,https://arxiv.org/abs/2302.14416,,https://huggingface.co/papers/2302.14416,,,,6,0 INSTA-BNN: Binary Neural Network with INSTAnce-aware Threshold,"Lee, Changhun; Kim, Hyungjun; Park, Eunhyeok; Kim, Jae-Joon*",poster,,,,,,,,, Deep Incubation: Training Large Models by Divide-and-Conquering,"Ni, Zanlin*; Wang, Yulin; Yu, Jiangwei; Jiang, Haojun; Cao, Yue; Huang, Gao",poster,2212.04129,https://arxiv.org/abs/2212.04129,https://github.com/LeapLabTHU/Deep-Incubation,https://huggingface.co/papers/2212.04129,,,,6,0 AdaMV-MoE: Adaptive Multi-Task Vision Mixture-of-Experts,"Chen, Tianlong*; Chen, Xuxi; Du, Xianzhi; Rashwan, Abdullah; Yang, Fan; Chen, Huizhong; Wang, Zhangyang; Li, Yeqing",poster,,,,,,,,, Overcoming Forgetting Catastrophe in Quantization-Aware Training,"Chen, Ting-An*; Yang, De-Nian; Chen, Ming-Syan",poster,,,,,,,,, Window-Based Early-Exit Cascades for Uncertainty Estimation: When Deep Ensembles are More Efficient than Single Models,"Xia, Guoxuan*; Bouganis, Christos-Savvas",poster,2303.08010,https://arxiv.org/abs/2303.08010,https://github.com/Guoxoug/window-early-exit,https://huggingface.co/papers/2303.08010,,,,2,0 ORC: Network Group-based Knowledge Distillation using Online Role Change,"Choi, Junyong; Cho, Hyeon; Cheung, Seokhwa; Hwang, Wonjun*",poster,2206.01186,https://arxiv.org/abs/2206.01186,,https://huggingface.co/papers/2206.01186,,,,4,0 RMP-Loss: Regularizing Membrane Potential Distribution for Spiking Neural Networks,"Guo, Yufei*; Zhang, Liwen; Chen, Yuanpei; Liu, Xiaode; peng, weihang; Zhang, Yuhan; Huang, Xuhui; Ma, Zhe",poster,,,,,,,,, Structural Alignment for Network Pruning through Partial Regularization,"Gao, Shangqian*; Zhang, Zeyu; Zhang, Yanfu; Huang, Feihu; Huang, Heng",poster,,,,,,,,, Automated Knowledge Distillation via Monte Carlo Tree Search,"Li, Lujun*; Dong, Peijie; Wei, Zimian; Ya, Yang",poster,,,,,,,,, SwiftFormer: Efficient Additive Attention for Transformer-based Real-time Mobile Vision Applications,"Shaker, Abdelrahman M*; Maaz, Muhammad; Rasheed, Hanoona Abdul; Khan, Salman; Yang, Ming-Hsuan; Shahbaz Khan, Fahad",poster,2303.15446,https://arxiv.org/abs/2303.15446,https://github.com/Amshaker/SwiftFormer,https://huggingface.co/papers/2303.15446,,,,6,0 Causal-DFQ: Causality Guided Data-free Network Quantization,"Shang, Yuzhang*; Xu, Bingxin; Liu, Gaowen; Kompella, Ramana; Yan, Yan",poster,,,,,,,,, Efficient Joint Optimization of Layer-Adaptive Weight Pruning in Deep Neural Networks,"Xu, Kaixin*; Wang, Zhe; Geng, Xue; Wu, Min; Li, Xiaoli; Lin, Weisi",poster,2308.10438,https://arxiv.org/abs/2308.10438,https://github.com/Akimoto-Cris/RD_VIT_PRUNE,https://huggingface.co/papers/2308.10438,,,,7,0 Automatic Network Pruning via Hilbert-Schmidt Independence Criterion Lasso under Information Bottleneck Principle,"guo, song*; Zhang, Lei; Zheng, Xiawu; Wang, Yan; Li, Yuchao; Chao, Fei; Zhang, ShengChuan; Wu, Chenglin; Ji, Rongrong",poster,,,,,,,,, Distribution Shift Matters for Knowledge Distillation with Webly Collected Images,"Tang, Jialiang; Chen, Shuo; Niu, Gang; Sugiyama, Masashi; Gong, Chen*",poster,2307.11469,https://arxiv.org/abs/2307.11469,,https://huggingface.co/papers/2307.11469,,,,5,0 FastRecon: Few-shot Industrial Anomaly Detection via Fast Feature Reconstruction,"Zheng, Fang; Wang, Xiaoyang; HaoCheng, Li; Liu, Jiejie; Hu, Qiugui; Xiao, Jimin*",poster,,,,,,,,, E^2VPT: An Effective and Efficient Approach for Visual Prompt Tuning,"Han, Cheng*; Wang, Qifan; Cui, Yiming; Cao, Zhiwen; Wang, Wenguan; Qi, Siyuan; Liu, Dongfang",poster,2307.13770,https://arxiv.org/abs/2307.13770,https://github.com/ChengHan111/E2VPT,https://huggingface.co/papers/2307.13770,,,,7,0 Bridging Vision and Language Encoders: Parameter-Efficient Tuning for Referring Image Segmentation,"Xu, Zunnan; Chen, Zhihong; Zhang, Yong; Song, Yibing; Wan, Xiang; Li, Guanbin*",poster,2307.11545,https://arxiv.org/abs/2307.11545,https://github.com/kkakkkka/ETRIS,https://huggingface.co/papers/2307.11545,,,,6,0 SHACIRA - Scalable HAsh-grid Compression for Implicit Neural Representations,"Girish, Sharath*; Gupta, Kamal; Shrivastava, Abhinav",poster,,,,,,,,, Efficient Deep Space Filling Curve,"Chen, Wanli *; Yao, Xufeng; Zhang, Xinyun; Yu, Bei",poster,,,,,,,,, Q-Diffusion: Quantizing Diffusion Models,"Li, Xiuyu*; Liu, Yijiang; Lian, Long; Yang, Huanrui; Dong, Zhen; Kang, Daniel; Zhang, Shanghang; Keutzer, Kurt",poster,,,,,,,,, Lossy and Lossless (L$^2$) Post-training Model Size Compression,"Shi, Yumeng*; bai, shihao; Wei, Xiuying; Gong, Ruihao; Yang, Jianlei",poster,2308.04269,https://arxiv.org/abs/2308.04269,https://github.com/ModelTC/L2_Compression,https://huggingface.co/papers/2308.04269,,,,5,0 Robustifying Token Attention for Vision Transformers,"Guo, Yong*; Stutz, David; Schiele, Bernt",poster,2303.11126,https://arxiv.org/abs/2303.11126,,https://huggingface.co/papers/2303.11126,,,,3,0 Strivec: Sparse Tri-Vector Radiance Fields,"Xu, Qiangeng; Gao, Quankai*; Su, Hao; Neumann, Ulrich; Xu, Zexiang",poster,2307.13226,https://arxiv.org/abs/2307.13226,,https://huggingface.co/papers/2307.13226,,,,5,2 Image Features with Formal Privacy Guarantees,"Pittaluga, Francesco*; Zhuang, Bingbing",poster,,,,,,,,, SparseFusion: Fusing Multi-Modal Sparse Representations for Multi-Sensor 3D Object Detection,"Xie, Yichen*; Xu, Chenfeng; Rakotosaona, Marie-Julie; Rim, Patrick; Tombari, Federico; Keutzer, Kurt; TOMIZUKA, Masayoshi; Zhan, Wei",poster,2304.14340,https://arxiv.org/abs/2304.14340,https://github.com/yichen928/SparseFusion,https://huggingface.co/papers/2304.14340,,,,8,0 Strata-NeRF : Neural Radiance fields for Stratified Scenes,"Dhiman, Ankit*; R, Srinath; Rangwani, Harsh; Parihar, Rishubh; Boregowda, Lokesh; Sridhar, Srinath; RADHAKRISHNAN, Venkatesh Babu",poster,2308.10337,https://arxiv.org/abs/2308.10337,,https://huggingface.co/papers/2308.10337,,,,7,0 "CRN: Camera Radar Net for Accurate, Robust, Efficient 3D Perception","Kim, Youngseok; Shin, Juyeb; Kim, Sanmin; Lee, In-Jae; Choi, Jun Won; Kum, Dongsuk*",poster,2304.00670,https://arxiv.org/abs/2304.00670,,https://huggingface.co/papers/2304.00670,,,,5,0 LightGlue: Local Feature Matching at Light Speed,"Lindenberger, Philipp*; Sarlin, Paul-Edouard; Pollefeys, Marc",poster,2306.13643,https://arxiv.org/abs/2306.13643,https://github.com/cvg/LightGlue,https://huggingface.co/papers/2306.13643,,,,3,0 ExBluRF: Efficient Radiance Fields for Extreme Motion Blurred Images,"Lee, Dongwoo; Oh, Jeongtaek; Rim, Jaesung; Cho, Sunghyun; Lee, Kyoung Mu*",poster,,,,,,,,, Generalized Differentiable RANSAC,"Wei, Tong*; Patel, Yash; Shekhovtsov, Alexander; Matas, Jiri; Barath, Daniel",poster,2212.13185,https://arxiv.org/abs/2212.13185,https://github.com/weitong8591/differentiable_ransac,https://huggingface.co/papers/2212.13185,,,,5,0 Constraining Depth Map Geometry for Multi-View Stereo: A Dual-Depth Approach with Saddle-shaped Depth Cells,"Ye, Xinyi; Zhao, Weiyue; Liu, Tianqi; Huang, Zihao; Cao, Zhiguo*; Li, Xin",poster,2307.09160,https://arxiv.org/abs/2307.09160,,https://huggingface.co/papers/2307.09160,,,,6,0 Total-Recon: Deformable Scene Reconstruction for Motion-based View Synthesis,"Song, Chonghyuk*; Yang, Gengshan; Deng, Kangle; Zhu, Jun-Yan; Ramanan, Deva",poster,,,,,,,,, Seal-3D: Interactive Pixel-Level Editing for Neural Radiance Fields,"Wang, Xiangyu; Zhu, Jingsen; Ran, Yunlong; Zhong, Zhihua; Huo, Yuchi; Chen, Jiming; Ye, Qi*",poster,,,,,,,,, PointMBF: A Multi-scale Bidirectional Fusion Network for Unsupervised RGB-D Point Cloud Registration,"Yuan, Mingzhi*; Fu, Kexue; Li, Zhihao; Meng, Yucong; Wang, Manning",poster,,,,,,,,, PARF: Primitive-Aware Radiance Fusion for Indoor Scene Novel View Synthesis,"Ying, Haiyang; Jiang, Baowei; Zhang, Jinzhi; Xu, Di; Yu, Tao*; Dai, Qionghai; Fang, Lu",poster,,,,,,,,, Rethinking Point Cloud Registration as Masking and Reconstruction,"Chen, Guangyan; Wang, Meiling; Yuan, Li; Yang, Yi; Yue, Yufeng *",poster,,,,,,,,, Ada3D : Exploiting the Spatial Redundancy with Adaptive Inference for Efficient 3D Object Detection,"Zhao, Tianchen*; Ning, Xuefei; Hong, Ke; Qiu, Zhongyuan; Pu, Lu; Zhang, Linfeng; Zhao, Yali; Zhou, Lipu; Dai, Guohao; Yang, Huazhong; Wang, Yu",poster,2307.08209,https://arxiv.org/abs/2307.08209,,https://huggingface.co/papers/2307.08209,,,,11,0 Delicate Textured Mesh Recovery from NeRF via Adaptive Surface Refinement,"Tang, Jiaxiang*; Zhou, Hang; Chen, Xiaokang; Hu, Tianshu; Ding, Errui; Wang, Jingdong; Zeng, Gang",poster,2303.02091,https://arxiv.org/abs/2303.02091,,https://huggingface.co/papers/2303.02091,,,,7,0 CVRecon: Rethinking 3D Geometric Feature Learning For Neural Reconstruction,"Feng, Ziyue*; Yang, Liang; Guo, Pengsheng; Li, Bing",poster,2304.14633,https://arxiv.org/abs/2304.14633,,https://huggingface.co/papers/2304.14633,,,,4,0 RICO: Regularizing the Unobservable for Indoor Compositional Reconstruction,"Li, Zizhang*; Lyu, Xiaoyang; Ding, Yuanyuan; Wang, Mengmeng; Liao, Yiyi; Liu, Yong",poster,2303.08605,https://arxiv.org/abs/2303.08605,,https://huggingface.co/papers/2303.08605,,,,6,1 Multiscale Representation for Real-Time Anti-Aliasing Neural Rendering,"Hu, Dongting; Zhang, Zhenkai; Hou, Tingbo; Liu, Tongliang; Fu, Huan; Gong, Mingming*",poster,2304.10075,https://arxiv.org/abs/2304.10075,,https://huggingface.co/papers/2304.10075,,,,6,0 ELFNet: Evidential Local-global Fusion for Stereo Matching,"Lou, Jieming*; Liu, Weide; Chen, Zhuo; Liu, Fayao; Cheng, Jun",poster,2308.00728,https://arxiv.org/abs/2308.00728,https://github.com/jimmy19991222/ELFNet,https://huggingface.co/papers/2308.00728,,,,5,0 GaPro: Box-Supervised 3D Point Cloud Instance Segmentation Using Gaussian Processes as Pseudo Labelers,"Ngo, Tuan Duc*; Hua, Binh-Son; Nguyen, Khoi",poster,2307.13251,https://arxiv.org/abs/2307.13251,https://github.com/VinAIResearch/GaPro,https://huggingface.co/papers/2307.13251,,,,3,0 Multi-body Depth and Camera Pose Estimation from Multiple Views,"Porfiri Dal Cin, Andrea*; Boracchi, Giacomo; Magri, Luca",poster,,,,,,,,, Reference-guided Controllable Inpainting of Neural Radiance Fields ,"Mirzaei, Ashkan*; Aumentado-Armstrong, Tristan T; Brubaker, Marcus A; Kelly, Jonathan; Levinshtein, Alex; Derpanis, Konstantinos G; Gilitschenski, Igor",poster,,,,,,,,, Retro-FPN: Retrospective Feature Pyramid Network for Point Cloud Semantic Segmentation,"Xiang, Peng; Wen, Xin; Liu, Yu-Shen*; Zhang, Hui; Fang, Yi; Han, Zhizhong",poster,,,,,,,,, Towards Better 3D Knowledge Transfer via Masked Image Modeling for Multi-view 3D Understanding,"Liu, Jihao*; Wang, Tai; Liu, Boxiao; Zhang, Qihang; Liu, Yu; Li, Hongsheng",poster,,,,,,,,, OpenOccupancy: A Large Scale Benchmark for Surrounding Semantic Occupancy Perception,"Wang, Xiaofeng*; Zhu, Zheng; Xu, Wenbo; Zhang, Yunpeng; Wei, Yi; Chi, Xu; Ye, Yun; Du, Dalong; Lu, Jiwen; Wang, Xingang",poster,2303.03991,https://arxiv.org/abs/2303.03991,,https://huggingface.co/papers/2303.03991,,,,10,0 Surface Normal Clustering for Implicit Representation of Manhattan Scenes,"Popovi?, Nikola*; Paudel, Danda Pani; Van Gool, Luc",poster,2212.01331,https://arxiv.org/abs/2212.01331,https://github.com/nikola3794/normal-clustering-nerf,https://huggingface.co/papers/2212.01331,,,,3,0 Spacetime Regularization for Neural Scene Reconstruction,"Choe, Jaesung*; Choy, Christopher; Park, Jaesik; Kweon, In So; Anandkumar, Animashree",poster,,,,,,,,, LDL: Line Distance Functions for Panoramic Localization,"Kim, Junho*; Choi, Changwoon; Jang, Hojun; Kim, Young Min",poster,2308.13989,https://arxiv.org/abs/2308.13989,https://github.com/82magnolia/panoramic-localization,https://huggingface.co/papers/2308.13989,,,,4,0 Learning Neural Implicit Surfaces with Object-Aware Radiance Fields,"Zhang, Yiheng; Qiu, Zhaofan; Pan, Yingwei*; Yao, Ting; Mei, Tao",poster,,,,,,,,, MonoNeRF: Learning a Generalizable Dynamic Radiance Field from Monocular Videos,"Tian, Fengrui*; Du, Shaoyi; Duan, Yueqi",poster,2212.13056,https://arxiv.org/abs/2212.13056,https://github.com/tianfr/MonoNeRF,https://huggingface.co/papers/2212.13056,,,,3,0 Neural Radiance Field with LiDAR maps,"Chang, MingFang*; Sharma, Akash; Kaess, Michael; Lucey, Simon",poster,,,,,,,,, Deformable Model Driven Neural Rendering for High-fidelity 3D Reconstruction of Human Heads Under Low-View Settings,"Xu, Baixin*; Zhang, Jiarui; Lin, Kwan-Yee; Qian, Chen; He, Ying",poster,,,,,,,,, "DeLiRa: Self-Supervised Depth, Light, and Radiance Fields","Guizilini, Vitor*; Vasiljevic, Igor; Fang, Jiading; Ambru?, Rare? A; Zakharov, Sergey; Sitzmann, Vincent; Gaidon, Adrien",poster,2304.02797,https://arxiv.org/abs/2304.02797,,https://huggingface.co/papers/2304.02797,,,,7,0 ATT3D: Amortized Text-to-3D Object Synthesis,"Lorraine, Jonathan P*; Xie, Kevin; Zeng, Xiaohui; Lin, Chen-Hsuan; Takikawa, Towaki; Sharp, Nicholas; Lin, Tsung-Yi; Liu, Ming-Yu; Fidler, Sanja; Lucas, James R",poster,2306.07349,https://arxiv.org/abs/2306.07349,,https://huggingface.co/papers/2306.07349,,,,10,4 ScatterNeRF: Seeing Through Fog with Physically-Based Inverse Neural Rendering,"Bijelic, Mario*; Walz, Stefanie; Ramazzina, Andrea; Sanvito, Alessandro; Scheuble, Dominik; Heide, Felix",poster,2305.02103,https://arxiv.org/abs/2305.02103,,https://huggingface.co/papers/2305.02103,,,,6,0 Improved Cross-view Completion Pre-training for Stereo Matching and Optical Flow,"Weinzaepfel, Philippe*; LUCAS, Thomas; Leroy, Vincent; Cabon, Yohann; Arora, Vaibhav; Brégier, Romain; Csurka, Gabriela; Antsfeld, Leonid; Chidlovskii, Boris; Revaud, Jerome",poster,,,,,,,,, Guiding Local Feature Matching with Surface Curvature,"Wang, Shuzhe*; Kannala, Juho; Pollefeys, Marc; Barath, Daniel",poster,,,,,,,,, NaviNeRF: NeRF-based 3D Representation Disentanglement by Latent Semantic Navigation,"XIE, BAAO*; Li, Bohan; Zhang, Zequn; Dong, Junting; Jin, Xin; Yang, Jingyu; Zeng, Wenjun ",poster,2304.11342,https://arxiv.org/abs/2304.11342,,https://huggingface.co/papers/2304.11342,,,,7,0 Efficient LiDAR Point Cloud Oversegmentation Network,"Hui, Le*; Tang, Linghua; Xie, Jin; Yang, Jian; Dai, Yuchao",poster,,,,,,,,, Iterative Superquadric Recomposition of 3D Objects from Multiple Views,"Alaniz, Stephan*; Mancini, Massimiliano; Akata, Zeynep",poster,,,,,,,,, S3IM: Stochastic Structural SIMilarity and Its Unreasonable Effectiveness for Neural Fields,"Xie, Zeke*; Yang, Xindi; Yang, Yujie; Sun, Qi; Jiang, Yixiang; Wang, Haoran; Cai, Yunfeng; Sun, Mingming",poster,2308.07032,https://arxiv.org/abs/2308.07032,,https://huggingface.co/papers/2308.07032,,,,8,0 LiveHand: Real-time and Photorealistic Neural Hand Rendering,"Mundra, Akshay*; B R , Mallikarjun ; Wang, Jiayi; Habermann, Marc; Theobalt, Christian; Elgharib, Mohamed",poster,2302.07672,https://arxiv.org/abs/2302.07672,,https://huggingface.co/papers/2302.07672,,,,6,0 "Neural-PBIR Reconstruction of Shape, Material, and Illumination","Sun, Cheng; Cai, Guangyan; Li, Zhengqin; Yan, Kai; Zhang, Cheng; Marshall, Carl S; Huang, Jia-Bin; Zhao, Shuang; Dong, Zhao*",poster,2304.13445,https://arxiv.org/abs/2304.13445,,https://huggingface.co/papers/2304.13445,,,,9,0 Predict to Detect: Prediction-guided 3D Object Detection using Sequential Images,"Kim, Sanmin; Kim, Youngseok; Lee, In-Jae; Kum, Dongsuk*",poster,2306.08528,https://arxiv.org/abs/2306.08528,,https://huggingface.co/papers/2306.08528,,,,4,0 ObjectFusion: Multi-modal 3D Object Detection with Object-Centric Fusion,"Cai, Qi; Pan, Yingwei*; Yao, Ting; Ngo, Chong-Wah; Mei, Tao",poster,,,,,,,,, Domain generalization of 3D semantic segmentation in autonomous driving,"Sanchez, Jules*; Deschaud, Jean-Emmanuel; GOULETTE, François",poster,2212.04245,https://arxiv.org/abs/2212.04245,https://github.com/JulesSanchez/3DLabelProp,https://huggingface.co/papers/2212.04245,,,,3,0 When Epipolar Constraint Meets Non-local Operators in Multi-View Stereo,"Liu, Tianqi; Ye, Xinyi; Zhao, Weiyue; Pan, Zhiyu; Shi, Min*; Cao, Zhiguo",poster,,,,,,,,, Hierarchical Point-based Active Learning for Semi-supervised Point Cloud Semantic Segmentation,"xu, zongyi*; Yuan, Bo; Zhao, Shanshan; Zhang, Qianni; Gao, Xinbo",poster,2308.11166,https://arxiv.org/abs/2308.11166,https://github.com/SmiletoE/HPAL,https://huggingface.co/papers/2308.11166,,,,5,1 UniT3D: A Unified Transformer for 3D Dense Captioning and Visual Grounding,"Chen, Zhenyu*; Hu, Ronghang; Chen, Xinlei; Niessner, Matthias; Chang, Angel X",poster,2212.00836,https://arxiv.org/abs/2212.00836,,https://huggingface.co/papers/2212.00836,,,,5,0 CleaNeRF: Erasing Artifacts from Casually Captured NeRFs,"Warburg, Frederik R*; Weber, Ethan; Tancik, Matthew; Holynski, Aleksander; Kanazawa, Angjoo",poster,,,,,,,,, QD-BEV : Quantization-aware View-guided Distillation for Multi-view 3D Object Detection,"zhang, yifan; Dong, Zhen*; Yang, Huanrui; Lu, Ming; Tseng, Cheng-Ching; Du, Yuan; DU, LI; Keutzer, Kurt; Zhang, Shanghang",poster,2308.10515,https://arxiv.org/abs/2308.10515,,https://huggingface.co/papers/2308.10515,,,,9,1 Clutter Detection and Removal in 3D Scenes with View-Consistent Inpainting,"Wei, Fangyin*; Funkhouser, Thomas; Rusinkiewicz, Szymon",poster,2304.03763,https://arxiv.org/abs/2304.03763,,https://huggingface.co/papers/2304.03763,,,,3,0 PG-RCNN: Semantic Surface Point Generation for 3D Object Detection,"Koo, Inyong; Lee, Inyoung; Kim, Se-Ho; Kim, Hee-Seon; Jeon, Woo-jin; Kim, Changick*",poster,,,,,,,,, Distributed bundle adjustment with block-based sparse matrix compression for super large scale datasets,"Zheng, Maoteng*; Chen, Nengcheng; Zhu, Junfeng; Zeng, Xiaoru; Qiu, Huanbin; Jiang, Yuyao; Lu, Xingyue; Qu, Hao",poster,2307.08383,https://arxiv.org/abs/2307.08383,,https://huggingface.co/papers/2307.08383,,,,8,0 Adaptive Reordering Sampler with Neurally Guided MAGSAC,"Wei, Tong*; Matas, Jiri; Barath, Daniel",poster,,,,,,,,, Privacy Preserving Localization via Coordinate Permutations,"Pan, Linfei*; Schönberger, Johannes L; Larsson, Viktor; Pollefeys, Marc",poster,,,,,,,,, DG-Recon: Depth-Guided Neural 3D Scene Reconstruction,"Ju, Jihong*; Tseng, Ching Wei; Bailo, Oleksandr; Dikov, Georgi; Ghafoorian, Mohsen",poster,,,,,,,,, WaveNeRF: Wavelet-based Generalizable Neural Radiance Fields,"Xu, Muyu; Zhan, Fangneng; Zhang, Jiahui; Yu, Yingchen; Zhang, Xiaoqin; Theobalt, Christian; Shao, Ling; Lu, Shijian*",poster,2308.04826,https://arxiv.org/abs/2308.04826,,https://huggingface.co/papers/2308.04826,,,,8,0 TransIFF: An Instance-Level Feature Fusion Framework for Vehicle-Infrastructure Cooperative 3D Detection with Transformers,"Chen, Ziming*; Shi, Yifeng; Jia, Jinrang; Gao, Chen; Li, Bo; Liu, Si",poster,,,,,,,,, Density-invariant Features for Distant Point Cloud Registration,"Liu, Quan*; Zhu, Hongzi; Zhou, Yunsong; Li, Hongyang; Chang, Shan; Guo, Minyi",poster,2307.09788,https://arxiv.org/abs/2307.09788,https://github.com/liuQuan98/GCL,https://huggingface.co/papers/2307.09788,,,,6,0 UMIFormer: Mining the Correlations between Similar Tokens for Multi-View 3D Reconstruction,"Zhu, Zhenwei; Yang, Liying; li, ning; Jiang, Chaohao; Liang, Yanyan*",poster,2302.13987,https://arxiv.org/abs/2302.13987,https://github.com/GaryZhu1996/UMIFormer,https://huggingface.co/papers/2302.13987,,,,5,0 Neural LiDAR Fields for Novel View Synthesis,"Huang, Shengyu*; Gojcic, Zan; Wang, Zian; Williams, Francis; Kasten, Yoni; Fidler, Sanja; Schindler, Konrad; Litany, Or",poster,2305.01643,https://arxiv.org/abs/2305.01643,,https://huggingface.co/papers/2305.01643,,,,8,0 Learning Unified Decompositional and Compositional NeRF for Editable Novel View Synthesis,"WANG, Yuxin*; Wu, Wayne; Xu, Dan",poster,2308.02840,https://arxiv.org/abs/2308.02840,,https://huggingface.co/papers/2308.02840,,,,3,0 Long-Range Grouping Transformer for Multi-View 3D Reconstruction,"Yang, Liying; Zhu, Zhenwei; Lin, Xuxin; Nong, Jian; Liang, Yanyan*",poster,2308.08724,https://arxiv.org/abs/2308.08724,https://github.com/LiyingCV/Long-Range-Grouping-Transformer,https://huggingface.co/papers/2308.08724,,,,5,0 Cross Modal Transformer: Towards Fast and Robust 3D Object Detection,"Yan, Junjie; Liu, Yingfei; Sun, Jianjian; Jia, Fan; Li, Shuailin; Wang, Tiancai; Zhang, Xiangyu*",poster,2301.01283,https://arxiv.org/abs/2301.01283,https://github.com/junjie18/CMT,https://huggingface.co/papers/2301.01283,,,,7,0 KECOR: Kernel Coding Rate Maximization for Active 3D Object Detection,"Luo, Yadan*; Chen, Zhuoxiao; Fang, Zhen; Zhang, Zheng; Huang, Zi Helen; Baktashmotlagh, Mahsa",poster,2307.07942,https://arxiv.org/abs/2307.07942,,https://huggingface.co/papers/2307.07942,,,,6,0 C2F2NeUS: Cascade Cost Frustum Fusion for High Fidelity and Generalizable Neural Surface Reconstruction,"Xu, Luoyuan; Guan, Tao; Wang, Yuesong*; Liu, Wenkai; Zeng, Zhaojie; Wang, Junle; Yang, Wei",poster,2306.10003,https://arxiv.org/abs/2306.10003,,https://huggingface.co/papers/2306.10003,,,,7,0 End-to-end 3D Tracking with Decoupled Queries,"Li, Yanwei*; Yu, Zhiding; Philion, Jonah; Anandkumar, Animashree; Fidler, Sanja; Jia, Jiaya; Alvarez, Jose M",poster,,,,,,,,, LU-NeRF: Scene and Pose Estimation by Synchronizing Local Unposed NeRFs,"Cheng, Zezhou*; Esteves, Carlos; Jampani, Varun; Kar, Abhishek; Maji, Subhransu; Makadia, Ameesh",poster,,,,,,,,, GridPull: Towards Scalability in Learning Implicit Representations from 3D Point Clouds,"Chen, Chao; Liu, Yu-Shen*; Han, Zhizhong",poster,2308.13175,https://arxiv.org/abs/2308.13175,https://github.com/chenchao15/GridPull,https://huggingface.co/papers/2308.13175,,,,3,0 Robust e-NeRF: NeRF from Sparse & Noisy Events under Non-Uniform Motion,"Low, Weng Fei*; Lee, Gim Hee",poster,2309.08596,https://arxiv.org/abs/2309.08596,https://github.com/wengflow/robust-e-nerf,https://huggingface.co/papers/2309.08596,,,https://huggingface.co/datasets/wengflow/robust-e-nerf,2,1 Parameterized Cost Volume for Stereo Matching,"Zeng, Jiaxi; Yao, Chengtang*; Yu, Lidong; Jia, Yunde; WU, Yuwei",poster,,,,,,,,, Coordinate Quantized Neural Implicit Representations for Multi-view Reconstruction,"Jiang, Sijia; Hua, Jing; Han, Zhizhong*",poster,2308.11025,https://arxiv.org/abs/2308.11025,https://github.com/MachinePerceptionLab/CQ-NIR,https://huggingface.co/papers/2308.11025,,,,3,0 Pixel-Aligned Recurrent Queries for Multi-View 3D Object Detection,"Xie, Yiming; Straub, Julian; Jiang, Huaizu; Gkioxari, Georgia*",poster,,,,,,,,, Optimizing the Placement of Roadside LiDARs for Autonomous Driving,"Jiang, Wentao*; Xiang, Hao; Cai, Xinyu; Xu, Runsheng; Ma, Jiaqi; Li, Yikang; Lee, Gim Hee; Liu, Si",poster,,,,,,,,, ActorsNeRF: Animatable Few-shot Human Rendering with Generalizable NeRFs,"Mu, Jiteng*; Sang, Shen; Vasconcelos, Nuno; Wang, Xiaolong",poster,2304.14401,https://arxiv.org/abs/2304.14401,,https://huggingface.co/papers/2304.14401,,,,4,0 NeRFrac: Neural Radiance Fields through Refractive Surface,"Zhan, Yifan; Nobuhara, Shohei; Nishino, Ko; Zheng, Yinqiang*",poster,,,,,,,,, CPCM: Contextual Point Cloud Modeling for Weakly-supervised Point Cloud Semantic Segmentation,"Liu, Lizhao; Zhuang, Zhuangwei; Huang, Shangxin; Xiao, Xunlong; Xiang, Tianhang; Chen, Cen; Wang, Jingdong; Tan, Mingkui*",poster,2307.10316,https://arxiv.org/abs/2307.10316,,https://huggingface.co/papers/2307.10316,,,,8,0 FineRecon: Depth-aware Feed-forward Network for Detailed 3D Reconstruction,"Stier, Noah*; Ranjan, Anurag; Colburn, Alex; yan, yajie; Yang, Liang; Ma, Fangchang; Angles, Baptiste",poster,2304.01480,https://arxiv.org/abs/2304.01480,,https://huggingface.co/papers/2304.01480,,,,7,0 Point-SLAM: Dense Neural Point Cloud-based SLAM,"Sandström, Erik; Li, Yue; Van Gool, Luc; Oswald, Martin R.*",poster,2304.04278,https://arxiv.org/abs/2304.04278,https://github.com/tfy14esa/Point-SLAM,https://huggingface.co/papers/2304.04278,,,,4,0 You Never Get a Second Chance To Make a Good First Impression: Seeding Active Learning for 3D Semantic Segmentation,"Samet, Nermin*; Siméoni, Oriane; Puy, Gilles; Ponimatkin, Georgy; Marlet, Renaud; Lepetit, Vincent",poster,2304.11762,https://arxiv.org/abs/2304.11762,https://github.com/nerminsamet/seedal,https://huggingface.co/papers/2304.11762,,,,6,0 Tetra-NeRF: Representing Neural Radiance Fields Using Tetrahedra,"Kulhanek, Jonas*; Sattler, Torsten",poster,,,,,,,,, Active Stereo Without Pattern Projector,"Bartolomei, Luca*; Poggi, Matteo; Tosi, Fabio; Conti, Andrea; Mattoccia, Stefano",poster,,,,,,,,, HOSNeRF: Dynamic Human-Object-Scene Neural Radiance Fields from a Single Video,"Liu, Jia-Wei*; Cao, Yan-Pei; Yang, Tianyuan; Xu, Zhongcong; Keppo, Jussi; Shan, Ying; Qie, Xiaohu; Shou, Mike Zheng",poster,2304.12281,https://arxiv.org/abs/2304.12281,,https://huggingface.co/papers/2304.12281,,,,8,0 PlankAssembly: Robust 3D Reconstruction from Three Orthographic Views with Learnt Shape Programs,"Hu, Wentao; Zheng, Jia*; Zhang, Zixin; Yuan, Xiaojun; Yin, Jian; Zhou, Zihan",poster,2308.05744,https://arxiv.org/abs/2308.05744,,https://huggingface.co/papers/2308.05744,,,,6,1 Efficient View Synthesis with Neural Radiance Distribution Field,"Wu, Yushuang*; Li, Xiao; Wang, Jinglu; Han, Xiaoguang; Cui, Shuguang; Lu, Yan",poster,2308.11130,https://arxiv.org/abs/2308.11130,,https://huggingface.co/papers/2308.11130,,,,6,0 Query Refinement Transformer for 3D Instance Segmentation,"lu, jiahao*; Deng, Jiacheng; Wang, Chuxin; He, Jianfeng; Zhang, Tianzhu",poster,,,,,,,,, TrajectoryFormer: 3D Object Tracking Transformer with Predictive Trajectory Hypotheses,"Chen, Xuesong*; Shi, Shaoshuai; Zhang, Chao; Zhu, Benjin; Wang, Qiang; Cheung, Ka Chun; See, Simon; Li, Hongsheng",poster,2306.05888,https://arxiv.org/abs/2306.05888,https://github.com/poodarchu/EFG,https://huggingface.co/papers/2306.05888,,,,8,0 NerfAcc: Efficient Sampling Accelerates NeRFs,"Li, Ruilong*; Gao, Hang; Tancik, Matthew; Kanazawa, Angjoo",poster,2305.04966,https://arxiv.org/abs/2305.04966,,https://huggingface.co/papers/2305.04966,,,,4,2 NeTO:Neural Reconstruction of Transparent Objects with Self-Occlusion Aware Refraction-Tracing,"Li, Zongcheng; Long, Xiaoxiao; Wang, Yusen; Cao, Tuo; Wang, Wenping; Luo, Fei; Xiao, Chunxia*",poster,2303.11219,https://arxiv.org/abs/2303.11219,,https://huggingface.co/papers/2303.11219,,,,7,0 Text2Tex: Text-driven Texture Synthesis via Diffusion Models,"Chen, Zhenyu*; Siddiqui, Yawar; Lee, Hsin-Ying; Tulyakov, Sergey; Niessner, Matthias",poster,2303.11396,https://arxiv.org/abs/2303.11396,,https://huggingface.co/papers/2303.11396,,,,5,0 Learning Long-range Information with Dual-Scale Transformers for Indoor Scene Completion,"Wang, Ziqi*; Luo, Fei; Long, Xiaoxiao; Zhang, Wenxiao; Xiao, Chunxia",poster,,,,,,,,, SparseBEV: Sparse 3D Object Detection from Multi-Camera Videos,"Liu, Haisong*; Teng, Yao; Lu, Tao; Wang, Haiguang; Wang, Limin",poster,,,,,,,,, NeRF-MS: Neural Radiance Fields with Multi-Sequence,"Li, Peihao*; Wang, Shaohui; Yang, Chen; Bingbing, Liu; Qiu, Weichao; Wang, Haoqian",poster,,,,,,,,, Label-Guided Knowledge Distillation for Continual Semantic Segmentation on 2D Images and 3D Point Clouds,"Yang, Ze; Li, Ruibo; Ling, Evan; Zhang, Chi; Wang, Yiming; HUANG, dezhao; Ma, Keng Teck; Hur, Minhoe; Lin, Guosheng*",poster,,,,,,,,, ETran: Energy-Based Transferability Estimation,"Gholami, Mohsen*; Akbari, Mohammad; Wang, Xinglu; kamranian, behnam; Zhang, Yong",poster,2308.02027,https://arxiv.org/abs/2308.02027,,https://huggingface.co/papers/2308.02027,,,,5,0 PODA: Prompt-driven Zero-shot Domain Adaptation,"Fahes, Mohammad*; VU, Tuan-Hung; Bursuc, Andrei; Pérez, Patrick; de Charette, Raoul",poster,,,,,,,,, Local Context-Aware Active Domain Adaptation,"Sun, Tao*; Lu, Cheng; Ling, Haibin",poster,2208.12856,https://arxiv.org/abs/2208.12856,https://github.com/tsun/LADA,https://huggingface.co/papers/2208.12856,,,,3,0 MRN: Multiplexed Routing Network for Incremental Multilingual Text Recognition,"Zheng, Tianlun; Chen, Zhineng*; Huang, BingChen; Zhang, Wei; Jiang, Yu-Gang",poster,2305.14758,https://arxiv.org/abs/2305.14758,https://github.com/simplify23/MRN,https://huggingface.co/papers/2305.14758,,,,5,1 Few-Shot Dataset Distillation,"Liu, Songhua*; Wang, Xinchao",poster,,,,,,,,, Wasserstein Expansible Variational Autoencoder for Discriminative and Generative Continual Learning,"Ye, Fei*; Bors, Adrian",poster,,,,,,,,, Tangent Model Composition for Ensembling and Continual Fine-tuning,"Liu, Tian Yu*; Soatto, Stefano",poster,2307.08114,https://arxiv.org/abs/2307.08114,,https://huggingface.co/papers/2307.08114,,,,2,0 Look at the Neighbor: Distortion-aware Unsupervised Domain Adaptation for Panoramic Semantic Segmentation,"Zheng, Xu*; Pan, Tianbo; Luo, Yunhao; Wang, Lin ",poster,2308.05493,https://arxiv.org/abs/2308.05493,,https://huggingface.co/papers/2308.05493,,,,4,0 Homeomorphism Alignment for Unsupervised Domain Adaptation,"Zhou, Lihua; Ye, Mao*; Zhu, Xiatian; Xiao, Siying; Fan, Xu-Qian; Neri, Ferrante",poster,,,,,,,,, Knowledge Restore and Transfer for Multi-label Class-Incremental Learning,"Dong, Songlin*; Luo, Haoyu; He, Yuhang; Wei, Xing; Cheng, Jie; Gong, Yihong",poster,2302.13334,https://arxiv.org/abs/2302.13334,,https://huggingface.co/papers/2302.13334,,,,5,0 Unsupervised Domain Adaptation for Training Event-Based Networks Using Contrastive Learning and Uncorrelated Conditioning,"Jian, Dayuan*; Rostami, Mohammad",poster,2303.12424,https://arxiv.org/abs/2303.12424,,https://huggingface.co/papers/2303.12424,,,,2,0 A Simple Recipe to Meta-Learn Forward and Backward Transfer,"Cetin, Edoardo*; Carta, Antonio; Celiktutan, Oya",poster,,,,,,,,, Dynamic Residual Classifier for Class Incremental Learning,"chen, xiuwei; Chang, Xiaobin*",poster,2308.13305,https://arxiv.org/abs/2308.13305,,https://huggingface.co/papers/2308.13305,,,,2,0 Concept-wise Fine-tuning Matters in Preventing Negative Transfer,"Yang, Yunqiao; Huang, Long-Kai; WEI, Ying*",poster,,,,,,,,, Online Prototype Learning for Online Continual Learning,"Wei, Yujie*; Ye, JiaXin; Huang, Zhizhong; Zhang, Junping; Shan, Hongming",poster,2308.00301,https://arxiv.org/abs/2308.00301,https://github.com/weilllllls/OnPro,https://huggingface.co/papers/2308.00301,,,,5,0 Bidirectional Alignment for Domain Adaptive Detection with Transformers,"He, Liqiang*; WANG, WEI; Chen, Albert Y; Sun, Min; Kuo, Cheng-Hao; Todorovic, Sinisa",poster,,,,,,,,, Borrowing Knowledge From Pre-trained Language Model: A New Data-efficient Visual Learning Paradigm,"Ma, Wenxuan; Li, Shuang*; Zhang, JinMing; Liu, Chi Harold; Kang, Jingxuan; Wang, Yulin; Huang, Gao",poster,,,,,,,,, CLR: Channel-wise Lightweight Reprogramming for Continual Learning,"Ge, Yunhao*; Li, Yuecheng; Ni, Shuo; Zhao, Jiaping; Yang, Ming-Hsuan; Itti, Laurent",poster,2307.11386,https://arxiv.org/abs/2307.11386,https://github.com/gyhandy/Channel-wise-Lightweight-Reprogramming,https://huggingface.co/papers/2307.11386,,,,6,0 Multi-Modal Continual Test-Time Adaptation for 3D Semantic Segmentation,"Cao, Haozhi*; Xu, Yuecong; Yang, Jianfei; Yin, Pengyu; Yuan, Shenghai; Xie, Lihua",poster,2303.10457,https://arxiv.org/abs/2303.10457,,https://huggingface.co/papers/2303.10457,,,,6,0 First Session Adaptation: A Strong Replay-Free Baseline for Class-Incremental Learning,"Panos, Aristeidis*; Kobe, Yuriko; Olmeda Reino, Daniel; Aljundi, Rahaf; Turner, Richard E.",poster,2303.13199,https://arxiv.org/abs/2303.13199,,https://huggingface.co/papers/2303.13199,,,,5,0 Domain Adaptive Few-Shot Open-Set Learning,"Pal, Debabrata*; More, Deeptej S; Rongali, Sai Bhargav; Tamboli, Dipesh; Aggarwal, Vaneet; Banerjee, Biplab",poster,,,,,,,,, Rethinking the Role of Pre-Trained Networks in Source-Free Domain Adaptation,"Zhang, Wenyu*; Shen, Li; Foo, Chuan Sheng",poster,2212.07585,https://arxiv.org/abs/2212.07585,,https://huggingface.co/papers/2212.07585,,,,3,0 Towards a True Evaluation of Rapid Adaptation in Online Continual Learning,"Hammoud, Hasan Abed Al Kader*; Prabhu, Ameya Pandurang; Lim, Ser-Nam; Torr, Philip; Bibi, Adel; Ghanem, Bernard",poster,,,,,,,,, Multi-grained Temporal Prototype Learning for Few-shot Video Object Segmentation,"Liu, Nian*; Nan, Kepan; Zhao, Wangbo; Liu, Yuanwei; Yao, Xiwen; Khan, Salman; Cholakkal, Hisham; Anwer , Rao Muhammad; Han, Junwei; Shahbaz Khan, Fahad",poster,,,,,,,,, A Low-Shot Object Counting Network With Iterative Prototype Adaptation,"?uki?, Nikola; Lukezic, Alan; Zavrtanik, Vitjan*; Kristan, Matej",poster,2211.08217,https://arxiv.org/abs/2211.08217,,https://huggingface.co/papers/2211.08217,,,,4,0 Towards Better Robustness against Common Corruptions for Unsupervised Domain Adaptation,"Gao, Zhiqiang*; Huang, Kaizhu; Zhang, Rui; Liu, Dawei; Ma, Jieming",poster,,,,,,,,, Alleviating Catastrophic Forgetting of Incremental Object Detection via Within-Class and Between-Class Knowledge Distillation,"Kang, Mengxue*; Zhang, Jinpeng; Zhang, Jinming; Wang, Xiashuang; chen, yang; Ma, Zhe; Huang, Xuhui",poster,,,,,,,,, Class-Aware Patch Embedding Adaptation for Few-Shot Image Classification,"Hao, Fusheng; He, Fengxiang; Liu, Liu; Wu, Fuxiang ; Tao, Dacheng; Cheng, Jun*",poster,,,,,,,,, Order-preserving Consistency Regularization for Domain Adaptation and Generalization,"Jing, Mengmeng; Zhen, Xiantong; Li, Jingjing; Snoek, Cees*",poster,,,,,,,,, Domain-Specificity Inducing Transformers for Source-Free Domain Adaptation,"Sanyal, Sunandini*; Asokan, Ashish R; Bhambri, Suvaansh; Kulkarni, Akshay R; Kundu, Jogendra Nath; RADHAKRISHNAN, Venkatesh Babu",poster,2308.14023,https://arxiv.org/abs/2308.14023,,https://huggingface.co/papers/2308.14023,,,,6,0 Diffusion Model as Representation Learner,"Yang, Xingyi*; Wang, Xinchao",poster,2308.10916,https://arxiv.org/abs/2308.10916,https://github.com/Adamdad/Repfusion,https://huggingface.co/papers/2308.10916,,,,2,0 ?-Adaptive Decoupled Prototype for Few-Shot Object Detection,"Zhang, Shan; Du, Jinhao; Chen, Qiang; Le, Haifeng; Sun, Yanpeng; Ni, Yao; Wang, Jian; He, Bin; Wang, Jingdong*",poster,,,,,,,,, Growing a Brain with Sparsity-Inducing Generation for Continual Learning,"Jin, Hyundong; Kim, Gyeong-hyeon; Ahn, Chanho; Kim, Eunwoo*",poster,,,,,,,,, DomainAdaptor: A Novel Approach to Test-time Adaptation,"Zhang, Jian*; Qi, Lei; Shi, Yinghuan; Gao, Yang",poster,2308.10297,https://arxiv.org/abs/2308.10297,https://github.com/koncle/DomainAdaptor,https://huggingface.co/papers/2308.10297,,,,4,0 Reconciling Object-Level and Global-Level Objectives for Long-Tail Detection,"Zhang, Shaoyu*; Chen, Chen; Peng, Silong",poster,,,,,,,,, Domain Generalization via Balancing Training Difficulty and Model Capability,"Jiang, Xueying; Huang, Jiaxing; Jin, Sheng; Lu, Shijian*",poster,,,,,,,,, Understanding Hessian Alignment for Domain Generalization,"Hemati, Sobhan*; Zhang, Guojun; Estiri, Amir H; Chen, Xi",poster,2308.11778,https://arxiv.org/abs/2308.11778,https://github.com/huawei-noah/Federated-Learning/tree/main/HessianAlignment,https://huggingface.co/papers/2308.11778,,,,4,0 Vision Transformer Adapters for Generalizable Multitask Learning,"Bhattacharjee, Deblina*; SÃŒsstrunk, Sabine; Salzmann, Mathieu",poster,2308.12372,https://arxiv.org/abs/2308.12372,,https://huggingface.co/papers/2308.12372,,,,3,0 Focus on Your Target: A Dual Teacher-Student Framework for Domain-adaptive Semantic Segmentation,"Huo, Xinyue*; Xie, Lingxi; Zhou, Wengang ; Li, Houqiang; Tian, Qi",poster,2303.09083,https://arxiv.org/abs/2303.09083,,https://huggingface.co/papers/2303.09083,,,,5,0 Masked Retraining Teacher-Student Framework for Domain Adaptive Object Detection,"Zhao, Zijing; Wei, Sitong; Chen, Qingchao; Li, Dehui; Yang, YiFan; Peng, Yuxin; Liu, Yang*",poster,,,,,,,,, DandelionNet: Domain Composition with Instance Adaptive Classification for Domain Generalization,"Hu, Lanqing*; Kan, Meina; Shan, Shiguang; Chen, Xilin",poster,,,,,,,,, CAFA: Class-aware Feature Alignment for Test-time Adaptaion,"Jung, Sanghun*; Lee, Jungsoo; KIM, NANHEE; Shaban, Amirreza; Boots, Byron; Choo, Jaegul",poster,,,,,,,,, Image-free Classifier Injection for Zero-Shot Classification,"Christensen, Anders*; Mancini, Massimiliano; Koepke, A. Sophia; Winther, Ole; Akata, Zeynep",poster,2308.10599,https://arxiv.org/abs/2308.10599,https://github.com/ExplainableML/ImageFreeZSL,https://huggingface.co/papers/2308.10599,,,,5,0 CBA: Improving Online Continual Learning via Continual Bias Adaptor,"Wang, Quanziang*; Wang, Renzhen; Wu, Yichen; Jia, Xixi; Meng, Deyu",poster,2308.06925,https://arxiv.org/abs/2308.06925,,https://huggingface.co/papers/2308.06925,,,,5,0 AdaptGuard: Defending Against Universal Attacks for Model Adaptation,"Sheng, Lijun*; Liang, Jian; He, Ran; Wang, Zilei; Tan, Tieniu",poster,2303.10594,https://arxiv.org/abs/2303.10594,,https://huggingface.co/papers/2303.10594,,,,5,0 Masked Autoencoders are Efficient Class Incremental Learners,"Zhai, Jiang-Tian; Liu, Xialei*; Bagdanov, Andy; Li, Ke; Cheng, Ming-Ming",poster,2308.12510,https://arxiv.org/abs/2308.12510,https://github.com/scok30/MAE-CIL,https://huggingface.co/papers/2308.12510,,,,5,0 DomainDrop: Suppressing Domain-Sensitive Channels for Domain Generalization,"Guo, Jintao*; Qi, Lei; Shi, Yinghuan",poster,2308.10285,https://arxiv.org/abs/2308.10285,https://github.com/lingeringlight/DomainDrop,https://huggingface.co/papers/2308.10285,,,,3,0 Preventing Zero-Shot Transfer Degradation in Continual Learning of Vision-Language Models,"Zheng, Zangwei*; MA, Mingyuan; Wang, Kai; Qin, Ziheng; Yue, Xiangyu; You, Yang",poster,2303.06628,https://arxiv.org/abs/2303.06628,https://github.com/Thunderbeee/ZSCL,https://huggingface.co/papers/2303.06628,,,,6,0 Incremental Generalized Category Discovery,"Zhao, Bingchen*; Mac Aodha, Oisin",poster,2304.14310,https://arxiv.org/abs/2304.14310,,https://huggingface.co/papers/2304.14310,,,,2,0 SLCA: Slow Learner with Classifier Alignment for Continual Learning on a Pre-trained Model,"Zhang, Gengwei*; Wang, Liyuan; Kang, Guoliang; Chen, Ling; Wei, Yunchao",poster,2303.05118,https://arxiv.org/abs/2303.05118,https://github.com/GengDavid/SLCA,https://huggingface.co/papers/2303.05118,,,,5,0 Efficient Model Personalization in Federated Learning via Client-Specific Prompt Generation,"Yang, Fu-En*; Wang, Chien-Yi; Wang, Yu-Chiang Frank",poster,2308.15367,https://arxiv.org/abs/2308.15367,,https://huggingface.co/papers/2308.15367,,,,3,0 iDAG: Invariant DAG Searching for Domain Generalization,"Huang, Zenan*; Wang, Haobo; Zhao, Junbo; Zheng, Nenggan",poster,,,,,,,,, SSDA: Secure Source-Free Domain Adaptation,"Ahmed, Sabbir*; Arafat, Abdullah Al; Rizve, Mamshad Nayeem; Hossain, Rahim; Guo, Zhishan; Rakin, Adnan Siraj",poster,,,,,,,,, Learning Pseudo-Relations for Cross-domain Semantic Segmentation,"zhao, dong; Wang, Shuang*; zang, qi; Quan, Dou; Ye, Xiutiao; Yang, Rui; Jiao, Licheng ",poster,,,,,,,,, Self-Organizing Pathway Expansion for Non-Exemplar Class-Incremental Learning,"Zhu, Kai*; Zheng, Kecheng; Feng, Ruili; Zhao, Deli; Cao, Yang; Zha, Zheng-Jun",poster,,,,,,,,, Improved Knowledge Transfer for Semi-supervised Domain Adaptation via Trico Training Strategy,"Ngo, Ba Hung*; Chae, Yeon Jeong; Kwon, Jung Eun; Park, Jae Hyeon; Cho, Sung In",poster,,,,,,,,, Few-shot Continual Infomax Learning,"Gu, Ziqi*; Xu, Chunyan; Cui, Zhen; Yang, Jian",poster,,,,,,,,, EDAPS: Enhanced Domain-Adaptive Panoptic Segmentation,"Saha, Suman*; Hoyer, Lukas; Obukhov, Anton; Dai, Dengxin ; Van Gool, Luc",poster,2304.14291,https://arxiv.org/abs/2304.14291,https://github.com/susaha/edaps,https://huggingface.co/papers/2304.14291,,,,5,0 Label-Efficient Online Continual Object Detection in Streaming Video,"Wu, Jay Zhangjie*; Zhang, David Junhao; Hsu, Wynne; Zhang, Mengmi; Shou, Mike Zheng",poster,2206.00309,https://arxiv.org/abs/2206.00309,https://github.com/showlab/Efficient-CLS,https://huggingface.co/papers/2206.00309,,,,5,0 Prototypical Kernel Learning and Open-set Foreground Perception for Generalized Few-shot Semantic Segmentation,"Huang, Kai*; wang, feigege; Xi, Ye; gao, yutao",poster,2308.04952,https://arxiv.org/abs/2308.04952,,https://huggingface.co/papers/2308.04952,,,,4,0 MSI: Maximize Support-Set Information for Few-Shot Segmentation,"Moon, Seonghyeon*; Sohn, Samuel S; Zhou, Honglu; Yoon, Sejong; Pavlovic, Vladimir; Khan, Muhammad Haris; Kapadia, Mubbasir",poster,2212.04673,https://arxiv.org/abs/2212.04673,,https://huggingface.co/papers/2212.04673,,,,7,0 AREA: Adaptive Reweighting via Effective Area for Long-Tailed Classification,"Chen, Xiaohua*; Zhou, Yucan; Wu, Dayan; Yang, Chule; Li, Bo; Hu, Qinghua; Wang, Weiping",poster,,,,,,,,, PASTA: Proportional Amplitude Spectrum Training Augmentation for Syn-to-Real Domain Generalization,"Chattopadhyay, Prithvijit*; Sarangmath, Kartik; Vijaykumar, Vivek; Hoffman, Judy",poster,2212.00979,https://arxiv.org/abs/2212.00979,,https://huggingface.co/papers/2212.00979,,,,4,1 Personalized Semantics Excitation for Federated Image Classification,"Xia, Haifeng*; Li, Kai; Ding, Zhengming",poster,,,,,,,,, Few-Shot Video Classification via Representation Fusion and Promotion Learning,"Xia, Haifeng*; Li, Kai; Min, Martin Renqiang; Ding, Zhengming",poster,,,,,,,,, Adaptive Calibrators Ensemble for Model Calibration under Distribution Shift,"Zou, Yuli*; Deng, Weijian; Zheng, Liang",poster,,,,,,,,, Anchor Structure Regularization Induced Multi-view Subspace Clustering via Enhanced Tensor Rank Minimization,"Ji, Jintian; Feng, Songhe*",poster,,,,,,,,, Meta OOD Learning For Continuously Adaptive OOD Detection ,"wu, xinheng*; Lu, Jie; Fang, Zhen; Zhang, Guangquan",poster,,,,,,,,, Learning with Diversity: Self-Expanded Equalization for Better Generalized Deep Metric Learning,"Yan, Jiexi; Yin, Zhihui; Yang, Erkun; Yang, Yanhua*; Huang, Heng",poster,,,,,,,,, Bold but Cautious: Unlocking the Potential of Personalized Federated Learning through Cautiously Aggressive Collaboration,"Wu, Xinghao; Liu, Xuefeng; Niu, jianwei*; Zhu, Guogang; Tang, Shaojie",poster,,,,,,,,, Federated Learning Over Images: Vertical Decompositions and Pre-Trained Backbones Are Difficult to Beat,"Hu, Erdong*; Tang, Yuxin; Kyrillidis, Anastasios; Jermaine, Chris",poster,,,,,,,,, Towards Inadequately Pre-trained Models in Transfer Learning,"Deng, Andong*; Li, Xingjian; Hu, Di; Wang, Tianyang; Xiong, Haoyi; Xu, Cheng-Zhong",poster,2203.04668,https://arxiv.org/abs/2203.04668,,https://huggingface.co/papers/2203.04668,,,,6,0 Reducing Training Time in Cross-Silo Federated Learning using Multigraph Topology,"Do, Tuong Khanh Long; Nguyen, Binh Xuan; Pham, Vuong; Tran, Toan M; Tjiputra, Erman; Tran, Quang Duy; Nguyen, Anh*",poster,2207.09657,https://arxiv.org/abs/2207.09657,https://github.com/aioz-ai/MultigraphFL,https://huggingface.co/papers/2207.09657,,,,7,0 Membrane Potential Batch Normalization for Spiking Neural Networks,"Guo, Yufei*; Zhang, Yuhan; Chen, Yuanpei; peng, weihang; Liu, Xiaode; Zhang, Liwen; Huang, Xuhui; Ma, Zhe",poster,2308.08359,https://arxiv.org/abs/2308.08359,https://github.com/yfguo91/MPBN,https://huggingface.co/papers/2308.08359,,,,8,0 Revisit PCA-based technique for Out-of-Distribution Detection,"Guan, Xiaoyuan; Liu, Zhouwu; Zhou, Yuren; ZHENG, WEI-SHI; Wang, Ruixuan*",poster,,,,,,,,, Cross-view Topology Based Consistent and Complementary Information for Deep Multi-view Clustering,"Dong, Zhibin; Jin, Jiaqi; Wang, Siwei; Liu, Xinwang*; Zhu, En",poster,,,,,,,,, A Benchmark for Chinese-English Scene Text Image Super-resolution,"Ma, Jianqi*; Liang, Zhetong; Xiang, Wangmeng; Yang, Xi; Zhang, Lei",poster,2308.03262,https://arxiv.org/abs/2308.03262,https://github.com/mjq11302010044/Real-CE,https://huggingface.co/papers/2308.03262,,,,5,0 Vision Grid Transformer for Document Layout Analysis,"Da, Cheng*; Luo, Chuwei; Zheng, Qi; Yao, Cong",poster,2308.14978,https://arxiv.org/abs/2308.14978,https://github.com/AlibabaResearch/AdvancedLiterateMachinery,https://huggingface.co/papers/2308.14978,,,,4,0 Self-supervised Character-to-Character Distillation for Text Recognition,"Guan, Tongkun*; Shen, Wei; Yang, Xue; Feng, Qi; Jiang, Zekun; Yang, Xiaokang",poster,2211.00288,https://arxiv.org/abs/2211.00288,https://github.com/TongkunGuan/CCD,https://huggingface.co/papers/2211.00288,,,,6,0 ICL-D3IE: In-Context Learning with Diverse Demonstrations Updating for Document Information Extraction,"He, Jiabang; Wang, Lei; Hu, Yi; Liu, Ning; LIU, HUI; Xu, Xing*; Shen, Heng Tao",poster,,,,,,,,, ESTextSpotter: Towards Better Scene Text Spotting with Explicit Synergy in Transformer,"Huang, Mingxin; Zhang, Jiaxin; Peng, Dezhi; Lu, Hao; Huang, Can; Liu, Yuliang; Bai, Xiang; Jin, Lianwen *",poster,2308.10147,https://arxiv.org/abs/2308.10147,https://github.com/mxin262/ESTextSpotter,https://huggingface.co/papers/2308.10147,,,,8,0 Few shot font generation via transferring similarity guided global style and quantization local style,"Pan, Wei; Zhu, Anna*; Zhou, Xinyu; Iwana, Brian K; Li, Shilin",poster,,,,,,,,, Attention Where It Matters: Rethinking Visual Document Understanding with Selective Region Concentration,"Cao, Haoyu*; Bao, Changcun; Liu, Chaohu; Chen, Huang; Yin, Kun; Liu, Hao; Liu, Yinsong; Jiang, Deqiang; Sun, Xing",poster,,,,,,,,, Document Understanding Dataset and Evaluation (DUDE),"Van Landeghem, Jordy*; Tito, RubÚn; Borchmann, ?ukasz; Pietruszka, Micha?; Joziak, Pawel; Powalski, Rafal; Jurkiewicz, Dawid; Coustaty, Mickael; Anckaert, Bertrand; Valveny, Ernest; Blaschko, Matthew B.; Moens, Sien; Stanislawek, Tomasz",poster,2305.08455,https://arxiv.org/abs/2305.08455,,https://huggingface.co/papers/2305.08455,,,,13,1 LISTER: Neighbor Decoding for Length-Insensitive Scene Text Recognition,"Cheng, Changxu*; Wang, Peng; Da, Cheng; Zheng, Qi; Yao, Cong",poster,2308.12774,https://arxiv.org/abs/2308.12774,,https://huggingface.co/papers/2308.12774,,,,5,0 MolGrapher: Graph-based Visual Recognition of Chemical Structures,"Morin, Lucas*; Danelljan, Martin; Agea, M. Isabel; Nassar, Ahmed S; weber, valery; Meijer, Gerhard Ingmar; Staar, Peter W J; Yu, Fisher",poster,2308.12234,https://arxiv.org/abs/2308.12234,,https://huggingface.co/papers/2308.12234,,,,8,0 SCOB: Universal Text Understanding via Character-wise Supervised Contrastive Learning with Online Text Rendering for Bridging Domain Gap,"Kim, Daehee; Kim, Yoonsik*; Kim, DongHyun; Lim, Yumin; Kim, Geewook; Kil, Taeho",poster,,,,,,,,, Foreground and Text-lines Aware Document Image Rectification,"Li, Heng; Wu, Xiangping; Chen, Qingcai*; Xiang, Qianjin",poster,,,,,,,,, DocTr: Document Transformer for Structured Information Extraction in Documents,"Liao, Haofu*; RoyChowdhury, Aruni; Li, Weijian; Bansal, Ankan; Zhang, Yuting; Tu, Zhuowen; Satzoda, Ravi Kumar; Manmatha, R.; Mahadevan, Vijay",poster,2307.07929,https://arxiv.org/abs/2307.07929,,https://huggingface.co/papers/2307.07929,,,,9,0 GPGait: Generalized Pose-based Gait Recognition,"Fu, Yang*; Meng, Shibei; Hou, Saihui; Hu, Xuecai ; Huang, Yongzhen",poster,2303.05234,https://arxiv.org/abs/2303.05234,https://github.com/BNU-IVC/FastPoseGait,https://huggingface.co/papers/2303.05234,,,,5,0 RPG-Palm: Realistic Pseudo-data Generation for Palmprint Recognition,"Shen, Lei*; Jin, Jianlong; Zhang, Ruixin; Li, Huaen; ZHAO, KAI; Zhang, Yingyi; Zhang, Jingyun; Ding, Shouhong; Zhao, Yang; Jia, Wei",poster,,,,,,,,, Learning Clothing and Pose Invariant 3D Shape Representation for Long-Term Person Re-Identification,"Liu, Feng*; Kim, Minchul; Gu, ZiAng; Jain, Anil; Liu, Xiaoming",poster,2308.10658,https://arxiv.org/abs/2308.10658,,https://huggingface.co/papers/2308.10658,,,,5,0 Physics-Augmented Autoencoder for 3D Skeleton-Based Gait Recognition,"Guo, Hongji*; Ji, Qiang",poster,,,,,,,,, Hierarchical Spatio-Temporal Representation Learning for Gait Recognition,"Wang, Lei; Liu, Bo*; Liang, Fangfang; Wang, Bincheng",poster,2307.09856,https://arxiv.org/abs/2307.09856,,https://huggingface.co/papers/2307.09856,,,,4,0 IDiff-Face: Synthetic-based Face Recognition through Fizzy Identity-Conditioned Diffusion Model,"Boutros, Fadi*; Grebe, Jonas Henry; Kuijper, Arjan; Damer, Naser",poster,,,,,,,,, Template Inversion Attack against Face Recognition systems using 3D Face Reconstruction,"Otroshi Shahreza, Hatef*; Marcel, Sebastien",poster,,,,,,,,, Privacy-Preserving Face Recognition Using Random Frequency Components,"Mi, Yuxi*; Huang, Yuge; Ji, Jiazhen; Zhao, Minyi; Wu, Jiaxiang; Xu, Xingkun; Ding, Shouhong; Zhou, Shuigeng",poster,2308.10461,https://arxiv.org/abs/2308.10461,https://github.com/Tencent/TFace,https://huggingface.co/papers/2308.10461,,,,8,0 FLIP: Cross-domain Face Anti-spoofing with Language Guidance,"Srivatsan, Koushik*; Naseer, Muzammal; Nandakumar, Karthik",poster,,,,,,,,, Kick Back & Relax: Learning to Reconstruct the World by Watching SlowTV,"Spencer, Jaime*; Russell, Chris; Hadfield, Simon; Bowden, Richard",oral,2307.10713,https://arxiv.org/abs/2307.10713,https://github.com/jspenmar/slowtv_monodepth,https://huggingface.co/papers/2307.10713,,,,4,0 Novel Scenes & Classes: Towards Adaptive Open-set Object Detection,"Li, Wuyang*; Guo, Xiaoqing; Yuan, Yixuan",oral,,,,,,,,, Improving Unsupervised Visual Program Inference with Code Rewriting Families,"Ganeshan, Aditya*; Jones, R. Kenny; Ritchie, Daniel",oral,,,,,,,,, Denoising Diffusion Autoencoders are Unified Self-supervised Learners,"Xiang, Weilai; Yang, Hongyu*; Huang, Di; Wang, Yunhong",oral,2303.09769,https://arxiv.org/abs/2303.09769,,https://huggingface.co/papers/2303.09769,,,,4,0 Self-Ordering Point Clouds,"Yang, Pengwan*; Snoek, Cees; Asano, Yuki M",oral,2304.00961,https://arxiv.org/abs/2304.00961,,https://huggingface.co/papers/2304.00961,,,,3,0 MOST: Multiple Object localization with Self-supervised Transformers for object discovery,"Rambhatla, Sai Saketh *; Misra, Ishan; Chellappa, Rama; Shrivastava, Abhinav",oral,2304.05387,https://arxiv.org/abs/2304.05387,,https://huggingface.co/papers/2304.05387,,,,4,2 Self-supervised Learning for 3D Human-Object Spatial Relations from Unbounded Synthesized Images,"Han, Sookwan*; Joo, Hanbyul",oral,,,,,,,,, Identity-Seeking Self-Supervised Representation Learning for Generalizable Person Re-identification,"Dou, Zhaopeng*; Wang, Zhongdao; Li, Ya-Li; Wang, Shengjin",oral,2308.08887,https://arxiv.org/abs/2308.08887,https://github.com/dcp15/ISR_ICCV2023_Oral,https://huggingface.co/papers/2308.08887,,,,4,0 Anatomical Invariance Modeling and Semantic Alignment for Self-supervised Learning in 3D Medical Image Analysis,"Jiang, Yankai*; Sun, Mingze; Guo, Heng; Bai, Xiaoyu; Yan, Ke; Lu, Le; Xu, Minfeng",oral,2302.05615,https://arxiv.org/abs/2302.05615,https://github.com/alibaba-damo-academy/alice,https://huggingface.co/papers/2302.05615,,,,7,0 IOMatch: Simplifying Open-Set Semi-Supervised Learning with Joint Inliers and Outliers Utilization,"Li, Zekun*; Qi, Lei; Shi, Yinghuan; Gao, Yang",oral,2308.13168,https://arxiv.org/abs/2308.13168,https://github.com/nukezil/IOMatch,https://huggingface.co/papers/2308.13168,,,,4,0 Enhancing Sample Utilization through Sample Adaptive Augmentation in Semi-Supervised Learning,"Gui, Guan*; Zhao, Zhen; Qi, Lei; Zhou, Luping; Wang, Lei; Shi, Yinghuan",oral,,,,,,,,, When Noisy Labels Meet Long Tail Dilemmas: A Representation Calibration Method,"Zhang, Manyi*; Zhao, Xuyang; Yao, Jun; Yuan, Chun; Huang, Weiran",oral,2211.10955,https://arxiv.org/abs/2211.10955,,https://huggingface.co/papers/2211.10955,,,,5,0 Cross-Ray Neural Radiance Fields for Novel-view Synthesis from Unconstrained Image Collections,"Yang, Yifan; Zhang, Shuhai; Huang, Zixiong; Zhang, Yubing; Tan, Mingkui*",oral,2307.08093,https://arxiv.org/abs/2307.08093,,https://huggingface.co/papers/2307.08093,,,,5,0 Effective Real Image Editing with Accelerated Iterative Diffusion Inversion,"Pan, Zhihong*; Gherardi, Riccardo; Xie, Xiufeng; Huang, Stephen",oral,,,,,,,,, Simulating Fluids in Real-World Still Images,"Fan, Siming; Piao, Jingtan; Qian, Chen; Lin, Kwan-Yee*; Li, Hongsheng",oral,2204.11335,https://arxiv.org/abs/2204.11335,,https://huggingface.co/papers/2204.11335,,,,5,1 FateZero: Fusing Attentions for Zero-shot Text-based Video Editing,"QI, Chenyang; Cun, Xiaodong; Zhang, Yong; Lei, Chenyang; Wang, Xintao; Shan, Ying; Chen, Qifeng*",oral,2303.09535,https://arxiv.org/abs/2303.09535,https://github.com/ChenyangQiQi/FateZero,https://huggingface.co/papers/2303.09535,https://huggingface.co/spaces/chenyangqi/FateZero,,,7,1 ELITE: Encoding Visual Concepts into Textual Embeddings for Customized Text-to-Image Generation,"Wei, Yuxiang; Yabo, Zhang; ji, zhilong; Bai, Jinfeng; Zhang, Lei; Zuo, Wangmeng*",oral,2302.13848,https://arxiv.org/abs/2302.13848,https://github.com/csyxwei/ELITE,https://huggingface.co/papers/2302.13848,https://huggingface.co/spaces/ELITE-library/ELITE,https://huggingface.co/ELITE-library/ELITE,,6,0 Get-a-Video-for-Free: Text-to-Image Diffusion Models are Zero-Shot Video Generators,"Khachatryan, Levon; Movsisyan, Andranik; Tadevosyan, Vahram; Henschel, Roberto*; Wang, Zhangyang; Navasardyan, Shant; Shi, Humphrey",oral,,,,,,,,, Chupa: Carving 3D Clothed Humans from Skinned Shape Priors using 2D Diffusion Probabilistic Models,"Kim, Byungjun*; Kwon, Patrick; Lee, Kwangho; Lee, Myunggi; Han, Sookwan; Kim, Daesik; Joo, Hanbyul",oral,2305.11870,https://arxiv.org/abs/2305.11870,,https://huggingface.co/papers/2305.11870,,,,7,2 DiffPose: Multi-hypothesis Human Pose Estimation using Diffusion Models,"Holmquist, Karl*; Wandt, Bastian",oral,2211.16487,https://arxiv.org/abs/2211.16487,,https://huggingface.co/papers/2211.16487,,,,2,0 HumanSD: A Native Skeleton-Guided Diffusion Model for Human Image Generation,"Ju, Xuan*; Zeng, Ailing; ZHAO, Chenchen; Wang, Jianan; Zhang, Lei; Xu, Qiang",oral,2304.04269,https://arxiv.org/abs/2304.04269,,https://huggingface.co/papers/2304.04269,,,,6,0 Role-aware Interaction Generation from Textual Description,"Tanaka, Mikihiro*; Fujiwara, Kent",oral,,,,,,,,, PhysDiff: Physics-Guided Human Motion Diffusion Model ,"Yuan, Ye*; Song, Jiaming; Iqbal, Umar; Vahdat, Arash; Kautz, Jan",oral,,,,,,,,, Forward Flow for Novel View Synthesis of Dynamic Scenes,"Guo, Xiang; Sun, Jiadai; Dai, Yuchao*; CHEN, Guanying; Ye, Xiaoqing; Tan, Xiao; Ding, Errui; Zhang, Yumeng; Wang, Jingdong",oral,,,,,,,,, A step towards understanding why classification helps regression,"Pintea, Silvia L*; Lin, Yancong; Dijkstra, Jouke; van Gemert, Jan C",poster,2308.10603,https://arxiv.org/abs/2308.10603,,https://huggingface.co/papers/2308.10603,,,,4,0 DNA-Rendering : A Diverse Neural Actor Repository for High-Fidelity Human-centric Rendering,"CHENG, WEI*; Chen, Ruixiang; Chen, Keyu; Cai, Zhongang; Dai, Bo; Fan, Siming; Gao, Yang; Lin, Zhengyu; Lin, Dahua; Liu, Ziwei; Lin, Kwan-Yee; Loy, Chen Change; Qian, Chen; Ren, Daxuan; Wu, Wayne; Wang, Jingbo; Yu, Zhengming; Yin, Wanqi; Yang, Lei",poster,,,,,,,,, Robo3D: Towards Robust and Reliable 3D Perception against Corruptions,"Kong, Lingdong*; Liu, Youquan; Li, Xin; Chen, Runnan; Zhang, Wenwei; Ren, Jiawei; Pan, Liang; Chen, Kai; Liu, Ziwei",poster,2303.17597,https://arxiv.org/abs/2303.17597,,https://huggingface.co/papers/2303.17597,,,,9,1 Efficient Discovery and Effective Evaluation of Visual Similarities: A Benchmark and Beyond,"Barkan, Oren*; Reiss, Tal; Weill, Jonathan; Kats, Ori; Hirsch, Roy; Malkiel, Itzik; Koenigstein, Noam ",poster,,,,,,,,, DetermiNet: A Large-Scale Diagnostic Dataset for Complex Visually-Grounded Referencing using Determiners,"Lee, Clarence*; Kumar, M Ganesh; Tan, Cheston",poster,,,,,,,,, Beyond Object Recognition: A New Benchmark towards Object Concept Learning,"Li, Yong-Lu*; Xu, Yue; Xu, Xinyu; Mao, Xiaohan; Yao, Yuan; Liu, Siqi; Lu, Cewu",poster,2212.02710,https://arxiv.org/abs/2212.02710,,https://huggingface.co/papers/2212.02710,,,,7,0 "HRS-Bench: Holistic, Reliable and Scalable Benchmark for Text-to-Image Models","abdelrahman, eslam mohamed*; Sun, Pengzhan; shen, xiaoqian; Khan, Faizan Farooq; Li, Li Erran; Elhoseiny, Mohamed",poster,,,,,,,,, SegRCDB: Semantic Segmentation via Formula-Driven Supervised Learning,"Shinoda, Risa*; Hayamizu, Ryo; Nakashima, Kodai; Inoue, Nakamasa; Yokota, Rio; Kataoka, Hirokatsu",poster,,,,,,,,, LoTE-Animal: A Long Time-span Dataset for Endangered Animal Behavior Understanding,"Liu, Dan*; Hou, Jin; Huang, Shaoli; Liu, Jing; He, Yuxin; zheng, bochuan; Ning, Jifeng; Zhang, Jingdong",poster,,,,,,,,, Building3D: A Urban-Scale Dataset and Benchmarks for Learning Roof Structures from Point Clouds,"Wang, Ruisheng*; Huang, Shangfeng; Yang, Hongxin",poster,,,,,,,,, Lecture Presentations Multimodal Dataset: Towards Understanding Multimodality in Educational Videos,"Lee, Dong Won*; Ahuja, Chaitanya; Liang, Paul Pu; Morency, Louis-Philippe; Natu, Sanika",poster,,,,,,,,, Probabilistic Precision and Recall Towards Reliable Evaluation of Generative Models,"Park, Dogyun; Kim, Suhyun*",poster,,,,,,,,, EgoObjects: A Large-Scale Egocentric Dataset for Fine-Grained Object Understanding,"Zhu, Chenchen; Xiao, Fanyi; Alvarado, Jose A; babaei, yasmine; Hu, Jiabo; El-Mohri, Hichem; Culatana, Sean; Sumbaly, Roshan; Yan, Zhicheng*",poster,,,,,,,,, Contrastive Automatic Model Evaluation,"Peng, Ru; Duan, Qiuyang; Wang, Haobo; Ma, Jiachen; Jiang, Yanbo; Tu, Yongjun; Jiang, Xiu; Zhao, Junbo*",poster,,,,,,,,, Aria Digital Twin: A New Benchmark Dataset for Egocentric 3D Machine Perception,"Pan, Xiaqing*; Charron, Nicholas; Yang, Yongqian; Peters, Scott C; Whelan, Thomas; Kong, Chen; Parkhi, Omkar M; Newcombe, Richard; Ren, Yuheng",poster,2306.06362,https://arxiv.org/abs/2306.06362,,https://huggingface.co/papers/2306.06362,,,,9,0 Exploring Video Quality Assessment on User Generated Contents from Aesthetic and Technical Perspectives,"Wu, Haoning*; Zhang, Erli; Liao, Liang; Chen, Chaofeng; Hou, Jingwen; Wang, Annan; Sun, Wenxiu; Yan, Qiong; Lin, Weisi",poster,2211.04894,https://arxiv.org/abs/2211.04894,https://github.com/VQAssessment/DOVER,https://huggingface.co/papers/2211.04894,,,,9,0 Going Beyond Nouns With Vision & Language Models Using Synthetic Data,"Cascante-Bonilla, Paola*; Shehada, Khaled; Smith, James S; Doveh, Sivan; Kim, Donghyun; Panda, Rameswar; Varol, Gul; Oliva, Aude; Ordonez, Vicente; Feris, Rogerio; Karlinsky, Leonid",poster,2303.17590,https://arxiv.org/abs/2303.17590,,https://huggingface.co/papers/2303.17590,,,,11,0 H3WB: Human3.6M 3D WholeBody Dataset and Benchmark,"Zhu, Yue*; Samet, Nermin; Picard, David",poster,2211.15692,https://arxiv.org/abs/2211.15692,https://github.com/wholebody3d/wholebody3d,https://huggingface.co/papers/2211.15692,,,,3,1 ZOD: A large-scale and diverse multimodal dataset for autonomous driving,"Alibeigi, Mina*; Ljungbergh, William; Tonderski, Adam; Hess, Georg; Lilja, Adam; Lindström, Carl; Motorniuk, Daria; Fu, Junsheng; Widahl, Jenny; Petersson, Christoffer",poster,,,,,,,,, CAD-Estate: Large-scale CAD Model Annotation in RGB Videos,"Maninis, Kevis-Kokitsi*; Popov, Stefan; Niessner, Matthias; Ferrari, Vittorio",poster,,,,,,,,, Neglected Free Lunch - Learning Image Classifiers Using Annotation Byproducts,"Han, Dongyoon; Choe, Junsuk; Chun, Seonghyeok; Chung, John JY; Chang, Minsuk; Yun, Sangdoo; Song, Jean Y; Oh, Seong Joon*",poster,,,,,,,,, Chaotic World: A Large and Challenging Benchmark for Human Behavior Understanding in Chaotic Events,"Ong, Kian Eng*; Ng, Xun Long; Ai, Wenjie; Li, Yanchao; Zhao, Kuangyi; Yeo, Si Yong; Liu, Jun",poster,,,,,,,,, MOSE: A New Dataset for Video Object Segmentation in Complex Scenes,"Ding, Henghui*; Liu, Chang; He, Shuting; Jiang, Xudong; Torr, Philip; Bai, Song",poster,2302.01872,https://arxiv.org/abs/2302.01872,,https://huggingface.co/papers/2302.01872,,,,6,0 Spurious Features Everywhere - Large-Scale Detection of Harmful Spurious Features in ImageNet,"Neuhaus, Yannic*; Augustin, Maximilian; Boreiko, Valentyn; Hein, Matthias",poster,,,,,,,,, Chop & Learn: Recognizing and Generating Object-State Compositions,"Saini, Nirat*; Wang, Hanyu; Gupta, Kamal; Swaminathan, Archana; Jayasundara, Vinoj; He, Bo; Shrivastava, Abhinav",poster,,,,,,,,, Building Bridge Across the Time: Disruption and Restoration of Murals In the Wild,"Shao, Huiyang*; Xu, Qianqian; Wen, Peisong; Peifeng, Gao; Yang, Zhiyong; Huang, Qingming",poster,,,,,,,,, HoloAssist: an Egocentric Human Interaction Dataset for Interactive AI Assistants in the Real World,"Wang, Xin*; Kwon, Taein ; Rad, Mahdi; Pan, Bowen; Chakraborty, Ishani ; Andrist, Sean; Bohus, Dan; Feniello, Ashley N; Tekin, Bugra; Vieira Frujeri, Felipe; Joshi, Neel; Pollefeys, Marc",poster,,,,,,,,, SynBody: Synthetic Dataset with Layered Human Models for 3D Human Perception and Modeling,"Yang, Zhitao*; Cai, Zhongang; Mei, Haiyi; Liu, Shuai; Chen, Zhaoxi; Xiao, Weiye; Wei, Yukun; Qing, Zhongfei; WEI, CHEN; Dai, Bo; Wu, Wayne; Qian, Chen; Lin, Dahua; Liu, Ziwei; Yang, Lei",poster,2303.17368,https://arxiv.org/abs/2303.17368,,https://huggingface.co/papers/2303.17368,,,,15,0 Humourous Image Captions (HIC): A Humour-oriented Image-text Dataset,"Li, Runjia; Sun, Shuyang*; Elhoseiny, Mohamed; Torr, Philip",poster,,,,,,,,, LaRS: A Diverse Panoptic Maritime Obstacle Detection Dataset and Benchmark,"Žust, Lojze*; Perš, Janez; Kristan, Matej",poster,2308.09618,https://arxiv.org/abs/2308.09618,,https://huggingface.co/papers/2308.09618,,,,3,0 Joint Metrics Matter: A Better Standard for Trajectory Forecasting,"Weng, Erica*; Hoshino, Hana; Ramanan, Deva; Kitani, Kris",poster,2305.06292,https://arxiv.org/abs/2305.06292,,https://huggingface.co/papers/2305.06292,,,,4,0 LPFF: A Portrait Dataset for Face Generators Across Large Poses,"Wu, Yiqian; Zhang, Jing; Fu, Hongbo ; Jin, Xiaogang*",poster,2303.14407,https://arxiv.org/abs/2303.14407,,https://huggingface.co/papers/2303.14407,,,,4,0 Replay: Multi-modal Multi-view Acted Videos for Casual Holography,"Shapovalov, Roman*; Kleiman, Yanir; Rocco, Ignacio; Novotny, David; Vedaldi, Andrea; Graham, Ben; Kokkinos, Filippos; Chen, Changan; Neverova, Natalia",poster,2307.12067,https://arxiv.org/abs/2307.12067,,https://huggingface.co/papers/2307.12067,,,,9,0 Human-centric Scene Understanding in 3D Large-scale Scenarios,"Xu, Yiteng; Cong, Peishan; Yao, Yichen; Chen, Runnan; HOU, Yuenan; Zhu, Xinge; He, Xuming; Yu, Jingyi; Ma, Yuexin*",poster,,,,,,,,, Pre-training Vision Transformers with Very Limited Synthesized Images,"Nakamura, Ryo*; Kataoka, Hirokatsu; Takashima, Sora; MARTINEZ-NORIEGA, Edgar Josafat; Yokota, Rio; Inoue, Nakamasa",poster,2307.14710,https://arxiv.org/abs/2307.14710,,https://huggingface.co/papers/2307.14710,,,,6,0 FACET: Fairness in Computer Vision Evaluation Benchmark,"Gustafson, Laura *; Rolland, Chloe; Ravi, Nikhila; Duval, Quentin; Adcock, Aaron; Fu, Cheng-Yang; Hall, Melissa; Ross, Candace",poster,,,,,,,,, EmoSet: A Large-scale Visual Emotion Dataset with Rich Attributes,"Yang, Jingyuan; Huang, Qirui; Ding, Tingting; Lischinski, Dani; Cohen-Or, Danny; Huang, Hui*",poster,2307.07961,https://arxiv.org/abs/2307.07961,,https://huggingface.co/papers/2307.07961,,,,6,0 RenderIH: A large-scale synthetic dataset for 3D interacting hand pose estimation,"Li, Lijun*; Tian, Linrui; Zhang, Xindi; Wang, Qi; Zhang, Bang; Liefeng, Bo; Liu, Mengyuan; Chen, Chen",poster,,,,,,,,, TIFA: Accurate and Interpretable Text-to-Image Faithfulness Evaluation with Question Answering,"Hu, Yushi*; Liu, Benlin; Kasai, Jungo; Wang, Yizhong; Ostendorf, Mari; Krishna, Ranjay; Smith, Noah A",poster,2303.11897,https://arxiv.org/abs/2303.11897,,https://huggingface.co/papers/2303.11897,,,,7,0 Exploring the Sim2Real Gap using Digital Twins,"Sudhakar, Sruthi*; Hanzelka, Jon; Bobillot, Josh D; Randhavane, Tanmay; Joshi, Neel; Vineet, Vibhav",poster,,,,,,,,, ClothesNet: An Information-Rich 3D Garment Model Repository with Simulated Clothes Environment,"Zhou, Bingyang*; Zhou, Haoyu; Liang, Tianhai; Yu, Qiaojun; Zhao, Siheng; Zeng, Yuwei; Lv, Jun; Luo, Siyuan; Wang, Qiancai; Yu, Xinyuan; Chen, Haonan; Lu, Cewu; Shao, Lin",poster,2308.09987,https://arxiv.org/abs/2308.09987,,https://huggingface.co/papers/2308.09987,,,,13,0 Video State-Changing Object Segmentation,"Yu, Jiangwei; Li, Xiang*; Zhao, Xinran; ZHANG, Hongming; Wang, Yu-Xiong",poster,,,,,,,,, PlanarTrack: A Large-scale Challenging Benchmark for Planar Object Tracking,"Liu, Xinran; liu, xiaoqiong; Yi, Ziruo; zhou, xin; Le, Thanh; Zhang, Libo; Huang, Yan; Yang, Qing; Fan, Heng*",poster,2303.07625,https://arxiv.org/abs/2303.07625,,https://huggingface.co/papers/2303.07625,,,,9,0 "AIDE: A Vision-Driven Multi-View, Multi-Modal, Multi-Tasking Dataset for Assistive Driving Perception","Yang, Dingkang*; huang, shuai; Xu, Zhi; Li, Zhenpeng; Wang, Shunli; Li, Mingcheng; Wang, Yuzheng; Liu, Yang; yang, kun; Chen, Zhaoyu; Wang, Yan; Liu, Jing; Zhang, Peixuan; Zhai, Peng; Zhang, Lihua",poster,2307.13933,https://arxiv.org/abs/2307.13933,https://github.com/ydk122024/AIDE,https://huggingface.co/papers/2307.13933,,,,15,0 Generalization-Reinforced Semi-Supervised Learning for Glaucoma Assessment and a Multimodal and Multitask Dataset,"Luo, Yan*; Shi, Min; Tian, Yu; Elze, Tobias; Wang, Mengyu",poster,,,,,,,,, ARNOLD: A Benchmark for Language-Grounded Task Learning with Continuous States in Realistic 3D Scenes,"Gong, Ran; Huang, Jiangyong; Zhao, Yizhou; Geng, Haoran; Gao, Xiaofeng; Wu, Qingyang; Ai, Wensi; Ziheng, Zhou; Terzopoulos, Demetri; Zhu, Song-Chun; Jia, Baoxiong; Huang, Siyuan*",poster,2304.04321,https://arxiv.org/abs/2304.04321,,https://huggingface.co/papers/2304.04321,,,,12,0 "FishNet: A Large-scale Dataset and Benchmark for Fish Recognition, Detection, and Functional Traits Prediction","Khan, Faizan Farooq*; Li, Xiang; Temple, Andrew J; Elhoseiny, Mohamed",poster,,,,,,,,, Towards Content-based Pixel Retrieval in Revisited Oxford and Paris,"An, Guoyuan*; Kim, Woo Jae; Yang, Saelyne; Li, Rong; Huo, Yuchi; Yoon, Sungeui",poster,,,,,,,,, BEAR: A BEnchmark on video Action Recognition,"Deng, Andong*; Yang, Taojiannan; Chen, Chen",poster,,,,,,,,, SQAD: Automatic Smartphone Camera Quality Assessment and Benchmarking,"Fang, Zilin; Ignatov, Andrey; Zamfir, Eduard; Timofte, Radu*",poster,,,,,,,,, Revisiting Scene Text Recognition: A Data Perspective,"Jiang, Qing*; Wang, Jiapeng; Peng, Dezhi; Liu, Chongyu; Jin, Lianwen ",poster,2307.08723,https://arxiv.org/abs/2307.08723,,https://huggingface.co/papers/2307.08723,,,,5,0 Will Large-scale Generative Models Corrupt Future Datasets?,"Hataya, Ryuichiro*; Bao, Han; Arai, Hiromi",poster,2211.08095,https://arxiv.org/abs/2211.08095,https://github.com/moskomule/dataset-contamination,https://huggingface.co/papers/2211.08095,,,,3,0 360VOT: A New Benchmark Dataset for Omnidirectional Visual Object Tracking,"Huang, Huajian*; Xu, Yinzhe; Chen, Yingshu; Yeung, Sai-Kit",poster,2307.14630,https://arxiv.org/abs/2307.14630,,https://huggingface.co/papers/2307.14630,,,,4,0 DeePoint: Pointing Recognition and Direction Estimation From A Fixed View,"Nakamura, Shu; Kawanishi, Yasutomo; Nobuhara, Shohei*; Nishino, Ko",poster,2304.06977,https://arxiv.org/abs/2304.06977,,https://huggingface.co/papers/2304.06977,,,,4,0 Contactless Pulse Estimation Leveraging Pseudo Labels and Self-Supervision,"Li, Zhihua*; Yin, Lijun",poster,,,,,,,,, Most Important Person-guided Dual-branch Cross-Patch Attention for Group Affect Recognition,"Xie, Hongxia*; Lee, Ming-Xian; Chen, Tzu Jui; Chen, Hung-Jen; Liu, Hou-I; Shuai, Hong-Han; Cheng, Wen-Huang",poster,,,,,,,,, Object-centric Contact Field for Grasp Generation,"Liu, Shaowei*; Zhou, Yang; Yang, Jimei; Gupta, Saurabh; Wang, Shenlong",poster,,,,,,,,, Imitator: Personalized Speech-driven 3D Facial Animation,"Thambiraja, Balamurugan*; Habibie, Ikhsanul; Aliakbarian, Sadegh; Cosker, Darren P; Theobalt, Christian; Thies, Justus",poster,2301.00023,https://arxiv.org/abs/2301.00023,,https://huggingface.co/papers/2301.00023,,,,6,0 DVGaze: Dual-view Gaze Estimation,"Cheng, Yihua; Lu, Feng*",poster,2308.10310,https://arxiv.org/abs/2308.10310,https://github.com/yihuacheng/DVGaze,https://huggingface.co/papers/2308.10310,,,,2,0 TransFace: Calibrating Transformer Training for Face Recognition from a Data-Centric Perspective,"Dan, Jun*; Liu, Yang; Xie, Haoyu; Deng, Jiankang; xie, haoran; Xie, Xuansong; Sun, Baigui",poster,2308.10133,https://arxiv.org/abs/2308.10133,https://github.com/DanJun6737/TransFace,https://huggingface.co/papers/2308.10133,,,,7,0 Towards Unsupervised Domain Generalization for Face Anti-Spoofing,"Liu, Yuchen*; Chen, Yabo; Gou, Mengran; Huang, Chun-Ting; Wang, Yaoming; Dai, Wenrui; Xiong, Hongkai",poster,,,,,,,,, Reinforced Disentanglement for Face Swapping without Skip Connection,"ren, xiaohang*; Chen, Xingyu; Yao, Pengfei; Shum, Heung-Yeung; Wang, Baoyuan",poster,2307.07928,https://arxiv.org/abs/2307.07928,,https://huggingface.co/papers/2307.07928,,,,5,0 CoSign: Exploring Co-occurrence Signals in Skeleton-based Continuous Sign Language Recognition,"Jiao, Peiqi; Min, Yuecong; Li, Yanan; Xiaotao, Wang; LEI, LEI; Chen, Xilin*",poster,,,,,,,,, EmoTalk: Speech-Driven Emotional Disentanglement for 3D Face Animation,"Peng, Ziqiao; Wu, Haoyu ; Song, Zhenbo; Xu, Hao; Zhu, Xiangyu; He, Jun; Liu, Hongyan; Fan, Zhaoxin*",poster,2303.11089,https://arxiv.org/abs/2303.11089,,https://huggingface.co/papers/2303.11089,,,,8,0 LA-Net: Landmark-Aware Learning for Reliable Facial Expression Recognition under Label Noise,"Wu, Zhiyu*; Cui, Jinshi",poster,,,,,,,,, ASM: Adaptive Skinning Model for High-Quality 3D Face Modeling,"Yang, Kai; Shang, Hong*; Shi, Tianyang; Chen, Xinghan; Zhou, Jingkai; Sun, Zhongqian; Yang, Wei",poster,2304.09423,https://arxiv.org/abs/2304.09423,,https://huggingface.co/papers/2304.09423,,,,7,0 Troubleshooting Ethnic Quality Bias with Curriculum Domain Adaptation for Face Image Quality Assessment,"Ou, Fu-Zhao*; Chen, Baoliang; Li, Chongyi; Wang, Shiqi ; Kwong, Sam",poster,,,,,,,,, UniFace: Unified Cross-Entropy Loss for Deep Face Recognition,"Zhou, Jiancan; Jia, Xi; Li, Qiufu; Shen, Linlin*; Duan, Jinming",poster,,,,,,,,, Human Part-wise 3D Motion Context Learning for Sign Language Recognition,"Lee, Taeryung; Oh, Yeonguk; Lee, Kyoung Mu*",poster,2308.09305,https://arxiv.org/abs/2308.09305,,https://huggingface.co/papers/2308.09305,,,,3,0 Weakly-Supervised Text-driven Contrastive Learning for Facial Behavior Understanding,"Zhang, Xiang*; Wang, Taoyue; Li, Xiaotian; Yang, Huiyuan; Yin, Lijun",poster,2304.00058,https://arxiv.org/abs/2304.00058,,https://huggingface.co/papers/2304.00058,,,,5,0 HaMuCo: Hand Pose Estimation via Multiview Collaborative Self-Supervised Learning,"Zheng, Xiaozheng*; Wen, Chao; Xue, Zhou; Ren, Pengfei; Wang, Jingyu",poster,2302.00988,https://arxiv.org/abs/2302.00988,,https://huggingface.co/papers/2302.00988,,,,5,0 ReactioNet: Learning High-order Facial Behavior from Universal Stimulus-Reaction by Dyadic Relation Reasoning,"Li, Xiaotian*; Wang, Taoyue; Zhao, Geran; Zhang, Xiang; Kang, Xi; Yin, Lijun",poster,,,,,,,,, CLIP-Cluster: CLIP-Guided Attribute Hallucination for Face Clustering,"Shen, Shuai; Li, Wanhua; Wang, Xiaobing; zhang, dafeng; Jin, Zhezhu; Zhou, Jie; Lu, Jiwen*",poster,,,,,,,,, Learning Human Dynamics in Autonomous Driving Scenarios,"Wang, Jingbo*; Yuan, Ye; Luo, Zhengyi; Xie, Kevin; Lin, Dahua; Iqbal, Umar; Fidler, Sanja; Khamis, Sameh",poster,,,,,,,,, LivelySpeaker: Towards Semantic-aware Co-Speech Gesture Generation,"Zhi, Yihao*; Cun, Xiaodong; Chen, Xuelin; Shen, Xi; GUO, Wen; Huang, Shaoli; Gao, Shenghua",poster,,,,,,,,, Controllable Guide-Space for Generalizable Face Forgery Detection,"Guo, Ying*; Zhen, Cheng; Yan, Pengfei",poster,2307.14039,https://arxiv.org/abs/2307.14039,,https://huggingface.co/papers/2307.14039,,,,3,0 Unpaired Multi-domain Attribute Translation of 3D Facial Shapes with a Square and Symmetric Geometric Map,"Fan, Zhenfeng*; zhang, zhiheng; Yang, Shuang; Zhong, Chongyang; min, cao; Xia, Shihong",poster,2308.13245,https://arxiv.org/abs/2308.13245,https://github.com/NaughtyZZ/3D_facial_shape_attribute_translation_ssgmap,https://huggingface.co/papers/2308.13245,,,,6,0 Emotional Listener Portrait: Neural Listener Head Generation with Emotion,"Song, Luchuan*; Yin, Guojun; Jin, Zhenchao; Dong, Xiaoyi; Xu, Chenliang",poster,,,,,,,,, Steered Diffusion: Diffusion Models Can Perform Zero-Shot Conditional Generation,"Gopalakrishnan Nair, Nithin*; Cherian, Anoop; Lohit, Suhas; Wang, Ye; Koike-Akino, Toshiaki; Patel, Vishal; Marks, Tim K",poster,,,,,,,,, Invariant Feature Regularization for Fair Face Recognition,"Ma, Jiali*; Yue, Zhongqi; Kagaya, Tomoyuki; SUZUKI, TOMOKI; Jayashree, Karlekar; Pranata, Sugiri; Zhang, Hanwang",poster,,,,,,,,, Gloss-free Sign Language Translation: Improving from Visual-Language Pretraining,"Zhou, Benjia; Chen, Zhigang; Clapés, Albert; Wan, Jun*; Liang, Yanyan; Escalera, Sergio; Lei, Zhen; Zhang, Du",poster,2307.14768,https://arxiv.org/abs/2307.14768,https://github.com/zhoubenjia/GFSLT-VLP,https://huggingface.co/papers/2307.14768,,,,8,0 Contrastive Pseudo Learning for Open-world Deepfake Attribution,"Sun, Zhimin*; Chen, Shen; Yao, Taiping; YIN, BANGJIE; Yi, Ran; Ding, Shouhong; Ma, Lizhuang",poster,,,,,,,,, Continual Learning for Personalized Co-speech Gesture Generation,"Ahuja, Chaitanya*; Joshi, Pratik; Ishii, Ryo; Morency, Louis-Philippe",poster,,,,,,,,, HandR2N2: Iterative 3D Hand Pose Estimation Using a Residual Recurrent Neural Network,"CHENG, WENCAN; Ko, Jong Hwan*",poster,,,,,,,,, SPACE: Speech-driven Portrait Animation with Controllable Expression,"Gururani, Siddharth*; Mallya, Arun; Wang, Ting-Chun; Valle, Rafael; Liu, Ming-Yu",poster,2211.09809,https://arxiv.org/abs/2211.09809,,https://huggingface.co/papers/2211.09809,,,,5,0 How to Boost Face Recognition with StyleGAN?,"Sevastopolskiy, Artem*; Malkov, Yury A.; Durasov, Nikita; Verdoliva, Luisa; Niessner, Matthias",poster,2210.10090,https://arxiv.org/abs/2210.10090,https://github.com/seva100/stylegan-for-facerec,https://huggingface.co/papers/2210.10090,,,,5,0 ChildPlay: A New Benchmark for Understanding Children’s Gaze Behaviour,"Tafasca, Samy; Gupta, Anshul*; ODOBEZ, Jean-Marc",poster,,,,,,,,, Robust One-Shot Face Video Re-enactment using Hybrid Latent Spaces of StyleGAN2,"Oorloff, Trevine S J*; Yacoob, Yaser",poster,,,,,,,,, Data-Free Class-Incremental Hand Gesture Recognition,"Aich, Shubhra*; Ruiz-Santaquiteria, Jesus; Garg, Prachi; Lu, Zhenyu; K J, Joseph; Fernandez Garcia, Alvaro; Kin, Kenrick; Wan, Chengde; N Balasubramanian, Vineeth; Camgoz, Necati Cihan; Ma, Shugao; de la Torre, Fernando",poster,,,,,,,,, Learning Robust Representations with Information Bottleneck and Memory Network for RGB-D-based Gesture Recognition,"Li, Yunan*; Chen, Huizhou; Feng, Guanwen; Miao, Qiguang",poster,,,,,,,,, Knowledge-Spreader: Learning Facial Action Dynamics from Single Label Clips via Progressive Knowledge Distillation,"Li, Xiaotian*; Zhang, Xiang; Wang, Taoyue; Yin, Lijun",poster,,,,,,,,, Face Clustering via Graph Convolutional Networks with Confidence Edges,"Wu, Yang; Ge, Zhiwei; Luo, Yuhao*; Liu, Lin; Xu, Sulong",poster,,,,,,,,, StyleGANEX: StyleGAN-Based Manipulation Beyond Cropped Aligned Faces ,"Yang, Shuai*; Jiang, Liming; Liu, Ziwei; Loy, Chen Change",poster,,,,,,,,, SeeABLE: Soft Discrepancies and Bounded Contrastive Learning for Exposing Deepfakes,"Larue, nicolas*; Vu, Ngoc-Son; Struc, Vitomir; Peer, Peter; Christophides, Vassilis",poster,2211.11296,https://arxiv.org/abs/2211.11296,,https://huggingface.co/papers/2211.11296,,,,5,0 Adaptive Nonlinear Latent Transformation for Conditional Face Editing,"Huang, Zhizhong*; Ma, Siteng; Zhang, Junping; Shan, Hongming",poster,2307.07790,https://arxiv.org/abs/2307.07790,https://github.com/Hzzone/AdaTrans,https://huggingface.co/papers/2307.07790,,,,4,0 Semi-supervised Speech-driven 3D Facial Animation via Cross-modal Encoding,"Yang, Peiji; Wei, Huawei*; Zhong, Yicheng; Wang, Zhisheng",poster,,,,,,,,, ICD-Face: Intra-class Compactness Distillation for Face Recognition,"yu, zhipeng; Liu, Jiaheng*; Qin, Haoyu; Wu, Yichao; Hu, Kun; Tian, Jiayi; Liang, Ding",poster,,,,,,,,, C$^2$ST: Cross-modal Contextualized Sequence Transduction for Continuous Sign Language Recognition,"Zhang, Huaiwen*; guo, zihang; Yang, Yang; Liu, Xin; Hu, De",poster,,,,,,,,, CO-PILOT: Dynamic Top-Down Point Cloud with Conditional Neighborhood Aggregation for Multi-Gigapixel Histopathology Image Representation,"Ebrahim Nakhli, Ramin*; Zhang, Allen W; Khajegili Mirabadi, Ali; Rich, Katherine P; Asadi, Maryam; Gilks, C Blake; Farahani, Hossein; Bashashati, Ali",poster,,,,,,,,, SKiT: a faSt Key information video Transformer for online surgical phase recognition,"Liu, Yang*; Huo, Jiayu; Peng, Jingjing; Sparks, Rachel; Dasgupta, Prokar; Granados, Alejandro; Ourselin, Sebastien",poster,,,,,,,,, XNet: Wavelet-Based Low and High Frequency Fusion Networks for Fully- and Semi-Supervised Semantic Segmentation of Biomedical Images,"Zhou, Yanfeng; jiaxing, Huang; Wang, Chenlong; Song, Le; Yang, Ge*",poster,,,,,,,,, Probabilistic Modeling of Inter- and Intra-observer Variability in Medical Image Segmentation,"Schmidt, Arne*; Morales-Alvarez, Pablo; Molina, Rafael",poster,,,,,,,,, Learning Cross-Representation Affinity Consistency for Sparsely Supervised Biomedical Instance Segmentation,"Liu, Xiaoyu; Huang, Wei; Xiong, Zhiwei*; Zhou, Shenglong; Zhang, Yueyi; Chen, Xuejin; Zha, Zheng-Jun; Wu, Feng",poster,,,,,,,,, Dual Meta-Learning with Longitudinally Consistent Regularization for One-Shot Brain Tissue Segmentation Across the Human Lifespan,"Sun, Yongheng; Wang, Fan; Shu, Jun; Wang, Haifeng; Wang, Li; Meng, Deyu; Lian, Chunfeng*",poster,,,,,,,,, BlindHarmony: “Blind” Harmonization for MR Images via Flow model,"Jeong, Hwihun*; Byun , Heejoon; Kang, Dong Un; Lee, Jongho",poster,,,,,,,,, "Continual Segment: Towards a Single, Unified and Non-forgetting Continual Segmentation Model of 143 Whole-body Organs in CT Scans","Ji, Zhanghexuan; Guo, Dazhou*; Wang, Puyang; Yan, Ke; Lu, Le; Xu, Minfeng; Wang, Qifeng; Ge, Jia; Gao, Mingchen; Ye, Xianghua; Jin, Dakai",poster,,,,,,,,, CLIP-Driven Universal Model for Organ Segmentation and Tumor Detection,"liu, jie; Zhang, Yixiao; Chen, Jieneng; Xiao, Junfei; Lu, Yongyi; Landman, Bennett A; Yuan, Yixuan; Yuille, Alan; Tang, Yucheng; Zhou, Zongwei*",poster,2301.00785,https://arxiv.org/abs/2301.00785,,https://huggingface.co/papers/2301.00785,,,,10,0 LIMITR: Leveraging Local Information for Medical Image-Text Representation,"Dawidowicz, Gefen*; Hirsch, Elad; Tal, Ayellet",poster,2303.11755,https://arxiv.org/abs/2303.11755,,https://huggingface.co/papers/2303.11755,,,,3,0 Taxonomy Adaptive Cross-Domain Adaptation in Medical Imaging via Optimization Trajectory Distillation,"Fan, Jianan*; Liu, Dongnan; Chang, Hang; Huang, Heng; Chen, Mei; Cai, Weidong",poster,2307.14709,https://arxiv.org/abs/2307.14709,,https://huggingface.co/papers/2307.14709,,,,6,0 CuNeRF: Cube-Based Neural Radiance Field for Zero-Shot Medical Image Arbitrary-Scale Super Resolution,"Chen, Zixuan; Yang, Lingxiao; Lai, Jian-Huang; Xie, Xiaohua*",poster,2303.16242,https://arxiv.org/abs/2303.16242,,https://huggingface.co/papers/2303.16242,,,,4,0 Learning to Distill Global Representation for Sparse-View CT,"Li, ZiLong; Ma, Chenglong; Chen, Jie; Zhang, Junping; Shan, Hongming*",poster,2308.08463,https://arxiv.org/abs/2308.08463,https://github.com/longzilicart/GloReDi,https://huggingface.co/papers/2308.08463,,,,5,0 Preserving Tumor Volumes for Unsupervised Meical Image Registration,"Dong, Qihua*; Du, Hao; Song, Ying; Xu, Yan; Liao, Jing",poster,,,,,,,,, uSplit: image decomposition for fluorescence microscopy,"Ashesh, Ashesh*; Krull, Alexander; di sante, moises; Pasqualini, Francesco; Jug, Florian",poster,,,,,,,,, Rethinking Multi-Contrast MRI Super-Resolution: Rectangle-Window Cross-Attention Transformer and Arbitrary-Scale Upsampling,"Li, Guangyuan*; Zhao, Lei; Sun, Jiakai; Lan, Zehua; Zhang, Zhanjie; Chen, Jiafu; Lin, Zhijie; Lin, Huaizhong; Xing, Wei",poster,,,,,,,,, Multimodal Optimal Transport-based Co-Attention Transformer with Global Structure Consistency for Survival Prediction,"XU, Yingxue*; Chen, Hao",poster,2306.08330,https://arxiv.org/abs/2306.08330,,https://huggingface.co/papers/2306.08330,,,,2,0 4D Myocardium Reconstruction with Decoupled Motion and Shape Model,"Yuan, Xiaohan; Liu, Cong; Wang, Yangang*",poster,2308.14083,https://arxiv.org/abs/2308.14083,,https://huggingface.co/papers/2308.14083,,,,3,0 Unsupervised Learning of Object-Centric Embeddings for Cell Instance Segmentation in Microscopy Images,"Wolf, Steffen; Lalit, Manan; McDole, Katie; Funke, Jan*",poster,,,,,,,,, LightDepth: Single-View Depth Self-Supervision from Illumination Decline,"Rodriguez-Puigvert, Javier*; Batlle, Víctor M.; Montiel, J. M. M.; Martinez-Cantin, Ruben; Fua, Pascal; Tardós, Juan D.; Civera, Javier",poster,2308.10525,https://arxiv.org/abs/2308.10525,,https://huggingface.co/papers/2308.10525,,,,7,0 BoMD: Bag of Multi-label Local Descriptors for Noisy Chest X-ray Classification,"Chen, Yuanhong*; Liu, Fengbei; Wang, Hu; Wang, Chong; Tian, Yu; liu, yuyuan; Carneiro, Gustavo",poster,,,,,,,,, Decomposition-Based Variational Network for Multi-Contrast MRI Super-Resolution and Reconstruction,"Lei, Pengcheng*; Fang, Faming; Zhang, Guixu; Zeng, Tieyong",poster,,,,,,,,, TopoSeg: Topology-Aware Nuclear Instance Segmentation,"He, Hongliang*; Wang, Jun; Wei, Pengxu; Xu, Fan; Ji, Xiangyang; Liu, Chang; Chen, Jie",poster,,,,,,,,, Scratch Each Other's Back: Incomplete Multi-modal Brain Tumor Segmentation Via Category Aware Group Self-Support Learning,"Qiu, YanSheng; Chen, Delin; Yao, Hongdou; Xu, Yongchao; Wang, Zheng*",poster,,,,,,,,, "Towards a Single Unified Model for Effective Detection, Segmentation, and Diagnosis of Eight Major Cancers Using a Large Collection of CT Scans","Chen, Jieneng*; Xia, Yingda; Yao, Jiawen ; Yan, Ke; Zhang, Jianpeng; Lu, Le; Wang, Fakai; Zhou, Bo; Qiu, Mingyan; Yu, Qihang; Yuan, Mingze; Fang, Wei; Tang, Yuxing; Xu, Minfeng; Zhou, Jian; Zhao, Yuqian; Wang, Qifeng; Ye, Xianghua; Yin, Xiaoli; Shi, Yu; Chen, Xin; Yuille, Alan; Liu, Zaiyi; Zhang, Ling",poster,2301.12291,https://arxiv.org/abs/2301.12291,,https://huggingface.co/papers/2301.12291,,,,25,1 Gram-based Attentive Neural Ordinary Differential Equations Network for Video Nystagmography Classification,"Qiu, Xihe*; Shi, Shaojie; Tan, Xiaoyu; Qu, Chao; Fang, Zhijun; Wang, Hailing; Gao, Yongbin; Wu, Peixia; Li, Huawei",poster,,,,,,,,, ConSlide: Asynchronous Hierarchical Interaction Transformer with Breakup-Reorganize Rehearsal for Continual Whole Slide Image Analysis,"Huang, Yanyan*; Zhao, Weiqin; Wang, Shujun; Fu, Yu; Jiang, Yuming; Yu, Lequan",poster,2308.13324,https://arxiv.org/abs/2308.13324,,https://huggingface.co/papers/2308.13324,,,,6,0 PRIOR: Prototype Representation Joint Learning from Medical Images and Reports,"Cheng, Pujin; Lin, Li; Lyu, Junyan; Huang, Yijin; Luo, Wenhan; Tang, Xiaoying*",poster,2307.12577,https://arxiv.org/abs/2307.12577,https://github.com/QtacierP/PRIOR,https://huggingface.co/papers/2307.12577,,,,6,0 MedKLIP: Medical Knowledge Enhanced Language-Image Pre-Training for X-ray Diagnosis,"Wu, Chaoyi*; Zhang, Xiaoman; Zhang, Ya; Wang, Yan-Feng; Xie, Weidi",poster,,,,,,,,, Affine-Consistent Transformer for Multi-Class Cell Nuclei Detection,"Huang, Junjia; Li, Haofeng; Wan, Xiang; Li, Guanbin*",poster,,,,,,,,, A differentiable skeletonization algorithm,"Menten, Martin J*; Paetzold, Johannes C.; Zimmer, Veronika A; Shit, Suprosanna; Ezhov, Ivan; Holland, Robbie M; Probst, Monika; Schnabel, Julia A; Rueckert, Daniel",poster,,,,,,,,, Improving Representation Learning for Histopathologic Images with Cluster Constraints,"Wu, Weiyi*; Gao, Chongyang; DiPalma, Joseph; Vosoughi, Soroush; Hassanpour, Saeed",poster,,,,,,,,, Enhancing Modality-Agnostic Representations via Meta-Learning for Brain Tumor Segmentation,"Konwer, Aishik*; Hu, Xiaoling; Bae, Joseph; Xu, Xuan; Chen, Chao; Prasanna, Prateek",poster,2302.04308,https://arxiv.org/abs/2302.04308,,https://huggingface.co/papers/2302.04308,,,,6,0 CauSSL: Causality-inspired Semi-supervised Learning for Medical Image Segmentation,"Miao, Juzheng; Liu, Furui*; Chen, Cheng; Wei, Hao; Heng, Pheng-Ann",poster,,,,,,,,, UniverSeg: Universal Medical Image Segmentation,"Butoi, Victor I*; Gonzalez Ortiz, Jose Javier; Ma, Tianyu ; Sabuncu, Mert; Guttag, John; Dalca, Adrian V",poster,2304.06131,https://arxiv.org/abs/2304.06131,,https://huggingface.co/papers/2304.06131,,,,6,3 MRM: Masked Relation Modeling for Medical Image Pre-Training with Genetics,"YANG, QIUSHI*; Li, Wuyang; Li, Baopu; Yuan, Yixuan",poster,,,,,,,,, "Boosting Whole Slide Image Classification from the Perspectives of Distribution, Correlation and Magnification","Qu, Linhao*; Yang, Zhiwei; Duan, Minghong; Ma, Yingfan; Wang, Shuo; Wang, Manning; Song, Zhijian",poster,,,,,,,,, Adaptive Template Transformer for Mitochondria Segmentation in Electron Microscopy Images,"Pan, Yuwen*; Luo, Naisong; Sun, Rui; Meng, Meng; Zhang, Tianzhu; Xiong, Zhiwei; Zhang, Yongdong",poster,,,,,,,,, Cross-Modal Translation and Alignment for Survival Analysis,"Zhou, Fengtao*; Chen, Hao",poster,,,,,,,,, LNPL-MIL: Learning from Noisy Pseudo Labels for Promoting Multiple Instance Learning in Whole Slide Image,"Shao, Zhuchen*; Wang, Yifeng; Chen, Yang; Bian, Hao; Liu, Shaohui; Wang, Haoqian; Zhang, Yongbing",poster,,,,,,,,, Generalized Few-Shot Point Cloud Segmentation Via Geometric Words,"Xu, Yating*; Hu, Conghui; Lee, Gim Hee; Zhao, Na",poster,,,,,,,,, Boosting 3-DoF Ground-to-Satellite Camera Localization Accuracy via Geometry-Guided Cross-View Transformer,"Shi, Yujiao*; wu, fei; Vora, Ankit; Perincherry, Akhil; LI, HONGDONG",poster,,,,,,,,, EP2P-Loc: End-to-End 3D Point to 2D Pixel Localization for Large-Scale Visual Localization,"Kim, Minjung*; Koo, Junseo; Kim, Gunhee",poster,,,,,,,,, Multi-task View Synthesis with Neural Radiance Fields,"Zheng, Shuhong; Bao, Zhipeng*; Hebert, Martial; Wang, Yu-Xiong",poster,,,,,,,,, Multi-Task Learning with Knowledge Distillation for Dense Prediction,"Xu, Yangyang; Yang, Yibo; Zhang, Lefei*",poster,,,,,,,,, Visually-Prompted Language Model for Fine-Grained Scene Graph Generation in an Open World,"Yu, Qifan*; Li, Juncheng; Wu, Yu; Tang, Siliang; Ji, Wei; Zhuang, Yueting",poster,2303.13233,https://arxiv.org/abs/2303.13233,,https://huggingface.co/papers/2303.13233,,,,6,0 CMDA: Cross-Modality Domain Adaptation for Nighttime Semantic Segmentation,"Xia, Ruihao; Zhao, Chaoqiang; Zheng, Meng; Wu, Ziyan; Sun, Qiyu; Tang, Yang*",poster,2307.15942,https://arxiv.org/abs/2307.15942,https://github.com/XiaRho/CMDA,https://huggingface.co/papers/2307.15942,,,,6,0 VQA-GNN: Fusing Multimodal Knowledge via Graph Neural Networks for Visual Question Answering,"Wang, Yanan*; Yasunaga, Michihiro; Ren, Hongyu; Wada, Shinya; Leskovec, Jure",poster,,,,,,,,, Disentangle then Parse: Night-time Semantic Segmentation with Illumination Disentanglement,"Wei, Zhixiang*; Chen, Lin; Tu, Tao; Ling, Pengyang; Chen, Huaian; Jin, Yi",poster,,,,,,,,, Visual Traffic Knowledge Graph Generation from Scene Images,"Guo, Yunfei*; yin, Fei; Li, Xiao-Hui; YAN, XUDONG; XUE, TAO; mei, shuqi; Liu, Cheng-Lin",poster,,,,,,,,, Agglomerative Transformer for Human-Object Interaction Detection,"Tu, Danyang*; Sun, Wei; Zhai, Guangtao; Shen, Wei",poster,2308.08370,https://arxiv.org/abs/2308.08370,,https://huggingface.co/papers/2308.08370,,,,4,0 3D Neural Embedding Likelihood for Robust Probabilistic Inverse Graphics,"Zhou, Guangyao*; Gothoskar, Nishad; Wang, Lirui; Tenenbaum, Joshua; Gutfreund, Dan; Lázaro-Gredilla, Miguel; George, Dileep; Mansinghka, Vikash",poster,2302.03744,https://arxiv.org/abs/2302.03744,,https://huggingface.co/papers/2302.03744,,,,8,0 HiLo: Exploiting High Low Frequency Relations for Unbiased Panoptic Scene Graph Generation,"Zhou, Zijian*; Shi, Miaojing; Caesar, Holger",poster,2303.15994,https://arxiv.org/abs/2303.15994,https://github.com/franciszzj/HiLo,https://huggingface.co/papers/2303.15994,,,,3,1 SRLIP: Fast Scaling of Relational Language-Image Pre-training,"Yuan, Hangjie*; Zhang, Shiwei; Wang, Xiang; Albanie, Samuel; Pan, Yining; Feng, Tao; Jiang, Jianwen; Ni, Dong; Zhang, Yingya; Zhao, Deli",poster,,,,,,,,, UniSeg: A Unified Multi-Modal LiDAR Segmentation Network and the OpenPCSeg Codebase,"Liu, Youquan*; Chen, Runnan; Li, Xin; Kong, Lingdong; Yang, Yuchen; Xia, Zhaoyang; Bai, Yeqi; Zhu, Xinge; Ma, Yuexin; Li, Yikang; HOU, Yuenan; Qiao, Yu",poster,,,,,,,,, See More and Know More: Zero-shot Point Cloud Segmentation via Multi-modal Visual Data,"Lu, Yuhang*; Jiang, Qi; Chen, Runnan; HOU, Yuenan; Zhu, Xinge; Ma, Yuexin",poster,2307.10782,https://arxiv.org/abs/2307.10782,,https://huggingface.co/papers/2307.10782,,,,6,0 Compositional Feature Augmentation for Unbiased Scene Graph Generation,"Li, Lin; Chen, Guikun; Xiao, Jun; Yang, Yi; Wang, Chunping; Chen, Long*",poster,2308.06712,https://arxiv.org/abs/2308.06712,,https://huggingface.co/papers/2308.06712,,,,6,0 Multi-weather Image Restoration via Domain Translation,"Patil, Prashant W*; Gupta, Sunil; Rana, Santu; Venkatesh, Svetha; Murala, Subrahmanyam",poster,,,,,,,,, CLIPTER: Looking at the Bigger Picture in Scene Text Recognition,"Aberdam, Aviad*; Bensaid, David H; Ganz, Roy; Nuriel, Oren; Golts, Alona; Mazor, Shai; Tichauer, Royee; Litman, Ron",poster,2301.07464,https://arxiv.org/abs/2301.07464,,https://huggingface.co/papers/2301.07464,,,,8,0 Towards Models that Can See and Read,"Ganz, Roy*; Nuriel, Oren; Aberdam, Aviad; Kittenplon, Yair; Litman, Ron; Mazor, Shai",poster,2301.07389,https://arxiv.org/abs/2301.07389,,https://huggingface.co/papers/2301.07389,,,,6,0 SurroundOcc: Multi-camera 3D Occupancy Prediction for Autonomous Driving,"Wei, Yi*; Zhao, Linqing; Zheng, Wenzhao; Zhu, Zheng; Zhou, Jie; Lu, Jiwen",poster,2303.09551,https://arxiv.org/abs/2303.09551,https://github.com/weiyithu/SurroundOcc,https://huggingface.co/papers/2303.09551,,,,6,0 DDP: Diffusion Model for Dense Visual Prediction,"Ji, Yuanfeng; Chen, Zhe; Xie, Enze*; Hong, Lanqing; Liu, Xihui; Liu, Zhaoqiang; Lu, Tong; Li, Zhenguo; Luo, Ping",poster,2303.17559,https://arxiv.org/abs/2303.17559,,https://huggingface.co/papers/2303.17559,,,,9,0 Understanding 3D Object Interaction from a Single Image,"Qian, Shengyi*; Fouhey, David",poster,2305.09664,https://arxiv.org/abs/2305.09664,,https://huggingface.co/papers/2305.09664,,,,2,1 ObjectSDF++: Improved Object-Compositional Neural Implicit Surfaces,"Wu, Qianyi*; Wang, Kaisiyuan; Li, Kejie; Zheng, Jianmin; Cai, Jianfei",poster,2308.07868,https://arxiv.org/abs/2308.07868,,https://huggingface.co/papers/2308.07868,,,,5,1 Improving Equivariance in State-of-the-Art Supervised Depth and Normal Predictors,"Zhong, Yuanyi*; Bhattad, Anand; Wang, Yu-Xiong; Forsyth, David",poster,,,,,,,,, CrossMatch: Source-Free Domain Adaptive Semantic Segmentation via Cross-Modal Consistency Training,"Yin, Yifang*; Hu, Wenmiao; Liu, Zhenguang; Wang, Guanfeng; Xiang, Shili; Zimmermann, Roger",poster,,,,,,,,, Semantic Attention Flow Fields for Monocular Dynamic Scene Decomposition,"Liang, Yiqing; Laidlaw, Eliot; Meyerowitz, Alexander; Sridhar, Srinath; Tompkin, James*",poster,,,,,,,,, Holistic Geometric Feature Learning for Structured Reconstruction,"Lu, Ziqiong; Huan, Linxi; Ma, Qiyuan; Zheng, Xianwei*",poster,,,,,,,,, Scalable Multi-Temporal Remote Sensing Change Data Generation via Simulating Stochastic Change Process,"Zheng, Zhuo*; Tian, Shiqi; Ma, Ailong; Zhang, Liangpei; Zhong, Yanfei",poster,,,,,,,,, TaskExpert: Dynamically Assembling Multi-Task Representations with Memorial Mixture-of-Experts,"Ye, Hanrong*; Xu, Dan",poster,2307.15324,https://arxiv.org/abs/2307.15324,https://github.com/prismformore/Multi-Task-Transformer,https://huggingface.co/papers/2307.15324,,,,2,0 Delegate Transformer for Image Color Aesthetics Assessment,"He, Shuai; Ming, Anlong*; Li, Yaqi; Sun, Jinyuan; Zheng, ShunTian; Ma, Huadong",poster,,,,,,,,, STEERER: Resolving Scale Variations via Selective Inheritance Learning,"Han, Tao; Bai, Lei*; Liu, Lingbo; Ouyang, Wanli",poster,,,,,,,,, Object-aware Gaze Target Detection,"Tonini, Francesco*; Dall'Asen, Nicola; Beyan, Cigdem; Ricci, Elisa",poster,2307.09662,https://arxiv.org/abs/2307.09662,https://github.com/francescotonini/object-aware-gaze-target-detection,https://huggingface.co/papers/2307.09662,,,,4,0 Weakly Supervised Referring Image Segmentation with Intra-Chunk and Inter-Chunk Consistency,"Lee, Jungbeom*; LEE, SUNGJIN; Nam, Jinseok; Yu, Seunghak; Do, Jaeyoung; Taghavi, Tara",poster,,,,,,,,, Vision Relation Transformer for Unbiased Scene Graph Generation,"Sudhakaran, Gopika*; Dhami, Devendra S; Kersting, Kristian; Roth, Stefan",poster,2308.09472,https://arxiv.org/abs/2308.09472,,https://huggingface.co/papers/2308.09472,,,,4,0 DDIT: Semantic Scene Completion via Deformable Deep Implicit Templates,"Li, Haoang*; Dong, Jinhu; Wen, Binghui; Gao, Ming; Huang, Tianyu; Liu, Yunhui; Cremers, Daniel",poster,,,,,,,,, DQS3D: Densely-matched Quantization-aware Semi-supervised 3D Detection,"Gao, Huan-ang; Tian, Beiwen; Li, Pengfei; Zhao, Hao*; Zhou, Guyue",poster,2304.13031,https://arxiv.org/abs/2304.13031,,https://huggingface.co/papers/2304.13031,,,,5,0 Shape Anchor Guided Holistic Indoor Scene Understanding,"Dong, Mingyue; Huan, Linxi; Xiong, Hanjiang; Shen, Shuhan; Zheng, Xianwei*",poster,,,,,,,,, SGAligner: 3D Scene Alignment with Scene Graphs,"Deb Sarkar, Sayan*; Miksik, Ondrej; Pollefeys, Marc; Barath, Daniel; Armeni, Iro",poster,,,,,,,,, Betrayed by Captions: Joint Caption Grounding and Generation for Open Vocabulary Instance Segmentation,"Wu, Jianzong; Li, Xiangtai*; Ding, Henghui; Li, Xia; Cheng, Guangliang; Tong, Yunhai; Loy, Chen Change",poster,2301.00805,https://arxiv.org/abs/2301.00805,,https://huggingface.co/papers/2301.00805,,,,7,0 SLAN: Self-Locator Aided Network for Vision-language Understanding,"Zhai, Jiang-Tian*; Zhang, Qi; Wu, Tong; Chen, Xingyu; Liu, Jiang-Jiang; Cheng, Ming-Ming",poster,,,,,,,,, Task-Oriented Multi-Modal Mutual Leaning for Vision-Language Models,"Long, Sifan*; Zhao, Zhen; Yuan, Junkun; Tan, Zichang; Liu, Jiang-Jiang; Zhou, Luping; Wang, Shengsheng; Wang, Jingdong",poster,2303.17169,https://arxiv.org/abs/2303.17169,,https://huggingface.co/papers/2303.17169,,,,8,1 TinyCLIP: CLIP Distillation via Affinity Mimicking and Weight Inheritance,"Wu, Kan*; Peng, Houwen; Zhou, Zhenghong; Xiao, Bin; Liu, Mengchen; Yuan, Lu; Xuan, Hong; Valenzuela, Michael L; Chen, Xi; Chao, Hongyang; Wang, Xinggang; Hu, Han",poster,,,,,,,,, In-Style: Unsupervised Text-Video Retrieval with Style Preservation,"Shvetsova, Nina*; Kukleva, Anna; Schiele, Bernt; Kuehne, Hilde",poster,,,,,,,,, Preserving Modality Structure Improves Multi-Modal Learning ,"Swetha, Sirnam*; Rizve, Mamshad Nayeem; Shvetsova, Nina; Kuehne, Hilde; Shah, Mubarak",poster,,,,,,,,, Distribution-Aware Prompt Tuning for Vision-Language Models,"Cho, Eulrang*; Kim, Jooyeon; Kim, Hyunwoo J",poster,,,,,,,,, SupFusion: Supervised LiDAR-Camera Fusion for 3D Object Detection,"qin, yiran; Wang, Chaoqun; Kang, Zijian; MA, Ningning; Li, Zhen; Zhang, Ruimao*",poster,,,,,,,,, Distribution-Consistent Modal Recovering for Incomplete Multimodal Learning,"Wang, Yuanzhi*; Cui, Zhen; Li, Yong",poster,,,,,,,,, Fg-T2M: Fine-Grained Text-Driven Human Motion Generation via Diffusion Model,"Wang, Yin*; Leng, Zhiying; Li, Frederick W. B.; Wu, shuncheng; Liang, Xiaohui",poster,,,,,,,,, Cross-modal Orthogonal High-rank Augmentation for RGB-Event Transformer-trackers,"Zhu, Zhiyu; Hou, Junhui*; Wu, Dapeng",poster,2307.04129,https://arxiv.org/abs/2307.04129,,https://huggingface.co/papers/2307.04129,,,,3,0 eP-ALM: Efficient Perceptual Augmentation of Language Models,"Shukor, Mustafa*; Dancette, Corentin; Cord, Matthieu",poster,,,,,,,,, Generating Visual Scenes from Touch,"Yang, Fengyu*; Zhang, Jiacheng; Owens, Andrew",poster,,,,,,,,, Multimodal High-order Relation Transformer for Scene Boundary Detection,"Wei, Xi*; Shi, Zhangxiang; Zhang, Tianzhu; Yu, Xiaoyuan; Xiao, Lei",poster,,,,,,,,, Muscles in Action,"Chiquier, Mia*; Vondrick, Carl",poster,2212.02978,https://arxiv.org/abs/2212.02978,,https://huggingface.co/papers/2212.02978,,,,2,0 Self-Evolved Dynamic Expansion Model for Task-Free Continual Learning,"Ye, Fei*; Bors, Adrian",poster,,,,,,,,, Multi-event Video-Text Retrieval,"Zhang, Gengyuan*; Ren, Jisen; Gu, Jindong; Tresp, Volker",poster,2308.11551,https://arxiv.org/abs/2308.11551,https://github.com/gengyuanmax/MeVTR,https://huggingface.co/papers/2308.11551,,,,4,0 Referring Image Segmentation Using Text Supervision,"Liu, Fang*; Liu, Yuhao; Kong, Yuqiu; Xu, Ke; Zhang, Lihe; Yin, Baocai ; Hancke, Gerhard P.; Lau, Rynson W.H.",poster,2308.14575,https://arxiv.org/abs/2308.14575,https://github.com/fawnliu/TRIS,https://huggingface.co/papers/2308.14575,,,,8,0 Audio-Visual Deception Detection: DOLOS Dataset and Parameter-Efficient Crossmodal Learning,"Guo, Xiaobao*; Muthuchamy Selvaraj, Nithish; Yu, Zitong; Kong, Wai-Kin Adams; Shen, Bingquan; Kot, Alex",poster,2303.12745,https://arxiv.org/abs/2303.12745,https://github.com/NMS05/Audio-Visual-Deception-Detection-DOLOS-Dataset-and-Parameter-Efficient-Crossmodal-Learning,https://huggingface.co/papers/2303.12745,,,,6,0 EMMN: Emotional Motion Memory Network for Audio-driven Emotional Talking Face Generation,"Tan, Shuai; Ji, Bin; pan, ye*",poster,,,,,,,,, CLIP2Point: Transfer CLIP to Point Cloud Classification with Image-Depth Pre-training,"Huang, Tianyu; Dong, Bowen; Yang, Yunhan; Huang, Xiaoshui; Lau, Rynson W.H.; Ouyang, Wanli; Zuo, Wangmeng*",poster,2210.01055,https://arxiv.org/abs/2210.01055,,https://huggingface.co/papers/2210.01055,,,,7,0 Speech2Lip: High-fidelity Speech to Lip Generation by Learning from a Short Video,"Wu, Xiuzhe; Hu, Pengfei; Wu, Yang*; Lyu, Xiaoyang; Cao, Yan-Pei; Shan, Ying; Yang, Wenming; Sun, Zhongqian; Qi, Xiaojuan",poster,,,,,,,,, GrowCLIP: Data-aware Automatic Model Growing for Large-scale Contrastive Language-Image Pre-training,"Deng, Xinchi*; Shi, Han; Huang, Runhui; Li, Changlin; Xu, Hang; Han, Jianhua; Kwok, James; Zhao, Shen; Zhang, Wei; Liang, Xiaodan",poster,2308.11331,https://arxiv.org/abs/2308.11331,,https://huggingface.co/papers/2308.11331,,,,10,0 A Retrospect to Multi-prompt Learning across Vision and Language,"Chen, Ziliang; Huang, Xin; Guan, Quanlong*; Lin, Liang; Luo, Weiqi",poster,,,,,,,,, ChartReader: A Unified Framework for Chart Derendering and Comprehension without Heuristic Rules,"Cheng , Zhi-Qi; Dai, Qi*; Hauptmann, Alexander ",poster,2304.02173,https://arxiv.org/abs/2304.02173,https://github.com/zhiqic/ChartReader,https://huggingface.co/papers/2304.02173,,,,6,0 Boosting Multi-modal Model Performance with Adaptive Gradient Modualtion,"Li, Hong*; Li, Xingyu; Hu, Pengbo ; Lei, Yinuo; Li, Chunxiao; Zhou, Yi",poster,,,,,,,,, ViLLA: Fine-Grained Vision-Language Representation Learning from Real-World Data,"Varma, Maya*; Delbrouck, Jean-Benoit; Hooper, Sarah; Chaudhari, Akshay S; Langlotz, Curtis",poster,2308.11194,https://arxiv.org/abs/2308.11194,,https://huggingface.co/papers/2308.11194,,,,5,0 Robust Referring Video Object Segmentation with Cyclic Structural Consensus,"Li, Xiang*; Wang, Jinglu; Xu, Xiaohao; Li, Xiao; Raj, Bhiksha; Lu, Yan",poster,,,,,,,,, Fantasia3D: Disentangling Geometry and Appearance for High-quality Text-to-3D Content Creation,"chen, rui; Chen, Yongwei; Jiao, Ningxin; Jia, Kui*",poster,2303.13873,https://arxiv.org/abs/2303.13873,,https://huggingface.co/papers/2303.13873,,,,4,0 CTP:Towards Vision-Language Continual Pretraining via Compatible Momentum Contrast and Topology Preservation,"Zhu, hongguang*; Wei, Yunchao; Liang, Xiaodan; Zhang, Chunjie; Zhao, Yao",poster,,,,,,,,, Teaching CLIP to Count to Ten,"Paiss, Roni*; Ephrat, Ariel; Tov, Omer; Zada, Shiran; Mosseri, Inbar; Irani, Michal; Dekel, Tali",poster,2302.12066,https://arxiv.org/abs/2302.12066,,https://huggingface.co/papers/2302.12066,,,,7,0 Narrator: Towards Natural Control of Human-Scene Interaction Generation via Relationship Reasoning,"Xuan, Haibiao; Li, Xiongzheng; Zhang, Jinsong; Zhang, Hongwen; Liu, Yebin; Li, Kun*",poster,2303.09410,https://arxiv.org/abs/2303.09410,,https://huggingface.co/papers/2303.09410,,,,6,0 Knowledge-Aware Federated Active Learning with Non-IID Data,"Cao, Yu-Tong*; Shi, Ye; Yu, Baosheng; Wang, Jingya; Tao, Dacheng",poster,2211.13579,https://arxiv.org/abs/2211.13579,,https://huggingface.co/papers/2211.13579,,,,5,0 SimpleClick: Interactive Image Segmentation with Simple Vision Transformers,"Liu, Qin *; Xu, Zhenlin; Bertasius, Gedas; Niethammer, Marc",poster,2210.11006,https://arxiv.org/abs/2210.11006,,https://huggingface.co/papers/2210.11006,,,,4,0 InterFormer: Real-time Interactive Image Segmentation,"Huang, You*; Yang, Hao; Sun, Ke; Zhang, ShengChuan; Cao, Liujuan; Jiang, Guannan; Ji, Rongrong",poster,2304.02942,https://arxiv.org/abs/2304.02942,https://github.com/YouHuang67/InterFormer,https://huggingface.co/papers/2304.02942,,,,7,0 Interactive Class-Agnostic Object Counting,"Huang, Yifeng; Ranjan, Viresh; Hoai, Minh *",poster,,,,,,,,, Agile Modeling: From Concept to Classifier in Minutes,"Stretcu, Otilia*; Vendrow, Edward; Hata, Kenji; Viswanathan, Krishnamurthy; Ferrari, Vittorio; Tavakkol, Sasan; Zhou, Wenlei; Avinash, Aditya; Luo, Emming; Alldrin, Neil; Bateni, MohammadHossein; Berger, Gabriel; Bunner, Andrew; Lu, Chun-Ta; Rey, Javier A; DeSalvo, Giulia; Krishna, Ranjay; Fuxman?, Ariel",poster,2302.12948,https://arxiv.org/abs/2302.12948,,https://huggingface.co/papers/2302.12948,,,,18,0 TiDAL: Learning Training Dynamics for Active Learning,"Kye, Seong Min; Choi, Kwanghee; Byun, Hyeongmin; Chang, Buru*",poster,2210.06788,https://arxiv.org/abs/2210.06788,,https://huggingface.co/papers/2210.06788,,,,3,0 Pre-training-free Image Manipulation Localization through Non-Mutually Exclusive Contrastive Learning,"Zhou, Jizhe*; Ma, Xiaochen; Du, Xia; Alhammadi, Ahmed Y; Feng, Wentao",poster,,,,,,,,, VADER: Video Alignment Differencing and Retrieval,"Black, Alexander*; Jenni, Simon; Bui, Tu; Tanjim, Md. Mehrab; Petrangeli, Stefano; Sinha, Ritwik; Swaminathan, Viswanathan (Vishy); Collomosse, John",poster,2303.13193,https://arxiv.org/abs/2303.13193,,https://huggingface.co/papers/2303.13193,,,,8,0 PIRNet: Privacy-Preserving Image Restoration Network via Wavelet Lifting,"Deng, Xin; Gao, Chao*; Xu, Mai",poster,,,,,,,,, Quality-Agnostic Deepfake Detection with Intra-model Collaborative Learning,"Le, Binh M.*; Woo, Simon S",poster,,,,,,,,, Towards Generic Image Manipulation Detection with Weakly-Supervised Self-Consistency Learning,"Zhai, Yuanhao*; Luan, Tianyu; Doermann, David; Yuan, Junsong",poster,,,,,,,,, CopyRNeRF: Protecting the CopyRight of Neural Radiance Fields,"Luo, Ziyuan; Guo, Qing; Cheung, Ka Chun; See, Simon; Wan, Renjie*",poster,2307.11526,https://arxiv.org/abs/2307.11526,,https://huggingface.co/papers/2307.11526,,,,5,2 UCF: Uncovering Common Features for Generalizable Deepfake Detection,"Yan, Zhiyuan; Zhang, Yong; Fan, Yanbo; Wu, Baoyuan*",poster,2304.13949,https://arxiv.org/abs/2304.13949,,https://huggingface.co/papers/2304.13949,,,,4,0 SAFL-Net: Semantic-Agnostic Feature Learning Network with Auxiliary Plugins for Image Manipulation Detection,"Sun, Zhihao*; Jiang, Haoran; Wang, Danding; Li, Xirong; Cao, Juan",poster,,,,,,,,, DRAW: Defending Camera-shooted RAW against Image Manipulation,"Hu, Xiaoxiao; Ying, Qichao ; Qian, Zhenxing*; Li, Sheng; Zhang, Xinpeng",poster,2307.16418,https://arxiv.org/abs/2307.16418,,https://huggingface.co/papers/2307.16418,,,,5,0 DIRE for Diffusion-Generated Image Detection,"Wang, Zhendong*; Bao, Jianmin; Zhou, Wengang ; Wang, Weilun; Hu, Hezhen; Chen, Hong; Li, Houqiang",poster,2303.09295,https://arxiv.org/abs/2303.09295,https://github.com/ZhendongWang6/DIRE,https://huggingface.co/papers/2303.09295,,,,7,0 Uncertainty-guided Learning for Improving Image Manipulation Detection,"Ji, Kaixiang*; Chen, Feng; Guo, Xin; Xu, Yadong; Wang, Jian; Chen, Jingdong",poster,,,,,,,,, The Stable Signature: Rooting Watermarks in Latent Diffusion Models,"Fernandez, Pierre*; Couairon, Guillaume; Jégou, Hervé; Douze, Matthijs; Furon, Teddy",poster,2303.15435,https://arxiv.org/abs/2303.15435,,https://huggingface.co/papers/2303.15435,,,,5,1 Get the Best of Both Worlds: Discriminative and Transferable Features by Grassmannian Class Representation,"Li, Zhizhong; Wang, Haoqi*; Zhang, Wayne",poster,,,,,,,,, 4D Panoptic Segmentation as Invariant and Equivariant Field Prediction,"Zhu, Minghan*; Han, Shizhong; Cai, Hong; Borse, Shubhankar; Porikli, Fatih; Ghaffari Jadidi, Maani",poster,2303.15651,https://arxiv.org/abs/2303.15651,,https://huggingface.co/papers/2303.15651,,,,6,0 SiLK: Simple Learned Keypoints,"Gleize, Pierre*; Wang, Weiyao; Feiszli, Matt",poster,,,,,,,,, "SC3K: Self-supervised and Coherent 3D Keypoints Estimation from Rotated, Noisy, and Decimated Point Cloud Data","Zohaib, Mohammad*; Del Bue, Alessio",poster,2308.05410,https://arxiv.org/abs/2308.05410,https://github.com/IITPAVIS/SC3K,https://huggingface.co/papers/2308.05410,,,,2,0 Geometric Viewpoint Learning with Hyper-Rays and Harmonics Encoding,"Min, Zhixiang*; Dibene Simental, Juan Carlos; Dunn, Enrique",poster,,,,,,,,, Surface Extraction from Neural Unsigned Distance Fields,"Zhang, Congyi*; Lin, Guying; Yang, Lei; Li, Xin; Komura, Taku; SCHAEFER, Scott; Keyser, John; Wang, Wenping",poster,,,,,,,,, Learning Adaptive Neighborhoods for Graph Neural Networks,"Saha, Avishkar*; Mendez, Oscar; Russell, Chris; Bowden, Richard",poster,2307.09065,https://arxiv.org/abs/2307.09065,,https://huggingface.co/papers/2307.09065,,,,4,0 Why do networks have inhibitory/negative connections?,"Wang, Qingyang*; Powell, Mike; Geisa, Ali ; Bridgeford, Eric W; Priebe, Carey E; Vogelstein, Joshua",poster,2208.03211,https://arxiv.org/abs/2208.03211,,https://huggingface.co/papers/2208.03211,,,,6,1 MasaCtrl: Tuning-free Mutual Self-Attention Control for Consistent Image Synthesis and Editing,"Cao, Mingdeng; Wang, Xintao*; Qi, Zhongang; Shan, Ying; Qie, Xiaohu; Zheng, Yinqiang",poster,2304.08465,https://arxiv.org/abs/2304.08465,,https://huggingface.co/papers/2304.08465,,,,6,0 Personalized Image Generation for Color Vision Deficiency Population,"Jiang, Shuyi; Liu, Daochang; Li, Dingquan; Xu, Chang*",poster,,,,,,,,, ReNeRF: Relightable Neural Radiance Fields with Nearfield Lighting,"Xu, Yingyan*; Zoss, Gaspard; Chandran, Prashanth; Gross, Markus; Bradley, Derek; Gotardo, Paulo",poster,,,,,,,,, MagicFusion: Boosting Text-to-Image Generation Performance by Fusing Diffusion Models,"Zhao, Jing; Zheng, Heliang; Wang, Chaoyue; lan, long; Yang, Wenjing*",poster,2303.13126,https://arxiv.org/abs/2303.13126,,https://huggingface.co/papers/2303.13126,,,,5,0 PODIA-3D: Domain Adaptation of 3D Generative Model Across Large Domain Gap Using Pose-Preserved Text-to-Image Diffusion,"Kim, Gwanghyun; jang, jiha; Chun, Se Young*",poster,,,,,,,,, Pluralistic Aging Diffusion Autoencoder,"Li, Peipei*; Wang, Rui; Huang, Huaibo; He, Ran; He, Zhaofeng",poster,2303.11086,https://arxiv.org/abs/2303.11086,,https://huggingface.co/papers/2303.11086,,,,5,0 DPM-OT: A New Diffusion Probabilistic Model Based on Optimal Transport,"Li, Zezeng; Li, Shenghao; Wang, Zhanpeng; Lei, Na*; Luo, Zhongxuan; GU, Xianfeng",poster,,,,,,,,, Efficient Emotional Adaptation for Audio-Driven Talking-Head Generation,"Gan, Yuan; Yang, Zongxin; Yue, Xihang; Sun, Lingyun; Yang, Yi*",poster,2309.04946,https://arxiv.org/abs/2309.04946,https://github.com/yuangan/EAT_code,https://huggingface.co/papers/2309.04946,,,,5,1 Diffusion Face Relighting,"Ponglertnapakorn, Puntawat -*; Tritrong, Nontawat; Suwajanakorn, Supasorn",poster,,,,,,,,, TALL: Thumbnail Layout for Deepfake Video Detection,"Xu, Yuting*; Liang, Jian; Jia, Gengyun; Yang, Ziming; Zhang, Yanhao; He, Ran",poster,2307.07494,https://arxiv.org/abs/2307.07494,https://github.com/rainy-xu/TALL4Deepfake,https://huggingface.co/papers/2307.07494,,,,6,0 LAW-Diffusion: Complex Scene Generation by Diffusion with Layouts,"Yang, Binbin*; Luo, Yi; Chen, Ziliang; Wang, Guangrun; Liang, Xiaodan; Lin, Liang",poster,,,,,,,,, DreamPose: Fashion Video Synthesis with Stable Diffusion,"Karras, Johanna S*; Holynski, Aleksander; Wang, Ting-Chun; Kemelmacher-Shlizerman, Ira",poster,,,,,,,,, Ablating Concepts in Text-to-Image Diffusion Models,"Kumari, Nupur*; Zhang, Bingliang; Wang, Sheng-Yu; Shechtman, Eli; Zhang, Richard ; Zhu, Jun-Yan",poster,2303.13516,https://arxiv.org/abs/2303.13516,,https://huggingface.co/papers/2303.13516,,,,6,0 DReg-NeRF: Deep Registration for Neural Radiance Fields,"Chen, Yu*; Lee, Gim Hee",poster,,,,,,,,, The Euclidean Space is Evil: Hyperbolic Attribute Editing for Few-shot Image Generation,"Li, Lingxiao*; Zhang, Yi; Wang, Shuhui",poster,2211.12347,https://arxiv.org/abs/2211.12347,,https://huggingface.co/papers/2211.12347,,,,3,0 Discriminative Class Tokens for Text-to-Image Diffusion Models,"Schwartz, Idan*; SnÊbjarnarson, Vésteinn; Benaim, Sagie; Chefer, Hila; Wolf, Lior; Belongie, Serge",poster,2303.17155,https://arxiv.org/abs/2303.17155,https://github.com/idansc/discriminative_class_tokens,https://huggingface.co/papers/2303.17155,,,,7,1 General Image-to-Image Translation with One-Shot Image Guidance,"Bin, Cheng*; Liu, Zuhao; Peng, Yunbo; Lin, Yue",poster,2307.14352,https://arxiv.org/abs/2307.14352,https://github.com/CrystalNeuro/visual-concept-translator,https://huggingface.co/papers/2307.14352,,,,4,0 Text2Performer: Text-Driven Human Video Generation,"Jiang, Yuming*; Yang, Shuai; Koh, Tong Liang; Wu, Wayne; Loy, Chen Change; Liu, Ziwei",poster,2304.08483,https://arxiv.org/abs/2304.08483,,https://huggingface.co/papers/2304.08483,,,,6,0 AesPA-Net: Aesthetic Pattern-Aware Style Transfer Networks,"Hong , Kibeom*; Jeon, Seogkyu; Lee, Junsoo; Ahn, Namhyuk; Kim, Kunhee; Lee, Pilhyeon; Kim, Daesik; Uh, Youngjung; Byun, Hyeran",poster,,,,,,,,, Controllable Person Image Synthesis with Pose-Constrained Latent Diffusion,"Han, Xiao*; Zhu, Xiatian; Deng, Jiankang; Song, Yi-Zhe; Xiang, Tao",poster,,,,,,,,, PATMAT: Person Aware Tuning of Mask-Aware Transformer for Face Inpainting,"Motamed, Saman*; Xu, Jianjin; Wu, Chen Henry; Haene, Christian; Bazin, Jean-Charles; de la Torre, Fernando",poster,2304.06107,https://arxiv.org/abs/2304.06107,,https://huggingface.co/papers/2304.06107,,,,4,1 Virtual Try-On with Pose-Garment Keypoints Guided Inpainting,"Li, Zhi*; Wei, Pengfei; Yin, Xiang; Ma, Zejun; Kot, Alex",poster,,,,,,,,, Online Clustered Codebook,"Zheng, Chuanxia*; Vedaldi, Andrea",poster,2307.15139,https://arxiv.org/abs/2307.15139,,https://huggingface.co/papers/2307.15139,,,,2,0 InfiniCity: Infinite-Scale City Synthesis,"Lin, Chieh Hubert*; Lee, Hsin-Ying; Menapace, Willi; Chai, Menglei; Siarohin, Aliaksandr; Yang, Ming-Hsuan; Tulyakov, Sergey",poster,2301.09637,https://arxiv.org/abs/2301.09637,,https://huggingface.co/papers/2301.09637,,,,7,0 Make-It-3D: High-fidelity 3D Creation from A Single Image with Diffusion Prior,"Tang, Junshu *; Wang, Tengfei; Zhang, Bo; Zhang, Ting; Yi, Ran; Chen, Dong; Ma, Lizhuang",poster,,,,,,,,, SAMPLING: Scene-adaptive Hierarchical Multiplane Images Representation for Novel View Synthesis from a Single Image,"Zhou, Xiaoyu*; Lin, ZhiWei; Shan, Xiaojun; Wang, Yongtao; Sun, Deqing; Yang, Ming-Hsuan",poster,,,,,,,,, StyleLipSync: Style-based Personalized Lip-sync Video Generation,"Ki, Taekyung*; Min, Dongchan",poster,2305.00521,https://arxiv.org/abs/2305.00521,,https://huggingface.co/papers/2305.00521,,,,2,0 StyleInV: A Temporal Style Modulated Inversion Network for Unconditional Video Generation,"Wang, Yuhan*; Jiang, Liming; Loy, Chen Change",poster,2308.16909,https://arxiv.org/abs/2308.16909,,https://huggingface.co/papers/2308.16909,,,,3,0 3D-Aware Generative Model for Improved Side-View Image Synthesis,"Jo, Kyungmin; Jin, Wonjoon*; Choo, Jaegul; Lee, Hyunjoon; Cho, Sunghyun",poster,,,,,,,,, Zero-Shot Contrastive Loss for Text-Guided Diffusion Image Style Transfer,"Yang, Serin*; HWANG, HYUNMIN; Ye, Jong Chul",poster,2303.08622,https://arxiv.org/abs/2303.08622,,https://huggingface.co/papers/2303.08622,,,,3,0 FlipNeRF: Flipped Reflection Rays for Few-shot Novel View Synthesis,"Seo, Seunghyeon; Chang, Yeonjin; Kwak, Nojun*",poster,2306.17723,https://arxiv.org/abs/2306.17723,,https://huggingface.co/papers/2306.17723,,,,3,0 Inverse problem regularization with hierarchical variational autoencoders,"Prost, Jean*; Houdard, Antoine; Almansa, Andres; Papadakis, Nicolas",poster,2303.11217,https://arxiv.org/abs/2303.11217,,https://huggingface.co/papers/2303.11217,,,,4,0 3D-aware Blending with Generative NeRFs,"Kim, Hyunsu*; Lee, Gayoung; Choi, Yunjey; Kim, Jin-Hwa; Zhu, Jun-Yan",poster,2302.06608,https://arxiv.org/abs/2302.06608,,https://huggingface.co/papers/2302.06608,,,,5,0 NeMF: Inverse Volume Rendering with Neural Microflake Field,"Zhang, Youjia; Xu, Teng; Yu, Junqing; Ye, YuTeng; Wang, Junle; Jing , Yanqing; Yu, Jingyi; Yang, Wei*",poster,2304.00782,https://arxiv.org/abs/2304.00782,,https://huggingface.co/papers/2304.00782,,,,8,0 eDiff-I Video: Text-to-Video via Finetuning Text-to-Image Diffusion Models with a Video Noise Prior,"Ge, Songwei*; Balaji, Yogesh; Nah, Seungjun; Liu, Guilin; Poon, Tyler; Tao, Andrew; Catanzaro, Bryan; Jacobs, David; Huang, Jia-Bin; Liu, Ming-Yu",poster,,,,,,,,, Learning Human View Synthesis from Internet Videos,"Dong, Junting*; Fang, Qi; Yang, Tianshuo; Peng, Sida; Shuai, Qing; Qiao, Chengyu",poster,,,,,,,,, ECG: Image Classification and Generation via a Single Energy-Based Model,"Guo, Qiushan*; Ma, Chuofan; Jiang, Yi; Yuan, Zehuan; Yu, Yizhou; Luo, Ping",poster,,,,,,,,, Automatic Animation of Hair Blowing in Still Portrait Photos,"Xiao, Wenpeng ; Liu, Wentao; Wang, Yitong; Ghanem, Bernard; Li, Bing*",poster,,,,,,,,, HoloFusion: Towards Photo-realistic 3D Generative Modeling,"Karnewar, Animesh*; Vedaldi, Andrea; mitra, niloy; Novotny, David",poster,2308.14244,https://arxiv.org/abs/2308.14244,,https://huggingface.co/papers/2308.14244,,,,4,0 Foreground Object Search by Distilling Composite Image Feature,"Zhang, Bo*; Sui, Jiacheng; Niu, Li",poster,2308.04990,https://arxiv.org/abs/2308.04990,https://github.com/bcmi/Foreground-Object-Search-Dataset-FOSD,https://huggingface.co/papers/2308.04990,,,,3,0 OrthoPlanes: A Novel Representation for Better 3D-Awareness of GANs,"He, Honglin; Yang, Zhuoqian; Li, Shikai; Dai, Bo; Wu, Wayne*",poster,,,,,,,,, 3DHumanGAN: 3D-Aware Human Image Generation with Photorealism,"Yang, Zhuoqian; Li, Shikai; Wu, Wayne*; Dai, Bo",poster,,,,,,,,, MODA: Mapping-Once Audio-driven Portrait Animation with Dual Attentions,"Liu, Yunfei*; Lin, Lijian; Zhou, Changyin; YU, Fei; Li, Yu",poster,2307.10008,https://arxiv.org/abs/2307.10008,,https://huggingface.co/papers/2307.10008,,,,5,0 Minimum Latency Deep Online Video Stabilization,"Zhang, Zhuofan; Liu, Zhen; Tan, Ping; Zeng, Bing; Liu, Shuaicheng*",poster,2212.02073,https://arxiv.org/abs/2212.02073,https://github.com/liuzhen03/NNDVS,https://huggingface.co/papers/2212.02073,,,,5,0 StableVideo: Text-driven Consistency-aware Diffusion Video Editing,"Chai, Wenhao; Guo, Xun*; Wang, Gaoang; Lu, Yan",poster,2308.09592,https://arxiv.org/abs/2308.09592,https://github.com/rese1f/StableVideo,https://huggingface.co/papers/2308.09592,,,,4,1 Localizing Object-level Shape Variations with Text-to-Image Diffusion Models,"Patashnik, Or*; Garibi, Daniel; Azuri, Idan; Averbuch-Elor, Hadar; Cohen-Or, Danny",poster,2303.11306,https://arxiv.org/abs/2303.11306,,https://huggingface.co/papers/2303.11306,,,,5,1 Implicit Identity Representation Conditioned Memory Compensation Network for Talking Head video Generation,"Hong, Fa-Ting*; Xu, Dan",poster,2307.09906,https://arxiv.org/abs/2307.09906,https://github.com/harlanhong/ICCV2023-MCNET,https://huggingface.co/papers/2307.09906,,,,2,0 ESSAformer: Efficient Transformer for Hyperspectral Image Super-resolution,"Zhang, Mingjin; Zhang, Chi; Zhang, Qiming; Guo, Jie; Gao, Xinbo*; Zhang, Jing",poster,2307.14010,https://arxiv.org/abs/2307.14010,,https://huggingface.co/papers/2307.14010,,,,6,0 GlueNet: Plug and Play Multi-modal Encoders for X-to-image Generation,"Qin, Can*; Yu, Ning; Xing, Chen; Zhang, Shu; Chen, Zeyuan; Ermon, Stefano ; FU, YUN; Xiong, Caiming; Xu, Ran",poster,,,,,,,,, UHDNeRF: Ultra-High-Definition Neural Radiance Fields,"Li, Quewei*; Li, Feichao; Guo, Jie; Guo, Yanwen",poster,,,,,,,,, All-to-key Attention for Arbitrary Style Transfer,"Zhu, Mingrui; He, Xiao; Wang, Nannan*; Wang, Xiaoyu; Gao, Xinbo",poster,2212.04105,https://arxiv.org/abs/2212.04105,,https://huggingface.co/papers/2212.04105,,,,5,0 Diverse Inpainting and Editing with GAN Inversion,"Yildirim, Ahmet Burak; Pehlivan, Hamza; Bilecen, Bahri Batuhan; Dundar, Aysegul*",poster,2307.15033,https://arxiv.org/abs/2307.15033,,https://huggingface.co/papers/2307.15033,,,,4,0 Harnessing the Spatial-Temporal Attention of Diffusion Models for High-Fidelity Text-to-Image Synthesis,"Wu, Qiucheng*; Liu, Yujian; Zhao, Handong; Bui, Trung; Lin, Zhe; Zhang, Yang; Chang, Shiyu",poster,2304.03869,https://arxiv.org/abs/2304.03869,https://github.com/UCSB-NLP-Chang/Diffusion-SpaceTime-Attn,https://huggingface.co/papers/2304.03869,,,,7,0 MoTIF: Learning Motion Trajectories with Local Implicit Neural Functions for Continuous Space-Time Video Super-Resolution,"Chen, Yi-Hsin; Chen, Si-Cun; Chen, Yi-Hsin; Lin, Yen-Yu; Peng, Wen-Hsiao*",poster,2307.07988,https://arxiv.org/abs/2307.07988,https://github.com/sichun233746/MoTIF,https://huggingface.co/papers/2307.07988,,,,5,0 RANA: Relightable and Articulated Neural Avatars,"Iqbal, Umar*; Caliskan, Akin; Nagano, Koki; Molchanov, Pavlo; Khamis, Sameh; Kautz, Jan",poster,,,,,,,,, DiffCloth: Diffusion Based Garment Synthesis and Manipulation via Structural Cross-modal Semantic Alignment,"Zhang, Xujie*; Yang, Binbin; Kampffmeyer, Michael C.; Zhang, Wenqing; Zhang, shiyue; Lu, Guansong; Lin, Liang; Xu, Hang; Liang, Xiaodan",poster,2308.11206,https://arxiv.org/abs/2308.11206,,https://huggingface.co/papers/2308.11206,,,,9,0 Masked Diffusion Transformer is a Strong Image Synthesizer,"Gao, Shanghua*; Zhou, Pan; Cheng, Ming-Ming; Yan, Shuicheng",poster,2303.14389,https://arxiv.org/abs/2303.14389,https://github.com/sail-sg/MDT,https://huggingface.co/papers/2303.14389,https://huggingface.co/spaces/shgao/MDT,https://huggingface.co/shgao/MDT-XL2,,4,0 FreeDoM: Training-Free Energy-Guided Conditional Diffusion Model,"Yu, Jiwen*; Wang, Yinhuai; Zhao, Chen; Ghanem, Bernard; Zhang, Jian",poster,2303.09833,https://arxiv.org/abs/2303.09833,,https://huggingface.co/papers/2303.09833,,,,5,0 CLNeRF: Continual Learning Meets NeRF,"Cai, Zhipeng*; MÃŒller, Matthias",poster,2308.14816,https://arxiv.org/abs/2308.14816,https://github.com/IntelLabs/CLNeRF,https://huggingface.co/papers/2308.14816,,,,2,0 Rethinking Fast Fourier Convolution in Image Inpainting,"Chu, Tianyi*; Chen, Jiafu; Sun, Jiakai; Lian, Shuobin; Wang, Zhizhong; Zuo, Zhiwen; Zhao, Lei; Xing, Wei; Lu, Dongming",poster,,,,,,,,, Pix2Video: Video Editing using Image Diffusion Models,"Ceylan, Duygu*; Huang, Chun-Hao; mitra, niloy",poster,,,,,,,,, Multi-view Spectral Polarization Propagation for Video Glass Segmentation,"Qiao, Yu*; Dong, Bo; jin, ao; Fu, Yu; Baek, Seung-Hwan; Heide, Felix; Peers, Pieter; Wei, Xiaopeng; Yang, Xin",poster,,,,,,,,, WALDO: Future Video Synthesis using Object Layer Decomposition and Parametric Flow Prediction,"Le Moing, Guillaume*; Ponce, Jean; Schmid, Cordelia",poster,2211.14308,https://arxiv.org/abs/2211.14308,,https://huggingface.co/papers/2211.14308,,,,3,1 Ray Conditioning: Trading Photo-consistency for Photo-realism in Multi-view Image Generation,"Chen, Eric M*; Holalkere, Sidhanth; Yan, Ruyu; Zhang, Kai; Davis, Abe",poster,2304.13681,https://arxiv.org/abs/2304.13681,,https://huggingface.co/papers/2304.13681,,,,5,0 Text-Conditioned Sampling Framework for Text-to-Image Generation with Masked Generative Models,"Lee, Jaewoong*; Jang, Sangwon; Jo, Jaehyeong; Yoon, Jaehong; Kim, Yunji; Kim, Jin-Hwa; Ha, Jung-Woo; Hwang, Sung Ju",poster,2304.01515,https://arxiv.org/abs/2304.01515,,https://huggingface.co/papers/2304.01515,,,,8,1 Efficient Video Prediction via Sparsely Conditioned Flow Matching,"Davtyan, Aram*; Sameni, Sepehr; Favaro, Paolo",poster,2211.14575,https://arxiv.org/abs/2211.14575,,https://huggingface.co/papers/2211.14575,,,,3,0 Democratising 2D Sketch to 3D Shape Retrieval Through Pivoting.,"Chowdhury, Pinaki Nath*; Bhunia , Ayan Kumar; Sain, Aneeshan; Koley, Subhadeep; Xiang, Tao; Song, Yi-Zhe",poster,,,,,,,,, Towards Instance-adaptive Inference for Federated Learning,"Feng, Chun-Mei*; Yu, Kai; Liu, Nian; Xu, Xinxing; Khan, Salman; Zuo, Wangmeng",poster,2308.06051,https://arxiv.org/abs/2308.06051,,https://huggingface.co/papers/2308.06051,,,,6,0 TransTIC: Transferring Transformer-based Image Compression from Human Visualization to Machine Perception,"Chen, Yi-Hsin; Weng, Ying-Chieh; Kao, Chia Hao; CHIEN, CHENG; Chiu, Wei-Chen; Peng, Wen-Hsiao*",poster,,,,,,,,, Counting Crowds in Bad Weather,"Huang, Zhi-Kai; Chen, Wei-Ting; Chiang, Yuan-Chun; Kuo, Sy-Yen; Yang, Ming-Hsuan*",poster,2306.01209,https://arxiv.org/abs/2306.01209,,https://huggingface.co/papers/2306.01209,,,,5,0 NeRF-Det: Learning Geometry-Aware Volumetric Representation for Multi-View Indoor 3D Object Detection,"Xu, Chenfeng*; Wu, Bichen; Hou, Ji; Tsai, Sam; Li, Ruilong; Wang, Jialiang; Zhan, Wei; He, Zijian; Vajda, Peter; Keutzer, Kurt; TOMIZUKA, Masayoshi",poster,,,,,,,,, MEGA: Multimodal Alignment Aggregation and Distillation For Cinematic Video Segmentation,"Sadoughi, Najmeh*; Li, Xinyu; Vajpayee, Avijit; Fan, David; Shuai, Bing; Santos-Villalobos, Hector J; Bhat, Vimal; MV, Rohith",poster,2308.11185,https://arxiv.org/abs/2308.11185,,https://huggingface.co/papers/2308.11185,,,,8,0 Bring Clipart to Life,"Zhao, Nanxuan*; Dang, Shengqi; Lin, Hexun; Shi, Yang; Cao, Nan",poster,,,,,,,,, UpCycling: Semi-supervised 3D Object Detection without Sharing Raw-level Unlabeled Scenes,"Hwang, Sunwook*; Kim, Youngseok; Kim, Seongwon; Bahk, Saewoong ; Kim, Hyung-Sin",poster,2211.11950,https://arxiv.org/abs/2211.11950,,https://huggingface.co/papers/2211.11950,,,,5,0 Graph Matching with Bi-level Noisy Correspondence,"Lin, Yijie; Yang, Mouxing; Yu, Jun; Hu, Peng; Zhang, Changqing; Peng, Xi*",poster,2212.04085,https://arxiv.org/abs/2212.04085,https://github.com/XLearning-SCU/2023-ICCV-COMMON,https://huggingface.co/papers/2212.04085,,,,6,0 Anomaly Detection using Score-based Perturbation Resilience,"Shin, Woosang*; Lee, Jong-Hyeon; Lee, Taehan; Lee, Sangmoon; Yun, Jong Pil",poster,,,,,,,,, Spatio-Temporal Domain Awareness for Multi-Agent Collaborative Perception,"yang, kun*; Yang, Dingkang; Zhang, Jingyu; Li, Mingcheng; Liu, Yang; Liu, Jing; Wang, Hanqi; Sun, Peng; Song, Liang ",poster,2307.13929,https://arxiv.org/abs/2307.13929,,https://huggingface.co/papers/2307.13929,,,,9,0 Multimodal Garment Designer: Human-Centric Latent Diffusion Models for Fashion Image Editing,"Baldrati, Alberto; Morelli, Davide*; Cartella, Giuseppe; Cornia, Marcella; Bertini, Marco; Cucchiara, Rita",poster,2304.02051,https://arxiv.org/abs/2304.02051,https://github.com/aimagelab/multimodal-garment-designer,https://huggingface.co/papers/2304.02051,,,,6,1 Towards Unifying Medical Vision-and-Language Pre-training via Soft Prompts,"Chen, Zhihong; Diao, Shizhe; wang, benyou; Li, Guanbin*; Wan, Xiang",poster,2302.08958,https://arxiv.org/abs/2302.08958,,https://huggingface.co/papers/2302.08958,,,,5,0 MAS: Towards Resource-Efficient Federated Multiple-Task Learning,"Zhuang, Weiming*; Wen, Yonggang; Lyu, Lingjuan; zhang, shuai",poster,2307.11285,https://arxiv.org/abs/2307.11285,,https://huggingface.co/papers/2307.11285,,,,4,0 Hierarchical Visual Categories Modeling in A Probabilistic Perspective for Out-of-Distribution Detection ,"Li, Jinglun; Zhou, Xinyu; Guo, Pinxue; Sun, Yixuan; Huang, Yiwen; Ge, Weifeng*; Zhang, Wenqiang",poster,,,,,,,,, Improving Generalization in Visual Reinforcement Learning via Conflict-aware Gradient Agreement Augmentation,"Liu, Siao*; Chen, Zhaoyu; Liu, Yang; Wang, Yuzheng; Yang, Dingkang; zhao, zhile; Zhou, Ziqing; Xie, Yi; Li, Wei; Zhang, Wenqiang; Gan, Zhongxue",poster,2308.01194,https://arxiv.org/abs/2308.01194,,https://huggingface.co/papers/2308.01194,,,,11,0 Tiny Updater: Towards Efficient Neural Network-Driven Software Updating,"Zhang, Linfeng*; Ma, Kaisheng",poster,,,,,,,,, Multiple Planar Object Tracking,"Zhang, Zhicheng; Liu, Shengzhe; Yang, Jufeng*",poster,,,,,,,,, Robust Omnimatte with 3D Background Modeling,"Lin, Geng*; Gao, Chen; Huang, Jia-Bin; Kim, Changil; Wang, Yipeng; Zwicker, Matthias; Saraf, Ayush",poster,,,,,,,,, Ordinal Label Distribution Learning,"Wen, Changsong; Zhang, Xin; Yao, Xingxu; Yang, Jufeng*",poster,,,,,,,,, "Re-mine, Learn and Reason: Exploring the Cross-modal Semantic Correlations for Language-guided HOI detection","Cao, Yichao*; Su, Xiu; tang, qingfei; Yang, Feng; You, Shan; Lu, Xiaobo; Xu, Chang",poster,2307.13529,https://arxiv.org/abs/2307.13529,,https://huggingface.co/papers/2307.13529,,,,7,1 MUVA: A New Large-Scale Benchmark for Multi-view Amodal Instance Segmentation in the Shopping Scenario,"Li, Zhixuan; Ye, Weining; Terven, Juan R; Bennett, Zachary R; Zheng, Ying; Jiang, Tingting*; Huang, Tiejun",poster,,,,,,,,, Editable Image Geometric Abstraction via Neural Primitive Assembly,"Chen, Ye*; Chen, Xuanhong; Hu, Zhangli; Ni, Bingbing",poster,,,,,,,,, One-shot recognition of any material anywhere using contrastive learning with physics-based rendering,"Drehwald, Manuel S.*; eppel, sagi; Hao, Han; Aspuru-Guzik, Alan",poster,2212.00648,https://arxiv.org/abs/2212.00648,,https://huggingface.co/papers/2212.00648,,,,5,1 Fast Full-frame Video Stabilization with Iterative Optimization,"Zhao, Weiyue; Li, Xin; Peng, Zhan; Luo, Xianrui; Ye, Xinyi; Lu, Hao; Cao, Zhiguo*",poster,2307.12774,https://arxiv.org/abs/2307.12774,,https://huggingface.co/papers/2307.12774,,,,7,0 "Two Birds, One Stone: A Unified Framework for Joint Learning of Image and Video Style Transfers","gu, bohai; Fan, Heng; Zhang, Libo*",poster,2304.11335,https://arxiv.org/abs/2304.11335,,https://huggingface.co/papers/2304.11335,,,,3,0 Multi-modal Gated Mixture of Local-to-Global Experts for Dynamic Image Fusion,"Sun, Yiming*; Cao, Bing; Zhu, Pengfei; Hu, Qinghua",poster,2302.01392,https://arxiv.org/abs/2302.01392,https://github.com/SunYM2020/MoE-Fusion,https://huggingface.co/papers/2302.01392,,,,4,0 SAFE: Sensitivity-Aware Features for Out-of-Distribution Object Detection,"Wilson, Samuel James*; Fischer, Tobias; Dayoub, Feras; Miller, Dimity; Suenderhauf, Niko",poster,2208.13930,https://arxiv.org/abs/2208.13930,,https://huggingface.co/papers/2208.13930,,,,5,0 GeT: Generative Target Structure Debiasing for Domain Adaptation,"Zhang, Can*; Lee, Gim Hee",poster,2308.10205,https://arxiv.org/abs/2308.10205,,https://huggingface.co/papers/2308.10205,,,,2,0 HairCLIPv2: Unifying Hair Editing via Proxy Feature Blending,"Wei, Tianyi*; Chen, Dongdong; Zhou, Wenbo; Liao, Jing; Zhang, Weiming; Hua, Gang; Yu, Nenghai",poster,,,,,,,,, Deformer: Dynamic Fusion Transformer for Robust Hand Pose Estimation,"Fu, Qichen*; Liu, Xingyu; Xu, Ran; Niebles, Juan Carlos; Kitani, Kris",poster,2303.04991,https://arxiv.org/abs/2303.04991,,https://huggingface.co/papers/2303.04991,,,,5,1 Improving Continuous Sign Language Recognition with Cross-Lingual Signs,"Chen, Yutong; Wei, Fangyun*",poster,2308.10809,https://arxiv.org/abs/2308.10809,,https://huggingface.co/papers/2308.10809,,,,2,0 A Parse-Then-Place Approach for Generating Graphic Layouts from Textual Descriptions,"Lin, Jiawei*; Guo, Jiaqi; Sun, Shizhao; Xu, Weijiang; Liu, Ting; Lou, Jian-Guang; Zhang, Dongmei",poster,2308.12700,https://arxiv.org/abs/2308.12700,,https://huggingface.co/papers/2308.12700,,,,7,1 DISeR: Designing Imaging Systems with Reinforcement Learning,"Klinghoffer, Tzofi M*; Tiwary, Kushagra; Behari, Nikhil; Agrawalla, Bhavya K; Raskar, Ramesh",poster,,,,,,,,, Segmentation of Tubular Structures Using Iterative Training With Tailored Samples,"Liao, Wei*",poster,,,,,,,,, Time-to-Contact Map by Joint Estimation of Up-to-Scale Inverse Depth and Global Motion using a Single Event Camera,"Nunes, Urbano Miguel G.*; Perrinet, Laurent U; Ieng, Sio-Hoi",poster,,,,,,,,, Zip-NeRF: Anti-Aliased Grid-Based Neural Radiance Fields,"Barron, Jonathan T*; Mildenhall, Ben; Verbin, Dor; Srinivasan, Pratul; Hedman, Peter",oral,,,,,,,,, Mixed Neural Voxels for Fast Multi-view Video Synthesis,"Wang, Feng*; Tan, Sinan; Li, Xinghang; Tian, Zeyue; Song, Yafei; Liu, Huaping",oral,2212.00190,https://arxiv.org/abs/2212.00190,https://github.com/fengres/mixvoxels,https://huggingface.co/papers/2212.00190,,,,5,0 Diffusion-Guided Reconstruction of Everyday Hand-Object Interaction Clips,"Ye, Yufei*; Hebbar, Poorvi; Gupta, Abhinav; Tulsiani, Shubham",oral,,,,,,,,, LERF: Language Embedded Radiance Fields,"Kerr, Justin; Kim, Chung Min*; Goldberg, Ken; Kanazawa, Angjoo; Tancik, Matthew",oral,2303.09553,https://arxiv.org/abs/2303.09553,,https://huggingface.co/papers/2303.09553,,,,5,0 Instruct-NeRF2NeRF: Editing 3D Scenes with Instructions,"Haque, Ayaan*; Tancik, Matthew; Efros, Alexei A; Holynski, Aleksander; Kanazawa, Angjoo",oral,2303.12789,https://arxiv.org/abs/2303.12789,,https://huggingface.co/papers/2303.12789,,,,5,0 P1AC: Revisiting Absolute Pose From a Single Affine Correspondence,"Ventura, Jonathan*; Kukelova, Zuzana; Sattler, Torsten; Barath, Daniel",oral,2011.08790,https://arxiv.org/abs/2011.08790,https://github.com/jonathanventura/P1AC,https://huggingface.co/papers/2011.08790,,,,4,0 Prior-Guided Strand-Based Hair Reconstruction,"Skliarova, Vanessa Valerievna; Chelishev, Jenya; Dogaru, Andreea; Medvedev, Igor; Lempitsky, Victor; Zakharov, Egor*",oral,,,,,,,,, Tri-MipRF: Tri-Mip Representation for Efficient Anti-Aliasing Neural Radiance Fields,"Hu, Wenbo*; Wang, Yuling; Ma, Lin; Yang, Bangbang; Gao, Lin; Liu, Xiao; Ma, Yuewen",oral,,,,,,,,, LiDAR-UDA: Self-ensembling Through Time for Unsupervised LiDAR Domain Adaptation,"Shaban, Amirreza*; Lee, JoonHo; Jung, Sanghun; Meng, Xiangyun; Boots, Byron",oral,,,,,,,,, Tracking Everything Everywhere All At Once,"Wang, Qianqian*; Chang, Yen-Yu; Cai, Ruojin; Li, Zhengqi; Hariharan, Bharath; Holynski, Aleksander; Snavely, Noah",oral,2306.05422,https://arxiv.org/abs/2306.05422,,https://huggingface.co/papers/2306.05422,,,,7,3 Ego-Humans: An Ego-Centric 3D Multi-Human Benchmark,"Khirodkar, Rawal*; Vo, Minh P; Bansal, Aayush; Ma, Lingni; Newcombe, Richard; Kitani, Kris",oral,,,,,,,,, "Once Detected, Never Lost: Surpassing Human Performance in Offline LiDAR based 3D Object Detection","Fan, Lue*; Yang, Yuxue; Mao, Yiming; Wang, Feng; Chen, Yuntao; Wang, Naiyan; Zhang, Zhaoxiang",oral,2304.12315,https://arxiv.org/abs/2304.12315,https://github.com/tusen-ai/SST,https://huggingface.co/papers/2304.12315,,,,7,0 DiffusionDet: Diffusion Model for Object Detection,"Chen, Shoufa*; Sun, Peize; Song, Yibing; Luo, Ping",oral,2211.09788,https://arxiv.org/abs/2211.09788,https://github.com/ShoufaChen/DiffusionDet,https://huggingface.co/papers/2211.09788,,,,4,0 V3Det: Vast Vocabulary Visual Detection Dataset,"Wang, Jiaqi*; Zhang, Pan; Chu, Tao; CAO, Yuhang; Zhou, Yujie; Wu, Tong; Wang, Bin; He, Conghui; Lin, Dahua",oral,2304.03752,https://arxiv.org/abs/2304.03752,,https://huggingface.co/papers/2304.03752,,,,9,0 PixelOdyssey: A Large-Scale Synthetic Dataset for Long-Term Pixel Tracking,"Zheng, Yang*; Harley, Adam; Shen, Bokui; Wetzstein, Gordon; Guibas, Leonidas",oral,,,,,,,,, Label-Free Event-based Object Recognition via Joint Learning with Image Reconstruction from Events,"Cho, Hoonhee*; Kim, Hyeonseong; Chae, Yujeong; Yoon, Kuk-Jin",oral,2308.09383,https://arxiv.org/abs/2308.09383,https://github.com/Chohoonhee/Ev-LaFOR,https://huggingface.co/papers/2308.09383,,,,4,0 Vision HGNN: An Image is More than a Graph of Nodes,"Han, Yan*; Wang, Peihao; Kundu, Souvik; Ding, Ying; Wang, Zhangyang",oral,,,,,,,,, Revisiting Vision Transformer from the View of Path Ensemble,"Chang, Shuning*; Wang, Pichao; Luo, Hao; Wang, Fan; Shou, Mike Zheng",oral,2308.06548,https://arxiv.org/abs/2308.06548,,https://huggingface.co/papers/2308.06548,,,,5,0 All in Tokens: Unifying Output Space of Visual Tasks via Soft Token,"Ning, Jia*; Li, Chen; Zhang, Zheng; Wang, Chunyu; Geng, Zigang; Dai, Qi; He, Kun; Hu, Han",oral,2301.02229,https://arxiv.org/abs/2301.02229,https://github.com/SwinTransformer/AiT,https://huggingface.co/papers/2301.02229,,,,7,0 Mitigating and Evaluating Static Bias of Action Representations in the Background and the Foreground,"Li, Haoxin; Liu, Yuan; Zhang, Hanwang; Li, Boyang*",oral,2211.12883,https://arxiv.org/abs/2211.12883,,https://huggingface.co/papers/2211.12883,,,,4,0 Deep Multitask Learning with Progressive Parameter Sharing,"Shi, Haosen; Ren, Shen; Zhang, Tianwei; Pan, Sinno Jialin*",oral,,,,,,,,, Implicit Temporal Modeling with Learnable Alignment for Video Recognition,"Tu, Shuyuan; Dai, Qi*; Wu, Zuxuan; Cheng , Zhi-Qi; Hu, Han; Jiang, Yu-Gang",oral,2304.10465,https://arxiv.org/abs/2304.10465,https://github.com/Francis-Rings/ILA,https://huggingface.co/papers/2304.10465,,,,6,0 Unmasked Teacher: Towards Training-Efficient Video Foundation Models,"Li, Kunchang*; Wang, Yali; Li, Yizhuo; Wang, Yi; He, Yinan; Wang, Limin; Qiao, Yu",oral,2303.16058,https://arxiv.org/abs/2303.16058,https://github.com/OpenGVLab/unmasked_teacher,https://huggingface.co/papers/2303.16058,,,,7,0 Large-Scale Person Detection and Localization using Overhead Fisheye Cameras,"Yang, Lu; Li, Liulei; Xin, Xueshi; Sun, Yifan; Song, Qing; Wang, Wenguan*",oral,2307.08252,https://arxiv.org/abs/2307.08252,,https://huggingface.co/papers/2307.08252,,,,6,0