Xu Luo
I am currently pursuing a Ph.D. at UESTC, supervised by Prof. Jingkuan Song. My primary research interest lies in advancing the state-of-the-art for generative AI models, including diffusion models and (multimodal) large language models (LLMs). Beyond these, I am fervently committed to actualizing the potential of these models, shaping them into autonomous agents capable of performing intricate tasks in real-world scenarios.
Email  / 
CV  / 
Google Scholar  / 
Github
|
|
News
[2024/09] Two papers accepted to NeurIPS 2024.
[2024/05] Thrilled to introduce our latest work, Lumina-T2X, a text-to-any-modality generation framework! Our models can generate high-resolution images & 720p videos of any aspect ratio, 3D multiview images, and audio conditioned on text.
[2023/10] Regret to say that FSL is no longer a valuable research topic. My research focus has shifted from Few-Shot Learning (FSL) to the burgeoning field of generative AI. I am currently an intern at Shanghai AI Lab.
[2023/04] Our systematic study on few-shot learning has been accepted at ICML'23! This project demanded a significant investment of time and revealed numerous unexpected findings. Check it out now!
[2022/05] Our paper, which uncovers fundamental problems inherent in few-shot learning, was accepted at ICML 2022.
[2021/09] My first top conference paper at NeurIPS 2021! The paper is not that good, but a good experience(:
|
|
Lumina-T2X: Transforming Text into Any Modality, Resolution, and Duration via Flow-based Large Diffusion Transformers
Peng Gao*,
Le Zhuo*,
Dongyang Liu*,
Ruoyi Du*,
Xu Luo*,
Longtian Qiu*,
Yuhang Zhang,
Chen Lin,
Rongjie Huang,
Shijie Geng,
Renrui Zhang,
Junlin Xi,
Wenqi Shao,
Zhengkai Jiang,
Tianshuo Yang,
Weicai Ye,
Tong He,
Jingwen He,
Yu Qiao,
Hongsheng Li
arXiv, 2024
[PDF]
[Code]
Text-to-any-modality models that generate images, videos, audio, and 3D multiview images conditioned on text in a flow-based diffusion framework, using novel Flag-DiT architectures with up to 5B parameters and 128K context windows.
|
|
Less is More: On the Feature Redundancy of Pretrained Models When Transferring to Few-shot Tasks
Xu Luo,
Difan Zou,
Lianli Gao,
Zenglin Xu,
Jingkuan Song
arXiv, 2023
[PDF]
Uncovering and analyzing extreme feature redundancy phenomenon of pretrained vision models when transferring to few-shot tasks.
|
|
DETA: Denoised Task Adaptation for Few-shot Learning
Ji Zhang,
Lianli Gao,
Xu Luo,
Hengtao Shen,
Jingkuan Song
ICCV, 2023
[PDF]
[Code]
Proposing DETA--a framework that solves potential data/label noise in downstream few-shot transfer tasks.
|
|
A Closer Look at Few-shot Classification Again
Xu Luo*,
Hao Wu*,
Ji Zhang,
Lianli Gao,
Jing Xu,
Jingkuan Song
ICML, 2023
[PDF]
[Code]
Empirically proving the disentanglement of training and adaptation algorithms in few-shot classification, and performing interesting analysis of each phase that leads to the discovery of several important observations.
|
|
Alleviating the Sample Selection Bias in Few-shot Learning by Removing Projection to the Centroid
Jing Xu,
Xu Luo,
Xinglin Pan,
Yanan Li,
Wenjie Pei,
Zenglin Xu
NeurIPS, 2022   (Spotlight)
[PDF]
[Code]
Revealing a strong bias caused by the centroid of features in each few-shot learning task. A simple method is designed to rectify this bias by removing the dimension along the direction of task centroid from the feature space.
|
|
Channel Importance Matters in Few-Shot Image Classification
Xu Luo,
Jing Xu,
Zenglin Xu
ICML, 2022
[PDF]
[Code]
Revealing and analyzing the channel bias problem that we found critical in few-shot learning, through a simple channel-wise feature transformation applied only at test time.
|
|
Rectifying the Shortcut Learning of Background for Few-Shot Learning
Xu Luo,
Longhui Wei,
Liangjian Wen,
Jinrong Yang,
Lingxi xie,
Zenglin Xu,
Qi Tian
NeurIPS, 2021
[PDF]
[Code]
Identifying image background
as a shortcut knowledge ungeneralizable
beyond training categories in Few-Shot Learning. A novel framework, COSOC, is designed to
tackle this problem.
|
|
Boosting Few-Shot Classification with View-Learnable Contrastive Learning
Xu Luo,
Yuxuan Chen,
Liangjian Wen,
Lili Pan,
Zenglin Xu
ICME, 2021
[PDF]
[Code]
Applying contrastive learning to Few-Shot Learning, with views generated in a learning-to-learn fashion.
|
Academic service
Conference reviewer
NeurIPS 2023, 2024
ICML 2022, 2024
ICLR 2024, 2025
CVPR 2023, 2024, 2025
ICCV 2023
ECCV 2022, 2024
CoLLAs 2023, 2024
AISTATS 2025
Journal reviewer
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)
IEEE Transactions on Image Processing (TIP)
|
This well-designed template is borrowed from this guy
|
|