Xu Luo

I am a Ph.D. candidate at UESTC, advised by Prof. Jingkuan Song. My research aims to bridge the gap between virtual AI and the physical world by developing general-purpose robots, focusing on the creation of generalist policies that empower them to robustly perceive, dynamically interact, and continuously adapt—ultimately enabling them to tackle any task, in any environment.

Email / CV / Google Scholar / Github

News

[2024/09] Two papers accepted to NeurIPS 2024.
[2024/05] Thrilled to introduce our latest work, Lumina-T2X, a text-to-any-modality generation framework! Our models can generate high-resolution images & 720p videos of any aspect ratio, 3D multiview images, and audio conditioned on text.
[2023/10] Regret to say that FSL is no longer a valuable research topic. My research focus has shifted from Few-Shot Learning (FSL) to the burgeoning field of generative AI. I am currently an intern at Shanghai AI Lab.
[2023/04] Our systematic study on few-shot learning has been accepted at ICML'23! This project demanded a significant investment of time and revealed numerous unexpected findings. Check it out now!
[2022/05] Our paper, which uncovers fundamental problems inherent in few-shot learning, was accepted at ICML 2022.
[2021/09] My first top conference paper at NeurIPS 2021! The paper is not that good, but a good experience(:

Selected Publications

Lumina-T2X: Transforming Text into Any Modality, Resolution, and Duration via Flow-based Large Diffusion Transformers
Peng Gao*, Le Zhuo*, Dongyang Liu*, Ruoyi Du*, Xu Luo*, Longtian Qiu*, Yuhang Zhang, Chen Lin, Rongjie Huang, Shijie Geng, Renrui Zhang, Junlin Xi, Wenqi Shao, Zhengkai Jiang, Tianshuo Yang, Weicai Ye, Tong He, Jingwen He, Yu Qiao, Hongsheng Li
ICLR, 2025
[PDF] [Code]

Text-to-any-modality models that generate images, videos, audio, and 3D multiview images conditioned on text in a flow-based diffusion framework, using novel Flag-DiT architectures with up to 5B parameters and 128K context windows.

Less is More: On the Feature Redundancy of Pretrained Models When Transferring to Few-shot Tasks
Xu Luo, Difan Zou, Lianli Gao, Zenglin Xu, Jingkuan Song
arXiv, 2023
[PDF]

Uncovering and analyzing extreme feature redundancy phenomenon of pretrained vision models when transferring to few-shot tasks.

DETA: Denoised Task Adaptation for Few-shot Learning
Ji Zhang, Lianli Gao, Xu Luo, Hengtao Shen, Jingkuan Song
ICCV, 2023
[PDF] [Code]

Proposing DETA--a framework that solves potential data/label noise in downstream few-shot transfer tasks.

A Closer Look at Few-shot Classification Again
Xu Luo*, Hao Wu*, Ji Zhang, Lianli Gao, Jing Xu, Jingkuan Song
ICML, 2023
[PDF] [Code]

Empirically proving the disentanglement of training and adaptation algorithms in few-shot classification, and performing interesting analysis of each phase that leads to the discovery of several important observations.

Alleviating the Sample Selection Bias in Few-shot Learning by Removing Projection to the Centroid
Jing Xu, Xu Luo, Xinglin Pan, Yanan Li, Wenjie Pei, Zenglin Xu
NeurIPS, 2022 (Spotlight)
[PDF] [Code]

Revealing a strong bias caused by the centroid of features in each few-shot learning task. A simple method is designed to rectify this bias by removing the dimension along the direction of task centroid from the feature space.

Channel Importance Matters in Few-Shot Image Classification
Xu Luo, Jing Xu, Zenglin Xu
ICML, 2022
[PDF] [Code]

Revealing and analyzing the channel bias problem that we found critical in few-shot learning, through a simple channel-wise feature transformation applied only at test time.

Rectifying the Shortcut Learning of Background for Few-Shot Learning
Xu Luo, Longhui Wei, Liangjian Wen, Jinrong Yang, Lingxi xie, Zenglin Xu, Qi Tian
NeurIPS, 2021
[PDF] [Code]

Identifying image background as a shortcut knowledge ungeneralizable beyond training categories in Few-Shot Learning. A novel framework, COSOC, is designed to tackle this problem.

Boosting Few-Shot Classification with View-Learnable Contrastive Learning
Xu Luo, Yuxuan Chen, Liangjian Wen, Lili Pan, Zenglin Xu
ICME, 2021
[PDF] [Code]

Applying contrastive learning to Few-Shot Learning, with views generated in a learning-to-learn fashion.

Academic service

Conference reviewer

NeurIPS 2023, 2024

ICML 2022, 2024

ICLR 2024, 2025

CVPR 2023, 2024, 2025

ICCV 2023

ECCV 2022, 2024

CoLLAs 2023, 2024

AISTATS 2025

Journal reviewer

IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)

IEEE Transactions on Image Processing (TIP)

This well-designed template is borrowed from this guy