Xuyang Liu (刘旭洋)

🌈 I am a second-year Master’s student at Sichuan University, under the supervision of Prof. Honggang Chen. Currently, I am working as a research intern at Taobao & Tmall Group, focusing on Efficient MLLM. Previously, I had the honor of visiting the VIP Lab at SUSTech, supervised by Prof. Feng Zheng. I also gained valuable experience at the MiLAB at Westlake University, supervised by Prof. Donglin Wang. I am very glad to be supervised and collaborated with Dr. Siteng Huang from DAMO Academy and Asst. Prof. Linfeng Zhang from SJTU.

My research interests span Efficient Multi-modal Large Language Models, including:

📢 Recently, I am focusing on Acceleration of Diffusion Models. Feel free to reach out to me at this email, if you are interested in collaborating with me.

📰 News

  • [Jul 22, 2024] I begin my research internship at Taobao & Tmall Group, focusing on multi-modal large language models (MLLM).
  • [Jul 2, 2024] One first author paper about memory-efficient fine-tuning for referring expression comprehension has been released!
  • [May 24, 2024] One co-first author paper about efficient fine-tuning and inference for Vision Transformers has been released!
  • [May 16, 2024] One paper about reference-reduced super-resolution image quality assessment has been released!
  • [March 13, 2024] One co-first author paper about parameter-efficient tuning for visual grounding got accepted by ICME 2024, and selected as Oral Presentation!
  • [December 13, 2023] One first author paper about diffusion-based zero-shot visual grounding got accepted by ICASSP 2024!

📃 Publications

Please find my full publications on my Google Scholar profile. Google Scholar

Conference Papers

Xuyang Liu*, Siteng Huang*, Yachen Kang, Honggang Chen, Donglin Wang, "VGDiffZero: Text-to-image Diffusion Models Can Be Zero-shot Visual Grounders". In IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024 [paper] [code] [poster]

Ting Liu*, Xuyang Liu*, Siteng Huang, Honggang Chen, Quanjun Yin, Long Qin, Donglin Wang, Yue Hu, "DARA: Domain- and Relation-aware Adapters Make Parameter-efficient Tuning for Visual Grounding". In IEEE International Conference on Multimedia & Expo (ICME), 2024 (Oral) [paper] [code] [poster]

Journal Papers

Xuyang Liu, "GLMLP-TRANS: A transportation mode detection model using lightweight sensors integrated in smartphones". Computer Communications, 2022 (SCI Q1, IF: 6.0) [paper] [code]

Preprints & Under Submission

Xuyang Liu*, Ting Liu*, Siteng Huang, Yue Hu, Quanjun Yin, Donglin Wang, Honggang Chen "M2IST: Multi-Modal Interactive Side-Tuning for Memory-efficient Referring Expression Comprehension". arXiv preprint arXiv:2407.01131. [paper]

Ting Liu*, Xuyang Liu*, Siteng Huang, Liangtao Shi, Zunnan Xu , Yi Xin, Quanjun Yin, Xiaohong Liu "Sparse-Tuning: Adapting Vision Transformers with Efficient Fine-tuning and Inference". arXiv preprint arXiv:2405.14700. [paper] [code] [Chinese intro] [Zhihu]

Xinying Lin, Xuyang Liu, Hong Yang, Xiaohai He, Honggang Chen, "Perception- and Fidelity-aware Reduced-Reference Super-Resolution Image Quality Assessment". arXiv preprint arXiv:2405.09472. [paper]

💼 Experience

  • Research Intern - Taobao & Tmall Group, Alibaba Group, Beijing
    • Time: July 2024 - Present.
    • Thesis: Multi-modal Large Language Models (MLLM).
  • Research Intern - Machine Intelligence Laboratory, Westlake University, Hangzhou

📠 Services

Conference Reviewer

  • ACM International Conference on Multimedia (MM)
  • ACM International Conference on Multimedia Retrieval (ICMR)