2026 LLM/VLM GTA-2: Benchmarking General Tool Agents from Atomic Tool-Use to Open-Ended Workflows Jize Wang, Xuanxuan Liu, Yining Li, Songyang Zhang, Yijun Wang, and 5 more authors arXiv, 2026 arXiv Code Agent RouteMoA: Dynamic Routing without Pre-Inference Boosts Efficient Mixture-of-Agents Jize Wang, Han Wu, Zhiyuan You, Yiming Song, Yijun Wang, and 7 more authors ACL, 2026 arXiv HTML Code Agent TREX: Automating LLM Fine-tuning via Agent-Driven Tree-based Exploration Zerun Ma, Guoqiang Wang, Xinchen Xie, Yicheng Chen, He Du, and 5 more authors arXiv, 2026 arXiv HTML Website LLM/VLM Kernel-Smith: A Unified Recipe for Evolutionary Kernel Optimization He Du, Qiming Ge, Jiakai Hu, Aijun Yang, Zheng Cai, and 16 more authors arXiv, 2026 arXiv HTML LLM/VLM DataChef: Cooking Up Optimal Data Recipes for LLM Adaptation via Reinforcement Learning Yicheng Chen, Zerun Ma, Xinchen Xie, Yining Li, and Kai Chen arXiv, 2026 arXiv HTML 2025 LLM/VLM MIG: Automatic Data Selection for Instruction Tuning by Maximizing Information Gain in Semantic Space Yicheng Chen, Yining Li, Kai Hu, Zerun Ma, Haochen Ye, and 1 more author In Findings of ACL, 2025 DOI arXiv HTML Code Website Vision & Multimodality Auto Cherry-Picker: Learning from High-quality Generative Data Driven by Language Yicheng Chen, Xiangtai Li, Yining Li, Yanhong Zeng, Jianzong Wu, and 2 more authors In CVPR, 2025 arXiv HTML Video Code Website Vision & Multimodality MG-LLaVA: Towards Multi-Granularity Visual Instruction Tuning Xiangyu Zhao, Xiangtai Li, Haodong Duan, Haian Huang, Yining Li, and 2 more authors IEEE TCSVT, 2025 arXiv HTML Code 2024 LLM/VLM InternLM-Law: An Open Source Chinese Legal Large Language Model Zhiwei Fei, Songyang Zhang, Xiaoyu Shen, Dawei Zhu, Xiao Wang, and 7 more authors arXiv, 2024 arXiv HTML Code Vision & Multimodality MotionBooth: Motion-Aware Customized Text-to-Video Generation Jianzong Wu, Xiangtai Li, Yanhong Zeng, Jiangning Zhang, Qianyu Zhou, and 3 more authors In NeurIPS Spotlight, 2024 arXiv HTML Video Code Website LLM/VLM Efficient LLM Jailbreak via Adaptive Dense-to-sparse Constrained Optimization Kai Hu, Weichen Yu, Tianjun Yao, Xiang Li, Wenhe Liu, and 5 more authors In NeurIPS, 2024 arXiv HTML LLM/VLM InternLM-XComposer2-4KHD: A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4K HD Xiaoyi Dong, Pan Zhang, Yuhang Zang, Yan Cao, Boxiao Wang, and 4 more authors In NeurIPS, 2024 arXiv HTML Code Website LLM/VLM InternLM2 Technical Report Zhaowei Cai, Ming Cao, Hao Chen, Kai Chen, Kaibo Chen, and 4 more authors arXiv, 2024 arXiv HTML Code Website LLM/VLM InternLM-XComposer2: Mastering Free-Form Text-Image Composition and Comprehension in Vision-Language Large Model Xiaoyi Dong, Pan Zhang, Yuhang Zang, Yan Cao, Boxiao Wang, and 4 more authors arXiv, 2024 arXiv HTML Code Website Agent GTA: A Benchmark for General Tool Agents Jize Wang, Zerun Ma, Yining Li, Songyang Zhang, Cailian Chen, and 2 more authors In NeurIPS Datasets and Benchmarks Track, 2024 arXiv HTML Code Website Vision & Multimodality RTMW: Real-Time Multi-Person 2D and 3D Whole-body Pose Estimation Tao Jiang, Xinchen Xie, and Yining Li In CVPR, 2024 arXiv HTML Code Vision & Multimodality Open-vocabulary SAM: Segment and Recognize Twenty-thousand Classes Interactively Haobo Yuan, Xiangtai Li, Chong Zhou, Yining Li, Kai Chen, and 1 more author In ECCV, 2024 arXiv HTML Code Website Vision & Multimodality RTMO: Towards High-Performance One-Stage Real-Time Multi-Person Pose Estimation Peng Lu, Tao Jiang, Yining Li, Xiangtai Li, Kai Chen, and 1 more author In CVPR, 2024 arXiv HTML Code Vision & Multimodality OMG-Seg: Is One Model Good Enough for All Segmentation? Xiangtai Li, Haobo Yuan, Wei Li, Henghui Ding, Size Wu, and 4 more authors In CVPR, 2024 arXiv HTML Code Website Vision & Multimodality Towards Language-Driven Video Inpainting via Multimodal Large Language Models Jianzong Wu, Xiangtai Li, Chenyang Si, Shangchen Zhou, Jingkang Yang, and 6 more authors In CVPR, 2024 arXiv HTML Code Website LLM/VLM InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output Pan Zhang, Xiaoyi Dong, Yuhang Zang, Yan Cao, Rui Qian, and 4 more authors arXiv, 2024 arXiv HTML LLM/VLM MMBench-Video: A Long-Form Multi-Shot Benchmark for Holistic Video Understanding Xinyu Fang, Kangrui Mao, Haodong Duan, Xiangyu Zhao, Yining Li, and 2 more authors In NeurIPS Datasets and Benchmarks Track, 2024 arXiv HTML Website Vision & Multimodality An Open and Comprehensive Pipeline for Unified Object Grounding and Detection Xiangyu Zhao, Yicheng Chen, Shilin Xu, Xiangtai Li, Xinjiang Wang, and 2 more authors arXiv, 2024 arXiv HTML Code Vision & Multimodality RAP-SAM: Towards Real-Time All-Purpose Segment Anything Shilin Xu, Haobo Yuan, Qingyu Shi, Lu Qi, Jingbo Wang, and 7 more authors arXiv, 2024 arXiv HTML Code Website 2023 Vision & Multimodality RTMPose: Real-Time Multi-Person Pose Estimation Based on MMPose Tao Jiang, Peng Lu, Li Zhang, Ningsheng Ma, Rui Han, and 3 more authors arXiv, 2023 arXiv HTML Code Vision & Multimodality DST-Det: Simple Dynamic Self-Training for Open-Vocabulary Object Detection Shilin Xu, Xiangtai Li, Size Wu, Wenwei Zhang, Yining Li, and 4 more authors arXiv, 2023 arXiv HTML Agent AgentLego: Open-Source Tool API Library to Extend and Enhance LLM Agents AL Contributors 2023 HTML Code 2020 Vision & Multimodality OpenMMLab Pose Estimation Toolbox and Benchmark MMP Contributors GitHub, 2020 Code 2019 Vision & Multimodality Deep Imbalanced Learning for Face Recognition and Attribute Prediction Chen Huang, Yining Li, Chen Change Loy, and Xiaoou Tang IEEE TPAMI, 2019 arXiv HTML Vision & Multimodality Dense Intrinsic Appearance Flow for Human Pose Transfer Yining Li, Chen Huang, and Chen Change Loy In CVPR, 2019 arXiv HTML Code Website 2017 Vision & Multimodality Learning to Disambiguate by Asking Discriminative Questions Yining Li, Chen Huang, Xiaoou Tang, and Chen Change Loy In ICCV, 2017 arXiv HTML Website 2016 Vision & Multimodality Learning Deep Representation for Imbalanced Classification Chen Huang, Yining Li, Chen Change Loy, and Xiaoou Tang In CVPR Spotlight, 2016 HTML Website Vision & Multimodality Human Attribute Recognition by Deep Hierarchical Contexts Yining Li, Chen Huang, Chen Change Loy, and Xiaoou Tang In ECCV, 2016 HTML Website