Chao Yang

杨超, Research Scientist, Shanghai AI Lab.

yangchao.png

I am Research Scientist at Shanghai AI Lab (上海人工智能实验室), leading a Fundamental Large Model Safety & Decision Intelligence research group. Recently, I have completed my postdoctoral fellowship under the guidance of Professor Yu Qiao , where my research focused on the security aspects of large-scale models. My work delved into the vulnerabilities and defense mechanisms associated with AI systems, particularly in the context of large language models and their applications.

Previously, I received my Ph.D. in Department of Computer Science and Technology at Tsinghua University in 2022, advised by Prof. Fuchun Sun and Prof. Huaping Liu.

My research interest includes Large Language Model Safety, Multi-modal Large Model, and Robotic Embodied Intelligence for Trustworthy AGI. Some of my current research keywords can be found below:

  • Large Language Model: LLM Post-training and Safety Alignment, LLM Attack and Defense.
  • Multimodal LLM: Modality Fusion, Multimodal alignment, VQA.
  • Embodied Robotics: Robotic Manipulation, Reinforcement Learning, Imitation Learning.

For Academic Cooperation, please feel free to email me at yangchao[at] pjlab [dot] org [dot] cn. For other matters, please contact me at yangchao9264 [at] 126 [dot] com or yangchaoemigmo [at] gmail [dot] com.

news

Dec 10, 2024 “TreeEval: Benchmark-Free Evaluation of Large Language Models through Tree Planning” is accepted by AAAI 2025.. :sparkles::sparkles:
Sep 26, 2024 Weak-to-Strong Search: Align Large Language Models via Searching over Small Language Models is accepted by NeurIPS 2024.. :sparkles::sparkles:
Sep 23, 2024 (IVG) Inference-Time Language Model Alignment via Integrated Value Guidance is accepted by EMNLP 2024.. :sparkles::sparkles:
Jul 04, 2024 MM-SafetyBench (A Benchmark for Safety Evaluation of Multimodal Large Language Models) is accepted by ECCV 2024. Here you go. :sparkles::sparkles:
May 16, 2024 Three Papers (Emulated Disalignment, Structured Reasoning, Multi-Objective DPO) are accepted by ACL 2024. More details is coming. :sparkles::sparkles:
May 02, 2024 RoboCodeX is accepted by ICML 2024. :sparkles::sparkles:
Apr 20, 2024 One Paper accepted by IJCAI 2024 Survey Track. :sparkles::sparkles:
Mar 13, 2024 One LLM safety survey paper accepted by NAACL 2024. :sparkles: :smile:
Feb 27, 2024 Two Papers(LLaMA-Excitor, VideoDistill) are accepted by CVPR 2024. :sparkles::sparkles:

selected publications

  1. CVPR2024
    VideoDistill: Language-aware Vision Distillation for Video Question Answering
    Bo Zou*, Chao Yang*, Yu Qiao, and 2 more authors
    arXiv preprint arXiv:2404.00973, 2024
  2. CVPR2024
    LLaMA-Excitor: General Instruction Tuning via Indirect Feature Interaction
    Bo Zou*, Chao Yang*, Yu Qiao, and 2 more authors
    2024
  3. ACL2024 Oral
    Emulated Disalignment: Safety Alignment for Large Language Models May Backfire!
    Zhanhui Zhou, Jie Liu, Zhichen Dong, and 4 more authors
    arXiv preprint arXiv:2402.12343, 2024
  4. NAACL2024
    Attacks, defenses and evaluations for llm conversation safety: A survey
    Zhichen Dong, Zhanhui Zhou, Chao Yang+, and 2 more authors
    arXiv preprint arXiv:2402.09283, 2024
  5. AAAI2024
    Critic-guided decision transformer for offline reinforcement learning
    Yuanfu Wang, Chao Yang, Ying Wen, and 2 more authors
    arXiv preprint arXiv:2312.13716, 2023
  6. ECCV2024
    Safety of Multimodal Large Language Models on Images and Text
    Xin Liu, Yichen Zhu, Yunshi Lan, and 2 more authors
    arXiv preprint arXiv:2402.00357, 2024