| Nov 20, 2025 | [AAAI2026] SHADOW: Dynamic-Aware Credit Assignment for Efficient Long-Horizon Agent Training is accepted by AAAI2026. |
| Jul 15, 2025 | 🎉 Big Project Release! We introduce SafeWork-R1, a cutting-edge multimodal reasoning model that demonstrates the coevolution of capabilities and safety. SafeWork-R1 |
| Jun 04, 2025 | [NeurIPS2025] We find patches from harmful content, enabling them to bypass data moderation and generate dangerous responses when encountering the full image or related text. VLMs Can Aggregate Scattered Training Patches is accepted by NeurIPS2025. |
| May 16, 2025 | [ACL2025] Our paper Adversarial Preference Learning for Robust LLM Alignment is accepted by ACL2025. Arxiv Link |
| May 02, 2025 | [ICML2025] Emergent Response Planning in LLM and C-3PO: Compact Plug-and-Play Proxy Optimization to Achieve Human-like Retrieval-Augmented Generation are accepted by ICML2025. |
| Dec 08, 2024 | We proposal a new law, AI 45°-Law toward trustworthy AGI! Arxiv Link |
| Sep 26, 2024 | [NeurIPS2024] Weak-to-Strong Search: Align Large Language Models via Searching over Small Language Models is accepted by NeurIPS 2024. NeurIPS Link  |
| Sep 23, 2024 | [EMNLP2024] Inference-Time Language Model Alignment via Integrated Value Guidance is accepted by EMNLP 2024. Arixv Link  |
| Jul 04, 2024 | [ECCV2024] MM-SafetyBench (A Benchmark for Safety Evaluation of Multimodal Large Language Models) is accepted by ECCV 2024.  |
| May 16, 2024 | [ACL2024] Three Papers (Emulated Disalignment, SEER: Structured Reasoning, Multi-Objective DPO) are accepted by ACL 2024. |
| May 02, 2024 | [ICML2024] RoboCodeX: Multimodal Code Generation for Robotic Behavior Synthesis is accepted by ICML 2024.  |
| Apr 20, 2024 | [IJCAI2024] One Paper Attacks, Defenses and Evaluations for LLM Conversation Safety: A Survey accepted by IJCAI 2024 Survey Track.  |
| Mar 13, 2024 | [NAACL2024] One LLM safety survey paper Attacks, Defenses and Evaluations for LLM Conversation Safety: A Survey accepted by NAACL 2024. |
| Feb 27, 2024 | [CVPR2024] Two Papers(LLaMA-Excitor, VideoDistill) are accepted by CVPR 2024.  |
| Dec 09, 2023 | [AAAI2024] One Offline RL Paper Critic-Guided Decision Transformer for Offline Reinforcement Learning accepted by AAAI 2024.  |