About Me

Wenjie Wang is an assistant Professor (tenure-track) at School of Information Science and Technology in the ShanghaiTech University. Dr.Wang completed her Ph.D. in 2023 at Emory University (@Atlanta, US) under the supervision of Dr.Li Xiong. Before that, she received the Bachelor degree at Huazhong University of Science and Technology(@Wuhan, China) in 2017. As the first or corresponding author, Dr.Wang has contributed to more than 20 papers in top-tier journals and conferences. She also served as the Area Chair for ACL and EMNLP.

Her major research interest can be listed as:

Safety Alignment of LLM: Ensuring model outputs consistently adhere to ethical guidelines, avoid harmful content, and respect human intent, even in adversarial scenarios.
Personlized Alignment of LLM: Adapting LLMs to align with individual users’ values, preferences, and communication styles.
AI Agent Safety: Studying the robustness of Agent safety from the perspective of perception, reasoning, planning, action and multi-agent collaboration.
Privacy Preserving LLM: Differential Privacy, Machine Unlearning, Federated Learning

我们正在招2027秋季入学的硕士和博士研究生，欢迎对以上方向感兴趣的同学附上简历联系我wangwj1@shanghaitech.edu.cn.

About ASPIRE Lab

The AI Security, Privacy, and Robustness Lab (ASPIRE LAb), led by Dr. Wenjie Wang, is a pioneering research group dedicated to enhancing the responsibility, reliability, privacy, and trustworthiness of cutting-edge AI technologies, particularly Large Language Models (LLMs), Multi-Modal models and AI Agents. The ASPIRE lab’s mission is to develop novel techniques and methodologies that address the unique challenges posed by the increasing complexity of modern AI systems.

News

[20260510] One paper is accepted to TPAMI🎉

[20260126] Thress papers are accepted to ICLR2026🎉

Tianyu Chen, Jian Lou, Wenjie Wang. Safeguarding Multimodal Knowledge Copyright in the RAG-as-a-Service Environment.

Yu Pan, Jiahao Chen, Lin Wang, Bingrong Dai, Wenjie Wang. STEDiff: Revealing the Spatial and Temporal Redundancy of Backdoor Attacks in Text-to-Image Diffusion Models.

Jiajin Tang, Gaoyang, Wenjie Wang, Sibei Yang, Xing Chen. Chart Deep Research in LVLMs via Parallel Relative Policy Optimization. International Conference on Learning Representations.

[20250919] Fairmaker is accepted to NIPS2025🎉

Yue Xu, Chengyan Fu, Li Xiong, Sibei Yang, Wenjie Wang. Auto-Search and Refinement: An Automated Framework for Gender Bias Mitigation in Large Language Models

[20250822] ADPO is accepted to EMNLP2025🎉

Fenghua Weng, Jian Lou, Jun Feng, Minlie Huang, Wenjie Wang. Adversary-Aware DPO: Enhancing Safety Alignment in Vision Language Models via Adversarial Training

[20250516] Two papers are accepted to ACL2025🎉

Yi Wang, Fenghua Weng, Sibei Yang, Zhan Qin, Minlie Huang, Wenjie Wang. DELMAN: Dynamic Defense Against Large Language Model Jailbreaking with Model Editing.

Yukai Zhou, Jian Lou, Zhijie Huang, Zhan Qin, Yibei Yang, Wenjie Wang. Don’t Say No: Jailbreaking LLM by Suppressing Refusal.

[20241224] MMJ-Bench is accepted to AAAI2025🎉

Fenghua Weng, Yue Xu, Chengyan Fu, Wenjie Wang. MMJ-Bench : A Comprehensive Study on Jailbreak Attacks and Defenses for Multimodal Large Language Models.

[20240922] CIDER is accepted to EMNLP2024🎉

Yue Xu, Xiuyuan Qi, Zhan Qin, Wenjie Wang. Cross-modality Information Check for Detecting Jailbreaking in Multimodal Large Language Models.

[20240314] LinkPrompt is accepted to NAACL2024🎉

Yue Xu and Wenjie Wang. 2024. LinkPrompt: Natural and Universal Adversarial Attacks on Prompt-based Language Models.

[20231209] IGAMT is accepted to AAAI-24🎉

Wang, W., Tang, P., Lou, J., Shao, Y., Waller, L., Ko, Y.- an, & Xiong, L. (2024). IGAMT: Privacy-Preserving Electronic Health Record Synthesization with Heterogeneity and Irregularity.

[20231209] Demo:Certified Toolformer is accepted to CCS2023🎉

Yue Xu and Wenjie Wang. 2023. Demo: Certified Robustness on Toolformer.

[20230214] Start my career at ShanghaiTech University on Valentine’s Day🎉🌹