About Me

Wenjie Wang is an assistant Professor (tenure-track) at School of Information Science and Technology in the ShanghaiTech University. Dr.Wang completed her Ph.D. in 2023 at Emory University (@Atlanta, US) under the supervision of Dr.Li Xiong. Before that, she received the Bachelor degree at Huazhong University of Science and Technology(@Wuhan, China) in 2017. As the first or corresponding author, Dr.Wang has contributed to more than 20 papers in top-tier journals and conferences. She also served as the Area Chair for ACL and EMNLP.

Her major research interest can be listed as:

  • Safety Alignment of LLM: Ensuring model outputs consistently adhere to ethical guidelines, avoid harmful content, and respect human intent, even in adversarial scenarios.
  • Personlized Alignment of LLM: Adapting LLMs to align with individual users’ values, preferences, and communication styles.
  • AI Agent Safety: Studying the robustness of Agent safety from the perspective of perception, reasoning, planning, action and multi-agent collaboration.
  • Privacy Preserving LLM: Differential Privacy, Machine Unlearning, Federated Learning

我们正在招2026秋季入学的硕士和博士研究生,欢迎对以上方向感兴趣的同学附上简历联系我wangwj1@shanghaitech.edu.cn.

About ASPIRE Lab

The AI Security, Privacy, and Robustness Lab (ASPIRE LAb), led by Dr. Wenjie Wang, is a pioneering research group dedicated to enhancing the responsibility, reliability, privacy, and trustworthiness of cutting-edge AI technologies, particularly Large Language Models (LLMs), Multi-Modal models and AI Agents. The ASPIRE lab’s mission is to develop novel techniques and methodologies that address the unique challenges posed by the increasing complexity of modern AI systems.

News

[20250516] Two papers are accepted to ACL2025🎉

Yi Wang, Fenghua Weng, Sibei Yang, Zhan Qin, Minlie Huang, Wenjie Wang. DELMAN: Dynamic Defense Against Large Language Model Jailbreaking with Model Editing.

Yukai Zhou, Jian Lou, Zhijie Huang, Zhan Qin, Yibei Yang, Wenjie Wang. Don’t Say No: Jailbreaking LLM by Suppressing Refusal.

[20241224] MMJ-Bench is accepted to AAAI2025🎉

Fenghua Weng, Yue Xu, Chengyan Fu, Wenjie Wang. MMJ-Bench : A Comprehensive Study on Jailbreak Attacks and Defenses for Multimodal Large Language Models.

[20240922] CIDER is accepted to EMNLP2024🎉

Yue Xu, Xiuyuan Qi, Zhan Qin, Wenjie Wang. Cross-modality Information Check for Detecting Jailbreaking in Multimodal Large Language Models.

[20240314] LinkPrompt is accepted to NAACL2024🎉

Yue Xu and Wenjie Wang. 2024. LinkPrompt: Natural and Universal Adversarial Attacks on Prompt-based Language Models.

[20231209] IGAMT is accepted to AAAI-24🎉

Wang, W., Tang, P., Lou, J., Shao, Y., Waller, L., Ko, Y.- an, & Xiong, L. (2024). IGAMT: Privacy-Preserving Electronic Health Record Synthesization with Heterogeneity and Irregularity.

[20231209] Demo:Certified Toolformer is accepted to CCS2023🎉

Yue Xu and Wenjie Wang. 2023. Demo: Certified Robustness on Toolformer.

[20230214] Start my career at ShanghaiTech University on Valentine’s Day🎉🌹