Large Model Safety Workshop 2025

Speaker Details

Yang Zhang

CISPA Helmholtz Center for Information Security

Yang Zhang (https://yangzhangalmo.github.io/) is a tenured faculty member at CISPA Helmholtz Center for Information Security, Germany. His research concentrates on trustworthy machine learning including privacy, security and more recenlty LLM safety. Moreover, he works on measuring and understanding misinformation and unsafe content like hateful memes on the Internet. His research has been featured in major media outlets including the Washington Post and New Scientist. He has received the NDSS 2019 distinguished paper award and the CCS 2022 best paper award runner-up.

Talk

Title: Safety Assessment of Large Generative Models

Abstract: During the past two years, large generative models like Stable Diffusion and ChatGPT have made tremendous progress. While reshaping our daily lives, recent research shows that these large models have severe security and safety issues. In this talk, I will cover some of our recent works in this field. First, I will talk about safety and security attacks against text-to-image generative models, like prompt stealing and unsafe content generation. Second, I will focus on large language models, and discuss jailbreak attacks and machine-generated text detection/attribution.