Context-Aware Membership Inference Attacks against Pre-trained Large Language Models

Hongyan Chang (Mohamed bin Zayed University of Artificial Intelligence), Ali Shahin Shamsabadi (Brave Software), Kleomenis Katevas (Brave Software), Hamed Haddadi (Brave Software, Imperial College London), and Reza Shokri (National University of Singapore) | Privacy, LLM

Membership Inference Attacks (MIAs) on pretrained Large Language Models (LLMs) aim at determining if a data point was part of the model’s training set. Prior MIAs that are built for classification models fail at LLMs, due to ignoring the generative nature of LLMs across token sequences. In this paper, we present a novel attack on pre-trained LLMs that adapts MIA statistical tests to the perplexity dynamics of subsequences within a data point. Our method significantly outperforms prior approaches, revealing context-dependent memorization patterns in pre-trained LLMs.

View paper

Links

Ready for a better Internet?

Brave’s easy-to-use browser blocks ads by default, making the Web faster, safer, and less cluttered for people all over the world.