Boosting Large Language Models for Mental Manipulation Detection via Data Augmentation and Distillation
Abstract
MentalMAD framework enhances large language models for detecting covert mental manipulation in social media through annotation-free data augmentation, complementary-task supervision, and knowledge distillation, achieving significant performance improvements on a new real-world dataset.
Mental manipulation on social media poses a covert yet serious threat to individuals' psychological well-being and the integrity of online interactions. Detecting such behavior is challenging due to the difficult-to-annotate training data, its highly covert and multi-turn nature, and the lack of real-world datasets. To address these challenges, we propose MentalMAD, a framework that enhances large language models for mental manipulation detection. Our approach consists of three key components: EvoSA, an annotation-free data augmentation method that combines evolutionary operations with speech-act-aware prompting; teacher-model-generated complementary-task supervision; and Complementary-Convergent Distillation, a phase-wise strategy for transferring manipulation-specific knowledge to student models. We then constructed the ReaMent dataset, comprising 5,000 real-world-sourced dialogues. Extensive experiments show that MentalMAD improves accuracy by 14.0%, macro-F1 by 27.3%, and weighted F1 by 15.1% over the strongest baseline. The code and the dataset are publicly available at https://github.com/Yuansheng-Gao/MentalMAD.
Models citing this paper 0
No model linking this paper
Datasets citing this paper 1
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper