About Me Works

Bot or Not: Synthetic Text Generation in Social Media Contexts Using Large Language Models

Abstract: The rise of bot accounts on social media platforms presents challenges in distinguishing synthetic from authentic user accounts. This research focuses on developing methodologies to generate bot accounts and synthetic text posts for an adversarial competition aimed at evaluating the efficacy of bot detection systems based primarily on the text within a user’s post, devoid of additional metadata typically associated with user accounts such as number of likes or number of followers. Using techniques such as Markov chain models, fine-tuned large language models (LLMs), and probabilistic text augmentation techniques, the study explores various strategies for enhancing authenticity in bot-generated content to more reliably deceive bot detection methods. Key techniques include integrating human-like typographical errors, statistical patterns, and probabilistic time distributions into synthetic posts to in- crease the difficulty of detection. Results indicate that incorporating advanced text augmentation and LLM-based strategies can reduce detection rates. However, continued development is necessary to mimic nuanced human behaviors effectively. Future work can refine these methodologies, explore group bot dynamics, and analyze sentiment-based text patterns to further challenge bot detection systems.