Abstract
This chapter surveys how large language models (LLMs) are being used in political science research and how they are reshaping established methods. It is organized around six applications: annotation and measurement, experimental treatment generation, silicon sampling, generative agent-based modeling, tool-augmented data collection, and the study of LLMs as objects of inquiry. The chapter traces the methodological lineages from which these approaches emerge and examines their implications for validity, inference, bias, and reproducibility. A central argument is that LLMs require researchers to rethink standards. As these models match or exceed human performance in some tasks, traditional human-coded gold standards become less secure, increasing the importance of construct validity and robustness. At the same time, because LLM-based workflows are sensitive to prompts, versions, and parameters, exact reproducibility is often unattainable. The chapter therefore discusses some of the new – and often still open – questions relating to replication and transparency with LLMs.

![Author ORCID: We display the ORCID iD icon alongside authors names on our website to acknowledge that the ORCiD has been authenticated when entered by the user. To view the users ORCiD record click the icon. [opens in a new tab]](https://preprints.apsanet.org/engage/assets/public/apsa/logo/orcid.png)