Chapter 6 Scaling Methods

While methods like dictionary analysis help identify themes or sentiments, their dictionary categories are often treated as distinct and not inherently ordered on a scale. If we want to compare texts or place them along a continuum (e.g., a left-right political spectrum, a scale of formality, or a sentiment dimension), we need methods to place documents on a scale.

In this chapter, we will look at three prominent scaling methods: Wordscores (Laver et al., 2003), Wordfish (Slapin & Proksch, 2008) and Correspondence Analysis (specifically Multiple Correspondence Analysis for categorical text data). The first two were initially part of the main quanteda package but have since moved to the quanteda.textmodels package. Correspondence Analysis, meanwhile, is a dimensionality reduction technique that can position documents and features in a multidimensional space and reveal relationships between them. For this, we will mainly use functions from the FactoMineR and factoextra packages, but we will also look at the textmodel_ca from quanteda.textmodels.

References

Laver, M., Benoit, K., & Garry, J. (2003). Extracting policy positions from political texts using words as data. The American Political Science Review, 97(2), 311–331. https://doi.org/10.1017/S0003055403000698

Slapin, J. B., & Proksch, S.-O. (2008). A scaling model for estimating time-series party positions from texts. American Journal of Political Science, 52(3), 705–722. https://doi.org/10.1111/j.1540-5907.2008.00338.x