5.3 Exercises
Using a sentiment analysis dictionary such as LSD2015, analyse the sentiment expressed in the Wikipedia article on the Cold War from the previous chapter. Are there differences in sentiment between the paragraphs, and if so, what might the reasons for these differences be? Consider plotting the sentiment score against the document index.
How did the Cold War influence culture, as reflected in textual data? Create your own dictionary of words or phrases representing cultural aspects influenced by the Cold War (e.g. terms related to fear, propaganda and specific cultural phenomena). Apply this dictionary to a relevant corpus (e.g. articles about the Cold War or cultural texts from the period) and analyse the frequency of these terms. Discuss your findings based on the dictionary counts.
Explore the different sentiment scores provided by the
vaderpackage (pos, neg, neu and compound). Plot the distribution of the ‘pos’, ‘neg’ and ‘neu’ scores. How do these relate to the ‘compound’ score and the human-assigned labels?Research other sentiment dictionaries available in R, for example, in packages such as
syuzhetortidytextthat can work withquantedaoutputs. Apply one of these dictionaries to the film review or tweet data, then compare the results with those obtained using LSD2015 or VADER. What similarities and differences are there in the sentiment scores?Consider the limitations of dictionary analysis, especially with regard to context and negation. Find examples in the tweet data where a simple dictionary lookup might misclassify the sentiment due to negation (‘not happy’) or sarcasm. How might these cases be handled in a more advanced approach to sentiment analysis?