6.2 Keywords in Context
One simple - but effective - way to learn more about our texts is by looking at keywords-in-context (kwic). Here, we look at with which other words a certain word appears in our texts. This is also known as looking at the concordance of our text. To do so is easy with our tokens data frame. Let’s take all those words that start with ‘secur’ and look at which three words occur before and after this word. We can then run:
In the outputted object, we find a column labelled pre
and another labelled post
. These refer to the words that came either before or after the word ’secur*’. We can easily take these out and combine them:
text_pre <- kwic_output$pre
text_post <- kwic_output$post
text_word <- kwic_output$keyword
text <- as.data.frame(paste(text_pre, text_word, text_post))
We then combine this information with the name of the document it came from so that we know which text the word is from:
extracted <- cbind(kwic_output$docname, text)
names(extracted) <- c("docname", "text")
head(extracted)
## docname text
## 1 text10 making allowances peace security ushering period détente
## 2 text27 establishment maintenance post-war security scholars contend western
## 3 text27 western allies desired security system democratic governments
## 4 text27 churchill's mainly centered securing control mediterranean ensuring
## 5 text34 peace enforcement capacity security council effectively paralyzed
## 6 text44 leaders establishing secret security force prevent subversion