3.4 Import .csv
Sometimes, text data comes pre-processed as a document-term matrix (DTM) or term-frequency matrix stored in a CSV file. A DTM typically has documents as rows, terms (or words) as columns, and cell values representing the word counts. There are two main ways we can import CSV files: using R’s inbuilt read.csv() or the read_csv function from the readr package:
data_dtm <- read.csv("your_dtm_file.csv") # In case the first row is NOT the column names
data_dtm <- read.csv("your_dtm_file.csv", header = TRUE)
data_dtm <- readr::read_csv("your_dtm_file.csv", col_names = FALSE) # In case the first row are NOT the column names
data_dtm <- readr::read_csv("your_dtm_file.csv")Remember that importing a pre-computed matrix means you inherit the pre-processing choices made when it was created. Also, take into account that in some cases, the CSV is not delimited by a comma but by a semicolon (;) or tab. In that case, we have to import it as a delimited object: