The corpus contains approximately 50 million words of text. The table below summarizes the texts by decade and genre, and you can also download a full listing of all 1135 texts.

    1920s-1940s 1950s-1970s 1980s 1990s 2000s 2010s TOTAL
SPOKEN texts       3 167 6 176
  words       94,527 5,592,381 187,689 5,874,597
FICTION texts 23 4 19 36 19 4 105
  words 1,739,983 329,263 506,611 860,022 452,736 12,766 3,901,381
MAGAZINES texts     46 88 172 2 308
  words     1,388,416 2,185,009 6,359,030 55,358 9,987,814
NEWSPAPERS texts     11 37 121 4 173
  words     835,569 1,805,388 9,948,930 510,807 13,100,695
NONFICTION texts 13 3 10 10 2,801   2,837
  words 761,739 172,617 735,567 822,731 2,728   2,495,382
ACADEMIC texts 6 5 44 78 155 7 295
  words 125,134 193,961 1,996,289 2,523,454 9,575,865 230,650 14,645,353
MISC texts     31   10   41
  words     49,437   26,620   76,057
TOTAL texts 42 12 161 252 3,445 23 3,935
  words 2,626,857 695,841 5,511,889 8,291,131 31,958,290 997,271 50,081,279

Note: The Excel spreadsheet shown above contains complete information on the texts used in the corpus. The [textID] column refers to the [t] value in the URL in the [title] column of the Keyword in Context display. For example, if the URL is http://corpus2.byu.edu/can/x4.asp?t=631&ID=30637060, then the [textID] is [631], and this corresponds to [Maclean's. February 24, 1986] in the speadsheet.