|
English |
Size |
Language / time |
Content |
|
Corpus of Contemporary American English
(COCA) |
400 million |
American English
1990-present |
20 million words each
year, 1990-present. Equally divided into spoken,
fiction, popular magazine, newspaper, and academic. Will
be continually updated. |
|
Corpus of Historical American English (COHA) |
300 million |
American English
c1810 - present |
(Available August 2010) 13-18 million words each decade,
divided among fiction, popular magazines, newspaper, and
other non-fiction. |
|
BYU-BNC: The British National
Corpus
|
100 million |
British English
~1980s-1993 |
90 million words written
(fiction, newspaper, academic, etc); 10 million spoken.
[Website for the
original BNC] |
|
TIME Magazine
|
100 million |
American English
1923-present |
More than 275,000 articles
from
TIME
Magazine. Wide range of topics: news, sports,
business, culture, health, entertainment, etc. |
|
Other languages |
|
|
|
|
Corpus del
Español |
100 million |
Spanish
1200s-1900s |
20 million words 1900s,
20m 1800s, 40m 1500s-1700s, 20m 1200s-1400s |
|
Corpus do
Português |
45 million |
Portuguese
1300s-1900s |
20 million words 1900s,
including spoken, fiction, newspaper, and academic.
Equally divided Brazil/Portugal. 10m 1800s, 15m
1300s-1700s |
|