|
Name |
Availability |
Number of words |
Dialect / time period |
Content |
Searches / architecture /
interface |
|
English |
|
BYU Corpus of American English |
Public, as of
Feb 2008 |
360 million |
American
1990-present |
20 million words each
year, 1990-present. Equally divided into spoken,
fiction, popular magazine, newspaper, and academic. Will
be updated at least two times a year. |
Search by
word, phrase, substring, part of speech, collocates, etc. Limit and
compare by frequency in different genres and years
(1990-present) |
|
British National
Corpus (BNC)
|
Public |
100 million |
British
Mainly
1980s-1993 |
90 million words written
(fiction, newspaper, academic, etc); 10 million spoken.
[Website for the
original BNC] |
Re-engineered, relational
database version of the original. Allows many types of
searches not found in any other interface of the BNC.
Note: Was
view.byu.edu |
|
TIME Magazine
|
Public |
100 million |
American
1923-present |
More than 275,000 articles
from TIME
Magazine. Wide range of topics: news, sports,
business, culture, health, entertainment, etc. |
Search by
word, phrase, substring, part of speech, collocates, etc. Limit and
compare by frequency in different years and decades. |
|
Other languages |
|
Corpus del
Espaņol |
Public |
100 million |
Spanish
1200s-1900s |
20 million words 1900s,
20m 1800s, 40m 1500s-1700s, 20m 1200s-1400s |
Search for words, phrases,
substrings, part of speech, lemma. Limit and sort by
frequency in different centuries and registers. |
|
Corpus del Espaņol: Registers |
Public |
20 million |
Spanish
1900s |
Enhanced version of the
1900s component of the Corpus del Espaņol. Equally
divided between spoken, fiction, non-fiction |
Compare frequency of 110+
grammatical constructions in twenty different registers.
Re-tagged version of the texts from the Corpus del
Espaņol. |
|
Corpus do
Portuguęs |
Public |
45 million |
Portuguese
1300s-1900s |
20 million words 1900s,
including spoken, fiction, newspaper, and academic.
Equally divided Brazil/Portugal. 10m 1800s, 15m
1300s-1700s |
Compare words, phrases, collocates, etc in
different historical periods, across genres/registers,
and in different dialects. |
|
BYU-only (limited to on-campus
use by BYU students and faculty) |
Oxford English Dictionary (OED)
|
BYU
only [SEARCH] |
37
million |
Old
English - 1900s |
2.2
million quotations in the Oxford English Dictionary. |
Find the frequency of
word, phases, substrings, and constructions in each
century since Old English. Can limit hits by frequency
limits in any century. |
|
EEBO / LION |
BYU
only
[SEARCH] |
700
million |
1500s-1900s |
Early English Books Online (1500s-1600s; 350m words) and
Literature Online (mainly 1700s-1800s; 350m words) |
Basic interface to these corpora. Find the frequency by
decade and century for words, phrases, and substrings. |
|
LDS General Conferences |
|
23
million |
1851-present |
Every General Conference talk from 1851 to the current
time |
Basic interface to these corpora. Find the frequency by
decade for words, phrases, and substrings. |