History

The following is a history of the different corpora, as well as a list of some upcoming changes.

2009. Fall ?

Improvements to the corpus architecture and interface, including:
1. Improved search syntax, with optional / variable length elements in the search string
2. Chart displays with relative frequencies of competing forms (who / whom, help V / help to V, etc)
3. Improved KWIC (context) displays, including full sources for all entries
4. "Random word" function, with useful features for language learners (done)

2009. Aug

Added about 15 million words to the Corpus of Contemporary American English (COCA), for Oct 2008 - Jun 2009.

2009. May

1. Redesigned the main http://corpus.byu.edu/ portal, and added new tools for collaboration.
2. Added new features to the corpus interface: history of searches, and ability to annotate queries (notes) and share them with others

2009. Mar

Awarded a grant from the National Endowment for the Humanities to create the Corpus of Historical American English (COHA), a balanced, 300 million word corpus of American English, early 1800s - present time. A beta version of this corpus will be available in August 2010. (More information...)

2008. Oct

Added about 15 million words to the Corpus of Contemporary American English (COCA), for Jan-Sep 2008.

2008. June

Applied the new architecture to the Corpus do Português

2008. Apr

Applied the new architecture to the British National Corpus and the TIME Corpus

2008. Mar

Released the Corpus of Contemporary American English

2007. Oct

Finished new (current) corpus architecture; applied it to the Corpus del Español. Major updates in this corpus as well, including much-improved tagging and lemmatization for Modern Spanish.

2007. May

Released the TIME Corpus of American English

2006. Aug

Released the Corpus do Português

2005. Apr

Interface for Register Variation in Spanish

2004. Apr

Released VIEW, our first version of the British National Corpus

2002. Sep

Released the first version of the Corpus del Español

Misc There are several other corpora with older, non-standard architecture and interface: Polyglot Bible, Polyglot Book of Mormon, and LDS General Conferences, Medieval Spanish bibles, and Latin/OSp/ModSp bibles