The following is a history of the different
corpora, as well as changes and improvements to the corpus architecture
and interface.
|
2012. Apr |
Released
www.academicwords.info; free downloadable lists for academic
English: word families, core academic, and genre-specific
technical words |
|
2012. Apr |
Released www.wordandphrase.info/academic:
same interface as the two resources below, but for just for 120
million words of COCA Academic
English |
|
2012. Feb |
Modified
www.wordandphrase.info: ability to enter entire texts and
then see detailed information about words and phrases |
|
2012. Jan |
Released
www.wordandphrase.info: integrated frequency and genre data,
definitions, collocates, concordances, synonyms, and WordNet |
|
2011. Dec |
Released free n-grams lists
for COCA and COHA; millions of rows of data for 2-grams (two
word sequences), 3-grams, 4-grams, and 5-grams. |
|
2011. May |
Released beta version of the
Google Books
(American English) Corpus (155 billion words,
1810-2009) |
|
2011. Apr |
Added about 15 million words to the
Corpus of Contemporary
American English (COCA), for July 2010 - Mar 2011. |
|
2011. Feb |
Added concordance view |
|
2010. Oct |
Improved functionality for
interaction with other users (see queries, researchers,
publications) and ability to save and manipulate Keyword in
Context entries. |
|
2010. Sep |
Released beta version of the
Corpus of Historical
American English (COHA) |
|
2010. Aug |
Added about 20 million words to the
Corpus of Contemporary
American English (COCA), for July 2009 - June 2010. |
|
2010. Feb |
Released the
frequency
lists and dictionary that are based on the Corpus of
Contemporary American English. |
|
2009. Aug |
Added about 15 million words to the
Corpus of Contemporary
American English (COCA), for October 2008 - June 2009. |
|
2009. May |
Added new tools for collaboration:
links to previous queries (including annotations/notes) and
ability to share them
with others |
|
2008. Oct |
Added about 15 million words to the
Corpus of Contemporary
American English (COCA), for Jan-Sep 2008. |
|
2008. Jun |
Applied the new architecture to the
Corpus do Português |
|
2008. Apr |
Applied the new architecture to the
British National Corpus
and the TIME Corpus |
|
2008. Mar |
Released the
Corpus of Contemporary
American English |
|
2007. Oct |
Finished new (current) corpus architecture;
applied it to the
Corpus del Español. Major updates in this corpus as well,
including much-improved tagging and lemmatization for Modern
Spanish. |
|
2007. May |
Released the
TIME Corpus of American
English |
|
2006. Aug |
Released the
Corpus do Português |
|
2005. Apr |
Interface for
Register
Variation in Spanish |
|
2004. Apr |
Released VIEW, our first version of the
British National Corpus |
|
2002. Sep |
Released the first version of the
Corpus del Español |
| Misc |
There are several other corpora with older,
non-standard architecture and interface:
Polyglot
Bible,
Polyglot Book of Mormon,
Medieval Spanish bibles, and
Latin/OSp/ModSp bibles |