corpus.byu.edu

corpora, size, queries = better resources, more insight


Overview
Corpora
Size, speed, queries
Insight into variation

History / updates
FAQ / questions
Researchers
Publications

Register
Modify profile

Related resources
   Full-text data 
   Word frequency
   Collocates
   N-grams
   WordAndPhrase
   Academic vocabulary

Problems
Contact us


Created by Mark Davies, BYU.   Overview, search types, researchers, publications, corpus-based resources.
 

English

# words

language/dialect

time period

 compare

Global Web-Based English (GloWbE)

1.9 billion

20 countries

2012-13

 

Corpus of Contemporary American English (COCA)

450 million

American

1990-2012

* * * * *

Corpus of Historical American English (COHA)

400 million

American

1810-2009

* *

TIME Magazine Corpus

100 million

American

1923-2006

 

Corpus of American Soap Operas

100 million

American

2001-2012

*

British National Corpus (BYU-BNC)*

100 million

British

1980s-1993

* *

Strathy Corpus (Canada)

50 million

Canadian

1970s-2000s

 

Other languages

       

Corpus del EspaŮol

100 million

Spanish

1200s-1900s

*

Corpus do PortuguÍs

45 million

Portuguese

1300s-1900s

 

N-grams

       

Google Books: American English

155 billion

American

1500s-2000s

*

Google Books: British English

34 billion

British

1500s-2000s

 

Google Books: One Million Books

89 billion

Am/Br

1500s-2000s

 
Google Books: Spanish 45 billion Spanish 1500s-2000s  

* Our architecture and interface to the BNC, which is distributed by IT Services (formerly OUCS) at Oxford University (on behalf of the BNC Consortium)