corpora, size, queries = better resources, more insight


Size, speed, queries
Insight into variation

History / updates
FAQ / questions

Log in / password
Profile / register

Related resources
   Full-text data
   Word frequency
   Academic vocabulary

Contact us

The corpora at this site were created by Mark Davies, Professor of Linguistics at Brigham Young University. They have many different uses, including: finding out how native speakers actually speak and write; looking at language variation and change; finding the frequency of words, phrases, and collocates; and designing authentic language teaching materials and resources.

The corpora are used by more than 170,000 people each month (more than 330,000 visits) -- for example, 65,000 distinct people each month for COCA alone. This makes them perhaps the most widely-used corpora currently available.

In addition to the nine corpora (and the Google Books (Advanced) interface), there are also many new COCA-based resources. The site allows you to enter and analyze entire texts, and see extremely detailed corpus-based entries from a frequency listing of the top 60,000 words in English. The sites,, and allow you to download large amounts of corpus data for offline use. Note especially the new 100,000 integrated word list from COCA, COHA, BNC, and SOAP -- the largest, corrected frequency list of English.