corpus.byu.edu

corpora, size, queries = better resources, more insight


Overview
Corpora
Size, speed, queries
Insight into variation

History / updates
FAQ / questions
Researchers

Register
Modify profile

Related resources
   Full-text data 
   Word frequency
   Collocates
   N-grams
   WordAndPhrase
   Academic vocabulary

Problems
Contact us


QUERIES

The nine main corpora have exactly the same architecture and interface, which allows users to carry out the following types of searches. One of the important advantages of our corpus architecture is that with one simply query and one click, users can analyze variation by comparing different sections of a corpus; e.g. genres in COCA or BYU-BNC, dialects in GloWbE (or between corpora from different countries), or across time periods (COHA, TIME, recent changes in COCA, and Google Books (Advanced)).

Visualization. You can see (examples with end up V-ing): Limiting and comparing sections

1) a chart with the overall frequency of all matching strings
2) the individual strings (overall - all sections)
3) individual strings (in each section of the corpus: genre, dialect, or time period)

1. You can also limit the search to just particular sections of the corpus (e.g. hard [nn*] in Fiction)
2. More importantly, you can compare between two sections of the corpus (e.g. hard [nn*] in FIC vs ACAD) -- either by genre, dialect, or time period.

Note: click on any link on this page to see the corpus data, and then click on "RETURN" in the upper right-hand corner of the corpus to come back to this page.

Type of search COCA-General COCA-Genres GloWbE-Dialects COHA-Historical
Specific word or phrase

I guess

validity

lah!

of no little

Substring

*al.[j*]

*al.[j*] (MAG/ACAD)

*ism (core/SAsia)

*ism (earlier/later)

Lemma (forms of a word)

[c*] [p*] [be] like ,
( and she was like , )

[j*] [chain] (FIC/ACAD)

[be] different to

[have] quite V-ed

Part of speech

[j*] eyes

[j*] body (MAG/ACAD)

went [j*]

a most [j*] [nn*]

Synonyms

[=strong]

[=strong] (FIC/ACAD)

[=beautiful] [woman]

[=beautiful] [=girl]

User-defined lists

[..:colors] [..:clothes]

felt [..:emotions] (FIC/ACAD)

[..:colors] [..:clothes]

felt [..:emotions]

Sortable concordance lines fathom argue (ACAD) diametrically swell (1930s)
Collocates (nearby words)

[break].[v*]

chair (FIC/ACAD)

scheme (US/GB)

gay (earlier/later)

-- Use Mutual Information score  [break].[v*]      
-- Compare two words 

utter / sheer