|
The following links provide a good overview of the features of the corpus. Each link inputs values into the search interface and runs the query against the 100 million word corpus (i.e. these are not "canned" results). You might want to note which options have been selected in the form, and then modify the values to create your own queries. As with other interfaces to the BNC, you can search by words (mysterious), phrases (fairly certain or white + noun), lemmas (all forms of words, like sing or tall), wildcards (un*ly or r?n*), and more complex searches such as un-X-ed adjectives or verb + any word + a form of ground. Notice that from the "frequency results" window you can click on the word or phrase to see it in context in this lower window. As the preceding searches indicate, the first option in the search form allows to to either see a list of all matching strings, or a chart display that shows the frequency in the six "macro" registers (spoken, fiction, academic, miscellaneous). Look for the frequency of whom, cloudy + noun, or forms of need + to + VERB. Via the chart display, you can also see the frequency of the word or phrase in all 70 registers, and this "register" view can be re-sorted by different criteria. You can also search for collocates (words nearby a given word), which often provides insight into the meaning of a given word. For example, you can search for the most common nouns near thick, adjectives near smile (sorted by relevance), nouns after look into, or words starting with clos* near eyes. You can also include register information directly as part of the query. This allows you to see how words and phrases vary across speech and many different types of written texts. We can easily find which words and phrases occur much more frequently in one register than another, such as -ness words in poetry, adjectives in tabloid newspapers, nouns in advertisements, good + [noun] in fiction, or verbs in the slot [we * that] in academic writing. You can also apply this to collocates, such as nouns with the verb break in NEWS, adjectives with woman in FICTION, nouns with passionate (FICTION vs NEWSPAPER, or ACADEMIC vs FICTION), or nouns near chair in ACAD vs FICTION (or FICTION vs ACADEMIC), . Finally, you can easily carry out semantically-oriented searches. For example, you can compare the most frequent nouns that appear with small and little, the most frequent adjectives with men and women, or the most frequent nouns with utter and sheer. You can also find the frequency and distribution of synonyms of a given word, such as beautiful or the verb clean, see with synonyms are more frequent in competing registers (such as synonyms of strong in ACADEMIC and NEWS), and use synonyms as part of a more complex query (such as synonyms of clean with nouns). Finally, you can create "customized lists" for any category that interests you, and then re-use these in subsequent queries (such as colors + clothes, or words related to beautiful + forms of woman). Hopefully this short five minute overview of the corpus has been helpful. Now feel free to look at more examples of the types of possible searches, including word/phrase, collocates (surrounding words), synonyms, word contrasts, and customized/user-defined lists. Find also more info on how to include detailed information on historical periods or modern registers into the queries, and how to refine your searches with certain search options. |
| . |
| . |