Why the name iWeb? Web of course refers to the fact that the corpus is
based on about 14 billion words in 22 million web pages from about
95,000 websites.
The
i stands for several things:
-
Immense:
14 billion words is huge. This is one of
only three structured corpora of English that are larger than about
10 billion words.
-
Nearly instantaneous.
iWeb is much faster than these other large corpora. Even searches
like BUY * ADJ NOUN,
"gorgeous" NOUN,
VERB + reflexive, or
NOUN NOUN take just 2-3 seconds,
and searches for
topics,
collocates,
clusters,
websites, and
concordance lines (all of these for the sample word bread) take one second or less.
-
Insightful:
other large corpora from the web are just a huge "blob of data" --
whatever has been scooped up from blind web scraping. iWeb was
designed from the ground up to allow you to target specific topics
and websites, such as websites dealing with buddhism,
chocolate,
basketball, or
nuclear energy.
-
Informative:
iWeb allows you to browse through the 60,000 words (lemmas) in the
corpus, and to see a wealth of information for each word.
-
Integrated:
with just one click you can move from word to word, and between
collocates, related topics, clusters, websites, and concordance
lines. In addition, the "word" pages are integrated with other
resources for images, videos, pronunciations, and translations.
No other corpus allows you the size, speed, and range of queries that
you will find in iWeb. |