Language Resources

Corpus of Contemporary American English (COCA).
440 million words of downloadable text (190,000 separate texts). Balanced for genre — about 88 million words each of spoken, fiction, magazine, newspaper, and academic. With the included [sources] table, you can also search by sub-genre, e.g. News-Financial or Academic-Medicine.

The corpus of Global Web-Based English (GloWbE).
1.8 billion words of downloadable text (1,800,000 separate texts). Divided into groups from twenty different English-speaking countries (US, UK, Canada, Australia, India, etc). About 60% from blogs, for very informal language.