I’ve been putting together a few wordlists and am making them available here for anyone that’s interested in them.

Each wordlist has been processed to only contain unique values, and each archive will contain a few variants on each wordlists (no spaces, no punctuation etc.) so that you can pick and choose the right one for your requirements. There is also a combined version, which contains the base data and all the variants in one (usually very big) file.

The number of words listed below shows the number of words in the base wordlist, not including the variants.

Set Title No. of Base Words Date Created
—– —————– ————
GeoNames (727MB) Countries/Regions 614,860 June 2014
  Streams/Lakes 2,190,169  
  Parks/Areas 439,047  
  Cities/Villages 4,903,912  
  Roads/Railroads 39,952  
  Spots/Buildings/Farms 2,129,613  
  Mountains/Hills/Rocks 1,726,181  
  Undersea 17,879  
  Forests/Heaths 55,794  
  Combined 11,411,626  
OpenLibrary (1.6GB) Authors 6,403,934 June 2014
  Books 13,953,610  
  Subjects 710,274  
Books (1.2MB) Authors 11,988 June 2014
  Books 64,119  
IMDb (183.8MB) Actors Characters 2,417,740 June 2014
  Actors 1,668,430  
  Actress Characters 1,203,647  
  Actresses 977,390  
  Movies 936,217  
MusicBrainz (299.9MB) Artists 918,778 June 2014
  Instruments 648  
  Music Labels 88,253  
  Places (Studios) 5,898  
  Works 7,555,440  
Proverbs (0.04MB) Proverbs 2,048 June 2014
Wikipedia (744.6MB) Article Titles 23,039,493 June 2014
Wikibooks (1.7MB) Article Titles 109,658 June 2014
Wiktionary (68.2MB) Article Titles 3,959,195 June 2014