The Wordiest Word

With Markov chain based random word generation, I essentially have tables of the probabilities for letters sequences. With this I’ve always wanted to know what the most English word was. The word with the highest probability of each letter following its predecessors.

I finally bit the bullet and produced it; well them, because it varies depending on the corpus & depth used. All in all it’s not that impressive, just kind of cool to know. I don’t know what I was expecting, some amazing word that would rock my socks off.

Without further ado, here they are:

Corpus Depth Wordiest Word
basic_english_words 1 st
basic_english_words 2 st
basic_english_words 3 struction
basic_english_words 4 statement
basic_english_words 5 store
unabridged_english_dictionary 1 prerererererererere…
unabridged_english_dictionary 2 press
unabridged_english_dictionary 3 press
unabridged_english_dictionary 4 preconcer
unabridged_english_dictionary 5 preconcertification