ngrams and the history of islam

18 December 2010

Google’s new Ngram Viewer allows you to compare frequencies of words or phrases in printed books (only in certain languages, for now) from essentially the beginning of the printed books era, until now.

You can’t yet search in Arabic (do they have an Arabic corpus?). But you can, for example, learn that before about 1840, there were few “Muslims”, and many “Mohammedans“. You can then correlate this with the dramatic increase of references to “Allah” starting around 1840 to learn that the Mohammedan religion must have given way to an Allah-centered Islam right about that time.

(For whatever reason, references to “Mohammedans” increase slightly from 0 starting in about 2005).

  1. There is an Arabic corpus, but it is lacking because of the limited ability of Arabic OCR. It is there though. Some of the manuscript collections (like the one at UofM) have been incorporated into this. I hope that eventually we will be able to use this for dots and squiggles languages.

