Ngrams and interpreting data

John Hunt, Edmund Hillary, and Tenzig Norgay. Members of a British expeditionary party who were the first to summit Mt. Everest in 1953
John Hunt, Edmund Hillary, and Tenzing Norgay. Members of a British expeditionary party who were the first to summit Mt. Everest in 1953

Google’s Ngram viewer is an extremely powerful tool for historical analysis. Because it looks at books published from 1800 to roughly 2000 we can get a pretty accurate idea of how popular the person, place, thing, or idea was throughout history. This is especially useful for searches of times before modern technology like radio and television, as books were the main way information was shared back then.

The graph above shows three members of the 1953 British Mount Everest expedition; John Hunt, Edmund Hillary, and Tenzing Norgay. Their popularity in mass culture is shown on the graph from 1800 to 2000, and is informative in a number of ways. First, we can see that the western explorers, Hunt and Hillary, are much more prevalent than Norgay, a Nepalese Sherpa. From that data alone we can infer that race was still a factor in the popularity of these men, or that the expedition declined to credit Norgay and tried to reduce his own fame to boost their own. We can also see that while Hillary and Norgay became popular around the time of their ascension, Hunt seems to already have fame throughout history. This is likely due to the fact that “John Hunt” is a common name in western culture, and that Ngrams can’t account for the context in which his name is mentioned. Or perhaps he’s an immortal adventurer and he is doing a very bad job of covering it up. It’s probably the former though. It is  important to remember that tools like these can present false, or misattributed data due to the complexity of their search method and a lack of context in search results.

These problems are not just limited to Ngrams though. Incomplete data sets plague almost any one research method, which is why it’s important to confirm your findings with multiple sources.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.