Daily Archives: December 22, 2010

Google Books Ngram Viewer Fun

Google has a new… toy? erm…  research tool up.  I saw it mentioned on I, Cringley yesterday and played with it a bit last night.

It is the Google Labs – Books Ngram Viewer.

I’ll give you a moment to look up ngram.

Or I can try to explain it in a half-assed fashion

Essentially, Google has scanned in a large collection of books (something that has earned Google Books a good deal of grief) and this tool allows you to enter a word or phrase and see how often it comes up in the corpus they have scanned.

So you can chart the frequency of mention of, say, Britney Spears and Christina Aguilera.


Britney vs. Christina

Fortunately, Britney is trending down of late.  Not that I care.  Neither get mentioned much in books I read.

But you can compare things and graph them, and that has super nerd appeal.

You do have to take care with the words, names, and phrases you use.  Madonna, for example, dwarfs the Britney and Christina, but the word “madonna” isn’t exclusive to the artist, so it is hard to tell how much of that is pop culture and how much religion.

And I had to come up with some topics closer to my own interests to graph.


Star Trek vs. Star Wars

Well, I guess that finishes off the Star Trek vs. Star Wars fight, though Star Wars seems to have peaked a while back. (About the time of Empire, if you want my opinion, but that is another topic.)

And the, a little closer to home.


Media for the New Century

Podcast, blog, website, and newsletter.  Blog has definitely been taking off.  Podcast gets less mention than I thought, but it does tend to be a transitory medium.  Who goes back and listens to old podcasts?  Besides me?  Do you hear that Van Hemlock?  Write more.

Of course, we can look at things that really matter.


Critical Technology Comparisons

Yes, online games, virtual worlds, Netscape, Compuserve, carbon paper, and mad cow disease.

You can see how those all fit together, right?

Virtual worlds is the clear winner, though it has dropped a bit of late.

Online games are on the rise.

Netscape and Compuserve are in decline for mentions in literature, as well they might be.

Carbon paper remains quite stable despite not being that widely used since the mass availability of the copy machine some 30 years back.

And mad cow disease is still more likely to get mentioned than any of these other than virtual worlds, though I am going to guess that there is some cross-over there between that and Second Life.

Then, of course, we can go to items of critical national importance.


Critical Societal Measures

Yes, the age old conflict between vampires, unicorns, werewolves and zombies.

It was a horn to fang race through most of the 20th century, with neither gaining dominance.  Then around 1980, vampires take off and never look back.  I’m going to credit/blame Anne Rice here.  Peter S. Beagle never had a chance.  Stephenie Meyer should send Anne Rice a Christmas card (or a Winter Solstice card maybe) every year thanking her for laying the groundwork of her success.

Meanwhile, zombies, which really had no standing for most of the last century, have really come into their own since 2000, passing unicorns, who have remained flat.  And even werewolves, sort of the odd-man out of the monster classes (Look, are you human or wolf? pick one already.) Threaten to surpass unicorns.

Of course, I am just searching through the full English corpus.  Switching to just American English looks about the same, with vampires just spiking even more drastically in the last 20 years.  But looking at the British English corpus and the results just get odd.


British Vampire Invasion

Vampires still rule, but they had a good run there in the 1970s as well.  What was going on in the UK then?  It can’t just be Margret Thatcher.  And what was going on around 1930 with Unicorns?

I think there might be a sample size issue.

If I switch to the Spanish corpus, well, zombies rule.


What is Spanish for "werewolf?"

But I didn’t bother to translate my search terms into Spanish, so who knows what that really means.

Anyway, that it Google’s new toy… erm… tool.

What other vital comparisons should we be doing?