Linguistic convergence?

February 1st, 2003

I’ve noticed an odd phenomena. When searching Google I consistently see results from projects I’m involved in, people I know, and of course, myself.

A certain percentage of this can be written off to specialized interests. The overwhelming amount of search results I get which point to IMC archives is understandable, for example.

What I don’t understand is why a relatively generic query like “tar over ssh”, would return a message from the LUG at my alma matter(which didn’t exist when I went there), in a thread between 2 people I know well. Thats just odd. And this happens a lot. I’m going to start keeping track, but it happens all the time. (note: I’ve probably destroyed Google’s usefulness for searching for “tar over ssh” now by mentioning it)

I don’t know that many people, a very small number in fact, and even if they all produce content hyperactively, shouldn’t they be drowned out in a sea content? Whats going on?

My thought is perhaps we’re seeing the effect of Google having a language based interface. I search in English, and therefore I’m much more likely to get English results back. Most of the people I know speak English. On the Net however this doesn’t proscribe the field much. I think perhaps it needs to be broken down beyond that, I don’t just speak English, I speak a vernacular informed by age, class, education, social environment, etc. My word choices are a product of culture. For example Mako and Josiah from the above thread have both had significant impacts on the Linux culture I was raised in. Could even my 3 word query display a language bias? If I was a product of a different linguistic micro-culture would I have said “pipe” instead of “over”, asked for “remote” instead of “ssh”, re-ordered the terms?

And if perhaps Google was a taxonomy engine, building a search tree of structured data, and my queries were made in a precise, perhaps numerical, language, then would this convergence disappear? Would it work nearly as well then? A response from my culture after all brings a number of advantages, no one suggested using a tape drive instead, or buying F-Secure.

Some Other Possibilities.

  • Aidan is fast to point out humans are expert pattern makers, and inclined to see patterns where none (of significance) exist. Perhaps I only notice the occurrence when something unusual happens, and this convergence is a false pattern?
  • That for all the millions of internet users, content is created by a mind blowingly small percentage. That a given individual really can know a statistically significant percentage of the population.

Comments are closed.