Doug's musings
<< Stuart Wyatt 2003 > May Kurt Vonnegut: Strange Weather Lately >>

Tuesday, 13 May 2003

SpamBayes and resonances ::

SpamBayes: Bayesian anti-spam classifier written in Python. Apple’s Mail is doing a pretty good job at detecting my spam, but a lot of offensive stuff is still getting through. So I’m giving SpamBayes a whirl. It sits as a proxy between your mail client and server, and just adds notes in the headers of incoming mail. Let’s let it train for awhile...

Jon Udell takes the idea of Bayesian filtering further, wondering:

For example: family/not-family, projectX/not-projectX. I actually go to the trouble of creating filters for some of these kinds of things, but it’s arguably more trouble than it’s worth. A multidimensional classifier that could notice these patterns emerging, offer to set up the foldering and filtering for me, and then reinforce the classification by observing my behavior over time—wow, isn’t that what computers were supposed to be for?

But having just read this [0xDECAFBAD]:

More and more, I’m running into myself on Google. I’ll be looking for expert information on something I’m trying to tinker with, and discover that one of more of the top search results are me writing about looking for expert information on the thing I’m trying to tinker with. Just occasionally do I find myself having actually provided the information that I’m currently seeking.

(I noticed that too: my Sprint/PCS Vision page suggests doing a Google search to help you find the cable for your phone... and now I’ve seen Google directing people right back to my page when they do such a search!)

I wonder, if we have these little automated filters helping us sort through the constant torrent of information, won’t they gradually become more and more resonant, until we discover that they’re just repeating to us what we already know? Just send me back the memes I’ve already absorbed and propagated, even in my own words? Hahaha! Everywhere I go, there I am... (Awhile after I began reading weblogs, it struck me that sometimes there was more circular cross-referencing than original content.)

Taking conscious steps to get rid of the obvious noise, like the constant barrage of penis-enlargement, Viagra, and mortgage spam, is fine (though I’m seeing signs that the more innovative spammers may be using tools like SpamBayes to help them construct messages that will get through spam filters!).

When it comes to finding what I want to read, I’d like to think there’s a little more serendipity involved. Ed Cone: “Try weird stuff, see what works.” I’ve seen the resonance effect here too; take Root Blog—when I first visited, it seemed like a really nice way to scan through brief excerpts from many weblogs and see what catches my eye. Now it seems to be dominated by aggregators, content selected by other people’s filters... it’s no longer so random, a source of potential serendipity.

It’s time to go to work. I’ll rub elbows with some fellow carbon-based lifeforms, and perhaps it will seem more serendipitous and less absurdly circular than sitting in front of my computer and rambling on about these things :-)

Update (14 May): So of course I spent the entire workday investigating one bug. Very circular, and at the end of the day... well, at least I had evidence to support the hypothesis that I had come up with last week, 5 seconds after being shown the bug.

Tue, 13 May 2003, 08:46 PDT
<< Stuart Wyatt 2003 > May Kurt Vonnegut: Strange Weather Lately >>