Etc.


— 10:59 PM on April 3, 2003

I've been using POPFile for a few weeks now to sort my mail, and it's been very effective catching spam for the most part—about 94% in my use, which is good enough to make me grateful. The theory behind POPFile seems sound, and the installer for the program is slick as can be. However, I am a little peeved at how much work "training" the program in ongoing use has turned out to be.

I'm well over the "training period" where one teaches the program how to identify spam and the like. I get thousands of e-mail messages per week, including lots and lots of processed pink meat.

But as I understand it, if I don't constantly reclassify messages the program has classified incorrectly, the quality of my word databases gets degraded over time as mis-classified messages pollute the lists. The alternative is to look through all my messages, identify all mis-classified messages, and go reclassify them. That's exactly the kind of tedium I'm trying to avoid by running anti-spam programs, and at the end of the day, it seems like the Bayesian filtering approach (at least in this incarnation) may be no more effective than DNS blacklist tools like MailWasher.

Perhaps POPFile could still be saved if the program would avoid collecting words from messages that haven't been manually reclassified? Initial training might take a little longer, but the thing wouldn't require ongoing training just to avoid letting mis-classified messages pollute the word lists.

Tip: You can use the A/Z keys to walk threads.
View options

This discussion is now closed.