Classifying Spam as a Biological Entity
In my wandering of the web last week, I happened upon an intriguing article entitled Application of Biological Metaphors for Identifying and Killing Spam. Basically, this proof-of-concept (code available) posits that if spam were treated as a sort of viral, mutating biological entity, spam filters could be improved immensely. This analogy appears very accurate, as spammers’ ability to adapt and find new methods of defeating mail filters never ceases to amaze.
Currently, the favorite methods of spam detection are combinations of blacklists and Bayesian filtering. These methods, however are not foolproof and can be improved through addition of other filtering technologies. This article states that by treating various characteristics of an email as “genetic markers” like whether the email is plain text, RTF or HTML and passing the lot of these markers through an artificial neural network, spam can more easily be separated from non-spam. The Bayesian probablity that a mesage is unsolicited is just one of several genetic markers passed to the neural network for analysis. Final analysis of the genetic makeup of a message results in classification as either spam or non-spam.
The current effectiveness reported by this new method of dealing with spam is one false negative in 1000, but the author goes on to say that further training and identification of more genetic markers could theoretically lower that probability to 1 in 5000. There is some C# code available for download for savvy users. I would be curious to know how easy it is to implement. Hopefully, Apple will consider this approach to defeating spam, as I do not have nearly these levels of accuracy.
In conclusion, this biological metaphor really works and has enormous potential for accuracy using the power of artificial neural networks to track the mutating characteristics of spam. Although spam is here to stay, it may not have to be dealt with by real people for very much longer.
This blurb about spam has been brought to you by the Kyle Rove Spam Stats in the right-hand column. Soon available for download.
About this entry
You’re currently reading “Classifying Spam as a Biological Entity,” an entry on sensory output
- Published:
- 4 years, 1 month ago
- Category:
- Sensory Output

No comments
Jump to comment form | comments rss [?] | trackback uri [?]