søndag, december 10, 2006

I'm getting too much spam!


I'm starting to get way way too much spam, and probably like the rest of you (at least the ones reading this) I've spent some time thinking about this problem.
Brians first solution was to install Spam Assassin, or something similar, on our mail server - which I did, and for a few days I was actually just a tad impressed.
However, I started to get more spam in the days following and I just realized that we can't stop the spam.

The reason is that they have increasingly started to attach images to the spam. This was previously a very expensive operation when sending to millions in one go, mainly because of bandwidth issues I suspect. Apparently this is not an issue anymore because I'm getting a ton of these.

Why do these spam mails run thru the spam filters? Well, because they are clever. They body contents of the mail contains a simple short random message - typically in the form of a poem. And then there is the attachment which is randomly named.

Some filters have begun examining the images, but that won't (in the long run) work either, because just like the world has invented word verification when submitting posts, so will the spammers. The images are already beginning to be distorted, like illustrated. There's no way a computer will be able to identify this as spam.
The spam filters can't flag these type of emails as spam because the message is gibberish and random, with one image attached to it, which the spam filter also can't recognize. In other words, we're doomed for more spam.

Microsoft has proposed an interesting way to prevent spam.
  1. Sender Id. The first idea was Sender Id, which in fact was not Microsoft's in the first place. The concept involves knowing who the sender was, where certain headers would incorporate some sort of legitimacy. As BBC reporter analyzes, this could be spoofed I would suspect.
  2. Email taxation. The second solution is much more interesting, because that involves an entirely different mechanism - the notion of taxing the sender of the all emails. Read more here.
    The idea is that the computer must use a little time to compute some sort of sequence of digits, which normal users would never notice when sending mails. Spammers would however require a substantial number of computers to be able to send as many emails as they do today.
My opinion about the solutions:
  1. Sender Id: The algorithm to produce the sender id could be hacked, thus simulating other domains. Besides, if ALL sending computers was required to have a secure id of some sort (similar to a SSL certificate) the cost of setting up an SMTP server would be too great - only the rich would be able to send emails.
  2. Email taxation: This is a not a bad solution, really. But, it would still prevent legitimate companies to send newsletters to all it's subscribers without it either: taking too long, or costing too much.
How about open source organizations or NGO's? They don't have money - and to some extent not even their own hardware. How could they possible send emails to all of its potential users?

I've come up with a another solution. The possible solution actually originated from something my former partner and colleague Jørgen Juul mentioned when he ranted about the amount of spam he was getting. He said that an email from me was flagged as spam, which I found very strange, but the reason was that his home-made spam filter (they also use Spam Assassin) would reject ALL emails except if a certain rules passed. How about that? Normal spam filters assume that all is accepted, except if certain rules apply. His solution was definately a (sad) twist. He was simply getting so much spam that he was forced to reverse the criterias.

White list SMTP servers
Instead of having black lists, let's have ONLY white lists. This way, your mail server would only accept emails from servers that are registered and on the list. It would in other words assume that all email from everyone was spam, except mails that come from certain servers.

If your server was not registered the email would bounce with a reply that the sending SMTP server is unregistered and they need to either register their server OR submit a request for direct send to a user, without being registered. I’ll get back to the latter part later.

So, which body should be the registration organ?
  • The system would never work unless it was totally open and the people authorizing servers consisted of volunteers.
  • The authorizing body would consist of the sys-admins of large corporate firms and open source organizations. The same people that, today, spend way too much of their time trying to prevent spam.
  • The same people authorizing would most likely also be able to provide authorizing servers free of charge (another reason for it to be open and free, because other wise the cost of setting up authorizing servers would be too costly).
What is a registration of a “server”
  • A registration of a server consists of an IP-address and a domain name.

So, how does a SMTP server get approved?
  1. Pay one dollar with a credit card.
  2. If you know some of the people that are either: Already approved, or on the approving committee, then the registration would be accomplished quicker (the more you know the quicker, and if you know more than, say 5, you get auto-approved).
  3. If you don’t know anyone from the committee OR don’t know anyone that can vouch for you (very unlikely after a while), then you have to wait for someone to physically call you to verify your identity.
The plan here, the whole know-somebody and so on, is to establish a network of trustworthy people much like LinkedIn.

Anyway, the reason it would cost one dollar is simply to have relatively sound proof of identity.

Obviously, these admin guys doing the approval of SMTP servers would not have the time to approve in a fast enough pace (at least in the beginning) so the obvious little money machine would exists for a company that would have staff that approves servers more rapidly.

The results of this solution would hopefully be
  • Never receive unsolicited email anymore
  • You would know that the email you do receive would come from someone that you would be able to hold accountable. You would literally be able to trace back to originator just like the Sender Id scheme.
  • Small companies, open source organisations, or NGO’s could still be mass-emailing without buying a ton of hardware.
  • The above mentioned organizations wishing to send emails would only have to pay 1 dollar, not the amount of a Sender Id certificate or SSL certificate or similar.
If someone violated the “spamming rules”, the same rules would apply that today get companies on black lists. They would have to prove that they were sending to consenting users. If not, they would be removed from the white list of SMTP servers. This alone should ensure that people take spamming seriously because all their other corporate email would bounce from then on.
If someone started spamming they would not only risk their own credentials, but also the entire network of co-authorizing buddies.

Now, back to the part about sending emails from a server that is not on the white list. This would be done via a submit form on the solutions website, with word verification and what have you, that would effectively prevent a computer from doing it, but allow normal users to send emails – with a few more hoops to complete, but what the hell, it’ll be worth it.

This idea is just a sketch, but the rough guide lines are hopefully outlined.