My approach to blocking spam is to, when I receive spam, block some range of IP addresses from connecting to my server. This is assuming most spam is coming from compromised end-user machines on ISP networks. (Ironically, my own server is on a static IP from a DSL provider, so if someone in my netblock were infected and sent me spam, I might block myself. Or someone else following this approach might block me.)

Another problem is that the allocation of these blocks may change. I'm sure there are other problems with this approach!

But assuming I don't block legit email, I can relax with my pipe and brandy snifter and wonder if I'm actually making a difference. Sure I can see that some email gets blocked that should get blocked, but is it a losing battle? How much do I need to add before I start to see a difference?

What I'm talking about is a logistic growth function measuring the number of infected blocks. How long before I have added enough blocks to where I see a slowdown in the arriving spam?

I could script adding a single address, which would make that fairly painless.

But for now I'm trying to find a whole block. Two of my assumptions are that the infected PC is on a dynamic address and will get a new address from that block, and that normal users in a dynamic address range aren't sending email directly, but are going via their ISP's outgoing mail server. Probably not always right, and probably blocks people using dynamic dns, but that's my problem.

The advantage is that for the same couple of minutes, I can block 256, or even 65,536 or more addresses at once.

It's probably simpler to try to apply the logistic model to a one-at-a-time approach.

There is a finite population of IP addresses, some proportion of infected ones, some cured ones (i.e. blocked for me), such that the more I cure, the fewer there are remaining to be infected. The thing is, the curing doesn't happen in parallel. That's why I'm banning blocks at a time instead of individual addresses.

Note that many addresses that send me spam are in big netblocks, so a given spam may well block out a class B network (/16 in CIDR notation.) What that means to the layperson is that if the first two numbers in the IP address match, it will be blocked. There are 65,536 such matches.

I'll get back to this...

changed November 16, 2007