Spamfo

Sep/04

19

A visual history of spam and virus emails

Raymond chen, a Microsoft employee has kept every single piece of spam since mid-1997.  The results were then put into a graph to show a visual representation of spam and viruses received for the last 6 years.

The chart only shows the mail which made it past the corporate spam filter .  There is still a fair amount of spam in this case, as the filters need to make sure no genuine business mail is classed as spam.

http://weblogs.asp.net/oldnewthing/archive/2004/09/16/230388.aspx


Some interesting stats quoted straight from the blog:


  • Largest message: 1,406,967 bytes, received January 8, 2004. HTML mail with a lot of text including 41 large images. A slightly smaller version was received the previous day. (I guess they figured that their first version wasn’t big enough, so they sent out an updated version the next day.)
  • Single worst spam day by volume: January 8, 2004. That one monster message sealed the deal.
  • Single worst spam day by number of messages: August 22, 2002. 67 pieces of spam. The vertical blue line.
  • Single worst virus day: August 24, 2003. This is the winner both by volume (1.7MB) and by number (49). The red splotch.
  • Totals: 227.6MB of spam in roughly 19,000 messages. 61.8MB of viruses in roughly 3500 messages.

    Things you can see on the chart:



    • Spam went ballistic starting in 2002. You could see it growing in 2001, but 2002 was when it really took off.
    • Vertical blue lines are “bad spam days”. Vertical red lines are “bad virus days”.
    • Horizontal red lines let you watch the lifetime of a particular email virus. (This works only for viruses with a fixed-size payload. Viruses with variable-size payload are smeared vertically.)
    • The big red splotch in August 2003 around the 100K mark is the Sobig virus.
    • The horizontal line in 2004 that wanders around the 2K mark is the Netsky virus.
    • For most of this time, the company policy on spam filtering was not to filter it out at all, because all the filters they tried had too high a false-positive rate. (I.e., they were rejecting too many valid messages as spam.) You can see that in late 2003, the blue dot density diminished considerably. That’s when mail administrators found a filter whose false-positive rate was low enough to be acceptable.

    Interesting trends and various explanations posted for the gaps in spam, some mentioning CAN-SPAM but realistically has this made a lot of difference so far?

    Good to see MS staff putting their spare time to good use, also judging by these:

    Totals: 227.6MB of spam in roughly 19,000 messages. 61.8MB of viruses in roughly 3500 messages.

    MS staff dont get enough spam :)   This averages at around 260 spam emails per month although these are the mails that have alread bypassed the filter.


     




     

  • No tags

    Comments are closed.

    <<

    >>