holiman/spam.md

## spam.md

      
    Raw
  

              spam.md
            
          
    Stopping spam

This blog post by Jonathan Brown suggested replacing SMTP with Ethereum blockchain; specifically, utilizing the log channel to monitor events.
With this approach, emails wouldn't actually be stored within the EVM (Ethereum Virtual Machine) storage, but every email would still be present in the blockchain blocks. The EVM log mechanism would make it simple for a full node to monitor and be alerted whenever an email was submitted.
I don't believe that this would be feasible in the real world, for several reasons

There are lots of email being sent, some of them quite large
Most people wouldn't want their emails forever on the blockchain, the future resiliency of GPG is unknown, and GPG encryption does not provide perfect forward secrecy. Once a key is compromised, all is revealed.
GPG encryption via S/MIME works on body, not email headers. Though I might have misunderstood how the message was transferred, perhaps the entire email was encrypted.
SMTP isn't going anywhere for a while yet... There's quite a protocol intertia on the internet (ask ipv6 and dnssec)

However, there's one aspect in this that's really interesting; the ability to associate emails with a cost, thereby using it to eliminate spam.
The new old thing

The 'Penny per Mail' idea is old. Paul Graham references it in this essay from 2003. Wikipedia has an entry about it here; it says the follwing:

As a refinement to stamp systems is the method of requiring that a micropayment only be made (or some other form of penalty imposed) if the recipient considers the email to be abusive. This addresses the principal objection to stamp systems: popular free legitimate mailing list hosts would be unable to continue to provide their services if they had to pay postage for every message they sent.
Bill Gates announced that Microsoft is working on a solution requiring so-called "unknown senders", i.e. senders not on the Accepted List of the recipient to post "the electronic equivalent of a" stamp whose value would be lost to the sender only if the recipient disapproves of the email. Gates said that Microsoft favors other solutions in the short-term, but would rely on the contingent payment solution to solve the spam problem over the longer run. Microsoft, AOL as well as Yahoo! have recently introduced systems that allow commercial senders to avoid filters if they obtain a paid or pre-paid certificate or certification, which is lost to the sender if recipients complain.

I don't know how old this information is; chances are it's pretty (read, extremely) old.
Back in 2003, there was no way to charge a micro payment for delivering an email. If you're old enough to remember, there were many initiatives that tried to tackle this, with various schemes. None of them were successfull. And spam is still a problem.
But now it's 2015; blockchains have been around for a while an started to become househould name. Blockchains are super useful for micro payments. We now have a tool at our disposal that just didn't exist back then. Since I'm the most familiar with Ethereum, that's the one I'm going to focus on.
The refined idea

So, let's revisit the idea from the blog post, but not aim for replacing SMTP; just cherrypicking the anti-span (micro-payment) part.
Let's say I start sending email with an additional header:
X-blockchain: ethereum/sha256('foobar.com')/sha256('johndoe')/23423409812340912348
Message-ID: 23423409812340912348

In paralell, my email client sends a message via Ethereum, but not the entire email message; only the identifier.
Whenever an email reaches me, if I see the header above, I can match it against an identifier received over Ethereum. If it matches; I know that the sender has paid at least the gas-cost of the message.
What are the consequences:

This would be used as an addition to existing spam-filters. A dual-channel cost-receipt will lower the spam-rating of an email.
If two users already have a dialogue, they can skip the micro-payment (supposing that client-side spam filters would whitelist outgoing email recipients).
Only 1st-contact emails need to use micro-payment.
Message content would still delivered as before, plain old insecure SMTP that we love so much
It would be possible to require higher increase the cost if the minimum threshold is too low
SMTP servers would not need any change. This could be implemented, today, within email clients.
A new way to model services around email *

Security

One of the problems that Paul Grahams mentions above is:

Charging per email wouldn't stop the worst spammers. They'd just break into companies' computers and send mail at their expense. And the possibility of a spammer breaking into one's system and racking up big email bills would not make the average sysadmin eager to become an early adopter, to say the least.
For this kind of approach to work, we'd first have to solve the problem of making the average small and medium-sized company's network secure. So we'd just be exchanging a hard problem for a harder one.

The way I read his comment, is that he thinks such a solution would endow the MTA with a VISA-card, or get a monthly bill from some creditor. Ergo, an attacker loose in the MTA would go on a spending spree and cost a lot.
However, that's not the case with an Ethereum account (or other types of Blockchain currencies). The corporate account could simply watch the email server(s) account(s), whenever the funds are running low, it could deposit some more gas for spending . The actual email server wouldn't have to hold more than a couple of days (or hours) worth of funds. Naturally, the corporate account would not have to have any connectivity to the actual email server at all - as long as it's connected to the blockchain.
Implementation thoughts / QnA

I don't have a full RFC for this thing, it's mainly a thought experiment at this point. I don't have a full understanding of the Ethereum log mechanism, but here are some implementation ideas and general thoughts:


What to send over Ethereum logs


For server side processing, a domain channel to listen at would be needed.

The first log topic could be hash(domain_name)


For client side processing, a the user address would be needed.

The second log topic could be : hash(username)
There's a problem with this, since it would make it easy to monitor the fact that a specific user got an email.


The log message should be the `Message-ID', corresponding to the identifier sent within the email.
Thus, we'd arrive at
log(sha256('foobar.com'), <optional> sha256('johndoe'), '23423409812340912348')). The corresponding email header additions would be:


X-blockchain: ethereum/sha256('foobar.com')/<empty> or sha256('johndoe')/<ID from Message-ID>


In practice:
X-blockchain: ethereum/fcc63e859c5d2631585c5f9ebf819aa7d471dac6317d56635deb5931a0f5957a//26f38bf5ba014f349921f5687ff4b489
[...]
Message-ID: 26f38bf5ba014f349921f5687ff4b489@senderdomain.com


How to notify pennymail capabilities

The pennymail-via-ethereum capability for an MTA could be specified via DNS records.
Don't know how a single user without MTA or DNS access could advertise this. Ideas?


Does this require SMTP server to be reconfigured?

No, it does not. This mechanism can be layered on top of / paralell to SMTP, on client systems only. However, future modification to the SMTP mechanism could utilize this mechanism to implement server-side spam filtering.


Does this require all mail to be recorded to the blockchain?

No, but all 'pennymails' will record an event on the blockchain. The event should contain some data which can be used by the receiver to get notified via a channel. An attacker monitoring the same events would know whenever a pennymail was sent to the recipient domain, but not the sender and only via hash-lookup be able to find out the individual recipient.


Where does the money go?

In the case of Ethereum, the canonical cost would go to the miners. Additional costs added on top could go to arbitrary ethereum accounts, such as email operator, recipient account or charity-account.


What if pennies are not submitted?

Depends on the recipient. The likelihood of being classified as 'spam' would increase. All would continue as ususal.


Would I have to 'penny' every email I send to my friends?

Nope, dialogues should whitelist automatically by whatever filter is in place.


What metadata is leaked

The sender ethereum account. If the same sender-account is used for all outgoing emails, recipients will know who the sender-account is, and can use that information to see when the sender-account is/has been used to send pennymails.
When emails are sent to the receiver domain.
If second logtopic is used, when emails are sent to the recipient address.


Will a full-blown ethereum client need to be installed on the mail server?

Not with client side processing.
With server side recipient processing, a light client which only receives blocks and handles log messages would suffice; no account database, no complete blockchain. The client could run as a nobody user which makes the log messages available to the MTA processing queues.


_{[*] Take gmail for example. You don't pay for gmail, so obviously, you are not the customer here; you're the product. What if an email service providing the same service as Gmail, but also implement the penny-mail spam protection. They would only accept incoming email with an associated penny - unless, of course, the email is within @ethmail domain or whitelisted by you.
If this was successfull, they could start additional charges; making incoming email to pay for the benefit of sending you mail. So instead of having advertisers pay for showing you ads, email senders would pay for communicating with you. While this does sound a bit far-fetched, today, the integration of micro payments via a second channel opens up for different business models around email services.**
[**] For example, If you're Bill Gates or Justin Bieber, you could promise to read and respond to all email with an associated 100$ bill, which is donated to charity.}