defining computers: Things to Fix in E-mail, Newsgroups, and Mailing Lists

I've posted a bit of soapbox-ing on the Debian User list over the last few weeks. One of my rants was about e-mail and mailing lists and the fact that they must change in both form and function. I was asked off-list about what I had in mind, and I guess it's just as well that I should post here, so that the answer is public, but not cluttering up the list further.

(More of the thoughts that have lead me to my current opinions here.)

First-off, current e-mail is actually pretty good, for people who are willing to understand the interface and take responsibility for their own use. Both imap and pop provide methods for checking message headers and deleting messages without actually downloading messages. At the GUI level, Sylpheed, for instance, has the

message -> receive -> remote mailbox

menu item, which allows you to sort by subject, sender, and date, in addition to deleting or downloading individual messages.

Since spam tends to clump together under a sort, sorting helps greatly at handling spam without actually setting up complex filtering systems. I can clear about a thousand spam messages in about fifteen minutes to a half-hour and not worry about false positives, etc. (This won't work for everyone -- It took me several years to tune my mental filters and visual scanning techniques.)

But, speaking of spam, there you have it. Senders of unsolicited commercial messages are on questionable moral and ethical ground, and the unsolicited pseudo-commercial messages we call spam, mostly fraud or worse, are definitely cases of irresponsible use. The tendency to respond to messages that shouldn't be responded to is also a case of not-really responsible use, whether answering a message about some unknown wanting to give you money or joining in a mail-list flame war.

If current e-mail protocols and usage were sufficient for mail lists, I think we would have no need for either twitter or facebook. The biggest problem with e-mail and mailing lists is that there are always going to be irresponsible users. Even in the early days of the internet, when the users were all military researchers and academics, you'd have (for example) the occasional professor deciding he needed to get the broadest possible audience for something he was doing and address-span mailing every address in his address files.

It's not so much the volume as the lack of human judgement about which addresses to use.

With physical mail, the volume issue is significantly offset by the cost of sending physical junk mail. That's the biggest reason it usually takes more than a week of failing to empty your mailbox to cause it to explode. But if we think of a way to make mass e-mail cost, e-mail is suddenly less valuable because we then start worrying about the cost of our daily conversations.

And then there's the problem of knowing who it really is you are talking to.

We have these things called certificates, that are supposed to provide us assurance of the identity of the other guy. But they don't really work because of companies that would rather make money than provide a service, and we can't use them to tell for sure whether the message we just got really is from the person it says it is from. Without such methods, all we have to work from is the contents of the "From" header and the contents of other headers that ostensibly describe the path that the message took on its way to our in-box. And all those headers are easily forged.

With a physical envelope, we really have no way of knowing that the return address on the envelope is for real, unless the letter is sent as registered mail. (And even then we aren't quite sure.) But the content of the letter has out-of-band clues, like the hand-writing, that can help us be sure.

In current e-mail, we have neither registered mail nor handwriting. The closest thing we have to registered mail is the logs on the servers that the message has passed through. If you don't any of the servers on the path, you can't trust the path itself.

Other out-of-band stuff like pictures require html format messages, which are dead easy to use a variety of forgery techniques with. The problems are inherent in the methods we use to encode the data in the messages. ASCII and its descendants, including Unicode, don't really provide good, standardizable methods for burying identifying information in the messages. There are cases where we don't want identifying messages in our messages, but there are cases when we very much want to know who we are talking with and want them to know who they are talking with.

Back to the size issues, individual messages are generally not all that large, but when you have a lot of messages from a mail list or newsgroup, the size adds up quickly. Non-requested advertisements add up even more quickly.

Mailing list and newsgroup browsers (or the mailing list mode of your MUA) should not download the thread headers unless you request a thread listing. (And they should respect the thread-related headers, to avoid breaking threads.) And they should not download a message unless you actually request the message.

But it's often hard to tell whether you want to download a message until you've read it and decided you know who sent it, or decide you are interested in it. Since you can't read it without downloading, you're often stuck with downloading anyway. (I'm only successful in my methods of checking the headers from years of practice. If I try that with a new newsgroup or mail list, however, it's going to take a little while to learn that group/list's patterns.)

If we can put reasonably useful identifying headers in a message, and if our MUA can read those headers, we can at least make meaningful judgments about who wrote the message. And that can help us decide whether to download a message, and help us reduce our bandwidth use. (And save us time.)

The more I've used e-mail, the more I find myself storing it the same way I store newsgroup and mailing list messages -- by thread. (And that is one of the reasons I can generally identify spam just by the headers fairly quickly.)

These identifying headers require the cooperation of the mail servers and internet service providers. But the providers and servers are not interested in their users' efficiency. That does nothing to help their bottom line, and in many cases (think wireless) what is inefficient for users makes providers money. Counter-motivation here.

Until we start serving our own mail, and managing our own connections to the internet more directly, e-mail, newsgroups, and mailing lists will remain as they are, rivers where users are dragged along in the flow, instead of tools for the benefit of users. But the technology to allow ordinary users to do so is still not there.

(I think this post is getting a little closer to what I've been trying to say about the internet, and computers, for a long time, but I'm still not quite there.)

defining computers

Misunderstanding Computers

Thursday, May 1, 2014

Things to Fix in E-mail, Newsgroups, and Mailing Lists

No comments:

Post a Comment