Gateway is malfunctioning

I'd like to see the whole raw message, headers included (my email is
valid, please attach the whole message, (g)zipped, or it will be
bounced by my _personal_ filters).

Ok, Robert sent me what seemed to be the full message, and it's in
fact a message declared with a MIME content-type of multipart/
alternative but only one part (text/plain). This is quite absurd,
and I'd suspect that something, somewhere, decided to strip away
the HTML part but leave the MIME declaration intact.

Forgive my weak knowledge of email types here, but if someone sent a
short message to the list with an attached Ruby script, would it be a
multipart message? If so, that would be rejected by the NNTP host?

Yes, but it won't be multipart/alternative (which means different
representations of the same data, nearly universally HTML + plain text),
more likely multipart/mixed (simply different parts).

For instance, with a quick search, I found this one :

http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/227642

Which has been gated correctly, as it has :

Content-Type: multipart/mixed; boundary="sm4nu43k4a2Rpi4c"

(But it lacks the MIME-Version header, which is, AFAICR, mandatory, if
quite useless...)

James

Fred

···

Le 2 décembre 2006 à 17:10, James Edward Gray II a écrit :

On Dec 2, 2006, at 5:50 AM, F. Senault wrote:

Le 2 décembre 2006 à 11:16, F. Senault a écrit :

--
Do you know how far this has gone ? Just how damaged have I become ?
When I think I can overcome It runs even deeper
Everything that matters is gone All the hands of hope have withdrawn
Could you try to help me hang on ? (Nine Inch Nails, Even Deeper)

Forgive my weak knowledge of email types here, but if someone sent a
short message to the list with an attached Ruby script, would it be a
multipart message? If so, that would be rejected by the NNTP host?

I have a feeling this is what is, or at least was, happening - I wasn't
going to say anything (since it was prior to December first) but I
noticed that posts with attachments never made it across.

That's not systematic.

ruby-talk 226884, for example, doesn't appear to have made it to the
newsgroup.

If you have an e-mail version of the message, could you send it to me
with all headers as a (g)zip file ? I'll run it manually through the
filters and look at what could cause the rejection. (Maybe the base64
encoding in this one, passing for a binary post.)

Fred

···

Le 2 décembre 2006 à 17:37, Ross Bamford a écrit :

On Sun, 2006-12-03 at 01:10 +0900, James Edward Gray II wrote:

--
It's over now, I'm cold, alone
I'm just a person on my own
Nothing means a thing to me
(Nothing means a thing to me) (K's Choice, Not an Addict)

Am I understanding right Fred that your filters do allow multipart/mixed?

I ask because I cannot find that the following message was moved to Usenet:

http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/226884

But here is the Content-Type header from that message:

Content-Type: multipart/mixed; boundary="=-nN7cXnGaqHsTLxonszcP"

James Edward Gray II

···

On Dec 2, 2006, at 10:55 AM, F. Senault wrote:

Le 2 décembre 2006 à 17:10, James Edward Gray II a écrit :

On Dec 2, 2006, at 5:50 AM, F. Senault wrote:

Le 2 décembre 2006 à 11:16, F. Senault a écrit :

I'd like to see the whole raw message, headers included (my email is
valid, please attach the whole message, (g)zipped, or it will be
bounced by my _personal_ filters).

Ok, Robert sent me what seemed to be the full message, and it's in
fact a message declared with a MIME content-type of multipart/
alternative but only one part (text/plain). This is quite absurd,
and I'd suspect that something, somewhere, decided to strip away
the HTML part but leave the MIME declaration intact.

Forgive my weak knowledge of email types here, but if someone sent a
short message to the list with an attached Ruby script, would it be a
multipart message? If so, that would be rejected by the NNTP host?

Yes, but it won't be multipart/alternative (which means different
representations of the same data, nearly universally HTML + plain text),
more likely multipart/mixed (simply different parts).

For instance, with a quick search, I found this one :

http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/227642

Which has been gated correctly, as it has :

Content-Type: multipart/mixed; boundary="sm4nu43k4a2Rpi4c"

Content-Type: multipart/mixed; boundary="sm4nu43k4a2Rpi4c"

Am I understanding right Fred that your filters do allow multipart/
mixed?

They should.

I ask because I cannot find that the following message was moved to
Usenet:

http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/226884

But here is the Content-Type header from that message:

Content-Type: multipart/mixed; boundary="=-nN7cXnGaqHsTLxonszcP"

If you have the complete message, please forward it to me. I'll be able
to give more precise answers if I can run it manually through the
filters.

Fred

···

Le 2 décembre 2006 à 18:05, James Edward Gray II a écrit :

On Dec 2, 2006, at 10:55 AM, F. Senault wrote:

--
Young at heart an' it gets so hard to wait
When no one I know can seem to help me now
Old at heart but I mustn't hesitate
If I'm to find my way out (Guns n' Roses, Estranged)

Okay, it was interpreted as binary (probably the base 64 part). I've
lifted that condition (I moved clr to the binaries allowed groups) but
messages like this will probably be plagued by a bad distribution, maybe
raising even more difficult to troubleshoot problems.

Fred

···

Le 2 décembre 2006 à 18:05, James Edward Gray II a écrit :

Am I understanding right Fred that your filters do allow multipart/
mixed?

I ask because I cannot find that the following message was moved to
Usenet:

--
Run desire run Sexual being
Run him like a blade To and through the heart
No conscience One motive
Cater to the hollow (A Perfect Circle, Hollow)

Assuming that the base64 encoding was done to cope with UTF8 in the source, is there no way we can improve the distribution situation? The vagaries of NNTP are not my strong point (by any means) but it strikes me that UTF8 will be fairly common in messages and attachments on ruby-talk. Did the old gateway address this somehow?

Btw. just wanted to thank both you and James for the time and resources spent on the gateway - it's a valuable service, and we're all lucky to have you working on it.

Cheers,

···

On Sat, 02 Dec 2006 19:02:34 -0000, F. Senault <fred@lacave.net> wrote:

Le 2 décembre 2006 à 18:05, James Edward Gray II a écrit :

Am I understanding right Fred that your filters do allow multipart/
mixed?

I ask because I cannot find that the following message was moved to
Usenet:

Okay, it was interpreted as binary (probably the base 64 part). I've
lifted that condition (I moved clr to the binaries allowed groups) but
messages like this will probably be plagued by a bad distribution, maybe
raising even more difficult to troubleshoot problems.

--
Ross Bamford - rosco@roscopeco.remove.co.uk

We appreciate all your efforts. Thank you Fred!

James Edward Gray II

···

On Dec 2, 2006, at 1:05 PM, F. Senault wrote:

Le 2 décembre 2006 à 18:05, James Edward Gray II a écrit :

Am I understanding right Fred that your filters do allow multipart/
mixed?

I ask because I cannot find that the following message was moved to
Usenet:

Okay, it was interpreted as binary (probably the base 64 part). I've
lifted that condition (I moved clr to the binaries allowed groups) but
messages like this will probably be plagued by a bad distribution, maybe
raising even more difficult to troubleshoot problems.

No. The new code does everything the old one did and more. The main difference is that I can understand the code now. :wink:

Before I took over the Gateway though, it did operate through a different NNTP host. I don't know what that host allowed, but the evidence suggests it may have been pretty accepting.

James Edward Gray II

···

On Dec 2, 2006, at 1:50 PM, Ross Bamford wrote:

Did the old gateway address this somehow?

Here's what I said to Fred on the subject:

"I'm guessing we could get radical and read incoming emails with an email library, then use that to compose a sensical Usenet post. That way we could pull the text section of a multipart/alternative message. Attachments are trickier and I guess we would have to inline them. That would work for simple Ruby scripts, but it gets more complicated when someone posts something like a zipped archive file.

There would still be situations it couldn't handle, but maybe we could reduce them. This sure sounds like work though. :wink:

I'm considering making the Gateway code public now that I have rewritten it. Maybe this will encourage enterprising souls to hack on it a bit for features like this."

We just need to remember that we are joining two worlds with different rules here.

James Edward Gray II

···

On Dec 2, 2006, at 1:50 PM, Ross Bamford wrote:

Assuming that the base64 encoding was done to cope with UTF8 in the source, is there no way we can improve the distribution situation?

Assuming that the base64 encoding was done to cope with UTF8 in the
source, is there no way we can improve the distribution situation?

Here's what I said to Fred on the subject:

"I'm guessing we could get radical and read incoming emails with an
email library, then use that to compose a sensical Usenet post. That
way we could pull the text section of a multipart/alternative
message. Attachments are trickier and I guess we would have to
inline them. That would work for simple Ruby scripts, but it gets
more complicated when someone posts something like a zipped archive
file.

This path leads to the Dark Side (tm) : content mangling of a message
(even if it's transcoding). It's usually considered very bad mojo to
touch anything else than the headers...

(At least, I'll have someone to blame if I typo something in a post...
:P)

There would still be situations it couldn't handle, but maybe we
could reduce them. This sure sounds like work though. :wink:

Yep. I'm not sure there are any good fast libraries to treat MIME, and
transfer-encodings out there ?

I'm considering making the Gateway code public now that I have
rewritten it. Maybe this will encourage enterprising souls to hack
on it a bit for features like this."

That's definitely a good idea.

We just need to remember that we are joining two worlds with
different rules here.

What he said.

James Edward Gray II

Fred

···

Le 2 décembre 2006 à 22:11, James Edward Gray II a écrit :

On Dec 2, 2006, at 1:50 PM, Ross Bamford wrote:

--
So sorry your world is tumbling down
I will watch you through these nights
Rest your head and go to sleep (Within Temptation,
Because my child, this is not our farewell Our Farewell)

I've published half of the code now:

James Edward Gray II

···

On Dec 5, 2006, at 4:20 AM, F. Senault wrote:

Le 2 décembre 2006 à 22:11, James Edward Gray II a écrit :

I'm considering making the Gateway code public now that I have
rewritten it. Maybe this will encourage enterprising souls to hack
on it a bit for features like this."

That's definitely a good idea.