[ANN] msgthr - container-agnostic, non-recursive message threading

Pure Ruby message threading based on the algorithm described by
JWZ in <https://www.jwz.org/doc/threading.html> and used in
countless mail and news readers; but with some features removed
and improved flexibility:

* Message format-agnostic, not limited to email or NNTP;
  but even appropriate for forums, blog comments, IM, etc.
  For mail and NNTP messages, that means you must extract
  the Message-ID and parse the References and In-Reply-To
  headers into an array yourself.

* No recursion; all algorithms are non-recursive to avoid
  SystemStackError exceptions in deep message threads.
  Memory usage is bound by the number of lightweight containers
  needed to represent a thread.

* No pruning of non-leaf ghosts. Readers are not shielded
  from deleted or lost messages. Equivalent to setting
  "hide_missing" to "no" in mutt.

* No grouping by subject[1]. This makes things ideal for short
  messages such as chat and status updates. For email, this
  encourages the use of proper client which set In-Reply-To
  or References headers. Equivalent to setting "strict_threads"
  to "yes" in mutt.

  [1] - Instead, messages with the same Subject can be grouped
        by a search engine. For email, this would be done using
        a mail search engine like notmuch or mairix.

Source code

···

-----------

Our Ruby 1.9.3+ compatible code is available via git,
no C compiler or other dependencies are required:

  git clone git://80x24.org/msgthr
  git clone https://80x24.org/msgthr.git

Releases may be downloaded via RubyGems:

  gem fetch [-v VERSION] msgthr

The code used in this Ruby library was originally derived from
the Mail::Thread module for Perl 5.x:

  https://metacpan.org/release/Mail-Thread

Contact
-------

Feedback is welcome via plain-text mail (no HTML, images, etc).
Please send comments, user/dev discussion, patches, bug reports,
and pull requests to the open-to-all mailing list at:

  msgthr-public@80x24.org

No subscription will ever be required to post. Please Cc: all
recipients as subscription will never be necessary.

Mailing list archives are available via HTTPS and NNTP:

  https://80x24.org/msgthr-public/
  nntp://news.public-inbox.org/inbox.comp.lang.ruby.msgthr

You may optionally subscribe, but will never be required to:

  msgthr-public+subscribe@80x24.org

This README is our homepage, we would rather be working on cool
stuff day than worrying about the next browser vulnerability
because CSS/JS/images are all too complicated for us.

RDoc API documentation in simple HTML is available at:

  https://80x24.org/msgthr/rdoc/

Copyright
---------

Copyright 2016-2017, all contributors (see git repo).
License: GPL-2.0+ <https://www.gnu.org/licenses/gpl-2.0.txt>

msgthr is copyrighted Free Software by all contributors, see logs in
revision control for names and email addresses of all of them.

msgthr is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as
published by the Free Software Foundation; either version 2 of
the License, or (at your option) any later version.

msgthr is distributed in the hope that it will be useful, but
WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public
License along with this program; if not, see
https://www.gnu.org/licenses/gpl-2.0.txt

--
https://80x24.org/msgthr/README

Quite interesting. Once I get around to continue my work on
Chessboard[1], I think I have a usecase for this (it's a
mailinglist-webforum gateway and thus needs to display email message
threads).

Greetings
Marvin

[1]: GitHub - Quintus/chessboard: Small Bulletin Board forum in Ruby

···

On Fri, Jun 09, 2017 at 12:04:35AM +0000, Eric Wong wrote:

Pure Ruby message threading based on the algorithm described by
JWZ in <https://www.jwz.org/doc/threading.html&gt; and used in
countless mail and news readers

--
Blog: https://www.guelkerdev.de
PGP/GPG ID: F1D8799FBCC8BC4F

Cool. I'll be interested to know how it goes! The algorithmic
improvements in msgthr were originally developed for
public-inbox(*) in Perl5 before porting to Ruby.

With the mirror of the git mailing list at
git@vger.kernel.org mailing list mirror (one of many) , I was hitting stack warnings in
Perl due to recursion on deep message threads, so changes were
necessary.

For chessboard: Feel free to steal from public-inbox, it's also
AGPL-3+. public-inbox stole it's Xapian-based search/thread
indexing logic from notmuch (GPL-3+), even. I suggest using
something along those lines to get the dataset which you'll
feed into msgthr.

Feel free to poke me about public-inbox implementation
details on meta@public-inbox.org, too :slight_smile:

(*) git clone https://public-inbox.org/ public-inbox

    lib/PublicInbox/SearchIdx.pm contains the Xapian indexing logic;
    Search.pm contains the retrieval (read-only) bits;
    SearchThread.pm is kinda like msgthr.rb :wink:
    and SearchMsg.pm is the lightish-weight message container

···

Marvin Gülker <m-guelker@phoenixmail.de> wrote:

On Fri, Jun 09, 2017 at 12:04:35AM +0000, Eric Wong wrote:
> Pure Ruby message threading based on the algorithm described by
> JWZ in <https://www.jwz.org/doc/threading.html&gt; and used in
> countless mail and news readers

Quite interesting. Once I get around to continue my work on
Chessboard[1], I think I have a usecase for this (it's a
mailinglist-webforum gateway and thus needs to display email message
threads).

Greetings
Marvin

[1]: GitHub - Quintus/chessboard: Small Bulletin Board forum in Ruby