Or get the source:
svn checkout svn://viewvc.rubyforge.mmmultiworks.com/var/svn/clothred
== New in this release:
* Support for some HTML entities
* Support for tables
== Features
This is alpha software, and only a few Textile rules have been
implemented yet:
* font markup and weight (<b>, <strong>, ...)
* text formatting (<sub>, <sup>, <ins>,<del>)
* Support for headings
* Support for paragraphs and <blockquote>
* Support for Textile entities
== Usage
require 'clothred'
text = ClothRed.new("<b>Bold</b> <em>HTML</em>!")
text.to_textile
== Get Help
Feel free to contact me, or peruse the homepage.
but that's pretty easy to work around with the simple patch attached.
I just replaced "" as a substitution to HTML </h1>, </h2>, ... by
"\n\n" producing the necessary paragraph breaks for Textile.
"test/test_headings.rb" had to be fixed and I also wrote
"test/test_misc.rb" as a test script for HTML with more than one tag
in it.
The substitution approach will not work quite right for HTML where
closing tags are missing. The algorithm will never understand when the
tags were closed. So this is somewhat limited currently to XHTML which
demands closing tags.
I think the suggestion of using a HTML parser (like Hpricot) to do
this conversion will impose itself pretty soon.
Or get the source:
svn checkout svn://viewvc.rubyforge.mmmultiworks.com/var/svn/clothred
== New in this release:
* Support for some HTML entities
* Support for tables
== Features
This is alpha software, and only a few Textile rules have been
implemented yet:
* font markup and weight (<b>, <strong>, ...)
* text formatting (<sub>, <sup>, <ins>,<del>)
* Support for headings
* Support for paragraphs and <blockquote>
* Support for Textile entities
== Usage
require 'clothred'
text = ClothRed.new("<b>Bold</b> <em>HTML</em>!")
text.to_textile
== Get Help
Feel free to contact me, or peruse the homepage.
but that's pretty easy to work around with the simple patch attached.
I just replaced "" as a substitution to HTML </h1>, </h2>, ... by
"\n\n" producing the necessary paragraph breaks for Textile.
"test/test_headings.rb" had to be fixed and I also wrote
"test/test_misc.rb" as a test script for HTML with more than one tag
in it.
Thanks. I'll add your patch ASAP, and make a new release.
The substitution approach will not work quite right for HTML where
closing tags are missing. The algorithm will never understand when the
tags were closed. So this is somewhat limited currently to XHTML which
demands closing tags.
And it's buggy, I've noticed, as it ignores self-closing tags like <br />. I'll fix that together with your patch.
I think the suggestion of using a HTML parser (like Hpricot) to do
this conversion will impose itself pretty soon.
Probably, but I'll have to play with Hpricot first, to see if it can do what is needed (after a short skim, it allows to convert HTML4 into XHTML1.0, which will make it easier). Trouble is, I want ClothRed as dependency free as I can. So maybe I'll redistribute Hpricot in non-gem distributions (if that's possible, haven't checked Hpricot's license yet).
Thanks for the inspiring work.
No problem. I'll need that library for my own ideas.
I think what you're going to find is, parsing tag-soup HTML is harder than
you think - especially if your goal with ClothRed is to parse *arbitrary*
tag-soup HTML from arbitrary sources.
···
On Sat, Apr 14, 2007 at 01:24:02AM +0900, Phillip Gawlowski wrote:
>I think the suggestion of using a HTML parser (like Hpricot) to do
>this conversion will impose itself pretty soon.
Probably, but I'll have to play with Hpricot first, to see if it can do
what is needed (after a short skim, it allows to convert HTML4 into
XHTML1.0, which will make it easier). Trouble is, I want ClothRed as
dependency free as I can.