In case anyone needs it,
str = 'This is a test of the emergency broadcasting services'
str.scan(/(.{1,30})(?:\s+|$)/)
=> [["This is a test of the"], ["emergency broadcasting"], ["services"]]
In case anyone needs it,
str = 'This is a test of the emergency broadcasting services'
str.scan(/(.{1,30})(?:\s+|$)/)
=> [["This is a test of the"], ["emergency broadcasting"], ["services"]]
Dang that's cool!
I'm still puzzling out how that works...
James Edward Gray II
On Sep 14, 2005, at 9:26 AM, Erik Terpstra wrote:
In case anyone needs it,
str = 'This is a test of the emergency broadcasting services'
str.scan(/(.{1,30})(?:\s+|$)/)=> [["This is a test of the"], ["emergency broadcasting"], ["services"]]
Nice. Not to golf, but how about simply:
str = 'This is a test of the emergency broadcasting services'
p str.scan(/.{1,30}\b/)
#=> ["This is a test of the ", "emergency broadcasting ", "services"]
(No flattening required.)
On Sep 14, 2005, at 8:26 AM, Erik Terpstra wrote:
str = 'This is a test of the emergency broadcasting services'
str.scan(/(.{1,30})(?:\s+|$)/)=> [["This is a test of the"], ["emergency broadcasting"], ["services"]]
--
(-, /\ \/ / /\/
Oops, because mine will split punctuation from its characters. However, both of ours will lose lines that are \S{31,}
So:
str = '123456789012345678901234567890This is a test of the emergency broadcasting system. This is only a test.'
class String
def wrap_to( col_width )
str = self.gsub( /(\S{#{col_width}})(\S)/, '\1 \2' )
str.scan(/(.{1,#{col_width}})(?:\s+|$)/).flatten.join( "\n" )
end
end
puts str.wrap_to( 30 )
123456789012345678901234567890
This is a test of the
emergency broadcasting system.
This is only a test.
puts str.wrap_to( 29 )
12345678901234567890123456789
0This is a test of the
emergency broadcasting
system. This is only a test.
On Sep 14, 2005, at 8:39 AM, Gavin Kistner wrote:
On Sep 14, 2005, at 8:26 AM, Erik Terpstra wrote:
str = 'This is a test of the emergency broadcasting services'
str.scan(/(.{1,30})(?:\s+|$)/)=> [["This is a test of the"], ["emergency broadcasting"], ["services"]]
Nice. Not to golf, but how about simply:
str = 'This is a test of the emergency broadcasting services'
p str.scan(/.{1,30}\b/)
#=> ["This is a test of the ", "emergency broadcasting ", "services"]
Up to 30 characters, but there has to be whitespace after it (to keep it from splitting in the middle of the word) or be the very end of the string. The greedy regexp will grab all 30 if it can find them with whitespace after, otherwise it will backtrack until it finds the right spot.
Very nice, Erik. I like how it also strips the whitespace that will be wrapped.
On Sep 14, 2005, at 8:38 AM, James Edward Gray II wrote:
str.scan(/(.{1,30})(?:\s+|$)/)
I'm still puzzling out how that works...
I do understand the Regexp, but isn't that a look-ahead assertion at the end? That's not supposed to consume characters, right? So why doesn't the very next match start with the leading whitespace that ended the last match?
I know I just haven't got me head all the way around it yet. I'm working on it...
James Edward Gray II
On Sep 14, 2005, at 9:48 AM, Gavin Kistner wrote:
On Sep 14, 2005, at 8:38 AM, James Edward Gray II wrote:
str.scan(/(.{1,30})(?:\s+|$)/)
I'm still puzzling out how that works...
Up to 30 characters, but there has to be whitespace after it (to keep it from splitting in the middle of the word) or be the very end of the string. The greedy regexp will grab all 30 if it can find them with whitespace after, otherwise it will backtrack until it finds the right spot.
Answering my own dumb question, "No James, that's simple clustering not a look-ahead. Get your Regexp symbology right man!" Clustering does consume characters of course, so it now all makes sense to me.
I guess it was just too early in the morning for me...
James Edward Gray II
On Sep 14, 2005, at 9:59 AM, James Edward Gray II wrote:
On Sep 14, 2005, at 9:48 AM, Gavin Kistner wrote:
On Sep 14, 2005, at 8:38 AM, James Edward Gray II wrote:
str.scan(/(.{1,30})(?:\s+|$)/)
I'm still puzzling out how that works...
Up to 30 characters, but there has to be whitespace after it (to keep it from splitting in the middle of the word) or be the very end of the string. The greedy regexp will grab all 30 if it can find them with whitespace after, otherwise it will backtrack until it finds the right spot.
I do understand the Regexp, but isn't that a look-ahead assertion at the end?
James Edward Gray II wrote:
I do understand the Regexp, but isn't that a look-ahead assertion at
the end? That's not supposed to consume characters, right? So why
doesn't the very next match start with the leading whitespace that
ended the last match?I know I just haven't got me head all the way around it yet. I'm
working on it...James Edward Gray II
No, at the end, it is not look-ahead assertion,
the (?: ... ) still consume characters but without grouping.
And when use String#scan and have group, the result will just return
group
anything not in the group will just ignore, for example:
'abcdef'.scan(/(.)./) # ==> [['a'], ['c'], ['e']]
So the str.scan(/(.{1,30})(?:\s+|$)/)
the part (?:\s+|$) will consume space characters but will not be part
of scan result.
I have been using something extremely similar to add some tags to a text dump of the classified ads for my newspaper. Like this:
line = "<begad:11560454>Clinician PT front office, 50-60 hrs/mo. Send resumes to: www.omacime.com<endad>"
line.gsub!(/(<begad:[^>]+>)(.{1,50}.*?\b)/, "\\1<ftditm>\\2<\/ftditm>")
#=> "<begad:11560454><ftditm>Clinician PT front office, 50-60 hrs/mo. Send resumes</ftditm> to: www.omacime.com<endad>"
This takes a line and wraps the <ftditm></ftditm> tags around 50 chars plus whatever is needed to make it to a whitespace char.
Cheers-
-Ezra Zygmuntowicz
Yakima Herald-Republic
WebMaster
509-577-7732
ezra@yakima-herald.com
On Sep 14, 2005, at 7:59 AM, James Edward Gray II wrote:
On Sep 14, 2005, at 9:48 AM, Gavin Kistner wrote:
On Sep 14, 2005, at 8:38 AM, James Edward Gray II wrote:
str.scan(/(.{1,30})(?:\s+|$)/)
I'm still puzzling out how that works...
Up to 30 characters, but there has to be whitespace after it (to keep it from splitting in the middle of the word) or be the very end of the string. The greedy regexp will grab all 30 if it can find them with whitespace after, otherwise it will backtrack until it finds the right spot.
I do understand the Regexp, but isn't that a look-ahead assertion at the end? That's not supposed to consume characters, right? So why doesn't the very next match start with the leading whitespace that ended the last match?
I know I just haven't got me head all the way around it yet. I'm working on it...
James Edward Gray II