Cannot remove multiple spaces

Tom_Cloyd2 · 7 February 2009 13:31

I'm baffled by this strange outcome - I cannot reduce multiple spaces from a text file. This isn't just a regex problem, somehow. I'm failing to grasp something essential, but don't know what it is. All help appreciated, as usual!

Here is a demo of my problem, in which I try two different ways, and both fail:

=== code ===
# h2t.rb

def main
  # conversion table spec
  conv = [
  [ '<h1>', 'h1. ' ], [ '<h2>', 'h2. ' ], [ '<h3>', 'h3. ' ],
  [ '<h4>', 'h4. ' ], [ '<h5>', 'h5. ' ], [ '<h6>', 'h6. ' ], [ /<\/h\d>/, '' ],
  [ " +", ' ' ]] # <= this last array element should do the trick, but doesn't

  data = open( 'h2t-in2.txt', 'r' ) { |f| ( f.readlines( data )).to_s }
   conv.each do |i|
    data.gsub!( i[0], i[1] )
  end
  data.squeeze(' ') # <= putting this here was sheer desperations, but even THIS fails

open( "h2t-out.txt", "w" ) { |f| f.write( data ) }

end

%w(rubygems ruby-debug readline strscan logger fileutils).each{ |lib| require lib }

main

=== input file ===

<h1>Library catalog listing </h1>x

<h3>Library catalog listing </h3>x

<h2>Library catalog listing </h2>x

p(subtitle). A complete listing of all material in the Library

=== output file ===

h1. Library catalog listing x

h3. Library catalog listing x

h2. Library catalog listing x

p(subtitle). A complete listing of all material in the Library

···

==============

The "x"s in the input file are to show that while the end tags are being removed the space before them is NOT.

t.

--

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Tom Cloyd, MS MA, LMHC - Private practice Psychotherapist
Bellingham, Washington, U.S.A: (360) 920-1226
<< tc@tomcloyd.com >> (email)
<< TomCloyd.com >> (website) << sleightmind.wordpress.com >> (mental health weblog)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Tom_Cloyd2 · 7 February 2009 13:42

Tom Cloyd wrote:

I'm baffled by this strange outcome - I cannot reduce multiple spaces from a text file. This isn't just a regex problem, somehow. I'm failing to grasp something essential, but don't know what it is. All help appreciated, as usual!

Here is a demo of my problem, in which I try two different ways, and both fail:

=== code ===
# h2t.rb

def main
# conversion table spec
conv = [
[ '<h1>', 'h1. ' ], [ '<h2>', 'h2. ' ], [ '<h3>', 'h3. ' ],
[ '<h4>', 'h4. ' ], [ '<h5>', 'h5. ' ], [ '<h6>', 'h6. ' ], [ /<\/h\d>/, '' ],
[ " +", ' ' ]] # <= this last array element should do the trick, but doesn't

Ouch. THIS - [ / +/, ' ' ], substituted for [ " +", ' ' ] above fixes it. I'm going blind, obviously.

t.

···

data = open( 'h2t-in2.txt', 'r' ) { |f| ( f.readlines( data )).to_s }

conv.each do |i|
data.gsub!( i[0], i[1] )
end
data.squeeze(' ') # <= putting this here was sheer desperations, but even THIS fails

open( "h2t-out.txt", "w" ) { |f| f.write( data ) }

end

%w(rubygems ruby-debug readline strscan logger fileutils).each{ |lib| require lib }

main

=== input file ===

<h1>Library catalog listing </h1>x

<h3>Library catalog listing </h3>x

<h2>Library catalog listing </h2>x

p(subtitle). A complete listing of all material in the Library

=== output file ===

h1. Library catalog listing x

h3. Library catalog listing x

h2. Library catalog listing x

p(subtitle). A complete listing of all material in the Library

==============

The "x"s in the input file are to show that while the end tags are being removed the space before them is NOT.

t.

--

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Tom Cloyd, MS MA, LMHC - Private practice Psychotherapist
Bellingham, Washington, U.S.A: (360) 920-1226
<< tc@tomcloyd.com >> (email)
<< TomCloyd.com >> (website) << sleightmind.wordpress.com >> (mental health weblog)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Craig_Demyanovich · 7 February 2009 14:21

See comments below.

I'm baffled by this strange outcome - I cannot reduce multiple spaces from
a text file. This isn't just a regex problem, somehow. I'm failing to grasp
something essential, but don't know what it is. All help appreciated, as
usual!

Here is a demo of my problem, in which I try two different ways, and both
fail:

=== code ===
# h2t.rb

def main
# conversion table spec
conv = [
[ '<h1>', 'h1. ' ], [ '<h2>', 'h2. ' ], [ '<h3>', 'h3. ' ],
[ '<h4>', 'h4. ' ], [ '<h5>', 'h5. ' ], [ '<h6>', 'h6. ' ], [ /<\/h\d>/,
'' ],
[ " +", ' ' ]] # <= this last array element should do the trick, but
doesn't

The last element means replace occurrences of a space followed by a plus
with an empty string. I assume that you were trying to write a regular
expression, which would make your last array [/ +/, ''].

data = open( 'h2t-in2.txt', 'r' ) { |f| ( f.readlines( data )).to_s }

conv.each do |i|
data.gsub!( i[0], i[1] )
end
data.squeeze(' ') # <= putting this here was sheer desperations, but even
THIS fails

This does nothing because String#squeeze returns a new string that you don't
capture. Instead of using the array above, you could do

data = data.squeeze(' ')

open( "h2t-out.txt", "w" ) { |f| f.write( data ) }

end

%w(rubygems ruby-debug readline strscan logger fileutils).each{ |lib|
require lib }

main

Hope that helps.

Regards,
Craig

···

On Sat, Feb 7, 2009 at 8:31 AM, Tom Cloyd <tomcloyd@comcast.net> wrote:

W_James · 7 February 2009 21:05

Tom Cloyd wrote:

I'm baffled by this strange outcome - I cannot reduce multiple spaces
from a text file. This isn't just a regex problem, somehow. I'm
failing to grasp something essential, but don't know what it is. All
help appreciated, as usual!

Here is a demo of my problem, in which I try two different ways, and
both fail:

=== code ===
# h2t.rb

def main
  # conversion table spec
  conv = [
  [ '<h1>', 'h1. ' ], [ '<h2>', 'h2. ' ], [ '<h3>', 'h3. ' ],
  [ '<h4>', 'h4. ' ], [ '<h5>', 'h5. ' ], [ '<h6>', 'h6. ' ], [
/<\/h\d>/, '' ],
  [ " +", ' ' ]] # <= this last array element should do the trick,
but doesn't

  data = open( 'h2t-in2.txt', 'r' ) { |f| ( f.readlines( data )).to_s
}
  conv.each do |i|
    data.gsub!( i[0], i[1] )
  end
  data.squeeze(' ') # <= putting this here was sheer desperations,
but even THIS fails

  open( "h2t-out.txt", "w" ) { |f| f.write( data ) }

end

%w(rubygems ruby-debug readline strscan logger fileutils).each{ |lib|
require lib }

main

=== input file ===

<h1>Library catalog listing </h1>x

<h3>Library catalog listing </h3>x

<h2>Library catalog listing </h2>x

p(subtitle). A complete listing of all material in the Library

=== output file ===

h1. Library catalog listing x

h3. Library catalog listing x

h2. Library catalog listing x

p(subtitle). A complete listing of all material in the Library

==============

The "x"s in the input file are to show that while the end tags are
being removed the space before them is NOT.

t.

puts IO.readlines("data2").map{|line|
line.sub( /<(h\d)>/, '\1. ' ).sub( /<\/h\d>/, "").
squeeze " " }

--- output ---

h1. Library catalog listing x

h3. Library catalog listing x

h2. Library catalog listing x

p(subtitle). A complete listing of all material in the Library

David_A_Black1 · 7 February 2009 13:49

Hi --

Tom Cloyd wrote:

I'm baffled by this strange outcome - I cannot reduce multiple spaces from a text file. This isn't just a regex problem, somehow. I'm failing to grasp something essential, but don't know what it is. All help appreciated, as usual!

Here is a demo of my problem, in which I try two different ways, and both fail:

=== code ===
# h2t.rb

def main
# conversion table spec
conv = [
[ '<h1>', 'h1. ' ], [ '<h2>', 'h2. ' ], [ '<h3>', 'h3. ' ],
[ '<h4>', 'h4. ' ], [ '<h5>', 'h5. ' ], [ '<h6>', 'h6. ' ], [ /<\/h\d>/, '' ],
[ " +", ' ' ]] # <= this last array element should do the trick, but doesn't

Ouch. THIS - [ / +/, ' ' ], substituted for [ " +", ' ' ] above fixes it. I'm going blind, obviously.

Just for fun, here's another way to write the method:

def main
  data = File.read("tom.txt")
  data.gsub!(/<(h[1-6])>/, "\\1. ")
  data.gsub!(/<\/h\d>/, "")
  data.squeeze!(' ')

open("tom.out", "w") {|f| f.write(data) }

end

I think that does the same thing. Tweak to taste

David

···

On Sat, 7 Feb 2009, Tom Cloyd wrote:

--
David A. Black / Ruby Power and Light, LLC
Ruby/Rails consulting & training: http://www.rubypal.com
Coming in 2009: The Well-Grounded Rubyist (http://manning.com/black2\)

http://www.wishsight.com => Independent, social wishlist management!

W_James · 7 February 2009 21:09

William James wrote:

Tom Cloyd wrote:

> I'm baffled by this strange outcome - I cannot reduce multiple
> spaces from a text file. This isn't just a regex problem, somehow.
> I'm failing to grasp something essential, but don't know what it
> is. All help appreciated, as usual!
>
> Here is a demo of my problem, in which I try two different ways,
> and both fail:
>
> === code ===
> # h2t.rb
>
> def main
> # conversion table spec
> conv = [
> [ '<h1>', 'h1. ' ], [ '<h2>', 'h2. ' ], [ '<h3>', 'h3. ' ],
> [ '<h4>', 'h4. ' ], [ '<h5>', 'h5. ' ], [ '<h6>', 'h6. ' ], [
> /<\/h\d>/, '' ],
> [ " +", ' ' ]] # <= this last array element should do the trick,
> but doesn't
>
> data = open( 'h2t-in2.txt', 'r' ) { |f| ( f.readlines( data
> )).to_s }
> conv.each do |i|
> data.gsub!( i[0], i[1] )
> end
> data.squeeze(' ') # <= putting this here was sheer desperations,
> but even THIS fails
>
> open( "h2t-out.txt", "w" ) { |f| f.write( data ) }
>
> end
>
> %w(rubygems ruby-debug readline strscan logger fileutils).each{
> >lib> require lib }
>
> main
>
> === input file ===
>
> <h1>Library catalog listing </h1>x
>
> <h3>Library catalog listing </h3>x
>
> <h2>Library catalog listing </h2>x
>
> p(subtitle). A complete listing of all material in the Library
>
>
> === output file ===
>
>
> h1. Library catalog listing x
>
> h3. Library catalog listing x
>
> h2. Library catalog listing x
>
> p(subtitle). A complete listing of all material in the Library
>
> ==============
>
> The "x"s in the input file are to show that while the end tags are
> being removed the space before them is NOT.
>
> t.

puts IO.readlines("data2").map{|line|
line.sub( /<(h\d)>/, '\1. ' ).sub( /<\/h\d>/, "").
squeeze " " }

--- output ---

h1. Library catalog listing x

h3. Library catalog listing x

h2. Library catalog listing x

p(subtitle). A complete listing of all material in the Library

puts IO.read("data2").gsub( /<(h\d)>/, '\1. ' ).gsub( /<\/h\d>/, "").
squeeze " "

Tom_Cloyd2 · 7 February 2009 20:50

David A. Black wrote:

Hi --

Tom Cloyd wrote:

I'm baffled by this strange outcome - I cannot reduce multiple spaces from a text file. This isn't just a regex problem, somehow. I'm failing to grasp something essential, but don't know what it is. All help appreciated, as usual!

Here is a demo of my problem, in which I try two different ways, and both fail:

=== code ===
# h2t.rb

def main
# conversion table spec
conv = [
[ '<h1>', 'h1. ' ], [ '<h2>', 'h2. ' ], [ '<h3>', 'h3. ' ],
[ '<h4>', 'h4. ' ], [ '<h5>', 'h5. ' ], [ '<h6>', 'h6. ' ], [ /<\/h\d>/, '' ],
[ " +", ' ' ]] # <= this last array element should do the trick, but doesn't

Ouch. THIS - [ / +/, ' ' ], substituted for [ " +", ' ' ] above fixes it. I'm going blind, obviously.

Just for fun, here's another way to write the method:

def main
data = File.read("tom.txt")
data.gsub!(/<(h[1-6])>/, "\\1. ")
data.gsub!(/<\/h\d>/, "")
data.squeeze!(' ')

open("tom.out", "w") {|f| f.write(data) }

end

I think that does the same thing. Tweak to taste

David

That's beautifully economical, and reveals a far better grasp of regex than I was able to attain last night. However, I'm having trouble with this line:

data.gsub!(/<(h[1-6])>/, "\\1. ")

It certain works, but I don't grasp the "\\1. " part. I haven't yet found anything that might shed light on this magic. How does it retain the 'h' and whatever digit follows it? It looks somehow like "\\" == retain matched alpha, and the "1" does the same for matched digits, but I really haven't a clue. Can you elucidate just a bit?

Thanks!

Tom

···

On Sat, 7 Feb 2009, Tom Cloyd wrote:

--

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Tom Cloyd, MS MA, LMHC - Private practice Psychotherapist
Bellingham, Washington, U.S.A: (360) 920-1226
<< tc@tomcloyd.com >> (email)
<< TomCloyd.com >> (website) << sleightmind.wordpress.com >> (mental health weblog)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

David_A_Black1 · 7 February 2009 20:53

Hi --

···

On Sun, 8 Feb 2009, Tom Cloyd wrote:

David A. Black wrote:

Hi --

On Sat, 7 Feb 2009, Tom Cloyd wrote:

Tom Cloyd wrote:

I'm baffled by this strange outcome - I cannot reduce multiple spaces from a text file. This isn't just a regex problem, somehow. I'm failing to grasp something essential, but don't know what it is. All help appreciated, as usual!

Here is a demo of my problem, in which I try two different ways, and both fail:

=== code ===
# h2t.rb

def main
# conversion table spec
conv = [
[ '<h1>', 'h1. ' ], [ '<h2>', 'h2. ' ], [ '<h3>', 'h3. ' ],
[ '<h4>', 'h4. ' ], [ '<h5>', 'h5. ' ], [ '<h6>', 'h6. ' ], [ /<\/h\d>/, '' ],
[ " +", ' ' ]] # <= this last array element should do the trick, but doesn't

Ouch. THIS - [ / +/, ' ' ], substituted for [ " +", ' ' ] above fixes it. I'm going blind, obviously.

Just for fun, here's another way to write the method:

def main
data = File.read("tom.txt")
data.gsub!(/<(h[1-6])>/, "\\1. ")
data.gsub!(/<\/h\d>/, "")
data.squeeze!(' ')

open("tom.out", "w") {|f| f.write(data) }

end

I think that does the same thing. Tweak to taste

David

That's beautifully economical, and reveals a far better grasp of regex than I was able to attain last night. However, I'm having trouble with this line:

data.gsub!(/<(h[1-6])>/, "\\1. ")

It certain works, but I don't grasp the "\\1. " part. I haven't yet found anything that might shed light on this magic. How does it retain the 'h' and whatever digit follows it? It looks somehow like "\\" == retain matched alpha, and the "1" does the same for matched digits, but I really haven't a clue. Can you elucidate just a bit?

The \\1, \\2, etc. in the replacement string are pegged to the
parenthetical captures. "\\1. " means: the first capture (which is h
plus a digit), a period, and a space.

They work in single-quoted strings too, but there they're just \1, \2,
etc. There's some explanation in the ri docs for String#gsub.

David

--
David A. Black / Ruby Power and Light, LLC
Ruby/Rails consulting & training: http://www.rubypal.com
Coming in 2009: The Well-Grounded Rubyist (http://manning.com/black2\)

http://www.wishsight.com => Independent, social wishlist management!

Jan-Erik_R · 7 February 2009 20:54

Tom Cloyd schrieb:

David A. Black wrote:

Hi --

Tom Cloyd wrote:

I'm baffled by this strange outcome - I cannot reduce multiple spaces from a text file. This isn't just a regex problem, somehow. I'm failing to grasp something essential, but don't know what it is. All help appreciated, as usual!

Here is a demo of my problem, in which I try two different ways, and both fail:

=== code ===
# h2t.rb

def main
# conversion table spec
conv = [
[ '<h1>', 'h1. ' ], [ '<h2>', 'h2. ' ], [ '<h3>', 'h3. ' ],
[ '<h4>', 'h4. ' ], [ '<h5>', 'h5. ' ], [ '<h6>', 'h6. ' ], [ /<\/h\d>/, '' ],
[ " +", ' ' ]] # <= this last array element should do the trick, but doesn't

Ouch. THIS - [ / +/, ' ' ], substituted for [ " +", ' ' ] above fixes it. I'm going blind, obviously.

Just for fun, here's another way to write the method:

def main
data = File.read("tom.txt")
data.gsub!(/<(h[1-6])>/, "\\1. ")
data.gsub!(/<\/h\d>/, "")
data.squeeze!(' ')

open("tom.out", "w") {|f| f.write(data) }

end

I think that does the same thing. Tweak to taste

David

That's beautifully economical, and reveals a far better grasp of regex than I was able to attain last night. However, I'm having trouble with this line:

data.gsub!(/<(h[1-6])>/, "\\1. ")

It certain works, but I don't grasp the "\\1. " part. I haven't yet found anything that might shed light on this magic. How does it retain the 'h' and whatever digit follows it? It looks somehow like "\\" == retain matched alpha, and the "1" does the same for matched digits, but I really haven't a clue. Can you elucidate just a bit?

Thanks!

Tom

ah...regex! it's easy if you know them =D
the (...) in the Regex defines a group.
this group now includes the 'h' followed by one of the numbers 1,2,3,4,5 or 6
in the second parameter \1 (double slash because of double-quotes/escaping now is assgined to the matched pattern /h[1-6]/
that's it, nothing magic anymore

···

On Sat, 7 Feb 2009, Tom Cloyd wrote:

Robert_K1 · 7 February 2009 20:59

The keyword is "capturing groups". Brackets in the regexp denote groups of characters which can be referenced later via their numeric index as you have seen. You can even use them for matching repetitions

/(fo+)\1/ =~ s # will match "fofo", "foofoo", "fooofooo" etc.

Cheers

robert

···

On 07.02.2009 21:50, Tom Cloyd wrote:

That's beautifully economical, and reveals a far better grasp of regex than I was able to attain last night. However, I'm having trouble with this line:

data.gsub!(/<(h[1-6])>/, "\\1. ")

It certain works, but I don't grasp the "\\1. " part. I haven't yet found anything that might shed light on this magic. How does it retain the 'h' and whatever digit follows it? It looks somehow like "\\" == retain matched alpha, and the "1" does the same for matched digits, but I really haven't a clue. Can you elucidate just a bit?

Tom_Cloyd2 · 7 February 2009 21:48

Robert Klemme wrote:

That's beautifully economical, and reveals a far better grasp of regex than I was able to attain last night. However, I'm having trouble with this line:

data.gsub!(/<(h[1-6])>/, "\\1. ")

It certain works, but I don't grasp the "\\1. " part. I haven't yet found anything that might shed light on this magic. How does it retain the 'h' and whatever digit follows it? It looks somehow like "\\" == retain matched alpha, and the "1" does the same for matched digits, but I really haven't a clue. Can you elucidate just a bit?

The keyword is "capturing groups". Brackets in the regexp denote groups of characters which can be referenced later via their numeric index as you have seen. You can even use them for matching repetitions

/(fo+)\1/ =~ s # will match "fofo", "foofoo", "fooofooo" etc.

Cheers

robert

David, badboy, Robert - thats to you all for the very clear explanations. I really couldn't find info. about this (yet). It IS clear, once the explanation's in had. I have to say that regex's becoming rather fun, now that I'm getting a little control of it.

I continue to be amazed at the generosity of this list in helping the real amateurs here move things along. We get that AND we get to listen in on all sorts of amazing and mysterious discussions of higher order magic. Pretty cool.

t.

···

On 07.02.2009 21:50, Tom Cloyd wrote:

--

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Tom Cloyd, MS MA, LMHC - Private practice Psychotherapist
Bellingham, Washington, U.S.A: (360) 920-1226
<< tc@tomcloyd.com >> (email)
<< TomCloyd.com >> (website) << sleightmind.wordpress.com >> (mental health weblog)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Tom_Cloyd2 · 7 February 2009 21:53

Robert Klemme wrote:

···

On 07.02.2009 21:50, Tom Cloyd wrote:

That's beautifully economical, and reveals a far better grasp of regex than I was able to attain last night. However, I'm having trouble with this line:

data.gsub!(/<(h[1-6])>/, "\\1. ")

It certain works, but I don't grasp the "\\1. " part. I haven't yet found anything that might shed light on this magic. How does it retain the 'h' and whatever digit follows it? It looks somehow like "\\" == retain matched alpha, and the "1" does the same for matched digits, but I really haven't a clue. Can you elucidate just a bit?

The keyword is "capturing groups". Brackets in the regexp denote groups of characters which can be referenced later via their numeric index as you have seen. You can even use them for matching repetitions

/(fo+)\1/ =~ s # will match "fofo", "foofoo", "fooofooo" etc.

Cheers

robert

I should have added this - as it was puzzling me and I just now "got it" (and no one mentioned it) - for those who might be following along or will come after: the "\1" business isn't regex. That's why I could find nothing about it in my regex sources! It's a String#gsub() convention, and is documented there.

OK...all darkness is vanquished. For now. (!) ~t.

--

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Tom Cloyd, MS MA, LMHC - Private practice Psychotherapist
Bellingham, Washington, U.S.A: (360) 920-1226
<< tc@tomcloyd.com >> (email)
<< TomCloyd.com >> (website) << sleightmind.wordpress.com >> (mental health weblog)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

MOUBAMBA_Laury_karin · 8 February 2009 09:20

how to unsubscribe to this email list ?

···

________________________________
De : Tom Cloyd <tomcloyd@comcast.net>
À : ruby-talk ML <ruby-talk@ruby-lang.org>
Envoyé le : Samedi, 7 Février 2009, 22h48mn 08s
Objet : Re: cannot remove multiple space> nuts!

Robert Klemme wrote:

On 07.02.2009 21:50, Tom Cloyd wrote:

That's beautifully economical, and reveals a far better grasp of regex than I was able to attain last night. However, I'm having trouble with this line:

data.gsub!(/<(h[1-6])>/, "\\1. ")

It certain works, but I don't grasp the "\\1. " part. I haven't yet found anything that might shed light on this magic. How does it retain the 'h' and whatever digit follows it? It looks somehow like "\\" == retain matched alpha, and the "1" does the same for matched digits, but I really haven't a clue. Can you elucidate just a bit?

The keyword is "capturing groups". Brackets in the regexp denote groups of characters which can be referenced later via their numeric index as you have seen. You can even use them for matching repetitions

/(fo+)\1/ =~ s # will match "fofo", "foofoo", "fooofooo" etc.

Cheers
robert

David, badboy, Robert - thats to you all for the very clear explanations. I really couldn't find info. about this (yet). It IS clear, once the explanation's in had. I have to say that regex's becoming rather fun, now that I'm getting a little control of it.

I continue to be amazed at the generosity of this list in helping the real amateurs here move things along. We get that AND we get to listen in on all sorts of amazing and mysterious discussions of higher order magic. Pretty cool.

t.

--
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Tom Cloyd, MS MA, LMHC - Private practice Psychotherapist
Bellingham, Washington, U.S.A: (360) 920-1226
<< tc@tomcloyd.com >> (email)
<< TomCloyd.com >> (website) << sleightmind.wordpress.com >> (mental health weblog)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

__________________________________________________________________________________________________
Ne pleurez pas si votre Webmail ferme ! Récupérez votre historique sur Yahoo! Mail ! Yahoo fait partie de la famille de marques Yahoo.

Robert_K1 · 8 February 2009 12:44

If you like to dig deeper into the matter, I can recommend http://oreilly.com/catalog/9780596528126/index.html

I covers functionality, different regexp dialects and performance considerations thoroughly. Might be a bit difficult to read if you do not have a full CS background but IMHO Jeff Friedl manages to keep language theory at a minimum without scarifying precision.

And one more particular Ruby hint: method String# is capable of working with regular expression arguments, so you can do

# fetch the whole match
ip = input[/\d{1,3}(\.\d{1,3}){3}/]

# fetch group 1
name = input[/name=(\S+)/, 1]

Cheers

robert

···

On 07.02.2009 22:48, Tom Cloyd wrote:

I have to say that regex's becoming rather fun, now that I'm getting a little control of it.

Jesus_Gabriel_y_Gala · 8 February 2009 18:05

If you want an introductory tutorials about regular expressions you
can check here:

Jesus.

···

On Sun, Feb 8, 2009 at 1:44 PM, Robert Klemme <shortcutter@googlemail.com> wrote:

On 07.02.2009 22:48, Tom Cloyd wrote:

I have to say that regex's becoming rather fun, now that I'm getting a
little control of it.

If you like to dig deeper into the matter, I can recommend
http://oreilly.com/catalog/9780596528126/index.html

I covers functionality, different regexp dialects and performance
considerations thoroughly. Might be a bit difficult to read if you do not
have a full CS background but IMHO Jeff Friedl manages to keep language
theory at a minimum without scarifying precision.

Tom_Cloyd2 · 9 February 2009 06:32

Jesús Gabriel y Galán wrote:

···

On Sun, Feb 8, 2009 at 1:44 PM, Robert Klemme > <shortcutter@googlemail.com> wrote:


On 07.02.2009 22:48, Tom Cloyd wrote:


I have to say that regex's becoming rather fun, now that I'm getting a
little control of it.


If you like to dig deeper into the matter, I can recommend
Mastering Regular Expressions, 3rd Edition [Book]

I covers functionality, different regexp dialects and performance
considerations thoroughly. Might be a bit difficult to read if you do not
have a full CS background but IMHO Jeff Friedl manages to keep language
theory at a minimum without scarifying precision.

If you want an introductory tutorials about regular expressions you
can check here:
http://www.regular-expressions.info

Jesus.

Robert, Jesus,
Thanks to you both. Great stuff. I'll be digging into it tonight.

t.

--

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Tom Cloyd, MS MA, LMHC - Private practice Psychotherapist
Bellingham, Washington, U.S.A: (360) 920-1226
<< tc@tomcloyd.com >> (email)
<< TomCloyd.com >> (website) << sleightmind.wordpress.com >> (mental health weblog)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Topic		Replies	Views
Regex \s == \n? ruby-talk	12	124	7 February 2009
How to remove leading   from string ruby-talk	12	1764	3 August 2010
DRY gsub ruby-talk	34	143	14 January 2007
Q: most efficient way to remove duplicate spaces in a string? ruby-talk	13	148	11 January 2009
Parsing a string using multiple regexs ruby-talk	9	148	22 June 2006

Cannot remove multiple spaces

Related topics