Remove commas from string

I have following string:

s = "B747-400, 8,357 miles, 561 mph, 4 Pratt & Whitney PW 4056
turbofans, 56,000 lbs."

I want to remove the comma only from the numbers (8,357 miles and 56,000
lbs) separating the thousands. I want the string to read as follows:

"B747-400, 8357 miles, 561 mph, 4 Pratt & Whitney PW 4056 turbofans,
56000 lbs."

I thought this following regex would do the trick because it
successfully isolated the right commas in rubular.com

s.gsub(/\d+(,)\d+/, "")

It turns out that my regex removes the entire number, not just the
comma.

Am I wrong in saying that my regex searches for 1 or more numbers
surrounding a comma and replaces just the comma with ""?

Thank you.

···

--
Posted via http://www.ruby-forum.com/.

Jason Lillywhite wrote:

I have following string:

s = "B747-400, 8,357 miles, 561 mph, 4 Pratt & Whitney PW 4056
turbofans, 56,000 lbs."

I want to remove the comma only from the numbers (8,357 miles and 56,000
lbs) separating the thousands. I want the string to read as follows:

"B747-400, 8357 miles, 561 mph, 4 Pratt & Whitney PW 4056 turbofans,
56000 lbs."

I thought this following regex would do the trick because it
successfully isolated the right commas in rubular.com

s.gsub(/\d+(,)\d+/, "")

It turns out that my regex removes the entire number, not just the
comma.

Am I wrong in saying that my regex searches for 1 or more numbers
surrounding a comma and replaces just the comma with ""?

Thank you.

one simple one here:
new_string = s.gsub(",","")
p new_string

···

--
Posted via http://www.ruby-forum.com/\.

Jason Lillywhite wrote:

I have following string:

s = "B747-400, 8,357 miles, 561 mph, 4 Pratt & Whitney PW 4056
turbofans, 56,000 lbs."

I want to remove the comma only from the numbers (8,357 miles and 56,000
lbs) separating the thousands. I want the string to read as follows:

"B747-400, 8357 miles, 561 mph, 4 Pratt & Whitney PW 4056 turbofans,
56000 lbs."

I thought this following regex would do the trick because it
successfully isolated the right commas in rubular.com

s.gsub(/\d+(,)\d+/, "")

It turns out that my regex removes the entire number, not just the
comma.

Am I wrong in saying that my regex searches for 1 or more numbers
surrounding a comma and replaces just the comma with ""?

Thank you.

Hi Jason

there is an example of how to do this in the gsub documentation

irb(main):007:0> s = "B747-400, 8,357 miles, 561 mph, 4 Pratt & Whitney PW 4056 tubofans, 56,000 lbs"
=> "B747-400, 8,357 miles, 561 mph, 4 Pratt & Whitney PW 4056 tubofans, 56,000 lbs"
irb(main):008:0> s.gsub(/(\d+),(\d+)/,'\1\2')
=> "B747-400, 8357 miles, 561 mph, 4 Pratt & Whitney PW 4056 tubofans, 56000 lbs"

The regexp specifies a group of digits a comma and a group of digits
The brackets indicate that the groups should be remembered
\1 substitutes in the first match and \2 the second. Note that you need single quotes around the string or \1 will be interpreted as octal 001 and \2 as octal 002

Hope this helps

Steve

Jason Lillywhite wrote:

I have following string:

s = "B747-400, 8,357 miles, 561 mph, 4 Pratt & Whitney PW 4056
turbofans, 56,000 lbs."

I want to remove the comma only from the numbers (8,357 miles and 56,000
lbs) separating the thousands. I want the string to read as follows:

"B747-400, 8357 miles, 561 mph, 4 Pratt & Whitney PW 4056 turbofans,
56000 lbs."

I thought this following regex would do the trick because it
successfully isolated the right commas in rubular.com

s.gsub(/\d+(,)\d+/, "")

It turns out that my regex removes the entire number, not just the
comma.

Am I wrong in saying that my regex searches for 1 or more numbers
surrounding a comma and replaces just the comma with ""?

Yes, you are wrong:

s = "|yes|"
puts s.gsub(/y(e)s/, "")

--output:--

gsub() replaces the whole match with the specified replacement. gsub()
does not pick out a parenthesized group in the regex and replace that
with the specified replacement . However, there is a block form of
gsub:

s = "yes, 1,234, yes, 4,567"

result = s.gsub(/(\d),(\d)/) do |match|
  "#{$1}#{$2}"
end

puts result

--output:--
yes, 1234, yes, 4567

Inside the block, $1, $2, $3, etc. refer to the matches for each
parenthesized group in the regex. The return value of the block is used
as the replacement.

···

--
Posted via http://www.ruby-forum.com/\.

Ruby 1.9 supports look-behind in regular expressions (Ruby 1.8
only supports look-ahead):

$ irb1.9

irb(main):001:0> s = "B747-400, 8,357 miles, 561 mph, 56,000 lbs."
=> "B747-400, 8,357 miles, 561 mph, 56,000 lbs."

irb(main):002:0> s.gsub(/(?<=\d),(?=\d)/, '')
=> "B747-400, 8357 miles, 561 mph, 56000 lbs."

···

* Jason Lillywhite <jason.lillywhite@gmail.com> wrote:

I have following string:

s = "B747-400, 8,357 miles, 561 mph, 4 Pratt & Whitney PW 4056
turbofans, 56,000 lbs."

I want to remove the comma only from the numbers (8,357 miles and 56,000
lbs) separating the thousands. I want the string to read as follows:

"B747-400, 8357 miles, 561 mph, 4 Pratt & Whitney PW 4056 turbofans,
56000 lbs."

--
Lars Haugseth

7stud -- wrote:

Inside the block, $1, $2, $3, etc. refer to the matches for each
parenthesized group in the regex. The return value of the block is used
as the replacement.

But note that once again, the entire match is replaced by the return
value of the block.

···

--
Posted via http://www.ruby-forum.com/\.

I'r rather do this to be a bit more robust:

irb(main):003:0> s.gsub(/(?<=\d),(?=\d{3})/, '')
=> "B747-400, 8357 miles, 561 mph, 56000 lbs."

Kind regards

robert

···

2009/8/5 Lars Haugseth <njus@larshaugseth.com>:

* Jason Lillywhite <jason.lillywhite@gmail.com> wrote:

I have following string:

s = "B747-400, 8,357 miles, 561 mph, 4 Pratt & Whitney PW 4056
turbofans, 56,000 lbs."

I want to remove the comma only from the numbers (8,357 miles and 56,000
lbs) separating the thousands. I want the string to read as follows:

"B747-400, 8357 miles, 561 mph, 4 Pratt & Whitney PW 4056 turbofans,
56000 lbs."

Ruby 1.9 supports look-behind in regular expressions (Ruby 1.8
only supports look-ahead):

$ irb1.9

irb(main):001:0> s = "B747-400, 8,357 miles, 561 mph, 56,000 lbs."
=> "B747-400, 8,357 miles, 561 mph, 56,000 lbs."

irb(main):002:0> s.gsub(/(?<=\d),(?=\d)/, '')
=> "B747-400, 8357 miles, 561 mph, 56000 lbs."

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

I have following string:

s = "B747-400, 8,357 miles, 561 mph, 4 Pratt & Whitney PW 4056
turbofans, 56,000 lbs."

I want to remove the comma only from the numbers (8,357 miles and 56,000
lbs) separating the thousands. I want the string to read as follows:

"B747-400, 8357 miles, 561 mph, 4 Pratt & Whitney PW 4056 turbofans,
56000 lbs."

Ruby 1.9 supports look-behind in regular expressions (Ruby 1.8
only supports look-ahead):

$ irb1.9

irb(main):001:0> s = "B747-400, 8,357 miles, 561 mph, 56,000 lbs."
=> "B747-400, 8,357 miles, 561 mph, 56,000 lbs."

irb(main):002:0> s.gsub(/(?<=\d),(?=\d)/, '')
=> "B747-400, 8357 miles, 561 mph, 56000 lbs."

I'r rather do this to be a bit more robust:

irb(main):003:0> s.gsub(/(?<=\d),(?=\d{3})/, '')
=> "B747-400, 8357 miles, 561 mph, 56000 lbs."

Kind regards

robert

Why so complex? Perhaps:

>> s.gsub(/\b(\d+),(\d+)\b/, '\1\2')
=> "B747-400, 8357 miles, 561 mph, 56000 lbs."

Is there a corner case I'm missing here?

···

On Aug 5, 2009, at 3:55 AM, Robert Klemme wrote:

2009/8/5 Lars Haugseth <njus@larshaugseth.com>:

* Jason Lillywhite <jason.lillywhite@gmail.com> wrote:

Ruby has the neat ability to pass a block to gsub. This can be a more
versatile solution than using backreferences. It also allows Jason to
use his original, straight-forward regex. The matched string is passed
to the block, and whatever the block evaluates to is used as the
replacement string. Check it out:

irb(main):012:0> s
=> "B747-400, 8357 miles, 561 mph, 4 Pratt & Whitney PW 4056 turbofans,
turbofans, 56,000 lbs."
irb(main):013:0> s.gsub(/\d+,\d+/) { |subs| subs.gsub(',', '') }
=> "B747-400, 8357 miles, 561 mph, 4 Pratt & Whitney PW 4056 turbofans,
turbofans, 56000 lbs."

···

--
Posted via http://www.ruby-forum.com/.

s.ross schrieb:
...

$ irb1.9

irb(main):001:0> s = "B747-400, 8,357 miles, 561 mph, 56,000 lbs."
=> "B747-400, 8,357 miles, 561 mph, 56,000 lbs."

irb(main):002:0> s.gsub(/(?<=\d),(?=\d)/, '')
=> "B747-400, 8357 miles, 561 mph, 56000 lbs."

I'r rather do this to be a bit more robust:

irb(main):003:0> s.gsub(/(?<=\d),(?=\d{3})/, '')
=> "B747-400, 8357 miles, 561 mph, 56000 lbs."

Kind regards

robert

Why so complex? Perhaps:

>> s.gsub(/\b(\d+),(\d+)\b/, '\1\2')
=> "B747-400, 8357 miles, 561 mph, 56000 lbs."

Is there a corner case I'm missing here?

This would make 2344,667 from 2,344,667 (leave the second comma). (Why the "\b"s and not just /(\d),(\d{3})/ ?)

To clean up sums like those that are made as gifts for banks in this times (like 1,000,000,000,000) in 1.8 you need something like

  s.gsub /\d(,\d{3})+/ do |mo|; mo.gsub ',',''; end

R.

Steve Ross wrote:

Why so complex? Perhaps:

>> s.gsub(/\b(\d+),(\d+)\b/, '\1\2')
=> "B747-400, 8357 miles, 561 mph, 56000 lbs."

Is there a corner case I'm missing here?

So I think I would like to try a non-block option but this code above
misses the second comma if I have say 12,134,650 lbs instead of 56,000
lbs in my string.

It looks like I need Ruby 1.9 to do this: s.gsub(/(?<=\d),(?=\d{3})/,
'')

If I have not yet installed Ruby 1.9, what would be a good non-block
regex that is more robust?

···

--
Posted via http://www.ruby-forum.com/\.

Ruby has the neat ability to pass a block to gsub. This can be a more
versatile solution than using backreferences.

It isn't needed though in this case. Please also note that the block
form is usually slower.

The block form is most appropriate if you need to calculate each
replacement individually.

It also allows Jason to
use his original, straight-forward regex. The matched string is passed
to the block, and whatever the block evaluates to is used as the
replacement string. Check it out:

irb(main):012:0> s
=> "B747-400, 8357 miles, 561 mph, 4 Pratt & Whitney PW 4056 turbofans,
turbofans, 56,000 lbs."
irb(main):013:0> s.gsub(/\d+,\d+/) { |subs| subs.gsub(',', '') }
=> "B747-400, 8357 miles, 561 mph, 4 Pratt & Whitney PW 4056 turbofans,
turbofans, 56000 lbs."

Frankly, I'd rather use any of the other non block solutions that the
one with a block. My 0.02EUR.

Kind regards

robert

···

2009/8/6 Nick Brown <nick@nick-brown.com>:

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

s.gsub(/(\d),(?=\d{3})/,'\\1')

Cheers

robert

···

2009/8/6 Jason Lillywhite <jason.lillywhite@gmail.com>:

Steve Ross wrote:

Why so complex? Perhaps:

>> s.gsub(/\b(\d+),(\d+)\b/, '\1\2')
=> "B747-400, 8357 miles, 561 mph, 56000 lbs."

Is there a corner case I'm missing here?

So I think I would like to try a non-block option but this code above
misses the second comma if I have say 12,134,650 lbs instead of 56,000
lbs in my string.

It looks like I need Ruby 1.9 to do this: s.gsub(/(?<=\d),(?=\d{3})/,
'')

If I have not yet installed Ruby 1.9, what would be a good non-block
regex that is more robust?

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

Robert Klemme wrote:

s.gsub(/(\d),(?=\d{3})/,'\\1')

Thank you very much Robert.

And thanks to everyone else. This has been a good learning experience
for me.

···

--
Posted via http://www.ruby-forum.com/\.