Gsub and gsub! are inconsistent

aurelianito · 8 November 2005 19:52

Hi all!

I've been trying to optimize the code
a_string.gsub(/pattern_1/, "REPLACE_1").gsub(/pattern_2/,
"REPLACE_2").gsub(/PATTERN_3/,"REPLACE_3").

with a_string.gsub(/pattern_1/, "REPLACE_1").gsub!(/pattern_2/,
"REPLACE_2").gsub!(/pattern_3/,"REPLACE_3") (note the '!' on the second
and third gsub).

IMHO, this two blocks of pseudocode should behave in the same way, but
if the second pattern don't matches, it returns null on the second
version, and then generates an exception.
Why is it than the destructive gsub behaves differently?
Do you think that the current behaviour is the right behaviour? Why?

Please comment,
Aureliano.

Anonymous_Coward · 8 November 2005 20:09

aurelianito wrote:

Hi all!

I've been trying to optimize the code
a_string.gsub(/pattern_1/, "REPLACE_1").gsub(/pattern_2/,
"REPLACE_2").gsub(/PATTERN_3/,"REPLACE_3").

with a_string.gsub(/pattern_1/, "REPLACE_1").gsub!(/pattern_2/,
"REPLACE_2").gsub!(/pattern_3/,"REPLACE_3") (note the '!' on the second
and third gsub).

IMHO, this two blocks of pseudocode should behave in the same way, but
if the second pattern don't matches, it returns null on the second
version, and then generates an exception.
Why is it than the destructive gsub behaves differently?
Do you think that the current behaviour is the right behaviour? Why?

There have been many discussions about method chaining. Some think it
would be good for bang-methods to always return self, some think that
nil should silently consume those method calls and the rest think that
you should not be chaining methods in the first place because of all
the problems it would mask and think of the children!

Yeah, it would be sensible for it to return self.

Please comment,
Aureliano.

E

Eric_Hodel1 · 8 November 2005 20:59

Hi all!

I've been trying to optimize the code
a_string.gsub(/pattern_1/, "REPLACE_1").gsub(/pattern_2/,
"REPLACE_2").gsub(/PATTERN_3/,"REPLACE_3").

So you've profiled your code and determined that gsub is your slow point?

If so, you've also checked that regex matching is not slowing you down, but the creation of a new string and the garbage collection of the old string is?

IMHO, this two blocks of pseudocode should behave in the same way, but
if the second pattern don't matches, it returns null on the second
version, and then generates an exception.
Why is it than the destructive gsub behaves differently?

$ ri String#gsub!
----------------------------------------------------------- String#gsub!
str.gsub!(pattern, replacement) => str or nil
str.gsub!(pattern) {|match| block } => str or nil

···

On Nov 8, 2005, at 11:52 AM, aurelianito wrote:
------------------------------------------------------------------------
Performs the substitutions of +String#gsub+ in place, returning
_str_, or +nil+ if no substitutions were performed.

Do you think that the current behaviour is the right behaviour? Why?

Yes.

Bang methods can change things in places you don't expect.

Bang methods should give an indication that something was changed if something was changed.

--
Eric Hodel - drbrain@segment7.net - http://segment7.net
FEC2 57F1 D465 EB15 5D6E 7C11 332A 551C 796C 9F04

Robert · 8 November 2005 22:42

Hi all!

I've been trying to optimize the code
a_string.gsub(/pattern_1/, "REPLACE_1").gsub(/pattern_2/,
"REPLACE_2").gsub(/PATTERN_3/,"REPLACE_3").

with a_string.gsub(/pattern_1/, "REPLACE_1").gsub!(/pattern_2/,
"REPLACE_2").gsub!(/pattern_3/,"REPLACE_3") (note the '!' on the
second and third gsub).

IMHO, this two blocks of pseudocode should behave in the same way, but
if the second pattern don't matches, it returns null on the second
version, and then generates an exception.
Why is it than the destructive gsub behaves differently?

To be able to determine whether something was changed

if s.gsub!(...)
puts "oops, changed!"
end

Do you think that the current behaviour is the right behaviour? Why?

If there were no other reasons then at least existing code. But there are other reasons (see above).

Btw, did you consider changing your code altoghether? Depending on your patterns and replacements, there are other options possible:

s.gsub!(/pat1|pat2/) {|m| replacements[m]}

s.gsub! /(pat1)|(pat2)/ do |m|
  case
    when m[1]; "re1"
    when m[2]; "re2"
    else raise "Unexpected"
  end
end

Kind regards

robert

···

aurelianito <aurelianocalvo@yahoo.com.ar> wrote:

aurelianito · 9 November 2005 02:27

Hi!

Thank's all for your responses,
what I can see is that there is no consensus on how the bang methods
should behave. May be, the always self returning bang methods can be
made available with an external library (there is an trivial
implementation for a bang method to return always self :D). Is it
already in the facets library? (right there with the "message eating"
nil).

Thank you all for your feedback,
Aureliano.

aurelianito · 9 November 2005 02:27

Hi!

Thank's all for your responses,
what I can see is that there is no consensus on how the bang methods
should behave. May be, the always self returning bang methods can be
made available with an external library (there is an trivial
implementation for a bang method to return always self :D). Is it
already in the facets library? (right there with the "message eating"
nil).

Thank you all for your feedback,
Aureliano.

Eric_Hodel1 · 9 November 2005 02:31

You realize you may break the Ruby standard library by doing that?

···

On Nov 8, 2005, at 6:27 PM, aurelianito wrote:

what I can see is that there is no consensus on how the bang methods
should behave. May be, the always self returning bang methods can be
made available with an external library (there is an trivial
implementation for a bang method to return always self :D). Is it
already in the facets library? (right there with the "message eating"
nil).

--
Eric Hodel - drbrain@segment7.net - http://segment7.net
FEC2 57F1 D465 EB15 5D6E 7C11 332A 551C 796C 9F04

aurelianito · 9 November 2005 12:36

> what I can see is that there is no consensus on how the bang methods
> should behave. May be, the always self returning bang methods can be
> made available with an external library (there is an trivial
> implementation for a bang method to return always self :D). Is it
> already in the facets library? (right there with the "message eating"
> nil).

You realize you may break the Ruby standard library by doing that?

Ups!
You are right.

I see three choices:
1 - use the caller method to change the bang methods to behave
differently depending on the caller.
2 - Use the rubycon tests to check how the standard library works with
the modified methods.
3 - Leave it as is, and complain that I don't like it.

Option number one is really nasty.
Option number two is too much work.
Option number three seems to be the right choice.

Thank's,
Aureliano.

David_A_Black3 · 9 November 2005 13:27

Hi --

what I can see is that there is no consensus on how the bang methods
should behave. May be, the always self returning bang methods can be
made available with an external library (there is an trivial
implementation for a bang method to return always self :D). Is it
already in the facets library? (right there with the "message eating"
nil).

You realize you may break the Ruby standard library by doing that?

Ups!
You are right.

I see three choices:
1 - use the caller method to change the bang methods to behave
differently depending on the caller.
2 - Use the rubycon tests to check how the standard library works with
the modified methods.
3 - Leave it as is, and complain that I don't like it.

Option number one is really nasty.
Option number two is too much work.

And also *incredibly* fragile. It's really not an option.

Option number three seems to be the right choice.

4. Investigate the various libraries on RAA that let you make
temporary changes to core behavior.
5. Wait until Ruby 2.0 and the possibility of selector namespaces.

(You can of course combine either of this with complaining that you
don't like it

David

···

On Wed, 9 Nov 2005, aurelianito wrote:

--
David A. Black
dblack@wobblini.net

Robert · 9 November 2005 13:42

David A. Black wrote:

Hi --

what I can see is that there is no consensus on how the bang
methods should behave. May be, the always self returning bang
methods can be made available with an external library (there is
an trivial implementation for a bang method to return always self
:D). Is it already in the facets library? (right there with the
"message eating" nil).

You realize you may break the Ruby standard library by doing that?

Ups!
You are right.

I see three choices:
1 - use the caller method to change the bang methods to behave
differently depending on the caller.
2 - Use the rubycon tests to check how the standard library works
with the modified methods.
3 - Leave it as is, and complain that I don't like it.

Option number one is really nasty.
Option number two is too much work.

And also *incredibly* fragile. It's really not an option.

Option number three seems to be the right choice.

4. Investigate the various libraries on RAA that let you make
temporary changes to core behavior.
5. Wait until Ruby 2.0 and the possibility of selector namespaces.

(You can of course combine either of this with complaining that you
don't like it

I'd like to add another option to the mix:

6. Don't complain and accept it the way it is.

This form of serenity can help a great deal in modern life.

Kind regards

robert

···

On Wed, 9 Nov 2005, aurelianito wrote:

Topic		Replies	Views
Method Chaining Issues ruby-talk	27	160	3 June 2005
Is there a replacement for sub? ruby-talk	14	97	24 July 2007
RCR 296: Destructive methods return self ruby-talk	1	101	23 March 2005
Anyone ever confused by Array#slice! method? ruby-talk	0	115	27 March 2003
Yukihiro - Please ensure backwards compatibility ruby-talk	0	101	18 December 2003

Gsub and gsub! are inconsistent

Related topics