Should String#strip take a parameter?

All,

Several times I have run across the need to strip characters other than

whitespaces from the beginning and/or end of a string. I have always
accomplished this using String#sub (or String#gsub) and an appropriate
pattern. For example:

price = field.sub(/^0+/,’’)
descr = field.gsub(/(^+)|(+$)/,’’)

It occurs to me that this would be a natural extension to the

String#strip methods (1.8 now includes String#lstrip and String#rstrip as
well as the “!” variations) by adding an optional String parameter to them.
The default value of the parameter would be nil to preserve backwards
compatibility, but if the parameter was not nil, the functions would strip
that string (or the string returned by that object’s to_str method).

Example behavior:

class String

def new_lstrip(str = nil)
return self.lstrip if str.nil?
self.sub(/^#{str}+/,’’)
end

def new_rstrip(str = nil)
return self.rstrip if str.nil?
self.sub(/#{str}+$/,’’)
end

def new_strip(str = nil)
return self.lstrip if str.nil?
self.gsub(/(^#{str}+)|(#{str}+$)/,’’)
end

end

If there is an interest in this, I can provide patches for 1.6.8 and/or

the latest 1.8 snapshot.

- Warren Brown

Hi,

···

In message “Should String#strip take a parameter?” on 03/07/22, “Warren Brown” wkb@airmail.net writes:

If there is an interest in this, I can provide patches for 1.6.8 and/or
the latest 1.8 snapshot.

We need to define the behavior first. Python 2.3’s strip takes a
parameter, but it strips off all characters in the string, i.e.

a = “<><<>>>>><><>”
a.strip(“<>”) # => “abc”

						matz.

“Warren Brown” wkb@airmail.net schrieb im Newsbeitrag
news:000201c34fd3$22b24cb0$6449c4d1@warrenpc…

All,

Several times I have run across the need to strip characters other

than

whitespaces from the beginning and/or end of a string. I have always
accomplished this using String#sub (or String#gsub) and an appropriate
pattern. For example:

price = field.sub(/^0+/,‘’)
descr = field.gsub(/(^+)|(+$)/,‘’)

… which is not too bad, isn’t it?

It occurs to me that this would be a natural extension to the

String#strip methods (1.8 now includes String#lstrip and String#rstrip as
well as the “!” variations) by adding an optional String parameter to
them.
The default value of the parameter would be nil to preserve backwards
compatibility, but if the parameter was not nil, the functions would strip
that string (or the string returned by that object’s to_str method).

Example behavior:

class String

def new_lstrip(str = nil)
return self.lstrip if str.nil?
self.sub(/^#{str}+/,‘’)
end

If you want to exactly match str you should use Regexp.quote for the
argument. Otherwise I’d use a regexp directly and match it at the beginning
and at the end. Maybe like this:

class String
SPACES = /\s+/o

def new_strip!(rx=SPACES)
gsub!(rx) {|m| $`==“”||$'==“” ? nil : m}
end

def new_strip(rx=SPACES)
s=dup; s.new_strip! rx; s
end

def new_lstrip!(rx=SPACES)
gsub!(rx) {|m| $`==“” ? nil : m}
end

def new_lstrip(rx=SPACES)
s=dup; s.new_lstrip! rx; s
end

def new_rstrip!(rx=SPACES)
gsub!(rx) {|m| $'==“” ? nil : m}
end

def new_rstrip(rx=SPACES)
s=dup; s.new_rstrip! rx; s
end
end

Regards

Robert

That’s unintuitive, IMO. I think if strip takes a parameter, it should
be a regex (a string can be autoconverted to its Regex.quoted form), and
implicitly wrap the regex in a (…)*. e.g.

s.strip(/[<>]/) →
s.gsub!(/^([<>])/,‘’)
s.gsub!(/([<>])
$/,‘’)

whereas
s.strip(‘<>’) →
s.gsub!(/^(<>)/,‘’)
s.gsub!(/(<>)
$/,‘’)

Treating a string as a character class is not convenient enough to
justify its limitingness.

(Do ruby regexes support a noncapturing, nonbacktracking form of ()*?)

martin

···

Yukihiro Matsumoto matz@ruby-lang.org wrote:

Hi,

In message “Should String#strip take a parameter?” > on 03/07/22, “Warren Brown” wkb@airmail.net writes:

If there is an interest in this, I can provide patches for 1.6.8 and/or
the latest 1.8 snapshot.

We need to define the behavior first. Python 2.3’s strip takes a
parameter, but it strips off all characters in the string, i.e.

a = “<><<>>>>><><>”
a.strip(“<>”) # => “abc”

Hi,

I have a question regarding strip. I wrote a DLL in other language, and
used the following to call the dll function in ruby:

arg = " " * 100
myfunc.Call(arg)

after that arg may contain a string such as:

“return value\000”

I used strip, but the strip does not strip away terminating null
character…

Am I doing something wrong? If I suggest that strip should also strip
the \000, is it a good idea or not?

Shannon

···

On Tue, 22 Jul 2003 11:07:08 +0900 matz@ruby-lang.org (Yukihiro Matsumoto) wrote:

Hi,

In message “Should String#strip take a parameter?” > on 03/07/22, “Warren Brown” wkb@airmail.net writes:

If there is an interest in this, I can provide patches for 1.6.8 and/or
the latest 1.8 snapshot.

We need to define the behavior first. Python 2.3’s strip takes a
parameter, but it strips off all characters in the string, i.e.

a = “<><<>>>>><><>”
a.strip(“<>”) # => “abc”

  					matz.


Xiangrong Fang xrfang@hotmail.com

All,

We need to define the behavior first. Python 2.3’s
strip takes a parameter, but it strips off all
characters in the string, i.e.

a = “<><<>>>>><><>”
a.strip(“<>”) # => “abc”

  					matz.
OK, how about the methods taking a Regular Expression as a parameter

(converting strings as necessary), with nil defaulting to /[\s\t\r\n\f]/?
That would still allow “xxx”.strip(“_”), but could also allow
“<><<>>>>><><>”.strip(/[<>]/).

- Warren Brown

Martin DeMello martindemello@yahoo.com writes:

Treating a string as a character class is not convenient enough to
justify its limitingness.

Couldn’t you just support both. That is, the argument could be either a
string or a RegExp.

Hi,

Xiangrong Fang xrfang@hotmail.com writes:

I have a question regarding strip. I wrote a DLL in other language, and
used the following to call the dll function in ruby:

arg = " " * 100
myfunc.Call(arg)

after that arg may contain a string such as:

“return value\000”

I used strip, but the strip does not strip away terminating null
character…

“return value\000”.unpack(“A*”)[0]

···


eban

Hi,

···

In message “Re: Should String#strip take a parameter?” on 03/07/22, Xiangrong Fang xrfang@hotmail.com writes:

Am I doing something wrong? If I suggest that strip should also strip
the \000, is it a good idea or not?

Maybe good. I’d like to hear from others.

						matz.

The only reason for thinking #strip might be relevant
in this situation is that you initialised your buffer
with spaces.

class String
def sz_cut! # cut at string-zero terminator
replace(self[/\A[^\0]*/])
end
end

p “return value\000 garbage”.sz_cut!

#-> “return value”

… works more generally.

But I can’t see a problem with treating
“\0” as whitespace in #rstrip until the
day someone wants to pass it on as a
C-string, gets an error, then remembers
to add it back :wink:

string << 0

daz

···

“Xiangrong Fang” xrfang@hotmail.com wrote:

Hi,

I have a question regarding strip. I wrote a DLL in other language, and
used the following to call the dll function in ruby:

arg = " " * 100
myfunc.Call(arg)

after that arg may contain a string such as:

“return value\000”

I used strip, but the strip does not strip away terminating null
character…

Am I doing something wrong? If I suggest that strip should also strip
the \000, is it a good idea or not?

Shannon

My argument was that if the parameter is regexp oriented, it makes more
sense for a string to match itself than “any of its characters”.

martin

···

Bj?rn Lindstr?m bkhl@privat.utfors.se wrote:

Martin DeMello martindemello@yahoo.com writes:

Treating a string as a character class is not convenient enough to
justify its limitingness.

Couldn’t you just support both. That is, the argument could be either a
string or a RegExp.

This would be no better than "return value\000 ".strip[0…-2] :slight_smile:

Shannon

···

On Tue, 22 Jul 2003 18:14:32 +0900 WATANABE Hirofumi eban@os.rim.or.jp wrote:

“return value\000”.unpack(“A*”)[0]

Hello,

In message “Re: Should String#strip take a parameter?”

···

on Jul.23,2003 11:57:42, matz@ruby-lang.org wrote:

Am I doing something wrong? If I suggest that strip should also strip
the \000, is it a good idea or not?

Maybe good. I’d like to hear from others.

I think it’s good.

Regards,

U.Nakamura usa@osb.att.ne.jp

“daz” dooby@d10.karoo.co.uk schrieb im Newsbeitrag
news:vhs8osad38h515@corp.supernews.com

Hi,

I have a question regarding strip. I wrote a DLL in other language,
and
used the following to call the dll function in ruby:

arg = " " * 100
myfunc.Call(arg)

after that arg may contain a string such as:

“return value\000”

I used strip, but the strip does not strip away terminating null
character…

Am I doing something wrong? If I suggest that strip should also strip
the \000, is it a good idea or not?

Shannon

The only reason for thinking #strip might be relevant
in this situation is that you initialised your buffer
with spaces.

class String
def sz_cut! # cut at string-zero terminator
replace(self[/\A[^\0]*/])
end
end

p “return value\000 garbage”.sz_cut!

There is a more efficient implementation of sz_cut!:

class String
def sz_cut! # cut at string-zero terminator
gsub!( /\0.*$/, ‘’ )
end
end

Cheers

robert


            user     system      total        real

test1 change 5.235000 0.031000 5.266000 ( 5.281000)
test2 change 5.828000 0.032000 5.860000 ( 5.875000)
test1 no change 1.891000 0.000000 1.891000 ( 1.922000)
test2 no change 7.078000 0.000000 7.078000 ( 7.125000)

require ‘benchmark’

REP=500000

STR1=“foo\0garbage”
STR1.freeze

STR2=“foo.garbage”
STR2.freeze

def test1(s)
s.gsub!( /\0.*$/, ‘’ )
end

def test2(s)
s.replace(s[/\A[^\0]*/])
end

Benchmark.bm(10) do |x|
x.report “test1 change” do
REP.times { test1 STR1.dup }
end

x.report “test2 change” do
REP.times { test2 STR1.dup }
end

x.report “test1 no change” do
REP.times { test1 STR2.dup }
end

x.report “test2 no change” do
REP.times { test2 STR2.dup }
end
end

···

“Xiangrong Fang” xrfang@hotmail.com wrote:

Am I doing something wrong? If I suggest that strip should also strip
the \000, is it a good idea or not?

Maybe good. I’d like to hear from others.

matz.

       #####  Relates to CVS only  #####
···

In message “Re: Should String#strip take a parameter?” > on 03/07/22, Xiangrong Fang xrfang@hotmail.com writes:


Wed Jul 23 15:49:01 2003 Yukihiro Matsumoto matz@ruby-lang.org

  • string.c (rb_str_lstrip_bang): strip NUL along with white
    spaces. [ruby-talk:76659]

  • string.c (rb_str_rstrip_bang): ditto.


rstrip - great.

But IMvHO, changing lstrip as well is an entirely different
thing, altogether.

[CHORUS]: " “”““Changing lstrip as well is an entirely different
thing.””“” "

######### [OT] [OT] [OT] [OT] [OT] [OT] [OT] [OT] [OT] [OT] [OT]

(find on page: altogether) to view the transcripted line from
the “Airplane!” film.
#########

I can see the desire to be consistent, but …

strip calls strip_bang, which calls lstrip_bang & rstrip_bang
in that order. I can’t test it but would …

 p "\0abc\0def\0   ".strip

be handled usefully? The initial “\0” (C-string end) would
also be removed by the new strip (calling lstrip).
In that example, the C-string value was “” (empty).
After strip, “abc\0def” remains with the C-string
value of “abc”.

I am prepared to be flicked across the room by a coiled finger,
of course, but I think the choices now might be:

  1. rstrip removes “\0”, lstrip doesn’t. [disharmonic]
  2. feature is added elsewhere.
  3. change is backed-out and forgotten.

daz
1)3)2)

Hi,

···

In message “Re: Should String#strip take a parameter?” on 03/07/26, “daz” dooby@d10.karoo.co.uk writes:

I am prepared to be flicked across the room by a coiled finger,
of course, but I think the choices now might be:

  1. rstrip removes “\0”, lstrip doesn’t. [disharmonic]
  2. feature is added elsewhere.
  3. change is backed-out and forgotten.

or
4) rstrip removes sequence of “\0” at the end of string. lstrip
doesn’t.

						matz.

Sorry, my English and my typing went out-of-sync :<}}
I did look at the change on CVSweb first.

···

“Yukihiro Matsumoto” matz@ruby-lang.org wrote:

Hi,

In message “Re: Should String#strip take a parameter?” > on 03/07/26, “daz” dooby@d10.karoo.co.uk writes:

I am prepared to be flicked across the room by a coiled finger,
of course, but I think the choices now might be:

  1. rstrip removes “\0”, lstrip doesn’t. [disharmonic]
  2. feature is added elsewhere.
  3. change is backed-out and forgotten.

or
4) rstrip removes sequence of “\0” at the end of string. lstrip
doesn’t.

matz.


  1. rstrip removes “\0” or sequence of “\0” from the end of string,
    lstrip doesn’t remove any “\0” or sequence of “\0” from the start
    of string (only ISSPACE() or sequence of).
  2. feature is added elsewhere.
  3. change is backed-out and forgotten.

There was never a 1), it was just a bad dream.
Now, our list is messy but you’ve coerced my votes into a sequence.

Thanks,

daz
4)3)2)

matz,

  1. rstrip removes “\0”, lstrip doesn’t. [disharmonic]
  2. feature is added elsewhere.
  3. change is backed-out and forgotten.

or
4) rstrip removes sequence of “\0” at the end of string. lstrip
doesn’t.

Since String#strip was originally intended to remove leading and

trailing whitespaces, it seems strange for it to also strip "\0"s (leading
or trailing). I have never seen a language where whitespace included “\0”,
and Ruby does not include “\0” in any other definition of whitespace.

However, if we allowed the String#strip methods to accept a regular

expression as a parameter (as has been suggested) then this becomes a
non-issue. The old (1.6) functionality can stay the same, and people that
want to strip trailing "\0"s can do this via a simple regular expression.

This is a fairly easy change to the String#strip methods, and could be

included in the 1.8 preview6 if there are problems with preview5. I would
be happy to provide the patches.

- Warren Brown