Several times I have run across the need to strip characters other than
whitespaces from the beginning and/or end of a string. I have always
accomplished this using String#sub (or String#gsub) and an appropriate
pattern. For example:
It occurs to me that this would be a natural extension to the
String#strip methods (1.8 now includes String#lstrip and String#rstrip as
well as the “!” variations) by adding an optional String parameter to them.
The default value of the parameter would be nil to preserve backwards
compatibility, but if the parameter was not nil, the functions would strip
that string (or the string returned by that object’s to_str method).
Example behavior:
class String
def new_lstrip(str = nil)
return self.lstrip if str.nil?
self.sub(/^#{str}+/,’’)
end
def new_rstrip(str = nil)
return self.rstrip if str.nil?
self.sub(/#{str}+$/,’’)
end
def new_strip(str = nil)
return self.lstrip if str.nil?
self.gsub(/(^#{str}+)|(#{str}+$)/,’’)
end
end
If there is an interest in this, I can provide patches for 1.6.8 and/or
“Warren Brown” wkb@airmail.net schrieb im Newsbeitrag
news:000201c34fd3$22b24cb0$6449c4d1@warrenpc…
All,
Several times I have run across the need to strip characters other
than
whitespaces from the beginning and/or end of a string. I have always
accomplished this using String#sub (or String#gsub) and an appropriate
pattern. For example:
It occurs to me that this would be a natural extension to the
String#strip methods (1.8 now includes String#lstrip and String#rstrip as
well as the “!” variations) by adding an optional String parameter to
them.
The default value of the parameter would be nil to preserve backwards
compatibility, but if the parameter was not nil, the functions would strip
that string (or the string returned by that object’s to_str method).
Example behavior:
class String
def new_lstrip(str = nil)
return self.lstrip if str.nil?
self.sub(/^#{str}+/,‘’)
end
If you want to exactly match str you should use Regexp.quote for the
argument. Otherwise I’d use a regexp directly and match it at the beginning
and at the end. Maybe like this:
class String
SPACES = /\s+/o
def new_strip!(rx=SPACES)
gsub!(rx) {|m| $`==“”||$'==“” ? nil : m}
end
def new_strip(rx=SPACES)
s=dup; s.new_strip! rx; s
end
def new_lstrip!(rx=SPACES)
gsub!(rx) {|m| $`==“” ? nil : m}
end
def new_lstrip(rx=SPACES)
s=dup; s.new_lstrip! rx; s
end
def new_rstrip!(rx=SPACES)
gsub!(rx) {|m| $'==“” ? nil : m}
end
def new_rstrip(rx=SPACES)
s=dup; s.new_rstrip! rx; s
end
end
That’s unintuitive, IMO. I think if strip takes a parameter, it should
be a regex (a string can be autoconverted to its Regex.quoted form), and
implicitly wrap the regex in a (…)*. e.g.
We need to define the behavior first. Python 2.3’s
strip takes a parameter, but it strips off all
characters in the string, i.e.
a = “<><<>>>>><><>”
a.strip(“<>”) # => “abc”
matz.
OK, how about the methods taking a Regular Expression as a parameter
(converting strings as necessary), with nil defaulting to /[\s\t\r\n\f]/?
That would still allow “xxx”.strip(“_”), but could also allow
“<><<>>>>><><>”.strip(/[<>]/).
The only reason for thinking #strip might be relevant
in this situation is that you initialised your buffer
with spaces.
class String
def sz_cut! # cut at string-zero terminator
replace(self[/\A[^\0]*/])
end
end
p “return value\000 garbage”.sz_cut!
#-> “return value”
… works more generally.
But I can’t see a problem with treating
“\0” as whitespace in #rstrip until the
day someone wants to pass it on as a
C-string, gets an error, then remembers
to add it back
(find on page: altogether) to view the transcripted line from
the “Airplane!” film.
#########
I can see the desire to be consistent, but …
strip calls strip_bang, which calls lstrip_bang & rstrip_bang
in that order. I can’t test it but would …
p "\0abc\0def\0 ".strip
be handled usefully? The initial “\0” (C-string end) would
also be removed by the new strip (calling lstrip).
In that example, the C-string value was “” (empty).
After strip, “abc\0def” remains with the C-string
value of “abc”.
I am prepared to be flicked across the room by a coiled finger,
of course, but I think the choices now might be:
or
4) rstrip removes sequence of “\0” at the end of string. lstrip
doesn’t.
matz.
rstrip removes “\0” or sequence of “\0” from the end of string,
lstrip doesn’t remove any “\0” or sequence of “\0” from the start
of string (only ISSPACE() or sequence of).
feature is added elsewhere.
change is backed-out and forgotten.
There was never a 1), it was just a bad dream.
Now, our list is messy but you’ve coerced my votes into a sequence.
or
4) rstrip removes sequence of “\0” at the end of string. lstrip
doesn’t.
Since String#strip was originally intended to remove leading and
trailing whitespaces, it seems strange for it to also strip "\0"s (leading
or trailing). I have never seen a language where whitespace included “\0”,
and Ruby does not include “\0” in any other definition of whitespace.
However, if we allowed the String#strip methods to accept a regular
expression as a parameter (as has been suggested) then this becomes a
non-issue. The old (1.6) functionality can stay the same, and people that
want to strip trailing "\0"s can do this via a simple regular expression.
This is a fairly easy change to the String#strip methods, and could be
included in the 1.8 preview6 if there are problems with preview5. I would
be happy to provide the patches.