In Message-Id: <4152820A.9080408@gmx.net>
Henrik Horneber <ryco@gmx.net> writes:
What's the best way to test if a string only consists of whitespaces
and newlines?
What about this?:
string !~ /\S/
where "\S" means complement of "\s". If your white spaces are not
equal to "\s", you can use an appropriate character class, say
"[^ \n]" for a character except a space and a line feed.
···
--
kjana@dm4lab.to September 23, 2004
Slow and steady wins the race.
In Message-Id: <4152820A.9080408@gmx.net>
Henrik Horneber <ryco@gmx.net> writes:
What's the best way to test if a string only consists of whitespaces
and newlines?
What about this?:
string !~ /\S/
where "\S" means complement of "\s". If your white spaces are not
equal to "\s", you can use an appropriate character class, say
"[^ \n]" for a character except a space and a line feed.
--
EMAIL :: Ara [dot] T [dot] Howard [at] noaa [dot] gov
PHONE :: 303.497.6469
A flower falls, even though we love it;
and a weed grows, even though we do not love it. --Dogen
I highly doubt a regex is faster than each_byte. each_byte has very
little code and is very fast (looping over the array in C and casting
the chars to fixnums), where as with a regex it has to pass through the
regex parser, get pulled back out as an object, pushed back into split,
which there in turn returns a potentially huge array which you pull back
again to run over with each. Then you've done another comparison with a
regex within the block which i guarantee is much slower then comparing 2
Fixnums.
My initial version didnt do \n, only white space, so here's my updated
version that even does tabs.
class String
def only_ws?
each_byte { |b| return false unless [9,10,32].include?(b) }
true
end
end
Evan Webb // evan@fallingsnow.net
···
On Thu, 2004-09-23 at 01:11, MiG wrote:
I think regexp should be is faster than each_byte. What about this?
class String
def whitespace_only? str
str.split(/\n/).each { |x|
return false unless x =~ /^\s*$/
}
true
end
"Mikael Brockman" <mikael@phubuh.org> schrieb im Newsbeitrag
news:87isa5i8rw.fsf@igloo.phubuh.org...
ts <decoux@moulon.inra.fr> writes:
>
> > self !~ /[\s\n]/m
>
> 1) \n is in \s with a character class, /m is useless
> 2) you are testing that it don't exist a whitespace character in the
"ts" <decoux@moulon.inra.fr> schrieb im Newsbeitrag
news:200409231451.i8NEphE08333@moulon.inra.fr...
> if s.strip.empty?
> # the string is whitespace only
svg% ruby -e 'a = " \000\000"; p "OK" if a.strip.empty?'
"OK"
svg%
svg% ruby -e 'a = " \000\000 "; p "OK" if a.strip.empty?'
svg%
Also I'd say the disadvantage of "a.strip.empty?" is that it creates a
copy of the string (=> a new instance) which is generally slower than a
simple regexp check.
ahh! that's terrible - i didn't know String#strip did that! the docs say
"Returns a copy of str with leading and trailing whitespace removed."
since when is NUL whitespace!? defintely against POLS.
thanks for the pointer.
-a
···
On Thu, 23 Sep 2004, ts wrote:
> if s.strip.empty?
> # the string is whitespace only
svg% ruby -e 'a = " \000\000"; p "OK" if a.strip.empty?'
"OK"
svg%
svg% ruby -e 'a = " \000\000 "; p "OK" if a.strip.empty?'
svg%
--
EMAIL :: Ara [dot] T [dot] Howard [at] noaa [dot] gov
PHONE :: 303.497.6469
A flower falls, even though we love it;
and a weed grows, even though we do not love it. --Dogen
i assumed you were correct - but this is suprising:
harp:~ > ruby b.rb
···
On Thu, 23 Sep 2004, Robert Klemme wrote:
"ts" <decoux@moulon.inra.fr> schrieb im Newsbeitrag
news:200409231451.i8NEphE08333@moulon.inra.fr...
> if s.strip.empty?
> # the string is whitespace only
svg% ruby -e 'a = " \000\000"; p "OK" if a.strip.empty?'
"OK"
svg%
svg% ruby -e 'a = " \000\000 "; p "OK" if a.strip.empty?'
svg%
Also I'd say the disadvantage of "a.strip.empty?" is that it creates a copy
of the string (=> a new instance) which is generally slower than a simple
regexp check.
-
small string strip-empty:
elapsed : 0.0081329345703125
-
small string re:
elapsed : 0.005950927734375
-
small string re-precompiled:
elapsed : 0.00719404220581055
-
big string strip-empty:
elapsed : 0.263929843902588
-
big string re:
elapsed : 5.26733493804932
-
big string re-precompiled:
elapsed : 5.51002883911133
def time label
fork do
GC.disable
puts "-\n#{ label }:\n"
a = Time::now.to_f
yield
b = Time::now.to_f
puts " elapsed : #{ b - a }"
end
Process::wait
end
s = "42"
bs = s * 8192
rep = %r/^\s*$/o
time('small string strip-empty') do
8192.times{ s.strip.empty? }
end
time('small string re') do
8192.times{ s =~ %r/^\s*$/ }
end
time('small string re-precompiled') do
8192.times{ s =~ rep }
end
time('big string strip-empty') do
8192.times{ bs.strip.empty? }
end
time('big string re') do
8192.times{ bs =~ %r/^\s*$/ }
end
time('big string re-precompiled') do
8192.times{ bs =~ rep }
end
at least it suprised me!
regards.
-a
--
EMAIL :: Ara [dot] T [dot] Howard [at] noaa [dot] gov
PHONE :: 303.497.6469
A flower falls, even though we love it;
and a weed grows, even though we do not love it. --Dogen
Since when _isn't_ NUL whitespace? Despite the fact that it is
sometimes used as a delimiter (which is true for all the other
whitespace characters as well), it has no meaning, no glyph, does not
show up when printed--it doesn't even move the cursor/printhead. How
much more "whitespace" can you get?
-- MarkusQ
···
On Thu, 2004-09-23 at 08:34, Ara.T.Howard@noaa.gov wrote:
since when is NUL whitespace!? defintely against POLS.