Return number of spaces at the beginning of a line

Jesse_B · 30 March 2010 02:41

How would I find the number of spaces at the beginning of a line before
the occurrance of the first non-space character?

Would the best method be to use a regular expression that covers all
non-space characters and get the index of the first occurrance of that?

···

--
Posted via http://www.ruby-forum.com/.

Jason_W1 · 30 March 2010 03:06

You could use regex you could also, if the data is a string or an array,
split them into a char array

e.g. :
s = " hello"
s = s.chars.to_a
=> [" ", " ", " ", " ", "h", "e", "l", "l", "o"]

s.each do | c |
  if( c == " " )
    i = i+1
    puts i
  end
end

and that'll count the number of spaces.

···

On Tue, Mar 30, 2010 at 3:41 AM, Jesse B. <jessebos@aol.com> wrote:

How would I find the number of spaces at the beginning of a line before
the occurrance of the first non-space character?

Would the best method be to use a regular expression that covers all
non-space characters and get the index of the first occurrance of that?
--
Posted via http://www.ruby-forum.com/\.

--
jbw

Harry3 · 30 March 2010 03:18

If I understand your question correctly,

str = " hello"
p str =~ /\S/

Harry

···

On Tue, Mar 30, 2010 at 11:41 AM, Jesse B. <jessebos@aol.com> wrote:

How would I find the number of spaces at the beginning of a line before
the occurrance of the first non-space character?

Would the best method be to use a regular expression that covers all
non-space characters and get the index of the first occurrance of that?
--

Gennady_Bystritsky1 · 30 March 2010 04:34

s = " abc"
s.index(%r{\S}) # => 3

s.scan(%r{^\s*}).first.size # => 3

Gennady.

···

On Mar 29, 2010, at 7:41 PM, Jesse B. wrote:

How would I find the number of spaces at the beginning of a line before
the occurrance of the first non-space character?

Would the best method be to use a regular expression that covers all
non-space characters and get the index of the first occurrance of that?
--
Posted via http://www.ruby-forum.com/\.

Steve_Howell · 30 March 2010 04:40

I think the following expresses what you are trying to do, but I do
not know if this is the most efficient or idiomatic way to do it:

    def num_leading_spaces(s)
      prefix = (s =~ /(\s*)/)
      $1.length
    end

puts num_leading_spaces(' hello')

···

On Mar 29, 7:41 pm, "Jesse B." <jesse...@aol.com> wrote:

How would I find the number of spaces at the beginning of a line before
the occurrance of the first non-space character?

Would the best method be to use a regular expression that covers all
non-space characters and get the index of the first occurrance of that?

Robert_K1 · 30 March 2010 07:12

How would I find the number of spaces at the beginning of a line before
the occurrance of the first non-space character?

spaces = str[/\A\s*/].length

Would the best method be to use a regular expression that covers all
non-space characters and get the index of the first occurrance of that?

I'd rather use the approach above because it always works (i.e. even
with empty strings). Your approach could be done like this:

spaces = /\S/ =~ s

as has been show already. But this fails for empty strings.

Kind regards

robert

···

2010/3/30 Jesse B. <jessebos@aol.com>:

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

Josh_Cheek · 30 March 2010 07:33

Hi, a quick appraisal of the issues with previous solutions.

This solution counts non-leading spaces
i = 0
" hello ".chars.to_a.each do |c|
i+=1 if( c == " " )
end

This one returns nil if there is not text after the leading whitespaces.
str =~ /\S/

Each of these count non-space whitespaces (ie " \t hello" would be 3 instead
of 1 )
s.index(%r{\S})

str[/\A\s*/].length

s =~ /(\s*)/
$1.length

So here is a quick suite you can use to test the solutions, along with a
solution which passes. If this suite doesn't accurately reflect what you
were trying to do, modify it and re-post You'll be a lot more likely to
get a solution which does explicitly what you are looking for, and maybe
find some areas that are ambiguous in your question, such as multi-line
strings, and strings with mixtures of whitespace types.

require 'test/unit'

def leading_spaces( str )
# fill this out however you like, my solution is:
str =~ /[^ ]/ || str.length
end

class TestLeadingSpaces < Test::Unit::TestCase

  def test_one_space
    assert_equal 1 , leading_spaces(' ')
  end

  def test_empty_string
    assert_equal 0 , leading_spaces('')
  end

  def test_two_leading_spaces
    assert_equal 2 , leading_spaces(' hello')
  end

  def test_no_spaces
    assert_equal 0 , leading_spaces('hello')
  end

  def test_spaces_inside_but_not_leading
    assert_equal 0 , leading_spaces('hello there')
  end

  def test_spaces_inside_and_leading
    assert_equal 2 , leading_spaces(' hello there')
  end

  def test_trailing_spaces_not_leading
    assert_equal 0 , leading_spaces('hello ')
  end

  def test_trailing_spaces_and_inside
    assert_equal 0 , leading_spaces('hello there ')
  end

  def test_spaces_everywhere
    assert_equal 1 , leading_spaces(' hello there ')
  end

  def test_mixture_of_spaces_and_tabs
    assert_equal 1 , leading_spaces(" \t hello")
  end

  # the following are not really defined in the question, this is what I
think the OP is asking for
  # might also be asking for an array listing the indentions for each line?
  def multi_line_spaces
    assert_equal 6 , leading_spaces(<<-MULTI_LINE_STRING)
      this is six spaces
        this is eight
    MULTI_LINE_STRING
  end
end

···

On Mon, Mar 29, 2010 at 8:41 PM, Jesse B. <jessebos@aol.com> wrote:

How would I find the number of spaces at the beginning of a line before
the occurrance of the first non-space character?

Would the best method be to use a regular expression that covers all
non-space characters and get the index of the first occurrance of that?
--
Posted via http://www.ruby-forum.com/\.

Steve_Howell · 30 March 2010 04:40

Oops, still not sure whether this is ideal, but this is better than my
other version:

    def num_leading_spaces(s)
      s =~ /(\s*)/
      $1.length
    end

···

On Mar 29, 9:35 pm, Steve Howell <showel...@yahoo.com> wrote:

On Mar 29, 7:41 pm, "Jesse B." <jesse...@aol.com> wrote:

> How would I find the number of spaces at the beginning of a line before
> the occurrance of the first non-space character?

> Would the best method be to use a regular expression that covers all
> non-space characters and get the index of the first occurrance of that?

I think the following expresses what you are trying to do, but I do
not know if this is the most efficient or idiomatic way to do it:
def num\_leading\_spaces$s$
  prefix = $s =\~ /\(\\s\*$/\)
  $1\.length
end

puts num\_leading\_spaces$&#39;   hello&#39;$

Josh_Cheek · 30 March 2010 07:57

Two things:
1) The name of the method multi_line_spaces won't actually test, because
TestUnit only considers a method to be a test if it begins with 'test' so
you should change it's name to test_multi_line_spaces

2) I thought of another test you should add. Throwing wholly bad data at it.
In this case, nil
  def test_nil
    assert_raises NoMethodError do
      leading_spaces nil
    end
  end

'The Pragmatic Programmer' says that it is better to crash early, so I
tested that it should throw an error.

You might also consider what you would like it to do in the event that you
pass it something not a string. Perhaps invoke the to_s method and then get
the length of the leading whitespace. Perhaps raise a different error.

Robert_K1 · 30 March 2010 09:55

Oh, that's easily fixed: just replace \s with ' ' e.g. s[\A */].length
(still my favorite). I thought general whitespace was sought after.

Kind regards

robert

···

2010/3/30 Josh Cheek <josh.cheek@gmail.com>:

On Mon, Mar 29, 2010 at 8:41 PM, Jesse B. <jessebos@aol.com> wrote:

Hi, a quick appraisal of the issues with previous solutions.

Each of these count non-space whitespaces (ie " \t hello" would be 3 instead
of 1 )
s.index(%r{\S})

str[/\A\s*/].length

s =~ /(\s*)/
$1.length

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

Jesse_B · 30 March 2010 10:41

This second post with the "spaces only" fix seems to meet all the needs
of what I was looking for.

I love that this got 9 replies in the middle of the night.
thanks everyone for your help.

Robert Klemme wrote:

···

2010/3/30 Josh Cheek <josh.cheek@gmail.com>:

On Mon, Mar 29, 2010 at 8:41 PM, Jesse B. <jessebos@aol.com> wrote:

Hi, a quick appraisal of the issues with previous solutions.

Each of these count non-space whitespaces (ie " \t hello" would be 3 instead
of 1 )
s.index(%r{\S})

str[/\A\s*/].length

s =~ /(\s*)/
$1.length

Oh, that's easily fixed: just replace \s with ' ' e.g. s[\A */].length
(still my favorite). I thought general whitespace was sought after.

Kind regards

robert

--
Posted via http://www.ruby-forum.com/\.

Robert_K1 · 30 March 2010 13:18

What do you mean, middle of the night? It's quite sunny here and the
sun isn't even going to settle soon.

Cheers

robert

···

2010/3/30 Jesse B. <jessebos@aol.com>:

This second post with the "spaces only" fix seems to meet all the needs
of what I was looking for.

I love that this got 9 replies in the middle of the night.
thanks everyone for your help.

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

Harry3 · 30 March 2010 13:49

I love that this got 9 replies in the middle of the night.
thanks everyone for your help.

What do you mean, middle of the night? It's quite sunny here and the
sun isn't even going to settle soon.

One Bright Day In The Middle Of The Night.

Harry

Colin_Bartlett1 · 30 March 2010 15:27

Well, it's also nowhere near the middle of the night here
(n*100 kilometres west of Robert?, where 0 < n < 13?)
but it's definitely not "quite sunny"! In fact, it's raining!
(To quote from "A Song of the Weather"
  by Flanders and Swann - 1950s/1960s sort of English Tom Lehrer's:
  April brings the sweet spring showers.
  On and on for hours and hours.)

How about this? I think it's different from the other solutions(?),
it seems to work, and it doesn't create a string object(?).
(str =~ /[^ ]/) || str.length

def q( str )
  ns = str =~ /[^ ]/ || str.length # actual code
  spaces = str[/\A */].length # Robert's 2nd post
  str.inspect.ljust(10) + " #=> " + ns.inspect + " =?= " + spaces.inspect
end

"" #=> 0 =?= 0
" " #=> 1 =?= 1
" " #=> 2 =?= 2
"c" #=> 0 =?= 0
" c" #=> 1 =?= 1
" c" #=> 2 =?= 2
" \t cc" #=> 1 =?= 1

···

On Tue, Mar 30, 2010 at 2:18 PM, Robert Klemme <shortcutter@googlemail.com> wrote:

2010/3/30 Jesse B. <jessebos@aol.com>:

This second post with the "spaces only" fix seems to meet all the needs
of what I was looking for.

I love that this got 9 replies in the middle of the night.
thanks everyone for your help.

What do you mean, middle of the night? It's quite sunny here and the
sun isn't even going to settle soon.

robert
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

Robert_K1 · 30 March 2010 15:35

This second post with the "spaces only" fix seems to meet all the needs
of what I was looking for.

I love that this got 9 replies in the middle of the night.
thanks everyone for your help.

What do you mean, middle of the night? It's quite sunny here and the
sun isn't even going to settle soon.

Well, it's also nowhere near the middle of the night here
(n*100 kilometres west of Robert?, where 0 < n < 13?)

Yeah, that sounds about right.

but it's definitely not "quite sunny"! In fact, it's raining!
(To quote from "A Song of the Weather"
by Flanders and Swann - 1950s/1960s sort of English Tom Lehrer's:
April brings the sweet spring showers.
On and on for hours and hours.)

I think I have to hurry: rain radar indicates showers coming my way.

How about this? I think it's different from the other solutions(?),
it seems to work, and it doesn't create a string object(?).
(str =~ /[^ ]/) || str.length

I'm afraid, that is not correct; the String is created as a side
effect and stored in a global variable:

irb(main):001:0> str = " foo"
=> " foo"
irb(main):002:0> str =~ /[^ ]/
=> 3
irb(main):003:0> $&
=> "f"

And it has the disadvantage over str[/\A */].length that likely is
slower because of the alternative.

Kind regards

robert

···

2010/3/30 Colin Bartlett <colinb2r@googlemail.com>:

On Tue, Mar 30, 2010 at 2:18 PM, Robert Klemme > <shortcutter@googlemail.com> wrote:

2010/3/30 Jesse B. <jessebos@aol.com>:

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

Colin_Bartlett1 · 30 March 2010 16:22

I think I have to hurry: rain radar indicates showers coming my way.

From the west?

How about this? I think it's different from the other solutions(?),
it seems to work, and it doesn't create a string object(?).
(str =~ /[^ ]/) || str.length

I'm afraid, that is not correct; the String is created as a side
effect and stored in a global variable:

irb(main):001:0> str = " foo" #=> " foo"
irb(main):002:0> str =~ /[^ ]/ #=> 3
irb(main):003:0> $& #=> "f"

And it has the disadvantage over str[/\A */].length that likely is
slower because of the alternative.

I'm glad that my (mathematically trained!) caution decided to add that "(?)".
I nearly didn't put it in, but had second thoughts!
I'd forgotten about the global variables.
I see that str.index( regexp ) also sets $&, which is nice to be reminded of,
first because it might come in handy, and second because, as you point out,
side effects might make what appears to be an "elegant" solution not elegant.
Well, one learns from ones mistakes!

···

On Tue, Mar 30, 2010 at 4:35 PM, Robert Klemme <shortcutter@googlemail.com> wrote:

2010/3/30 Colin Bartlett <colinb2r@googlemail.com>:

Josh_Cheek · 31 March 2010 17:16

There was a blog by Yehuda Katz that implied to me that these globals
weren't set until you asked for them, but upon re-reading, it isn't really
clear to me what actually happens

···

On Tue, Mar 30, 2010 at 9:35 AM, Robert Klemme <shortcutter@googlemail.com>wrote:

I'm afraid, that is not correct; the String is created as a side
effect and stored in a global variable:

irb(main):001:0> str = " foo"
=> " foo"
irb(main):002:0> str =~ /[^ ]/
=> 3
irb(main):003:0> $&
=> "f"

And it has the disadvantage over str[/\A */].length that likely is
slower because of the alternative.

Robert_K1 · 30 March 2010 17:00

I think I have to hurry: rain radar indicates showers coming my way.

From the west?

From south west.

···

On 03/30/2010 06:22 PM, Colin Bartlett wrote:

On Tue, Mar 30, 2010 at 4:35 PM, Robert Klemme > <shortcutter@googlemail.com> wrote:

2010/3/30 Colin Bartlett <colinb2r@googlemail.com>:

How about this? I think it's different from the other solutions(?),
it seems to work, and it doesn't create a string object(?).
(str =~ /[^ ]/) || str.length

I'm afraid, that is not correct; the String is created as a side
effect and stored in a global variable:

irb(main):001:0> str = " foo" #=> " foo"
irb(main):002:0> str =~ /[^ ]/ #=> 3
irb(main):003:0> $& #=> "f"

And it has the disadvantage over str[/\A */].length that likely is
slower because of the alternative.

I'm glad that my (mathematically trained!) caution decided to add that "(?)".
I nearly didn't put it in, but had second thoughts!
I'd forgotten about the global variables.
I see that str.index( regexp ) also sets $&, which is nice to be reminded of,
first because it might come in handy, and second because, as you point out,
side effects might make what appears to be an "elegant" solution not elegant.
Well, one learns from ones mistakes!

Absolutely! And there is still so much around to learn... I am still waiting for a bit of spare time to invest in having a closer look at PostgreSql...

Kind regards

robert

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

Aldric_Giacomoni2 · 31 March 2010 18:11

Josh Cheek wrote:

···

On Tue, Mar 30, 2010 at 9:35 AM, Robert Klemme > <shortcutter@googlemail.com>wrote:

And it has the disadvantage over str[/\A */].length that likely is
slower because of the alternative.

There was a blog by Yehuda Katz that implied to me that these globals
weren't set until you asked for them,

No, they are created as soon as the regular expression is parsed. I
believe Wycats explained that somewhere else, but I can't find the link
at the moment.
--
Posted via http://www.ruby-forum.com/\.

SwarmShepherd · 30 March 2010 23:00

I'm impressed.

Comment: It seems amazing that even as languages get more and more
powerful, we seem to spend more time pushing strings through hoops.

I've been using '?' and '!' on the end of some methods - NOW I'm
appending things to the method names to break ties. I am converting
from sym to s, remove the /?!/ and so on and I'm rather on strange
ground exactly where my app is the most prone to serious performance
degradation. So this has been on my mind some.....

Then, following this thought I just read up on "Fancy" (shhh!) (while
reading up on new Wiki features) and was thinking about if there might
a way to use some of that for really "core" stuff.

I'm really happy with Ruby performance so far, but it could be a
factor coming up, and it seems to be a ok idea to keep an eye out for
ways to tweak performance?

Any comments on converting Symbols to Strings, doing things on them,
then converting back - is there some better way?

Topic		Replies	Views
How to truncate the spaces in the front of a line ruby-talk	12	146	27 December 2009
How to check for where a text file begins? ruby-talk	2	107	3 July 2007
Regular expression mismatch? ruby-talk	12	80	8 April 2005
Is there an easier way to get a match before a position in a string? ruby-talk	6	119	21 January 2008
How to remove empty space in a string and others ruby-talk	4	114	9 October 2006

Return number of spaces at the beginning of a line

Related topics