Wierd problem reading fields in a tab-delimitted row

David-

Thanks very much for your comments. -1 does indeed make a more promising result. I’m 1/2 way home on this problem…

A follow-up question if I may: How can I modify the if statement so that the output returns “no_string” if the string is blank. Test somehow for length = 0? Can you suggest the syntax?

Many thanks!

-Kurt

< ^ > P N |<>|^_>< — | ~ …OJHelp

···

Subject: Re: Wierd problem reading fields in a tab-delimitted row…
From: dblack@superlink.net
Date: Sun, 24 Aug 2003 14:37:21 +0900
In-reply-to: 80053
Hi –

On Sun, 24 Aug 2003, Kurt Euler wrote:

Everyone–

A wierd problem occurs in Ruby 1.67, 1.68 and 1.8 (and probably
other versions?): If I step through rows in a tab-delimited text file
using code like this:

IO.foreach(“wierd.txt”) { |x|
field = x.chop.split(“\t”)
if field[4] != “”
puts “some_string”
puts field[4]
else
puts “no_string”
puts field[4]
end
}

to read each line of a text file (wierd.txt) as in this example:

string1string2string3string4
string4string5string6string8
string9string10string11
string12string13string14string15

I get this output:

===================
no_string

no_string

some_string
nil
no_string

The question is, why does field[4] only get assigned nil in the 3rd
line but some phantom non-“nil” value in the 1st, 2nd, and 4th
lines? The general case I’ve found is that a nil value is read from
a ‘column’ if there are no strings in any ‘columns’ to the right.

Technically, I think, it’s that split stops adding values to the
array when it stop encountering anything other than delimiters on
the right:

irb(main):002:0> “aaaba”.split(/a/)
=> [“”, “”, “”, “b”]

That last “a” in the string does not result in a final “” in the return
array. And if the string has no non-delimiters:

irb(main):003:0> “aaa”.split(/a/)
=>

Unless you give a negative second argument to split, which indicates
you want the empty strings returned:

irb(main):001:0> “aaaba”.split(/a/,-1)
=> [“”, “”, “”, “b”, “”]
irb(main):002:0> “aaa”.split(/a/,-1)
=> [“”, “”, “”, “”]

So, in your third line, which ends with a bunch of tabs, you don’t get
the empty strings in the return array.

(I think your test output might be clouding the results a little,
since “” gets called “no_string” while nil gets called “some_string”
:slight_smile: But anyway – it’s split’s handling of delimiters on the right of
the string that’s behind what’s happening.)

David


David Alan Black
home: dblack@superlink.net
work: blackdav@shu.edu
Web: http://pirate.shu.edu/~blackdav

Hi –

David-

Thanks very much for your comments. -1 does indeed make a more
promising result. I’m 1/2 way home on this problem…

A follow-up question if I may: How can I modify the if statement so
that the output returns “no_string” if the string is blank. Test
somehow for length = 0? Can you suggest the syntax?

There are a few ways – the first is probably the nicest:

str.empty?
str == “”
str.size.zero?

and others. But I think you’re already doing what you’re asking how
to do:

  if field[4] != ""
  	puts "some_string"
  	puts field[4]
  else
  	puts "no_string"
  	puts field[4]
  end

If field[4] is empty (“”), the ‘else’ is executed and “no_string” is
printed. As I mentioned before, this might be a bit misleading, since
an empty string is different from no string (such as nil).

David

···

On Mon, 25 Aug 2003, Kurt Euler wrote:


David Alan Black
home: dblack@superlink.net
work: blackdav@shu.edu
Web: http://pirate.shu.edu/~blackdav