Splitting help needed

Phoenix · 30 May 2008 23:15

I have a program that someone on this forum helped me fix before that
took a list of cities formatted like:

New York | Chicago | Boston |

and formatted them like this, along with a phrase added after each one:

New York
Chicago
Boston
etc.

The code looks like this:

main = 0

inside << main

    newfile=File.new("state2.txt", "w")
    newfile.puts inside
    newfile.close

count+=1
end
end

I tried to modify it so that it would separate not on the '|' character,
but on a single space (such as in a list of cities like "New York
Chicago Boston", etc. without the '|' above) and then enter the phrase
after it.

I can't seem to get it to put each city in the file on a new line and
then add the phrase that I want after it like it did before. The only
real difference in what I want now and what I had before was the '|'
character. Can someone help me fix this?

···

--
Posted via http://www.ruby-forum.com/.

Siep_Korteling · 30 May 2008 23:41

Zoe Phoenix wrote:

I have a program that someone on this forum helped me fix before that
took a list of cities formatted like:

New York | Chicago | Boston |

and formatted them like this, along with a phrase added after each one:

New York
Chicago
Boston
etc.

The code looks like this:

main = 0

full= File.open("state.txt")
phrase=[", New Jersey"]
count=0
inside =
full.each do |line|
  first=
  first=line.split(/\|/)
  first.each do |single|
    sub=single.strip!
    main = (sub).to_s + (phrase).to_s

    inside << main

    newfile=File.new("state2.txt", "w")
    newfile.puts inside
    newfile.close

    count+=1
  end
end

The method split splits up a string and it will put the parts in an
array.
If you don't specify what to split on, it will split on newlines. Not
what you want. How to make clear that you want to split on " "?
Just say split(" ") .
(split will also work with a regular expression, like in your code. It's
faster and far more powerfull, but completely unreadable if you are not
familiar with it. In your code split("|") works.)

I have not tried, but it looks as if your code writes a new file for
each line it reads. Each time the same file. First time 1 line, second
time 2 lines, etc. You could consider taking this bit:

     newfile=File.new("state2.txt", "w")
     newfile.puts inside
     newfile.close

out of the loop.

hth,

Siep

···

--
Posted via http://www.ruby-forum.com/\.

Siep_Korteling · 30 May 2008 23:56

Zoe Phoenix wrote:

I tried to modify it so that it would separate not on the '|' character,
but on a single space (such as in a list of cities like "New York
Chicago Boston", etc. without the '|' above) and then enter the phrase
after it.

Oh. Well, you will end up with the cities New, York, Chicago, Boston.

Sorry about that.

Siep

···

--
Posted via http://www.ruby-forum.com/\.

Phoenix · 31 May 2008 00:01

Siep Korteling wrote:

Zoe Phoenix wrote:

I have a program that someone on this forum helped me fix before that
took a list of cities formatted like:

New York | Chicago | Boston |

and formatted them like this, along with a phrase added after each one:

New York
Chicago
Boston
etc.

The code looks like this:

main = 0

full= File.open("state.txt")
phrase=[", New Jersey"]
count=0
inside =
full.each do |line|
  first=
  first=line.split(/\|/)
  first.each do |single|
    sub=single.strip!
    main = (sub).to_s + (phrase).to_s

    inside << main

    newfile=File.new("state2.txt", "w")
    newfile.puts inside
    newfile.close

    count+=1
  end
end

The method split splits up a string and it will put the parts in an
array.
If you don't specify what to split on, it will split on newlines. Not
what you want. How to make clear that you want to split on " "?
Just say split(" ") .
(split will also work with a regular expression, like in your code. It's
faster and far more powerfull, but completely unreadable if you are not
familiar with it. In your code split("|") works.)

I have not tried, but it looks as if your code writes a new file for
each line it reads. Each time the same file. First time 1 line, second
time 2 lines, etc. You could consider taking this bit:

     newfile=File.new("state2.txt", "w")
     newfile.puts inside
     newfile.close

out of the loop.

hth,

Siep

Well, it isn't writing a new file for each line... but, it's not listing
the cities like I want, all it's doing is putting the phrase on a new
line for the same number of cities there are. So, I get, say ",
Alabama" a bunch of times instead of "Montgomery, Alabama", "Birmingham,
Alabama", etc.

I want it to take this:

Alabaster Albertville Alexander City Andalusia Anniston Arab Ardmore
Athens Atmore Attalla Auburn

And turn it into this:

Alabaster, Alabama
Albertville, Alabama
Alexander City, Alabama
Andalusia, Alabama
etc.

What I'm getting when I run the program is,

, Alabama
, Alabama
, Alabama
etc.

I know I'll run into a problem with some of the cities having two words
in them, like Alexander City, but fixing those manually isn't a problem.

···

--
Posted via http://www.ruby-forum.com/\.

7stud · 31 May 2008 01:56

Siep Korteling wrote:

The method split splits up a string and it will put the parts in an
array.
If you don't specify what to split on, it will split on newlines. Not
what you want. How to make clear that you want to split on " "?
Just say split(" ") .

str = "hello world goodbye"
arr = str.split()
p arr

--output:--
["hello", "world", "goodbye"]

(split will also work with a regular expression, like in your code. It's
faster

Wrong.

and far more powerfull, but completely unreadable if you are not
familiar with it. In your code split("|") works.)

Ahh, but the governing principle in the Ruby community is to make a Ruby
script look as much like a Perl script as possible--efficiency be
damned. So who in their right mind would pass up a chance to use a
hieroglyphic regex like: /\|/ in their code. That's art.

···

--
Posted via http://www.ruby-forum.com/\.

David_A_Black1 · 31 May 2008 10:08

Hi --

···

On Sat, 31 May 2008, Siep Korteling wrote:

The method split splits up a string and it will put the parts in an
array.
If you don't specify what to split on, it will split on newlines. Not
what you want. How to make clear that you want to split on " "?
Just say split(" ") .

Actually without an argument it will split on any amount of
whitespace. I think you're thinking of how strings enumerate, which is
(by default) as lines:

"abc\ndef".to_a # ["abc\n", "def"]

David

--
Rails training from David A. Black and Ruby Power and Light:
INTRO TO RAILS June 9-12 Berlin
ADVANCING WITH RAILS June 16-19 Berlin
See http://www.rubypal.com for details and updates!

David_A_Black1 · 31 May 2008 01:05

Hi --

Siep Korteling wrote:

Zoe Phoenix wrote:

I have a program that someone on this forum helped me fix before that
took a list of cities formatted like:

New York | Chicago | Boston |

and formatted them like this, along with a phrase added after each one:

New York
Chicago
Boston
etc.

The code looks like this:

main = 0

full= File.open("state.txt")
phrase=[", New Jersey"]
count=0
inside =
full.each do |line|
  first=
  first=line.split(/\|/)
  first.each do |single|
    sub=single.strip!
    main = (sub).to_s + (phrase).to_s

    inside << main

    newfile=File.new("state2.txt", "w")
    newfile.puts inside
    newfile.close

    count+=1
  end
end

The method split splits up a string and it will put the parts in an
array.
If you don't specify what to split on, it will split on newlines. Not
what you want. How to make clear that you want to split on " "?
Just say split(" ") .
(split will also work with a regular expression, like in your code. It's
faster and far more powerfull, but completely unreadable if you are not
familiar with it. In your code split("|") works.)

I have not tried, but it looks as if your code writes a new file for
each line it reads. Each time the same file. First time 1 line, second
time 2 lines, etc. You could consider taking this bit:

     newfile=File.new("state2.txt", "w")
     newfile.puts inside
     newfile.close

out of the loop.

hth,

Siep

Well, it isn't writing a new file for each line...

It's writing a new file for each element in the input. You're writing
the same file over and over again, a little bigger each time, instead
of gathering all the input and writing it all at once (or writing it
incrementally to a file that you keep open).

but, it's not listing
the cities like I want, all it's doing is putting the phrase on a new
line for the same number of cities there are. So, I get, say ",
Alabama" a bunch of times instead of "Montgomery, Alabama", "Birmingham,
Alabama", etc.

I want it to take this:

Alabaster Albertville Alexander City Andalusia Anniston Arab Ardmore
Athens Atmore Attalla Auburn

And turn it into this:

Alabaster, Alabama
Albertville, Alabama
Alexander City, Alabama
Andalusia, Alabama
etc.

What I'm getting when I run the program is,

, Alabama
etc.

I know I'll run into a problem with some of the cities having two words
in them, like Alexander City, but fixing those manually isn't a problem.

Try this:

phrase = ", Alabama"

David

···

On Sat, 31 May 2008, Zoe Phoenix wrote:

--
Rails training from David A. Black and Ruby Power and Light:
INTRO TO RAILS June 9-12 Berlin
ADVANCING WITH RAILS June 16-19 Berlin
See http://www.rubypal.com for details and updates!

David_A_Black1 · 31 May 2008 10:08

Hi --

···

On Sat, 31 May 2008, 7stud -- wrote:

Siep Korteling wrote:

The method split splits up a string and it will put the parts in an
array.
If you don't specify what to split on, it will split on newlines. Not
what you want. How to make clear that you want to split on " "?
Just say split(" ") .

str = "hello world goodbye"
arr = str.split()
p arr

--output:--
["hello", "world", "goodbye"]

(split will also work with a regular expression, like in your code. It's
faster

Wrong.

My benchmarks suggest that it's a little faster (about 10%).

David

--
Rails training from David A. Black and Ruby Power and Light:
INTRO TO RAILS June 9-12 Berlin
ADVANCING WITH RAILS June 16-19 Berlin
See http://www.rubypal.com for details and updates!

Dave_Bass · 31 May 2008 11:33

7stud -- wrote:

Ahh, but the governing principle in the Ruby community is to make a Ruby
script look as much like a Perl script as possible--efficiency be
damned. So who in their right mind would pass up a chance to use a
hieroglyphic regex like: /\|/ in their code. That's art.

Larry Wall (Perl supremo) has called this sort of thing LTS: leaning
toothpick syndrome. But Perl's syntax allows you to write a regexp as
m(\|), which is a bit clearer. Or you can use the quotemeta() function
to add a backslash for you.

But this is Ruby, not Perl!

Coming from 10 years of Perl coding, I wish Ruby were *less* Perl-like,
as it can get confusing, especially when you're working in both
languages at the same time.

···

--
Posted via http://www.ruby-forum.com/\.

Phoenix · 31 May 2008 03:16

Ohhh, I see... thank you so much!

···

--
Posted via http://www.ruby-forum.com/.

David_A_Black1 · 31 May 2008 12:11

Hi --

7stud -- wrote:

Ahh, but the governing principle in the Ruby community is to make a Ruby
script look as much like a Perl script as possible--efficiency be
damned. So who in their right mind would pass up a chance to use a
hieroglyphic regex like: /\|/ in their code. That's art.

Larry Wall (Perl supremo) has called this sort of thing LTS: leaning
toothpick syndrome. But Perl's syntax allows you to write a regexp as
m(\|), which is a bit clearer. Or you can use the quotemeta() function
to add a backslash for you.

I always figured it's easiest just to learn the regex stuff and get it
over with. As a result, I can read regexes fluently as long as they
don't use /x or %r{}

But this is Ruby, not Perl!

Coming from 10 years of Perl coding, I wish Ruby were *less* Perl-like,
as it can get confusing, especially when you're working in both
languages at the same time.

I remember someone (I'm too lazy to look it up) saying long ago that
while Ruby often strikes one as Perl-like initially, it actually is
much less so than it appears at first. I think that's true. Perl also
has more of a tradition of deliberate code obfuscation, though of
course it's generally done in a playful way. Obfuscated Ruby code
always looks kind of ridiculous to me, as Ruby really militates for
a certain clarity, and there's such a tradition of love of clean code
in the community.

For the first RubyConf, we were going to have a "Code De-Obfuscation"
contest, since the idea of an obfuscation contest in Ruby seemed so
against the grain of what people loved about the language. We got as
far as getting some obfuscated contributions, ripe for de-obfuscation
(including one from Dave Thomas), but unfortunately the timing of that
conference -- October 2001 -- sapped some of our time and energy and
that contest was one of the things that fell by the wayside.

David

···

On Sat, 31 May 2008, Dave Bass wrote:

--
Rails training from David A. Black and Ruby Power and Light:
INTRO TO RAILS June 9-12 Berlin
ADVANCING WITH RAILS June 16-19 Berlin
See http://www.rubypal.com for details and updates!

7stud · 31 May 2008 21:10

David A. Black wrote:

I always figured it's easiest just to learn the regex stuff and get it
over with.

I did--8 years ago.

My benchmarks suggest that it's a little faster (about 10%).

Ok, I see where you're coming from. The following test shows that
split() operates 45% faster without a regex:

require 'benchmark'
include Benchmark

L_COUNT = 1_000_000

bm(25) do |test|
  test.report("split:") do
    L_COUNT.times do |i|
      str = "hello world goodbye"
      arr = str.split()
    end
  end

  test.report("regex:") do
    L_COUNT.times do |i|
      str = "hello world goodbye"
      arr = str.split(/\s+/)
    end
  end
end

user system total real
split: 2.470000 0.010000 2.480000 ( 2.494421)
regex: 4.550000 0.030000 4.580000 ( 4.576609)

And this test shows that split() is 13% faster with a regex:

require 'benchmark'
include Benchmark

L_COUNT = 1_000_000

That indicates to me that unless you use the default behavior of
split(), which splits on spaces, then split() has to spend time
converting its argument to a regex.

···

--
Posted via http://www.ruby-forum.com/\.

Tristin_Davis · 31 May 2008 03:56

No need to waste space being facetious.

···

On Fri, May 30, 2008 at 10:16 PM, Zoe Phoenix <dark_sgtphoenix@hotmail.com> wrote:

Ohhh, I see... thank you so much!
--
Posted via http://www.ruby-forum.com/\.

Rick_DeNatale1 · 31 May 2008 14:45

Hi --

But this is Ruby, not Perl!

Coming from 10 years of Perl coding, I wish Ruby were *less* Perl-like,
as it can get confusing, especially when you're working in both
languages at the same time.

I remember someone (I'm too lazy to look it up) saying long ago that
while Ruby often strikes one as Perl-like initially, it actually is
much less so than it appears at first. I think that's true. Perl also
has more of a tradition of deliberate code obfuscation, though of
course it's generally done in a playful way. Obfuscated Ruby code
always looks kind of ridiculous to me, as Ruby really militates for
a certain clarity, and there's such a tradition of love of clean code
in the community.

Some of the features Ruby 'stole' from Perl are things I use fairly
regularly, for example the if/unless/while statement modifiers.

Others such as all the special and sometimes magical global variables
and pseudo-variables I don't find useful at all, and I don't see them
discussed much here. They're probably quite helpful if you are using
Ruby in the same kind of swiss-army knife adjunct to shell commands
way that perl is often used by wrapping a 'one-liner' in one of the
various loops implied by different command line options like i, n,
and p, but I never use that.

Regular expressions are really a sub-language of their own. Ruby has
some extensions like %r{} which I think make them a little clearer.

For the first RubyConf, we were going to have a "Code De-Obfuscation"
contest, since the idea of an obfuscation contest in Ruby seemed so
against the grain of what people loved about the language. We got as
far as getting some obfuscated contributions, ripe for de-obfuscation
(including one from Dave Thomas), but unfortunately the timing of that
conference -- October 2001 -- sapped some of our time and energy and
that contest was one of the things that fell by the wayside.

That sounds like a great idea. I wish that there were some kind of
anti-code-golf contest, where the objective was maximum
clarity/expressiveness rather than compactness. The problem is that
it's impossible to measure the former objectively, compared to 32 vs.
33 character comparisons.

···

On Sat, May 31, 2008 at 8:11 AM, David A. Black <dblack@rubypal.com> wrote:

On Sat, 31 May 2008, Dave Bass wrote:

--
Rick DeNatale

My blog on Ruby
http://talklikeaduck.denhaven2.com/

7stud · 31 May 2008 21:11

7stud -- wrote:

And this test shows that split() is 13% faster with a regex:

require 'benchmark'
include Benchmark

L_COUNT = 1_000_000

bm(25) do |test|
  test.report("split:") do
    L_COUNT.times do |i|
      str = "hello|world|goodbye"
      arr = str.split("|")
    end
  end

  test.report("regex:") do
    L_COUNT.times do |i|
      str = "hello|world|goodbye"
      arr = str.split(/\|/)
    end
  end
end

Whoops. Here are the results:

user system total real
split: 4.620000 0.030000 4.650000 ( 4.661699)
regex: 4.000000 0.030000 4.030000 ( 4.056688)

···

--
Posted via http://www.ruby-forum.com/\.

Robert_K1 · 31 May 2008 17:19

That's true. I usually try to compact code in order to increase readability and often also efficiency because I do not find shortness a value in itself. As always it's a question of balance.

Kind regards

robert

···

On 31.05.2008 16:45, Rick DeNatale wrote:

On Sat, May 31, 2008 at 8:11 AM, David A. Black <dblack@rubypal.com> wrote:

For the first RubyConf, we were going to have a "Code De-Obfuscation"
contest, since the idea of an obfuscation contest in Ruby seemed so
against the grain of what people loved about the language. We got as
far as getting some obfuscated contributions, ripe for de-obfuscation
(including one from Dave Thomas), but unfortunately the timing of that
conference -- October 2001 -- sapped some of our time and energy and
that contest was one of the things that fell by the wayside.

That sounds like a great idea. I wish that there were some kind of
anti-code-golf contest, where the objective was maximum
clarity/expressiveness rather than compactness. The problem is that
it's impossible to measure the former objectively, compared to 32 vs.
33 character comparisons.

David_A_Black1 · 31 May 2008 19:42

Hi --

Hi --

But this is Ruby, not Perl!

Coming from 10 years of Perl coding, I wish Ruby were *less* Perl-like,
as it can get confusing, especially when you're working in both
languages at the same time.

I remember someone (I'm too lazy to look it up) saying long ago that
while Ruby often strikes one as Perl-like initially, it actually is
much less so than it appears at first. I think that's true. Perl also
has more of a tradition of deliberate code obfuscation, though of
course it's generally done in a playful way. Obfuscated Ruby code
always looks kind of ridiculous to me, as Ruby really militates for
a certain clarity, and there's such a tradition of love of clean code
in the community.

Some of the features Ruby 'stole' from Perl are things I use fairly
regularly, for example the if/unless/while statement modifiers.

Others such as all the special and sometimes magical global variables
and pseudo-variables I don't find useful at all, and I don't see them
discussed much here. They're probably quite helpful if you are using
Ruby in the same kind of swiss-army knife adjunct to shell commands
way that perl is often used by wrapping a 'one-liner' in one of the
various loops implied by different command line options like i, n,
and p, but I never use that.

Regular expressions are really a sub-language of their own. Ruby has
some extensions like %r{} which I think make them a little clearer.

For the first RubyConf, we were going to have a "Code De-Obfuscation"
contest, since the idea of an obfuscation contest in Ruby seemed so
against the grain of what people loved about the language. We got as
far as getting some obfuscated contributions, ripe for de-obfuscation
(including one from Dave Thomas), but unfortunately the timing of that
conference -- October 2001 -- sapped some of our time and energy and
that contest was one of the things that fell by the wayside.

That sounds like a great idea. I wish that there were some kind of
anti-code-golf contest, where the objective was maximum
clarity/expressiveness rather than compactness. The problem is that
it's impossible to measure the former objectively, compared to 32 vs.
33 character comparisons.

I like code golf, as long as it's clear that it's code golf -- that
is, a brain-teaser/exercise with the goal of coming up with the
minimum number of "strokes". I've learned an awful lot about both Ruby
and Perl by doing that. I don't consider it any more closely related
to real code production than, say, abdominal crunches are to baseball.
It's just a mind-stretcher. I suspend aesthetic judgment on code in
code golf contests because I assume it isn't really being held up as
anything other than what it is (maximally compressed).

Then again, there are certainly cases of quasi-golf-like code that
people just write, and that can be a problem....

The de-obfuscation contest idea was kind of a superset of what you're
describing: looking to transform code that was opaque and badly
written in various ways into something more clear and expressive. The
de-obfuscations were going to be judged by a panel, as I recall, since
as you say there's no automatic way to judge them.

David

···

On Sat, 31 May 2008, Rick DeNatale wrote:

On Sat, May 31, 2008 at 8:11 AM, David A. Black <dblack@rubypal.com> wrote:

On Sat, 31 May 2008, Dave Bass wrote:

--
Rails training from David A. Black and Ruby Power and Light:
INTRO TO RAILS June 9-12 Berlin
ADVANCING WITH RAILS June 16-19 Berlin
See http://www.rubypal.com for details and updates!

David_A_Black1 · 31 May 2008 19:44

Hi --

···

On Sun, 1 Jun 2008, Robert Klemme wrote:

On 31.05.2008 16:45, Rick DeNatale wrote:

On Sat, May 31, 2008 at 8:11 AM, David A. Black <dblack@rubypal.com> wrote:

For the first RubyConf, we were going to have a "Code De-Obfuscation"
contest, since the idea of an obfuscation contest in Ruby seemed so
against the grain of what people loved about the language. We got as
far as getting some obfuscated contributions, ripe for de-obfuscation
(including one from Dave Thomas), but unfortunately the timing of that
conference -- October 2001 -- sapped some of our time and energy and
that contest was one of the things that fell by the wayside.

That sounds like a great idea. I wish that there were some kind of
anti-code-golf contest, where the objective was maximum
clarity/expressiveness rather than compactness. The problem is that
it's impossible to measure the former objectively, compared to 32 vs.
33 character comparisons.

That's true. I usually try to compact code in order to increase readability and often also efficiency because I do not find shortness a value in itself. As always it's a question of balance.

I do think that one of the great things about Ruby is that, to a
remarkable degree, code becomes *clearer* as it becomes smaller. Not,
of course, when it gets into real golf territory (see my last post) --
but over a pretty wide range.

David

--
Rails training from David A. Black and Ruby Power and Light:
INTRO TO RAILS June 9-12 Berlin
ADVANCING WITH RAILS June 16-19 Berlin
See http://www.rubypal.com for details and updates!

Robert_K1 · 1 June 2008 09:39

Absolutely agree.

robert

···

On 31.05.2008 21:44, David A. Black wrote:

Hi --

On Sun, 1 Jun 2008, Robert Klemme wrote:

On 31.05.2008 16:45, Rick DeNatale wrote:

On Sat, May 31, 2008 at 8:11 AM, David A. Black <dblack@rubypal.com> wrote:

For the first RubyConf, we were going to have a "Code De-Obfuscation"
contest, since the idea of an obfuscation contest in Ruby seemed so
against the grain of what people loved about the language. We got as
far as getting some obfuscated contributions, ripe for de-obfuscation
(including one from Dave Thomas), but unfortunately the timing of that
conference -- October 2001 -- sapped some of our time and energy and
that contest was one of the things that fell by the wayside.

That sounds like a great idea. I wish that there were some kind of
anti-code-golf contest, where the objective was maximum
clarity/expressiveness rather than compactness. The problem is that
it's impossible to measure the former objectively, compared to 32 vs.
33 character comparisons.

That's true. I usually try to compact code in order to increase readability and often also efficiency because I do not find shortness a value in itself. As always it's a question of balance.

I do think that one of the great things about Ruby is that, to a
remarkable degree, code becomes *clearer* as it becomes smaller. Not,
of course, when it gets into real golf territory (see my last post) --
but over a pretty wide range.

Topic		Replies	Views
Splitting ruby-talk	12	95	25 July 2009
Splitting strings ruby-talk	17	91	25 July 2011
Writing to a file help ruby-talk	3	129	23 October 2007
Need Help with Split ruby-talk	2	76	14 December 2010
Take text from file and add do it with Ruby..? ruby-talk	5	122	6 May 2008

Splitting help needed

Related topics