Problem removing new line characters on Mac OS X

Hi, I'm pretty new to Ruby. I've got a text file where I need to
remove some new line characters. I've tried everything I can think of
to do this with no success, including:

line.gsub!("/r","")
line.gsub!("/n","")
line=line.chomp

I can't seem to get the new line character to be recognised and dealt
with. Any advice appreciated.

Thanks

Hi, I'm pretty new to Ruby. I've got a text file where I need to
remove some new line characters. I've tried everything I can think of
to do this with no success, including:

line.gsub!("/r","")

                 ~~ \r

line.gsub!("/n","")

                ~~\n

line=line.chomp

I can't seem to get the new line character to be recognised and dealt
with. Any advice appreciated.

Thanks

try it

···

On 5/17/07, Singeo <singeo.sg@gmail.com> wrote:

--

line.chomp! doesn't work?
Would you show some code?

Harry

···

On 5/17/07, Singeo <singeo.sg@gmail.com> wrote:

Hi, I'm pretty new to Ruby. I've got a text file where I need to
remove some new line characters. I've tried everything I can think of
to do this with no success, including:

line.gsub!("/r","")
line.gsub!("/n","")
line=line.chomp

I can't seem to get the new line character to be recognised and dealt
with. Any advice appreciated.

Thanks

--

A Look into Japanese Ruby List in English

Singeo wrote:

Hi, I'm pretty new to Ruby. I've got a text file where I need to
remove some new line characters. I've tried everything I can think of
to do this with no success, including:

line.gsub!("/r","")
line.gsub!("/n","")
line=line.chomp

I can't seem to get the new line character to be recognised and dealt
with. Any advice appreciated.

Thanks

It looks like you should be using backslashes. If you want to match both newlines and carriage returns, you can use:

line.gsub!(/[\n\r]/, "")

-Dan

Singeo wrote:

Hi, I'm pretty new to Ruby. I've got a text file where I need to
remove some new line characters. I've tried everything I can think of
to do this with no success, including:

line.gsub!("/r","")
line.gsub!("/n","")
line=line.chomp

In case your problem is just your Ruby syntax:

1. Replace the forward slashes (like in "/r") by
backward slashes ("\r" in your above mentioned
solution.

2. Make the first parameter to the gsub! method
a Regexp instead of a string. The API docs say:
"... if it is a String then no regular expression
metacharacters will be interpreted ...".

This is why neither "/r" (1) nor "\r" (2) will
work.

If chomp does not work, you may be using a Mac
file under Linux or Windows. In that case you may
want to try something like

line.gsub!(/\015/, '')

Hermann

Apologies for the mis-understanding, I have been using backslashes.
Here's my code, as you'll see from the resulting file there are a
bunch of new line characters in the file I'd like to get rid of.
Thanks for the help so far.

require("rubygems")
require("scrubyt")
require ("open-uri")
require 'time'
require 'date'

psi = Scrubyt::Extractor.define do
  fetch("http://app.nea.gov.sg/psi/&quot;\)

  record("/html/body/div/table/tr/td/table/tbody/tr/td/div",
{ :generalize => true }) do
    title("/strong[1]/font[1]")
    item("/table/tbody/tr/td/table/tbody/tr", { :generalize => true })
do
      region("/td[1]")
      psi("/td[7]")
      aqd("/td[8]")
    end
  end
end

f = open("psiregions.xml", File::CREAT|File::TRUNC|File::RDWR) {|f|
  psi.to_xml.write(f, 1)
}

# Create the RSS file.
rssfile = File.new("sgpsi.xml", "w")
rssfile.puts('<?xml version="1.0" encoding="UTF-8"?>')
rssfile.puts('<rss version="2.0">')
rssfile.puts(' <channel>')
rssfile.puts(' <link>http://app.nea.gov.sg/psi/&lt;/link&gt;&#39;\)
rssfile.puts(' <description>Singapore PSI Readings</description>')
#rssfile.puts(' <title>Singapore PSI Readings' + Time.now.rfc2822
+ '</title>')
rssfile.puts(' <lastBuildDate>' + Time.now.rfc2822 + '</

')

rssfile.puts(' <webMaster>singeo@singeo.com.sg</webMaster>')

File.open('psiregions.xml', 'r') do |f1|
    while line = f1.gets
    line=line.strip
    line=line.chomp
    line.gsub!(/[\n]/, "")
    line.gsub!(/<root>/, "")
    line.gsub!(/<\/root>/, "")
    line.gsub!(/<record>/, "")
    line.gsub!(/<\/record>/, "")
    line.gsub!("24-hr", "Singapore 24-hr")
    line.gsub!("<region>Region</region>", "")
    line.gsub!("<region>Sulphur Dioxide</region>", "")
    line.gsub!(/<region>/, "<title>")
    line.gsub!(/<\/region>/,":")
    line.gsub!(/<psi>/, " PSI Level ")
    line.gsub!(/<\/psi>/, "")
    line.gsub!(/<aqd>/, " - ")
    line.gsub!(/<\/aqd>/, "</title>")
    line.gsub!(/<item>/, "<item><pubDate>" + Time.now.rfc2822 + "</

")

    rssfile.puts line
    end
end

rssfile.puts('</channel>')
rssfile.puts('</rss>')
rssfile.close

···

On May 17, 4:38 pm, Dan Zwell <dzw...@gmail.com> wrote:

Singeo wrote:
> Hi, I'm pretty new to Ruby. I've got a text file where I need to
> remove some new line characters. I've tried everything I can think of
> to do this with no success, including:

> line.gsub!("/r","")
> line.gsub!("/n","")
> line=line.chomp

> I can't seem to get the new line character to be recognised and dealt
> with. Any advice appreciated.

> Thanks

It looks like you should be using backslashes. If you want to match both
newlines and carriage returns, you can use:

line.gsub!(/[\n\r]/, "")

-Dan

Hi Hermann, just tried your suggestion of:

line.gsub!(/\015/, '')

still no success. I'm creating and running the file on a Mac.

···

On May 17, 5:05 pm, Hermann Martinelli <martine...@yahoo.com> wrote:

Singeo wrote:
> Hi, I'm pretty new to Ruby. I've got a text file where I need to
> remove some new line characters. I've tried everything I can think of
> to do this with no success, including:

> line.gsub!("/r","")
> line.gsub!("/n","")
> line=line.chomp

In case your problem is just your Ruby syntax:

1. Replace the forward slashes (like in "/r") by
backward slashes ("\r" in your above mentioned
solution.

2. Make the first parameter to the gsub! method
a Regexp instead of a string. The API docs say:
"... if it is a String then no regular expression
metacharacters will be interpreted ...".

This is why neither "/r" (1) nor "\r" (2) will
work.

If chomp does not work, you may be using a Mac
file under Linux or Windows. In that case you may
want to try something like

line.gsub!(/\015/, '')

Hermann

<snip>

Singeo wrote:

Here's my code, as you'll see from the resulting file there are a
bunch of new line characters in the file I'd like to get rid of.
[...]
    line=line.chomp
[...]
    rssfile.puts line

puts adds a newline to the end of the string it writes. If you don't want that
behaviour (which you obviously don't), use print instead.

···

--
Ist so, weil ist so
Bleibt so, weil war so

Singeo wrote:

Hi Hermann, just tried your suggestion of:

line.gsub!(/\015/, '')

still no success. I'm creating and running the file on a Mac.

Are you shure that you it is not successful?

It would be good to know how you read the lines,
how you (not) remove the carriage returns,
and how you perhaps put the lines together
(adding again \r characters by mistake?).

Are you removing the carriage returns line
by line (in which case the chomp should be perfect)
or are you trying it as a whole, i.e. do you have
not only one line but a whole file in 'line'?

Rather than an answer to these questions I would
prefer to see some more code of the whole part
from opening the file to writing back or putting
out the strings.

Hermann

Hermann, I followed Sebatian's advice to use print instead of puts and
that solved my problem. But I would still like to understand how to
remove the newline characters. Here's my code as it currently stands
(with "rssfile.print line" in place of "rssfile.puts line"), hopefully
it will help you see how I was trying to tackle the problem.

require("rubygems")
require("scrubyt")
require ("open-uri")
require 'time'
require 'date'

psi = Scrubyt::Extractor.define do
  fetch("http://app.nea.gov.sg/psi/&quot;\)

  record("/html/body/div/table/tr/td/table/tbody/tr/td/div",
{ :generalize => true }) do
    title("/strong[1]/font[1]")
    item("/table/tbody/tr/td/table/tbody/tr", { :generalize => true })
do
      region("/td[1]")
      psi("/td[7]")
      aqd("/td[8]")
    end
  end
end

f = open("psiregions.xml", File::CREAT|File::TRUNC|File::RDWR) {|f|
  psi.to_xml.write(f, 1)
}

# Create the RSS file.
rssfile = File.new("sgpsi.xml", "w")
rssfile.puts('<?xml version="1.0" encoding="UTF-8"?>')
rssfile.puts('<rss version="2.0">')
rssfile.puts(' <channel>')
rssfile.puts(' <link>http://app.nea.gov.sg/psi/&lt;/link&gt;&#39;\)
rssfile.puts(' <description>Singapore PSI Readings</description>')
#rssfile.puts(' <title>Singapore PSI Readings' + Time.now.rfc2822
+ '</title>')
rssfile.puts(' <lastBuildDate>' + Time.now.rfc2822 + '</

')

rssfile.puts(' <webMaster>singeo@singeo.com.sg</webMaster>')

File.open('psiregions.xml', 'r') do |f1|
    while line = f1.gets
    line=line.strip
    line.gsub!(/<root>/, "")
    line.gsub!(/<\/root>/, "")
    line.gsub!(/<record>/, "")
    line.gsub!(/<\/record>/, "")
    line.gsub!("24-hr", "Singapore 24-hr")
    line.gsub!("<region>Region</region>", "")
    line.gsub!("<region>Sulphur Dioxide</region>", "")
    line.gsub!(/<region>/, "<title>")
    line.gsub!(/<\/region>/,":")
    line.gsub!(/<psi>/, " PSI Level ")
    line.gsub!(/<\/psi>/, "")
    line.gsub!(/<aqd>/, " - ")
    line.gsub!(/<\/aqd>/, "</title>")
    line.gsub!(/<item>/, "<item><pubDate>" + Time.now.rfc2822 + "</

")

    line.gsub!(/<\/item>/, "</item>\n")
    rssfile.print line
    end
end

rssfile.puts('')
rssfile.puts('</channel>')
rssfile.puts('</rss>')

rssfile.close

···

On May 17, 6:20 pm, Hermann Martinelli <hermann.martine...@yahoo.com> wrote:

Singeo wrote:
> Hi Hermann, just tried your suggestion of:

> line.gsub!(/\015/, '')

> still no success. I'm creating and running the file on a Mac.

Are you shure that you it is not successful?

It would be good to know how you read the lines,
how you (not) remove the carriage returns,
and how you perhaps put the lines together
(adding again \r characters by mistake?).

Are you removing the carriage returns line
by line (in which case the chomp should be perfect)
or are you trying it as a whole, i.e. do you have
not only one line but a whole file in 'line'?

Rather than an answer to these questions I would
prefer to see some more code of the whole part
from opening the file to writing back or putting
out the strings.

Hermann

It s good to know that nobody reads your posts ;), seriously.

Is all you can see from my post

<snip>?

Strange

Probably I did something stupid, but that is not important as somebody
else came up with it too :slight_smile:

Robert

···

On 5/17/07, Singeo <singeo.sg@gmail.com> wrote:

Hermann, I followed Sebatian's advice

Hi Siingeo (btw: is that your first name?),

Singeo wrote:

... I followed Sebatian's advice to use print instead of puts and
that solved my problem.

Running out of time now, but I see that Sebastian
has given the correct answer already. I was on the
same track, which is why I was asking for the code
to see how you read and write the lines.

> But I would still like to understand how to

remove the newline characters. Here's my code as it currently stands
(with "rssfile.print line" in place of "rssfile.puts line"), hopefully
it will help you see how I was trying to tackle the problem.

You may have done well removing the newlines (to be
more precise: the carriage returns or "\r" characters),
but then when writing the lines you add it back again:

Hermann

Maybe I'm being thick here but I don't understand you: you have a working solution and an explanation why the NL weren't "removed" (actually they were removed and then reinserted by using "puts"). Now what is it that you want to understand about this?

Another remark: since you seem to be dealing with XML files why don't you use an XML tool, such as REXML? That's certainly less error prone when manipulating XML files.

Regards

  robert

···

On 17.05.2007 12:33, Singeo wrote:

Hermann, I followed Sebatian's advice to use print instead of puts and
that solved my problem. But I would still like to understand how to
remove the newline characters. Here's my code as it currently stands
(with "rssfile.print line" in place of "rssfile.puts line"), hopefully
it will help you see how I was trying to tackle the problem.

Robert, thanks for trying, but all I could see was <snip>..... as you
say, strange.....

···

On May 17, 6:43 pm, "Robert Dober" <robert.do...@gmail.com> wrote:

On 5/17/07, Singeo <singeo...@gmail.com> wrote:

> Hermann, I followed Sebatian's advice

It s good to know that nobody reads your posts ;), seriously.

Is all you can see from my post

<snip>?

Strange

Probably I did something stupid, but that is not important as somebody
else came up with it too :slight_smile:

Robert

Robert, I'm pretty new to Ruby so ignore my ignorance.... I have a
solution to my problem but wanted to understand more about why my
previous appraoch failed (for future reference), in particular, how do
I do a simple string substitution of a newline character on Mac OS X,
nothing I've tried or that has been suggested has worked.

I'll certainly look at REXML, thanks for the help.

···

On May 17, 6:55 pm, Robert Klemme <shortcut...@googlemail.com> wrote:

On 17.05.2007 12:33, Singeo wrote:

> Hermann, I followed Sebatian's advice to use print instead of puts and
> that solved my problem. But I would still like to understand how to
> remove the newline characters. Here's my code as it currently stands
> (with "rssfile.print line" in place of "rssfile.puts line"), hopefully
> it will help you see how I was trying to tackle the problem.

Maybe I'm being thick here but I don't understand you: you have a
working solution and an explanation why the NL weren't "removed"
(actually they were removed and then reinserted by using "puts"). Now
what is it that you want to understand about this?

Another remark: since you seem to be dealing with XML files why don't
you use an XML tool, such as REXML? That's certainly less error prone
when manipulating XML files.

Regards

        robert

<snip>

Thx for confirming which means I just did something stupid as this
post got through.
R.

···

On 5/17/07, Singeo <singeo.sg@gmail.com> wrote:

Robert, thanks for trying, but all I could see was <snip>..... as you
say, strange.....

Singeo wrote:

I have a
solution to my problem but wanted to understand more about why my
previous appraoch failed (for future reference)

Your previous approach didn't fail as such. You did remove the newline as you
wanted to. You just added it again afterwards by using puts instead of print.
puts adds a newline, that's just what it does (and what it says it does in the
docs). That's why I didn't work. There's nothing more to it.

···

--
Ist so, weil ist so
Bleibt so, weil war so