Trying to strip characters from a line

7stud2 · 20 December 2012 20:36

I'm reading a table from a MySQL database and then processing it row by
row, stripping each line of certain characters ([, ], " and comma)
before writing it to a file. The code is running without errors, but
it's not stripping any of the characters as I would expect.

Here's the code:

#!/usr/bin/ruby

require 'mysql'

@bad_chars = '[],"'

begin
con = Mysql.new 'localhost', 'root', 'menagerie', 'haiku_archive'

rs = con.query("SELECT * FROM archive_2012")
n_rows = rs.num_rows

n_rows.times do
 begin
 file = File.open("archive.html", "a")
 line = rs.fetch_row.to_s
 line.gsub(/\[\]\,\"/,'')
 file.write(line)
 file.puts " "
 end
 end
end

Executing the code results in an "archive.html" file with all of the
"stripped" characters still intact. Am I invoking the gsub method
incorrectly? Thanks in advance for any help.

···

--
Posted via http://www.ruby-forum.com/.

Henry_Maddocks1 · 20 December 2012 21:23

Executing the code results in an "archive.html" file with all of the
"stripped" characters still intact. Am I invoking the gsub method
incorrectly? Thanks in advance for any help.

There are two forms of gsub…

line = "this is a line with ***** in it"

=> "this is a line with ***** in it"

line.gsub '*', ''

=> "this is a line with in it"

line

=> "this is a line with ***** in it"

line.gsub! '*', ''

=> "this is a line with in it"

line

=> "this is a line with in it"

The first form returns a new string, the second modifies the receiver.

Also your regexp in the gsub probably doesn't do what you think it does. You might want something like…

line.gsub /[\[\],"]/, "**"

Henry

···

On 21/12/2012, at 9:36 AM, Paul Mena <lists@ruby-forum.com> wrote:

7stud2 · 20 December 2012 21:55

Hi,

simply use String#delete or String#delete!
http://www.ruby-doc.org/core-1.9.3/String.html#method-i-delete

No need to fumble with regexes.

···

--
Posted via http://www.ruby-forum.com/.

7stud2 · 20 December 2012 22:42

Thanks for the responses. Unfortunately neither is having the desired
result. Here is a "tail" of the outputted archive.html file:

pablo@cochituate=> tail archive.html
["only Wednesday -- I stop fast-forwarding between
commercials ", "Sep-12-2012"] 
["last days of summer -- one more ghost story around the
fire ", "Sep-13-2012"] 
["maple-colored moon -- remembering Mom’s pancakes ",
"Sep-14-2012"] 
["Harvard Square station -- a Mozart sonata between buses ",
"Sep-14-2012"] 
["between buses a Mozart sonata ", "Sep-14-2012"] 
["stiff sea breeze -- the drawbridge operator’s bushy
beard ", "Sep-15-2012"] 
["old sea port -- the tugboat’s tattered flag ",
"Sep-15-2012"] 
["dockside pub -- wondering where the cormorant went ",
"Sep-15-2012"] 
["dockside pub -- the bride-to-be’s bright pink tiara ",
"Sep-15-2012"] 
["after the break somewhat less jumping to the jump blues
band ", "Sep-15-2012"]

···

--
Posted via http://www.ruby-forum.com/.

7stud2 · 21 December 2012 13:17

The code is the very first post in this thread, but I'll repeat some of
it here:

n_rows.times do
 begin
 file = File.open("archive.html", "a")
 line = rs.fetch_row.to_s
 line.gsub /\[\]\"\,/, ''
 file.write(line)
 file.puts " "
 end
 end

The code is processing rows from a MySQL table as lines of text, one at
a time. The desired output would look something like this:

blah blah blah meh foo bar Feb-12-2012

Instead it looks like this:

["blah blah blah meh foo bar ", "Feb-12-2012 "]

···

--
Posted via http://www.ruby-forum.com/.

7stud2 · 21 December 2012 14:01

I'm talking about your *new* code where you used the above suggestions.
You said that neither of them worked, so I'm asking you for the exact
code.

Henry told you that you need to use "gsub!" if you want the method to
actually change the string (instead of returning a new string).

I suggested using "delete!" as an alternative.

So choose one of those two options, rewrite your code and try again.

···

--
Posted via http://www.ruby-forum.com/.

7stud2 · 21 December 2012 14:15

This finally worked, although it's certainly not elegant:

n_rows.times do
 begin
 file = File.open("archive.html", "a")
 line = rs.fetch_row.to_s
 line.gsub! '[', ''
 line.gsub! '"', ''
 line.gsub! ']', ''
 line.gsub! ',', ''
 file.write(line)
 file.puts " "
 end
 end

Thanks once again to everyone for their suggestions.

···

--
Posted via http://www.ruby-forum.com/.

7stud2 · 21 December 2012 15:20

I did read and attempt to implement every suggestion made, but
admittedly I'm new to Ruby and will make newbie mistakes (like mixing up
gsub and gsub!). I started out as a FORTRAN developer back in the early
80s but have been a Sys Admin since the early 90s, and am still trying
to wrap my mind around object-oriented programming.

···

--
Posted via http://www.ruby-forum.com/.

7stud2 · 23 December 2012 16:05

Alex,

I'm appending to the "archive.html" line by line in a loop - that's why
I opened it with the "a" mode.

Cheers,

Paul

···

--
Posted via http://www.ruby-forum.com/.

7stud2 · 23 December 2012 16:52

Instead of all those individual gsubs, why not this:

irb(main):001:0> 'A["],a'.gsub(/\["\],/,'')
=> "Aa"

···

--
Posted via http://www.ruby-forum.com/.

7stud2 · 23 December 2012 20:01

Strangely, this didn't work for me:

irb(main):001:0> '["the frustrated musician’s mountain of
unsold CDs ", "Feb-12-2012"] '.gsub(/\["\],/,'')

=> "[\"the frustrated musician’s mountain of unsold
CDs \", \"Feb-12-2012\"] "

···

--
Posted via http://www.ruby-forum.com/.

7stud2 · 24 December 2012 01:05

Yes, my mistake, i was looking for that exact string instead of the
group. It should be this: /[\["\],]/

that is
/ Start Regex
[ Start group
\ Escape next character
[ look for open square bracket
" Look for double quotes
\ Escape next character
] Look for close square bracket
, Look for comma
] End group
/ End Regex

···

--
Posted via http://www.ruby-forum.com/.

7stud2 · 24 December 2012 13:10

Joel / Calvin,

That worked brilliantly. Thank you so much for the help!

Paul

···

--
Posted via http://www.ruby-forum.com/.

7stud2 · 21 December 2012 01:13

Paul Mena wrote in post #1089769:

Thanks for the responses. Unfortunately neither is having the desired
result.

Here is how computer programming forums work.

1) You post 15 lines or less of code that demonstrates your problem.
2) You show the actual output.
3) You state your desired/expected output.

···

--
Posted via http://www.ruby-forum.com/\.

7stud2 · 21 December 2012 11:10

Paul Mena wrote in post #1089769:

Thanks for the responses. Unfortunately neither is having the desired
result. Here is a "tail" of the outputted archive.html file:

The "tail" doesn't tell us anything, we need your *code*.

I'm pretty sure you've again confused "gsub" and "gsub!" (or "delete"
and "delete!") like in your first post.

···

--
Posted via http://www.ruby-forum.com/\.

7stud2 · 21 December 2012 14:34

Paul Mena wrote in post #1089856:

Thanks once again to everyone for their suggestions.

It would be even better if you'd actually read them. :-/

What's the purpose of the "begin-end", by the way? This is not Pascal. A
"begin-end" block only makes sense in combination with "rescue" or
"ensure". Re-opening the file for every single row also doesn't really
make sense. Either open the file *before* the loop or collect the row
strings and then write them all at once.

File.open("archive.html", "a") do |file|
 # I'm sure there's a better method for this, something like "each_row"
 n_rows.times do
 file << rs.fetch_row.to_s.delete('",')
 file.puts ' '
 end
end

···

--
Posted via http://www.ruby-forum.com/\.

Alexander_McMillan · 23 December 2012 11:38

If you are trying to strip characters from a line in a web page file .html why may I ask are you using the "a" file opening mode? The "a" mode is to append to the bottom of a file.

···

Date: Fri, 21 Dec 2012 23:15:52 +0900
From: lists@ruby-forum.com
Subject: Re: trying to strip characters from a line
To: ruby-talk@ruby-lang.org

This finally worked, although it's certainly not elegant:

 n_rows.times do
 begin
 file = File.open("archive.html", "a")
 line = rs.fetch_row.to_s
 line.gsub! '[', ''
 line.gsub! '"', ''
 line.gsub! ']', ''
 line.gsub! ',', ''
 file.write(line)
 file.puts " "
 end
 end

Thanks once again to everyone for their suggestions.

--
Posted via http://www.ruby-forum.com/\.

Calvin_Bornhofen · 23 December 2012 23:19

Hello,

it did not work, because his regular expression is describing a different pattern than you want: The \["\], looks for a [ (must be escaped in the regular expression, because it is a character with special meaning), followed by a ", followed by a ] (as with [), followed by a ,. His regular expression worked in his case, because he had ["], in his string. You want, as far as I can tell, remove all occurrences of those characters.
The regular expression you want is similar: /[\[\]",]/
If you don't know what this does, here's an explanation: The [ denotes the beginning of a set, and the ] ends it. The set matches to any character inside of it, but only one (because I did not write a quantifier like * or + after it).

So, this works:
"A[Aaaaa, \"aa".gsub( /[\[\]",]/, '' ) # => "AAaaaa aaa"

Alternatively, you could describe what you want with the regular expression /\[|\]|"|,/ since the pipe in regular expressions can be read as "or".

Regards

(Tested in Ruby 1.9.3p286; I don't know from the top of my head if the behavior would be any different in 1.8)

···

On 23.12.2012 21:01, Paul Mena wrote:

Strangely, this didn't work for me:

irb(main):001:0> '["the frustrated musician’s mountain of
unsold CDs ", "Feb-12-2012"] '.gsub(/\["\],/,'')

=> "[\"the frustrated musician’s mountain of unsold
CDs \", \"Feb-12-2012\"] "

Topic		Replies	Views
Character encoding in 1.9 ruby-talk	3	140	25 December 2012
Stripping double quotes ruby-talk	3	119	12 January 2012
Regexp: Stripping out all except ASCII? ruby-talk	4	105	3 October 2002
Stripping columns / puts'ing columns ruby-talk	6	63	12 July 2007
Stripping characters off a string ruby-talk	7	117	21 October 2008

Trying to strip characters from a line

Related topics