Special characters in csv header using fastercsv

Hello all,
I'm kind of new to fastercsv and only have 2 months with Ruby on Rails.
So this my sound a little newbie.

First, I read a csv file (approx. 6,000 lines). This file comes from a
different department, so I have no control over the headers.

Second, while reading or parsing the file, the rows are inputted into a
database.

My problem is my database columns can not have special characters, but
the csv headers have the special characters.

Is there a way to remove the special characters before inserting in
database?

Here is a view of my code.

def import_irb_file
     # set file name
      file = params[:irb][:file]
      rowcount = 0

      Irb.transaction do
        FasterCSV.parse(file,
                        :headers => true,
                        :header_converters => :symbol,
                        :converters => :all,
                        :encoding => 'u' ) do |row|
                          Irb.create!(row.to_hash)
                          rowcount += 1
                        end
      end
      # if successful then display, then redirect to index page
      flash[:notice] = "Successfully added #{rowcount} project(s)."
      redirect_to :action => :index

    rescue => exception
      file_name = params[:irb]['file'].original_filename
      file_parts = params[:irb]['file'].original_filename.split('.')
      ext = file_parts[1]

      if ext != 'csv'
        error = "CSV file is required"
      else
        error = ERB::Util.h(exception.to_s) # get the error and HTML
escape it
      end
      # If an exception in thrown, the transaction rolls back and we end
up in this
      # rescue block

      flash[:error] = "Error adding projects to IRB table. (#{error}).
Please try again. "

      redirect_to :controller => 'irbs', :action => 'new'

  end

Thank you for any advice.

JohnM

···

--
Posted via http://www.ruby-forum.com/.

John Mcleod wrote:

Hello all,
I'm kind of new to fastercsv and only have 2 months with Ruby on Rails.
So this my sound a little newbie.

First, I read a csv file (approx. 6,000 lines). This file comes from a
different department, so I have no control over the headers.

Second, while reading or parsing the file, the rows are inputted into a
database.

My problem is my database columns can not have special characters,

[...]

Why can't they? And are you creating a DB column for each column in the
CSV file?

Best,

···

--
Marnen Laibow-Koser
http://www.marnen.org
marnen@marnen.org
--
Posted via http://www.ruby-forum.com/\.

Hello all,

Hello.

First, I read a csv file (approx. 6,000 lines). This file comes from a
different department, so I have no control over the headers.

Second, while reading or parsing the file, the rows are inputted into a
database.

My problem is my database columns can not have special characters, but
the csv headers have the special characters.

Is there a way to remove the special characters before inserting in
database?

Sure. FasterCSV support header_converters that can transform your headers in any way that you need. Can you show a sample header and what you would like it to become?

James Edward Gray II

···

On Nov 17, 2009, at 10:24 AM, John Mcleod wrote:

Hello Marnen,
Thank you for replying.

I would consider the non-use of special characters in a database column
title, a matter of good form.
The only special character I use in database column titles is an
underscore and sometimes not that.

John

Marnen Laibow-Koser wrote:

···

John Mcleod wrote:

Hello all,
I'm kind of new to fastercsv and only have 2 months with Ruby on Rails.
So this my sound a little newbie.

First, I read a csv file (approx. 6,000 lines). This file comes from a
different department, so I have no control over the headers.

Second, while reading or parsing the file, the rows are inputted into a
database.

My problem is my database columns can not have special characters,

[...]

Why can't they? And are you creating a DB column for each column in the
CSV file?

Best,
--
Marnen Laibow-Koser
http://www.marnen.org
marnen@marnen.org

--
Posted via http://www.ruby-forum.com/\.

James,
Thank you for replying.

sample headers:
Q.20C - Population May Include (Target)
IRB#

Desired headers:
Q20C_Population_May_Include_Target
IRB_id

John

James Edward Gray II wrote:

···

On Nov 17, 2009, at 10:24 AM, John Mcleod wrote:

Hello all,

Hello.

database?

Sure. FasterCSV support header_converters that can transform your
headers in any way that you need. Can you show a sample header and what
you would like it to become?

James Edward Gray II

--
Posted via http://www.ruby-forum.com/\.

Try adding this argument where faster CSV opens the file:

  :header_converters => lambda { |h| h.tr(" ", "_").delete("^a-zA-Z0-9_") }

Hope that helps.

James Edward Gray II

···

On Nov 17, 2009, at 10:38 AM, John Mcleod wrote:

James,
Thank you for replying.

sample headers:
Q.20C - Population May Include (Target)
IRB#

Desired headers:
Q20C_Population_May_Include_Target
IRB_id

[Please do not top-post.]

John Mcleod wrote:

Hello Marnen,
Thank you for replying.

I would consider the non-use of special characters in a database column
title, a matter of good form.
The only special character I use in database column titles is an
underscore and sometimes not that.

I agree with you when I'm creating the column names manually. But if
you're loading them dynamically from a CSV file, then it probably makes
sense to transform the names as little as possible

John

Best,

···

--
Marnen Laibow-Koser
http://www.marnen.org
marnen@marnen.org
--
Posted via http://www.ruby-forum.com/\.

I'm not sure if the placement is correct but I'm still getting "Error
adding projects to IRB table. (unknown attribute:
Q16_Research_Category_International). Please try again." errors.

Here's my updated code.

def import_irb_file
     # set file name
      file = params[:irb][:file]
      rowcount = 0

      Irb.transaction do
        FasterCSV.parse(file,
                        :headers => true,
                        :header_converters => lambda { |h| h.tr(" ",
"_").delete("^a-zA-Z0-9_")},
                        :converters => :all ) do |row|
                          Irb.create!(row.to_hash)
                          rowcount += 1
                        end
      end
      # if successful then display, then redirect to index page
      flash[:notice] = "Successfully added #{rowcount} project(s)."
      redirect_to :action => :index

    rescue => exception
      file_name = params[:irb]['file'].original_filename
      file_parts = params[:irb]['file'].original_filename.split('.')
      ext = file_parts[1]

      if ext != 'csv'
        error = "CSV file is required"
      else
        error = ERB::Util.h(exception.to_s) # get the error and HTML
escape it
      end
      # If an exception in thrown, the transaction rolls back and we end
up in this
      # rescue block
      flash[:error] = "Error adding projects to IRB table. (#{error}).
Please try again. "
      redirect_to :controller => 'irbs', :action => 'new'
  end

John

···

--
Posted via http://www.ruby-forum.com/.

I'm not sure if the placement is correct

You're placement looks fine to me.

but I'm still getting "Error
adding projects to IRB table. (unknown attribute:
Q16_Research_Category_International). Please try again." errors.

The name Q16_Research_Category_International doesn't have any characters in it. It looks like what you asked me for, but the table doesn't seem to have that column. So, you tell me what we did wrong. :slight_smile:

James Edward Gray II

···

On Nov 17, 2009, at 12:26 PM, John Mcleod wrote:

You are correct.
I need to check all my DB column titles. I updated the column title and
the next error was another DB column.

In your code...
:header_converters => lambda { |h| h.tr(" ",
"_").delete("^a-zA-Z0-9_")},

what is happening here?
h.tr(" ", "_").delete("^a-zA-Z0-9_#")

it looks like you're checking a table row for an instance of " " and
replacing with "_"
but I'm unsure of the ".delete("^a-zA-Z0-9_")"

John

···

--
Posted via http://www.ruby-forum.com/.

John Mcleod wrote:

what is happening here?
h.tr(" ", "_").delete("^a-zA-Z0-9_#")

it looks like you're checking a table row for an instance of " " and
replacing with "_"
but I'm unsure of the ".delete("^a-zA-Z0-9_")"

John

http://rubular.com/ is your best friend.

···

--
Posted via http://www.ruby-forum.com/\.

In your code...
:header_converters => lambda { |h| h.tr(" ",
"_").delete("^a-zA-Z0-9_")},

what is happening here?
h.tr(" ", "_").delete("^a-zA-Z0-9_#")

it looks like you're checking a table row for an instance of " " and
replacing with "_"

tr(), for transliterate, is used to replace characters. You are right that I'm using it to switch all spaces to underscores.

but I'm unsure of the ".delete("^a-zA-Z0-9_")"

delete() allows me to list characters I want to remove. It understands simple ranges, like a-z and 0-9. Also, if the first character is a ^, the entire character set is negated. Thus my call means delete all non letters, numbers, and underscore characters.

James Edward Gray II

···

On Nov 17, 2009, at 12:58 PM, John Mcleod wrote:

I didn't use any regular expressions. :wink:

James Edward Gray II

···

On Nov 17, 2009, at 1:13 PM, Aldric Giacomoni wrote:

John Mcleod wrote:

what is happening here?
h.tr(" ", "_").delete("^a-zA-Z0-9_#")

it looks like you're checking a table row for an instance of " " and
replacing with "_"
but I'm unsure of the ".delete("^a-zA-Z0-9_")"

John

http://rubular.com/ is your best friend.

So to delete or remove say additional special characters, you would add
additional characters to the range?

h.tr(" ", "_").delete("^a-zA-Z0-9_#$-")

···

--
Posted via http://www.ruby-forum.com/.

James Edward Gray II wrote:

···

On Nov 17, 2009, at 1:13 PM, Aldric Giacomoni wrote:

http://rubular.com/ is your best friend.

I didn't use any regular expressions. :wink:

James Edward Gray II

I'll be quiet now :slight_smile: My brain parsed it correctly and categorized it
incorrectly.
--
Posted via http://www.ruby-forum.com/\.

As I said in my last message, the leading ^ means NOT. Thus, I deleted all characters that are NOT letters, numbers, or underscores. That includes characters like #, $, and -. Your change added those characters to the list NOT to delete, so they would now be skipped.

James Edward Gray II

···

On Nov 17, 2009, at 1:49 PM, John Mcleod wrote:

So to delete or remove say additional special characters, you would add
additional characters to the range?

h.tr(" ", "_").delete("^a-zA-Z0-9_#$-")

Thanks for all the help.
After reading your posts and reading "FasterCSV" docs, I'm starting to
understand better.
This issue is solved.
Now on to another, thus a different post on a different FasterCSV
problem.

Thank you James.

···

--
Posted via http://www.ruby-forum.com/.