Can someone point me in the right direction? Pipe Delimited Files & Hashes

Hello all,

I have a file that is not a normal csv or tab delimited file. It is delimited with the pipe | character.

I Googled and found someone had posted how to parse it... That's all fine and well, but now I need to figure out how I'm going to either (a) import it into a MySQL database or (b) put it in some sort of container that will let me access each field (like an array or a hash).

If I go through the hash, I'll probably need to assign each field in the file a key, and then populate it. I'm having a hard time figuring this out.

If I go the route of MySQL I know that I'll probably end up using ActiveRecord.

This is what I have so far that I found from someone's blog (I got the concept from someone's blog and then modified it a bit to see what it was doing and what it would do in relation to my file. What this does, obviously, is puts out the info, a new line between each field, and an extra linefeed between each row... (each group - boy, am I articulate today or WHAT?!)

File.open("i put the filename here").each do |record|
  record.split("|").each do |field|
    field.chomp!
    puts field
  end
end

So, what I want it to do, is say I have the following fields in the | delimited file:

category | subcategory | description

How do I make that stuff into a hash? I should probably start out small by putting it into a hash first, and then figure out how to deal with it in MySQL.

If someone could point me in the right direction, of possible libraries that would help or the such, I'd love to go read there and study on it and try to figure it out. Not asking for answers, just asking for resources. :slight_smile:

Thanks,
Samantha

http://www.babygeek.org/

"Beware when the great God lets loose a thinker on this planet. Then all things are at risk."
  --Ralph Waldo Emerson

You pretty much got it already. Just add this:

FIELDS = [:category, :subcategory, :description]
db =

File.foreach("in") do |line|
   line.chomp!
   rec = {}
   FIELDS.zip(line.split("|")) do |name, val|
     rec[name]=val if val
   end
   db << rec
end

# work with db

(untested)

Kind regards

  robert

···

On 07.03.2007 16:33, Samantha wrote:

Hello all,

I have a file that is not a normal csv or tab delimited file. It is delimited with the pipe | character.

I Googled and found someone had posted how to parse it... That's all fine and well, but now I need to figure out how I'm going to either (a) import it into a MySQL database or (b) put it in some sort of container that will let me access each field (like an array or a hash).

If I go through the hash, I'll probably need to assign each field in the file a key, and then populate it. I'm having a hard time figuring this out.

If I go the route of MySQL I know that I'll probably end up using ActiveRecord.

This is what I have so far that I found from someone's blog (I got the concept from someone's blog and then modified it a bit to see what it was doing and what it would do in relation to my file. What this does, obviously, is puts out the info, a new line between each field, and an extra linefeed between each row... (each group - boy, am I articulate today or WHAT?!)

File.open("i put the filename here").each do |record|
record.split("|").each do |field|
   field.chomp!
   puts field
end
end

So, what I want it to do, is say I have the following fields in the | delimited file:

category | subcategory | description

How do I make that stuff into a hash? I should probably start out small by putting it into a hash first, and then figure out how to deal with it in MySQL.

If someone could point me in the right direction, of possible libraries that would help or the such, I'd love to go read there and study on it and try to figure it out. Not asking for answers, just asking for resources. :slight_smile:

This is what I have so far that I found from someone's blog (I got the concept from someone's blog...

File.open("i put the filename here").each do |record|
record.split("|").each do |field|
   field.chomp!
   puts field
end
end

So, what I want it to do, is say I have the following fields in the > delimited file:

category | subcategory | description

How do I make that stuff into a hash? I should probably start out small by putting it into a hash first, and then figure out how to deal with it in MySQL.

Let me try giving you this little hint and see if it's enough:

>> cat, sub, des = "Books|Programming|A fun book about Ruby".split("|")
=> ["Books", "Programming", "A fun book about Ruby"]
>> cat
=> "Books"
>> sub
=> "Programming"
>> des
=> "A fun book about Ruby"

Don't be shy if you need more help!

If someone could point me in the right direction, of possible libraries that would help or the such, I'd love to go read there and study on it and try to figure it out. Not asking for answers, just asking for resources. :slight_smile:

The standard CSV library will do the parsing for you, but this case looks very simple so you probably don't need it. However, if the first row of the file has the field names, it might be worth looking at FasterCSV which will build the Hashes for you:

http://rubyforge.org/projects/fastercsv/

Hope that helps.

James Edward Gray II

···

On Mar 7, 2007, at 9:33 AM, Samantha wrote:

Do the hash first, since that will give you a good feel for how things
work, but if you put this into actual production, I'd consider using
the ArrayFields library:

http://www.codeforpeople.com/lib/ruby/arrayfields/arrayfields-3.6.0/README

It's a useful tool to have in your kit

martin

···

On 3/7/07, Samantha <rubygeekgirl@gmail.com> wrote:

If I go through the hash, I'll probably need to assign each field in the
file a key, and then populate it. I'm having a hard time figuring this out.

James Edward Gray II wrote:

Let me try giving you this little hint and see if it's enough:

>> cat, sub, des = "Books|Programming|A fun book about Ruby".split("|")
=> ["Books", "Programming", "A fun book about Ruby"]
>> cat
=> "Books"
>> sub
=> "Programming"
>> des
=> "A fun book about Ruby"

Don't be shy if you need more help!

Thank you! I really appreciate it. :slight_smile: Again, I'll have to dig through and figure things out. Apparently I need another cup of coffee this morning for my synapses to fire properly.

If someone could point me in the right direction, of possible libraries that would help or the such, I'd love to go read there and study on it and try to figure it out. Not asking for answers, just asking for resources. :slight_smile:

The standard CSV library will do the parsing for you, but this case looks very simple so you probably don't need it. However, if the first row of the file has the field names, it might be worth looking at FasterCSV which will build the Hashes for you:

Thanks! The first row does not have field names, however, I'm sure I could add them.

Thanks again, James - this community is always so helpful!

···

http://rubyforge.org/projects/fastercsv/

Hope that helps.

James Edward Gray II

--
Samantha

http://www.babygeek.org/

"Beware when the great God lets loose a thinker on this planet. Then all things are at risk."
  --Ralph Waldo Emerson

Martin DeMello wrote:

···

On 3/7/07, Samantha <rubygeekgirl@gmail.com> wrote:

If I go through the hash, I'll probably need to assign each field in the
file a key, and then populate it. I'm having a hard time figuring this out.

Do the hash first, since that will give you a good feel for how things
work, but if you put this into actual production, I'd consider using
the ArrayFields library:

http://www.codeforpeople.com/lib/ruby/arrayfields/arrayfields-3.6.0/README

It's a useful tool to have in your kit

martin

Thanks, Martin! I will check that out.

--
Samantha

http://www.babygeek.org/

"Beware when the great God lets loose a thinker on this planet. Then all
things are at risk."
   --Ralph Waldo Emerson

Robert Klemme wrote:

···

On 07.03.2007 16:33, Samantha wrote:

Hello all,

I have a file that is not a normal csv or tab delimited file. It is delimited with the pipe | character.

I Googled and found someone had posted how to parse it... That's all fine and well, but now I need to figure out how I'm going to either (a) import it into a MySQL database or (b) put it in some sort of container that will let me access each field (like an array or a hash).

If I go through the hash, I'll probably need to assign each field in the file a key, and then populate it. I'm having a hard time figuring this out.

If I go the route of MySQL I know that I'll probably end up using ActiveRecord.

This is what I have so far that I found from someone's blog (I got the concept from someone's blog and then modified it a bit to see what it was doing and what it would do in relation to my file. What this does, obviously, is puts out the info, a new line between each field, and an extra linefeed between each row... (each group - boy, am I articulate today or WHAT?!)

File.open("i put the filename here").each do |record|
record.split("|").each do |field|
   field.chomp!
   puts field
end
end

So, what I want it to do, is say I have the following fields in the | delimited file:

category | subcategory | description

How do I make that stuff into a hash? I should probably start out small by putting it into a hash first, and then figure out how to deal with it in MySQL.

If someone could point me in the right direction, of possible libraries that would help or the such, I'd love to go read there and study on it and try to figure it out. Not asking for answers, just asking for resources. :slight_smile:

You pretty much got it already. Just add this:

FIELDS = [:category, :subcategory, :description]
db =

File.foreach("in") do |line|
  line.chomp!
  rec = {}
  FIELDS.zip(line.split("|")) do |name, val|
    rec[name]=val if val
  end
  db << rec
end

# work with db

(untested)

Kind regards

    robert

Thank you, Robert! I'm going to need to figure out what each thing does. :slight_smile:

I really want to make sure I grok everything. Again, thanks!

--
Samantha

http://www.babygeek.org/

"Beware when the great God lets loose a thinker on this planet. Then all things are at risk."
  --Ralph Waldo Emerson

James Edward Gray II wrote:

If someone could point me in the right direction, of possible libraries that would help or the such, I'd love to go read there and study on it and try to figure it out. Not asking for answers, just asking for resources. :slight_smile:

The standard CSV library will do the parsing for you, but this case looks very simple so you probably don't need it. However, if the first row of the file has the field names, it might be worth looking at FasterCSV which will build the Hashes for you:

Thanks! The first row does not have field names, however, I'm sure I could add them.

Actually, FasterCSV plans for that too. I should have said that. Here's a taste:

>> require "rubygems"
=> false
>> require "faster_csv"
=> true
>> csv = <<END_CSV
Book>Programming>Good Stuff.
Book>Home Improvement|Sounds like work.
END_CSV
=> "Book|Programming|Good Stuff.\nBook|Home Improvement|Sounds like work.\n"
>> FCSV.parse(csv, :col_sep => "|", :headers => [:cat, :sub, :des]) do |row|
?> p row.to_hash
>> end
{:sub=>"Programming", :cat=>"Book", :des=>"Good Stuff."}
{:sub=>"Home Improvement", :cat=>"Book", :des=>"Sounds like work."}
=> nil

Hope that helps.

James Edward Gray II

···

On Mar 7, 2007, at 10:09 AM, Samantha wrote: