Changing the format of a text file

Hello everyone,

i am new to ruby and im having some problems trying to reformat a text
file.

Basically, i have a large log file which is around 200mb in the
following format:

···

----------------------------------------------------------
1000000 name
Status :A
Basetype :2
Version :1.0
   >
   >
(more
  fields)
   >
Name :/file/name/etc
1000001 name
Status :B
Basetype :2
Version :a20
   >
   >
Name :/file/name/etc
1000002 name
Status :C
   >

... and so on

so for each 200mb file there are lot of entries.

What i want to do is to open the file, read the data into an array,
reformat the text and save it into another file with the following
output:

id, Status, Basetype, .... , Name
1000000, A, 2, ..... , /file/name/etc
1000001, B, 2, ..... , /file/name/etc

i tried to write a script in ruby to do that task but i dont get any
output so far.

def getfile(file_name)
entry = []
IO.foreach(file_name) do |fl|
  if fl.include? 'name'
   entry.push fl.scan(/\d+/)[0]
  elsif fl.strip =~ /\A\d/
  end
end
entry
end

def writefile(file, *linedata)
linedata.each do |line|
  file << line.join(", ") +\n"
  end
end

def readfile(file, outputfile)
out = File.new(outputfile, "w+")
info = []

wline = ['id', 'Status', 'Basetype', .... 'Name']

IO.foreach(file) { |line|

if line =~ //
  wline[0]= line.scan(/\d+/)
elsif line =~ /Status/
  wline[1]= line.split(":")[1].scan(/[a-zA-Z]+/).join("")
elsif line =~ /Basetype/
  wline[2]= line.split(":")[1].scan(/\d+/).join("")
    >
    >
    >
wline all fields
    >
writefile(out, wline)
end
out.close
end

readfile('filename', 'outputfile')

this is what ive done so far, can someone tell me whats wrong and i dont
get any output at all..

Thanks in advance
--
Posted via http://www.ruby-forum.com/.

I would strongly recommend looking at Treetop (http://treetop.rubyforge.org/\).
It's a parser generator that produces tree structures from text files using
a grammar that you specify. If you know regular expressions, it shouldn't be
too big a leap to use Treetop's grammar language.

For this particular task it may be overkill, but certainly worth looking at.

···

2009/2/25 Bary Buz <sxetikos@hotmail.co.uk>

Hello everyone,

i am new to ruby and im having some problems trying to reformat a text
file.

Basically, i have a large log file which is around 200mb in the
following format:
----------------------------------------------------------
1000000 name
Status :A
Basetype :2
Version :1.0
  >
  >
(more
fields)
  >
Name :/file/name/etc
1000001 name
Status :B
Basetype :2
Version :a20
  >
  >
Name :/file/name/etc
1000002 name
Status :C
  >

... and so on

so for each 200mb file there are lot of entries.

What i want to do is to open the file, read the data into an array,
reformat the text and save it into another file with the following
output:

id, Status, Basetype, .... , Name
1000000, A, 2, ..... , /file/name/etc
1000001, B, 2, ..... , /file/name/etc

Hello everyone,

Hello and welcome.

i am new to ruby and im having some problems trying to reformat a text
file.

Basically, i have a large log file which is around 200mb in the
following format:
----------------------------------------------------------
1000000 name
Status :A
Basetype :2
Version :1.0

id, Status, Basetype, .... , Name
1000000, A, 2, ..... , /file/name/etc
1000001, B, 2, ..... , /file/name/etc

Do you just read the log file replacing variables holding Status, Basetype, Version, and Name then spit out a new entry each time you run across a number?

i tried to write a script in ruby to do that task but i dont get any
output so far.

I'll try to give some feedback…

def getfile(file_name)
entry =
IO.foreach(file_name) do |fl|
if fl.include? 'name'
  entry.push fl.scan(/\d+/)[0]
elsif fl.strip =~ /\A\d/
end
end
entry
end

I don't see this method used anywhere in the code.

def writefile(file, *linedata)
linedata.each do |line|
file << line.join(", ") +\n"

You are missing a quote there. It should be:

   … + "\n"

end
end

def readfile(file, outputfile)
out = File.new(outputfile, "w+")
info =

wline = ['id', 'Status', 'Basetype', .... 'Name']

IO.foreach(file) { |line|

if line =~ //

Don't do that. It doesn't do what you think it does. :slight_smile:

What are you looking for here? A line that starts with a digit? If so, use this:

   if line =~ /\A\s*(\d+)/
     # the digit is in the $1 variable here...

wline[0]= line.scan(/\d+/)
elsif line =~ /Status/
wline[1]= line.split(":")[1].scan(/[a-zA-Z]+/).join("")

The above two lines can be simplified to:

   elsif line =~ /\A\s*Status\s*:\s*([a-zA-Z]+)/
     wline[1] = $1

The other assignments could be handled in a similar way.

elsif line =~ /Basetype/
wline[2]= line.split(":")[1].scan(/\d+/).join("")
   >
wline all fields
   >
writefile(out, wline)
end
out.close
end

readfile('filename', 'outputfile')

this is what ive done so far, can someone tell me whats wrong and i dont
get any output at all..

It's not real easy for me to tell why you don't see output. It looks like outputs might only happen in that last elsif. If that's the case, you won't se output unless the code makes it there. I'm guessing it's not. Maybe because of the line =~ // condition, which is problematic.

I believe the code below does something like what you want. I hope it can be adapted to your needs.

James Edward Gray II

#!/usr/bin/env ruby -wKU

fields = ["id"]
fields_written = false
entry = { }

DATA.each do |line|
   case line
   when /\A\s*(\d+)/
     unless entry.empty?
       unless fields_written
         puts fields.join(", ")
         fields_written = true
       end
       puts fields.map { |f| entry[f] }.join(", ")
       entry.clear
     end
     entry["id"] = $1
   when /\A\s*([a-zA-Z]+)\s*:\s*(\S+)/
     fields << $1 unless fields.include? $1
     entry[$1] = $2
   end
end

__END__
1000000 name
Status :A
Basetype :2
Version :1.0
Name :/file/name/etc
1000001 name
Status :B
Basetype :2
Version :a20
Name :/file/name/etc
1000002 name
Status :C

···

On Feb 25, 2009, at 5:30 AM, Bary Buz wrote: