Hi,
I have drafted this tiny script, which works almost(!!) fine except that
every item shows 5x more than it should.
Hi. I'm not really sure from looking at it. If you can give me an example of
the data you are reading so I could follow its execution, that would make it
a lot easier.
if File.exists?("report.output")
puts "We can work..."
else
puts "Copy the 'pascpgen.output'to the same folder, please"
exit
end
Here, I would say "exit 1", when you just call exit, it exits with a status
code of 0, which is considered to mean that the application executed
correctly. However, we are exiting because there is an error, so you
shouldn't have an exit status of 0.
# define Class
class Results <
Struct.new(:date, :custo, :invoice, :descr, :type, :money, :ref)
def print_csv_record
date.length==0 ? printf(",") : printf("%s ", date)
custo.length==0 ? printf(",") : printf("%s ", custo)
invoice.length==0 ? printf(",") : printf("%s ", invoice)
descr.length==0 ? printf(",") : printf("%s ", descr)
type.length==0 ? printf(",") : printf("%s ", type)
money.length==0 ? printf(",") : printf("%s ", money)
ref.length==0 ? printf(",") : printf("%s ", ref)
printf("\n")
end
end
Four things on this one.
1) I don't think the object should be responsible for printing itself. I
think it would be better that it returns the csv_record as a String, and
lets whoever is using it decide how they want to deal with it (whatever code
uses it decides to print)
2) Structs have #each defined on them, which just yields all the members.
This means that rather than manually checking each one for its length, you
could just use the Enumerable methods (in this case, probably map). So I
would probably rewrite it
def csv_record
map { |value| if value.empty? then ',' else "#{value} " end }.join
end
(note that if you prefer to use C style string patterns, there is also a
sprintf function)
And I might change the name of the method to be "to_csv"
3) The stdlib has a CSV library in it, you might check that out rather than
building your own. (http://ruby-doc.org/stdlib/\)
4) I would name the class singular, "Result" rather than "Results" because
each line in your file contains a Result instance, and the collection of
lines I look at as being the results. That is undoubtedly the influence of
Rails, but a few simple conventions like this can make it much easier to
follow and communicate about the code.
# create array
arr = Array.new
arr2 = Array.new
Here, I would name them based on what they contain, rather than what type
they are. So I would probably rename them to "results" and "queries", which
most Rubyists would immediately identify as a collection of Result and Query
instances. Then, at the bottom, when iterating over them, instead of naming
the elements p and q, I would name them "result" and "query" to indicate
that this is the current instance of the Result or Query class.
f = File.open("report.output", 'r').each {|line|
if line =~ /MST Revenue/
# read every line into string
str = line.chomp!
str.length
date = str[20, 5]
custo = str[28,6]
invoice = str[48,8]
descr = str[59,28]
type = str[112,4]
money = str[123,11]
ref = str [36,7]
r = Results.new
r.date = date
r.custo = custo
r.invoice = invoice
r.descr = descr
r.type = type
r.money = money
r.ref = ref
arr.push(r)
end
}
3 things on this one
1) You use File.open, then iterate on the result there. Unfortunately, that
file object you are iterating over, nothing ever happens with it, so your
file doesn't get closed. Rather than calling #each on the returned object I
would receive the object in a block, then call each on it in there.
Tricky to find this one in the docs, its actually under IO rather than File
class IO - RDoc Documentation "If the optional code block
is given, it will be passed *io* as an argument, and the
IO<http://ruby-doc.org/core/classes/IO.html>object will automatically
be closed when the block terminates." So you can
let File close the file for you (this is considered a best practice, since
it does certain things to ensure this will happen, even if an error is
raised)
File.open 'report.output' do |file|
file.each do |line|
...
end
end
2) I'm sure plenty of people will disagree with me here, but rather than
putting the entire contents nested under an if statement, I strongly prefer
to have the if statement just jump to the next iteration. Then everything is
less indented, which for me, makes it much easier to read and understand.
So rather than
if line =~ /MST Revenue/
...
end
I would do
next unless line =~ /MST Revenue/
3) The constructor for a struct takes the members you created it with. So
you don't have to individually assign each member.
r = Results.new
r.date = date
r.custo = custo
r.invoice = invoice
r.descr = descr
r.type = type
r.money = money
r.ref = ref
could be rewritten as
r = Results.new date , custo , invoice , descr , type , money , ref
arr.each { |p| p[:custo]
arr2.each { |q| q[:custo]
if p[:custo] == q[:custo]
q.print_csv_record2
p.print_csv_record
end
}
}
Generally, people use do ... end for multiline blocks, and reserver { ... }
for single-line blocks. Not a hard rule, but is easier for most people
Here, I realize that if you are coming from a C background, where there is
only the top level namespace, then you might be naming these methods
differently to avoid a method name conflict. In Ruby, you can namespace
methods by whatever contains them. In this case, the Result and Query
classes. That means that you can define "print_csv_record" in both classes,
and when you call that method on the given object, it will know about its
method and not the other's method, so it will be correct.
This script finds the matching fields but for some reason it writes 5x
everything. I believe the "arr.each ..." part is wrong. Any idea? Thanks
a lot! Szabolcs
I don't see anything conspicuous. Are you sure it isn't in your data? (ie 5
months of reports)
···
2010/11/8 Szabolcs Tóth <tsz@purzelbaum.hu>