Comparing/search struct array

Szabolcs_Toth · 8 November 2010 12:36

Hi,
I have drafted this tiny script, which works almost(!!) fine except that
every item shows 5x more than it should.

I have one file with the report, where there is no customer name but
customer number is there. I have another file with both names and
numbers. So I put both file into array and try to find the relevant
fields.

if File.exists?("report.output")
  puts "We can work..."
  else
    puts "Copy the 'pascpgen.output'to the same folder, please"
    exit
  end

# define Class
class Results <
  Struct.new(:date, :custo, :invoice, :descr, :type, :money, :ref)
    def print_csv_record
      date.length==0 ? printf(",") : printf("%s ", date)
      custo.length==0 ? printf(",") : printf("%s ", custo)
      invoice.length==0 ? printf(",") : printf("%s ", invoice)
      descr.length==0 ? printf(",") : printf("%s ", descr)
      type.length==0 ? printf(",") : printf("%s ", type)
      money.length==0 ? printf(",") : printf("%s ", money)
      ref.length==0 ? printf(",") : printf("%s ", ref)
      printf("\n")
    end
end

class Query <
  Struct.new(:cuname, :custo)
    def print_csv_record2
      cuname.length==0 ? printf(",") : printf("%s", cuname)
    # custo.length==0 ? printf(",") : printf("\"%s\",", custo)
    # printf("\n")
    end
end

# create array
arr = Array.new
arr2 = Array.new

f = File.open("report.output", 'r').each {|line|
if line =~ /MST Revenue/

       # read every line into string
       str = line.chomp!
       str.length

       date = str[20, 5]
       custo = str[28,6]
       invoice = str[48,8]
       descr = str[59,28]
       type = str[112,4]
       money = str[123,11]
       ref = str [36,7]

r = Results.new

     r.date = date
     r.custo = custo
     r.invoice = invoice
     r.descr = descr
     r.type = type
     r.money = money
     r.ref = ref
     arr.push(r)
     end
    }

f2 = File.open("customers.txt", 'r').each {|line|
# read every line into string

    str = line.chomp!
    if str != nil
      str.length

cuname = str[3, 21]
custo = str[27,6]

q = Query.new

     q.cuname = cuname
     q.custo = custo
     arr2.push(q)
end
    }

  arr.each { |p| p[:custo]
    arr2.each { |q| q[:custo]
    if p[:custo] == q[:custo]

q.print_csv_record2
p.print_csv_record

  end
  }
  }

This script finds the matching fields but for some reason it writes 5x
everything. I believe the "arr.each ..." part is wrong. Any idea? Thanks
a lot! Szabolcs

···

--
Posted via http://www.ruby-forum.com/.

Josh_Cheek · 8 November 2010 17:14

Hi,
I have drafted this tiny script, which works almost(!!) fine except that
every item shows 5x more than it should.

Hi. I'm not really sure from looking at it. If you can give me an example of
the data you are reading so I could follow its execution, that would make it
a lot easier.

if File.exists?("report.output")
puts "We can work..."
else
puts "Copy the 'pascpgen.output'to the same folder, please"
exit
end

Here, I would say "exit 1", when you just call exit, it exits with a status
code of 0, which is considered to mean that the application executed
correctly. However, we are exiting because there is an error, so you
shouldn't have an exit status of 0.

# define Class
class Results <
Struct.new(:date, :custo, :invoice, :descr, :type, :money, :ref)
   def print_csv_record
     date.length==0 ? printf(",") : printf("%s ", date)
     custo.length==0 ? printf(",") : printf("%s ", custo)
     invoice.length==0 ? printf(",") : printf("%s ", invoice)
     descr.length==0 ? printf(",") : printf("%s ", descr)
     type.length==0 ? printf(",") : printf("%s ", type)
     money.length==0 ? printf(",") : printf("%s ", money)
     ref.length==0 ? printf(",") : printf("%s ", ref)
     printf("\n")
   end
end

Four things on this one.
1) I don't think the object should be responsible for printing itself. I
think it would be better that it returns the csv_record as a String, and
lets whoever is using it decide how they want to deal with it (whatever code
uses it decides to print)
2) Structs have #each defined on them, which just yields all the members.
This means that rather than manually checking each one for its length, you
could just use the Enumerable methods (in this case, probably map). So I
would probably rewrite it

def csv_record
map { |value| if value.empty? then ',' else "#{value} " end }.join
end

(note that if you prefer to use C style string patterns, there is also a
sprintf function)

And I might change the name of the method to be "to_csv"

3) The stdlib has a CSV library in it, you might check that out rather than
building your own. (http://ruby-doc.org/stdlib/\)

4) I would name the class singular, "Result" rather than "Results" because
each line in your file contains a Result instance, and the collection of
lines I look at as being the results. That is undoubtedly the influence of
Rails, but a few simple conventions like this can make it much easier to
follow and communicate about the code.

# create array

arr = Array.new
arr2 = Array.new

Here, I would name them based on what they contain, rather than what type
they are. So I would probably rename them to "results" and "queries", which
most Rubyists would immediately identify as a collection of Result and Query
instances. Then, at the bottom, when iterating over them, instead of naming
the elements p and q, I would name them "result" and "query" to indicate
that this is the current instance of the Result or Query class.

f = File.open("report.output", 'r').each {|line|

if line =~ /MST Revenue/

      # read every line into string
      str = line.chomp!
      str.length

      date = str[20, 5]
      custo = str[28,6]
      invoice = str[48,8]
      descr = str[59,28]
      type = str[112,4]
      money = str[123,11]
      ref = str [36,7]

    r = Results.new

    r.date = date
    r.custo = custo
    r.invoice = invoice
    r.descr = descr
    r.type = type
    r.money = money
    r.ref = ref
    arr.push(r)
    end
   }

3 things on this one

1) You use File.open, then iterate on the result there. Unfortunately, that
file object you are iterating over, nothing ever happens with it, so your
file doesn't get closed. Rather than calling #each on the returned object I
would receive the object in a block, then call each on it in there.

Tricky to find this one in the docs, its actually under IO rather than File
class IO - RDoc Documentation "If the optional code block
is given, it will be passed *io* as an argument, and the
IO<http://ruby-doc.org/core/classes/IO.html>object will automatically
be closed when the block terminates." So you can
let File close the file for you (this is considered a best practice, since
it does certain things to ensure this will happen, even if an error is
raised)

File.open 'report.output' do |file|
  file.each do |line|
    ...
  end
end

2) I'm sure plenty of people will disagree with me here, but rather than
putting the entire contents nested under an if statement, I strongly prefer
to have the if statement just jump to the next iteration. Then everything is
less indented, which for me, makes it much easier to read and understand.

So rather than
if line =~ /MST Revenue/
...
end

I would do
next unless line =~ /MST Revenue/

3) The constructor for a struct takes the members you created it with. So
you don't have to individually assign each member.

r = Results.new
r.date = date
r.custo = custo
r.invoice = invoice
r.descr = descr
r.type = type
r.money = money
r.ref = ref

could be rewritten as

r = Results.new date , custo , invoice , descr , type , money , ref

arr.each { |p| p[:custo]
   arr2.each { |q| q[:custo]
   if p[:custo] == q[:custo]

     q.print_csv_record2
     p.print_csv_record

end
}
}

Generally, people use do ... end for multiline blocks, and reserver { ... }
for single-line blocks. Not a hard rule, but is easier for most people

Here, I realize that if you are coming from a C background, where there is
only the top level namespace, then you might be naming these methods
differently to avoid a method name conflict. In Ruby, you can namespace
methods by whatever contains them. In this case, the Result and Query
classes. That means that you can define "print_csv_record" in both classes,
and when you call that method on the given object, it will know about its
method and not the other's method, so it will be correct.

This script finds the matching fields but for some reason it writes 5x
everything. I believe the "arr.each ..." part is wrong. Any idea? Thanks
a lot! Szabolcs

I don't see anything conspicuous. Are you sure it isn't in your data? (ie 5
months of reports)

···

2010/11/8 Szabolcs TÃ³th <tsz@purzelbaum.hu>

Szabolcs_Toth · 9 November 2010 07:08

Thanks for the comments. I am working on your suggestions.

I figured out that there were repeating lines in the customers.txt that
was the reason why some of the lines were duplicated in the result as
well. So, just modified the customers.txt file read like this:

lastline =""
f2 = File.open("customers.txt", 'r').each {|line|
# read every line into string
  str = line.chomp!
    if str != nil
      str.length

      cuname = str[3, 21]
      custo = str[27,6]
        if lastline != cuname
            q = Query.new

            q.cuname = cuname
            q.custo = custo
            arr2.push(q)
        end
          lastline =cuname
      end
    }

Now, it is fine. But, more and more I am working on this code, my
feeling is that I started in a very wrong direction and made my life and
the code too complicated.

Thank you again!

···

--
Posted via http://www.ruby-forum.com/.

Jesus_Gabriel_y_Gala · 8 November 2010 17:23

If you are not doing anything else with the file apart from reading
the lines you could also do:

File.foreach("report.output") do |line|
...
end

Jesus.

···

On Mon, Nov 8, 2010 at 6:14 PM, Josh Cheek <josh.cheek@gmail.com> wrote:

2010/11/8 Szabolcs TÃ³th <tsz@purzelbaum.hu>
f = File.open("report.output", 'r').each {|line|

if line =~ /MST Revenue/

Tricky to find this one in the docs, its actually under IO rather than File
class IO - RDoc Documentation "If the optional code block
is given, it will be passed *io* as an argument, and the
IO<http://ruby-doc.org/core/classes/IO.html>object will automatically
be closed when the block terminates." So you can
let File close the file for you (this is considered a best practice, since
it does certain things to ensure this will happen, even if an error is
raised)

File.open 'report.output' do |file|
file.each do |line|
...
end
end

Josh_Cheek · 8 November 2010 17:36

Thanks, I was looking for that, but couldn't remember what it was called I
found each_line in the docs, and thought that's what I had remembered.

···

2010/11/8 Jesús Gabriel y Galán <jgabrielygalan@gmail.com>

On Mon, Nov 8, 2010 at 6:14 PM, Josh Cheek <josh.cheek@gmail.com> wrote:
> 2010/11/8 Szabolcs TÃ³th <tsz@purzelbaum.hu>
> f = File.open("report.output", 'r').each {|line|
>> if line =~ /MST Revenue/
>>
>>
> Tricky to find this one in the docs, its actually under IO rather than
File
> class IO - RDoc Documentation "If the optional code
block
> is given, it will be passed *io* as an argument, and the
> IO<http://ruby-doc.org/core/classes/IO.html>object will automatically
> be closed when the block terminates." So you can
> let File close the file for you (this is considered a best practice,
since
> it does certain things to ensure this will happen, even if an error is
> raised)
>
> File.open 'report.output' do |file|
> file.each do |line|
> ...
> end
> end

If you are not doing anything else with the file apart from reading
the lines you could also do:

File.foreach("report.output") do |line|
...
end

Jesus.

Topic		Replies	Views
Is there a combination of a struct and an array? I wanna iterate over all created objects from a certain struct-class (I guess) ruby-talk	7	69	29 September 2007
Obtaining specific .csv data (one data at a time) ruby-talk	6	126	2 February 2011
Advice requested on array diff ruby-talk	1	168	19 December 2004
Compare 2 "arrays" ruby-talk	2	83	5 November 2007
Comparing two arrays, too slow ruby-talk	5	129	9 August 2010

Comparing/search struct array

Related topics