Optimizing iterating over an array

Hi!

I’ve got a method which checks an array of arrays containing CSV data coming from CSV::read and returns the (corrected) array:

···

~~~
def check_data(csv_data)
  ...
  csv_data.each { |row| LOGGER.info("Record not processed due to empty field(s): #{row}") if row[:field1].nil? }
  csv_data.delete_if { |row| row[:field1].nil? }
end
~~~

It works but I don’t like the last two statements iterating two times over the whole array for logging and deleting rows where the first field contains nil values. This should be easily done with one iteration and that’s what I’ve come up with:

~~~
  csv_data.each_with_index do |row, index|
    if row[:field1].nil?
      LOGGER.info("Record nbr. #{index} not processed due to empty field(s): #{row}")
      csv_data.delete(index)
    end
  end
~~~

It does what it’s supposed to be doing but it’s 6 lines of code instead of two and it doesn’t look very »Ruby-ish« to me. Is there any better / more elegant way to log and delete entries in one iteration from an array (of arrays) with CSV data?

Any hints more than welcome!

Cheers,
Michael

Off the top of my head (and thus untried or tested): from what I can see in
the documentation
<Class: Logger (Ruby 2.4.0);
Logger#info
always returns true, so this should work:

···

On 10 January 2017 at 15:59, Michael Schwarze <michael@schwarze-web.de> wrote:

Hi!

I’ve got a method which checks an array of arrays containing CSV data
coming from CSV::read and returns the (corrected) array:

~~~
​​
def check_data(csv_data)
  ...
  csv_data.each { |row| LOGGER.info("Record not processed due to empty
field(s): #{row}") if row[:field1].nil? }
  csv_data.delete_if { |row| row[:field1].nil? }
end
~~~

It works but I don’t like the last two statements iterating two times over
the whole array for logging and deleting rows where the first field
contains nil values. This should be easily done with one iteration and
that’s what I’ve come up with:

​​
~~~
  csv_data.each_with_index do |row, index|
    if row[:field1].nil?
      LOGGER.info("Record nbr. #{index} not processed due to empty
field(s): #{row}")
      csv_data.delete(index)
    end
  end
~~~

It does what it’s supposed to be doing but it’s 6 lines of code instead of
two and it doesn’t look very »Ruby-ish« to me. Is there any better / more
elegant way to log and delete entries in one iteration from an array (of
arrays) with CSV data?

Any hints more than welcome!

Cheers,
Michael

~~~
​def check_data(csv_data)
  ...
  csv_data.delete_if do |row|
    row[:field1].nil? && LOGGER.info("Record not processed due to empty
field(s): #{row}")
  end
end
~~~

Although that's not very pretty, and depends on some not-particularly-clear
behaviour. There are other ways to combine the statements, too. You could
even rewrite your six-liner to use delete_if:

​~~~
  csv_data.delete_if do |row|
    if row[:field1].nil?
      LOGGER.info("Record nbr. #{index} not processed due to empty
field(s): #{row}")
      true # unnecessary, but clearly signals the intention
    end
  end
~~~

Cheers
--
  Matthew Kerwin
  http://matthew.kerwin.net.au/

I would go with that solution because it does not only clearly signal
the intention - it's also independent from changes in logger method's
return value and thus more robust.

Here's a bit odd other method:

csv_data.delete_if do |row|
  row[:field1].nil?.tap |x|
    LOGGER.info("Record nbr. #{index} not processed due to empty
field(s): #{row.inspect}" if x
  end
end

Which makes me wonder: is there a Smalltalk like method that invokes
the block only if the value is trueish, e.g.

value.if_true do
  puts "only if not false and not nil"
end

That would come handy here if it also returned the value, e.g.

class Object
  def if_true
    yield if self
    self
  end
end

Kind regards

robert

···

On Tue, Jan 10, 2017 at 7:41 AM, Matthew Kerwin <matthew@kerwin.net.au> wrote:

~~~
  csv_data.delete_if do |row|
    if row[:field1].nil?
      LOGGER.info("Record nbr. #{index} not processed due to empty field(s):
#{row}")
      true # unnecessary, but clearly signals the intention
    end
  end
~~~

--
[guy, jim, charlie].each {|him| remember.him do |as, often| as.you_can
- without end}
http://blog.rubybestpractices.com/

Hi,

Many thanks, Matthew & Robert!

~~~
csv_data.delete_if do |row|
   if row[:field1].nil?
     LOGGER.info("Record nbr. #{index} not processed due to empty field(s): #{row}")
     true # unnecessary, but clearly signals the intention
   end
end
~~~

I would go with that solution because it does not only clearly signal the intention - it’s also independent from changes in logger method’s return value and thus more robust.

I like that one, too; definitely more concise than my first approach.

Here's a bit odd other method:

csv_data.delete_if do |row|
row[:field1].nil?.tap |x|
   LOGGER.info("Record nbr. #{index} not processed due to empty field(s): #{row.inspect}" if x
end
end

@Robert: #tap is an interesting direction here and I haven’t used it before. The documentation [1] states: »The primary purpose of this method is to “tap into” a method chain, in order to perform operations on intermediate results within the chain.« So it’s not that odd at all and does fit my purpose, too. And thanks for your hint to #inspect: my log has definitely improved by this!

Cheers,
Michael

[1] Class: Object (Ruby 2.4.0)

Mit freundlichem Gruß
Michael Schwarze

···

Am 10.01.2017 um 09:46 schrieb Robert Klemme <shortcutter@googlemail.com>:
On Tue, Jan 10, 2017 at 7:41 AM, Matthew Kerwin <matthew@kerwin.net.au> wrote:

@Robert: #tap is an interesting direction here and I haven’t used it before. The documentation [1] states: »The primary purpose of this method is to “tap into” a method chain, in order to perform operations on intermediate results within the chain.« So it’s not that odd at all and does fit my purpose, too.

Yes, that is true. I just found the necessary "if" a bit ugly. But
then, it clearly states the intent.

And thanks for your hint to #inspect: my log has definitely improved by this!

You are welcome!

Kind regards

robert

···

On Tue, Jan 10, 2017 at 9:30 PM, Michael Schwarze <michael@schwarze-web.de> wrote:

--
[guy, jim, charlie].each {|him| remember.him do |as, often| as.you_can
- without end}
http://blog.rubybestpractices.com/