Select "columns" from multidimensional array?

Hah, I wrote that head exploding comment first and then managed to work
out what it did afterwards. Still took a few minutes of smashing my head
into the desk to make room for the new thought though ;¬)

LOL

Using a Hash sounds like a good idea. I already tried rewriting the
selector into something a bit more excel-like (although I won't bore you
with all the little changes):

Good!

  def ( addr )
    col, row = addr.upcase.scan( /([A-Z]+)(\d+)/ ).flatten
    data[ index( row, col ) ]
  end

  def =( addr, val )
    col, row = addr.upcase.scan( /([A-Z]+)(\d+)/ ).flatten
    data[ index( row, col ) ] = val
  end

m = Matrix.new(%w{A B C}, %w{1 2 3 4})

I would probably not initialize the matrix then. Excel is dynamic as
well. So I'd start with a blank slate and only remember a max value
for row and column.

m["A1"] = 123
m["B4"] = 123

I haven't gotten around to changing all the "row, col" to "col, row"
references, so it looks a bit weird, but I'm just experimenting with
options at the moment. I'll have a go at Hashing it up as well.

Good. With Ruby, these types of experiments are so much fun. :slight_smile:

Naturally I have many questions floating around in my head, but I'll try
to work them out through the scientific method of repeated failed
attempts :slight_smile:

Actually that's probably the best method to learn: we learn much more
through our failures than from our successes.

Kind regards

robert

···

On Fri, Feb 15, 2013 at 3:55 PM, Joel Pearson <lists@ruby-forum.com> wrote:

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

I've attached my attempt at converting your code to suit mine (hope you
don't mind the plagarism :stuck_out_tongue: )

No, it's not a doctoral thesis. :slight_smile: (allusion to German politics)

I have a list of some of my plans to add functionality at the top, and
I've rewritten your test at the bottom to suit the new options.

I'd be interested to know whether there are any things I'm doing
drastically wrong... I think the rows? and columns? might be able to be
done more succinctly, for example.

First of all their names seem weird. Methods with a question mark at
the end are usually intended to be used in a boolean context. But
rows? and columns? return the row and column counts. Plus, they do
not gracefully deal with an empty Matrix. I guess for everyday use it
will be more efficient to store max values in two members because you
will need them often for iteration.

Method #array could use Columns's #map method. If you rewrite
iteration methods like this

  def each_row
    return to_enum(:each_row) unless block_given?

    (1..rows?).each do |idx|
      yield row( idx )
    end
    self
  end

Then you can even do

  def array
    each_row.map do |row|
      row.to_a
    end
  end

And I would consider changing = to actually remove the entry if the
value is nil. (Excel does it differently, I believe. It will not
reduce the area it considers "used" when emptying the last cell in the
last column or row.)

Ah, and I would add a validity check for addresses - otherwise one can
get any String into the Hash as key - even things which are not valid
addresses.

But generally I think you get the hang of it.

Kind regards

robert

···

On Sun, Feb 17, 2013 at 9:14 PM, Joel Pearson <lists@ruby-forum.com> wrote:

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

Nice tips! Thanks for the help again.

You're welcome.

I had no idea how to use to_enum, I'll have to read up on that. I've
done all the Ruby courses I could find at Codecademy which filled in a
few gaps I had in my knowledge. I'm still reading the Book of Ruby as
well.

Basically that method stores self and method name along with arguments
given in a new object which implements #each as delegation to the
method stored and also includes Enumerable. You can cook your own for
learning purposes

E = Struct.new :obj, :method, :arguments do
  include Enumerable

  def each(&b)
    obj.send(method, *arguments, &b)
    self
  end
end

irb(main):017:0> s = "foo bar"
=> "foo bar"
irb(main):018:0> e = E.new s, :each_char
=> #<struct E obj="foo bar", method=:each_char, arguments=nil>
irb(main):019:0> e.map {|x| "<#{x.inspect}>"}
=> ["<\"f\">", "<\"o\">", "<\"o\">", "<\" \">", "<\"b\">", "<\"a\">", "<\"r\">"]

Note that #map is defined in Enumerable

I've decided to leave the "Matrix" class name alone in case I need to
use it within the same scope later. I've renamed this "RubyExcel" for
want of a better term.

+1

I've added the ability to upload a multidimensional array into the data.
It carries the option to overwrite or append as a switch.

Two remarks:

- don't catch the exception inside the method
- better create two methods - one for overwrite and one for append
(even if they internally delegate common work to another method)

This will give you a clearer API. See here for more reasoning:

I've added "find" to return a "cell address" when given a value

You may also want to add #select etc. Or you include Enumerable and
implement #each as delegation to the Hash's #each and get #select and
all others for free.

I still have a long list of things I want to add, and I'm sure I'll
think of more. I'm surprised I haven't found anything equivalent out
there, to be honest. Maybe all the real pros are using databases to
parse their output :stuck_out_tongue:

:slight_smile:

Kind regards

robert

···

On Mon, Feb 18, 2013 at 1:05 PM, Joel Pearson <lists@ruby-forum.com> wrote:

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

I think there are better solutions - in order of increasing quality:

1. Rename the parameter from "switch" to "negate".
2. Pass the operator symbol directly.
3. Add a method Regexp#negate which will return an instance which has
matching logic reversed and remove the parameter.
4. yield ar[idx] (i.e. delegate the decision to the block passed in)

I am still not fond of the #skip_headers approach. As far as I can
see you want to copy the sheet into a new one and include only cells
whose content matches a regular expression (or maybe other filter
criteria). I don't see the rest of your current version of the
implementation but the line with self[0] somehow looks strange.

Kind regards

robert

···

On Mon, Feb 18, 2013 at 2:43 PM, Joel Pearson <lists@ruby-forum.com> wrote:

I do use switches occasionally, here's one example where I think it's
justified (from my older Excel_Sheet<Array class):

def filter( header, regex, switch=true )
  fail ArgumentError, "#{regex} is not valid Regexp" unless regex.class
== Regexp
  idx = self[0].index header
  fail ArgumentError, "#{header} is not a valid header" if idx.nil?
  operator = ( switch ? :=~ : :!~ )
  Excel_Sheet.new skip_headers { |xl| xl.select { |ar| ar[idx].send(
operator, regex ) } }
end

Mostly I just did that because I was learning how to use symbols, but it
makes the Regex more flexible with the minimum amount of repetition or
long-winded "if" statements.

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

I went with "filter" with an optional true/false regex switch because it
seemed like the simplest way to use it, and closest to my own experience
in using Excel's filters.
Passing the symbol feels less intuitive, and yielding to a block means
writing more code, particularly when I'm writing a quick method chain.
The notation I set up feels natural to me when chaining criteria. For
example I can just do this:
data.filter( 'Account', /^P/ ).filter( 'Type', /^Large/, false )

What do you mean by "writing more code"? It is as short as

data.filter( 'Account', /^P/ ).filter( 'Type' ) {|x| /^Large/ =~ x}

You could even implement Regexp#to_proc like this

class Regexp
  def to_proc
    lambda {|s| self =~ s}
  end
end

and then do

data.filter( 'Account', /^P/ ).filter( 'Type', &/^Large/)

Regarding the usage of skip_headers
Say I have this data:

Type Flag Unique_ID
Type1 1 A001
Type2 0 A002
Type1 0 A003
Type3 1 A004
Type1 1 A005

If I only want to keep Parts of "Type1" and "Type3" then I could use
"select" and some Regex, but I might pick up the Header as well if I'm
not careful.
Using a method like "skip_headers" allows me to select or reject
elements of the data without losing the identifiers in the first row,
which I'm almost always going to need at the end when I output the data
into human-readable format.

But wouldn't you want to make the decision what is a header and what
not more flexible? Possible criteria that come to mind are
- first n lines / columns
- first lines / columns where all values match a particular regexp
- any line or column where all values match a particular regexp

I'm also dealing with entire rows rather than individual cells, and
since the source data can change its content and order, using the
headers to identify the data source for a given operation is essential.
Using skip_headers both allows me to preserve them while sorting through
data, and also puts them back on again for the next time I need to
reference them.

So basically you want a view on the data which omits a few rows and
columns. Given that there are so many potential criteria I'd probably
pass in one object implementing === as a row header detector and one
as a column header detector. Since Proc implements === as call you
can also easily provide a lambda there. The argument to === would be
the Column respective Row instance so the position as well as the cell
contents can be evaluated to decide whether something constitutes a
header row / column. Once you have that in place you could create
convenience methods using one of the criteria mentioned above. Just a
few thoughts.

Kind regards

robert

···

On Mon, Feb 18, 2013 at 4:50 PM, Joel Pearson <lists@ruby-forum.com> wrote:

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

That Regexp to proc idea looks good. I could use proc form for a
positive match and a normal block for the negative. I'll see if I can
get something like this working when I write filter method for
RubyExcel.

Using the new class I can implement something like skip_headers by
passing a starting value to "rows" or "columns". This makes it more
flexible as well. I've rewritten those iterators using optional start
and end points:

Actually the even more general concept is filter anything: one might
not only want to skip headers but general rows or columns.

def rows( start_row = 1, end_row = maxrow )
  fail TypeError, 'Data is empty' if maxrow == 0
  fail ArgumentError, 'The starting row must be less than the maximum
row' if maxrow < start_row
  return to_enum(:rows) unless block_given?

You need to pass arguments start_row and end_row here as well!

return to_enum(:rows, start_row, end_row) unless block_given?

  ( start_row..end_row ).each do |idx|
    yield row( idx )
  end
  self
end

Now I can use rows(2) to skip the headers if necessary. It might be a
bit confusing when rows(1) actually returns from 1 to the end, but I've
already got row(1) for that purpose and it makes it shorter to iterate
through all of them. Plus it means I can do "rows.count", which is the
same as VBA syntax.

I vaguely understand the idea of passing something in to compare to a
header type. I'm not sure how I'd implement it though, since the only
headers I ever deal with are row 1, and they tend to look pretty similar
to the data itself.

Well, then we should probably add a method #index to Row and Column
which returns the numeric index. That makes checking whether it's the
n'th row / column easy. Basically for one of the two the method is
just an alias (not if there is a base 0 or based 1 difference though).

Kind regards

robert

···

On Mon, Feb 18, 2013 at 6:23 PM, Joel Pearson <lists@ruby-forum.com> wrote:

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

After more face-to-keyboard action, I came up with a working filter
system. it modifies self at the moment rather than returning a copy,
which is something I'll have to look intosince I'm not sure I want that
to be the default behaviour.

You cannot change self. Or do you mean you change matrix contents?
IMHO that's a bad idea since you basically only want a filtered /view/
on the data.

I've added a index option for row and column, and also added row and
column methods to String. Since those methods didn't exist before, and
you gave me the idea of modifying an existing class (Regexp), I thought
this would be quite a useful way to get the index values straight from
the hash keys.

IMHO that is not a good idea. The method I suggested for Regex is of
general use because matching is what Regex instances are for. But the
extraction of a row or column identifier is closely bound to your
class's logic. So if at all I'd put it as a class method in class
RubyExcel.

...
I'm too tired for rational thought now so I'd better call it a day
before I find myself thinking that adding ASCII art comments in the
shape of ponies and rainbows would improve the code...

:slight_smile:

Frankly, I don't have the time right now to take a closer look. I
hope I can followup at a later point in time.

Kind regards

robert

···

On Tue, Feb 19, 2013 at 12:47 AM, Joel Pearson <lists@ruby-forum.com> wrote:

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

Using the filter method on my previous attempt would return a copy of
the object containing only the required data. I was hoping for something
similar with this so I can split into multiple sections using filter
logic but still keep the original if required, but I'm not sure how to
do this. This is the kind of thing I'd probably want to use:

m = #Main matrix
n = m.filter args #Creates Subset without altering "m"
m.filter! other_args #Alters its own data

That's another reasonable way to do it. Similar to how String#gsub
and String#gsub! go about this. It entirely depends on your use case.
If you would want to modify the original data through a view then of
course you need a similar approach as with Row and Column.

A class method sounds like a good idea. I'll have to play with that and
see what I can come up with...

Cheers

robert

···

On Tue, Feb 19, 2013 at 10:51 AM, Joel Pearson <lists@ruby-forum.com> wrote:

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

hi Joel
i am timo i follow this thread you are doing great
starting to look like it will be a worthwhile effort
lately i have hoped for a few pointers in this same topic
may i ask have you compared any current ruby forge programs?
a few mainly older Ruby programs seem to exist for MS Excel in Ruby
http://rubyforge.org/projects/spreadsheet (2011) beta
and on ruby-lang
http://raa.ruby-lang.org/project/hyogen/ ( 2007)
http://raa.ruby-lang.org/project/ruby-statsample/ (2010)
http://raa.ruby-lang.org/project/parseexcel/ (2006)

···

On Wed, Feb 20, 2013 at 10:14 AM, Joel Pearson <lists@ruby-forum.com> wrote:

I've now had time to do a bit more testing and debugging, and I've
managed to come up with a (relatively) stable and usable framework. Even
more surprising, I now have a working version of pretty much everything
on my wish list. I suppose I can just call .dup or .clone for when I
want to take different sections of the data down different paths.

I've tried to refactor some of the more esoteric code, and improve the
"encapsulation" by hiding the underlying hash wherever I spotted it
poking out.

I've actually managed to use recursion (in the "unique" method) without
permanently looping! I'll just gloss over the amount of permanent loops
I went through to get that right...

I've had to make some hefty use of reverse_each to sort out deleting
entire sections and compacting the rest afterwards.

As usual, I've probably made many silly mistakes and done things the
hard way. Any advice would be appreciated (although not demanded :wink: ).

I think the next thing to do is put this class through some real-world
scenarios, so I'll be using it as a replacement for ye olde Excel_Sheet
class.

Attachments:
http://www.ruby-forum.com/attachment/8156/RubyExcel.rb

--
Posted via http://www.ruby-forum.com/\.

hey thanks for some concise notes on this important topic
yes at length you guys built a nifty way to filter data in the memory
even present live updates of cell vectors in the sheets
your main data is strings
my excel woes lately is the veteran car club parts catalogs
i scan pages to tiff then image OCR to Word correct it and paste into Excel
with aims like matching compatible pistons to various makes and models

···

On Wed, Feb 20, 2013 at 11:00 PM, Joel Pearson <lists@ruby-forum.com> wrote:

From what I've seen of statsample so far it looks more numerically
oriented, whereas I tend to be dealing with lots of strings, some
numbers, and a few dates.

Hyogen also looks more interested in the read-write than the analysis in
the interim. My input tends to come from Nokogiri and I can already
output to excel with some advanced formatting options using my own code.

I have attached one of the functions I wrote to output the data to Excel
(part of a larger class).
As you can see, it's pretty simple to use; and I always have the option
of just handing over the win3ole object if I need more advanced options
(which happens occasionally). It could use some refactoring, but it does
the job.

As I already have this kind of functionality available, it's the ability
to sort through a table of data efficiently in the memory which
interests me. I know a database would be ideal for that kind of
operation, but I have portability in mind as a given report program may
be on a variety of machines, and the only thing I know they have for
sure is Office 2007 or higher.

Attachments:
http://www.ruby-forum.com/attachment/8161/Demo.rb

--
Posted via http://www.ruby-forum.com/\.

#clone and #dup are _always_ shallow copies. If you want to change
that you need to override them, e.g.

class Excel
  def dup
    super.tap do |copy|
      %w{@table foo}.each {|iv| copy.instance_variable_set(iv,
instance_variable_get(iv).dup)}
    end
  end
end

Kind regards

robert

···

On Thu, Feb 21, 2013 at 1:40 PM, Joel Pearson <lists@ruby-forum.com> wrote:

I thought I could avoid the issue of splitting the data into multiple
outputs by creating copies of the class, but apparently I was wrong:

irb(main):002:0> m = RubyExcel.new.load [['a','b']]
=> columns: 2, rows: 1, values: 2

irb(main):003:0> a = m.clone

irb(main):004:0> a['A1'] = nil

irb(main):005:0> puts a
        b

irb(main):006:0> puts m
        b

What am I doing wrong? Is @data inside m not being duplicated along with
the class instance?

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

thanks yes you're right the task is tedious and messy
first i place the old book in my scanner and convert image to tiff
tiff is the format accepted in ms-office document imaging OCR tool
any text found in the tiff image is output to Word all in a single column
i clear all text formatting and correct many typos by examining it visually
i check the OCR-output text in word against the input tiff side by side
then copy and paste text ie part numbers and names into excel columns
i put each catalog section items into an xls sheet to make the data
accessible
i hope these could be used to create an index of a database of tech drawings

···

On Thu, Feb 21, 2013 at 12:34 AM, Joel Pearson <lists@ruby-forum.com> wrote:

Sounds messy. Is the data on the page graphical rather than text based?
Why do you need Word?

--
Posted via http://www.ruby-forum.com/\.

Robert Klemme wrote in post #1094583:

write your own - it is the most
reasonable thing to have a specific class for handling this instead of
writing functions which work with a nested Array structure.

It's only just dawned on me just how useful this kind of thing could be.
I could have a class tailor-made to my expected outputs, yet versatile
enough to adapt to new challenges... If I learn how to use blocks
effectively as well, then it could do pretty much anything I need.
All hail Ruby!

···

--
Posted via http://www.ruby-forum.com/\.

class Excel_Sheet<Array

I wouldn't do that. With the basic types it is usually much better to
use delegation (i.e. have a member of that type) than exposing the
full API via inheritance. The whole point of OO is to control
internal state which is usually quite difficult when exposing a
complete API of Array because anybody can insert and remove elements.

Btw. the "self." in your code are superfluous.

Kind regards

robert

Ok, I wasn't sure how to create an object which behaved like an array.
I've tried this instead:

class Excel_Sheet

  def initialize( val= )
    val = %w(A1 A2 A3 B1 B2 B3 C1 C2 C3).each_slice(3).to_a if val ==
'test'
    fail ArgumentError, 'Must be multidimensional array' unless
val[0].class == Array || val.empty?
    replace val
  end

end

It seems to work ok.

I didn't know how to use self at first, I think I understand it a bit
better now.

Thanks for the advice :slight_smile:

···

--
Posted via http://www.ruby-forum.com/\.

Robert Klemme wrote in post #1096452:

anybody can override header values or complete
rows / columns violating your class's idea of internal state.

This class only gets added into scripts, and I'm the only one who knows
Ruby where I work, so I don't really see this being a problem yet.

  def skip_headers
    block_given? ? ( [ self[0] ] + yield( self[1..-1] ) ) : (
self[1..-1] )
  end

What is this supposed to do? Ah, I think I see. I'd probably name it
differently, i.e. each_data_cell or something.

This basically allows me to make summaries or edits within the data
itself without having to worry about the headers, then it tacks the
headers back on again once I've made the changes. The switch is just so
I can rip the headers off data and also read my code back later and see
what it's doing.

  def filter( header, regex )
    idx = self[0].index header
    skip_headers { |xl| xl.select { |ar| ar[idx] =~ regex } }
  end

That combines too much logic in one method IMHO. I'd rather select a
row based on header and then I would use #select on that.

In this case, I'm filtering the data like Excel does. This means I'm
keeping all of the columns, but only specific rows based on the values
in a single column.

So in short, my question is how can I return my class type after using
Array's methods on my child-class?

Do you mean as return value from #map and the like? Well, you can't
without overriding all methods with this approach, I'm afraid. That's
one of the reasons why this approach does not work well. :slight_smile:

You're probably thinking in terms of classes which are constantly active
as objects accessible to multiple users, whereas I'm just using this
class to make scripts easier to write.

My usage in this case is:

1) Run a report which returns a table of data
2) Filter the data on a set of criteria
3) Output the results to Excel
4) Exit

Using the class I've built I can easily pare down the data to the
important sections, I've added other handy methods suitable to a
multidimensional array such as "find" which returns the header and row,
"to_s" which converts to TSV format, and several more. It saves me
rewriting the same stuff over and over.

It's great when I want to build excel-style tabs out of the data, like
this little snippet:

···

______________
@data[:EXT] = @WIP_data.filter 'Job Status', /External Repair/
@data[:Awaiting_Parts] = @WIP_data.filter 'Job Status', /Awaiting Parts/

@data.each { |name, data| @data[:Summary].push [ name.to_str, (
data.length -1 ) ] }

reporter.output_to_excel_tabs @data, filename
______________

Ok, in that case I'm happy with an array return, but there are plenty of
sequential events where I'd like to have my own class type returned. Is
passing it through the initializer again really the best way to do that?

--
Posted via http://www.ruby-forum.com/\.

Maybe one explanatory sentence: I chose to use two values for
addressing cells in the Matrix class I put on github. That is a
design decision. You may want to choose something else (e.g. "R3")
but the main point of the example was this: it does make sense to turn
things you talk and reason about into classes (i.e. Matrix, Column,
Row). One can do that and still maintain connection, i.e. if you
write through a Row or Column instance the Matrix gets updated. That
way you can easily maintain consistency in the Matrix and yet present
the user abstractions which are more appropriate for his particular
use case (e.g. if you want to sum all values in a column you obtain
the column and then iterate through all values and sum them up).

Kind regards

robert

···

On Tue, Feb 12, 2013 at 10:49 PM, Robert Klemme <shortcutter@googlemail.com> wrote:

On Tue, Feb 12, 2013 at 9:26 PM, Joel Pearson <lists@ruby-forum.com> wrote:

Thanks for the advice and examples, I'll see whether I can understand
how the classes and methods work with each other there and set about
experimenting with them.

I didn't put commenting in the gist. If there's anything unclear feel
free to ask.

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

Robert Klemme wrote in post #1097127:

Below is a valuable statement. :slight_smile:

···

On Fri, Feb 15, 2013 at 3:55 PM, Joel Pearson <lists@ruby-forum.com> > wrote:

Actually that's probably the best method to learn: we learn much more
through our failures than from our successes.

Kind regards

robert

--
Posted via http://www.ruby-forum.com/\.

Robert Klemme wrote in post #1094583:

write your own - it is the most
reasonable thing to have a specific class for handling this instead of
writing functions which work with a nested Array structure.

It's only just dawned on me just how useful this kind of thing could be.
I could have a class tailor-made to my expected outputs, yet versatile
enough to adapt to new challenges...

That's the whole point of OO - or rather software engineering in
general: create proper abstractions. Advantage of OO is that one can
first reason about the interface and hide all the details behind that
- including the state necessary to make the interface of the class
work as needed.

If I learn how to use blocks
effectively as well, then it could do pretty much anything I need.
All hail Ruby!

:slight_smile:

Cheers

robert

···

On Sat, Feb 2, 2013 at 10:04 PM, Joel Pearson <lists@ruby-forum.com> wrote:

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

As Robert says I'm not sure there is any (clean) way to do what you want, which is why delegation as opposed to inheritance is often used in this scenario. I'm sure this isn't a complete solution, but does this rewritten snippet (untested!) help at all?

class ExcelSheet

   def initialize( val= )
     val = %w(A1 B1 C1 A2 B2 C2 A3 B3 C3).each_slice(3).to_a if val == 'test'
     @arr = val
   end

   def skip_headers
     block_given? ? ( [ @arr[0] ] + yield( @arr[1..-1] ) ) : ( @arr[1..-1] )
   end

   def filter( header, regex )
     idx = @arr[0].index header
     @arr = skip_headers { |xl| xl.select { |ar| ar[idx] =~ regex } }
   end

   def method_missing(meth,*args,&block)
     ret = @arr.send meth, *args, &block
     ret.instance_of?(Array) ? ExcelSheet.new(@ret) : ret
   end

end

···

On 12.02.2013 13:06, Joel Pearson wrote:

Robert Klemme wrote in post #1096452:

So in short, my question is how can I return my class type after using
Array's methods on my child-class?

Do you mean as return value from #map and the like? Well, you can't
without overriding all methods with this approach, I'm afraid. That's
one of the reasons why this approach does not work well. :slight_smile:

You're probably thinking in terms of classes which are constantly active
as objects accessible to multiple users, whereas I'm just using this
class to make scripts easier to write.

My usage in this case is:

1) Run a report which returns a table of data
2) Filter the data on a set of criteria
3) Output the results to Excel
4) Exit

Using the class I've built I can easily pare down the data to the
important sections, I've added other handy methods suitable to a
multidimensional array such as "find" which returns the header and row,
"to_s" which converts to TSV format, and several more. It saves me
rewriting the same stuff over and over.

It's great when I want to build excel-style tabs out of the data, like
this little snippet:
______________
@data[:EXT] = @WIP_data.filter 'Job Status', /External Repair/
@data[:Awaiting_Parts] = @WIP_data.filter 'Job Status', /Awaiting Parts/

@data.each { |name, data| @data[:Summary].push [ name.to_str, (
data.length -1 ) ] }

reporter.output_to_excel_tabs @data, filename
______________

Ok, in that case I'm happy with an array return, but there are plenty of
sequential events where I'd like to have my own class type returned. Is
passing it through the initializer again really the best way to do that?

--
Alex Gutteridge

Robert Klemme wrote in post #1096452:

anybody can override header values or complete
rows / columns violating your class's idea of internal state.

This class only gets added into scripts, and I'm the only one who knows
Ruby where I work, so I don't really see this being a problem yet.

I think you misunderstand OO. The point of OO is to build
abstractions of real world phenomena which behave in certain ways.
That does not have to do with how many people use the code; proper OO
abstractions help you even if you are the only one working with the
code because they encapsulate specific behavior and you as a user of
classes do not need to worry any longer about internals. By thinking
in proper abstractions you actually make your life easier since the
system is easier to understand.

  def skip_headers
    block_given? ? ( [ self[0] ] + yield( self[1..-1] ) ) : (
self[1..-1] )
  end

What is this supposed to do? Ah, I think I see. I'd probably name it
differently, i.e. each_data_cell or something.

This basically allows me to make summaries or edits within the data
itself without having to worry about the headers, then it tacks the
headers back on again once I've made the changes. The switch is just so
I can rip the headers off data and also read my code back later and see
what it's doing.

I would actually rather have Row and Colum as specific items which can
be asked for their header and iterate through all their values.
Example: Sample matrix with abstraction of rows and columns. This can certainly be improved, it's just to convey an idea. · GitHub

  def filter( header, regex )
    idx = self[0].index header
    skip_headers { |xl| xl.select { |ar| ar[idx] =~ regex } }
  end

That combines too much logic in one method IMHO. I'd rather select a
row based on header and then I would use #select on that.

In this case, I'm filtering the data like Excel does. This means I'm
keeping all of the columns, but only specific rows based on the values
in a single column.

Right, and as I said I'd rather make that two separate steps. That is
much more modular and hence reusable.

So in short, my question is how can I return my class type after using
Array's methods on my child-class?

Do you mean as return value from #map and the like? Well, you can't
without overriding all methods with this approach, I'm afraid. That's
one of the reasons why this approach does not work well. :slight_smile:

You're probably thinking in terms of classes which are constantly active
as objects accessible to multiple users, whereas I'm just using this
class to make scripts easier to write.

See my initial statement: I think you are vastly underestimating the
value of OOP.

Cheers

robert

···

On Tue, Feb 12, 2013 at 2:06 PM, Joel Pearson <lists@ruby-forum.com> wrote:

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/