Documenting the end result.
I added a sort to keep the input column order, and reordered the inputs
to make the array itself optional for ease-of-use within the parent
class.
Looks like good advice, both. My concern with matrices is being able to
modify elements in the same way as I could in in a multidimensional
array, but I assume that's the reason for creating a child class.
There's a whole thread full of people waxing philosophical about the
subject!
I've never written a class based on someone else's before, sounds like
fun. I'll see what happens when I play with it a bit.
I decided to try and build on the Array class as I don't really
understand Matrices yet. I've added a few handy methods. The hidden Bang
stuff is justified, I think, as this class is intended to mimic Excel's
layout.
I'll add more useful bits as I come up with them, this is just an
experiment at the moment.
class Excel_Sheet<Array
def initialize( val=[] )
fail ArgumentError, 'Must be multidimensional array' unless
val[0].class == Array || val.empty?
super( val )
end
def columns
ensure_shape
self[0].length
end
def rows
self.length
end
def ensure_shape
max_size = self.max_by(&:length).length
self.map! { |ar| ar.length == max_size ? ar : ar + Array.new(
max_size - ar.length, nil) }
end
I've decided to inherit from array after all, since all I want to do
with this is extend support for multidimensional arrays, but without
overwriting any of Array's methods.
Anyway, the obstacle I've hit is one I can avoid, but I was wondering
whether I'm doing something wrong, or whether there's a nice Rubyish way
around this. Here's a simplified version to demonstrate the issue:
···
___________________
class Excel_Sheet<Array
def initialize( val=[] )
val = %w(A1 B1 C1 A2 B2 C2 A3 B3 C3).each_slice(3).to_a if val ==
'test'
super ( val )
end
When I do this sort of thing:
result = object.filter('Header', /value1|value2/)
I get the return as an Array, so I can't use my extra methods on it
anymore.
Here's my current workaround. It's the only way I could think of doing
this but it doesn't look right.
___________________
def filter( header, regex )
idx = self[0].index header
Excel_Sheet.new skip_headers { |xl| xl.select { |ar| ar[idx] =~ regex
} }
end
___________________
So in short, my question is how can I return my class type after using
Array's methods on my child-class?
Thanks for the advice and examples, I'll see whether I can understand
how the classes and methods work with each other there and set about
experimenting with them.
Once thing which put me off generating a custom class "from scratch" is
that Array appears to be equal to its content (I assume this is a
language shortcut), but it seems "custom" objects' values have to be
accessed via their accessors.
I was hoping for some more succinct syntax than this sort of thing:
puts [] #Array is so easy to create
puts CustomObject.new([]).value #This looks clunky next to that
I'd love to get accustomed proper OO thinking, but I'll inevitably make
all the rookie mistakes in the process. It's a lot to get used to all at
once given that I've been using Ruby for less than a year, and I have no
training other than helpful hints and googling. Thanks again for your
patience.
I haven't had a chance to look into your example yet; I've been reading
up on OOP.
I intend to take the ideas I've been coming up with for ease-of-use
within the Array class and use those, your Matrix example, and whatever
else occurs to me to form a new set of classes which can handle my data
and the operations I regularly need to perform. Then it's time to play
with scenarios and see what happens.
Interesting Matrix build. It's giving me a bit of a headache just trying
to figure out the links involved.
So MatrixPart defines the methods and the "parent" matrix (held as an
instance variable); and row and column both use these methods and both
access the variable which points to the matrix they're part of.
The rows and columns can be selected based on given headers, and each
will reference the other... and this is where my head explodes:
def index( row, col ) @row_headers.index( row ) * @col_headers.size + @col_headers.index(
col )
end
It takes a bit of getting used to, but thanks to Ruby's flexible array
class adding nil values automatically when you specify an index higher
than the upper boundary, that works.
I guess with a bit more poking and prodding I could figure out how to
append, insert, and delete rows and columns. After all, it's only a math
problem in the end. All the interconnected references (especially the
layered yields) still make my head spin though
Hah, I wrote that head exploding comment first and then managed to work
out what it did afterwards. Still took a few minutes of smashing my head
into the desk to make room for the new thought though ;¬)
Using a Hash sounds like a good idea. I already tried rewriting the
selector into something a bit more excel-like (although I won't bore you
with all the little changes):
def []( addr )
col, row = addr.upcase.scan( /([A-Z]+)(\d+)/ ).flatten
data[ index( row, col ) ]
end
def []=( addr, val )
col, row = addr.upcase.scan( /([A-Z]+)(\d+)/ ).flatten
data[ index( row, col ) ] = val
end
m = Matrix.new(%w{A B C}, %w{1 2 3 4})
m["A1"] = 123
m["B4"] = 123
I haven't gotten around to changing all the "row, col" to "col, row"
references, so it looks a bit weird, but I'm just experimenting with
options at the moment. I'll have a go at Hashing it up as well.
Naturally I have many questions floating around in my head, but I'll try
to work them out through the scientific method of repeated failed
attempts
I've attached my attempt at converting your code to suit mine (hope you
don't mind the plagarism )
I have a list of some of my plans to add functionality at the top, and
I've rewritten your test at the bottom to suit the new options.
I'd be interested to know whether there are any things I'm doing
drastically wrong... I think the rows? and columns? might be able to be
done more succinctly, for example.
I had no idea how to use to_enum, I'll have to read up on that. I've
done all the Ruby courses I could find at Codecademy which filled in a
few gaps I had in my knowledge. I'm still reading the Book of Ruby as
well.
Hopefully this one is more stable:
I've decided to leave the "Matrix" class name alone in case I need to
use it within the same scope later. I've renamed this "RubyExcel" for
want of a better term.
I fixed all the things you mentioned (I think).
I've added the ability to upload a multidimensional array into the data.
It carries the option to overwrite or append as a switch.
I set the reference list of column references to a Constant.
I've removed "array" added "to_a" and "to_s"
I've added "find" to return a "cell address" when given a value
I still have a long list of things I want to add, and I'm sure I'll
think of more. I'm surprised I haven't found anything equivalent out
there, to be honest. Maybe all the real pros are using databases to
parse their output
Nice link. I agree with the sentiment there, and I'll think more
carefully about using boolean switches in future.
I've split that method into "load" and "append", each passing arguments
to private "import_data".
I added the rescue when I realised the method was returning the number
of rows and I wanted it to return success or failure as a boolean, I
forgot it was catching my exceptions as well. Now it's true or
exception.
I do use switches occasionally, here's one example where I think it's
justified (from my older Excel_Sheet<Array class):
def filter( header, regex, switch=true )
fail ArgumentError, "#{regex} is not valid Regexp" unless regex.class
== Regexp
idx = self[0].index header
fail ArgumentError, "#{header} is not a valid header" if idx.nil?
operator = ( switch ? :=~ : :!~ )
Excel_Sheet.new skip_headers { |xl| xl.select { |ar| ar[idx].send(
operator, regex ) } }
end
Mostly I just did that because I was learning how to use symbols, but it
makes the Regex more flexible with the minimum amount of repetition or
long-winded "if" statements.
I went with "filter" with an optional true/false regex switch because it
seemed like the simplest way to use it, and closest to my own experience
in using Excel's filters.
Passing the symbol feels less intuitive, and yielding to a block means
writing more code, particularly when I'm writing a quick method chain.
The notation I set up feels natural to me when chaining criteria. For
example I can just do this:
data.filter( 'Account', /^P/ ).filter( 'Type', /^Large/, false )
Regarding the usage of skip_headers
Say I have this data:
Type Flag Unique_ID
Type1 1 A001
Type2 0 A002
Type1 0 A003
Type3 1 A004
Type1 1 A005
If I only want to keep Parts of "Type1" and "Type3" then I could use
"select" and some Regex, but I might pick up the Header as well if I'm
not careful.
Using a method like "skip_headers" allows me to select or reject
elements of the data without losing the identifiers in the first row,
which I'm almost always going to need at the end when I output the data
into human-readable format.
I'm also dealing with entire rows rather than individual cells, and
since the source data can change its content and order, using the
headers to identify the data source for a given operation is essential.
Using skip_headers both allows me to preserve them while sorting through
data, and also puts them back on again for the next time I need to
reference them.
That Regexp to proc idea looks good. I could use proc form for a
positive match and a normal block for the negative. I'll see if I can
get something like this working when I write filter method for
RubyExcel.
Using the new class I can implement something like skip_headers by
passing a starting value to "rows" or "columns". This makes it more
flexible as well. I've rewritten those iterators using optional start
and end points:
def rows( start_row = 1, end_row = maxrow )
fail TypeError, 'Data is empty' if maxrow == 0
fail ArgumentError, 'The starting row must be less than the maximum
row' if maxrow < start_row
return to_enum(:rows) unless block_given?
( start_row..end_row ).each do |idx|
yield row( idx )
end
self
end
Now I can use rows(2) to skip the headers if necessary. It might be a
bit confusing when rows(1) actually returns from 1 to the end, but I've
already got row(1) for that purpose and it makes it shorter to iterate
through all of them. Plus it means I can do "rows.count", which is the
same as VBA syntax.
I vaguely understand the idea of passing something in to compare to a
header type. I'm not sure how I'd implement it though, since the only
headers I ever deal with are row 1, and they tend to look pretty similar
to the data itself.
Nice catch on the arguments, I completely missed that.
After more face-to-keyboard action, I came up with a working filter
system. it modifies self at the moment rather than returning a copy,
which is something I'll have to look intosince I'm not sure I want that
to be the default behaviour.
I've added a index option for row and column, and also added row and
column methods to String. Since those methods didn't exist before, and
you gave me the idea of modifying an existing class (Regexp), I thought
this would be quite a useful way to get the index values straight from
the hash keys.
In order to get the filter working properly I've created some compact
methods which will reconfigure the hash keys and values. That could
probably be refactored but it took me so long to get it working properly
I dare not touch it again yet!
I added empty? to the columns and rows as a helper for the compact
method.
I didn't like the inspect output so I tidied it up a bit as well, and
redefined "to_s" for each type.
I've added each_with_address as an option for the columns and rows since
they don't access the data hash directly. There might be a neater way to
implement this, but I couldn't figure it out.
I'm too tired for rational thought now so I'd better call it a day
before I find myself thinking that adding ASCII art comments in the
shape of ponies and rainbows would improve the code...