Dňa Utorok 21 Február 2006 09:58 Tony Mobily napísal:
>>> Yes. This class could also store the data directory path and manage
>>> the
>>> lifecycle of Subscriber objects, reducing those to only data
>>> retrieval /
>>> caching.
>>
>> You mean with something like:
>> sub_container=SubscribersContainer.new()
>
> A particularly sick thing to do would be making SubscribersContainer a
> singleton and then delegate calls to the class object to the single
> instance.
> Not really useful though.
I don't have enough experience in Ruby to even understand why you'd
do that, or when it would make sense to create singleton classes.
In fact: when does it make sense to mix a class with Singleton?
In this case? I'd do that to make the code horribly confusing while still
being able to drop pattern names as an excuse. Pure malice 
It would make sense to make singletons when you need to make sure there's one
and only one object of a given class. Basically, they're the same as globals,
except you can use encapsulation with them. In Java, they also save up a lot
of typing "static" over and over in the long run, too (major feature *grin*).
They also probably provide you with a little more flexibility in one case or
another, can't imagine an example though.
>> sub_container['merc'].name="tony"
>>
>> ...?
>> When the method (email) is called, the object SubscribersContainer
>> would need to create a new object (if necessary), or return the one
>> already created at some point in the past. Is that right?
>
> I'd still keep the creation and loading as separate operations to
> avoid
> clobbering some data by mistake when thinking you're making a new
> record.
Wooops... Maybe we had a misunderstanding here? I meant that
sub_container['merc'] would need to create a new *object* (of type
Subscriber) and link it to the file system. I wasn't thinking about
creating a new record on disk.
So, my question was: if each one of these calls:
p sub_container['merc'].name
p sub_container['dave'].name
p sub_container['bridget'].name
p sub_container['anna'].name
Allocates a Subscriber object in the collection, after 14000 I'd have
allocated 14000 Subscriber objects.
Or maybe my understanding of containers is still far too poor to
follow you properly.
> There are other ways to preventing that though, like providing a
> method to
> check whether a given record exists.
OK.
Right now, I must admit I have no idea where I'd start creating a
container.
The Subscriber objects have to be allocated somewhere anyway, the difference
is where you'd put what. 14000 elements isn't a particularly large data set
anyway - with 150 bytes of data per record on average, this is about 2 MB in
total.
SubscriberContainer wouldn't necessarily be a real collection. But I'll let
the code do the talking:
class SubscriberContainer
def self.(email)
sub = Subscriber.new
sub.link_to(email)
return sub
end
end
Of course, you could use some weak array / hash or something like that to
cache these objects. I haven't really worked with that though. In this case,
it would even be recommendable, because two objects representing the same
record would very probably lead to bugs because of the caching.
For example if there's a record for "fred@flintstone.com" with an attribute
"name" with the value "Fred Flintstone", you'd get:
ff1 = Subscriber.new
ff1.link_to "fred@flintstone.com"
ff2 = Subscriber.new
ff2.link_to "fred@flintstone.com"
ff1,name # The value "Fred Flinstone" gets cached.
ff2.name # Same as above.
ff1.name = "Barney Rubble" # Value gets changed on disk and inside the ff1
object.
puts ff2.name # The ff2 object doesn't notice a change, and doesn't reread
a cached value
Only accessing the records per objects managed in the container would prevent
this, because at most one Subscriber object would exist per record.
On this spot, I'd probably think even more of using an ACID compliant database
backend and a persistence layer to handle the nitty gritty details for you.
>> I am not sure why you say that this class would need the data
>> directory...!
>
> Well, I thought of this class providing the created Subscriber with
> the full
> record path and the e-mail address when creating the object. That
> way, you'd
> keep the what's stored where bit out of the Subscriber class.
>
> You could also determine / change the config file directory at
> runtime from a
> parameter to the application more intuitively
> (SubscriberContainer.new("/path/to/record/directory/") instead of
> hardcoding
> it or manipulating class variables), or even have several
> containers - even
> if this would probably be rarely useful.
Very true.
> Last, but not least, at least to me it makes more sense for the
> container to
> have the information where its contents are stored.
You're completely right.
Now that I have a better understanding of scoping, this makes sense.
>> Also, I wonder if accessing too many objects that way wouldn't
>> clutter the collection too much (the "real" number of subscribers we
>> have is about 14000. I KNOW we need a DB. We didn't expect quite so
>> many. I am planning to switch to DB)
>
> You wouldn't have to actually store the retrieved object in the
> container
> after creation / loading, just have the container do this work and
> dumb down
> the Subscriber storage to data transfer and mutation, those being
> ignorant to
> as much context as possible.
But this would surely mean that "Subscriber" is not really usable
without the container... wouldn't it?
(maybe that's not a problem?)
No, it's not a problem. The functionality provided would stay the same, just
the responsibility for providing it would be split across the two classes.
The fact a container class exists could be concealed to a lot of code using
the subscriber objects.
The point is keeping sets of related bits of code separated from each other as
much as possible - we need only very little information from a Subscriber
object to store a new value of a field of the object - only the id of the
record (the e-mail address), the field name, and the new value. Therefore
it's more concise to have a separate component with access to the minimum
amount of information necessary to implement this operation.
>>> I'd say read through the Gang of Four and Refactoring,
>>
>> Woops... I've lost you here. Are you talking about one specific book?
>> Or two books?
>
> Well, THE Refactoring book *cackle*. But Dave Cantrell already
> answered this
> perfectly. Gang of Four is a nickname of the four authors of the
> book. Not
> quite up-to-date as far as the patterns mentioned are concerned -
> there are
> already droves more that have been invented since. But I like the
> case study
> bit as an explanation that shows an example of quite a few of those
> applied
> in a single program.
OK.
I find that a lot of these books apply to Java or C++. Is there a
Ruby Patterns book out there?
If not... well, it *should* exist!
I think someone made "translations" of the source code in these books to Ruby
and announced that to the ML. Try searching the archives for it? The basic
concepts are pretty much the same between the languages, except for a few
differences in what "special" language features can be used to implement what
patterns more efficiently than the "standard" ones.
>> @country=nil
>> @creation_date=nil
>> @name=nil
>> @password=nil
>> @postcode=nil
>> @premium_expiry_date=nil
>> @questionnaire_res=nil
>> @subscriber_code=nil
>> @subscriber_comments=nil
>
> This shouldn't be necessary. Reading uninitialized instance
> variables results
> in a nil by default.
This is me being me. It's nice to know WHAT instance variables are
there. I'm an obsessive compulsive, you see 
OK, comments exists for that reason...
Hehe, I know that. I can't learn to omit the return keyword, even if it's
actually slower, and on the other hand, can't make myself write parentheses
when declaring a method without arguments...
Oh, and I also do the assignment of nil thing, just not for variables I have
accessors for.
>> # Set the value to nil. This is to reflect the
>> # "real" state of the variable (the file has just
>> been
>> # cleared up by the previous call)
>> #
>> begin
>> ios.print(value)
>> rescue SystemCallError
>> ios.close
>> return nil
>> end
>
> I'd merge this code block with the previous one, and possibly
> handle the
> exception somewhere else, reporting it to the user as a severe
> failure, and
> logging it. You could also use the block form of File::open here.
OK.
I didn't have logging abilities in the object right now. I have no
idea where to start, with that.
Print to STDERR? You might check how your webserver works and if you can
integrate into that.
I also divided the block in two, because in the second half of the
code I close the file if there was a problem.
I can see that with the "block" in File::open I can do everything at
once...!
So, the function has become:
def set_field(field,value)
return nil if ! @email
# Open the file
#
begin
File::open(self.full_path+field.to_s,"w") do
>ios>
ios.print(value)
end
rescue SystemCallError
raise
return nil
end
value
end
I think you can actually omit the begin / rescue / end here. The "return nil"
is never reached. You can't both raise an exception and return from a
function normally.
> You'd end up cluttering the code with checks for nil all the time,
> which is
> only proper if you expect the issue to appear during more-or-less
> regular
> operation; e.g. for calls where a failure isn't abnormal.
OK.
> Same here as in the previous method, just let the SystemCallError
> pass up on
> the stack -it's probably not possible to gracefully recover from it.
OK.
I've just put "raise" there:
def set_flag(flag,value)
return nil if ! @email
if(value)
begin
File.open(self.full_path+flag.to_s,"w")
rescue SystemCallError
raise
end
return true
else
begin
File.delete(self.full_path+fiag.to_s)
rescue SystemCallError
raise
end
return false
end
end
Same as above, you can as well let the call to File.open raise the exception
for you, it's not necessary to do it explicitly.
Well, the only obscure bit now is how to make this "containable".
It's probably not necessary, because the class works well "as it is".
However...
Very true. The code as it is is likely to work well enough until you get a
large enough codebase to warrant separating data storage and data access.
OK, I am assuming that to make the container, basically I would have:
* A class called "SubscribersContainer". This class would have the
methods "country=", country(), and so on; those methods would all use
the methods set_field and get_field, NOT implemented in the container
* A class called SubscriberFS (which I have), which would ONLY
implement get_field, set_field, get_flag, set_flag. These methods
will be used by the container to do the "real" work
* I could also have a class called SubscribersDB, which would do the
same things but connecting to a database
Actually, I meant the naming the other way around. Subscriber would access the
data, and the container would represent the backends - the roles of the
respective objects stay the same. You'd have a single Subscriber class, and
then a separate FSContainer and a DBContainer, that would implement the
specifics of writing the data into the backends.
Something like (excerpts):
class Subscriber
def initialize(container)
@container = container
end
def get_field(field)
container.get_field(@email, :field)
end
# Etc. for #set_field, #get_flag, #set_flag
end
class FSContainer
def get_field(email, field)
# Find respective file, read it, return what's inside.
end
# Create a new Subscriber stored in this container.
def self.(email)
sub = Subscriber.new(self)
sub.email = email
return sub
end
end
class DBContainer
def get_field(email, field)
# Connect to the DB and get the needed data from it.
end
end
My assumption is that most of the time, you need to manipulate the data in the
Subscriber records without caring how or where they are stored. For creation
of new records, you could set a "default" container to use for that in
initialization
Or possibly make a "container of containers" - when looking for an existing
record, this one would search the two "real" ones, and when creating a new
record, use the default one. This way you'd completely contain the way the
records are stored in the backends from the creation / loading of records -
the operations you commonly need would be the same code no matter what
backend is used.
In the latter case, better names for classes would be Subscriber,
SubscriberFactory, FSStorage, DBStorage. The SubscriberFactory would be the
mentioned "container of containers", a class responsible for the creation and
finding of Subscribers.
However, I have so many questions in this case... For example,
SubscribersDB would need far more information than just a path (like
SubscribersFS). Where would this information be stored? What if it
changed?
Given my proposed design, of course, the DBStorage would need information how
to connect to a database, and about its layout. However, the subscribers
would still remain uniquely identified by their e-mail addresses, and
DBStorage would implement the same operations the Subscriber class needs from
its storage object as FSStorage, just using DB access instead of file access.
Mind you, I'd only use this specific approach if, and only if you really need
to support both the backends at once, and only for few classes. Otherwise,
you'd need to eventually connect each data class with each backend using a
separate backend, which would be really messy, or have to implement a generic
storage adapter for any type of record, which would be complicated, and
probably has already been done for SQL DBs. Of course, code based on very
similar concepts could be handy when migrating the data between backends.
And what would actually *happen* when I did Subscriber
['merc2@mobily.com'].name="Tony", data-wise?
Well, supposing the record doesn't exist already:
- The call to Subscriber:: (SubscriberFactory:: in the naming I proposed
earlier) would use FSStorage or DBStorage to find a record for
"merc2@mobily.com". (Using a method named like BlahStorage#include? or
similar.)
- This would fail (the record doesn't exist), so the SubscriberFactory would
create a new Subscriber object for this e-mail, using the default storage to
handle it (let's say it's DBStorage)
- SubscriberFactory:: returns this new Subscriber object - I'll name it
"subscriber" below
- The call to subscriber.name = "Tony" delegates to
subscriber.set_field(:name, "Tony")
- subscriber.set_field(:name, "Tony") calls
@storage.set_field("merc2@mobily.com", :name, "Tony")
(The last two steps could be merged into one, but the call to set_field would
have to be changed in all the setters. Using this middle-man is the lazy way
out.)
The last method is probably implemented as something like:
class DBStorage
def set_field(email, field, value)
# db is the database connection
db.execute(
"UPDATE subscribers
SET subscribers.#{field.to_s} = ?
WHERE subscribers.email = ?",
value, email)
end
end
I feel I am out of my leagues here.
Patience, young grasshopper. (Even if it's more likely you're older than
me :P)
If you have time, David (or anybody else), I would love it if you
could write a basic skeleton for the two classes - something that
would make me understand what goes where.
However, I feel I am abusing of your time. So... let's say that I'm
not expecting it!
Maybe when I remember this in daytime to get my mind off Java at work for a
little while, I have enough coding Ruby in my free time on a side job...
David Vallner