Hard drive backup program in 100% pure ruby

Just figured I'd mention that I'm working on this:

http://www.subspacefield.org/security/hdb/

I think that I'll run into needing some features from the File or
FileUtil classes (such as mapping from filename to inode numbers) that
likely won't be present, or won't be implemented on certain file
systems, OSes, and so on, so I'll end up being a contributor.

This is my first ruby project, however, and I'm still learning.

By the way, if you know anyone (e.g. a student) who could benefit from
a case study in applying OOP/OOD to a real project, I'm writing it up
here:

http://www.subspacefield.org/~travis/hdb_history.html

My background is python, and here were my reactions to ruby:

1) Assignment is weird. Both names point to the same object,
   unless it's a "small" type (fits in a register).

   a = 5
   b = a
   b = 3
   puts a # prints 5

   a = "hello"
   b = a
   b['hell'] = 't'
   puts a # prints 'to'

   That really confused me for a bit. Fix is easy - always use dup in
   your initializers. It reminds me of the pass-by-value
   vs. pass-by-reference semantics.

2) I originally wrote some fancy set operations (like Set class but
   not using a hash) using array operators like -=. Problem is that
   these operates remove _the same object_ from an array, but not an
   equivalent (eql?) object. So when I compute metadata entries for
   a file based on the file, and then based on saved metadata, the
   same file with the same metadata is not the same object, so the -=
   operator won't actually remove it from the array.

   I'm considering the work it would require to implement -= using eql?
   but it seems complicated, so I'm thinking of switching to using Set.
   Downside of this is memory overhead for hashing and the random access
   I won't use much, also the fact that I build up these sets of metadata
   incrementally and so I'd be rehashing all the time. Building an array
   and then freezing it into a set seems possible, but cumbersome.

3) What's the syntax for class method constructors? I.E. I want four different
   ways to create metadata entries, some of which have overlapping signatures,
   so I can't use a single initialzer method. (Yes, I know initialize is not
   a constructor).

   Currently I do this by having an empty initializer and doing something like:
   a = Whatever.new().create_from_string(string_args)

In general I like ruby - it's a very terse language but not overly cryptic.
The documentation could be improved, but that's another list.

Also, whatever MLM you're using (FML?) doesn't recognize valid
RFC [2]822 addresses.

travis+ml-ruby-talk@subspacefield.org IS a valid address, contrary to
what it believes. This seems to be the only MLM I've encountered that
doesn't do correct parsing/identification of email addresses (web apps
are a different story... I wish people would RTFM...)

···

--
A Weapon of Mass Construction
My emails do not have attachments; it's a digital signature that your mail
program doesn't understand. | http://www.subspacefield.org/~travis/
If you are a spammer, please email john@subspacefield.org to get blacklisted.

Just figured I'd mention that I'm working on this:

Hard Drive Backups With HDB

Looks interesting...

I think that I'll run into needing some features from the File or
FileUtil classes (such as mapping from filename to inode numbers)

Really? Check out File#stat.

My background is python, and here were my reactions to ruby:

1) Assignment is weird. Both names point to the same object,
   unless it's a "small" type (fits in a register).

I don't think so. For example:

irb(main):001:0> 5.instance_variable_set :@foo, 'five'
=> "five"
irb(main):002:0> 3.instance_variable_get :@foo
=> nil
irb(main):003:0> 5.instance_variable_get :@foo
=> "five"
irb(main):004:0>

In other words, the "small" types point to the same object. As an
implementation detail, Ruby likely does store them as plain old numbers in a
register, but you don't have to care about that.

   That really confused me for a bit. Fix is easy - always use dup in
   your initializers.

What? Why?

I mean, I guess -- there's certainly precedent for it. But most of the time,
when I'm passing something to an initializer, I'm not keeping a reference
around -- or if I am, it's because I actually want the object to be shared.

2) I originally wrote some fancy set operations (like Set class but
   not using a hash) using array operators like -=.

Careful. Remember:

a -= b

always expands to:

a = a - b

In other words, you're duping your entire array every single time you do that.
Is that what you want? You could always do array.delete_if (or array.reject,
if you want the duping behavior) -- those take blocks which let you set your
own comparison.

I don't know offhand how any of the builtin array operations compare objects,
or what you need to override. There are at least four builtin equivalence
operators, which all do slightly different things:

a == b
a === b
a.eql?(b)
a.equals?(b)

And I don't know offhand what each of those do, but you should look them up
before overriding.

3) What's the syntax for class method constructors?

...what?

I.E. I want four
different ways to create metadata entries, some of which have overlapping
signatures, so I can't use a single initialzer method.

Are you sure there's no way your initializer can tell what it's dealing with?
For example, say you ultimately expect a string:

def initialize anything
  @foo = anything.to_s
end

And while I don't ordinarily encourage type checking, I think it's OK if
you're dealing with option arguments:

def initialize options
  if options.kind_of? Hash
    @a = options[:a]
    @b = options[:b]
  else
    @a = options.to_s
  end
end

   Currently I do this by having an empty initializer and doing something
like: a = Whatever.new().create_from_string(string_args)

By the way: The parens are optional in Ruby. Even in the DataMapper code,
where they're encouraged, I don't think I ever see the empty parens like that.

That said, there's nothing stopping you from doing this:

class Whatever
  class << self
    def create_from_string(string_args)
      new.tap{|obj| obj.create_from_string(string_args)}
    end
  end
  def create_from_string_args(string_args)
    ...
  end
end

That, or just do your initialization inside the body of 'create_from_string'
-- or accept a single standardized format (like an options hash) in the
initialize method, and just convert the specialized argument (like the
argument to create_from_string) into something more generic.

Remember, there's no special "new" syntax, so there's absolutely nothing
stopping you from making as many factory methods as you like. (Perl is the
same way, and I've always wondered why more languages aren't.)

There's a limit to how much advice I can give you, though, without a little
more details about what you're trying to do. For example, what is it that
you're trying to put into set-like behavior, and under what circumstances?
What kind of object are you trying to create_from_string, and what are the
possible ways you want of initializing it?

···

On Wednesday, July 28, 2010 01:06:56 pm travis.ml-ruby-talk@subspacefield.org wrote: