Ben Nagy wrote:
Have you thought about not using String as the base class? For instance,
OpenStruct would be almost OK for my purposes, if it sustained ordered
output. If I had to hack things up without guidance I would probably start
with a Hash and have :fieldname -> pos, val, type internally. You wouldn't
be able to treat the whole object like a string directly, but overloading
to_s shouldn't be too ugly syntactically? The type definition would still be
used to meta-create a class 'parse' method that does the parsing, to convert
from a raw string (or I guess you could just use o=Class.new(String)).
This is a good point (about String as the base class), and it brings up
the threshold at which bit-struct loses its usefulness. If you're doing
a lot of complex accessor operations (esp. the var-length fields), then
operating on a string just gets hopelessly mucky. It's better to use
some structured data type, and follow the parse->operate->unparse cycle.
BitStruct has been useful in cases where I only need to touch a field or
two and then just pass the string on somewhere else (a socket, a file, a
database, etc.). In these cases, parsing all the fields is a waste of time.
So, what kind of data structure to use...
A hash of fieldname => [pos, val, type] has the disadvantage that each
field must know its position. If you increase the length of one field,
you have to search for all other fields with higher pos, and increase
their pos.
An array of values, with defined accessors plus #parse and #to_s
methods, is probably better. I think Ara Howard's arrayfields lib might
be a place to start, and then you can implement #parse and #to_s using
#unpack and #pack. You don't need to keep track of pos and update it
each time a field changes size, as long as each field knows its
(current) length. Don't worry about actual offsets except in #to_s. Be lazy.
With this approach, the accessors will be much more efficient than
BitStructs, but parse/to_s will be less efficient.
The trouble is that I'm still getting to grips with the nontrivial parts of
Ruby metaprogramming, so there are a few fiddly details that I'm mentally
glossing over. I _think_, for example, that it would be cool to be able to
define the class fields as using any calculated value at the time of
instantiation. Take UDP for example, where the checksum is performed over a
pseudoheader + payload. That rapidly starts to twist my brain though, since
the UDP object would need to know if it is the payload of an IP object
before being able to calculate the checksum. Gah. Maybe some Proc that is
called when you call o.field.refresh (which gets called the first time
during instantiation)... but then the checksum depends on other calculated
fields like length so it needs to be done last... ok my brain just exploded.
It's probably better to compute the checksum in terms of the string
representation, rather than try to perform the calculation in terms of
individual field values (which may be in the wrong byte order, may have
too much precision, may need to be shifted into position in a bit field,
...).
I hope you find it worthwhile to work on a library like this.
···
--
vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407