break;
case 'T':
/* read until end of string or until a null character occurs */
{
char *start = s;
while (s < send) { /* don't read more than the whole string */
if (*s == '\000') break;
s++;
}
rb_ary_push(ary, infected_str_new(start, s-start, str));
if (s < send && *s == '\000') s++; /* skip null character */
}
How can I unpack two or more consecutive C-strings with the
String#unpack method? Like this:
“abc\000def\000”.unpack(“??”) # => [“abc”, “def”]
Currently, this seems not to be possible. Any chance to get the
following patch applied, which implements exactly this?
[snip] diff -r1.69 pack.c
At the risk of being told to clear off and write my own spec.,
I think that an ambuiguity has intruded into the designers mind.
The A and Z string field formats should IMO be recovered from
left to right. Doesn’t the term “string” relate here to a
string element within a packed field. The packed field just
happens to be a Ruby String.
If this is going to break code, I wish that it could happen
from 1.9.
As it is now, A and Z are behaving the way I would expect
A* and Z* to (i.e. * uses all remaining elements).
There’s String#rstrip for removing spaces and nulls from the
end of a String.
Unpack is very useful for decoding structures but with the
current behaviour if a structure were to contain a null-
terminated string element it would break the flow …
… as Michael has highlighted.
At Sun, 25 Apr 2004 14:34:03 +0900,
daz wrote in [ruby-talk:98298]:
The A and Z string field formats should IMO be recovered from
left to right. Doesn’t the term “string” relate here to a
string element within a packed field. The packed field just
happens to be a Ruby String.
At Sun, 25 Apr 2004 14:34:03 +0900,
daz wrote in [ruby-talk:98298]:
The A and Z string field formats should IMO be recovered from
left to right. Doesn’t the term “string” relate here to a
string element within a packed field. The packed field just
happens to be a Ruby String.
rb_ary_push(ary, infected_str_new(s, t - s, str));
if (t < send) t++;
s = t;
}
else {
Combining that with recognition of the length specifier:
···
===============================
case ‘Z’:
{
char *t = s;
if (len > send-s) len = send-s;
while (t < s+len && *t) t++;
rb_ary_push(ary, infected_str_new(s, t-s, str));
if (t < send) t++;
s = star ? t : s+len;
}
break;
At Mon, 26 Apr 2004 16:19:04 +0900,
daz wrote in [ruby-talk:98364]:
Combining that with recognition of the length specifier:
===============================
case ‘Z’:
{
char *t = s;
if (len > send-s) len = send-s;
while (t < s+len && *t) t++;
rb_ary_push(ary, infected_str_new(s, t-s, str));
if (t < send) t++;
s = star ? t : s+len;
}
break;
===============================
I’d also considered about it, but
s = “abc\0def\0\0jkl\0”
s.unpack(‘Z6ZZ’) #-> [“abc”, “f”, “”]
It can’t round trip with Array#pack, so I discarded this plan.
As the changes only affects ‘Z’-types in Strings with embedded null(s),
the impact should be extremely low.
I’m trying to think of any kind of string which might contain
significant nulls but also has a null as terminator.
I’ve seen some where null delimits fields and double-null terminates
but that rare case might be the only one to break iff a
programmer had decided that the best method to use on that type of string
was unpack(‘Z*’).
Embedded nulls are common when reading from binary files
(e.g. encoded characters) but I feel that it would never be a good idea
to strip trailing nulls in that context.
Voting +1 for inclusion in 1.8, also. Much more usable