Calling fun taking struct and not pointer to struct?

Related to the recent thread about nested structs
and Ruby/DL here is the answer from Ruby/DL’s
author:

Robert Feldt said:

module T
extend DL::Importable

S2 = struct [
“int a”,
“int b”,
“int c”,
]
end
Is this garantueed to work for all C compilers?

I’m not sure. However, I’ve written similar code.
Would you ask ruby-talk members about portability?

So that’s what I ask you:

Is inlining of nested structs portable between C compilers?

Regards,

···


Robert Feldt

I’m a little confused by the question… are you asking if:

  1. The act of using a struct in the declaration of another
    struct is portable?

  2. If a struct nested inside another struct can be accessed
    in a portable manner?

  3. If a nested struct’s members can portably be accessed by
    simply pretending they exist directly in the containing
    struct?

  4. Something else altogether?

Just wanted to clarify before I try to answer,

Nathaniel

<:((><

···

Robert Feldt [mailto:feldt@ce.chalmers.se] wrote:

Is inlining of nested structs portable between C compilers?

Nathaniel Talbott nathaniel@NOSPAMtalbott.ws skrev den Wed, 10 Sep 2003 22:42:24 +0900:

  1. If a nested struct’s members can portably be accessed by
    simply pretending they exist directly in the containing
    struct?

This is what I’m asking.

Sorry for confusion.

If 3 is the case we can use this technique for the situation
you mentioned.

/Robert

Hmmm… I may end up doing that for now, but it’s just so ugly. It sure
would be nice to have direct support for this in the DL library like so:

module M
extend DL::Importable

S1 = struct [
  'int a',
  'int b'
]

typealias('S1', S1)

S2 = struct [
  'S1 c',
  'S1 d'
]

end

s = M::S2::malloc
s.c.a = 42

Or something like that. I’ve looked at the internals of the DL library some,
and it seems pretty clean, so I may see if I can create a patch to do the
above.

Nathaniel

<:((><

···

Robert Feldt [mailto:feldt@ce.chalmers.se] wrote:

  1. If a nested struct’s members can portably be accessed by simply
    pretending they exist directly in the containing struct?

This is what I’m asking.

Sorry for confusion.

If 3 is the case we can use this technique for the situation
you mentioned.

Nathaniel Talbott nathaniel@NOSPAMtalbott.ws skrev den Thu, 11 Sep 2003 00:06:18 +0900:

Hmmm… I may end up doing that for now, but it’s just so ugly. It sure
would be nice to have direct support for this in the DL library like so:

module M
extend DL::Importable

S1 = struct [
‘int a’,
‘int b’
]

typealias(‘S1’, S1)

S2 = struct [
‘S1 c’,
‘S1 d’
]
end

s = M::S2::malloc
s.c.a = 42

Or something like that. I’ve looked at the internals of the DL library some,
and it seems pretty clean, so I may see if I can create a patch to do the
above.

I agree it seems pretty straightforward and useful. It still depends
on the answer to the 3 question above. But I guess it’s unlikely a
C compiler would do it in any other way.

This “cheat” at least works on gcc 3.2 20020908 on cygwin and
3.2.3 20030422 on Gentoo linux.

Regards,

Robert

Well, I’m very much a C novice, but from my limited understanding of how
structs work, it seems that in a case like:

typedef struct {int a, b;} inner;
typedef struct {int c, d; s1 e;} outer;

That outer must be stored like:

int
int
int
int

I think nesting a struct simply tells C to allocate an additional
sizeof(inner) in the outer struct, and the compiler then translates an
outer.inner.whatever reference to access the correct memory in outer. But
that’s mostly a guess.

Nathaniel

<:((><

···

Robert Feldt [mailto:feldt@ce.chalmers.se] wrote:

I agree it seems pretty straightforward and useful. It still
depends on the answer to the 3 question above. But I guess
it’s unlikely a C compiler would do it in any other way.

This “cheat” at least works on gcc 3.2 20020908 on cygwin and
3.2.3 20030422 on Gentoo linux.

Nathaniel,

Well, I’m very much a C novice, but from
my limited understanding of how structs
work, it seems that in a case like:

typedef struct {int a, b;} inner;
typedef struct {int c, d; s1 e;} outer;

That outer must be stored like:

int
int
int
int

I think nesting a struct simply tells C
to allocate an additional sizeof(inner)
in the outer struct, and the compiler
then translates an outer.inner.whatever
reference to access the correct memory
in outer. But that’s mostly a guess.

This is generally correct.  IIRC, in C structure members are guaranteed

to be stored in the same order they were declared. The only thing that
might end up being a problem is alignment issues. A C compiler is allowed
to insert additional filler in a structure to guarantee alignment. I could
imagine a C compiler inserting filler to word align a nested structure, even
if the data members of that structure did not require it. For example:

typedef struct { char b; } inner;
typedef struct { char a; inner c; char d; } outer;

...could be stored as "afbfd" ('f'=filler) instead of "abd".  My guess

would be that this occurs rarely if ever in practice, but I believe that it
is possible.

I hope this helps.

- Warren Brown

Would that this were true. However, the ANSI C standard doesn’t
require it. Consider, for example,

struct INNER {
int a;
int b;
};
struct OUTER1 {
int c;
int d;
int e;
struct INNER inner;
};

struct OUTER2 {
int c;
int d;
int e;
int f;
int g;
};

You can’t assume that “f” in OUTER2 is the same as “inner.a” in
OUTER1. The compiler is free to align structure members, including
“inner”, any way it wants to, and the alignment requirements for “f”
are certainly not the same as for “inner”. If the compiler chooses,
for example, to always align structures at dword boundaries, then
there will be a word-sized gap between ‘e’ and ‘inner.a’ in the OUTER1
struct (assuming 32-bit machines) that may not appear between “e” and
“f” in the OUTER2 struct.

HTH.

···

On Thu, 11 Sep 2003 01:05:07 +0900, “Nathaniel Talbott” nathaniel@NOSPAMtalbott.ws wrote:

Robert Feldt [mailto:feldt@ce.chalmers.se] wrote:

I agree it seems pretty straightforward and useful. It still
depends on the answer to the 3 question above. But I guess
it’s unlikely a C compiler would do it in any other way.

This “cheat” at least works on gcc 3.2 20020908 on cygwin and
3.2.3 20030422 on Gentoo linux.

Well, I’m very much a C novice, but from my limited understanding of how
structs work, it seems that in a case like:

typedef struct {int a, b;} inner;
typedef struct {int c, d; s1 e;} outer;

That outer must be stored like:

int
int
int
int

I think nesting a struct simply tells C to allocate an additional
sizeof(inner) in the outer struct, and the compiler then translates an
outer.inner.whatever reference to access the correct memory in outer. But
that’s mostly a guess.

Nathaniel

<:((><

...could be stored as "afbfd" ('f'=filler) instead of "abd".  My guess

would be that this occurs rarely if ever in practice, but I believe that it
is possible.

I suspect it is the norm, not the exception. It happens with GCC 2 and
greater under Alpha, ix86 and sparc.

Ari

Thanks for the info!

I could definitely see struct alignment as being a potential problem, but it
would seem that DL must handle it somehow already, since it could also be a
problem for plain old (no nesting) structs, right?

Nathaniel

<:((><

···

Warren Brown [mailto:wkb@airmail.net] wrote:

I think nesting a struct simply tells C
to allocate an additional sizeof(inner)
in the outer struct, and the compiler
then translates an outer.inner.whatever
reference to access the correct memory
in outer. But that’s mostly a guess.

This is generally correct.  IIRC, in C structure members 

are guaranteed to be stored in the same order they were
declared. The only thing that might end up being a problem
is alignment issues. A C compiler is allowed to insert
additional filler in a structure to guarantee alignment. I
could imagine a C compiler inserting filler to word align a
nested structure, even if the data members of that structure
did not require it. For example:

typedef struct { char b; } inner;
typedef struct { char a; inner c; char d; } outer;

...could be stored as "afbfd" ('f'=filler) instead of 

“abd”. My guess would be that this occurs rarely if ever in
practice, but I believe that it is possible.

Hmmm… so the question is, is there any way to determine how the compiler
aligned things?

Nathaniel

<:((><

···

Tim Hunter [mailto:Tim.Hunter@sas.com] wrote:

I think nesting a struct simply tells C to allocate an additional
sizeof(inner) in the outer struct, and the compiler then translates an
outer.inner.whatever reference to access the correct memory in outer.
But that’s mostly a guess.

Would that this were true. However, the ANSI C standard
doesn’t require it. Consider, for example,

struct INNER {
int a;
int b;
};
struct OUTER1 {
int c;
int d;
int e;
struct INNER inner;
};

struct OUTER2 {
int c;
int d;
int e;
int f;
int g;
};

You can’t assume that “f” in OUTER2 is the same as “inner.a”
in OUTER1. The compiler is free to align structure members,
including “inner”, any way it wants to, and the alignment
requirements for “f” are certainly not the same as for
“inner”. If the compiler chooses, for example, to always
align structures at dword boundaries, then there will be a
word-sized gap between ‘e’ and ‘inner.a’ in the OUTER1 struct
(assuming 32-bit machines) that may not appear between “e”
and “f” in the OUTER2 struct.

After doing some more googling, and looking more closely at the DL source,
it appears that DL uses a variant of the trick described at
http://www.monkeyspeak.com/alignment/ to make sure it aligns things as the
platform normally does. Specifically, there’s this code in dl.h:

typedef struct { char c; void *x; } s_voidp;
typedef struct { char c; short x; } s_short;
typedef struct { char c; int x; } s_int;
typedef struct { char c; long x; } s_long;
typedef struct { char c; float x; } s_float;
typedef struct { char c; double x; } s_double;

#define ALIGN_VOIDP (sizeof(s_voidp) - sizeof(void *))
#define ALIGN_SHORT (sizeof(s_short) - sizeof(short))
#define ALIGN_INT (sizeof(s_int) - sizeof(int))
#define ALIGN_LONG (sizeof(s_long) - sizeof(long))
#define ALIGN_FLOAT (sizeof(s_float) - sizeof(float))
#define ALIGN_DOUBLE (sizeof(s_double) - sizeof(double))

I haven’t yet found anything indicating that a struct may be aligned on any
criteria other than its contents, but that doesn’t mean a whole lot. But
even if it does do that, couldn’t you just add a check as above to see?

typedef struct {} struct_type;
typedef struct { char c; struct_type x; } s_struct;

#define ALIGN_STRUCT (sizeof(s_struct) - sizeof(struct_type))

Anyhow, I’d love to see something in writing indicating that a struct may be
aligned differently than it’s contents demand. As far as I can tell,
alignment is done for the benefit of the underlying machine’s access to the
data, and since the underlying machine doesn’t access a struct directly, it
doesn’t really care how it’s aligned, other than to demand that it can
access it’s members correctly. Thus it seems that:

typedef struct { int i; } inner;
typedef struct { char c; inner in; } outer;

Will always be the same as:

typedef struct { char c; int i; } s;

This is definitely expanding my knowledge of C fundamentals… as well as
making me appreciate that Ruby manages to compile in so many different
places.

Nathaniel

<:((><

···

Tim Hunter [mailto:Tim.Hunter@sas.com] wrote:

You can’t assume that “f” in OUTER2 is the same as “inner.a”
in OUTER1. The compiler is free to align structure members,
including “inner”, any way it wants to, and the alignment
requirements for “f” are certainly not the same as for
“inner”. If the compiler chooses, for example, to always
align structures at dword boundaries, then there will be a
word-sized gap between ‘e’ and ‘inner.a’ in the OUTER1 struct
(assuming 32-bit machines) that may not appear between “e”
and “f” in the OUTER2 struct.

(…)

You can’t assume that “f” in OUTER2 is the same as “inner.a” in OUTER1.
The compiler is free to align structure members, including “inner”, any
way it wants to, and the alignment requirements for “f” are certainly
not the same as for “inner”. If the compiler chooses, for example, to
always align structures at dword boundaries, then there will be a
word-sized gap between ‘e’ and ‘inner.a’ in the OUTER1 struct (assuming
32-bit machines) that may not appear between “e” and “f” in the OUTER2
struct.

Hmmm… so the question is, is there any way to determine how the compiler
aligned things?

Nathaniel

<:((><

Well, a starting point would be offsetof().

···

On Thu, 11 Sep 2003 04:24:05 +0900, Nathaniel Talbott wrote:

Tim Hunter [mailto:Tim.Hunter@sas.com] wrote:

A compiler is free to align a structure on a boundary that is stricter
than that required by its members. For example, a structure with an int as
its first member may be aligned on a dword boundary, if that makes it
easier to generate code for accessing the structure elements. (Consider
the needs of arrays of structures allocated from the heap.)

Also a compiler is free to add padding between structure elements as it
sees fit.

For “something in writing” check the ANSI standard.

The reason Ruby compiles on so many different machines is (in part)
because the Ruby authors didn’t depend on non-standard behavior from the
compiler.

···

On Thu, 11 Sep 2003 08:55:14 +0900, Nathaniel Talbott wrote:

Tim Hunter [mailto:Tim.Hunter@sas.com] wrote:

You can’t assume that “f” in OUTER2 is the same as “inner.a” in OUTER1.
The compiler is free to align structure members, including “inner”, any
way it wants to, and the alignment requirements for “f” are certainly
not the same as for “inner”. If the compiler chooses, for example, to
always align structures at dword boundaries, then there will be a
word-sized gap between ‘e’ and ‘inner.a’ in the OUTER1 struct (assuming
32-bit machines) that may not appear between “e” and “f” in the OUTER2
struct.

After doing some more googling, and looking more closely at the DL
source, it appears that DL uses a variant of the trick described at
http://www.monkeyspeak.com/alignment/ to make sure it aligns things as
the platform normally does. Specifically, there’s this code in dl.h:

typedef struct { char c; void *x; } s_voidp; typedef struct { char c;
short x; } s_short; typedef struct { char c; int x; } s_int; typedef
struct { char c; long x; } s_long; typedef struct { char c; float x; }
s_float; typedef struct { char c; double x; } s_double;

#define ALIGN_VOIDP (sizeof(s_voidp) - sizeof(void *)) #define
ALIGN_SHORT (sizeof(s_short) - sizeof(short)) #define ALIGN_INT
(sizeof(s_int) - sizeof(int)) #define ALIGN_LONG (sizeof(s_long) -
sizeof(long)) #define ALIGN_FLOAT (sizeof(s_float) - sizeof(float))
#define ALIGN_DOUBLE (sizeof(s_double) - sizeof(double))

I haven’t yet found anything indicating that a struct may be aligned on
any criteria other than its contents, but that doesn’t mean a whole lot.
But even if it does do that, couldn’t you just add a check as above to
see?

typedef struct {} struct_type;
typedef struct { char c; struct_type x; } s_struct;

#define ALIGN_STRUCT (sizeof(s_struct) - sizeof(struct_type))

Anyhow, I’d love to see something in writing indicating that a struct
may be aligned differently than it’s contents demand. As far as I can
tell, alignment is done for the benefit of the underlying machine’s
access to the data, and since the underlying machine doesn’t access a
struct directly, it doesn’t really care how it’s aligned, other than to
demand that it can access it’s members correctly. Thus it seems that:

typedef struct { int i; } inner;
typedef struct { char c; inner in; } outer;

Will always be the same as:

typedef struct { char c; int i; } s;

This is definitely expanding my knowledge of C fundamentals… as well
as making me appreciate that Ruby manages to compile in so many
different places.

Nathaniel

<:((><

Hmmm… but is offsetof() portable? :-/

Nathaniel

<:((><

···

Tim Hunter [mailto:cyclists@nc.rr.com] wrote:

Hmmm… so the question is, is there any way to determine how the
compiler aligned things?

Nathaniel

<:((><

Well, a starting point would be offsetof().

Tim Hunter wrote:

A compiler is free to align a structure on a boundary that is stricter
than that required by its members. For example, a structure with an int as
its first member may be aligned on a dword boundary, if that makes it
easier to generate code for accessing the structure elements. (Consider
the needs of arrays of structures allocated from the heap.)

Correct. ISO C further guarantees that a pointer to a structure may be
coerced to a pointer to its first element.

Also a compiler is free to add padding between structure elements as it
sees fit.

Correct. By the above guarantee, no padding can occur before the first
element.

For “something in writing” check the ANSI standard.

For the pedantically-inclined, ISO/IEC 9899. But K&R 2nd edition deals
with this adequately in Appendix A.

Steve