Nested struct alignment summarized

Hi,

So trying to distill the recent thread about C compilers
alignment/layout of nested structs it appears that:

  • Ruby/DL cannot as of now handle structs within structs.

  • A trick to get around that is to inline the inner struct
    into the outer struct.

  • If we could assume this trick work in general we could
    enhance Ruby/DL to handle nested structs.

  • The trick relies on how different C compilers lays out
    a struct in memory. The ANSI C standard does not specify
    how that should be done; its implementation-defined ie.
    each compiler must document how they do it but it may
    not be in the same way.

  • This means problem for us when trying to use Ruby/DL
    for libs that have nested struct constructs.

So the situation look grim; even if we find a portable
way to check how a certain compiler lays things out
it may not lay things out in the same way for different
structs. There could be special situation where it changes the layout.

So is there anything we can do?

If not this sounds like a serious problem for using
Ruby/DL on larger/more complex lib “wraps”.

Regards,

Robert Feldt

Expand the current method (list of structure types in dl.h (ALIGN_*)),
so that the alignments for a growing number of structs are determined
at compile time?

Of course, that’d require some code to check for structural equivalence
of emm structures.

···

On Thu, Sep 11, 2003 at 04:07:14PM +0900, Robert Feldt wrote:

So the situation look grim; even if we find a portable
way to check how a certain compiler lays things out
it may not lay things out in the same way for different
structs. There could be special situation where it changes the layout.

So is there anything we can do?


_ _

__ __ | | ___ _ __ ___ __ _ _ __
'_ \ / | __/ __| '_ _ \ / ` | ’ \
) | (| | |
__ \ | | | | | (| | | | |
.__/ _,
|_|/| || ||_,|| |_|
Running Debian GNU/Linux Sid (unstable)
batsman dot geo at yahoo dot com

Yes I have a Machintosh, please don’t scream at me.
– Larry Blumette on linux-kernel

Mauricio Fernández batsman.geo@yahoo.com skrev den Thu, 11 Sep 2003 17:35:15 +0900:

So the situation look grim; even if we find a portable
way to check how a certain compiler lays things out
it may not lay things out in the same way for different
structs. There could be special situation where it changes the layout.

So is there anything we can do?

Expand the current method (list of structure types in dl.h (ALIGN_*)),
so that the alignments for a growing number of structs are determined
at compile time?

Of course, that’d require some code to check for structural equivalence
of emm structures.

Below is code to assist analysis of this. I’d be interested in the
output from the script on other machines and with other compilers. I’ve run it on a PC/WinXP/Cygwin

Ruby version: 1.8.0 (2003-09-08) [i386-cygwin]
Compiler: gcc
Version: gcc (GCC) 3.2 20020927 (prerelease)

and a Gentoo Linux machine

Ruby version: 1.8.0 (2003-08-04) [i686-linux-gnu]
Compiler: gcc
Version: gcc (GCC) 3.2.3 20030422 (Gentoo Linux 1.4 3.2.3-r1, propo
lice)

and results differ in how they align double’s. They both inline
structs directly into the outer struct though.

Regards,

Robert

require ‘dl/import’
require ‘dl/struct’

your_name, your_mail = ARGV[0], ARGV[1]
compiler = ARGV[2] || “gcc”
compile_cmd = ARGV[3] || “-shared -o DLLNAME”
version_cmd = ARGV[4] || “–version”

class CFile
def initialize(compiler, compileCommand, versionCommand)
@co, @cc, @vc = compiler, compileCommand, versionCommand
@count = 0
end

def compile_to_dll(dllname = “test.so”, code = “”, cfilename = “test.c”)
File.open(cfilename, “w”) {|fh| fh.write code}
cmd = “#{@co} #{@cc} #{cfilename}”.gsub(“DLLNAME”, dllname)
system cmd
File.delete cfilename
end

def compiler_version
#{@co} #{@vc}
end

Template = <<EOT
typedef struct I {
it1 i1;
it2 i2;
} I;

typedef struct O {
ot1 o1;
ot2 o2;
I i;
ot3 o3;
} O;

extern int o2o1() {O o; return ((int)&(o.o2) - (int)&o.o1);}
extern int i1o2() {O o; return ((int)&(o.i.i1) - (int)&o.o2);}
extern int i2i1() {O o; return ((int)&(o.i.i2) - (int)&o.i.i1);}
extern int o3i2() {O o; return ((int)&(o.o3) - (int)&o.i.i2);}
EOT

MethodNames = [“o2o1”, “i1o2”, “i2i1”, “o3i2”]
TypeNames = [“it1”, “it2”, “ot1”, “ot2”, “ot3”]
DefaultTypes = Hash.new
TypeNames.each {|n| DefaultTypes[n] = “int”}

def address_diffs(typeAssignment = {})
dllname = new_dllname
compile_to_dll(dllname, create_cfile(typeAssignment))
load_and_get_adress_diffs(dllname)
end

def new_dllname
dllname = “test#{@count += 1}.so”
end

def create_cfile(typeAssignment)
ta = DefaultTypes.clone.update typeAssignment
cfile = Template.clone
ta.each {|n, t| cfile.gsub!(n, t)}
cfile
end

def load_and_get_adress_diffs(dllname)
dll = DL.dlopen(“./” + dllname)
MethodNames.map {|mn| dll.sym(mn, ‘I’).call().first}
end
end

cf = CFile.new compiler, compile_cmd, version_cmd

types = [“char”, “int”, “short”, “long”, “void*”, “float”, “double”]

def all_pairs(ary)
a, na = ary.uniq,
a.map {|t1| a.map {|t2| na << [t1, t2]}}
na
end

report = <<EOR
Reporter: #{your_name}
Email: #{your_mail}
Ruby version: #{RUBY_VERSION} (#{RUBY_RELEASE_DATE}) [#{RUBY_PLATFORM}]
Compiler: #{compiler}
Version: #{cf.compiler_version}
Build command: #{compiler} #{compile_cmd}
EOR

puts report

results, counts = , Hash.new(0)
all_pairs(types).each do |tfrst, tsnd|
CFile::MethodNames.each do |methodname|
t1, t2 = methodname[0,2], methodname[2,2]
t1[1,0] = t2[1,0] = “t”
th = {t1 => tfrst, t2 => tsnd}
adiffs = cf.address_diffs(th)
results << [“#{t1} #{tfrst}, #{t2} #{tsnd}:”, adiffs]
counts[adiffs] += 1
STDERR.print “.”; STDERR.flush
end
end

most_common = counts.to_a.sort {|a,b| b.last <=> a.last}.first.first

puts “The address diffs are #{most_common.inspect} for all combinations but:”

Print the uncommon ones

results.each {|r| puts “#{r[0]} #{r[1].inspect}” unless r[1] == most_common}

···

On Thu, Sep 11, 2003 at 04:07:14PM +0900, Robert Feldt wrote:

I solved the same problem for my C++ DDT (Dynamic Data Typing)
library. The difficulty is that the same compiler may have many
different sets of structure alignment rules. For example, with
the Microsoft compilers, you can use “#pragma pack(N)” to tell
the compiler to apply different minimum alignment for any struct.
The Windows APIs actually use this pragma, so any code to mimic
the memory layout must have not only a list of the field types,
but also of the rules being used. A nested structure might even
have different alignment rules than the structure inside which
it’s nested, and all this is platform dependent. Unless you’re
prepared to wear limiting yourself to a single chosen set of
alignment rules, or to provide possibly many platform-specific
hints in your metadata (I did both, chosen default with optional
annotations), you’re SOL.

Clifford.

Here are my results, as well as an updated version of the script that will
work with MSVC.

Nathaniel

<:((><

Reporter: Nathaniel Talbott
Email: nathaniel@NOSPAMtalbott.ws
Ruby version: 1.8.0 (2003-08-04) [i386-mswin32]
Compiler: cl
Version: Microsoft Visual C++
Build command: cl -LD -o DLLNAME INPUTNAME -link -dll -export:o2o1
-expor
t:i1o2 -export:i2i1 -export:o3i2
The address diffs are [4, 4, 4, 4] for all combinations but:
ot2 char, ot1 char: [1, 3, 4, 4]
it2 char, it1 char: [4, 4, 1, 3]
ot2 char, ot1 short: [2, 2, 4, 4]
it2 char, it1 short: [4, 4, 2, 2]
ot2 char, ot1 double: [8, 4, 4, 4]
it1 char, ot2 double: [8, 8, 4, 4]
it2 char, it1 double: [4, 4, 8, 8]
ot3 char, it2 double: [4, 4, 8, 8]
ot2 int, ot1 double: [8, 4, 4, 4]
it1 int, ot2 double: [8, 8, 4, 4]
it2 int, it1 double: [4, 4, 8, 8]
ot3 int, it2 double: [4, 4, 8, 8]
ot2 short, ot1 char: [2, 2, 4, 4]
it2 short, it1 char: [4, 4, 2, 2]
ot2 short, ot1 short: [2, 2, 4, 4]
it2 short, it1 short: [4, 4, 2, 2]
ot2 short, ot1 double: [8, 4, 4, 4]
it1 short, ot2 double: [8, 8, 4, 4]
it2 short, it1 double: [4, 4, 8, 8]
ot3 short, it2 double: [4, 4, 8, 8]
ot2 long, ot1 double: [8, 4, 4, 4]
it1 long, ot2 double: [8, 8, 4, 4]
it2 long, it1 double: [4, 4, 8, 8]
ot3 long, it2 double: [4, 4, 8, 8]
ot2 void*, ot1 double: [8, 4, 4, 4]
it1 void*, ot2 double: [8, 8, 4, 4]
it2 void*, it1 double: [4, 4, 8, 8]
ot3 void*, it2 double: [4, 4, 8, 8]
ot2 float, ot1 double: [8, 4, 4, 4]
it1 float, ot2 double: [8, 8, 4, 4]
it2 float, it1 double: [4, 4, 8, 8]
ot3 float, it2 double: [4, 4, 8, 8]
ot2 double, ot1 char: [8, 8, 4, 4]
it1 double, ot2 char: [4, 4, 8, 8]
it2 double, it1 char: [4, 4, 8, 8]
ot2 double, ot1 int: [8, 8, 4, 4]
it1 double, ot2 int: [4, 4, 8, 8]
it2 double, it1 int: [4, 4, 8, 8]
ot2 double, ot1 short: [8, 8, 4, 4]
it1 double, ot2 short: [4, 4, 8, 8]
it2 double, it1 short: [4, 4, 8, 8]
ot2 double, ot1 long: [8, 8, 4, 4]
it1 double, ot2 long: [4, 4, 8, 8]
it2 double, it1 long: [4, 4, 8, 8]
ot2 double, ot1 void*: [8, 8, 4, 4]
it1 double, ot2 void*: [4, 4, 8, 8]
it2 double, it1 void*: [4, 4, 8, 8]
ot2 double, ot1 float: [8, 8, 4, 4]
it1 double, ot2 float: [4, 4, 8, 8]
it2 double, it1 float: [4, 4, 8, 8]
ot2 double, ot1 double: [8, 8, 4, 4]
it1 double, ot2 double: [8, 8, 8, 8]
it2 double, it1 double: [4, 4, 8, 8]
ot3 double, it2 double: [4, 4, 8, 8]

alignment.rb:

require ‘dl/import’
require ‘dl/struct’

your_name, your_mail = ARGV[0], ARGV[1]
compiler = ARGV[2] || “gcc”
MSVC = (compiler == ‘cl’)
compile_cmd = ARGV[3] || (MSVC ? “-nologo -LD -o DLLNAME INPUTNAME -link
-dll -export:o2o1 -export:i1o2 -export:i2i1 -export:o3i2” : “-shared -o
DLLNAME INPUTNAME”)
version_cmd = ARGV[4] || “–version”

class CFile
def initialize(compiler, compileCommand, versionCommand)
@co, @cc, @vc = compiler, compileCommand, versionCommand
@count = 0
end

def compile_to_dll(dllname = "test.so", code = "", cfilename = "test.c")
  File.open(cfilename, "w") {|fh| fh.write code}
  cmd = "#{@co} #{@cc}".gsub("DLLNAME", dllname).gsub("INPUTNAME",

cfilename)
system cmd
File.delete cfilename
end

def compiler_version
  MSVC ? "Microsoft Visual C++" : `#{@co} #{@vc}`
end

Template = <<EOT

typedef struct I {
it1 i1;
it2 i2;
} I;

typedef struct O {
ot1 o1;
ot2 o2;
I i;
ot3 o3;
} O;

extern int o2o1() {O o; return ((int)&(o.o2) - (int)&o.o1);}
extern int i1o2() {O o; return ((int)&(o.i.i1) - (int)&o.o2);} extern int
i2i1() {O o; return ((int)&(o.i.i2) - (int)&o.i.i1);}
extern int o3i2() {O o; return ((int)&(o.o3) - (int)&o.i.i2);}
EOT

MethodNames = ["o2o1", "i1o2", "i2i1", "o3i2"]
TypeNames = ["it1", "it2", "ot1", "ot2", "ot3"]
DefaultTypes = Hash.new
TypeNames.each {|n| DefaultTypes[n] = "int"}

def address_diffs(typeAssignment = {})
  dllname = new_dllname
  compile_to_dll(dllname, create_cfile(typeAssignment))
  load_and_get_adress_diffs(dllname)
end

def new_dllname
  dllname = "test#{@count += 1}.dll"
end

def create_cfile(typeAssignment)
  ta = DefaultTypes.clone.update typeAssignment
  cfile = Template.clone
  ta.each {|n, t| cfile.gsub!(n, t)}
  cfile
end

def load_and_get_adress_diffs(dllname)
  dll = DL.dlopen("./" + dllname)
  MethodNames.map {|mn| dll.sym(mn, 'I').call().first}
end

end

cf = CFile.new compiler, compile_cmd, version_cmd

types = [“char”, “int”, “short”, “long”, “void*”, “float”, “double”]

def all_pairs(ary)
a, na = ary.uniq,
a.map {|t1| a.map {|t2| na << [t1, t2]}}
na
end

report = <<EOR
Reporter: #{your_name}
Email: #{your_mail}
Ruby version: #{RUBY_VERSION} (#{RUBY_RELEASE_DATE})
[#{RUBY_PLATFORM}]
Compiler: #{compiler}
Version: #{cf.compiler_version}
Build command: #{compiler} #{compile_cmd}
EOR

results, counts = , Hash.new(0)
all_pairs(types).each do |tfrst, tsnd|
CFile::MethodNames.each do |methodname|
t1, t2 = methodname[0,2], methodname[2,2]
t1[1,0] = t2[1,0] = “t”
th = {t1 => tfrst, t2 => tsnd}
adiffs = cf.address_diffs(th)
results << [“#{t1} #{tfrst}, #{t2} #{tsnd}:”, adiffs]
counts[adiffs] += 1
STDERR.print “.”; STDERR.flush
end
end

most_common = counts.to_a.sort {|a,b| b.last <=> a.last}.first.first

puts
puts
puts report
puts “The address diffs are #{most_common.inspect} for all combinations
but:”

Print the uncommon ones

results.each {|r| puts “#{r[0]} #{r[1].inspect}” unless r[1] ==
most_common}

···

Robert Feldt [mailto:feldt@ce.chalmers.se] wrote:

Below is code to assist analysis of this. I’d be interested
in the output from the script on other machines and with
other compilers. I’ve run it on a PC/WinXP/Cygwin