A question about Charsets

Hello Rubyists,

in a new application i have to read dBase III Files which were generated
in a DOS enviroment. How can i convert the Data into Windows codepages
from ruby?

Thanks for any hints.

Eric.

Hi!

  • Eric-Roger Bruecklmeier; 2003-11-20, 20:30 UTC:

in a new application i have to read dBase III Files which were generated
in a DOS enviroment. How can i convert the Data into Windows codepages
from ruby?

Map each Byte to the corresponding one using a hash. You need the
codepages.

DOS codepages are listed here:

http://dwd.da.ru/charsets/index.html#dos-specific

Windows codepages are listed here:

http://dwd.da.ru/charsets/index.html#windows-specific

The mapping is troublesome because of two reasons: First of all all
DOS characters have Windows standard codepage counterparts (greek
letters for example) and 0…31 can be either control chars or
pictograms.

So the best you can do is use the above tables and create Arrays or
hashes that do the mapping.

For cp850 and cp866 you can use iconv, otherwise you can use recode.
This can be done from Ruby but it requires the appropriate software
being in place. Bad if you want software to be portable.

Josef ‘Jupp’ Schugt

···


.-------.
message > 100 kB? / | |
sender = spammer? / | R.I.P.|
text = spam? / | |

Josef ‘Jupp’ SCHUGT schrieb:

in a new application i have to read dBase III Files which were generated
in a DOS enviroment. How can i convert the Data into Windows codepages
from ruby?

Map each Byte to the corresponding one using a hash. You need the
codepages.

That’s the way i do it now, but it’s slow :frowning:

For cp850 and cp866 you can use iconv, otherwise you can use recode.
This can be done from Ruby but it requires the appropriate software
being in place. Bad if you want software to be portable.

Exactly that’s the problem, the software has to be portable :frowning:

Thanks anyhow!

C YA

Eric.

Hi!

  • Eric-Roger Bruecklmeier; 2003-11-21, 13:01 UTC:

Josef ‘Jupp’ SCHUGT schrieb:

in a new application i have to read dBase III Files which were
generated in a DOS enviroment. How can i convert the Data into
Windows codepages from ruby?

Map each Byte to the corresponding one using a hash. You need the
codepages.

That’s the way i do it now, but it’s slow :frowning:

When I find my code in tons of trouble, friends and collegues come to
me, speaking words of wisdom: write in C. (Sung to: ‘Let it be’ by
the Beatles).

Speedup calls for a C extension. I’ll skip the ‘intro to C
extensions’ stuff (Thomas and Hunt have that) and directly go to the
implementation of the mapping algorithm.

Suppose s points to array of char to be converted. Suppose you simply
need to map code 0 to 1 and vice versa. In that case use this:

for (p = s; *p; p++) {
switch (*p) {
case 0: *p = 1; break;
case 1: *p = 0; break;
}
}

You don’t need to map any char in the ASCII printable range which
saves a lot of coding. The resulting code is extremely fast.

Exactly that’s the problem, the software has to be portable :frowning:

The above code is extremely portable. An additional advantage: You
can give the codes in decimal or hexadecimal values.

For 16 Bit codes things are more complicated. You then need

for (p = s; *p; p+=2) {
switch (*p << 8 + (p+1)) { / or the other way round, depends */
case 0: *p = 1; break;
case 1: *p = 0; break;
}
}

and lots of additional cases.

Viel Erfolg,

Josef ‘Jupp’ Schugt

···


.-------.
message > 100 kB? / | |
sender = spammer? / | R.I.P.|
text = spam? / | |