Question about some octal formatted output?

eacute = ""
eacute << 0xC3 << 0xA9 #eacute<< 195 << 169 ; or é

p eacute

--output:---
"\303\251"

That ouput is in octal--although there is no leading 0.

1) Where does that format come from, i.e. no leading 0?
2) Why is the output in octal and not hex?

I looked up String#<< and it says it converts any Fixnum between 0-255
to a character.

3) Using what character set?

Thanks.

···

--
Posted via http://www.ruby-forum.com/.

eacute = ""
eacute << 0xC3 << 0xA9 #eacute<< 195 << 169 ; or é

p eacute

--output:---
"\303\251"

That ouput is in octal--although there is no leading 0.

1) Where does that format come from, i.e. no leading 0?
2) Why is the output in octal and not hex?

Its at least as old as C. You'll probably have to ask some really old timers for the answer.

$ cat octal.c
#include <stdio.h>

void main() { printf("\303\251\n"); }
$ gcc octal.c
octal.c: In function 'main':
octal.c:3: warning: return type of 'main' is not 'int'
$ ./a.out
é

I looked up String#<< and it says it converts any Fixnum between 0-255
to a character.

3) Using what character set?

ASCII. Its your terminal that controls how it gets displayed. My terminal is set to UTF-8.

···

On Oct 14, 2007, at 11:17 , 7stud -- wrote:

--
Poor workers blame their tools. Good workers build better tools. The
best workers get their tools to do the work for them. -- Syndicate Wars

7stud -- wrote:

eacute = ""
eacute << 0xC3 << 0xA9 #eacute<< 195 << 169 ; or é

p eacute

--output:---
"\303\251"

That ouput is in octal--although there is no leading 0.

1) Where does that format come from, i.e. no leading 0?
2) Why is the output in octal and not hex?

I looked up String#<< and it says it converts any Fixnum between 0-255
to a character.

3) Using what character set?

Actually, what's your problem with all that?

Your ints specified in hex are actually converted to bytes in the
string. That, interpreted as utf-8, may mean an é.

The conventional syntax for specifying bytes by their integer value in
string literals, used in C, shells and a number of other environments
(including Ruby) is a backslash followed by octal digits. (The leading 0
is used for specifying *integer* literals in octal.)

String#inspect (which I guess p is using) adopts this syntax for
displaying non-ascii and/or non-printing bytes in the string.

I really don't get your third question. There's no character set
involved here, beyond how you intended your two bytes to be interpreted.
Those two bytes remain the same, regardless how they are displayed. They
may mean two characters in plain old 8-bit charsets, they may mean e.g.
one é in utf-8, or they may mean what p displays for them.

mortee