You are not allowed to set the source encoding to a non-ASCII
compatible encoding, if memory serves.
Where is it documented please?
I'm not sure it's officially documented yet.
Ruby does throw an error in this scenario though:
$ ruby_dev
# encoding: UTF-16BE
ruby_dev: UTF-16BE is not ASCII compatible (ArgumentError)
and:
$ ruby_dev -e 'puts "\uFEFF# encoding: UTF-16BE".encode("UTF-16BE")' | ruby_dev
-:1: invalid multibyte char (UTF-8)
I believe this is the relevant code from Ruby's parser:
static void
parser_set_encode(struct parser_params *parser, const char *name)
{
int idx = rb_enc_find_index(name);
rb_encoding *enc;
if (idx < 0) {
rb_raise(rb_eArgError, "unknown encoding name: %s", name);
}
enc = rb_enc_from_index(idx);
if (!rb_enc_asciicompat(enc)) {
rb_raise(rb_eArgError, "%s is not ASCII compatible", rb_enc_name(enc));
}
parser->enc = enc;
}
That eliminates any issues
with encodings like UTF-16. This makes perfect sense as there's no
way to reliably support the magic encoding comment unless we can count
on being able to read at least that far.
Needed to say that XML parsers can handle such cases, i.e. when xml
header is in different encoding than the rest of document.
I doubt we can say that universally. 
Also, what you said isn't very accurate. For example, "in different encoding than the rest of document" is not a possible occurrence according to the XML 1.1 specification (http://www.w3.org/TR/2006/REC-xml11-20060816/\) which states:
"It is a fatal error when an XML processor encounters an entity with an encoding that it is unable to process. It is a fatal error if an XML entity is determined (via default, encoding declaration, or higher-level protocol) to be in a certain encoding but contains byte sequences that are not legal in that encoding."
All XML parsers are required to assume UTF-8 unless told otherwise and to be able to recognize UTF-16 by a required BOM. Beyond that, they are not required to recognize any other encodings, though they may of course. Their encoding declaration can be expressed in ASCII and, since they assume UTF-8 by default, this is similar to what Ruby does. It allows a switch to an ASCII-compatible encoding.
XML processors may do more. For example, they can accept a different encoding from an external source to support things like HTTP headers and MIME types. Ruby doesn't really have access to such sources at execution time, so that option doesn't apply to the case we are discussing. However, XML processors may also recognize other BOM's and Ruby could do this.
A BOM could be handled similarly to what I showed. You need to open
the file in ASCII-8BIT and check the beginning bytes, then you could
switch to US-ASCII and finish reading the first line (or to the second
if a shebang line is includes), then switch encodings again if needed
and finish processing.
May be this technique could be used for reading UTF-16 encoded files, if
needed?
Yes, Ruby could recognize BOM's for non-ASCII compatible encodings to support them. A BOM would be required in this case though, just as it is in an XML processor that doesn't have external information.
Ruby doesn't currently do this, as near as I can tell.
Note that this would not give what you purposed in your initial message: multiple encodings in the same file. Ruby doesn't support that and isn't ever likely to. An XML processor that supports such things is in violation of its specification as I understand it.
Besides, not many text editors that I'm aware of make it super easy to edit in multiple encodings. 
James Edward Gray II
···
On Aug 7, 2009, at 10:41 AM, Vít Ondruch wrote: