[OT] A question for people with English OS

Can you view Japanese documents on the internet with an English OS
without special settings or is it garbled text?
This may seem like a silly question but I have always used a Japanese
OS so I do not know.

Is it just about the browser? Or is this a thing of the past?

Harry

···

--
http://www.kakueki.com/ruby/list.html
A Look into Japanese Ruby List in English

Generally, these days the answer is yes. All modern OS's that are widely used use Unicode natively. Windows XP and Vista, OS X and Ubuntu all ship with multiple languages supported. They're all basically language agnostic. Meaning you can switch system languages as well as browser encodings. Switching the system language may require logging out/in again or rebooting, depending on the OS. Japanese is included, including fonts in the standard installations.
The main problem is that browsers do not always catch the encoding. UTF-8 should be used on all web sites from now on, but many Japanese sites still encode their content as Shift-JIS or EUC. So it's really usually the content authors. In theory Shift-JIS should show up fine if the browser's default encoding is set to UTF-8 but often it is necessary to manually try different encodings.
One problem is that various fonts may not implement some of the standard Unicode characters included in the range for Japanese in Unicode. Also, various browser plugins, such as Flash generally don't play well with Unicode.
Some sites are also built with older more obscure or platform specific encodings and 'mojibake' is often all you can get.
Some sites built by individuals using WYSIWYG applications may even end up with pages containing multiple, conflicting encodings.
It's getting there, but the word on Unicode isn't completely out there, and not only in Japan. Many application developers world wide still do not make the effort or even realize they can.
One more point of contention is that different mobile phones in japan also often use different encodings still, thus perpetuating some of the trouble. e-mail client apps also are often troublesome with badly formed xml/xhtml or non-unicode encodings.
Supporting broad amounts of Unicode does have a little more overhead than the old encodings, but not much.
If you go with UTF-8 for web sites, regardless of the language, you should be visible to most modern viewers.
see
http://www.unicode.org
for more on it.
or the w3c's site.

···

On May 6, 2007, at 12:16 AM, Harry Kakueki wrote:

Can you view Japanese documents on the internet with an English OS
without special settings or is it garbled text?
This may seem like a silly question but I have always used a Japanese
OS so I do not know.

Is it just about the browser? Or is this a thing of the past?

Harry

-- http://www.kakueki.com/ruby/list.html
A Look into Japanese Ruby List in English

It is not thing of the past. You need Japanese fonts. Most OSes or
distributions install some but some do not. But this part works in
most cases, and users of obscure distributions are responsible for
their choice I'd guess :wink:

On the other hand, many web page authors fail to specify the encoding
properly. This doesn't matter for English and a few Western languages.
For most languages that use Latin characters the problem is not
critical, the page is still readable. And many browsers would
autodetect the character set given a hint what language you expect.
But this really hurts for Japanese and other non-Latin scripts. Of
course, I do not set up my browser to try and guess what Japanese
encoding would fit the gibberish I received. There are about five
encodings to try, and only one of them shows some readable characters.

So I would guess that about half of the problem are poorly designed web pages.

Of course, when you install a Japanese font and view a page that
specifies the encoding properly the page is *displayed*. You asked
about the ability to *read* the page which requires special skills of
the reader. So in most cases the proper setup does not help much
anyway :wink:

Thanks

Michal

···

On 05/05/07, Harry Kakueki <list.push@gmail.com> wrote:

Can you view Japanese documents on the internet with an English OS
without special settings or is it garbled text?
This may seem like a silly question but I have always used a Japanese
OS so I do not know.

Is it just about the browser? Or is this a thing of the past?

Harry Kakueki wrote:

Can you view Japanese documents on the internet with an English OS
without special settings or is it garbled text?
This may seem like a silly question but I have always used a Japanese
OS so I do not know.

Is it just about the browser? Or is this a thing of the past?

Harry

The user/reader machine needs to install Chinese/Japanese fonts to see
them, otherwise, they would be all question marks. I am currently
working on a machine without those font installed, therefore, I cannot
read any Chinese etc... :frowning:

···

--
Posted via http://www.ruby-forum.com/\.

Thanks for the information and the link.
I'll be using that. I need to study this topic.

I have no immediate plans to use Japanese text at my web site.
But I have some links to Japanese pages (and plan to add more) and
wanted to know if the visitor could see the Japanese text.
I guess that is out of my control but I wanted to know.

Thank you.

Harry

···

On 5/6/07, John Joyce <dangerwillrobinsondanger@gmail.com> wrote:

If you go with UTF-8 for web sites, regardless of the language, you
should be visible to most modern viewers.
see
http://www.unicode.org
for more on it.
or the w3c's site.

--

A Look into Japanese Ruby List in English

Thanks for the input.

Would you look at this without changing any settings and tell me if
you see Japanese or gibberish?

http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-list/43471

Thank you.

Harry

A Look into Japanese Ruby List in English

···

On 5/7/07, Michal Suchanek <hramrach@centrum.cz> wrote:

On 05/05/07, Harry Kakueki <list.push@gmail.com> wrote:
> Can you view Japanese documents on the internet with an English OS
> without special settings or is it garbled text?
> This may seem like a silly question but I have always used a Japanese
> OS so I do not know.
>
> Is it just about the browser? Or is this a thing of the past?
>
And many browsers would
autodetect the character set given a hint what language you expect.
But this really hurts for Japanese and other non-Latin scripts. Of
course, I do not set up my browser to try and guess what Japanese
encoding would fit the gibberish I received. There are about five
encodings to try, and only one of them shows some readable characters.

So I would guess that about half of the problem are poorly designed web pages.

Of course, when you install a Japanese font and view a page that
specifies the encoding properly the page is *displayed*. You asked
about the ability to *read* the page which requires special skills of
the reader. So in most cases the proper setup does not help much
anyway :wink:

Thanks

Michal

Now I am confused.
Some people can see Japanese and some can not.

http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-list/43471

This link shows you question marks?
How about this?

http://d.hatena.ne.jp/nappa_zzz/20070429

Thank you.

Harry

···

On 5/7/07, Roseanne Zhang <roseanne@javaranch.com> wrote:

Harry Kakueki wrote:
> Can you view Japanese documents on the internet with an English OS
> without special settings or is it garbled text?
> This may seem like a silly question but I have always used a Japanese
> OS so I do not know.
>
> Is it just about the browser? Or is this a thing of the past?
>
> Harry

The user/reader machine needs to install Chinese/Japanese fonts to see
them, otherwise, they would be all question marks. I am currently
working on a machine without those font installed, therefore, I cannot
read any Chinese etc... :frowning:

--
Posted via http://www.ruby-forum.com/\.

--

A Look into Japanese Ruby List in English

OK. I guess I get it.
It's about the fonts. That was pointed out earlier but I missed it.

Thank you.

Harry

···

On 5/7/07, Roseanne Zhang <roseanne@javaranch.com> wrote:

Harry Kakueki wrote:
> Can you view Japanese documents on the internet with an English OS
> without special settings or is it garbled text?
> This may seem like a silly question but I have always used a Japanese
> OS so I do not know.
>
> Is it just about the browser? Or is this a thing of the past?
>
> Harry

The user/reader machine needs to install Chinese/Japanese fonts to see
them, otherwise, they would be all question marks. I am currently
working on a machine without those font installed, therefore, I cannot
read any Chinese etc... :frowning:

--
Posted via http://www.ruby-forum.com/\.

--

A Look into Japanese Ruby List in English

http://www.surfjunky.com/?r=Gabrielll chack this out :smiley: it chaged my
life style :slight_smile:

···

--
Posted via http://www.ruby-forum.com/.

Sure Harry, no prob.
One thing you can do is view the source of those sites you link to. That will tell you what encoding is being used. The best thing you can do is e-mail the webmaster of the site to encourage UTF-8
Like Ruby, it's one of those technologies that spreads slowly at times.

···

On May 6, 2007, at 10:01 AM, Harry Kakueki wrote:

On 5/6/07, John Joyce <dangerwillrobinsondanger@gmail.com> wrote:

If you go with UTF-8 for web sites, regardless of the language, you
should be visible to most modern viewers.
see
http://www.unicode.org
for more on it.
or the w3c's site.

Thanks for the information and the link.
I'll be using that. I need to study this topic.

I have no immediate plans to use Japanese text at my web site.
But I have some links to Japanese pages (and plan to add more) and
wanted to know if the visitor could see the Japanese text.
I guess that is out of my control but I wanted to know.

Thank you.

Harry

Japanese. But the page itself is created with bad old HTML with capital letters in the elements.
It contains no DOCTYPE declaration.
Also the page contains no character set encoding declaration.
It's basically up to the user-agent (browser application) in this case to parse it and guess correctly.
User-agents (mostly browsers) often have a 'quirks mode' where they're really pretty amazingly good at rendering a badly formed page.
When the doctype and encoding are not specified, your results will either be something readable or a complete mess, or something in between.
Fortunately the page itself is simple enough that the problems it has as an html document are not preventing viewing.

Whoever hosts that site (I've visited it before) should really spend a few minutes to update the thing.
If you are having trouble viewing it, or others are, the best bet is to try a different browser.
It works fine in Safari, which is gecko and KHTML based.
it works fine in Firefox as well.
It even works in Opera.

If it works in those, you can't ask for much more.
Older browsers may have more trouble.
But good modern browsers are free so there is no reason to support ancient (by computer standards) technology.

···

On May 7, 2007, at 9:08 PM, Harry Kakueki wrote:

Thanks for the input.

Would you look at this without changing any settings and tell me if
you see Japanese or gibberish?

http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-list/43471

Thank you.

Harry

http://www.kakueki.com/ruby/list.html
A Look into Japanese Ruby List in English

This is also true. If you don't have any fonts for a particular language, you won't be able to view it. Generally speaking, those that need them do have them, or can get them. They're included on the install disk with Windows XP and Vista and OS X installs them by default. Both of these OS's take internationalization very seriously. Linux/BSD/other unixes are more of a mixed bag but support is there.

@Michal
Like it or not, xhtml is here to stay. It is actually very easy because you don't have so many attributes crowding your elements. Lots of software to validate it. It's intended to be a form of XML so it uses CSS style sheets.

XHTML and CSS are really really easy to learn.

···

On May 8, 2007, at 12:21 AM, Harry Kakueki wrote:

On 5/7/07, Roseanne Zhang <roseanne@javaranch.com> wrote:

Harry Kakueki wrote:
> Can you view Japanese documents on the internet with an English OS
> without special settings or is it garbled text?
> This may seem like a silly question but I have always used a Japanese
> OS so I do not know.
>
> Is it just about the browser? Or is this a thing of the past?
>
> Harry

The user/reader machine needs to install Chinese/Japanese fonts to see
them, otherwise, they would be all question marks. I am currently
working on a machine without those font installed, therefore, I cannot
read any Chinese etc... :frowning:

--
Posted via http://www.ruby-forum.com/\.

OK. I guess I get it.
It's about the fonts. That was pointed out earlier but I missed it.

Thank you.

Harry

--
http://www.kakueki.com/ruby/list.html
A Look into Japanese Ruby List in English

>>
>
> Thanks for the input.
>
> Would you look at this without changing any settings and tell me if
> you see Japanese or gibberish?
>
> http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-list/43471
>
> Thank you.
>
> Harry
>
> http://www.kakueki.com/ruby/list.html
> A Look into Japanese Ruby List in English
>
Japanese. But the page itself is created with bad old HTML with
capital letters in the elements.
It contains no DOCTYPE declaration.

Actually I hate they specified small letters for elements and I do not
care about DOCTYPE declarations. XHTML strict is so limited I never
figured out how it could possibly work. It is probably good enough for
such simple pages, but that's not the point I guess.

However, the page probably does also have incorrect colors. It looks
like the background of some parts is specified, but not the body
background nor the text and link color. It renders gray on gray for
me.

Also the page contains no character set encoding declaration.

It may be sent in the server headers. It probably is because the
encoding is EUC-JP and Firefox would not figure out without a header
somewhere.
It is questionable if server headers or in-page headers are better.
Both have their strengths and limitations.

I have also seen links to that page already. I had no problems except
the colors and the fact it is in language I do not understand :wink:

Thanks

Michal

···

On 07/05/07, John Joyce <dangerwillrobinsondanger@gmail.com> wrote:

On May 7, 2007, at 9:08 PM, Harry Kakueki wrote:

Thanks, everybody.
I appreciate it.

Harry

···

On 5/8/07, John Joyce <dangerwillrobinsondanger@gmail.com> wrote:

On May 8, 2007, at 12:21 AM, Harry Kakueki wrote:

> On 5/7/07, Roseanne Zhang <roseanne@javaranch.com> wrote:
>> Harry Kakueki wrote:
>> > Can you view Japanese documents on the internet with an English OS
>> > without special settings or is it garbled text?
>> > This may seem like a silly question but I have always used a
>> Japanese
>> > OS so I do not know.
>> >
>> > Is it just about the browser? Or is this a thing of the past?
>> >
>> > Harry
>>
>> The user/reader machine needs to install Chinese/Japanese fonts to
>> see
>> them, otherwise, they would be all question marks. I am currently
>> working on a machine without those font installed, therefore, I
>> cannot
>> read any Chinese etc... :frowning:
>>
>> --
>> Posted via http://www.ruby-forum.com/\.
>>
> OK. I guess I get it.
> It's about the fonts. That was pointed out earlier but I missed it.
>
> Thank you.
>
> Harry
>
> --
> http://www.kakueki.com/ruby/list.html
> A Look into Japanese Ruby List in English
>

This is also true. If you don't have any fonts for a particular
language, you won't be able to view it. Generally speaking, those
that need them do have them, or can get them. They're included on the
install disk with Windows XP and Vista and OS X installs them by
default. Both of these OS's take internationalization very seriously.
Linux/BSD/other unixes are more of a mixed bag but support is there.

@Michal
Like it or not, xhtml is here to stay. It is actually very easy
because you don't have so many attributes crowding your elements.
Lots of software to validate it. It's intended to be a form of XML so
it uses CSS style sheets.

XHTML and CSS are really really easy to learn.

--

A Look into Japanese Ruby List in English

This is also true. If you don't have any fonts for a particular
language, you won't be able to view it. Generally speaking, those
that need them do have them, or can get them. They're included on the
install disk with Windows XP and Vista and OS X installs them by
default. Both of these OS's take internationalization very seriously.
Linux/BSD/other unixes are more of a mixed bag but support is there.

I am not sure about the permissions on the Windows fonts folder.
Explorer offers to install fonts if it finds a page that cannot be
displayed but you may need special privileges for that.
On modern unix-like (OS X, most *BSD, Linux) systems you can put fonts
in your home folder. Firefox uses fontconfig on systems that use X11
so it can find your fonts both on OS X and most unixes.

@Michal
Like it or not, xhtml is here to stay. It is actually very easy
because you don't have so many attributes crowding your elements.
Lots of software to validate it. It's intended to be a form of XML so
it uses CSS style sheets.

XHTML and CSS are really really easy to learn.

I do not say that they are hard to learn or that XHTML is harder than
HTML. I am not against moving functionality from HTML to CSS either. I
liked element names in uppercase because it made them stand out. And I
do not like removing functionality. Frames in all forms are
unsupported or deprecated in XHTML as far as I know.

Thanks

Michal

···

On 07/05/07, John Joyce <dangerwillrobinsondanger@gmail.com> wrote:

I do not say that they are hard to learn or that XHTML is harder than
HTML. I am not against moving functionality from HTML to CSS either. I
liked element names in uppercase because it made them stand out. And I
do not like removing functionality. Frames in all forms are
unsupported or deprecated in XHTML as far as I know.

Thanks

Michal

xhtml is not one thing there are several varieties at this time. xhtml 1.0 strict and all future versions have no frames. That kind of functionality kind of broken (the reason I don't like RDOC is the frames). One of the worst problems is bookmarking a page that has frames. Search engines have a tough time indexing such things as well. With css you can create the same thing with more control and it degrades much more gracefully.

I understand your thinking about elements being easier to separate from content visually with upper-case. But this is what a good text editor with colors is for.

I use OS X with all the fonts installed. The biggest problem I have is
that many sites do not set the encoding for the pages. I guess someone
using a Japanese computer to visit a Japanese web site would not find
this a problem as the default will be to assume that the site is
Japanese but for me it can be pretty much impossible to view some sites
unless I am prepared to try every possible encoding (and then find out
that they are really written in Chinese or Korean).

I understand that IE can make a pretty good guess at the encoding if it
does not look like Latin but Safari does not seem to hack it.

Fonts and encoding and you are made.

Peter Hickman wrote:

I use OS X with all the fonts installed. The biggest problem I have is
that many sites do not set the encoding for the pages. I guess someone
using a Japanese computer to visit a Japanese web site would not find
this a problem as the default will be to assume that the site is
Japanese but for me it can be pretty much impossible to view some sites
unless I am prepared to try every possible encoding (and then find out
that they are really written in Chinese or Korean).

I understand that IE can make a pretty good guess at the encoding if it
does not look like Latin but Safari does not seem to hack it.

Browsers aren't supposed to guess. That IE guesses simply means that IE has yet another bug, born out of Microsoft's typical arrogant refusal to follow standards.

Servers that do not identify the correct encoding are bugged, too. Bitch to the webmasters until they fix it. Their sites are broken, and should be fixed. Period.

···

--
John W. Kennedy
"Give up vows and dogmas, and fixed things, and you may grow like That. ...you may come to think a blow bad, because it hurts, and not because it humiliates. You may come to think murder wrong, because it is violent, and not because it is unjust."
   -- G. K. Chesterton. "The Ball and the Cross"
* TagZilla 0.066 * http://tagzilla.mozdev.org

John Joyce wrote:

xhtml is not one thing there are several varieties at this time. xhtml 1.0 strict and all future versions have no frames.

Neither does HTML 4 strict. Or HTML 3.2. W3C has /never/ wanted frames.

···

--
John W. Kennedy
"...if you had to fall in love with someone who was evil, I can see why it was her."
   -- "Alias"
* TagZilla 0.066 * http://tagzilla.mozdev.org

It may be viewed as refusal to follow standards and encouraging bad
webmaster practices (using some proprietary Windows encoding and
relying on Explorer to guess right). On the other hand, it could be
seen as an attempt to remove some burden from the users. A web browser
developer may implement scheme for guessing the encoding on sites that
do not specify it but cannot fix the sites.

However, the right implementation would also include a big fat warning
about the encoding being guessed. This serves both to let the user
know that the site is deficient and may be displayed incorrectly and
to remind the web developer that it should be fixed.

Thanks

Michal

···

On 10/05/07, John W. Kennedy <jwkenne@attglobal.net> wrote:

Peter Hickman wrote:
> I use OS X with all the fonts installed. The biggest problem I have is
> that many sites do not set the encoding for the pages. I guess someone
> using a Japanese computer to visit a Japanese web site would not find
> this a problem as the default will be to assume that the site is
> Japanese but for me it can be pretty much impossible to view some sites
> unless I am prepared to try every possible encoding (and then find out
> that they are really written in Chinese or Korean).

> I understand that IE can make a pretty good guess at the encoding if it
> does not look like Latin but Safari does not seem to hack it.

Browsers aren't supposed to guess. That IE guesses simply means that IE
has yet another bug, born out of Microsoft's typical arrogant refusal to
follow standards.

Servers that do not identify the correct encoding are bugged, too. Bitch
to the webmasters until they fix it. Their sites are broken, and should
be fixed. Period.