[ANN] How to spy on the Japanese Rubists

Harry wrote:

Could we get Eskimo and Inuit translations as well please?

Sorry :slight_smile:

Uhhhh......No :slight_smile:

Maybe ... what's Inuit for me?

<ducking>

ยทยทยท

--
M. Edward (Ed) Borasky, FBG, AB, PTA, PGS, MS, MNLP, NST, ACMC(P)
http://borasky-research.blogspot.com/

If God had meant for carrots to be eaten cooked, He would have given rabbits fire.

Their language is based more upon sounds than most. Each character
represents a sound, and then sounds together create a word. Our letters
do have sounds, but the complete sound is only made with a combination
of letters. 'rubima' (Ruby Magazine) can only be written 1 way in their
language, and apparently also means 'motivation bean jam'. Any time
they shorten something like that, it's almost assured to also mean
something else. The translator has no way of knowing this was a short
form of other words, and does its best to translate.

Back to technology - now I need a RSS feed proxy that auto-translates
RSS feeds before my feedreader gets them. Netvibes won't be happy if I
send it off to Google for Translating :slight_smile:

The Google Translator returns a split frame panel. The content frame's
url is:
/translate_p?hl=en&ie=UTF-8&oe=UTF-8&langpair=ja&u=TARGETURL

And that redirects to /translate_c?...

(Note that the en has been removed from the langpair).

But theoretically, if we pass a RSS feed to
/translate_c?hl=en&ie=UTF-8&oe=UTF-8&langpair=ja&u=FEED.XML

it should autotranslate each time it loads.

I passed it: ๅฎถๅบญๅ†…ใ‚คใƒณใƒ•ใƒฉ็ฎก็†่€…ใฎ็‹ฌใ‚Š่จ€๏ผˆใฏใชใšใใ‚“ใฎๆ—ฅ่จ˜ใฃใฝใ„ใฎ๏ผ‰
(a random feed I found)

Unfortunately, Google Translate doesn't seem to work on RSS feeds. Pity.

ยทยทยท

--
Posted via http://www.ruby-forum.com/\.

Their language is based more upon sounds than most.

Well, Japanese does have very regular pronounciation, there are
relatively few syllables, around 50, and these tend to be pronounced
much more consistently than in other languages.

Each character
represents a sound, and then sounds together create a word.
Our letters
do have sounds, but the complete sound is only made with a combination
of letters.

Sort of, actually there are three Japanese 'character sets' which are
used for different purposes.

Kanji, are the Chinese pictorial characters, each Kanji stands for a
word or a concept. One of the neat things about Kanji is that
speakers of languages which use Chinese characters can often read
written material despite the fact that the writer and reader speak
different languages.

There are two character sets (kana) in which each character represents
a sylable. Hiragana is used for writing Japanese words often in
combination with Kanji. Japanese children usually learn hiragana
first, since there are far fewer symbols than Kanji. The forms of the
Hiragana characters are derived from Kanji, and look curvier than...
Katakana, which covers the same syllables as Hiragana, but is used
primarily for writing words borrowed or adapted from other languages.

There's also romanji which is the english/european alphabet, which is
used to directly quote foreign names, and sometimes as a
transliteration of katakana/hiragana. Foreign words tend to get
modified to match the closest Japanese pronunciation so "Miss America"
would get rendered as Mi-su A-me-ri-ka in romanji, with the Japanese
'mi' being pronounced something like 'me', the Japanese 'me' like the
english 'may,' and the Japanese 'ri' something between 'ree' and
'lee,' actually a sound close to 'dee.'

'rubima' (Ruby Magazine) can only be written 1 way in their
language,

Actually, I'm pretty sure that rubima comes from a popular style of
jargon used by young Japanese, and among varous Japanese enthusiast
groups which comes from abreviating, usually English, words and
phrases. There are at least two words for magazine(periodical) in
Japanese, ma-ga-jin which would be written in katakana, or zasshi
which would be written in kanji.

and apparently also means 'motivation bean jam'. Any time
they shorten something like that, it's almost assured to also mean
something else. The translator has no way of knowing this was a short
form of other words, and does its best to translate.

Well maybe it's coming from there. I've discussed this with my
Japanese-American wife, and she couldn't figure out how. The only
thing I can think of as being bean jam would be anko, which is a
popular stiff jelly-like sweet made from the azuki bean.

I suspect that it's really coming from a translation of something
else, perhaps someone's name, sort of like a rote translator of
Italian, translating my last name to "of Christmas."

But then, I could be wrong.

ยทยทยท

--
Rick DeNatale

My blog on Ruby
http://talklikeaduck.denhaven2.com/

There'a a famous story about an early attempt on machine translation.

Sometime in the 1960s the US Air Force funded a project for
Russian/English translation.

They decided that good test cases could be obtained by taking famous
quotations, translate them to Russian and then back again.

Two of the tests were:

"Out of sight, out of mind."
and
"The spirit is willing, but the flesh is weak"

and they came back respectively as:
"Invisible idiot"
and
"The vodka is strong, but the meat is rotten."

T'aint just a matter of grammar and vocabulary.

I've been trying to program my meatware to do NL translation for 40 or
more years with limited success.

ยทยทยท

On 8/30/06, Hal Fulton <hal9000@hypermetrics.com> wrote:

Paul Robinson wrote:

> Why is machine translation so bad anyway? It's only changing one set of
> vocabulary and grammar for an equivalent set - it would seem that the
> biggest problem with most tools out there is ridiculously small foreign
> language dictionaries and tiny grammar rule sets. Very odd, and
> probably quite easy to fix.

Not to be unkind, but that's very naive. I can understand people
thinking that way... but only until they've studied the field in
detail, or better yet, tried it themselves.

So try it. If you fail, you'll have learned. If you succeed, you'll
be rich and famous.

--
Rick DeNatale

My blog on Ruby
http://talklikeaduck.denhaven2.com/

Frankly, I think this is absurd. I'm studying Japanese myself, and
from an online dictionary (http://kanjidict.stc.cx) their word for
'motivation' is 'shigeki' or 'mochibeshon' (imported from English and
mangled in the way the Japanese tend to mangle borrowed foreign
words), and as Mr. DeNatale has mentioned, 'anko' means red bean jam.
Apparently the only common words that begin with 'ru-bi' are their
imported word Ruby itself, and the chemical element Rubidium (ใƒซใƒ“ใ‚ธใ‚ฆใƒ ,
ru-bi-ji-u-mu). There are also apparently no common Kanji that has the
reading ru-bi or bi-ma, and of the 26 or so kanji that have the
reading (either on-yomi or kun-yomi) 'ru' none of them have a meaning
even remotely close to any of 'motivation' or 'bean' or 'jam'
(however, interestingly enough, there are apparently several kanji
with the reading 'ru' that mean 'precious stone' or 'lapis lazuli').
Someone's translation software is really screwed up if it rendered
ru-bi-ma as motivation bean paste, however it was written.

ยทยทยท

On 8/30/06, William Crawford <wccrawford@gmail.com> wrote:

Their language is based more upon sounds than most. Each character
represents a sound, and then sounds together create a word. Our letters
do have sounds, but the complete sound is only made with a combination
of letters. 'rubima' (Ruby Magazine) can only be written 1 way in their
language, and apparently also means 'motivation bean jam'. Any time
they shorten something like that, it's almost assured to also mean
something else. The translator has no way of knowing this was a short
form of other words, and does its best to translate.

Heh, it's not so.

Have you ever tried to learn a foriegn language?

Only romance languages - French is particularly easy if you're natively English and know a little about construction of sentences from latin.

For one, I heared that Eskimos use some tens of various words for
different kinds of snow and ice. You get the idea.

They're called inuits, not eskimos - calling somebody an eskimo is like calling them a nigger, i.e. highly offensive - but I'm aware of what you mean by the snow/ice thing.

However, this is just social slang, all cultures have it, and there is more slang for those things that culture is obsessed by. Think how many different terms there are in western culture for genitals and having sex and getting drunk (yes, I know this is a sad statement on western culture) - it's exactly the same thing. A big enough dictionary takes care of it.

Even in areas where the needs for precision weren't very different the
slicing often happens at different places. So one word would be
translated as different words in different contexts, often in both
ways.

That is just basic grammar though - if you know that that a pronoun references an antecedent, and you can spot the antecedent, you can deal with the pronoun correctly. If you are able to identify the subject and predicate of a sentence and identify a passive or active voice, spot tense, etc. then you can translate easily.

This isn't hard - a 2-year old baby can do it with very primitive language skills.

My point is that current translation technology doesn't even attempt to try and do that.

And the grammar is far from equivalent. I already found English
sentences that I can understand (or so I think) but which would need
several sentences to be explained in my native language.
And these are both Indoeuropean languages. Japanese grammar is much
more interesting :slight_smile:

Maybe translation should move away from the literal word-for-word translation of words and move toward being able to express an identical idea or thought.

The only correct translation can come from analyzing the meaning of a
sentence in context of the previous text, and constructing sentence(s)
in the other language with similar meaning. Of course, this is nearly
impossible to do with a computer.

Shouldn't be - all a brain is, is a computer - and I know that gcc can parse syntax within a context far more accurately than I can. What you actually need is a compiler of natural languages - a very, very big syntax parsing mechanism - and then we have all the right tools we need. If somebody is able to write a tool that can translate ruby into C accurately, I don't see why somebody can't write a tool that can translate Japanese into English using a similar set of methods.

Anyway, enough of that, we're way off topic. Interesting discussion. :slight_smile:

ยทยทยท

On 31 Aug 2006, at 14:25, Michal Suchanek wrote:

--
Paul

Thanks for the input.

I have listed a few of the subject titles from the Japanese mailing list.
I think if I translated any further I would need permission.
Anyway, there are just a few and only the ones I could do quickly.
It's late at night here so I will add to the list another day as I can
if there are no objections.

Actually ,Ruby magazine is interesting to me, so I will be looking at
that some more.

http://www.kakueki.com/ruby/list.html

ยทยทยท

--
http://www.kakueki.com/

(Sorry, I misplaced Harry's post, is the one to which I should have replied.)

What takes the time in translation?

I'm just wondering--if people are interested, perhaps someone fluent in
Japanese and English (i.e., a translator) could read Matz's blog (out loud)
and record it. Then other volunteers might be willing to transcribe the
recordings to written (English) text.

I say that (as opposed to streaming the audio, or similar), because with a
slow dial-up connection (and even without it), I prefer written text--it
takes less bandwidth over the Internet and (usually at least) less bandwidth
on my Internet <=> brain interface.

And, I'm not interested enough to volunteer as a transcriber--I'm more curious
to know whether that would be a significant help to a translator.

Randy Kramer

ยทยทยท

On Tuesday 16 January 2007 05:43 am, Zev Blut wrote:

On Tue, 16 Jan 2007 18:35:02 +0900, Harry <rubyprogrammer@gmail.com> wrote:

> One person interested in the list and one person interested in Matz's
> blog. Hmmm....
> Translating takes a lot of time. Something I don't have much of right
> now.

Dr Nic wrote:

Unfortunately, Google Translate doesn't seem to work on RSS feeds. Pity.

I've summarised this all here:
http://drnicwilliams.com/2006/08/30/translation-of-rss-feeds-a-failure/

Looks like it ignores the feed as bad input and merely redirects you
back to the feed.

Anyone know anyone at Google?

ยทยทยท

--
Posted via http://www.ruby-forum.com/\.

Hi,

At Thu, 31 Aug 2006 02:35:55 +0900,
Rick DeNatale wrote in [ruby-talk:211552]:

There's also romanji which is the english/european alphabet, which is

Japanese say it romaji, without 'n'.

> and apparently also means 'motivation bean jam'. Any time
> they shorten something like that, it's almost assured to also mean
> something else. The translator has no way of knowing this was a short
> form of other words, and does its best to translate.

Well maybe it's coming from there. I've discussed this with my
Japanese-American wife, and she couldn't figure out how. The only
thing I can think of as being bean jam would be anko, which is a
popular stiff jelly-like sweet made from the azuki bean.

It came from "yaruki an'noka?", which means "do you have the
motivation?". "an'noka" is rough expression for "arunoka".

ยทยทยท

--
Nobu Nakada

Dido Sevilla wrote:

Someone's translation software is really screwed up if it rendered
ru-bi-ma as motivation bean paste, however it was written.

But the bookmarklet idea is fun though :slight_smile:

Also, I added some javascript for ppl to add to their own sites to allow
easy translation of their own site. -
http://drnicwilliams.com/2006/08/30/foreign-tourists-to-your-websites-part-2/

Then the Japanese can laugh at the translations in their forum... :slight_smile:

ยทยทยท

--
Posted via http://www.ruby-forum.com/\.

Paul Robinson wrote:

For one, I heared that Eskimos use some tens of various words for
different kinds of snow and ice. You get the idea.

They're called inuits, not eskimos - calling somebody an eskimo is
like calling them a nigger, i.e. highly offensive - but I'm aware of
what you mean by the snow/ice thing.

I never knew that Eskimos was a derogatory term. The things you learn on
the Ruby forum :slight_smile:

ยทยทยท

--
Posted via http://www.ruby-forum.com/\.

And it was the other first nation tribes that called the Innu
"Eskimo". It means "raw meat eater", IIRC (probably Iroquois). The
things you learn having a teacher as a partner.

-austin

ยทยทยท

On 8/31/06, Paul Robinson <paul@iconoplex.co.uk> wrote:

On 31 Aug 2006, at 14:25, Michal Suchanek wrote:
> For one, I heared that Eskimos use some tens of various words for
> different kinds of snow and ice. You get the idea.
They're called inuits, not eskimos - calling somebody an eskimo is
like calling them a nigger, i.e. highly offensive - but I'm aware of
what you mean by the snow/ice thing.

--
Austin Ziegler * halostatue@gmail.com * http://www.halostatue.ca/
               * austin@halostatue.ca * You are in a maze of twisty little passages, all alike. // halo โ€ข statue
               * austin@zieglers.ca

> Heh, it's not so.
>
> Have you ever tried to learn a foriegn language?

Only romance languages - French is particularly easy if you're
natively English and know a little about construction of sentences
from latin.

Well French is, of course one of the main mother languages of English,
particularly the King's English. Starting in 1066 and for quite
awhile, the language spoken in the English court was French.

That's why we commonly have two words for the same thing which come
from either French or Anglo-Saxon. Pork and Pig, Beef and Cow... In
the case of these food terms the word which has more affinity to the
food being on the table is French in origin, while the English word
has more affinity to it being on the farm. Wonder why!

Kent Beck told me while he was working in Zurich as a consultant, that
he had the best results in talking to his Swiss clients if he used
fancy vocabulary and simple grammar, since most Swiss (oops I typed
Suisse a first) speak French as, at least, a second language.

I used to say that I was Swiss, because my Mother was German, my
Father was Italian and je parle un peu de francais.

> For one, I heared that Eskimos use some tens of various words for
> different kinds of snow and ice. You get the idea.

This old wives tale, is the basis of the Sapir-Whorf hypothesis, that
our language constrains our thought. As others have pointed out the
Eskimo snow vocabulary doesn't stand up to scrutiny, it's not that
speakers of other languages can't express the same ideas, but that
they different or less jargon for some subject areas.

Which reminds me of another story (you guys are going to think of me
as the old British guy who keeps saying "It reminds me of when I was
in the Crimean."

Back when I was working for a certain three-letter large computer
company, I found myself in a bar late one night in Budapest, with une
amie, (une belle nicoise), and a guy who worked for the same company,
who had been at the Paris office for a couple of years and refused to
learn ANY French. Somehow the conversation turned to his espousal of
the Sapir-Worff hypothesis:

    UglyAmerican: You don't have to speak anything but English, because
                              everyone you need to do business with
speaks English.
                              And people don't speak English have
thoughts they can't
                              form.

    Me: Oh really?!

    UA: For example, there's no way to say "I Like
you" in French.

    Aside - Yes, in French, the verb aimer means to love, and "Je
t'aime" means
                  "I love you," but the French being subtle, will say
something like
                  "je t'aime bien" which would seem to intensify the
verb but really
                  turns off the amorous aspects of the verb. As a
Catholic priest told
                  my grammar school class many years ago, like is more
than love,
                  because I HAVE to love you even though I don't like you.

                  And now back to our story

   Me: Really!

   UA: And the Japanese don't have a way to say no, because it
wouldn't be
                  polite.

   Me: Yes they do, in fact they have lots of ways to say it,
at various levels of
                  politeness, and urgency.

I guess the point is that, even compared to programming languages,
human languages are subtle, and a lot of us have misconceptions about
the languages we don't speak or speak well. I guess that the latter
holds for programming languages as well.

Hmmmm, anyone up for a ruby quiz on machine translation?

ยทยทยท

On 8/31/06, Paul Robinson <paul@iconoplex.co.uk> wrote:

On 31 Aug 2006, at 14:25, Michal Suchanek wrote:

--
Rick DeNatale

My blog on Ruby
http://talklikeaduck.denhaven2.com/

Paul Robinson wrote:

Shouldn't be - all a brain is, is a computer - and I know that gcc can
parse syntax within a context far more accurately than I can. What you
actually need is a compiler of natural languages - a very, very big
syntax parsing mechanism - and then we have all the right tools we need.
If somebody is able to write a tool that can translate ruby into C
accurately, I don't see why somebody can't write a tool that can
translate Japanese into English using a similar set of methods.

Anyway, enough of that, we're way off topic. Interesting discussion. :slight_smile:

Back in the good old days, people were predicting that "artificial
intelligence", meaning machines playing grandmaster chess, making
intelligent translations of English to Russian and back again, and being
indistinguishable from a human in an interview via teletype, was "just
around the corner."

well, there's Deep Blue ... there's BabalFish ... and there are
"chatbots" that can at least fool a lovestruck teenager. :slight_smile:

But ... it took a *lot* longer than people back in the late 1950s
thought it would, didn't it? :slight_smile:

> For one, I heared that Eskimos use some tens of various words for
> different kinds of snow and ice. You get the idea.

They're called inuits, not eskimos - calling somebody an eskimo is
like calling them a nigger, i.e. highly offensive - but I'm aware of
what you mean by the snow/ice thing.

I hereby apologize to any Inuits frequenting this list. To my defence
I'd like to say that the word Eskimo was adopted into Czech probably
more than half a century ago, and sice there are no Inuits around the
nationality designation could not collect any negative racist or
pejorative connotations.

However we also have to develop new designations for nationalities
that are around because the designations that are used too long become
somewhat rasistic or pejorative, just like in English speaking
countries.

However, this is just social slang, all cultures have it, and there
is more slang for those things that culture is obsessed by. Think how
many different terms there are in western culture for genitals and
having sex and getting drunk (yes, I know this is a sad statement on
western culture) - it's exactly the same thing. A big enough
dictionary takes care of it.

Slang or not it creates many words with different meanings, at least
in the snow case these should be different kinds of snow I beleive.

However the drunk case is interesting as well. As far as I am aware in
Czech there are two words widely used that mean 'get drunk by wine',
'and get drunk by beer', and the rest of the slang is just a synonym
for 'get drunk'. Though some are probably used for 'get drunk much'
and others for 'get drunk slightly'. I do not have detailed knowledge
of English in this regard.

This reflects that wine and beer are two historically most widely used
drugs in Europe so one needs to talk about the consequences of using
them.

Thanks

Michal

ยทยทยท

On 8/31/06, Paul Robinson <paul@iconoplex.co.uk> wrote:

On 31 Aug 2006, at 14:25, Michal Suchanek wrote:

/me is interested in Matz's blog

What takes the time in translation?

I'm just wondering--if people are interested, perhaps someone fluent in
Japanese and English (i.e., a translator) could read Matz's blog (out loud)
and record it. Then other volunteers might be willing to transcribe the
recordings to written (English) text.

I say that (as opposed to streaming the audio, or similar), because with a
slow dial-up connection (and even without it), I prefer written text--it
takes less bandwidth over the Internet and (usually at least) less bandwidth
on my Internet <=> brain interface.

And, I'm not interested enough to volunteer as a transcriber--I'm more curious
to know whether that would be a significant help to a translator.

Randy Kramer

I don't know about other translators but I prefer Japanese text to English text.
You have probably guessed that Japanese is not my first language :slight_smile:
If the topic is somewhat complex, I need to read it again, and maybe again.
Then I type slooooowly. It's the only way I can type:)

Many people who do not translate think that a translator should be
able to spit out a page every 5 minutes. It doesn't work that way. If
it were that easy someone would have done all these translations
already. I have noticed people here who can handle both languages.

ยทยทยท

--

Diplomacy is the art of saying "Nice doggie" until you can find a rock.
Will Rogers

Domo-arrigato Nobu-sensei.

I wonder where the "bean jam" translation is coming from.

ยทยทยท

On 8/30/06, nobu@ruby-lang.org <nobu@ruby-lang.org> wrote:

Hi,

At Thu, 31 Aug 2006 02:35:55 +0900,
Rick DeNatale wrote in [ruby-talk:211552]:
> There's also romanji which is the english/european alphabet, which is

Japanese say it romaji, without 'n'.

> > and apparently also means 'motivation bean jam'. Any time
> > they shorten something like that, it's almost assured to also mean
> > something else. The translator has no way of knowing this was a short
> > form of other words, and does its best to translate.
>
> Well maybe it's coming from there. I've discussed this with my
> Japanese-American wife, and she couldn't figure out how. The only
> thing I can think of as being bean jam would be anko, which is a
> popular stiff jelly-like sweet made from the azuki bean.

It came from "yaruki an'noka?", which means "do you have the
motivation?". "an'noka" is rough expression for "arunoka".

--
Rick DeNatale

My blog on Ruby
http://talklikeaduck.denhaven2.com/

IPMS/USA Region 12 Coordinator
http://ipmsr12.denhaven2.com/

Visit the Project Mercury Wiki Site
http://www.mercuryspacecraft.com/

Dr Nic wrote:

Paul Robinson wrote:

For one, I heared that Eskimos use some tens of various words for
different kinds of snow and ice. You get the idea.

They're called inuits, not eskimos - calling somebody an eskimo is
like calling them a nigger, i.e. highly offensive - but I'm aware of
what you mean by the snow/ice thing.

I never knew that Eskimos was a derogatory term. The things you learn on the Ruby forum :slight_smile:

The Inuit don't have any more words for snow than anybody else: http://www.straightdope.com/columns/010202.html

That's 10,000,000 pedantic points for me. Do I win?