'kakasi': different style for hiragana and katakana?

In my kakasi-renamer I want to change filenames to Roumaji as follows:

hiragana
KATAKANA
Kanji

How to do that? I’m thinking of running kakasi multiple times, perhaps:

Katakana->ASCII
upcase
Everything else->ASCII

But this would also upcase existing ASCII characters in the file name.

Currently I have

if Kconv::guess (name) != Kconv::UNKNOWN
  euc = Kconv::toeuc (name)
  ascii = Kakasi::kakasi ('-C -s -ja -ga -ka -Ea -Ka -Ha -Ja -ieuc -oeuc -rhepburn -u', euc)
  # Could I instead write Kanji, hiragana and KATAKANA?
  # Watashi wa gaijin desu, I don't know the language, I don't know the
  # coding systems, I don't know kakasi's internals, therefore I don't
  # know if it is safe to call multiple times with different options.
  # All I know is how to use kinput2 to write Greek letters (ARUFA, PAI)
  # and the infinity sign (mugendai).

Does anyone have a solution or at least a hint? Or is there a Kakasi
version which is able of that directly?

···


begin if-attachment–newsreader-broken.txt
L0VAA;F=E('EO=7(@8G)O:V5N*VEN8V]M<&%T:6)L92!N97=S<F5A9&5R7"$
end

In my kakasi-renamer I want to change filenames to Roumaji as follows:

hiragana
KATAKANA
Kanji

As far as I know, kakasi doesn’t have such function.

How to do that? I’m thinking of running kakasi multiple times, perhaps:

Katakana->ASCII
upcase
Everything else->ASCII

But this would also upcase existing ASCII characters in the file name.

You can use gsub.

require “kakasi”

def eucjp_katakana_to_upcase(str)
katakana_from = “\xa5\xa1” # small a
katakana_end = “\xa5\xf6” # small ke
katakana = /[#{katakana_from}-#{katakana_end}]+/em # em for euc&multiline
str.gsub(katakana){Kakasi::kakasi(“-ka -Ka”, $&).upcase}
end

if FILE == $0
puts eucjp_katakana_to_upcase(ARGF.read)
end

Currently I have

if Kconv::guess (name) != Kconv::UNKNOWN
  euc = Kconv::toeuc (name)

Kconv::guess and Kconv::toeuc uses different algorithms to determine
encoding and “if Kconv::guess (name) != Kconv::UNKNOWN” may not work.

– Gotoken

···

At Fri, 27 Sep 2002 08:18:10 +0900, Rudolf Polzer wrote:

Scripsit ille aut illa GOTO Kentaro gotoken@notwork.org:

Rudolf Polzer wrote:

In my kakasi-renamer I want to change filenames to Roumaji as follows:

hiragana
KATAKANA
Kanji

As far as I know, kakasi doesn’t have such function.

[…]

def eucjp_katakana_to_upcase(str)
katakana_from = “\xa5\xa1” # small a
katakana_end = “\xa5\xf6” # small ke
katakana = /[#{katakana_from}-#{katakana_end}]+/em # em for euc&multiline
str.gsub(katakana){Kakasi::kakasi(“-ka -Ka”, $&).upcase}
end

Thanks! Where can I find an EUC-JP coding table (not containing each
single code, but containing all relevant ranges)?

if FILE == $0
puts eucjp_katakana_to_upcase(ARGF.read)
end

Currently I have

if Kconv::guess (name) != Kconv::UNKNOWN
  euc = Kconv::toeuc (name)

Kconv::guess and Kconv::toeuc uses different algorithms to determine
encoding and “if Kconv::guess (name) != Kconv::UNKNOWN” may not work.

I’m using this to prevent changing of file names that just contain
German umlaut characters (ä, ö, ü), the German ß or the Euro sign (¤,
\xa4). But I could save the result of Kconv::guess and use this when
converting.

···


[mpg123d] Just playing: …/albums/vintage/a/09 hesitation.mp3

The math could be slightly incorrect, but it sounds right. RFC 2795