[ruby-talk:444454] codepoints

Hi,

How to reverse str.codepoints?

how can I convert codepoints between UTF8 and UTF16 ?
e.g: cp1 = [0xf0, 0x9f, 0x8f, 0xb3]; cp2 = [0xe2, 0x82, 0xac] #=> 0x20ac
Could someone add UTF-8 to pack/unpack ?
# seems output is UTF16 instead as UTF8 as doku says!

How to show the US-Flag (star-flag as one graphical symbol), composed
of [0x1f1fa, 0x1f1f8] ?
# How to compose UTF-symbols consisting of more codepoints?

···

______________________________________________
ruby-talk mailing list -- ruby-talk@ml.ruby-lang.org
To unsubscribe send an email to ruby-talk-leave@ml.ruby-lang.org
ruby-talk info -- Info | ruby-talk@ml.ruby-lang.org - ml.ruby-lang.org

Hi,

I'm not sure if you've checked, but there is plenty of online
documentation for Ruby.

Hi,

   How to reverse str.codepoints?

  str.codepoints.reverse

   how can I convert codepoints between UTF8 and UTF16 ?
   e.g: cp1 = [0xf0, 0x9f, 0x8f, 0xb3]; cp2 = [0xe2, 0x82, 0xac] #=> 0x20ac
   Could someone add UTF-8 to pack/unpack ?
         # seems output is UTF16 instead as UTF8 as doku says!

Those are arrays of bytes, not codepoints. A codepoint is usually
transported as a single integer. The most logical way to deal with
characters and codepoints is with strings, so I'd start by getting it
back from an array of bytes to a string with the correct encoding
metadata:

  str1 = cp1.pack('C*').force_encoding('UTF-8') #=> ":white_flag:"
  str2 = cp2.pack('C*').force_encoding('UTF-8') #=> "€"

Then you can convert to UTF-16

  str1.encode('UTF-16').codepoints #=> [0xFEFF, 0x1F3F3]
  str2.encode('UTF-16').codepoints #=> [0xFEFF, 0x20AC]

My question is: why? What are you trying to do? You seem to be partway
down a rabbit hole and have maybe lost track of the actual goal you're
trying to achieve?

   How to show the US-Flag (star-flag as one graphical symbol), composed
of [0x1f1fa, 0x1f1f8] ?
         # How to compose UTF-symbols consisting of more codepoints?

Emoji sequences (and other character sequences) are sequences of
codepoints, so you have to transmit them as a sequence. I don't know
what you're asking.

  [0x1f1fa, 0x1f1f8].pack 'U*' #=> ":us:"

Cheers

···

On Sun, 21 Apr 2024 at 19:02, Information via ruby-talk <ruby-talk@ml.ruby-lang.org> wrote:
--
  Matthew Kerwin [he/him]
  https://matthew.kerwin.net.au/
______________________________________________
ruby-talk mailing list -- ruby-talk@ml.ruby-lang.org
To unsubscribe send an email to ruby-talk-leave@ml.ruby-lang.org
ruby-talk info -- Info | ruby-talk@ml.ruby-lang.org - ml.ruby-lang.org