How do you feel about utf-8 in ruby code?


(RRRoy BBBean) #1

Is this acceptable?

'a⟶b'.to_sym == :a⟶b

Granted, this is a simplistic example, but how do you feel about using utf-8 in the ruby code itself?


(Walter Lee Davis) #2

I will sometimes use non-ascii characters in comments, but never in a method name or signature. It’s an ergonomic thing—I just don’t want to have to type a combo keystroke that often.

Walter

···

On Mar 14, 2019, at 4:42 PM, RRRoy BBBean <rrroybbbean@gmail.com> wrote:

Is this acceptable?

'a⟶b'.to_sym == :a⟶b

Granted, this is a simplistic example, but how do you feel about using utf-8 in the ruby code itself?

Unsubscribe: <mailto:ruby-talk-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-talk>


(RRRoy BBBean) #3

Objection noted: Ergonomics. Thank you :slight_smile:

···

On 3/14/19 3:45 PM, Walter Lee Davis wrote:

I will sometimes use non-ascii characters in comments, but never in a method name or signature. It’s an ergonomic thing—I just don’t want to have to type a combo keystroke that often.

Walter

On Mar 14, 2019, at 4:42 PM, RRRoy BBBean <rrroybbbean@gmail.com> wrote:

Is this acceptable?

'a⟶b'.to_sym == :a⟶b

Granted, this is a simplistic example, but how do you feel about using utf-8 in the ruby code itself?

Unsubscribe: <mailto:ruby-talk-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-talk>

Unsubscribe: <mailto:ruby-talk-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-talk>


(Peter Hickman) #4

As on observation on other languages that try to show off unicode in the
source:

1) I probably don't know how to type whatever weird symbol you used so I
will be forever have to cut a paste to edit your code
2) I, personally, know that *𝛿* is read as delta but what does ϕ mean?
Meaningful (or even pronounceable) variable names are out the window
3) Someone is going to edit this in an editor that doesn't understand
unicode and we will end up with æøåÆØÅ in our code instead


(Wolf) #5

And you need to take fonts into account, this one is rendered as square
in my terminal since it seems I don't have font for it :confused:

W.

···

On , Peter Hickman wrote:

[..] know that *𝛿* is read as delta [..]

--
There are only two hard things in Computer Science:
cache invalidation, naming things and off-by-one errors.


(Viktor Ml. Justo Vasquez) #6

Honest question, why do you want that? why not just using a font with
ligatures like fira code or something like that?


(Igor Fontana) #7

:bomb: = Class.new { def self.:fire: raise ":boom::skull:"; end } :bomb:.:fire:

···

El mar., 19 de mar. de 2019 a la(s) 08:41, Viktor Ml. Justo Vasquez (viktor.is.a.genius@gmail.com) escribió:

Honest question, why do you want that? why not just using a font with ligatures like fira code or something like that?

Unsubscribe: <mailto:ruby-talk-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-talk>

--
miau =o3


(Igor Fontana) #8

Actually I'm not nearly as good as this wonderful person:

···

El mié., 20 de mar. de 2019 a la(s) 15:38, Igor Fontana (rogi@skylittlesystem.org) escribió:

El mar., 19 de mar. de 2019 a la(s) 08:41, Viktor Ml. Justo Vasquez > (viktor.is.a.genius@gmail.com) escribió:
>
> Honest question, why do you want that? why not just using a font with ligatures like fira code or something like that?
>
> Unsubscribe: <mailto:ruby-talk-request@ruby-lang.org?subject=unsubscribe>
> <http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-talk>

= Class.new { def self. raise ""; end } .

--
miau =o3

--
miau =o3


(Gerald Bauer) #9

Hello,

   In the safestruct [1] library / gem I'm using unicode (utf-8)
identifiers for typesafe class constants. For example:

    Instead of this "generated" name:
Hash_Address_x_Hash_Address_x_Game for a nested Hash with a Hash of
Game structs the library uses:

     Hash‹Address→Hash‹Address→Game››

    And yes, that's a valid class constant! Here's another "real
world" example with arrays:

     Instead of "classic" Array_Array_Integer_x3 the library uses:

       Array‹Array‹Integer››×3

     Conclusion: Yes, unicode (utf-8) identifiers rock!

     Cheers. Prost.

PS: Try this - yes, it works:

     pp Hash‹Address→Hash‹Address→Game››
     pp Hash‹Address→Hash‹Address→Game››.new

     pp Array‹Array‹Integer››×3
     pp Array‹Array‹Integer››×3.new

     and so on

[1] https://github.com/s6ruby/safestruct


(Peter Hickman) #10

Gerald, thank you for providing the perfect example of why using unicode is
such a bad idea. Lets take the line you gave:

Hash‹Address→Hash‹Address→Game››

Now I don't know which actual code point your › is but here is a list of
unicode code points that look similar (and there are probably more)

U+003E >
U+02C3 ˃
U+02F2 ˲ <-- In my editor this looks like a squished U+02C3, but in Gmail
this looks like a , to me. Go figure
U+203A ›
U+227B ≻

So to work on your code I would be forced to cut and paste all the time to
guarantee I was using the correct code point. That is not going to be a
pleasant experience and is definitely going to be a source of bugs

As a bonus let us take the end of the line, the ››, which we can assume is
just two › code points. Well how about:

U+00BB »
U+226B ≫

Visually similar, especially U+00BB, but incorrect

Until all editors, mail clients etc handle unicode correctly (trust me here
U+02F2 visually similar to > in my editor but looks like a , in gmail -
perhaps you are not seeing this though, who can tell) this is simply a
source of confusion

Unicode is nice and all but seems to offer no advantage in this case.
Rather it can be a source of frustration and bugs


(Peter Hickman) #11

Oh yes. I forgot fonts. Full unicode fonts are rare so using unicode limits
the fonts I can use in my editor


(Gerald Bauer) #12

Hello,

   Good point of code points. I will add the code points to the documentation.

    Note: For now the unicode names are auto-generated (no typing
needed) and used for pretty printing. Example:

#<Safe::Array‹Integer›:0x31c1f10 @ary=[]>
.#<Safe::Array‹Array‹Integer››:0x31c11c0
 @ary=
  [#<Safe::Array‹Integer›:0x31c1160 @ary=[100, 200]>,
   #<Safe::Array‹Integer›:0x31c1130 @ary=[300]>]>
#<Safe::Array‹Array‹Integer››:0x31bb388
 @ary=
  [#<Safe::Array‹Integer›:0x31bb358 @ary=[100, 200]>,
   #<Safe::Array‹Integer›:0x31bb328 @ary=[300]>]>
..#<Safe::Array‹Array‹Integer›×3›×3:0x31b8c58
 @ary=
  [#<Safe::Array‹Integer›×3:0x31b8bf8 @ary=[0, 0, 0]>,
   #<Safe::Array‹Integer›×3:0x31b8bc8 @ary=[0, 0, 0]>,
   #<Safe::Array‹Integer›×3:0x31b8b98 @ary=[0, 0, 0]>]>
#<Safe::Array‹Array‹Integer›×3›×3:0x31b8c58
 @ary=
  [#<Safe::Array‹Integer›×3:0x31b8bf8 @ary=[1, 2, 1]>,
   #<Safe::Array‹Integer›×3:0x31b8bc8 @ary=[2, 0, 0]>,
   #<Safe::Array‹Integer›×3:0x31b8b98 @ary=[0, 0, 0]>]>

Isn't ruby (with unicode) beautiful?

  Happy coding with ruby (and unicode). Cheers. Prost.

PS: Note, the "magic" unicode names in the safestruct (safe data
structures) library are now documented in the section titled "Bonus:
Auto-Generated Unicode (UTF-8) Class Constants / Names for Pretty
Printing" [1].

[1] https://github.com/s6ruby/safestruct#bonus-auto-generated-unicode-utf-8-class-constants--names-for-pretty-printing