Greetings, folks. First time poster, so if I breach
any etiquette I sincerely apologize. I'm a bit of a Ruby Nuby who's
been
bouncing around between Python and Ruby, not entirely satisfied with
either
and wishing both were better. Two years ago I had no familiarity with
either
language, then I quit working for Microsoft and learned the true joy
of
programming in dynamic languages.
I am not a zealot and have little tolerance for zealotry, and I have
no
desire to get involved in holy wars. I'm a little apprehensive that
I'm
about to step into one, but here goes anyway. In general, I prefer
Ruby's
computational model to Python's. I think code blocks are cool, and I
love Ruby's
very flexible expressiveness. I dig the way every statement is an
expression,
and being able to return a value simply by stating it rather than
using the
'return' keyword. I hate Python's reliance on global methods like len
() and
filter() and map() (rather than making thesem methods members of the
classes
to which they apply) and I absolutely loathe its reliance on __magic__
method names. Ruby's ability to reopen and modify _any_ class kicks
ass, and
any Python fan who wants to deride "monkeypatching" can shove it. It
rocks.
That being said, I think monkeypatching could use some syntactic sugar
to
provide a cleaner way of referencing overridden methods, so instead
of:
module Kernel
alias oldprint print
def print(*args)
do_something
oldprint *(args + [" :-)"])
end
end
...maybe something like this:
module Kernel
override print(*args)
do_something
overridden *(args + [" :-)"])
end
end
But I digress... the purpose of this post is to talk about one of the
relatively
few areas where I think Python beats Ruby, and that's syntatically-
significant
indentation.
Before I get into it, let me say to those of you whose eyes are
rolling way
back in your skulls that I have a proposal that will allow you to keep
your
precious end keyword if you insist, and will be 100% backward
compatible with
your existing code. Skip down to "My proposal is" if you want to cut
to the
chase.
When I encounter engineers who don't know Python, I sometimes ask them
if they've heard anything about the language, and more often than not,
they
answer, "Whitespace is significant." And more often than not, they
think that's
about the dumbest idea ever ever. I used to think the same. Then I
learned
Python, and now I think that using indentation to define scope is
awesome.
I started looking over my code in C++ and realized that if some
nefarious
person took all of my code and stripped out the braces, I could easily
write a simple script in either Python or Ruby to restore them,
because
their locations would be completely unambiguous: open braces go right
before
the indentation level increases, close braces go right before it
decreases. And
having gotten used to this beautiful way of making code cleaner, I
hate that
Ruby doesn't have it.
I've read the two-year-old thread at
http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/252034
(well, most of it, anyway) and I'll answer some of the objections
raised
in it, but first let me illustrate the scope of the problem with the
output
of a quick and dirty script I wrote:
Examined 1433 files in /usr/lib/ruby/1.8.
Total non-empty lines: 193458
Lines consisting of NOTHING BUT THE WORD END: 31587 (a whopping 16.33%)Streaks:
7: 4
6: 37
5: 28
4: 159
3: 765
2: 4082
1: 16505
My friends, when ONE OUT OF EVERY SIX of your code lines consists of
just the
word "end", you have a problem with conciseness. I recognize that
syntactically-
significant indentation is not perfect, and it would bring a few pain
points
with it. But let me say that again: ONE OUT OF EVERY SIX LINES, for
crying out
loud! This should be intolerable to engineers who value elegance.
"Streaks"
means what you'd expect: there are four places in the scanned files
that look
like this:
end
end
end
end
end
end
end
This is *not* DRY. Or anything remotely resembling it. This is an
ugly blemidh on a language that otherwise is very beautiful. The
problem of
endless ends is exacerbated by Ruby's expressiveness, which lends
itself to very short methods, which can make defs and ends take up a
large amount of space relative to lines of code that actually do
something.
Even if you can find some ways in which the explicit "end" keyword is
preferable
to letting indentation do the talking... one out of every six lines.
Matz's objections in the cited thread were:
* tab/space mixture
Well, tough. Programmers shouldn't be using freakin' tabs anyway, and
if they
are, they _definitely_ shouldn't be mixing them with spaces. I don't
think
it's worthwhile to inflate the code base by a staggering 20% to
accommodate
people who want to write ugly code, mixing tabs and spaces when
there's no
reason to. And if for some reason it's really, really critical to
support this
use case, there could be some kernel-level method for specifying how
many
spaces a tab equates to, so the interpreter can figure out just how
far indented
that line with the tabs is.
* templates, e.g. eRuby
Not having used eRuby and therefore not being very familiar with it, I
don't
want to comment on specifics other than to note that plenty of Python-
based
template systems manage to get by.
* expression with code chunk, e.g lambdas and blocks
I don't really see the problem. My blocks are generally indented
relative to
the context to which they're being passed, isn't that standard?
My proposal is to, first, not change a thing with respect to existing
syntax. Second, steal the : from Python and use it to signify a scope
that's marked by indentation:
while some_condition
# this scope will terminate with an 'end' statement
do_something
end
while some_condition:
# this scope will terminate when the indentation level decreases
to the
# level before it was entered
do_something
%w{foo bar baz}.each do |val|
print val
end
%w{foo bar baz}.each do |val|:
print val
A valid objection that was raised in the earlier thread was regarding
a
quick and easy and common debugging technique: throwing in print
statements
def do_something(a, b, c)
print a, b, c # for debugging purposes
a + b + c
end
def do_something(a, b, c):
print a, b, c # error! unexpected indentation level
a + b + c
end
We can get around this by saying that braces, wherever they may
appear,
always define a new scope nested within the current scope, regardless
of
indentation.
def do_something(a, b, c):
{ print a, b, c } # this works
a + b + c
Alternatively, perhaps a character that's not normally valid at the
start of a
line (maybe !) could tell the interpreter "treat this line as though
it were
indented to the level of the current scope":
def do_something(a, b, c):
!print a, b, c
a + b + c
Well, I think that's probably enough for my first post. Thank you for
your
time, and Matz, thanks for the language. Thoughts, anyone?
--J