Syntax checker wtf?

Just about every language and compiler I've run across in 30+ years
has shown some head-scratching syntax errors, usually caused by a
mis-matched pair of somethings. The solution has always been some form
of tool besides the syntax error messages:

The old-fashioned way was for the compiler to produce a listing with
annotations on the side showing nesting levels of things like do/end
procedures/functions and their ends etc.

Proc Do
0 0 Function foo(a,b)
1 0 do while something
1 1 sing(a)
1 1 do something else
1 2 skip(b)
1 1 end
1 1 frobnitz()
1 0 end
0 0 end

Of course that looks ancient to anyone but us old greybeards who cut
our teeth on keypunches instead of vi or emacs.

The more modern way is to eschew listings in favor of editors which
are somewhat knowledgable of the language syntax and can do things
like fold, indent, and find matching items.

Back when I did Lisp, I learned techniques like numbering parentheses myself:
   (a (b c)( (( (d e) f) ) ) ))))))))))
   0 1 11234 4 3210

And of course those old Lots of Irritating Silly Parentheses
evaluators would throw away those unneeded closing parens.

···

On 8/21/06, Mike Cargal <mike@cargal.net> wrote:

My point was not to say that it was a problem for Ruby to not have
statement terminators. I was trying to answer
the question as to why the user was getting his error reported at EOF.
It's also pertinent because a lot of
discussion has been around the need for "better error messages". While
I have certainly seen cases where Ruby
error messages could be better, I don't think a little more work on
error messages is going to correct his issue.
It's implicit with the lack of statement terminators.

--
Rick DeNatale

My blog on Ruby
http://talklikeaduck.denhaven2.com/

Bill Kelly wrote:

From: "inboulder" <rubyforum@wendlink.com>

Matthew Smillie wrote:

Actually yes. Error identification and recovery in parsing is a
difficult task to begin with, engineering it into an existing system
is far from trivial.

The compiler throws away the line # of the start of the expression it is trying to evaluate? Keeping this # around and printing it out would be a great step in improving the debug info.

I wonder...

0001: class Foo
0002:
0003: def initialize
0004: # whatever
0005: end
0006:
[...]
0347: def bar
0348: if @gargle
0349: puts "glug!"
0350: #end (missing end for if)
0360: end
[...]
0415:
0416: end # of class Foo

Now, the error message we'd like is that the 'end' is missing for the 'if'
expression starting on line 348.

But the parser did find an 'end' that paired up with the 'if', at line 360.
And it found an 'end' for the 'def' at line 416. Ultimately, it reaches the
end of the file, and is missing an 'end' for 'class Foo'.

So, would "unexpected EOF, missing kEND from line 1" really be all
that helpful?

It seems like a missing 'end' would probably tend to be reported for
whatever line the outermost class or module being compiled started
on... (?)

This is sort of an interesting idea, discussed before. I wonder what
Matz thinks about it. It's not like "enforcing" indentation, it's just
using it to "guess" about a syntax error.

Of course, the way it works now, it DOES find an 'end' for the 'if'...
it's on line 360. :slight_smile: What it doesn't find is an end for the class...

So unless it was whitespace-sensitive, it would tell you that it
reached the end of file while parsing the expression that started
on line 1. :slight_smile:

Cheers,
Hal

M. Edward (Ed) Borasky wrote:

Now R is a functional language with (two kinds of) objects, not an
"object-oriented language". But I think I'd write the same way in Ruby.

Proper modularization helps readability and understanding in *every* programming language regardless of the idiom.

Kind regards

  robert

# Bill Kelly wrote:
# > I wonder...
# >
# > 0001: class Foo
# > 0002:
# > 0003: def initialize
# > 0004: # whatever
# > 0005: end
# > 0006:
# > [...]
# > 0347: def bar
# > 0348: if @gargle
# > 0349: puts "glug!"
# > 0350: #end (missing end for if)
# > 0360: end
# > [...]
# > 0415:
# > 0416: end # of class Foo
# >
# > Now, the error message we'd like is that the 'end' is
# missing for the 'if'
# > expression starting on line 348.
# >
# > But the parser did find an 'end' that paired up with the
# 'if', at line 360.
# > And it found an 'end' for the 'def' at line 416.
# Ultimately, it reaches
# > the
# > end of the file, and is missing an 'end' for 'class Foo'.
# >
# > So, would "unexpected EOF, missing kEND from line 1" really be all
# > that helpful?
# >
# > It seems like a missing 'end' would probably tend to be reported for
# > whatever line the outermost class or module being compiled started
# > on... (?)

···

From: Hal Fulton [mailto:hal9000@hypermetrics.com]
#
# This is sort of an interesting idea, discussed before. I wonder what
# Matz thinks about it. It's not like "enforcing" indentation, it's just
# using it to "guess" about a syntax error.
#
# Of course, the way it works now, it DOES find an 'end' for the 'if'...
# it's on line 360. :slight_smile: What it doesn't find is an end for the class...
#
# So unless it was whitespace-sensitive, it would tell you that it
# reached the end of file while parsing the expression that started
# on line 1. :slight_smile:

is it possible for ruby to return list of end pairs pretty printed (think folding)?

Error: unexpected EOF, missing kEND from line 1.
Listing end pairs encountered:

0001: class Foo
0003: def initialize
0005: end
0347: def bar
0348: if @gargle
0360: end
0416: end # of class Foo
      <--- missing end here

an intelligent editor may then possibly associate the end pairs with the actual code, like so (pls forgive the crude ascii art),

0001: class Foo ------------------------- 0001: class Foo
0003: def initialize ................. 0002:
0005: end ........................ '.. 0003: def initialize
0347: def bar -----------------.. | 0004: # whatever
0348: if @gargle .......... | '..... 0005: end
0360: end ............. | | 0006:
0416: end # of class Foo | | |
       <--| missing end here | | |........ 0347: def bar
          > > >........... 0348: if @gargle
          > > 0349: puts "glug!"
          > > 0350: #end
          > >.............. 0360: end
          >
          > 0415:
          >................................. 0416: end # of class Foo

i think this could be a good ruby quiz/challenge.

kind regards -botp

Robert Klemme wrote:

M. Edward (Ed) Borasky wrote:

Now R is a functional language with (two kinds of) objects, not an
"object-oriented language". But I think I'd write the same way in Ruby.

Proper modularization helps readability and understanding in *every*
programming language regardless of the idiom.

Kind regards

    robert

Exactly! Which is why Ruby can "get away with" a loose syntax, duck
typing, open syntactic elements as continuations, etc. So old
programmers like me who are new to Ruby are free to use semicolons and
curly braces the same way we do in C and Perl in R or Ruby just to
facilitate our thinking when we switch among languages.

For someone new to programming who choses to learn starting with Ruby,
though, perhaps our introductory "textbooks" ought to emphasize a coding
style that promotes factoring and readability at an equal or even
greater level than the basics of how to construct classes, objects,
methods, expressions and the other semantic elements of the language.

Bill Kelly wrote:

So, would "unexpected EOF, missing kEND from line 1" really be all
that helpful?

It seems like a missing 'end' would probably tend to be reported for
whatever line the outermost class or module being compiled started
on... (?)

So unless it was whitespace-sensitive, it would tell you that it
reached the end of file while parsing the expression that started
on line 1. :slight_smile:

That was, indeed, the very nub of my gist. :smiley:

Regards,

John Cleese

···

From: "Hal Fulton" <hal9000@hypermetrics.com>

M. Edward (Ed) Borasky wrote:

For someone new to programming who choses to learn starting with Ruby,
though, perhaps our introductory "textbooks" ought to emphasize a coding
style that promotes factoring and readability at an equal or even
greater level than the basics of how to construct classes, objects,
methods, expressions and the other semantic elements of the language.

The introductory textbooks are supposed to teach someone to code ruby, not to proselytize the virtues of clean design. And for what it's worth, the code examples in those (and all books) almost always are very short, without convoluted methods with three levels of nesting like you always end up with when you get someone else's code to maintain *sulk*.

Putting actual emphasis on those would be beyond the scope of said books, even if mentioning the gotchas of the loose syntax in the appropriate parts of the book wouldn't be out of place. The problem is the appropriate parts tend to be towards the end of the books in the boring grammar treatises where people new to programming actually get to.

Also, readability is in the eye of the observer, I'm sure there's people with macho-programmer nerve-twitches aplenty that would find guides like that personally insulting. I know I met a lot... (C# partial classes fanboys in specific.)

David Vallner

Bill Kelly wrote:

That was, indeed, the very nub of my gist. :smiley:

:slight_smile: Apparently I'm better at replying than reading.

Hal