I'm not sure this is the right place to report issues about Syck but that's
the best I've found so far I think I've found a problem with the way Syck
quotes strings that could look like floats. Here is a short example:
The first element is quoted appropriately, the second isn't because there's
no ambiguity but the third should be quoted. The YAML float can have
underscores, they're used as visual separators and should be ignored by the
parser. So when seeing 1.2_3 the parser should read the float 1.23. Then to
disambiguates the string "1.2_3" should be quoted.
Practically this isn't really a problem as YAML converts on writes so
YAML.dump(1.2_3) gets directly written as 1.23. However it creates some
interoperability issues when "1.2_3" gest written unquoted by Syck and then
read by another parser as being 1.23. The string isn't a string anymore in
that case.
I do not believe underscores are valid in either YAML floats or YAML
integers. Although they are allowed in Ruby, I believe Syck is
handling your example case properly and without ambiguity.
Blessings,
TwP
···
On 2/19/07, Matthieu Riou <matthieu.riou@gmail.com> wrote:
Hi,
I'm not sure this is the right place to report issues about Syck but that's
the best I've found so far I think I've found a problem with the way Syck
quotes strings that could look like floats. Here is a short example:
The first element is quoted appropriately, the second isn't because there's
no ambiguity but the third should be quoted. The YAML float can have
underscores, they're used as visual separators and should be ignored by the
parser. So when seeing 1.2_3 the parser should read the float 1.23. Then to
disambiguates the string "1.2_3" should be quoted.
Practically this isn't really a problem as YAML converts on writes so
YAML.dump(1.2_3) gets directly written as 1.23. However it creates some
interoperability issues when "1.2_3" gest written unquoted by Syck and then
read by another parser as being 1.23. The string isn't a string anymore in
that case.
On Tue, Feb 20, 2007 at 10:03:25AM +0900, Matthieu Riou wrote:
The first element is quoted appropriately, the second isn't because there's
no ambiguity but the third should be quoted. The YAML float can have
underscores, they're used as visual separators and should be ignored by the
parser.
"Any "*_*" characters in the number are ignored, allowing a readable
representation of large values."
So clearly they're allowed, in 1.2_3 the underscore is simply ignored and
the parser should undestand 1.23. If I fancy to write my own YAML by hand
and make it easily readable (which is kind of the original purpose) I could
write 100_000_000.03 which would be a nice looking float. Hence the
ambiguity with "1.2_3".
Cheers,
Matthieu
···
On 2/19/07, Tim Pease <tim.pease@gmail.com> wrote:
On 2/19/07, Matthieu Riou <matthieu.riou@gmail.com> wrote:
> Hi,
>
> I'm not sure this is the right place to report issues about Syck but
that's
> the best I've found so far I think I've found a problem with the way
Syck
> quotes strings that could look like floats. Here is a short example:
>
> irb(main):001:0> require 'yaml'
> => true
> irb(main):002:0> YAML.dump(["1.2", "1.2.3", "1.2_3"])
> => "--- \n- \"1.2\"\n- 1.2.3\n- 1.2_3\n"
>
> The first element is quoted appropriately, the second isn't because
there's
> no ambiguity but the third should be quoted. The YAML float can have
> underscores, they're used as visual separators and should be ignored by
the
> parser. So when seeing 1.2_3 the parser should read the float 1.23. Then
to
> disambiguates the string "1.2_3" should be quoted.
>
> Practically this isn't really a problem as YAML converts on writes so
> YAML.dump(1.2_3) gets directly written as 1.23. However it creates some
> interoperability issues when "1.2_3" gest written unquoted by Syck and
then
> read by another parser as being 1.23. The string isn't a string anymore
in
> that case.
>
I do not believe underscores are valid in either YAML floats or YAML
integers. Although they are allowed in Ruby, I believe Syck is
handling your example case properly and without ambiguity.
"Any "*_*" characters in the number are ignored, allowing a readable
representation of large values."
So clearly they're allowed, in 1.2_3 the underscore is simply
ignored and
the parser should undestand 1.23. If I fancy to write my own
YAML by hand
and make it easily readable (which is kind of the original
purpose) I could
write 100_000_000.03 which would be a nice looking float. Hence the
ambiguity with "1.2_3".
But the specification is not just in english, it's also in regex form:
So clearly they're allowed, in 1.2_3 the underscore is simply
ignored and
the parser should undestand 1.23. If I fancy to write my own
YAML by hand
and make it easily readable (which is kind of the original
purpose) I could
write 100_000_000.03 which would be a nice looking float. Hence the
ambiguity with "1.2_3"
Notwithstanding the spelling of exponential, that example is not
consistent with the regular expression (which allows underscores only if
they preceed the decimal point -
[-+]?([0-9][0-9_]*)?\.[0-9.]*([eE][-+][0-9]+)?).
This is horrible, IMHO. It has to be fixed. I’m very sorry for the
late response.
Clark
···
On Tue, Feb 20, 2007 at 04:17:20PM +1100, Daniel Sheppard wrote:
So clearly they’re allowed, in 1.2_3 the underscore is simply
ignored and
the parser should undestand 1.23. If I fancy to write my own
YAML by hand
and make it easily readable (which is kind of the original
purpose) I could
write 100_000_000.03 which would be a nice looking float. Hence the
ambiguity with “1.2_3”
Notwithstanding the spelling of exponential, that example is not
consistent with the regular expression (which allows underscores only if
they preceed the decimal point -
[-+]?([0-9][0-9_])?.[0-9.]([eE][-+][0-9]+)?).