YAML and Windows newlines

I've run into the following odd behavior using YAML:

   # a = "something: >\r\n Some text with a\r\n newline in it.\r\n"
   # b = "something: >\n Some text with a\n newline in it.\n"
   # p YAML.load(a)
   {"something"=>"Some text with a\nnewline in it.\n"}
   # p YAML.load(b)
   {"something"=>"Some text with a newline in it.\n"}

Should this be reported as a bug, or are there still known issues with how Ruby's yaml processor handles these situations? This kind of threw us for a loop in some code generation routines I'm writing at work, since we expected the resulting text to all exist on one line. For now, the work-around is simple -- just strip out all newlines from the string. Just wanted to raise a little warning flag to the list, though, and ask if anyone else had run into this.

Oh, and:

   # ruby -v
   ruby 1.8.1 (2004-04-24) [i686-linux-gnu]

···

--
Jamis Buck
jgb3@email.byu.edu
http://www.jamisbuck.org/jamis

ruby -h | ruby -e 'a=[];readlines.join.scan(/-(.)\[e|Kk(\S*)|le.l(..)e|#!(\S*)/) {|r| a << r.compact.first };puts "\n>#{a.join(%q/ /)}<\n\n"'

Jamis Buck wrote:

I've run into the following odd behavior using YAML:

   # a = "something: >\r\n Some text with a\r\n newline in it.\r\n"
   # b = "something: >\n Some text with a\n newline in it.\n"
   # p YAML.load(a)
   {"something"=>"Some text with a\nnewline in it.\n"}
   # p YAML.load(b)
   {"something"=>"Some text with a newline in it.\n"}

Should this be reported as a bug, or are there still known issues with
how Ruby's yaml processor handles these situations? This kind of threw
us for a loop in some code generation routines I'm writing at work,
since we expected the resulting text to all exist on one line. For now,
the work-around is simple -- just strip out all newlines from the
string. Just wanted to raise a little warning flag to the list, though,
and ask if anyone else had run into this.

Hi Jamis,

I think your description is consistent with reading
input from file(s) opened in binary mode?

The script below writes a string with "\n"s to a file
and the file dump shows that "\r\n"s have been written.

Reading this file back in "normal" mode, then "binary" mode
gives the same pair of results you showed in your strings.

Ruby IO for DOS in "normal" mode drops the "\r"s. In
"binary" mode, you see "\r\n"s.

···

#---------------------------------------
fsn = 'C:\TEMP\rbtest.txt'

File.open(fsn, 'w') do |fs|
  fs.write("something: >\n Some text with a\n newline in it.\n")
end

# Hexdump of file: ('=' shows the "\r\n" DOS newlines)
# ----------------
# 0x00: 73 6F 6D 65 74 68 69 6E 67 3A 20 3E 0D=0A 20 20 ; something: >..
# 0x10: 53 6F 6D 65 20 74 65 78 74 20 77 69 74 68 20 61 ; Some text with a
# 0x20: 0D=0A 20 20 6E 65 77 6C 69 6E 65 20 69 6E 20 69 ; .. newline in i
# 0x30: 74 2E 0D=0A ; t...

sa = sb = nil
require 'yaml'

File.open(fsn, 'r') do |fs|
  sa = fs.read
end
p sa
#=> "something: >\n Some text with a\n newline in it.\n"

p YAML.load(sa)
#=> {"something"=>"Some text with a newline in it.\n"}

File.open(fsn, 'rb') do |fs| # open in binary mode !!
  sb = fs.read
end
p sb
#=> "something: >\r\n Some text with a\r\n newline in it.\r\n"

p YAML.load(sb)
#=> {"something"=>"Some text with a\nnewline in it.\n"}
#---------------------------------------

HTH,

daz

daz wrote:

I think your description is consistent with reading
input from file(s) opened in binary mode?

Arg! You're right, of course. Gotta love Windows.

Thanks for the tip, and sorry for the noise...

···

--
Jamis Buck
jgb3@email.byu.edu
http://www.jamisbuck.org/jamis

ruby -h | ruby -e 'a=;readlines.join.scan(/-(.)\[e|Kk(\S*)|le.l(..)e|#!(\S*)/) {|r| a << r.compact.first };puts "\n>#{a.join(%q/ /)}<\n\n"'