Newbee: recursively converting LF to CRLF and vice versa

Hello,

I'd like to recursively go through a bunch of directories and for text
file - convert any linefeed (LF) endings to carriage return line feeds
(CRLF). Ideally, I'd like to be able to do the conversion either way
and to distinguish from binary files (so that I don't bother doing any
changes to these files). Lastly, I'd like to package this utility as
a standalone .exe for windows for systems that don't have ruby on it.

I'm slowly going through the Picaxe book but I need to get this
utility done and I'd like to do it in Ruby if possible, rather than
resort to some other thing like Java (gasp). Can anyone provide me
some hints or at least some pointers to parts of Picaxe from which I
could assemble this functionality.

Anything info would be appreciated.

Elliott

A few things to look at:

- The Find module searches recursively in directories, calling your
block of code for each file. Find it in the reference section.
- Scan for control characters in the file contents to determine
(usually) if it is binary or not. You could use a Regexp like
/[[:cntrl:]]/, which matches a single control character.
- The standalone exe can be made using RubyScript2EXE (by Erik Veen).
Find more info at http://www.erikveen.dds.nl/rubyscript2exe

HTH,
Mark

···

On 6/22/05, Elliott <e.hmlhml@gmail.com> wrote:

Hello,

I'd like to recursively go through a bunch of directories and for text
file - convert any linefeed (LF) endings to carriage return line feeds
(CRLF). Ideally, I'd like to be able to do the conversion either way
and to distinguish from binary files (so that I don't bother doing any
changes to these files). Lastly, I'd like to package this utility as
a standalone .exe for windows for systems that don't have ruby on it.

I'm slowly going through the Picaxe book but I need to get this
utility done and I'd like to do it in Ruby if possible, rather than
resort to some other thing like Java (gasp). Can anyone provide me
some hints or at least some pointers to parts of Picaxe from which I
could assemble this functionality.

Anything info would be appreciated.

My "hint" would be to get it working on one file first. That's pretty easy. You can do it with a single Regexp. Then expand it to a recursive directory search.

How do you want to detect binary files? By extension? Regexp can probably help there too.

My "where to look advice" is the Regular Expression section (page 68) for the conversion, the Find library (page 679) for the directory traversal, and the OptionParser library (page 711) for the options you want to give yourself.

I'm not knowledgeable about packaging Ruby .exe files, but I know there are tools for this. Check out AllInOneRuby, for example.

If you really need this fast and coding it isn't a requirement, I'm going to be blown away if a line ending conversion tool doesn't already exist in some form.

Hope that helps.

James Edward Gray II

···

On Jun 22, 2005, at 10:52 AM, Elliott wrote:

Can anyone provide me
some hints or at least some pointers to parts of Picaxe from which I
could assemble this functionality.

Elliott wrote:

Hello,

I'd like to recursively go through a bunch of directories and for text
file - convert any linefeed (LF) endings to carriage return line feeds
(CRLF).

For the end of line conversion, see File.nl_convert, part of the "ptools" package.

Available on the RAA.

Regards,

Dan

Elliott said:

Hello,

I'd like to recursively go through a bunch of directories and for text
file - convert any linefeed (LF) endings to carriage return line feeds
(CRLF). Ideally, I'd like to be able to do the conversion either way
and to distinguish from binary files (so that I don't bother doing any
changes to these files). Lastly, I'd like to package this utility as
a standalone .exe for windows for systems that don't have ruby on it.

[snip]

Anything info would be appreciated.

Hi Elliott,

Firstly, welcome to Ruby! :slight_smile:

For the recursive directory search you will need to use the Find module.
For the conversion you will need to learn about reading and writing files
with either the IO or File classes, as well as regular expressions. Also
take a look at the FileUtils module.

As Mark Hubbart says binary files can usually be determined by looking for
control characters. Unfortunately [[:cntrl:]] matches newlines, so that
isn't the character class to use. This should work fairly well:

class File
  def binary?
    (self.read(100) =~ /[\x00-\x06]/) != nil
  end
end

Here is a test for the above (for Windows systems with the default install
location):

['C:\ruby\bin\ruby.exe', 'C:\ruby\readme.txt'].each do |filename|
  File.open(filename) do |file|
    if file.binary?
      puts "#{filename} is binary"
    else
      puts "#{filename} is NOT binary"
    end
  end
end

Also as Mark said, you can package up your ruby script using rubyscript2exe.

Ryan

Gah! what is wrong with me? *hides face*

cheers,
Mark

···

On 6/22/05, Ryan Leavengood <mrcode@netrox.net> wrote:

As Mark Hubbart says binary files can usually be determined by looking for
control characters. Unfortunately [[:cntrl:]] matches newlines, so that
isn't the character class to use. This should work fairly well:

Ryan Leavengood said:

class File
  def binary?
    (self.read(100) =~ /[\x00-\x06]/) != nil
  end
end

I realized the above alters the position of the filestream, so I think
this is better:

class File
  def binary?
    old_pos = pos
    seek(0)
    match = (read(100) =~ /[\x00-\x06]/)
    seek(old_pos)
    match != nil
  end
end

Ryan

Thank you all for the great information and the ways to approach this problem.