Something that corresponds to Perl's -T and -B tests?

I’ve searched the Ruby documentation, and I can’t find descriptions of
anything that corresponds to Perl’s -T' and-B’ tests … these test
whether a file contains ASCII text or “binary” data, respectively.

I’m happy to write such a test myself in Ruby and then post it here, and
I can easily do so. But I prefer not to re-invent the wheel, so does
anyone know of such a thing that already might have been written, or
which I might have missed in my search through the docs?

Thank you very much in advance.

···


Lloyd Zusman
ljz@asfast.com

I've searched the Ruby documentation, and I can't find descriptions of
anything that corresponds to Perl's `-T' and `-B' tests ... these test
whether a file contains ASCII text or "binary" data, respectively.

See [ruby-talk:44936]

Guy Decoux

ts decoux@moulon.inra.fr writes:

I’ve searched the Ruby documentation, and I can’t find descriptions of
anything that corresponds to Perl’s -T' and -B’ tests … these test
whether a file contains ASCII text or “binary” data, respectively.

See [ruby-talk:44936]

Thank you very much.

Yes, the code in this post this is a version of the text/binary test
that currently exists in Perl, and it’s pretty much what I would have
written … although I’d probably change the “non-ASCII” test. As coded
in the example you referred me to, certain normal ASCII text characters
(such as TAB, to name but one) would incorrectly cause a file to be
flagged as binary.

I also would want this test to be performable on a file that’s already
open (i.e., an IO object).

I’ll wait a day or so to see if someone reports any other currently
written code, but if not, I’ll write and post my own version here.

Guy Decoux

Thank you again.

···


Lloyd Zusman
ljz@asfast.com

Lloyd Zusman ljz@asfast.com writes:

ts decoux@moulon.inra.fr writes:

I’ve searched the Ruby documentation, and I can’t find descriptions of
anything that corresponds to Perl’s -T' and -B’ tests … these test
whether a file contains ASCII text or “binary” data, respectively.

See [ruby-talk:44936]

[ … ]

I’ll wait a day or so to see if someone reports any other currently
written code, but if not, I’ll write and post my own version here.

I’ve written an extension to File::Stat which adds a textfile? and
binaryfile? method. Each works pretty much the same as the file?
and directory? methods of File::Stat. For example:

s0 = File.stat(“/etc/fstab”)
s1 = File.stat(“/bin/sh”)
s2 = File.stat(“/tmp”)

p s0.textfile? => true
p s0.binaryfile? => false
p s1.textfile? => false
p s1.binaryfile? => true
p s2.textfile? => false
p s2.binaryfile? => false

It currently only works with File#stat, but eventually, I’d like to make
this work with IO#stat and File#lstat.

But before I do this work, I’d like the group’s feedback on what I’ve
written so far. Thanks in advance.

class File

def self.stat(arg)
Stat.new(arg)
end

class Stat

private

def self.isText(block)
   return (block.count("^ -~", "^\b\f\t\r\n") < (block.size / 3.0) &&
       block.count("\x00") < 1)
end

public

alias_method :__old_initialize, :initialize

def initialize(arg)
  @item = arg
  __old_initialize(arg)
end

# The textfile? and binaryfile? methods are not inverses of each
# other.  Both return false if the item is not a file, or if the
# item is a zero-length file.  The "textness" or "binariness" of
# an item can only be determined if it's a file that contains at
# least one byte.

def textfile?
  if self.zero? then
    return false
  end
  begin
    open(@item) {
      >file>
      block = file.read(self.blksize < self.size ?
                    self.blksize : self.size)
      return self.class.isText(block)
    }
  rescue
    return false
  end
end

def binaryfile?
  if self.zero? then
    return false
  end
  begin
    open(@item) {
      >file>
      block = file.read(self.blksize < self.size ?
                    self.blksize : self.size)
      return !self.class.isText(block)
    }
  rescue
    return false
  end
end

end # class Stat

end # class File

···


Lloyd Zusman
ljz@asfast.com

Hi,

I’ve written an extension to File::Stat which adds a textfile? and
binaryfile? method. Each works pretty much the same as the file?
and directory? methods of File::Stat. For example:

Why not File::textfile? and so on?

def initialize(arg)
  @item = arg
    super
···

At Sun, 11 Aug 2002 11:41:00 +0900, Lloyd Zusman wrote:

end


Nobu Nakada

nobu.nokada@softhome.net writes:

Hi,

I’ve written an extension to File::Stat which adds a textfile? and
binaryfile? method. Each works pretty much the same as the file?
and directory? methods of File::Stat. For example:

Why not File::textfile? and so on?

Hello, and thank you for your feedback.

Well, to me, the textfile? and binaryfile? methods seem closer in
meaning to the file?, directory?, etc. methods of File::Stat. And if
I’m not mistaken, the necessity to inspect the data inside of a file in
order to determine whether it’s a text file or binary file is
OS-specific … I believe that there are some OS’s where this is just an
attribute of a file that can be determined by making some sort of system
call, in a similar way to which the inode, device, etc. are returned via
the Unix stat() system call.

So to me, textfile? and binaryfile? seem “stat-like”, rather than
“file-like” methods.

But I don’t have a strong objection to making this a method of File …
to me, this is just a matter of personal taste, and I am flexible.

def initialize(arg)
  @item = arg
    super
end

??? “super” doesn’t work in place of __old_initialize(arg) in my code.
Neither does “super(arg)”. What would the superclass be in this case?

···

At Sun, 11 Aug 2002 11:41:00 +0900, > Lloyd Zusman wrote:

Nobu Nakada


Lloyd Zusman
ljz@asfast.com

Hi,

Well, to me, the textfile? and binaryfile? methods seem closer in
meaning to the file?, directory?, etc. methods of File::Stat. And if
I’m not mistaken, the necessity to inspect the data inside of a file in
order to determine whether it’s a text file or binary file is
OS-specific … I believe that there are some OS’s where this is just an
attribute of a file that can be determined by making some sort of system
call, in a similar way to which the inode, device, etc. are returned via
the Unix stat() system call.

It’s true certainly in particular system but not in many
others. I don’t guess it’s common and feel Stat is too
restrictive to bind text/binary-ness.

def initialize(arg)
  @item = arg
    super
end

??? “super” doesn’t work in place of __old_initialize(arg) in my code.
Neither does “super(arg)”. What would the superclass be in this case?

Oops, it wasn’t a subclass but Stat itself, sorry.

···

At Sun, 11 Aug 2002 12:42:50 +0900, Lloyd Zusman wrote:


Nobu Nakada