Hi guys,
After some research I still cannot find a way how to see if a file is
plain text or binary. In fact I want to check if a file is plain text no
matter what characters are in it.
This thing may be possible by using ruby ?
Thanks,
Alin
···
--
Posted via http://www.ruby-forum.com/.
Alin Popa wrote:
Hi guys,
After some research I still cannot find a way how to see if a file is
plain text or binary. In fact I want to check if a file is plain text no
matter what characters are in it.
This thing may be possible by using ruby ?
I think so, but it's a little unclear exactly what you're trying to achieve. Do you have an example?
···
--
Alex
Hi guys,
After some research I still cannot find a way how to see if a file is
plain text or binary. In fact I want to check if a file is plain text no
matter what characters are in it.
This thing may be possible by using ruby ?
If you can't use 'file' directly, you should look at it's source and see how
the detection works. I think CVS also detects text quite well.
Thanks,
···
On 6/19/07, Alin Popa <alin.popa@gmail.com> wrote:
Alin
--
Posted via http://www.ruby-forum.com/\.
--
I always thought Smalltalk would beat Java, I just didn't know it would be
called 'Ruby' when it did.
-- Kent Beck
http://blog.zenspider.com/archives/2006/08/i_miss_perls_b.html
···
On Jun 18, 2007, at 23:59 , Alin Popa wrote:
After some research I still cannot find a way how to see if a file is
plain text or binary. In fact I want to check if a file is plain text no
matter what characters are in it.
Alex Young wrote:
Alin Popa wrote:
Hi guys,
After some research I still cannot find a way how to see if a file is
plain text or binary. In fact I want to check if a file is plain text no
matter what characters are in it.
This thing may be possible by using ruby ?
I think so, but it's a little unclear exactly what you're trying to
achieve. Do you have an example?
I'm trying to do a replace in file for some text but I don't want to
consider files like archives or other binary files.
···
--
Posted via http://www.ruby-forum.com/\.
Hi,
At Wed, 20 Jun 2007 02:10:57 +0900,
Ryan Davis wrote in [ruby-talk:256206]:
> After some research I still cannot find a way how to see if a file is
> plain text or binary. In fact I want to check if a file is plain
> text no
> matter what characters are in it.
http://blog.zenspider.com/archives/2006/08/i_miss_perls_b.html
You can use String#count:
def File.binary?(path)
s = read(path, 4096) and
!s.empty? and
(/\0/n =~ s or s.count("\t\n -~").to_f/s.size<=0.7)
end
In any case, it doesn't work for non-ascii files.
···
--
Nobu Nakada
Alin Popa wrote:
Alex Young wrote:
Alin Popa wrote:
Hi guys,
After some research I still cannot find a way how to see if a file is
plain text or binary. In fact I want to check if a file is plain text no
matter what characters are in it.
This thing may be possible by using ruby ?
I think so, but it's a little unclear exactly what you're trying to
achieve. Do you have an example?
I'm trying to do a replace in file for some text but I don't want to
consider files like archives or other binary files.
Of course, when I'm on windows I can go after the file extension and try
to ignore some specific (eg. .exe, .zip, .jar, .rar, .anything_i_want)
but I don't know how to do it on Linux/Unix OS where file extension is
not mandatory.
···
--
Posted via http://www.ruby-forum.com/\.
Which I shamelessly plagiarized and stuck in the ptools library.
gem install ptools
File.binary?('some_file')
Regards,
Dan
···
On Jun 19, 11:33 am, Alin Popa <alin.p...@gmail.com> wrote:
Ryan Davis wrote:
> On Jun 18, 2007, at 23:59 , Alin Popa wrote:
>> After some research I still cannot find a way how to see if a file is
>> plain text or binary. In fact I want to check if a file is plain
>> text no
>> matter what characters are in it.
>http://blog.zenspider.com/archives/2006/08/i_miss_perls_b.html
Nice, thanks.
Nobuyoshi Nakada wrote:
You can use String#count:
def File.binary?(path)
s = read(path, 4096) and
!s.empty? and
(/\0/n =~ s or s.count("\t\n -~").to_f/s.size<=0.7)
end
In any case, it doesn't work for non-ascii files.
Pedantic correction: it desn't work for non-western scripts. French uses accents here and there but it would pass the test above.
Still, I have to say I was surprised; I didn't know that a hyphen in String#count had the same effect as in a regexp character class. Talk about an undocumented feature!
Daniel
You could read the file (or portion of the file), create a histogram of byte (or groups of bytes) occurrences and compare that to what you expect for text files (e.g. most chars are "0-9a-zA-Z" and punctuation).
You could as well use command "file" and parse its output.
Kind regards
robert
···
On 19.06.2007 09:33, Alin Popa wrote:
Alin Popa wrote:
Alex Young wrote:
Alin Popa wrote:
Hi guys,
After some research I still cannot find a way how to see if a file is
plain text or binary. In fact I want to check if a file is plain text no
matter what characters are in it.
This thing may be possible by using ruby ?
I think so, but it's a little unclear exactly what you're trying to
achieve. Do you have an example?
I'm trying to do a replace in file for some text but I don't want to consider files like archives or other binary files.
Of course, when I'm on windows I can go after the file extension and try to ignore some specific (eg. .exe, .zip, .jar, .rar, .anything_i_want) but I don't know how to do it on Linux/Unix OS where file extension is not mandatory.
Hello,
On a *nix system, you can do
file_type = `file my_file`
puts file_type
but this will not work on Windows.
George
···
On 19 Jun 2007, at 08:33, Alin Popa wrote:
Alin Popa wrote:
Alex Young wrote:
Alin Popa wrote:
Hi guys,
After some research I still cannot find a way how to see if a file is
plain text or binary. In fact I want to check if a file is plain text no
matter what characters are in it.
This thing may be possible by using ruby ?
I think so, but it's a little unclear exactly what you're trying to
achieve. Do you have an example?
I'm trying to do a replace in file for some text but I don't want to
consider files like archives or other binary files.
Of course, when I'm on windows I can go after the file extension and try
to ignore some specific (eg. .exe, .zip, .jar, .rar, .anything_i_want)
but I don't know how to do it on Linux/Unix OS where file extension is
not mandatory.
--
Posted via http://www.ruby-forum.com/\.
Hi,
At Wed, 20 Jun 2007 08:22:51 +0900,
Daniel DeLorme wrote in [ruby-talk:256241]:
Still, I have to say I was surprised; I didn't know that a hyphen in
String#count had the same effect as in a regexp character class. Talk
about an undocumented feature!
It's documented.
It can be
s.count("^\t\n -~").to_f/s.size>0.3
···
--
Nobu Nakada
robert@fussel ~
$ file .inputrc
.inputrc: ASCII English text
robert@fussel ~
$ uname -a
CYGWIN_NT-5.1 fussel 1.5.24(0.156/4/2) 2007-01-31 10:57 i686 Cygwin
robert
···
On 19.06.2007 10:01, George Malamidis wrote:
Hello,
On a *nix system, you can do
file_type = `file my_file`
puts file_type
but this will not work on Windows.
George Malamidis wrote:
Hello,
On a *nix system, you can do
file_type = `file my_file`
puts file_type
but this will not work on Windows.
George
Thanks guys, the problem was solved due to your indications
Regarding file command, I can use it on win also since there are
gnuwin32 tools
Best regards,
Alin
···
--
Posted via http://www.ruby-forum.com/\.