Directory searching againist a text file

I am in the middle of writing a quick program which will scan the
contents of a given file path recursively for a list of keywords stored
in a file. My code so far is below, but before moving ahead I have two
questions.

First: I am passing in a text file called "terms.txt" to search for each
keyword in the file I assume the best way to to do so is as follows:

terms.each do |term|
if line =~ term
  puts ""
end

My second question is: This program works well for searching text files
but what about word docs and spreadsheets? Do i need some Windows API in
there??

Many thanks

require 'find'

class ESearch

  #method which is passed file path from cmd line
  def scanFiles(path)
    terms = "C:\Documents and Settings\user\Desktop\terms.txt"
    #process each file under the passed file path
    Find.find(path) do |curPath|
      next unless File.file?(curPath)
      #process the contens of each file line by line counting line
nmbers
      File.open(curPath) do |file|
        file.each do |line|
          #check if a line in the file matches term and output the path
and line number
          if line =~ terms
            puts "#{curPath}"
          end
        end
      end
    end
  end
end

#run of cmd line pass in file path, this will ask for a file path if one
is not passed
if __FILE__ == $0
  if ARGV.size != 1
    puts "Use: #{$0} [path]"
    exit
  end

  esearch = ESearch.new()
  esearch.scanFiles(ARGV[0])
end

···

--
Posted via http://www.ruby-forum.com/.

Stuart Clarke wrote:

I am in the middle of writing a quick program which will scan the
contents of a given file path recursively for a list of keywords
stored in a file. My code so far is below, but before moving ahead I
have two questions.

First: I am passing in a text file called "terms.txt" to search for
each keyword in the file I assume the best way to to do so is as
follows:

terms.each do |term|
if line =~ term
  puts ""
end

My second question is: This program works well for searching text
files but what about word docs and spreadsheets? Do i need some
Windows API in there??

You can read these files if you open them in binary mode.
However, they will contain so much extra binary crap that
it may not be easy to search in them.

Many thanks

require 'find'

class ESearch

  #method which is passed file path from cmd line
  def scanFiles(path)
    terms = "C:\Documents and Settings\user\Desktop\terms.txt"
    #process each file under the passed file path
    Find.find(path) do |curPath|
      next unless File.file?(curPath)
      #process the contens of each file line by line counting line
nmbers
      File.open(curPath) do |file|
        file.each do |line|
          #check if a line in the file matches term and output the
path and line number
          if line =~ terms
            puts "#{curPath}"
          end
        end
      end
    end
  end
end

#run of cmd line pass in file path, this will ask for a file path if
one is not passed
if FILE == $0
  if ARGV.size != 1
    puts "Use: #{$0} [path]"
    exit
  end

  esearch = ESearch.new()
  esearch.scanFiles(ARGV[0])
end

terms = IO.read("terms.txt").strip.split(/\s*\n\s*/)

ARGF.each{|line| line.strip!
  if terms.include? line
    puts "#{ARGF.filename}:#{ARGF.lineno}: #{line}"
  end
}

Running it:

ruby scanner.rb *.dat