Slow file access with mapped drives ?!

Hi,

running ruby 1.8.4 (2006-04-14) [i386-mswin32]
on Windows2000

i have a script that scans some directories for given filename patterns
and does string replacement in those found files.

it works fine, but when it comes to work with files on a mapped drive
the script hangs for about 20secs, afterwards does his work and all OK.

i tried with =

V:/path/to/files
and also with
//servername/c$/path/to/files

and it's both the same, no difference.

Any ideas how to speed up the file access ?
Are there general problems with unc path and ruby ?

Gilbert

Hi,

after some irb it shows the bottleneck lies in
the Dir.glob method

i do something like =

config['targetdirs'].each do |dir|

   Dir.chdir("//servername/c$/targetdir")
   Dir.glob("**/"<<config['targetfilepattern']).each do |file|
   ...
  end
...
end

any alternatives to Dir.glob ?

Regards, Gilbert

···

-----Original Message-----
From: Rebhan, Gilbert [mailto:Gilbert.Rebhan@huk-coburg.de]
Sent: Monday, May 14, 2007 11:39 AM
To: ruby-talk ML
Subject: slow file access with mapped drives ?!

Hi,

running ruby 1.8.4 (2006-04-14) [i386-mswin32]
on Windows2000

i have a script that scans some directories for given filename patterns
and does string replacement in those found files.

it works fine, but when it comes to work with files on a mapped drive
the script hangs for about 20secs, afterwards does his work and all OK.

i tried with =

V:/path/to/files
and also with
//servername/c$/path/to/files

and it's both the same, no difference.

Any ideas how to speed up the file access ?
Are there general problems with unc path and ruby ?

Gilbert

wala ba kayong tulog?ang galing

···

On 5/14/07, Rebhan, Gilbert <Gilbert.Rebhan@huk-coburg.de> wrote:

Hi,

running ruby 1.8.4 (2006-04-14) [i386-mswin32]
on Windows2000

i have a script that scans some directories for given filename patterns
and does string replacement in those found files.

it works fine, but when it comes to work with files on a mapped drive
the script hangs for about 20secs, afterwards does his work and all OK.

i tried with =

V:/path/to/files
and also with
//servername/c$/path/to/files

and it's both the same, no difference.

Any ideas how to speed up the file access ?
Are there general problems with unc path and ruby ?

Gilbert

--
ruby

Hi,

after some irb it shows the bottleneck lies in
the Dir.glob method

i do something like =

config['targetdirs'].each do |dir|

   Dir.chdir("//servername/c$/targetdir")
   Dir.glob("**/"<<config['targetfilepattern']).each do |file|

Why don't you just do

Dir.glob("//servername/c$/targetdir/**/"<<config['targetfilepattern']).each do |file|

Btw, I don't see any reference to block variable "dir" is this on purpose?

   ...
  end
..
end

any alternatives to Dir.glob ?

Find.find - but don't expect that it's faster because the slowness lies in file system accesses. Network drives are inherently slower than local disks.

Kind regards

  robert

···

On 14.05.2007 12:05, Rebhan, Gilbert wrote:

Hi, Robert

···

-----Original Message-----
From: Robert Klemme [mailto:shortcutter@googlemail.com]
Sent: Monday, May 14, 2007 12:20 PM
To: ruby-talk ML
Subject: Re: slow file access with mapped drives ?!

/*
Why don't you just do

Dir.glob("//servername/c$/targetdir/**/"<<config['targetfilepattern']).e
ach
do |file|
*/

Because i need to have the targetdirs flexible, for example i have in
my yaml =

targetdirs:
- Y:/tempwork
- T:/rubytest
- //wvp10175/c$/SCM_Server

/*
Btw, I don't see any reference to block variable "dir" is this on
purpose?
*/

config['targetdirs'].each do |dir|
    Dir.chdir(dir)
    Dir.glob("**/"<<config['targetfilepattern']).each do |file
...
end
...end

any alternatives to Dir.glob ?

/*
Find.find - but don't expect that it's faster because the slowness lies
in file system accesses. Network drives are inherently slower than
local disks.
*/

it runs in almost half the time, much quicker :slight_smile:
thanks for the pointer !! , now i have =

config['targetdirs'].each do |dir|
    Find.find(dir) do |f|
      if f =~ /#{config['targetfilepattern']}/
      puts "... processing "<<f
      filesed!(config['replacefrom'],s,f)
      end
    end
  end

Best regards, Gilbert

Hi, Robert

From: Robert Klemme [mailto:shortcutter@googlemail.com] Sent: Monday, May 14, 2007 12:20 PM
To: ruby-talk ML
Subject: Re: slow file access with mapped drives ?!

/*
Why don't you just do

Dir.glob("//servername/c$/targetdir/**/"<<config['targetfilepattern']).e
ach do |file|
*/

Because i need to have the targetdirs flexible, for example i have in
my yaml =

targetdirs:
- Y:/tempwork
- T:/rubytest
- //wvp10175/c$/SCM_Server

??? The code you posted had no dynamic piece there. Did you maybe mean to write

Dir.chdir("//servername/c$/#{dir}")

/*
Btw, I don't see any reference to block variable "dir" is this on
purpose?
*/

config['targetdirs'].each do |dir|
    Dir.chdir(dir)
    Dir.glob("**/"<<config['targetfilepattern']).each do |file
..
end
..end

That was not in the code you posted.

any alternatives to Dir.glob ?

/*
Find.find - but don't expect that it's faster because the slowness lies in file system accesses. Network drives are inherently slower than local disks.
*/

it runs in almost half the time, much quicker :slight_smile:

... which might be due to the fact that your new code doesn't do the chdir! You changed multiple parameters at the same time so it's hard to attribute the perceived performance change to any one parameter.

thanks for the pointer !! , now i have =

config['targetdirs'].each do |dir|
    Find.find(dir) do |f|
      if f =~ /#{config['targetfilepattern']}/
      puts "... processing "<<f
      filesed!(config['replacefrom'],s,f)
      end
    end
  end

Please make sure you post the code you are actually testing - otherwise everybody else will have a hard time understanding what's going on let alone come up with helpful replies.

Regards

  robert

···

On 14.05.2007 14:18, Rebhan, Gilbert wrote:

-----Original Message-----

Hi,

···

-----Original Message-----
From: Robert Klemme [mailto:shortcutter@googlemail.com]
Sent: Monday, May 14, 2007 2:45 PM
To: ruby-talk ML
Subject: Re: slow file access with mapped drives ?!

/*
Please make sure you post the code you are actually testing - otherwise
everybody else will have a hard time understanding what's going on let
alone come up with helpful replies.
*/

i don't want to bother with bad code, but here is the whole thing,
for you to understand what i try to achieve =

----snipp----

require 'yaml'
require 'highline/import'
require 'win32/registry'
require 'find'

config=YAML.load_file("cvs_login.yaml")

cvsuser = ask("Enter CVS User: ") {|q|
q.default = "#{ENV["USERNAME"]}"
q.echo = true}
cvspass = ask("Enter password: ") { |q| q.echo = '*' }
puts "\n\n"

config['cvsrepos'].each {|x|
puts "Login CVS Repository >> #{x} ..."
IO.popen("#{config['CVSEXE']} -d
:pserver:#{cvsuser}:#{cvspass}@cvsprod:d:/cvsrepos/#{x} login")
}
puts "Login successful !!"

def filesed!(pattern, replaceString, filename)
  regexPattern = Regexp.compile(pattern)
  tmpFilename = "#{filename}.#{Time.now.usec}"
  tmpFile = File.new(tmpFilename, "w")
    srcFile = File.open(filename)
      srcFile.each_line do |line|
        tmpFile.print line.gsub(regexPattern) {|match|
        replaceString
      }
  end
  tmpFile.close
  srcFile .close
  File.rename(tmpFilename, filename)
end

if config['targetdirs']

  Win32::Registry::HKEY_CURRENT_USER.open('Software\Cvsnt\cvspass') do

reg>

         @cvsreg =
reg[":pserver:#{ENV["USERNAME"]}@cvsprod:d:/cvsrepos/test"]
  end

  s=config['replaceto1']<<"#{@cvsreg}"<<config['replaceto2']

  config['targetdirs'].each do |dir|
    puts "\n\n"
    Find.find(dir) do |f|
      if f =~ /#{config['targetfilepattern']}/
      puts "... processing "<<f
      filesed!(config['replacefrom'],s,f)
      end
    end
  end
  puts "\n\nDone !!"
  sleep 1
  else
  puts "\n\nDone !!"
    sleep 1
  end

----snipp---

----yaml---

---
CVSEXE: "./cvsnt/cvs.exe"
cvsrepos:
- foo
- foobar
- foobaz
- bla
- ...

# optional stuff following
targetdirs:
- Y:/tempwork
- T:/rubytest
- //wvp10175/c$/SCM_Server

targetfilepattern: Scm\w+\.xml$

replacefrom: "<Passwort>.*</Passwort>"
replaceto1: "<Passwort><![CDATA["
replaceto2: "]></Passwort>"

----yaml----

the filesed method is not by me, found it somewhere, don't remember

Regards, Gilbert

From: Robert Klemme [mailto:shortcutter@googlemail.com] Sent: Monday, May 14, 2007 2:45 PM
To: ruby-talk ML
Subject: Re: slow file access with mapped drives ?!

/*
Please make sure you post the code you are actually testing - otherwise everybody else will have a hard time understanding what's going on let alone come up with helpful replies.
*/

i don't want to bother with bad code, but here is the whole thing,
for you to understand what i try to achieve =

I was referring to the fact that you present one bit of code and in your answer to my posting you present another bit of code as if *that* was the original code.

However, there are quite a few things to say about this bit of code.

----snipp----

require 'yaml'
require 'highline/import'
require 'win32/registry'
require 'find'

config=YAML.load_file("cvs_login.yaml")

cvsuser = ask("Enter CVS User: ") {|q|
q.default = "#{ENV["USERNAME"]}"
q.echo = true}
cvspass = ask("Enter password: ") { |q| q.echo = '*' }
puts "\n\n"

config['cvsrepos'].each {|x|
puts "Login CVS Repository >> #{x} ..."
IO.popen("#{config['CVSEXE']} -d
:pserver:#{cvsuser}:#{cvspass}@cvsprod:d:/cvsrepos/#{x} login")
}

Why do you use popen if you do not read the pipe? *If* you use popen you should make sure the pipe is read from - even if you ignore what you find because otherwise the other process might get blocked.

puts "Login successful !!"

def filesed!(pattern, replaceString, filename)
  regexPattern = Regexp.compile(pattern)
  tmpFilename = "#{filename}.#{Time.now.usec}"
  tmpFile = File.new(tmpFilename, "w")
    srcFile = File.open(filename)
      srcFile.each_line do |line|
        tmpFile.print line.gsub(regexPattern) {|match|
        replaceString
      }

Rather use the non block form of gsub unless you want to make sure that there is no metacharacter interpretation going on. (Even in that case I'd probably prefer the non block form with a modified replaceString.)

  end
  tmpFile.close
  srcFile .close

Rather use the block form of File.open which is much safer.

  File.rename(tmpFilename, filename)
end

if config['targetdirs']

  Win32::Registry::HKEY_CURRENT_USER.open('Software\Cvsnt\cvspass') do
>reg>
         @cvsreg =
reg[":pserver:#{ENV["USERNAME"]}@cvsprod:d:/cvsrepos/test"]
  end

  s=config['replaceto1']<<"#{@cvsreg}"<<config['replaceto2']

Why do you use "#{@cvsreg}" instead of plain @cvsreg or @cvsreg.to_s?

  config['targetdirs'].each do |dir|
    puts "\n\n"
    Find.find(dir) do |f|
      if f =~ /#{config['targetfilepattern']}/

You should pull out this regexp compilation from both loops for more efficiency.

      puts "... processing "<<f
      filesed!(config['replacefrom'],s,f)

Indentation.

      end
    end
  end
  puts "\n\nDone !!"
  sleep 1

Why the sleep?

  else
  puts "\n\nDone !!"
    sleep 1

Why the sleep?

  end

----snipp---

----yaml---

---
CVSEXE: "./cvsnt/cvs.exe"
cvsrepos:
- foo
- foobar
- foobaz
- bla
- ...

# optional stuff following
targetdirs:
- Y:/tempwork
- T:/rubytest
- //wvp10175/c$/SCM_Server

targetfilepattern: Scm\w+\.xml$

replacefrom: "<Passwort>.*</Passwort>"

This pattern will kill you if there are two sections with <Passwort></Passwort> in the file. You should at least use the reluctanct qualifier.

replaceto1: "<Passwort><![CDATA["
replaceto2: "]></Passwort>"

This cries for using a regexp group. Also filesed is pretty inflexible. I'd rather do something like this:

def file_replace(file, tmp = file + Time.now.usec.to_s)
   File.open(tmp, "w") do |out|
     File.open(file) do |inf|
       inf.each_line {|line| out.puts(yield(line)) }
     end
   end
   File.mv(tmp, file, :force => true)
end

Now you can do arbitrary line based replacements like

file_replace "foo.txt" do |line|
   "# " << line
end

----yaml----

the filesed method is not by me, found it somewhere, don't remember

Regards, Gilbert

Regards

  robert

···

On 14.05.2007 14:56, Rebhan, Gilbert wrote:
  > -----Original Message-----

Hi,

···

-----Original Message-----
From: Robert Klemme [mailto:shortcutter@googlemail.com]
Sent: Monday, May 14, 2007 3:51 PM
To: ruby-talk ML
Subject: Re: slow file access with mapped drives ?!

/*
I was referring to the fact that you present one bit of code and in your

answer to my posting you present another bit of code as if *that* was
the original code.
*/

sorry for that
Thanks for your annotations !!

IO.popen("#{config['CVSEXE']} -d
:pserver:#{cvsuser}:#{cvspass}@cvsprod:d:/cvsrepos/#{x} login")
}

/*
Why do you use popen if you do not read the pipe? *If* you use popen
you should make sure the pipe is read from - even if you ignore what you

find because otherwise the other process might get blocked.
*/

now i have = system(...) instead

Why do you use "#{@cvsreg}" instead of plain @cvsreg or @cvsreg.to_s?

assuring a string, but @cvsreg works

  config['targetdirs'].each do |dir|
    puts "\n\n"
    Find.find(dir) do |f|
      if f =~ /#{config['targetfilepattern']}/

You should pull out this regexp compilation from both loops for more
efficiency.

OK, now i have =

s=config['replaceto1']<<@cvsreg<<config['replaceto2']
  r=/#{config['targetfilepattern']}/

  config['targetdirs'].each do |dir|
    puts "\n\n"
    Find.find(dir) do |f|
      if f =~ r
      puts "... processing "<<f
     filesed!(config['replacefrom'],s,f)
    ....

  else
  puts "\n\nDone !!"
    sleep 1

/*
Why the sleep?
*/

to see the echoes on stdout, before they disappear

replacefrom: "<Passwort>.*</Passwort>"

/*
This pattern will kill you if there are two sections with
<Passwort></Passwort> in the file. You should at least use the
reluctanct qualifier.
*/

i know, it's dangerous, but there's no other match in the file for sure

/*

replaceto1: "<Passwort><![CDATA["
replaceto2: "]]></Passwort>"

This cries for using a regexp group. Also filesed is pretty inflexible.

  I'd rather do something like this:

def file_replace(file, tmp = file + Time.now.usec.to_s)
   File.open(tmp, "w") do |out|
     File.open(file) do |inf|
       inf.each_line {|line| out.puts(yield(line)) }
     end
   end
   File.mv(tmp, file, :force => true)
end

Now you can do arbitrary line based replacements like

file_replace "foo.txt" do |line|
   "# " << line
end

*/

Sorry don't understand how your code works and also there is no regex
stuff in it ?!

I thought there would be an easy xml/xpath solution for my
needs, as i have to edit a xmlfile, i tried with REXML =

require 'rexml/document'

=> true

include REXML

=> Object

file=File.new("Y:/tempwork/ScmConfig.xml")

=> #<File:Y:/tempwork/ScmConfig.xml>

doc=Document.new file

=> <UNDEFINED> ... </>

cvspass = XPath.match( doc,

"//de.foobar.scm.repository.repcvs.CvsConnector/Passwort")
=> [<Passwort> ... </>]

i tried several combinations of =

cvspass[1].gsub(/(<Passwort>!\[CDATA\[)\.*(\[\[>Passwort>)/,

$1.to_s<<'test123'<<$2.to_s)
NoMethodError: private method `gsub' called for nil:NilClass

cvspass[1].gsub(/(<Passwort>!\[CDATA\[)\.*(\[\[>Passwort>)/,

$1<<'test123<<$2)
NoMethodError: undefined method `<<' for nil:NilClass

what's the right method/syntax with REXML ?

Regards, Gilbert