Kernel#system bug?

Running Ruby 1.8.4 on Linux

This problem seems to exist with the system() function but also has the
same problem with IO#popen. Seems the working directory of a subshell
can be affected by the command being executed? In the first call to
popen, I'm just executing "env". In the second, I'm calling "env;"
(with a semicolon. I encountered this because some commands I was
calling with system() were unable to find files that should have been
in the directory that I chdir'd to.

Program and Output below.

Thanks,
Charlton

Program:

Dir.mkdir("bug") if !FileTest.exists?("bug")
File.open("bug/bugfile", "w") { |file|
    file << "Bug!"
}
Dir.chdir("bug")

puts "without semi-colon"
IO.popen("env").readlines.each do |entry|
    puts entry if entry =~ /PWD/
end

puts "with semi-colon"
IO.popen("env;").readlines.each do |entry|
    puts entry if entry =~ /PWD/
end

Output:

without semi-colon
PWD=/user/blah/tests/ruby
with semi-colon
PWD=/user/blah/tests/ruby/bug

See the recent discussion on ruby-core.

···

On Dec 20, 2006, at 13:45, Charlton wrote:

Running Ruby 1.8.4 on Linux

This problem seems to exist with the system() function but also has the
same problem with IO#popen. Seems the working directory of a subshell
can be affected by the command being executed? In the first call to
popen, I'm just executing "env". In the second, I'm calling "env;"
(with a semicolon. I encountered this because some commands I was
calling with system() were unable to find files that should have been
in the directory that I chdir'd to.

--
Eric Hodel - drbrain@segment7.net - http://blog.segment7.net

I LIT YOUR GEM ON FIRE!

Hi,

At Thu, 21 Dec 2006 06:45:06 +0900,
Charlton wrote in [ruby-talk:230639]:

Program:

Dir.mkdir("bug") if !FileTest.exists?("bug")
File.open("bug/bugfile", "w") { |file|
    file << "Bug!"
}
Dir.chdir("bug")

puts "without semi-colon"
IO.popen("env").readlines.each do |entry|
    puts entry if entry =~ /PWD/
end

puts "with semi-colon"
IO.popen("env;").readlines.each do |entry|
    puts entry if entry =~ /PWD/
end

Output:

without semi-colon
PWD=/user/blah/tests/ruby
with semi-colon
PWD=/user/blah/tests/ruby/bug

It is natural result.
PWD is not set by OS automatically, but set by sh.

You should use getcwd() in C or Dir.pwd in ruby instead of
relying on $PWD.

···

--
Nobu Nakada

Hi Nobu,

Hm, I'm not sure I understand how this is natural. It's true that the
shell sets PWD but if I'm executing the same command from within a
Kernel#system call, I would have expected the directory context to be
consistent. If the directory viewed by the shell isn't coherent with
wherever I've taken ruby to (via Dir.chdir), then I would almost say
it's a busted implementation.

Cherers,
Charlton

Nobuyoshi Nakada wrote:

···

Hi,

At Thu, 21 Dec 2006 06:45:06 +0900,
Charlton wrote in [ruby-talk:230639]:
> Program:
>
> Dir.mkdir("bug") if !FileTest.exists?("bug")
> File.open("bug/bugfile", "w") { |file|
> file << "Bug!"
> }
> Dir.chdir("bug")
>
> puts "without semi-colon"
> IO.popen("env").readlines.each do |entry|
> puts entry if entry =~ /PWD/
> end
>
> puts "with semi-colon"
> IO.popen("env;").readlines.each do |entry|
> puts entry if entry =~ /PWD/
> end
>
> Output:
>
> without semi-colon
> PWD=/user/blah/tests/ruby
> with semi-colon
> PWD=/user/blah/tests/ruby/bug

It is natural result.
PWD is not set by OS automatically, but set by sh.

You should use getcwd() in C or Dir.pwd in ruby instead of
relying on $PWD.

--
Nobu Nakada

Thanks, Eric. I see the discussion over at ruby-core. Unfortunately, I
don't really understand the resolution (if there is one). I'll keep my
eyes on the mailing list.

Cheers,
Charlton

Eric Hodel wrote:

···

On Dec 20, 2006, at 13:45, Charlton wrote:

> Running Ruby 1.8.4 on Linux
>
> This problem seems to exist with the system() function but also has
> the
> same problem with IO#popen. Seems the working directory of a subshell
> can be affected by the command being executed? In the first call to
> popen, I'm just executing "env". In the second, I'm calling "env;"
> (with a semicolon. I encountered this because some commands I was
> calling with system() were unable to find files that should have been
> in the directory that I chdir'd to.

See the recent discussion on ruby-core.

--
Eric Hodel - drbrain@segment7.net - http://blog.segment7.net

I LIT YOUR GEM ON FIRE!

Hi Nobu,

Hm, I'm not sure I understand how this is natural. It's true that the
shell sets PWD

and this here is the issue. in this command there is no shell involved:

puts "without semi-colon"
IO.popen("env").readlines.each do |entry|
    puts entry if entry =~ /PWD/
end

which you can confirm thusly:

   harp:~ > cat a.rb
   IO.popen 'env'

   harp:~ > strace -f -- ruby a.rb 2>&1|grep exec
   execve("/home/ahoward/bin/ruby", ["ruby", "a.rb"], [/* 52 vars */]) = 0
   set_thread_area({entry_number:-1 -> 6, base_addr:0xb75e1460, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 0
   execve("/bin/env", ["env"], [/* 52 vars */]) = 0
   set_thread_area({entry_number:-1 -> 6, base_addr:0xb75e2780, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 0

yet a semi-colon terminated command does indeed invoke /bin/sh:

   harp:~ > cat b.rb
   IO.popen 'env;'

   harp:~ > strace -f -- ruby b.rb 2>&1|grep exec
   execve("/home/ahoward/bin/ruby", ["ruby", "b.rb"], [/* 52 vars */]) = 0
   set_thread_area({entry_number:-1 -> 6, base_addr:0xb75e0460, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 0
   execve("/bin/sh", ["sh", "-c", "env;"], [/* 52 vars */]) = 0
   set_thread_area({entry_number:-1 -> 6, base_addr:0xb75e0080, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 0
   execve("/bin/env", ["env"], [/* 50 vars */]) = 0
   set_thread_area({entry_number:-1 -> 6, base_addr:0xb75e4780, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 0

which you see here

puts "with semi-colon"
IO.popen("env;").readlines.each do |entry|
    puts entry if entry =~ /PWD/
end

this is because the command 'env;' is, in fact, not valid. in a c program you
will not be able to popen it. ruby, however, is kind, when it sees the special
chars

   "*?{}<>()~&|\\$;'`\"\n"

in your system call it runs your command via sh. this is doccumented
somewhere, though i forget where attm...

so what's happening is that, in one case, you exec 'env' which simply inherits
the parents env, including current value of PWD. in the second case you
actually exec sh, which sets ENV[PWD], which in turn runs env as a child
process.

in summary, nobu is right - simply use Dir.pwd and do not rely on auto-magical
behaviour of child processes which set, or may not set, the PWD env var.
similarly, if you want to avoid the special handling of cmd strings given to
system/popen, make sure the commands given are valid (in the 'c' sense) so you
bypass ruby filtering them via /bin/sh.

regards.

-a

···

On Thu, 21 Dec 2006, Charlton wrote:
--
if you find yourself slandering anybody, first imagine that your mouth is
filled with excrement. it will break you of the habit quickly enough. - the
dalai lama

"Charlton" <charlton.wang@gmail.com> writes:

Hi Nobu,

Hm, I'm not sure I understand how this is natural. It's true that the
shell sets PWD but if I'm executing the same command from within a
Kernel#system call, I would have expected the directory context to be
consistent. If the directory viewed by the shell isn't coherent with
wherever I've taken ruby to (via Dir.chdir), then I would almost say
it's a busted implementation.

Maybe this helps:

    Dir.mkdir("bug") if !FileTest.exists?("bug")
    Dir.chdir("bug")
    
    puts "without semi-colon"
    puts IO.popen("pwd").readlines
    puts IO.popen("env").readlines.grep(/^PWD/)[0].split(/=/)[1]
     
    puts "\nwith semi-colon"
    puts IO.popen("pwd;").readlines
    puts IO.popen("env;").readlines.grep(/^PWD/)[0].split(/=/)[1]
    
    # >> without semi-colon
    # >> /tmp/bug
    # >> /tmp
    # >>
    # >> with semi-colon
    # >> /tmp/bug
    # >> /tmp/bug

The working directory *does* change, but the PWD environment variable is
set by the shell, not the operating system.

-Marshall

Thanks, Ara,

That clarifies it beautifully. I wasn't aware that Ruby actually looked
behind the scenes for shell characters in order to determine whether or
not to execute the SHELL. I understand the behaviour now. I guess the
original program that caused me to run into this snag was Perforce
(p4). It's clearly using the PWD environment variable to do its work as
is witnessed by:

( setenv PWD dont_exist ; p4 -v3 info | grep cwd )

RpcSendBuffer cwd = dont_exist

Thanks all for clarifying.

Cheers,
Charlton

···

ara.t.howard@noaa.gov wrote:

On Thu, 21 Dec 2006, Charlton wrote:

> Hi Nobu,
>
> Hm, I'm not sure I understand how this is natural. It's true that the
> shell sets PWD

and this here is the issue. in this command there is no shell involved:

>>> puts "without semi-colon"
>>> IO.popen("env").readlines.each do |entry|
>>> puts entry if entry =~ /PWD/
>>> end

which you can confirm thusly:

   harp:~ > cat a.rb
   IO.popen 'env'

   harp:~ > strace -f -- ruby a.rb 2>&1|grep exec
   execve("/home/ahoward/bin/ruby", ["ruby", "a.rb"], [/* 52 vars */]) = 0
   set_thread_area({entry_number:-1 -> 6, base_addr:0xb75e1460, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 0
   execve("/bin/env", ["env"], [/* 52 vars */]) = 0
   set_thread_area({entry_number:-1 -> 6, base_addr:0xb75e2780, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 0

yet a semi-colon terminated command does indeed invoke /bin/sh:

   harp:~ > cat b.rb
   IO.popen 'env;'

   harp:~ > strace -f -- ruby b.rb 2>&1|grep exec
   execve("/home/ahoward/bin/ruby", ["ruby", "b.rb"], [/* 52 vars */]) = 0
   set_thread_area({entry_number:-1 -> 6, base_addr:0xb75e0460, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 0
   execve("/bin/sh", ["sh", "-c", "env;"], [/* 52 vars */]) = 0
   set_thread_area({entry_number:-1 -> 6, base_addr:0xb75e0080, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 0
   execve("/bin/env", ["env"], [/* 50 vars */]) = 0
   set_thread_area({entry_number:-1 -> 6, base_addr:0xb75e4780, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 0

which you see here

>>> puts "with semi-colon"
>>> IO.popen("env;").readlines.each do |entry|
>>> puts entry if entry =~ /PWD/
>>> end

this is because the command 'env;' is, in fact, not valid. in a c program you
will not be able to popen it. ruby, however, is kind, when it sees the special
chars

   "*?{}<>()~&|\\$;'`\"\n"

in your system call it runs your command via sh. this is doccumented
somewhere, though i forget where attm...

so what's happening is that, in one case, you exec 'env' which simply inherits
the parents env, including current value of PWD. in the second case you
actually exec sh, which sets ENV[PWD], which in turn runs env as a child
process.

in summary, nobu is right - simply use Dir.pwd and do not rely on auto-magical
behaviour of child processes which set, or may not set, the PWD env var.
similarly, if you want to avoid the special handling of cmd strings given to
system/popen, make sure the commands given are valid (in the 'c' sense) so you
bypass ruby filtering them via /bin/sh.

regards.

-a
--
if you find yourself slandering anybody, first imagine that your mouth is
filled with excrement. it will break you of the habit quickly enough. - the
dalai lama