Pathname with backslashes

Hello everybody,

I came across to this today. It seems that Pathname is not able to parse
Windows paths with backslashes when running on Unix environment.

# on Unix
irb(main):016:0> Pathname("c:\\hello\\world\\file.exe").basename
=> #<Pathname:c:\hello\world\file.exe>

# on Windows
irb(main):016:0> Pathname("c:\\hello\\world\\file.exe").basename
=> #<Pathname:file.exe>

I know that a solution on Unix is to use #gsub method to replace
backslashes with forward slashes:
irb(main):019:0> Pathname("c:\\hello\\world\\file.exe".gsub(/\\/,
'/')).basename
=> #<Pathname:file.exe>

However, Unix paths are parsed correctly on Windows environment:
irb(main):016:0> Pathname("/hello/world/file.exe").basename
=> #<Pathname:file.exe>

...but Windows paths don't work well on Unix.
shouldn't Pathname take care of it instead of me manipulating the path?
I personally would consider this an unexpected behavior.

Any thoughts?

Thank you,

*Fabio Pitino*

"c:\\hello\\world\\file.exe" is a valid filename in unix, special
chars included : )

read on unix filesystems and compare w windows.

kind regards
--botp

···

On Fri, Feb 24, 2017 at 10:00 PM, Fabio Pitino <pitinofabio@gmail.com> wrote:

# on Unix
irb(main):016:0> Pathname("c:\\hello\\world\\file.exe").basename
=> #<Pathname:c:\hello\world\file.exe>

You should simply do it the other way round. Windows accepts slashes as
path dividers as well as backslashes, so if you just use ordinary
slashes everywhere on all platforms it just works (tm). Just try it.

Greetings
Marvin

···

On Fri, Feb 24, 2017 at 02:00:54PM +0000, Fabio Pitino wrote:

I came across to this today. It seems that Pathname is not able to parse
Windows paths with backslashes when running on Unix environment.

--
Blog: https://www.guelkerdev.de
PGP/GPG ID: F1D8799FBCC8BC4F

"c;\a\b" is a valid FILENAME in Unix. So Pathname treats it as such on a
Unix system.

While inconvenient - the best option if you know you are handling Windows
paths on a Unix box is to gsub the slashes.

There certainly could be an argument that adding a :style parameter to
Pathname would be helpful - it's just not available at the moment.

John

···

On Fri, Feb 24, 2017 at 8:11 AM, Fabio Pitino <pitinofabio@gmail.com> wrote:

Sorry, didn't get your comment :frowning:
I mean that on Unix methods like #dirname, #basename, etc don't work as
expected if Pathname takes in input a path with backslashes.

To give you some context: I have an app on a Linux server that parses
feeds containing file metadata coming from Windows machines. One of the
operations is to break down path information.

*Fabio Pitino*

On 24 February 2017 at 15:37, botp <botpena@gmail.com> wrote:

On Fri, Feb 24, 2017 at 10:00 PM, Fabio Pitino <pitinofabio@gmail.com> >> wrote:
> # on Unix
> irb(main):016:0> Pathname("c:\\hello\\world\\file.exe").basename
> => #<Pathname:c:\hello\world\file.exe>

"c:\\hello\\world\\file.exe" is a valid filename in unix, special
chars included : )

read on unix filesystems and compare w windows.

kind regards
--botp

Unsubscribe: <mailto:ruby-talk-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-talk&gt;

Unsubscribe: <mailto:ruby-talk-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-talk&gt;

Sorry, didn't get your comment :frowning:
I mean that on Unix methods like #dirname, #basename, etc don't work as expected if Pathname takes in input a path with backslashes.

To give you some context: I have an app on a Linux server that parses feeds containing file metadata coming from Windows machines. One of the operations is to break down path information.

Just what he said, that is a valid unix file name:

[ruby-2.4.0p0] tmp/ruby-talk $ ls -ltrah
total 0
-rw-r--r-- 1 rab wheel 0B Feb 24 11:36 c:\hello\world\file.exe
-rw-r--r-- 1 rab wheel 0B Feb 24 11:36 normal_file
drwxrwxrwt 11 root wheel 374B Feb 24 11:38 ..
drwxr-xr-x 4 rab wheel 136B Feb 24 11:38 .

You can either use .tr("\\","/") or just split the string yourself. This was perhaps a valid path on the source machine, but not on your Linux server. (You could have a local subdirectory named "c:" also so it could get quite confusing.)

-Rob

···

On 2017-Feb-24, at 11:11 , Fabio Pitino <pitinofabio@gmail.com> wrote:

Fabio Pitino

On 24 February 2017 at 15:37, botp <botpena@gmail.com <mailto:botpena@gmail.com>> wrote:
On Fri, Feb 24, 2017 at 10:00 PM, Fabio Pitino <pitinofabio@gmail.com <mailto:pitinofabio@gmail.com>> wrote:
> # on Unix
> irb(main):016:0> Pathname("c:\\hello\\world\\file.exe").basename
> => #<Pathname:c:\hello\world\file.exe>

"c:\\hello\\world\\file.exe" is a valid filename in unix, special
chars included : )

read on unix filesystems and compare w windows.

kind regards
--botp

Unsubscribe: <mailto:ruby-talk-request@ruby-lang.org <mailto:ruby-talk-request@ruby-lang.org>?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-talk&gt;

Unsubscribe: <mailto:ruby-talk-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-talk&gt;

Sorry, didn't get your comment :frowning:
I mean that on Unix methods like #dirname, #basename, etc don't work as
expected if Pathname takes in input a path with backslashes.

To give you some context: I have an app on a Linux server that parses feeds
containing file metadata coming from Windows machines. One of the
operations is to break down path information.

*Fabio Pitino*

···

On 24 February 2017 at 15:37, botp <botpena@gmail.com> wrote:

On Fri, Feb 24, 2017 at 10:00 PM, Fabio Pitino <pitinofabio@gmail.com> > wrote:
> # on Unix
> irb(main):016:0> Pathname("c:\\hello\\world\\file.exe").basename
> => #<Pathname:c:\hello\world\file.exe>

"c:\\hello\\world\\file.exe" is a valid filename in unix, special
chars included : )

read on unix filesystems and compare w windows.

kind regards
--botp

Unsubscribe: <mailto:ruby-talk-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-talk&gt;

Thanks all for the feedback!
Makes totally sense now. I agree with John that it would be helpful to
being able to pass in the platform as parameter when working on this type
of cross platform operation.

But I see that Pathname's primary focus is to make sure it works on the
environment it runs.

Much appreciated for your time!

*Fabio Pitino*

···

On 24 February 2017 at 16:55, John W Higgins <wishdev@gmail.com> wrote:

"c;\a\b" is a valid FILENAME in Unix. So Pathname treats it as such on a
Unix system.

While inconvenient - the best option if you know you are handling Windows
paths on a Unix box is to gsub the slashes.

There certainly could be an argument that adding a :style parameter to
Pathname would be helpful - it's just not available at the moment.

John

On Fri, Feb 24, 2017 at 8:11 AM, Fabio Pitino <pitinofabio@gmail.com> > wrote:

Sorry, didn't get your comment :frowning:
I mean that on Unix methods like #dirname, #basename, etc don't work as
expected if Pathname takes in input a path with backslashes.

To give you some context: I have an app on a Linux server that parses
feeds containing file metadata coming from Windows machines. One of the
operations is to break down path information.

*Fabio Pitino*

On 24 February 2017 at 15:37, botp <botpena@gmail.com> wrote:

On Fri, Feb 24, 2017 at 10:00 PM, Fabio Pitino <pitinofabio@gmail.com> >>> wrote:
> # on Unix
> irb(main):016:0> Pathname("c:\\hello\\world\\file.exe").basename
> => #<Pathname:c:\hello\world\file.exe>

"c:\\hello\\world\\file.exe" is a valid filename in unix, special
chars included : )

read on unix filesystems and compare w windows.

kind regards
--botp

Unsubscribe: <mailto:ruby-talk-request@ruby-lang.org
?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-talk&gt;

Unsubscribe: <mailto:ruby-talk-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-talk&gt;

Unsubscribe: <mailto:ruby-talk-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-talk&gt;

I mean that on Unix methods like #dirname, #basename, etc don't work as
expected if Pathname takes in input a path with backslashes.

in unix, colons and backslashes (among others) are valid chars for
file naming. so a file named "c:\\testing\\only" is a valid
file/filename; the "c:" and the backslashes are part of the filename.
unix has only one root("/"). windows has multiple roots "based" on the
drive. in unix, you can have a path like "/c:/testing/only/", and it
would treat "c:" as just another subdirectory.

eg,

$ echo "this is a test" > c:\\testing\\only

$ la -la c:\\testing\\only
-rw-rw-r-- 1 botp botp 15 Feb 25 00:22 c:\testing\only

$ cat c:\\testing\\only
this is a test

$ file c:\\testing\\only
c:\testing\only: ASCII text

To give you some context: I have an app on a Linux server that parses feeds
containing file metadata coming from Windows machines. One of the operations
is to break down path information.

as of now, Pathname deduces from the underlying/running os. it does
not yet have the option to receive an os param to change its default
behaviour.

so just create your own method for parsing that kind of data.

1 you can start by separating the drive part from the pathname.

drive,path="c:\\this\\is\\a\\test.exe".split("\\",2)

=> ["c:", "this\\is\\a\\test.exe"]

2 then process the path by temporarily replacing the backslash with slash.

path=Pathname.new(path.gsub("\\","/"))
=> #<Pathname:this/is/a/test.exe>

3 then get the dirname and basename

dir,filename=path.dirname.to_s, path.basename.to_s

=> ["this/is/a", "test.exe"]

4 convert the dirname back to windows notation by replacing the "/"

dir=dir.gsub("/","\\")

=> "this\\is\\a"

5 now you have the basic "windows-based" info

[drive, dir, filename]

=> ["c:", "this\\is\\a", "test.exe"]

kind regards
--botp

···

On Sat, Feb 25, 2017 at 12:11 AM, Fabio Pitino <pitinofabio@gmail.com> wrote:

oops, my bad, sorry for the noise.
qs were already anwered by rob and john.