[ruby-talk:443045] Re: parallel require in ruby, dirwalking and other windows things

I haven’t used Windows in more than a decade, but there are active Windows
core developers that can be found on ruby-core and the issue reported here
could be raised more usefully at
Issues - Ruby master - Ruby Issue Tracking System.

Yes, I might end up doing that! I just wanted to hear general thoughts
first :slight_smile:

(re, lstr funcs)

I would also suggest that this be raised, but as a separate issue. I
suspect that there are very good reasons to keep it, but there are more
knowledgeable people on this.

I filed a bug about it a <https://bugs.ruby-lang.org/issues/18924&gt; few
months ago
:slight_smile:

No activity on this or my other related bug
<https://bugs.ruby-lang.org/issues/18923&gt;, and I don't know how to ping the
devs?

I’ve never used it, but Zeitwerk (GitHub - fxn/zeitwerk: Efficient and thread-safe code loader for Ruby) is at

least one approach to improving loading. Because `require`s are only
*nominally* related to the files that are loaded, however, I would not
count on parallel `require`s being clean, easy, or thread-safe. As before,
the best place to discuss this would be

Issues - Ruby master - Ruby Issue Tracking System
or the ruby-core mailing list.

Zeitwerk is now the default loader for rails! It seems to work nicely, but
I have no clue if require can be parallelized from there? Could zeitwerk,
in theory, wrap the blocking Kernel#require to simply run async, and do
some threadpool/coroutine dispatching around it? I just don't know enough
about ruby internals to understand myself.

I spent an hour or two poking around the code that itself loads & parses
files (i.e. load_iseq_eval?) doesn't seem to mess with global data much.
I'm presuming that's the part that is slowest, but it might instead
be rb_iseq_eval. I don't have the proper environment setup to profile this.

If actually executing the file (which is what I suspect rb_iseq_eval does?)
is the slow part, then yes, that would be insanely messy.

If I'm understanding this correctly, then sure, I'll be happy to post as an
issue on the redmine tracker.

Sincerely,
Alexander Riccio

···

--
"Change the world or go home."
about.me/ariccio

<http://about.me/ariccio&gt;
If left to my own devices, I will build more.

On Wed, Oct 5, 2022 at 11:00 PM <ruby-talk-request@ruby-lang.org> wrote:

----------------------------------------------------------------------

Message: 1
Date: Wed, 5 Oct 2022 19:32:09 -0400
From: Austin Ziegler <halostatue@gmail.com>
To: Ruby users <ruby-talk@ruby-lang.org>
Subject: [ruby-talk:443039] Re: parallel require in ruby, dirwalking
        and other windows things
Message-ID:
        <
CAJ4ekQs3Ush10ohyukYBtCsLzzATk5v3LBaKP3EAoF+fFD+Yvw@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

On Wed, Oct 5, 2022 at 6:42 PM <Alexander G. Riccio> <test35965@gmail.com> > wrote:

> Dear ruby community,
>
> I have one question, a suggestion to speedup directory walking, and a few
> brief notes about win32.c.
>
> I'm a Windows ruby user of 3 years or so, and I'm fairly satisfied. I
> noticed, however, that some operations get pathologically slow at times.
I
> have seen 5 minute startup times for Rails at worst. This is partly the
> residual MFT congestion leftover from when I worked on altWinDirStat,
but I
> believe also partly the result of slow directory tree walking.
>
> It looks like the way directory tree walking is implemented *may* be
> 1.5-3x slower than necessary on windows. Sadly, I do not have the proper
> build environment setup to debug or fix this, but someone may be able to!
>
> It's hard for me to fully understand glob_helper, since there are many
> #ifdefs and it's nearly 300 lines long, but I think that when globbing
the
> directory recursively, I see that there's (first) a compatibility layer
of
> sorts for windows in rb_w32_opendir that fills out a bunch of fake
> direntries from underlying FindFirstFileW/FindNextFileW.
>
> One interesting opportunity here is that it appears ruby makes a
wstati128
> call before FindFirstFile to find out if the file/path is indeed a
> directory, which is roughly the same behavior that Ben Hoyt saw when
> working on BetterWalk <https://github.com/benhoyt/betterwalk&gt;\. Since
> Windows already provides this info from the FindFirstFile/FindNextFile
> APIs. If this happens on every file it encounters, (and I cannot tell),
> this is a 2x slowdown.
>
> Sidenote: there's no reporting/mapping of GetLastError if FindNextFile
> fails. This could bite someone in the future. I'd say the same about
> CloseHandle, but I'm the only dev I've ever met who checks that return
> value.
>

I haven’t used Windows in more than a decade, but there are active Windows
core developers that can be found on ruby-core and the issue reported here
could be raised more usefully at
Issues - Ruby master - Ruby Issue Tracking System.

> The other interesting issue I've noticed is that lstrlenW is still used
> throughout win32.c. I'm perplexed by this! lstrlenW is a full syscall
that
> has the same exact behavior as wcslen, just way slower. Is there a good
> reason to keep it? I believe it exists as a legacy matter more than a
> useful reason. Back in the day, lstrlenW used to catch and silently
ignore
> access violations like it's friends lstrcat and lstrcpy. There's a few
uses
> of lstrcat left in ruby, those should be removed for security reasons
even
> if there's no memory corruption at the moment! I've personally been bit
by
> them many times in the past.
>

I would also suggest that this be raised, but as a separate issue. I
suspect that there are very good reasons to keep it, but there are more
knowledgeable people on this.

> The last thing, what I initially came here to ask, has anybody ever
> thought about parallelizing require?
>

I’ve never used it, but Zeitwerk (GitHub - fxn/zeitwerk: Efficient and thread-safe code loader for Ruby) is at
least one approach to improving loading. Because `require`s are only
*nominally* related to the files that are loaded, however, I would not
count on parallel `require`s being clean, easy, or thread-safe. As before,
the best place to discuss this would be

Issues - Ruby master - Ruby Issue Tracking System
or the ruby-core mailing list.

-a
--
Austin Ziegler • halostatue@gmail.com • austin@halostatue.ca
http://www.halostatue.ca/http://twitter.com/halostatue
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <
http://lists.ruby-lang.org/pipermail/ruby-talk/attachments/20221005/0ee23ec2/attachment.html
>

------------------------------